|Who's Afraid of Big Data? Not this panel...|
There have long been calls for authors’ underlying research data to be made accessible, so as to substantiate research conclusions, suggest further work and so on. The main Plenaries concerned themselves with Big Data, usually unstructured sets of elements of unprecedented scale and scope, such as the whole of Wikipedia, accumulated Google searches, the biomedical literature, the visible galaxies in the universe. The challenge of ‘mining’ these datasets is to bring structure to them so that new insights emerge beyond those arising from limited or sampled data. This requires automation, big computing resources and appropriate but speeded-up human intervention and sometimes crowd sourcing.
|Gemma Hersh from Elsevier on TDM|
Inevitably there are barriers and issues. The data themselves are often inadequate; for example not all drug trials are published and negative or non-results are frequently excluded from papers. Research data are not always structured and standardized and authors are often untutored in databases and ontologies. The default policy, it was recommended, should be openness in the availability of authors’ published and underlying data, standardized with full metadata and unique identifiers, to make data usable and mitigate the need for sophisticated mining.
|CrossRef's Rachael Lammey|
Rejecting the name tag Cassandra, Paul Uhlir of the National Academies urged a note of caution. Big Data is changing the public and academic landscape, harbouring threats of disintermediation, complexity, luddism and inequality and exposing weaknesses in reproducibility, scientific method, data policy, metrics and human resources, amongst others.
|Paul F. Uhlir urges caution|
Judging by the remainder of these sessions and the audience reaction, excitement was more noticeable than apprehension.
ALPSP of course is on the ball and has just issued a Member Briefing on Text and Data Mining (member login required) and will publish a special online-only issue of Learned Publishing before the end of this month."