|The panel line up for questions|
The problem with not sharingLee-Ann Coleman, Head of Scientific, Technical and Medical Information at the British Library, chaired the session. She has particular insight into the use of data by researchers having worked on both the DRYAD project and currently DataCite. There are a number of challenges sharing data amongst researchers. Coleman acknowledged that publishers have been helpful by requiring this, but this is not standard practice. The lack of sharing can be a real problem, particularly in public health or multidisciplinary areas. A maximum return on sharing data is not realised by the current system despite a focus on open data from policy makers and organisations such as the Royal Society.
|Lee-Ann Coleman kicks off the session|
Read more about DataCite here.
What practical challenges do publishers face in making data open?Phil Hurst is Publisher at The Royal Society who published a research report Science as an open enterprise in 2012. It highlighted the need to deal with the deluge of data, to exploit it for the benefit of the development of science, and the need to preserve the principle of openness. Hurst asserted that before you can analyse data, you need to open it up. Why bother? A recent outbreak of E. coli was a classic case study of how open, shared data helped to quickly control an outbreak of a deadly virus.
The report highlights the power of opening up data for science and provides a vision of all scientific literature online. The Royal Society makes sharing data a condition of publication. The data should go into a repository where it can be linked to it. Being practical, it is still early days for this. Hurst observed that you need to identify suitable repositories, establish appropriate criteria and share a list to guide authors. One repository they are working with is DRYAD.
|Phil Hurst and a nasty strain of E. coli|
The Society has amended licences to allow text and data mining and work with partners to facilitate. Challenges to take into account include how to manage access control for text and data mining purposes There are differences between subjects and varying degrees of willingness to share across the spectrum of science. Sharing data allows analysts to conduct meta analyses, modelling and data and text mining; and ultimately, enables scientists get new scientific value from content.
Developing taxonomies to track and map dataRichard Kidd, Business Development Manager for the Strategic Innovation Group at the Royal Society of Chemistry, outlined how they had approached data analysis at the RSC by using topic modelling to determine a set of true topics. They identified/invented 12 broad subjects which then generated 100+ categories. These were narrowed down and then mapped to existing categories.
|Richard Kidd from the RSC in action|
They are now looking at data in their publications and patterns in data for sub-domains and hope that this approach will allow them to look at their back list and bring back the original data points.
Chemists don't have a community norm about sharing with a laboratory group culture. There is a lack of available standards and issues about releasing data when patents could be developed. This leads to a more protective culture in relation to research data that can be at odds with open data principles. However, the RSC will be operating the EPSRC National Chemical Database, a domain repository for chemical sciences. Use and reuse is a priority with data availability feeds especially.
The rise of the 'meta journal'Brian Hole of open access publisher Ubiquity Press outlined how researchers’ needs drive their publishing efforts. The model they use encourages researchers to share data. Hole is a strong proponent of what he calls the social contract of science and considers not only publication of research but also research data to be an essential part of it. As a result an author’s conclusions can be validated and their work more efficiently built upon by the research community. On the other hand it is effectively scientific malpractice to withhold data from the community. He argues that this principle applies to publishers, librarians and repositories as well as researchers.
|Brian Hole from Ubiquity Press|
Ubiquity Press are developing 'metajournals' to aid in discovery of research outputs scattered throughout the world in different repository silos, and also to provide incentives for researchers to openly share their data according to best practices. The metajournals provide researchers with citable publications for their data or software, which are then referenced by other researchers in articles and books. The citations are the tracked along with the public impact of papers (using altmetrics). The platform so far includes metajournals in public health, psychology, archaeology and research software, with more to come including economics and history. Read more about Ubiquity Press' meta journals here.
If you are interested in data, join us at the ALPSP Conference this September to hear Fiona Murphy from Wiley and a panel of industry specialists discuss Data: Not the why, but the how (and then what?). Book online by 14 June to secure the early bird rate.