Tuesday, 3 May 2016

Democratizing eBook Publishing: The rise and rise of e-publishing through the cloud

Copyright: RedKoala

We spoke to Sabine Guerry, founder of 123 Library, about the rise of e-publishing in the cloud, and why publishers should consider this approach.

For those that are approaching this topic for the first time, can you explain what e-publishing in the cloud is about?

Cloud based systems, or Software as a Service (SaasS) as they are also known, are a way of combining proprietary data and shared software storage. They provide publishers with access to hardware, software and maintenance on a licensed basis, without having to invest in setting up and managing your own in-house system.

As eBook distributions models have proliferated it can be difficult to scale your operation. The restrictions in the current aggregator model have resulted in smaller to medium sized and specialist publishers being overlooked. Cloud publishing is changing that offering flexibility over distribution and control over sales, effectively democratising publishing. In its simplest terms, it harnesses the potential of off-site data management service providers to open up possibilities requiring minimal upfront capital expenditure.

What does this mean for a publisher’s output?

It provides efficiency in your distribution process as you can plug into existing tried and tested systems. Cloud services are accessible for the publisher and their customers and employ the latest user experience approaches for simplicity of use. When you use a cloud based service, it means you can offer access to content direct rather than being solely reliant on aggregators. It puts control of your eBook distribution back into your hands. For academic publishers cloud publishing platforms can cater for eBook delivery to both individual user and institutions.

What other features can it provide?

Some customisation is usually available in cloud based systems meaning you can change and adapt it for your list and your market, but in a timely and responsive way. Cloud based systems also tend to include cross device capability and include enhanced search and research tools that improve the user experience. Areas such as privacy of customer data, authentication, DRM security, bibliographic reference integration, management tools, web reading, cataloguing, content management and e-commerce can be handled by the system.

How does it usually work if you decide to work with a cloud based solution?

Cloud publishing starts with a set of tools for linking easy-to-use software applications – usually via an API (application programme interface). The API allows publishers to create a bespoke, standalone eBook distribution and sales website, but it can be used to power an existing one. The API also dictates how programme elements interact and function so the publisher doesn’t have to worry about seeing or modifying them.

Why would you recommend users consider this approach?

Cloud services work particularly well for smaller organizations. They don’t require a team of in-house developers working on bespoke software. They are an ‘off the shelf’ solution with simple integration and easier maintenance. The cloud company will undertake the expense and risk of maintaining and developing the software which is then ‘shared’ amongst their customers reducing cost. Cloud-based systems are also scalable so you can match demand. Crucially, it allows you to punch above your weight and provide direct eBook services comparable to those of larger publishers.

Sabine is Director and Founder of 123Library, an eBook B2B delivery tool for publishers. She is an entrepreneur who specializes in developing IT services for the publishing industry. 123Library’s CloudPublish™ platform provides a range of business models and management tools for both end-users and librarians, and complies with academic institutions' technical requirements. www.123library.org.

Tuesday, 5 April 2016

SciHub, sharing and predatory journals: how can publishers justify their existence? Rachel Maund reflects...

Rachel Maund, Owner of Marketability and tutor on ALPSP's Introduction to Journals Marketing course, reflects on the increasing importance of the relationship with your editorial board.

"At last year’s meeting of the Asian Council of Science Editors the talk was all of ‘predatory journals’, and how they damaged the reputation of the rest of us by charging author processing fees but not then adding editorial value in return.

Their actions fanned the flames of the argument that publishers exploit academics and are increasingly redundant in an age when sharing, self-archiving and self-publishing is so easy. 

Witness a spate of wonderful retaliatory articles about ‘xx things publishers do’ (various numbers as different authors came up with more) on sources such as The Scholarly Kitchen. Do look up a few of these if you need a bit of reassurance that all is not lost (and inspiration for future copy).

This year it’s the actions of SciHub that are getting journals publishers hot under the collar. And that genie is well and truly out of the bottle. SciHub’s founder, Alexandra Elbakyan, is convinced she has right on her side by making your research content available free to all, and let’s face it, those of us with an inside knowledge of what goes into publishing quality journals are in a very small minority. How much of your content is already being accessed via SciHub? 

But despite the scary headlines, all is not lost. We can distance ourselves from predatory publishers by constantly making clear how much value and credibility is added through the publishing process, and a majority of the research community does accept this in principle.

SciHub is a tougher proposition, because it’s your content they’re making available. 

And even if they were closed down, free sharing of information (as articulated as Google’s mission many moons ago now), is the future. We have to find ways of promoting why working in partnership with a publisher, whether as an author, researcher or librarian, gives an extra quality of service that’s worth paying for. In other ways, we have to find ways to co-exist, as book publishers have had to do with Amazon.

The key may lie with our editorial boards. 

It’s never been more important to work in partnership with them to bring positive messages to the wider research community. A percentage of our audience may be sceptical about what we have to say, but editorial board members are fellow academics, and their views have credibility with their peers. Simple tactics like having campaigns fronted by the Editor-in-Chief (written by you, naturally) can really help.

It’s always been important to work in partnership with editorial boards to deliver marketing to our audiences. But today that’s true more than ever before.

We’re in this together, regardless of what you might read in some of the headlines.

None of us has definitive answers as to how to tackle these challenges, but our Introduction to Journals Marketing workshop is both a reality check for some of the simple tactics that will still work for you, and a great chance to hear what other publishers are doing to mitigate the effects of initiatives such as SciHub.”

Rachel Maund is an international publishing consultant specialising in marketing training, with over 30 years’ practical experience, and is the founder of Marketability.

Introduction to Journals Publishing will run on Wednesday 20 April in London. Book online or further details are available from melissa.marshall@alpsp.org.

Tuesday, 23 February 2016

Why is the business technology side of eJournals so unnecessarily complex? Tracy Gardner reflects...

Photograph of Tracy Gardner
eJournal technology is an essential part of the scholarly publishing industry. It is also the topic of one of our most popular training courses. Here, we spoke to Understanding eJournal Technology co-tutor, Tracy Gardner, about the challenges of keeping up-to-date in this area.

"One of the biggest challenges publishers face is making sure their content can be easily found in the various discovery resources readers use to find journal articles, and then to ensure the steps between the reader finding the content and reading it are seamless and without barrier. There are so many potential pitfalls along the way, and this issue therefore concerns people working in production, IT, editorial, sales, marketing and customer service.

The pace of change is fast, technology is evolving all of the time and the driver for much of it has come from the libraries. Libraries are keen to ensure their patrons find and access content they have selected and purchased and by keeping them in a library intermediated environment they feel they can improve their research experience overall. Ultimately the library would like the user to start at the library website, find content they can read and not be challenged along the way.

Simon Inger and I have been running the Understanding eJournal Technology course two or three times a year for ten years now and we have never run the same course twice - it constantly needs to be updated.

Those working in customer facing roles such as sales, marketing and customer service may not fully appreciate how much library technology impacts on the way researchers find and access their content. Many people are surprised to learn that poor usage within an institution is often because something has gone wrong with the way the content is indexed within the library discovery layer, how it is set up in the library link resolver, or issues with authentication.

For those in operational or technology roles, the business technology side of eJournals can seem unnecessarily complex and, especially for those new to the industry, the way the information community works can seem counter to the way many other business sectors operate. What makes sense in classic B2B or B2C environments will not make sense within the academic research community.

By helping people who work in publishing houses understand how the eJournal technology works and how they can most effectively work with libraries to maximise discovery and use of their content. Many people who have attended our course have not been aware of the impact some of their decisions have had and our course has helped them understand why they need to work in certain ways."

Tracy Gardner will tutor on Understanding eJournal Technology in March and October 2016. Book your place now.

Thursday, 4 February 2016

All Change in Scholarly Communications: How are the Players – Veterans and Newbies – Adapting?

Fiona Murphy reports from #APE2016
Last month, in characteristically bracing January Berlin weather, around 250 intrepid speakers and delegates attended the 11th Academic Publishing in Europe (APE – pronounced “Ahhhpay”) meeting. Keep an eye on Twitter #ape2016 as all of the presentations were recorded and so should become available in the near future.

A number of familiar characters – large publishers, established platform providers, and so forth – whose language seems to have evolved over the past few years – spoke about ‘openness’ and ‘sharing’ rather than preserving business models. Todd Toler of Wiley, for instance, expressed the “publisher’s value proposition” as having shifted from content provision – basically “moving stuff about” to “strengthening knowledge connections”. This feels like a real turning of tides; such players are now actively aiding and abetting our efforts to garner significant knowledge from our scholarly ecosystem.

In point of fact, there was a general theme around intelligence rather than simply the power of data. Barend Mons bemoaned the existence of “a Christmas tree of hyperlinks and the malpractice of supplementary material’”, instead calling for the training of experts to really understand how machine learning and human interrogation of data can be meshed together to form a powerful whole – “Open Science as a Social Machine” (keep an eye on the IDCC programme in Amsterdam later this month, as he’ll be expanding on the topic there). Meanwhile, Emma Green, of Zapnito – a start-up that aids knowledge-based companies to maximise the impact of their associated experts spoke of growing the ‘knowledge economy’ by reducing the noise and chatter, thereby freeing up the collective intelligence.

John Sack of Highwire’s approach was to examine frictions in the workflow. If workflow is ‘a way of getting things done,’ then instances of friction – with the possible exception of a review stage – largely involve the loss of efficiency. Currently most journal workflows are still based on the original print journal format, but with the version of record shifting online, the resulting misalignments between what is desired and what is produced are causing delays, and infringements of established rules (such as copyright). Friction-reducing tools that can support and simplify the generation, finding, and attribution of scholarly outputs are needed. This can be enabled by standards such as e.g. ORCID or ResearcherID for people, and by initiatives such as openRIF/VIVO for connecting people and their roles to their works and activities. This connectivity will surely boost quality, productivity, and the need for improved garnering of knowledge from our research landscape that generally arose as a theme across APE in general. This connectedness, according to Sack, is about a supported conversation amongst collaborators who are enabled by tools that sift, pre-curate and – potentially – publish their scholarly outputs.

Opportunities for new business models are appearing in a number of points in the workflow – Publons acknowledges and badges peer review activities, Overleaf provides templated support to write journal articles, and Elsevier is leveraging the new Mendeley Data service to enable authors to publish their data and link it immediately with journal articles.

At the same time, policy (=funding) is also moving in the same direction. Stephan Kuster, Head of Policy Affairs for Science Europe explained its function and mission. Science Europe is a think tank set up to support and advise EU National Research Funding Councils around on EU R&D policy issues. Open Access is one of nine key priorities, including enabling authors to hold copyright, supporting sustainable archiving, and publication and dissemination are integral part of research process and should be funded as such.

There was a thoughtful debate about Scholarly Communications Networks and whether they add value, which would not have been possible even a few years ago. Fred Dylla, Emeritus Executive Director of the American Institute of Physics, made the salient point that reputation of the journal still needs to be fundamentally challenged for the landscape to be really disrupted. Currently, the people and institutions making the key decisions about funding, tenure and promotion, are still fixated on journal reputations and impact factors. So, despite feeling as though there has been a lot of progress in the last few years, it also seems there’s still a lot to do.

Luckily there are several opportunities coming up to extend and develop our understanding of and strategies for adapting to this changing landscape. As well as the aforementioned IDCC later this month. And look out for the ALPSP Seminar on research data, digital preservation and innovation in March. Standing on the Digits of Giants is co-organised with the Digital Preservation Coalition and is designed to orientate and empower publishers, research managers and researchers to navigate and flourish in the new landscape.

Another key space to continue these discussions is in the context of the Force11 community, which aims to bring together many of the stakeholders needed at the table to effect change: policy makers, funders, researchers, technologists, publishers, informaticists, lawyers, etc. Force16 promises to be an exciting venue where we’ll be pushing scholarly communications into uncharted territory. Hope to see you there too.

Fiona Murphy, February 2016

Now associated with the Maverick Publishing Specialists, Fiona Murphy has held a range of production and editorial roles at Wiley, Oxford University Press, Random House and Bloomsbury Academic. She specializes in emerging scholarly communications (including Open Science and Open Data) and works to raise expertise and activity levels across the wider research and publications communities. Fiona has written and presented extensively on the research landscape, data and publishing. She is Co-Chair of the World Data System—Research Data Alliance Publishing Data Workflows Working Group, an Editorial Board Member of the Data Science Journal and enjoys organizing meetings. orcid.org/0000-0003-1693-1240

This post was written by Fiona Murphy with the support of Melissa Haendel.

Thursday, 26 November 2015

Standards: chaos minimization, credibility and the human factor

Standard, standards, standards. One is born, one conforms to standards, one dies. Or so Edmund Blackadder might have said.

And yet, as David Sommer and his panel of experts demonstrated earlier this month, standards underpin our scholarly publishing infrastructure. Without them, we could not appoint editorial teams, enable the review process, tag or typeset articles, publish in print or online, catalogue, discover, or even assess the quality of what we published – assuming, that is, we had been allowed through the office door by our standards-compliant HR departments. We couldn’t determine the citation rates of our publications, sell said publications to libraries (all of them naturally sceptical of our unstandardized claims for high usage) or even contact our high-profile UCL author (is this the UCL in London, Belgium, Ecuador, Denmark, Brazil or the USA?). Resolution, disambiguation, standardization is the order of the day.

‘We are’, as Tim Devenport of EDItEUR said, ‘in the chaos minimization business’.

Speakers at the seminar offered overviews of the roles played by CrossRef, Ringgold, ORCID, COUNTER, Thomson Reuters, EDItEUR, libraries (in the guise of the University of Birmingham) and JISC, considering content, institutional and individual identifiers, plus usage, citation, metadata and library standards.

Audio of all talks is available via the ALPSP site, but here are some broader conclusions from issues discussed on the day.

Humans make standards

But we’re remarkably good at breaking them too. The most foolproof systems are those that don’t allow much human intervention at all (ever tried to accurately type a sixteen-digit alphanumerical code on less than eight cups of coffee?). Vendors should build systems that not only pre-populate identifier fields, but actively discourage users from guessing, ignoring or simply making up numbers.

Be the difference

Publishers, funders and institutions need to actively assert their need for standards at every stage of their workflows. Break one part of the article supply chain and something, somewhere, is bound to be lost. (And the worse part? We don’t know where.) That means that the entire supply chain must inform and develop standards, not just 'free ride' on existing ones.

Standards help authors find their voice

If an article can be found by DOI, funding source, award number or ORCID iD – in other words, if one or more of the key standards is applied to a particular publication – then research gets heard above the online ‘noise’. Authors can help themselves by claiming their own iDs, but it’s up to publishers and institutions to show them why it matters.

Identifiers enforce uniqueness

They not only help with functionality (disambiguating data and eradicating duplication), but they ensure correct access rights, help understand a customer base and build stronger client relationships. All of this adds immense value to your data.

Standards build credibility everywhere

We tend to think of publishing standards as being the building blocks of the standard workflows – and they are. But the latest development from ORCID encourages involvement in peer review, with journals and funders now collecting reviewers’ iDs to track review activities. That’s a startling contribution to tenure decisions and research assessments. And what about the prospect of using iDs in job applications to verify your publications?

The Impact Factor is a number, not a standard

OK, so we knew that. And we probably had an opinion on it. But coming on a day when Thomson Reuters announced they were ‘exploring strategic options’ for the Intellectual Property & Science businesses, it was good to hear from the horse’s mouth.

Even the ‘standard’ standards need, well, standardizing

Given the significance of COUNTER usage statistics for library negotiations, the possibility for inaccuracy seems startlingly high. Over 90% of users still require some form of manual intervention, and that means greater likelihood of error. There is a role for standardizing and checking IP information to improve the accuracy of COUNTER data - but for now, no one seems to be claiming that ground.

Slow is good

If a publisher/funder/institution is a late standards adopter, that’s OK. Better to start slow and get it right than to implement poorly and leave a (data) trail of tears. But start. Organizations such as ORCID make available plenty of information about integrating identifiers into publisher and repository workflows.

Standards are not anti-innovation

On the contrary, they facilitate innovation. And they provide the information architecture for innovation to flourish in more than one place.

Share it

Since we can't predict when/where (meta)data will be used, let’s make sure everyone knows as much as possible. Make it open source, or at the very least, make it trustworthy.

And finally…

The mobile charging area at the British Dental Association front desk is a perfect example of the need for rational standards. How many wires?

Martyn Lawrence (@martynlawrence) is Publisher at Emerald Group Publishing and attended the recent ALPSP Setting the Standard seminar in London. He can be contacted at mlawrence@emeraldinsight.com.

Monday, 9 November 2015

Why Publishers Need to Know the Difference between Search and Text Mining

picture of Haralambos “Babis” MarmanisHaralambos “Babis” Marmanis CTO and VP, Engineering & Product Development at the Copyright Clearance Center looks at the concepts behind search and text mining and highlights why publishers need to understand the differences in order to make the best use of each.

As the author of works on search and the lead architect of a product which enables text mining of scientific journal articles, I am often asked about the difference between Search and Text Mining, and have observed that the two are sometimes conflated. Unless you work with technology every day, this confusion is certainly understandable. Knowing the differences, however, can open new business opportunities for publishers. Both functions deal with the application of algorithms to natural language text, and both need to cope with the fact that, as compared with “pure data,” text is messy. Text is unstructured, amorphous, and difficult to deal with algorithmically.

While the challenges associated with text are common to both search and text mining, the details with respect to inputs, analytical techniques, outputs, and use cases differ greatly. For years, publishers have been engaged in search engine optimization, designed to make their works more discoverable to users. As publishers are increasingly asked to enable text mining of their content, they enter into new territory – a territory that is different than that of public search engines. Thus, it is more important than ever to understand the difference between these two distinct mechanisms of processing content, so that optimal business and licensing strategies are chosen for each.

To begin with, let me describe the key concepts for each area. "Search" means the retrieval of documents based on certain search terms. Think, for example, of your usual web search on well-known search engines such as Google, Yahoo or Bing. In search, the typical actions performed by a software system are index-based and designed for the retrieval of documents. The indexing process therefore aims to build a look-up table that organizes the documents based on the words they contain. The output is typically a hyper-link to text/information residing elsewhere, along with a small amount of text which describes what is to be found at the other end of the link. In these systems, no “net new” information is derived from the documents through the processes that are employed to create the search index. The purpose is to find the existing work so that its content can be used.

On the other hand, "text mining" is a less widely understood but well-developed field that deals with analyzing (not finding) text. That is, while text mining can sometimes look at meta-textual issues – for example, tracking the history of science by counting the instances of a specific phrase (e.g., “avian flu”) in articles – more often the goal is to extract expressed information that is useful for particular purposes, not just to find, link to, and retrieve documents that contain specific facts.

Text mining tools accomplish this by allowing computers to rapidly process thousands of articles and integrate a wealth of information. Some tools rely on parsing the text contained in the documents and apply simple algorithms that effectively count the words of interest. Other tools dig deeper and extract basic language structure and meaning (such as identifying noun phrases or genes) or even analyze the complete grammatical structure of millions of sentences in order to gain insights from the textual expression of the authors. By extracting facts along with authors’ interpretations and opinions over a broad corpus of text, this more sophisticated approach can deliver precise and comprehensive information, and in the commercial setting, provides more value than simple word counts.

Unlike with search, the output of text mining will vary depending on the use to which the researcher wishes to apply the results. In some contexts, the output is digital and designed for machines to process. In other examples, such as using text mining to drive marketing of products and services, the ultimate output will be human-readable text. In other words, even when text mining is performed, sometimes the user needs and receives the full article.

Although both search and text mining involve the parsing and lexical analysis of documents, there are important differences that should drive a publisher’s decisions about investments in text mining and search.

  1. In text mining, the processing and analysis is often done on a project by project basis. Unlike the search functionality provided by search engines, the “how, why, and what” are infinitely variable, and it is difficult to accurately anticipate the inputs, processes, and outputs required. For example, depending on a text miner’s use case, the output may be facts, data, links, or full expression, as opposed to the simple links that are the output of search.
  2. Search is about finding a set of relevant documents, each of which is considered independently by the algorithm; if applied to a single document the process will yield the same result for that document. On the other hand, text mining is mostly about discovering and using information that lives in the fabric of a corpus of documents. Change one document and the fabric of the corpus changes. Mining is usually (but not always) consumptive of the content. So, the “search” process is document-by-document specific, while the “mining” process involves sets of documents and how these documents relate to each other.
  3. Lastly, the mining process aims at extracting “higher-order” information that involves first-, second-, and higher-order correlations that may occur among any combination of the terms, data, or expressions appearing in the corpus of documents that is processed.

In summary, search and text mining should be considered as two quite distinct processing mechanisms, with often different inputs and outputs. While publishers need to engage with both, by conflating them, one loses the unique opportunities and strengths that each provides. With search, it’s all about helping users find the specific content that they are looking for. Text mining goes well beyond search, to find multiple meanings in a publisher’s content in order to derive new value therefrom. Hence, one would expect that, just as the processes themselves differ, publishers’ licenses for the search and text mining processes will differ too.

Tuesday, 13 October 2015

Standard Identifiers, Metrics and Processes in Journal Publishing: Mark Hester asks 'Aren't they a bit...dull?'

Why should we use standards? Identifiers, transaction processes, schemas, metrics and many other things in scholarly publishing have standards, or are developing them. Isn’t this a rather arduous and bureaucratic way of handling things? Are these things really there to make life easier or just another way of overcomplicating an already complex market, taking time away from the efforts of actually producing high quality content?

Here Mark Hester of Aries Systems delves into why we should care.

Aren’t standards a bit….dull?'

Standards? Just a bunch of numbers, right? With tedious documentation on how and where to use them? Why would I bother with those?

It’s not hard to see why you might think that, but also easy to see how this is misguided. Jumping straight into a document to read about standards is a little bit like reading the telephone directory when you have no intention of calling someone, or leafing through a Haynes manual when you’re not repairing a car.

An example of a standard from outside publishing might help – EAN-13. What is EAN-13 you might ask? You see examples of it daily – it is the standard for the barcodes we see on everything we buy in the supermarket. Retail staff don’t need to know how EAN-13 works, it is unlikely that they’ve read documentation on it, but they are all grateful that it does work when checking stocks, pricing items and working on the till and, in turn, so are their customers.

So I ignore standards: what’s the worst that can happen?

When I was a student in the early nineties, the departmental librarian had been using his own classification system for many years. Back then, it didn’t matter much – students got used to its quirks, visitors from other departments were rare, from other universities much rarer still. The people using the service understood it, and that was enough.

Imagine taking this approach in the online world - it would mean that your content would be less discoverable and also less usable. Online library catalogues wouldn’t work if everyone took the librarian from my alma mater’s approach! Not using DOIs means frustration for researchers who can’t click on the references and go straight to the articles, and a simple change to a URL means a broken link. If your content isn’t seen it affects your reputation, and in the case of a commercial publisher, your profits.

The benefit of standards will only increase as the ‘digital natives’ used to touch screen technology enter academia and the workplace – having to click more than once or search for more than a minute will lead them to go elsewhere.

How can standards enhance my working life and be good for my organization?

Rapid changes in scholarly publishing means that new applications are found for standards once they are in place. Adopting standards can ‘future proof’ your content and processes against changes that occur in the future.

A great example of this is the relentless adoption of gold open access. The publishing standards which enable Copyright Clearance Center’s RightsLink for OA to display different article processing charge policies to different users on the fly developed separately from one another – Ringgold for institutions, ORCID for identifying authors, and FundRef for funder identification. Brought together, however, their machine readability allows flexible APC pricing models and automated billing and payment processing, making life easier and saving time and money for both publishers and institutions.

The advantages can be psychological as well as practical – if authors, researchers and librarians see the ORCID or CrossRef logos displayed on your website, they will know that your organization is a serious player, one which will help them, one they can trust.

So what's next?

By now, I hope I’ve convinced you of the importance of standards. But if the prospect of researching the topic still fills you with a sense of dread, there's an upcoming seminar from ALPSP I'm helping to coordinate called Setting the Standard. It's being held in London on Wednesday 11 November and includes speakers from CrossRef, Ringgold, ORCID, COUNTER, Thomson Reuters, EDItEUR, Jisc and an institution. Everything you ever wanted to know about standards, but were too scared to ask.

I hope to see you there.