ALPSP blog: at the heart of scholarly publishing: seminar

Showing posts with label seminar. Show all posts

Wednesday, 3 May 2017

Data challenges for publishers – teams, tools and changes in the law

We are delighted to be able to share this blog from Warren Clark at Research Information who attended our popular, recent seminar How to Build a Data- Driven Publishing Organization chaired by Freddie Quek.

Dealing with data is nothing new to scholarly publishers – but it was clear from a recent ALPSP event that it’s an ever-changing battlefield, reports Warren Clark

How to Build a Data Driven Publishing Organization, held on 20 April at the Institute for Strategic Studies in London, and hosted by ALPSP, proved there is much for many still to learn in how to approach the masses of data points generated by companies throughout the publishing cycle.

As John Morton, board chair of Zapaygo, said in his keynote: ‘Most publishers are using less than five per cent of the data they own.’

The event featured many examples of areas in which data could be collected, analysed and presented in a form that would improve profitability for publishers, and provide users with a more personalised experience.

Ove Kähler, director, program management and global distribution at Brill, together with his colleague Lauren Danahy, team leader, applications and data, explored the challenges they faced in developing an in-house data team. Their most significant innovation was to arrange their primary data groups according to where they occurred in the workflow: content validation; product creation; content and data enrichment; content and data distribution; product promotion; and product sales.

The pair explained how they created a team – from existing staff within the company – giving each specific responsibility for one of those data groups, and how that led to improved quality and output of data at each step.

Indeed, the notion that publishers shouldn’t assume that dealing with data means employing new staff was echoed throughout the day, with both David Smith, head of product solutions at IET, and Elisabeth Ling, SVP of analytics at Elsevier, suggesting in the panel discussion that people ‘look at your own team first’, since it was likely that the skills required would already be present.

Choosing tools

As well as who and why, many speakers talked about how they capture, store, analyse and visualise the data they collect. The most extensive of these was IET’s David Smith, who overhauled the IT department’s software tools to evolve a more accurate suite of visualisations that product teams could use independently and without the need to continuous IT support. Smith explained that those looking for a ‘single solution’ from a software package that solved all data challenges for publishers would be disappointed, before reeling off half a dozen or more software tools that his team had integrated to develop a solution that suited their needs.

In a session that brought a perspective from outside the publishing industry, Matt Hutchison, director of business intelligence and analytics at Collinson Group, a company that runs global loyalty programmes on behalf of major brands, supported this notion by showing how they had outsourced some of their function to Amazon Web Services (AWS). Matt Pitchford, solutions architect at AWS, demonstrated that the cloud computing set-up they developed for Collinson Group involved more than 20 different pieces of software.

What data can bring

Another theme was quality of data – as Graeme Doswell, head of global circulation at Sage Publishing put it: ‘You need your data capture processes to be as granular as you want your output to be.’ He showed examples of how Sage was using its data to show librarians their levels of usage, making it easier for the sales teams when it came to renewals. David Leeming, publishing consultant at 67 Bricks, gave a further example, specifically in the area of content enrichment.

For Iain Craig, director strategic market analysis at Wiley, data was used to help business decisions on new journal launches. He explained a major project that involved them collecting internal and external data points such as subject matter, number of submissions, journal usage, funding patterns, and many more. The outcomes have helped improve existing journals, and suggest where future resources should be deployed for emerging markets.

Similarly, Blair Granville, insights analyst at Portland Press, demonstrated how his team tracked submissions, subscriptions, open access, citations, usage, commissions and click-through rates in order to to feed intelligence back to the editorial teams about where their focus should be.

Data and the law

The most enlightening paper of the day came from Sarah Day, data marketing professional and associate consultant at DQM-GRC, who spoke about data regulation and governance. She warned against complacency and ignorance when it comes to data, particularly with regard to the upcoming General Data Protection Regulation (GDPR). Already law, but due to become enforceable in May 2018 (allowing time for institutions to ensure compliance), this is an EU-wide revision of privacy laws designed to give individuals more control over their personal data.

‘In spite of Brexit, the UK – and indeed any country outside the EU that offers goods and services to people in the EU – will have to comply,’ said Day. The impact of the new regulations are far and wide as far as publishers are concerned, and among the most important things they can do is ‘be transparent about what you are doing with an individual’s data’.

Although Day successfully rose to the challenge of explaining GDPR in one minute, it served to demonstrate that managing data in a safe, secure, and legal manner is a complex issue that every publisher will have to address head on.

With more than 50 attendees at the event, drawn from publishers large and small, it’s clear that understanding data – and all the issues that come with it – is an issue that will only become more important in the years to come, as the amount of data generated grows exponentially.

For more blogs and publishing news from Warren Clark and the excellent team at Research Information please visit: https://www.researchinformation.info/analysis-opinion

Thursday, 30 March 2017

'Just do it': highlights from the ALPSP Open Access seminar

Martyn Lawrence attended last month's ALPSP seminar How to build a Successful Open Access Books Programme which was chaired by Frances Pinter.

He offers his thoughts on the day.

This one day seminar on Open Access monographs brought together a mixed – and refreshingly perky – group of publishers, librarians, funders and authors.

On the heels of the R2R conference, held on the preceding days, chair Frances Pinter set the scene in a room full of industry heavyweights, traditional presses, societies and start-ups. She had briefed the wide range of speakers to talk about challenges overcome and how their offer could be scaled up, not just to showcase their companies.

Here, rather than a blow-by-blow account of each presentation, I’m offering the top ten takeaways from a thoroughly enjoyable day.

1. Monographs are important

The tone was set from the outset. There’s an intangible thing with books: even though you can read on a device, there’s something about a printed book that provokes different emotions from a printed journal. Yes, chapters in edited collections are akin to journal articles (scholarly ‘stuff’ to use the language preferred by Toby Green and Tom Clark) but monographs, by and large, arouse different responses. That’s partly because of their dominance in the humanities and social sciences: because HSS research is so often about the idea, rather than the data, the venue for that idea is venerated – as is the means of expressing it. As the Crossick Report stated: ‘The writing of the long-form publication IS the research process’.

2. Books are under pressure

The problems are hardly new: low sales, declining library budgets, tough distribution, pressure to make publicly-funded work freely available and a changing environment in a platform-led world.

For some disciplines, it’s a relevancy issue in the fake-news, barriers-first world of Trump and Brexit. If STM creates new drugs and builds planes, HSS needs to explain what it offers. Indeed, as Rupert Gatty so eloquently said in favour of Open Book Publishers, it’s time to re-evaluate the entire publishing model. If access to your title results to a 300:1 success in favour of the open version (based on data from his presentation), it takes a lot of effort to justify prioritising the single digit. We should be able to communicate in more ways, not fewer.

3. HEFCE monograph policy

Funder attention and OA policies have hitherto focused on journals publishing, because of the desire to kick-start innovation and drive new business models. It’s also been driven by academic priorities in the big-money STM areas.

Ben Johnson (HEFCE) explained why HEFCE is interested in OA for all published outputs:

it leads to greater efficiency when university finances are stretched
it improves quality of research
it leads to impact and reach outside big institutions

A diverse system means that people can choose how they communicate. In STM, 98% of REF returns were journal articles. In HSS, by contrast, the monograph dominated.

The REF after next will require OA monographs, and pilots are being put into place for that. In ten years, there will be a significant percentage of OA books. The equivalent REF value isn’t yet given to e-monographs but that will change.

4. We’re going to play nice

The journey to OA for journals was heated and not always constructive. HEFCE hopes to avoid a repeat for monographs (which, given the expected length of the journey, is a blessing), and it’s worth emphasising that the atmosphere in the room was considerably different from the ALPSP OA event in June 2016 which focused predominantly on journals. There was precious little mention here of ‘drive your APCs’ or ‘milk the P&L’. HEFCE set the tone and subsequent speakers reinforced it: all parties should respect that the pace of change will be up for debate.

5. University presses may be the future melting pot for OA

Perhaps the most interesting news was that initiative for change is less likely to come from the legacy publishers, nor yet the start-ups, but from the growing cohort of university presses. Often housed within university libraries (and therefore with a strong mandate to champion OA), they are often far less reactive than the legacy publishers. Two careful presentations from CUP and Taylor & Francis bore this out: progress is cautious in the global publishing houses, partly because agitation from the author community is not high, and partly because of varying geographical and disciplinary opinions about open research.

In the UPs, by contrast, commissioning can be driven by ‘what’s the story?’ not ‘where’s the money?’. The rationale for editorial excellence is as strong as ever, but removing the pressure of profit margins means OA books can be more eclectic, more interesting, more exciting than ever before. ‘The value to the university is in profile and reputation, not in income’, said Sue White of University of Huddersfield Press. No one is going half-measures on this, either. As Lara Speicher (UCL Press) noted, authors are watching closely and they’ll quiz publishers over their sales and marketing plans for a title. Having said all of that, the (small) list of OA books published by CUP was notable for its breadth and quality: there’s no indication that OA diminishes the value proposition for readers.

6. Systems really stink

Publishers don’t build systems to give away books for free. OK, so there’s a wisecrack hiding there, but try as you might, it’s really difficult to convince a legacy e-commerce system to offer an article or an entire book with a zero price tag. They simply weren’t built with OA in mind, and rescaffolding sites is one of these things that everyone assumes is easy until they try it. Time and again, this issue emerged as a remarkable stumbling block.

7. Discoverability ain’t great either

Three kinds of metadata are needed to make an OA monograph fully discoverable, and they are non-negotiable, functional essentials:

content (eg keywords and BIC codes)
digital (eg DOIs, ORCiDs, ISBNs)
OA-specific (eg specific CC license for both articles and images, embargo period, funders, location of Version of Record)

Without this, scalability of OA programmes will prove tricky. It doesn’t help if third-party vendors don’t make it clear that a print book is digitally OA, or if elements of the metadata drop out on the book’s journey through the post-publication environment. (I was reminded at this point of a recent Scholarly Kitchen piece by Jill O’Neill, in which she described the convoluted process of tracking down what she called ‘an OA monograph in the wild’.)

Simon Bains, Head of Research Services at the University of Manchester reinforced this point. Unless metadata is strong, Manchester doesn’t give OA books the same priority. In Bains’ view, JSTOR discoverability is good; OAPEN and DOAB are poor; Hathi Trust and Internet Archive are non-existent. They also prioritise reading list books.

As Euan Adie said, ‘metadata is a love-letter to the future’. Without it, OA founders.

8. OA encourages audience-first publishing

Some of the most fascinating presentations came from researchers. Vanesa Castán Broto, Senior Lecturer at UCL, made the forceful point that if academics are not inspired to produce something, they will drag their heels. Broto was adamant that she didn’t want to produce something held only by an elite group in the English-speaking global north. Her OA research on Mozambique, published by UCL Press in English and Portuguese, has seen downloads in 152 countries: ‘it’s a massive incentive for me to publish open and in a language other than English’, she said. This motivation, she said, trumped any accusations about OA vanity publishing.

Broto’s conviction raised an important issue. The bigger publishers are ploughing time and money into an OA monograph programme as a business need: they’re packaging it as part of a wider author services offer. By contrast, the authors are taking risks because they are in the business of communicating their research discoveries to the widest possible audience. In one sense, these factors are symbiotic: authors need publications to be widely available, and publishers are in the business of making that happen. But it’s intriguing to see how these two different rationales will converge, given the issues of scalability and sustainability. For the most part, publishing ‘closed’ in the right journals is still more important than publishing ‘open’ in smaller journals.

9. OA enables innovation

Book launches kill budgets, but authors love them. So in a platform-driven world, what’s the alternative? Online parties, says Xinyuang Wang at UCL, who reported on a campaign supporting her OA book with a MOOC and YouTube videos translated into multiple languages. The greatest impact of this was the means of attracting new and wider audiences to the work. It’s an audience-first model that legacy publishers will struggle to match.

The larger point seemed to be that OA publishers, particularly those without legacy models to protect, are potential incubators of innovation. Without a cumbersome legacy model to restrict format or dictate price, they can engage more fully with the long tail of high quality titles. Diversity, said Andrew Lockett of University of Westminster Press, has much greater value once you’re not obsessed with the US library market.

10. Print isn’t going away

Despite everything, physical books still make a difference. Ultimately, that’s why the transition to OA monographs has taken so much longer than journals. Lots of university presses are offering books as short-run PODs (often 100 copies) to ensure they cover demand, and OA isn’t replacing print. This was the funding message too: academic choice is a big part of the HEFCE approach. Data from Brill and UCL suggests that print sales are not decimated by OA (it’s the effect on ebooks that is more notable).

And this is what’s so interesting – the mix keeps us going. When it comes to OA monographs, what do we want? Everything.

Martyn Lawrence is Publishing Manager at the Royal Armouries Museum, with oversight of the books programme at the museum's three sites (Leeds, the Tower of London and Fort Nelson). He is a frequent contributor to international publishing workshops and training events, including seminars for ALPSP and London Book Fair, and he has chaired numerous conference sessions around the world.
@martynlawrence

ALPSP organises a full professional development programme of seminars, training workshops and webinars. See www.alpsp.org for details.

Wednesday, 12 June 2013

Outsourcing: the good, the bad and the ugly. Edward Wates reflects on the Wiley-Blackwell experience.

Edward Wates, Wiley-Blackwell

Edward Wates is Global Journal Content Director for Wiley-Blackwell. He reflected on the Wiley-Blackwell experience at the ALPSP seminar on Outsourcing, providing insights into what to consider when outsourcing and insourcing, as well as lessons he has learnt.

The wider context for outsourcing is the changing environment. We are in a period of rapid change. There is a reduction in growth, digital transformation of business, with growth in open access that means lower revenue per article. There is restructuring and reinvestment in growth and innovation. Wates believes that you need to invest in article enrichment - they want their content to do more, and they want to use it in a more interesting way.

He noted that journal contract renewals is a competitive matter, often associated with improved royalty payments to keep the business. This will be the case for all publishers. All this has an impact on the cost base, so they have a responsibility to think about this.

What can be outsourced? You have to be careful about what you do and balance what you outsource and insource. This starts with redefining core competencies. They insource content acquisition, editorial judgement, sales reach, and purchasing/specification. They outsource: technology development, 'processing activities', customer services, marketing collateral and support.

You should use a range of financial metrics including cost of sale (typesetting, PP&B), plant costs (copy editing, project management), direct expenses (overheads), and releasing funds for investment. Production insourcing at Wiley-Blackwell has an ethos of 'manage more, do less'. They focus on workflow development, specification, purchasing, training and support, relationship management and project management.

Their content strategy is now multi-channel (web, mobile, ebook and print), multi-product, user focused containing multiple combinations. They also focus on discoverability. Underpinning principles set the framework for think about how they deliver these things. Wates was clear about the need to win hearts and minds internally when outsourcing. If colleagues aren't fully on board they can undermine the work.

The benefits of outsourcing include:

increased productivity and cost savings
redeployment of in-house staff
speed to market
efficiency
access to competencies
access to tools.

Often, you may face a challenge at senior level, where there is lack of understanding of process and tools. This has led them to focus on tools for content management. Other areas to consider are the commoditization of production services, how vendor expertise is developing and widening, flexible approaches and ways of slicing the functions to outsource.

Perceived downsides of outsourcing have to be taken into account. These can include concerns around loss of control, quality and time zone factors. The impact of growing economies and exposure to exchange rate fluctuations, staff turnover, as well as takeovers and company failure are also areas of concern.

And what about quality issues? It helps to pin down exactly what they are: style, language, layout, timeliness – of these, the most critical is XML. Wates also defined the 'scrutiny effect' whereby the extra attention given to outsourced work may require higher quality levels than previously existed.

In the future Wates believes that speed to market, greater consistency and further standardisation will be critical. New media and enriched content will facilitate a move away from a print-oriented way of thinking about outsourcing.

Wednesday, 22 May 2013

Text and Data Mining: practical aspects of licensing and legal aspects

Alistair Tebbit

Text and Data Mining: international perspectives, licensing and legal aspects was a Journals Publisher Forum seminar organised in conjunction with ALPSP, the Publishers Association and STM held earlier this week in London. This is the second in a series of posts summarising discussions.

Alistair Tebbit, Government Affairs Manager at Reed Elsevier, outlined his company's view on evolving publishers’ solutions in the STM sector. Elsevier have text mining licenses with higher education and research institutions, corporates and app developers. They give access to text miners through transfer of content to user’s system enabled by one of two delivery mechanisms under a separate licence: via API or using ConSyn. The delivery mechanisms have been set up and there are running costs. Their policy to date has been to charge the value add to the services to users for commercial organisations, but not academic.

Why is content delivery managed this way? Platform stability is a critical reason. Data miners want content at scale – they generally don’t do TDM on a couple of articles - but delivering scale via their main platform ScienceDirect.com is sub-optimal. APIs or ConSyn are the solution as they leave ScienceDirect untouched. Effectively they are separating machine-to-machine traffic from traffic created by real users going to ScienceDirect.com. Content security is another key issue. Free-for-all access to miners on ScienceDirect would not allow bona fide users to be checked. XML versions are less susceptible to piracy than PDFs. Why is content delivery managed this way? It’s more efficient for genuine text miners. Most miners prefer to work off XML, not from article versions on ScienceDirect. Their delivery mechanisms put the content into data miners' hands fast.

With text and data mining outputs, they use a CC BY-NC licence when redistributing results of text mining in a research support or other non-commercial tool. They require that the DOI link back to the mined article whenever feasible when displaying extracted content. They grant permission to use snippets around extracted entities to allow context when presenting results, up to a maximum of 200 characters or one complete sentence.

Licensing is working well at Elsevier and will improve further. The demand to mine is being met and there are no extra charges in the vast majority of cases. Additional services to support mining will likely be offered as they improve. However, it’s early days. Mining demand is embryonic with low numbers at the moment. Copyright exceptions are a big cause for concern and there is a major risk of spike in unauthorized redistribution. Platform stability may be threatened and there is a risk of a chilling effect on future service innovation.

Duncan Campbell

Duncan Campbell, Associate Director for Journal Digital Licensing at Wiley-Blackwell, provided an overview of emerging solutions in text and data mining with a publisher perspective on intermediary solutions. Text and data mining is important to publishers as it enriches published content, adds value for customers and aids development of new products. For researchers it helps identify new hypotheses, discover new patterns, facts and knowledge. For corporate research and development, the benefits are as above and in addition it accelerates drug discovery and development and maximises the value of information spend.

There are a number of barriers to text and data mining:

Access: how can users get hold of content for text mining
Content formats: there is no standard cross-publisher format
Evaluation: understanding user needs and use cases
Uncertainty: what is allowed by law, what is the use of text and data mining output
Business models: lack of business pricing models e.g. access to unsubscribed content
Scale: define and manage demand, bilaterial licensing unlikely to be scalable.

There is a potential role for intermediary to help with publisher/end user relationship. This could include as a single point of access and delivery; by providing standard licensing terms as well as speed and ease of access. The intermediary may make mining extensible and scalable and they can cover the long tail of publishers and end-users. It also enables confidential access, especially in pharma.

Andrew Hughes

Andrew Hughes, Commercial Director at the Newspaper Licensing Agency (NLA), provided a different perspective on text and data mining. Text mining requires copying of all data to establish data patterns and connections and computers need to index data. Every word on every page has to be copied. Once the copy exists, it needs to be managed. Copying requires access to data so that indexing can only happen on either the publisher database, but there is a risk of damage and disruption unless managed, and expense; or copy provided to text minders’ database where there are costs and control risks for publishers. He believes that you also need to bear in mind that third party licence partners aren’t always as careful with your data as you are.

In the newspaper sector, press packs are produced by text mining. The NLA eClips is a service where the proprietary way of mining content is withheld and a PDF is supplied of the relevant articles. There are substantial risks for publishers in text mining including the potential for technical errors by miners, challenges around data integrity and commercial malpractice. There are also cost implications including the technical loads on systems, management of copies and uses and opportunity costs.

Hughes cited the Meltwater case where the industry had to tackle the unauthorised use of text and data mining for commercial use. It took a lot of time and litigation, but they are now thriving within the NLA rules. They are licensed by the NLA and their users are licensed. It means they are operating on fair and equal terms with competitors and is an example of how licenses can work to the benefit of all parties.