Monday, 18 May 2015

High Value Content: Big Data Meets Mega Text

ALPSP recently updated the Text and Data Mining Member Briefing (member login required). As part of the update, Roy Kaufman, Managing Director of New Ventures at Copyright Clearance Center, provided an overview of the potential of TDM, outlined below.

"Big data may be making headlines, but numbers don’t always tell the whole story. Experts estimate that at least 80 percent of all data in any organization—not to mention in the World Wide Web at large— is what’s known as unstructured data. Examples include email, blogs, journals, Power Point presentations, and social media, all of which are primarily made up of text. It’s no surprise, then, that data mining, the computerized process of identifying relationships in huge sets of numbers to uncover new information, is rapidly morphing into text and data mining (TDM), which is creating novel uses for old- fashioned content and bringing new value to it. Why? Text-based resources like news feeds or scientific journals provide crucial information that can guide predictions about whether the stock market will rise or fall, can gauge consumers’ feelings about a particular product or company, or can uncover connections between various protein interactions that lead to the development of a new drug.

For example, a 2010 study at Indiana University in Bloomington found a correlation between the overall mood of the 500 million tweets released on a given day and the trending of the Dow Jones Industrial Average. Specifically, measurements of the collective public mood derived from millions of tweets predicted the rise and fall of the Dow Jones Industrial Average up to a week in advance with an accuracy approaching 90 percent, according to study author Johan Bollen, Ph.D., an associate professor in the School of Informatics and Computing. At the time, Dr. Bollen predicted, with uncanny accuracy, where he felt TDM was going, from the imprecise, quirky world of Facebook and Twitter to high-value content. He said, "We are hopeful to find equal or better improvements for more sophisticated market models that may in fact include other information derived from news sources and a variety of relevant economic indicators."

In other words, structured data alone is not enough, nor is text mined from the wilds of social media. Wall Street and marketers, eager to predict the right moment to hit buy or sell or to launch an ad campaign, have already moved from mining Facebook and Twitter to licensing high-value content, such as raw newsfeeds from Thomson Reuters and the Associated Press, as well as scientific journal articles reformatted in machine- readable XML. In fact, a 2014 study by Seth Grimes of Alta Plana concludes that the text mining market already exceeds 2 billion dollars per year, with a CAGR of at least 25%.

Far from being irrelevant in our digital age, high-value content is about to have its moment, and not just to improve the odds in the financial world or help marketers sell soap. It represents a new revenue stream for publishers and their thousands of scientific journals as well. For example, in 2003, immunologist Marc Weeber and his associates used text mining tools to search for scientific papers on thalidomide and then targeted those papers that contained concepts related to immunology. They ultimately discovered three possible new uses for the banned drug. “Type in thalidomide and you get between 2,000 and 3,000 hits. Type in disease and you get 40,000 hits,” writes Weeber in his report in the Journal of the American Medical Informatics Association. “With automated text mining tools, we only had to read 100-200 abstracts and 20 or 30 full papers to create viable hypotheses that others could follow up on, saving countless steps and years of research.”

The potential of computer-generated, text-driven insight is only increasing. In his 2014 TedX Talk, Charles Stryker, CEO of the Venture Development Center, points out that the average oncologist, after scouring journals the usual way, reading them one by one, might be able to keep track of six or eight similar cancer cases at a time, recalling details that might help him or her go back, re-read one of two of those articles, and determine the best course of care for a patient with an intractable cancer. The data banks of the two major cancer institutes, on the other hand, hold searchable records of cancer cases that can be reviewed in conjunction with 3 billion DNA base pairs and 20,000 genes contained within each. So using that data would mean a vast improvement in the odds of finding clues to help treat a tricky case or target the best clinical trial for someone with a rare disease. This information might otherwise have been difficult, if not impossible, for even the most plugged-in oncologist to find, let alone read, see patterns, or retain the information for a period of time.

Think, then, of the possibilities of improving healthcare outcomes if the best biomedical research were aggregated in just a few, easily accessible repositories. That’s about to happen. My employer, Copyright Clearance Center (CCC), is coming to market with a new service designed to make it easier to mine high-value journal content. Scientific, technical and medical publishers are opting into the program, and CCC will aggregate and license content to users in XML for text mining. Although the service has not yet fully launched, CCC already has publishers representing thousands of journals and millions of articles participating.

Consider the difficulties of researchers, doctors, or pharmaceutical companies wishing to use text mining to see if cancer patients on a certain diabetes drug might have a better outcome than patients not on the drug. They must go to each publisher, negotiate a price for the rights, get a feed of the journals, and convert that feed into a single useable format. If the top 20 companies did this with the top 20 publishers, it would take 400 agreements, 400 feeds, and 400 XML conversions. The effort would be overwhelming.

Instead, envision a world where users can avail themselves of an aggregate of all relevant journals in their field of interest. Instead of 400 agreements and feeds to navigate and instead of 400 documents to convert to XML, there would be maybe 40 agreements: 20 between the publishers and CCC and 20 with users. There would be no need for customers to convert the text. In other words, researchers could get their hands on the high-value information they need to move research and healthcare forward, in less time, with less effort. And that’s only the beginning. As Stryker said about the promise of TDM, “We are in the first inning of a nine-inning game. It’s all coming together at this moment in time.”

ALPSP Members can login to the website to view the Briefing here.

Roy Kaufman is Managing Director of New Ventures at the Copyright Clearance Center. He is responsible for expanding service capabilities as CCC moves into new markets and services. Prior to CCC, Kaufman served as Legal Director, Wiley-Blackwell, John Wiley and Sons, Inc. He is a member of the Bar of the State of New York and a member of, among other things, the Copyright Committee of the International Association of Scientific Technical and Medical Publishers and the UK's Gold Open Access Infrastructure Program. He formerly chaired the legal working group of CrossRef, which he helped to form, and also worked on the launch of ORCID. He has lectured extensively on the subjects of copyright, licensing, new media, artists' rights, and art law. Roy is Editor-in-Chief of ‘Art Law Handbook: From Antiquities to the Internet’ and author of two books on publishing contract law. He is a graduate of Brandeis University and Columbia Law School.

Friday, 24 April 2015

Eleven Top Tips on Content Marketing You Can't Live Without

Kate Smith, far right, and the content marketing panel
Kate Smith, Associate Marketing Director at Wiley, chaired the Turning scholarly research into content marketing gold panel in The Faculty at The London Book Fair. She deftly drew out the key issues surrounding the myths and magic of content marketing from a panel of industry experts including Laura Finn, Content Marketing Manager at the Royal Society of Chemistry, Amy Nicholson, Managing Editor at Sticky Content, and Lynne Miller, Associate Director at TBI Communications.

Read the Storify from TBI Communications here and ALPSP here.

In the spirit of all good content marketers, Kate has created a list of eleven top tips drawn from the session.
  1. Start small, pick something easy 
  2. Map out everything in your content ‘universe’ 
  3. Identify the drivers of your funnel and audience journey 
  4. Don’t get too bogged down with personas 
  5. Audience is 1st, Story 2nd, Content 3rd 
  6. Provide some easy guidelines – like making the first sentence of your content 140 characters 
  7. Use other events like a survey someone else in your company has done and repurpose it into a short form content asset (or two, or three, or four…) 
  8. Top and tail other people’s content 
  9. There are different ways to consume content so repurpose the same content in different forms (a list, an infographic, a blog post etc) 
  10. Check out ‘1 post Wonders’ on Tumblr 
  11. Get Amy from Sticky Content’s stakeholder guide
Disagree? Add your own to the comments field below.

Kate Smith is Associate Marketing Director, John Wiley & Sons. She is a seasoned marketer and has led community marketing teams, and more recently Channel teams and B2B marketing at Wiley for the last 12 years. As a content marketing advocate Kate has been helping to change the mindset around marketing in Wiley so that that marketers are better equipped to connect and engage with customers and end-users.

Thursday, 9 April 2015

Is there a single project management solution that works for all projects? We spoke to Jim Russell to find out.

Project management remains a core skill that publishers need. We spoke to Jim Russell, tutor on the ALPSP Project Management for Publishing training course, about the different ways of handling this essential function.

Why is it that project management is so important to publishing?

Projects are vital to every publishing organization. They are the means by which most change is implemented. Think about new and enhanced products and services, changes to workflows, systems and processes. What about rebranding, sales and marketing campaigns? Acquisitions, restructuring, strategy development and suppliers changes are all project-based activities. I could go on, but you get the picture.

What typically goes wrong when managing projects?

As projects, they all have inherent risks and under the pressures of high workloads or when individuals are inexperienced in managing them, projects often fail to match expectations. Project management is meant to help! But sometimes it doesn't.

Why do you think this is the case?

Over many years, traditional project management methods have been based on experience from large projects. Whilst there is still a place for some of these methods on major infrastructure developments, they can cause delay and add too much bureaucracy to smaller and medium sized projects. Add in a lack of training or experience, plus an expectation of fitting this in to your day job, and you have a recipe for disaster.

So how has project management methodology evolved to tackle these issues?

Most projects now involve web-based solutions that need a more flexible approach to project management, and businesses need shorter implementation time scales. ‘Agile’ methods can be highly effective for incremental web development and can work very well.

Sounds like a great solution, are there any drawbacks?

These methods are not appropriate for all projects and the conditions for the successful application of Agile methods do not always exist. There is no single project management solution that works for all projects!

So what what would you advise to those starting a project?

I always recommend taking a flexible approach to project management. Review the type of project you are working on and take the best parts of different methods to suit your circumstances and business goals. This will enable you to apply a scalable, tailored approach for different types and sizes of project.

Jim Russell is a project management consultant who runs his own training and consultancy company. He has worked with a broad range of organizations within the publishing sector and continues to be a practising project manager.

Jim is co-tutor on ALPSP’s Project Management for Publishing course along with Ruth Wilson from Nature Publishing Group. The next course will run in May 2015.  Further information and booking available on the ALPSP website.

Tuesday, 7 April 2015

Managing Author Fees – Resolving the “Build vs. Buy” Dilemma

In a guest post, Jennifer Goodrich from CCC reflects on the "Build vs. Buy" dilemma when managing author fees for gold open access.

"In its 2014 article, “Digitizing the Consumer Decision Journey,” McKinsey & Company warns that “tools and standards are changing faster than companies can react,” and nowhere is this more true than in the field of Open Access. Open Access publishing requires new business models from publishers, new activities and expertise from authors when paying article processing charges (APCs), new relationships with institutions and funders, and enhanced systems to support all parties. It’s no surprise, then, that publishers are examining their author programs with a particular focus on creating an integrated workflow between the editorial process and the payment of APCs. As a result, many publishers are facing the choice: can they deliver what’s needed with their existing systems or should they instead be working with an outsourced partner?

The “build versus buy dilemma,” of which this is but one example, is a perennial topic of debate in the field of software development – perhaps why InfoWorld described it as “a question of Shakespearean proportions.” Conventional wisdom suggests that building your own solution may be the right decision in areas of key competitive advantage, or where there is no suitable commercial product to deliver your core business requirements. By contrast, buying is often a more cost-effective way to automate and standardize core business processes, and allows the organization to manage risk by transferring the burden of software development and maintenance onto a third party. But which of these scenarios applies in the case of the management of author fees? The answer will depend on the needs of each organization, but when faced with a rapidly changing environment like Open Access publishing, McKinsey suggests many organizations need to adopt a different approach to managing the consumer decision journey—one that embraces the speed that digitization brings and focuses on capabilities in three areas: Discover, Design and Deliver.

Here is how each of these can be applied to the challenge of managing author fees, and what the implications might be when considering a systems solution.


In the context of Open Access, the Discover phase means drawing upon information about the author (including reviewer or editorial status), the manuscript and publication, the author’s institution, the author’s funding sources, the author’s membership status and more to develop a full customer portrait. Publishers are under growing pressure to capture and share industry standard metadata such as ORCIDs, ISNIs, DOIs, FundRef IDs and Ringgold IDs. They must therefore invest in the interoperability of their existing systems that contain data about authors and their institutions, and also find ways to allow external systems to draw on this information.

Unfortunately, for most publishers, the information in question resides in disparate legacy systems that are not currently integrated and which rely on proprietary identifiers rather than external data standards. Further, with the growth of OA journals and now books, the volume of author transactions is escalating rapidly, and publishers are managing these transactions on systems that were designed for annual subscription fees, rather than for high-volume real-time transactions. As a result, in order to develop a robust author-centric solution, publishers typically have to dedicate significant resources to both unifying legacy systems and building a new transaction system that can evolve with the marketplace and be ready for numerous emerging standards. A fully unified set of internal systems might be the ultimate goal, but it is likely to come at a high price in terms of development costs and management time. For this reason, an outsourced solution can often make more sense in a resource-constrained environment or where there is a need to deliver a solution within a shorter timeframe.


The design phase is about creating a frictionless experience that uses the data gathered in the discovery phase to ensure interactions are expressly tailored to an author’s stage in his or her publication journey. Publishers have rightly identified that maintaining high levels of customer satisfaction among authors is critical in an Open Access model. A recent report by Forrester Consulting for the Technology Business Management Council called “The Business Technology Scorecard” highlighted that companies now live in the “Age of the Customer.” For publishers, this global trend is already apparent in the move from annually billing institutions for subscriptions to daily collection of APCs from authors.

In this new paradigm, authors cannot be kept at arm’s length from billing and payment processes, which now form an integral part of the author experience. This requires publishers to dramatically shift how their organizations operate – from a business-to-business (B2B) model centered on selling subscriptions to a business-to-consumer (B2C) model centered on managing an exponentially greater number of micro transactions for APCs.

Furthermore, authors’ expectations of online payment solutions are very high. We all live in the digital world of online service providers such as Amazon, Google and Apple, which develop convenient, easy-to-use applications to which we have become accustomed. The pricing of an APC is often dependent on a far larger range of variables than an iTunes download or an Amazon eBook, but these services set the benchmark for a seamless user experience. Rightly or wrongly, manual, slow or partially automated payment solutions that seem clunky and antiquated negatively impact an author’s perception of a publisher. Instead, authors expect intuitive user interfaces (UIs) and robust workflows when they submit manuscripts and pay author charges. Any system used for this purpose is likely to need some sophisticated algorithms “under the hood” in order to meet these expectations.


Finally, the Deliver phase requires the creation of a more agile organization with the right people, tools, and processes. McKinsey highlights the need for cross-functional teams with "strong collaborative and communication skills and a relentless commitment to iterative testing, learning, and scaling—at a pace that many companies may find challenging.” For publishers, this means developing closer links among editorial, production, finance and operations teams who may have had relatively little contact under a subscription model. To make it easy to support these cross-functional teams, a robust APC solution must offer simple management of complex pricing, discounting and compliance reporting. It should also be robust enough to test and implement promotional codes and discounts based on factors such as location of the author, institutional affiliation, subscription status of the institution, and membership status of the author, allowing publishers to iterate quickly in response to market demand. Flexibility is the key in this phase of activity, and any solution needs to accommodate this or risk undermining the organization’s capacity to evolve and develop.

Reaching a Decision

Having discussed some of the main characteristics of, and challenges to, efficient APC payment processes, we still have to resolve the build versus buy dilemma. In some cases this will come down to an evaluation of the likely return on investment from each of the two options, meaning needing to develop a good understanding of the potential costs and benefits in each case. This needs careful thought, both in testing the market for outsourced solutions and in being realistic about the costs of in-house development and ongoing maintenance, which are frequently under-estimated. Even for the very largest publishers, it‘s difficult to make the case to support the upfront investment needed in development of their own systems. It is often a cost-benefit decision, and the ability to share some of the risk with a third party can be a decisive factor.

Consistency and predictability will be crucial considerations in your decision – errors and unscheduled downtime in this area can quickly cost money, not to mention the risk of reputational damage. However, as Tom DeMarco, principal of the Atlantic Systems Guild has noted: “The more important goal is transformation – creating software that changes the world or that transforms a company or how it does business.” If your chosen author fee management solution can deliver this, you really will have resolved the build versus buy dilemma.

About the Author

Jennifer Goodrich is Director of Product Management at Copyright Clearance Center (CCC). She is responsible for driving the development and evolution of CCC’s RightsLink® for Open Access platform. The platform, built on ten years of experience assisting publishers in collecting author charges, is CCC’s next-generation service designed to simplify the collection and management of article processing charges or APCs. She and the RightsLink team work closely with publishers, authors, manuscript management systems, standards organizations, as well as academic and funding institutions, to ensure the platform meets the needs of the Open Access community.

If you are going to the London Book Fair, CCC will be hosting a panel discussion, “You Made the Move to Open Access: What’s Next for Your Business?” on Tuesday, 14 April, 16:00 – 17:00 in The Faculty. CCC is located at Stand 7C16.

Tuesday, 17 March 2015

BMJ's acquisition of the British Veterinary Association Journals

Janet O'Flaherty from the BMJ
In a session designed to consider divestment as a strategy, Janet O'Flaherty, Publisher at BMJ, outlined how the British Veterinary Society chose to work with BMJ as their preferred partner and how the journals have fared in the past six years.

The BVA has been publishing in house since 1888. they were very traditional and conservative, had seen spiralling costs with declining revenues and lost contract publishing business over several years. There was a technology deficit and were struggling to keep up to date. They chose the BMJ due to a good natural fit: both are the 'go-to' Associations for their professions.

BMJ inherited a weekly magazine, large backlog of content and lots of historical legacy systems. First of all, they moved the staff across to the BMJ, built a new jobs site, reviewed roles and processes, integrated workflows and upgraded hardware. It was important to reassure the staff that they were investing for the future.

They moved print production to BMJ suppliers and renegotiated supplier contract for hosting. They introduced right-priced subscriptions and tweaked the business model. They didn't have any vets working on the editorial board so they advertised for a Veterinary Editor-in-Chief. They also set up customer focus groups. They refreshed cover and page layout, bearing in mind the conservative nature of the audience.

Other editorial developments included:

  • Appointed veterinary research editor and clinical editor
  • Refreshed international advisory board to improve reach
  • Improved workflows and author services - online first
  • Research published as one page summary in print for practitioners
  • Editorial to accompany original articles
  • OA hybrid option
  • Boosted social media.

There is a successful and ongoing relationship with the BVA which is profitable for both parties. Online usage has increased 60% from 2011 to 2014. They have had positive feedback from readership surveys. They've looked at new revenue streams such as supplements, round tables, webinars. The jobs site has changed and even more successful with revenue growing. Editorial turnaround times now 12 days from receipt to acceptance. They have a stable editorial team and have launched two new journals.

And the lessons learned? These include:

  • Timing to avoid disrupting weekly schedule
  • Poor inherited data especially for subscribers (importance of TRANSFER)
  • Staff getting used to BMJ (letting go of some responsibilities)
  • Transition to new systems and processes
  • Online hosting transfer slow
  • Recruiting first Veterinary Editor-in-Chief
  • Communications are key

Janet O'Flaherty spoke at the ALPSP seminar Disruption, development and divestment held in London on Tuesday 17 March 2015.

The publisher as technology company

Phill Jones from Digital Science
Phill Jones, Head of Publisher Outreach at Digital Science, asked the audience to consider whether publishers need to become technology companies to succeed in today's market?

Information technology has changed everything. A crucial point in time was when telephones and computers collided.  Another key point is when television and the internet collided (think NetFlix). Marc Andreessen, Co-Founder of Netscape and Andreeson Horowitz, came up with the phrase "software eats the world." This is when software is developed into something that heavily disrupts an industry. That is what is happening to our industry. Publishing is changing because it is colliding with information technology.

Who are the most powerful players in publishing? Google, Amazon, Apple. One is an advertising company, one is an online retailer, one is an device manufacturer. What they all have in common is IT.

With science, what is happening now is a transformation from a cottage industry approach to industrial size. Things are on a much larger scale with a range of researchers tackling one part of the research project each. The technologies of choice in the lab lag behind. Post-It notes still prevail.

Another key driver is the change in policy (e.g. Neelie Kroes in the EU through to the NIH compliance where they moved to remove grants from researchers who didn't comply with OA requirements).

There is an evaluation gap. Traditional measures of impact don't take into account funders; requirements to measure impact including societal impact, public engagement and legislative impact. As publishers, we need to be aware of this.

Where does this leave us? The publisher as technology company? When content and technology collide you basically get Open Science. The different stages of research include: getting the idea, doing the research, documenting findings, output and dissemination, maximising return on researcher investment.

There are opportunities to use technology to help with each one of these stages. That is the space that publishers should inhabit. How do you go about this to add value? Digital Science focuses on investment in young companies that have solutions to problems in the science space. To understand what the problems are, they have a consultancy division who undertake research.

Phill Jones chaired the ALPSP seminar Disruption, development and divestment held in London on Tuesday 17 March 2015.

Where does publishing go from here? Tom Clark reflects...

Tom Clark from Emerald
Tom Clark is Chief Officer for Business and Product Innovation at Emerald. In the closing session at the ALPSP seminar Disruption, Development and Divestment, he reflect on the future for publishing.

We are all learning organisations now. In what can be called "The dawn of digital abundance" everyone is a contributor, discovered and data source. Connectivity is at the heart of internet commerce, but how sustainable is content aggregation? Business models are diverse and fluid while usefulness and discoverability are playing a stronger hand.

He believes we are in an internet business and while he isn't interested in articles, he is interested in author problems and needs. The internet is simply too complex to not innovate and orientate to customer needs. What is a publisher? What does that student think of you? What will they be doing in five years? Digital, lean, agile approach allows publishers to develop better products for markets.

We face digital abundance. How do we see change? Who are our competitors these days? What is an author? Does marketing work anymore? Where will revenues come from? (Clark thinks they will come from varied and many new different sources). How do I develop new business models? Does everyone else understand what's going on? Business faculty have conflicting issues on their time (teaching, admin, research). Emerald feel they have a range of products and services that can help.

You need to learn to develop and blink: what happens on the internet in a minute is huge. There are new rules and new competitors. The ability to harness and connect more things is key. It is worth considering joint ventures, especially if you can't afford to buy or invest.

Mobile will be key. It's not only researchers using it, students are too. What problem can you solve with mobile? Are mobile apps dead and the humble browser reborn? The FT famously ditched its hub app in favour of HTML5 because it was cheaper, searchable and more responsive. Hardware and connectivity improvements are delivering great experiences via 'm' websites. There will be less phone 'litter' and more responsive/intuitive layers.

We are all learning organisations, including Emerald. When a researcher submits and publishes a paper they are enriching their understanding and furthering their knowledge and career. Helping them to ensure their research makes an impact is key.

Clark asked how well we know and understand the data we can get from our own websites. Not very well, he suspects. You need to understand that online attention is decreasing. Divesting the legacy approach is expensive. Are you creating packages of content, delivering through micro-sites, listening, designing the experience? That's where publishers need to be.

What matters? innovation, risk, customer data. Investment, connectivity, new markets. Usefulness, discoverability, openness. Existing markets, print versus digital, direct marketing. Organic growth, social media, market share. Librarians, discovery services, open access.

It was a big cultural shift for Emerald to do things quickly. You need a mixed team and make quick decisions: be agile and decisive. Clark closed by observing that no one has all the answers. You have to experiment.

Tom Clark spoke at the ALPSP seminar Disruption, development and divestment held in London on Tuesday 17 March 2015.