Showing posts with label bowker. Show all posts
Showing posts with label bowker. Show all posts

Tuesday, 8 October 2013

Big Data / Little Data: The practical capture, analysis and integration of data for publishers

Laura Dawson, from Bowker, leans in.
Laura Dawson from Bowker provided the ultimate 'Data 101' for publishers at the Big Data/Little Data session at Frankfurt Book Fair's Contec conference.

She cautioned that data doesn't stop with getting something on Amazon. They have tracked the explosion in the amount of books. In the United States there were 900,000 books in print in 1999. This grew to 28 million in 2013. Information is on a massive scale. We are swimming in it.

There is a problem and opportunity in this abundance. The problem is with fluidity - all this information is out of the container. Abundance, persistance and fluidity lead to issues with discovery.

There are four different types of metadata:

  1. Bibliographic: basic book information, the classic understanding of metadata.
  2. Commercial: tax codes, proprietary fields.
  3. Transactional: inventory, locations, order and billings, royalties, etc.
  4. Merchandising: descriptive content, marketing copy, consumer oriented content.

Part of the challenge of managing metadata are the many different sources. There are publisher prepared files, publisher requests (typically email), data aggregators (e.g. Bowker), social reading sites, online and offline retailers and libraries (remember them?).

Other complicating factors for digital metadata include differential timing (physical books require 6 months prior, digital upon publication). There are different attributes and more frequent price changes. Conversions are often outsourced and, in relative terms, this is a whole new process.

Current metadata practices tend to include creation in 4 primary departments (editorial/managing editorial, marketing, production and creative services). Management responsibility varies by sender. Most publishers treat publication as end date for updates (although this is changing). Complete does not mean accurate, inspection is limited. And prepping metadata is somewhat ad hoc. But it's not all bad news. Many publishing houses are now looking at metadata as a functional map. They are examining the process and putting all data into a metadata repository.

Best practice in organising metadata is emerging. You need a hub - a single source of truth for your data able to deal with multiple contributors and multiple recipients. Design defined roles and provide a single source. Identifiers are much more efficient to search engines than thesauri. Text matching doesn't work across character sets or even languages that use the same characters.

There are a number of codified representations of a concept that should be used as they are helpful to search engines as they are short cuts:


Machine language is key. Codes are easier to process than text, faster and less complex. Codes are unambiguous. Natural language evolves and is more unstable. You can use linking data sets using ISNI. Content's new vocabulary is based upon:

  • structured content
  • linked data/linked open data
  • the semantic web
  • ontology
  • Good Relations - an ontology devised specifically for describing products for sale
  • RDF - Resource Description Framework
  • and data visualisation.

Tuesday, 9 October 2012

Tools of Change Frankfurt Key Note 1: Jo Henry on Consumer eBook Monitor Data

The (ebook) world according to Bowker
For the first session at Tools of Change conference on the eve of the Frankfurt Book Fair, Jo Henry from Bowker presented an overview of their Consumer eBook Monitor data.

Who are downloading ebooks? In general, they are male and a third to a half are under 35. The majority live in urban/surburban areas and half to three quarters are in work. A third have a degree (although this rises to 90% in India).

Future trends include i) moving towards more even male/female split; ii) becoming older with more 35+ in most markets; and iii) increasingly suburban. India is a massive market however the US is still the biggest market. There is a long tail across to New Zealand, with a small population and low growth.

Other interesting trends include:

  • engagement in ebooks is not slowing print purchases
  • heavy book buyers are usually promiscuous and buy/borrow from all channels
  • 10% of people who were not buying print books are buying ebooks

What is the role of free in the digital world? Free is driving engagement with paid digital content. If you are a free downloader you are two and a half times more likely to buy print while still downloading, unless you are in India or South Africa. In this survey they also asked about piracy. India and Canada are low in the 'never would download illegally'. Some consumers are conflicted and might consider downloading illegally if they couldn't get a legitimate copy.

Where there is a young market, the device most used for e-reading is a PC. The markets who've adopted e-reading devices most enthusiastically are Canada and the UK. There is also a significant number who read using mobile. Amazon has the strongest market share in the UK, US and New Zealand while Canada have a higher proportion from Kobo.

Attitudes to pricing are fairly consistent: consumers say they should be cheaper than print books. In most markets they think the ebook should be 50% of hardback and c.70% of paperback for adult fiction except in India where value is perceived to be more than print. For debut author prices need to be cheaper.

She concluded with the following observations:

  • growth rates are fast - particularly in emerging markets
  • engagement with ebooks doesn't always reduce print spend
  • free is driving the ebook market
  • the biggest players don't always dominate the market
  • 'E' is regarded as less valuable than 'P'.