Tuesday 9 October 2012

Tools of Change Frankfurt: Metadata Futures

Karina Luke from BIC introduced a panel on metadata for the future. Graham Bell from EDIteur began with an observation of how uncomfortable book publishers are with the concept of metadata. He provided an explanation of the fundamentals of metadata for the industry and how it can enable you to begin to discover new metadata within the data.

He went on to describe in more detail what meta and linked data are. Linked data expresses metadata as a collection of triples. It uses URIs to represent relations and things and prefers persistent HTTP URIs so they can be 'looked up' to get further details. This lets the data be 'self-describing'. He warned about Linked Open Data: this has an additional view added in which requires the data to be free and accessible, and counselled to bear this in mind as it may - or may not - be what you want.

Linked data is just another way of expressing the same data. Some practitioners have a loose view of semantics, that it's not best suited to the supply chain. You need to be selective about data sources, as the system is based around trust and expectations of persistence. There is a need for common entities, shared vocabulary and a standard approach.

George Lossius' presentation was called 'Navigating the Semantic Web'. He covered definitions of linked data - the semantic web - and why we need it, who is using it now, and the business benefits for the trade. Working in the semantic web isn't a scary thing: it brings you closer to the original, scientific view point, and it's fun.

The semantic web takes the web solution further by providing:
  • web of linked data vs web of documents
  • framework of emerging standards (W3C)
  • structured content - standard way of describing things
  • ontology
  • inference / relationship
  • interoperable
  • combination of data from diverse sources

'The semantic web is a little bit about us: it uses deductive reasoning and inference to do things you ask it to do.'

An example of a semantic website is Breathing Space, a pilot project that aims to explore the value to researchers of compiling and mining a critical mass of data within a discipline. Another example is GSE Research, which aims to provide a bridge between scholarly research and practice in the fields of governance, environment and sustainability. It was interesting to hear him note that the BBC website for Olympic athletes was populated by a semantic search.

Is it relevant to the publishing industry or to trade books? Yes. Your consumers are becoming more demanding, time poor and intolerant of waiting. So the job in the publishing supply chain is to make it easy and interesting so you don't lose your readers. What the semantic web gives you is the opportunity to create compelling, relevant and interesting material to create value for them and your business.

'The semantic web is about fulfilment: the fulfilment of books and the fulfilment of the right content to consumers at the right time.'

Beat Barblan from Bowker provided an illustration of how identification can be difficult online and how the ISNI helps. The ISNI is an ISO standard which uniquely and authoritatively identifies Public Identities across multiple fields of creative activity. For a full definition of ISNI read the website.

It will help with discovery, search ranking, identifying rights holders and distribution. It is the tool that can link the unique content to the creator. It is a bridge identifier that will link while showing enough to disambiguate. The rich content will be found elsewhere. There are just under 1.5 million ISNIs assigned and around 15.5 million provisional records.

Valla Vakili from Small Demons focused on the great chain of narrative in his talk. He focused on V for Vendetta as an extreme example of a great way for a narrative to break out into the world. It referenced so many aspects of history and life including Guy Fawkes. Data collected included:
  • book
  • character in book
  • chararcter's role
  • character's clothing
  • character's clothing was inspired by historical figure of Guy Fawkes
  • where to get the mask (which is also the highest selling mask on Amazon)

Howard Willows at Nielsen BookData closed with an overview of moving toward a single subject classification scheme for the global market. Drawing a comparison with the Tower of Babel, there is still a range of systems designed for local languages and confusion reigns (e.g. BIC, BISAC, SAB, RVM, YSO, etc). This system undermines their overarching goal and introduces inefficiency into the supply chain.

There is a gap in the metadata for trading partners across national borders even between divisions of mulitnational companies. The traditional fix is mapping and while this works, it only works up to a point. The problem with mapping is:
  • it's not a complete solution
  • there are often competing versions of varying quality with different outcomes
  • they tend to be either simple and inaccurate or complex and accurate.
Overall there is a degradation of quality and loss of discoverability which results in poor experience and degradation of sales. Mapping has been pushed to breaking point by the growth of digital publishing and online trading and has outrun interim solutions. 

'A global market needs globally understood metadata.'

The best and only viable long term solution is a single universal subject classification scheme. Who will benefit? Publishers through greater control over product data; aggregators through less data manipulation; as well as retailers and consumers through a clearer and much simpler supply chain.

As a result of this need, a new organisational structure has been put together, independent of BIC and any other existing company. THEMA has been born to ensure global subject class scheme stays free to use and truly international.

No comments:

Post a Comment