Showing posts with label STM Innovations. Show all posts
Showing posts with label STM Innovations. Show all posts

Wednesday, 4 December 2013

Frank Stein on Watson and the Journey to Cognitive Computing

Frank Stein on cognitive computing
Frank Stein from IBM outlined their project Watson and the Journey to Cognitive Computing at the STM Innovations seminar. Data is exploding driven by unstructured data (in descending order: video, image, audio, text, structured data). How do we build a system that can take all this info and build something useful for researchers, doctors, etc?

The Watson and Jeopardy! example shows how they have developed a programme that can match deeper evidence and use temporal reasoning, statistical paraphrasing and geospatial reasoning. The evidence is still not 100% certain, but it is about about likelihood and confidence.

What they learned in Jeopardy
The DeepQA approach can accurately answer single sentence queries with confidence and speed. It is highly dependent on content, content quality, and content formats. They need a combination of technologies to get satisfactory performance (semantic technology, machine learning, information retrieval/search technology, databases and high performance computing techniques). Both structured and unstructured content need to be combined for best results. They now need to extend Watson to handle richer interactions and continuous training/learning.

Here's the IBM video about Watson and the game show Jeopardy!


Watson Decision Advisor in medicine
A data-rich, societally important field helping Watson change how medicine is:
IBM used to produce typewriters
When Stein started, IBM produced typewriters. Now they have 10,000+ products. Their sales agents need help. IBM is building out a portfolio of Watson Solutions including Watson Engagement Advisor for use in situations in which you need stronger ties with constituents and better automated or agent-facilitated conversations. Examples include: bank outreach to customers for cross-sell, cable operator services and support, tax agency advice, etc. 

What's next - Cognitive Computing
Watson is ushering in a new era of computing. We have transitioned from the tabulating systems era to programmable systems era. Now we are moving into a world called cognitive systems era. This is a key technology for a new era of computing that takes into account:
  • Content and learning
  • Visual analytics and interaction
  • Data centric systems
  • Cognitive architecture
  • Atomic and nano-scale.

Sayeed Choudhury reflects on the research data revolution

Sayeed Choudhury
Sayeed Choudhury, Associate Dean for Research Data Management, Johns Hopkins University, kicked off the STM Innovations seminar reflecting on 'The Research Data Revolution'.

There is a new economy of sources of data. The challenge as publishers is to develop services.

Data Conservancy is a community that develops solutions for data preservation and sharing to promote cross-disciplinary re-use. It is about preservation - collect and take care of research data; sharing - reveal data's potential and possibilities; and discovery - promote re-use and new combinations.

Is data different?
Data is the new oil (stated in Qatar, European Commission, etc). McKinsey claimed that data is 4th factor of production and estimates a potential $3 trillion of economic value across seven sectors within the US alone. Todd Park estimates location sensitive apps generate $90 billion of value annually. Policy movements reflect its importance: the White House Office of Science & Technology Policy Executive memorandum and White House Open Government Initiative are two key initiatives.

Collections
Data are a new form of collections though they are fundamentally different in nature. They are created or converted to digital format for processing by machines. Entirely new methods are required to deal with them. They are, in effect, a new form of special collections.

What is 'Big Data'?
There are definitions based on the V's of Big Data (e.g. volume, velocity, variety). What is clear is that it's different from 'spreadsheet science' (or long-tail science). For Choudhury, if a community's ability to deal with data is overwhelmed, it is 'Big Data' - and it's more about 'M's' (methods of lack thereof) than 'V's'.

Services
There's a core of services that span across data from different disciplines and contexts. Archiving is a good example. However, if data collections are basically open, libraries may need to differentiate themselves by the services they offer. They should provide a combination of machine and human mediated services. There will be a set of services that only 'experts' will be able to offer.

Data management layers: curation, preservation, archiving, storage

Understanding infrastructure
Data will require fundamentally new systems and infrastructure. Institutional repositories can be useful gateways, but are not long-term solutions (particularly for 'Big Data'). Libraries will need to operate at scale through an integrated, ecosystem approach to infrastructure. Customised 'human mediated' services are most effective as an interpretative layer on machine based services.

What about publishers?
No one can claim a specific role or act with a sense of entitlement when it comes to data (whether publishers or librarians). The future of data curation is a competition between information graphs. 'Publishing is about content, not format.' - Wendy Queen, Associate Director of Project Muse, Johns Hopkins University Press