AI was one of the hot topics at last year's ALPSP conference, in this guest blog Hong Zhou, Senior Product Manager
for Information Discovery and AI at Atypon give us the 101 on this transformational development.
Artificial intelligence, or AI, is much more than the latest technology buzzword. According to Gartner, by 2020, AI will positively change the behavior of billions of workers and users. And Tata estimates that the vast majority of those workers will work outside of IT.
But what exactly is AI?
AI is a broad set of technologies
that use the computational capabilities of machines to “think” like humans.
There are many different types of AI, each of which can be used to solve
different problems.
So how can AI be employed by
scholarly publishers? Ultimately, any publishing technology should make the
research experience more productive, increase content usage, and add value to
the publisher’s content. To do that, R&D at Atypon explores ways to help
readers discover useful and relevant information more quickly by improving
search mechanisms and refining content recommendations.
Making content relevant: Recommender systems
Recommender systems will be familiar
to anyone who has received suggestions about what other products to buy before
or after making an online purchase. Publishers can use them to target relevant
products to individual customers by understanding their online site behavior
and interests.
Anticipating what readers want: Personalized search
AI-driven recommendation technology
can be extended to personalize search as well: reading histories can be used to
adjust search rankings specifically to each user—and even suggest new queries
that may be relevant—with the goal of understanding a user’s intentions even
before they search.
Faster, easier content classification: Semantic auto-tagging
Content tagging underlies many
important website capabilities, such as automating the creation of
topic-specific pages and content bundles, and powering search results and
content recommendations. But tagging documents and maintaining tag sets can be
a daunting undertaking. Auto-taggers powered by intelligent machine learning
algorithms tag articles accurately and even identify which tags may not be
assigned correctly. They save curators time by letting them concentrate their
efforts only on content that’s assigned low “confidence scores” by the
auto-tagger, thus making it easier for publishers to implement and manage
taxonomies.
Content enrichment: Natural language processing
Keywords are traditionally extracted
or selected manually, but doing it automatically requires a large amount of
training data to identify relationships among topics and key phrases. By
enabling machines to understand the meaning of content rather than just the
individual words, they can extract more valuable information from content.
Natural language processing (NLP) automates key phrase extraction and obviates
“teaching” the engine about the content first. By extracting key phrases from
different sections of the content and ranking them based on their importance,
NLP ultimately improves content categorization and, by extension, content
discovery.
Beyond tagging and metadata: Knowledge graphs
A knowledge graph charts all of the
possible connections among publication-related information like authors,
topics, journals, articles, and even external knowledge databases. Based on
these connections, algorithms identify and recommend to researchers the most
influential entities, trending topics, and even co-authors and reviewers based
on their areas of specialization and the subjects about which they’re writing.
Granular discoverability for text and images: Semantic enrichment
Suppose a researcher wants to
interpret many figures associated with a single experiment. Editors have to
segment them manually using specialized software—problematic when processing a
large number of them. Machine learning can be used to extract sub-figures and
captions from compound figures and even separate labels from their associated
images, enabling each item to be searched and retrieved individually. Such
automation not only reduces the cost of segmentation but also extracts and
organizes more valuable information so researchers can search for, compare, and
recommend images more precisely and easily.
Search the science, not the text
AI is no longer an aspirational
conversation about the future—many of the technologies discussed above are all
available today and in use by publishers. By using AI to provide better search
results for researchers—and enable publishers to target content more
effectively—publishers can deepen researchers’ engagement with their websites,
increase the value of their content, and further the pursuit of scientific
knowledge by surfacing the information they need more quickly and accurately.
Hong Zhou works on Atypon’s next-generation information discovery technologies. Previously, he was the CTO of Digital Fineprint, a startup that leveraged machine learning algorithms for the insurance industry. He also spent a year designing race car games at Eutechnyx. He holds a PhD in 3D modeling with artificial intelligence algorithms from Aberystwyth University and has published widely on computer science.
Website: https://www.atypon.com/
Facebook: https://www.facebook.com/Atypon
Linked in: https://www.linkedin.com/company/atypon/
Twitter: https://twitter.com/Atypon