Wednesday, 9 September 2015

Anurag Acharya, co-creator of Google Scholar asks: What happens when your library is worldwide and all articles are easy to find?

There was a real sense of anticipation in the room as co-creator of Google Scholar Anurag Acharya stepped up to make the first keynote of the 2015 ALPSP Conference.
Acharya harked back to his time at grad school in 1990. Print was the dominant format. Research had to be physically handled. Every library was limited or bound in different ways. There was wide distribution for core collections, each field would have its own small sets of journals, with wide visibility for published articles. But there was narrow distribution for other journals that were found in far fewer libraries leading to limited visibility for published articles.

Browse was the common way to find research: tables of content for newly arrived issues, bibliography sections of papers you read, shelves of the libraries you could walk to. Some libraries had search services that were often based on titles, authors, keywords, included abstracts. There was no full text indexing or no relevance ranking. The most recent came first. but if you couldn't find it, you can't learn from it! In every way you were limited, by shelves, by institution's budget, that which you don't know about.

Fast forward to 2015. Almost all journals worldwide are online. A large fraction of archives are online. Anyone anywhere can browse it all - let your fingers do the walking. Your library is worldwide - online shelves have no ends. Relevance ranking allows all articles to rise - all articles are equally easy to find, new or old, well-known journal or obscure. Full text indexing allows all sections to rise including conclusions and methods.

Anyone anywhere can find it all: all areas, all languages, all time. Your own area or your colleague's, latest research or well-read classics, free to all users. If you can get online you can join the entire global research community. There is so much more that you can actually read from big deal licenses, free archives, preprints to open access journals and articles.
The transformation is fantastic - he could not have dreamt of this 25 years ago as a grad student. And publishers, societies, libraries and search services have together made this possible.

So how has researcher behaviour changed? What do they look for? What do they read? What do they cite? There is a tremendous growth in queries with many many more users and queries per user in all research and geographical areas.

Queries evolve: there has been the most growth in keyword/concept queries e.g. author name queries, known item queries. The average query length has increased to 4-5 words. There are multiple concepts or entities occur often and most queries are unique. Queries are no longer limited just to their own area. Relevance ranking makes exploration easy and broad queries return classics/seminal work. There's a mix of expert and non-expert queries from users with sustained growth in related area queries. The researcher is no longer limited to narrow areas.

What do they read? There has been steady and sustained growth per user as well as in diversity of areas per user compared to the growth in related areas queries. Users read much more shown through the growth in both abstracts and full texts.

There is more full text available than ever before. Iterative scanning is a common mode: do a query, scan. Abstracts that have full text links in the search interface are selected more frequently, even if they don't actually read the full text. PDF remains extremely popular for full text allowed what is important to the researcher to be accessible to them later.

They have undertaken research into what researchers cite and the evolution of citation patterns. The full report is published on the Google Scholar blog.

Anurag concluded by observing if it is useful, researchers will find/read/cite. The spread of attention is widening across the spectrum to non-elite journals (more specific, less known), older articles, regional journals and dissertations. Good ideas can come from anywhere and insights are not limited to the well-funded or to the web-published. The top 10 journals still publish many top papers: 85% in 1995 to 75% in 2013. The elite are as yet still elite, but less so.

Research is inherently a process of filtering and abstracts are a crucial part of the filtering process. Forcing full text on early-stage users is not useful and limiting COUNTER stats to full text misses much of an article's utility to researchers.

He reflected that we are lucky to live in an era of information plenty. Better a glut than a famine.

No comments:

Post a comment