ALPSP blog: at the heart of scholarly publishing: driving usage

Showing posts with label driving usage. Show all posts

Monday, 1 October 2012

Sarah Price: Library Technology and Metadata - Measuring Impact

The afternoon session at To Measure or Not To Measure: Driving Usage seminar included a session from Sarah Price who is E-Resources and Serials Coordinator at the University of Birmingham and Co-Chair of KBART.

One of key things librarians are interested in is ensuring that the content they buy is easy to use, is discoverable and accessible for their students. She provided a candid and compelling story of how the University had got to grips with critical feedback from students on the eLibrary provision, and how they instigated a major review and development programme to address the issue.

Traditionally, there were two access points to content: traditional library catalogue (mainly for print collections) and the elibrary service. Both were accessed via the home page, but didn't take into account special collections and other services they had. The user interface was very text heavy, old fashioned and not very user friendly and you had to search separately for ejournals and ebooks, making the experience confusing, unattractive and a source of dissatisfaction.

As a results the University has invest in a Resource Discovery Service which provides:

single search interface and search box (with a Google-like interface)
harvesting of collections across institutions
much faster search and results retrieval
discovery at article and chapter level
post search filtering and refinement.

The service is publicly available - with no (upfront) authentication - as a taster for potential students and academics. However, if you want to access in-depth content you have to sign-in with your university account. It is designed to have no dead ends and is integrated with other web services such as the University portal. They worked with Ex Libris to develop the product and included embedded searching as a function.

They added the Primo Central Index to this product which is a very important part of the discovery service delivering article level searching. A user can also narrow research from 'everything' to specific collections or using advanced search. You can log in with your own personal account which then provides access to the full set of content and lifts restrictions. When using a search term, the results will indicate what type of resource it is (e.g. articles, books, etc.) Where it is a book, it will show stock and location of copies on a site-specific basis, even including a map of the location in library. Print and electronic resources are listed alongside in a discovery tool. You can see where terms are where you searched to check relevance and you can also facet or post-filter (e.g. by article, book, library site, date range, author, language, electronic database, etc.), and it will attempt to group similar records.

Another interesting feature for scholarly publishers is the link to the in-house reading list management system on each textbook. This is flagged at the foot of the entry and you can click through to see full reading list and then continue through to other titles and services. Crucially, this will be helpful in checking against your records whether an academic has added a title to a reading list or not after receiving an inspection copy.

The resource is embedded on the university portal my.bham within a MyLibrary tab. This is a primary source of driving usage to the site. It's early days for analytics, but at the start of term they have the same amount of traffic from my.bham university portal as from Google Scholar. In addition, index based searching is generating a lot of traffic from their users.

During the implementation they decided to:

still provide database level link to native interface function
provide library catalogue only search but within FindIt@BHam
'everything' set as search default but enable a limit of scope
linking SFX component of Metalib library catalogue to reading list management system and the University of Birmingham Research Archive (UBIRA).

They dispensed with the A-Z list and pre-search limiters and now rely on post filtering facets. They also dispensed with ebook MARC records as metadata input and now directly harvest from SFX. It was a bold decision, but they have found that it works for them. There has also been integration of the single search in the portal and library services homepage.

Price flagged the importance of metadata for discovery. It supports linking to the appropriate copy; allows an appropriate set of links to be presented in a single place; allows the library to accurately and comprehensively display an entire portfolio; accurately depicts the entitled coverage for that user; and allows users to find keywords in full text - not just abstracts.

As 'Resource Discovery Service' isn't the most exciting or engaging title, they ran a competition amongst staff for the new brand name. There were 80 suggestions, but the winner - FindIt@Bham - was felt to tie in with the overall university brand well. They thought long and hard about integrating the Birmingham brand and used pictures of the distinctive campus to customise the out of the box product. They have integrated with the University portal VLE and embedded in the library Facebook page. Other marketing and promotion included:

social media
lots of work with the Student Guild
postcards/bookmarks
university staff and student newsletters
focus groups, training and briefing sessions
integration and prominent website advertising
university-wide plasma screens.

It's early days in terms of measuring impact, but they are assessing reviews of user feedback post-launch and have a continuous improvement strategy and post-launch authority group in place. They will analyse future quality measures, service and resource usage and benefits realisation. They are expecting to see a big hike in full text usage, are anticipating a massive impact to their ratings and anticipate seeing value added throughout the supply chain.

It has been interesting to compare to a Google Scholar set of results for certain specific searches. These only give generic results, not library entitlement. It has been interesting to note that the top result on a title is a pdf from JSTOR of a similar book to the one searched - their system is much more precise.

When addressing concerns about wider access to content, a demonstration showed that while Google will present the results, it won't present the full text unless they are free for access or the viewer can log in with an entitlement through the library system. The system doesn't embed authentication without library intervention - the link resolver.

Already, in comparison with Google Scholar searches, the Library discovery is context sensitive to the definition and results are more focused. Library discovery allows added value with resources grouped by subject and scholarly recommender services.

Her advice to publishers on how to integrate titles into the system includes:

send your title level metdata to link resolvers (KBART)
keep it up-to-date (cessations, title changes, etc)
provide your deep linking algorithm
allow discovery platforms to harvest your metadata
don't be exclusive, be promiscuous!
assess usage patterns following integration.

She concluded by saying that integration with library discovery tools is essential to drive usage. This needs to be based on industry good practice and there is a growing body of evidence supporting usage increase (and decrease) dependent on RDS integration.

Thursday, 27 September 2012

David Sommer: COUNTER - New Measures for Scholarly Publishing

David Sommer is a consultant working with a range of publishing clients to grow products and businesses. He has also been a contributor to the project COUNTER and completed the morning session at the To Measure or Not to Measure - Driving Usage seminar.

He provided an overview of the latest COUNTER Release 4. The main objectives for the update were to provide a single, unified Code covering all e-resources, including journals, databases, books, reference works, multimedia, content, etc. They wanted to improve the database reports and reporting of archive usage. The update will enable the reporting of mobile usage separately, expand the categories of 'Access Denied' covered, improve the application of XML and SUSHI in the design of the usage reports, and collect metadata that facilitates the linking of usage statistics to other datasets such as subscription information.

The main features of the update are:

a single integrated database
expanded list of definitions including gold OA, Multimedia Full Content Univ, Record View etc
improved database report that includes reporting of results clicks and record views in addition to searches (sessions removed)
enhancement of the SUSHI (Standardised Usage Statistics Harvesting Initiative) protocol designed to facilitate its implementation by vendors and its use by librarians
a requirement that Institutional Identifies, Journal DOI and Book DOI be included in the usage reports, to facilitate not only the management of usage data, but also the linking of usage data to other data relevant to collections of online content
requirement that usage of Gold OA articles within journals be reported separately in a new report - Journal Report 1 GOA
a requirement that Journal Report 5 must be provided (archive report, broken down by year so you can understand how much you are paying and the journal usage)
Modified Database Reports in which the previous requirement to report Session counts has been dropped, and new requirements, to report Record Views and Result Clicks, have been added. (Database Report 3 has also been renamed Platform Report 1)
New Multimedia Report 1, which covers the usage of non-textual multimedia resources audio, video and images, reporting number of successful requests for full multimedia content units - optional
new optional reports covering usage on mobile devices
description of the relative advantages of logfiles and page tags as the basis for counting online usage
flexibility in the usage reporting period that allows customers to specify a date range for their usage reports

Sommers posed an interesting question: what is a mobile device? They have used the WURFL list to define. The timetable for implementation includes a deadline date of 31st December.

Sommer then provided a useful background to Usage Factor (UF). It is designed to be a complement to citation-based measures. While Impact Factors, based on citation data, have become generally accepted as a valid measure of the impact and status of scholarly journals and they are widely used by publishers, authors, funding agencies and librarians, there are misgivings about an over reliance on them. The idea is not to try to kill them off, but to provide other measures to use alongside, particularly as Impact Factors don't work so well for non-STM disciplines.

Usage Factor provides a new perspective: a complementary measure that will compensate for the weakness of Impact Factors in sereral important ways:

UFs will be available for much larger number of journals
coverage of all fields of scholarship that have online journals
impact of practitioner-oriented journals better reflected in usage
authors welcoming this to build their profile.

Four major groups will benefit: authors (especially practitioner-based fields) without reliable global measures; publishers; librarians; and research funding agencies seeking a wider range of credible consistent quantitative measures of value and impact of output of research that they fund.

The aims and objectives of the project have been to assess whether UF will be statistically meaningful, will be accepted, is robust and credible, and to identify what the organisational and economic model will be. They started in 2007-2008 with market research including 29 face to face interviews from across interest groups as well as 155 librarian and 1400 author web survey responses.

Stage two focused on modelling and analysis and involved relevant bodies, publishers and journals. The recommendations included:

UF should be calculated using median rather than the arithmetic mean
range of UF should ideally be published for each journal: comprehensive UF plus supplementary factors for selected items
UF should be published as integers - no decimal places
UF should be published with appropriate confidence levels around the average to guide their interpretation
UF should be calculated initially on the basis of a maximum usage time window of 24 months
UF is not directly comparable across subject groups and should therefore be published and interpreted only within appropriate subject groupings
UF should be calculated using a publication window of two years.

There seems to be no reason why ranked lists of journals by usage factor should not gain acceptance. However, small journals and titles with less than 100 downloads per item are unsuitable candidates for UF as they are likely to be unreliable.

Stage three involves testing. The usage data will be used to investigate the following:

effect of using online publication data versus date of first successful request on UF
calculation and testing UF for subject fields not covered
test further gaming scenarios and assess how these can be detected
test stability of UF for low UF journals and confirm level below which it shouldn't be provided.

This will deliver a Code of Practice which will include definitions, methodology for calculation, specifications for reporting and independent auditing as well as a description of the role of the Central Registry for UF and funding model.

David closed with a summary of PIRUS2 whose mission is to develop a global standard to enable recording, reporting and consolidation of online usage statstics for individual journal articles hosted by Institutional Repositories. Further information is available online.

Vanja Merrild: Marketing Channels for the World's Largest Open Access Publisher

The second session at our To Measure or Not to Measure seminar was presented by Vanja Merrild, a digital marketing specialist working with BioMed Central. Now part of SpringerOpen, BioMed Central have 243 open access journals across biology, medicine and chemistry. 52 journals are society affiliated. 121 journals with Impact Factors.

Their network of sites and users have:

32 million page views a month
over 5 million registered users
over 380,000 recipients to fortnightly BioMed Central newsletter
14,000 new registrants a month
Google page rank of 8

Their focus as an open access publisher is on submissions, being author driven, using their own in-house submissions system and focusing on author data.

Vanja focused on providing best practice advice on how to drive traffic and usage to content from their experience. Her suggestions for email best practice are:

build your lists
have a good welcome programme - what follows after they've signed up
make it easy for recipients to add your email to their address book
make it easy to sign up
maintain consistency in 'From' lines - builds recognition of a trusted source.

She spoke candidly of the action they took recently to boost results for a newsletter that was dipping:

they made it easier to access the interesting relevant content (3 rather than 5 clicks through to website)
noted they had lots of forwards so added a recommend button
captured the forward emails
made it easier to sign up on website and from within the email

Segmentation is hugely important: the same message is not relevant for everyone. Editors, authors, members, librarians, scientific interests: they all need their own message. You should test and measure your hypothesis, check spam filters, look at the mobile device appearance and appearance in different email clients (particularly the ones your audience are using).

Vanja suggested a range of factors to test and measure. Keep track of your reputation with services such as ReturnPath senderscore.org. Understand what works where. Check out your 'sleeping beauties' contacts who need waking up and find out when they leave your email.

She then went on to provide some best practice guidelines for tweeting and Facebook posts:

share more about others than you do about yourself
promote others and build relationships
have a distinct voice
images are important.

Your social media strategy should focus on increased ROI for your business and your time. Create safe tests to experiment and don't make this only one person's role. Be iterative: plan, execute, measure, adjust, repeat. Understand which metrics matter and which are your goals. Track metrics before, during, and after to show return on investment and consider benchmarks to better understand what they actually mean. Use analytics not only as a reactive tool to see how you did, but as a proactive tool to hone your branding messages. She closed with suggested tools to measure your activity including: Twitter Counter, TweetReach and TweetStats.

Breda Corish: From 20 Million Pieces of Content to a New Clinical Insight Engine: ClinicalKey

The first session at To Measure or Not to Measure: Driving Usage to Content - Marketing, Measurement and Metrics seminar was presented by Breda Corish, Head of Clinical Reference for UK and Northern Europe at Elsevier.

The focus was on publisher products that drive users to content. ClinicalKey is Elsevier's 'clinical insight engine', designed to think like a physician and provide information for diagnosis at point of care, the ability to share relevant answers and a resource to maintain knowledge with.

It took three years development work to develop a product platform to answer questions posed within a clinical care context. The scale of information overload is immense: back in the early 90s, the challenge for doctors was that medical knowledge doubled every 19 years. In 2008 this was down to 18 months. Now the forecast is that by 2020, it will double every 73 days.

This creates the doctor's dilemma: how to access trusted information quickly with a seamless experience. The challenge for Elsevier was how to unite 20 million pieces of content into one seamless experience including existing products such as MD Consult, Journals Consult, Procedures Consult, etc.

They started by understanding users. The 'mechanics' are driven by visual procedural content. Doctors are extremely time-pressed, require pre- and post-procedural care resource, with well-defined, fairly narrow, but deep information requirements.

They focused on understanding the patient care management workflow: from diagnosis to creating a care plan, from medical treatment to after-treatment care plans and patient education and compliance. They also identified collateral workflows for doctors on keeping current, sharing information and not working in isolation, but as part of a multi-disciplinary team.

Using this knowledge they moved from unstructured content to structured content and turned it into smart content and made it work in the clinical setting. They created the Elsevier Merged Medical Taxonomy (EMMeT) using 250,000 core clinical concepts, 1 million+ synonyms, 1 million+ hierarchical relationships and 1million+ ontological relationships.

Through concept mapping they focused on making vast amounts of content easily discoverable using speciality-specific navigation, dynamic clinical summary creation and meaningful related content recommendations. The semantic taxonomy was adapted for the clinical setting and semantic relationships are used to suggest other content (e.g. clinical condition, procedures, etc). Weighted tags are their 'secret sauce for better search'.

The smart content infrastructure is based on four areas:

Product development and enhancement

more accurage search results
faceted navigation
improved content discoverability

Content analytics

greater insights into what we publish
identification of co-occurring terms
link to related external content and data

Personalization

individual content recommendations
targeted individual marketing
contextual advertising

Editorial productivity

flexible product types - new collections, image banks, etc.
increased speed to market

Usage tracking is based on usage events rather than page views. They have COUNTER-compliant content reports and monthly institution reports based on COUNTER filtering rules. They produce usage reports for different content types e.g. books, full-text articles, FirstConsult, Medline, Guidelines, etc. Every piece of content is tagged so they can produce usage reports. Usage event reports by month include analysis of: discovery (search, browse); content usage; and Advanced Tools usage.

With performance metrics they want to keep the number small for searches per content view as this is key to delivering relevant content quickly. They take insight from the usage reports on what search terms people are searching with and add them back into the product.

A recent Outsell report identified that the 'development of such taxonomies and their use to power the semantic enrichment of collecitons and aggregations of content will increasingly need to become core competencies for publishers further along the digital transition to higher value-added services.' This is something that Elsevier have engaged with directly.

Their plans for the future involve adapting for international markets using same content and powerful functionality, but adding in geographic-specific publications content. They are also looking at developing different interfaces for local languages. Even in cash-strapped health care systems around the world, there is still investment in IT. There is potential for mobile devices and tablets being used in hospitals. Doctors need information when they are on the move and not desk-bound.

They want to do more with content, more with features and functionality for doctors and end users, and develop product further for use any time, any place, any where. They are looking at how they can integrate their clinical or patient records portal or system so it's not disruptive to the experience e.g. developing query buttons to ping off to the other database. Something like 80% of hospitals in the UK are still in early stages of developing these services so there's great potential.

They are currently selling to institutions, but have recently launched a service for individuals - which focuses on their particular clinical specialism, with the option to add on. Overall, their aim is to hit the 'sweet spot' of being at the heart of comprehensive, trusted and speed to answer.