Wednesday 29 April 2009

New suite of tools to enhance search capabilities

I thought this announcement from Califronian based 'DeepDyve' would be of particular interest given the growing use of semantic web techniques.


DeepDyve Unveils Suite of Tools for Publishers

New Tools Enable Advanced Search and Discovery

APRIL 28, 2009 – Sunnyvale, CALIF — DeepDyve, the research engine for the Deep Web, today unveiled a suite of tools for publishers and scientific societies of all sizes that want to enhance the search capabilities on their websites. The suite of tools leverages DeepDyve’s KeyPhraseTM algorithm, which allows users to input whole sentences, paragraphs or entire articles as their query to find related results.
"Our vision is that search is becoming more sophisticated and more decentralized. Increasingly, users are initiating their research online and they want to have search integrated seamlessly with their reading and browsing behavior — in other words, they want their content to be their query for finding comprehensive answers to difficult questions. These tools give our partners the ability to make their content more findable and to demonstrate the breadth and depth of their collection," said William Park, CEO of DeepDyve. "We're making available some of our most frequently used search capabilities to publishers that want to give their visitors a more compelling search experience."

The products being announced today are designed to first make the publisher's content more discoverable in search engines which is where, according to a report from Outsell, more than 70% of users begin their research. From there, other tools are available to increase the engagement at the publisher's site by allowing users to quickly find related articles based on what they are viewing.

Publisher Landing Pages
Publisher Landing Pages use DeepDyve's technology to enhance the findability of publisher content by Google and other search engines. These co-branded or private label pages present the Web searcher with not just one article at a time, but with a whole page full of closely related articles from the same publisher, with links pointing them to the publisher site. One Landing Page can be set up for each article, yet they are easy to deploy, totally automated and hosted by DeepDyve.

Custom Search API
DeepDyve's next-generation search technology is available to publishers via a web services API (Application Programming Interface). The API can be set up to search only a publisher's own content, or to help users discover other highly relevant documents in the DeepDyve index. Searches can be launched with a few keywords, or by allowing users to use a paragraph or an entire document as a query to find articles that match the concepts described. Results are returned via an XML feed or as a hosted, co-branded web page.

More Like This Document API

The DeepDyve More Like This Document API enables websites to directly interface with the DeepDyve database to search for articles that are similar to a designated document within the DeepDyve index. It is designed for use by publishers whose content has been indexed by DeepDyve and who would like to include a 'related articles' functionality on their site without the painful implementation. The user may select any document to use as a query, and the title and body are compared to other documents in the DeepDyve index. The resulting documents can be limited to a publisher's own content, or may include other content in the DeepDyve index. Results are returned via an XML feed or as a hosted, co-branded web page.

Content Highlight Widget
The Highlight Widget enables users at the publisher site to simply highlight any block of text up to 5,000 characters, then run that selection as a query. DeepDyve returns only the Publisher's articles in the search results via an XML feed or as a hosted, co-branded page of results.
DeepDyve is a search engine that was developed to scour the depths of the so-called Deep Web, the vast collection of information-rich content that is largely overlooked by today's traditional search engines. Since the company's launch in September 2008, DeepDyve has worked closely with major publishers, building an index with hundreds of millions of pages that showcases content from the industry's most respected research organizations, academic institutions and professional associations. The API tools that are being announced today are the next step in DeepDyve's vision for enabling publishers to better utilize the Internet to reach as large an audience as possible.

Early Customer Adoption
The Public Library of Science (PLoS) is one of the first organizations to implement DeepDyve's search technology.
"We are looking forward to DeepDyve powering the search on," said Peter Jerram, CEO of The Public Library of Science. "DeepDyve's cutting-edge technology and ability to use entire sentences as a query will make it much easier for our users to find and discover new original research in science and medicine."

Pricing and Availability
Each of these tools is available for free with advertising revenue sharing, or for a fee which varies depending on volume.

About DeepDyve
DeepDyve, formerly known as Infovell, is the research engine for the Deep Web. DeepDyve unlocks the vast and rich collection of information that is out on the web, but is hidden from today's search engines. Using DeepDyve, people find the in-depth, high-quality information they need to answer tough research questions. The company is headquartered in Sunnyvale, CA, with offices in Shanghai, China. To learn more about DeepDyve, go to or call 1-408-773-0110.

Thursday 23 April 2009

RCUK publishes new open access report

RCUK has published its long awaited open access report:

The language used in the release is moderate. Ian Diamond, Chair of RCUK, is quoted as saying “The Research Councils look forward to working with their partners across the research community to consider the options” which seems like a responsible attitude to me.

The headlines are:
That over time (my emphasis) the UK Research Councils will support increased open access, by:
* building on their mandates on grant-holders to deposit research papers in suitable repositories within an agreed time period, and;
* extending their support for publishing in open access journals, including through the pay-to-publish model.

I haven’t read the full report yet and will post further comments in due course...


Tuesday 21 April 2009

ACT NOW TO PROTECT YOUR RIGHTS! - important deadline: May 5

Published or sold a book in the United States before January 5 2009? Then you need to pay attention!

As you probably know, May 5 2009 is a very important deadline in the Google Books Settlement. This is the date by which rightsholders need to opt-out of the class action settlement. You don’t need to have been involved in the instigation of the class action to be affected; everyone who has published or sold a book in the United States at any time before January 5 2009 may be involved.

You are strongly urged to decide whether to opt-out or not. Do not to let this deadline pass without actively making a decision!

ALPSP has not adopted a position on the Google Books Settlement and individual rightsholders will obviously need to come to their own decision on whether the terms of the settlement are right for them.

I won’t go into the detail here, but here are a few things about the settlement that will hopefully help you determine whether you need to find out more:

It is about books (and some inserts to books) only. Some journals have been included in Google’s digitization program but these are not included in this settlement.

Google has apparently digitized around 7 million books. Of these about one million are out of copyright and are therefore not subject to the settlement. Around a further one million are still in-print and have been digitized with the agreement of publishers under the partner program; these are also not part of the settlement. The remaining 5 million books are in copyright but out-of-print and are the subject of this settlement (and, as I understand it, are any in copyright, out-of-print books not yet digitized).

As far as I can tell, Google has made no attempt to identify the rightsholders of the ~ 5 million books that are the subject of this settlement. In fact they may use the ‘opt-out’ procedure of the class action as justification for their actions if the rightsholders have not come forward. Google terms these ‘orphan works’ (i.e. works where the rightsholder cannot be identified) but as I say I believe that no due process to identify rightsholders has actually been carried out.

The settlement does not cover most of the images in the books which have been (or will be) blanked out.

Access to the digitized versions of these books will only be from the USA (those involved more closely appear confident that this can be achieved but it seems an enormously difficult task to me…)

You need to have the US copyright interest in the book (so if you have retained worldwide rights you are included).

If you choose to opt-out of the settlement you are retaining the right to sue Google for copyright infringement with respect to their digitizing activity of these books. However, it strikes me that even if you have no intention of suing Google you will be sending a message that you do not feel that the settlement is in your interests and be objecting to Google’s activities in this area.

If you choose to opt-out of the settlement you will not benefit from the payments from Google for the digitization of these works nor from any income generated in the future by the books included in the service. In addition you will not be able to formally comment on the terms of the settlement.

The Publishers Licensing Society have provided some additional information and, in case you've not seen it, a copy of the legal notice summary and full notice.

More information:
The settlement -
FAQs -
To opt out -

You can also email the Settlement Administrator at

Please note that ALPSP is unable to give legal advice regarding this settlement. If you are in any doubt you should consult a lawyer.

Saturday 18 April 2009

Founders of The Pirate Bay jailed for copyright infringement

On the same day as the Digital Britain Summit, the four founders of BitTorrent tracker site The Pirate Bay were handed prison sentences and ordered to pay USD 3.6m in damages for assisting copyright infringement of 'about 30 works'.

The Pirate Bay displays utter contempt for copyright as demonstrated by the legal disputes section of their website and have steadfastly refused to remove links to infringing content. It's great to see the Swedish courts taking the matter seriously and dishing out sentences that will hopefully serve as a deterrent for others.

On name-dropping and argument at the Ditigal Britain Summit

On Friday 17 April I attended the Digital Britain summit at the British Library in London.

The day covered the whole breadth of the topic, from the cable, fibre and wireless infrastructure needed to deliver Next Generation Access (NGA) through to the needs and confidence of consumers, to the issues around intellectual property (IP) and the protection of creativity and innovation.

We were treated to four government ministers including the Prime Minister, Gordon Brown, himself (not that he said much of relevance but it does, as we were frequently reminded, show the seriousness with which the UK government views this issue).

On the subject of piracy and the respect of IP, or lack of it, in the digital environment I got into a heated debate with two middle-aged attendees over coffee who refused to accept that piracy and illegal file sharing had any negative consequences at all (I mention age because we are so frequently told that it is only the young who are so naive about such matters). Instead I was told that there were no costs to producing content (which must come as a bit of a surprise to, for example, the Hollywood film studios!), that copying and distribution were free and that file sharing was free marketing. Of course when I asked how creators and rightsholders should recoup their investment I was told that it's for us to figure out and if we don't someone else will. Great.

The highlight for me was a long chat with Fergal Sharkey, former front man for The Undertones, chart-topping solo artist and now CEO of UK Music. UK Music are doing a huge amount to expose the impact that illegal file sharing and piracy are having on the creative industries and are banner wavers for all of us...

Friday 17 April 2009

ALPSP Free Seminar at London Book Fair: Surviving the future - how authors' rights are impacting scholarly publishing

Going to the London Book Fair on Monday 20 April? Don’t forget the free ALPSP seminar, there is no need to book – just come along on the day.

Surviving the future - how authors' rights are impacting scholarly publishing

Westminster Room, Earls Court One – 12:30 – 13:30

Chair: John Cox, Director, John Cox Associates

Fiona Kearney - Director, UK Business Development and Rights, Oxford University Press
Grace Baynes, Corporate PR Planner, Nature Publishing Group

Our seminar will concentrate on "Author's rights", based on findings
from the ALPSP Scholarly Publishing Practice Survey (2008). When authors grant a licence to journal publishers to publish their articles, what they can do with this material in other ways - putting the accepted manuscript on their own or institutional website for example - is changing a great deal. Different publishers have different attitudes to how their journal article content is made available. While being concerned not to undermine the "exclusivity" of their own version, they are conscious that they have to find ways of accommodating institutional repositories and other routes. John Cox will be describing what the industry is currently doing, and two publishers will provide case-studies of their own experience. So while the seminar is based on the fact that copyright materials can be licensed, it will be more about the form of licensing than copyright as such.
Note: there is no charge to attend this seminar but delegates will need a visitor pass for the Book Fair to enter the Exhibition. To pre-register visit:

For further information please contact Diane French (email) +44 (0)1827 709188

Wednesday 15 April 2009

Web Seminar: Improving the Copyediting Workflow: How Can Technology Help?

ALPSP has a close working relationship with SSP (the Society for Scholarly Publishing) and in fact we publish our journal, Learned Publishing, in cooperation with SSP.

We also occasionally cross-promote events and things that we think will be of interest to our members. A forthcoming web seminar ‘Improving the Copyediting Workflow’ is one such thing – details below!

Register now for the SSP/AAUP Web Seminar: Improving the Copyediting Workflow: How Can Technology Help?
May 7, 2009 - 1:00-2:30 PM EDT

The copyediting function is one of the most expensive and time-consuming parts of the publishing workflow, and has recently come under increasing scrutiny from publishers seeking cheaper and faster ways of delivering content to market. Over the last decade, a variety of tools, ranging from Microsoft Word macros to entire SGML/XML editing environments, have been deployed, using a variety of in-house and commercial expertise. This seminar explores the impact of technology on copyediting in the digital age. It features two case- studies from scholarly publishers who have reinvented their copyediting workflows, placed in context by a technical expert who will survey the latest developments in editing software.

Who should attend?

Full of practical suggestions and honest assessments, this seminar will be invaluable to production editors, managing editors, and senior management at small to medium-sized publishers of scholarly content. Under a new collaboration, both SSP and AAUP members are entitled to special "member-only" rates.Join us if you want to:

Benchmark your own copyediting workflow against successful publishers who have reinvented theirs.
Assess your organization's technology needs, and learn the pros and cons of different options.
Understand which copyediting tasks can be simplified through the application of technology, and which cannot.

Learn about the strategic issues involved in balancing in-house staffing with outsourced services.
Get practical tips for keeping copyediting projects on time, on budget, and under control.

John Muenning, University of Chicago Press
Michael Haskell, Columbia UP
Scott Beebe, Oxford University Press

Greg Suprock, VP & General Manager, Content Services, Cadmus Communications

Registration: Fees are $99 for SSP or AAUP members/$149 for non-members.
See the SSP Web site for additional details and to register:
Remember, with web seminars, all you need is a telephone and a computer with Internet access. Registration is per-computer; you can invite as many staff as you like to participate using a single speakerphone and projector. All participants receive a recording of the seminar after the event.

Wednesday 8 April 2009

Free online seminar on the Google Books Settlement

Those following the Google Book Settlement (or trying to) will be interested in an upcoming web-based information session offered by the Copyright Clearance Center (CCC). Details and signup information are below.

CCC is offering a free, informational online seminar titled The Authors Guild, AAP, Google Settlement: What Authors & Publishers Need to Know as May 5th Approaches on Tuesday, April 14th at 12:00 pm ET / 16:00 GMT. This session features publishing attorney and copyright expert, Lois Wasoff, who will focus on key points of the settlement to help publishers, authors and literary agents understand their options as the opt-out date of May 5th approaches.

For more information and to register, visit the CCC website.

Tuesday 7 April 2009

Accessibility Action Group Newsletter just posted

It must be newsletter day!

Issue 5 of the Publisher Accessibility Newsletter is now available. This is produced quarterly by industry trade bodies and licensing/standards organisations under the umbrella of the ‘Accessibility Action Group’ and is "designed to help publishers meet the precise
requirements of people with reading impairments".

Inside you will find an overview of current activities in progress – both in the UK and abroad. We hope you will find the information contained in this document both interesting and educational.

Latest issue of our e-newsletter, ALPSP Alert, now available!

The April issue of ALPSP Alert, our monthly e-newsletter, is now available and is FREE to everyone at ALPSP member organizations.

To access ALPSP Alert you will have to login to the website - - (you can request a username and password from the homepage if you need one).

Please note that you may need to refresh the page after logging in for the link to ALPSP Alert to appear.

Monday 6 April 2009

ALPSP Awards 2009

Applications are invited for the 2009 Awards - Best New Journal, Publishing Innovation and new for 2009, Best eBook Publisher. The winners will be announced at the Awards Dinner on 10 September at the ALPSP International Conference. The closing date for applications for all three awards is 17 June 2009. Full Details; Download brochure (pdf).

ALPSP Award for Publishing Innovation
sponsored by EBSCO

ALPSP Award for Best New Journal
sponsored by DHL Global Mail

ALPSP Award for Best eBook Publisher
sponsored by Ingram Digital

Society of Indexers: Wheatley Medal

The Society of Indexers invites publishers to submit nominations for the prestigious Wheatley Medal for an outstanding index.

Indexes published between 1 January 2008 and 30 April 2009 are eligible for nomination and the deadline for nominations is 30 April 2009.

Further information and the nomination form are available from the Society of Indexers' website.

Friday 3 April 2009

Library of the Future meeting

I attended an interesting JISC sponsored meeting on 2 April 2009. The Libraries of the Future debate was hosted by Oxford University and was expertly Chaired by Vincent Gillespie (J.R.R. Tolkien Professor of English Literature and Language at the University of Oxford). It was the first event that I've been to that was simultaneously available in Second Life and webcast and it also featured a screen alongside the speakers which displayed real-time comments from those in the auditorium and watching on the web.

There was actually a fair amount of discussion about the publisher of the future and as the debate wore on a fair amount of publisher bashing too. Harvard University librarian Robert Darnton spoke vitriolically of "extortionately priced journals" and received a smattering of applause when he suggested that librarians should just stop buying journals. The trouble is that this fatuous remark does little to help anyone nor to advance the debate. I am sure he has no intention of ceasing the purchase of journals and if he did how would the academics at Harvard continue to do their research without access to the literature?

The issue of peer review also came up, and along with it the familiar arguments; it's all done for free by academics so why does the world need publishers (completely ignoring the importance of journal brands in the peer review process) and that the future is in overlay journals on repositories.

Peter Murray-Rust did try to address the issue of the day. In a typically polemic talk he challenged the library community to get their act together with a call to "Just Do It". He also said that it's too easy to simply blame the publishers.

The presentations were, with one exception, all excellent but I found the timbre of the meeting to be rather anti-private sector. Google (represented at the meeting by Santiago de la Mora, Partnerships Lead for Google Book Search in Europe) attracted criticism for making money via ads on their digitization programme and Google Book Search and some commentators from the floor bemoaned the fact that the public sector in general, and libraries in particular, had not done this. I am happy to defend publishing and the importance of publishing, but the place to defend capitalism was 50 miles down the M40 motorway at the G20 summit.

I left with Peter Murray-Rust's call to "Just Do It" in my head. It struck me that if replicating the benefits of the peer review / journal brand system is that easy then the academic community or the librarians should 'just do it'; that if the librarians think that they and the students and faculty members they serve would be better off if they cancelled subscriptions to journals, they should 'just do it'; that if overlay journals are such an obvious and easy solution (even though no overlay journal has emerged on the arXiv despite this being talked about for more than a decade) then someone should 'just do it'.

Better that than scoring cheap points in a debate.

The reason that I'm so keen for them to 'just do it' is because then the realization will hit that this stuff isn't that easy, that the publishing industry is not full of fat cats creaming money from from the academy and adding no value and because I am 100% confident that professional scholarly publishers can fulfil these functions more effectively and more efficiently.