By Simon Linacre, Digital Science – Silver sponsor of University Press Redux 2024.
With the hype about artificial intelligence reaching fever pitch, it is easy to forget that the importance of this technology is what it can do for you. To understand more about how publishers can benefit from recent advances, Simon Linacre spoke to Digital Science company Writefull about their AI-based academic language service
In late 2023, Digital Science announced it had fully acquired Writefull, which aims to support users worldwide with all aspects of scholarly writing. Writefull had been part of the Digital Science stable for a while after it won Digital Science’s Catalyst Grant award in 2016, being part-owned by the parent company since 2018. As such, the startup represented Digital Science’s first major investment in AI language models, and shows just how long AI technology has actually been around before its Generative AI entered people’s consciousness in late 2022.
Writefull’s AI language models are trained on billions of sentences taken from millions of journal articles. This scale of training has to be matched with a strong commitment to data privacy, which means its models offer the best possible assistance to its users in activities such as academic writing, copy editing and making revisions.
In its first few years, Writefull has expanded its language services to students and researchers at more than 1,500 institutions. Having such support available helps academic publishers down the line, of course, as they see improved standards of article writing eventually being submitted to them. However, Writefull also works directly with publishers and their copy editors through integrated workflows, including the American Chemical Society (ACS), Hindawi, the British Ecological Society and the Royal Society of Chemistry (RSC). In addition, Writefull’s APIs are also integrated with Digital Science’s collaborative LaTeX editor Overleaf.
As part of an interview to learn more about where the idea of Writefull came from and how technology can help shape improvements in scholarly communications, CEO and co-founder Juan Castro says the idea for the company came - like many Digital Science founders - when he was studying for his PhD in artificial intelligence.
“I have always been interested in linguistics, and the interface between artificial intelligence and how language is generated, understood, and how it can be analyzed,” says Juan.
“The idea of Writefull came all the way back to when Hilde [Writefull’s Applied Linguist Hilde van Zeeland] and I were doing our PhDs. Hilde was doing her PhD in applied linguistics, and there was always this question of: Couldn’t we use artificial intelligence to help authors with their academic language? And so it all started from there.”
Juan says the first versions of Writefull were based around how people have used different ‘chunks’ of language in the academic setting. This version enabled users to search for phrases and see how often they would appear in published papers, or what synonyms were used frequently instead of certain words.
It was the development and application of deep learning techniques that opened up the first possibilities for Writefull to work with publishers. As Juan says, “The first use case we identified for publishers was to improve the language of author’s manuscripts at submission. Hindawi was the first publisher to integrate Writefull this way. Later on, we realised that we could also use Writefull’s language models to evaluate the language quality of manuscripts, either at submission or later in the pipeline.”
“That really took us all the way to where we are now, where we've developed language models in-house that are very tailored to academic writing, and are applicable across a publishers’ portfolios. It helps them cut costs and increase efficiency, especially around timeliness.”
Now that Writefull is fully part of Digital Science, Juan and the team are looking forward to more conversations with publishers to understand their problems and see if Writefull can help them with their AI-based solutions. Juan believes that a lot of publishers have problems around language that could be tackled using Writefull’s AI. For example, the categorization of manuscripts by language quality is one area where he believes that Writefull could help. It can help to evaluate the editing needs of submitted manuscripts, to evaluate the copy editing work done, and more: “By using Writefull’s categorization service, you can better budget for copy editing needs and time, and you're also reducing the time to publish.”
“Through our conversations with publishers, we have seen that many do not categorise manuscripts by editing needs, or they do it manually. Manual categorisation is very time-consuming and therefore hardly scalable, and may also lead to inconsistencies.”
Another benefit of paying more attention to the varying quality of manuscripts at the point of submission is that it levels the playing field earlier on for papers that may represent excellent research but poor quality English, which can disproportionately impact authors from Global South countries. As we see AI and related technologies develop quickly around us, Juan sees more benefits feeding through to publishers in the future.
Juan thinks that, overall, we will see an improvement in quality. “Another use case we have is with one major chemistry publisher where they're using our Metadata API. The publisher has checks in place to ensure that the XML of the copyedited manuscript corresponds to the original docx or PDF. Before, a human would check all required fields manually: check the authors’ names and surnames, their affiliations, their address details, etc. One of the things the publisher wanted to do was to improve the quality of this process. They now use Writefull’s Metadata API to extract all the metadata to compare the original with the XML, and if there are any differences it will pop it up for a human to manually review.”
“I think in general this and other processes will be more automated in the future, and that as a result, less manual checking and editing will be needed. As more and more material is submitted for publication, such automated services will become even more valuable in the future.”
It is clear, then, that the AI hype in scholarly communications that we have witnessed in the last year or so has actually been a quiet revolution for many years, with startups like Writefull in the vanguard. However, it also appears that this is just the beginning, and the impact AI will have for academic publishers will be far-reaching in 2024 and beyond.
About the author
Simon Linacre has 20 years’ experience in scholarly communications, has lectured and published on the topics of bibliometrics, publication ethics and research impact, and has recently authored a book on predatory publishing. Simon is an ALPSP tutor and former COPE Trustee, and holds Masters degrees in Philosophy and International Business.
About Digital Science
Digital Science started in 2010 by looking for ways to solve challenges they were facing as researchers themselves. Today, their innovative technologies empower organizations with insights, analytics and tools that advance the research lifecycle. Their six different product solutions - Dimensions, Altmetric, Writefull, Figshare, ReadCube, and Overleaf - help scholarly publishers to analyze data more effectively, track research outcomes, enhance author services, streamline workflows, and make collaboration more seamless.
About UP Redux 2024
The 5th ALPSP University Press Redux returns as an in-person event on 15 & 16 May 2024, in partnership with Edinburgh University Press. This is part of EUP’s 75th anniversary celebrations and will be held at the John McIntyre Conference Centre, The University of Edinburgh, Edinburgh, UK. Find out more.