Thomas Yaeger's Blog: An Uneven Distribution: Research and Scholarly Resources in the 21st Century (I)

Tuesday, 15 March 2016

An Uneven Distribution: Research and Scholarly Resources in the 21st Century (I)

There are lots of digital resources out there for scholars and students of ancient history and ancient languages, which are my main interests. A really useful searchable version of the classic Liddell and Scott Greek lexicon for example, and the wonderful resources at the Perseus project; the electronic corpus of Sumerian literature at Oxford (ETCSL), The Sumerian Dictionary at the University of Pennsylvania (PSD), the Melammu database on Assyria and Babylonia at the University of Helsinki, the Thesaurus Linguae Graecae, (TLG), which gives access to the whole corpus of Greek literature from Homer onwards to the fall of Constantinople in 1453 CE, and so on. Most of which I had occasion to use in the course of writing The Sacred History of Being.

My bread and butter for many years was in scholarly communications – we built and ran repositories and encouraged open access deposit of scholarly papers in those repositories. Open access, for anyone who doesn’t know, is an important subset of digital publishing, which is about improving the circulation and use of research by taking it from behind a publisher paywall, where possible.

The presence of papers available on open access terms with the appropriate licenses has been invaluable for many researchers, including myself. The physics community knows this better than any other part of academia, since they have had the facilities to upload their papers to Paul Ginsparg’s Arxiv, formerly run as a site using the File Transfer Protocol (FTP), a repository of electronic preprints (note the archaic description!) originally based at Los Alamos from 1991, and at Cornell since 2004 . Papers are available via a searchable interface shortly after uploading. CERN has also maintained an active repository in High Energy Physics for many years. To some significant extent, the Web itself owes the necessity of its invention to the need to find an easy way to organise and disseminate the large collection of papers generated by the research done at CERN.

The publishing community naturally was not very happy about the idea, when it began to be more generally promoted as a solution to a number of problems in contemporary scholarly communication, since it threatened the subscription fees that the publishers charge the academic libraries, if all research was to end up freely available in institutional repositories, or otherwise on author websites. Without those subscription fees, commercial publishing would find itself largely cut adrift from the academic business of doing research, and disseminating that research.

In the early days, some minor concessions were made by the publishing community, in that they would perhaps offer a downloadable unformatted version of a paper in addition to the formally published paper, behind a paywall. Sometimes ‘unformatted’ was taken to extremes, so that the papers were virtually unreadable. Formatting information was present visibly in the text, cluttering the view, but not contributing anything useful to the documents.

Formatting became the thing that the publishers clung on to, as their main value-added contribution to the publishing process, in addition to copy-editing and the organisation of peer-review. They also clung to the practice of authors signing away their copyright in the published articles as part of the acceptance of the article for publication. So along the way we ended up with distinctions being made between preprints and post-prints; the author’s final copy, and the publisher’s final copy; green and gold routes to open access publication; and the invention of rules concerning what authors and institutions could and couldn’t do with these different versions. From the publisher point of view, they argued that what they were doing was maintaining the integrity and quality of the publishing process, and their important role in that. Then we ended up with the invention of article processing charges, which attempted to envelop the research publication process entirely within publisher dictated assessments of cost.

Naturally I’ve compressed a number of years of development in the foregoing, but that is the broad shape of the struggle which has developed since the late 1990s. The publishing community cannot be blamed for attempting to protect their interests, but ultimately it seems to be obvious that research should not be a free resource for publishers which can be used to extract increasingly expensive subscriptions from university libraries. In theory at least, it should be about the quality of scholarship, and its dissemination.

Unfortunately that is not the perspective of many university administrators and senior academics. Early on in the progress of open access, it became possible to see how the community would divide. We spent a lot of time talking to senior academics, with the idea that if we persuaded them of worth of the open access idea, they would encourage their research students to stop signing away their copyrights, and to deposit their work in institutional repositories. Some were interested. Others responded with the specious objection that If they wanted a paper to make an impact they would submit it to Nature, or another publication of similar status. As if we were suggesting that no-one should submit papers to high status, high-impact publications. But that fracture in the nature of the response was a phenomenon which should have told us something important about how senior academics understand publishing, and how open access would fare in succeeding years. It’s about status and its modern double, research funding.

Eventually open access began to be promoted as an aspect of institutional reputation management, which of course is about how an institution and its component faculties and departments are perceived. Of course a perception of quality is not necessarily the same thing as quality itself, so reputation management is more problematic than a real assessment of research output. Publish or perish was an attitude which was already well established In UK academia however, and reputation management became another way to raise an institutional profile, even if the quality of the research was not clarified by doing this. ‘Width’ was also important.

A little later, repository technology was spotted as a way of automating the submission of a sample of research papers in what was called the Research Assessment Exercise (RAE). So the deposit of papers in a repository became an important part of the way in which universities would be assessed for research funding. Open access was now about academics not keeping research information (and their papers) hidden away in their departmental records, but making them available to the institution as a whole, as a component of both the institution’s reputation management, and its pursuit of research funding.

So open access, and the associated technology, in the end became an adjunct to the already established importance of reputational status and the acquisition of government research income for the universities. Yet it still isn’t regarded as a proper publishing route. Is this a strange state of affairs? I think it is, and I will write about this in my next post.

Thomas Yaeger, March 2016