Critique of EU Council's Conclusions on OA Stevan Harnad 16 Jan 2008 03:26 UTC

     ** Apologies for Cross-Posting **

     Fully hyperlinked version available at
     http://openaccess.eprints.org/index.php?/archives/350-guid.html

Also to be discussed at the DRIVER Summit
     http://openaccess.eprints.org/index.php?/archives/349-guid.html
is this EU Council statement (which shows the tell-tale signs of
penetration by the publisher anti-OA lobby; familiar slogans, decisively
rebutted many, many times, crop up verbatim in the EU Council's language,
though the Council does not appear to realize that it has allowed itself
to become the mouthpiece of these special interests, which are not those
of the research community):

     "Council of the European Union: Conclusions on scientific information
     in the digital age: access, dissemination and preservation"
     http://www.consilium.europa.eu/ueDocs/newsWord/en/intm/97236.doc

Here is my critique of this EU Council statement:

     "the importance of scientific output resulting from publicly funded
     research being available on the Internet at no cost to the reader
     under economically viable circumstances, including delayed open
     access"

(1) 'At no cost to the reader' conflates site-licensing and Open Access
(OA). This wording was no doubt urged by the publisher lobby. The focus
should be on providing free online access webwide. That is OA, and that
makes the objective clear and coherent.

(2) 'Delayed open access' refers to publisher embargoes on author
self-archiving. If embargoes are to be accommodated, it should be made
clear that they apply to the date at which the access to the embargoed
document is made OA, not to the date at which the document is deposited,
which should be immediately upon acceptance for publication. The DRIVER
network of Institutional Repositories (IRs) can then adopt the 'email
eprint request' button that will allow individual users to request and
receive individual copies of the document semi-automatically.

(3) What should be deposited in the author's own institutional IR
immediately upon acceptance for publication is the author's
peer-reviewed, accepted final draft ('postprint'), not the publisher's
PDF (or XML). There are far more publisher embargoes on the PDF/XML than
on the postprint, and the postprint is all that is needed for research
use and progress. The postprint is a supplementary version of the
official publication, provided for OA purposes; it is not the version
with the primary digital preservation problem.

(4) Digital preservation should not be conflated with OA provision:
There is a (separate) problem of the digital preservation of the
publisher's PDF/XML, but this is not the same as the problem of
providing OA to the author's postprint. The postprint, though it can and
should be preserved, is not the canonical copy of the publication, so
the two preservation tasks should not be conflated.

(5) Self-archiving research data is also a different matter from
self-archiving research publications. Data-archiving is not subject to a
publisher embargo, and it needs independent preservation, but
data-access and data-preservation should not be conflated with OA
provision.

(6) Deposit should be directly in each author's own IR: Distributed
institutional depositing and storage should not be conflated with
central harvesting and indexing: Deposit Institutionally, Harvest
Centrally.

(7) Direct central deposit should be avoided except in cases where the
author is institutionally unaffiliated or the author's institution does
not yet have an IR. For those cases, there should be at least one
provisional default repository such as DEPOT.

(8) Research (publications and data) should not be conflated with other
forms of digital content. The problems of cultural heritage archiving,
for example, are not the same as those of research publication
archiving. Nor are the problems of archiving the same as the problem of
access-provision (OA).
"ensure the long term preservation of scientific information -including
publications and data"
This is an example of the complete conflation of OA-provision with
digital preservation, including a conflation of authors' supplementary
postprints with the publisher's original, as well as a conflation of
research publications with research data.

DRIVER will not have a coherent programme unless it clearly and
systematically de-conflates OA-provision from digital preservation,
primary publications from authors' supplementary postprints, and
publication-archiving from data-archiving, treating each of these
separately, on its own respective terms.

     "experiments on and wide deployment of scientific data infrastructures
     with cross-border, cross-institution and cross-discipline added-value
     for open access to and preservation of scientific information"

This again conflates OA provision with digital preservation and
conflates publications with data. It also conflates both of these with
IR interoperability, which is yet another matter. (And webwide OA is, by
definition, cross-institution, cross-border and cross-discipline, so
that is a non-issue.)

What is an issue, however, is institutional versus central depositing,
and it is crucial that DRIVER have a clear, coherent policy (insofar as
research archiving is concerned -- this does not necessarily apply to
other forms of digital content): Deposit Institutionally:
Harvest/Index/Search Centrally.

The emphasis of DRIVER should accordingly be on ensuring that the
distributed IRs have the requisite interoperability for whatever central
harvesting, indexing, search and analysis are needed and desired.

     "promoting, through these policies, access through the internet to
     the results of publicly financed research, at no cost to the reader,
     taking into consideration economically sustainable ways of doing this,
     including delayed open access"

Economic sustainability is again a red herring introduced by the
publishing lobby into language that should only concern the research
community and research access. The economic sustainability of publishing
is not DRIVER's concern.

DRIVER's concern should be interoperable OA-provision (plus whatever
cultural-heritage and other forms of archiving DRIVER wishes to provide
the infrastructure for).

Nor are publisher access-embargoes DRIVER's concern: DRIVER should
merely help ensure immediate deposit in IRs, and it should facilitate
research usage needs through IR interoperability as well as the IRs'
email eprint request button.

     "2008 working towards the interoperability of national repositories
     of scientific information in order to facilitate accessibility and
     searchability of scientific information beyond national borders"

Insofar as research is concerned, it is not the interoperability of
national repositories that is crucial but the interoperability of all OA
IRs.

     "2009 contributing to an effective overview of progress at European
     level, informing the Commission of results and experiences with
     alternative models for the dissemination of scientific information."

This is again a red herring (for both the EU and for DRIVER) introduced
by the publishing lobby: Research archiving and OA-provision are neither
a matter of alternative publishing models nor a matter of alternatives
to the generic peer-reviewed publication model. Publishing reform and
peer review reform are not DRIVER matters. They can and will evolve too,
but DRIVER should focus on the deposit of current published research as
well as research data in IRs, and the interoperability of those IRs.
That is the immediate problem. The rest is merely speculative for now.

     "B. Invitation to the Commission to implement the measures announced
     in the Communication on "scientific information in the digital age:
     access, dissemination and preservation", and in particular to:
     1. Experiment with open access to scientific publications resulting
     from projects funded by the EU Research Framework Programmes by:
     defining and implementing concrete experiments with open access to
     scientific publications resulting from Community funded research,
     including with open access."

This is a vague way of saying that the publishing lobby has persuaded
the EU not to do the obvious, but to keep on 'experimenting' as if what
needed to be done were not already evident, already tested, already
demonstrated to work, and already being done, worldwide (including by
RCUK, ERC, NIH, and over a dozen universities):

The EU should mandate that all EU-funded research articles (postprints)
are deposited in the fundee's IR immediately upon acceptance for
publication. Access can be set in compliance with embargoes, if desired.
And data-archiving should be strongly encouraged. DRIVER's concern
should be with ensuring that the network of IRs has the requisite
interoperability to make it maximally useful and usable for further
research progress.

----------------------------------------------------------

Here is the video of my presentation to the DRIVER Summit:

    Institutional Versus Central Deposit:
    Optimising DRIVER Policy for the OA Mandate and Metric Era
    http://users.ecs.soton.ac.uk/harnad/Temp/harnad-driver.mov

Here is a written summary statement:

       THE FEEDER AND THE DRIVER:
       Deposit Institutionally, Harvest Centrally

       Stevan Harnad

DRIVER is designing an infrastructure for European and Worldwide Open
Access research output, stored in institutional and disciplinary
repositories, now increasingly under institutional and research-funder
mandates. It is critical for DRIVER to explicitly take into account in
its design (as some research funders have not yet done, because they
have not yet thought it through) that institutional and disciplinary
repositories (IRs and CRs), although they are fully interoperable and at
a par in that respect, nevertheless play profoundly different roles.

Universities and research institutions are the FEEDERS for both kinds of
repositories (IRs and CRs). Universities and research institutions are
the primary providers of research, funded and unfunded, in all
disciplines, for both.

This difference in role and function must be concretely reflected in the
design of the DRIVER infrastructure. The primary locus of deposit for
all research output is the researcher's own institution's IR (except in
the increasingly rare case of institutionally unaffiliated researchers).
Thanks to OAI-interoperability, the metadata for those deposits, or even
the full-text deposits themselves, can also be harvested by (or exported
to) any number of CRs -- discipline-based CRs, funder-based CRs,
theme-based CRs, national CRs.

Neither IRs nor CRs will fill without deposit mandates. This is a hard
lesson that has been learned very late, but it has now at long last
indeed been learned. So the number of institutional and funder mandates
is now set to grow dramatically. Institutions of course always mandate
deposit in their IRs. Many funders have mandated deposit, indicating
that deposit can be in either IRs or CRs; but some funders have
stipulated that deposit must be in CRs.

This is a symptom of not having thought OA through. Mandating funders
are of course greatly to be commended for mandating OA, but their
short-sightedness on the locus and means of deposit needs correction,
and DRIVER can and should help with this, pre-emptively, rather than
blindly following the unreflective and incoherent trends in the air
today. Indeed DRIVER must take a coherent position, if it wants OA
content to be provided and OA repositories to be filled.

The model that DRIVER should adopt in designing its infrastructure is
"Deposit Institutionally, Harvest Centrally." That is the way to scale,
speedily and systematically, to 100% OA. I give the reasons in detail in
my talk tomorrow, but for now, I just want to point out the principle
points:

Institutions are the providers -- the source -- of all research.
Institutions have a direct interest in showcasing and managing their own
research output, but they have been even more sluggish than funders in
adopting mandates. If funders mandate central deposit, they neither
cover all of OA output nor do they collaborate coherently with the
providers (the institutions) to scale up systematically to providing OA
to all of their research output. The OAI protocol makes it possible to
harvest content from all OAI-compliant repositories. That is the
coherent, systematic pattern of content provision for which DRIVER
should be designed, not an incoherent patchwork of arbitrary
institutional and central depositing and repositories that will neither
scale up to all of OA nor accelerate its attainment.

And, not to put too fine a point on it, the very notion of Central
Repositories betrays something of a misunderstanding of the online
medium: Is Google a central repository? Is it a repository at all? Do
people deposit directly in Google?

OAIster is an even better model: It was explicitly designed to be an OAI
service provider, a functional overlay on the distributed OA content
providers. Do CRs really need to be any more than that?

Stevan Harnad
AMERICAN SCIENTIST OPEN ACCESS FORUM:
http://amsci-forum.amsci.org/archives/American-Scientist-Open-Access-Forum.html
     http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/

UNIVERSITIES and RESEARCH FUNDERS:
If you have adopted or plan to adopt a policy of providing Open Access
to your own research article output, please describe your policy at:
     http://www.eprints.org/signup/sign.php
     http://openaccess.eprints.org/index.php?/archives/71-guid.html
     http://openaccess.eprints.org/index.php?/archives/136-guid.html

OPEN-ACCESS-PROVISION POLICY:
     BOAI-1 ("Green"): Publish your article in a suitable toll-access journal
     http://romeo.eprints.org/
OR
     BOAI-2 ("Gold"): Publish your article in an open-access journal if/when
     a suitable one exists.
     http://www.doaj.org/
AND
     in BOTH cases self-archive a supplementary version of your article
     in your own institutional repository.
     http://www.eprints.org/self-faq/
     http://archives.eprints.org/
     http://openaccess.eprints.org/