Guidelines for Electronic Media Statistics

Guidelines for Electronic Media Statistics Frederick C. Lynden 08 Oct 1993 21:19 UTC
Please reply to Jan Bruusgaard at the address below; not to SERIALST. -ed.

----------------------------Original message----------------------------
     Currently librarians are very interested in keeping
statistical track of media, especially electronic media.  At
a recent meeting of the IFLA Standing Committee on
Statistics, Mr. Jan Bruusgaard, Norway, presented "Remarks
on compiling statistics on electronic media". It is an
excellent document, and Mr. Bruusgaard would like comments
from American librarians by December 15. His address is:

                    Jan Bruusgaard
                    Government Administration Services
                    Documentation Services
                    P.O. Box 8129
                    0033-OSLO
                    NORWAY

His paper will also be considered by the ARL Statistics
Committee at its annual meeting.  Once he was received all
comments it will be published by the IFLA statistics
section.

Thank you for your assistance.

                    Frederick C. Lynden
                    Brown University
                    AP010037 at Brownvm

Remarks- Compiling Statistics on Electronic Media

I. Purpose
   A.  For libraries it should be a goal to document  all
       types offered as well as possible.

   B.   Electronic media may have a larger and larger part
       of the library's acquisition budget. Statistics will
       give grounds for the spending of these resources.
       This can also explain the use of resources to
       training and motivation.

   C.  Electronic media would partly be a supplement to and
       partly a substitute for printed media.  Probably the
       part that substitutes printed material would bring
       about a decline in the traditional statistics on
       printed material.For this reason it is important to
       document the corresponding increase in electronic
       media.

   D.  Statistics on electronic media may make it possible
       to estimate the value and benefit of databases in
       relationship to:

       1. printed material

       2. other databases or electronic media

       3. cost of the material

   E.  Statistics would give management information to the
       library for planning and budgeting.

II. Background and concepts.

   A.Parts

       1.  In the traditional statistics we only have one
           part - the library itself.  Because of the
           complexity of the statistics of electronic media
           it is natural to include:

           a. database hosts, vendors and producers

           b. Software producers and developers

       2.  If we succeed to make standards which are
           acceptable to a. and b., with a view to which
           kind of information that is     relevant to
           libraries, the compilation of data would be much
           easier.

   B.  Products. Electronic media consists of different
       products that have to be handled in different ways.

       1.  Databases are the most important.  These can be
           classified by medium:

           a. Online-databases
           b. CD-ROM databases

       2.  Databases can also be classified by type:

           a. Bibliographic or reference databases
           b. Fulltext databases
           c. Factual databases

Reference databases may often be used as a tool in
retrieving printed material, while fulltext databases would
be used as a replacement of or supplement to printed
material.

       3.  A third way to divide databases are into:

           a.  textual databases
           b.  picture databases
           c.  (In the future) sound databases

       4.  Other types of electronic media that may be
           important are:

           a.  software
           b.  multimedia-packets

   C.  Ownership. Electronic media differs from printed
       material in the way that there is a more
       complexstructure in the case of ownership.

       1.  Printed material would be the library's own
           property. Some of the electronic documents would
           also be.  In addition electronic media may be:

       2.  rented

       3.  only the data owned by others are permitted to be
           used, e.g. databases owned by the producer.

       4.  Different ownership leads to different placement.

           a.  The media can be placed inside the library

           b.  or it can be placed outside the library,
               where only the terminal and
               telecommunications network give the library
               access to the data.

III. Measurement

   A.  How to measure electronic media?

       1.  Count

           a.  Manual counting. This is done by many today
               and is the simplest way, e.g. number of CD-
               ROM-records, number online-database searches
               executed.
           b.  Electronic counting. A major tool may be the
               development of software that count and give
               printed repots. This software has to work
               together with the other software that is used
               in the retrieval process.

       2.  Sample tests may be another possibility of
           measurement.

   B.  What to measure.

       1.  Collection.

           a.  Macrolevel.

               (1) Number of CD-ROMS records, number of
                   diskettes

               (2) Number of databases available. This would
                   give a picture of how a large part of the
                   electronic world the library has access
                   to On the other hand many database hosts
                   offer many databases where as a single
                   library only takes advantage of a few,
                   e.g. Dialog who offer hundreds of
                   adifferent databases.

           b.  Microlevel

               (1) Number of bytes. One byte corresponds to
                   one letter. This measure may tell
                   something about the size of the
                   databases. (In the computerworld the
                   measures kilo-, mega-, giga-, and
                   terabytes are used.) One problem here is
                   that a database both consists of pure
                   data and indexes (overhead) which may be
                   2-10 times as large as the pure data.

               (2) Number of records/documents in the
                   database. Together with an average size
                   of the records, in bytes, number will
                   give a good picture.  The average size of
                   the records is important, because the
                   size of one record would vary much since
                   a pure bibliographic record is much
                   smaller than a record with abstract or a
                   fulltext.

       2.  Use

           a.  Number of loggins to a database will give a
               measure of how many requests are solved by
               each single electronic source.

           b.  Time. The spending of time is an easy
               measure.Here we count number of minutes which
               are used on connecting each database (online-
               bases), or number of minutes on a terminal i
               being active (CD-ROM).

           c.  Number of bytes fetched from disk (database)
               would give a rather physical measure of use.

           d.  Number of records fetched from disk would
               give the number of logical units that are
               retrieved. This measure has to be used
               together with the average size of the record.

           e.  Number of bytes/records printed to a printer
               or file show what the user gets out of the
               session.This measure removes the material
               which is judged as irrelevant. This measure
               can be seen as a parallel to loan, because
               this is what the user can bring out of the
               library.

           f.  If we presume that cost used on a single unit
               reflects benefit, spending is a suitable
               measure.Costs can be divided into:

               (1) Once only costs

               (2) Subscription costs

               (3) Cost for use or pr minute

It is likely that a combination of the measures mentioned
above is the best to get a suitable picture of the use and
collection of electronic media. One possibility is to
develop an abstract measure which represent the different
factors with unequal weights.

IV. Problematic areas

   A.  The development of standard counting procedures
       require some standardizing work done by the software
       and database vendors. IFLA or ISO may take the
       initiative in setting standards or guidelines among
       these vendors.

   B.  Even if the library's own catalogue is a database,
       and often the most used one, this should be separated
       from other electronic media. Union catalogues makes
       the border unclear.

   C.  Protection of individuals.  This kind of measures
       gives spin off effects that can be used to measure
       the effectiveness of the single employee, and maybe
       map the single person's use of the media.  It is
       important that this kind of measurement is done in
       agreement with the users and employees.

V.More advanced possibilities - research.

There are two fields from where the knowledge can be
utilized to get better measures of electronic media.

   A.  Information theory. Using some information theory it
       maybe would be possible to develop measures about
       what is new knowledge and what only are duplication
       of existing knowledge that also are located in other
       sources. Many electronic media have a great deal of
       overlap, often in a much higher degree than printed
       material.

   B.  Information retrieval. It should be possible to use
       knowledge and methods from this field to give a more
       precise answer to what is retrieved and used,
       e.g.compared to the principles of recall and
       precision.

19.8.93

Jan Bruusgaard,
Government Documentation services
POB 8129 Dep
N-0033 Oslo
Norway

Member of IFLA's SC section of statistics.