Re: Bradford distribution, 80/20, and larger samples (3 messages) Marcia Tuttle 05 Jun 2003 20:46 UTC

3 messages:

----------1
Date: Thu, 5 Jun 2003 09:24:57 -0700
From: Jennifer Sweeney <jennifer.sweeney@attbi.com>
Subject: Re: Bradford distribution, 80/20, and larger samples

Steve,

Thanks so much for sharing your findings--these studies are important for
helping us get better at understanding database content and use.  Your study
suggests a couple of points about electronic journal use that I noticed in a
study I did a couple of years ago.  One is that article views in electronic
databases (or abstracts, or citations, or whatever) --virtual use, that
is--may not necessarily be the "same" as somebody taking something off the
physical shelf.  The numbers seem to suggest that something different is
happening, don't they?  Searching behavior may actually be different online
than in the physical environment.  Think about how your students use the
database, and what they use it for.  Is it really the same as using the
print collection?  Or are all these search and retrieval practices changing
somewhat, perhaps enough to influence Bradford theory?  (wow!)

But a more important question might be whether the collection of EBSCO
titles is distributed in terms of disciplines or fields in such a way as to
produce a skewed Bradford distribution, especially if there is a disconnect
between a significant number of the titles in the database and the topics
your students are studying.  It might mean that EBSCO isn't covering what
your students need covered.

On the other hand, haven't Bradfords in specific disciplines been shown to
be off the 80/20 somewhat?  How far off?  I think you alluded to this in
your message.  It could be argued too that 80/12 is certainly Bradford-like,
if somewhat skewed.  You might want to take a close look at the titles in
the sample anyway.

Just a couple of thoughts-
Jen

**************************
Jennifer Sweeney
PhD Student
Department of Information Studies
UCLA
jksweene@ucla.edu
(916) 965-6635

Mailing address:
4896 Steele Way
Fair Oaks, CA  95628

----- Original Message -----
From: "Steve Black" <blacks@MAIL.STROSE.EDU>
To: <SERIALST@LIST.UVM.EDU>
Sent: Thursday, June 05, 2003 7:42 AM
Subject: Bradford distribution, 80/20, and larger samples

>   A well established principle of bibliometrics is that a relatively small
> number of journals get the majority of use.  A mathematical model for this
> is the Bradford Distribution, stated as 1:n:n2:n3:n4. . .. (that's n, n
> squared, n cubed, n to the fourth, etc.).  If n=2, then the Bradford
> Distribution predicts that 1 title will get X uses, the next 2 most heavily
> used will get X uses, the next 4 get X uses, the next 8 get X uses, and so
> on, where X remains approximately constant.  The n (in this example 2) is
> called the Bradford multiplier.
>
>   A simpler formulation of the same phenomenon is the 80/20 rule.  It states
> that 80% of use is concentrated in 20% of the titles.
>
>   I downloaded the use statistics for the College of Saint Rose for all
> abstracts viewed by our patrons in EBSCOhost databases from Jan. 2001
> through May 2003, and decided to see if the Bradford Distribution and 80/20
> matched the data.   They don't.
>
>  We had 501,768 abstracts viewed from a pool of 8097 titles.  No Bradford
> muliplier consistently matched the data, even though I tried grouping use
> data together in a variety of ways.  The reduction in uses, if plotted,
> would not be a smooth curve.  It would look more like a bumpy roller
> coaster, with many changes in the rate in drop in use.  Bradford
> distribution only very roughly matches the data.
>
>   I also discovered that 11.9% of titles, not 20%, accounted for 80% of
> viewed abstracts.
>
>   Now, I've never read of anyone taking either Bradford Distribution or
> 80/20 as strict rules.  It's understood that no data set will match
> precisely.  But these are big differences.  In the past, the gathering of
> use statistics was so onerous as to force the data sets to be relatively
> small, and often restricted to the journals in a single discipline.  I
> wonder if the larger data samples we can now easily get require a better
> model to describe the distribution of use. Models that hold for the journals
> in single disciplines may not hold for interdisciplinary collections.
>
> Steve Black
> Reference, Instruction, and Serials Librarian
> Neil Hellman Library
> The College of Saint Rose
> 392 Western Ave.
> Albany, NY 12203
> (518) 458-5494
> blacks@mail.strose.edu

----------2
Date: Thu, 05 Jun 2003 12:53:36 -0400
From: David Goodman <dgoodman@Princeton.EDU>
Subject: Re: Bradford distribution, 80/20, and larger samples

A variety of results for analogous studies have begun to be reported,
and I particularly want to call attention to
 Davis P, "Patterns in electronic journal usage : challenging the
composition
 of geographic consortia"  CREL 63 484-497 (2002)

It is very interesting to see a comparative figure from a 4 year
college.

Let me in particular ask you what the figure was for full text articles
as compared to abstracts, because at Princeton we typically see a
narrower distribution of use from full articles than abstracts (which I
interpret as meaning that people look at abstracts from a wide range of
sources, but only find the ones from the major journals worth actually
reading through).
The distribution at Princeton across all science and social science
disciplines for cited articles is also more like 90:10 than 80:20. But
cited is not the same as read--either read as abstracts or read as
articles.

The forthcoming general availability of standardized data
(see www.projectcounter.org) will, just as you say, permit the fuller
analysis of this, and permit the comparison of different  institutions,
types of institutions, disciplines, aggregators, publishers, and
journals.

--
Dr. David Goodman
Princeton University Library
and
Palmer School of Library and Information Science, Long Island University

e-mail: dgoodman@princeton.edu

----------3
Date: Thu,  5 Jun 2003 13:56:23 -0400
From: Gregory Szczyrbak <gszczyrb@ycp.edu>
Subject: Re: Bradford distribution, 80/20, and larger samples

This is just conjecture, however, my guess is that the numbers are different
because of the collections analyzed.

Traditional analysis of collections centered around print collections that were
very thoughtfully collected (usually) according to well defined goals for
supporting a curriculum or a community of researchers.

I would argue, and I'm not the first, that the collections that are part of
EBSCOHost and other aggregator databases contain many more titles that are not
part of well defined collection development goals for specific libraries.
Instead they are selected as part of a 'one-size-fits-all' model.
Aggregators "take it or leave it" distribution methods force us to accept
titles that would have never been part of a print collection.

I'm not really arguing against aggregator databases.  They very necessarily do
this to keep costs down, yet provide something useful to all of us, even if it
is at the expense of well established principles of bibliometrics. :)

--
Gregory Szczyrbak, Reference Librarian
Schmidt Library
York College of Pennsylvania
York, PA  17405