3 messages: ----------1 Date: Thu, 5 Jun 2003 09:24:57 -0700 From: Jennifer Sweeney <jennifer.sweeney@attbi.com> Subject: Re: Bradford distribution, 80/20, and larger samples Steve, Thanks so much for sharing your findings--these studies are important for helping us get better at understanding database content and use. Your study suggests a couple of points about electronic journal use that I noticed in a study I did a couple of years ago. One is that article views in electronic databases (or abstracts, or citations, or whatever) --virtual use, that is--may not necessarily be the "same" as somebody taking something off the physical shelf. The numbers seem to suggest that something different is happening, don't they? Searching behavior may actually be different online than in the physical environment. Think about how your students use the database, and what they use it for. Is it really the same as using the print collection? Or are all these search and retrieval practices changing somewhat, perhaps enough to influence Bradford theory? (wow!) But a more important question might be whether the collection of EBSCO titles is distributed in terms of disciplines or fields in such a way as to produce a skewed Bradford distribution, especially if there is a disconnect between a significant number of the titles in the database and the topics your students are studying. It might mean that EBSCO isn't covering what your students need covered. On the other hand, haven't Bradfords in specific disciplines been shown to be off the 80/20 somewhat? How far off? I think you alluded to this in your message. It could be argued too that 80/12 is certainly Bradford-like, if somewhat skewed. You might want to take a close look at the titles in the sample anyway. Just a couple of thoughts- Jen ************************** Jennifer Sweeney PhD Student Department of Information Studies UCLA jksweene@ucla.edu (916) 965-6635 Mailing address: 4896 Steele Way Fair Oaks, CA 95628 ----- Original Message ----- From: "Steve Black" <blacks@MAIL.STROSE.EDU> To: <SERIALST@LIST.UVM.EDU> Sent: Thursday, June 05, 2003 7:42 AM Subject: Bradford distribution, 80/20, and larger samples > A well established principle of bibliometrics is that a relatively small > number of journals get the majority of use. A mathematical model for this > is the Bradford Distribution, stated as 1:n:n2:n3:n4. . .. (that's n, n > squared, n cubed, n to the fourth, etc.). If n=2, then the Bradford > Distribution predicts that 1 title will get X uses, the next 2 most heavily > used will get X uses, the next 4 get X uses, the next 8 get X uses, and so > on, where X remains approximately constant. The n (in this example 2) is > called the Bradford multiplier. > > A simpler formulation of the same phenomenon is the 80/20 rule. It states > that 80% of use is concentrated in 20% of the titles. > > I downloaded the use statistics for the College of Saint Rose for all > abstracts viewed by our patrons in EBSCOhost databases from Jan. 2001 > through May 2003, and decided to see if the Bradford Distribution and 80/20 > matched the data. They don't. > > We had 501,768 abstracts viewed from a pool of 8097 titles. No Bradford > muliplier consistently matched the data, even though I tried grouping use > data together in a variety of ways. The reduction in uses, if plotted, > would not be a smooth curve. It would look more like a bumpy roller > coaster, with many changes in the rate in drop in use. Bradford > distribution only very roughly matches the data. > > I also discovered that 11.9% of titles, not 20%, accounted for 80% of > viewed abstracts. > > Now, I've never read of anyone taking either Bradford Distribution or > 80/20 as strict rules. It's understood that no data set will match > precisely. But these are big differences. In the past, the gathering of > use statistics was so onerous as to force the data sets to be relatively > small, and often restricted to the journals in a single discipline. I > wonder if the larger data samples we can now easily get require a better > model to describe the distribution of use. Models that hold for the journals > in single disciplines may not hold for interdisciplinary collections. > > Steve Black > Reference, Instruction, and Serials Librarian > Neil Hellman Library > The College of Saint Rose > 392 Western Ave. > Albany, NY 12203 > (518) 458-5494 > blacks@mail.strose.edu ----------2 Date: Thu, 05 Jun 2003 12:53:36 -0400 From: David Goodman <dgoodman@Princeton.EDU> Subject: Re: Bradford distribution, 80/20, and larger samples A variety of results for analogous studies have begun to be reported, and I particularly want to call attention to Davis P, "Patterns in electronic journal usage : challenging the composition of geographic consortia" CREL 63 484-497 (2002) It is very interesting to see a comparative figure from a 4 year college. Let me in particular ask you what the figure was for full text articles as compared to abstracts, because at Princeton we typically see a narrower distribution of use from full articles than abstracts (which I interpret as meaning that people look at abstracts from a wide range of sources, but only find the ones from the major journals worth actually reading through). The distribution at Princeton across all science and social science disciplines for cited articles is also more like 90:10 than 80:20. But cited is not the same as read--either read as abstracts or read as articles. The forthcoming general availability of standardized data (see www.projectcounter.org) will, just as you say, permit the fuller analysis of this, and permit the comparison of different institutions, types of institutions, disciplines, aggregators, publishers, and journals. -- Dr. David Goodman Princeton University Library and Palmer School of Library and Information Science, Long Island University e-mail: dgoodman@princeton.edu ----------3 Date: Thu, 5 Jun 2003 13:56:23 -0400 From: Gregory Szczyrbak <gszczyrb@ycp.edu> Subject: Re: Bradford distribution, 80/20, and larger samples This is just conjecture, however, my guess is that the numbers are different because of the collections analyzed. Traditional analysis of collections centered around print collections that were very thoughtfully collected (usually) according to well defined goals for supporting a curriculum or a community of researchers. I would argue, and I'm not the first, that the collections that are part of EBSCOHost and other aggregator databases contain many more titles that are not part of well defined collection development goals for specific libraries. Instead they are selected as part of a 'one-size-fits-all' model. Aggregators "take it or leave it" distribution methods force us to accept titles that would have never been part of a print collection. I'm not really arguing against aggregator databases. They very necessarily do this to keep costs down, yet provide something useful to all of us, even if it is at the expense of well established principles of bibliometrics. :) -- Gregory Szczyrbak, Reference Librarian Schmidt Library York College of Pennsylvania York, PA 17405