Project Aims to Archive the Entire Internet (fwd) Birdie MacLennan 25 Jun 1996 20:03 UTC

Forwarded FYI.  -Birdie

---------------------
Date: Fri, 21 Jun 1996 08:14:55 -0400
From: liber@elk.uvm.edu
Subject: The WHOLE Internet
X-Mailer: Mozilla 2.01 (Macintosh; I; PPC)

<A HREF="/">Home</A> | <A
HREF="/info/contents/sections.html">Sections</A> |
<A HREF="/info/contents/contents.html">Contents</A> | <A
HREF="/search/daily">Search</A> | <A HREF="/comment">Forums</A> | <A
HREF="/info/help">Help</A> <P> Copyright 1996

Project Aims to Archive the Entire Internet

By  LAURIE J. FLYNN

   There's a certain symmetry to the fact that Brewster Kahle
chose a historic site in San Francisco as office space for his
new venture, The Internet Archives. After all, Kahle, a computer
scientist, has adopted the role of chief curator of the world's
digital history.

        \|/
        -+-
        /|\

        The Net, for all intents and purposes, is
        completely different today from what it was
        a year ago.
                                     Brewster Kahle

                                                \|/
                                                -+-
                                                /|\

    Last week, Kahle, a 35-year-old entrepreneur, officially
launched his labor of love: nothing short of establishing a
permanent record of the entire Internet. His ambition is to
create a cultural time capsule that will document the early days
of the digital revolution, preserving it in a digital library
that he will make available as a public resource.

  "There's something very important going on," Kahle told
friends and family at the official kick-off of the company,
which has its offices on the newly converted Presideo Army base
overlooking the Golden Gate Bridge. "The stuff that's going on
in the digital domain now is our cultural history."

   Looking back someday, he predicted, "We'll have a very good
idea of what the late 20th century was like."

   The need for an ongoing record of the Internet has become
sort of a battle cry of the digerati in recent months,
particularly as it has become clear just how fast the Net is
expanding. Estimates vary widely on how fast it is changing,
though anyone who has used the Web knows that new sites come and
go faster than TV sitcoms. And even if a Web site endures, old
pages are often purged from servers to free up precious space.

   "The Net, for all intents and purposes, is completely
different today from what it was a year ago," Kahle said. "It's
gone. Everyone out there is pushing to the future."

   Kahle compares the Net today to the early days of television,
particularly as it relates to major political events. "Early
television just evaporated," he said. "We don't even know what
it looked like. It would be great to see today what campaign
commercials were like in 1950."

   But to create such an archive is a project of untold
proportions, Kahle concedes. So far, he has fincanced the
project himself, using part of the fortune he amassed when he
sold his Web publishing company, WAIS Inc., to America Online
last year. Eventually, he may add additional investors.

        +-+-
        -+-+
        +-+-

        The goal is to create a new breed of products
        for mining terrabytes of data.
                                       Brewster Kahle

                                                +-+-
                                                -+-+
                                                +-+-

    Before creating WAIS in 1992, Kahle, a computer scientist,
helped found the Thinking Machines Corporation, creator of
powerful supercomputers. It was at Thinking Machines that he
first began tackling the question of how to manage huge volumes
of data and make it usable by people.

   But Kahle doesn't plan to achieve his goal of archiving the
Net entirely on this own. Rather, he's accepting help and
donations wherever he can find them. As of last week, he and the
five members of his staff had finished archiving the text of the
Web, essentially by working with a donation of the data from a
Web crawler company.

   Kahle said he hoped to entice others to donate their own
archives with the promise that they will be stored permanently.
He hopes to have a copy of the entire Net, including Web images,
Usenet and gopher sites, by the end of the summer.

   The company is also working with the Smithsonian Institution
to collect Presidential Web sites, a project that will result in
an exhibit at the American History Museum focusing on the Web's
impact on the 1996 election.

   And then the hard part will begin.

   At that point, he said, the company will start working on
providing public access, clearly even a thornier issue than
amassing the data. Kahle says he is working with the major
policy makers and experts on intellectual property, including
law professors at both Stanford University and the University of
California, to help understand the scope of the copyright issues
the company will soon face.

   Privacy concerns, too, will no doubt arise as the company
attempts to change "a medium that's assumed ephemeral into an
enduring one," Kahle said.

   The Internet Archives will consist of essentially two
companies: The archives themselves will reside in a not-
for-profit trust, while Kahle and his colleagues will also
develop software for managing huge amounts of Internet data.
That software will eventually be packaged and sold commercially
for use with Intranets and other large sites, though Kahle has
no specific time frame yet for doing so. The goal, he said, "is
to create a new breed of products for mining terrabytes of
data."

   Kahle concedes that not everybody understands the importance
of recording the Net as a sort of historical artifact, and he
admits that many people look at him like he's crazy.

   "They either say, 'How could you possibly do that?,' or 'Why
would you want to?' "

   Kahle answers: "The idea is to have an impact."

Copyright 1996 The New York Times