Search the web
Sign In
New User? Sign Up
archivists
? Already a member? Sign in to Yahoo!

Yahoo! Groups Tips

Did you know...
Want to share photos of your group with the world? Add a group photo to Flickr.

Best of Y! Groups

   Check them out and nominate your group.
Having problems with message search? Fill out this form to ensure your group is one of the first to be migrated to the new message search system.

Messages

  Messages Help
Advanced
[Archivists] A variety of fish in my net!   Message List  
Reply | Forward Message #22 of 244 |
RE: [Archivists] A variety of fish in my net!

You have discovered us! Yes, it is the Internet Archive which hosts this
discussion list. For those who are not familiar with us, please go to
www.archive.org for details, but here is some top line information:

We are located in the Presidio of San Francisco, have been in existence for
4 years, but active for 1. We are trying to create a coherent and thorough
archive of the Internet to be made available for free to researchers and
scholars. Our current collection is indeed primarily HTML text; however, we
have just begun to collect Web images and, moving forward, will collect
other sorts of Internet files.

Anybody can get an account with us which allows access to the archives
(http://www.archive.org/proposal.html), but please be aware that our
technological development is still underway. In practical terms that means
the collections are accessible in the UNIX environment only and are in large
flat files that take some computer science knowledge to use. Over the course
of the next year, we will be developing tools which will enable
non-computer-science researchers to use the material.

In terms of providing spiders for others to create specialized crawls: that
is a great idea. We're not able to do that yet, but is something we have
discussed. We are interested in receiving data donations as long as we can
make them available for free to our users. If someone wanted to set up a
partnership in which Partner A provided the crawler to Partner B to run and
we save and serve the resulting data, we would be interested in working that
out. We would also be interested in running additional crawlers that someone
else (Partner A) provides.

I hope this clarifies and provides more food for thought and collaboration.

Cordially,

Marlita Kahn
Managing Director, Internet Archive
415-561-6802



-----Original Message-----
From: archivists-admin@...
[mailto:archivists-admin@...]On Behalf Of Aaron Swartz
Sent: Monday, July 24, 2000 8:39 AM
To: archivists@...
Subject: Re: [Archivists] A variety of fish in my net!


Electronic Information Systems Librarian <xlib@...> wrote:

> 1. Aaron Swartz wrote of "the work of the Internet Archive" Is that what
this
> list is meant to be about? I had completely forgotten about it, but by
> plugging in www.archive.org into my web browser, was reminded that it was
here
> that I joined this list! I notice also that the site still says that
since
> 1998 they have only been collecting ASCII text. Is that really still the
case?
>
> Aaron asked if we should "focus on more specialized archives rather than
> trying to archive the entire Web". Indeed my hope was to elicit help from
> other people in how to archive an extremely specialised subset of
electronic
> documents (about or mentioning the Baha'i Faith) with extremely limited
> resources - only a part of my job, and just me with one lowly PC attached
to a
> network, as part of a total library staff of 15 people.

Well, the website says the list is for "discussion on Internet libraries"
which is rather broad. Perhaps the Internet Archive could work out a
distributed system allowing people like you to work on smaller subsets
(Baha'i) of the Web and contribute your work to the archive. The archive
could provide you with the tools and technologies to spider and store the
information you'd like, and in return you could provide them with the data.

However, I haven't yet heard from anyone at the archive on this list, so I
don't know how feasible this is.

--
Aaron Swartz |"This information is top security.
<http://swartzfam.com/aaron/>| When you have read it, destroy yourself."
<http://www.theinfo.org/> | - Marshall McLuhan

_______________________________________________
Archivists mailing list
Archivists@...
http://www.archive.org/mailman/listinfo/archivists

_______________________________________________
Archivists mailing list
Archivists@...
http://www.archive.org/mailman/listinfo/archivists



Mon Jul 24, 2000 10:42 pm

marlita@...
Send Email Send Email

Forward
Message #22 of 244 |
Expand Messages Author Sort by Date

Friends, In response to Lee's comments I threw out a "fishing net" and pulled in a small but varied catch. 1. Aaron Swartz wrote of "the work of the Internet...
Electronic Informatio...
xlib@...
Send Email
Jul 24, 2000
3:52 am

... Well, the website says the list is for "discussion on Internet libraries" which is rather broad. Perhaps the Internet Archive could work out a distributed...
Aaron Swartz
aswartz@...
Send Email
Jul 24, 2000
5:12 pm

You have discovered us! Yes, it is the Internet Archive which hosts this discussion list. For those who are not familiar with us, please go to www.archive.org...
Marlita Kahn
marlita@...
Send Email
Jul 24, 2000
11:34 pm

Aaron - In response to your query regarding the Internet Archive collection, the website is still accurate. We primarily collect ASCII.txt. In the past four...
Gail Feldman
gail@...
Send Email
Jul 25, 2000
2:14 am

... PANDORA is a regular part of the NLA's work and therefore does not have a separately allocated budget so I can't give you a figure on that. We have a ...
Deborah Woodyard
Dwoodyar@...
Send Email
Jul 25, 2000
2:05 am

Hello all - I am researching a short article on the development of toy-train related webpages for the Toy Train Paper and Memorabilia collectors newsletter. ...
Gregory A. Johnson
gjohnso2@...
Send Email
Jul 25, 2000
6:20 pm
Advanced

Copyright © 2009 Yahoo! Inc. All rights reserved.
Privacy Policy - Terms of Service - Guidelines - Help