Search the web
Sign In
New User? Sign Up
archivists
? Already a member? Sign in to Yahoo!

Yahoo! Groups Tips

Did you know...
Show off your group to the world. Share a photo of your group with us.

Best of Y! Groups

   Check them out and nominate your group.
Having problems with message search? Fill out this form to ensure your group is one of the first to be migrated to the new message search system.

Messages

  Messages Help
Advanced
[Archivists] Archivists archives   Message List  
Reply | Forward Message #11 of 244 |
RE: [Archivists] Archivists archives

Bryn,

good to hear from you, I just downloaded webstripper and it worked fine. I
am trying to find a way to archive websites, much like you do, to be viewed
by future generations so that they can experience the same 'look and feel'
as we are experiencing now. You mentioned an interesting problem: if you
view a site as an 'edition', how can you make the updates accessible without
destroying the elder site? Would it be possible to deploy a form of indexing
as is being used in 'normal' archives like the EAD standard(Encoded Archival
Description)?. Another thing is how do you know when a site has been
updated? Are there tools that can notify you when that happens (and ideally
integrate with webstripper so that the update is done automatically?

cheers,

Harry Verwayen
Area Manager

IDC Publishers Inc.
350 Fifth Avenue, Suite 1801
Empire State Building
New York, NY 10118

Tel: 212-271-5945
Fax: 212-271-5930
Toll Free: 800-757-7441
Email: harry@...
Internet: http://www.idc.nl



> -----Original Message-----
> From: archivists-admin@...
> [mailto:archivists-admin@...]On Behalf Of Electronic Information
> Systems Librarian
> Sent: Thursday, July 20, 2000 6:50 AM
> To: 'L.H. Grant'; archivists@...
> Subject: RE: [Archivists] Archivists archives
>
>
> Lee,
>
> Until your email I'd forgotten the list even existed.
>
> Perhaps just to start something I'll describe a little bit of
> what our library is trying to do in the way of internet archiving
> - and the way I do it. This may bring forth some comments - like
> - "That's crazy - why don't you.... whatever.." :) Any
> suggestions will be gratefully received.
>
> The Baha'i World centre library (http://library.bahai.org ) is
> attempting to maintain a complete a record of the development of
> the Baha'i Community as possible. All Baha'i communities and
> publishers are requested to deposit copies of their publication
> with the Library. We also try to collect any mention of the
> Baha'i Faith in any source whatsoever.
>
> Since 1989 we have been receiving a few community or association
> newsletters by email. Until 1999 when the World Centre changed to
> PC's with NT, we were using UNIX in a character based
> environment. Thus such newsletters were saved to sub-directories
> of UNIX. More and more such items were being received by the end
> of the decade. We used PINE to save the files - many of which we
> could not even read at the time.
>
> However with the change to PC's and NT during 1998/9, a whole new
> world opened up.
>
> The Library was granted space to create an INTERNAL Web page.
> This has been done, with one section being the "Electronic
> Collection". To this is saved any electronically received item we
> can. At the moment there are 8 parts:
>
> 1. Annual and Convention reports of the National Baha'i Communities
> 2. Electronic Baha'i newsletters
> 3. Electronic Commercial and Scholarly journals
> 4. Miscellaneous
> 5. Online news source archive
> 6. Radio and TV Transcripts
> 7. Theses and Dissertations
> 8. Web page archives.
>
> The goal is to try to save whatever we get in a way that future
> users will be able to experience what current users experience.
> This often means going into the code or program and changing it
> so that is viewable in the available suite of Microsoft tool.
> Thus there is some reduction of the archival integrity in favour
> of the final product looking as much as it did originally as possible.
>
> The items saved come from either emails or the Web.
>
> Within emails, the items may arrive in HTML form, Word Processed
> attachments (Word, Word-Perfect etc), plain ASCII text or some hybrid.
>
> They are all saved to the networked drive. HTML emailed files are
> then opened in Internet Explorer from the network drive and
> edited with Notepad to ensure that all the links work in the
> local environment. They are then imported into the Library's Web
> using FrontPage 2000. Non-HTML files are examined to see if they
> will be opened by Internet Explore with some semblance of their
> original look. Word files are simply opened, but some text files
> do get messed up. I do whatever I can to try to save their
> original look as much as possible - sometimes converting them to
> HTML, sometimes not. I then import those to the Libarchive web
> site and create the links.
>
> Thus the items exist in two places - on the network drive,
> (sometimes in both the original format and a local copy but we
> have not been consistent about this), and in the Intra-web.
>
> Initially, the Library only had one Web but after about 3 months
> of building the Web we suddenly found that all the HTML pages had
> been "themed" by Front-page. This is sort of like taking the
> Mona-Lisa and changing its background and colours!
>
> Thus a second web page - Libarchive was created in which no
> themes are used. The advantage of having saved the items on the
> Network drive became apparent, as we were able to re-import them
> all. The Libarchive web has no home pages. These are all created
> on the major Library Web, get themed appropriately, then point to
> the unthemed Libarchive site.
>
> Archiving the Web items is also very challenging.
>
> For single pages, we usually can simply save the page in Internet
> Explore, which nicely saves the associated images etc, in a
> sub-directory associated with the file. These are then imported
> into the Archives web and linked to from the index pages in the main web.
> This is mainly how we archive any reference to the Baha'i Faith
> in online-new sources and other web sites.
>
> For entire web sites, I have been using a program called
> Web-stripper. However, I have not yet fully come to terms with
> it, and sometimes find myself either saving too much or too
> little. Further, updating the web page is problematic. We don't
> always want to overwrite the existing pages, since they can be
> considered an earlier edition, but we don't want to duplicate the
> entire site. Thus far we have got around it by first renaming
> the original sites-home page saving any image files associated
> with it to another part of the Web before updating the entire
> site. That way we can save a snapshot of what the site used to
> look like at different parts of its history.
>
> It is understood that many of the links within a file lead to
> external pages and will degrade overtime. It is also understood
> that many of the advertising links still continue to be linked to
> and external site and will continue to show new and different
> advertising to what is seen originally - until such time as that
> link dies and an empty space is left.
>
> After some 6 months experimentation the Libarchive web has some
> 7,511 files, of which (if I understand the FrontPage Site
> summary) 2,903 are pictures.
>
> At the moment all of this is only visible within our organization
> - the Baha'i World Centre and does not exist on our Public Web
> Site. Various decisions need to be made before the collection, or
> parts of it, could be made public.
>
> So that's it in a nutshell. Does it sound anything like what
> somebody else is doing out there?
>
> warm regards,
> Bryn Deamer
> Electronic Information Systems Librarian
> Baha'i World Centre Library
> xlib@...
> http://library.bahai.org
>
>
>
> _______________________________________________
> Archivists mailing list
> Archivists@...
> http://www.archive.org/mailman/listinfo/archivists
>

_______________________________________________
Archivists mailing list
Archivists@...
http://www.archive.org/mailman/listinfo/archivists



Thu Jul 20, 2000 8:12 pm

harry@...
Send Email Send Email

Forward
Message #11 of 244 |
Expand Messages Author Sort by Date

Greetings all, As a new subscriber to the list I tried to access the archivists archives and was informed that there currently are no archives available. Is...
L.H. Grant
lhgrant@...
Send Email
Jul 19, 2000
5:30 pm

Lee, Until your email I'd forgotten the list even existed. Perhaps just to start something I'll describe a little bit of what our library is trying to do in...
Electronic Informatio...
xlib@...
Send Email
Jul 20, 2000
6:46 pm

Bryn, good to hear from you, I just downloaded webstripper and it worked fine. I am trying to find a way to archive websites, much like you do, to be viewed by...
Harry Verwayen
harry@...
Send Email
Jul 20, 2000
11:29 pm

L.H. -- thanks for noticing the bug. Here's a repost to the full list! ... Perhaps so, which means that I get to be the second. :-) But seriously, there really...
Aaron Swartz
aswartz@...
Send Email
Jul 20, 2000
10:57 pm

Hello all Bryn Deamer asked if there were other similar projects to his at the Baha'i World Centre Library, so I thought I'd point out PANDORA at the National ...
Deborah Woodyard
Dwoodyar@...
Send Email
Jul 21, 2000
4:53 pm

... Why not use a tool that looks like the CVS system? http://www.cyclic.com/CVS/index_html (note that CVS is not really intended to save graphics- so you...
Charles MacDonald
cmacd@...
Send Email
Jul 24, 2000
3:50 am

The IPROXY product developed by AT&T Labs-Research also provide the function of archiving web pages. It can be downloaded from: ...
David Chiou
dc@...
Send Email
Jul 27, 2000
12:15 am

Harry, I'll answer the easiest question: "Another thing is how do you know when a site has been updated? Are there tools that can notify you when that happens...
Electronic Informatio...
xlib@...
Send Email
Jul 24, 2000
3:51 am

... Not a bad choce. Most web formats have viewers on Unix.. ... (Lots of security holes?) ... Hate to say it, but Microsoft formats change too often to ever...
Charles MacDonald
cmacd@...
Send Email
Jul 24, 2000
3:54 am

For those looking for EAD references and the likes, I like... http://hul.harvard.edu/hul/dfap/dfapcontents.html ...
Bob Mulrenin
bob.mulrenin@...
Send Email
Jul 27, 2000
12:14 am
Advanced

Copyright © 2009 Yahoo! Inc. All rights reserved.
Privacy Policy - Terms of Service - Guidelines - Help