Without knowing the technical details, the idea makes sense, but
I wonder if the Library of Congress, under their NDIPP program,
would be the more appropriate body to oversee this?
http://www.digitalpreservation.gov/
The mission of the National Archives is comparatively narrow in
that it covers only the records of the federal Government
(broadly construed)--in itself a huge project. And I wonder how
they are dealing with this athentication issue in their ERA
project. It will be worth discussing at the Society of American
Archivists meeting--speaking as someone who is only beginning to
get involved in electronic records issues.
Thanks,
Chris
--
Christopher J. Prom
Assistant University Archivist
University of Illinois Archives
19 Library
1408 W. Gregory Dr.
Urbana, IL 61801
phone: 217.333.0798
fax: 217.333.2868
e-mail: prom@...
web: http://web.library.uiuc.edu/ahx
On Tue, 10 Jul 2007, Brad Jensen wrote:
> For those who are not aware, there is a computational procedure
> you can do for any digital file, that creates a unique number,
> called a hash, that only matches that exact file.
>
> There is a Federal standard for one hashing algorithm, called
> SHA-1. That is a 160-biit number. More commonly used today is the
> SHA-256 hash, that generates a 256 bit number.
>
> Another term for this is 'digital thumbprint'.
>
> In the following discussion I am referring implicitly to the use
> of the SHA-256 hash.
>
> If you take a digital file 'A', and you change the order of two
> characters in the file, the hash becomes completely different.
>
> No two digital files will have the same thumbprint. You cannot
> predict what the thumbprint will be for a file. You cannot forge
> or modify a file to match an existing thumbprint.
>
> There are digital time stamping services on the internet that
> register these 'thumbprints' to prove a particular file existed
> at a particular date and time, and it has not changed.
>
> The US Postal Service offers a time stamping service for a small
> fee that they call an 'Electronic Postmark' but it only is kept
> for seven years. They also require the user to have a digital
> certificate to establish identity of the person time stamping the
> file.
>
> I propose something simpler.
>
> I propose that the National Archives create and offer a free time
> stamping service that does not require a digital certificate. The
> purpose of this is to store and retrieve unique file identifiers
> that will establish that a file existed at a certain date and
> time, and has not changed.
>
> Then files can be archived in multiple locations across a
> distributed network, and their identity and authenticity will
> remain unquestionable.
>
> This service would be a public good, similar to the digital time
> source offered by the Navy, for example.
>
> The National Archives will keep these timestamps in perpetuity.
> They would basically be entries in a database, with a 32-byte
> thumbprint, date and time. They would be a public record, so
> anyone can look up a thumbprint and now the date and time it was
> registered.
>
> Can others see the value of this idea?
>
> I can write the basic software for this. One part would be a
> database for the National Archives with a web XML interface for
> registering and retrieving the thumbprints.
>
> It would include a feature to thumbprint each day's database
> entries, to eliminate any possibility of human interference in
> the process. You don't have to trust anybody or even the
> institution, since the thumbprints are impossible to forge.
>
> The second thing would be a program, downloadable from a web
> page, to calculate and submit the thumbprint. I can write it in
> Windows, publish the source, and others could do the same for
> Linux, etc.
>
> What could it be used for? Scanned images, photographs, text
> documents, backup files, sound recordings, web pages, newspapers,
> anything that can be digitized.
>
> Since the only submission is the thumbprint and not the file,
> files can remain private yet still be authenticated later.
>
> And the processing load on the server is tiny.
>
> The other alternative to have someone like the National Archives
> do it, is to do it ourselves as a distributed database with
> replication across many sites and servers.
>
> I can do it myself, but this needs institutional support to last
> forever.
>
> That institution can be a formal body like the National Archives,
> or an ad hoc self-organizing one. Perhaps the latter makes sense
> in this global internet world.
>
> I think of this as the 'Forever Project' since it is the first
> thing designed to last forever.
>
> Brad Jensen
> President
> LaserVault LLC
> www.laservault.com
>
>
>
>
>
>
>
>
>
>
>
>
>
>