Archival
- J1 Lee
- Oct 8, 2023
- 3 min read
Updated: Jun 22, 2024
Analog media (newspapers, books, documents, records, and articles) have been preserved for an extremely long time in libraries and archives. Since these items are ephemeral and can be easily destroyed or lost, people maintain copies of these items in order to preserve and record history. Likewise, digital media is also susceptible to being lost. This type of media is called lost media and usually appears when a hosting service shuts down, disabling the public from accessing this piece of media ever again.
It is very easy to overlook the idea that Internet archival is necessary as it almost seems like what will stay on the Internet will simply be there forever. This is not the case and in order to solve this problem like its analog counterparts, archival is necessary. The biggest organization responsible for this task is The Internet Archive which is a non-profit that strives to gather everything – literally everything – that is on the Internet for the public to access. The Internet Archive is a combination of a plethora of projects with the most prominent one being the “Wayback Machine”, a time machine for the Internet. The Wayback Machine works by using website crawlers which copy the files for a website then upload it onto servers which host these snapshots publicly. This is not only useful for reviewing lost websites, but also useful for reviewing deleted social media posts and content that has been removed from an active site. The archive holds at least two copies of everything, meaning that there is a great deal of storage required. One copy of the site is ninety-nine petabytes, which is equivalent to nine million gigabytes. To put this into perspective, the petabox, a server capable of hosting one petabyte, can store the equivalent of 3906 average laptops.
The archive does not only consist of websites but also contains old software, a book library, and an audio library. Even very obscure media such as Tiles and Tribulations V 1.5, a pc game from 1992, is saved onto the platform. Most of the data saved on the site has no relevance and is almost pointless to be saved. The sheer scale of the website leads this to be the case. The difference between analog media published to the public and digital media is that anyone can publish digital media with little to no effort while it would take people much more effort to produce a piece of analog media. A blog like this would have been impossible to publish and actually be archived if it was analog, as a high school student would likely not have the funds nor the platform to push such a piece; however, now I can publish this for the whole world to see.
The Internet has allowed more and more people to voice themselves and publish to the public. These minor sites and irrelevant ideas may seem unimportant on their own, but as a whole, they provide an important perspective of how humanity used the Internet. Unlike analog archival, digital archiving and the archival of the Internet not only record major events, but also record the shifting landscape of the Internet, the language of the Internet, and the trends of the Internet with every minor piece taking kilobytes and shaping a ninety-nine petabyte image of a development in humanity.
Kommentare