Project Background To promote
Getty pursues its mission in Los Angeles and around the world through
the work of its constituent programs—Getty Conservation Institute, Getty Foundation, J. Paul Getty Museum, and Getty Research Institute—serving the phone number library general interested public and a wide range
of professional communities through an understanding of the visual arts.
In 2019, Getty began a website redesign project
Changing the technology stack and updating the way we interact with our communities online. The legacy website contained more than 19,000 web pages and
we knew many were no longer useful or relevant and should be retired, possibly after being archived. This led us to leverage
the content we’d captured using the Internet Archive’s Archive-It service.
We’d been crawling our site since 2017, but had treated the results more as a record of institutional change over time. To promote than as an archival in memory of python resource to be consulted after deletion of a page. We needed to direct traffic to our Wayback Machine captures
thus ensuring deleted pages remain accessible when a user requests a deprecated URL. We decided to dynamically display a link
to the archived page from our site’s 404 error “Page not found” page.
Message including the dynamically generated
Instructions and Internet Archive page link.
>The project to audit all existing pages required us to educate content owners across the institution about web archiving practices and purpose. We developed processes for completing human reviews of large amounts of captured content. This work is sault data described in more detail in a 2021 Digital Preservation Coalition blog post that mentions the Web Archives Collecting Policy we developed.
In this blog post we’ll discuss the work required to use the Internet Archive’s data API to add the necessary link on our 404 pages pointing to the most recent Wayback Machine capture of a deleted page.