Categories
Misc

Link Expiry

In which I ruminate on the problem of old links and stale references.

We all know it. You search for an issue, or topic you’re interested in, click a few links and boom. Dead end. The page no longer lives there, the domain is gone, or the server ended up at the bottom of a river. Even my website is no exception.

While hypertext documents shouldn’t change, we all know they can, and will do so often. Which is why we have such interest int tools like archive.org and the Wayback Machine. These tools regularly scrape, or have users submit interesting material for archiving. It’s frequently used to ensure a particular version of a page or site is preserved.

I started thinking about this because I read an article about strategies for linking to obsolete websites (thanks Beko Pharm). One was to use a periodic link checker to find stale or broken links on your site. Optionally swapping out outdated references with fresh ones, or with links into the Wayback Machine. While this is all well and good, I think it might be more useful to self-archive sites. Use something like wget to pull down the document and associated resources and host it yourself (statically), or at least provide an archive for people to download and inspect.

Has anyone given this any further thought? It doesn’t sound like a technically complicated project, but I’m sure someone has already trodden down this path and came to some sort of outcome or reason it’s not worth it.

By Nathan

Hello! My name is Nathan and I'm a technologist living and working in the south east. I love breaking crap and fixing it. I tend to break more than I fix. When I'm not breaking and fixing stuff, I'm playing games with my son or going to Disney with my family. I strongly support open source software, hardware and greater transparency in government.