If you really need the information, don't give you yet. The information may be stored elsewhere. For instance in Google's cache. When Google's googlebot web crawler crawls the web it not only indexes the content, i.e. builds a database containing keywords of the content of the pages it crawls, but also saves or "caches" a significant portion of the data it finds. You may still be able to access otherwise inaccessible webpages through Google's cache.
For instance, I wanted information on a program called mbxtools. I performed a Google search and saw the following listed:
Flag Duplicates; MBXTools - Monochrome
Edition. The goal is the same as in Mark Duplicates tool (see Mailbox Tools for
Eudora), only this one is written in ... brana.ns.users.sbb.co.yu/software.htm - 29k - |
But when I clicked on "Miscellaneous Programs" to follow the link to the webpage, I received a "Cannot find server or DNS Error webpage" error. But you will notice another link titled "Cached" at the bottom of the information Google returned for the page. I was able to click on that link to access Google's cached version of the page providing me with the information for which I was searching.
At the top of the cached page that was returned, Google let me know when it had cached the information. In this case it was November 28, 2006, the last time Google successfully indexed the page I wanted to access. I was able to see a copy of the page as it looked almost 4 months ago, since I performed the search on March 23, 2007.
|
If you have a bookmark or otherwise know the URL of a page that is no longer
accessible, you can also specifically tell Google to go to a cached version
of a page, if one is available, by searching using cache:
followed
by the URL, e.g. in this
case I could put
cache:http://brana.ns.users.sbb.co.yu/software.htm
in Google's
search field to have it look in its cache for the page. Or, alterntatively,
I could just put http://google.com/search?q=cache:
followed by
the URL of the webpage I wished to visit in my browser's address field, e.g.
http://google.com/search?q=cache:http://brana.ns.users.sbb.co.yu/software.htm
A page won't be cached if Google never indexed it or if the web page designer included code in his or her page telling Google not to cache the page.
Some website developers have even been able to use Google's cache to recover information lost from their sites when they experienced hardware problems with their web servers as related in Cache at the End of His Rainbow.
Another place you can search for saved copies of webpages is the Internet Archive, aka the Wayback Machine. When I went there and put in the URL for the webpage I wanted to access, I found it had copies saved from many dates, starting in October 14, 2002 and ending in August 20, 2006.
References: