The GPO has released a white paper on the results of its recently completed “Web Harvesting Pilot Project” and is requesting comments from the public. They had to extend the comment period until Fri. Feb. 8 because apparently they’re not getting any response, so the folks at Free Government Information are urging people to take the time to look at the project and make comments.
What is the project? Many publications being published by Federal agencies are not being included in the Federal Depository Library Program that distributes all government documents. These documents have come to be known as “fugitive publications”; with increasing frequency, federal agencies are publishing content only in electronic formats and they frequently fail to inform GPO of these new publications for inclusion in the FDLP. In light of the large number of publications that have become fugitive, GPO is seeking Web crawler and other technologies that can provide a solution for the identification and harvesting of fugitive documents and publications from agency Web sites.
A summary of the results of the pilot project (they used the EPA website as the subject of the pilot) is available here (pdf) and the simple online comment form is here on the GPO website.