100 likes | 237 Vues
The Web Archiving Project (WARP) is an initiative aimed at preserving internet resources by providing a one-stop search for free databases and archived websites. Launched by the National Diet Library (NDL), WARP offers access to 2,100 websites and 1,500 e-journals, allowing researchers to retrieve obsolete web pages that are no longer available, such as the Japanese Organizing Committee for the 2002 FIFA World Cup. This meeting at the Hyatt Regency Hotel in Atlanta on April 3, 2008, highlights WARP's capabilities and the importance of stakeholder involvement in web harvesting.
E N D
“WARP: Web Archiving Project” JSIST Class 2007 Kenichiro Shimada Gordon W. Prange Collection University of Maryland Libraries NCC 2008 Open Meeting Hyatt Regency Hotel, Atlanta, April 3, 2008
WARP : NDL’s Research Tools for a Type of Web Resources • One-stop search for free databases and archived web sites • Alternative web research tools for - Obsolete web pages
WARP: Web Archiving Projecthttp://warp.ndl.go.jp/ • 2,100 websites and 1,500 title e-journals (Jan. 2008) • Fed. Gov., Prefectures, Designated Cities, Municipal merger, Foundations, Organizations , Universities, Events, E-Journals
Web Archiving Project (Cont. ) • Harvesting Web Resources by Robot • HTML (front page) and linked pages - Not possible to collect Deep web pages … CGI (Common Gateway Interface) implemented etc http://warp.ndl.go.jp/ft_WARP_Mechanism.pdf • Permission-based • Stakeholders can specify (limit) the extent of web harvesting and access
Advantages: WARP • WARP - Can retrieve obsolete web pages (currently not available) e.g. Website of “Japanese Organizing Committee for the 2002 FIFA World Cup Korea/Japan Original URL: http://www.jawoc.or.jp/index.html
Notes : WARP • WARP - Not all government bodies have given consent e.g. Prefectures (33 out of 47) Ordinance-designated cities (13 out of 17) - Stakeholders can specify the extent of harvesting and access - Check the date of data harvesting