Heritrix
(返回
www.opendocs.net
)
介绍
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.
文档
An Introduction To Heritrix
链接
http://crawler.archive.org/
http://archive-access.sourceforge.net/
http://en.wikipedia.org/wiki/Heritrix/
http://download.www.opendocs.net/heritrix/