Package org.archive.crawler.processor

Class Summary
BeanShellProcessor A processor which runs a BeanShell script on the CrawlURI.
CrawlMapper A simple crawl splitter/mapper, dividing up CandidateURIs/CrawlURIs between crawlers by diverting some range of URIs to local log files (which can then be imported to other crawlers).
HashCrawlMapper Maps URIs to one of N crawler names by applying a hash to the URI's (possibly-transformed) classKey.
LexicalCrawlMapper A simple crawl splitter/mapper, dividing up CandidateURIs/CrawlURIs between crawlers by diverting some range of URIs to local log files (which can then be imported to other crawlers).
 



Copyright © 2003-2011 Internet Archive. All Rights Reserved.