|
||||||||||
PREV NEXT | FRAMES NO FRAMES |
Classes in org.archive.crawler.datamodel used by org.archive.crawler.admin | |
---|---|
CoreAttributeConstants
CrawlURI attribute keys used by the core crawler classes. |
|
CrawlURI
Represents a candidate URI and the associated state it collects as it is crawled. |
Classes in org.archive.crawler.datamodel used by org.archive.crawler.datamodel | |
---|---|
CandidateURI
A URI, discovered or passed-in, that may be scheduled. |
|
CoreAttributeConstants
CrawlURI attribute keys used by the core crawler classes. |
|
CrawlHost
Represents a single remote "host". |
|
CrawlServer
Represents a single remote "server". |
|
CrawlSubstats
Collector of statistics for a 'subset' of a crawl, such as a server (host:port), host, or frontier group (eg queue). |
|
CrawlSubstats.HasCrawlSubstats
|
|
CrawlSubstats.Stage
|
|
CrawlURI
Represents a candidate URI and the associated state it collects as it is crawled. |
|
CredentialStore
Front door to the credential store. |
|
FetchStatusCodes
Constant flag codes to be used, in lieu of per-protocol codes (like HTTP's 200, 404, etc.), when network/internal/ out-of-band conditions occur. |
|
RobotsDirectives
Represents the directives that apply to a user-agent (or set of user-agents) |
|
RobotsExclusionPolicy
RobotsExclusionPolicy represents the actual policy adopted with respect to a specific remote server, usually constructed from consulting the robots.txt, if any, the server provided. |
|
RobotsHonoringPolicy
RobotsHonoringPolicy represent the strategy used by the crawler for determining how robots.txt files will be honored. |
|
Robotstxt
Utility class for parsing and representing 'robots.txt' format directives, into a list of named user-agents and map from user-agents to RobotsDirectives. |
|
UriUniqFilter.HasUriReceiver
URIs that have not been seen before 'visit' this 'Visitor'. |
Classes in org.archive.crawler.datamodel used by org.archive.crawler.datamodel.credential | |
---|---|
CrawlURI
Represents a candidate URI and the associated state it collects as it is crawled. |
Classes in org.archive.crawler.datamodel used by org.archive.crawler.deciderules | |
---|---|
CandidateURI
A URI, discovered or passed-in, that may be scheduled. |
|
CoreAttributeConstants
CrawlURI attribute keys used by the core crawler classes. |
|
CrawlURI
Represents a candidate URI and the associated state it collects as it is crawled. |
Classes in org.archive.crawler.datamodel used by org.archive.crawler.deciderules.recrawl | |
---|---|
CoreAttributeConstants
CrawlURI attribute keys used by the core crawler classes. |
|
CrawlURI
Represents a candidate URI and the associated state it collects as it is crawled. |
Classes in org.archive.crawler.datamodel used by org.archive.crawler.event | |
---|---|
CrawlURI
Represents a candidate URI and the associated state it collects as it is crawled. |
Classes in org.archive.crawler.datamodel used by org.archive.crawler.extractor | |
---|---|
CoreAttributeConstants
CrawlURI attribute keys used by the core crawler classes. |
|
CrawlURI
Represents a candidate URI and the associated state it collects as it is crawled. |
Classes in org.archive.crawler.datamodel used by org.archive.crawler.fetcher | |
---|---|
CoreAttributeConstants
CrawlURI attribute keys used by the core crawler classes. |
|
CrawlHost
Represents a single remote "host". |
|
CrawlURI
Represents a candidate URI and the associated state it collects as it is crawled. |
|
FetchStatusCodes
Constant flag codes to be used, in lieu of per-protocol codes (like HTTP's 200, 404, etc.), when network/internal/ out-of-band conditions occur. |
|
ServerCache
Server and Host cache. |
Classes in org.archive.crawler.datamodel used by org.archive.crawler.filter | |
---|---|
CoreAttributeConstants
CrawlURI attribute keys used by the core crawler classes. |
|
CrawlURI
Represents a candidate URI and the associated state it collects as it is crawled. |
Classes in org.archive.crawler.datamodel used by org.archive.crawler.framework | |
---|---|
CandidateURI
A URI, discovered or passed-in, that may be scheduled. |
|
Checkpoint
Record of a specific checkpoint on disk. |
|
CoreAttributeConstants
CrawlURI attribute keys used by the core crawler classes. |
|
CrawlOrder
Represents the 'root' of the settings hierarchy. |
|
CrawlSubstats.HasCrawlSubstats
|
|
CrawlURI
Represents a candidate URI and the associated state it collects as it is crawled. |
|
FetchStatusCodes
Constant flag codes to be used, in lieu of per-protocol codes (like HTTP's 200, 404, etc.), when network/internal/ out-of-band conditions occur. |
|
ServerCache
Server and Host cache. |
Classes in org.archive.crawler.datamodel used by org.archive.crawler.frontier | |
---|---|
CandidateURI
A URI, discovered or passed-in, that may be scheduled. |
|
CoreAttributeConstants
CrawlURI attribute keys used by the core crawler classes. |
|
CrawlServer
Represents a single remote "server". |
|
CrawlSubstats
Collector of statistics for a 'subset' of a crawl, such as a server (host:port), host, or frontier group (eg queue). |
|
CrawlSubstats.HasCrawlSubstats
|
|
CrawlSubstats.Stage
|
|
CrawlURI
Represents a candidate URI and the associated state it collects as it is crawled. |
|
FetchStatusCodes
Constant flag codes to be used, in lieu of per-protocol codes (like HTTP's 200, 404, etc.), when network/internal/ out-of-band conditions occur. |
|
UriUniqFilter
A UriUniqFilter passes URI objects to a destination (receiver) if the passed URI object has not been previously seen. |
|
UriUniqFilter.HasUriReceiver
URIs that have not been seen before 'visit' this 'Visitor'. |
Classes in org.archive.crawler.datamodel used by org.archive.crawler.io | |
---|---|
CoreAttributeConstants
CrawlURI attribute keys used by the core crawler classes. |
Classes in org.archive.crawler.datamodel used by org.archive.crawler.postprocessor | |
---|---|
CandidateURI
A URI, discovered or passed-in, that may be scheduled. |
|
CoreAttributeConstants
CrawlURI attribute keys used by the core crawler classes. |
|
CrawlURI
Represents a candidate URI and the associated state it collects as it is crawled. |
|
FetchStatusCodes
Constant flag codes to be used, in lieu of per-protocol codes (like HTTP's 200, 404, etc.), when network/internal/ out-of-band conditions occur. |
Classes in org.archive.crawler.datamodel used by org.archive.crawler.prefetch | |
---|---|
CoreAttributeConstants
CrawlURI attribute keys used by the core crawler classes. |
|
CrawlSubstats.HasCrawlSubstats
|
|
CrawlURI
Represents a candidate URI and the associated state it collects as it is crawled. |
|
FetchStatusCodes
Constant flag codes to be used, in lieu of per-protocol codes (like HTTP's 200, 404, etc.), when network/internal/ out-of-band conditions occur. |
Classes in org.archive.crawler.datamodel used by org.archive.crawler.processor | |
---|---|
CandidateURI
A URI, discovered or passed-in, that may be scheduled. |
|
CrawlURI
Represents a candidate URI and the associated state it collects as it is crawled. |
|
FetchStatusCodes
Constant flag codes to be used, in lieu of per-protocol codes (like HTTP's 200, 404, etc.), when network/internal/ out-of-band conditions occur. |
Classes in org.archive.crawler.datamodel used by org.archive.crawler.processor.recrawl | |
---|---|
CoreAttributeConstants
CrawlURI attribute keys used by the core crawler classes. |
|
CrawlURI
Represents a candidate URI and the associated state it collects as it is crawled. |
Classes in org.archive.crawler.datamodel used by org.archive.crawler.scope | |
---|---|
CandidateURI
A URI, discovered or passed-in, that may be scheduled. |
|
CrawlURI
Represents a candidate URI and the associated state it collects as it is crawled. |
Classes in org.archive.crawler.datamodel used by org.archive.crawler.selftest | |
---|---|
CrawlURI
Represents a candidate URI and the associated state it collects as it is crawled. |
Classes in org.archive.crawler.datamodel used by org.archive.crawler.settings | |
---|---|
CrawlOrder
Represents the 'root' of the settings hierarchy. |
|
CrawlURI
Represents a candidate URI and the associated state it collects as it is crawled. |
Classes in org.archive.crawler.datamodel used by org.archive.crawler.url | |
---|---|
CrawlOrder
Represents the 'root' of the settings hierarchy. |
Classes in org.archive.crawler.datamodel used by org.archive.crawler.util | |
---|---|
CandidateURI
A URI, discovered or passed-in, that may be scheduled. |
|
CoreAttributeConstants
CrawlURI attribute keys used by the core crawler classes. |
|
CrawlURI
Represents a candidate URI and the associated state it collects as it is crawled. |
|
UriUniqFilter
A UriUniqFilter passes URI objects to a destination (receiver) if the passed URI object has not been previously seen. |
|
UriUniqFilter.HasUriReceiver
URIs that have not been seen before 'visit' this 'Visitor'. |
Classes in org.archive.crawler.datamodel used by org.archive.crawler.writer | |
---|---|
CoreAttributeConstants
CrawlURI attribute keys used by the core crawler classes. |
|
CrawlURI
Represents a candidate URI and the associated state it collects as it is crawled. |
|
FetchStatusCodes
Constant flag codes to be used, in lieu of per-protocol codes (like HTTP's 200, 404, etc.), when network/internal/ out-of-band conditions occur. |
|
||||||||||
PREV NEXT | FRAMES NO FRAMES |