org.archive.crawler.datamodel
Interface FetchStatusCodes

All Known Implementing Classes:
AbstractFrontier, AdaptiveRevisitFrontier, ARCWriterProcessor, BdbFrontier, BeanShellProcessor, CrawlMapper, CrawlServer, CrawlStateUpdater, CrawlSubstats, CrawlURI, DomainSensitiveFrontier, FetchDNS, FetchFTP, FetchHTTP, FrontierScheduler, HashCrawlMapper, LexicalCrawlMapper, LinksScoper, PreconditionEnforcer, Preselector, QuotaEnforcer, RuntimeLimitEnforcer, ToeThread, WARCWriterProcessor, WorkQueueFrontier, WriterPoolProcessor

public interface FetchStatusCodes

Constant flag codes to be used, in lieu of per-protocol codes (like HTTP's 200, 404, etc.), when network/internal/ out-of-band conditions occur. The URISelector may use such codes, along with user-configured options, to determine whether, when, and how many times a CrawlURI might be reattempted.

Author:
gojomo

Field Summary
static int S_BLOCKED_BY_CUSTOM_PROCESSOR
          Blocked by custom prefetcher processor.
static int S_BLOCKED_BY_QUOTA
          Blocked due to exceeding an established quota.
static int S_BLOCKED_BY_RUNTIME_LIMIT
          Blocked due to exceeding an established runtime.
static int S_BLOCKED_BY_USER
          blocked from fetch by user setting.
static int S_CONNECT_FAILED
          HTTP connect failed
static int S_CONNECT_LOST
          HTTP connect broken
static int S_DEEMED_CHAFF
          'chaff' detection of traps/content of negligible value applied
static int S_DEEMED_NOT_FOUND
          synthetic status, used when some other status (such as connection-lost) is considered by policy the same as a document-not-found
static int S_DEFERRED
          temporary status assigned URIs awaiting preconditions; appearance in logs is a bug
static int S_DELETED_BY_USER
          deleted from frontier by user
static int S_DNS_SUCCESS
          DNS success
static int S_DOMAIN_PREREQUISITE_FAILURE
          DNS prerequisite failed, precluding attempt
static int S_DOMAIN_UNRESOLVABLE
          DNS lookup failed
static int S_GETBYNAME_SUCCESS
          InetAddress.getByName success
static int S_OTHER_PREREQUISITE_FAILURE
          DNS prerequisite failed, precluding attempt
static int S_OUT_OF_SCOPE
          out-of-scope upoin reexamination (only when scope changes during crawl)
static int S_PREREQUISITE_UNSCHEDULABLE_FAILURE
          DNS prerequisite failed, precluding attempt
static int S_PROCESSING_THREAD_KILLED
          Processing thread was killed
static int S_ROBOTS_PRECLUDED
          robots rules precluded fetch
static int S_ROBOTS_PREREQUISITE_FAILURE
          Robots prerequisite failed, precluding attempt
static int S_RUNTIME_EXCEPTION
          Unexpected runtime exception; see runtime-errors.log
static int S_SERIOUS_ERROR
          severe java 'Error' conditions (OutOfMemoryError, StackOverflowError, etc.) during URI processing
static int S_TIMEOUT
          HTTP timeout (before any meaningful response received)
static int S_TOO_MANY_EMBED_HOPS
          overstepped embed/trans hops
static int S_TOO_MANY_LINK_HOPS
          overstepped link hops
static int S_TOO_MANY_RETRIES
          multiple retries all failed
static int S_UNATTEMPTED
          fetch never tried (perhaps protocol unsupported or illegal URI)
static int S_UNFETCHABLE_URI
          URI recognized as unsupported or illegal)
static int S_UNQUEUEABLE
          URI could not be queued in Frontier; when URIs are properly filtered for format, should never occur
 

Field Detail

S_UNATTEMPTED

static final int S_UNATTEMPTED
fetch never tried (perhaps protocol unsupported or illegal URI)

See Also:
Constant Field Values

S_DOMAIN_UNRESOLVABLE

static final int S_DOMAIN_UNRESOLVABLE
DNS lookup failed

See Also:
Constant Field Values

S_CONNECT_FAILED

static final int S_CONNECT_FAILED
HTTP connect failed

See Also:
Constant Field Values

S_CONNECT_LOST

static final int S_CONNECT_LOST
HTTP connect broken

See Also:
Constant Field Values

S_TIMEOUT

static final int S_TIMEOUT
HTTP timeout (before any meaningful response received)

See Also:
Constant Field Values

S_RUNTIME_EXCEPTION

static final int S_RUNTIME_EXCEPTION
Unexpected runtime exception; see runtime-errors.log

See Also:
Constant Field Values

S_DOMAIN_PREREQUISITE_FAILURE

static final int S_DOMAIN_PREREQUISITE_FAILURE
DNS prerequisite failed, precluding attempt

See Also:
Constant Field Values

S_UNFETCHABLE_URI

static final int S_UNFETCHABLE_URI
URI recognized as unsupported or illegal)

See Also:
Constant Field Values

S_TOO_MANY_RETRIES

static final int S_TOO_MANY_RETRIES
multiple retries all failed

See Also:
Constant Field Values

S_DEFERRED

static final int S_DEFERRED
temporary status assigned URIs awaiting preconditions; appearance in logs is a bug

See Also:
Constant Field Values

S_UNQUEUEABLE

static final int S_UNQUEUEABLE
URI could not be queued in Frontier; when URIs are properly filtered for format, should never occur

See Also:
Constant Field Values

S_ROBOTS_PREREQUISITE_FAILURE

static final int S_ROBOTS_PREREQUISITE_FAILURE
Robots prerequisite failed, precluding attempt

See Also:
Constant Field Values

S_OTHER_PREREQUISITE_FAILURE

static final int S_OTHER_PREREQUISITE_FAILURE
DNS prerequisite failed, precluding attempt

See Also:
Constant Field Values

S_PREREQUISITE_UNSCHEDULABLE_FAILURE

static final int S_PREREQUISITE_UNSCHEDULABLE_FAILURE
DNS prerequisite failed, precluding attempt

See Also:
Constant Field Values

S_DEEMED_NOT_FOUND

static final int S_DEEMED_NOT_FOUND
synthetic status, used when some other status (such as connection-lost) is considered by policy the same as a document-not-found

See Also:
Constant Field Values

S_SERIOUS_ERROR

static final int S_SERIOUS_ERROR
severe java 'Error' conditions (OutOfMemoryError, StackOverflowError, etc.) during URI processing

See Also:
Constant Field Values

S_DEEMED_CHAFF

static final int S_DEEMED_CHAFF
'chaff' detection of traps/content of negligible value applied

See Also:
Constant Field Values

S_TOO_MANY_LINK_HOPS

static final int S_TOO_MANY_LINK_HOPS
overstepped link hops

See Also:
Constant Field Values

S_TOO_MANY_EMBED_HOPS

static final int S_TOO_MANY_EMBED_HOPS
overstepped embed/trans hops

See Also:
Constant Field Values

S_OUT_OF_SCOPE

static final int S_OUT_OF_SCOPE
out-of-scope upoin reexamination (only when scope changes during crawl)

See Also:
Constant Field Values

S_BLOCKED_BY_USER

static final int S_BLOCKED_BY_USER
blocked from fetch by user setting.

See Also:
Constant Field Values

S_BLOCKED_BY_CUSTOM_PROCESSOR

static final int S_BLOCKED_BY_CUSTOM_PROCESSOR
Blocked by custom prefetcher processor. A check against scope or against filters in a custom prefetch processor rules CrawlURI should not be crawled. TODO: Add to documentation and help page.

See Also:
Constant Field Values

S_BLOCKED_BY_QUOTA

static final int S_BLOCKED_BY_QUOTA
Blocked due to exceeding an established quota. TODO: Add to documentation and help page.

See Also:
Constant Field Values

S_BLOCKED_BY_RUNTIME_LIMIT

static final int S_BLOCKED_BY_RUNTIME_LIMIT
Blocked due to exceeding an established runtime. TODO: Add to documentation and help page.

See Also:
Constant Field Values

S_DELETED_BY_USER

static final int S_DELETED_BY_USER
deleted from frontier by user

See Also:
Constant Field Values

S_PROCESSING_THREAD_KILLED

static final int S_PROCESSING_THREAD_KILLED
Processing thread was killed

See Also:
Constant Field Values

S_ROBOTS_PRECLUDED

static final int S_ROBOTS_PRECLUDED
robots rules precluded fetch

See Also:
Constant Field Values

S_DNS_SUCCESS

static final int S_DNS_SUCCESS
DNS success

See Also:
Constant Field Values

S_GETBYNAME_SUCCESS

static final int S_GETBYNAME_SUCCESS
InetAddress.getByName success

See Also:
Constant Field Values


Copyright © 2003-2011 Internet Archive. All Rights Reserved.