org.archive.crawler.datamodel
Class CrawlSubstats
java.lang.Object
org.archive.crawler.datamodel.CrawlSubstats
- All Implemented Interfaces:
- java.io.Serializable, FetchStatusCodes
public class CrawlSubstats
- extends java.lang.Object
- implements java.io.Serializable, FetchStatusCodes
Collector of statistics for a 'subset' of a crawl,
such as a server (host:port), host, or frontier group
(eg queue).
- Author:
- gojomo
- See Also:
- Serialized Form
Fields inherited from interface org.archive.crawler.datamodel.FetchStatusCodes |
S_BLOCKED_BY_CUSTOM_PROCESSOR, S_BLOCKED_BY_QUOTA, S_BLOCKED_BY_RUNTIME_LIMIT, S_BLOCKED_BY_USER, S_CONNECT_FAILED, S_CONNECT_LOST, S_DEEMED_CHAFF, S_DEEMED_NOT_FOUND, S_DEFERRED, S_DELETED_BY_USER, S_DNS_SUCCESS, S_DOMAIN_PREREQUISITE_FAILURE, S_DOMAIN_UNRESOLVABLE, S_GETBYNAME_SUCCESS, S_OTHER_PREREQUISITE_FAILURE, S_OUT_OF_SCOPE, S_PREREQUISITE_UNSCHEDULABLE_FAILURE, S_PROCESSING_THREAD_KILLED, S_ROBOTS_PRECLUDED, S_ROBOTS_PREREQUISITE_FAILURE, S_RUNTIME_EXCEPTION, S_SERIOUS_ERROR, S_TIMEOUT, S_TOO_MANY_EMBED_HOPS, S_TOO_MANY_LINK_HOPS, S_TOO_MANY_RETRIES, S_UNATTEMPTED, S_UNFETCHABLE_URI, S_UNQUEUEABLE |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
totalScheduled
long totalScheduled
fetchSuccesses
long fetchSuccesses
fetchFailures
long fetchFailures
fetchDisregards
long fetchDisregards
fetchResponses
long fetchResponses
robotsDenials
long robotsDenials
successBytes
long successBytes
totalBytes
long totalBytes
fetchNonResponses
long fetchNonResponses
novelBytes
long novelBytes
novelUrls
long novelUrls
notModifiedBytes
long notModifiedBytes
notModifiedUrls
long notModifiedUrls
dupByHashBytes
long dupByHashBytes
dupByHashUrls
long dupByHashUrls
CrawlSubstats
public CrawlSubstats()
tally
public void tally(CrawlURI curi,
CrawlSubstats.Stage stage)
- Examing the CrawlURI and based on its status and internal values,
update tallies.
- Parameters:
curi
-
getFetchSuccesses
public long getFetchSuccesses()
getFetchResponses
public long getFetchResponses()
getSuccessBytes
public long getSuccessBytes()
getTotalBytes
public long getTotalBytes()
getFetchNonResponses
public long getFetchNonResponses()
getTotalScheduled
public long getTotalScheduled()
getFetchDisregards
public long getFetchDisregards()
getRobotsDenials
public long getRobotsDenials()
getRemaining
public long getRemaining()
getRecordedFinishes
public long getRecordedFinishes()
getNovelBytes
public long getNovelBytes()
getNovelUrls
public long getNovelUrls()
getNotModifiedBytes
public long getNotModifiedBytes()
getNotModifiedUrls
public long getNotModifiedUrls()
getDupByHashBytes
public long getDupByHashBytes()
getDupByHashUrls
public long getDupByHashUrls()
Copyright © 2003-2011 Internet Archive. All Rights Reserved.