|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object javax.management.Attribute org.archive.crawler.settings.Type org.archive.crawler.settings.ComplexType org.archive.crawler.settings.ModuleType org.archive.crawler.framework.Processor org.archive.crawler.postprocessor.WaitEvaluator
public class WaitEvaluator
A processor that determines when a URI should be revisited next. Does not account for DNS and robots.txt expiration. That should be handled seperately by the Frontiers.
Nested Class Summary |
---|
Nested classes/interfaces inherited from class org.archive.crawler.settings.ComplexType |
---|
ComplexType.MBeanAttributeInfoIterator |
Field Summary | |
---|---|
static java.lang.String |
ATTR_CHANGED_FACTOR
Factor decrease on wait when changed |
static java.lang.String |
ATTR_DEFAULT_WAIT_INTERVAL
Fixed wait time for 'unknown' change status. |
static java.lang.String |
ATTR_INITIAL_WAIT_INTERVAL
Default wait time after initial visit. |
static java.lang.String |
ATTR_MAX_WAIT_INTERVAL
Maximum wait between visits |
static java.lang.String |
ATTR_MIN_WAIT_INTERVAL
Minimum wait between visits |
static java.lang.String |
ATTR_UNCHANGED_FACTOR
Factor increase on wait when unchanged |
static java.lang.String |
ATTR_USE_OVERDUE_TIME
Indicates if the amount of time the URI was overdue should be added to the wait time before the new wait time is calculated. |
protected static java.lang.Double |
DEFAULT_CHANGED_FACTOR
|
protected static java.lang.Long |
DEFAULT_DEFAULT_WAIT_INTERVAL
|
protected static java.lang.Long |
DEFAULT_INITIAL_WAIT_INTERVAL
|
protected static java.lang.Long |
DEFAULT_MAX_WAIT_INTERVAL
|
protected static java.lang.Long |
DEFAULT_MIN_WAIT_INTERVAL
|
protected static java.lang.Double |
DEFAULT_UNCHANGED_FACTOR
|
protected static java.lang.Boolean |
DEFAULT_USE_OVERDUE_TIME
|
(package private) java.util.logging.Logger |
logger
|
Fields inherited from class org.archive.crawler.framework.Processor |
---|
ATTR_DECIDE_RULES, ATTR_ENABLED, attrDecideRules |
Fields inherited from class org.archive.crawler.settings.ComplexType |
---|
definition, definitionMap |
Fields inherited from interface org.archive.crawler.frontier.AdaptiveRevisitAttributeConstants |
---|
A_CONTENT_STATE_KEY, A_DISCARD_REVISIT, A_FETCH_OVERDUE, A_LAST_CONTENT_DIGEST, A_LAST_DATESTAMP, A_LAST_ETAG, A_NUMBER_OF_VERSIONS, A_NUMBER_OF_VISITS, A_TIME_OF_NEXT_PROCESSING, A_WAIT_INTERVAL, A_WAIT_REEVALUATED, CONTENT_CHANGED, CONTENT_UNCHANGED, CONTENT_UNKNOWN |
Constructor Summary | |
---|---|
WaitEvaluator(java.lang.String name)
Constructor |
|
WaitEvaluator(java.lang.String name,
java.lang.String description,
java.lang.Long default_inital_wait_interval,
java.lang.Long default_max_wait_interval,
java.lang.Long default_min_wait_interval,
java.lang.Double default_unchanged_factor,
java.lang.Double default_changed_factor)
Constructor |
Method Summary | |
---|---|
protected void |
innerProcess(CrawlURI curi)
Classes subclassing this one should override this method to perform their custom actions on the CrawlURI. |
Methods inherited from class org.archive.crawler.framework.Processor |
---|
checkForInterrupt, finalTasks, getController, getDecideRule, getDefaultNextProcessor, initialTasks, innerRejectProcess, isContentToProcess, isEnabled, isExpectedMimeType, isHttpTransactionContentToProcess, kickUpdate, process, report, rulesAccept, rulesAccept, setDefaultNextProcessor, spawn |
Methods inherited from class org.archive.crawler.settings.ModuleType |
---|
addElement, listUsedFiles |
Methods inherited from class org.archive.crawler.settings.Type |
---|
addConstraint, equals, getConstraints, getLegalValueType, isExpertSetting, isOverrideable, isTransient, setExpertSetting, setLegalValueType, setOverrideable, setTransient |
Methods inherited from class javax.management.Attribute |
---|
getName, hashCode |
Methods inherited from class java.lang.Object |
---|
clone, finalize, getClass, notify, notifyAll, wait, wait, wait |
Field Detail |
---|
java.util.logging.Logger logger
public static final java.lang.String ATTR_INITIAL_WAIT_INTERVAL
protected static final java.lang.Long DEFAULT_INITIAL_WAIT_INTERVAL
public static final java.lang.String ATTR_MAX_WAIT_INTERVAL
protected static final java.lang.Long DEFAULT_MAX_WAIT_INTERVAL
public static final java.lang.String ATTR_MIN_WAIT_INTERVAL
protected static final java.lang.Long DEFAULT_MIN_WAIT_INTERVAL
public static final java.lang.String ATTR_UNCHANGED_FACTOR
protected static final java.lang.Double DEFAULT_UNCHANGED_FACTOR
public static final java.lang.String ATTR_CHANGED_FACTOR
protected static final java.lang.Double DEFAULT_CHANGED_FACTOR
public static final java.lang.String ATTR_DEFAULT_WAIT_INTERVAL
protected static final java.lang.Long DEFAULT_DEFAULT_WAIT_INTERVAL
public static final java.lang.String ATTR_USE_OVERDUE_TIME
protected static final java.lang.Boolean DEFAULT_USE_OVERDUE_TIME
Constructor Detail |
---|
public WaitEvaluator(java.lang.String name)
name
- The name of the modulepublic WaitEvaluator(java.lang.String name, java.lang.String description, java.lang.Long default_inital_wait_interval, java.lang.Long default_max_wait_interval, java.lang.Long default_min_wait_interval, java.lang.Double default_unchanged_factor, java.lang.Double default_changed_factor)
name
- The name of the moduledescription
- Description of the moduledefault_inital_wait_interval
- The default value for initial wait
timedefault_max_wait_interval
- The maximum value for wait timedefault_min_wait_interval
- The minimum value for wait timedefault_unchanged_factor
- The factor for changing wait times of
unchanged documents (will be multiplied by this value)default_changed_factor
- The factor for changing wait times of
changed documents (will be divided by this value)Method Detail |
---|
protected void innerProcess(CrawlURI curi) throws java.lang.InterruptedException
Processor
innerProcess
in class Processor
curi
- The CrawlURI being processed.
java.lang.InterruptedException
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |