org.archive.crawler.filter
Class HTTPMidFetchUnchangedFilter
java.lang.Object
javax.management.Attribute
org.archive.crawler.settings.Type
org.archive.crawler.settings.ComplexType
org.archive.crawler.settings.ModuleType
org.archive.crawler.framework.Filter
org.archive.crawler.filter.HTTPMidFetchUnchangedFilter
- All Implemented Interfaces:
- java.io.Serializable, javax.management.DynamicMBean, CoreAttributeConstants, AdaptiveRevisitAttributeConstants
public class HTTPMidFetchUnchangedFilter
- extends Filter
- implements AdaptiveRevisitAttributeConstants
A mid fetch filter for HTTP fetcher processors. It will evaluate the HTTP
header to try and predict if the document has changed since it last passed
through this filter. It does this by comparing the last-modified and etag
values with the same values stored during the last processing of the URI.
If both values are present, they must agree on predicting no change,
otherwise a change is predicted (return true).
If only one of the values is present, it alone is used to predict if a
change has occured.
If neither value is present the filter will return true (predict change)
- Author:
- Kristinn Sigurdsson
- See Also:
- Serialized Form
Fields inherited from interface org.archive.crawler.frontier.AdaptiveRevisitAttributeConstants |
A_CONTENT_STATE_KEY, A_DISCARD_REVISIT, A_FETCH_OVERDUE, A_LAST_CONTENT_DIGEST, A_LAST_DATESTAMP, A_LAST_ETAG, A_NUMBER_OF_VERSIONS, A_NUMBER_OF_VISITS, A_TIME_OF_NEXT_PROCESSING, A_WAIT_INTERVAL, A_WAIT_REEVALUATED, CONTENT_CHANGED, CONTENT_UNCHANGED, CONTENT_UNKNOWN |
Fields inherited from interface org.archive.crawler.datamodel.CoreAttributeConstants |
A_ANNOTATIONS, A_CONTENT_DIGEST, A_CONTENT_TYPE, A_CREDENTIAL_AVATARS_KEY, A_DELAY_FACTOR, A_DISTANCE_FROM_SEED, A_DNS_FETCH_TIME, A_DNS_SERVER_IP_LABEL, A_ETAG_HEADER, A_FETCH_BEGAN_TIME, A_FETCH_COMPLETED_TIME, A_FETCH_HISTORY, A_FORCE_RETIRE, A_FTP_CONTROL_CONVERSATION, A_FTP_FETCH_STATUS, A_HERITABLE_KEYS, A_HTML_BASE, A_HTTP_BIND_ADDRESS, A_HTTP_PROXY_HOST, A_HTTP_PROXY_PORT, A_HTTP_TRANSACTION, A_LAST_MODIFIED_HEADER, A_LOCALIZED_ERRORS, A_META_ROBOTS, A_MINIMUM_DELAY, A_MIRROR_PATH, A_PREREQUISITE_URI, A_REFERENCE_LENGTH, A_RETRY_DELAY, A_RRECORD_SET_LABEL, A_RUNTIME_EXCEPTION, A_SOURCE_TAG, A_STATUS, A_WRITTEN_TO_WARC, HEADER_TRUNC, LENGTH_TRUNC, TIMER_TRUNC, TRUNC_SUFFIX |
Method Summary |
protected boolean |
innerAccepts(java.lang.Object o)
Classes subclassing this one should override this method to perfrom
their custom determination of whether or not the object given to it. |
Methods inherited from class org.archive.crawler.settings.ComplexType |
addElementToDefinition, checkValue, earlyInitialize, getAbsoluteName, getAttribute, getAttribute, getAttribute, getAttributeInfo, getAttributeInfo, getAttributeInfoIterator, getAttributes, getDataContainerRecursive, getDataContainerRecursive, getDefaultValue, getDescription, getElementFromDefinition, getLegalValues, getLocalAttribute, getMBeanInfo, getMBeanInfo, getParent, getPreservedFields, getSettingsHandler, getUncheckedAttribute, getValue, globalSettings, invoke, isInitialized, isOverridden, iterator, removeElementFromDefinition, setAsOrder, setAttribute, setAttribute, setAttributes, setDescription, setPreservedFields, unsetAttribute |
Methods inherited from class org.archive.crawler.settings.Type |
addConstraint, equals, getConstraints, getLegalValueType, isExpertSetting, isOverrideable, isTransient, setExpertSetting, setLegalValueType, setOverrideable, setTransient |
Methods inherited from class javax.management.Attribute |
getName, hashCode |
Methods inherited from class java.lang.Object |
clone, finalize, getClass, notify, notifyAll, wait, wait, wait |
HEADER_PREDICTS_MISSING
public static final int HEADER_PREDICTS_MISSING
- See Also:
- Constant Field Values
HEADER_PREDICTS_UNCHANGED
public static final int HEADER_PREDICTS_UNCHANGED
- See Also:
- Constant Field Values
HEADER_PREDICTS_CHANGED
public static final int HEADER_PREDICTS_CHANGED
- See Also:
- Constant Field Values
HTTPMidFetchUnchangedFilter
public HTTPMidFetchUnchangedFilter(java.lang.String name)
- Constructor
- Parameters:
name
- Module name
HTTPMidFetchUnchangedFilter
public HTTPMidFetchUnchangedFilter(java.lang.String name,
java.lang.String description)
- Constructor
- Parameters:
name
- Module namedescription
- A description of the modules functions
innerAccepts
protected boolean innerAccepts(java.lang.Object o)
- Description copied from class:
Filter
- Classes subclassing this one should override this method to perfrom
their custom determination of whether or not the object given to it.
- Overrides:
innerAccepts
in class Filter
- Parameters:
o
- The object
- Returns:
- True if it passes the filter.
Copyright © 2003-2011 Internet Archive. All Rights Reserved.