org.archive.crawler.framework
Class Filter

java.lang.Object
  extended by javax.management.Attribute
      extended by org.archive.crawler.settings.Type
          extended by org.archive.crawler.settings.ComplexType
              extended by org.archive.crawler.settings.ModuleType
                  extended by org.archive.crawler.framework.Filter
All Implemented Interfaces:
java.io.Serializable, javax.management.DynamicMBean
Direct Known Subclasses:
CrawlScope, DecidingFilter, HopsFilter, HTTPMidFetchUnchangedFilter, OrFilter, PathDepthFilter, SurtPrefixFilter, TransclusionFilter, URIListRegExpFilter, URIRegExpFilter

public class Filter
extends ModuleType

Base class for filter classes.

Several classes allow 'filters' to be applied to them. Filters are classes that, based on an arbitrary object passed to them, return a boolean stating if if passes the filter. Thus applying filters can affect the behavior of those classes. This class provides the basic framework for filters. All detailed implementation of filters inherit from it and it is considered to be a 'null' filter (always returns true).

Author:
Gordon Mohr
See Also:
Processor, Serialized Form

Nested Class Summary
 
Nested classes/interfaces inherited from class org.archive.crawler.settings.ComplexType
ComplexType.MBeanAttributeInfoIterator
 
Field Summary
static java.lang.String ATTR_ENABLED
           
 
Fields inherited from class org.archive.crawler.settings.ComplexType
definition, definitionMap
 
Constructor Summary
Filter(java.lang.String name)
          Creates a new 'null' filter.
Filter(java.lang.String name, java.lang.String description)
          Creates a new 'null' filter.
 
Method Summary
 boolean accepts(java.lang.Object o)
           
protected  boolean getFilterOffPosition(CrawlURI curi)
          If the filter is disabled, the value returned by this method is what filters return as their disabled setting.
protected  boolean innerAccepts(java.lang.Object o)
          Classes subclassing this one should override this method to perfrom their custom determination of whether or not the object given to it.
 void kickUpdate()
           
protected  boolean returnTrueIfMatches(CrawlURI curi)
          Checks to see if filter functionality should be inverted for this curi.
 java.lang.String toString()
           
 
Methods inherited from class org.archive.crawler.settings.ModuleType
addElement, listUsedFiles
 
Methods inherited from class org.archive.crawler.settings.ComplexType
addElementToDefinition, checkValue, earlyInitialize, getAbsoluteName, getAttribute, getAttribute, getAttribute, getAttributeInfo, getAttributeInfo, getAttributeInfoIterator, getAttributes, getDataContainerRecursive, getDataContainerRecursive, getDefaultValue, getDescription, getElementFromDefinition, getLegalValues, getLocalAttribute, getMBeanInfo, getMBeanInfo, getParent, getPreservedFields, getSettingsHandler, getUncheckedAttribute, getValue, globalSettings, invoke, isInitialized, isOverridden, iterator, removeElementFromDefinition, setAsOrder, setAttribute, setAttribute, setAttributes, setDescription, setPreservedFields, unsetAttribute
 
Methods inherited from class org.archive.crawler.settings.Type
addConstraint, equals, getConstraints, getLegalValueType, isExpertSetting, isOverrideable, isTransient, setExpertSetting, setLegalValueType, setOverrideable, setTransient
 
Methods inherited from class javax.management.Attribute
getName, hashCode
 
Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, wait, wait, wait
 

Field Detail

ATTR_ENABLED

public static final java.lang.String ATTR_ENABLED
See Also:
Constant Field Values
Constructor Detail

Filter

public Filter(java.lang.String name,
              java.lang.String description)
Creates a new 'null' filter.

Parameters:
name - the name of the filter.
description - an description of the filter suitable for showing in the user interface.

Filter

public Filter(java.lang.String name)
Creates a new 'null' filter.

Parameters:
name - the name of the filter.
Method Detail

accepts

public boolean accepts(java.lang.Object o)

getFilterOffPosition

protected boolean getFilterOffPosition(CrawlURI curi)
If the filter is disabled, the value returned by this method is what filters return as their disabled setting. Default is that we return 'true', continue processing, but some filters -- the exclude filters for example -- will want to return false if disabled so processing can continue.

Parameters:
curi - CrawlURI to use as context. Passed curi can be null.
Returns:
This filters 'off' position.

returnTrueIfMatches

protected boolean returnTrueIfMatches(CrawlURI curi)
Checks to see if filter functionality should be inverted for this curi.

All filters will by default return true if curi is accepted by the filter. If this method returns false, then the filter will return true if doesn't match.

Classes extending this class should override this method with appropriate code.

Parameters:
curi - Current CrawlURI
Returns:
true for default behaviour, false otherwise.

innerAccepts

protected boolean innerAccepts(java.lang.Object o)
Classes subclassing this one should override this method to perfrom their custom determination of whether or not the object given to it.

Parameters:
o - The object
Returns:
True if it passes the filter.

toString

public java.lang.String toString()
Overrides:
toString in class ComplexType

kickUpdate

public void kickUpdate()


Copyright © 2003-2011 Internet Archive. All Rights Reserved.