org.archive.crawler.scope
Class SurtPrefixScope
java.lang.Object
javax.management.Attribute
org.archive.crawler.settings.Type
org.archive.crawler.settings.ComplexType
org.archive.crawler.settings.ModuleType
org.archive.crawler.framework.Filter
org.archive.crawler.framework.CrawlScope
org.archive.crawler.scope.ClassicScope
org.archive.crawler.scope.RefinedScope
org.archive.crawler.scope.SurtPrefixScope
- All Implemented Interfaces:
- java.io.Serializable, javax.management.DynamicMBean
Deprecated. As of release 1.10.0. Replaced by DecidingScope
.
public class SurtPrefixScope
- extends RefinedScope
A specialized CrawlScope suitable for the most common crawl needs.
Roughly, as with other existing CrawlScope variants, SurtPrefixScope's logic
is that a URI is included if:
( isSeed(uri) || focusFilter.accepts(uri) ) ||
transitiveFilter.accepts(uri) ) && ! excludeFilter.accepts(uri)
Specifically, SurtPrefixScope uses a SurtFilter to test for focus-inclusion.
- Author:
- gojomo
- See Also:
- Serialized Form
Method Summary |
protected boolean |
focusAccepts(java.lang.Object object)
Deprecated. Check if a URI is part of this scope. |
void |
initialize(CrawlController controller)
Deprecated. Initialize is called just before the crawler starts to run. |
void |
kickUpdate()
Deprecated. Re-read prefixes after an update. |
Methods inherited from class org.archive.crawler.framework.CrawlScope |
addSeed, addSeedListener, checkClose, getSeedfile, isSameHost, isSeed, listUsedFiles, refreshSeeds, seedsIterator, seedsIterator, toString |
Methods inherited from class org.archive.crawler.settings.ComplexType |
addElementToDefinition, checkValue, earlyInitialize, getAbsoluteName, getAttribute, getAttribute, getAttribute, getAttributeInfo, getAttributeInfo, getAttributeInfoIterator, getAttributes, getDataContainerRecursive, getDataContainerRecursive, getDefaultValue, getDescription, getElementFromDefinition, getLegalValues, getLocalAttribute, getMBeanInfo, getMBeanInfo, getParent, getPreservedFields, getSettingsHandler, getUncheckedAttribute, getValue, globalSettings, invoke, isInitialized, isOverridden, iterator, removeElementFromDefinition, setAsOrder, setAttribute, setAttribute, setAttributes, setDescription, setPreservedFields, unsetAttribute |
Methods inherited from class org.archive.crawler.settings.Type |
addConstraint, equals, getConstraints, getLegalValueType, isExpertSetting, isOverrideable, isTransient, setExpertSetting, setLegalValueType, setOverrideable, setTransient |
Methods inherited from class javax.management.Attribute |
getName, hashCode |
Methods inherited from class java.lang.Object |
clone, finalize, getClass, notify, notifyAll, wait, wait, wait |
ATTR_SURTS_SOURCE_FILE
public static final java.lang.String ATTR_SURTS_SOURCE_FILE
- Deprecated.
- See Also:
- Constant Field Values
ATTR_SEEDS_AS_SURT_PREFIXES
public static final java.lang.String ATTR_SEEDS_AS_SURT_PREFIXES
- Deprecated.
- See Also:
- Constant Field Values
ATTR_SURTS_DUMP_FILE
public static final java.lang.String ATTR_SURTS_DUMP_FILE
- Deprecated.
- See Also:
- Constant Field Values
ATTR_ALSO_CHECK_VIA
public static final java.lang.String ATTR_ALSO_CHECK_VIA
- Deprecated.
- Whether the 'via' of CrawlURIs should also be checked
to see if it is prefixed by the set of SURT prefixes
- See Also:
- Constant Field Values
DEFAULT_ALSO_CHECK_VIA
public static final java.lang.Boolean DEFAULT_ALSO_CHECK_VIA
- Deprecated.
surtPrefixes
SurtPrefixSet surtPrefixes
- Deprecated.
SurtPrefixScope
public SurtPrefixScope(java.lang.String name)
- Deprecated.
initialize
public void initialize(CrawlController controller)
- Deprecated.
- Description copied from class:
CrawlScope
- Initialize is called just before the crawler starts to run.
The settings system is up and initialized so can be used. This
initialize happens after
ComplexType.earlyInitialize(CrawlerSettings)
.
- Overrides:
initialize
in class CrawlScope
- Parameters:
controller
- Controller object.
focusAccepts
protected boolean focusAccepts(java.lang.Object object)
- Deprecated.
- Check if a URI is part of this scope.
- Overrides:
focusAccepts
in class ClassicScope
- Parameters:
object
- An instance of UURI or of CandidateURI.
- Returns:
- True if focus filter accepts passed object.
kickUpdate
public void kickUpdate()
- Deprecated.
- Re-read prefixes after an update.
- Overrides:
kickUpdate
in class ClassicScope
- See Also:
CrawlScope.kickUpdate()
Copyright © 2003-2011 Internet Archive. All Rights Reserved.