org.archive.crawler.scope
Class PathScope
java.lang.Object
javax.management.Attribute
org.archive.crawler.settings.Type
org.archive.crawler.settings.ComplexType
org.archive.crawler.settings.ModuleType
org.archive.crawler.framework.Filter
org.archive.crawler.framework.CrawlScope
org.archive.crawler.scope.ClassicScope
org.archive.crawler.scope.SeedCachingScope
org.archive.crawler.scope.PathScope
- All Implemented Interfaces:
- java.io.Serializable, javax.management.DynamicMBean
Deprecated. As of release 1.10.0. Replaced by DecidingScope
.
public class PathScope
- extends SeedCachingScope
A core CrawlScope suitable for the most common
crawl needs.
Roughly, its logic is that a URI is included if:
(( isSeed(uri) || focusFilter.accepts(uri) )
|| transitiveFilter.accepts(uri) )
&& ! excludeFilter.accepts(uri)
The focusFilter may be specified by either:
- adding a 'mode' attribute to the
scope
element. mode="broad" is equivalent
to no focus; modes "path", "host", and "domain"
imply a SeedExtensionFilter will be used, with
the scope
element providing its configuration
- adding a focus
subelement
If unspecified, the focusFilter will default to
an accepts-all filter.
The transitiveFilter may be specified by supplying
a transitive
subelement. If unspecified, a
TransclusionFilter will be used, with the scope
element providing its configuration.
The excludeFilter may be specified by supplying
a exclude
subelement. If unspecified, a
accepts-none filter will be used -- meaning that
no URIs will pass the filter and thus be excluded.
- Author:
- gojomo
- See Also:
- Serialized Form
Constructor Summary |
PathScope(java.lang.String name)
Deprecated. |
Method Summary |
protected boolean |
additionalFocusAccepts(java.lang.Object o)
Deprecated. Check if URI is accepted by the additional focus of this scope. |
protected boolean |
focusAccepts(java.lang.Object o)
Deprecated. Check if URI is accepted by the focus of this scope. |
protected boolean |
transitiveAccepts(java.lang.Object o)
Deprecated. |
Methods inherited from class org.archive.crawler.settings.ComplexType |
addElementToDefinition, checkValue, earlyInitialize, getAbsoluteName, getAttribute, getAttribute, getAttribute, getAttributeInfo, getAttributeInfo, getAttributeInfoIterator, getAttributes, getDataContainerRecursive, getDataContainerRecursive, getDefaultValue, getDescription, getElementFromDefinition, getLegalValues, getLocalAttribute, getMBeanInfo, getMBeanInfo, getParent, getPreservedFields, getSettingsHandler, getUncheckedAttribute, getValue, globalSettings, invoke, isInitialized, isOverridden, iterator, removeElementFromDefinition, setAsOrder, setAttribute, setAttribute, setAttributes, setDescription, setPreservedFields, unsetAttribute |
Methods inherited from class org.archive.crawler.settings.Type |
addConstraint, equals, getConstraints, getLegalValueType, isExpertSetting, isOverrideable, isTransient, setExpertSetting, setLegalValueType, setOverrideable, setTransient |
Methods inherited from class javax.management.Attribute |
getName, hashCode |
Methods inherited from class java.lang.Object |
clone, finalize, getClass, notify, notifyAll, wait, wait, wait |
ATTR_TRANSITIVE_FILTER
public static final java.lang.String ATTR_TRANSITIVE_FILTER
- Deprecated.
- See Also:
- Constant Field Values
ATTR_ADDITIONAL_FOCUS_FILTER
public static final java.lang.String ATTR_ADDITIONAL_FOCUS_FILTER
- Deprecated.
- See Also:
- Constant Field Values
additionalFocusFilter
Filter additionalFocusFilter
- Deprecated.
transitiveFilter
Filter transitiveFilter
- Deprecated.
PathScope
public PathScope(java.lang.String name)
- Deprecated.
transitiveAccepts
protected boolean transitiveAccepts(java.lang.Object o)
- Deprecated.
- Overrides:
transitiveAccepts
in class ClassicScope
- Parameters:
o
-
- Returns:
- True if transitive filter accepts passed object.
focusAccepts
protected boolean focusAccepts(java.lang.Object o)
- Deprecated.
- Description copied from class:
ClassicScope
- Check if URI is accepted by the focus of this scope.
This method should be overridden in subclasses.
- Overrides:
focusAccepts
in class ClassicScope
- Parameters:
o
-
- Returns:
- True if focus filter accepts passed object.
additionalFocusAccepts
protected boolean additionalFocusAccepts(java.lang.Object o)
- Deprecated.
- Description copied from class:
ClassicScope
- Check if URI is accepted by the additional focus of this scope.
This method should be overridden in subclasses.
- Overrides:
additionalFocusAccepts
in class ClassicScope
- Parameters:
o
- the URI to check.
- Returns:
- True if additional focus filter accepts passed object.
Copyright © 2003-2011 Internet Archive. All Rights Reserved.