org.archive.crawler.deciderules
Class ScopePlusOneDecideRule

java.lang.Object
  extended by javax.management.Attribute
      extended by org.archive.crawler.settings.Type
          extended by org.archive.crawler.settings.ComplexType
              extended by org.archive.crawler.settings.ModuleType
                  extended by org.archive.crawler.deciderules.DecideRule
                      extended by org.archive.crawler.deciderules.ConfiguredDecideRule
                          extended by org.archive.crawler.deciderules.PredicatedDecideRule
                              extended by org.archive.crawler.deciderules.SurtPrefixedDecideRule
                                  extended by org.archive.crawler.deciderules.ScopePlusOneDecideRule
All Implemented Interfaces:
java.io.Serializable, javax.management.DynamicMBean, SeedListener

public class ScopePlusOneDecideRule
extends SurtPrefixedDecideRule

Rule allows one level of discovery beyond configured scope (e.g. Domain, plus the first otherwise out-of-scope link from an in-scope page, but not further hops from that first page)

Version:
$Date: 2006-09-25 17:16:55 +0000 (Mon, 25 Sep 2006) $ $Revision: 4649 $
Author:
Shifra Raffel
See Also:
Serialized Form

Nested Class Summary
 
Nested classes/interfaces inherited from class org.archive.crawler.settings.ComplexType
ComplexType.MBeanAttributeInfoIterator
 
Field Summary
static java.lang.String ATTR_SCOPE
           
static java.lang.String DOMAIN
           
static java.lang.String HOST
           
 
Fields inherited from class org.archive.crawler.deciderules.SurtPrefixedDecideRule
ATTR_ALSO_CHECK_VIA, ATTR_REBUILD_ON_RECONFIG, ATTR_SEEDS_AS_SURT_PREFIXES, ATTR_SURTS_DUMP_FILE, ATTR_SURTS_SOURCE_FILE, DEFAULT_ALSO_CHECK_VIA, DEFAULT_REBUILD_ON_RECONFIG, surtPrefixes
 
Fields inherited from class org.archive.crawler.deciderules.ConfiguredDecideRule
ALLOWED_TYPES, ATTR_DECISION
 
Fields inherited from class org.archive.crawler.deciderules.DecideRule
ACCEPT, PASS, REJECT
 
Fields inherited from class org.archive.crawler.settings.ComplexType
definition, definitionMap
 
Constructor Summary
ScopePlusOneDecideRule(java.lang.String name)
          Constructor.
 
Method Summary
protected  boolean evaluate(java.lang.Object object)
          Evaluate whether given object comes from a URI which is in scope
protected  SurtPrefixSet getPrefixes()
          Synchronized get of prefix set to use
protected  SurtPrefixSet getPrefixes(java.lang.Object o)
          Synchronized get of prefix set to use.
protected  java.lang.String getScope(java.lang.Object o)
          Decide whether using host or domain scope
protected  void readPrefixes(java.lang.Object o)
          Patch the SURT prefix set so that it only includes the appropriate prefixes.
 
Methods inherited from class org.archive.crawler.deciderules.SurtPrefixedDecideRule
addedSeed, buildSurtPrefixSet, dumpSurtPrefixSet, getSeedfile, kickUpdate, prefixFrom, readPrefixes
 
Methods inherited from class org.archive.crawler.deciderules.PredicatedDecideRule
decisionFor
 
Methods inherited from class org.archive.crawler.deciderules.ConfiguredDecideRule
singlePossibleNonPassDecision
 
Methods inherited from class org.archive.crawler.deciderules.DecideRule
getController
 
Methods inherited from class org.archive.crawler.settings.ModuleType
addElement, listUsedFiles
 
Methods inherited from class org.archive.crawler.settings.ComplexType
addElementToDefinition, checkValue, earlyInitialize, getAbsoluteName, getAttribute, getAttribute, getAttribute, getAttributeInfo, getAttributeInfo, getAttributeInfoIterator, getAttributes, getDataContainerRecursive, getDataContainerRecursive, getDefaultValue, getDescription, getElementFromDefinition, getLegalValues, getLocalAttribute, getMBeanInfo, getMBeanInfo, getParent, getPreservedFields, getSettingsHandler, getUncheckedAttribute, getValue, globalSettings, invoke, isInitialized, isOverridden, iterator, removeElementFromDefinition, setAsOrder, setAttribute, setAttribute, setAttributes, setDescription, setPreservedFields, toString, unsetAttribute
 
Methods inherited from class org.archive.crawler.settings.Type
addConstraint, equals, getConstraints, getLegalValueType, isExpertSetting, isOverrideable, isTransient, setExpertSetting, setLegalValueType, setOverrideable, setTransient
 
Methods inherited from class javax.management.Attribute
getName, hashCode
 
Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, wait, wait, wait
 

Field Detail

ATTR_SCOPE

public static final java.lang.String ATTR_SCOPE
See Also:
Constant Field Values

HOST

public static final java.lang.String HOST
See Also:
Constant Field Values

DOMAIN

public static final java.lang.String DOMAIN
See Also:
Constant Field Values
Constructor Detail

ScopePlusOneDecideRule

public ScopePlusOneDecideRule(java.lang.String name)
Constructor.

Parameters:
name -
Method Detail

evaluate

protected boolean evaluate(java.lang.Object object)
Evaluate whether given object comes from a URI which is in scope

Overrides:
evaluate in class SurtPrefixedDecideRule
Parameters:
object - to evaluate
Returns:
true if URI is either in scope or its via is

getPrefixes

protected SurtPrefixSet getPrefixes()
Synchronized get of prefix set to use

Returns:
SurtPrefixSet to use for check
See Also:
SurtPrefixedDecideRule.getPrefixes()

getPrefixes

protected SurtPrefixSet getPrefixes(java.lang.Object o)
Synchronized get of prefix set to use.

Parameters:
o - Context object.
Returns:
SurtPrefixSet to use for check
See Also:
SurtPrefixedDecideRule.getPrefixes()

readPrefixes

protected void readPrefixes(java.lang.Object o)
Patch the SURT prefix set so that it only includes the appropriate prefixes.

Parameters:
o - Context object.
See Also:
SurtPrefixedDecideRule.readPrefixes()

getScope

protected java.lang.String getScope(java.lang.Object o)
Decide whether using host or domain scope

Parameters:
o - Context
Returns:
String Host or domain


Copyright © 2003-2011 Internet Archive. All Rights Reserved.