org.archive.crawler.deciderules
Class TransclusionDecideRule

java.lang.Object
  extended by javax.management.Attribute
      extended by org.archive.crawler.settings.Type
          extended by org.archive.crawler.settings.ComplexType
              extended by org.archive.crawler.settings.ModuleType
                  extended by org.archive.crawler.deciderules.DecideRule
                      extended by org.archive.crawler.deciderules.ConfiguredDecideRule
                          extended by org.archive.crawler.deciderules.PredicatedDecideRule
                              extended by org.archive.crawler.deciderules.TransclusionDecideRule
All Implemented Interfaces:
java.io.Serializable, javax.management.DynamicMBean

public class TransclusionDecideRule
extends PredicatedDecideRule

Rule ACCEPTs any CrawlURIs whose path-from-seed ('hopsPath' -- see CandidateURI.getPathFromSeed()) ends with at least one, but not more than, the given number of non-navlink ('L') hops. Otherwise, if the path-from-seed is empty or if a navlink ('L') occurs within max-trans-hops of the tail of the path-from-seed, this rule returns PASS.

Thus, it allows things like embedded resources (frames/images/media) and redirects to be transitively included ('transcluded') in a crawl, even if they otherwise would not, for some reasonable number of hops (1-4).

Author:
gojomo
See Also:
Transclusion, Serialized Form

Nested Class Summary
 
Nested classes/interfaces inherited from class org.archive.crawler.settings.ComplexType
ComplexType.MBeanAttributeInfoIterator
 
Field Summary
(package private) static java.lang.Integer DEFAULT_MAX_SPECULATIVE_HOPS
          Default maximum speculative ('X') hops.
(package private) static java.lang.Integer DEFAULT_MAX_TRANS_HOPS
          Default maximum transitive hops -- any type Default access so can be accessed by unit tests.
 
Fields inherited from class org.archive.crawler.deciderules.ConfiguredDecideRule
ALLOWED_TYPES, ATTR_DECISION
 
Fields inherited from class org.archive.crawler.deciderules.DecideRule
ACCEPT, PASS, REJECT
 
Fields inherited from class org.archive.crawler.settings.ComplexType
definition, definitionMap
 
Constructor Summary
TransclusionDecideRule(java.lang.String name)
          Usual constructor.
 
Method Summary
protected  boolean evaluate(java.lang.Object object)
          Evaluate whether given object is within the threshold number of transitive hops.
 
Methods inherited from class org.archive.crawler.deciderules.PredicatedDecideRule
decisionFor
 
Methods inherited from class org.archive.crawler.deciderules.ConfiguredDecideRule
singlePossibleNonPassDecision
 
Methods inherited from class org.archive.crawler.deciderules.DecideRule
getController, kickUpdate
 
Methods inherited from class org.archive.crawler.settings.ModuleType
addElement, listUsedFiles
 
Methods inherited from class org.archive.crawler.settings.ComplexType
addElementToDefinition, checkValue, earlyInitialize, getAbsoluteName, getAttribute, getAttribute, getAttribute, getAttributeInfo, getAttributeInfo, getAttributeInfoIterator, getAttributes, getDataContainerRecursive, getDataContainerRecursive, getDefaultValue, getDescription, getElementFromDefinition, getLegalValues, getLocalAttribute, getMBeanInfo, getMBeanInfo, getParent, getPreservedFields, getSettingsHandler, getUncheckedAttribute, getValue, globalSettings, invoke, isInitialized, isOverridden, iterator, removeElementFromDefinition, setAsOrder, setAttribute, setAttribute, setAttributes, setDescription, setPreservedFields, toString, unsetAttribute
 
Methods inherited from class org.archive.crawler.settings.Type
addConstraint, equals, getConstraints, getLegalValueType, isExpertSetting, isOverrideable, isTransient, setExpertSetting, setLegalValueType, setOverrideable, setTransient
 
Methods inherited from class javax.management.Attribute
getName, hashCode
 
Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, wait, wait, wait
 

Field Detail

DEFAULT_MAX_TRANS_HOPS

static final java.lang.Integer DEFAULT_MAX_TRANS_HOPS
Default maximum transitive hops -- any type Default access so can be accessed by unit tests.


DEFAULT_MAX_SPECULATIVE_HOPS

static final java.lang.Integer DEFAULT_MAX_SPECULATIVE_HOPS
Default maximum speculative ('X') hops. Default access so can be accessed by unit tests.

Constructor Detail

TransclusionDecideRule

public TransclusionDecideRule(java.lang.String name)
Usual constructor.

Parameters:
name - Name of this DecideRule.
Method Detail

evaluate

protected boolean evaluate(java.lang.Object object)
Evaluate whether given object is within the threshold number of transitive hops.

Specified by:
evaluate in class PredicatedDecideRule
Parameters:
object - Object to make decision on.
Returns:
true if the transitive hops >0 and <= max


Copyright © 2003-2011 Internet Archive. All Rights Reserved.