Uses of Class
org.archive.crawler.deciderules.DecideRule

Packages that use DecideRule
org.archive.crawler.deciderules Provides classes for a simple decision rules framework. 
org.archive.crawler.deciderules.recrawl   
org.archive.crawler.fetcher   
org.archive.crawler.framework   
org.archive.crawler.postprocessor   
org.archive.crawler.processor   
 

Uses of DecideRule in org.archive.crawler.deciderules
 

Subclasses of DecideRule in org.archive.crawler.deciderules
 class AcceptDecideRule
          Rule which responds ACCEPT to anything passed in.
 class AddRedirectFromRootServerToScope
           
 class BeanShellDecideRule
          Rule which runs a groovy script to make its decision.
 class ClassKeyMatchesRegExpDecideRule
          Rule applies configured decision to any CrawlURI class key -- i.e.
 class ConfiguredDecideRule
          Rule which can be configured to ACCEPT or REJECT at operator's option.
 class ContentTypeMatchesRegExpDecideRule
          DecideRule whose decision is applied if the URI's content-type is present and matches the supplied regular expression.
 class ContentTypeNotMatchesRegExpDecideRule
          DecideRule whose decision is applied if the URI's content-type is present and does not match the supplied regular expression.
 class DecideRuleSequence
          RuleSequence represents a series of Rules, which are applied in turn to give the final result.
 class ExceedsDocumentLengthTresholdDecideRule
           
 class ExternalGeoLocationDecideRule
          A rule that can be configured to take alternate implementations of the ExternalGeoLocationInterface.
 class ExternalImplDecideRule
          A rule that can be configured to take alternate implementations of the ExternalImplInterface.
 class FetchStatusDecideRule
          Rule applies the configured decision for any URI which has a fetch status equal to the 'target-status' setting.
 class FetchStatusMatchesRegExpDecideRule
           
 class FetchStatusNotMatchesRegExpDecideRule
           
 class FilterDecideRule
          FilterDecideRule wraps a legacy Filter for use in DecideRule contexts.
 class HasViaDecideRule
          Rule applies the configured decision for any URI which has a 'via' (essentially, any URI that was a seed or some kinds of mid-crawl adds).
 class HopsPathMatchesRegExpDecideRule
          Rule applies configured decision to any CrawlURIs whose 'hops-path' (string like "LLXE" etc.) matches the supplied regexp.
 class IsCrossTopmostAssignedSurtHopDecideRule
          Applies its decision if the current URI differs in that portion of its hostname/domain that is assigned/sold by registrars (AKA its 'topmost assigned SURT' or 'public suffix'.)
 class MatchesFilePatternDecideRule
          Compares suffix of a passed CrawlURI, UURI, or String against a regular expression pattern, applying its configured decision to all matches.
 class MatchesListRegExpDecideRule
          Rule applies configured decision to any CrawlURIs whose String URI matches the supplied regexps.
 class MatchesRegExpDecideRule
          Rule applies configured decision to any CrawlURIs whose String URI matches the supplied regexp.
 class NotExceedsDocumentLengthTresholdDecideRule
           
 class NotMatchesFilePatternDecideRule
          Rule applies configured decision to any URIs which do *not* match the supplied (file-pattern) regexp.
 class NotMatchesListRegExpDecideRule
          Rule applies configured decision to any URIs which do *not* match the supplied regexp.
 class NotMatchesRegExpDecideRule
          Rule applies configured decision to any URIs which do *not* match the supplied regexp.
 class NotOnDomainsDecideRule
          Rule applies configured decision to any URIs that are *not* in one of the domains in the configured set of domains, filled from the seed set.
 class NotOnHostsDecideRule
          Rule applies configured decision to any URIs that are *not* on one of the hosts in the configured set of hosts, filled from the seed set.
 class NotSurtPrefixedDecideRule
          Rule applies configured decision to any URIs that, when expressed in SURT form, do *not* begin with one of the prefixes in the configured set.
 class OnDomainsDecideRule
          Rule applies configured decision to any URIs that are on one of the domains in the configured set of domains, filled from the seed set.
 class OnHostsDecideRule
          Rule applies configured decision to any URIs that are on one of the hosts in the configured set of hosts, filled from the seed set.
 class PathologicalPathDecideRule
          Rule REJECTs any URI which contains an excessive number of identical, consecutive path-segments (eg http://example.com/a/a/a/boo.html == 3 '/a' segments)
 class PredicatedDecideRule
          Rule which applies the configured decision only if a test evaluates to true.
 class PrerequisiteAcceptDecideRule
          Rule which ACCEPTs all 'prerequisite' URIs (those with a 'P' in the last hopsPath position).
 class QueueOverbudgetDecideRule
          Applies configured decision to every candidate URI that would overbudget its queue.
 class RejectDecideRule
          Rule which answers REJECT to everything evaluated.
 class ScopePlusOneDecideRule
          Rule allows one level of discovery beyond configured scope (e.g.
 class SeedAcceptDecideRule
          Rule which ACCEPTs all 'seed' URIs (those for which isSeed is true).
 class SurtPrefixedDecideRule
          Rule applies configured decision to any URIs that, when expressed in SURT form, begin with one of the prefixes in the configured set.
 class TooManyHopsDecideRule
          Rule REJECTs any CrawlURIs whose total number of hops (length of the hopsPath string, traversed links of any type) is over a threshold.
 class TooManyPathSegmentsDecideRule
          Rule REJECTs any CrawlURIs whose total number of path-segments (as indicated by the count of '/' characters not including the first '//') is over a given threshold.
 class TransclusionDecideRule
          Rule ACCEPTs any CrawlURIs whose path-from-seed ('hopsPath' -- see CandidateURI.getPathFromSeed()) ends with at least one, but not more than, the given number of non-navlink ('L') hops.
 

Methods in org.archive.crawler.deciderules that return DecideRule
protected  DecideRule DecidingFilter.getDecideRule(java.lang.Object o)
           
protected  DecideRule DecidingScope.getDecideRule(java.lang.Object o)
           
 

Uses of DecideRule in org.archive.crawler.deciderules.recrawl
 

Subclasses of DecideRule in org.archive.crawler.deciderules.recrawl
 class IdenticalDigestDecideRule
          Rule applies configured decision to any CrawlURIs whose prior-history content-digest matches the latest fetch.
 

Uses of DecideRule in org.archive.crawler.fetcher
 

Methods in org.archive.crawler.fetcher that return DecideRule
protected  DecideRule FetchHTTP.getMidfetchRule(java.lang.Object o)
           
 

Uses of DecideRule in org.archive.crawler.framework
 

Methods in org.archive.crawler.framework that return DecideRule
protected  DecideRule Processor.getDecideRule(java.lang.Object o)
           
 

Methods in org.archive.crawler.framework with parameters of type DecideRule
protected  boolean Processor.rulesAccept(DecideRule rule, java.lang.Object o)
           
 

Uses of DecideRule in org.archive.crawler.postprocessor
 

Methods in org.archive.crawler.postprocessor that return DecideRule
protected  DecideRule SupplementaryLinksScoper.getLinkRules(java.lang.Object o)
           
protected  DecideRule LinksScoper.getRejectLogRules(java.lang.Object o)
           
 

Uses of DecideRule in org.archive.crawler.processor
 

Methods in org.archive.crawler.processor that return DecideRule
protected  DecideRule CrawlMapper.getMapOutlinkDecideRule(java.lang.Object o)
           
 



Copyright © 2003-2011 Internet Archive. All Rights Reserved.