Uses of Class
org.archive.crawler.datamodel.CandidateURI

Packages that use CandidateURI
org.archive.crawler.datamodel   
org.archive.crawler.deciderules Provides classes for a simple decision rules framework. 
org.archive.crawler.framework   
org.archive.crawler.frontier   
org.archive.crawler.postprocessor   
org.archive.crawler.processor   
org.archive.crawler.scope   
org.archive.crawler.util   
 

Uses of CandidateURI in org.archive.crawler.datamodel
 

Subclasses of CandidateURI in org.archive.crawler.datamodel
 class CrawlURI
          Represents a candidate URI and the associated state it collects as it is crawled.
 

Methods in org.archive.crawler.datamodel that return CandidateURI
 CandidateURI CandidateURI.createCandidateURI(UURI baseUURI, Link link)
          Utility method for creation of CandidateURIs found extracting links from this CrawlURI.
 CandidateURI CandidateURI.createCandidateURI(UURI baseUURI, Link link, int scheduling, boolean seed)
          Utility method for creation of CandidateURIs found extracting links from this CrawlURI.
static CandidateURI CandidateURI.createSeedCandidateURI(UURI uuri)
           
static CandidateURI CandidateURI.fromString(java.lang.String uriHopsViaString)
          Given a string containing a URI, then optional whitespace delimited hops-path and via info, create a CandidateURI instance.
 

Methods in org.archive.crawler.datamodel that return types with arguments of type CandidateURI
 java.util.Collection<CandidateURI> CrawlURI.getOutCandidates()
          Returns discovered candidate URIs.
 

Methods in org.archive.crawler.datamodel with parameters of type CandidateURI
 void UriUniqFilter.add(java.lang.String key, CandidateURI value)
          Add given uri, if not already present.
 void UriUniqFilter.addForce(java.lang.String key, CandidateURI value)
          Add given uri, all the way through to underlying destination, even if already present.
 void UriUniqFilter.addNow(java.lang.String key, CandidateURI value)
          Immediately add uri.
 void UriUniqFilter.forget(java.lang.String key, CandidateURI value)
          Forget item was seen
static CrawlURI CrawlURI.from(CandidateURI caUri, long ordinal)
          Make a CrawlURI from the passed CandidateURI.
 CrawlHost ServerCache.getHostFor(CandidateURI cauri)
          Get the CrawlHost associated with curi.
 CrawlServer ServerCache.getServerFor(CandidateURI cauri)
          Get the CrawlServer associated with curi.
static java.lang.String CrawlServer.getServerKey(CandidateURI cauri)
          Get key to use doing lookup on server instances.
protected  void CandidateURI.inheritFrom(CandidateURI ancestor)
          Inherit (copy) the relevant keys-values from the ancestor.
 void UriUniqFilter.HasUriReceiver.receive(CandidateURI item)
           
 boolean CandidateURI.sameDomainAs(CandidateURI other)
          Compares the domain of this CandidateURI with that of another CandidateURI
 

Method parameters in org.archive.crawler.datamodel with type arguments of type CandidateURI
 void CrawlURI.replaceOutlinks(java.util.Collection<CandidateURI> links)
          Replace current collection of links w/ passed list.
 

Constructors in org.archive.crawler.datamodel with parameters of type CandidateURI
CrawlURI(CandidateURI caUri, long o)
          Create a new instance of CrawlURI from a CandidateURI
 

Uses of CandidateURI in org.archive.crawler.deciderules
 

Methods in org.archive.crawler.deciderules with parameters of type CandidateURI
 void SurtPrefixedDecideRule.addedSeed(CandidateURI curi)
           
 

Uses of CandidateURI in org.archive.crawler.framework
 

Methods in org.archive.crawler.framework with parameters of type CandidateURI
 boolean CrawlScope.addSeed(CandidateURI curi)
          Add a new seed to scope.
 java.lang.String Frontier.getClassKey(CandidateURI cauri)
           
protected  boolean Scoper.isInScope(CandidateURI caUri)
          Schedule the given CandidateURI with the Frontier.
protected  void Scoper.outOfScope(CandidateURI caUri)
          Called when a CandidateUri is ruled out of scope.
 void Frontier.schedule(CandidateURI caURI)
          Schedules a CandidateURI.
 

Uses of CandidateURI in org.archive.crawler.frontier
 

Methods in org.archive.crawler.frontier with parameters of type CandidateURI
 void RecoveryJournal.added(CandidateURI curi)
           
 void FrontierJournal.added(CandidateURI curi)
           
protected  CrawlURI WorkQueueFrontier.asCrawlUri(CandidateURI caUri)
           
protected  CrawlURI AbstractFrontier.asCrawlUri(CandidateURI caUri)
           
protected  void AdaptiveRevisitFrontier.batchSchedule(CandidateURI caUri)
           
protected  java.lang.String AbstractFrontier.canonicalize(CandidateURI cauri)
          Canonicalize passed CandidateURI.
protected  java.lang.String AdaptiveRevisitFrontier.canonicalize(CandidateURI cauri)
          Canonicalize passed CandidateURI.
 void RecoveryJournal.emitted(CandidateURI curi)
           
 void FrontierJournal.emitted(CandidateURI curi)
          Note that a CrawlURI was emitted for processing.
 void RecoveryJournal.finishedDisregard(CandidateURI curi)
           
 void FrontierJournal.finishedDisregard(CandidateURI curi)
           
 void RecoveryJournal.finishedFailure(CandidateURI curi)
           
 void FrontierJournal.finishedFailure(CandidateURI curi)
           
 void RecoveryJournal.finishedSuccess(CandidateURI curi)
           
 void FrontierJournal.finishedSuccess(CandidateURI curi)
           
 java.lang.String AbstractFrontier.getClassKey(CandidateURI cauri)
           
 java.lang.String AdaptiveRevisitFrontier.getClassKey(CandidateURI cauri)
           
 java.lang.String IPQueueAssignmentPolicy.getClassKey(CrawlController controller, CandidateURI cauri)
           
 java.lang.String SurtAuthorityQueueAssignmentPolicy.getClassKey(CrawlController controller, CandidateURI cauri)
           
 java.lang.String TopmostAssignedSurtQueueAssignmentPolicy.getClassKey(CrawlController controller, CandidateURI cauri)
           
abstract  java.lang.String QueueAssignmentPolicy.getClassKey(CrawlController controller, CandidateURI cauri)
          Get the String key (name) of the queue to which the CrawlURI should be assigned.
 java.lang.String BucketQueueAssignmentPolicy.getClassKey(CrawlController controller, CandidateURI curi)
           
 java.lang.String HostnameQueueAssignmentPolicy.getClassKey(CrawlController controller, CandidateURI cauri)
           
protected  QueueAssignmentPolicy AbstractFrontier.getQueueAssignmentPolicy(CandidateURI cauri)
           
protected  void AdaptiveRevisitFrontier.innerSchedule(CandidateURI caUri)
           
 void WorkQueueFrontier.receive(CandidateURI caUri)
          Accept the given CandidateURI for scheduling, as it has passed the alreadyIncluded filter.
 void AdaptiveRevisitFrontier.receive(CandidateURI item)
           
 void RecoveryJournal.rescheduled(CandidateURI curi)
           
 void FrontierJournal.rescheduled(CandidateURI curi)
           
 void WorkQueueFrontier.schedule(CandidateURI caUri)
          Arrange for the given CandidateURI to be visited, if it is not already scheduled/completed.
 void AdaptiveRevisitFrontier.schedule(CandidateURI caURI)
           
 void RecoveryJournal.writeLongUriLine(java.lang.String tag, CandidateURI curi)
           
 

Uses of CandidateURI in org.archive.crawler.postprocessor
 

Methods in org.archive.crawler.postprocessor with parameters of type CandidateURI
protected  boolean SupplementaryLinksScoper.isInScope(CandidateURI caUri)
           
protected  void SupplementaryLinksScoper.outOfScope(CandidateURI caUri)
          Called when a CandidateUri is ruled out of scope.
protected  void LinksScoper.outOfScope(CandidateURI caUri)
           
protected  void FrontierScheduler.schedule(CandidateURI caUri)
          Schedule the given CandidateURI with the Frontier.
 

Uses of CandidateURI in org.archive.crawler.processor
 

Methods in org.archive.crawler.processor with parameters of type CandidateURI
protected  boolean CrawlMapper.decideToMapOutlink(CandidateURI cauri)
           
protected  void CrawlMapper.divertLog(CandidateURI cauri, java.lang.String target)
          Note the given CandidateURI in the appropriate diversion log.
protected  java.lang.String HashCrawlMapper.map(CandidateURI cauri)
          Look up the crawler node name to which the given CandidateURI should be mapped.
protected abstract  java.lang.String CrawlMapper.map(CandidateURI cauri)
          Look up the crawler node name to which the given CandidateURI should be mapped.
protected  java.lang.String LexicalCrawlMapper.map(CandidateURI cauri)
          Look up the crawler node name to which the given CandidateURI should be mapped.
 

Uses of CandidateURI in org.archive.crawler.scope
 

Methods in org.archive.crawler.scope with parameters of type CandidateURI
 void SeedListener.addedSeed(CandidateURI uuri)
           
 

Uses of CandidateURI in org.archive.crawler.util
 

Fields in org.archive.crawler.util declared as CandidateURI
(package private)  CandidateURI FPMergeUriUniqFilter.PendingItem.caUri
           
 

Methods in org.archive.crawler.util with parameters of type CandidateURI
 void FPMergeUriUniqFilter.add(java.lang.String key, CandidateURI value)
           
 void SetBasedUriUniqFilter.add(java.lang.String key, CandidateURI value)
           
 void FPMergeUriUniqFilter.addForce(java.lang.String key, CandidateURI value)
           
 void SetBasedUriUniqFilter.addForce(java.lang.String key, CandidateURI value)
           
 void FPMergeUriUniqFilter.addNow(java.lang.String key, CandidateURI value)
           
 void SetBasedUriUniqFilter.addNow(java.lang.String key, CandidateURI value)
           
 void FPMergeUriUniqFilter.forget(java.lang.String key, CandidateURI value)
           
 void BloomUriUniqFilter.forget(java.lang.String canonical, CandidateURI item)
           
 void SetBasedUriUniqFilter.forget(java.lang.String key, CandidateURI value)
           
protected  void FPMergeUriUniqFilter.pend(long fp, CandidateURI value)
          Place the given FP/CandidateURI pair into the pending set, awaiting a merge to determine if it's actually accepted.
 void BenchmarkUriUniqFilters.receive(CandidateURI item)
           
 

Constructors in org.archive.crawler.util with parameters of type CandidateURI
FPMergeUriUniqFilter.PendingItem(long fp, CandidateURI value)
           
 



Copyright © 2003-2011 Internet Archive. All Rights Reserved.