|
||||||||||
PREV NEXT | FRAMES NO FRAMES |
Packages that use CandidateURI | |
---|---|
org.archive.crawler.datamodel | |
org.archive.crawler.deciderules | Provides classes for a simple decision rules framework. |
org.archive.crawler.framework | |
org.archive.crawler.frontier | |
org.archive.crawler.postprocessor | |
org.archive.crawler.processor | |
org.archive.crawler.scope | |
org.archive.crawler.util |
Uses of CandidateURI in org.archive.crawler.datamodel |
---|
Subclasses of CandidateURI in org.archive.crawler.datamodel | |
---|---|
class |
CrawlURI
Represents a candidate URI and the associated state it collects as it is crawled. |
Methods in org.archive.crawler.datamodel that return CandidateURI | |
---|---|
CandidateURI |
CandidateURI.createCandidateURI(UURI baseUURI,
Link link)
Utility method for creation of CandidateURIs found extracting links from this CrawlURI. |
CandidateURI |
CandidateURI.createCandidateURI(UURI baseUURI,
Link link,
int scheduling,
boolean seed)
Utility method for creation of CandidateURIs found extracting links from this CrawlURI. |
static CandidateURI |
CandidateURI.createSeedCandidateURI(UURI uuri)
|
static CandidateURI |
CandidateURI.fromString(java.lang.String uriHopsViaString)
Given a string containing a URI, then optional whitespace delimited hops-path and via info, create a CandidateURI instance. |
Methods in org.archive.crawler.datamodel that return types with arguments of type CandidateURI | |
---|---|
java.util.Collection<CandidateURI> |
CrawlURI.getOutCandidates()
Returns discovered candidate URIs. |
Methods in org.archive.crawler.datamodel with parameters of type CandidateURI | |
---|---|
void |
UriUniqFilter.add(java.lang.String key,
CandidateURI value)
Add given uri, if not already present. |
void |
UriUniqFilter.addForce(java.lang.String key,
CandidateURI value)
Add given uri, all the way through to underlying destination, even if already present. |
void |
UriUniqFilter.addNow(java.lang.String key,
CandidateURI value)
Immediately add uri. |
void |
UriUniqFilter.forget(java.lang.String key,
CandidateURI value)
Forget item was seen |
static CrawlURI |
CrawlURI.from(CandidateURI caUri,
long ordinal)
Make a CrawlURI from the passed CandidateURI . |
CrawlHost |
ServerCache.getHostFor(CandidateURI cauri)
Get the CrawlHost associated with curi . |
CrawlServer |
ServerCache.getServerFor(CandidateURI cauri)
Get the CrawlServer associated with curi . |
static java.lang.String |
CrawlServer.getServerKey(CandidateURI cauri)
Get key to use doing lookup on server instances. |
protected void |
CandidateURI.inheritFrom(CandidateURI ancestor)
Inherit (copy) the relevant keys-values from the ancestor. |
void |
UriUniqFilter.HasUriReceiver.receive(CandidateURI item)
|
boolean |
CandidateURI.sameDomainAs(CandidateURI other)
Compares the domain of this CandidateURI with that of another CandidateURI |
Method parameters in org.archive.crawler.datamodel with type arguments of type CandidateURI | |
---|---|
void |
CrawlURI.replaceOutlinks(java.util.Collection<CandidateURI> links)
Replace current collection of links w/ passed list. |
Constructors in org.archive.crawler.datamodel with parameters of type CandidateURI | |
---|---|
CrawlURI(CandidateURI caUri,
long o)
Create a new instance of CrawlURI from a CandidateURI |
Uses of CandidateURI in org.archive.crawler.deciderules |
---|
Methods in org.archive.crawler.deciderules with parameters of type CandidateURI | |
---|---|
void |
SurtPrefixedDecideRule.addedSeed(CandidateURI curi)
|
Uses of CandidateURI in org.archive.crawler.framework |
---|
Methods in org.archive.crawler.framework with parameters of type CandidateURI | |
---|---|
boolean |
CrawlScope.addSeed(CandidateURI curi)
Add a new seed to scope. |
java.lang.String |
Frontier.getClassKey(CandidateURI cauri)
|
protected boolean |
Scoper.isInScope(CandidateURI caUri)
Schedule the given CandidateURI with the Frontier. |
protected void |
Scoper.outOfScope(CandidateURI caUri)
Called when a CandidateUri is ruled out of scope. |
void |
Frontier.schedule(CandidateURI caURI)
Schedules a CandidateURI. |
Uses of CandidateURI in org.archive.crawler.frontier |
---|
Methods in org.archive.crawler.frontier with parameters of type CandidateURI | |
---|---|
void |
RecoveryJournal.added(CandidateURI curi)
|
void |
FrontierJournal.added(CandidateURI curi)
|
protected CrawlURI |
WorkQueueFrontier.asCrawlUri(CandidateURI caUri)
|
protected CrawlURI |
AbstractFrontier.asCrawlUri(CandidateURI caUri)
|
protected void |
AdaptiveRevisitFrontier.batchSchedule(CandidateURI caUri)
|
protected java.lang.String |
AbstractFrontier.canonicalize(CandidateURI cauri)
Canonicalize passed CandidateURI. |
protected java.lang.String |
AdaptiveRevisitFrontier.canonicalize(CandidateURI cauri)
Canonicalize passed CandidateURI. |
void |
RecoveryJournal.emitted(CandidateURI curi)
|
void |
FrontierJournal.emitted(CandidateURI curi)
Note that a CrawlURI was emitted for processing. |
void |
RecoveryJournal.finishedDisregard(CandidateURI curi)
|
void |
FrontierJournal.finishedDisregard(CandidateURI curi)
|
void |
RecoveryJournal.finishedFailure(CandidateURI curi)
|
void |
FrontierJournal.finishedFailure(CandidateURI curi)
|
void |
RecoveryJournal.finishedSuccess(CandidateURI curi)
|
void |
FrontierJournal.finishedSuccess(CandidateURI curi)
|
java.lang.String |
AbstractFrontier.getClassKey(CandidateURI cauri)
|
java.lang.String |
AdaptiveRevisitFrontier.getClassKey(CandidateURI cauri)
|
java.lang.String |
IPQueueAssignmentPolicy.getClassKey(CrawlController controller,
CandidateURI cauri)
|
java.lang.String |
SurtAuthorityQueueAssignmentPolicy.getClassKey(CrawlController controller,
CandidateURI cauri)
|
java.lang.String |
TopmostAssignedSurtQueueAssignmentPolicy.getClassKey(CrawlController controller,
CandidateURI cauri)
|
abstract java.lang.String |
QueueAssignmentPolicy.getClassKey(CrawlController controller,
CandidateURI cauri)
Get the String key (name) of the queue to which the CrawlURI should be assigned. |
java.lang.String |
BucketQueueAssignmentPolicy.getClassKey(CrawlController controller,
CandidateURI curi)
|
java.lang.String |
HostnameQueueAssignmentPolicy.getClassKey(CrawlController controller,
CandidateURI cauri)
|
protected QueueAssignmentPolicy |
AbstractFrontier.getQueueAssignmentPolicy(CandidateURI cauri)
|
protected void |
AdaptiveRevisitFrontier.innerSchedule(CandidateURI caUri)
|
void |
WorkQueueFrontier.receive(CandidateURI caUri)
Accept the given CandidateURI for scheduling, as it has passed the alreadyIncluded filter. |
void |
AdaptiveRevisitFrontier.receive(CandidateURI item)
|
void |
RecoveryJournal.rescheduled(CandidateURI curi)
|
void |
FrontierJournal.rescheduled(CandidateURI curi)
|
void |
WorkQueueFrontier.schedule(CandidateURI caUri)
Arrange for the given CandidateURI to be visited, if it is not already scheduled/completed. |
void |
AdaptiveRevisitFrontier.schedule(CandidateURI caURI)
|
void |
RecoveryJournal.writeLongUriLine(java.lang.String tag,
CandidateURI curi)
|
Uses of CandidateURI in org.archive.crawler.postprocessor |
---|
Methods in org.archive.crawler.postprocessor with parameters of type CandidateURI | |
---|---|
protected boolean |
SupplementaryLinksScoper.isInScope(CandidateURI caUri)
|
protected void |
SupplementaryLinksScoper.outOfScope(CandidateURI caUri)
Called when a CandidateUri is ruled out of scope. |
protected void |
LinksScoper.outOfScope(CandidateURI caUri)
|
protected void |
FrontierScheduler.schedule(CandidateURI caUri)
Schedule the given CandidateURI with the Frontier. |
Uses of CandidateURI in org.archive.crawler.processor |
---|
Methods in org.archive.crawler.processor with parameters of type CandidateURI | |
---|---|
protected boolean |
CrawlMapper.decideToMapOutlink(CandidateURI cauri)
|
protected void |
CrawlMapper.divertLog(CandidateURI cauri,
java.lang.String target)
Note the given CandidateURI in the appropriate diversion log. |
protected java.lang.String |
HashCrawlMapper.map(CandidateURI cauri)
Look up the crawler node name to which the given CandidateURI should be mapped. |
protected abstract java.lang.String |
CrawlMapper.map(CandidateURI cauri)
Look up the crawler node name to which the given CandidateURI should be mapped. |
protected java.lang.String |
LexicalCrawlMapper.map(CandidateURI cauri)
Look up the crawler node name to which the given CandidateURI should be mapped. |
Uses of CandidateURI in org.archive.crawler.scope |
---|
Methods in org.archive.crawler.scope with parameters of type CandidateURI | |
---|---|
void |
SeedListener.addedSeed(CandidateURI uuri)
|
Uses of CandidateURI in org.archive.crawler.util |
---|
Fields in org.archive.crawler.util declared as CandidateURI | |
---|---|
(package private) CandidateURI |
FPMergeUriUniqFilter.PendingItem.caUri
|
Methods in org.archive.crawler.util with parameters of type CandidateURI | |
---|---|
void |
FPMergeUriUniqFilter.add(java.lang.String key,
CandidateURI value)
|
void |
SetBasedUriUniqFilter.add(java.lang.String key,
CandidateURI value)
|
void |
FPMergeUriUniqFilter.addForce(java.lang.String key,
CandidateURI value)
|
void |
SetBasedUriUniqFilter.addForce(java.lang.String key,
CandidateURI value)
|
void |
FPMergeUriUniqFilter.addNow(java.lang.String key,
CandidateURI value)
|
void |
SetBasedUriUniqFilter.addNow(java.lang.String key,
CandidateURI value)
|
void |
FPMergeUriUniqFilter.forget(java.lang.String key,
CandidateURI value)
|
void |
BloomUriUniqFilter.forget(java.lang.String canonical,
CandidateURI item)
|
void |
SetBasedUriUniqFilter.forget(java.lang.String key,
CandidateURI value)
|
protected void |
FPMergeUriUniqFilter.pend(long fp,
CandidateURI value)
Place the given FP/CandidateURI pair into the pending set, awaiting a merge to determine if it's actually accepted. |
void |
BenchmarkUriUniqFilters.receive(CandidateURI item)
|
Constructors in org.archive.crawler.util with parameters of type CandidateURI | |
---|---|
FPMergeUriUniqFilter.PendingItem(long fp,
CandidateURI value)
|
|
||||||||||
PREV NEXT | FRAMES NO FRAMES |