org.archive.crawler.selftest
Class SelfTestCrawlJobHandler

java.lang.Object
  extended by org.archive.crawler.admin.CrawlJobHandler
      extended by org.archive.crawler.selftest.SelfTestCrawlJobHandler
All Implemented Interfaces:
CrawlStatusListener, CrawlURIDispositionListener

public class SelfTestCrawlJobHandler
extends CrawlJobHandler
implements CrawlURIDispositionListener

An override to gain access to end-of-crawljob message.

Version:
$Id: SelfTestCrawlJobHandler.java 4667 2006-09-26 20:38:48Z paul_jack $
Author:
stack

Field Summary
 
Fields inherited from class org.archive.crawler.admin.CrawlJobHandler
DEFAULT_PROFILE, DEFAULT_PROFILE_NAME, ORDER_FILE_NAME, PROFILES_DIR_NAME, RECOVER_LOG
 
Constructor Summary
SelfTestCrawlJobHandler(java.io.File jobsDir, java.lang.String selfTestName, java.lang.String url)
           
 
Method Summary
 void crawledURIDisregard(CrawlURI curi)
          Notification of a crawled URI that is to be disregarded.
 void crawledURIFailure(CrawlURI curi)
          Notification of a failed crawling of a URI.
 void crawledURINeedRetry(CrawlURI curi)
          Notification of a failed crawl of a URI that will be retried (failure due to possible transient problems).
 void crawledURISuccessful(CrawlURI curi)
          Notification of a successfully crawled URI
 void crawlEnded(java.lang.String sExitMessage)
          Called when a CrawlController has ended a crawl and is about to exit.
 void crawlStarted(java.lang.String message)
          Called on crawl start.
 
Methods inherited from class org.archive.crawler.admin.CrawlJobHandler
addJob, addProfile, checkDirectory, checkpointJob, crawlCheckpoint, crawlEnding, crawlPaused, crawlPausing, crawlResuming, createNewJob, createSettingsHandler, deleteJob, deleteProfile, deleteURIsFromPending, deleteURIsFromPending, discardNewJob, doFlush, ensureNewJobWritten, getCompletedJobs, getCurrentJob, getDefaultProfile, getInitialMarker, getJob, getNewJob, getNextJobUID, getPendingJobs, getPendingURIsList, getProfiles, getStateJobFile, importUri, importUri, importUris, importUris, importUris, isCrawling, isRunning, kickUpdate, loadJob, loadOptions, loadProfile, newJob, newJob, newProfile, pauseJob, requestCrawlStop, resumeJob, setDefaultProfile, startCrawler, startNextJob, startNextJobInternal, stop, stopCrawler, terminateCurrentJob, updateRecoveryPaths
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

SelfTestCrawlJobHandler

public SelfTestCrawlJobHandler(java.io.File jobsDir,
                               java.lang.String selfTestName,
                               java.lang.String url)
Method Detail

crawlStarted

public void crawlStarted(java.lang.String message)
Description copied from interface: CrawlStatusListener
Called on crawl start.

Specified by:
crawlStarted in interface CrawlStatusListener
Overrides:
crawlStarted in class CrawlJobHandler
Parameters:
message - Start message.

crawlEnded

public void crawlEnded(java.lang.String sExitMessage)
Description copied from interface: CrawlStatusListener
Called when a CrawlController has ended a crawl and is about to exit.

Specified by:
crawlEnded in interface CrawlStatusListener
Overrides:
crawlEnded in class CrawlJobHandler
Parameters:
sExitMessage - Type of exit. Should be one of the STATUS constants in defined in CrawlJob.
See Also:
CrawlJob

crawledURIDisregard

public void crawledURIDisregard(CrawlURI curi)
Description copied from interface: CrawlURIDispositionListener
Notification of a crawled URI that is to be disregarded. Usually this means that the robots.txt file for the relevant site forbids this from being crawled and we are therefor not going to keep it. Other reasons may apply. In all cases this means that it was successfully downloaded but will not be stored.

Specified by:
crawledURIDisregard in interface CrawlURIDispositionListener
Parameters:
curi - The relevant CrawlURI

crawledURIFailure

public void crawledURIFailure(CrawlURI curi)
Description copied from interface: CrawlURIDispositionListener
Notification of a failed crawling of a URI. The failure is of a type that precludes retries (either by it's very nature or because it has been retried to many times)

Specified by:
crawledURIFailure in interface CrawlURIDispositionListener
Parameters:
curi - The relevant CrawlURI

crawledURINeedRetry

public void crawledURINeedRetry(CrawlURI curi)
Description copied from interface: CrawlURIDispositionListener
Notification of a failed crawl of a URI that will be retried (failure due to possible transient problems).

Specified by:
crawledURINeedRetry in interface CrawlURIDispositionListener
Parameters:
curi - The relevant CrawlURI

crawledURISuccessful

public void crawledURISuccessful(CrawlURI curi)
Description copied from interface: CrawlURIDispositionListener
Notification of a successfully crawled URI

Specified by:
crawledURISuccessful in interface CrawlURIDispositionListener
Parameters:
curi - The relevant CrawlURI


Copyright © 2003-2011 Internet Archive. All Rights Reserved.