org.archive.crawler.frontier
Class AdaptiveRevisitQueueList

java.lang.Object
  extended by org.archive.crawler.frontier.AdaptiveRevisitQueueList
All Implemented Interfaces:
Reporter

public class AdaptiveRevisitQueueList
extends java.lang.Object
implements Reporter

Maintains an ordered list of AdaptiveRevisitHostQueues used by a Frontier.

The list is ordered by the AdaptiveRevisitHostQueue.getNextReadyTime(), smallest value at the top of the list and then on in descending order.

The list will maintain a list of hostnames in a seperate DB. On creation a list will try to open the DB at a specified location. If it already exists the list will create HQs for all the hostnames in the list, discarding those that turn out to be empty.

Any BDB DatabaseException will be converted to an IOException by public methods. This includes preserving the original stacktrace, in favor of the one created for the IOException, so that the true source of the exception is not lost.

Author:
Kristinn Sigurdsson

Constructor Summary
AdaptiveRevisitQueueList(com.sleepycat.je.Environment env, com.sleepycat.bind.serial.StoredClassCatalog catalog)
           
 
Method Summary
 void close()
          Closes all HQs and the Environment.
 AdaptiveRevisitHostQueue createHQ(java.lang.String hostName, int valence)
          Creates a new AdaptiveRevisitHostQueue.
 long getAverageDepth()
          Returns the average depth of all the HQs in this list
 float getCongestionRatio()
          Returns the congestion ratio.
 long getDeepestQueueSize()
          Returns the size of the largest (deepest) queue.
 AdaptiveRevisitHostQueue getHQ(java.lang.String hostName)
          Get an AdaptiveRevisitHostQueue for the specified host.
 java.lang.String[] getReports()
          Get an array of report names offered by this Reporter.
 long getSize()
          Returns the number of URIs in all the HQs in this list
 AdaptiveRevisitHostQueue getTopHQ()
           
 long getUriCount()
          The total number of URIs queued in all the HQs belonging to this list.
protected  void reorder(AdaptiveRevisitHostQueue hq)
          This method reorders the host queues.
 void reportTo(java.io.PrintWriter writer)
          Make a default report to the passed-in Writer.
 void reportTo(java.lang.String name, java.io.PrintWriter writer)
          Make a report of the given name to the passed-in Writer, If null, give the default report.
 java.lang.String singleLineLegend()
          Return a legend for the single-line summary report as a String.
 java.lang.String singleLineReport()
          Return a short single-line summary report as a String.
 void singleLineReportTo(java.io.PrintWriter writer)
          Make a single-line summary report to the passed-in writer
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

AdaptiveRevisitQueueList

public AdaptiveRevisitQueueList(com.sleepycat.je.Environment env,
                                com.sleepycat.bind.serial.StoredClassCatalog catalog)
                         throws java.io.IOException
Throws:
java.io.IOException
Method Detail

getHQ

public AdaptiveRevisitHostQueue getHQ(java.lang.String hostName)
Get an AdaptiveRevisitHostQueue for the specified host.

If one does not already exist, null is returned

Parameters:
hostName - The host's name
Returns:
an AdaptiveRevisitHostQueue for the specified host

createHQ

public AdaptiveRevisitHostQueue createHQ(java.lang.String hostName,
                                         int valence)
                                  throws java.io.IOException
Creates a new AdaptiveRevisitHostQueue.

If a HQ already existed for the specified hostName, the existing HQ is returned as it is. It's existing valence will not be updated to reflect a different valence.

Parameters:
hostName -
valence - number of simultaneous connections allowed to this host
Returns:
the newly created HQ
Throws:
java.io.IOException

getTopHQ

public AdaptiveRevisitHostQueue getTopHQ()

getSize

public long getSize()
Returns the number of URIs in all the HQs in this list

Returns:
the number of URIs in all the HQs in this list

getAverageDepth

public long getAverageDepth()
Returns the average depth of all the HQs in this list

Returns:
the average depth of all the HQs in this list (rounded down)

getDeepestQueueSize

public long getDeepestQueueSize()
Returns the size of the largest (deepest) queue.

Returns:
the size of the largest (deepest) queue.

getCongestionRatio

public float getCongestionRatio()
Returns the congestion ratio.

The congestion ratio is equal to the total number of queues divided by the number of queues currently being processed or are snozzed (i.e. not ready). A congestion ratio of 1 indicates no congestion.

Returns:
the congestion ratio

reorder

protected void reorder(AdaptiveRevisitHostQueue hq)
This method reorders the host queues. Method is only called by the AdaptiveRevisitHostQueue that it 'owns' when their reported time of next ready is being updated.

Parameters:
hq - The calling HQ

getUriCount

public long getUriCount()
The total number of URIs queued in all the HQs belonging to this list.

Returns:
total number of URIs queued in all the HQs belonging to this list.

close

public void close()
Closes all HQs and the Environment.


getReports

public java.lang.String[] getReports()
Description copied from interface: Reporter
Get an array of report names offered by this Reporter. A name in brackets indicates a free-form String, in accordance with the informal description inside the brackets, may yield a useful report.

Specified by:
getReports in interface Reporter
Returns:
String array of report names, empty if there is only one report type

singleLineReport

public java.lang.String singleLineReport()
Description copied from interface: Reporter
Return a short single-line summary report as a String.

Specified by:
singleLineReport in interface Reporter
Returns:
String single-line summary report

reportTo

public void reportTo(java.io.PrintWriter writer)
Description copied from interface: Reporter
Make a default report to the passed-in Writer. Should be equivalent to reportTo(null, writer)

Specified by:
reportTo in interface Reporter
Parameters:
writer - to receive report

reportTo

public void reportTo(java.lang.String name,
                     java.io.PrintWriter writer)
Description copied from interface: Reporter
Make a report of the given name to the passed-in Writer, If null, give the default report.

Specified by:
reportTo in interface Reporter
writer - to receive report

singleLineReportTo

public void singleLineReportTo(java.io.PrintWriter writer)
Description copied from interface: Reporter
Make a single-line summary report to the passed-in writer

Specified by:
singleLineReportTo in interface Reporter
Parameters:
writer - to receive report

singleLineLegend

public java.lang.String singleLineLegend()
Description copied from interface: Reporter
Return a legend for the single-line summary report as a String.

Specified by:
singleLineLegend in interface Reporter
Returns:
String single-line summary legend


Copyright © 2003-2011 Internet Archive. All Rights Reserved.