org.archive.crawler.frontier
Class BdbWorkQueue

java.lang.Object
  extended by org.archive.crawler.frontier.WorkQueue
      extended by org.archive.crawler.frontier.BdbWorkQueue
All Implemented Interfaces:
java.io.Serializable, java.lang.Comparable, CrawlSubstats.HasCrawlSubstats, Frontier.FrontierGroup, Reporter

public class BdbWorkQueue
extends WorkQueue
implements java.lang.Comparable, java.io.Serializable

One independent queue of items with the same 'classKey' (eg host).

Author:
gojomo
See Also:
Serialized Form

Field Summary
 
Fields inherited from class org.archive.crawler.frontier.WorkQueue
classKey, substats
 
Constructor Summary
BdbWorkQueue(java.lang.String classKey, BdbFrontier frontier)
          Create a virtual queue inside the given BdbMultipleWorkQueues
 
Method Summary
protected  void deleteItem(WorkQueueFrontier frontier, CrawlURI peekItem)
          Removes the given item from the queue.
protected  long deleteMatchingFromQueue(WorkQueueFrontier frontier, java.lang.String match)
          Delete URIs matching the given pattern from this queue.
protected static java.lang.String getPrefixClassKey(byte[] byteArray)
           
protected  void insertItem(WorkQueueFrontier frontier, CrawlURI curi, boolean overwriteIfPresent)
          Insert the given curi, whether it is already present or not.
protected  CrawlURI peekItem(WorkQueueFrontier frontier)
          Returns first item from queue (does not delete)
 
Methods inherited from class org.archive.crawler.frontier.WorkQueue
clearHeld, compareTo, deleteMatching, dequeue, enqueue, expend, getClassKey, getContextUURI, getCount, getPendingExpenditure, getReports, getSessionBalance, getSubstats, getTotalBudget, getTotalExpenditure, getWakeTime, incrementSessionBalance, isHeld, isOverBudget, isRetired, noteError, peek, refund, reportTo, reportTo, resume, setActive, setHeld, setRetired, setSessionBalance, setTotalBudget, setWakeTime, singleLineLegend, singleLineReport, singleLineReportTo, suspend, unpeek, update
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface java.lang.Comparable
compareTo
 

Constructor Detail

BdbWorkQueue

public BdbWorkQueue(java.lang.String classKey,
                    BdbFrontier frontier)
Create a virtual queue inside the given BdbMultipleWorkQueues

Parameters:
classKey -
Method Detail

deleteMatchingFromQueue

protected long deleteMatchingFromQueue(WorkQueueFrontier frontier,
                                       java.lang.String match)
                                throws java.io.IOException
Description copied from class: WorkQueue
Delete URIs matching the given pattern from this queue.

Specified by:
deleteMatchingFromQueue in class WorkQueue
Parameters:
frontier - WorkQueues manager.
match - the pattern to match
Returns:
count of deleted URIs
Throws:
java.io.IOException - if there was a problem while deleting

deleteItem

protected void deleteItem(WorkQueueFrontier frontier,
                          CrawlURI peekItem)
                   throws java.io.IOException
Description copied from class: WorkQueue
Removes the given item from the queue. This is only used to remove the first item in the queue, so it is not necessary to implement a random-access queue.

Specified by:
deleteItem in class WorkQueue
Parameters:
frontier - Work queues manager.
Throws:
java.io.IOException - if there was a problem while deleting the item

peekItem

protected CrawlURI peekItem(WorkQueueFrontier frontier)
                     throws java.io.IOException
Description copied from class: WorkQueue
Returns first item from queue (does not delete)

Specified by:
peekItem in class WorkQueue
Returns:
The peeked item, or null
Throws:
java.io.IOException - if there was a problem while peeking

insertItem

protected void insertItem(WorkQueueFrontier frontier,
                          CrawlURI curi,
                          boolean overwriteIfPresent)
                   throws java.io.IOException
Description copied from class: WorkQueue
Insert the given curi, whether it is already present or not. Hook for subclasses.

Specified by:
insertItem in class WorkQueue
Parameters:
frontier - WorkQueueFrontier.
curi - CrawlURI to insert.
Throws:
java.io.IOException - if there was a problem while inserting the item

getPrefixClassKey

protected static java.lang.String getPrefixClassKey(byte[] byteArray)
Parameters:
byteArray - Byte array to get hex string of.
Returns:
Hex string of passed in byte array (Used logging key-prefixes).


Copyright © 2003-2011 Internet Archive. All Rights Reserved.