Package org.archive.crawler.util

Interface Summary
Transformer<Original,Transformed> Transforms objects from one thing into another.
 

Class Summary
BdbUriUniqFilter A BDB implementation of an AlreadySeen list.
BenchmarkUriUniqFilters BenchmarkUriUniqFilters
BloomUriUniqFilter A MG4J BloomFilter-based implementation of an AlreadySeen list.
CheckpointUtils Utilities useful checkpointing.
CrawledBytesHistotable  
DiskFPMergeUriUniqFilter Crude FPMergeUriUniqFilter using a disk data file of raw longs as the overall FP record.
FPMergeUriUniqFilter UriUniqFilter based on merging FP arrays (in memory or from disk).
FPUriUniqFilter UriUniqFilter storing 64-bit UURI fingerprints, using an internal LongFPSet instance.
IoUtils Logging utils.
LogReader This class contains a variety of methods for reading log files (or other text files containing repeated lines with similar information).
LogUtils Logging utils.
MemFPMergeUriUniqFilter Crude all-in-memory FP-merging UriUniqFilter.
MemUriUniqFilter A purely in-memory UriUniqFilter based on a HashSet, which remembers every full URI string it sees.
NoopUriUniqFilter A UriUniqFilter that doesn't actually provide any uniqueness filter on presented items: all are passed through.
RecoveryLogMapper  
SetBasedUriUniqFilter UriUniqFilter based on an underlying UriSet (essentially a Set).
Sorts  
StringIntPair  
StringIntPairComparator  
Transform<Original,Transformed> A transformation of a collection.
TransformIterator<Original,Transformed>  
 

Exception Summary
SeedUrlNotFoundException  
 



Copyright © 2003-2011 Internet Archive. All Rights Reserved.