|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object javax.management.Attribute org.archive.crawler.settings.Type org.archive.crawler.settings.ComplexType org.archive.crawler.settings.ModuleType org.archive.crawler.framework.Processor org.archive.crawler.processor.recrawl.PersistProcessor
public abstract class PersistProcessor
Superclass for Processors which utilize BDB-JE for URI state (including most notably history) persistence. Includes many static utility methods (including a main()).
Nested Class Summary |
---|
Nested classes/interfaces inherited from class org.archive.crawler.settings.ComplexType |
---|
ComplexType.MBeanAttributeInfoIterator |
Field Summary | |
---|---|
static java.lang.String |
URI_HISTORY_DBNAME
name of history Database |
Fields inherited from class org.archive.crawler.framework.Processor |
---|
ATTR_DECIDE_RULES, ATTR_ENABLED, attrDecideRules |
Fields inherited from class org.archive.crawler.settings.ComplexType |
---|
definition, definitionMap |
Constructor Summary | |
---|---|
PersistProcessor(java.lang.String name,
java.lang.String string)
Usual constructor |
Method Summary | |
---|---|
static int |
copyPersistSourceToHistoryMap(java.io.File context,
java.lang.String sourcePath,
com.sleepycat.collections.StoredSortedMap<java.lang.String,st.ata.util.AList> historyMap)
Populates a given StoredSortedMap (history map) from an old environment db or a persist log. |
protected static com.sleepycat.je.DatabaseConfig |
historyDatabaseConfig()
|
static void |
main(java.lang.String[] args)
Utility main for importing a log into a BDB-JE environment or moving a database between environments (2 arguments), or simply dumping a log to stderr in a more readable format (1 argument). |
java.lang.String |
persistKeyFor(CrawlURI curi)
Return a preferred String key for persisting the given CrawlURI's AList state. |
static int |
populatePersistEnv(java.lang.String sourcePath,
java.io.File envFile)
Populates a new environment db from an old environment db or a persist log. |
static EnhancedEnvironment |
setupCopyEnvironment(java.io.File env)
|
static EnhancedEnvironment |
setupCopyEnvironment(java.io.File env,
boolean readOnly)
|
protected boolean |
shouldLoad(CrawlURI curi)
Whether the current CrawlURI's state should be loaded |
protected boolean |
shouldStore(CrawlURI curi)
Whether the current CrawlURI's state should be persisted (to log or direct to database). |
Methods inherited from class org.archive.crawler.framework.Processor |
---|
checkForInterrupt, finalTasks, getController, getDecideRule, getDefaultNextProcessor, initialTasks, innerProcess, innerRejectProcess, isContentToProcess, isEnabled, isExpectedMimeType, isHttpTransactionContentToProcess, kickUpdate, process, report, rulesAccept, rulesAccept, setDefaultNextProcessor, spawn |
Methods inherited from class org.archive.crawler.settings.ModuleType |
---|
addElement, listUsedFiles |
Methods inherited from class org.archive.crawler.settings.Type |
---|
addConstraint, equals, getConstraints, getLegalValueType, isExpertSetting, isOverrideable, isTransient, setExpertSetting, setLegalValueType, setOverrideable, setTransient |
Methods inherited from class javax.management.Attribute |
---|
getName, hashCode |
Methods inherited from class java.lang.Object |
---|
clone, finalize, getClass, notify, notifyAll, wait, wait, wait |
Field Detail |
---|
public static final java.lang.String URI_HISTORY_DBNAME
Constructor Detail |
---|
public PersistProcessor(java.lang.String name, java.lang.String string)
name
- string
- Method Detail |
---|
protected static com.sleepycat.je.DatabaseConfig historyDatabaseConfig()
public java.lang.String persistKeyFor(CrawlURI curi)
curi
- CrawlURI
protected boolean shouldStore(CrawlURI curi)
curi
- CrawlURI
protected boolean shouldLoad(CrawlURI curi)
curi
- CrawlURI
public static int populatePersistEnv(java.lang.String sourcePath, java.io.File envFile) throws com.sleepycat.je.DatabaseException, java.io.IOException
sourcePath
- source of old entries: can be a path to an existing
environment db, or a URL or path to a persist logenvFile
- path to new environment db (or null for a dry run)
com.sleepycat.je.DatabaseException
java.io.IOException
public static int copyPersistSourceToHistoryMap(java.io.File context, java.lang.String sourcePath, com.sleepycat.collections.StoredSortedMap<java.lang.String,st.ata.util.AList> historyMap) throws com.sleepycat.je.DatabaseException, java.io.IOException, java.net.MalformedURLException, java.io.UnsupportedEncodingException
sourcePath
- source of old entries: can be a path to an existing
environment db, or a URL or path to a persist loghistoryMap
- map to populate (or null for a dry run)
com.sleepycat.je.DatabaseException
java.io.IOException
java.net.MalformedURLException
java.io.UnsupportedEncodingException
public static void main(java.lang.String[] args) throws com.sleepycat.je.DatabaseException, java.io.IOException
args
- command-line arguments
com.sleepycat.je.DatabaseException
java.io.IOException
public static EnhancedEnvironment setupCopyEnvironment(java.io.File env) throws com.sleepycat.je.DatabaseException
com.sleepycat.je.DatabaseException
public static EnhancedEnvironment setupCopyEnvironment(java.io.File env, boolean readOnly) throws com.sleepycat.je.DatabaseException
com.sleepycat.je.DatabaseException
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |