|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object java.util.AbstractCollection<E> java.util.AbstractSet<E> java.util.TreeSet<java.lang.String> org.archive.util.PrefixSet org.archive.util.SurtPrefixSet
public class SurtPrefixSet
Specialized TreeSet for keeping a set of String prefixes. Redundant prefixes (those that are themselves prefixed by other set entries) are eliminated.
Constructor Summary | |
---|---|
SurtPrefixSet()
|
Method Summary | |
---|---|
void |
convertAllPrefixesToDomains()
Changes all prefixes so that they only enforce a general domain (allowing subdomains).For prefixes that don't include a ')', no change is necessary. |
void |
convertAllPrefixesToHosts()
Changes all prefixes so that they enforce an exact host. |
static java.lang.String |
convertPrefixToDomain(java.lang.String prefix)
|
static java.lang.String |
convertPrefixToHost(java.lang.String prefix)
|
void |
exportTo(java.io.Writer fw)
|
static java.lang.String |
getCandidateSurt(java.lang.Object object)
Calculate the SURT form URI to use as a candidate against prefixes from the given Object (CandidateURI or UURI) |
void |
importFrom(java.io.Reader r)
Read a set of SURT prefixes from a reader source; keep sorted and with redundant entries removed. |
void |
importFromMixed(java.io.Reader r,
boolean deduceFromSeeds)
Import SURT prefixes from a reader with mixed URI and SURT prefix format. |
void |
importFromUris(java.io.Reader r)
|
static void |
main(java.lang.String[] args)
Allow class to be used as a command-line tool for converting URL lists (or naked host or host/path fragments implied to be HTTP URLs) to implied SURT prefix form. |
static java.lang.String |
prefixFromPlain(java.lang.String u)
Given a plain URI or hostname/hostname+path, deduce an implied SURT prefix from it. |
Methods inherited from class org.archive.util.PrefixSet |
---|
add, containsPrefixOf |
Methods inherited from class java.util.TreeSet |
---|
addAll, ceiling, clear, clone, comparator, contains, descendingIterator, descendingSet, first, floor, headSet, headSet, higher, isEmpty, iterator, last, lower, pollFirst, pollLast, remove, size, subSet, subSet, tailSet, tailSet |
Methods inherited from class java.util.AbstractSet |
---|
equals, hashCode, removeAll |
Methods inherited from class java.util.AbstractCollection |
---|
containsAll, retainAll, toArray, toArray, toString |
Methods inherited from class java.lang.Object |
---|
finalize, getClass, notify, notifyAll, wait, wait, wait |
Methods inherited from interface java.util.Set |
---|
containsAll, equals, hashCode, removeAll, retainAll, toArray, toArray |
Constructor Detail |
---|
public SurtPrefixSet()
Method Detail |
---|
public void importFrom(java.io.Reader r)
r
- reader over file of SURT_format strings
java.io.IOException
public void importFromUris(java.io.Reader r)
r
- Where to read from.public void importFromMixed(java.io.Reader r, boolean deduceFromSeeds)
r
- the reader to import the prefixes fromdeduceFromSeeds
- true to also import SURT prefixes implied
from normal URIs/hostname seedspublic static java.lang.String prefixFromPlain(java.lang.String u)
u
- URI or almost-URI to consider
public static java.lang.String getCandidateSurt(java.lang.Object object)
object
- CandidateURI or UURI
public void exportTo(java.io.Writer fw) throws java.io.IOException
fw
-
java.io.IOException
public void convertAllPrefixesToHosts()
public static java.lang.String convertPrefixToHost(java.lang.String prefix)
public void convertAllPrefixesToDomains()
public static java.lang.String convertPrefixToDomain(java.lang.String prefix)
public static void main(java.lang.String[] args) throws java.io.IOException
args
- cmd-line arguments: may include input file
java.io.IOException
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |