|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object org.archive.extractor.CharSequenceLinkExtractor
public abstract class CharSequenceLinkExtractor
Abstract superclass providing utility methods for LinkExtractors which would prefer to work on a CharSequence rather than a stream. ROUGH DRAFT IN PROGRESS / incomplete... untested...
Field Summary | |
---|---|
protected UURI |
base
|
protected ExtractErrorListener |
extractErrorListener
|
protected java.util.LinkedList<Link> |
next
|
protected UURI |
source
|
protected java.lang.CharSequence |
sourceContent
|
Constructor Summary | |
---|---|
CharSequenceLinkExtractor()
|
Method Summary | |
---|---|
protected java.lang.CharSequence |
charSequenceFrom(java.io.InputStream content,
java.nio.charset.Charset charset)
|
protected java.lang.CharSequence |
createCharSequenceFrom(java.io.InputStream content,
java.nio.charset.Charset charset)
|
static void |
extract(java.lang.CharSequence content,
UURI source,
UURI base,
java.util.List<Link> collector,
ExtractErrorListener extractErrorListener)
Convenience method to do default extraction. |
protected abstract boolean |
findNextLink()
Scan to the next link(s), if any, loading it into the next buffer. |
boolean |
hasNext()
|
protected static CharSequenceLinkExtractor |
newDefaultInstance()
|
java.lang.Object |
next()
|
Link |
nextLink()
Alternative to Iterator.next() which returns type Link. |
void |
remove()
|
void |
reset()
Discard all state. |
void |
setup(UURI sourceandbase,
java.lang.CharSequence content,
ExtractErrorListener listener)
Convenience method for when source and base are same. |
void |
setup(UURI sourceandbase,
java.io.InputStream content,
java.nio.charset.Charset charset,
ExtractErrorListener listener)
Convenience version of above for common case where source and base are same. |
void |
setup(UURI source,
UURI base,
java.lang.CharSequence content,
ExtractErrorListener listener)
|
void |
setup(UURI source,
UURI base,
java.io.InputStream content,
java.nio.charset.Charset charset,
ExtractErrorListener listener)
Setup the LinkExtractor to operate on the given stream and charset, considering the given contextURI as the initial 'base' URI for resolving relative URIs. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
protected UURI source
protected UURI base
protected ExtractErrorListener extractErrorListener
protected java.lang.CharSequence sourceContent
protected java.util.LinkedList<Link> next
Constructor Detail |
---|
public CharSequenceLinkExtractor()
Method Detail |
---|
public void setup(UURI source, UURI base, java.io.InputStream content, java.nio.charset.Charset charset, ExtractErrorListener listener)
LinkExtractor
setup
in interface LinkExtractor
source
- source URIbase
- base URI (usually the source URI) for URI derelativizingcontent
- input stream of content to scan for linkscharset
- Charset to consult to decode stream to characterslistener
- ExtractErrorListener to notify, rather than raising
exception through extraction looppublic void setup(UURI source, UURI base, java.lang.CharSequence content, ExtractErrorListener listener)
source
- base
- content
- listener
- public void setup(UURI sourceandbase, java.lang.CharSequence content, ExtractErrorListener listener)
sourceandbase
- content
- listener
- public void setup(UURI sourceandbase, java.io.InputStream content, java.nio.charset.Charset charset, ExtractErrorListener listener)
LinkExtractor
setup
in interface LinkExtractor
sourceandbase
- URI to use as source and base for derelativizingcontent
- input stream of content to scan for linkscharset
- Charset to consult to decode stream to characterslistener
- ExtractErrorListener to notify, rather than raising
exception through extraction looppublic Link nextLink()
LinkExtractor
nextLink
in interface LinkExtractor
public void reset()
reset
in interface LinkExtractor
public boolean hasNext()
hasNext
in interface java.util.Iterator
protected abstract boolean findNextLink()
public java.lang.Object next()
next
in interface java.util.Iterator
public void remove()
remove
in interface java.util.Iterator
protected java.lang.CharSequence charSequenceFrom(java.io.InputStream content, java.nio.charset.Charset charset)
content
- charset
-
protected java.lang.CharSequence createCharSequenceFrom(java.io.InputStream content, java.nio.charset.Charset charset)
content
- charset
-
public static void extract(java.lang.CharSequence content, UURI source, UURI base, java.util.List<Link> collector, ExtractErrorListener extractErrorListener)
content
- source
- base
- collector
- extractErrorListener
- protected static CharSequenceLinkExtractor newDefaultInstance()
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |