org.archive.util.iterator
Class RegexpLineIterator
java.lang.Object
org.archive.util.iterator.LookaheadIterator<Transformed>
org.archive.util.iterator.TransformingIteratorWrapper<java.lang.String,java.lang.String>
org.archive.util.iterator.RegexpLineIterator
- All Implemented Interfaces:
- java.util.Iterator<java.lang.String>
public class RegexpLineIterator
- extends TransformingIteratorWrapper<java.lang.String,java.lang.String>
Utility class providing an Iterator interface over line-oriented
text input. By providing regexps indicating lines to ignore
(such as pure whitespace or comments), lines to consider input, and
what to return from the input lines (such as a whitespace-trimmed
non-whitespace token with optional trailing comment), this can
be configured to handle a number of formats.
The public static members provide pattern configurations that will
be helpful in a wide variety of contexts.
- Author:
- gojomo
Constructor Summary |
RegexpLineIterator(java.util.Iterator<java.lang.String> inner,
java.lang.String ignore,
java.lang.String extract,
java.lang.String replace)
|
Method Summary |
protected java.lang.String |
transform(java.lang.String line)
Loads next item into lookahead spot, if available. |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
COMMENT_LINE
public static final java.lang.String COMMENT_LINE
- See Also:
- Constant Field Values
NONWHITESPACE_ENTRY_TRAILING_COMMENT
public static final java.lang.String NONWHITESPACE_ENTRY_TRAILING_COMMENT
- See Also:
- Constant Field Values
TRIMMED_ENTRY_TRAILING_COMMENT
public static final java.lang.String TRIMMED_ENTRY_TRAILING_COMMENT
- See Also:
- Constant Field Values
ENTRY
public static final java.lang.String ENTRY
- See Also:
- Constant Field Values
ignoreLine
protected java.util.regex.Matcher ignoreLine
extractLine
protected java.util.regex.Matcher extractLine
outputTemplate
protected java.lang.String outputTemplate
RegexpLineIterator
public RegexpLineIterator(java.util.Iterator<java.lang.String> inner,
java.lang.String ignore,
java.lang.String extract,
java.lang.String replace)
transform
protected java.lang.String transform(java.lang.String line)
- Loads next item into lookahead spot, if available. Skips
lines matching ignoreLine; extracts desired portion of
lines matching extractLine; informationally reports any
lines matching neither.
- Specified by:
transform
in class TransformingIteratorWrapper<java.lang.String,java.lang.String>
- Parameters:
line
- Object to transform.
- Returns:
- whether any item was loaded into next field
Copyright © 2003-2011 Internet Archive. All Rights Reserved.