org.archive.io.arc
Class ARCReaderFactory.CompressedARCReader
java.lang.Object
org.archive.io.ArchiveReader
org.archive.io.arc.ARCReader
org.archive.io.arc.ARCReaderFactory.CompressedARCReader
- All Implemented Interfaces:
- ARCConstants, ArchiveFileConstants
- Enclosing class:
- ARCReaderFactory
public class ARCReaderFactory.CompressedARCReader
- extends ARCReader
Compressed arc file reader.
- Author:
- stack
Fields inherited from interface org.archive.io.arc.ARCConstants |
ARC_FILE_EXTENSION, ARC_GZIP_EXTRA_FIELD, ARC_MAGIC_NUMBER, CHECKSUM_FIELD_KEY, CHECKSUM_HEADER_FIELD_KEY, CODE_HEADER_FIELD_KEY, COMPRESSED_ARC_FILE_EXTENSION, DEFAULT_ENCODING, DEFAULT_GZIP_HEADER_LENGTH, DEFAULT_MAX_ARC_FILE_SIZE, DOT_ARC_FILE_EXTENSION, DOT_COMPRESSED_ARC_FILE_EXTENSION, DOT_COMPRESSED_FILE_EXTENSION, FILENAME_FIELD_KEY, FILENAME_HEADER_FIELD_KEY, GZIP_HEADER_BEGIN, HEADER_FIELD_SEPARATOR, IP_HEADER_FIELD_KEY, LINE_SEPARATOR, LOCATION_HEADER_FIELD_KEY, MAX_METADATA_LINE_LENGTH, MINIMUM_RECORD_LENGTH, OFFSET_FIELD_KEY, OFFSET_HEADER_FIELD_KEY, REQUIRED_VERSION_1_HEADER_FIELDS, STATUSCODE_FIELD_KEY, TOKENIZED_PREFIX |
Fields inherited from interface org.archive.io.ArchiveFileConstants |
ABSOLUTE_OFFSET_KEY, CDX, CDX_FILE, CDX_LINE_BUFFER_SIZE, COMPRESSED_FILE_EXTENSION, CRLF, DATE_FIELD_KEY, DEFAULT_DIGEST_METHOD, DUMP, GZIP_DUMP, HEADER, INVALID_SUFFIX, LENGTH_FIELD_KEY, MIMETYPE_FIELD_KEY, NOHEAD, OCCUPIED_SUFFIX, READER_IDENTIFIER_FIELD_KEY, RECORD_IDENTIFIER_FIELD_KEY, SINGLE_SPACE, TYPE_FIELD_KEY, URL_FIELD_KEY, VERSION_FIELD_KEY |
Method Summary |
ARCRecord |
get(long offset)
Get record at passed offset . |
protected void |
gotoEOR(ArchiveRecord rec)
Skip over any trailing new lines at end of the record so we're lined up
ready to read the next. |
java.util.Iterator<ArchiveRecord> |
iterator()
Returns an ArchiveRecord iterator. |
Methods inherited from class org.archive.io.arc.ARCReader |
createArchiveRecord, createCDXIndexFile, dump, fixSpaceInURL, getDeleteFileOnCloseReader, getDotFileExtension, getFileExtension, getVersion, isAlignedOnFirstRecord, isDate, isLegitimateIPValue, isNumber, isParseHttpHeaders, main, output, output, outputRecord, setAlignedOnFirstRecord, setParseHttpHeaders |
Methods inherited from class org.archive.io.ArchiveReader |
cdxOutput, cleanupCurrentRecord, close, currentRecord, get, getCurrentRecord, getFileName, getIn, getInputStream, getInputStream, getLogger, getOptions, getReaderIdentifier, getStrippedFileName, getStrippedFileName, getTrueOrFalse, initialize, isCompressed, isDigest, isStrict, isValid, logStdErr, outputRecord, rewind, setCompressed, setDigest, setIn, setReaderIdentifier, setStrict, setVersion, stripExtension, validate, validate |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
ARCReaderFactory.CompressedARCReader
public ARCReaderFactory.CompressedARCReader(java.io.File f)
throws java.io.IOException
- Constructor.
- Parameters:
f
- Compressed arcfile to read.
- Throws:
java.io.IOException
ARCReaderFactory.CompressedARCReader
public ARCReaderFactory.CompressedARCReader(java.io.File f,
long offset)
throws java.io.IOException
- Constructor.
- Parameters:
f
- Compressed arcfile to read.offset
- Position at where to start reading file.
- Throws:
java.io.IOException
ARCReaderFactory.CompressedARCReader
public ARCReaderFactory.CompressedARCReader(java.lang.String f,
java.io.InputStream is,
boolean atFirstRecord)
throws java.io.IOException
- Constructor.
- Parameters:
f
- Compressed arcfile.is
- InputStream to use.
- Throws:
java.io.IOException
get
public ARCRecord get(long offset)
throws java.io.IOException
- Get record at passed
offset
.
- Overrides:
get
in class ArchiveReader
- Parameters:
offset
- Byte index into arcfile at which a record starts.
- Returns:
- An ARCRecord reference.
- Throws:
java.io.IOException
iterator
public java.util.Iterator<ArchiveRecord> iterator()
- Description copied from class:
ArchiveReader
- Returns an ArchiveRecord iterator.
Of note, on IOException, especially if ZipException reading compressed
ARCs, rather than fail the iteration, try moving to the next record.
If
ArchiveReader.strict
is not set, this will usually succeed.
- Overrides:
iterator
in class ArchiveReader
- Returns:
- An iterator over ARC records.
gotoEOR
protected void gotoEOR(ArchiveRecord rec)
throws java.io.IOException
- Description copied from class:
ARCReader
- Skip over any trailing new lines at end of the record so we're lined up
ready to read the next.
- Overrides:
gotoEOR
in class ARCReader
- Throws:
java.io.IOException
Copyright © 2003-2011 Internet Archive. All Rights Reserved.