org.archive.io.arc
Class ARCReaderFactory.CompressedARCReader

java.lang.Object
  extended by org.archive.io.ArchiveReader
      extended by org.archive.io.arc.ARCReader
          extended by org.archive.io.arc.ARCReaderFactory.CompressedARCReader
All Implemented Interfaces:
ARCConstants, ArchiveFileConstants
Enclosing class:
ARCReaderFactory

public class ARCReaderFactory.CompressedARCReader
extends ARCReader

Compressed arc file reader.

Author:
stack

Nested Class Summary
 
Nested classes/interfaces inherited from class org.archive.io.ArchiveReader
ArchiveReader.ArchiveRecordIterator, ArchiveReader.RandomAccessBufferedInputStream
 
Field Summary
 
Fields inherited from class org.archive.io.arc.ARCReader
HEADER_FIELD_NAME_KEYS, logger
 
Fields inherited from class org.archive.io.ArchiveReader
MAX_ALLOWED_RECOVERABLES
 
Fields inherited from interface org.archive.io.arc.ARCConstants
ARC_FILE_EXTENSION, ARC_GZIP_EXTRA_FIELD, ARC_MAGIC_NUMBER, CHECKSUM_FIELD_KEY, CHECKSUM_HEADER_FIELD_KEY, CODE_HEADER_FIELD_KEY, COMPRESSED_ARC_FILE_EXTENSION, DEFAULT_ENCODING, DEFAULT_GZIP_HEADER_LENGTH, DEFAULT_MAX_ARC_FILE_SIZE, DOT_ARC_FILE_EXTENSION, DOT_COMPRESSED_ARC_FILE_EXTENSION, DOT_COMPRESSED_FILE_EXTENSION, FILENAME_FIELD_KEY, FILENAME_HEADER_FIELD_KEY, GZIP_HEADER_BEGIN, HEADER_FIELD_SEPARATOR, IP_HEADER_FIELD_KEY, LINE_SEPARATOR, LOCATION_HEADER_FIELD_KEY, MAX_METADATA_LINE_LENGTH, MINIMUM_RECORD_LENGTH, OFFSET_FIELD_KEY, OFFSET_HEADER_FIELD_KEY, REQUIRED_VERSION_1_HEADER_FIELDS, STATUSCODE_FIELD_KEY, TOKENIZED_PREFIX
 
Fields inherited from interface org.archive.io.ArchiveFileConstants
ABSOLUTE_OFFSET_KEY, CDX, CDX_FILE, CDX_LINE_BUFFER_SIZE, COMPRESSED_FILE_EXTENSION, CRLF, DATE_FIELD_KEY, DEFAULT_DIGEST_METHOD, DUMP, GZIP_DUMP, HEADER, INVALID_SUFFIX, LENGTH_FIELD_KEY, MIMETYPE_FIELD_KEY, NOHEAD, OCCUPIED_SUFFIX, READER_IDENTIFIER_FIELD_KEY, RECORD_IDENTIFIER_FIELD_KEY, SINGLE_SPACE, TYPE_FIELD_KEY, URL_FIELD_KEY, VERSION_FIELD_KEY
 
Constructor Summary
ARCReaderFactory.CompressedARCReader(java.io.File f)
          Constructor.
ARCReaderFactory.CompressedARCReader(java.io.File f, long offset)
          Constructor.
ARCReaderFactory.CompressedARCReader(java.lang.String f, java.io.InputStream is, boolean atFirstRecord)
          Constructor.
 
Method Summary
 ARCRecord get(long offset)
          Get record at passed offset.
protected  void gotoEOR(ArchiveRecord rec)
          Skip over any trailing new lines at end of the record so we're lined up ready to read the next.
 java.util.Iterator<ArchiveRecord> iterator()
          Returns an ArchiveRecord iterator.
 
Methods inherited from class org.archive.io.arc.ARCReader
createArchiveRecord, createCDXIndexFile, dump, fixSpaceInURL, getDeleteFileOnCloseReader, getDotFileExtension, getFileExtension, getVersion, isAlignedOnFirstRecord, isDate, isLegitimateIPValue, isNumber, isParseHttpHeaders, main, output, output, outputRecord, setAlignedOnFirstRecord, setParseHttpHeaders
 
Methods inherited from class org.archive.io.ArchiveReader
cdxOutput, cleanupCurrentRecord, close, currentRecord, get, getCurrentRecord, getFileName, getIn, getInputStream, getInputStream, getLogger, getOptions, getReaderIdentifier, getStrippedFileName, getStrippedFileName, getTrueOrFalse, initialize, isCompressed, isDigest, isStrict, isValid, logStdErr, outputRecord, rewind, setCompressed, setDigest, setIn, setReaderIdentifier, setStrict, setVersion, stripExtension, validate, validate
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

ARCReaderFactory.CompressedARCReader

public ARCReaderFactory.CompressedARCReader(java.io.File f)
                                     throws java.io.IOException
Constructor.

Parameters:
f - Compressed arcfile to read.
Throws:
java.io.IOException

ARCReaderFactory.CompressedARCReader

public ARCReaderFactory.CompressedARCReader(java.io.File f,
                                            long offset)
                                     throws java.io.IOException
Constructor.

Parameters:
f - Compressed arcfile to read.
offset - Position at where to start reading file.
Throws:
java.io.IOException

ARCReaderFactory.CompressedARCReader

public ARCReaderFactory.CompressedARCReader(java.lang.String f,
                                            java.io.InputStream is,
                                            boolean atFirstRecord)
                                     throws java.io.IOException
Constructor.

Parameters:
f - Compressed arcfile.
is - InputStream to use.
Throws:
java.io.IOException
Method Detail

get

public ARCRecord get(long offset)
              throws java.io.IOException
Get record at passed offset.

Overrides:
get in class ArchiveReader
Parameters:
offset - Byte index into arcfile at which a record starts.
Returns:
An ARCRecord reference.
Throws:
java.io.IOException

iterator

public java.util.Iterator<ArchiveRecord> iterator()
Description copied from class: ArchiveReader
Returns an ArchiveRecord iterator. Of note, on IOException, especially if ZipException reading compressed ARCs, rather than fail the iteration, try moving to the next record. If ArchiveReader.strict is not set, this will usually succeed.

Overrides:
iterator in class ArchiveReader
Returns:
An iterator over ARC records.

gotoEOR

protected void gotoEOR(ArchiveRecord rec)
                throws java.io.IOException
Description copied from class: ARCReader
Skip over any trailing new lines at end of the record so we're lined up ready to read the next.

Overrides:
gotoEOR in class ARCReader
Throws:
java.io.IOException


Copyright © 2003-2011 Internet Archive. All Rights Reserved.