org.archive.io.warc
Class WARCReaderFactory.UncompressedWARCReader

java.lang.Object
  extended by org.archive.io.ArchiveReader
      extended by org.archive.io.warc.WARCReader
          extended by org.archive.io.warc.WARCReaderFactory.UncompressedWARCReader
All Implemented Interfaces:
ArchiveFileConstants, WARCConstants
Enclosing class:
WARCReaderFactory

public class WARCReaderFactory.UncompressedWARCReader
extends WARCReader

Uncompressed WARC file reader.

Author:
stack

Nested Class Summary
 
Nested classes/interfaces inherited from class org.archive.io.ArchiveReader
ArchiveReader.ArchiveRecordIterator, ArchiveReader.RandomAccessBufferedInputStream
 
Field Summary
 
Fields inherited from class org.archive.io.ArchiveReader
MAX_ALLOWED_RECOVERABLES
 
Fields inherited from interface org.archive.io.warc.WARCConstants
COLON_SPACE, COMPRESSED_WARC_FILE_EXTENSION, CONTENT_DESCRIPTION, CONTENT_LENGTH, CONTENT_TYPE, CONTINUATION, CONTINUATION_INDEX, CONVERSION, CONVERSION_INDEX, DEFAULT_ENCODING, DEFAULT_MAX_WARC_FILE_SIZE, DOT_COMPRESSED_FILE_EXTENSION, DOT_COMPRESSED_WARC_FILE_EXTENSION, DOT_WARC_FILE_EXTENSION, FTP_CONTROL_CONVERSATION_MIMETYPE, HEADER_FIELD_KEYS, HEADER_FIELD_SEPARATOR, HEADER_KEY_BLOCK_DIGEST, HEADER_KEY_CONCURRENT_TO, HEADER_KEY_DATE, HEADER_KEY_ETAG, HEADER_KEY_FILENAME, HEADER_KEY_ID, HEADER_KEY_IP, HEADER_KEY_LAST_MODIFIED, HEADER_KEY_PAYLOAD_DIGEST, HEADER_KEY_PROFILE, HEADER_KEY_TRUNCATED, HEADER_KEY_TYPE, HEADER_KEY_URI, HEADER_LINE_ENCODING, HTTP_REQUEST_MIMETYPE, HTTP_RESPONSE_MIMETYPE, MAX_LINE_LENGTH, MAX_WARC_HEADER_LINE_LENGTH, METADATA, METADATA_INDEX, NAMED_FIELD_CHECKSUM_LABEL, NAMED_FIELD_DESCRIPTION, NAMED_FIELD_FILEDESC, NAMED_FIELD_IP_LABEL, NAMED_FIELD_RELATED_LABEL, NAMED_FIELD_TRUNCATED, NAMED_FIELD_TRUNCATED_VALUE_HEAD, NAMED_FIELD_TRUNCATED_VALUE_LENGTH, NAMED_FIELD_TRUNCATED_VALUE_TIME, NAMED_FIELD_TRUNCATED_VALUE_UNSPECIFIED, NAMED_FIELD_WARCFILENAME, PLACEHOLDER_RECORD_LENGTH_STRING, PROFILE_REVISIT_IDENTICAL_DIGEST, PROFILE_REVISIT_NOT_MODIFIED, REQUEST, REQUEST_INDEX, RESOURCE, RESOURCE_INDEX, RESPONSE, RESPONSE_INDEX, REVISIT, REVISIT_INDEX, TRUNCATED_VALUE_UNSPECIFIED, TYPE, TYPES, TYPES_LIST, WARC_010_ID, WARC_010_MAGIC, WARC_FILE_EXTENSION, WARC_HEADER_ENCODING, WARC_ID, WARC_MAGIC, WARC_VERSION, WARCINFO, WARCINFO_INDEX, WSP
 
Fields inherited from interface org.archive.io.ArchiveFileConstants
ABSOLUTE_OFFSET_KEY, CDX, CDX_FILE, CDX_LINE_BUFFER_SIZE, COMPRESSED_FILE_EXTENSION, CRLF, DATE_FIELD_KEY, DEFAULT_DIGEST_METHOD, DUMP, GZIP_DUMP, HEADER, INVALID_SUFFIX, LENGTH_FIELD_KEY, MIMETYPE_FIELD_KEY, NOHEAD, OCCUPIED_SUFFIX, READER_IDENTIFIER_FIELD_KEY, RECORD_IDENTIFIER_FIELD_KEY, SINGLE_SPACE, TYPE_FIELD_KEY, URL_FIELD_KEY, VERSION_FIELD_KEY
 
Constructor Summary
WARCReaderFactory.UncompressedWARCReader(java.io.File f)
          Constructor.
WARCReaderFactory.UncompressedWARCReader(java.io.File f, long offset)
          Constructor.
WARCReaderFactory.UncompressedWARCReader(java.lang.String f, java.io.InputStream is)
          Constructor.
 
Method Summary
 
Methods inherited from class org.archive.io.warc.WARCReader
createArchiveRecord, createCDXIndexFile, dump, getDeleteFileOnCloseReader, getDotFileExtension, getFileExtension, gotoEOR, initialize, main, output, readExpectedChar
 
Methods inherited from class org.archive.io.ArchiveReader
cdxOutput, cleanupCurrentRecord, close, currentRecord, get, get, getCurrentRecord, getFileName, getIn, getInputStream, getInputStream, getLogger, getOptions, getReaderIdentifier, getStrippedFileName, getStrippedFileName, getTrueOrFalse, getVersion, isCompressed, isDigest, isStrict, isValid, iterator, logStdErr, output, outputRecord, outputRecord, rewind, setCompressed, setDigest, setIn, setReaderIdentifier, setStrict, setVersion, stripExtension, validate, validate
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

WARCReaderFactory.UncompressedWARCReader

public WARCReaderFactory.UncompressedWARCReader(java.io.File f)
                                         throws java.io.IOException
Constructor.

Parameters:
f - Uncompressed arcfile to read.
Throws:
java.io.IOException

WARCReaderFactory.UncompressedWARCReader

public WARCReaderFactory.UncompressedWARCReader(java.io.File f,
                                                long offset)
                                         throws java.io.IOException
Constructor.

Parameters:
f - Uncompressed file to read.
offset - Offset at which to position Reader.
Throws:
java.io.IOException

WARCReaderFactory.UncompressedWARCReader

public WARCReaderFactory.UncompressedWARCReader(java.lang.String f,
                                                java.io.InputStream is)
Constructor.

Parameters:
f - Uncompressed file to read.
is - InputStream.


Copyright © 2003-2011 Internet Archive. All Rights Reserved.