org.archive.io
Class ArchiveReaderFactory

java.lang.Object
  extended by org.archive.io.ArchiveReaderFactory
All Implemented Interfaces:
ArchiveFileConstants
Direct Known Subclasses:
ARCReaderFactory, WARCReaderFactory

public class ArchiveReaderFactory
extends java.lang.Object
implements ArchiveFileConstants

Factory that returns an Archive file Reader. Returns Readers for ARCs or WARCs.

Version:
$Date: 2007-03-09 23:57:28 +0000 (Fri, 09 Mar 2007) $ $Revision: 4977 $
Author:
stack

Field Summary
 
Fields inherited from interface org.archive.io.ArchiveFileConstants
ABSOLUTE_OFFSET_KEY, CDX, CDX_FILE, CDX_LINE_BUFFER_SIZE, COMPRESSED_FILE_EXTENSION, CRLF, DATE_FIELD_KEY, DEFAULT_DIGEST_METHOD, DOT_COMPRESSED_FILE_EXTENSION, DUMP, GZIP_DUMP, HEADER, INVALID_SUFFIX, LENGTH_FIELD_KEY, MIMETYPE_FIELD_KEY, NOHEAD, OCCUPIED_SUFFIX, READER_IDENTIFIER_FIELD_KEY, RECORD_IDENTIFIER_FIELD_KEY, SINGLE_SPACE, TYPE_FIELD_KEY, URL_FIELD_KEY, VERSION_FIELD_KEY
 
Constructor Summary
protected ArchiveReaderFactory()
          Shutdown any public access to default constructor.
 
Method Summary
protected  void addUserAgent(java.net.HttpURLConnection connection)
           
protected  java.io.InputStream asRepositionable(java.io.InputStream is)
           
static ArchiveReader get(java.io.File f)
           
static ArchiveReader get(java.io.File f, long offset)
           
static ArchiveReader get(java.lang.String arcFileOrUrl)
          Get an Archive file Reader on passed path or url.
static ArchiveReader get(java.lang.String s, java.io.InputStream is, boolean atFirstRecord)
          Wrap a Reader around passed Stream.
static ArchiveReader get(java.net.URL u)
          Get an ARCReader.
static ArchiveReader get(java.net.URL u, long offset)
          Get an Archive Reader aligned at offset.
protected  ArchiveReader getArchiveReader(java.io.File f)
           
protected  ArchiveReader getArchiveReader(java.io.File f, long offset)
           
protected  ArchiveReader getArchiveReader(java.lang.String arcFileOrUrl)
           
protected  ArchiveReader getArchiveReader(java.lang.String id, java.io.InputStream is, boolean atFirstRecord)
           
protected  ArchiveReader getArchiveReader(java.lang.String arcFileOrUrl, long offset)
           
protected  ArchiveReader getArchiveReader(java.net.URL u)
           
protected  ArchiveReader getArchiveReader(java.net.URL f, long offset)
           
protected  boolean isCompressed(java.io.File f)
           
protected  ArchiveReader makeARCLocal(java.net.URLConnection connection)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

ArchiveReaderFactory

protected ArchiveReaderFactory()
Shutdown any public access to default constructor.

Method Detail

get

public static ArchiveReader get(java.lang.String arcFileOrUrl)
                         throws java.net.MalformedURLException,
                                java.io.IOException
Get an Archive file Reader on passed path or url. Does primitive heuristic figuring if path or URL.

Parameters:
arcFileOrUrl - File path or URL pointing at an Archive file.
Returns:
An Archive file Reader.
Throws:
java.io.IOException
java.net.MalformedURLException
java.io.IOException

getArchiveReader

protected ArchiveReader getArchiveReader(java.lang.String arcFileOrUrl)
                                  throws java.net.MalformedURLException,
                                         java.io.IOException
Throws:
java.net.MalformedURLException
java.io.IOException

getArchiveReader

protected ArchiveReader getArchiveReader(java.lang.String arcFileOrUrl,
                                         long offset)
                                  throws java.net.MalformedURLException,
                                         java.io.IOException
Throws:
java.net.MalformedURLException
java.io.IOException

get

public static ArchiveReader get(java.io.File f)
                         throws java.io.IOException
Parameters:
f - An Archive file to read.
Returns:
An ArchiveReader
Throws:
java.io.IOException

getArchiveReader

protected ArchiveReader getArchiveReader(java.io.File f)
                                  throws java.io.IOException
Throws:
java.io.IOException

get

public static ArchiveReader get(java.io.File f,
                                long offset)
                         throws java.io.IOException
Parameters:
f - An Archive file to read.
offset - Have returned Reader set to start reading at this offset.
Returns:
An ArchiveReader
Throws:
java.io.IOException

getArchiveReader

protected ArchiveReader getArchiveReader(java.io.File f,
                                         long offset)
                                  throws java.io.IOException
Throws:
java.io.IOException

get

public static ArchiveReader get(java.lang.String s,
                                java.io.InputStream is,
                                boolean atFirstRecord)
                         throws java.io.IOException
Wrap a Reader around passed Stream.

Parameters:
s - Identifying String for this Stream used in error messages. Must be a string that ends with the name of the file we're to put an ArchiveReader on. This code looks at file endings to figure whether to return an ARC or WARC reader.
is - Stream. Stream will be wrapped with implementation of RepositionableStream unless already supported.
atFirstRecord - Are we at first Record?
Returns:
ArchiveReader.
Throws:
java.io.IOException

asRepositionable

protected java.io.InputStream asRepositionable(java.io.InputStream is)
Parameters:
is -
Returns:
If passed is is RepositionableInputStream, returns is, else we wrap is with RepositionableStream.

getArchiveReader

protected ArchiveReader getArchiveReader(java.lang.String id,
                                         java.io.InputStream is,
                                         boolean atFirstRecord)
                                  throws java.io.IOException
Throws:
java.io.IOException

get

public static ArchiveReader get(java.net.URL u,
                                long offset)
                         throws java.io.IOException
Get an Archive Reader aligned at offset. This version of get will not bring the file local but will try to stream across the net making an HTTP 1.1 Range request on remote http server (RFC1435 Section 14.35).

Parameters:
u - HTTP URL for an Archive file.
offset - Offset into file at which to start fetching.
Returns:
An ArchiveReader aligned at offset.
Throws:
java.io.IOException

getArchiveReader

protected ArchiveReader getArchiveReader(java.net.URL f,
                                         long offset)
                                  throws java.io.IOException
Throws:
java.io.IOException

get

public static ArchiveReader get(java.net.URL u)
                         throws java.io.IOException
Get an ARCReader. Pulls the ARC local into whereever the System Property java.io.tmpdir points. It then hands back an ARCReader that points at this local copy. A close on this ARCReader instance will remove the local copy.

Parameters:
u - An URL that points at an ARC.
Returns:
An ARCReader.
Throws:
java.io.IOException

getArchiveReader

protected ArchiveReader getArchiveReader(java.net.URL u)
                                  throws java.io.IOException
Throws:
java.io.IOException

makeARCLocal

protected ArchiveReader makeARCLocal(java.net.URLConnection connection)
                              throws java.io.IOException
Throws:
java.io.IOException

addUserAgent

protected void addUserAgent(java.net.HttpURLConnection connection)

isCompressed

protected boolean isCompressed(java.io.File f)
                        throws java.io.IOException
Parameters:
f - File to test.
Returns:
True if f is compressed.
Throws:
java.io.IOException


Copyright © 2003-2011 Internet Archive. All Rights Reserved.