|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object org.archive.io.GenericReplayCharSequence
public class GenericReplayCharSequence
Provides a (Replay)CharSequence view on recorded streams (a prefix
buffer and overflow backing file) that can handle streams of multibyte
characters.
For better performance on ISO-8859-1 text, use
Latin1ByteReplayCharSequence
.
Call close on this class when done so can clean up resources.
Implementation currently works by checking to see if content to read all fits the in-memory buffer. If so, we decode into a CharBuffer and keep this around for CharSequence operations. This CharBuffer is discarded on close.
If content length is greater than in-memory buffer, we decode the buffer plus backing file into a new file named for the backing file w/ a suffix of the encoding we write the file as. We then run w/ a memory-mapped CharBuffer against this file to implement CharSequence. Reasons for this implemenation are that CharSequence wants to return the length of the CharSequence.
Obvious optimizations would keep around decodings whether the in-memory decoded buffer or the file of decodings written to disk but the general usage pattern processing URIs is that the decoding is used by one processor only. Also of note, files usually fit into the in-memory buffer.
We might also be able to keep up 3 windows that moved across the file decoding a window at a time trying to keep one of the buffers just in front of the regex processing returning it a length that would be only the length of current position to end of current block or else the length could be got by multipling the backing files length by the decoders' estimate of average character size. This would save us writing out the decoded file. We'd have to do the latter for files that are > Integer.MAX_VALUE.
Field Summary | |
---|---|
protected static java.util.logging.Logger |
logger
|
Constructor Summary | |
---|---|
GenericReplayCharSequence(byte[] buffer,
long size,
long responseBodyStart,
java.lang.String encoding)
Constructor for all in-memory operation. |
|
GenericReplayCharSequence(ReplayInputStream contentReplayInputStream,
java.lang.String backingFilename,
java.lang.String characterEncoding)
Constructor for overflow-to-disk-file operation. |
Method Summary | |
---|---|
char |
charAt(int index)
|
void |
close()
Call this method when done so implementation has chance to clean up resources. |
protected void |
finalize()
|
int |
length()
|
java.lang.CharSequence |
subSequence(int start,
int end)
|
java.lang.String |
toString()
|
Methods inherited from class java.lang.Object |
---|
clone, equals, getClass, hashCode, notify, notifyAll, wait, wait, wait |
Field Detail |
---|
protected static java.util.logging.Logger logger
Constructor Detail |
---|
public GenericReplayCharSequence(byte[] buffer, long size, long responseBodyStart, java.lang.String encoding) throws java.io.IOException
buffer
- In-memory buffer of recordings prefix. We read from
here first and will only go to the backing file if size
requested is greater than buffer.length
.size
- Total size of stream to replay in bytes. Used to find
EOS. This is total length of content including HTTP headers if
present.responseBodyStart
- Where the response body starts in bytes.
Used to skip over the HTTP headers if present.backingFilename
- Path to backing file with content in excess of
whats in buffer
.encoding
- Encoding to use reading the passed prefix buffer and
backing file. For now, should be java canonical name for the
encoding. (If null is passed, we will default to
ByteReplayCharSequence).
java.io.IOException
public GenericReplayCharSequence(ReplayInputStream contentReplayInputStream, java.lang.String backingFilename, java.lang.String characterEncoding) throws java.io.IOException
contentReplayInputStream
- inputStream of contentbackingFilename
- hint for name of temp filecharacterEncoding
- Encoding to use reading the stream.
For now, should be java canonical name for the
encoding.
java.io.IOException
Method Detail |
---|
public void close()
ReplayCharSequence
close
in interface ReplayCharSequence
protected void finalize() throws java.lang.Throwable
finalize
in class java.lang.Object
java.lang.Throwable
public int length()
length
in interface java.lang.CharSequence
public char charAt(int index)
charAt
in interface java.lang.CharSequence
public java.lang.CharSequence subSequence(int start, int end)
subSequence
in interface java.lang.CharSequence
public java.lang.String toString()
toString
in interface java.lang.CharSequence
toString
in class java.lang.Object
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |