org.archive.util
Class TextUtils

java.lang.Object
  extended by org.archive.util.TextUtils

public class TextUtils
extends java.lang.Object


Constructor Summary
TextUtils()
           
 
Method Summary
static java.lang.String escapeForHTML(java.lang.String s)
          Minimally escapes a string so that it can be placed inside XML/HTML attribute.
static java.lang.String escapeForHTMLJavascript(java.lang.String s)
          Escapes a string so that it can be passed as an argument to a javscript in a JSP page.
static java.lang.String escapeForMarkupAttribute(java.lang.String s)
          Escapes a string so that it can be placed inside XML/HTML attribute.
static java.lang.String exceptionToString(java.lang.String message, java.lang.Throwable e)
           
static java.lang.String getFirstWord(java.lang.String s)
           
static java.util.regex.Matcher getMatcher(java.lang.String pattern, java.lang.CharSequence input)
          Get a matcher object for a precompiled regex pattern.
static boolean matches(java.lang.String pattern, java.lang.CharSequence input)
          Utility method using a precompiled pattern instead of using the matches method of the String class.
static void recycleMatcher(java.util.regex.Matcher m)
           
static java.lang.String replaceAll(java.lang.String pattern, java.lang.CharSequence input, java.lang.String replacement)
          Utility method using a precompiled pattern instead of using the replaceAll method of the String class.
static java.lang.String replaceFirst(java.lang.String pattern, java.lang.CharSequence input, java.lang.String replacement)
          Utility method using a precompiled pattern instead of using the replaceFirst method of the String class.
static java.lang.String[] split(java.lang.String pattern, java.lang.CharSequence input)
          Utility method using a precompiled pattern instead of using the split method of the String class.
static java.lang.CharSequence unescapeHtml(java.lang.CharSequence cs)
          Replaces HTML Entity Encodings.
static void writeEscapedForHTML(java.lang.String s, javax.servlet.jsp.JspWriter out)
          Utility method for writing a (potentially large) String to a JspWriter, escaping it for HTML display, without constructing another large String of the whole content.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

TextUtils

public TextUtils()
Method Detail

getMatcher

public static java.util.regex.Matcher getMatcher(java.lang.String pattern,
                                                 java.lang.CharSequence input)
Get a matcher object for a precompiled regex pattern. This method tries to reuse Matcher objects for efficiency. It can hold for recycling one Matcher per pattern per thread. Matchers retrieved should be returned for reuse via the recycleMatcher() method, but no errors will occur if they are not. This method is a hotspot frequently accessed.

Parameters:
pattern - the string pattern to use
input - the character sequence the matcher should be using
Returns:
a matcher object loaded with the submitted character sequence

recycleMatcher

public static void recycleMatcher(java.util.regex.Matcher m)

replaceAll

public static java.lang.String replaceAll(java.lang.String pattern,
                                          java.lang.CharSequence input,
                                          java.lang.String replacement)
Utility method using a precompiled pattern instead of using the replaceAll method of the String class. This method will also be reusing Matcher objects.

Parameters:
pattern - precompiled Pattern to match against
input - the character sequence to check
replacement - the String to substitute every match with
Returns:
the String with all the matches substituted
See Also:
Pattern

replaceFirst

public static java.lang.String replaceFirst(java.lang.String pattern,
                                            java.lang.CharSequence input,
                                            java.lang.String replacement)
Utility method using a precompiled pattern instead of using the replaceFirst method of the String class. This method will also be reusing Matcher objects.

Parameters:
pattern - precompiled Pattern to match against
input - the character sequence to check
replacement - the String to substitute the first match with
Returns:
the String with the first match substituted
See Also:
Pattern

matches

public static boolean matches(java.lang.String pattern,
                              java.lang.CharSequence input)
Utility method using a precompiled pattern instead of using the matches method of the String class. This method will also be reusing Matcher objects.

Parameters:
pattern - precompiled Pattern to match against
input - the character sequence to check
Returns:
true if character sequence matches
See Also:
Pattern

split

public static java.lang.String[] split(java.lang.String pattern,
                                       java.lang.CharSequence input)
Utility method using a precompiled pattern instead of using the split method of the String class.

Parameters:
pattern - precompiled Pattern to split by
input - the character sequence to split
Returns:
array of Strings split by pattern
See Also:
Pattern

getFirstWord

public static java.lang.String getFirstWord(java.lang.String s)
Parameters:
s - String to find first word in (Words are delimited by whitespace).
Returns:
First word in the passed string else null if no word found.

escapeForHTMLJavascript

public static java.lang.String escapeForHTMLJavascript(java.lang.String s)
Escapes a string so that it can be passed as an argument to a javscript in a JSP page. This method takes a string and returns the same string with any single quote escaped by prepending the character with a backslash. Linebreaks are also replaced with '\n'. Also, less-than signs and ampersands are replaced with HTML entities.

Parameters:
s - The string to escape
Returns:
The same string escaped.

escapeForMarkupAttribute

public static java.lang.String escapeForMarkupAttribute(java.lang.String s)
Escapes a string so that it can be placed inside XML/HTML attribute. Replaces ampersand, less-than, greater-than, single-quote, and double-quote with escaped versions.

Parameters:
s - The string to escape
Returns:
The same string escaped.

escapeForHTML

public static java.lang.String escapeForHTML(java.lang.String s)
Minimally escapes a string so that it can be placed inside XML/HTML attribute. Escapes lt and amp.

Parameters:
s - The string to escape
Returns:
The same string escaped.

writeEscapedForHTML

public static void writeEscapedForHTML(java.lang.String s,
                                       javax.servlet.jsp.JspWriter out)
                                throws java.io.IOException
Utility method for writing a (potentially large) String to a JspWriter, escaping it for HTML display, without constructing another large String of the whole content.

Parameters:
s - String to write
out - destination JspWriter
Throws:
java.io.IOException

unescapeHtml

public static java.lang.CharSequence unescapeHtml(java.lang.CharSequence cs)
Replaces HTML Entity Encodings.

Parameters:
cs - The CharSequence to remove html codes from
Returns:
the same CharSequence or an escaped String.

exceptionToString

public static java.lang.String exceptionToString(java.lang.String message,
                                                 java.lang.Throwable e)
Parameters:
message - Message to put at top of the string returned. May be null.
e - Exception to write into a string.
Returns:
Return formatted string made of passed message and stack trace of passed exception.


Copyright © 2003-2011 Internet Archive. All Rights Reserved.