org.geotoolkit.util
Class Strings

Object
  extended by Static
      extended by Strings

public final class Strings
extends Static

Utility methods working on String or CharSequence instances. Some methods defined in this class duplicate the functionalities already provided in the String class, but works on a generic CharSequence instance instead than String. Other methods perform their work directly on the provided StringBuilder.


Unicode support
Every methods defined in this class work on code points instead than characters when appropriate. Consequently those methods should behave correctly with characters outside the Basic Multilingual Plane (BMP).


Handling of null values
Some methods accept a null argument, in particular the methods converting the given String to another String which may be the same. For example the camelCaseToAcronym(String) method returns null if the string to convert is null. Some other methods like count(String, char) handles null argument like an empty string. The methods that don't accept a null argument are explicitly documented as throwing a NullPointerException.

Since:
3.09 (derived from 3.00)
Version:
3.20
Author:
Martin Desruisseaux (Geomatys)
See Also:
Arrays.toString(Object[]), XArrays.containsIgnoreCase(String[], String)
Module:
utility/geotk-utility (download)    View source code for this class

Field Summary
static String[] EMPTY
          An array of zero-length.
 
Method Summary
static String camelCaseToAcronym(String text)
          Creates an acronym from the given text.
static String camelCaseToSentence(CharSequence identifier)
          Given a string in camel cases (typically a Java identifier), returns a string formatted like an English sentence.
static StringBuilder camelCaseToWords(CharSequence identifier, boolean toLowerCase)
          Given a string in camel cases, returns a string with the same words separated by spaces.
static String commonPrefix(String s1, String s2)
          Returns the longest sequence of characters which is found at the beginning of the two given strings.
static String commonSuffix(String s1, String s2)
          Returns the longest sequence of characters which is found at the end of the two given strings.
static int count(CharSequence text, char c)
          Counts the number of occurrence of the given character in the given character sequence.
static int count(String text, char c)
          Counts the number of occurrence of the given character in the given string.
static int count(String text, String toSearch)
          Returns the number of occurrences of the toSearch string in the given text.
static boolean endsWith(CharSequence sequence, CharSequence suffix, boolean ignoreCase)
          Returns true if the given character sequence ends with the given suffix.
static boolean equalsIgnoreCase(CharSequence s1, CharSequence s2)
          Returns true if the two given strings are equal, ignoring case.
static String[] getLinesFromMultilines(String text)
          Returns a String instance for each line found in a multi-lines string.
static int indexOf(CharSequence string, CharSequence part, int fromIndex)
          Returns the index within the given strings of the first occurrence of the specified part, starting at the specified index.
static boolean isAcronymForWords(CharSequence acronym, CharSequence words)
          Returns true if the first string is likely to be an acronym of the second string.
static boolean isJavaIdentifier(CharSequence identifier)
          Returns true if the given identifier is a legal Java identifier.
static boolean isUpperCase(CharSequence text)
          Returns true if every characters in the given character sequence are upper-case.
static byte[] parseBytes(String values, char separator, int radix)
          Splits the given string around the given character, then parses each item as a byte.
static double[] parseDoubles(String values, char separator)
          Splits the given string around the given character, then parses each item as a double.
static float[] parseFloats(String values, char separator)
          Splits the given string around the given character, then parses each item as a float.
static int[] parseInts(String values, char separator, int radix)
          Splits the given string around the given character, then parses each item as an int.
static long[] parseLongs(String values, char separator, int radix)
          Splits the given string around the given character, then parses each item as a long.
static short[] parseShorts(String values, char separator, int radix)
          Splits the given string around the given character, then parses each item as a short.
static boolean regionMatches(CharSequence string, int offset, CharSequence part)
          Returns true if the given string at the given offset contains the given part, in a case-sensitive comparison.
static void remove(StringBuilder buffer, String search)
          Removes every occurrences of the given string in the given buffer.
static void replace(StringBuilder buffer, int start, int end, char[] chars)
          Replaces the characters in a substring of the buffer with characters in the specified array.
static void replace(StringBuilder buffer, String search, String replacement)
          Replaces every occurrences of the given string in the given buffer.
static int skipLines(CharSequence string, int numToSkip, int startAt)
          Returns the index of the first character after the given number of lines.
static String spaces(int length)
          Returns a string of the specified length filled with white spaces.
static String[] split(String toSplit, char separator)
          Splits a string around the given character.
static boolean startsWith(CharSequence sequence, CharSequence prefix, boolean ignoreCase)
          Returns true if the given character sequence starts with the given prefix.
static CharSequence toASCII(CharSequence text)
          Replaces some Unicode characters by ASCII characters on a "best effort basis".
static CharSequence token(CharSequence text, int offset)
          Returns the token starting at the given offset in the given text.
static String trim(String text)
          Returns a string with leading and trailing white spaces omitted.
static String trimFractionalPart(String value)
          Trims the fractional part of the given formatted number, provided that it doesn't change the value.
static void trimFractionalPart(StringBuilder buffer)
          Trims the fractional part of the given formatted number, provided that it doesn't change the value.
 
Methods inherited from class Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

EMPTY

public static final String[] EMPTY
An array of zero-length. This constant play a role equivalents to Collections.EMPTY_LIST.

Since:
3.20
Method Detail

spaces

public static String spaces(int length)
Returns a string of the specified length filled with white spaces. This method tries to return a pre-allocated string if possible.

Parameters:
length - The string length. Negative values are clamped to 0.
Returns:
A string of length length filled with white spaces.

count

public static int count(String text,
                        String toSearch)
Returns the number of occurrences of the toSearch string in the given text. The search is case-sensitive.

Parameters:
text - String to search in, or null.
toSearch - The string to search in the given text. Must contain at least one character.
Returns:
The number of occurrence of toSearch in text, or 0 if text was null or empty.
Throws:
IllegalArgumentException - If the toSearch array is null or empty.
Since:
3.11

count

public static int count(String text,
                        char c)
Counts the number of occurrence of the given character in the given string. This method performs the same work than count(CharSequence, char), but is faster.

Parameters:
text - The text in which to count the number of occurrence.
c - The character to count, or 0 if text was null.
Returns:
The number of occurrences of the given character.

count

public static int count(CharSequence text,
                        char c)
Counts the number of occurrence of the given character in the given character sequence. This method performs the same work than count(String, char), but on a more generic interface.

Parameters:
text - The text in which to count the number of occurrence.
c - The character to count, or 0 if text was null.
Returns:
The number of occurrences of the given character.

split

public static String[] split(String toSplit,
                             char separator)
Splits a string around the given character. The array returned by this method contains each substring of the given string that is terminated by the given character or is terminated by the end of the string. The substrings in the array are in the order in which they occur in the given string. If the character is not found in the input, then the resulting array has just one element, namely the given string.

This method is similar to the standard String.split(String) method except for the following:

Parameters:
toSplit - The string to split, or null.
separator - The delimiting character (typically the coma).
Returns:
The array of strings computed by splitting the given string around the given character, or an empty array if toSplit was null.
Since:
3.18
See Also:
String.split(String)

parseDoubles

public static double[] parseDoubles(String values,
                                    char separator)
                             throws NumberFormatException
Splits the given string around the given character, then parses each item as a double.

Parameters:
values - The strings containing the values to parse, or null.
separator - The delimiting character (typically the coma).
Returns:
The array of numbers parsed from the given string, or an empty array if values was null.
Throws:
NumberFormatException - If at least one number can not be parsed.
Since:
3.19

parseFloats

public static float[] parseFloats(String values,
                                  char separator)
                           throws NumberFormatException
Splits the given string around the given character, then parses each item as a float.

Parameters:
values - The strings containing the values to parse, or null.
separator - The delimiting character (typically the coma).
Returns:
The array of numbers parsed from the given string, or an empty array if values was null.
Throws:
NumberFormatException - If at least one number can not be parsed.
Since:
3.19

parseLongs

public static long[] parseLongs(String values,
                                char separator,
                                int radix)
                         throws NumberFormatException
Splits the given string around the given character, then parses each item as a long.

Parameters:
values - The strings containing the values to parse, or null.
separator - The delimiting character (typically the coma).
radix - the radix to be used for parsing. This is usually 10.
Returns:
The array of numbers parsed from the given string, or an empty array if values was null.
Throws:
NumberFormatException - If at least one number can not be parsed.
Since:
3.19

parseInts

public static int[] parseInts(String values,
                              char separator,
                              int radix)
                       throws NumberFormatException
Splits the given string around the given character, then parses each item as an int.

Parameters:
values - The strings containing the values to parse, or null.
separator - The delimiting character (typically the coma).
radix - the radix to be used for parsing. This is usually 10.
Returns:
The array of numbers parsed from the given string, or an empty array if values was null.
Throws:
NumberFormatException - If at least one number can not be parsed.
Since:
3.19

parseShorts

public static short[] parseShorts(String values,
                                  char separator,
                                  int radix)
                           throws NumberFormatException
Splits the given string around the given character, then parses each item as a short.

Parameters:
values - The strings containing the values to parse, or null.
separator - The delimiting character (typically the coma).
radix - the radix to be used for parsing. This is usually 10.
Returns:
The array of numbers parsed from the given string, or an empty array if values was null.
Throws:
NumberFormatException - If at least one number can not be parsed.
Since:
3.19

parseBytes

public static byte[] parseBytes(String values,
                                char separator,
                                int radix)
                         throws NumberFormatException
Splits the given string around the given character, then parses each item as a byte.

Parameters:
values - The strings containing the values to parse, or null.
separator - The delimiting character (typically the coma).
radix - the radix to be used for parsing. This is usually 10.
Returns:
The array of numbers parsed from the given string, or an empty array if values was null.
Throws:
NumberFormatException - If at least one number can not be parsed.
Since:
3.19

replace

public static void replace(StringBuilder buffer,
                           String search,
                           String replacement)
Replaces every occurrences of the given string in the given buffer. This method invokes StringBuilder.replace(int, int, String) for each occurrence of search found in the buffer.

Parameters:
buffer - The string in which to perform the replacements.
search - The string to replace.
replacement - The replacement for the target string.
Throws:
NullPointerException - if any of the arguments is null.
See Also:
String.replace(char, char), String.replace(CharSequence, CharSequence), StringBuilder.replace(int, int, String)

replace

public static void replace(StringBuilder buffer,
                           int start,
                           int end,
                           char[] chars)
Replaces the characters in a substring of the buffer with characters in the specified array. The substring to be replaced begins at the specified start and extends to the character at index end - 1.

Parameters:
buffer - The buffer in which to perform the replacement.
start - The beginning index in the buffer, inclusive.
end - The ending index in the buffer, exclusive.
chars - The array that will replace previous contents.
Throws:
NullPointerException - if the buffer or chars argument is null.
Since:
3.20
See Also:
StringBuilder.replace(int, int, String)

remove

public static void remove(StringBuilder buffer,
                          String search)
Removes every occurrences of the given string in the given buffer. This method invokes StringBuilder.delete(int, int) for each occurrence of search found in the buffer.

Parameters:
buffer - The string in which to perform the removals.
search - The string to remove.
Throws:
NullPointerException - if any of the arguments is null.
See Also:
StringBuilder.delete(int, int)

trim

public static String trim(String text)
Returns a string with leading and trailing white spaces omitted. White spaces are identified by the Character.isWhitespace(int) method.

This method is similar in purpose to String.trim(), except that the later considers every ASCII control codes below 32 to be a whitespace. This have the effect of removing X3.64 escape sequences as well. Users should invoke this Strings.trim method instead if they need to preserve X3.64 escape sequences.

Parameters:
text - The string from which to remove leading and trailing white spaces, or null.
Returns:
A string with leading and trailing white spaces removed, or null is the given string was null.
See Also:
String.trim()

trimFractionalPart

public static String trimFractionalPart(String value)
Trims the fractional part of the given formatted number, provided that it doesn't change the value. This method assumes that the number is formatted in the US locale, typically by the Double.toString(double) method.

More specifically if the given string ends with a '.' character followed by a sequence of '0' characters, then those characters are omitted. Otherwise this method returns the string unchanged. This is a "all or nothing" method: either the fractional part is completely removed, or either it is left unchanged.

Examples
This method returns "4" if the given value is "4.", "4.0" or "4.00", but returns "4.10" unchanged (including the trailing '0' character) if the input is "4.10".

Use case
This method is useful before to parse a number if that number should preferably be parsed as an integer before attempting to parse it as a floating point number.

Parameters:
value - The value to trim if possible, or null.
Returns:
The value without the trailing ".0" part (if any), or null if the given string was null.

trimFractionalPart

public static void trimFractionalPart(StringBuilder buffer)
Trims the fractional part of the given formatted number, provided that it doesn't change the value. This method performs the same work than trimFractionalPart(String) except that it modifies the given buffer in-place.

Use case
This method is useful after a double value has been appended to the buffer, in order to make it appears like an integer when possible.

Parameters:
buffer - The buffer to trim if possible.
Throws:
NullPointerException - if the argument is null.

toASCII

public static CharSequence toASCII(CharSequence text)
Replaces some Unicode characters by ASCII characters on a "best effort basis". For example the 'é' character is replaced by 'e' (without accent).

The current implementation replaces only the characters in the range 00C0 to 00FF, inclusive. Other characters are left unchanged.

Note that if the given character sequence is an instance of StringBuilder, then the replacement will be performed in-place.

Parameters:
text - The text to scan for Unicode characters to replace by ASCII characters, or null.
Returns:
The given text with substitution applied, or text if no replacement has been applied.
Since:
3.18

camelCaseToSentence

public static String camelCaseToSentence(CharSequence identifier)
Given a string in camel cases (typically a Java identifier), returns a string formatted like an English sentence. This heuristic method performs the following steps:
  1. Invoke camelCaseToWords(CharSequence, boolean), which separate the words on the basis of character case. For example "transferFunctionType" become "transfer function type". This works fine for ISO 19115 identifiers.

  2. Next replace all occurrence of '_' by spaces in order to take in account an other common naming convention, which uses '_' as a word separator. This convention is used by NetCDF attributes like "project_name".

  3. Finally ensure that the first character is upper-case.

Exception to the above rules
If the given identifier contains only upper-case letters, digits and the '_' character, then the identifier is returned "as is" except for the '_' characters which are replaced by '-'. This work well for identifiers like "UTF-8" or "ISO-LATIN-1" for example.

Note that those heuristic rules may be modified in future Geotk versions, depending on the practical experience gained.

Parameters:
identifier - An identifier with no space, words begin with an upper-case character, or null.
Returns:
The identifier with spaces inserted after what looks like words, or null if the given argument was null.
Since:
3.18 (derived from 3.09)

camelCaseToWords

public static StringBuilder camelCaseToWords(CharSequence identifier,
                                             boolean toLowerCase)
Given a string in camel cases, returns a string with the same words separated by spaces. A word begins with a upper-case character following a lower-case character. For example if the given string is "PixelInterleavedSampleModel", then this method returns "Pixel Interleaved Sample Model" or "Pixel interleaved sample model" depending on the value of the toLowerCase argument.

If toLowerCase is false, then this method inserts spaces but does not change the case of characters. If toLowerCase is true, then this method changes to lower case the first character after each spaces inserted by this method (note that this intentionally exclude the very first character in the given string), except if the second character is upper case, in which case the words is assumed an acronym.

The given string is usually a programmatic identifier like a class name or a method name.

Parameters:
identifier - An identifier with no space, words begin with an upper-case character.
toLowerCase - true for changing the first character of words to lower case, except for the first word and acronyms.
Returns:
The identifier with spaces inserted after what looks like words, returned as a StringBuilder in order to allow modifications by the caller.
Throws:
NullPointerException - if the identifier argument is null.

camelCaseToAcronym

public static String camelCaseToAcronym(String text)
Creates an acronym from the given text. If every characters in the given text are upper case, then the text is returned unchanged on the assumption that it is already an acronym. Otherwise this method returns a string containing the first character of each word, where the words are separated by the camel case convention, the '_' character, or any character which is not a java identifier part (including spaces).

Examples: given "northEast", this method returns "NE". Given "Open Geospatial Consortium", this method returns "OGC".

Parameters:
text - The text for which to create an acronym, or null.
Returns:
The acronym, or null if the given text was null.

isAcronymForWords

public static boolean isAcronymForWords(CharSequence acronym,
                                        CharSequence words)
Returns true if the first string is likely to be an acronym of the second string. An acronym is a sequence of letters or digits built from at least one character of each word in the words string. More than one character from the same word may appear in the acronym, but they must always be the first consecutive characters. The comparison is case-insensitive.

Example: given the string "Open Geospatial Consortium", the following strings are recognized as acronyms: "OGC", "ogc", "O.G.C.", "OpGeoCon".

Parameters:
acronym - A possible acronym of the sequence of words.
words - The sequence of words.
Returns:
true if the first string is an acronym of the second one.
Throws:
NullPointerException - if any of the arguments is null.

isJavaIdentifier

public static boolean isJavaIdentifier(CharSequence identifier)
Returns true if the given identifier is a legal Java identifier. This method returns true if the identifier length is greater than zero, the first character is a Java identifier start and all remaining characters (if any) are Java identifier parts.

Parameters:
identifier - The character sequence to test.
Returns:
true if the given character sequence is a legal Java identifier.
Throws:
NullPointerException - if the argument is null.
Since:
3.20

isUpperCase

public static boolean isUpperCase(CharSequence text)
Returns true if every characters in the given character sequence are upper-case.

Parameters:
text - The character sequence to test.
Returns:
true if every character are upper-case.
Throws:
NullPointerException - if the argument is null.
See Also:
String.toUpperCase()

equalsIgnoreCase

public static boolean equalsIgnoreCase(CharSequence s1,
                                       CharSequence s2)
Returns true if the two given strings are equal, ignoring case. This method is similar to String.equalsIgnoreCase(String), except it works on arbitrary character sequences and compares code points instead than characters.

Parameters:
s1 - The first string to compare.
s2 - The second string to compare.
Returns:
true if the two given strings are equal, ignoring case.
Throws:
NullPointerException - if any of the arguments is null.
See Also:
String.equalsIgnoreCase(String)

regionMatches

public static boolean regionMatches(CharSequence string,
                                    int offset,
                                    CharSequence part)
Returns true if the given string at the given offset contains the given part, in a case-sensitive comparison. This method is equivalent to the following code:
return string.regionMatches(offset, part, 0, part.length());
Except that this method works on arbitrary CharSequence objects instead than Strings only.

Parameters:
string - The string for which to tests for the presence of part.
offset - The offset in string where to test for the presence of part.
part - The part which may be present in string.
Returns:
true if string contains part at the given offset.
Throws:
NullPointerException - if any of the arguments is null.
See Also:
String.regionMatches(int, String, int, int)

indexOf

public static int indexOf(CharSequence string,
                          CharSequence part,
                          int fromIndex)
Returns the index within the given strings of the first occurrence of the specified part, starting at the specified index. This method is equivalent to the following code:
return string.indexOf(part, fromIndex);
Except that this method works on arbitrary CharSequence objects instead than Strings only.

Parameters:
string - The string in which to perform the search.
part - The substring for which to search.
fromIndex - The index from which to start the search.
Returns:
The index within the string of the first occurrence of the specified part, starting at the specified index, or -1 if none.
Throws:
NullPointerException - if any of the arguments is null.
Since:
3.16
See Also:
String.indexOf(String, int), StringBuilder.indexOf(String, int), StringBuffer.indexOf(String, int)

token

public static CharSequence token(CharSequence text,
                                 int offset)
Returns the token starting at the given offset in the given text. For the purpose of this method, a "token" is any sequence of consecutive characters of the same type, as defined below.

Let define c as the first non-blank character located at an index equals or greater than the given offset. Then the characters that are considered of the same type are:

Parameters:
text - The text for which to get the token.
offset - Index of the fist character to consider in the given text.
Returns:
A sub-sequence of text starting at the given offset, or an empty string if there is no non-blank character at or after the given offset.
Throws:
NullPointerException - if the text argument is null.
Since:
3.18 (derived from 3.06)

commonPrefix

public static String commonPrefix(String s1,
                                  String s2)
Returns the longest sequence of characters which is found at the beginning of the two given strings. If one of those string is null, then the other string is returned.

Parameters:
s1 - The first string, or null.
s2 - The second string, or null.
Returns:
The common prefix of both strings, or null if both strings are null.

commonSuffix

public static String commonSuffix(String s1,
                                  String s2)
Returns the longest sequence of characters which is found at the end of the two given strings. If one of those string is null, then the other string is returned.

Parameters:
s1 - The first string, or null.
s2 - The second string, or null.
Returns:
The common suffix of both strings, or null if both strings are null.

startsWith

public static boolean startsWith(CharSequence sequence,
                                 CharSequence prefix,
                                 boolean ignoreCase)
Returns true if the given character sequence starts with the given prefix.

Parameters:
sequence - The sequence to test.
prefix - The expected prefix.
ignoreCase - true if the case should be ignored.
Returns:
true if the given sequence starts with the given prefix.
Throws:
NullPointerException - if any of the arguments is null.

endsWith

public static boolean endsWith(CharSequence sequence,
                               CharSequence suffix,
                               boolean ignoreCase)
Returns true if the given character sequence ends with the given suffix.

Parameters:
sequence - The sequence to test.
suffix - The expected suffix.
ignoreCase - true if the case should be ignored.
Returns:
true if the given sequence ends with the given suffix.
Throws:
NullPointerException - if any of the arguments is null.

skipLines

public static int skipLines(CharSequence string,
                            int numToSkip,
                            int startAt)
Returns the index of the first character after the given number of lines. This method counts the number of occurrence of '\n', '\r' or "\r\n" starting from the given position. When numToSkip occurrences have been found, the index of the first character after the last occurrence is returned.

Parameters:
string - The string in which to skip a determined amount of lines.
numToSkip - The number of lines to skip. Can be positive, zero or negative.
startAt - Index at which to start the search.
Returns:
Index of the first character after the last skipped line.
Throws:
NullPointerException - if the string argument is null.

getLinesFromMultilines

public static String[] getLinesFromMultilines(String text)
Returns a String instance for each line found in a multi-lines string. Each element in the returned array will be a single line. If the given text is already a single line, then this method returns a singleton containing only the given text.

Parameters:
text - The multi-line text from which to get the individual lines.
Returns:
The lines in the text, or null if the given text was null.


Copyright © 2009-2012 Geotoolkit.org. All Rights Reserved.