Guava源码解析-Strings源码解析

Strings源码解析

Strings类介绍

Strings是Guava提供的一个操作String的方法类集合。功能没有apache-commons的StringUtils类强大。但这边分析的是Guava的源码,所以……

首先,看下Strings类提供了哪些方法我们可以使用。

1
2
3
4
5
6
7
8
9
10
11
public static void main(String[] args) {
System.out.println(Strings.commonPrefix("aaa", "ann")); // Get a
System.out.println(Strings.commonSuffix("aaa", "bbb")); // Get ""
System.out.println(Strings.emptyToNull("")); // Get null
System.out.println(Strings.isNullOrEmpty("")); // Get true
System.out.println(Strings.lenientFormat("[%s]123[%s]", "abc", "def")); // Get [abc]123[def]
System.out.println(Strings.nullToEmpty(null)); // Get ""
System.out.println(Strings.padEnd("abc", 5, '*')); // Get abc**
System.out.println(Strings.padStart("abc", 5, '*')); // Get **abc
System.out.println(Strings.repeat("abc", 5)); // Get abcabcabcabcanc
}

类图

Strings类图

主要属性

主要方法

Strings()

1
private Strings() {}

私有的构造函数,表明这个类不希望被实例化。

Strings类也是被final字符修饰的,同样表明这个类不需要被实例化。

nullToEmpty()

1
2
3
4
5
6
7
8
9
/**
* Returns the given string if it is non-null; the empty string otherwise.
*
* @param string the string to test and possibly return
* @return {@code string} itself if it is non-null; {@code ""} if it is null
*/
public static String nullToEmpty(@Nullable String string) {
return Platform.nullToEmpty(string);
}

如果传入的字符串是null,则将它转为空字符串""返回,否则返回原字符串。

Platform.nullToEmpty()

1
2
3
static String nullToEmpty(@Nullable String string) {
return (string == null) ? "" : string;
}

静态方法,如果字符串为null,返回空字符串,否则返回原字符串。

isNullOrEmpty()

1
2
3
4
5
6
7
8
9
10
11
12
13
14
/**
* Returns {@code true} if the given string is null or is the empty string.
*
* <p>Consider normalizing your string references with {@link #nullToEmpty}. If you do, you can
* use {@link String#isEmpty()} instead of this method, and you won't need special null-safe forms
* of methods like {@link String#toUpperCase} either. Or, if you'd like to normalize "in the other
* direction," converting empty strings to {@code null}, you can use {@link #emptyToNull}.
*
* @param string a string reference to check
* @return {@code true} if the string is null or is the empty string
*/
public static boolean isNullOrEmpty(@Nullable String string) {
return Platform.stringIsNullOrEmpty(string);
}

如果传入的字符串为null或者长度为0,返回true,否则返回false。

Platform.stringIsNullOrEmpty()

1
2
3
static boolean stringIsNullOrEmpty(@Nullable String string) {
return string == null || string.isEmpty();
}

如果传入的字符串为null或者长度为0,返回true,否则返回false。

padStart()

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
/**
* Returns a string, of length at least {@code minLength}, consisting of {@code string} prepended
* with as many copies of {@code padChar} as are necessary to reach that length. For example,
*
* <ul>
* <li>{@code padStart("7", 3, '0')} returns {@code "007"}
* <li>{@code padStart("2010", 3, '0')} returns {@code "2010"}
* </ul>
*
* <p>See {@link java.util.Formatter} for a richer set of formatting capabilities.
*
* @param string the string which should appear at the end of the result
* @param minLength the minimum length the resulting string must have. Can be zero or negative, in
* which case the input string is always returned.
* @param padChar the character to insert at the beginning of the result until the minimum length
* is reached
* @return the padded string
*/
public static String padStart(String string, int minLength, char padChar) {
checkNotNull(string); // eager for GWT
// 如果当前字符串的长度大于等于minLength,返回当前值富川.
if (string.length() >= minLength) {
return string;
}
// 构造string.length - minLength个前缀
StringBuilder sb = new StringBuilder(minLength);
for (int i = string.length(); i < minLength; i++) {
sb.append(padChar);
}
sb.append(string);
return sb.toString();
}

将一个字符串加上指定的前缀,直到长度满足指定的长度。如果当前字符串的长度已经大于等于指定的长度,则直接返回当前字符串。

padEnd()

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
/**
* Returns a string, of length at least {@code minLength}, consisting of {@code string} appended
* with as many copies of {@code padChar} as are necessary to reach that length. For example,
*
* <ul>
* <li>{@code padEnd("4.", 5, '0')} returns {@code "4.000"}
* <li>{@code padEnd("2010", 3, '!')} returns {@code "2010"}
* </ul>
*
* <p>See {@link java.util.Formatter} for a richer set of formatting capabilities.
*
* @param string the string which should appear at the beginning of the result
* @param minLength the minimum length the resulting string must have. Can be zero or negative, in
* which case the input string is always returned.
* @param padChar the character to append to the end of the result until the minimum length is
* reached
* @return the padded string
*/
public static String padEnd(String string, int minLength, char padChar) {
checkNotNull(string); // eager for GWT.
if (string.length() >= minLength) {
return string;
}
StringBuilder sb = new StringBuilder(minLength);
sb.append(string);
for (int i = string.length(); i < minLength; i++) {
sb.append(padChar);
}
return sb.toString();
}

将一个字符串加上指定的后缀,直到长度满足指定的长度。如果当前字符串的长度已经大于等于指定的长度,则直接返回当前字符串。

repeat

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
/**
* Returns a string consisting of a specific number of concatenated copies of an input string. For
* example, {@code repeat("hey", 3)} returns the string {@code "heyheyhey"}.
*
* @param string any non-null string
* @param count the number of times to repeat it; a nonnegative integer
* @return a string containing {@code string} repeated {@code count} times (the empty string if
* {@code count} is zero)
* @throws IllegalArgumentException if {@code count} is negative
*/
public static String repeat(String string, int count) {
checkNotNull(string); // eager for GWT.

if (count <= 1) {
checkArgument(count >= 0, "invalid count: %s", count);
return (count == 0) ? "" : string;
}

// IF YOU MODIFY THE CODE HERE, you must update StringsRepeatBenchmark
final int len = string.length();
final long longSize = (long) len * (long) count;
final int size = (int) longSize;
if (size != longSize) {
throw new ArrayIndexOutOfBoundsException("Required array size too large: " + longSize);
}

final char[] array = new char[size];
string.getChars(0, len, array, 0);
int n;
for (n = len; n < size - n; n <<= 1) {
System.arraycopy(array, 0, array, n, n);
}
System.arraycopy(array, 0, array, n, size - n);
return new String(array);
}

将一个字符串重复指定的次数返回。

以字符串abc为例,假设我们需要重复它5次。

初始化时:len = 3;longSize = 3 * 5 = 15;array = new char[15];

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

执行到代码28行。string.getChars(0,len,array,n,n);方法将字符串abc拷贝到数组的起始位置,此时数组为:

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
a b c

进入30行的循环时,n = 3, size - n = 12,执行数组拷贝,此时数组为

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
a b c a b c

因为n = 6 < (size - n = 9),所以会在执行一次数组拷贝,此时要在位置6处,从数组array的头部,拷贝6个元素(n <<= 1的结果为3 × 2,左移)。此时数组为:

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
a b c a b c a b c a b c

此时n = 12 > (size - 12 = 3),跳出循环。

最后在33行,执行拷贝,将位置12-14填充。得到:

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
a b c a b c a b c a b c a b c

commonPrefix()

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
/**
* Returns the longest string {@code prefix} such that {@code a.toString().startsWith(prefix) &&
* b.toString().startsWith(prefix)}, taking care not to split surrogate pairs. If {@code a} and
* {@code b} have no common prefix, returns the empty string.
*
* @since 11.0
*/
public static String commonPrefix(CharSequence a, CharSequence b) {
checkNotNull(a);
checkNotNull(b);

int maxPrefixLength = Math.min(a.length(), b.length());
int p = 0;
while (p < maxPrefixLength && a.charAt(p) == b.charAt(p)) {
p++;
}
if (validSurrogatePairAt(a, p - 1) || validSurrogatePairAt(b, p - 1)) {
p--;
}
return a.subSequence(0, p).toString();
}

寻找两个字符串的公共前缀。

validSurrogatePairAt()

1
2
3
4
5
6
7
8
9
10
11
/**
* True when a valid surrogate pair starts at the given {@code index} in the given {@code string}.
* Out-of-range indexes return false.
*/
@VisibleForTesting
static boolean validSurrogatePairAt(CharSequence string, int index) {
return index >= 0
&& index <= (string.length() - 2)
&& Character.isHighSurrogate(string.charAt(index))
&& Character.isLowSurrogate(string.charAt(index + 1));
}

众所周知java采用UTF-16编码unicode字符集。UTF-16使用使用一个16位单元(两字节)或者两个16为单元表示一个unicode字符。使用两个单元的,前面那个单元叫highsurrogate 后面那个叫lowsurrogate。而char只占一个16位单元(两字节),在判断的时候使用的char来进行的比较。所以最后一个字符是占4字节的字符的时候会出现highsurrogate 相同,lowsurrogate不同,非同一个单元对,这就表明不是一个相同的字符。所以公共前缀的长度需要减一。

commonSuffix()

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
/**
* Returns the longest string {@code suffix} such that {@code a.toString().endsWith(suffix) &&
* b.toString().endsWith(suffix)}, taking care not to split surrogate pairs. If {@code a} and
* {@code b} have no common suffix, returns the empty string.
*
* @since 11.0
*/
public static String commonSuffix(CharSequence a, CharSequence b) {
checkNotNull(a);
checkNotNull(b);

int maxSuffixLength = Math.min(a.length(), b.length());
int s = 0;
while (s < maxSuffixLength && a.charAt(a.length() - s - 1) == b.charAt(b.length() - s - 1)) {
s++;
}
if (validSurrogatePairAt(a, a.length() - s - 1)
|| validSurrogatePairAt(b, b.length() - s - 1)) {
s--;
}
return a.subSequence(a.length() - s, a.length()).toString();
}

寻找两个字符串的公共后缀。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
/**
* Returns the given {@code template} string with each occurrence of {@code "%s"} replaced with
* the corresponding argument value from {@code args}; or, if the placeholder and argument counts
* do not match, returns a best-effort form of that string. Will not throw an exception under
* normal conditions.
*
* <p><b>Note:</b> For most string-formatting needs, use {@link String#format String.format},
* {@link java.io.PrintWriter#format PrintWriter.format}, and related methods. These support the
* full range of <a
* href="https://docs.oracle.com/javase/9/docs/api/java/util/Formatter.html#syntax">format
* specifiers</a>, and alert you to usage errors by throwing {@link
* java.util.IllegalFormatException}.
*
* <p>In certain cases, such as outputting debugging information or constructing a message to be
* used for another unchecked exception, an exception during string formatting would serve little
* purpose except to supplant the real information you were trying to provide. These are the cases
* this method is made for; it instead generates a best-effort string with all supplied argument
* values present. This method is also useful in environments such as GWT where {@code
* String.format} is not available. As an example, method implementations of the {@link
* Preconditions} class use this formatter, for both of the reasons just discussed.
*
* <p><b>Warning:</b> Only the exact two-character placeholder sequence {@code "%s"} is
* recognized.
*
* @param template a string containing zero or more {@code "%s"} placeholder sequences. {@code
* null} is treated as the four-character string {@code "null"}.
* @param args the arguments to be substituted into the message template. The first argument
* specified is substituted for the first occurrence of {@code "%s"} in the template, and so
* forth. A {@code null} argument is converted to the four-character string {@code "null"};
* non-null values are converted to strings using {@link Object#toString()}.
* @since 25.1
*/
// TODO(diamondm) consider using Arrays.toString() for array parameters
public static String lenientFormat(
@Nullable String template, @Nullable Object @Nullable ... args) {
template = String.valueOf(template); // null -> "null"

if (args == null) {
args = new Object[] {"(Object[])null"};
} else {
for (int i = 0; i < args.length; i++) {
args[i] = lenientToString(args[i]);
}
}

// start substituting the arguments into the '%s' placeholders
StringBuilder builder = new StringBuilder(template.length() + 16 * args.length);
int templateStart = 0;
int i = 0;
while (i < args.length) {
int placeholderStart = template.indexOf("%s", templateStart);
if (placeholderStart == -1) {
break;
}
builder.append(template, templateStart, placeholderStart);
builder.append(args[i++]);
templateStart = placeholderStart + 2;
}
builder.append(template, templateStart, template.length());

// if we run out of placeholders, append the extra args in square braces
if (i < args.length) {
builder.append(" [");
builder.append(args[i++]);
while (i < args.length) {
builder.append(", ");
builder.append(args[i++]);
}
builder.append(']');
}

return builder.toString();
}

向模式串中的%s处按序插入目标对象。如果%s不够的话,该方法会尽最大可能的返回合理的结果,而不是和JDK提供的方法一样抛出异常。

lenientToString()

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
private static String lenientToString(@Nullable Object o) {
if (o == null) {
return "null";
}
try {
return o.toString();
} catch (Exception e) {
// Default toString() behavior - see Object.toString()
String objectToString =
o.getClass().getName() + '@' + Integer.toHexString(System.identityHashCode(o));
// Logger is created inline with fixed name to avoid forcing Proguard to create another class.
Logger.getLogger("com.google.common.base.Strings")
.log(WARNING, "Exception during lenientFormat for " + objectToString, e);
return "<" + objectToString + " threw " + e.getClass().getName() + ">";
}
}

讲一个对象转为String返回。

如果该对象为null,返回字符串形式的null。否则调用该对象的toString方法。如果调用出异常的话,就返回该对象的HashCode。类名+‘@’+此对象的哈希码。

0%