Java9新特性系列(String改进)

Java9前时代

在Java9之前,String的源码如下:

1
2
3
4
5
6
7
package java.lang;
public final class String
implements java.io.Serializable, Comparable<String>, CharSequence {
/** The value is used for character storage. */
private final char value[];
...
}

可以看到,String的内部是一个char数组,每个字符占2个字节(16位)。

Java9

官方Feature给出了如下说明:

  • 产生背景:

The current implementation of the String class stores characters in a char array, using two bytes (sixteen bits) for each character. Data gathered from many different applications indicates that strings are a major component of heap usage and, moreover, that most String objects contain only Latin-1 characters. Such characters require only one byte of storage, hence half of the space in the internal char arrays of such String objects is going unused.

大量数据表明,String对象占用了主要的堆使用,而且,大部分的字符串对象只包含Latin-1字符,这样的字符只需要一个字节的存储空间,因此此类字符串对象的内部char数组中的一半空间将被闲置。所以,Java9中对String对存储结构进行了改进:

  • 描述:

We propose to change the internal representation of the String class from a UTF-16 char array to a byte array plus an encoding-flag field. The new String class will store characters encoded either as ISO-8859-1/Latin-1 (one byte per character), or as UTF-16 (two bytes per character), based upon the contents of the string. The encoding flag will indicate which encoding is used.

String改成了byte数组,再加上编码标记,就节约了不少空间。

Java9中String源码如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
package java.lang;
public final class String
implements java.io.Serializable, Comparable<String>, CharSequence {

/**
* The value is used for character storage.
*
* @implNote This field is trusted by the VM, and is a subject to
* constant folding if String instance is constant. Overwriting this
* field after construction will cause problems.
*
* Additionally, it is marked with {@link Stable} to trust the contents
* of the array. No other facility in JDK provides this functionality (yet).
* {@link Stable} is safe here, because value is never null.
*/
@Stable
private final byte[] value;
...
}

String-related classes such as AbstractStringBuilder, StringBuilder, and StringBuffer will be updated to use the same representation, as will the HotSpot VM’s intrinsic string operations.

与String相关对比如AbstractStringBuilderStringBuilderStringBuffer也将有相同的实现,不知道在3月份即将发布的Java10中实现呢,期待~