Java9新特性系列(String改进)
Java9前时代
在Java9之前,String的源码如下:
1 | package java.lang; |
可以看到,String的内部是一个char数组,每个字符占2个字节(16位)。
Java9
官方Feature给出了如下说明:
- 产生背景:
The current implementation of the String class stores characters in a char array, using two bytes (sixteen bits) for each character. Data gathered from many different applications indicates that strings are a major component of heap usage and, moreover, that most String objects contain only Latin-1 characters. Such characters require only one byte of storage, hence half of the space in the internal char arrays of such String objects is going unused.
大量数据表明,String对象占用了主要的堆使用,而且,大部分的字符串对象只包含Latin-1字符,这样的字符只需要一个字节的存储空间,因此此类字符串对象的内部char数组中的一半空间将被闲置。所以,Java9中对String对存储结构进行了改进:
- 描述:
We propose to change the internal representation of the String class from a UTF-16 char array to a byte array plus an encoding-flag field. The new String class will store characters encoded either as ISO-8859-1/Latin-1 (one byte per character), or as UTF-16 (two bytes per character), based upon the contents of the string. The encoding flag will indicate which encoding is used.
String改成了byte数组,再加上编码标记,就节约了不少空间。
Java9中String源码如下:
1 | package java.lang; |
String-related classes such as AbstractStringBuilder, StringBuilder, and StringBuffer will be updated to use the same representation, as will the HotSpot VM’s intrinsic string operations.
与String相关对比如AbstractStringBuilder
、StringBuilder
、StringBuffer
也将有相同的实现,不知道在3月份即将发布的Java10中实现呢,期待~