通过一道面试题来学习StringTable

通过一道面试题来学习StringTable

先看下下面这道面试题

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
String s1 = "a";
String s2 = "b";
String s3 = "a" + "b";
String s4 = s1 + s2;
String s5 = "ab";
String s6 = s4.intern();

//问
System.out.println(s3 == s4);
System.out.println(s3 == s5);
System.out.println(s3 == s6);


String x2 = new String("c") + new String("d");
String x1 = "cd";
x2.intern();

//问 如果调换了[最后两行代码的位置]? 如果是java6呢
System.out.println(x1 == x2);

常量池和串池之间的关系

首先我们先通过一个Demo案例来分析.

1
2
3
4
5
6
7
public class Demo02 {
public static void main(String[] args) {
String s1 = "a";
String s2 = "b";
String s3 = "ab";
}
}

然后利用javap命令反编译class

1
javap -v Demo02.class

反编译后样式

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
Classfile /E:/workplace/abc/jvmStudy/out/production/jvmStudy/demo1/Demo02.class
Last modified 2020-2-16; size 482 bytes
MD5 checksum fafaf5d5900d39006ae780bf977babac
Compiled from "Demo02.java"
public class demo1.Demo02
minor version: 0
major version: 52
flags: ACC_PUBLIC, ACC_SUPER
Constant pool:
#1 = Methodref #6.#24 // java/lang/Object."<init>":()V
#2 = String #25 // a
#3 = String #26 // b
#4 = String #27 // ab
#5 = Class #28 // demo1/Demo02
#6 = Class #29 // java/lang/Object
#7 = Utf8 <init>
#8 = Utf8 ()V
#9 = Utf8 Code
#10 = Utf8 LineNumberTable
#11 = Utf8 LocalVariableTable
#12 = Utf8 this
#13 = Utf8 Ldemo1/Demo02;
#14 = Utf8 main
#15 = Utf8 ([Ljava/lang/String;)V
#16 = Utf8 args
#17 = Utf8 [Ljava/lang/String;
#18 = Utf8 s1
#19 = Utf8 Ljava/lang/String;
#20 = Utf8 s2
#21 = Utf8 s3
#22 = Utf8 SourceFile
#23 = Utf8 Demo02.java
#24 = NameAndType #7:#8 // "<init>":()V
#25 = Utf8 a
#26 = Utf8 b
#27 = Utf8 ab
#28 = Utf8 demo1/Demo02
#29 = Utf8 java/lang/Object
{
public demo1.Demo02();
descriptor: ()V
flags: ACC_PUBLIC
Code:
stack=1, locals=1, args_size=1
0: aload_0
1: invokespecial #1 // Method java/lang/Object."<init>":()V
4: return
LineNumberTable:
line 3: 0
LocalVariableTable:
Start Length Slot Name Signature
0 5 0 this Ldemo1/Demo02;

public static void main(java.lang.String[]);
descriptor: ([Ljava/lang/String;)V
flags: ACC_PUBLIC, ACC_STATIC
Code:
stack=1, locals=4, args_size=1
0: ldc #2 // String a
2: astore_1
3: ldc #3 // String b
5: astore_2
6: ldc #4 // String ab
8: astore_3
9: return
LineNumberTable:
line 5: 0
line 6: 3
line 7: 6
line 8: 9
LocalVariableTable:
Start Length Slot Name Signature
0 10 0 args [Ljava/lang/String;
3 7 1 s1 Ljava/lang/String;
6 4 2 s2 Ljava/lang/String;
9 1 3 s3 Ljava/lang/String;
}
SourceFile: "Demo02.java"

然后找到main方法部分,我们来查看字节码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
public static void main(java.lang.String[]);
descriptor: ([Ljava/lang/String;)V
flags: ACC_PUBLIC, ACC_STATIC
Code:
stack=1, locals=4, args_size=1
0: ldc #2 // String a
2: astore_1
3: ldc #3 // String b
5: astore_2
6: ldc #4 // String ab
8: astore_3
9: return
LineNumberTable:
line 5: 0
line 6: 3
line 7: 6
line 8: 9
LocalVariableTable:
Start Length Slot Name Signature
0 10 0 args [Ljava/lang/String;
3 7 1 s1 Ljava/lang/String;
6 4 2 s2 Ljava/lang/String;
9 1 3 s3 Ljava/lang/String;
}

其中 ldc #2 表示去常量池中的2号位置加载数据(“a”).

astore_a 表示把加载好的数据存入一个1号局部变量.

LocalVariableTable 表示方法的局部变量表,对应的1号局部变量就是

1
3       7     1    s1   Ljava/lang/String;

同理可理解全部main方法.

Constant pool 指的是运行时常量池.

通过反编译来查看字符串拼接

现在我们来新增一行代码

1
String s4 = s1 + s2;

再重新编译Demo02,然后使用javap进行反编译,得到如下内容:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
Classfile /E:/workplace/abc/jvmStudy/out/production/jvmStudy/demo1/Demo02.class
Last modified 2020-2-16; size 666 bytes
MD5 checksum 8071b7cfbe139cfe3a2c99c48aa13260
Compiled from "Demo02.java"
public class demo1.Demo02
minor version: 0
major version: 52
flags: ACC_PUBLIC, ACC_SUPER
Constant pool:
#1 = Methodref #10.#29 // java/lang/Object."<init>":()V
#2 = String #30 // a
#3 = String #31 // b
#4 = String #32 // ab
#5 = Class #33 // java/lang/StringBuilder
#6 = Methodref #5.#29 // java/lang/StringBuilder."<init>":()V
#7 = Methodref #5.#34 // java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
#8 = Methodref #5.#35 // java/lang/StringBuilder.toString:()Ljava/lang/String;
#9 = Class #36 // demo1/Demo02
#10 = Class #37 // java/lang/Object
#11 = Utf8 <init>
#12 = Utf8 ()V
#13 = Utf8 Code
#14 = Utf8 LineNumberTable
#15 = Utf8 LocalVariableTable
#16 = Utf8 this
#17 = Utf8 Ldemo1/Demo02;
#18 = Utf8 main
#19 = Utf8 ([Ljava/lang/String;)V
#20 = Utf8 args
#21 = Utf8 [Ljava/lang/String;
#22 = Utf8 s1
#23 = Utf8 Ljava/lang/String;
#24 = Utf8 s2
#25 = Utf8 s3
#26 = Utf8 s4
#27 = Utf8 SourceFile
#28 = Utf8 Demo02.java
#29 = NameAndType #11:#12 // "<init>":()V
#30 = Utf8 a
#31 = Utf8 b
#32 = Utf8 ab
#33 = Utf8 java/lang/StringBuilder
#34 = NameAndType #38:#39 // append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
#35 = NameAndType #40:#41 // toString:()Ljava/lang/String;
#36 = Utf8 demo1/Demo02
#37 = Utf8 java/lang/Object
#38 = Utf8 append
#39 = Utf8 (Ljava/lang/String;)Ljava/lang/StringBuilder;
#40 = Utf8 toString
#41 = Utf8 ()Ljava/lang/String;
{
public demo1.Demo02();
descriptor: ()V
flags: ACC_PUBLIC
Code:
stack=1, locals=1, args_size=1
0: aload_0
1: invokespecial #1 // Method java/lang/Object."<init>":()V
4: return
LineNumberTable:
line 3: 0
LocalVariableTable:
Start Length Slot Name Signature
0 5 0 this Ldemo1/Demo02;

public static void main(java.lang.String[]);
descriptor: ([Ljava/lang/String;)V
flags: ACC_PUBLIC, ACC_STATIC
Code:
stack=2, locals=5, args_size=1
0: ldc #2 // String a
2: astore_1
3: ldc #3 // String b
5: astore_2
6: ldc #4 // String ab
8: astore_3
9: new #5 // class java/lang/StringBuilder
12: dup
13: invokespecial #6 // Method java/lang/StringBuilder."<init>":()V
16: aload_1
17: invokevirtual #7 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
20: aload_2
21: invokevirtual #7 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
24: invokevirtual #8 // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
27: astore 4
29: return
LineNumberTable:
line 5: 0
line 6: 3
line 7: 6
line 8: 9
line 9: 29
LocalVariableTable:
Start Length Slot Name Signature
0 30 0 args [Ljava/lang/String;
3 27 1 s1 Ljava/lang/String;
6 24 2 s2 Ljava/lang/String;
9 21 3 s3 Ljava/lang/String;
29 1 4 s4 Ljava/lang/String;
}
SourceFile: "Demo02.java"

我们来查看新增的第四行代码反编译的结果

  1. 9: new #5 // class java/lang/StringBuilder
    先通过new关键字创建了一个StringBuilder对象
  2. 13: invokespecial #6 // Method java/lang/StringBuilder.”“:()V
    调用StringBuilder的构造方法,通过()V可以看出是一个无参构造
  3. 16: aload_1
    将局部变量表中的1号局部变量加入栈顶供方法使用
  4. 17: invokevirtual #7 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
    调用StringBuilder的append方法.
  5. 20: aload_2
    加载S2入栈
  6. 21: invokevirtual #7 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
    调用append方法
  7. invokevirtual #8 // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
    调用StringBuilder中的toString方法
  8. 27: astore 4
    添加到4号局部变量

然后我们看一下StringBuilder中的toString()方法.

1
2
3
4
5
@Override
public String toString() {
// Create a copy, don't share the array
return new String(value, 0, count);
}

他就是通过StringBuilder中的value值,创建了一个新的值为”ab”字符串对象.

所以

1
System.out.println(s3 == s4);

的结果就很明显了,s3是串池中的对象,而s4是一个新new的对象.

编译器优化

我们在定义一个变量s5

1
String s5 = "a" + "b";

然后再进行反编译

反编译后我们继续查看main方法的字节码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
 public static void main(java.lang.String[]);
descriptor: ([Ljava/lang/String;)V
flags: ACC_PUBLIC, ACC_STATIC
Code:
stack=2, locals=6, args_size=1
0: ldc #2 // String a
2: astore_1
3: ldc #3 // String b
5: astore_2
6: ldc #4 // String ab
8: astore_3
9: new #5 // class java/lang/StringBuilder
12: dup
13: invokespecial #6 // Method java/lang/StringBuilder."<init>":()V
16: aload_1
17: invokevirtual #7 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
20: aload_2
21: invokevirtual #7 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
24: invokevirtual #8 // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
27: astore 4
29: ldc #4 // String ab
31: astore 5
33: return
LineNumberTable:
line 5: 0
line 6: 3
line 7: 6
line 8: 9
line 9: 29
line 12: 33
LocalVariableTable:
Start Length Slot Name Signature
0 34 0 args [Ljava/lang/String;
3 31 1 s1 Ljava/lang/String;
6 28 2 s2 Ljava/lang/String;
9 25 3 s3 Ljava/lang/String;
29 5 4 s4 Ljava/lang/String;
33 1 5 s5 Ljava/lang/String;
}

我们找到s5的执行代码

29: ldc #4 // String ab

31: astore 5

可以看出是直接在常量池中找到”ab”对象,然后存到局部变量表中

我们可以对比下 String s3 = “ab”; 反编译的字节码

6: ldc #4 // String ab

他们都是从常量池中找到”ab”对象


所以

1
System.out.println(s3 == s5);//返回true

返回true.

这是因为javac在编译期间做的优化,因为”a” ,”b” 都是常量,在编译期间做拼接,结果是固定的,不可能是其他结果,所以直接优化成”ab”.

而s1 +s2 因为都是变量,无法在编译期间确定值,变量的值是可能改变的,所以在编译期间无法确定其值,所以需要通过StringBuilder来拼接字符串.

字符串加载到串池中的时间

我们新建一个Demo03类,代码如下

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
public class Demo03 {
public static void main(String[] args) {
System.out.println("1");
System.out.println("2");
System.out.println("3");
System.out.println("4");
System.out.println("5");
System.out.println("6");
System.out.println("7");
System.out.println("8");
System.out.println("9");
System.out.println("1");
System.out.println("2");
System.out.println("3");
System.out.println("4");
System.out.println("5");
System.out.println("6");
System.out.println("7");
System.out.println("8");
System.out.println("9");
}
}

然后分别在两个 System.out.println(“1”); 处打断点.

然后通过IDea中的Debug memory工具来查看对象个数.

debug-memory-01

点击load classes 查看运行到当前代码处,每个对象的实例个数.

debug-memory-02

其中 String对象为2252.

然后我们单步执行一下,2254(在System.out.print创建类名字符串),再执行一下,变成了2255.

走到下一个断点处,String对象变为2262个,继续往下走,字符串个数不变.

这个现象,充分证明了两个信息

  1. 串池不是在编译期立即生成对象,而是在使用的时候才延迟加载
  2. 如果串池中有对应的字符串对象,则从串池中获取,不再新创建对象.(“a”格式,如果使用new String(“a”) 则正常新建对象)

StringTable特性总结

  1. 常量池中的字符串仅仅是符号,只有在第一次用到时才会变为对象
  2. 利用串池的机制,来避免重复创建字符串对象
  3. 字符串变量拼接的原理是StringBuilder.append
  4. 字符串常量拼接的原理是编译期优化
  5. 可以使用intern方法,主动将串池中还没有的字符串放入串池.

java8中的intern()方法

我们新创建一个测试类Demo04,代码如下

1
2
3
4
5
6
7
8
9
10
11
public class Demo04 {

public static void main(String[] args) {
String s = new String("a") + new String("b");

String s2 = s.intern();

System.out.println(s == "ab");
System.out.println(s2 == "ab");
}
}

我们逐步分析

  1. 在执行 String s = new String(“a”) + new String(“b”);时
    串池中存在 “a” 对象和 “b” 对象
    堆中存在 new String(“a”) 和 new String(“b”),以及拼接得来的 new String(“ab”)
  2. String s2 = s.intern();
    intern()方法的作用是:// 将这个字符串对象放入串池,如果有则不会放入,会把串池中的对象返回
    这时,串池中多了一个”ab”对象,且改对象与拼接来的 new String(“ab”)为同一个对象

所以 两个判断相等的结果都为true

假设 我们在执行 s.intern(); 方法前面加上如下代码String x = “ab”; ,我们再来逐步分析

在执行 s.intern() 时,串池中已经有了”ab”对象,则s2直接饮用串池中的”ab”对象,所以两个判断的结果应为false,true .

java1.6中的intern()方法

在java1.6中,intern()方法将这个字符串尝试放入串池,如果有则不会放入,如果没有,则会将现有对象复制一份放入串池,也就是串池中的对象,并不是当前执行intern()的对象.

这个就不具体演示了(偷懒…)

面试题结果

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
public class Demo01 {
public static void main(String[] args) {
String s1 = "a";
String s2 = "b";
String s3 = "a" + "b"; //编译期优化,"ab"在在串池中
String s4 = s1 + s2; // StringBuilde拼接,堆中"ab"对象
String s5 = "ab";
String s6 = s4.intern(); //返回串池中"ab"对象

//问
System.out.println(s3 == s4); //false
System.out.println(s3 == s5); //true
System.out.println(s3 == s6); //true


String x2 = new String("c") + new String("d"); //堆中"cd"对象
String x1 = "cd"; //串池中"cd"对象
x2.intern();

//问 如果调换了[最后两行代码的位置]? 如果是java6呢
System.out.println(x1 == x2); //false
// 如果调换最后两行代码 true
}
}
0%