問題描述
我想問一個關于在 Java 中避免字符串重復的問題.
I want to ask a question about avoiding String duplicates in Java.
context 是:一個帶有如下標簽和屬性的 XML:
The context is: an XML with tags and attributes like this one:
<product id="PROD" name="My Product"...></product>
使用 JibX,此 XML 在這樣的類中編組/解組:
With JibX, this XML is marshalled/unmarshalled in a class like this:
public class Product{
private String id;
private String name;
// constructor, getters, setters, methods and so on
}
程序是一個長時間的批處理,所以Product對象被創建、使用、復制等.
The program is a long-time batch processing, so Product objects are created, used, copied, etc.
嗯,問題是:當我使用 Eclipse 內存分析器 (MAT) 之類的軟件分析執行時,我發現了幾個重復的字符串.例如,在 id 屬性中,PROD 值在 2000 個實例左右重復,等等.
Well, the question is: When I analysed the execution with software like Eclipse memory analyzer (MAT), I found several duplicated Strings. For example, in the id attribute, the PROD value is duplicated around 2000 instances, etc.
如何避免這種情況?Product 類中的其他屬性可能會在執行過程中改變它們的值,但像 id、name 等屬性不會如此頻繁地改變.
How can I avoid this situation? Other attributes in Product class may change their value along the execution, but attrs like id, name... don't change so frequently.
我已經閱讀了一些關于 String.intern() 方法的內容,但我還沒有使用過,我不確定它是否可以解決這個問題.我可以在類中的 static final 常量等屬性中定義最常見的值嗎?
I have readed something about String.intern() method, but I haven't used yet and I'm not sure it's a solution for this. Could I define the most frequent values in those attributes like static final constants in the class?
我希望我能以正確的方式表達我的問題.非常感謝任何幫助或建議.提前致謝.
I hope I'd have expressed my question in a right way. Any help or advice is very appreciated. Thanks in advance.
推薦答案
interning 將是正確的解決方案,如果你真的有問題.Java 將字符串字面量和許多其他字符串存儲在一個內部池中,每當 將要創建一個新字符串時,JVM 首先檢查該字符串是否已經在池中.如果是,它不會創建新實例,而是將引用傳遞給 interned String 對象.
interning would be the right solution, if you really have a problem. Java stores String literals and a lot of other Strings in an internal pool and whenever a new String is about to be created, the JVM first checks, if the String is already in the pool. If yes, it will not create a new instance but pass the reference to the interned String object.
有兩種方法可以控制這種行為:
There are two ways to control this behaviour:
String interned = String.intern(aString); // returns a reference to an interned String
String notInterned = new String(aString); // creates a new String instance (guaranteed)
所以也許,這些庫確實為所有 xml 屬性值創建了新實例.這是可能的,您將無法更改它.
So maybe, the libraries really create new instances for all xml attribute values. This is possible and you won't be able to change it.
實習生具有全球影響力.一個實習字符串可以立即用于任何對象"(這個視圖實際上沒有意義,但它可能有助于理解它).
intern has a global effect. An interned String is immediatly available "for any object" (this view doesn't really make sense, but it may help to understand it).
所以,假設我們在類 Foo
中有一行,方法 foolish
:
So, lets say we have a line in class Foo
, method foolish
:
String s = "ABCD";
字符串文字立即被實習.JVM 檢查ABCD"是否已經在池中,如果沒有,則ABCD"存儲在池中.JVM 將對實習字符串的引用分配給 s
.
String literals are interned immediatly. JVM checks, if "ABCD" is already in the pool, if not, "ABCD" is stored in the pool. The JVM assigns a reference to the interned String to s
.
現在,也許在另一個類 Bar
中,在方法 barbar
中:
Now, maybe in another class Bar
, in method barbar
:
String t = "AB"+"CD";
然后JVM會像上面一樣實習AB"和CD",創建連接的String,看,如果它已經被intered,嘿,是的,并將對interned StringABCD"的引用分配給<代碼>t.
Then the JVM will intern "AB" and "CD" like above, create the concatenated String, look, if it is intered already, Hey, yes it is, and assign the reference to the interned String "ABCD" to t
.
調用 "PROD".intern()
可能會起作用,也可能會失敗.是的,它將實習字符串PROD"
.但是有一個機會,jibx 確實為屬性值創建了新的字符串
Calling "PROD".intern()
may work or fail. Yes, it will intern the String "PROD"
. But there's a chance, that jibx really creates new Strings for attribute values with
String value = new String(getAttributeValue(attribute));
在這種情況下,value 不會引用一個實習字符串(即使 "PROD"
在池中),而是引用一個新的 String 實例在堆上.
In that case, value will not have a reference to an interned String (even if "PROD"
is in the pool) but a reference to a new String instance on the heap.
而且,對于您命令中的另一個問題:這僅在運行時發生.編譯只是創建類文件,字符串池是對象堆上的數據結構,由 JVM 使用,執行應用程序.
And, to the other question in your command: this happens at runtime only. Compiling simply creates class files, the String pool is a datastructure on the object heap and that is used by the JVM, that executes the application.
這篇關于避免Java中的重復字符串的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網!