|
The Java Specialists' Newsletter
Issue 014 2001-03-21
Category:
Language
Java version: Insane Stringsby Dr. Heinz M. Kabutz
Welcome to the 14th issue of "The Java(tm) Specialists'
Newsletter", where we look at things that other newsletters would
not dare to mention. Please please use the ideas presented in
this newsletter with caution, they can really mess up a project
if used (incorrectly). A lot of them are educational rather than
practical so that we can understand Java better. This is true
especially for this newsletter, in which we mutate Strings.
Thanks if you forwarded last week's newsletter to some of your
colleagues and friends, I got a few subscriptions as a result
of referrals. The membership has been growing steadily since
I started this newsletter in November 2000. It has now broken
through the 400 barrier and I'm thinking of moving it to a proper
list server before reaching 500. My main concern is that if it
is a free list server it might start sending you spam, which I
really would not like. Any suggestions of free list servers that
do not generate spam would be most welcome. My current
preference is for Topica - which I will choose unless I hear from
you.
Upcoming Java Specialist Master Courses:
"This course embodies my Java knowledge and experience gained publishing 180 advanced Java newsletters, teaching hundreds of seminars and writing hundreds of thousands of lines of Java code." Heinz Kabutz, The Java Specialists NewsletterParis, France, Feb 9-12 2010, €2500 - click to sign up. Düsseldorf, Germany (in German), Mar 2-5 2010, €2500 - click to sign up. San Jose CA, Mar 16-19 2010, $3500 - click to sign up. Oslo, Norway, Apr 13-16 2010, Kr 24500 - click to sign up. Chania, Crete, May 25-28 2010, €2500 - click to sign up.
In-house courses if these dates or locations do not suit you - click here for more information. Playing with your sanity - Strings
Have a look at the following code:
public class MindWarp {
public static void main(String[] args) {
System.out.println(
"Romeo, Romeo, wherefore art thou oh Romero?");
}
private static final String OH_ROMEO =
"Romeo, Romeo, wherefore art thou oh Romero?";
private static final Warper warper = new Warper();
}
If we are told that the class Warper does not produce any visible
output when you construct it, what is the output of this program?
The most correct answer is, "you don't know, depends on what
Warper does". Now THERE's a nice question for the Sun Certified
Java Programmer Examination.
In my case, running "java MindWarp" produces the following output
C:> java MindWarp <ENTER>
Stop this romance nonsense, or I'll be sick
And here is the code for Warper:
import java.lang.reflect.*;
public class Warper {
private static Field stringValue;
static {
// String has a private char [] called "value"
// if it does not, find the char [] and assign it to value
try {
stringValue = String.class.getDeclaredField("value");
} catch(NoSuchFieldException ex) {
// safety net in case we are running on a VM with a
// different name for the char array.
Field[] all = String.class.getDeclaredFields();
for (int i=0; stringValue == null && i<all.length; i++) {
if (all[i].getType().equals(char[].class)) {
stringValue = all[i];
}
}
}
if (stringValue != null) {
stringValue.setAccessible(true); // make field public
}
}
public Warper() {
try {
stringValue.set(
"Romeo, Romeo, wherefore art thou oh Romero?",
"Stop this romance nonsense, or I'll be sick".
toCharArray());
stringValue.set("hi there", "cheers !".toCharArray());
} catch(IllegalAccessException ex) {} // shhh
}
}
How is this possible? How can String manipulation in a
completely different part of the program affect our class
MindWarp?
To understand that, we have to look under the hood of Java. In
the language specification it says in §3.10.5:
"Each string literal is a reference (§4.3) to an instance
(§4.3.1, §12.5) of class String (§4.3.3). String objects have a
constant value. String literals-or, more generally, strings that
are the values of constant expressions (§15.28)-are "interned" so
as to share unique instances, using the method String.intern."
The usefulness of this is quite obvious, we will use less memory
if we have two Strings which are equivalent pointing at the same
object. We can also manually intern Strings by calling the
intern() method.
The language spec goes a bit further:
- Literal strings within the same class (§8) in the same package
(§7) represent references to the same String object (§4.3.1).
- Literal strings within different classes in the same package
represent references to the same String object.
- Literal strings within different classes in different packages
likewise represent references to the same String object.
- Strings computed by constant expressions (§15.28) are computed
at compile time and then treated as if they were literals.
- Strings computed at run time are newly created and therefore
distinct.
- The result of explicitly interning a computed string is the
same string as any pre-existing literal string with the same
contents.
This means that if a class in another package "fiddles" with an
interned String, it can cause havoc in your program. Is this a
good thing? (You don't need to answer ;-)
Consider this example
public class StringEquals {
public static void main(String[] args) {
System.out.println("hi there".equals("cheers !"));
}
private static final String greeting = "hi there";
private static final Warper warper = new Warper();
}
Running this against the Warper produces a result of true, which
is really weird, and in my opinion, quite mind-bending. Hey, you
can SEE the values there right in front of you and they are
clearly NOT equal!
BTW, for simplicity, the Strings in my examples are exactly the
same length, but you can change the length quite easily as well.
Last example concerns the HashCode of String, which is now cached
for performance reasons mentioned in "Java Idiom and Performance
Guide", ISBN 0130142603. (Just for the record, I was never
and am still not convinced that caching the String hash code in a
wrapper object is a good idea, but caching it in String itself is
almost acceptable, considering String literals.)
public class CachingHashcode {
public static void main(String[] args) {
java.util.Map map = new java.util.HashMap();
map.put("hi there", "You found the value");
new Warper();
System.out.println(map.get("hi there"));
System.out.println(map);
}
private static final String greeting = "hi there";
}
The output under JDK 1.3 is:
You found the value
{cheers !=You found the value}
Under JDK 1.2 it is
null
{cheers !=You found the value}
This is because in the JDK 1.3 SUN is caching the hash code so if
it once calculated, it doesn't get recalculated, so if the value
field changes, the hashcode stays the same.
Imagine trying to debug this program where SOMEWHERE, one of your
hackers has done a "workaround" by modifying a String literal.
The thought scares me.
[Heinz: Author's note: the comment below on using "final" to solve
this problem is not correct. Firstly, you cannot make arrays
immutable, which is a design flaw in Java, so you could still
change the content of the array even if the handle were final.
Secondly, in JDK 1.5, you can set final fields using reflection.
See Java 5 - "final" is not final
anymore and for a similar contortion with autoboxing see
Mangling Integers.]
There is of course a small keyword that would have stopped this
problem, namely "final". I got into the habit a few months ago
to make all my data members final where possible, and it has paid
off more than once. Surprisingly, the char array in String is
not final.
Consider the following example code:
public class Bla {
private char[] c1 = "hello".toCharArray();
private final char[] c2 = "bye".toCharArray();
public String toString() {
return c1 + ", " + c2;
}
}
import java.lang.reflect.*;
public class BlaTest {
private static Field c1;
private static Field c2;
static {
try {
c1 = Bla.class.getDeclaredField("c1");
c1.setAccessible(true);
c2 = Bla.class.getDeclaredField("c2");
c2.setAccessible(true);
} catch(NoSuchFieldException ex) { }
}
public static void main(String[] args) {
Bla bla = new Bla();
try {
c1.set(bla, "mutatedc1".toCharArray());
c2.set(bla, "mutatedc2".toCharArray());
} catch(IllegalAccessException ex) {
ex.printStackTrace();
}
System.out.println(bla);
}
}
When I run my program, I can quite happily change c1, but when I
try to change c2 I get an exception. String has no reason for
value to be non-final, so it should be final. If you have
contacts at SUN, please forward them this newsletter and ask them
to make value final. It might stop some nasty Java viruses from
completely messing up the JVM.
Until next week, and please remember to forward this newsletter
and send me your comments.
Heinz
Language Articles
Related Java Course
|