Java StringTokenizer respects exactly five space characters and nothing more

Which means the following:

... and StringTokenizer respects exactly five character spaces and nothing more.

http://code.google.com/p/guava-libraries/wiki/StringsExplained#Splitter

+3
source share
4 answers

Presumably this means that StringTokenizerwill be broken into \n, \r, space, TAB, and formfeed, by default. From the source for the simplest constructor:

this(str, " \t\n\r\f", false);

, . , , . StringTokenizer ; .

+9

, StringTokenizer, , \n, \r, \t \f.

+5

, Java API String, . StringTokenizer , (, API Java, , , StringTokenizer . , , , , , , , , . , , , Javadoc, , , , StringTokenizerbut that doesn’t mean t make it less annoying.

But I think the Guava argument expands even wider than the unexpected behavior in this case. The Java API as a whole is absurdly inconsistent in how it defines spaces, which is why they created it CharMatcher.WHITESPACE. Check out all of the various definitions here by Guva Kevin Burrillin.

+4
source

I assume that the “five whitespace” to which they refer is: space, \ t, \ r, \ n and \ f.

+1
source

All Articles