After comparing the APIs, we have thought about which of the two APIs has the faster split implementation. So I have build simple performance test for the two String split implementations. The result has surprised me. The StringUtils split method is in my test case much faster then the Guava Splitter split method.
Test setup is I generate 5000 random strings with a length of 10000. The test strings contains commas to split the strings in the test. I invoke the Apache common spilt method and the Guava Splitter with the same test data, the performance result is shown in the table bellow.
Test Runs | 1 | 2 | 3 | 4 |
Apache Common StringUtils.split(…) |
126 ms | 122 ms | 121 ms | 122 ms |
Google Guava splitter.split(…) |
352 ms | 350 ms | 346 ms | 349 ms |
Here the source of my simple performance test:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import static org.junit.Assert.assertTrue; | |
import java.util.LinkedList; | |
import java.util.List; | |
import org.apache.commons.lang.RandomStringUtils; | |
import org.apache.commons.lang.StringUtils; | |
import org.junit.AfterClass; | |
import org.junit.BeforeClass; | |
import org.junit.Test; | |
import org.springframework.util.StopWatch; | |
import com.google.common.base.Splitter; | |
public class StringUtilsTest | |
{ | |
static int testDataCount = 5000; | |
static int testDataStringSize = 10000; | |
static List<String> testData = new LinkedList<String>(); | |
static StopWatch stopWatch = new StopWatch(); | |
@BeforeClass | |
public static void generateTestData() { | |
for(int i=0; i < testDataCount; i++){ | |
String data = RandomStringUtils | |
.random(testDataStringSize, "ABCDEFGHIJKLMNOPQRSTUVXYZ,"); | |
testData.add(data); | |
} | |
} | |
@AfterClass | |
public static void printTestSummery(){ | |
System.out.println(stopWatch.prettyPrint()); | |
} | |
@Test | |
public void apacheCommonLangSplit() throws Exception { | |
stopWatch.start("Apache Common Lang Split"); | |
for(String data: testData){ | |
String[] elements = StringUtils.split(data, ","); | |
for (String element : elements) { | |
assertTrue(element.length() < testDataStringSize); | |
} | |
} | |
stopWatch.stop(); | |
} | |
@Test | |
public void guavaSplitterSplit() throws Exception { | |
stopWatch.start("Google Guava Splitter"); | |
Splitter spiltter = Splitter.on(","); | |
for(String data: testData){ | |
Iterable<String> elements = spiltter.split(data); | |
for (String element : elements) { | |
assertTrue(element.length() < testDataStringSize); | |
} | |
} | |
stopWatch.stop(); | |
} | |
} |
Why
Has anybody an idea why the Guava API in my test is slower then the StringUtils split method? I read that the Guava Splitter performance should be very good. Therefore, I am surprised about the result.
Here the dependencies I have used for the performance test:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<dependencies> | |
<dependency> | |
<groupId>com.google.guava</groupId> | |
<artifactId>guava</artifactId> | |
<version>12.0</version> | |
</dependency> | |
<dependency> | |
<groupId>commons-lang</groupId> | |
<artifactId>commons-lang</artifactId> | |
<version>2.5</version> | |
</dependency> | |
<dependency> | |
<groupId>org.springframework</groupId> | |
<artifactId>spring-beans</artifactId> | |
<version>3.1.2.RELEASE</version> | |
<scope>test</scope> | |
</dependency> | |
<dependency> | |
<groupId>junit</groupId> | |
<artifactId>junit</artifactId> | |
<version>4.10</version> | |
<scope>test</scope> | |
</dependency> | |
</dependencies> |