- 05.06.2011 1:30 pm - Initial created Post with Groovy, Java, Scala, awk, Ruby, Python Implementation
- 05.06.2011 4:00 pm - Add PHP implementation and update voting (now you can vote for PHP).
- 06.06.2011 4:00 pm - Add Bash implementation and update voting (now you can also vote for Bash)
Which is the best programming language for converting a simple CSV into another format?
First I blogged three Java VM based solutions written in Groovy, Java and Scala to convert a simple CSV file into another format. Rainer sends me the Java based solution, yesterday Axel Knauf sends an awk based solution, Niko sends Ruby based solution, Hendrik sends a Python based solution, Sebastian sends me a PHP implementation and Julien sends a Bash version. Now there are a Groovy, Java, Scala, awk, Ruby, Python, PHP and Bash implementation.
Now here again a complete overview of the different implementations:
The Groovy Implementation:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
new File("output.csv") withPrintWriter { out -> | |
new File("input.csv") splitEachLine(';') { fields -> | |
def name = fields[2] | |
def firstname = fields[1] | |
def kto = fields[3] | |
def blz = fields[4] | |
def amount = fields[5] | |
out.println "${name};${firstname} ${name};${kto};${blz};${amount}" | |
} | |
} |
The Java Implementation:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import java.io.*; | |
public class CsvConvertor { | |
public static void main(String[] args) throws Exception { | |
FileWriter out = new FileWriter("output.csv"); | |
BufferedReader in = new BufferedReader(new FileReader("input.csv")); | |
String line; | |
while ((line = in.readLine()) != null) { | |
String[] fields = line.split(";"); | |
String name = fields[2]; | |
String firstname = fields[1]; | |
String kto = fields[3]; | |
String blz = fields[4]; | |
String amount = fields[5]; | |
out.append( | |
String.format("%s;%s %1$s;%s;%s;%s%n", | |
name, firstname,kto, blz, amount)); | |
} | |
out.close(); | |
} | |
} |
The Scala Implementation:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import io.Source._ | |
import java.io._ | |
object CsvConvertor extends Application { | |
val outputCsv = new FileWriter("output.csv") | |
val accounts = fromFile("input.csv") getLines() map (line => Account(line)) | |
accounts foreach (account => outputCsv append (account toCsv)) | |
outputCsv close | |
} | |
case class Account(line: String) { | |
val data = line split (';') | |
val firstname = data(1) | |
val lastname = data(2) | |
val kto = data(3) | |
val blz = data(4) | |
val amount = data(5) | |
def toCsv() = | |
"%s;%s %1$s;%s;%s;%s%n" format (lastname, firstname, kto, blz, amount) | |
} |
Here the shell command and awk script:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
awk 'BEGIN { FS=";"; OFS=";" } { print $2,$1" "$2,$3,$4,$5 }' input.csv > output.csv |
The pure Ruby Implementation:
The Python Implementation:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import csv | |
with open('input.csv', 'rb') as input, open('output.csv', 'wb') as output: | |
reader = csv.reader(input, delimiter=';') | |
writer = csv.writer(output, delimiter=';') | |
for input_row in reader: | |
firstname, name, accno, bsc, amount = input_row | |
output_row = [name, '%s %s' % (firstname, name), accno, bsc, amount] | |
writer.writerow(output_row) | |
# For Jython 2.5, the context manager usage ('with') has to be replaced with a classic try..finally. | |
# For Python 3.x, the files have to be opened in text mode ('r', 'w') instead of binary mode ('rb', 'wb'). |
The pure PHP Implementation:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<?php | |
$start = microtime(true); | |
$fpIn = fopen('input.csv', 'r'); | |
$fpOut = fopen('output-pure.csv', 'w'); | |
while (($row = fgets($fpIn)) !== false) | |
{ | |
$fields = explode(";", $row); | |
fwrite($fpOut, | |
implode( | |
";", | |
array( | |
"name" => $fields[0], | |
"firstname name" => '"'.$fields[1] ." ". $fields[0].'"', | |
"kto" => $fields[2], | |
"blz" => $fields[3], | |
"amount" => $fields[4] | |
) | |
) | |
); | |
} | |
$end = microtime(true); | |
$duration = $end - $start; | |
echo "Duration: ".round($duration, 2) . "s".PHP_EOL; |
The Bash Implementation:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/bash | |
while IFS=$';' read -r name vorname kto blz amount | |
do | |
echo "$name;$vorname $name;$kto;$blz;$amount" >> output.csv | |
done < input.csv |
I'm curious whether there are other implementation proposals (Clojure, Perl, PHP, …), if you have one you could send me the script via Twitter or leave a comment here…
I am also curious which implementation Groovy, Java, Scala, awk or Ruby you like and why? I have create voting here:
Thanks Rainer, Axel Knauf, Niko Dittmann, Hendrik Heimbuerger S.Barthenheier and Julien Guitton for the Java, awk, Ruby Python, PHP and Bash implementation.
Links:
- Java (Rainer) - https://gist.github.com/1006757
- awk (Axel Knauf) - https://twitter.com/#!/kopfkind/status/77288701725638656
- Ruby (Niko) - https://gist.github.com/1008877
- Python (Hendrik) - https://gist.github.com/1008881
- PHP (Sebastian) - https://gist.github.com/1008966
- Bash (Julien) - https://gist.github.com/1009420
- old blog post: Convert a CSV File with awk - http://goo.gl/U7onj
- old blog post: Convert a CSV File in Groovy, Java or Scala? - http://goo.gl/AAE17