mjl blog
May 31st 2016

Comma or Tab in a Comma Separated Values file?

Quick question.

What do you think is the separator in a “Comma Separated Values” (CSV) file?

If you are like me, or think in ways I can identify with, you just answered “comma”. If you are a software developer on the Microsoft Excel team, you apparently thought this instead: “It depends on the settings of the computer opening the file”.

Yes, that’s right. Excel thinks a file format (and thus, the contents of a file) magically changes when you change settings on your computer. On my Mac, Excel thinks a semicolon is the separator for comma separated values-files. This means that when I open a CSV file, I get multiple rows, but just one column. Each line from the CSV file is shown as is on a row. No separation into columns. You would think that working with tabular data is the core business of Excel. It’s quite the disappointment that it cannot parse the most basic file format in existence, the CSV file…

If you have an idea as to how this came to be, please let me know. Maybe it will alleviate the frustration and pain I’m feeling everytime I try to open a CSV file in Excel…

PS: I’m aware that you can fix Excel’s mistakes when opening a CSV file. I’d rather it didn’t make the mistake in the first place.

Comments

Werner
very annoying indeed. You are not alone with frustration in this area, and you didn't even mention byte-order-marks or text encoding in general.

My quick solution, when I had to process csv files, where I don't know which language settings the user had, is to count the number of semicolons of the first line.
If that is greater than the number of commas, it wins...