Thursday, October 10, 2013

the csv file problem?!

Do you know what a csv file is...? Even if you don't have experience, you could at least try some cloud based spreadsheet app like Google Spreadsheet. Even products like MS Excel give you options to save your worksheet to a CSV format. Basically open one such app, type in data in a few contiguous columns and rows...and then when it comes to saving you have the option of saving it as a Comma Separated Value or CSV.

The final result will look similar to this when opened in a plain vanilla text editor:


So now you know why its "comma" separated...however to be more generic, people can be a little uneasy with that character, hence they resolve to use some other character like a tab ('\t'), or a pipe ('|'), etc.

Hence to be more generic we should be calling this pattern of data storage Delimiter-Separated-Value or DSV. This pattern of storage is meant to store tabular data via a series of delimiters (to separate columns) and row-terminators (or line-terminators).

Now comes the more tricky problem...

Imagine there is program that would parse dsv files given its delimiter and row-terminator. You are only provided with a dsv file with no knowledge of the above two things...how do you write a program to find those two things out?!

Drop in a comment if you happen to figure this out or if you want an answer to it.

Happy programming.

No comments: