On a recent project, I had to implement a CSV parser that would gracefully handle malformed files. I’m talking about files with unescaped quotes, wacky UTF-8 chars, and various other abominations of nature.
I originally assumed FasterCSV would handle this automagically, but it turns out that the library’s most commonly used methods are pretty strict when it comes to handling CSV files.
For example, parsing a malformed file one line at a time will result in an exception being thrown, even before any rows are yielded to the block:
FasterCSV.foreach("malformed.csv") do |row| # use row here... end
Not cool! I managed to get around this by manually looping over each row and rescuing a malformed CSV exception if one gets thrown:
FasterCSV.open("malformed.csv", "rb") do |output| loop do begin break unless row = output.shift # use row here... rescue FasterCSV::MalformedCSVError => e # handle malformed row here... end end end
Anyone have a better way to do this?