CSV Parser / Lexer

2022-12-23

I've done software development for quite some time now but never got down to writing a compiler or parser of any kind. Why? Because you don't have to. Writing your own parser is almost never a good idea — because well-maintained and tested libraries and frameworks exist for almost every platform. Just like security and encryption, don't write your own algorithms. You will screw it up.

But nevertheless, it's not bad to know how stuff works. So I wrote fb.CsvParser — a CSV lexer and parser for .NET in C#. CSV is a simple format that's well-suited as a first endeavor. A first impulse might be to just split a string by newlines and commas. But then you're in for a treat when the text itself contains any of those characters. So I wrote a state machine-based lexer that handles all that (I think). I've thrown all CSV files I could find at it, and the lexer seems to parse them correctly. But then CSV is not fully standardized, so there could of course be a lot of edge cases where it fails. Weird delimiters and escape characters etc. I don't know.

Anyway, I've put it on GitHub. I also made a CSV viewer for testing.


Tags:csv,parser,lexer,state,machine,.NET,core,C#,dotnet,simple

Home