Hey,
I did a search and didn't find anything about this in the forums. Google Refine 2.0 has recently been released and is a tool that has amazing applications for anyone dealing with generated data sets. It's great at clean up. Check out the site, as I'm sure the videos will tell you more than I ever could :)
And also a few perks: it's a desktop app you can access through your web browser, (which is great for anyone working with sensitive data sets)! Oh and.. did I mention it's open source??
