Practices > Collecting Data > Clean Data
How to clean up your data?
Whether it's from an external source or your own collection effort, raw data is generally messy. There might empty records, duplication, unidentifiable inputs, and other anomalies that make it hard to understand what the data can tell you.
There are a number of tools that can help to give you perspectives on your data set that make it easier to spot problems.
[insert tools in "refine" category]
You can also learn more about techniques and tips regarding data cleaning:
- The Data Journalism Handbook has a section on cleaning messy data.
- The Online Journalism Blog also has blog posts on various aspects of data cleaning.