HOUSE SALES SQL DATA CLEANING-MySQL (Click to view full SQL script here)
There is a wide gap between raw data and successful data analysis. Data Cleaning bridges this gap, hence the reason why it is an important pre-requisite for a successful data analysis.
Data Cleaning (or Data Cleansing) refers to the process of identifying incomplete, incorrect, inconsistent, inaccurate or irrelevant parts of the data and then replacing, modifying or deleting the dirty / coarse data. Insights are only as good as the data that informs them, as a result, clean data is more likely to inform good insights.
This project is a step-by-step walkthrough of the process used in Cleaning a data about various sale of houses in United States. Using the table data import wizard, the dataset was imported to MySQL workbench and found to contain 24007 rows of data and 19 field columns.
A LOOK AT DATASET
Relevant columns for the analysis were identified and the steps were taken to improve their usability and ensure they are error free.
Click to view full SQL script here
After all the steps the dataset is now ready for further analysis and can be used to derive meaningful insights about sale of houses in United States. This project demonstrates the importance of data cleaning in ensuring data quality and reliability.