What is Exploratory Data Analysis?

Read this article to learn about exploratory data analysis (EDA).

NIST/SEMATECH e-Handbook of Statistical Methods

This article gives an overview of exploratory data analysis (EDA). Many people associate data science with fields like machine learning and artificial intelligence, but EDA often takes up a larger percentage a data scientist’s day-to-day work! This is because:

  • Before fitting any sort of machine learning model, it is important to inspect a dataset and get to know it! Often the best way to improve a model is to spend more time thinking about the data itself. EDA can help you make decisions about what data to include, exclude, or transform.
  • Sometimes a data scientist does not plan to fit a predictive model to their data at all. Instead, their goal may be to inspect and analyze existing data to answer questions like: What proportion of visitors to a website made a purchase? Or, how has the purchase rate changed over the last 6 months?

This article gives a more formal definition of EDA and describes some of the techniques involved.

After you finish reading the article, return to this page and complete the following assessment: