Key Concepts

Review core concepts you need to learn to master this subject

The Steps of Statistical Model Building

The four primary steps of statistical model building are:

  1. Confirming data assumptions
  2. Building a model on training data
  3. Assessing the model’s fit
  4. Analyzing model results.
Linear Regression in R
Lesson 1 of 1
  1. 1
    Linear Regression is the workhorse of applied Data Science; it has long been the most commonly used method by scientists and can be applied to a wide variety of datasets and questions. Unlike more …
  2. 2
    While the linear regression is perhaps the most widely applied method in Data Science, it relies on a strict set of assumptions about the relationship between predictor and outcome variables. The m…
  3. 3
    Our next step is to check for outlier data points. Linear regression models also assume that there are no extreme values in the data set that are not representative of the actual relationship betwe…
  4. 4
    Simple linear regression is not a misnomer–– it is an uncomplicated technique for predicting a continuous outcome variable, Y, on the basis of just one predictor variable, X. As detailed in p…
  5. 5
    Once we have an understanding of the kind of relationship our model describes, we want to understand the extent to which this modeled relationship actually fits the data. This is typically referred…
  6. 6
    Great! We can build a model! But… how do we know if it’s any good? Also, if another data scientist builds a different model using a different independent variable, how can we tell which model is …
  7. 7
    In addition to the quantitative measures that characterize our model accuracy, it is alway a best practice to produce visual summaries to assess our model. First, we should always visualize our mo…
  8. 8
    Ready for the real fun? We’ve done our due diligence and confirmed that our data fulfills the assumptions of simple linear regression models; we’ve split our data into test and training subsets, an…
  9. 9
    Let’s practice our model interpretation skills! We know that for continuous independent variables, like podcasts, the regression coefficient represents the difference in the predicted value of sale…
  10. 10
    Data Scientists are often interested in building models to make predictions on new data. While the add_predictions() function from the modelr package makes it easy to predict new values from a tech…
  11. 11
    We’ve been able to really dig into the results of simple linear regression models and show how the results convey a substantial amount of information about the relationship between two variables. H…
  12. 12
    Time to pull it all together! The interpretation of coefficents in multiple linear regression is slightly different than that of coefficents in simple linear regression. Coefficent of independent c…
  13. 13
    Whew, that’s a wrap! You’ve covered a lot of material related to linear regression and its implementation in R. Here are the main concepts we’ve covered: - Statistical model building entails fou…

What you'll create

Portfolio projects that showcase your new skills

Pro Logo

How you'll master it

Stress-test your knowledge with quizzes that help commit syntax to memory

Pro Logo