This lesson is still being designed and assembled (Pre-Alpha version)

Applied Machine Learning With Tidymodels: Glossary

Key Points

Introduction
  • tidymodels is a collection of packages that are designed to work together to make it easier to build models.

Machine Learning Basics
  • Testing sets are used to estimate a models performance on new data

  • Validation sets are used to choose between different models or hyperparameters

  • Cross validation is a more advanced method for estimating the performance of a model

  • Accuracy is a simple metric for evaluating a model

  • Confusion matrices are a more detailed way of evaluating a model

  • Precision and recall are useful metrics for evaluating models when the classes are imbalanced

  • Bias and variance are useful concepts for understanding how a model will perform on new data

Data Preparation
  • You can use rsample to split your data into training and testing sets.

  • You can use recipes to create a recipe to preprocess your data.

Building a Simple Workflow
  • A workflow is a series of steps that are executed in order to accomplish a task.

Hyperparameter Tuning
  • Hyperparameters are the settings of a model that we have to set ourselves.

After Training a Model
  • We can use last_fit() to test a model on the testing data.

  • We can use saveRDS() and readRDS() to save and load a model.

Full Walkthrough For Classification
  • We can apply what we have learnt to a new dataset.

Glossary

FIXME