Machine Learning Workflows: tidymodels vs. scikit-learn

Comparing ML Model Training, Evaluation, and Prediction in R and Python

Compare machine learning workflows in R and Python by examining tidymodels in R versus scikit-learn in Python. This tutorial provides side-by-side examples for model training, evaluation, and prediction using the iris dataset, highlighting key differences and strengths of each ecosystem.

Programming
Author
Affiliation
Published

February 13, 2024

Modified

March 11, 2025

Keywords

tidymodels vs scikit-learn, machine learning in R and Python, caret vs scikit-learn, data science ML workflows, Python vs R machine learning

Introduction

In data science, building and evaluating predictive models is a core task. Both R and Python offer robust ecosystems for machine learning—R with tidymodels (or caret) and Python with scikit-learn. This tutorial provides side-by-side examples using the well-known iris dataset to compare how each language approaches model training, evaluation, and prediction. By the end, you’ll understand the similarities and differences between these workflows, helping you choose the best toolset for your projects.



Comparative Example: Training a Classification Model on the Iris Dataset

We will split the iris dataset into training and testing sets, train a logistic regression model in each language, and then evaluate their performance.

#| label: r-tidymodels
# Load required libraries
library(tidymodels)

# Load the iris dataset
data(iris)

# Split data into training and testing sets
set.seed(123)
iris_split <- initial_split(iris, prop = 0.8)
iris_train <- training(iris_split)
iris_test <- testing(iris_split)

# Define a logistic regression model specification for multi-class classification
model_spec <- multinom_reg() %>% 
  set_engine("nnet") %>% 
  set_mode("classification")

# Create a workflow and add the model and formula
iris_workflow <- workflow() %>% 
  add_model(model_spec) %>% 
  add_formula(Species ~ .)

# Fit the model
iris_fit <- fit(iris_workflow, data = iris_train)

# Make predictions on the test set
iris_pred <- predict(iris_fit, new_data = iris_test) %>% 
  bind_cols(iris_test)

# Evaluate model performance
metrics <- iris_pred %>% metrics(truth = Species, estimate = .pred_class)
print(metrics)

Output:

  .metric  .estimator .estimate
  <chr>    <chr>          <dbl>
1 accuracy multiclass     0.967
2 kap      multiclass     0.946
#| label: python-scikit-learn
import pandas as pd
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, classification_report

# Load the iris dataset
iris = load_iris()
df = pd.DataFrame(iris.data, columns=iris.feature_names)
df['Species'] = iris.target

# Split the dataset into training and testing sets
X = df.drop('Species', axis=1)
y = df['Species']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=123)

# Initialize and fit a logistic regression model
model = LogisticRegression(max_iter=200)
model.fit(X_train, y_train)

# Make predictions on the test set
pred = model.predict(X_test)

# Evaluate model performance
accuracy = accuracy_score(y_test, pred)
print("Accuracy:", accuracy)
print("Classification Report:")
print(classification_report(y_test, pred))

Output:

              precision    recall  f1-score   support

           0       1.00      1.00      1.00        13
           1       1.00      1.00      1.00         6
           2       1.00      1.00      1.00        11

    accuracy                           1.00        30
   macro avg       1.00      1.00      1.00        30
weighted avg       1.00      1.00      1.00        30
Note

The support column indicates the number of true instances (or actual observations) of each class in the test set. It tells you how many samples belong to each class, which is useful for understanding the distribution of the classes in your dataset.

Discussion

Workflow Comparison

  • Data Splitting:
    Both examples split the iris dataset into training and testing sets using similar random seed approaches.

  • Model Training:

    • R (tidymodels): Uses a workflow that integrates model specification and formula-based modeling.
    • Python (scikit-learn): Implements a more imperative approach, directly fitting a logistic regression model.
  • Evaluation:
    Both methods evaluate model performance, though the metrics and output formats differ. The tidymodels approach leverages a tidy, pipe-friendly syntax, while scikit-learn provides built-in functions for accuracy and detailed classification reports.

Strengths of Each Approach

  • tidymodels (R):
    Offers a highly modular, consistent workflow that integrates seamlessly with other tidyverse tools, making it easy to extend and customize.

  • scikit-learn (Python):
    Provides a straightforward, widely adopted interface with robust support for a variety of machine learning algorithms and preprocessing tools.

Conclusion

This side-by-side comparison demonstrates that both tidymodels in R and scikit-learn in Python offer powerful workflows for building and evaluating machine learning models. By understanding the nuances of each approach, you can select the environment that best aligns with your data science needs—or even combine them to leverage the strengths of both.

Further Reading

Happy coding, and enjoy exploring machine learning workflows in both R and Python!

Back to top

Reuse

Citation

BibTeX citation:
@online{kassambara2024,
  author = {Kassambara, Alboukadel},
  title = {Machine {Learning} {Workflows:} Tidymodels Vs. Scikit-Learn},
  date = {2024-02-13},
  url = {https://www.datanovia.com/learn/programming/transition/machine-learning-workflows.html},
  langid = {en}
}
For attribution, please cite this work as:
Kassambara, Alboukadel. 2024. “Machine Learning Workflows: Tidymodels Vs. Scikit-Learn.” February 13, 2024. https://www.datanovia.com/learn/programming/transition/machine-learning-workflows.html.