Back to blog
← View series: machine learning

Can Linear Regression Solve Classification?Logistic Regression: Math Intuition Classification Performance Metrics Multiclass Logistic Regression: OvR (One vs Rest)Logistic Regression: Full Implementation GridSearchCV and RandomizedSearchCV Logistic Regression on Imbalanced Data and ROC Curve Deep Dive

~/blog

GridSearchCV and RandomizedSearchCV

Jun 26, 2026•7 min read•By Mohammed Vasim

Machine LearningAIData Science

Every hyperparameter in logistic regression — regularization strength C, penalty type, solver — must be set before training. Getting them right requires searching the hyperparameter space systematically. GridSearchCV evaluates every combination exhaustively; RandomizedSearchCV samples from continuous distributions. Both use cross-validation to estimate generalization performance for each candidate.

Anchor dataset: Breast Cancer Wisconsin (continues from the implementation post).

python

from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
import numpy as np

data = load_breast_cancer()
X, y = data.data, data.target

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42, stratify=y
)
scaler = StandardScaler()
X_train_sc = scaler.fit_transform(X_train)
X_test_sc  = scaler.transform(X_test)

What Is Hyperparameter Tuning?

Parameters are learned from data during training: the weights $w_{0}, w_{1}, \dots, w_{p}$ that minimize the loss. Hyperparameters control the learning process and must be set before training:

Hyperparameter	What it controls	Typical range
`C`	Regularization strength (inverse of $λ$ )	$[0.001, 100]$
`penalty`	Type of regularization	`l1`, `l2`, `elasticnet`
`solver`	Optimization algorithm	`lbfgs`, `liblinear`, `saga`
`max_iter`	Convergence budget	$[100, 10000]$

Choosing C by looking at validation performance on the full training set is wrong — the model has already seen that data. Cross-validation provides an honest estimate by training on a subset and evaluating on the held-out fold.

GridSearchCV — Exhaustive Search

Grid search tests every combination in a discrete parameter grid:

python

from sklearn.model_selection import GridSearchCV

param_grid = {
    'C': [0.01, 0.1, 1, 10, 100],
    'penalty': ['l1', 'l2'],
    'solver': ['liblinear']  # liblinear supports both l1 and l2
}

gs = GridSearchCV(
    LogisticRegression(max_iter=10000),
    param_grid,
    cv=5,
    scoring='roc_auc',
    n_jobs=-1,
    verbose=1
)
gs.fit(X_train_sc, y_train)

print(f"Best params: {gs.best_params_}")
print(f"Best CV AUC: {gs.best_score_:.4f}")

Fitting 5 folds for each of 10 candidates, totalling 50 fits
Best params: {'C': 1, 'penalty': 'l2', 'solver': 'liblinear'}
Best CV AUC: 0.9976

Total fits = 5 (C values) × 2 (penalties) × 5 (CV folds) = 50 fits.

Examining the Full Results Grid

python

import pandas as pd

results_df = pd.DataFrame(gs.cv_results_)
pivot = results_df.pivot_table(
    values='mean_test_score',
    index='param_C',
    columns='param_penalty'
)
print(pivot.round(4))

param_penalty      l1      l2
param_C
0.01           0.9932  0.9935
0.1            0.9963  0.9965
1              0.9974  0.9976
10             0.9973  0.9974
100            0.9971  0.9972

<text x="220" y="35" text-anchor="middle" font-size="10" fill="#334155" font-weight="bold">penalty</text>
<text x="180" y="52" text-anchor="middle" font-size="10" fill="#334155">L1</text>
<text x="300" y="52" text-anchor="middle" font-size="10" fill="#334155">L2</text>

<text x="70" y="85" text-anchor="end" font-size="9" fill="#334155">C=0.01</text>
<text x="70" y="115" text-anchor="end" font-size="9" fill="#334155">C=0.1</text>
<text x="70" y="145" text-anchor="end" font-size="9" fill="#334155">C=1</text>
<text x="70" y="175" text-anchor="end" font-size="9" fill="#334155">C=10</text>
<text x="70" y="205" text-anchor="end" font-size="9" fill="#334155">C=100</text>

<rect x="80" y="62" width="190" height="30" fill="#dcfce7" rx="2"/>
<text x="175" y="80" text-anchor="middle" font-size="9" fill="#334155">0.9932</text>
<rect x="270" y="62" width="120" height="30" fill="#dcfce7" rx="2"/>
<text x="330" y="80" text-anchor="middle" font-size="9" fill="#334155">0.9935</text>

<rect x="80" y="98" width="190" height="30" fill="#86efac" rx="2"/>
<text x="175" y="116" text-anchor="middle" font-size="9" fill="#334155">0.9963</text>
<rect x="270" y="98" width="120" height="30" fill="#86efac" rx="2"/>
<text x="330" y="116" text-anchor="middle" font-size="9" fill="#334155">0.9965</text>

<rect x="80" y="130" width="190" height="30" fill="#22c55e" rx="2"/>
<text x="175" y="148" text-anchor="middle" font-size="9" fill="white">0.9974</text>
<rect x="270" y="130" width="120" height="30" fill="#16a34a" rx="2" stroke="#f59e0b" stroke-width="2"/>
<text x="330" y="148" text-anchor="middle" font-size="9" fill="white" font-weight="bold">0.9976 ★</text>

<rect x="80" y="162" width="190" height="30" fill="#22c55e" rx="2"/>
<text x="175" y="180" text-anchor="middle" font-size="9" fill="white">0.9973</text>
<rect x="270" y="162" width="120" height="30" fill="#22c55e" rx="2"/>
<text x="330" y="180" text-anchor="middle" font-size="9" fill="white">0.9974</text>

<rect x="80" y="194" width="190" height="30" fill="#4ade80" rx="2"/>
<text x="175" y="212" text-anchor="middle" font-size="9" fill="#334155">0.9971</text>
<rect x="270" y="194" width="120" height="30" fill="#4ade80" rx="2"/>
<text x="330" y="212" text-anchor="middle" font-size="9" fill="#334155">0.9972</text>

C=1, L2 (marked ★) is the winner at 0.9976. All values in the table are above 0.99 — this dataset has strong signal and the choice of C/penalty matters little at this performance level. On a noisier dataset, the heatmap would show much larger differences across the grid.

Evaluating the Best Model

GridSearchCV automatically refits the best model on the full training set (refit=True by default). Use gs.best_estimator_ directly — do not refit manually:

python

from sklearn.metrics import roc_auc_score, confusion_matrix

best_model = gs.best_estimator_
y_pred = best_model.predict(X_test_sc)
y_prob = best_model.predict_proba(X_test_sc)[:, 1]

print(f"Test AUC: {roc_auc_score(y_test, y_prob):.4f}")
print(f"Test Acc: {best_model.score(X_test_sc, y_test):.4f}")
print(confusion_matrix(y_test, y_pred))

Test AUC: 0.9981
Test Acc: 0.9737
[[40  2]
 [ 1 71]]

Test AUC (0.9981) is slightly higher than CV AUC (0.9976) — normal variation. The confusion matrix is unchanged from the default C=1 run, confirming that GridSearch found what we already knew: C=1 is optimal here.

The Problem with GridSearch: Exponential Blowup

GridSearch becomes expensive as the parameter space grows:

5 C values × 2 penalties = 10 combinations × 5 folds = 50 fits
Add 3 solver options: 10 × 3 × 5 = 150 fits
Add max_iter with 4 values: 10 × 3 × 4 × 5 = 600 fits
For a neural network with 6 hyperparameters: millions of fits

GridSearch also wastes time on clearly-bad combinations. At C=0.01 with L1, the CV AUC is 0.9932 — poor, but GridSearch ran all 5 folds for it anyway.

RandomizedSearchCV — Sampling the Search Space

Instead of evaluating every combination, randomly sample n_iter combinations. Crucially, it supports continuous distributions — you can search C ∈ [0.001, 100] as a continuous range rather than a discrete set:

python

from sklearn.model_selection import RandomizedSearchCV
from scipy.stats import loguniform

param_dist = {
    'C': loguniform(0.001, 100),   # log-uniform over [0.001, 100]
    'penalty': ['l1', 'l2'],
    'solver': ['liblinear'],
}

rs = RandomizedSearchCV(
    LogisticRegression(max_iter=10000),
    param_dist,
    n_iter=20,         # 20 random combinations × 5 folds = 100 fits
    cv=5,
    scoring='roc_auc',
    random_state=42,
    n_jobs=-1
)
rs.fit(X_train_sc, y_train)

print(f"Best params: {rs.best_params_}")
print(f"Best CV AUC: {rs.best_score_:.4f}")

Best params: {'C': 1.34, 'penalty': 'l2', 'solver': 'liblinear'}
Best CV AUC: 0.9977

With 20 iterations (100 fits) vs GridSearch's 50 fits: same CV AUC (0.9977 vs 0.9976). For larger, harder search spaces, RandomizedSearch typically finds near-optimal solutions in far fewer evaluations.

Why Log-Uniform Distribution for C?

C spans orders of magnitude. A uniform distribution over [0.001, 100] would draw 99.9% of samples from [1, 100] — barely exploring the important low-C region.

python

from scipy.stats import loguniform
import numpy as np

samples = loguniform(0.001, 100).rvs(10, random_state=42)
print(np.sort(samples).round(4))

[0.0019 0.0082 0.0341 0.1234 0.5892 1.3412 4.7821 12.341 34.512 67.891]

Log-uniform distributes samples proportionally across decades: each power of 10 gets roughly the same number of samples. This matches the scale on which C matters — the difference between C=0.01 and C=0.1 is as significant as between C=10 and C=100.

GridSearch vs RandomizedSearch

Aspect	GridSearchCV	RandomizedSearchCV
Search strategy	All combinations	`n_iter` random samples
Continuous distributions	No (discrete only)	Yes (`scipy.stats`)
Fits required	$\prod_{i} ∥ H_{i} ∥ \times K$	`n_iter` × K
Guaranteed to find best	Yes (in grid)	No (probabilistic)
Efficient for large spaces	No	Yes
When to use	Small grid (< 100 combos)	Large or continuous spaces

GridSearch Results Summary

Top 3 and bottom 2 combinations by CV AUC:

C	Penalty	CV AUC	Rank
1	L2	0.9976	1
1	L1	0.9974	2
10	L2	0.9974	3
0.01	L1	0.9932	9
0.01	L2	0.9935	10

GridSearchCV's refit=True means after the search is done, it refits the best model on the entire training set. You get a model that was tuned on subsets and finally trained on all of the training data — correct. If you manually refit after inspecting gs.best_params_, you get the same result, but it's redundant and error-prone.

The honest limitation: both GridSearch and RandomizedSearch assume that CV performance on the training set predicts test performance — which requires that the train and test distributions are similar. If your test set comes from a different time period, geographic region, or demographic than training, even a perfectly tuned model can fail on deployment.

Test Your Understanding

GridSearchCV ran 50 fits (10 combos × 5 folds). If you set cv=10 instead of cv=5, how many total fits would run? Would the best params change? Would the best CV AUC increase, decrease, or stay roughly the same?
RandomizedSearch with n_iter=20 found C=1.34 — not in our original discrete grid of [0.01, 0.1, 1, 10, 100]. If you ran GridSearch on a grid that included C=1.34, would it necessarily outperform GridSearch on [0.01, 0.1, 1, 10, 100]?
loguniform(0.001, 100).rvs(10) drew 10 samples distributed across decades. If you used uniform(0.001, 100).rvs(10) instead, what fraction of samples would fall below C=1?
gs.best_score_ reports the mean CV AUC across 5 folds. The standard deviation across folds is not directly shown but is stored in gs.cv_results_['std_test_score']. If std_test_score = 0.008 for the best combination, does this change your confidence in C=1 being the true optimum?
You have 6 hyperparameters each with 4 values. GridSearch needs 4⁶ × 5 = 20,480 fits. RandomizedSearch with n_iter=100 needs 500 fits. The paper by Bergstra & Bengio (2012) shows RandomizedSearch finds near-optimal solutions with fewer evaluations. Intuitively, why?

GridSearchCV and RandomizedSearchCV

What Is Hyperparameter Tuning?

GridSearchCV — Exhaustive Search

Examining the Full Results Grid

Evaluating the Best Model

The Problem with GridSearch: Exponential Blowup

RandomizedSearchCV — Sampling the Search Space

Why Log-Uniform Distribution for C?

GridSearch vs RandomizedSearch

GridSearch Results Summary

Test Your Understanding

Comments (0)

Leave a comment

GridSearchCV and RandomizedSearchCV

What Is Hyperparameter Tuning?

GridSearchCV — Exhaustive Search

Examining the Full Results Grid

Evaluating the Best Model

The Problem with GridSearch: Exponential Blowup

RandomizedSearchCV — Sampling the Search Space

Why Log-Uniform Distribution for C?

GridSearch vs RandomizedSearch

GridSearch Results Summary

Related Concepts and Honest Limitations

Test Your Understanding

Comments (0)

Leave a comment