Kernel regression

Kernel regression is a non-parametric technique used to estimate the conditional expectation of a random variable.
Updated: Jun 21, 2024

3 key takeaways

Copy link to section
  • Kernel regression is a non-parametric method that estimates the relationship between variables without assuming a predefined form.
  • It uses kernel functions to assign weights to data points, giving more importance to observations near the target point.
  • Kernel regression can model complex, non-linear relationships in data, making it a versatile tool in various fields, including economics, machine learning, and statistics.

What is kernel regression?

Copy link to section

Kernel regression is a method used to estimate the conditional expectation ( E[Y|X] ) of a dependent variable ( Y ) given an independent variable ( X ). Unlike parametric methods, which assume a specific functional form for the relationship between ( X ) and ( Y ), kernel regression uses a flexible approach that adapts to the structure of the data.

How does kernel regression work?

Copy link to section

Kernel regression works by weighting nearby observations more heavily than those further away when estimating the value of the dependent variable at a given point. The weights are determined by a kernel function, which is a smooth, symmetric function centered around the target point.

The kernel function

Copy link to section

A kernel function ( K ) assigns weights to observations based on their distance from the target point ( x_0 ). Commonly used kernel functions include:

  • Gaussian kernel: ( K(u) = \frac{1}{\sqrt{2\pi}} e^{-\frac{u^2}{2}} )
  • Epanechnikov kernel: ( K(u) = \frac{3}{4} (1 – u^2) ) for ( |u| \leq 1 ), and 0 otherwise
  • Uniform kernel: ( K(u) = \frac{1}{2} ) for ( |u| \leq 1 ), and 0 otherwise

The Nadaraya-Watson estimator

One of the most common kernel regression estimators is the Nadaraya-Watson estimator. The estimated value of ( Y ) at ( X = x_0 ) is given by:

[ \hat{m}(x_0) = \frac{\sum_{i=1}^{n} K\left(\frac{x_0 – x_i}{h}\right) y_i}{\sum_{i=1}^{n} K\left(\frac{x_0 – x_i}{h}\right)} ]


  • ( \hat{m}(x_0) ) is the estimated value of ( Y ) at ( X = x_0 ).
  • ( K ) is the kernel function.
  • ( h ) is the bandwidth parameter that controls the smoothness of the estimate.
  • ( x_i ) and ( y_i ) are the observed values of the independent and dependent variables, respectively.

Choosing the bandwidth

Copy link to section

The bandwidth ( h ) is a crucial parameter in kernel regression, determining the width of the kernel function and, consequently, the smoothness of the estimated curve. A small bandwidth leads to a more flexible, less smooth estimate that may capture noise (overfitting), while a large bandwidth results in a smoother estimate that may miss important details (underfitting). Bandwidth selection methods include cross-validation and rule-of-thumb approaches.

Applications of kernel regression

Copy link to section


Kernel regression is used to model non-linear relationships in economic data, such as estimating demand curves, analyzing income distributions, and studying the effects of policy changes.

Machine learning

In machine learning, kernel regression is employed for tasks such as regression analysis, pattern recognition, and time series forecasting, especially when the relationship between variables is complex and non-linear.


Statisticians use kernel regression for smoothing data, density estimation, and non-parametric hypothesis testing, providing flexible tools for data analysis without assuming a specific functional form.

Advantages and limitations

Copy link to section


  • Flexibility: Kernel regression does not assume a specific functional form, allowing it to model complex, non-linear relationships.
  • Smooth estimates: The use of kernel functions provides smooth estimates, making it suitable for data with continuous relationships.


  • Computational complexity: Kernel regression can be computationally intensive, especially for large datasets, due to the need to calculate weights for all observations.
  • Bandwidth selection: Choosing an appropriate bandwidth is critical and can be challenging, as it significantly affects the estimate’s accuracy and smoothness.
Copy link to section
  • Non-parametric regression: Explore other non-parametric regression techniques, such as spline regression and local polynomial regression, which also provide flexible modeling approaches.
  • Kernel density estimation: Learn about kernel density estimation, a related technique used to estimate the probability density function of a random variable.
  • Smoothing techniques: Understand various smoothing techniques in statistics and data analysis, including moving averages, loess, and spline smoothing.

Consider exploring these related topics to gain a deeper understanding of kernel regression and its applications in statistical modeling and data analysis.

Sources & references
Risk disclaimer
AI Financial Assistant
Arti is a specialized AI Financial Assistant at Invezz, created to support the editorial team. He leverages both AI and the knowledge base, understands over 100,000... read more.