Parzen Windows density estimation technique

Parzen Window is a non-parametric density estimation technique. Density estimation in Pattern Recognition can be achieved by using the approach of the Parzen Windows. Parzen window density estimation technique is a kind of generalization of the histogram technique.

It is used to derive a density function, { f(x).
{ f(x) is used to implement a Bayes Classifier. When we have a new sample feature x and when there is a need to compute the value of the class conditional densities, { f(x) is used.
{ f(x) takes sample input data value and returns the density estimate of the given data sample.

An n-dimensional hypercube is considered which is assumed to possess k-data samples.
The length of the edge of the hypercube is assumed to be hn.

Hence the volume of the hypercube is: Vn = hnd

We define a hypercube window function, φ(u) which is an indicator function of the unit hypercube which is centered at origin.:
φ(u) = 1 if |ui| <= 0.5
φ(u) = 0 otherwise
Here, u is a vector, u = (u1, u2, …, ud)T.
φ(u) should satisfy the following:

  1. \varphi\((u) >= 0 ;   \forall u
  2. \int_{R^{d}}^{} \varphi\((u).du = 1

Let V = \int_{R^{d}}^{} \varphi\( \frac{u}{h} \))du = \int_{R^{d}}^{} \varphi\ (\frac{u-u_{0}}{h} \))du

Since, φ(u) is centered at the origin, it is symmetric.
φ(u) = φ(-u)

  • \varphi\(\frac{(u-u_{0})}{h}\) is a hypercube of size h cenetered at u0
  • Let D = {x1, x2, …, xn} be the data samples.
  • For any x, \varphi\(\frac{(x-x_{i})}{h}\) would be 1 only if x_{i} falls in a hypercube of side h centered at x.
  • Hence the number of data points falling in a hypercube of side h centered at x is k =\(\sum_{i=1}^{n}\varphi\(\frac{(x-x_{i})}{h}\)

Hence the estimated density function is : {\LARGE f(x) = \( \frac{1}{n} \) \(\sum_{i=1}^{n}\)  \( \frac{1}{h^{d}} \) \varphi\(\frac{(x-x_{i})}{h}\) }

Also Since, Vn = hnd, Density Function becomes :
{\LARGE f(x) = \( \frac{1}{n} \) \(\sum_{i=1}^{n}\)  \( \frac{1}{V} \) \varphi\(\frac{(x-x_{i})}{h}\) }

f(x) would satisfy the following conditions:

  1. f(x) >= 0 ;   \forall x
  2. \int_{}^{} f(x).dx = 1

Don’t stop now and take your learning to the next level. Learn all the important concepts of Data Structures and Algorithms with the help of the most trusted course: DSA Self Paced. Become industry ready at a student-friendly price.

My Personal Notes arrow_drop_up

Check out this Author's contributed articles.

If you like GeeksforGeeks and would like to contribute, you can also write an article using or mail your article to See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.

Article Tags :

Be the First to upvote.

Please write to us at to report any issue with the above content.