In R I can create the desired output by doing:

```
data = c(rep(1.5, 7), rep(2.5, 2), rep(3.5, 8),
rep(4.5, 3), rep(5.5, 1), rep(6.5, 8))
plot(density(data, bw=0.5))
```

In python (with matplotlib) the closest I got was with a simple histogram:

```
import matplotlib.pyplot as plt
data = [1.5]*7 + [2.5]*2 + [3.5]*8 + [4.5]*3 + [5.5]*1 + [6.5]*8
plt.hist(data, bins=6)
plt.show()
```

I also tried the normed=True parameter but couldn't get anything other than trying to fit a gaussian to the histogram.

My latest attempts were around `scipy.stats`

and `gaussian_kde`

, following examples on the web, but I've been unsuccessful so far.

Sven has shown how to use the class `gaussian_kde`

from Scipy, but you will notice that it doesn't look quite like what you generated with R. This is because `gaussian_kde`

tries to infer the bandwidth automatically. You can play with the bandwidth in a way by changing the function `covariance_factor`

of the `gaussian_kde`

class. First, here is what you get without changing that function:

However, if I use the following code:

```
import matplotlib.pyplot as plt
import numpy as np
from scipy.stats import gaussian_kde
data = [1.5]*7 + [2.5]*2 + [3.5]*8 + [4.5]*3 + [5.5]*1 + [6.5]*8
density = gaussian_kde(data)
xs = np.linspace(0,8,200)
density.covariance_factor = lambda : .25
density._compute_covariance()
plt.plot(xs,density(xs))
plt.show()
```

I get

which is pretty close to what you are getting from R. What have I done? `gaussian_kde`

uses a changable function, `covariance_factor`

to calculate its bandwidth. Before changing the function, the value returned by covariance_factor for this data was about .5. Lowering this lowered the bandwidth. I had to call `_compute_covariance`

after changing that function so that all of the factors would be calculated correctly. It isn't an exact correspondence with the bw parameter from R, but hopefully it helps you get in the right direction.

Five years later, when I Google "how to create a kernel density plot using python", this thread still shows up at the top!

Today, a much easier way to do this is to use seaborn, a package that provides many convenient plotting functions and good style management.

```
import numpy as np
import seaborn as sns
data = [1.5]*7 + [2.5]*2 + [3.5]*8 + [4.5]*3 + [5.5]*1 + [6.5]*8
sns.set_style('whitegrid')
sns.kdeplot(np.array(data), bw=0.5)
```

Licensed under: CC-BY-SA with attribution

Not affiliated with: Stack Overflow