I have a set of X,Y data points (about 10k) that are easy to plot as a scatter plot but that I would like to represent as a heatmap.

I looked through the examples in MatPlotLib and they all seem to already start with heatmap cell values to generate the image.

Is there a method that converts a bunch of x,y, all different, to a heatmap (where zones with higher frequency of x,y would be "warmer")?

If you don't want hexagons, you can use numpy's `histogram2d`

function:

```
import numpy as np
import numpy.random
import matplotlib.pyplot as plt
# Generate some test data
x = np.random.randn(8873)
y = np.random.randn(8873)
heatmap, xedges, yedges = np.histogram2d(x, y, bins=50)
extent = [xedges[0], xedges[-1], yedges[0], yedges[-1]]
plt.clf()
plt.imshow(heatmap.T, extent=extent, origin='lower')
plt.show()
```

This makes a 50x50 heatmap. If you want, say, 512x384, you can put `bins=(512, 384)`

in the call to `histogram2d`

.

Example:

In *Matplotlib* lexicon, i think you want a **hexbin** plot.

If you're not familiar with this type of plot, it's just a *bivariate histogram* in which the xy-plane is tessellated by a regular grid of hexagons.

So from a histogram, you can just count the number of points falling in each hexagon, discretiize the plotting region as a set of *windows*, assign each point to one of these windows; finally, map the windows onto a *color array*, and you've got a hexbin diagram.

Though less commonly used than e.g., circles, or squares, that hexagons are a better choice for the geometry of the binning container is intuitive:

hexagons have

*nearest-neighbor symmetry*(e.g., square bins don't, e.g., the distance*from*a point on a square's border*to*a point inside that square is not everywhere equal) andhexagon is the highest n-polygon that gives

*regular plane tessellation*(i.e., you can safely re-model your kitchen floor with hexagonal-shaped tiles because you won't have any void space between the tiles when you are finished--not true for all other higher-n, n >= 7, polygons).

(*Matplotlib* uses the term *hexbin* plot; so do (AFAIK) all of the plotting libraries for *R*; still i don't know if this is the generally accepted term for plots of this type, though i suspect it's likely given that *hexbin* is short for *hexagonal binning*, which is describes the essential step in preparing the data for display.)

```
from matplotlib import pyplot as PLT
from matplotlib import cm as CM
from matplotlib import mlab as ML
import numpy as NP
n = 1e5
x = y = NP.linspace(-5, 5, 100)
X, Y = NP.meshgrid(x, y)
Z1 = ML.bivariate_normal(X, Y, 2, 2, 0, 0)
Z2 = ML.bivariate_normal(X, Y, 4, 1, 1, 1)
ZD = Z2 - Z1
x = X.ravel()
y = Y.ravel()
z = ZD.ravel()
gridsize=30
PLT.subplot(111)
# if 'bins=None', then color of each hexagon corresponds directly to its count
# 'C' is optional--it maps values to x-y coordinates; if 'C' is None (default) then
# the result is a pure 2D histogram
PLT.hexbin(x, y, C=z, gridsize=gridsize, cmap=CM.jet, bins=None)
PLT.axis([x.min(), x.max(), y.min(), y.max()])
cb = PLT.colorbar()
cb.set_label('mean value')
PLT.show()
```

Licensed under: CC-BY-SA with attribution

Not affiliated with: Stack Overflow