I have a histogram (see below) and I am trying to find the mean and standard deviation along with code which fits a curve to my histogram. I think there is something in SciPy or matplotlib that can help, but every example I've tried doesn't work.
import matplotlib.pyplot as plt import numpy as np with open('gau_b_g_s.csv') as f: v = np.loadtxt(f, delimiter= ',', dtype="float", skiprows=1, usecols=None) fig, ax = plt.subplots() plt.hist(v, bins=500, color='#7F38EC', histtype='step') plt.title("Gaussian") plt.axis([-1, 2, 0, 20000]) plt.show()
Take a look at this answer for fitting arbitrary curves to data. Basically you can use
scipy.optimize.curve_fit to fit any function you want to your data. The code below shows how you can fit a Gaussian to some random data (credit to this SciPy-User mailing list post).
import numpy from scipy.optimize import curve_fit import matplotlib.pyplot as plt # Define some test data which is close to Gaussian data = numpy.random.normal(size=10000) hist, bin_edges = numpy.histogram(data, density=True) bin_centres = (bin_edges[:-1] + bin_edges[1:])/2 # Define model function to be used to fit to the data above: def gauss(x, *p): A, mu, sigma = p return A*numpy.exp(-(x-mu)**2/(2.*sigma**2)) # p0 is the initial guess for the fitting coefficients (A, mu and sigma above) p0 = [1., 0., 1.] coeff, var_matrix = curve_fit(gauss, bin_centres, hist, p0=p0) # Get the fitted curve hist_fit = gauss(bin_centres, *coeff) plt.plot(bin_centres, hist, label='Test data') plt.plot(bin_centres, hist_fit, label='Fitted data') # Finally, lets get the fitting parameters, i.e. the mean and standard deviation: print 'Fitted mean = ', coeff print 'Fitted standard deviation = ', coeff plt.show()
You can try sklearn gaussian mixture model estimation as below :
import numpy as np import sklearn.mixture gmm = sklearn.mixture.GMM() # sample data a = np.random.randn(1000) # result r = gmm.fit(a[:, np.newaxis]) # GMM requires 2D data as of sklearn version 0.16 print("mean : %f, var : %f" % (r.means_[0, 0], r.covars_[0, 0]))
Note that in this way, you don't need to estimate your sample distribution with an histogram.