How can I plot the empirical CDF of an array of numbers in matplotlib in Python? I'm looking for the cdf analog of pylab's "hist" function.
One thing I can think of is:
from scipy.stats import cumfreq
a = array([...]) # my array of numbers
num_bins = ...

Asked By user248237

I have an array of lists of numbers, e.g.:
[0] (0.01, 0.01, 0.02, 0.04, 0.03)
[1] (0.00, 0.02, 0.02, 0.03, 0.02)
[2] (0.01, 0.02, 0.02, 0.03, 0.02)
...
[n] (0.01, 0.00, 0.01, 0.05, 0.03)
What I would like to do is efficiently calculate the mean an...

Asked By Alex Reynolds

Is there a convenient way to calculate percentiles for a sequence or single-dimensional numpy array?
I am looking for something similar to Excel's percentile function.
I looked in NumPy's statistics reference, and couldn't find this. All I could find is...

Asked By Uri

In R, I am using ccf or acf to compute the pair-wise cross-correlation function so that I can find out which shift gives me the maximum value. From the looks of it, R gives me a normalized sequence of values. Is there something similar in Python's scipy o...

I have some questions about boxplots in matplotlib:
Question A. What do the markers that I highlighted below with Q1, Q2, and Q3 represent? I believe Q1 is maximum and Q3 are outliers, but what is Q2?
&nb...

Asked By Amelio Vazquez-Reina

I am looking for a function that takes as input two lists, and returns the Pearson correlation, and the significance of the correlation.

Asked By ariel

I'm making a visualization of historical stock data for a project, and I'd like to highlight regions of drops. For instance, when the stock is experiencing significant drawdown, I would like to highlight it with a red region.
Can I do this automatically...

Asked By Alex

I've always thought that Python's advantages are code readibility and development speed, but time and memory usage were not as good as those of C++.
These stats struck me really hard.
What does your experience tell you about Python vs C++ time and memor...

Asked By Alex

Python has my_sample = random.sample(range(100), 10) to randomly sample without replacement from [0, 100).
Suppose I have sampled n such numbers and now I want to sample one more without replacement (without including any of the previously sampled n), ho...

Asked By necromancer

One last newbie pandas question for the day: How do I generate a table for a single Series?
For example:
my_series = pandas.Series([1,2,2,3,3,3])
pandas.magical_frequency_function( my_series )
>> {
1 : 1,
2 : 2,
3 : 3
}
Lots...

Asked By Abe

How can we plot (in python matplotlib) bivariate Gaussian Distributions , given their centers and covariance matrices as numpy arrays?
Let's say that our parameters are as follows:
center1=np.array([3,3])
center2=np.array([5,5])
cov1=np.array([ [1.,.5],...

Asked By red

I can't seem to find any python libraries that do multiple regression. The only things I find only do simple regression. I need to regress my dependent variable (y) against several independent variables (x1, x2, x3, etc.).
For example, with this data:
p...

Asked By Zach

If I want to calculate the mean of two categories in Pandas, I can do it like this:
data = {'Category': ['cat2','cat1','cat2','cat1','cat2','cat1','cat2','cat1','cat1','cat1','cat2'],
'values': [1,2,3,1,2,3,1,2,3,5,1]}
my_data = DataFrame(data)
m...

Asked By hirolau

