I'm trying to generate a linear regression on a scatter plot I have generated, however my data is in list format, and all of the examples I can find of using `polyfit`

require using `arange`

. `arange`

doesn't accept lists though. I have searched high and low about how to convert a list to an array and nothing seems clear. Am I missing something?

Following on, how best can I use my list of integers as inputs to the `polyfit`

?

here is the polyfit example I am following:

```
from pylab import *
x = arange(data)
y = arange(data)
m,b = polyfit(x, y, 1)
plot(x, y, 'yo', x, m*x+b, '--k')
show()
```

`arange`

*generates* lists (well, numpy arrays); type `help(np.arange)`

for the details. You don't need to call it on existing lists.

```
>>> x = [1,2,3,4]
>>> y = [3,5,7,9]
>>>
>>> m,b = np.polyfit(x, y, 1)
>>> m
2.0000000000000009
>>> b
0.99999999999999833
```

I should add that I tend to use `poly1d`

here rather than write out "m*x+b" and the higher-order equivalents, so my version of your code would look something like this:

```
import numpy as np
import matplotlib.pyplot as plt
x = [1,2,3,4]
y = [3,5,7,10] # 10, not 9, so the fit isn't perfect
fit = np.polyfit(x,y,1)
fit_fn = np.poly1d(fit)
# fit_fn is now a function which takes in x and returns an estimate for y
plt.plot(x,y, 'yo', x, fit_fn(x), '--k')
plt.xlim(0, 5)
plt.ylim(0, 12)
```

This code:

```
from scipy.stats import linregress
linregress(x,y) #x and y are arrays or lists.
```

gives out a list with the following:

slope : float

slope of the regression line

intercept : float

intercept of the regression line

r-value : float

correlation coefficient

p-value : float

two-sided p-value for a hypothesis test whose null hypothesis is that the slope is zero

stderr : float

Standard error of the estimate

Licensed under: CC-BY-SA with attribution

Not affiliated with: Stack Overflow