python - OLS using statsmodel.formula.api versus statsmodel.api -

June 15, 2015

can explain me difference between ols in statsmodel.formula.api versus ols in statsmodel.api?

using advertising data islr text, ran ols using both, , got different results. compared scikit-learn's linearregression.

import numpy np import pandas pd import statsmodels.formula.api smf import statsmodels.api sm sklearn.linear_model import linearregression  df = pd.read_csv("c:\...\advertising.csv")  x1 = df.loc[:,['tv']] y1 = df.loc[:,['sales']]  print "statsmodel.formula.api method" model1 = smf.ols(formula='sales ~ tv', data=df).fit() print model1.params  print "\nstatsmodel.api method" model2 = sm.ols(y1, x1) results = model2.fit() print results.params  print "\nsci-kit learn method" model3 = linearregression() model3.fit(x1, y1) print model3.coef_ print model3.intercept_

the output follows:

statsmodel.formula.api method intercept    7.032594 tv           0.047537 dtype: float64  statsmodel.api method tv    0.08325 dtype: float64  sci-kit learn method [[ 0.04753664]] [ 7.03259355]

the statsmodel.api method returns different parameter tv statsmodel.formula.api , scikit-learn methods.

what kind of ols algorithm statsmodel.api running produce different result? have link documentation answer question?

the difference due presence of intercept or not:

in statsmodels.formula.api, r approach, constant automatically added data , intercept in fitted
in statsmodels.api, have add constant (see the documentation here). try using add_constant statsmodels.api
```
x1 = sm.add_constant(x1) 
```

Search This Blog

Script

python - OLS using statsmodel.formula.api versus statsmodel.api -

Comments

Post a Comment

Popular posts from this blog

javascript - Bootstrap Popover: iOS Safari strange behaviour -

Magento/PHP - Get phones on all members in a customer group -

spring cloud - How to configure SpringCloud Eureka instance to point to https on non standard port -