当前位置：首页 > news >正文

利用python statsmodels包分析数据

news 2026/4/19 19:25:13

原文档地址：https://www.statsmodels.org/stable/index.html

下载statsmodels安装包

aaa@kylin-pc:~/par$ python3 loong/pip-24.0.pyz download statsmodels -d 313 -i https://mirrors.aliyun.com/pypi/simple/ --platform manylinux2014_aarch64 --only-binary=:all: --python-version 3.13 --default-timeout=160 ... Successfully downloaded statsmodels numpy packaging pandas patsy scipy python-dateutil pytz tzdata six

安装statsmodels

aaa@kylin-pc:~/par$ cd tpy313 aaa@kylin-pc:~/par/tpy313$ source myenv/bin/activate (myenv) aaa@kylin-pc:~/par/tpy313$ pip install --no-index -f 313 statsmodels ... Successfully installed pandas-2.3.2 patsy-1.0.2 pytz-2026.1.post1 scipy-1.16.3 statsmodels-0.14.6 tzdata-2026.1

执行文档中的例子，需要联网

(myenv) aaa@kylin-pc:~/par/tpy313$ python3 Python 3.13.13 (main, Apr 7 2026, 20:43:47) [Clang 22.1.1 ] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import numpy as np >>> import statsmodels.api as sm >>> import statsmodels.formula.api as smf >>> dat = sm.datasets.get_rdataset("Guerry", "HistData").data >>> dat dept Region Department Crime_pers Crime_prop Literacy Donations Infants ... Donation_clergy Lottery Desertion Instruction Prostitutes Distance Area Pop1831 0 1 E Ain 28870 15890 37 5098 33120 ... 69 41 55 46 13 218.372 5762 346.03 1 2 N Aisne 26226 5521 51 8901 14572 ... 36 38 82 24 327 65.945 7369 513.00 2 3 C Allier 26747 7925 13 10973 17044 ... 76 66 16 85 34 161.927 7340 298.26 3 4 E Basses-Alpes 12935 7289 46 2733 23018 ... 37 80 32 29 2 351.399 6925 155.90 4 5 E Hautes-Alpes 17488 8174 69 6962 23076 ... 64 79 35 7 1 320.280 5549 129.10 .. ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... 81 86 W Vienne 15010 4710 25 8922 35224 ... 44 40 38 65 18 170.523 6990 282.73 82 87 C Haute-Vienne 16256 6402 13 13817 19940 ... 78 55 11 84 7 198.874 5520 285.13 83 88 E Vosges 18835 9044 62 4040 14978 ... 5 14 85 11 43 174.477 5874 397.99 84 89 C Yonne 18006 6516 47 4276 16616 ... 35 51 66 27 272 81.797 7427 352.49 85 200 NaN Corse 2199 4589 49 37015 24743 ... 84 83 9 25 1 539.213 8680 195.41 [86 rows x 23 columns] >>> results = smf.ols('Lottery ~ Literacy + np.log(Pop1831)', data=dat).fit() >>> print(results.summary()) OLS Regression Results ============================================================================== Dep. Variable: Lottery R-squared: 0.348 Model: OLS Adj. R-squared: 0.333 Method: Least Squares F-statistic: 22.20 Date: Fri, 17 Apr 2026 Prob (F-statistic): 1.90e-08 Time: 16:33:51 Log-Likelihood: -379.82 No. Observations: 86 AIC: 765.6 Df Residuals: 83 BIC: 773.0 Df Model: 2 Covariance Type: nonrobust =================================================================================== coef std err t P>|t| [0.025 0.975] ----------------------------------------------------------------------------------- Intercept 246.4341 35.233 6.995 0.000 176.358 316.510 Literacy -0.4889 0.128 -3.832 0.000 -0.743 -0.235 np.log(Pop1831) -31.3114 5.977 -5.239 0.000 -43.199 -19.424 ============================================================================== Omnibus: 3.713 Durbin-Watson: 2.019 Prob(Omnibus): 0.156 Jarque-Bera (JB): 3.394 Skew: -0.487 Prob(JB): 0.183 Kurtosis: 3.003 Cond. No. 702. ============================================================================== Notes: [1] Standard Errors assume that the covariance matrix of the errors is correctly specified. >>> nobs = 100 >>> X = np.random.random((nobs, 2)) >>> X = sm.add_constant(X) >>> beta = [1, .1, .5] >>> e = np.random.random(nobs) >>> y = np.dot(X, beta) + e >>> results = sm.OLS(y, X).fit() >>> print(results.summary()) OLS Regression Results ============================================================================== Dep. Variable: y R-squared: 0.263 Model: OLS Adj. R-squared: 0.248 Method: Least Squares F-statistic: 17.30 Date: Fri, 17 Apr 2026 Prob (F-statistic): 3.75e-07 Time: 16:35:40 Log-Likelihood: -14.069 No. Observations: 100 AIC: 34.14 Df Residuals: 97 BIC: 41.95 Df Model: 2 Covariance Type: nonrobust ============================================================================== coef std err t P>|t| [0.025 0.975] ------------------------------------------------------------------------------ const 1.4461 0.085 17.023 0.000 1.277 1.615 x1 0.0461 0.104 0.443 0.658 -0.160 0.253 x2 0.5766 0.098 5.865 0.000 0.381 0.772 ============================================================================== Omnibus: 49.277 Durbin-Watson: 1.995 Prob(Omnibus): 0.000 Jarque-Bera (JB): 6.904 Skew: 0.074 Prob(JB): 0.0317 Kurtosis: 1.721 Cond. No. 6.04 ============================================================================== Notes: [1] Standard Errors assume that the covariance matrix of the errors is correctly specified. >>>

查看全文

http://www.jsqmd.com/news/667437/