暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

笔记 |Macos安装pycaret

懒麻蛇 2021-08-18
2132


pycaret


pycaret是机器学习的懒人包。与其他开源机器学习库相比,pycaret是一个备用的低代码库,可用于仅用很少几个单词替换数百行代码。它本质上就是组装了多个机器学习库和框架,例如scikit-learn,XGBoost,Microsoft LightGBM,spaCy等。


比如几年前,为了这样对比sklearn的几个estimator,你需要以下的代码:

    # Regression problem
    import pandas as pd
    import matplotlib.pyplot as plt
    import seaborn as sns
    import numpy as np
    from sklearn import model_selection
    from sklearn.metrics import make_scorer, mean_squared_error
    from sklearn.svm import SVR, LinearSVR
    from sklearn.tree import DecisionTreeRegressor
    from sklearn.linear_model import LinearRegression,Ridge,Lasso,ElasticNet,BayesianRidge,SGDRegressor
    from sklearn.gaussian_process import GaussianProcessRegressor
    from sklearn.neighbors import KNeighborsRegressor
    from sklearn.tree import DecisionTreeRegressor
    from sklearn.neural_network import MLPRegressor
    from sklearn.ensemble import GradientBoostingRegressor,RandomForestRegressor,ExtraTreesRegressor
    from sklearn.kernel_ridge import KernelRidge


    models=[]
    models.append(('DecisionTree', DecisionTreeRegressor()))
    models.append(('Ridge', Ridge()))
    models.append(('Lasso', Lasso()))
    models.append(('EN', ElasticNet(alpha=0.001,max_iter=10000)))
    models.append(('BayesianRidge',BayesianRidge()))
    models.append(('SVM',SVR()))
    models.append(('KNeighbors',KNeighborsRegressor()))
    models.append(('NN',MLPRegressor()))
    models.append(('GBoosting',GradientBoostingRegressor()))
    models.append(('RF',RandomForestRegressor()))
    models.append(('ExtraTrees',ExtraTreesRegressor()))
    models.append(('SGD',SGDRegressor(max_iter=1000,tol=1e-3)))
    models.append(('Kernel_Ridge',KernelRidge(alpha=0.6, kernel='polynomial', degree=2, coef0=2.5)))
    models.append(('LR_SVR',LinearSVR()))
    models.append(('LR',LinearRegression()))






    def compare_scores_mae(models, X, y):
    cv_means = []
    cv_std = []
    cv_resutls= []
    names=[]
    for name,model in models:
    kfold = model_selection.KFold(n_splits=10)
    cv_results = model_selection.cross_val_score(model, X, y, cv=kfold, scoring='neg_mean_absolute_error',n_jobs=10)
    cv_means.append(cv_results.mean())
    cv_std.append(cv_results.std())
    names.append(name)
    msg = "%s: %f (%f)" % (name, cv_results.mean(), cv_results.std())
    print(msg)
    cv_res=pd.DataFrame({"CrossValMeans":cv_means,"CrossValerrors": cv_std,"Algorithm":names})
    g = sns.barplot("CrossValMeans","Algorithm",data = cv_res, palette="Set3",orient = "h",**{'xerr':cv_std})
    g.set_xlabel("negative MAE")
    g = g.set_title("Cross validation scores")
    return cv_res


    对,我在认识pycaret之前就是这么干的。

    pycaret就是这样把这些代码封装📦成了一个函数:compare_models()




    安装


    官方给出了通过pip

      #installing for the first time
      pip install pycaret
      #if you have installed beta version in past, run the below code to upgrade
      pip install --upgrade pycaret
      #Run the below code in your notebook to check the installed version
      from pycaret.utils import version
      version()

      或者conda安装的方法,

        #create a conda environment
        conda create --name yourenvname python=3.6
        #activate environment
        conda activate yourenvname
        #install pycaret
        pip install pycaret
        #create notebook kernel connected with the conda environment
        python -m ipykernel install --user --name yourenvname --display-name "display-name-here"

        如果在colab或者kaggle的instance的话使用!pip就好。然鹅如果安装过python或者R包的你知道,事情可能并没有那么简单,Macos在安装llvmlite和LightGBM的时候各种error让人猝不及防,导致安装失败。大概花了一个小时在Macos上安装pycaret (悄悄告诉你kaggle的instance上安装没有任何毛病)。





        llvmlite

        pip安装总是出现python setup_tools的相关错误,过程中发现llvmite这个包需要cmake。用brew安装了cmake,结果还是不行。最后在github一个角落发现可以使用easy_install的命令轻松解决其不能在python3.8上安装的问题,果断试了试(自己用的conda环境python3.5),问题解决。原理不得而知,pip不行,easy_install就可以。

          brew install cmake
          easy_install llvmlite



          LightGBM

          LihgtGBM是树模型中模型能力最优异的模型之一,作为pycaret包含的模型之一,安装pycaret的过程中也需要安装LightGBM。LightGBM在window上的安装很简单(微软自家开发),直接使用python自带的pip安装工具安装即可。在Mac上用pip安装会遇到错误。因此需要安装C版本LightGBM。

            pip uninstall lightgbm
            git clone --recursive https://github.com/Microsoft/LightGBM ; cd LightGBM
            export CXX=g++-8 CC=gcc-8
            mkdir build ; cd build
            cmake ..
            make -j4

            如果发现自己没有gcc-8的话,使用brew安装gcc-8,记忆中cmake也是需要用到到。

              brew install gcc@8





              最后的建议


              conda和pip安装最好不要混搭。

              不要升级pip,升级过后你会有一种需要重新装python的赶脚。


              升级之后使用pip如下

                  File "F:\anaconda\envs\emotion\lib\site-packages\pkg_resources\__init__.py", line 2331, in resolve
                module = __import__(self.module_name, fromlist=['__name__'], level=0)
                File "F:\anaconda\envs\emotion\lib\site-packages\pip\_internal\__init__.py", line 42, in <module>
                from pip._internal import cmdoptions
                File "F:\anaconda\envs\emotion\lib\site-packages\pip\_internal\cmdoptions.py", line 16, in <module>
                from pip._internal.index import (
                ImportError: cannot import name 'FormatControl'


                附赠一份降级教程:

                https://pypi.org/project/pip/19.1.1/#files

                手动下载第二个文件并解压,在其目录下运行

                  python setup.py install





                  End














                  文章转载自懒麻蛇,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

                  评论