Databricks Runtime 6.1 pour ML (non pris en charge)
Databricks a publié cette image en octobre 2019.
Databricks Runtime 6.1 for Machine Learning fournit un environnement prêt à l'emploi pour l'apprentissage automatique et la science des données basé sur Databricks Runtime 6.1 (non pris en charge). Databricks Runtime ML contient de nombreuses bibliothèques populaires de Machine Learning, notamment TensorFlow, PyTorch, Keras et XGBoost. Il prend également en charge la formation de Deep Learning distribué avec Horovod.
Pour plus d’informations, notamment les instructions relatives à la création d’un cluster Databricks Runtime ML, consultez IA et Machine Learning sur Databricks.
Nouvelles fonctionnalités
Databricks Runtime 6.1 ML s’appuie sur Databricks Runtime 6.1 . Pour plus d’informations sur les nouveautés de Databricks Runtime 6.1, consultez les notes de publication sur Databricks Runtime 6.1 (non pris en charge).
Améliorations
Bibliothèques de Machine Learning mises à niveau :
- TensorFlow : de 1.13.1 à 1.14.0
- PyTorch : de 1.1.0 à 1.2.0
- Torchvision : de 0.3.0 à 0.4.0
- MLflow : de 1.2.0 à 1.3.0
Environnement du système
L’environnement système de Databricks Runtime 6.1 ML diffère de Databricks Runtime 6.1 comme suit :
- DBUtils : Ne contient pas l’Utilitaire de bibliothèque (dbutils.library) (hérité).
- Pour des clusters GPU, les bibliothèques GPU NVIDIA suivantes :
- Pilote NVIDIA 418.40
- CUDA 10.0
- CUDNN 7.6.0
Bibliothèques
Les sections suivantes répertorient les bibliothèques incluses dans Databricks Runtime ML 6.1 qui diffèrent de celles incluses dans Databricks Runtime 6.1.
Bibliothèques de niveau supérieur
Databricks Runtime 6.1 ML comprend les bibliothèques de niveau supérieur suivantes :
- GraphFrames
- Horovod et HorovodRunner
- MLflow
- PyTorch
- spark-tensorflow-connector
- TensorFlow
- TensorBoard
Bibliothèques Python
Databricks Runtime 6.1 ML utilise Conda pour la gestion des packages Python et comprend de nombreux packages ML populaires. La section suivante décrit l’environnement Conda pour Databricks Runtime 6.1 ML.
Python sur clusters UC
name: databricks-ml
channels:
- Databricks
- pytorch
- defaults
dependencies:
- _libgcc_mutex=0.1=main
- _py-xgboost-mutex=2.0=cpu_0
- _tflow_select=2.3.0=mkl
- absl-py=0.8.0=py37_0
- asn1crypto=0.24.0=py37_0
- astor=0.8.0=py37_0
- backcall=0.1.0=py37_0
- backports=1.0=py_2
- bcrypt=3.1.7=py37h7b6447c_0
- blas=1.0=mkl
- boto=2.49.0=py37_0
- boto3=1.9.162=py_0
- botocore=1.12.163=py_0
- c-ares=1.15.0=h7b6447c_1001
- ca-certificates=2019.1.23=0
- certifi=2019.3.9=py37_0
- cffi=1.12.2=py37h2e261b9_1
- chardet=3.0.4=py37_1003
- click=7.0=py37_0
- cloudpickle=0.8.0=py37_0
- colorama=0.4.1=py37_0
- configparser=3.7.4=py37_0
- cpuonly=1.0=0
- cryptography=2.6.1=py37h1ba5d50_0
- cycler=0.10.0=py37_0
- cython=0.29.6=py37he6710b0_0
- decorator=4.4.0=py37_1
- docutils=0.14=py37_0
- entrypoints=0.3=py37_0
- et_xmlfile=1.0.1=py37_0
- flask=1.0.2=py37_1
- freetype=2.9.1=h8a8886c_1
- future=0.17.1=py37_0
- gast=0.3.2=py_0
- gitdb2=2.0.5=py37_0
- gitpython=2.1.11=py37_0
- google-pasta=0.1.7=py_0
- grpcio=1.16.1=py37hf8bcb03_1
- gunicorn=19.9.0=py37_0
- h5py=2.9.0=py37h7918eee_0
- hdf5=1.10.4=hb1b8bf9_0
- html5lib=1.0.1=py_0
- icu=58.2=h9c2bf20_1
- idna=2.8=py37_0
- intel-openmp=2019.3=199
- ipython=7.4.0=py37h39e3cac_0
- ipython_genutils=0.2.0=py37_0
- itsdangerous=1.1.0=py37_0
- jdcal=1.4=py37_0
- jedi=0.13.3=py37_0
- jinja2=2.10=py37_0
- jmespath=0.9.4=py_0
- jpeg=9b=h024ee3a_2
- keras=2.2.4=0
- keras-applications=1.0.8=py_0
- keras-base=2.2.4=py37_0
- keras-preprocessing=1.1.0=py_1
- kiwisolver=1.0.1=py37hf484d3e_0
- krb5=1.16.1=h173b8e3_7
- libedit=3.1.20181209=hc058e9b_0
- libffi=3.2.1=hd88cf55_4
- libgcc-ng=8.2.0=hdf63c60_1
- libgfortran-ng=7.3.0=hdf63c60_0
- libpng=1.6.36=hbc83047_0
- libpq=11.2=h20c2e04_0
- libprotobuf=3.9.2=hd408876_0
- libsodium=1.0.16=h1bed415_0
- libstdcxx-ng=8.2.0=hdf63c60_1
- libtiff=4.0.10=h2733197_2
- libxgboost=0.90=he6710b0_0
- libxml2=2.9.9=hea5a465_1
- libxslt=1.1.33=h7d1a2b0_0
- llvmlite=0.28.0=py37hd408876_0
- lxml=4.3.2=py37hefd8a0e_0
- mako=1.0.10=py_0
- markdown=3.1.1=py37_0
- markupsafe=1.1.1=py37h7b6447c_0
- mkl=2019.3=199
- mkl_fft=1.0.10=py37ha843d7b_0
- mkl_random=1.0.2=py37hd81dba3_0
- ncurses=6.1=he6710b0_1
- networkx=2.2=py37_1
- ninja=1.9.0=py37hfd86e86_0
- nose=1.3.7=py37_2
- numba=0.43.1=py37h962f231_0
- numpy=1.16.2=py37h7e9f1db_0
- numpy-base=1.16.2=py37hde5b4d6_0
- olefile=0.46=py37_0
- openpyxl=2.6.1=py37_1
- openssl=1.1.1b=h7b6447c_1
- pandas=0.24.2=py37he6710b0_0
- paramiko=2.4.2=py37_0
- parso=0.3.4=py37_0
- pathlib2=2.3.3=py37_0
- patsy=0.5.1=py37_0
- pexpect=4.6.0=py37_0
- pickleshare=0.7.5=py37_0
- pillow=5.4.1=py37h34e0f95_0
- pip=19.0.3=py37_0
- ply=3.11=py37_0
- prompt_toolkit=2.0.9=py37_0
- protobuf=3.9.2=py37he6710b0_0
- psutil=5.6.1=py37h7b6447c_0
- psycopg2=2.7.6.1=py37h1ba5d50_0
- ptyprocess=0.6.0=py37_0
- py-xgboost=0.90=py37he6710b0_0
- py-xgboost-cpu=0.90=py37_0
- pyasn1=0.4.7=py_0
- pycparser=2.19=py37_0
- pygments=2.3.1=py37_0
- pymongo=3.8.0=py37he6710b0_1
- pynacl=1.3.0=py37h7b6447c_0
- pyopenssl=19.0.0=py37_0
- pyparsing=2.3.1=py37_0
- pysocks=1.6.8=py37_0
- python=3.7.3=h0371630_0
- python-dateutil=2.8.0=py37_0
- python-editor=1.0.4=py_0
- pytorch=1.2.0=py3.7_cpu_0
- pytz=2018.9=py37_0
- pyyaml=5.1=py37h7b6447c_0
- readline=7.0=h7b6447c_5
- requests=2.21.0=py37_0
- s3transfer=0.2.1=py37_0
- scikit-learn=0.20.3=py37hd81dba3_0
- scipy=1.2.1=py37h7c811a0_0
- setuptools=40.8.0=py37_0
- simplejson=3.16.0=py37h14c3975_0
- singledispatch=3.4.0.3=py37_0
- six=1.12.0=py37_0
- smmap2=2.0.5=py37_0
- sqlite=3.27.2=h7b6447c_0
- sqlparse=0.3.0=py_0
- statsmodels=0.9.0=py37h035aef0_0
- tabulate=0.8.3=py37_0
- tensorboard=1.14.0=py37hf484d3e_0
- tensorflow=1.14.0+db1=mkl_py37h0f35a5d_0
- tensorflow-base=1.14.0+db1=mkl_py37h7ce6ba3_0
- tensorflow-estimator=1.14.0+db1=py_0
- tensorflow-mkl=1.14.0+db1=h4fcabd2_0
- termcolor=1.1.0=py37_1
- tk=8.6.8=hbc83047_0
- torchvision=0.4.0=py37_cpu
- tqdm=4.31.1=py37_1
- traitlets=4.3.2=py37_0
- urllib3=1.24.1=py37_0
- virtualenv=16.0.0=py37_0
- wcwidth=0.1.7=py37_0
- webencodings=0.5.1=py37_1
- websocket-client=0.56.0=py37_0
- werkzeug=0.14.1=py37_0
- wheel=0.33.1=py37_0
- wrapt=1.11.1=py37h7b6447c_0
- xz=5.2.4=h14c3975_4
- yaml=0.1.7=had09818_2
- zlib=1.2.11=h7b6447c_3
- zstd=1.3.7=h0b5b093_0
- pip:
- argparse==1.4.0
- databricks-cli==0.9.0
- docker==4.1.0
- fusepy==2.0.4
- gorilla==0.3.0
- horovod==0.18.1
- hyperopt==0.1.2.db8
- matplotlib==3.0.3
- mleap==0.8.1
- mlflow==1.3.0
- nose-exclude==0.5.0
- pyarrow==0.13.0
- querystring-parser==1.2.4
- seaborn==0.9.0
- tensorboardx==1.8+db1
prefix: /databricks/conda/envs/databricks-ml
Python sur les clusters GPU
name: databricks-ml-gpu
channels:
- Databricks
- pytorch
- defaults
dependencies:
- _libgcc_mutex=0.1=main
- _py-xgboost-mutex=1.0=gpu_0
- _tflow_select=2.1.0=gpu
- absl-py=0.8.0=py37_0
- asn1crypto=0.24.0=py37_0
- astor=0.8.0=py37_0
- backcall=0.1.0=py37_0
- backports=1.0=py_2
- bcrypt=3.1.7=py37h7b6447c_0
- blas=1.0=mkl
- boto=2.49.0=py37_0
- boto3=1.9.162=py_0
- botocore=1.12.163=py_0
- c-ares=1.15.0=h7b6447c_1001
- ca-certificates=2019.1.23=0
- certifi=2019.3.9=py37_0
- cffi=1.12.2=py37h2e261b9_1
- chardet=3.0.4=py37_1003
- click=7.0=py37_0
- cloudpickle=0.8.0=py37_0
- colorama=0.4.1=py37_0
- configparser=3.7.4=py37_0
- cryptography=2.6.1=py37h1ba5d50_0
- cudatoolkit=10.0.130=0
- cudnn=7.6.0=cuda10.0_0
- cupti=10.0.130=0
- cycler=0.10.0=py37_0
- cython=0.29.6=py37he6710b0_0
- decorator=4.4.0=py37_1
- docutils=0.14=py37_0
- entrypoints=0.3=py37_0
- et_xmlfile=1.0.1=py37_0
- flask=1.0.2=py37_1
- freetype=2.9.1=h8a8886c_1
- future=0.17.1=py37_0
- gast=0.3.2=py_0
- gitdb2=2.0.5=py37_0
- gitpython=2.1.11=py37_0
- google-pasta=0.1.7=py_0
- grpcio=1.16.1=py37hf8bcb03_1
- gunicorn=19.9.0=py37_0
- h5py=2.9.0=py37h7918eee_0
- hdf5=1.10.4=hb1b8bf9_0
- html5lib=1.0.1=py_0
- icu=58.2=h9c2bf20_1
- idna=2.8=py37_0
- intel-openmp=2019.3=199
- ipython=7.4.0=py37h39e3cac_0
- ipython_genutils=0.2.0=py37_0
- itsdangerous=1.1.0=py37_0
- jdcal=1.4=py37_0
- jedi=0.13.3=py37_0
- jinja2=2.10=py37_0
- jmespath=0.9.4=py_0
- jpeg=9b=h024ee3a_2
- keras=2.2.4=0
- keras-applications=1.0.8=py_0
- keras-base=2.2.4=py37_0
- keras-preprocessing=1.1.0=py_1
- kiwisolver=1.0.1=py37hf484d3e_0
- krb5=1.16.1=h173b8e3_7
- libedit=3.1.20181209=hc058e9b_0
- libffi=3.2.1=hd88cf55_4
- libgcc-ng=8.2.0=hdf63c60_1
- libgfortran-ng=7.3.0=hdf63c60_0
- libpng=1.6.36=hbc83047_0
- libpq=11.2=h20c2e04_0
- libprotobuf=3.9.2=hd408876_0
- libsodium=1.0.16=h1bed415_0
- libstdcxx-ng=8.2.0=hdf63c60_1
- libtiff=4.0.10=h2733197_2
- libxgboost=0.90=h688424c_0
- libxml2=2.9.9=hea5a465_1
- libxslt=1.1.33=h7d1a2b0_0
- llvmlite=0.28.0=py37hd408876_0
- lxml=4.3.2=py37hefd8a0e_0
- mako=1.0.10=py_0
- markdown=3.1.1=py37_0
- markupsafe=1.1.1=py37h7b6447c_0
- mkl=2019.3=199
- mkl_fft=1.0.10=py37ha843d7b_0
- mkl_random=1.0.2=py37hd81dba3_0
- ncurses=6.1=he6710b0_1
- networkx=2.2=py37_1
- ninja=1.9.0=py37hfd86e86_0
- nose=1.3.7=py37_2
- numba=0.43.1=py37h962f231_0
- numpy=1.16.2=py37h7e9f1db_0
- numpy-base=1.16.2=py37hde5b4d6_0
- olefile=0.46=py37_0
- openpyxl=2.6.1=py37_1
- openssl=1.1.1b=h7b6447c_1
- pandas=0.24.2=py37he6710b0_0
- paramiko=2.4.2=py37_0
- parso=0.3.4=py37_0
- pathlib2=2.3.3=py37_0
- patsy=0.5.1=py37_0
- pexpect=4.6.0=py37_0
- pickleshare=0.7.5=py37_0
- pillow=5.4.1=py37h34e0f95_0
- pip=19.0.3=py37_0
- ply=3.11=py37_0
- prompt_toolkit=2.0.9=py37_0
- protobuf=3.9.2=py37he6710b0_0
- psutil=5.6.1=py37h7b6447c_0
- psycopg2=2.7.6.1=py37h1ba5d50_0
- ptyprocess=0.6.0=py37_0
- py-xgboost=0.90=py37h688424c_0
- py-xgboost-gpu=0.90=py37h28bbb66_0
- pyasn1=0.4.7=py_0
- pycparser=2.19=py37_0
- pygments=2.3.1=py37_0
- pymongo=3.8.0=py37he6710b0_1
- pynacl=1.3.0=py37h7b6447c_0
- pyopenssl=19.0.0=py37_0
- pyparsing=2.3.1=py37_0
- pysocks=1.6.8=py37_0
- python=3.7.3=h0371630_0
- python-dateutil=2.8.0=py37_0
- python-editor=1.0.4=py_0
- pytorch=1.2.0=py3.7_cuda10.0.130_cudnn7.6.2_0
- pytz=2018.9=py37_0
- pyyaml=5.1=py37h7b6447c_0
- readline=7.0=h7b6447c_5
- requests=2.21.0=py37_0
- s3transfer=0.2.1=py37_0
- scikit-learn=0.20.3=py37hd81dba3_0
- scipy=1.2.1=py37h7c811a0_0
- setuptools=40.8.0=py37_0
- simplejson=3.16.0=py37h14c3975_0
- singledispatch=3.4.0.3=py37_0
- six=1.12.0=py37_0
- smmap2=2.0.5=py37_0
- sqlite=3.27.2=h7b6447c_0
- sqlparse=0.3.0=py_0
- statsmodels=0.9.0=py37h035aef0_0
- tabulate=0.8.3=py37_0
- tensorboard=1.14.0=py37hf484d3e_0
- tensorflow=1.14.0+db1=gpu_py37h517d0a7_0
- tensorflow-base=1.14.0+db1=gpu_py37he292aa2_0
- tensorflow-estimator=1.14.0+db1=py_0
- tensorflow-gpu=1.14.0+db1=h0d30ee6_0
- termcolor=1.1.0=py37_1
- tk=8.6.8=hbc83047_0
- torchvision=0.4.0=py37_cu100
- tqdm=4.31.1=py37_1
- traitlets=4.3.2=py37_0
- urllib3=1.24.1=py37_0
- virtualenv=16.0.0=py37_0
- wcwidth=0.1.7=py37_0
- webencodings=0.5.1=py37_1
- websocket-client=0.56.0=py37_0
- werkzeug=0.14.1=py37_0
- wheel=0.33.1=py37_0
- wrapt=1.11.1=py37h7b6447c_0
- xz=5.2.4=h14c3975_4
- yaml=0.1.7=had09818_2
- zlib=1.2.11=h7b6447c_3
- zstd=1.3.7=h0b5b093_0
- pip:
- argparse==1.4.0
- databricks-cli==0.9.0
- docker==4.1.0
- fusepy==2.0.4
- gorilla==0.3.0
- horovod==0.18.1
- hyperopt==0.1.2.db8
- matplotlib==3.0.3
- mleap==0.8.1
- mlflow==1.3.0
- nose-exclude==0.5.0
- pyarrow==0.13.0
- querystring-parser==1.2.4
- seaborn==0.9.0
- tensorboardx==1.8+db1
prefix: /databricks/conda/envs/databricks-ml-gpu
Packages Spark contenant des modules Python
Package Spark | Module Python | Version |
---|---|---|
graphframes | graphframes | 0.7.0-db1-spark2.4 |
spark-deep-learning | sparkdl | 1.5.0-db5-spark2.4 |
tensorframes | tensorframes | 0.8.1-s_2.11 |
Bibliothèques R
Les bibliothèques R sont identiques aux bibliothèques R dans Databricks Runtime 6.1.
Bibliothèques Java et Scala (cluster Scala 2.11)
En plus des bibliothèques Java et Scala dans Databricks Runtime 6.1, Databricks Runtime 6.1 ML contient les fichiers jar suivants :
ID de groupe | ID d’artefact | Version |
---|---|---|
com.databricks | spark-deep-learning | 1.5.0-db5-spark2.4 |
com.typesafe.akka | akka-actor_2.11 | 2.3.11 |
ml.combust.mleap | mleap-databricks-runtime_2.11 | 0.14.0 |
ml.dmlc | xgboost4j | 0,90 |
ml.dmlc | xgboost4j-spark | 0,90 |
org.graphframes | graphframes_2.11 | 0.7.0-db1-spark2.4 |
org.mlflow | mlflow-client | 1.3.0 |
org.tensorflow | libtensorflow | 1.14.0 |
org.tensorflow | libtensorflow_jni | 1.14.0 |
org.tensorflow | spark-tensorflow-connector_2.11 | 1.14.0 |
org.tensorflow | tensorflow | 1.14.0 |
org.tensorframes | tensorframes | 0.8.1-s_2.11 |