Machine Learning용 Databricks Runtime 14.3 LTS

Machine Learning용 Databricks Runtime 14.3 LTS는 Databricks Runtime 14.3 LTS를 기반으로 기계 학습 및 데이터 과학을 위한 즉시 사용할 수 있는 환경을 제공합니다. Databricks Runtime ML에는 TensorFlow, PyTorch 및 XGBoost를 포함하여 널리 사용되는 많은 기계 학습 라이브러리가 포함되어 있습니다. Databricks Runtime ML에는 기계 학습 파이프라인을 자동으로 학습시키는 도구인 AutoML이 포함되어 있습니다. Databricks Runtime ML은 Horovod를 사용한 분산 딥 러닝 학습도 지원합니다.

새로운 기능 및 향상 기능

Databricks Runtime 14.3 LTS ML은 Databricks Runtime 14.3 LTS를 기반으로 빌드됩니다. Apache Spark MLlib 및 SparkR을 비롯한 Databricks Runtime 14.3 LTS의 새로운 내용에 대한 자세한 내용은 Databricks Runtime 14.3 LTS 릴리스 정보를 참조하세요.

시스템 환경

Databricks Runtime 14.3 LTS ML의 시스템 환경은 다음과 같이 Databricks Runtime 14.3 LTS와 다릅니다.

  • GPU 클러스터의 경우 Databricks Runtime ML에는 다음 NVIDIA GPU 라이브러리가 포함됩니다.
    • CUDA 11.8
    • cuDNN 8.9.0.131-1
    • NCCL 2.15.5
    • TensorRT 8.5.3-1

Databricks Runtime 14.3 LTS ML에는 컴퓨팅 기능이 5.2 이하인 GPU 클러스터를 지원하지 않는 XGBoost 1.7.6이 포함되어 있습니다.

라이브러리

다음 섹션에서는 Databricks Runtime 14.3 LTS에 포함된 라이브러리와 다른 Databricks Runtime 14.3 LTS ML에 포함된 라이브러리를 나열합니다.

이 섹션의 내용:

최상위 계층 라이브러리

Databricks Runtime 14.3 LTS ML에는 다음과 같은 최상위 계층 라이브러리가 포함되어 있습니다.

Python 라이브러리

Databricks Runtime 14.3 LTS ML은 Python 패키지 관리에 사용 virtualenv 되며 널리 사용되는 많은 ML 패키지를 포함합니다.

다음 섹션에 지정된 패키지 외에도 Databricks Runtime 14.3 LTS ML에는 다음 패키지가 포함되어 있습니다.

  • hyperopt 0.2.7+db4
  • sparkdl 3.0.0_db1
  • automl 1.24.0

로컬 Python 가상 환경에서 Databricks Runtime ML Python 환경을 재현하려면 requirements-14.3.txt 파일을 다운로드하고 실행pip install -r requirements-14.3.txt합니다. 이 명령은 Databricks Runtime ML이 사용하는 모든 오픈 소스 라이브러리를 설치하지만 databricks-automl, databricks-feature-store 또는 hyperopt의 Databricks 포크와 같은 Databricks에서 개발한 라이브러리는 설치하지 않습니다.

CPU 클러스터의 Python 라이브러리

라이브러리 버전 라이브러리 버전 라이브러리 버전
absl-py 1.0.0 accelerate 0.25.0 aiohttp 3.9.1
aiosignal 1.3.1 anyio 3.5.0 appdirs 1.4.4
argon2-cffi 21.3.0 argon2-cffi-bindings 21.2.0 astor 0.8.1
asttokens 2.0.5 astunparse 1.6.3 async-timeout 4.0.3
attrs 22.1.0 audioread 3.0.1 azure-core 1.29.1
azure-cosmos 4.3.1 azure-storage-blob 12.19.0 azure-storage-file-datalake 12.14.0
backcall 0.2.0 bcrypt 3.2.0 beautifulsoup4 4.11.1
검정색 22.6.0 bleach 4.1.0 blinker 1.4
blis 0.7.11 boto3 1.24.28 botocore 1.27.96
cachetools 5.3.2 catalogue 2.0.10 category-encoders 2.6.3
certifi 2022.12.7 cffi 1.15.1 chardet 4.0.0
charset-normalizer 2.0.4 에서 8.0.4 cloudpathlib 0.16.0
cloudpickle 2.0.0 cmdstanpy 1.2.0 comm 0.1.2
confection 0.1.4 configparser 5.2.0 contourpy 1.0.5
암호화 39.0.1 cycler 0.11.0 cymem 2.0.8
Cython 0.29.32 dacite 1.8.1 databricks-automl-runtime 0.2.20
databricks-cli 0.18.0 databricks-feature-engineering 0.2.0 databricks-sdk 0.1.6
dataclasses-json 0.6.3 datasets 2.15.0 dbl-tempo 0.1.26
dbus-python 1.2.18 debugpy 1.6.7 decorator 5.1.1
deepspeed 0.12.4 defusedxml 0.7.1 dill 0.3.6
diskcache 5.6.3 distlib 0.3.7 docstring-to-markdown 0.11
entrypoints 0.4 evaluate 0.4.1 실행 0.8.3
facets-overview 1.1.1 fastjsonschema 2.19.1 fasttext 0.9.2
filelock 3.9.0 Flask 2.2.5 flatbuffers 23.5.26
fonttools 4.25.0 frozenlist 1.4.1 fsspec 2023.6.0
future 0.18.3 gast 0.4.0 gitdb 4.0.11
GitPython 3.1.27 google-api-core 2.15.0 google-auth 2.21.0
google-auth-oauthlib 1.0.0 google-cloud-core 2.4.1 google-cloud-storage 2.11.0
google-crc32c 1.5.0 google-pasta 0.2.0 google-resumable-media 2.7.0
googleapis-common-protos 1.62.0 greenlet 2.0.1 grpcio 1.48.2
grpcio-상태 1.48.1 gunicorn 20.1.0 gviz-api 1.10.0
h5py 3.7.0 hjson 3.1.0 휴일 0.38
horovod 0.28.1 htmlmin 0.1.12 httplib2 0.20.2
huggingface-hub 0.19.4 idna 3.4 ImageHash 4.3.1
imbalanced-learn 0.11.0 importlib-metadata 4.11.3 importlib-resources 6.1.1
ipykernel 6.25.0 ipython 8.14.0 ipython-genutils 0.2.0
ipywidgets 7.7.2 isodate 0.6.1 itsdangerous 2.0.1
jedi 0.18.1 jeepney 0.7.1 Jinja2 3.1.2
jmespath 0.10.0 joblib 1.2.0 joblibspark 0.5.1
jsonpatch 1.33 jsonpointer 2.4 jsonschema 4.17.3
jupyter-client 7.3.4 jupyter-server 1.23.4 jupyter_core 5.2.0
jupyterlab-pygments 0.1.2 jupyterlab-widgets 1.0.0 Keras 2.14.0
Keyring 23.5.0 kiwisolver 1.4.4 langchain 0.0.348
langchain-core 0.0.13 langcodes 3.3.0 langsmith 0.0.79
launchpadlib 1.10.16 lazr.restfulclient 0.14.4 lazr.uri 1.0.6
lazy_loader 0.3 libclang 15.0.6.1 librosa 0.10.1
lightgbm 4.1.0 llvmlite 0.39.1 lxml 4.9.1
Mako 1.2.0 Markdown 3.4.1 MarkupSafe 2.1.1
마쉬 멜 로우 3.20.2 matplotlib 3.7.0 matplotlib-inline 0.1.6
mccabe 0.7.0 mistune 0.8.4 ml-dtypes 0.2.0
mlflow-skinny 2.9.2 more-itertools 8.10.0 mpmath 1.2.1
msgpack 1.0.7 multidict 6.0.4 multimethod 1.10
multiprocess 0.70.14 murmurhash 1.0.10 mypy-extensions 0.4.3
nbclassic 0.5.2 nbclient 0.5.13 nbconvert 6.5.4
nbformat 5.7.0 nest-asyncio 1.5.6 networkx 2.8.4
ninja 1.11.1.1 nltk 3.7 nodeenv 1.8.0
Notebook 6.5.2 notebook_shim 0.2.2 numba 0.56.4
numpy 1.23.5 oauthlib 3.2.0 openai 0.28.1
opt-einsum 3.3.0 패키징 23.2 pandas 1.5.3
pandocfilters 1.5.0 paramiko 2.9.2 parso 0.8.3
pathspec 0.10.3 patsy 0.5.3 petastorm 0.12.1
pexpect 4.8.0 phik 0.12.4 pickleshare 0.7.5
Pillow 9.4.0 pip 22.3.1 platformdirs 2.5.2
plotly 5.9.0 pluggy 1.0.0 pmdarima 2.0.4
pooch 1.4.0 preshed 3.0.9 prometheus-client 0.14.1
prompt-toolkit 3.0.36 prophet 1.1.5 protobuf 4.24.0
psutil 5.9.0 psycopg2 2.9.3 ptyprocess 0.7.0
pure-eval 0.2.2 py-cpuinfo 9.0.0 pyarrow 8.0.0
pyarrow-hotfix 0.5 pyasn1 0.4.8 pyasn1-modules 0.2.8
pybind11 2.11.1 pycparser 2.21 pydantic 1.10.6
pyflakes 3.1.0 Pygments 2.11.2 PyGObject 3.42.1
PyJWT 2.3.0 PyNaCl 1.5.0 pynvml 11.5.0
pyodbc 4.0.32 pyparsing 3.0.9 pyright 1.1.294
pyrsistent 0.18.0 pytesseract 0.3.10 python-dateutil 2.8.2
python-editor 1.0.4 python-lsp-jsonrpc 1.1.1 python-lsp-server 1.8.0
pytoolconfig 1.2.5 pytz 2022.7 PyWavelets 1.4.1
PyYAML 6.0 pyzmq 23.2.0 regex 2022.7.9
requests 2.28.1 requests-oauthlib 1.3.1 응답 0.18.0
rope 1.7.0 rsa 4.9 s3transfer 0.6.2
safetensors 0.4.1 scikit-learn 1.1.1 scipy 1.10.0
seaborn 0.12.2 SecretStorage 3.3.1 Send2Trash 1.8.0
문장 변환기 2.2.2 문장 조각 0.1.99 setuptools 65.6.3
shap 0.44.0 simplejson 3.17.6 six 1.16.0
slicer 0.0.7 smart-open 5.2.1 smmap 5.0.0
sniffio 1.2.0 사운드 파일 0.12.1 soupsieve 2.3.2.post1
soxr 0.3.7 spacy 3.7.2 spacy-legacy 3.0.12
spacy-loggers 1.0.5 spark-tensorflow-distributor 1.0.0 SQLAlchemy 1.4.39
sqlparse 0.4.2 srsly 2.4.8 ssh-import-id 5.11
stack-data 0.2.0 stanio 0.3.0 statsmodels 0.13.5
sympy 1.11.1 tabulate 0.8.10 tangled-up-in-unicode 0.2.0
tenacity 8.1.0 tensorboard 2.14.1 tensorboard-data-server 0.7.2
tensorboard-plugin-profile 2.14.0 tensorflow-cpu 2.14.1 tensorflow-estimator 2.14.0
tensorflow-io-gcs-filesystem 0.35.0 termcolor 2.4.0 terminado 0.17.1
thinc 8.2.2 threadpoolctl 2.2.0 tiktoken 0.5.2
tinycss2 1.2.1 tokenize-rt 4.2.1 tokenizers 0.15.0
tomli 2.0.1 torch 2.0.1+cpu torchvision 0.15.2+cpu
tornado 6.1 tqdm 4.64.1 traitlets 5.7.1
트랜스 포 머 4.36.1 typeguard 2.13.3 typer 0.9.0
typing-inspect 0.9.0 typing_extensions 4.4.0 ujson 5.4.0
unattended-upgrades 0.1 urllib3 1.26.14 virtualenv 20.16.7
visions 0.7.5 wadllib 1.3.6 wasabi 1.1.2
wcwidth 0.2.5 족제비 0.3.4 webencodings 0.5.1
websocket-client 0.58.0 Werkzeug 2.2.2 whatthepatch 1.0.2
wheel 0.38.4 widgetsnbextension 3.6.1 wordcloud 1.9.3
wrapt 1.14.1 xgboost 1.76 xxhash 3.4.1
yapf 0.33.0 yarl 1.9.4 ydata 프로파일링 4.2.0
zipp 3.11.0

GPU 클러스터의 Python 라이브러리

라이브러리 버전 라이브러리 버전 라이브러리 버전
absl-py 1.0.0 accelerate 0.25.0 aiohttp 3.9.1
aiosignal 1.3.1 anyio 3.5.0 appdirs 1.4.4
argon2-cffi 21.3.0 argon2-cffi-bindings 21.2.0 astor 0.8.1
asttokens 2.0.5 astunparse 1.6.3 async-timeout 4.0.3
attrs 22.1.0 audioread 3.0.1 azure-core 1.29.1
azure-cosmos 4.3.1 azure-storage-blob 12.19.0 azure-storage-file-datalake 12.14.0
backcall 0.2.0 bcrypt 3.2.0 beautifulsoup4 4.11.1
검정색 22.6.0 bleach 4.1.0 blinker 1.4
blis 0.7.11 boto3 1.24.28 botocore 1.27.96
cachetools 5.3.2 catalogue 2.0.10 category-encoders 2.6.3
certifi 2022.12.7 cffi 1.15.1 chardet 4.0.0
charset-normalizer 2.0.4 에서 8.0.4 cloudpathlib 0.16.0
cloudpickle 2.0.0 cmake 3.28.1 cmdstanpy 1.2.0
comm 0.1.2 confection 0.1.4 configparser 5.2.0
contourpy 1.0.5 암호화 39.0.1 cycler 0.11.0
cymem 2.0.8 Cython 0.29.32 dacite 1.8.1
databricks-automl-runtime 0.2.20 databricks-cli 0.18.0 databricks-feature-engineering 0.2.0
databricks-sdk 0.1.6 dataclasses-json 0.6.3 datasets 2.15.0
dbl-tempo 0.1.26 dbus-python 1.2.18 debugpy 1.6.7
decorator 5.1.1 deepspeed 0.12.4 defusedxml 0.7.1
dill 0.3.6 diskcache 5.6.3 distlib 0.3.7
docstring-to-markdown 0.11 einops 0.7.0 entrypoints 0.4
evaluate 0.4.1 실행 0.8.3 facets-overview 1.1.1
fastjsonschema 2.19.1 fasttext 0.9.2 filelock 3.9.0
flash-attn 2.3.6 Flask 2.2.5 flatbuffers 23.5.26
fonttools 4.25.0 frozenlist 1.4.1 fsspec 2023.6.0
future 0.18.3 gast 0.4.0 gitdb 4.0.11
GitPython 3.1.27 google-api-core 2.15.0 google-auth 2.21.0
google-auth-oauthlib 1.0.0 google-cloud-core 2.4.1 google-cloud-storage 2.11.0
google-crc32c 1.5.0 google-pasta 0.2.0 google-resumable-media 2.7.0
googleapis-common-protos 1.62.0 greenlet 2.0.1 grpcio 1.48.2
grpcio-상태 1.48.1 gunicorn 20.1.0 gviz-api 1.10.0
h5py 3.7.0 hjson 3.1.0 휴일 0.38
horovod 0.28.1 htmlmin 0.1.12 httplib2 0.20.2
huggingface-hub 0.19.4 idna 3.4 ImageHash 4.3.1
imbalanced-learn 0.11.0 importlib-metadata 4.11.3 importlib-resources 6.1.1
ipykernel 6.25.0 ipython 8.14.0 ipython-genutils 0.2.0
ipywidgets 7.7.2 isodate 0.6.1 itsdangerous 2.0.1
jedi 0.18.1 jeepney 0.7.1 Jinja2 3.1.2
jmespath 0.10.0 joblib 1.2.0 joblibspark 0.5.1
jsonpatch 1.33 jsonpointer 2.4 jsonschema 4.17.3
jupyter-client 7.3.4 jupyter-server 1.23.4 jupyter_core 5.2.0
jupyterlab-pygments 0.1.2 jupyterlab-widgets 1.0.0 Keras 2.14.0
Keyring 23.5.0 kiwisolver 1.4.4 langchain 0.0.348
langchain-core 0.0.13 langcodes 3.3.0 langsmith 0.0.79
launchpadlib 1.10.16 lazr.restfulclient 0.14.4 lazr.uri 1.0.6
lazy_loader 0.3 libclang 15.0.6.1 librosa 0.10.1
lightgbm 4.1.0 lit 17.0.6 llvmlite 0.39.1
lxml 4.9.1 Mako 1.2.0 Markdown 3.4.1
MarkupSafe 2.1.1 마쉬 멜 로우 3.20.2 matplotlib 3.7.0
matplotlib-inline 0.1.6 mccabe 0.7.0 mistune 0.8.4
ml-dtypes 0.2.0 mlflow-skinny 2.9.2 more-itertools 8.10.0
mpmath 1.2.1 msgpack 1.0.7 multidict 6.0.4
multimethod 1.10 multiprocess 0.70.14 murmurhash 1.0.10
mypy-extensions 0.4.3 nbclassic 0.5.2 nbclient 0.5.13
nbconvert 6.5.4 nbformat 5.7.0 nest-asyncio 1.5.6
networkx 2.8.4 ninja 1.11.1.1 nltk 3.7
nodeenv 1.8.0 Notebook 6.5.2 notebook_shim 0.2.2
numba 0.56.4 numpy 1.23.5 oauthlib 3.2.0
openai 0.28.1 opt-einsum 3.3.0 패키징 23.2
pandas 1.5.3 pandocfilters 1.5.0 paramiko 2.9.2
parso 0.8.3 pathspec 0.10.3 patsy 0.5.3
petastorm 0.12.1 pexpect 4.8.0 phik 0.12.4
pickleshare 0.7.5 Pillow 9.4.0 pip 22.3.1
platformdirs 2.5.2 plotly 5.9.0 pluggy 1.0.0
pmdarima 2.0.4 pooch 1.4.0 preshed 3.0.9
prompt-toolkit 3.0.36 prophet 1.1.5 protobuf 4.24.0
psutil 5.9.0 psycopg2 2.9.3 ptyprocess 0.7.0
pure-eval 0.2.2 py-cpuinfo 9.0.0 pyarrow 8.0.0
pyarrow-hotfix 0.5 pyasn1 0.4.8 pyasn1-modules 0.2.8
pybind11 2.11.1 pycparser 2.21 pydantic 1.10.6
pyflakes 3.1.0 Pygments 2.11.2 PyGObject 3.42.1
PyJWT 2.3.0 PyNaCl 1.5.0 pynvml 11.5.0
pyodbc 4.0.32 pyparsing 3.0.9 pyright 1.1.294
pyrsistent 0.18.0 pytesseract 0.3.10 python-dateutil 2.8.2
python-editor 1.0.4 python-lsp-jsonrpc 1.1.1 python-lsp-server 1.8.0
pytoolconfig 1.2.5 pytz 2022.7 PyWavelets 1.4.1
PyYAML 6.0 pyzmq 23.2.0 regex 2022.7.9
requests 2.28.1 requests-oauthlib 1.3.1 응답 0.18.0
rope 1.7.0 rsa 4.9 s3transfer 0.6.2
safetensors 0.4.1 scikit-learn 1.1.1 scipy 1.10.0
seaborn 0.12.2 SecretStorage 3.3.1 Send2Trash 1.8.0
문장 변환기 2.2.2 문장 조각 0.1.99 setuptools 65.6.3
shap 0.44.0 simplejson 3.17.6 six 1.16.0
slicer 0.0.7 smart-open 5.2.1 smmap 5.0.0
sniffio 1.2.0 사운드 파일 0.12.1 soupsieve 2.3.2.post1
soxr 0.3.7 spacy 3.7.2 spacy-legacy 3.0.12
spacy-loggers 1.0.5 spark-tensorflow-distributor 1.0.0 SQLAlchemy 1.4.39
sqlparse 0.4.2 srsly 2.4.8 ssh-import-id 5.11
stack-data 0.2.0 stanio 0.3.0 statsmodels 0.13.5
sympy 1.11.1 tabulate 0.8.10 tangled-up-in-unicode 0.2.0
tenacity 8.1.0 tensorboard 2.14.1 tensorboard-data-server 0.7.2
tensorboard-plugin-profile 2.14.0 tensorflow 2.14.1 tensorflow-estimator 2.14.0
tensorflow-io-gcs-filesystem 0.35.0 termcolor 2.4.0 terminado 0.17.1
thinc 8.2.2 threadpoolctl 2.2.0 tiktoken 0.5.2
tinycss2 1.2.1 tokenize-rt 4.2.1 tokenizers 0.15.0
tomli 2.0.1 torch 2.0.1+cu118 torchvision 0.15.2+cu118
tornado 6.1 tqdm 4.64.1 traitlets 5.7.1
트랜스 포 머 4.36.1 트리톤 2.0.0 typeguard 2.13.3
typer 0.9.0 typing-inspect 0.9.0 typing_extensions 4.4.0
ujson 5.4.0 unattended-upgrades 0.1 urllib3 1.26.14
virtualenv 20.16.7 visions 0.7.5 wadllib 1.3.6
wasabi 1.1.2 wcwidth 0.2.5 족제비 0.3.4
webencodings 0.5.1 websocket-client 0.58.0 Werkzeug 2.2.2
whatthepatch 1.0.2 wheel 0.38.4 widgetsnbextension 3.6.1
wordcloud 1.9.3 wrapt 1.14.1 xgboost 1.76
xxhash 3.4.1 yapf 0.33.0 yarl 1.9.4
ydata 프로파일링 4.2.0 zipp 3.11.0

R 라이브러리

R 라이브러리는 Databricks Runtime 14.3 LTS의 R 라이브러리와 동일합니다.

Java 및 Scala 라이브러리(Scala 2.12 클러스터)

Databricks Runtime 14.3 LTS의 Java 및 Scala 라이브러리 외에도 Databricks Runtime 14.3 LTS ML에는 다음 JAR이 포함됩니다.

CPU 클러스터

그룹 ID 아티팩트 ID 버전
com.typesafe.akka akka-actor_2.12 2.5.23
ml.dmlc xgboost4j-spark_2.12 1.7.3
ml.dmlc xgboost4j_2.12 1.7.3
org.graphframes graphframes_2.12 0.8.2-db2-spark3.4
org.mlflow mlflow-client 2.9.2
org.scala-lang.modules scala-java8-compat_2.12 0.8.0
org.tensorflow spark-tensorflow-connector_2.12 1.15.0

GPU 클러스터

그룹 ID 아티팩트 ID 버전
com.typesafe.akka akka-actor_2.12 2.5.23
ml.dmlc xgboost4j-gpu_2.12 1.7.3
ml.dmlc xgboost4j-spark-gpu_2.12 1.7.3
org.graphframes graphframes_2.12 0.8.2-db2-spark3.4
org.mlflow mlflow-client 2.9.2
org.scala-lang.modules scala-java8-compat_2.12 0.8.0
org.tensorflow spark-tensorflow-connector_2.12 1.15.0