클러스터 CLI(레거시)

아티클
03/01/2024

Important

이 설명서는 사용 중지되었으며 업데이트되지 않을 수 있습니다.

이 정보는 레거시 Databricks CLI 버전 0.18 이하에 적용됩니다. Databricks는 최신 Databricks CLI 버전 0.205 이상을 대신 사용하는 것이 좋습니다. Databricks CLI란?을 참조하세요. Databricks CLI 버전을 찾으려면 다음을 실행 databricks -v합니다.

Databricks CLI 버전 0.18 이하에서 Databricks CLI 버전 0.205 이상으로 마이그레이션하려면 Databricks CLI 마이그레이션을 참조하세요.

Databricks 클러스터 CLI 하위 명령을 databricks clusters에 추가하여 실행합니다. 이러한 하위 명령은 클러스터 API를 호출합니다.

databricks clusters -h

Usage: databricks clusters [OPTIONS] COMMAND [ARGS]...

  Utility to interact with Databricks clusters.

Options:
  -v, --version  [VERSION]
  -h, --help     Show this message and exit.

Commands:
  create           Creates a Databricks cluster.
    Options:
      --json-file PATH  File containing JSON request to POST to /api/2.0/clusters/create.
      --json JSON       JSON string to POST to /api/2.0/clusters/create.
  delete           Removes a Databricks cluster.
    Options:
      --cluster-id CLUSTER_ID Can be found in the URL at https://<databricks-instance>/?o=<16-digit-number>#/setting/clusters/$CLUSTER_ID/configuration.
  edit             Edits a Databricks cluster.
    Options:
      --json-file PATH  File containing JSON request to POST to /api/2.0/clusters/edit.
      --json JSON       JSON string to POST to /api/2.0/clusters/edit.
  events Gets events for a Spark cluster.
    Options:
      --cluster-id CLUSTER_ID  Can be found in the URL at https://<databricks-instance>/#/setting/clusters/$CLUSTER_ID/configuration.  [required]
      --start-time TEXT        The start time in epoch milliseconds. If
                               unprovided, returns events starting from the
                               beginning of time.
      --end-time TEXT          The end time in epoch milliseconds. If unprovided,
                               returns events up to the current time
      --order TEXT             The order to list events in; either ASC or DESC.
                               Defaults to DESC (most recent first).
      --event-type TEXT        An event types to filter on (specify multiple event
                               types by passing the --event-type option multiple
                               times). If empty, all event types are returned.
      --offset TEXT            The offset in the result set. Defaults to 0 (no
                               offset). When an offset is specified and the
                               results are requested in descending order, the
                               end_time field is required.
      --limit TEXT             The maximum number of events to include in a page
                               of events. Defaults to 50, and maximum allowed
                               value is 500.
      --output FORMAT          can be "JSON" or "TABLE". Set to TABLE by default.
  get              Retrieves metadata about a cluster.
    Options:
      --cluster-id CLUSTER_ID Can be found in the URL at https://<databricks-instance>/?o=<16-digit-number>#/setting/clusters/$CLUSTER_ID/configuration.
  list             Lists active and recently terminated clusters.
    Options:
      --output FORMAT          JSON or TABLE. Set to TABLE by default.
  list-node-types  Lists node types for a cluster.
  list-zones       Lists zones where clusters can be created.
  permanent-delete Permanently deletes a cluster.
    Options:
      --cluster-id CLUSTER_ID  Can be found in the URL at https://<databricks-instance>/?o=<16-digit-number>#/setting/clusters/$CLUSTER_ID/configuration.
  resize           Resizes a Databricks cluster given its ID.
    Options:
      --cluster-id CLUSTER_ID  Can be found in the URL at https://<databricks-instance>/?o=<16-digit-number>#/setting/clusters/$CLUSTER_ID/configuration.
      --num-workers INTEGER    Number of workers. [required]
  restart          Restarts a Databricks cluster.
    Options:
      --cluster-id CLUSTER_ID  Can be found in the URL at https://<databricks-instance>/?o=<16-digit-number>#/setting/clusters/$CLUSTER_ID/configuration.
  spark-versions   Lists possible Databricks Runtime versions.
  start            Starts a terminated Databricks cluster.
    Options:
      --cluster-id CLUSTER_ID  Can be found in the URL at https://<databricks-instance>/?o=<16-digit-number>#/setting/clusters/$CLUSTER_ID/configuration.

클러스터 생성

사용 설명서를 표시하려면 databricks clusters create --help를 실행합니다.

databricks clusters create --json-file create-cluster.json

create-cluster.json:

{
  "cluster_name": "my-cluster",
  "spark_version": "7.3.x-scala2.12",
  "node_type_id": "Standard_D3_v2",
  "spark_conf": {
    "spark.speculation": true
  },
  "num_workers": 25
}

{
  "cluster_id": "1234-567890-batch123"
}

클러스터 삭제

사용 설명서를 표시하려면 databricks clusters delete --help를 실행합니다.

databricks clusters delete --cluster-id 1234-567890-batch123

성공하면 출력이 표시되지 않습니다.

클러스터의 구성 변경

사용 설명서를 표시하려면 databricks clusters edit --help를 실행합니다.

databricks clusters edit --json-file edit-cluster.json

edit-cluster.json:

{
  "cluster_id": "1234-567890-batch123",
  "num_workers": 10,
  "spark_version": "7.3.x-scala2.12",
  "node_type_id": "Standard_D3_v2"
}

성공하면 출력이 표시되지 않습니다.

클러스터에 대한 이벤트 나열

사용 설명서를 표시하려면 databricks clusters events --help를 실행합니다.

databricks clusters events \
--cluster-id 1234-567890-batch123 \
--start-time 1617238800000 \
--end-time 1619485200000 \
--order DESC \
--limit 5 \
--event-type RUNNING \
--output JSON \
| jq .

{
  "events": [
    {
      "cluster_id": "1234-567890-batch123",
      "timestamp": 1619214150232,
      "type": "RUNNING",
      "details": {
        "current_num_workers": 2,
        "target_num_workers": 2
      }
    },
    ...
    {
      "cluster_id": "1234-567890-batch123",
      "timestamp": 1617895221986,
      "type": "RUNNING",
      "details": {
        "current_num_workers": 2,
        "target_num_workers": 2
      }
    }
  ],
  "next_page": {
    "cluster_id": "1234-567890-batch123",
    "start_time": 1617238800000,
    "end_time": 1619485200000,
    "order": "DESC",
    "event_types": [
      "RUNNING"
    ],
    "offset": 5,
    "limit": 5
  },
  "total_count": 11
}

클러스터에 대한 정보 가져오기

사용 설명서를 표시하려면 databricks clusters get --help를 실행합니다.

databricks clusters get --cluster-id 1234-567890-batch123

또는

databricks clusters get --cluster-name my-cluster

{
  "cluster_id": "1234-567890-batch123",
  "spark_context_id": 3124308392469747564,
  "cluster_name": "my-cluster",
  "spark_version": "7.5.x-scala2.12",
  "spark_conf": {
    "spark.databricks.delta.preview.enabled": "true"
  },
  "node_type_id": "Standard_DS3_v2",
  "driver_node_type_id": "Standard_DS3_v2",
  "spark_env_vars": {
    "PYSPARK_PYTHON": "/databricks/python3/bin/python3"
  },
  "autotermination_minutes": 0,
  "enable_elastic_disk": true,
  "disk_spec": {},
  "cluster_source": "JOB",
  "enable_local_disk_encryption": false,
  "azure_attributes": {
    "first_on_demand": 1,
    "availability": "ON_DEMAND_AZURE",
    "spot_bid_max_price": -1.0
  },
  "instance_source": {
    "node_type_id": "Standard_DS3_v2"
  },
  "driver_instance_source": {
    "node_type_id": "Standard_DS3_v2"
  },
  "state": "TERMINATED",
  "state_message": "",
  "start_time": 1619563745373,
  "terminated_time": 1619563822867,
  "last_state_loss_time": 0,
  "num_workers": 8,
  "default_tags": {
    "Vendor": "Databricks",
    "Creator": "someone@example.com",
    "ClusterName": "my-cluster",
    "ClusterId": "1234-567890-batch123",
    "JobId": "1268284",
    "RunName": "Normal job"
  },
  "creator_user_name": "someone@example.com",
  "termination_reason": {
    "code": "JOB_FINISHED",
    "type": "SUCCESS"
  },
  "init_scripts_safe_mode": false
}

사용 가능한 모든 클러스터에 대한 정보 나열

사용 설명서를 표시하려면 databricks clusters list --help를 실행합니다.

databricks clusters list --output JSON | jq .

{
  "clusters": [
    {
      "cluster_id": "1234-567890-batch123",
      "spark_context_id": 3124308392469747564,
      "cluster_name": "my-cluster",
      "spark_version": "7.5.x-scala2.12",
      "spark_conf": {
        "spark.databricks.delta.preview.enabled": "true"
      },
      "node_type_id": "Standard_DS3_v2",
      "driver_node_type_id": "Standard_DS3_v2",
      "spark_env_vars": {
        "PYSPARK_PYTHON": "/databricks/python3/bin/python3"
      },
      "autotermination_minutes": 0,
      "enable_elastic_disk": true,
      "disk_spec": {},
      "cluster_source": "JOB",
      "enable_local_disk_encryption": false,
      "azure_attributes": {
        "first_on_demand": 1,
        "availability": "ON_DEMAND_AZURE",
        "spot_bid_max_price": -1.0
      },
      "instance_source": {
        "node_type_id": "Standard_DS3_v2"
      },
      "driver_instance_source": {
        "node_type_id": "Standard_DS3_v2"
      },
      "state": "TERMINATED",
      "state_message": "",
      "start_time": 1619563745373,
      "terminated_time": 1619563822867,
      "last_state_loss_time": 0,
      "num_workers": 8,
      "default_tags": {
        "Vendor": "Databricks",
        "Creator": "someone@example.com",
        "ClusterName": "my-cluster",
        "ClusterId": "1234-567890-batch123",
        "JobId": "1268284",
        "RunName": "Normal job"
      },
      "creator_user_name": "someone@example.com",
      "termination_reason": {
        "code": "JOB_FINISHED",
        "type": "SUCCESS"
      },
      "init_scripts_safe_mode": false
    },
    ...
  ]
}

사용 가능한 클러스터 노드 유형 나열

사용 설명서를 표시하려면 databricks clusters list-node-types --help를 실행합니다.

databricks clusters list-node-types

{
  "node_types": [
    {
      "node_type_id": "Standard_L80s_v2",
      "memory_mb": 655360,
      "num_cores": 80.0,
      "description": "Standard_L80s_v2",
      "instance_type_id": "Standard_L80s_v2",
      "is_deprecated": false,
      "category": "Storage Optimized",
      "support_ebs_volumes": true,
      "support_cluster_tags": true,
      "num_gpus": 0,
      "node_instance_type": {
        "instance_type_id": "Standard_L80s_v2",
        "local_disks": 1,
        "local_disk_size_gb": 800,
        "instance_family": "Standard LSv2 Family vCPUs",
        "local_nvme_disk_size_gb": 1788,
        "local_nvme_disks": 10,
        "swap_size": "10g"
      },
      "is_hidden": false,
      "support_port_forwarding": true,
      "display_order": 0,
      "is_io_cache_enabled": true,
      "node_info": {
        "available_core_quota": 350.0,
        "total_core_quota": 350.0
      }
    },
    ...
  ]
}

클러스터를 만들기 위한 사용 가능한 영역 나열

참고 항목

이 명령은 Azure Databricks에서 작동하지 않습니다.

사용 설명서를 표시하려면 databricks clusters list-zones --help를 실행합니다.

databricks clusters list-zones

클러스터 영구 삭제

사용 설명서를 표시하려면 databricks clusters permanent-delete --help를 실행합니다.

databricks clusters permanent-delete --cluster-id 1234-567890-batch123

성공하면 출력이 표시되지 않습니다.

클러스터 크기 조정

사용 설명서를 표시하려면 databricks clusters resize --help를 실행합니다.

databricks clusters resize --cluster-id 1234-567890-batch123 --num-workers 10

성공하면 출력이 표시되지 않습니다.

클러스터 다시 시작

사용 설명서를 표시하려면 databricks clusters restart --help를 실행합니다.

databricks clusters restart --cluster-id 1234-567890-batch123

성공하면 출력이 표시되지 않습니다.

사용 가능한 Spark 런타임 버전 나열

사용 설명서를 표시하려면 databricks clusters spark-versions --help를 실행합니다.

databricks clusters spark-versions

{
  "versions": [
    {
      "key": "8.2.x-scala2.12",
      "name": "8.2 (includes Apache Spark 3.1.1, Scala 2.12)"
    },
    ...
  ]
}

클러스터 시작

사용 설명서를 표시하려면 databricks clusters start --help를 실행합니다.

databricks clusters start --cluster-id 1234-567890-batch123

성공하면 출력이 표시되지 않습니다.

클러스터 CLI(레거시)

클러스터 생성

클러스터 삭제

클러스터의 구성 변경

클러스터에 대한 이벤트 나열

클러스터에 대한 정보 가져오기

사용 가능한 모든 클러스터에 대한 정보 나열

사용 가능한 클러스터 노드 유형 나열

클러스터를 만들기 위한 사용 가능한 영역 나열

클러스터 영구 삭제

클러스터 크기 조정

클러스터 다시 시작

사용 가능한 Spark 런타임 버전 나열

클러스터 시작

추가 리소스