如何导入自定义 CA 证书

使用 Python 时,可能需要导入自定义 CA 证书,以避免连接到终结点时出现错误。

ConnectionError: HTTPSConnectionPool(host='my_server_endpoint', port=443): Max retries exceeded with url: /endpoint (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7fb73dc3b3d0>: Failed to establish a new connection: [Errno 110] Connection timed out',))

若要将一个或多个自定义 CA 证书导入到 Azure Databricks 群集:

  1. 请创建一个 init 脚本用于添加整个 CA 链和设置 REQUESTS_CA_BUNDLE 属性。

    在此示例中,PEM 格式的 CA 证书被添加到 /user/local/share/ca-certificates/ 处的 myca.crt 文件中。 此文件在 custom-cert.sh init 脚本中引用。

    dbutils.fs.put("/databricks/init-scripts/custom-cert.sh", """#!/bin/bash
    
    cat << 'EOF' > /usr/local/share/ca-certificates/myca.crt
    -----BEGIN CERTIFICATE-----
    <CA CHAIN 1 CERTIFICATE CONTENT>
    -----END CERTIFICATE-----
    -----BEGIN CERTIFICATE-----
    <CA CHAIN 2 CERTIFICATE CONTENT>
    -----END CERTIFICATE-----
    EOF
    
    update-ca-certificates
    
    PEM_FILE="/etc/ssl/certs/myca.pem"
    PASSWORD="<password>"
    JAVA_HOME=$(readlink -f /usr/bin/java | sed "s:bin/java::")
    KEYSTORE="$JAVA_HOME/lib/security/cacerts"
    
    CERTS=$(grep 'END CERTIFICATE' $PEM_FILE| wc -l)
    
    # To process multiple certs with keytool, you need to extract
    # each one from the PEM file and import it into the Java KeyStore.
    
    for N in $(seq 0 $(($CERTS - 1))); do
      ALIAS="$(basename $PEM_FILE)-$N"
      echo "Adding to keystore with alias:$ALIAS"
      cat $PEM_FILE |
        awk "n==$N { print }; /END CERTIFICATE/ { n++ }" |
        keytool -noprompt -import -trustcacerts \
                -alias $ALIAS -keystore $KEYSTORE -storepass $PASSWORD
    done
    
    echo "export REQUESTS_CA_BUNDLE=/etc/ssl/certs/ca-certificates.crt" >> /databricks/spark/conf/spark-env.sh
    """)
    

    若要将自定义 CA 证书与 DBFS FUSE 一起使用,请将此行添加到 init 脚本的底部:

    /databricks/spark/scripts/restart_dbfs_fuse_daemon.sh
    
  2. 将 init 脚本作为群集范围内的 init 脚本附加到群集。

  3. 重启群集。