適用於 Python 的 Azure 認知服務電腦視覺 SDKAzure Cognitive Services Computer Vision SDK for Python

電腦視覺服務可供開發人員存取進階演算法,以處理影像及傳回資訊。The Computer Vision service provides developers with access to advanced algorithms for processing images and returning information. 電腦視覺演算法可根據您感興趣的視覺化功能,以不同的方式分析影像的內容。Computer Vision algorithms analyze the content of an image in different ways, depending on the visual features you're interested in.

如需此服務的詳細資訊,請參閱什麼是電腦視覺?For more information about this service, see What is Computer Vision?.

在尋找更多文件嗎?Looking for more documentation?

必要條件Prerequisites

如果您沒有 Azure 訂用帳戶If you don't have an Azure Subscription

請針對電腦視覺服務,建立有效期為 7 天且提供 試用 體驗的免費金鑰。Create a free key valid for 7 days with the Try It experience for the Computer Vision service. 建立金鑰時,請複製金鑰和端點名稱。When the key is created, copy the key and endpoint name. 您在建立用戶端時將需要這些資訊。You will need this to create the client.

建立金鑰後請保存下列項目:Keep the following after the key is created:

  • 金鑰值:一個包含 32 個字元、採用 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx 格式的字串Key value: a 32 character string with the format of xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
  • 金鑰端點:基本端點 URL (https://westcentralus.api.cognitive.microsoft.com)Key endpoint: the base endpoint URL, https://westcentralus.api.cognitive.microsoft.com

如果您有 Azure 訂用帳戶If you have an Azure Subscription

在訂用帳戶中建立資源的最簡單方法是使用下列 Azure CLI 命令。The easiest method to create a resource in your subscription is to use the following Azure CLI command. 這會建立可用於許多認知服務的認知服務金鑰。This creates a Cognitive Service key that can be used across many cognitive services. 您必須選擇「現有」 資源群組名稱 (例如 "my-cogserv-group") 和新的電腦視覺資源名稱 (例如 "my-computer-vision-resource")。You need to choose the existing resource group name, for example, "my-cogserv-group" and the new computer vision resource name, such as "my-computer-vision-resource".

RES_REGION=westeurope
RES_GROUP=<resourcegroup-name>
ACCT_NAME=<computervision-account-name>

az cognitiveservices account create \
    --resource-group $RES_GROUP \
    --name $ACCT_NAME \
    --location $RES_REGION \
    --kind CognitiveServices \
    --sku S0 \
    --yes

安裝 SDKInstall the SDK

使用 pip 安裝適用於 Python 的 Azure 認知服務電腦視覺 SDK 套件Install the Azure Cognitive Services Computer Vision SDK for Python package with pip:

pip install azure-cognitiveservices-vision-computervision

AuthenticationAuthentication

在建立電腦視覺資源後,您需要其端點及其中一個帳戶金鑰來具現化用戶端物件。Once you create your Computer Vision resource, you need its endpoint, and one of its account keys to instantiate the client object.

在建立 ComputerVisionClient 用戶端物件的執行個體時,請使用這些值。Use these values when you create the instance of the ComputerVisionClient client object.

例如,使用 Bash 終端機來設定環境變數:For example, use the Bash terminal to set the environment variables:

ACCOUNT_ENDPOINT=<resourcegroup-name>
ACCT_NAME=<computervision-account-name>

對於 Azure 訂用帳戶使用者,取得金鑰和端點的認證For Azure subscription users, get credentials for key and endpoint

如果不記得您的端點和金鑰,您可以使用下列方法來找出它們。If you do not remember your endpoint and key, you can use the following method to find them. 如果您需要建立金鑰和端點,可以將此方法用於 Azure 訂用帳戶持有者沒有 Azure 訂用帳戶的使用者If you need to create a key and endpoint, you can use the method for Azure subscription holders or for users without an Azure subscription.

請使用下面的 Azure CLI 程式碼片段,來為兩個環境變數填入電腦視覺帳戶端點及其中一個金鑰 (您也可以在 Azure 入口網站找到這些值)。Use the Azure CLI snippet below to populate two environment variables with the Computer Vision account endpoint and one of its keys (you can also find these values in the Azure portal). 此程式碼片段會針對 Bash Shell 加以格式化。The snippet is formatted for the Bash shell.

RES_GROUP=<resourcegroup-name>
ACCT_NAME=<computervision-account-name>

export ACCOUNT_ENDPOINT=$(az cognitiveservices account show \
    --resource-group $RES_GROUP \
    --name $ACCT_NAME \
    --query endpoint \
    --output tsv)

export ACCOUNT_KEY=$(az cognitiveservices account keys list \
    --resource-group $RES_GROUP \
    --name $ACCT_NAME \
    --query key1 \
    --output tsv)

建立用戶端Create client

取得環境變數中的端點和金鑰,然後建立 ComputerVisionClient 用戶端物件。Get the endpoint and key from environment variables then create the ComputerVisionClient client object.

from azure.cognitiveservices.vision.computervision import ComputerVisionClient
from azure.cognitiveservices.vision.computervision.models import VisualFeatureTypes
from msrest.authentication import CognitiveServicesCredentials

# Get endpoint and key from environment variables
import os
endpoint = os.environ['ACCOUNT_ENDPOINT']
key = os.environ['ACCOUNT_KEY']

# Set credentials
credentials = CognitiveServicesCredentials(key)

# Create client
client = ComputerVisionClient(endpoint, credentials)

範例Examples

您必須要有 ComputerVisionClient 用戶端物件,才能使用任何下列工作。You need a ComputerVisionClient client object before using any of the following tasks.

分析影像Analyze an image

您可以使用 analyze_image 分析影像來找出某些特徵。You can analyze an image for certain features with analyze_image. 請使用 visual_features 屬性來設定要對影像執行的分析類型。Use the visual_features property to set the types of analysis to perform on the image. 常見的值為 VisualFeatureTypes.tagsVisualFeatureTypes.descriptionCommon values are VisualFeatureTypes.tags and VisualFeatureTypes.description.

url = "https://upload.wikimedia.org/wikipedia/commons/thumb/1/12/Broadway_and_Times_Square_by_night.jpg/450px-Broadway_and_Times_Square_by_night.jpg"

image_analysis = client.analyze_image(url,visual_features=[VisualFeatureTypes.tags])

for tag in image_analysis.tags:
    print(tag)

取得主題領域清單Get subject domain list

請使用 list_models 來檢閱用於分析影像的主題領域。Review the subject domains used to analyze your image with list_models. 依領域分析影像時會用到這些領域名稱。These domain names are used when analyzing an image by domain. landmarks 便是領域的其中一個範例。An example of a domain is landmarks.

models = client.list_models()

for x in models.models_property:
    print(x)

依領域分析影像Analyze an image by domain

您可以使用 analyze_image_by_domain 來依主題領域分析影像。You can analyze an image by subject domain with analyze_image_by_domain. 請取得所支援主題領域的清單,以便使用正確的領域名稱。Get the list of supported subject domains in order to use the correct domain name.

# type of prediction
domain = "landmarks"

# Public domain image of Eiffel tower
url = "https://images.pexels.com/photos/338515/pexels-photo-338515.jpeg"

# English language response
language = "en"

analysis = client.analyze_image_by_domain(domain, url, language)

for landmark in analysis.result["landmarks"]:
    print(landmark["name"])
    print(landmark["confidence"])

取得影像的文字描述Get text description of an image

您可以使用 describe_image 來取得影像的語言式文字描述。You can get a language-based text description of an image with describe_image. 如果您要對影像的相關關鍵字執行文字分析,請使用 max_description 屬性要求幾個描述。Request several descriptions with the max_description property if you are doing text analysis for keywords associated with the image. 下圖的文字描述範例包括 a train crossing a bridge over a body of watera large bridge over a body of watera train crossing a bridge over a large body of waterExamples of a text description for the following image include a train crossing a bridge over a body of water, a large bridge over a body of water, and a train crossing a bridge over a large body of water.

domain = "landmarks"
url = "http://www.public-domain-photos.com/free-stock-photos-4/travel/san-francisco/golden-gate-bridge-in-san-francisco.jpg"
language = "en"
max_descriptions = 3

analysis = client.describe_image(url, max_descriptions, language)

for caption in analysis.captions:
    print(caption.text)
    print(caption.confidence)

從影像中取得文字Get text from image

您可以從影像中取得任何手寫或列印的文字。You can get any handwritten or printed text from an image. 這需要對 SDK 發出兩個呼叫:batch_read_fileget_read_operation_resultThis requires two calls to the SDK: batch_read_file and get_read_operation_result. batch_read_file 的呼叫是非同步的。The call to batch_read_file is asynchronous. get_read_operation_result 呼叫的結果中,您必須先確認第一個呼叫是否已完成且有 TextOperationStatusCodes,再擷取文字資料。In the results of the get_read_operation_result call, you need to check if the first call completed with TextOperationStatusCodes before extracting the text data. 結果中會包含文字以及文字的週框座標。The results include the text as well as the bounding box coordinates for the text.

# import models
from azure.cognitiveservices.vision.computervision.models import TextOperationStatusCodes
import time

url = "https://azurecomcdn.azureedge.net/cvt-1979217d3d0d31c5c87cbd991bccfee2d184b55eeb4081200012bdaf6a65601a/images/shared/cognitive-services-demos/read-text/read-1-thumbnail.png"
raw = True
custom_headers = None
numberOfCharsInOperationId = 36

# Async SDK call
rawHttpResponse = client.batch_read_file(url, custom_headers,  raw)

# Get ID from returned headers
operationLocation = rawHttpResponse.headers["Operation-Location"]
idLocation = len(operationLocation) - numberOfCharsInOperationId
operationId = operationLocation[idLocation:]

# SDK call
while True:
    result = client.get_read_operation_result(operationId)
    if result.status not in ['NotStarted', 'Running']:
        break
    time.sleep(1)

# Get data
if result.status == TextOperationStatusCodes.succeeded:
    for textResult in result.recognition_results:
        for line in textResult.lines:
            print(line.text)
            print(line.bounding_box)

產生縮圖Generate thumbnail

您可以使用 generate_thumbnail 來產生影像的縮圖 (JPG)。You can generate a thumbnail (JPG) of an image with generate_thumbnail. 縮圖的比例不必和原始影像相同。The thumbnail does not need to be in the same proportions as the original image.

安裝 Pillow 以使用此範例:Install Pillow to use this example:

pip install Pillow

安裝 Pillow 之後,請使用下列程式碼範例中的套件來產生縮圖影像。Once Pillow is installed, use the package in the following code example to generate the thumbnail image.

# Pillow package
from PIL import Image

# IO package to create local image
import io

width = 50
height = 50
url = "http://www.public-domain-photos.com/free-stock-photos-4/travel/san-francisco/golden-gate-bridge-in-san-francisco.jpg"

thumbnail = client.generate_thumbnail(width, height, url)

for x in thumbnail:
    image = Image.open(io.BytesIO(x))

image.save('thumbnail.jpg')

疑難排解Troubleshooting

一般General

使用 Python SDK 與 ComputerVisionClient 用戶端物件互動時,系統會用 ComputerVisionErrorException 類別來傳回錯誤。When you interact with the ComputerVisionClient client object using the Python SDK, the ComputerVisionErrorException class is used to return errors. 服務所傳回的錯誤會對應至 REST API 要求所傳回的相同 HTTP 狀態碼。Errors returned by the service correspond to the same HTTP status codes returned for REST API requests.

例如,如果您嘗試使用無效的金鑰來分析影像,就會傳回 401 錯誤。For example, if you try to analyze an image with an invalid key, a 401 error is returned. 下列程式碼片段會藉由攔截例外狀況並顯示錯誤的其他相關資訊,來適當地處理錯誤In the following snippet, the error is handled gracefully by catching the exception and displaying additional information about the error.


domain = "landmarks"
url = "http://www.public-domain-photos.com/free-stock-photos-4/travel/san-francisco/golden-gate-bridge-in-san-francisco.jpg"
language = "en"
max_descriptions = 3

try:
    analysis = client.describe_image(url, max_descriptions, language)

    for caption in analysis.captions:
        print(caption.text)
        print(caption.confidence)
except HTTPFailure as e:
    if e.status_code == 401:
        print("Error unauthorized. Make sure your key and endpoint are correct.")
    else:
        raise

透過重試來處理暫時性錯誤Handle transient errors with retries

在使用 ComputerVisionClient 用戶端時,您可能會遇到由於服務強制執行速率限制而造成的暫時性失敗,或遇到其他暫時性問題 (例如,網路中斷)。While working with the ComputerVisionClient client, you might encounter transient failures caused by rate limits enforced by the service, or other transient problems like network outages. 如需如何處理這些失敗類型的相關資訊,請參閱《雲端設計模式》指南中的重試模式,以及相關的斷路器模式For information about handling these types of failures, see Retry pattern in the Cloud Design Patterns guide, and the related Circuit Breaker pattern.

後續步驟Next steps