Custom Model Deployment fails, no clear error is given

Tharaka Ranathunga 0 Reputation points
2024-05-11T08:12:34.0066667+00:00

I'm new to Azure ML. Any Help will be greately appriciated. I have a custom Model built with python as a keras model. (my_model.keras) . I have imported it to Azure ML as a model. Now I need to deploy it as a Web Service in Azure ML. Everytime it fails. It does not give a clear error. The endpoint is giving follows

"No Helalthy Upstream"

Below is my conda.env.yml

name: tensorflow_env
channels:
  - conda-forge
  - defaults
dependencies:
  - python=3.10  # 
  - tensorflow=2*  #
  - opencv
  - pillow
  - matplotlib
  - scikit-learn  # 
  - pip:
    - azureml-core
    - azureml-inference-server-http

This is my Score.py


# Get the registered model by name and version (replace with your actual values)
model_name = "my_model"  # Replace with your registered model name
model_version = 5  # Replace with the specific version you want to deploy

# Load the model directly from the registered model URI
model = Model(workspace=ws, model_name=model_name, version=model_version)
loaded_model = tf.keras.models.load_model(model.uri)

# Define function to pre-process an image
def preprocess_image(image_path):
  img = Image.open(image_path)  # Use Image.open for flexibility
  img = img.resize((224, 224))  # Resize consistently
  img_array = np.array(img)
  img_array = img_array / 255.0  # Normalize pixel values (common practice)
  img_array = np.expand_dims(img_array, axis=0)  # Add batch dimension
  return img_array

# Define function to predict on an image
def predict(image_path):
  preprocessed_image = preprocess_image(image_path)
  predictions = loaded_model.predict(preprocessed_image)
  class_id = np.argmax(predictions, axis=1)

  # Load class names (assuming they are saved in a separate file)
  class_names = ['Not Suitable for Commercial Purpose', 'Suitable for Commercial Purpose']
  predicted_class_name = class_names[class_id.item()]
  return predicted_class_name

# Main function (for testing purposes)
def main():
  # Replace with the URL of your test image in the blob storage
  image_url = "https://silverchefml9194321163.blob.core.windows.net/azureml-blobstore-bf7c4871-0f1f-456d-9c1f-2b7793317bfc/TrainingImages/Test/Suitable for Commercial Purpose/1273 Hay St, West Perth WA 6005, Australia_337_135_2023-01.jpg"
  predicted_class = predict(image_url)
  print(f"Predicted class for {image_url}: {predicted_class}")

# Run main function for testing (optional)
if __name__ == "__main__":
  main()

Below is my log output.



Server Settings
---------------
Entry Script Name: /structure/azureml-app/main.py
Model Directory: /var/azureml-app/azureml-models/my_model/5
Config File: None
Worker Count: 1
Worker Timeout (seconds): 300
Server Port: 31311
Health Port: 31311
Application Insights Enabled: false
Application Insights Key: AppInsights key provided
Inferencing HTTP server version: azmlinfsrv/1.2.1
CORS for the specified origins: None
Create dedicated endpoint for health: None


Server Routes
---------------
Liveness Probe: GET   127.0.0.1:31311/
Score:          POST  127.0.0.1:31311/score


Warnings
---------------
Azmlinfsrv will be migrating to Pydantic 2.0 on 1/15/24. This is a breaking change for any Pydantic 1.0 code.

2024-05-11 07:53:02,034 W [70] azmlinfsrv - Found extra keys in the config file that are not supported by the server.
Extra keys = ['AZUREML_ENTRY_SCRIPT', 'SERVICE_NAME', 'WORKSPACE_NAME', 'SERVICE_PATH_PREFIX', 'SERVICE_VERSION', 'SCORING_TIMEOUT_MS', 'AZUREML_MODEL_DIR', 'HOSTNAME']
2024-05-11 07:53:02,898 I [70] azmlinfsrv - AML_FLASK_ONE_COMPATIBILITY is set. Patched Flask to ensure compatibility with Flask 1.
Initializing logger
2024-05-11 07:53:02,905 I [70] azmlinfsrv - Starting up app insights client
2024-05-11 07:53:03.848041: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: SSE4.1 SSE4.2 AVX AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-05-11 07:53:12,946 I [70] azmlinfsrv.print - Performing interactive authentication. Please follow the instructions on the terminal.
WARNING:azureml._vendor.azure_cli_core.auth.identity:To sign in, use a web browser to open the page https://microsoft.com/devicelogin and enter the code PQGFBMD3J to authenticate.

Azure Machine Learning
Azure Machine Learning
An Azure machine learning service for building and deploying models.
2,608 questions
Azure AI services
Azure AI services
A group of Azure services, SDKs, and APIs designed to make apps more intelligent, engaging, and discoverable.
2,446 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Gowtham CP 1,255 Reputation points
    2024-05-12T05:39:12.8833333+00:00

    Hello @Tharaka Ranathunga

    Thanks for reaching out in the Microsoft Q&A!

    The error "No Healthy Upstream" suggests a problem communicating with your model. First, tidy up extra configuration keys, handle any TensorFlow warnings, and ensure authentication works. To fix deployment, double-check how your model loads, preprocesses data, and test it locally. Verify all needed tools are listed in your environment file, confirm the correct model version is selected, and review service settings. By tackling these steps carefully, you can solve the deployment problem in Azure ML.

    If you found this solution helpful, consider accepting it.

    0 comments No comments