Example: Build and deploy a custom skill with Azure Machine Learning designer

Azure Machine Learning designer is an easy to use interactive canvas to create machine learning models for tasks like regression and classification. Invoking the model created by the designer in a Cognitive Search enrichment pipeline requires a few additional steps. In this example, you will create a simple regression model to predict the price of an automobile and invoke the inferencing endpoint as an AML skill.

Follow the Regression - Automobile Price Prediction (Advanced) tutorial in the examples pipelines & datasets documentation page to create a model that predicts the price of an automobile given the different features.

Important

Deploying the model following the real time inferencing process will result in a valid endpoint, but not one that you can use with the AML skill in Cognitive Search.

Register model and download assets

Once you have a model trained, register the trained model and follow the steps to download all the files in the trained_model_outputs folder or download only the score.py and conda_env.yml files from the models artifacts page. You will edit the scoring script before the model is deployed as a real-time inferencing endpoint.

Cognitive Search enrichment pipelines work on a single document and generate a request that contains the inputs for a single prediction. The downloaded score.py accepts a list of records and returns a list of predictions as a serialized JSON string. You will be making two changes to the score.py

  • Edit the script to work with a single input record, not a list
  • Edit the script to return a JSON object with a single property, the predicted price.

Open the downloaded score.py and edit the run(data) function. The function is currently setup to expect the following input as described in the model's _samples.json file.

[
  {
    "symboling": 2,
    "make": "mitsubishi",
    "fuel-type": "gas",
    "aspiration": "std",
    "num-of-doors": "two",
    "body-style": "hatchback",
    "drive-wheels": "fwd",
    "engine-location": "front",
    "wheel-base": 93.7,
    "length": 157.3,
    "width": 64.4,
    "height": 50.8,
    "curb-weight": 1944,
    "engine-type": "ohc",
    "num-of-cylinders": "four",
    "engine-size": 92,
    "fuel-system": "2bbl",
    "bore": 2.97,
    "stroke": 3.23,
    "compression-ratio": 9.4,
    "horsepower": 68.0,
    "peak-rpm": 5500.0,
    "city-mpg": 31,
    "highway-mpg": 38,
    "price": 6189.0
  },
  {
    "symboling": 0,
    "make": "toyota",
    "fuel-type": "gas",
    "aspiration": "std",
    "num-of-doors": "four",
    "body-style": "wagon",
    "drive-wheels": "fwd",
    "engine-location": "front",
    "wheel-base": 95.7,
    "length": 169.7,
    "width": 63.6,
    "height": 59.1,
    "curb-weight": 2280,
    "engine-type": "ohc",
    "num-of-cylinders": "four",
    "engine-size": 92,
    "fuel-system": "2bbl",
    "bore": 3.05,
    "stroke": 3.03,
    "compression-ratio": 9.0,
    "horsepower": 62.0,
    "peak-rpm": 4800.0,
    "city-mpg": 31,
    "highway-mpg": 37,
    "price": 6918.0
  },
  {
    "symboling": 1,
    "make": "honda",
    "fuel-type": "gas",
    "aspiration": "std",
    "num-of-doors": "two",
    "body-style": "sedan",
    "drive-wheels": "fwd",
    "engine-location": "front",
    "wheel-base": 96.5,
    "length": 169.1,
    "width": 66.0,
    "height": 51.0,
    "curb-weight": 2293,
    "engine-type": "ohc",
    "num-of-cylinders": "four",
    "engine-size": 110,
    "fuel-system": "2bbl",
    "bore": 3.15,
    "stroke": 3.58,
    "compression-ratio": 9.1,
    "horsepower": 100.0,
    "peak-rpm": 5500.0,
    "city-mpg": 25,
    "highway-mpg": 31,
    "price": 10345.0
  }
]

Your changes will ensure that the model can accept the input generated by Cognitive Search during indexing, which is a single record.

{
    "symboling": 2,
    "make": "mitsubishi",
    "fuel-type": "gas",
    "aspiration": "std",
    "num-of-doors": "two",
    "body-style": "hatchback",
    "drive-wheels": "fwd",
    "engine-location": "front",
    "wheel-base": 93.7,
    "length": 157.3,
    "width": 64.4,
    "height": 50.8,
    "curb-weight": 1944,
    "engine-type": "ohc",
    "num-of-cylinders": "four",
    "engine-size": 92,
    "fuel-system": "2bbl",
    "bore": 2.97,
    "stroke": 3.23,
    "compression-ratio": 9.4,
    "horsepower": 68.0,
    "peak-rpm": 5500.0,
    "city-mpg": 31,
    "highway-mpg": 38,
    "price": 6189.0
}

Replace lines 27 through 30 with


    for key, val in data.items():
        input_entry[key].append(decode_nan(val))

You will also need to edit the output that the script generates from a string to a JSON object. Edit the return statement (line 37) in the original file to:

    output = result.data_frame.values.tolist()
    return {
        "predicted_price": output[0][-1]
    }

Here is the updated run function with the changes in input format and the predicted output that will accept a single record as an input and return a JSON object with the predicted price.

def run(data):
    data = json.loads(data)
    input_entry = defaultdict(list)
    # data is now a JSON object not a list of JSON objects
    for key, val in data.items():
        input_entry[key].append(decode_nan(val))

    data_frame_directory = create_dfd_from_dict(input_entry, schema_data)
    score_module = ScoreModelModule()
    result, = score_module.run(
        learner=model,
        test_data=DataTable.from_dfd(data_frame_directory),
        append_or_result_only=True)
    #return json.dumps({"result": result.data_frame.values.tolist()})
    output = result.data_frame.values.tolist()
    # return the last column of the the first row of the dataframe
    return  {
        "predicted_price": output[0][-1]
    }

Register and deploy the model

With your changes saved, you can now register the model in the portal. Select register model and provide it with a valid name. Choose Other for Model Framework, Custom for Framework Name and 1.0 for Framework Version. Select the Upload folder option and select the folder with the updated score.py and conda_env.yaml.

Select the model and select on the Deploy action. The deployment step assumes you have an AKS inferencing cluster provisioned. Container instances are currently not supported in Cognitive Search.

  1. Provide a valid endpoint name
  2. Select the compute type of Azure Kubernetes Service
  3. Select the compute name for your inference cluster
  4. Toggle enable authentication to on
  5. Select Key-based authentication for the type
  6. Select the updated score.py for entry script file
  7. Select the conda_env.yaml for conda dependencies file
  8. Select the deploy button to deploy your new endpoint.

To integrate the newly created endpoint with Cognitive Search

  1. Add a JSON file containing a single automobile record to a blob container
  2. Configure a AI enrichment pipeline using the import data workflow. Be sure to select JSON as the parsing mode
  3. On the Add Enrichments tab, select a single skill Extract people names as a placeholder.
  4. Add a new field to the index called predicted_price of type Edm.Double, set the Retrievable property to true.
  5. Complete the import data process

Add the AML Skill to the skillset

From the list of skillsets, select the skillset you created. You will now edit the skillset to replace the people identification skill with the AML skill to predict prices. On the Skillset Definition (JSON) tab, select Azure Machine Learning (AML) from the skills dropdown. Select the workspace, for the AML skill to discover your endpoint, the workspace and search service need to be in the same Azure subscription. Select the endpoint that you created earlier in the tutorial. Validate that the skill is populated with the URI and authentication information as configured when you deployed the endpoint. Copy the skill template and replace the skill in the skillset. Edit the skill to:

  1. Set the name to a valid name
  2. Add a description
  3. Set degreesOfParallelism to 1
  4. Set the context to /document
  5. Set the inputs to all the required inputs, see the sample skill definition below
  6. Set the outputs to capture the predicted price returned.
{
      "@odata.type": "#Microsoft.Skills.Custom.AmlSkill",
      "name": "AMLdemo",
      "description": "AML Designer demo",
      "context": "/document",
      "uri": "Your AML endpoint",
      "key": "Your AML endpoint key",
      "resourceId": null,
      "region": null,
      "timeout": "PT30S",
      "degreeOfParallelism": 1,
      "inputs": [
        {
          "name": "symboling",
          "source": "/document/symboling"
        },
        {
          "name": "make",
          "source": "/document/make"
        },
        {
          "name": "fuel-type",
          "source": "/document/fuel-type"
        },
        {
          "name": "aspiration",
          "source": "/document/aspiration"
        },
        {
          "name": "num-of-doors",
          "source": "/document/num-of-doors"
        },
        {
          "name": "body-style",
          "source": "/document/body-style"
        },
        {
          "name": "drive-wheels",
          "source": "/document/drive-wheels"
        },
        {
          "name": "engine-location",
          "source": "/document/engine-location"
        },
        {
          "name": "wheel-base",
          "source": "/document/wheel-base"
        },
        {
          "name": "length",
          "source": "/document/length"
        },
        {
          "name": "width",
          "source": "/document/width"
        },
        {
          "name": "height",
          "source": "/document/height"
        },
        {
          "name": "curb-weight",
          "source": "/document/curb-weight"
        },
        {
          "name": "engine-type",
          "source": "/document/engine-type"
        },
        {
          "name": "num-of-cylinders",
          "source": "/document/num-of-cylinders"
        },
        {
          "name": "engine-size",
          "source": "/document/engine-size"
        },
        {
          "name": "fuel-system",
          "source": "/document/fuel-system"
        },
        {
          "name": "bore",
          "source": "/document/bore"
        },
        {
          "name": "stroke",
          "source": "/document/stroke"
        },
        {
          "name": "compression-ratio",
          "source": "/document/compression-ratio"
        },
        {
          "name": "horsepower",
          "source": "/document/horsepower"
        },
        {
          "name": "peak-rpm",
          "source": "/document/peak-rpm"
        },
        {
          "name": "city-mpg",
          "source": "/document/city-mpg"
        },
        {
          "name": "highway-mpg",
          "source": "/document/highway-mpg"
        },
        {
          "name": "price",
          "source": "/document/price"
        }
      ],
      "outputs": [
        {
          "name": "predicted_price",
          "targetName": "predicted_price"
        }
      ]
    }

Update the indexer output field mappings

The indexer output field mappings determine what enrichments are saved to the index. Replace the output field mappings section of the indexer with the snippet below:

"outputFieldMappings": [
    {
      "sourceFieldName": "/document/predicted_price",
      "targetFieldName": "predicted_price"
    }
  ]

You can now run your indexer and validate that the predicted_price property is populated in the index with the result from your AML skill output.

Next steps

Learn more about adding custom skills to the enrichment pipeline

Learn more about the AML skill