@Priyanka Shah Thanks for the question. Yes, Instead of using a existing curated environment, you can try creating your own environment with the required package version dependencies specified in the requirements file/ conda yml configuration?
Please follow the below document for the same.
https://learn.microsoft.com/en-us/azure/machine-learning/how-to-use-environments#use-conda-dependencies-or-pip-requirements-files
Deploying spark-nlp model using custom docker image fails in Azure Machine Learning
Issue while deploying spark-nlp model to AML
I am trying to deploy a SPARK-NLP trained model from here to Azure Machine learning (using Python environment)
While deploying I am using a custom docker image provided by the spark-nlp documentation here
This is because when I try to use an existing image such as
env = Environment.get(ws, name='AzureML-PySpark-MmlSpark-0.15')
then we get errors when loading the model in the scoring script as spark-nlp libraries are not found in that AzureML-PySpark image.
So now I am using custom docker file as below:
dockerfile = r"""
FROM ubuntu:18.04
ENV NB_USER yuefeng
ENV NB_UID 1000
ENV HOME /home/${NB_USER}
ENV PYSPARK_PYTHON=python3
ENV PYSPARK_DRIVER_PYTHON=python3
RUN apt-get update && apt-get install -y \
tar \
wget \
bash \
rsync \
gcc \
libfreetype6-dev \
libhdf5-serial-dev \
libpng-dev \
libzmq3-dev \
python3 \
python3-dev \
python3-pip \
unzip \
pkg-config \
software-properties-common \
graphviz
RUN adduser --disabled-password \
--gecos "Default user" \
--uid ${NB_UID} \
${NB_USER}
RUN apt-get update && \
apt-get install -y openjdk-8-jdk && \
apt-get install -y ant && \
apt-get clean;
RUN apt-get update && \
apt-get install ca-certificates-java && \
apt-get clean && \
update-ca-certificates -f;
ENV JAVA_HOME /usr/lib/jvm/java-8-openjdk-amd64/
RUN export JAVA_HOME
RUN echo "export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/" >> ~/.bashrc
RUN apt-get clean && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
RUN pip3 install --upgrade pip
RUN pip3 install --no-cache-dir notebook==5.* numpy pyspark==2.4.4 spark-nlp==2.5.1 azureml-sdk azureml-core pandas mlflow Keras scikit-spark scikit-learn scipy matplotlib pydot tensorflow graphviz
USER root
RUN chown -R ${NB_UID} ${HOME}
USER ${NB_USER}
WORKDIR ${HOME}
CMD ["jupyter", "notebook", "--ip", "0.0.0.0"]
"""
My environment configuration is as below:
from azureml.core import Environment
from azureml.core.model import InferenceConfig
env=Environment("myenv")
env.docker.base_image=None
env.docker.base_dockerfile = dockerfile
env.inferencing_stack_version='latest'
inf_config = InferenceConfig(environment=env, entry_script="score.py")
the entry script, score.py looks like below:
%%writefile score.py
import json
import pyspark
import azureml.core
from azureml.core.model import Model
from azureml.core import Workspace
from pyspark.ml import PipelineModel
from pyspark.context import SparkContext
from pyspark.sql.session import SparkSession
import sys, glob, os
global trainedModel
global spark
def init():
sys.path.extend(glob.glob(os.path.join(os.path.expanduser("~"), ".ivy2/jars/*.jar")))
spark = SparkSession.builder.appName("Spark NLP").master("local[4]").config("spark.driver.memory","16G").config("spark.driver.maxResultSize", "2G") .config("spark.kryoserializer.buffer.max", "2000M").config("spark.jars.packages", "com.johnsnowlabs.nlp:spark-nlp_2.11:2.6.1,com.google.cloud.spark:spark-bigquery-with-dependencies_2.11:0.15.1-beta").getOrCreate()
model_name = "nlp_test-register" #interpolated
model_path = Model.get_model_path(model_name)
trainedModel = PipelineModel.load(model_path)
def run(input_json):
if isinstance(trainedModel, Exception):
return json.dumps({<!-- -->{"trainedModel":str(trainedModel)}})
try:
sc = spark.sparkContext
input_list = json.loads(input_json)
input_rdd = sc.parallelize(input_list)
input_df = spark.read.json(input_rdd)
prediction = trainedModel.transform(input_df)
predictions = prediction.collect()
preds = [str(x['ntokens']) for x in predictions.select('ntokens').collect()]
result = ",".join(preds)
return result
except Exception as e:
result = str(e)
return result
the spark-nlp trained model is registered successfully in the workspace.
from azureml.core.model import Model
--Register model
resgistered_Model = Model.register(ws, model_name="nlp_test-register", model_path="./test.mml")
Now, when trying to deploy the model as a local webservice:
deployment_config = LocalWebservice.deploy_configuration(port=6789) service = Model.deploy( ws, "myservice", [model], inf_config, deployment_config, overwrite=True, ) service.wait_for_deployment(show_output=True)
I get error when trying to build the environment:
Step 32/45 : RUN if dpkg --compare-versions conda --version | grep -oE '[^ ]+$'
lt 4.4.11; then conda install conda==4.4.11; fi
---> Running in 1d06aaf8f181
/bin/sh: 1: conda: not found
dpkg: error: --compare-versions takes three arguments: <version> <relation> <version>
Type dpkg --help for help about installing and deinstalling packages [*];
Use 'apt' or 'aptitude' for user-friendly package management;
Type dpkg -Dhelp for a list of dpkg debug flag values;
Type dpkg --force-help for a list of forcing options;
Type dpkg-deb --help for help about manipulating *.deb files;
...
Step 33/45 : COPY azureml-environment-setup/mutated_conda_dependencies.yml azureml-environment-setup/mutated_conda_dependencies.yml
---> efe6235c07d2
Step 34/45 : RUN ldconfig /usr/local/cuda/lib64/stubs && conda env create -p /azureml-envs/azureml_da3e97fcb51801118b8e80207f3e01ad -f azureml-environment-setup/mutated_conda_dependencies.yml && rm -rf "$HOME/.cache/pip" && conda clean -aqy && CONDA_ROOT_DIR=$(conda info --root) && rm -rf "$CONDA_ROOT_DIR/pkgs" && find "$CONDA_ROOT_DIR" -type d -name pycache -exec rm -rf {} + && ldconfig
---> Running in 5382859f6b89
/bin/sh: 1: conda: not found
The command '/bin/sh -c ldconfig /usr/local/cuda/lib64/stubs && conda env create -p /azureml-envs/azureml_da3e97fcb51801118b8e80207f3e01ad -f azureml-environment-setup/mutated_conda_dependencies.yml && rm -rf "$HOME/.cache/pip" && conda clean -aqy && CONDA_ROOT_DIR=$(conda info --root) && rm -rf "$CONDA_ROOT_DIR/pkgs" && find "$CONDA_ROOT_DIR" -type d -name pycache -exec rm -rf {} + && ldconfig' returned a non-zero code: 127
2021/06/17 02:47:36 Container failed during run: acb_step_0. No retries remaining.
failed to run step ID: acb_step_0: exit status 127
Run ID: ccx failed after 13m28s. Error: failed during run, err: exit status 1
Package creation Failed
Does it mean conda is not available in the docker image? How to install CONDA in the docker image? What commands should be given in the docker file?
<img width="960" alt="error" src="https://user-images.githubusercontent.com/50163025/122332624-4d47d100-cf69-11eb-92d2-ce025500e410.PNG">
1 answer
Sort by: Most helpful
-
Ramr-msft 17,616 Reputation points
2021-06-17T14:53:21.397+00:00