Azure Container Instance killed for no reason after a few hours

jbx999 246 Reputation points
2021-02-21T11:52:12.423+00:00

I have a very simple Azure Container Instance running from a Docker container based on AdoptOpenJDK. The Dockerfile has nothing but:

   FROM adoptopenjdk:11-jre-hotspot  
   ARG JAR_FILE=target/*.jar  
   COPY ${JAR_FILE} app.jar  
   ENTRYPOINT ["java","-jar","/app.jar"]  

The application itself is a very simple Spring Boot Application exposing a REST API. It currently does nothing but expose a simple request for testing. The application is not doing any logic so there is nothing related to load or resource consumption. In fact the graph shows a practically constant usage at around 285Mb. According to the Properties of the container it has 1.5Gb of memory, so it shouldnt be a resource issue.

70363-image.png

However, after a few hours of running happily, the container is just killed. I am also noticing it on a more complex application, but I wanted to verify it with a simpler one.
I am deploying the container with this command:

   az container create --dns-name-label $ACI_CONTAINER_NAME --resource-group $AZURE_RESOURCE_GROUP --name $ACI_CONTAINER_NAME --image $IMAGE_NAME:latest --registry-username $ACR_USERNAME --registry-password $ACR_PASSWORD --restart-policy Never --ports 80 --environment-variables $MY_VARIABLES --secure-environment-variables $MY_SECURE_VARIABLES  

I have set the restart policy on purpose to Never so that I can see the logs of the container, but there is nothing that shows any problem.

Sometimes the container is killed after a few hours, sometimes after a bit longer. In this case it was killed after 20hrs.

70359-image.png

I can see nothing else in the container logs.

How can I know what is going on? Why is ACI just killing my instances arbitrarily?

Azure Container Registry
Azure Container Registry
An Azure service that provides a registry of Docker and Open Container Initiative images.
382 questions
Azure Container Instances
Azure Container Instances
An Azure service that provides customers with a serverless container experience.
633 questions
{count} votes

Accepted answer
  1. jbx999 246 Reputation points
    2021-04-04T10:53:01.513+00:00

    In the end I decided to move away from Azure Container Instances. They are a buggy, unstable, half baked product. After more than 6 weeks of investigation by Support, West EU containers are still being killed without any reason, while containers running in East US have been running happily since February. Furthermore, I just discovered that the cost to expose an ACI which is inside a vnet, due to the fact that you need to have an Application Gateway, costs more than the container instance itself! They are marketed as a low cost solution instead of a VM, while in fact they will cost double that, without the stability.

    My personal advice, stay away from Azure Container Instances.

    3 people found this answer helpful.

6 additional answers

Sort by: Most helpful
  1. Damir Dobric 6 Reputation points MVP
    2021-09-07T07:18:21.513+00:00

    Dear all,

    I had the same issue on the beging of the year. After a long investigation I have opened a support issue related to to ACI.
    It took some time, but engineering team has found and fixed the issue. It was related to an unintended side-effect of some defragmentation operation.
    Right now, all my long-running containers (running 3-5 days) complete successfully.

    This was behavior before the fix:
    129757-image.png

    This is today:
    129758-image.png

    As you can see, unexpected restarts (see arrows) do not happen anymore.

    Just for the record.
    Damir

    1 person found this answer helpful.

  2. Sergio Rodriguez 1 Reputation point
    2021-02-23T17:23:09+00:00

    Same problem here!

    No reason why the container is killed.

    0 comments No comments

  3. Sergio Rodriguez 1 Reputation point
    2021-02-25T07:59:52.683+00:00

    I think I solved the problem. Check if the container is writting something on the file system. My problem was that was writting a log file and maybe there are same disk space limitation.


  4. Asha 59 1 Reputation point
    2021-06-28T08:30:22.803+00:00

    Same problem here.

    Seen on 2 kind of container instances:

    • one doing a basic rsync job, and writing to a mounted Azure Storage
    • one doing a long running computation (min 3 days long) (writing to a mounted Azure storage too)

    We're kind of disappointed, as ACI, on the paper, were the perfect fit for our needs.

    0 comments No comments