Azure OpenAI Service GPT-4 is excessively (and recently) slow - when will this be resolved?

dankronstal 90 Reputation points
2024-01-25T04:49:05.9966667+00:00

There is a significant performance issue with AOAI GPT-4 models in the last week or so. We are developing a solution using this service as one of several components, and over the past week response times from the REST API have degraded to a point of un-usability. A very simple prompt exchange, as shown in the screenshot below, requires >1 minute to return. This has been tested across multiple GPT-4 models in different subscriptions and deployments, both with and without content filtering, and with API versions 2023-07-01-preview and 2023-12-01-preview, in times of load and during 'cool' periods when token throughput has rested at 0 for a few hours, during peak business hours and during the middle of the night. GPT-3.5 model deployments for the same request payload are <4 seconds, so the issue is specific to GPT-4 models only. Input as to when this issue will be resolved would be appreciated, or other ideas for testing to support the team working on the fix. Screenshot:User's image

Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
2,190 questions
{count} votes