Azure OpenAI model version - Deployment vs Chat mismatch

Question

Hello Community,

When I try the new deployment:
Model name: gpt-4-32k
Model version: 0613
Version update policy: Once a new default version is available.
Deployment type: Standard
Content Filter: Default
Tokens per Minute Rate Limit (thousands): 20
Rate limit (Tokens per minute): 20000
Rate limit (Requests per minute): 120

I get the messages following response from the model: Screenshot 2024-05-01 121105

What am I doing wrong?

Thanks a lot

Answer

Istvan Molnar Greetings & Welcome to Microsoft Q&A forum!

This is an expected behavior. Please see FAQs for more details.

When I ask GPT-4 which model it's running, it tells me it's running GPT-3. Why does this happen?

Azure OpenAI models (including GPT-4) being unable to correctly identify what model is running is expected behavior.

Why does this happen?

Ultimately, the model is performing next token prediction in response to your question. The model doesn't have any native ability to query what model version is currently being run to answer your question. To answer this question, you can always go to Azure OpenAI Studio > Management > Deployments > and consult the model name column to confirm what model is currently associated with a given deployment name.

The questions, "What model are you running?" or "What is the latest model from OpenAI?" produce similar quality results to asking the model what the weather will be today. It might return the correct result, but purely by chance. On its own, the model has no real-world information other than what was part of its training/training data. In the case of GPT-4, as of August 2023 the underlying training data goes only up to September 2021. GPT-4 wasn't released until March 2023, so barring OpenAI releasing a new version with updated training data, or a new version that is fine-tuned to answer those specific questions, it's expected behavior for GPT-4 to respond that GPT-3 is the latest model release from OpenAI.

If you wanted to help a GPT based model to accurately respond to the question "what model are you running?", you would need to provide that information to the model through techniques like prompt engineering of the model's system message, Retrieval Augmented Generation (RAG) which is the technique used by Azure OpenAI on your data where up-to-date information is injected to the system message at query time, or via fine-tuning where you could fine-tune specific versions of the model to answer that question in a certain way based on model version.

To learn more about how GPT models are trained and work we recommend watching Andrej Karpathy's talk from Build 2023 on the state of GPT.

I hope this helps. Do let me know if that helps or have any other queries.

If the response helped, please do click Accept Answer and Yes for was this answer helpful.

Doing so would help other community members with similar issue identify the solution. I highly appreciate your contribution to the community.

Share via

Azure OpenAI model version - Deployment vs Chat mismatch

1 answer