Manually evaluate the performance of a model

During the early phases of the development of your generative AI app, you want to experiment and iterate quickly. To easily assess whether your selected language model and app, created with prompt flow, meet your requirements, you can manually evaluate models and flows in the Azure AI Foundry portal.

Even when your model and app are already in production, manual evaluations are a crucial part of assessing performance. As manual evaluations are done by humans, they can provide insights that automated metrics might miss.

Let’s explore how you can manually evaluate your selected models and app in the Azure AI Foundry portal.

Prepare your test prompts

To begin the manual evaluation process, it’s essential to prepare a diverse set of test prompts that reflect the range of queries and tasks your app is expected to handle. These prompts should cover various scenarios, including common user questions, edge cases, and potential failure points. By doing so, you can comprehensively assess the app’s performance and identify areas for improvement.

mobile app development

Manually evaluate the performance of a model

Prepare your test prompts

Comments

Leave a Reply Cancel reply

More posts

Introduction to tools for declarative agents in Copilot Studio

Integrate Office and Dynamics 365 Customer Engagement apps

Explore migration options for SAP on Azure

Migrate open-source databases to Azure