During the early phases of the development of your generative AI app, you want to experiment and iterate quickly. To easily assess whether your selected language model and app, created with prompt flow, meet your requirements, you can manually evaluate models and flows in the Azure AI Foundry portal.
Even when your model and app are already in production, manual evaluations are a crucial part of assessing performance. As manual evaluations are done by humans, they can provide insights that automated metrics might miss.
Let’s explore how you can manually evaluate your selected models and app in the Azure AI Foundry portal.
Prepare your test prompts
To begin the manual evaluation process, it’s essential to prepare a diverse set of test prompts that reflect the range of queries and tasks your app is expected to handle. These prompts should cover various scenarios, including common user questions, edge cases, and potential failure points. By doing so, you can comprehensively assess the app’s performance and identify areas for improvement.
Leave a Reply