Used to quickly and easily evaluate or compare AI responses in .NET applications, especially for testing. we can use automatic assessment Excellent “LLM-as-a-Judge” tips with help from: Semantic core.
Sample code
Note that you need to set up a semantic kernel with chat completion first. It is also recommended to set “temperature” to 0.
var json =
"""
{
"humor" : {
"output" : "this maybe funny"
}
}
""";
await foreach (var result in
kernel.Run(json, executionSettings: executionSettings))
{
Console.WriteLine($"[{result.Key}]: result: {result.Value?.Item1}, score: {result.Value?.Item2}");
}
although Microsoft.Extensions.AI.Evaluation It’s a work in progress and currently involves too much “ritual” for simple use cases.
Feel free to get in touch on Twitter @roaming code