[2406.18181] On the Evaluation of Large Language Models in Unit Test Generation
Lightweight AI Evaluation with SemanticKernel
How Databricks is using synthetic data to simplify evaluation of AI agents