Back to Browse

AI Evals - Model Evaluation & Testing Platform | LLM as a judge | Python SDK

215 views
Oct 16, 2025
22:33

Evaluate Large Language Models in 3 easy steps! Code your own 'Evals' : 1: git clone https://github.com/RGGH/evaluate.git cd evaluate cargo run (if you have Rust installed) else build and run with Docker 🐋 2: Configure .env with your API keys (Note : You can also run this locally, just with Ollama!) ✅ 3: write your python code - as per video! pip install llmeval-sdk - chapters - 00:00 intro 03:06 single evaluation using 'evaluate' Python SDK 07:30 batch evaluation using 'evaluate' - with 'run_batch' 15:03 the "evaluate" API 21:33 notebook on GitHub The server can run on your local PC/Laptop in Windows, Mac, or Linux The client which you write can talk to the server which calls the LLM providers and does the "LLM as a Judge" part for you. View the Juputer Notebook 📖 of the video here: https://github.com/RGGH/llmeval-python-sdk/blob/main/examples/evaluate.ipynb You can contact me via: https://base21.uk/#contact Thanks for watching! #evals #LLM-testing

Download

0 formats

No download links available.

AI Evals - Model Evaluation & Testing Platform | LLM as a judge | Python SDK | NatokHD