Evaluate Large Language Models in 3 easy steps!
Code your own 'Evals' :
1:
git clone https://github.com/RGGH/evaluate.git
cd evaluate
cargo run (if you have Rust installed) else build and run with Docker 🐋
2:
Configure .env with your API keys (Note : You can also run this locally, just with Ollama!) ✅
3: write your python code - as per video!
pip install llmeval-sdk
- chapters -
00:00 intro
03:06 single evaluation using 'evaluate' Python SDK
07:30 batch evaluation using 'evaluate' - with 'run_batch'
15:03 the "evaluate" API
21:33 notebook on GitHub
The server can run on your local PC/Laptop in Windows, Mac, or Linux
The client which you write can talk to the server which calls the LLM providers and does the "LLM as a Judge" part for you.
View the Juputer Notebook 📖 of the video here:
https://github.com/RGGH/llmeval-python-sdk/blob/main/examples/evaluate.ipynb
You can contact me via:
https://base21.uk/#contact
Thanks for watching!
#evals #LLM-testing
Download
0 formats
No download links available.
AI Evals - Model Evaluation & Testing Platform | LLM as a judge | Python SDK | NatokHD