Back to Browse

Machine Learning without data: training small models with 10 examples

474 views
Dec 6, 2025
40:31

Title: Machine Learning without data: training small models with 10 examples Speaker: Jacek Golebiowski (distil labs: https://www.distillabs.ai/) Abstract: Organizations need AI that works on their messy data—logs, images, notes, reports—but can't afford to send sensitive information to the cloud. Small Language Models (SLMs) run locally on your own hardware, keeping data private while delivering fast results at a fraction of the cost. Traditionally, building custom AI for production tasks requires expert teams and months of development. AutoML addresses part of this by automating model selection and training, but practitioners still face the bottleneck of manually labeling thousands of training examples. This is why ChatGPT succeeded: it only asks you to describe what you want, not provide labeled datasets. We believe custom models need to match this experience, so in this session we present our model training pipeline that extends AutoML to data preparation itself. You define the problem, and a larger AI "teacher" automatically generates and refines training examples to create a specialized "student" model tailored to tasks like request triage, API chat interfaces, and data transformations. We'll dive deeper into the structure of the generated data and demonstrate how to effectively navigate data generation by controlling the latent variables that define each datapoint.

Download

1 formats

Video Formats

360pmp445.5 MB

Right-click 'Download' and select 'Save Link As' if the file opens in a new tab.

Machine Learning without data: training small models with 10 examples | NatokHD