Back to Browse

TabICL: Scaling In-Context Learning for Large Tabular Datasets

58 views
Apr 23, 2026
20:48

This research introduces TabICL, a novel tabular foundation model designed to scale in-context learning to massive datasets. Traditional models like TabPFNv2 struggle with large tables due to high computational costs, but TabICL overcomes this via a two-stage architecture that condenses columns into row-wise embeddings. This approach allows the model to process up to 500,000 samples efficiently while remaining significantly faster than gradient-boosted trees or deep learning competitors. TabICL achieves state-of-the-art results on the TALENT benchmark, demonstrating that pre-trained transformers can outperform traditional methods on large-scale classification tasks without the need for hyperparameter tuning. Additionally, the authors implement curriculum learning and a hierarchical strategy to manage diverse data sizes and numerous classes. Ultimately, TabICL represents a major advancement in the speed and scalability of foundation models for structured data.

Download

0 formats

No download links available.

TabICL: Scaling In-Context Learning for Large Tabular Datasets | NatokHD