Coralysis is an innovative R package designed to improve the integration, annotation, and analysis of single-cell datasets, specifically addressing common failures in handling imbalanced or missing cell types. By utilizing a multi-level divisive algorithm and self-supervised learning, the tool effectively removes technical batch effects while preserving subtle biological variations that other methods often overlook. Beyond simple data merging, the package features reference-mapping capabilities that accurately transfer labels to new data and cell-state identification through specific probability scores. Benchmark results indicate that Coralysis outranks existing state-of-the-art tools when processing highly similar but distinct cell populations across diverse platforms. Its versatility is further demonstrated by its robust performance on various data modalities, including transcriptomics and proteomics such as CyTOF and ADT. Ultimately, this comprehensive suite streamlines the single-cell workflow to provide a more faithful representation of the cellular landscape in complex experiments.
References:
• Sousa A G G, Smolander J, Junttila S, et al. Coralysis enables sensitive identification of imbalanced cell types and states in single-cell data via multi-level integration[J]. Nucleic Acids Research, 2025, 53(21): gkaf1128.