ICML

Distinctive to ICML

Topics over-represented at ICML vs. the field average (2025)

Neural Scaling (1.1% vs 0.4% field avg) 2.43×
Mixture of Experts (1.1% vs 0.7% field avg) 1.59×
World Models (0.9% vs 0.6% field avg) 1.54×
Curriculum Learning (0.3% vs 0.2% field avg) 1.53×
Reinforcement Learning (8.2% vs 5.5% field avg) 1.49×
Large Language Models (19.5% vs 14.8% field avg) 1.31×
Quantization (2.1% vs 1.7% field avg) 1.29×
Model Compression (3.8% vs 3.3% field avg) 1.14×
Synthetic Data (3.2% vs 2.9% field avg) 1.13×
Explainability (5.1% vs 4.6% field avg) 1.1×

Most Cited Papers

Year	Title	Citations	Links
2021	Learning Transferable Visual Models From Natural Language Supervision	43,062	S2 · arXiv
2020	A Simple Framework for Contrastive Learning of Visual Representations	22,864	S2 · arXiv
2019	EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks	22,253	S2 · arXiv
2020	Training data-efficient image transformers & distillation through attention	8,495	S2 · arXiv
2023	BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models	6,959	S2 · arXiv
2021	Zero-Shot Text-to-Image Generation	6,108	S2 · arXiv
2022	Robust Speech Recognition via Large-Scale Weak Supervision	6,062	S2 · arXiv
2022	BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation	5,982	S2 · arXiv
2019	Parameter-Efficient Transfer Learning for NLP	5,867	S2 · arXiv
2021	Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision	5,028	S2 · arXiv
2021	Improved Denoising Diffusion Probabilistic Models	4,860	S2 · arXiv
2021	GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models	4,458	S2 · arXiv
2021	EfficientNetV2: Smaller Models and Faster Training	3,866	S2 · arXiv
2019	Simplifying Graph Convolutional Networks	3,688	S2 · arXiv
2019	SCAFFOLD: Stochastic Controlled Averaging for Federated Learning	3,570	S2
2024	Scaling Rectified Flow Transformers for High-Resolution Image Synthesis	2,984	S2 · arXiv
2019	Theoretically Principled Trade-off between Robustness and Accuracy	2,883	S2 · arXiv
2021	Barlow Twins: Self-Supervised Learning via Redundancy Reduction	2,804	S2 · arXiv
2021	Is Space-Time Attention All You Need for Video Understanding?	2,712	S2 · arXiv
2020	REALM: Retrieval-Augmented Language Model Pre-Training	2,687	S2 · arXiv

Paper Count Over Time

Top Topics (2025)

Topic Trajectory (Top 10)

Distinctive to ICML

Most Cited Papers