Publications
Recent publications in Machine Learning (2022- ). For a complete list, see my
CV or
Google Scholar.
2026
Teaching Models to Teach Themselves: Reasoning at the Edge of Learnability
Shobhita Sundaram, John Quan, Ariel Kwiatkowski, Kartik Ahuja, Yann Ollivier, Julia Kempe
OpenApps: Simulating Environment Variations to Measure UI-Agent Reliability
Karen Ullrich, Jingtong Su, Claudia Shi, Arjun Subramonian, Amir Bar, Ivan Evtimov, Nikolaos Tsilivis, Randall Balestriero, Julia Kempe, Mark Ibrahim
ICLR, 2026
How reinforcement learning after next-token prediction facilitates learning
Nikolaos Tsilivis, Eran Malach, Karen Ullrich, Julia Kempe
ICLR, 2026
Workshop version: "How reinforcement learning after next-token prediction facilitates learning" — NeurIPS Workshop on Principles of Generative Modeling (PriGM), 2025. Oral
Soft Tokens, Hard Truths
Natasha Butt, Ariel Kwiatkowski, Ismail Labiad, Julia Kempe*, Yann Ollivier* (*Equal senior authorship)
ICLR, 2026
From Concepts to Components: Concept-Agnostic Attention Module Discovery in Transformers
Jingtong Su, Julia Kempe*, Karen Ullrich* (*Equal senior authorship)
ICLR, 2026
2025
Embedding Trust: Semantic Isotropy Predicts Nonfactuality in Long-Form Text Generation
Dhrupad Bhardwaj, Julia Kempe, Tim G. J. Rudner
Preprint, 2025
Don't Waste Mistakes: Leveraging Negative RL-Groups via Confidence Reweighting
Yunzhen Feng, Parag Jain, Anthony Hartshorn, Yaqi Duan*, Julia Kempe* (*Equal senior authorship)
Preprint, 2025
Outcome-based Exploration for LLM Reasoning
Yuda Song, Julia Kempe, Rémi Munos
NeurIPS Workshop on Aligning RL Experimentalists and Theorists (2nd Edition), 2025
What Characterizes Effective Reasoning? Revisiting Length, Review, and Structure of Chain-of-Thought
Yunzhen Feng, Julia Kempe, Cheng Zhang, Parag Jain, Anthony Hartshorn
NeurIPS Workshop on Efficient Reasoning, 2025Spotlight
Tuning without Peeking: Provable Generalization Bounds and Robust LLM Post-Training
Ismail Labiad, Mathurin Videau, Matthieu Kowalski, Marc Schoenauer, Alessandro Leite, Julia Kempe, Olivier Teytaud
Preprint, 2025
Asymmetric REINFORCE for Off-Policy Reinforcement Learning: Balancing Positive and Negative Rewards
Charles Arnal, Gaëtan Narozniak, Vivien Cabannes, Yunhao Tang, Julia Kempe, Rémi Munos
NeurIPS, 2025
PILAF: Optimal Human Preference Sampling for Reward Modeling
Y. Feng, A. Kwiatkowski, K. Zheng, J. Kempe*, Y. Duan* (*Equal senior authorship)
ICML, 2025
Workshop versions: ICLR Workshop on Bidirectional Human–AI Alignment, 2025; COLT Workshop on Foundations of Post-Training, 2025.
Strong Model Collapse
E. Dohmatob, Y. Feng, A. Subramonian, J. Kempe
ICLR, 2025Spotlight
DRoP: Distributionally Robust Data Pruning
A. Vysogorets, K. Ahuja, J. Kempe
ICLR, 2025Spotlight
Workshop version: "Towards Robust Data Pruning" — ICLR Workshop on Datacentric Machine Learning, 2024.
Flavors of Margin: Implicit Bias of Steepest Descent in Homogeneous Neural Networks
Nikolaos Tsilivis, Gal Vardi, Julia Kempe
ICLR, 2025
Workshop version: NeurIPS Workshop on Mathematics of Modern Machine Learning, 2024.
Beyond Model Collapse: Scaling Up with Synthesized Data Requires Verification
Y. Feng, E. Dohmatob, P. Yang, F. Charton, J. Kempe
ICLR, 2025
Workshop version: "Beyond Model Collapse: Scaling Up with Synthesized Data Requires Reinforcement" — ICML Workshop on Theoretical Foundations of Foundation Models (TF2M), 2024. Earlier title; main version revised framing and results.
2024
Emergent properties with repeated examples
F. Charton, J. Kempe
NeurIPS Workshop on Scientific Methods for Understanding Deep Learning, 2024🏆 Debunking Challenge WinnerOral
Workshop version: "Repeated examples help learn arithmetic" — 4th NeurIPS Workshop on Mathematical Reasoning and AI, 2024. Same core contribution, reframed toward learning arithmetic.
The Price of Implicit Bias in Adversarially Robust Generalization
N. Tsilivis, N. Frank, N. Srebro, J. Kempe
NeurIPS, 2024
Workshop version: "The Best Algorithm for Adversarial Training" — ICLR Workshop on Bridging the Gap Between Practice and Theory in Deep Learning (BGPT), 2024. Earlier title; subset of authors (main version added N. Srebro).
On the Geometry of Regularization in Adversarial Training: High-Dimensional Asymptotics and Generalization Bounds
Matteo Vilucchio, Nikolaos Tsilivis, Bruno Loureiro, Julia Kempe
Preprint, 2024
Mission Impossible: A Statistical Perspective on Jailbreaking LLMs
J. Su, J. Kempe*, K. Ullrich* (*Equal senior authorship)
NeurIPS, 2024
Workshop version: ICML Workshop on Theoretical Foundations of Foundation Models (TF2M), 2024.
Iteration Head: A Mechanistic Study of Chain-of-Thought
V. Cabannes, C. Arnal, W. Bouaziz, A. Yang, F. Charton, J. Kempe
NeurIPS, 2024
Workshop version: ICML Mechanistic Interpretability Workshop, 2024.
Model Collapse Demystified: The Case of Regression
E. Dohmatob, Y. Feng, J. Kempe
NeurIPS, 2024
Workshop version: "Towards a Theoretical Understanding of Model Collapse" — ICLR Workshop on Bridging the Gap Between Practice and Theory in Deep Learning (BGPT), 2024. Earlier title; main version includes expanded results.
Attacking Bayes: On the Adversarial Robustness of Bayesian Neural Networks
Y. Feng, T. G. J. Rudner, N. Tsilivis, J. Kempe
Transactions on Machine Learning Research, 2024TMLR Certification Award
Workshop version: "Attacking Bayes: Are Bayesian Neural Networks Inherently Robust?" — 5th Symposium on Advances in Approximate Bayesian Inference (AABI), 2023. Shorter version; revised title and expanded analysis in TMLR.
A Tale of Tails: Model Collapse as a Change of Scaling Laws
E. Dohmatob, Y. Feng, P. Yang, F. Charton, J. Kempe
ICML, 2024
Workshop version: ICLR Workshop on Navigating and Addressing Data Problems for Foundation Models (DPFM), 2024.
Deconstructing the Goldilocks Zone of Neural Network Initialization
A. Vysogorets, A. Dawid, J. Kempe
ICML, 2024
Mind the GAP: Improving Robustness to Subpopulation Shifts with Group-Aware Priors
T. Rudner, Y. Zhang, A. Wilson, J. Kempe
AISTATS, 2024AISTATS Notable Paper AwardOral
Embarrassingly Simple Dataset Distillation
Y. Feng, R. Vedantam, J. Kempe
ICLR, 2024
Workshop version: NeurIPS Workshop on Advancing Neural Network Training, 2023.
On the Robustness of Neural Collapse and the Neural Collapse of Robustness
J. Su, Y. Zhang, N. Tsilivis, J. Kempe
Transactions on Machine Learning Research, 2024
Workshop version: NeurIPS Workshop on Unifying Representations in Neural Models, 2023.
Kernels, Data & Physics (Lecture Notes, Les Houches 2022)
F. Cagnetta, D. Oliveira, M. Sabanayagam, N. Tsilivis, J. Kempe
Journal of Statistical Mechanics: Theory and Experiment, 2024
2023
Galaxy Dataset Distillation by Self-Adaptive Trajectory Matching
H. Guan, X. Zhao, Z. Wang, Z. Li, J. Kempe
NeurIPS Workshop on Machine Learning and the Physical Sciences, 2023
Connectivity Matters: Neural Network Pruning Through the Lens of Effective Sparsity
A. Vysogorets, J. Kempe
Journal of Machine Learning Research, 2023 (Vol. 24, Issue 99, pp. 1–23)
2022
What Can The Neural Tangent Kernel Tell Us About Adversarial Robustness?
N. Tsilivis, J. Kempe
NeurIPS, 2022
Wavelets Beat Monkeys at Adversarial Robustness
J. Su, J. Kempe
NeurIPS Workshop on Machine Learning and the Physical Sciences, 2022
Adversarial Noise Injection for Learned Turbulence Simulations
J. Su, J. Kempe, D. Fielding, N. Tsilivis, M. Cranmer, S. Ho
NeurIPS Workshop on Machine Learning and the Physical Sciences, 2022
Can We Achieve Robustness from Data Alone?
N. Tsilivis, J. Su, J. Kempe
ICML Workshop on New Frontiers in Adversarial Machine Learning, 2022