About Me
I’m a third-year computer science PhD student in Northwestern University’s theory group. My research interests are in theoretical computer science and the theoretical foundations of machine learning.
Before beginning my PhD studies at Northwestern University, I completed both my undergraduate and master’s degrees at the same institution. (Go Cats!)
Full CV
Publications and Manuscripts
Manuscripts and Ongoing Projects
On Hint-Induced Answer Shifts in Chain-of-Thought Reasoning
Sara Ahmadian, Pranjal Awasthi, Anxin Guo, Kamesh Munagala, Chirag Pabbaraju, and Aravindan Vijayaraghavan. (alphabetical order)
Manuscript in preparation.
Algorithmic Lemma Discovery via Formal Library Compression
Ruize Xu, Yi Gu, Anxin Guo, Tianchong Jiang, Chenxiao Yang, and Zhiyuan Li.
Manuscript.
Copying Under Repetition: From Prefix Matching to Position Hashing
Anxin Guo, Tianlong Huang, Chenxiao Yang, and Zhiyuan Li.
Manuscript.
Publications
Hallucination is a Consequence of Space-Optimality: A Rate-Distortion Theorem for Membership Testing
Anxin Guo and Jingwei Li.
ICML 2026 (spotlight, top 2.2%).
arXiv
Summary: We model the memorization of random, non-inferable facts as a membership testing problem, connecting Bloom-filter-style error metrics with the log-loss behavior of language models. In the sparse-fact regime, we prove a rate-distortion theorem showing that the optimal space-error tradeoff is governed by a KL-divergence frontier. The result gives an information-theoretic explanation for high-confidence hallucinations: under limited capacity, even an optimal model may assign high confidence to some non-facts rather than simply abstaining or forgetting.
Agnostic Learning of Arbitrary ReLU Activation under Gaussian Marginals
Anxin Guo and Aravindan Vijayaraghavan.
COLT 2025.
arXiv | conference version | recorded virtual talk
Summary: We give the first algorithm for agnostic PAC learning of an arbitrarily biased ReLU neuron under Gaussian input distributions, up to constant approximation. We also prove a hardness separation between SQ (statistical query) and CSQ (correlational statistical query) models for this problem, showing a limitation of gradient-based algorithms in this setting.
To Store or Not to Store: A Graph-Theoretic Approach for Dataset Versioning
Anxin Guo, Jingwei Li, Pattara Sukprasert, Samir Khuller, Amol Deshpande, and Koyel Mukherjee.
IPDPS 2024.
arXiv | conference version
Summary: We study a graph-theoretic framework for dataset versioning that optimizes storage costs while controlling retrieval costs across versions. On the theory side, we prove hardness-of-approximation results and give provably near-optimal algorithms for tree-like graphs of bounded treewidth. These results also lead to practical heuristics with up to 1000x speedups for the "MinSum Retrieval" problem on real-world GitHub repositories.
Education
Northwestern University, Evanston, Illinois (2019-2023)
B.A. in Math and M.S. in Computer Science
Northwestern University, Evanston, Illinois (2023-present)
Ph.D. student in Computer Science
Awards
- Ph.D. student research award (Northwestern CS department), 2024-2025.
- Barris Award for outstanding TA, CS 335: Introduction to the Theory of Computing, 2024 Fall quarter.
- Junior Career Award in Mathematics (Northwestern math department), 2021-2022.
Academic Activities
Research and Teaching
- IDEAL-funded summer research exchange, University of Chicago, Summer 2026. Hosted by Prof. Chao Gao.
- IDEAL-funded summer research exchange, Toyota Technological Institute at Chicago (TTIC), Summer 2025. Hosted by Prof. Zhiyuan Li.
- Teaching Assistant, CS 336: Design and Analysis of Algorithms, 2026 Spring quarter.
- Teaching Assistant, CS 335: Introduction to the Theory of Computing, 2024 Fall quarter. Guest lecture on busy beaver numbers.
Service and Programs
- Program Committee / Reviewer, Reliable ML from Unreliable Data, NeurIPS 2025 Workshop.
- Volunteer, FOCS 2024 (65th IEEE Symposium on Foundations of Computer Science), Chicago, IL.
- Ross Mathematics Program (Asia): first-year student (2018), junior counselor (2019), counselor (2020).
- Directed Reading Program: Spring 2021, read Linear Representations of Finite Groups with Wenyuan Li.
Miscellaneous
I have written three articles on 知乎 about theoretical computer science. Here is the link.
More personal notes
I grew up in Beijing and still have a soft spot for its subway system and shared bicycles.