Ben Lipkin - Home

About Me

I am a PhD candidate at MIT in the Department of Brain & Cognitive Sciences (BCS). I am also a member of the GenLM research consortium, where we are building an open-source ecosystem for language model probabilistic programming. This summer, I am interning at Apple in the Bay Area. I am grateful for my many wonderful mentors and collaborators across these communities.

My research, which is funded by the NSF GRFP and an MIT Presidential Fellowship, draws from diverse disciplines spanning cognitive science, Bayesian machine learning, and NLP. Currently, I'm focused on developing train-time and test-time algorithms for reliably controlling language models. I am particularly interested in tasks involving long-horizon planning and sparse reward.

Prior to starting my PhD, I studied computational neuroscience at the University of Michigan, and worked for several years on machine learning applications to neurobiology, including publications in Nature, PNAS, and NeurIPS. I have also previously organized several interdisciplinary workshops including NLRSE @ ACL 2024 and NHLS @ The U.S. National Science Foundation.

Select Projects
‣ Sampling Algorithms & Programming Models for LLMs
‣ AWRS [arXiv]: Fast randomized algorithm for constrained decoding as posterior inference.
‣ GenLM [ICLR Oral]: Controlling LLM generation via programmable constraints and sequential Monte Carlo.
‣ Decoding [GitHub]: An open-source library for compositional language model programs.
‣ Reasoning, Pragmatics, & World Knowledge
‣ ProbSem [CogSci]: Pragmatic semantic parsing via LLM-mediated approximate inference.
‣ LINC [EMNLP Outstanding Paper]: Combining LLMs with SMT solvers for provably consistent reasoning.
‣ EWoK [arXiv]: Benchmarking LLMs on core world knowledge.
‣ AI for Code & Mathematics
‣ BrainCode [NeurIPS]: An investigation of how LLMs encode computer programs.
‣ HumanMath [NeurIPS Math AI Workshop]: Opinion piece on the communicative role of mathematics.

Open Source
I care about open source and allocate a portion of my time towards community contributions. These have previously included the development and evaluation of code models with the Star Coder project by Hugging Face and Service Now, the first implementation of CFG-guided text generation for the Outlines library by dottxt-ai, and contributions to AI for mathematics with Project Numina.

News

May 2025: I've started an internship at Apple in the Cupertino office.
Apr 2025: "Fast controlled generation from language models with adaptive weighted rejection sampling" uploaded as preprint.
Feb 2025: "Syntactic and semantic control of large language models via sequential Monte Carlo" accepted and selected for Oral Presentation at ICLR conference in Singapore.
Dec 2024: We presented the EWoK project at Google DeepMind.
Dec 2024: "Models can and should embrace the communicative nature of human-generated math" presented at NeurIPS Workshop on Mathematical Reasoning and AI in Vancouver, BC.
Nov 2024: Initial release of the decoding library.
Oct 2024: Invited research talks on Neurosymbolic AI at Microsoft Research and Pitt NLP Seminar.
Sept 2024: Co-organizer for SFI working group on "Assessing Representation in Minds and Artificial Systems" in Santa Fe, NM.
Aug 2024: Program chair for ACL workshop on "Natural Language Reasoning and Structured Explanations" in Bangkok, Thailand.
June 2024: Joined Project Numina as a contributor, facilitating the release of the models, datasets, and code used to win the first AIMO progress prize.
May 2024: "Elements of World Knowledge (EWOK): A cognition-inspired framework for evaluating basic world knowledge in language models" released as preprint alongside the companion open-source software library and dataset.
May 2024: Student organizer for NSF workshop on "New Horizons in Language Science: Large Language Models, Language Structure, and the Cognitive and Neural Basis of Language" in Arlington, VA.
Apr 2024: "Modeling uncertainty in semantic parsing" presented at the New England NLP Meeting in Providence, RI.
Apr 2024: Awarded NSF GRFP Fellowship.
Apr 2024: "Log probability scores provide a closer match to human plausibility judgments than prompt-based evaluations" presented at the SouthNLP symposium in Atlanta, GA.
Dec 2023: Designed and implemented the first version of CFG-guided generation for the open-source Outlines library.
Dec 2023: "LINC: a neurosymbolic approach for logical reasoning by combining language models with first-order logic provers" wins outstanding paper award at EMNLP conference in Singapore.
Dec 2023: "StarCoder: may the source be with you!" published at TMLR.
Aug 2023: Started at SFI summer program on intelligence and representation at the Isaac Newton Institute in Cambridge, UK.
July 2023: "Evaluating statistical language models as pragmatic reasoners" presented at CogSci in Sydney, Australia, and the ACL NLRSE workshop in Toronto, ON.
May 2023: BigCode evaluation harness released, enabling secure containerized evaluation of code generation language models.
Dec 2022: "This is your brain. This is your brain on code." promotes our work in MIT News and Communications of the ACM.
Nov 2022: "Convergent representations of computer programs in human and artificial neural networks" presented at NeurIPS conference in New Orleans, LA.
Sept 2022: Awarded MIT Presidential Fellowship.
Sept 2022: Started PhD at MIT BCS.