POPL 2025
Sun 19 - Sat 25 January 2025 Denver, Colorado, United States
Sun 19 Jan 2025 16:36 - 16:54 at Hopscotch - Verified Code Synthesis Chair(s): Stefan Zetzsche

Formal verification has the potential to drastically reduce software bugs, but its high additional cost has hindered large-scale adoption. While Dafny presents a promise to significantly reduce the effort to write verified programs, users are often required to provide logical annotations to aid the verifier. Here, we explore using a combination of Large Language Models and search to build dafny-annotator: a tool that adds logical annotations to a Dafny method until the verifier can prove it correct. On a test set from the DafnyBench collection of programs, greedy search guided by LLaMa 3.1 8B successfully annotates only 15.7% of the methods. Since this data-driven approach is hindered by the lack of large-scale training data, we propose a method for open-ended synthesis of new Dafny programs in a flexible pipeline where LLMs formulate high-level ideas, implement them, and incrementally propose changes to existing programs, which Dafny validates. This gives us a synthetic dataset, DafnySynth, which we use to augment DafnyBench for training. Fine-tuning on both datasets boosts LLaMa 8B’s success rate to 50.6% — significantly better than the base model, or training on either dataset alone. Our results suggest a path towards capable AI assistants for languages that don’t yet have large-scale human-generated examples. In turn, such assistants might reduce friction for users and ultimately drive adoption.

Sun 19 Jan

Displayed time zone: Mountain Time (US & Canada) change

16:00 - 18:00
Verified Code SynthesisDafny at Hopscotch
Chair(s): Stefan Zetzsche Amazon Web Services
16:00
18m
Talk
Laurel: Unblocking Automated Verification with Large Language Models
Dafny
Eric Mugnier University of California San Diego, Emmanuel Anaya Gonzalez UCSD, Nadia Polikarpova University of California at San Diego, Ranjit Jhala University of California at San Diego, Zhou Yuanyuan UCSD
16:18
18m
Talk
VerMCTS: Synthesizing Multi-Step Programs using a Verifier, a Large Language Model, and Tree Search
Dafny
David Brandfonbrener Harvard, Simon Henniger Technical University of Munich, Sibi Raja Harvard, Tarun Prasad Harvard, Chloe Loughridge Harvard University, Federico Cassano Northeastern University, Sabrina Ruixin Hu Harvard, Jianang Yang Million.js, William E. Byrd University of Alabama at Birmingham, USA, Robert Zinkov University of Oxford, Nada Amin Harvard University
16:36
18m
Talk
dafny-annotator: AI-Assisted Verification of Dafny Programs
Dafny
Gabriel Poesia Stanford University, Chloe Loughridge Harvard University, Nada Amin Harvard University
16:54
18m
Talk
Dafny as Verification-Aware Intermediate Language for Code Generation
Dafny
Yue Chen Li Massachusetts Institute of Technology, Stefan Zetzsche Amazon Web Services, Siva Somayyajula Amazon Web Services
Pre-print
17:12
18m
Talk
DafnyBench: A Benchmark for Formal Software Verification
Dafny
Chloe Loughridge Harvard University, Qinyi Sun Massachusetts Institute of Technology, Seth Ahrenbach Beneficial AI Foundation, Federico Cassano Northeastern University, Chuyue Sun Stanford University, Ying Sheng Stanford University, Anish Mudide Massachusetts Institute of Technology, Md Rakib Hossain Misu University of California Irvine, Nada Amin Harvard University, Max Tegmark Massachusetts Institute of Technology
17:30
18m
Talk
Towards Neural Synthesis for SMT-Assisted Proof-Oriented Programming
Dafny
Saikat Chakraborty Microsoft Research, Gabriel Ebner Microsoft Research, Siddharth Bhat University of Cambridge, Sarah Fakhoury Microsoft Research, Sakina Fatima University of Ottawa, Shuvendu K. Lahiri Microsoft Research, Nikhil Swamy Microsoft Research
17:48
12m
Day closing
Day closing
Dafny
Stefan Zetzsche Amazon Web Services