Laurel: Unblocking Automated Verification with Large Language Models (Dafny 2025) - POPL 2025

Sun 19 - Sat 25 January 2025 Denver, Colorado, United States

Who

Eric Mugnier, Emmanuel Anaya Gonzalez, Nadia Polikarpova, Ranjit Jhala, Zhou Yuanyuan

Track

Time Zone

The program is currently displayed in (GMT-07:00) Mountain Time (US & Canada).

Use conference time zone: (GMT-07:00) Mountain Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

When

Sun 19 Jan 2025 16:00 - 16:18 at Hopscotch - Verified Code Synthesis Chair(s): Stefan Zetzsche

Abstract

Program verifiers such as Dafny automate proofs by outsourcing them to an SMT solver. This automation is not perfect, however, and the solver often requires hints in the form of assertions, creating a burden for the proof engineer. In this paper, we propose Laurel, a tool that alleviates this burden by automatically generating assertions using large language models (LLMs).

To improve the success rate of LLMs in this task, we design two domain-specific prompting techniques. First, we help the LLM determine the location of the missing assertion by analyzing the verifier’s error message and inserting an assertion placeholder at that location. Second, we provide the LLM with example assertions from the same codebase, which we select based on a new proof similarity metric. We evaluate our techniques on our new benchmark DafnyGym, a dataset of complex lemmas we extracted from three real-world Dafny codebases. Our evaluation shows that Laurel is able to generate over 56.6% of the required assertions given only a few attempts, making LLMs an affordable tool for unblocking program verifiers without human intervention.

Eric Mugnier

University of California San Diego

Emmanuel Anaya Gonzalez

UCSD

Nadia Polikarpova

University of California at San Diego

United States

Ranjit Jhala

University of California at San Diego

United States

Zhou Yuanyuan

UCSD

Time Zone

The program is currently displayed in (GMT-07:00) Mountain Time (US & Canada).

Use conference time zone: (GMT-07:00) Mountain Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Session Program

Sun 19 Jan
Displayed time zone: Mountain Time (US & Canada) change

	16:00 - 18:00	Verified Code SynthesisDafny at Hopscotch Chair(s): Stefan Zetzsche Amazon Web Services

	16:00 18m Talk		Laurel: Unblocking Automated Verification with Large Language Models Dafny Eric Mugnier University of California San Diego, Emmanuel Anaya Gonzalez UCSD, Nadia Polikarpova University of California at San Diego, Ranjit Jhala University of California at San Diego, Zhou Yuanyuan UCSD
	16:18 18m Talk		VerMCTS: Synthesizing Multi-Step Programs using a Verifier, a Large Language Model, and Tree Search Dafny David Brandfonbrener Harvard, Simon Henniger Technical University of Munich, Sibi Raja Harvard, Tarun Prasad Harvard, Chloe Loughridge Harvard University, Federico Cassano Northeastern University, Sabrina Ruixin Hu Harvard, Jianang Yang Million.js, William E. Byrd University of Alabama at Birmingham, USA, Robert Zinkov University of Oxford, Nada Amin Harvard University
	16:36 18m Talk		dafny-annotator: AI-Assisted Verification of Dafny Programs Dafny Gabriel Poesia Stanford University, Chloe Loughridge Harvard University, Nada Amin Harvard University
	16:54 18m Talk		Dafny as Verification-Aware Intermediate Language for Code Generation Dafny Yue Chen Li Massachusetts Institute of Technology, Stefan Zetzsche Amazon Web Services, Siva Somayyajula Amazon Web Services Pre-print
	17:12 18m Talk		DafnyBench: A Benchmark for Formal Software Verification Dafny Chloe Loughridge Harvard University, Qinyi Sun Massachusetts Institute of Technology, Seth Ahrenbach Beneficial AI Foundation, Federico Cassano Northeastern University, Chuyue Sun Stanford University, Ying Sheng Stanford University, Anish Mudide Massachusetts Institute of Technology, Md Rakib Hossain Misu University of California Irvine, Nada Amin Harvard University, Max Tegmark Massachusetts Institute of Technology
	17:30 18m Talk		Towards Neural Synthesis for SMT-Assisted Proof-Oriented Programming Dafny Saikat Chakraborty Microsoft Research, Gabriel Ebner Microsoft Research, Siddharth Bhat University of Cambridge, Sarah Fakhoury Microsoft Research, Sakina Fatima University of Ottawa, Shuvendu K. Lahiri Microsoft Research, Nikhil Swamy Microsoft Research
	17:48 12m Day closing		Day closing Dafny Stefan Zetzsche Amazon Web Services