Automated Program Refinement: Guide and Verify Code Large Language Model with Refinement Calculus (POPL 2025 - POPL Research Papers)

Who

Yufan Cai, Zhe Hou, David Sanan, Xiaokun Luan, Yun Lin, Jun Sun, Jin Song Dong

Track

POPL 2025

Time Zone

The program is currently displayed in (GMT-07:00) Mountain Time (US & Canada).

Use conference time zone: (GMT-07:00) Mountain Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Fri 24 Jan 2025 11:20 - 11:40 at Marco Polo - Synthesis and Compilation Chair(s): Alexander K. Lew

Abstract

Recently, the rise of code-centric large language models (LLMs) appears to have reshaped the software engineering world with low-barrier tools like Copilot that can generate code easily. However, there is no correctness guarantee for the code generated by LLMs, which suffer from the hallucination problem, and their output is fraught with risks. Besides, the end-to-end process from specification to code through LLMs is a non-transparent and uncontrolled black box. This opacity makes it difficult for users to understand and trust the generated code. Addressing these challenges is both necessary and critical. In contrast, program refinement transforms high-level specification statements into executable code while preserving correctness. Traditional tools for program refinement are primarily designed for formal methods experts and lack automation and extensibility. We apply program refinement to guide LLM and validate the LLM-generated code while transforming refinement into a more accessible and flexible framework. To initiate this vision, we propose Refine4LLM, an approach that aims to: (1) Formally refine the specifications, (2) Automatically prompt and guide the LLM using refinement calculus, (3) Interact with the LLM to generate the code and proof obligations, (4) Verify that the generated code satisfies the constraints, thus guaranteeing its correctness, (5) Learn and build more refinement laws to extend the refinement calculus. We evaluated Refine4LLM against the state-of-the-art baselines on program refinement and LLMs benchmarks. The experiment results show that Refine4LLM can efficiently generate more robust code and reduce the time for refinement and verification.

Yufan Cai

National University of Singapore

Zhe Hou

Griffith University

David Sanan

Nanyang Technological University, Singapore

Singapore

Xiaokun Luan

Peking University

Yun Lin

Shanghai Jiao Tong University

China

Jun Sun

Singapore Management University

Singapore

Jin Song Dong

National University of Singapore

Singapore

Time Zone

The program is currently displayed in (GMT-07:00) Mountain Time (US & Canada).

Use conference time zone: (GMT-07:00) Mountain Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Fri 24 Jan
Displayed time zone: Mountain Time (US & Canada) change

10:40 - 12:00	Synthesis and CompilationPOPL at Marco Polo Chair(s): Alexander K. Lew Massachusetts Institute of Technology

10:40 20m Talk		MimIR: An Extensible and Type-Safe Intermediate Representation for the DSL Age POPL Roland Leißa University of Mannheim, School of Business Informatics and Mathematics, Marcel Ullrich Saarland University, Saarland Informatics Campus, Joachim Meyer Saarland University, Saarland Informatics Campus, Sebastian Hack Saarland University, Saarland Informatics Campus Link to publication Pre-print
11:00 20m Talk		Simple Linear Loops: Algebraic Invariants and Applications POPL Rida Ait El Manssour CNRS & IRIF, Paris, George Kenison Liverpool John Moores University, Mahsa Shirmohammadi CNRS & IRIF, Paris, Anton Varonka TU Wien
11:20 20m Talk		Automated Program Refinement: Guide and Verify Code Large Language Model with Refinement Calculus POPL Yufan Cai National University of Singapore, Zhe Hou Griffith University, David Sanan Nanyang Technological University, Singapore, Xiaokun Luan Peking University, Yun Lin Shanghai Jiao Tong University, Jun Sun Singapore Management University, Jin Song Dong National University of Singapore
11:40 20m Talk		Tail Modulo Cons, OCaml, and Relational Separation LogicRemote POPL Clément Allain INRIA, Frédéric Bour Tarides, Basile Clément OCamlPro, François Pottier Inria, Gabriel Scherer Université Paris Cité - Inria - CNRS