Program Analysis via Multiple Context Free Language Reachability (POPL 2025 - POPL Research Papers)

Who

Giovanna Kobus Conrado, Adam Husted Kjelstrøm, Jaco van de Pol, Andreas Pavlogiannis

Track

POPL 2025

Time Zone

The program is currently displayed in (GMT-07:00) Mountain Time (US & Canada).

Use conference time zone: (GMT-07:00) Mountain Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Wed 22 Jan 2025 11:00 - 11:20 at Peek-A-Boo - Program Analysis Chair(s): Roland Leißa

Abstract

\emph{Context-free language (CFL)} reachability is a standard approach in static analyses, where the analysis question (e.g., is there a dataflow from $x$ to $y$?) is phrased as a language reachability problem on a graph $G$ wrt a CFL $\mathcal{L}$. However, CFLs lack the expressiveness needed for high analysis precision. On the other hand, common formalisms for \emph{context-sensitive} languages are too expressive, in the sense that the corresponding reachability problem becomes undecidable. \emph{Are there useful context-sensitive language-reachability models for static analysis?}

In this paper, we introduce \emph{Multiple Context-Free Language (MCFL)} reachability as an \emph{expressive} yet \emph{tractable} model for static program analysis. MCFLs form an infinite hierarchy of mildly context sensitive languages parameterized by a \emph{dimension} $d$ and a \emph{rank} $r$. Larger $d$ and $r$ yield progressively more expressive MCFLs, offering \emph{tunable} analysis precision. We showcase the utility of MCFL reachability by developing a family of MCFLs that approximate interleaved Dyck reachability, a common but undecidable static analysis problem.

Given the increased expressiveness of MCFLs, one natural question pertains to their algorithmic complexity, i.e., \emph{how fast can MCFL reachability be computed?} We show that the problem takes $O(n^{2d+1})$ time on a graph of $n$ nodes when $r=1$, and in $O(n^{d(r+1)})$ time when $r>1$. Moreover, we show that when $r=1$, even the simpler membership problem has a lower bound of $n^{2d}$ based on the Strong Exponential Time Hypothesis, while reachability for $d=1$ has a lower bound of $n^{3}$ based on the combinatorial Boolean Matrix Multiplication Hypothesis. Thus, for $r=1$, our algorithm is optimal within a factor $n$ for all levels of the hierarchy based on the dimension $d$ (and fully optimal for $d=1$).

We implement our MCFL reachability algorithm and evaluate it on underapproximating interleaved Dyck reachability for a standard taint analysis for Android. When combined with existing overapproximate methods, MCFL reachability discovers \emph{all} tainted information on 8 out of 11 benchmarks, while it has remarkable coverage (confirming $94.3%$ of the reachable pairs reported by the overapproximation) on the remaining 3. To our knowledge, this is the first report of high and provable coverage for this challenging benchmark set.

Giovanna Kobus Conrado

Hong Kong University of Science and Technology

Hong Kong

Adam Husted Kjelstrøm

Aarhus University

Denmark

Jaco van de Pol

Aarhus University

Denmark

Andreas Pavlogiannis

Aarhus University

Denmark

Time Zone

The program is currently displayed in (GMT-07:00) Mountain Time (US & Canada).

Use conference time zone: (GMT-07:00) Mountain Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Wed 22 Jan
Displayed time zone: Mountain Time (US & Canada) change

10:40 - 12:00	Program AnalysisPOPL at Peek-A-Boo Chair(s): Roland Leißa University of Mannheim, School of Business Informatics and Mathematics

10:40 20m Talk		Maximal Simplification of Polyhedral Reductions POPL Louis Narmour Colorado State University, University of Rennes, Inria, CNRS, IRISA, Tomofumi Yuki , Sanjay Rajopadhye Colorado State University
11:00 20m Talk		Program Analysis via Multiple Context Free Language Reachability POPL Giovanna Kobus Conrado Hong Kong University of Science and Technology, Adam Husted Kjelstrøm Aarhus University, Jaco van de Pol Aarhus University, Andreas Pavlogiannis Aarhus University
11:20 20m Talk		The Best of Abstract Interpretations POPL Roberto Giacobazzi University of Arizona, Francesco Ranzato University of Padova Pre-print
11:40 20m Talk		An Incremental Algorithm for Algebraic Program Analysis POPL Chenyu Zhou University of Southern California, Yuzhou Fang University of Southern California, Jingbo Wang Purdue University, Chao Wang University of Southern California