ETH AI Digest: #20
Constraint-satisfying neural networks, grammar-perfect code generation, and privacy law meets machine learning practice
In this week's digest:
Neural Networks with Hard Constraints — Πnet projects network outputs onto feasible regions using operator splitting, achieving superior performance in training time and solution quality for constrained optimization
Grammar-Perfect Code Generation — First constrained decoding method for diffusion models ensures syntactically valid C++ and JSON output with near-perfect correctness and reasonable computational overhead
Data Minimization Framework for ML — Comprehensive DMML framework bridges GDPR requirements with machine learning practices, systematically analyzing 13 techniques for privacy-preserving AI
Selected Papers of the Week
1. Pinet: Optimizing hard-constrained neural networks with orthogonal projection layers
Πnet: Projecting neural outputs onto convex constraints for faster, more accurate optimization.

✍️ Authors: Panagiotis D. Grontas, Antonio Terpin, Efe C. Balta, Raffaello D'Andrea, John Lygeros
🏛️ Lab: Automatic Control Laboratory, Institute for Dynamic Systems and Control
⚡ Summary
This paper introduces Πnet, a neural network architecture that guarantees satisfaction of convex constraints by projecting outputs onto feasible regions.
Using operator splitting for forward passes and the implicit function theorem for backpropagation, Πnet achieves superior performance compared to existing methods in terms of training time and solution quality.
The approach is demonstrated on benchmark optimization problems and multi-vehicle motion planning, showing its ability to handle both convex and non-convex objectives while maintaining constraint satisfaction.
The authors provide a GPU-ready implementation with effective tuning heuristics, making Πnet accessible for real-world applications.
2. Constrained Decoding of Diffusion LLMs with Context-Free Grammars
Ensuring syntactically perfect code generation by constraining diffusion language models with context-free grammars.

✍️ Authors: Niels Mündler, Jasper Dekoninck, Martin Vechev
🏛️ Lab: Secure, Reliable, and Intelligent Systems Lab
⚡ Summary
Current diffusion language models cannot guarantee adherence to formal languages like programming syntax, limiting their practical utility in code generation.
This paper presents the first constrained decoding method for diffusion models that handles context-free grammars, ensuring outputs like C++ code and JSON data are syntactically valid.
By reducing constrained decoding to an infilling problem and developing efficient algorithms to check language intersections, the authors achieve near-perfect syntactic correctness while improving functional correctness.
The approach maintains reasonable computational overhead (30-125%), making it practical for real-world applications in software development and structured data extraction.
3. SoK: Data Minimization in Machine Learning
Bridging regulatory requirements with machine learning practices through a comprehensive data minimization framework.

✍️ Authors: Robin Staab, Nikola Jovanović, Kimberly Mai, Prakhar Ganesh, Martin Vechev, Ferdinando Fioretto, Matthew Jagielski
🏛️ Lab: Secure, Reliable, and Intelligent Systems Lab
⚡ Summary
This paper addresses the disconnect between data minimization regulations (like GDPR) and machine learning practices by introducing a unified framework for Data Minimization in Machine Learning (DMML).
The authors systematically analyze 13 ML techniques that implicitly provide data minimization benefits, categorizing them along dimensions including type of minimization, points of application, and privacy guarantees.
Their framework defines actors, pipelines, adversaries, and evaluation metrics, helping practitioners understand which techniques satisfy regulatory requirements.
By bridging technical implementations with regulatory principles, this work enables more effective privacy-preserving machine learning while maintaining utility.
Other noteworthy articles
Robust-Sub-Gaussian Model Predictive Control for Safe Ultrasound-Image-Guided Robotic Spinal Surgery: Novel control framework handles image-based uncertainties with safety guarantees for robotic spine surgery