Welcome to the Meel Group’s web page. We are situated at National University of Singapore.
Broadly, our research group is focused on the development of a rigorous mathematical framework for provably verified, correct-by-construction, explainable probabilistic reasoning systems building on recent breakthroughs in formal methods and symbolic reasoning; and deployment of our algorithmic framework in the real world including applications in reliability estimation of critical infrastructure and computational sustainability.
At a more technical level, we work at the intersection of formal methods and artificial intelligence. We seek to develop formal methods for AI techniques and also employ advances in AI techniques for the development of formal method algorithmic and system frameworks.
Two observations motivate this research theme:
O1: The success of SAT/SMT/MaxSAT solvers has allowed researchers to employ them as “oracles” inside larger algorithmic components, which give rise to rich structure in the queries to the SAT solvers, and
O2: Modern SAT/SMT solvers achieve scalability and robustness with sophisticated heuristics that are challenging to understand and explain.
O1 necessitates design of solvers that can exploit the rich structures of queries without losing the generality offered by CNF. Our recent work has proposed new paradigms for CNF+XOR constraints [aaai19] and CNF+ANF constraints [date19].
A consequence of O2 is that the development of new algorithmic insights has been largely restricted to expert intuitions and evaluation of the new insights have been restricted to performance measurement in terms of the runtime of solvers or a proxy for the runtime of solvers. In this project, our focus is to enable data-driven design and understanding of SAT Solvers. In particular, we view modern conflict-driven clause learning (CDCL) solvers as a composition of classifiers and regressors for different tasks such as branching, clause memory management, and restarting. We will employ supervised learning and uses extensive, multi-gigabyte data extracted from runs of a single SAT solver to perform predictive analytics. Our [SAT-19] paper presents the first version of the system, called CrystalBall, that we are building.
Constrained sampling and counting are two fundamental problems in data analysis. In constrained sampling, the task is to sample randomly, subject to a given weighting function, from the set of solutions to a set of given constraints while for constrained counting, the task is to count the total weight, subject to a given weighting function, of the set of solutions of the given constraints. Both of these problems have applications in machine learning, probabilistic reasoning, planning, hardware, and software testing, among other areas.
Our focus in on the development of new algorithmic techniques for constrained sampling and counting, combining the classical technique of universal hashing with advances in Boolean reasoning. Many of the ideas underlying the proposed approach were go back to the 1980s, but they have never been reduced to practice. Recent progress in Boolean reasoning is enabling us to reduce these algorithmic ideas to practice and obtain breakthrough results in constrained sampling and counting, providing a new algorithmic toolbox in design verification, machine learning, probabilistic reasoning, and the like.
We are focused on
The modern AI systems significantly differ from traditional systems in their reliance on probabilistic reasoning. Often statistical tests are employed to argue for the accuracy/robustness of these systems, but such statistical tests are usually performed on a tiny number of samples for which no theoretical guarantees exist for their accuracy.
In contrast to testing for traditional hardware and software systems, where one trace is sufficient to prove the existence of a bug; such is not the case for samplers as one sample is typically not sufficient to prove non-conformity of the underlying systems. Our recent work [aaai19] showcased, to the best of our knowledge, the first algorithmic framework, Barbarik, to test whether the distribution generated is close to the uniform distribution.
Interpretability has become a central thread in ML research. As ML algorithms continue to permeate critical application domains such as medicine, legal, and transportation, it becomes increasingly important to allow human domain experts to understand and interact with ML solutions. Providing meaningful explanations for automated decisions is an increasingly important task. In this research theme, we are investigating techniques for two categories of explanations:
Interpretable explanations such as those specified using a small set of rules/formulas. In particular, we have designed a MaxSAT based formulation for learning interpretable rules in CNF/DNF [cp18] and later extended this formulation by incorporating incremental learning of small rules [aies19].
In settings where the cognitive burden of complete explanations is too high, people often do not seek to know why a particular event happened, but rather why an alternative one didn’t. Due to the heuristic nature of explanations produced by existing tools, it is necessary to validate them, and our approach applies the development in formal methods to aid validation. As an example, our recent work [sat19] proposed a counting-based method to assess the quality of explanations rigorously.
A hashing-based algorithm for discrete integration over finite domains.
A framework to provide white-box access to the execution of SAT solver.
An algorithm to generate uniform samples subject to given set of constraints.
GANAK: A Scalable Probabilistic Exact Model Counter
WAPS: Weighted and Projected Sampling
An ANF and CNF simplifier and converter.
Knowledge Compilation meets Uniform Sampling
An algorithm to compute minimal independent support for a given CNF formula.
An approximate model counter for Bitvector theory.
A hashing-based approximate sampler for weighted CNF formulas.
A weighted model counter over Boolean domains.
2019
2018
We are always looking for highly motivated Ph.D. students, research assistants and summer internship for exceptional undergraduate interns in our group. We work at the intersection of algorithmic design and systems; therefore, an ideal candidate should have deeper expertise in one area and willingness to learn the other. A strong background in statistics, algorithms/formal methods and prior experience in coding is crucial to make a significant contribution to our research.