Our research is motivated by an interest in the fundamental theory and mathematics that underpins modern machine learning and statistical signal processing methodology. For instance, the group has a rich history working on structural characterizations of Partially Observed Markov Decision Processes (POMDPs), which serve as the mathematical backbone for current reinforcement learning research. In general, we are interested in using deep mathematical tools (from e.g., measure-theoretic probability, statistical learning theory, dynamical systems or graph theory) to advance state-of-the-art techniques in statistical signal processing, stochastic control and stochastic optimization. It is also important to ground our research in practical applicability, and thus there are several application areas which our work tends to be embedded within. These research applications can be clustered into three main themes:
- Behavioral Economics, Statistical Signal Processing & Machine Learning- The human sensor interface: How does human decision-making interface with signal processing algorithms? How to optimize sensing with human behavioral constraints? (Stochastic Optimization, Partially Observed Markov Decision Processes (POMDPs), RL and Computational Game Theory)
- Social Network Analytics and Sensing – Network Science with Reinforcement Learning: How to use social networks as a real time adaptive sensor? How to model & control the dynamics of information flow in social networks? (Stochastic Control, Graph Theory, Dynamical Systems)
- Adversarial Sensing: How should cognitive radars autonomously reconfigure their measurement modes based on their Bayesian estimates? How to calibrate your adversary’s sensors capabilities and intent? How to detect cognition? How to detect coordination among multiple adversarial agents? (Microeconomics and Inverse Reinforcement Learning)
Fundamental Research Areas
Partially Observed Markov Decision Processes (POMDPs) and Controlled Sensing
How to build smart reconfigurable sensors that dynamically adapt their behavior over time? This is a partially observed stochastic control problem. Examples include cognitive radars, sensor scheduling and cognitive radio. Our research in POMDPs focus mainly on structural results – that is, how to characterize the optimal policy using powerful ideas in supermodularity and stochastic dominance – without brute force computations.
Inverse Reinforcement Learning and Stochastic Optimization
We are interested in synthesizing and analyzing stochastic approximation algorithms for distributed optimization and game-theoretic learning. How quickly can a constant step size stochastic approximation track a time varying parameter or more generally, the equilibria of a time varying game? Stochastic averaging theory is the main tool that we use to analyze such algorithms. One of our major results is that if the parameter jumps according to a Markov chain with slow transition matrix, then the asymptotic behavior of the stochastic approximation converges to a switched-Markovian ordinary differential equation or differential inclusion.
Application Areas of Recent Interest