Bridging Theory and Real-World Impact
We advance the foundations of deep learning and reinforcement learning while ensuring our methods deliver measurable impact in healthcare, robotics, transportation, manufacturing, and finance.
The Dynamic Optimization and Reinforcement Learning Lab builds principled, data-driven methods for intelligent systems that learn, plan, and adapt — from supply chains and healthcare to robotics and finance.
Grounded in Markov decision processes, approximate dynamic programming, and bandit theory — and extended through deep and inverse reinforcement learning — our work develops scalable, reliable algorithms for the decisions that matter.
The Dynamic Optimization & Reinforcement Learning Lab (DORL) advances fundamental research in data-driven intelligence and dynamic decision making. Our work bridges theory and practice through innovations in deep learning, reinforcement learning, and dynamic optimization, with the goal of developing efficient, generalizable, and safe AI systems.
We explore the foundations of learning under uncertainty — from self-supervised and continual learning to safe and transfer reinforcement learning — and apply these principles to complex real-world systems spanning supply chain optimization, intelligent healthcare, autonomous robotics, building management, transportation, manufacturing, and finance.
Embedded within the University of Toronto's interdisciplinary research ecosystem, DORL publishes in leading machine learning and applied AI venues while contributing to intelligent systems that shape the future of technology and society.
See what we're working onWe advance the foundations of deep learning and reinforcement learning while ensuring our methods deliver measurable impact in healthcare, robotics, transportation, manufacturing, and finance.
We develop robust algorithms for decision making in dynamic, uncertain environments — from safe RL to continual and online learning — enabling intelligent systems that adapt over time.
Our mission is to push AI toward efficiency, scalability, and generalization — essential components of future intelligent systems capable of operating across diverse domains.
Active threads across the lab, each with deep theoretical grounding and live applications.
Safe RL, multi-agent RL, risk-sensitive and longevity-aware agents, inverse RL, duality & occupancy-measure formulations, and general-utility objectives.
Hierarchical representation learning, graph neural networks, constrained generative models, discrete diffusion for structured outputs, and out-of-distribution generalization.
Markov decision processes, approximate dynamic programming, combinatorial optimization, scheduling, and mechanism design for self-interested multi-agent systems.
Language-conditioned robot learning, multimodal manipulation, human-robot interaction, sim-to-real transfer, and long-horizon autonomy for real-world tasks.
Vision-language-action models, LLM-based decision making, neurosymbolic AI, and interpretable methods for complex sequential decision-making tasks.
Online learning, test-time training and adaptation, robust time series analysis, continual learning, and belief revision in dynamic environments.
A single PI leading a cohort of PhD researchers spanning reinforcement learning theory, robotics, generative AI, graph learning, and applied operations research.
We welcome prospective students, collaborators, and industry partners exploring new applications of dynamic optimization and reinforcement learning.