Ken Ming Lee

Reinforcement Learning for Retail Trajectory Prediction

Modelled in-store customer paths with maximum-entropy RL
Using real-world data, showed that RL trajectories matches more closely to actual human paths instead of TSP and PNN
Paper accepted at AAMAS 2026

Understanding customer movement within retail spaces is essential for optimizing store layouts. Real-world trajectory data can provide highly accurate insights, but collecting it is costly and often infeasible for many retailers. Heuristics such as Travelling Salesman Problem (TSP) and Probabilistic Nearest Neighbours (PNN) are commonly used as inexpensive approximations, but actual customer trajectories deviate by an average of 28% from shortest paths, highlighting a tradeoff between accuracy and practicality. We propose an agent-based modelling framework that casts customer trajectory prediction as a maximum entropy reinforcement learning (RL) problem, balancing reward maximization with stochasticity to better reflect customers with bounded rationality. Using real-world trajectory data from a convenience store, we show that RL-generated trajectories align more closely with customer behaviour than TSP and PNN, providing more accurate estimates of impulse purchase rates and shelf traffic densities. Furthermore, only RL-based predictions yield repositioning decisions for impulse products that align with those derived from actual trajectory data, resulting in comparable estimated profit gains. Our work demonstrates that RL provides a practical, behaviourally grounded alternative that bridges the gap between oversimplified heuristics and data-intensive approaches, making accurate layout optimization more accessible. To encourage further research, the source code is available on GitHub.

Reinforcement Learning for Retail Trajectory Prediction

Website GitHub

Interactive 3D Digital Twin of a Retail Store

First-person and bird-eye views for immersive exploration
Heatmap overlays of TSP, PNN, RL, and human trajectories for various baskets
User-controlled path replay and basket checkout simulation

To keep the paper focused on its core contributions, store representations were deliberately simplified, and trajectory comparisons are presented as static heatmaps. While informative, customer interaction with a store is fundamentally a human experience. To fully understand hotspots in heatmaps and the gaps between actual human trajectories and those predicted by algorithms, one needs to visualize the heatmaps and paths firsthand.

Interactive 3D Digital Twin of a Retail Store

Demo

Transformer Language Model from Scratch

Implemented GPT-style transformer end-to-end using only PyTorch primitives
Built core components including BPE tokenization, pre-norm transformer blocks, and modern attention mechanisms
Implemented training, optimization, and text generation pipelines with configurable sampling strategies

GitHub

RealmAI: ML-Driven Automated Game Playtesting Platform

Led development of an RL-based system to automate game playtesting and generate gameplay analytics
Built a one-click training pipeline with automatic hyperparameter tuning and model selection on top of Unity ML-Agents
Recognized as Distinguished Capstone Project

Website GitHub Documentation

Highlighted Projects

Reinforcement Learning for Retail Trajectory Prediction

Interactive 3D Digital Twin of a Retail Store

Transformer Language Model from Scratch

RealmAI: ML-Driven Automated Game Playtesting Platform