The Digital Initiative Presents

Dynamic marketing policies: constructing Markov states for reinforcement learning

12:00 pm March 4, 2020

HBS, Cotting House Conference Room, Room 107

Abstract

Many firms want to target their customers with a sequence of marketing actions, rather than just a single action. We interpret sequential targeting problems as a Markov Decision Process (MDP), which can be solved using a range of Reinforcement Learning (RL) algorithms. MDPs require the construction of Markov state spaces. These state spaces summarize the current information about each customer in each time period so that movements over time between Markov states describe customers’ dynamic paths. The Markov property requires that the states are “memoryless,” so that future outcomes depend only upon the current state, not upon earlier states. Even small breaches of this property can dramatically undermine the performance of RL algorithms. Yet, most methods for designing states, such as grouping customers by the recency, frequency, and monetary value of past transactions (RFM), are not guaranteed to yield Markov states. We propose a method for constructing Markov states from historical transaction data by adopting a method that has been proposed in computer science literature. Rather than designing states in transaction space, we construct predictions over how customers will respond to a firm’s marketing actions. We then design states using these predictions, grouping customers together if their predicted behavior is similar. To make this approach computationally tractable, we adapt the method to exploit a common feature of transaction data (sparsity). As a result, a problem that faces computational challenges in many settings becomes more feasible in a marketing setting. The method is straightforward to implement, and the resulting states can be used in standard RL algorithms. We evaluate the method using a novel validation approach. The findings confirm that the constructed states satisfy the Markov property, and are robust to the introduction of non-Markov distortions in the data. Co-authored with Duncan Simester.

Speaker bio

Yuting Zhu is a Ph. D. candidate in marketing at Sloan School of Management at Massachusetts Institute of Technology.

Dynamic marketing policies: constructing Markov states for reinforcement learning
All Events