On Feb 12, 2024, Wu visited University of Kansas and gave an invited talk at I2S Speaker Series.
Speech title: Achieving causal fairness in bandit based recommendation
Abstract: In online recommendation, customers arrive in a sequential and stochastic manner from an underlying distribution and the online decision model recommends a chosen item for each arriving individual conditional on the state of the environment. It is imperative to develop online recommendation algorithms to maximize the expected reward while achieving user-side fairness for customers, i.e., customers who share similar profiles will receive a similar reward regardless of their sensitive attributes and items being recommended. We study how to leverage offline data, incorporate causal inference, and adopt soft intervention to model the item selection strategy in contextual bandits. We present the d-separation based UCB algorithm (D-UCB) which can reduce the amount of exploration needed to achieve low cumulative regret, and the fair causal bandit (F-UCB) for achieving the counterfactual individual fairness. As the offline data often contain confounding and selection biases, ignoring these biases in causal bandits could negatively affect the performance of online recommendation. We present approaches of estimating conditional causal effects and deriving their bounds in the presence of compound biases. We further study how the derived causal bounds affect regret analysis in contextual bandits.
More information can be found at here.
A note — it was a very interesting experience of watching Super Bowl LVIII (San Francisco 49ers vs. Kansas City Chiefs) on TV and celebrations at Lawrence downtown (after Chiefs won the over time) on Feb 11.
Recent Comments