Seminars & Colloquia Calendar

Download as iCal file

DIMACS Theory of Computing Seminar

A Smoothed Analysis of the Greedy Algorithm for the Linear Contextual Bandit Problem

Steven Wu, Microsoft Research, NYC

Location:  CoRE 301
Date & time: Wednesday, 18 October 2017 at 11:00AM - 12:00PM

Abstract:   Bandit learning models the common setting when the decisions of an algorithm feed back into its training data, and it cannot observe counter-factuals. These settings include criminal recidivism prediction (would an inmate have committed another crime had he been released?), lending (would an applicant denied a loan have paid it back?), and drug trials (how would a patient have responded to a different drug?) The main tension in bandit learning is balancing exploration with exploitation. However, exploration, which explicitly sacrifices welfare today in exchange for potential future benefit, can be distasteful when decisions pertain to individuals. For example, it might be considered unethical to give drug A to a patient while knowing drug B often cures their illness merely to learn about drug A’s efficacy. In other settings (like predictive policing), a lack of exploration has been blamed for perpetuating unfairness. This motivates studying the performance of the greedy algorithm, which only exploits and never explores. Although this is not a no-regret algorithm, we show that when adversarially selected contexts are perturbed by a small amount of Gaussian noise, it recovers strong no-regret bounds.


Special Note to All Travelers

Directions: map and driving directions. If you need information on public transportation, you may want to check the New Jersey Transit page.

Unfortunately, cancellations do occur from time to time. Feel free to call our department: 848-445-6969 before embarking on your journey. Thank you.