Bandit models are widely used to capture learning in contexts where agents repeatedly choose actions with uncertain rewards. Examples include firms maximizing profits by experimenting with prices or advertisement, randomized control trials maximizing outcomes by evaluating alternative treatments, and consumers maximizing utility by trying experience goods. A popular bandit algorithm is the upper confidence bound (UCB) algorithm. The UCB algorithm requires sub-Gaussian concentration parameters as inputs. In practice, these parameters are unknown so that the UCB algorithm is not fully data-driven. I propose a method to estimate these parameters and use the estimated parameters to conduct inference with Hoeffding's inequality. I show that asymptotic inference with estimated parameters is valid under mild and optimal under stronger conditions. In finite samples, I establish the validity of inference under an anti-concentration condition. Equipped with the proposed estimator for sub-Gaussian concentration parameters, I adapt the UCB algorithm to settings where these parameters are unknown. In an empirical application, I study price experimentation after the liberalization of the spirits market in Washington State in 2012 and find that the adapted UCB algorithm leads to considerably higher profits. My theoretical results can also be applied to non-standard inference problems that arise in partial identification and machine learning.
Although market shares are frequently estimated, commonly applied methods for demand estimation are not robust to estimation error in these shares. While non-negligible estimation error in market shares always introduces bias in the demand parameter estimators, the issue becomes most salient when market shares are estimated as zero. In the presence of zero shares, widely applied estimators of the random coefficient logit model cannot be computed without ad-hoc data manipulations. This paper proposes a new estimator of demand parameters for settings with endogenous prices and estimated market shares that is robust to zero-valued market shares. The estimator generalizes the constrained optimization program of Dubé et al. (2012) with probabilistic bounds on the estimation error in market shares. We show consistency as the number of markets T grows sufficiently slowly relative to the number of consumers n such that log(T)/n → 0, and provide confidence intervals under the same rate. Simulations suggest improved finite sample properties of the proposed estimator to conventional alternatives.
The nested logit model is commonly used to estimate demand in differentiated products markets. However, it and its generalizations require an assumed nesting structure. In this paper, we propose to estimate the nesting structure from the data. For this, we build on a recent generalization of the nested logit model that allows any possible nesting structure and is consistent with utility-maximization by heterogeneous consumers. In this setting, estimating the nesting structure amounts to estimating a linear model with many endogenous variables, which is challenging. We show theoretically and in simulations that non-negativity constraints coming from economic theory are sufficient to recover the nesting structure from data. In doing so, we explore the regularization properties of the non-negative least squares estimator as demonstrated in the statistical literature and expanded here to an instrumental variable context. This estimator may be of independent interest.