Seminars

On choosing the size of data perturbation in adaptive model selection via penalization

92
reads

Hung Chen

2011-06-17
12:50:00 - 14:50:00

307 , Mathematics Research Center Building (ori. New Math. Bldg.)

Model selection procedures via penalization often use a fixed penalty, such as AIC and BIC, to avoid choosing a model which fits a particular data set extremely well. As a correction for not including the variability induced in model selection, generalized degrees of freedom (GDF) is introduced in Ye (1998) as an estimate of {\it model selection uncertainty} that arise in using the same data for both model selection and associated parameter estimation. Built upon generalized degrees of freedom, Shen and Ye (2002) propose a data-adaptive complexity penalty. The estimate of GDF is then obtained by randomly perturbs (adds noise to) the output variable and re-runs the modeling procedure. In this talk, we address how to select the size of perturbation through unbiased risk estimate, Stein Lemma, and generalized derivative with variable selection in nested linear regression models and wavelet thresholding. We also comment its connection with little bootstrap and tiny bootstrap considered in Breiman (1992, 1995, and 1996).