In contemporary statistical applications, selection of the formal inferential problem is typically done after some level of interaction with the data. Usually, an initial exploratory analysis is used to identify those aspects of some population that appear interesting, and then the same dataset is used to learn about them. Such double use of the data invalidates classical inferential procedures: selective inference aims to correct for `data snooping’ and provide valid inference on selected parameters. We propose a powerful procedure for selective inference in selection-agnostic settings, where the selection mechanism is unknown or difficult to handle analytically.
Our method operates by performing the exploratory analysis on an artificially randomised version of the data and basing the inference on an orthogonal complement, which is independent of selection by construction in Gaussian settings. We discuss the relationship of the method with data splitting and describe how, under mild conditions, the method is asymptotically justified in the non-Gaussian context.
This is joint work with Daniel Garcia Rasines.