100 likes | 124 Vues
In-depth discussion on frequency estimation techniques, challenges, and solutions for nonlinear time series analysis. Explore resolution, computational effort, statistical methods, and label switching in Bayesian analysis.
E N D
Discussion of Papers of Kroen and Woan John Rice University of California, Berkeley
Frequency Estimation • For estimating a single frequency (or small number), theoretical results for vanilla models are encouraging. For example, using folding and smoothing • Resolution is much better than that of the natural Fourier frequencies, 1/T. The downside is that the computational effort for a broadband search scales accordingly. The relevance of the periodogram computed at the natural Fourier frequencies, other than as a crude initial screen, is unclear, especially for irregularly spaced data and complex lightcurves. • Folding methods are easy to robustify, so one is not forced to rely on parametric models. • Testing is straightforward (at least conceptually) via permuting observations over times.
Drift • Koen has provided thought provoking material. These kinds of problems provide a playground for statisticians interested in nonlinear time series analysis. • Theory is encouraging, too (Genton and Hall). In a polynomial model for period drift • and for amplitude drift • Bootstrap tests can be readily devised. • The irony is that these superfine levels of resolution imply the necessity of a correspondingly scaled computational effort.
Confusion-Limited Specta • “LISA throws up very many fascinating statistical challenges” • Theoretically, how many components can be identified in a series of length T, and under what conditions? (Gaps in the data, Doppler-modulation, and antenna patterns further complicate the issues.) • Computationally, how can one go about doing this? The periodogram is problemmatical, since as an estimation method it does not incorporate any constraints on strength and sparsity. • How can uncertainty be quantified?
Hope for strong, sparse signals How many would be picked up using a simple uniform bound?
Label Switching and MCMC • One of the main challenges of a Bayesian analysis using mixtures is the nonidentiability of the components. That is, if exchangeable priors are placed upon the parameters of a mixture model, then the resulting posterior distribution will be invariant to permutations in the labeling of the parameters. As a result, the marginal posterior distributions for the parameters will be identical for each mixture component. Therefore, during MCMC simulation, the sampler encounters the symmetries of the posterior distribution and the interpretation of the labels switch. It is then meaningless to draw inference directly from MCMC output using ergodic averaging. Label switching significantly increases the effort required to produce a satisfactory Bayesian analysis of the data, but is a prerequisite of convergence of an MCMC sampler, and therefore must be addressed. Whilst convergence in MCMC simulation is a complex issue we regard a minimum requirement of convergence for a mixture posterior to be such that we have explored all possible labellings of the parameters*. • Current reversible jump and continuous time samplers are unable to move efficiently around the sample space and new simulation methods are required to apply Bayesian methodology in such contexts. • Jasra, Holmes, and Stephens. Statistical Science, 2005. • *~105 for LISA?
L1 constrained estimation • Such procedures are very much in vogue for sparse models. Applications in genomics (microarray data) and other areas. Number of variables may be much larger than number of observations (“small n large p”). • The optimization problem is convex and there are efficient codes. Tibshirani, “Lasso”
L1 spectral estimation Genovese and Stark, 1993
L1 spectral estimation Thanks to Nicolai Meinshausen. Same setup as in Woan