An Attempt to Analyse Baarda’s Iterative Data Snooping Procedure based on Monte Carlo Simulation

Vinicius Francisco Rofatto, Marcelo Tomio Matsuoka, Ivandro Klein

Abstract


William Sealy Gosset, otherwise known as “Student”, Fisher's disciple, was one of the pioneers in the development of modern statistical method and its application to the design and analysis of experiments. Although there were no computers in his time, he discovered the form of the “t distribution” by a combination of mathematical and empirical work with random numbers. This is now known as an early application of the Monte Carlo simulation. Today with the fast computers and large data storage systems, the probabilities distribution can be estimated using computerized simulation. Here, we use Monte Carlo simulation to investigate the efficiency of the Baarda’s iterative data snooping procedure as test statistic for outlier identification in the Gauss-Markov model. We highlight that the iterative data snooping procedure can identify more observations than real number of outliers simulated.  It has a deserved attention in this work. The available probability of over-identification allows enhancing the probability of type III error as well as probably the outlier identifiability. With this approach, considering the analysed network, in general, the significance level of 0.001 was the best scenario to not make mistake of excluding wrong observation. Thus, the data snooping procedure was more realistic when the over-identifications case is considered in the simulation.  In the end, we concluded that for GNSS network that the iterative data snooping procedure based on Monte Carlo can locate an outlier in the order of magnitude 4.5σ with high success rate.

Full Text: PDF