A New Procedure for Multiple Outliers Detection in Linear Regression

Ugah, Tobias Ejiofor and Arum, Kingsley Chinedu and Onwuamaeze, Charity Uchenna and Ossai1, Everestus Okafor and Oranye, Henrrietta Ebele and Eze, Nnaemeka Martin and Mba, Emmanuel Ikechukwu and Mba, Ifeoma Christy and Ekene-Okafor, Comfort Njideka and Asogwa, Oluchukwu Chukwuemeka and Okoacha, Nkechi Grace (2003) A New Procedure for Multiple Outliers Detection in Linear Regression. Mathematics and Statistics, 11 (4). pp. 738-745.

Text
Uga 2 (1)_070936.pdf
Download (492kB)

Abstract

In this paper, a simple asymptotic test statistic for identifying multiple outliers in linear regression is proposed. Sequential methods of multiple outliers detection test for the presence of a single outlier each time the procedure is applied. That is, the most severe or extreme outlying observation (the observation with the largest absolute internally studentized residual from the original fit of the mode to the entire observations) is tested first. If the test detects this observation as an outlier, then this observation is deleted, and the model is refitted to the remaining (reduced) observations. Then the observation with the next largest absolute internally studentized residual from the reduced sample is tested, and so on. This procedure of deleting observations and recomputing studentized residuals is continued until the null hypothesis of no outliers fails to be rejected. However, in this work our method or procedure entails calculating and uses only one set of internally studentized residuals obtained from fitting the model to the original data throughout the test exercise, and hence the procedure of deleting an observation, refitting the data to the remaining observations (reduced values) and recomputing the absolute internally studentized residuals at each stage of the test is avoided. The test statistic is incorporated into a technique (procedure) that entails a sequential application of a function of the internally studentized residuals. The procedure is a straightforward multistage method and is based on a result giving large sample properties of the internally studentized residuals. Approximate critical values of this test statistic are obtained based on approximations that depend on the application of the Bonferroni inequality since their exact values are not available. The new test statistic is very simple to compute, efficient and effective in large data sets, where more complex methods are difficult to apply because of their enormous computational demands or requirements. The results of the simulation study and numerical examples clearly show that the proposed test statistic is very successful in the identification of outlying observations.

Item Type:	Article
Subjects:	Q Science > Q Science (General)
Divisions:	Faculty of Engineering, Science and Mathematics > School of Mathematics
Depositing User:	Cynthia Ugwuoti
Date Deposited:	30 May 2025 13:02
Last Modified:	30 May 2025 13:02
URI:	http://eprints.gouni.edu.ng/id/eprint/4696

Actions (login required)

View Item