Journal of Statistical Theory and Applications

Volume 20, Issue 2, June 2021, Pages 219 - 227

An Alternative Measures of Moments Skewness Kurtosis and JB Test of Normality

Authors
Md. Siraj-Ud-Doulah*
Department of Statistics, Begum Rokeya University, Rangpur, 5400, Bangladesh
*Corresponding author. Email: sdoulah_brur@yahoo.com
Corresponding Author
Md. Siraj-Ud-Doulah
Received 30 October 2018, Accepted 31 December 2020, Available Online 31 May 2021.
DOI
10.2991/jsta.d.210525.002How to use a DOI?
Keywords
Robust moments; Robust skewness; Robust Kurtosis; Robust test of normality
Abstract

If we know the statistics of central tendency and dispersion, we still cannot nature a complete design about the distribution. About these measures we should know more information's of skewness and kurtosis, which are enables us to have a design the distribution. However, there is evidence that they may response poorly in the presence of non-normality or when outliers arise in data. We examine the performances of popular and frequently used measures of skewness β1, kurtosis β2 and Jarque–Bera test of normality that they may not perform and we anticipates in the existence of non-normality or outliers. In this paper, firstly, we develop robust measures of moments and we formulate a new statistics of skewness and kurtosis which we name robust skewness ϕ1 and robust kurtosis ϕ2. Again, in this paper, we modify Jarque–Bera test of normality, which we label Robust Jarque–Bera (RJB). These measures should be fairly robust. The effectiveness of the proposed measures is investigated by simulation approach. The results demonstrate that the newly proposed skewness ϕ1, kurtosis ϕ2 and RJB test outperform the skewness, kurtosis and Jarque–Bera test of normality when a small percentage of outliers are present or absent in the data.

Copyright
© 2021 The Authors. Published by Atlantis Press B.V.
Open Access
This is an open access article distributed under the CC BY-NC 4.0 license (http://creativecommons.org/licenses/by-nc/4.0/).

1. INTRODUCTION

The learning of central tendency and dispersion provides us with variable information involving to the central value as well as the variability of the distribution. Unfortunately, these measures fail to exhibit how the observations are given and accumulated about the central value of the distribution. The arrangements and accumulation of the observations establish the characteristics of the distribution with respect to its shape and pattern [1]. By shape characteristics of a distribution, we refer to the level of these two characteristics what is known as the measures of skewness and kurtosis respectively. Skewness and kurtosis can help us to visualize the asymmetry and peakedness of a frequency distribution. The theoretical and practical background of various measures of moments, skewness and kurtosis are documented in several books [211] and journal articles [12], to name but a few. Among them the absolute measures of skewness are not calculate for comparing two series. On the other hand, Prof. Karl Pearson's coefficient of skewness is baffled to calculate, if mode is ill-defined as well as skewness for moderately asymmetrical distribution give limit is ±3. In practice, these limits are rarely attained. Again, Bowley's coefficient of skewness is depend only on the central 50% of the data as well as based upon moments, coefficient of skewness Sk is depends on β1 and β2, but Sk=0 if either β1=0 or β2=3. Since, β2 cannot negative, Sk=0 if and only if β1=0. However, the most popular location and scale estimators are the mean and standard deviation, which is known to be extremely sensitive to outliers. Although mean, variance and covariance are the most frequently used summary measures of univariate and multivariate PDFs, we occasionally need to consider higher moments of the PDFs, such as the second, third and the fourth moments. Now, the rth moment about the mean (μ) is defined as

rth moment:EXμr,r=1,2,3,

The second, third and fourth moments of a distribution are often used in studying the “shape” of a probability distribution, in particular, its skewness, S (i.e., lack of symmetry) and kurtosis, K (i.e., tallness or flatness), are defined as

One measure of skewness is defined as β1=μ32/μ23

A commonly used measure of kurtosis is given by β2=μ4/μ22

Here we define a most popular and commonly used goodness of fit Robust Jarque and Bera [13] test for normality, which utilized the information of the skewness and kurtosis, is formulate by

JB=nβ126+β23224

In that case the value of the JB statistic is expected to be 0. Under the null hypothesis that the data set is normally distributed, JB showed that asymptotically (i.e., in large samples) the JB statistic follows the chi-square distribution with 2 degrees of freedom. If the computed p value of the JB statistic in an application is sufficiently low, which will happen if the value of JB is very different from 0, one can reject the hypothesis that the data are normally distributed. But if the p value is reasonably high, which will happen if the value of the statistic is close to 0, we do not reject the normality [13,14].

Since μ's sensitive to outliers from the underlying distribution, the resulting skewness β1 and kurtosis β2 can be affected by outliers. In particular, the JB test is sensitive to outliers because of that. In this paper, to overcome the sensitivity of departures from normal distribution, we focus on finding a novel and straightforward measure of skewness, kurtosis and JB statistic. We label its robust skewness ϕ1, robust kurtosis ϕ2 and robust Jarque-Bera RJB test for normality which are introduced in Section 2. The properties of these new measures are illustrated in Section 3 with a real-life data. The performance of the proposed measures is investigated in Section 4 through a Monte Carlo simulation experiment.

2. PROPOSE ROBUST MODIFICATION OF MOMENTS SKEWNESS KURTOSIS AND JB STATISTIC

The presence of a small proportion of outliers in a sample can have a large distorting influence on the sample mean and the sample variance. It is well known that these classical estimators, optimal under the normality assumption, are extremely sensitive to atypical observations in the data. Since the measures of skewness and kurtosis are based on mean and variance, it's also sensitive to outliers. There exist several measures of robustness of an estimator [15,16], but in this paper, the decile mean (DM) will be used. This is rich tool that summarizes several aspects of the robustness of an estimator. A survey on DM is given by [1]. Now we define DM as

DM=D1+D2++D99
where D1, D2,,D9 are 9 decile from grouped or ungrouped data. Therefore, we develop robust measures of moments, skewness, kurtosis and JB statistic of normality test.

2.1. Robust Moments

In statistics, moments are certain constant values in a given distribution, it's obviously fall under descriptive statistics. Because of this nature, the moments help us to establish the nature and form of the underlying distribution. Consider a variable X, assuming values x1,x2,,xn, and then the rth raw moment of a variable X about any point A is defined by

λr/=DMxiAr,r=1,2,.

The first-four raw moments about the value A are defined as

λ1/=DMxiA=DMxiA=DMxA
λ2/=DMxiA2
λ3/=DMxiA3
λ4/=DMxiA4

Replacing A by DMx in the above expression, we will get the central moments are defined as

λ1=DMxiDMx=DMxiDMx=DMxDMx=0
λ2=DMxiDMx2
λ3=DMxiDMx3
λ4=DMxiDMx4

In general, the rth central moment is defined as

λr=DMxiDMxr,r=1,2,.

Thus, it is to be significant that you can compute an infinite number of moments for a given distribution, but in practice, we need only four moments to investigate the form and characteristics of a distribution.

2.1.1. Relation between raw moments and central moments

Recall that

λ1=DMxiDMx=DMxiA+ADMx=DMxiADMDMxA=DMxiADMxAλ1=λ1/λ1/λ2=DMxiDMx2=DMxiA+ADMx2=DMxiA2+DMDMxA22DMxiADMxA=DMxiA2+DMxA22DMxiADMxAλ2=λ2/+λ1/22λ1/λ1/λ2=λ2/λ1/2

Similarly,

λ3=DMxiDMx3=DMxiA+ADMx3λ3=λ3/3λ2/λ1/+2λ1/3
and
λ4=DMxiDMx4=DMxiA+ADMx4λ4=λ4/4λ3/λ1/+6λ2/λ1/23λ1/3

In general, λr=λr/Cr1λ1/λr1/+Cr2λ1/2λr2/,,+1rλ1/r, r=1,2,

Thus the formula enable us to find the moments about any point, once the decile mean DMx and the decile mean DMx are known.

2.1.2. Effect of change of origin and scale on moments

Let yi=xiAh, where A and h are origin and scale respectively.

xiA=hyiDMx=A+hDMy

Now the rth raw moments of x about any point A is given by

λr/=DMxiAr=DMhyir=hrDMyir=hrλr/y

And the rth moment of x about decile mean DMx is

λr=DMxiDMxr=DMA+hyiAhDMyr=hrDMyiDMyr=hrλry

Thus the rth moment of the variable x about decile mean DMx is hr times the rth moment of the variable y about decile mean DMy.

2.2. Robust Skewness and Robust Kurtosis

Literally, skewness means “lack of symmetry” as well as kurtosis means “convexity of curve.” We study skewness and kurtosis to have an idea about the shape and pattern of the curve. The robust measures of skewness and kurtosis may also be obtained by making use of the proposed robust moments. A relative measure of robust skewness denoted by ϕ1, is define as follows:

ϕ1=λ32λ23

The value of ϕ1 shall be zero for a perfectly symmetrical distribution. It is obvious from the above formula that a distribution will be positively or negatively skewed according as the value of λ3 is positive or negative.

The most important measure of robust kurtosis is ϕ2, defined as

ϕ2=λ4λ22

For normal distribution ϕ2=3. In other words, if ϕ23>0, the distribution is leptokurtic; if ϕ23<0, the distribution is platykurtic; if ϕ23=0, the distribution is mesokurtic.

2.2.1. Prove that ϕ1 and ϕ2 are invariant to the changes in origin and scale of measurement

Proof:

Let ϕ1x and ϕ2x denote the values of ϕ1 and ϕ2 calculated from a set of observations x1,x2,,xn pertaining to a variable X.

Now,ϕ1x=λ32xλ23x and ϕ2x=λ4xλ22x, Where λr=DMxiDMxr

Let Y be a transformed variable assuming values y1,y2,,yn.

Now, suppose yi=xiAh, where A and h are origin and scale respectively.

Since,λrx=hrλry,r=1,2,.

The corresponding phi values are as follows:

ϕ1y=λ32yλ23y and ϕ2y=λ4yλ22y

Hence the proof.

2.2.2. For any set of values x1,x2,,xn, prove that ϕ21+ϕ1

Proof:

Let us recall that

λ2=DMxiDMx2,λ3=DMxiDMx3,λ4=DMxiDMx4

Consider the following expression

DMaxiDMx2+bxiDMx+c20a2DMxiDMx4+b2DMxiDMx2+c2+2abDMxiDMx3+2acDMxiDMx2+2bcDMxiDMx0a2λ4+b2λ2+c2+2abλ3+2acλ2+00

Choosing a=1,b=λ3/λ2 and c=λ2, the above expression becomes

λ4λ32λ2λ220
ϕ21+ϕ1

This completes the proof.

2.2.3. For any set of values x1,x2,,xn, prove that ϕ21

Proof:

Let us recall that

λ2=DMxiDMx2,λ4=DMxiDMx4

Consider the following expression

DMaxiDMx2+c20
a2DMxiDMx4+c2++2acDMxiDMx20
a2λ4+c2+2acλ20

Choosing a=1 and c=λ2, the above expression becomes

λ4λ220
Hence,ϕ21.

2.3. Robust Jarque–Bera (RJB) Test of Normality

AS a result, and following the measures of robust skewness ϕ1 and robust kurtosis ϕ2 discussed earlier, for a normal PDF ϕ1=0 and ϕ2=3, that is a normal distribution is symmetric and mesokurtic. Therefore, a simple test of normality is to find out whether the computed values of robust skewness ϕ1 and robust kurtosis ϕ2 depart from the norms of 0 and 3, is defined by-

RJB=nϕ126+ϕ23224

It follows that the value of the RJB statistic is estimated to be 0. Under the null hypothesis of normality, RJB is distributed as a chi-square χ2 statistic with 2 degrees of freedom. If the p value is reasonably high, which will happen if the value of the statistic is close to 0, we do not reject the normality. But if the computed p value of the RJB statistic in an application is sufficiently low, which will happen if the value of RJB is very different from 0, one can reject the hypothesis that the data are normally distributed.

3. REAL DATA EXAMPLES

In this section, we apply some recognized graphs, classical and our newly proposed measures as well as tests on real data sets to make out the data are normal or not. Let us first consider the weight of a bag of carrots data, which is taken from [17]. This data consists of 12 observations. When we apply usual outliers' detection method (Med-MAD) [18,19], we notice that this data does not hold any outlier. The outcome of graphical, classical and newly proposed measures and tests for this data are given below:

Since, the original data set is free from outliers we watch from Figure 1 that the type of the density plot is positively skewed distribution as well as QQ-plot are reasonably normal in shape.

Figure 1

A graphical comparison of normality.

From Table 1 reports the data are positively skewed and platykurtic normal shape based on both classical and proposes estimators. Any more notice that the inference of classical and propose JB tests results are same. But it is worth mentioning that Hogg and Tanis [17] used only graphical two tests and announced that the data is normal.

μ3/λ3 β1/ϕ1 β2/ϕ2 JB/RJB Value p-Value Remarks
Classical 0.000087 0.0457 1.9797 0.5246 0.7692 Normal
Proposed 0.000080 0.0539 1.8303 0.6898 0.7082 Normal
Table 1

Four measures result of the weight of a bag of carrots data.

Now judge another data set, the diameter of individual grains of soil, such as porosity, data has taken as of [17], which contains 30 observations. In the beginning, we make sure outliers by usual method (Med-MAD) [18,19]; it detects 2 outliers (cases 6 and 14). Original data and deleting these outliers we verify the normality of the data set by graphical as well as analytical tests of normality, which results has publicized below:

In Figure 2 gives the two conclusions: one the density and QQ plots indicate the data set is positively skewed and normal characteristics when the data contain outliers, another graphs look negatively skewed and non-normal pattern because of free from contamination.

Figure 2

A graphical comparison of normality when data hold outliers and not.

From Table 2 demonstrate that the classical measures μ3, β1 and β2 suggest that the data set is positively skewed and platykurtic as well as the classical JB test declare that the data set is normal when outliers presence in the data set. On the other hand, when extreme values present in the data set my newly proposed robust estimators λ3, ϕ1 and ϕ2 advice that the data set is negatively skewed and platykurtic as well as my newly proposed RJB test identify non-normality. It is to be important that both tests speak out non-normality when the data set free from unusual observations. But it is notified that Hogg and Tanis [17] utilized only graphical two tests and certified that the data is normal. Since the classical measures and JB test fail to discover the actual nature and shape of the data distribution, we recommend that our newly proposed measures and RJB test are more efficient and robust for correct inference.

μ3/λ3 β1/ϕ1 β2/ϕ2 JB/RJB Value p-Value Remarks

WO WOO WO WOO WO WOO WO WOO WO WOO WO WOO
Classical 0.000031 −8.053 e006 0.0166 0.0037 2.941 2.044 0.0057 1.0659 0.997 0.586 Normal Not normal
Proposed −0.00001 −8.559 e006 0.0075 0.0067 1.919 1.794 1.4606 1.6970 0.482 0.428 Not normal Not normal
Table 2

Four measures of the diameter of individual grains of soil, such as porosity, data with outliers (WO) and without outliers (WOO).

Again, we assume a real data; the weight of packaged product data is taken from [17], which is consists of 100 observations. Initially, we confirm outliers by usual method (Med-MAD) [18,19]; it detects 6 outliers (cases 29, 50, 70, 71, 75 and 81). We test out the normality of the data set by graphical and analytical methods, which results have shown below:

From Figure 3 demonstrate that the data display positively skewed and non-normal because the points do fall far from a straight line in QQ-plot when the data contain extreme values. Conversely, we show that the data exhibit negatively skewed and normal for the reason that the points do drop over the straight line in QQ-plot when the data set is free from outliers.

Figure 3

A graphical comparison of normality when data have outliers and not.

From Table 3 shows that the classical statistics μ3, β1 and β2 hint that the data are positively skewed and leptokurtic as well as the classical JB test recognize non-normal pattern when outliers present in the data set. Alternatively, our newly proposed statistics λ3, ϕ1 and ϕ2 tell that the data is negatively skewed and platykurtic as well as RJB test has given exact identification when the data set hold outliers. Moreover, both the classical and proposed measures and tests have given right finding after removing outliers. But it is noteworthy that Hogg and Tanis [17] used only graphical two tests and licensed that the data is normal. Thus, we explain that the classical measures and JB test baffle to determine actual inference when extreme observations present in the data set. On the other hand, our newly proposed measures and RJB test have given correct inference when outliers present in the data set or absent.

μ3/λ3 β1/ϕ1 β2/ϕ2 JB/RJB Value p-Value Remarks

WO WOO WO WOO WO WOO WO WOO WO WOO WO WOO
Classical 2.688 −0.6001 0.0352 0.0054 3.442 2.933 0.836 0.0179 0.658 0.9910 Not normal Normal
Proposed −1.015 −0.7407 0.0120 0.0164 2.650 2.670 0.512 0.4297 0.774 0.8066 Normal Normal
Table 3

Four measures of the weight of packaged product data with outliers (WO) and without outliers (WOO).

4. REPORT OF MONTE CARLO SIMULATION STUDY

In this section, we report a Monte Carlo simulation study which is aim to compare the performance of the newly proposed robust measure of moment λ3, skewness ϕ1 and kurtosis ϕ2 with other popular and commonly used classical same measures. We also verify the sound power comparison of my newly proposed RJB test and classical JB test. We simulate data under not normal as well as normal from uniform distribution. In my simulation experiment, we have taken different sample sizes, n = 50, 100, 200, 500 and 1000. Each experiment is run 10,000 times and the tests consequences are given below.

From Table 4 shows that the classical measure μ3, β1 and β2 give higher percentage for normal when the data sets are not normal. The classical normality test JB, the rejecting power of alternative hypothesis H1 is very high when alternative hypothesis H1 is true. Alternatively, the proposed measure λ3, ϕ1 and ϕ2 give very low percentage for normal when the data sets are not normal. The proposed normality test RJB, the rejecting power of alternative hypothesis H1 is very low when alternative hypothesis H1 is true.

Power (In Percentage)
μ3/λ3 β1/ϕ1 β2/ϕ2 JB/RJB

n = 50
Classical 13.39 13.39 13.39 13.39
Proposed 1.31 1.31 1.31 1.31
n = 100
Classical 19.64 19.64 19.64 19.64
Proposed 0.97 0.97 0.97 0.97
n = 200
Classical 27.58 27.58 27.58 27.58
Proposed 0.053 0.053 0.053 0.053
n = 500
Classical 31.89 31.89 31.89 31.89
Proposed 0.0019 0.0019 0.0019 0.0019
n = 1000
Classical 37.49 37.49 37.49 37.49
Proposed 0.0004 0.0004 0.0004 0.0004
Table 4

Performance comparison under not normal.

From Table 5 reports that the classical measure μ3, β1 and β2 give very low percentage for normal when the data sets are normal. The classical normality test JB, the rejecting power of null hypothesis H0 is very high when null hypothesis H0 is true. Conversely, the proposed measure λ3, ϕ1 and ϕ2 give very high percentage for normal when the data sets are normal. The proposed normality test RJB, the rejecting power of null hypothesis H0 is very low when null hypothesis H0 is true.

Power (In Percentage)
μ3/λ3 β1/ϕ1 β2/ϕ2 JB/RJB

n = 50
Classical 7.60 7.60 7.60 7.60
Proposed 93.47 93.47 93.47 93.47
n = 100
Classical 11.81 11.81 11.81 11.81
Proposed 98.89 98.89 98.89 98.89
n = 200
Classical 18.99 18.99 18.99 18.99
Proposed 100 100 100 100
n = 500
Classical 34.03 34.03 34.03 34.03
Proposed 100 100 100 100
n = 1000
Classical 39.68 39.68 39.68 39.68
Proposed 100 100 100 100
Table 5

Performance comparison under normal.

Analyzing the above discussion, we demonstrate that the proposed measures and test give right outcome when the data set is normal and not normal. So over all we can say that the proposed measures and test are better than any other measures and tests to check the normality.

5. CONCLUSION

In this paper, to sum up the whole aforesaid discussion, our main objectives was to propose a new robust measures of moments, skewness, kurtosis which represent the data better than any others existing tools. We also propose a new statistic of Jarque–Bera test of normality, so that it can correctly identify right inference than any others existing tests. Both cases we have seen that irrespective of the presence of outliers or not, our newly proposed robust measures of moments, skewness, kurtosis and RJB test performs better than other classical measures and tests for different sample sizes. Note that our proposed measures fulfill the various properties and conditions which we proved in Appendices. Mention that all existing graphical and analytical measures and test of normality fail to identify appropriate outcome for real data sets and small to moderate sample sizes when outliers present in the data sets. Not only that, both the real-life data and simulation study demonstrate that our newly proposed robust moments, robust skewness, robust kurtosis and RJB test of normality have given more actual sound results in a variety of situations and hence can be recommended to use an effective measures and test.

CONFLICTS OF INTEREST

The authors declare they have no conflicts of interest.

AUTHORS' CONTRIBUTIONS

Md.Siraj-Ud-Doulah conceived and designed the study, analyzed the data, interpretation of the data and wrote the manuscript. The final version of the manuscript was reviewed and approved by the author.

ACKNOWLEDGMENTS

The author would like to thank the anonymous referees for their helpful remarks.

REFERENCES

1.S. Rana, M.S.U. Doulah, H. Midi, and A.H.M.R. Imon, Chiang Mai J. Sci., Vol. 39, 2012, pp. 478-485.
2.D.R. Anderson, J.S. Dennis, and A. Thomas, Introduction to Statistics, West Publishing Company, New York, NY, USA, 1981.
3.L.L. Chao, Statistics: Methods and Application, McGraw-Hill, New York, NY, USA, 1980.
5.E.A. Freedman, Statistics, John Wiley & Sons, Sydney, Australia, 1991.
6.J.E. Freund, Statistics: A First Course, Printice Hall Inc, New York, NY, USA, 1976.
7.H. William Hays, Statistics, Harcourt Brace College Publisher, New York, NY, USA, 1994.
8.R.V. Hogg and J. Ledolter, Applied Statistics for Engineer and Social Scientists, second, Macmillan Publishing Company, New York, NY, USA, 1992.
9.L.R. Iman, L. Ronald, and W.I. Conover, A Modern Approach to Statistics, John Wiley & Sons, New York, NY, USA, 1983.
10.Z.A. Karian and E.A. Tanis, Probability and Statistics: Explorations with MAPLE, second, Prentice Hall, New Jersey, NJ, USA, 1999.
11.J. Levin and J.A. Fox, Elementary Statistics in Social Research, Longman, New York, NY, USA, 1997.
14.D.N. Gujarati, Basic Econometrics, fourth, McGraw Hill, New York, NY, USA, 2010.
17.R.V. Hogg and E.A. Tanis, Probability and Statistical Inference, sixth, Prentice Hall, New Jersey, NJ, USA, 2001.
18.V. Bartnett and T. Lewis, Outlier in Statistical Data, third, Wiley, New York, NY, USA, 1994.
Journal
Journal of Statistical Theory and Applications
Volume-Issue
20 - 2
Pages
219 - 227
Publication Date
2021/05/31
ISSN (Online)
2214-1766
ISSN (Print)
1538-7887
DOI
10.2991/jsta.d.210525.002How to use a DOI?
Copyright
© 2021 The Authors. Published by Atlantis Press B.V.
Open Access
This is an open access article distributed under the CC BY-NC 4.0 license (http://creativecommons.org/licenses/by-nc/4.0/).

Cite this article

TY  - JOUR
AU  - Md. Siraj-Ud-Doulah
PY  - 2021
DA  - 2021/05/31
TI  - An Alternative Measures of Moments Skewness Kurtosis and JB Test of Normality
JO  - Journal of Statistical Theory and Applications
SP  - 219
EP  - 227
VL  - 20
IS  - 2
SN  - 2214-1766
UR  - https://doi.org/10.2991/jsta.d.210525.002
DO  - 10.2991/jsta.d.210525.002
ID  - Siraj-Ud-Doulah2021
ER  -