Our Normality: Tested and Explained

anderson-darling black belt data transformation hypothesis testing lean six sigma normal distribution normality test p-value process engineer process management six-sigma statistics the normal distribution curve
Six-Sigma Normality Testing

Almost all natural data on planet earth is normally distributed and most manufacturing process data is normally distributed too. What does this mean? Well, it simply means that the majority of statistical data tends to be concentrated at some center point and we find less and less data points as we move away from center.

Knowing if a data set is normally distributed is important because when it is normally distributed, we can easily test hypothesis and draw statistical conclusions. In the realm of manufacturing six-sigma, it means that you can improve your process by using various six-sigma tools such as T-tests, F-tests, ANOVA’s and Process Capability Studies without having to transform the data..

To be certain that your data is normally distributed, a statistical normality test should be conducted. Note that you want to check the normality of the data early-on in your project such that you can draw accurate statistical conclusions. The test to validate normality is called the Anderson-Darling Normality Test. You can find this test on-line using the R-Project.org or by using Minitab or Excel statistical software packages.

The key output of the Anderson-Darling Test is a P-value and if the P-value is greater than 0.05, then you can assume with 95% confidence that your data is normally distributed. When your data is normal, then you can confidently move forward with your six-sigma project.

If you conduct the normality test and find that your data is not normally distributed, then you need to determine why it is not a normal distribution. A good starting point is to look at the histogram of your data. What does it look like? Do you see uniform distribution? Or a bimodal distribution? Or a distribution that looks like half a bell curve? Or is it skewed or distorted? Histogram shapes that don't look like a bell curve could result in a low P-Value and hence fail the normality test.

After you look at the histogram, consider if the distribution shape makes logical sense. If you can’t explain it, then your population data may actually be normally distributed, In which case, you'll need to do some exploration to determine if you have a root-cause that is corrupting your data. For example, maybe you are simply lacking data points. If so, then go collect more samples. Or maybe you have machine operators who are running the process differently from each other and this is causing a bimodal distribution. Or possibly lab technicians are testing the samples differently. In these cases, you want to standardize the operating and testing procedures.

So if your data fails the normality test, always ask yourself; does it make sense? Should your data be normal with concentrated data around a center point and less data points further away from center? If logically speaking, your data should be normal, but it’s not, then dig deeper to solve your data problem. 

If your data actually does represent a non-normal distribution, then it can be transformed into a useful format. However, this data transformation process requires statistical expertise. If you are not an expert, then buy your company’s black belt a cup of coffee and pick their brain to learn more about it. (Note that if your data is flawed and you transform it then you are compounding an error, and you will draw false conclusions in your project.)

Summary:

  1. Collect your data.
  2. Run the Anderson Darling Normality Test.
  3. If not normal, troubleshoot your data and transform it if necessary.
  4. If normal then move forward with your project.

A six-sigma process yields happy customers and high profits!

 

Check out our Training!

Stay connected with news and updates!

If you want some weekly T4T wisdom coming straight to your inbox for your reading pleasure - look no further!  Join our mailing list to receive the latest blogs and updates.
Don't worry, your information will not be shared.

We hate SPAM. We will never sell your information, for any reason.