- Industry news
- Markets & Products
- Measurement & Analysis
- Project management
- Software engineering
- Security
The following steps describe how to conduct a hypothesis test for a difference in means.However, these steps are the same for any hypothesis test on any other population parameter that a Black Belt may conduct.
1. Define the problem or issue to be studied.
2. Define the objective.
3. State the null hypothesis, identified as H0.
l The null hypothesis is a statement of no difference between the before and after states (similar to a defendant being not guillty in court).
H0:μbefore=μafter
The goal of the test is to either reject or not reject H0.
4. State the alternative hypothesis, identified as Ha.
l The alternative hypothesis is what the Black Belt is trying to prove and can be one of the following:
l H0:μbefore≠μafter(a two-sided test)
l H0:μbefore<μafter(a two-sided test)
l H0:μbefore>μafter(a two-sided test)
l The alternative chosen depends on what the Black Belt is trying to prove.In a two-sided test, it is important to detect differences form the hypothesized mean, μbefore,that lie on either side ofμbefore.The α risk in a two-sided test is split on both sides of the histogram.In a one-sided test, it is only important to detect a difference on one side or the other.
5.Determine the practical difference(δ).
l The practical difference is the meaningful difference the hypothesis test should detect.
6.Establish the α and β risks for the test.
7.Determine the number of samples needed to obtain the desired β risk.
l Remember that the power of the test is (1-β).
8.Collect the samples and conduct the test to determine a p-value.
l Use a software package to analyze the data and determine a p-value.
9.Compare the p-value to the decision criteria (α risk) and determine whether to reject H0 in favor of Ha, or not to reject H0.
l If the p-value is less than the α risk, then reject H0 in favor Ha.
l If the p-value is greater than the α risk, there is not enough evidence to reject H0.
The risks associated with making an incorrect decision are described in the following table.
Decision Table
If the decision is:
H0
Ha | H0 Ha | |
Right Decision | α Risk Type I Error | |
β Risk Type ⅡError | Right Decision |
If the
Correct
answer is:
Depending on the population parameter of interest there are different types of hypothesis tests; these types are different types of hypothesis tests; these types are described in the following table.
Note: The table is divided into two sections: parametric and non-parametric.Parametric tests are used when the underlying distribution of the data is known or can be assumed(e.g., the data used for t-testing should subscribe to the normal distribution). Non-parametric tests are used when there is no assumption of a specific underlying distribution of the data.
Different Hypothesis Tests
Hypothesis Test | Underlying Distribution | Purpose | |
Parametric (Assumes the data subscribes to a distribution) | 1 Sample t-Test | Normal | Compares one sample average to a historical average or target |
2 Sample t-Test | Normal | Compares two independent sample averages | |
Paired t-Test | Normal | Compares two dependent sample averages | |
Test for Equal Variances | Chi-square | Compares two or more independent sample variances or standard deviations | |
1 Proportion Test | Binomial | Compares one sample proportion (percentage) to a historical average or target | |
2 Proportion Test | Binomial | Compares two independent proportions | |
Chi-square Goodness of Fit | Chi-square | Determines whether a data set fits a known distribution | |
Chi-square Test for Independence | Chi-square | Determines whether probabilities classified for one variable are associated with the classification of a second | |
Non-Parametric(Makes no assumption about the underlying distribution of the data) | 1 Sample Sign Test | None | Compares one sample median to a historical median or target |
Mann- Whitney Test | None | Compares two independent sample medians |
2 Sample t-Test Example:
A Black Belt is interested in determining whether temperature has an impact on the yield of a process.The current process runs at 100℃ and results in a nominal yield of 28 kg.The Black Belt would like to change the temperature to 110℃ with the hope of detecting a 3-kg increase in output.The null hypothesis is defined as:
H0:μ100℃≥μ110℃(one sided)
and the alternative hypothesis is chosen as:
Ha:μ100℃<μ110℃(one sided)
The practical difference the Black Belt would like to detect is 3 kg (an increase to 31 kg).The test is conducted with an α and β risk of 5% and 10%, respectively. To achieve a β risk of 10%, twenty-one samples will need to be collected at both 100℃ and 110℃, the process temperature was changed to 110℃, and twenty-one samples were collected.The respective averages and standard deviations were 28.2 and 3.2, and 32.4 and 3.2.The data was entered into a software program and the p-value was determined to be 0.01.After comparing the p-value (0.01) to the α risk (0.05), H0 is rejected in favor of Ha
as there is only a 1% risk in deciding Ha is greater than H0 when compared to the initial 5% risk the Black Belt was willing to take.
2 Proportion Test Example:
A Black Belt is interested in determining whether a new method of processing forms will result in fewer defective forms. The old method resulted in 5.2% defectives.The Black Belt would like to change to a new method with the hope of reducing the percent defectives to 2.0%.The null hypothesis is defined as:
H0:Pold method≤Pnew method
and the alternative hypothesis is chosen as:
Ha:Pold method>Pnew method
The practical difference the Black Belt would like to detect is a 3.2% reduction.The test will be conducted with an α and β risk of 5% and 10%, respectively.To achieve a β risk of 10%, 579 forms will need to be collected at the old and new methods; therefore, 579 samples were collected at the old process, the new method was implemented, and 579 more samples were collected. The respective percentages were 5.2% (thirty defectives) and 2.9% (seventeen defectives).The data was entered into a software program and the p-value was determined to be 0.026. Comparing the p-value (0.026) to the α risk (0.05) results in a conclusion that H0 should be rejected.