# A heavy hypothesis test

edited June 20

(5 pt)

Our CDC data set has just over 1000 men who are 5'9''. We can pack them into a data frame and take a sample of size 100 from that data frame as follows:

``````import pandas as pd

df_men = df[df.gender=='m']
five9 = df_men[df_men.height==69]
sample = five9.sample(100, random_state=5)
print(len(five9))
``````
genhlth exerany hlthplan smoke100 height weight wtdesire age gender
838 excellent 1 1 1 69 160 160 31 m
13979 good 1 1 1 69 150 165 52 m
16025 fair 1 1 1 69 190 150 53 m
14550 excellent 1 1 0 69 200 180 29 m
4765 fair 0 1 1 69 235 180 56 m

The CDC recommends that the average weight of men at this height be 165 points, but we suspect that it might be more.

Your exercise is to use the code above (with your special number as the `random_state` to grab a sample of size 100 and test the null hypothesis that the average weight of men at 5'9'' is 165 vs the alternative hypothesis that the actual mean is larger.

Thus, my alternative hypothesis is:

H_A: mu > 165

• edited June 20

H_0: mu = 165
H_A: mu > 165

With 95 level of confidence
Therefore a=0.05

``````import pandas as pd
import numpy as np
from scipy.stats import norm

df_men = df[df.gender=='m']
five9 = df_men[df_men.height==69]
sample = five9.sample(100, random_state=9)
print(len(five9))
``````

``````m=sample.weight.mean()
s=sample.weight.std()
se=s/np.sqrt(100)
[m,s,se]
``````

[176.33, 23.76486456680929, 2.376486456680929]

``````z = (m-165)/se
z
``````

4.767542423037337

``````norm.cdf(-z)
``````

9.324336787626066e-07

At a 95 confidence level

9.324336787626066e-07 < 0.05

Therefore, we reject the null hypothesis

• edited June 20

Hypotheses:

H_0: mu=165

H_A: mu>165

Data Set:

``````import pandas as pd

df_men = df[df.gender=='m']
five9 = df_men[df_men.height==69]
sample = five9.sample(100, random_state=1)
print(len(five9))
``````

Mean:

``````m= df.weight.mean()
m
``````

=169.68295

Standard Deviation:

``````sd= df.weight.std()
sd
``````

=40.080969967120254

Standard Error:

``````se= sd/np.sqrt(100)
se
``````

=4.008096996712025

Z Score:

``````z= (m-165)/se
z
``````

=1.1683724230829704

P Value:

P(Z=1.16)= 0.123

P Value (0.123) > Confidence Level (.05)

So, we fail to reject H_A

• edited June 20
``````import pandas as pd

df_men = df[df.gender=='m']
five9 = df_men[df_men.height==69]
sample = five9.sample(100, random_state=4)
print(len(five9))
``````

mean:

``````df.weight.mean()
``````

=169.68296

standard deviation

``````df.weight.std()
``````

=40.080969967120254

standard error

``````se = sd/np.sqrt(100)
se
``````

=4.008

Z score

``````z = (m-165)/se
z
``````

=1.167

H_0:mu=165
H_A:mu>165

P value:
P(Z=1.16)=.13
Pvalue(.13)>(.05)

we fail to reject H_A

• edited June 20

H_A:mu=165
H_A:mu>165

Mean, Standard Deviation, Population

``````weight = df.weight
m = weight.mean()
s = weight.std()
n = len(weight)
[m,s,n]
``````

[169.68295, 40.080969967120254, 20000]

Standard Error

``````se = s/10
se
``````

4.008096996712025

Z Score

``````z= (m-165)/se
z
``````

1.1683724230829704

P(Z=1.17)=.1210

PValue(.1210) > Confidence Level (.05)

So, we fail to reject H_A

• edited June 20
``````import pandas as pd

from scipy.stats import norm

from numpy import sqrt

df_men = df[df.gender=='m']
five9 = df_men[df_men.height==69]
sample = five9.sample(100, random_state=10)
print(len(five9))

import numpy as np
``````

With a 95 level of confidence, our alpha value equals 0.05

``````m = np.mean([160, 150, 170, 180, 160])
m
``````

mean=164

``````s = np.std([160, 150, 170, 180, 160])
s
``````

standard deviation=10.198039027185569

``````se = s/sqrt(100)
se
``````

standard error=1.0198039027185568

``````z = (m-165)/se
z
``````

z-score=-0.9805806756909203

probability value=0.1635

p-value (0.8365) > confidence level (0.05)

Therefore, we fail to reject the null hypothesis.

H_0: mu = 165
H_A: mu < 165

• edited June 20

H_0: mu=165
H_A: mu>165

Data Set imported from:

``````    import pandas as pd

df_men = df[df.gender=='m']
five9 = df_men[df_men.height==69]
sample = five9.sample(100, random_state=3)
print(len(five9))
``````

The Mean and Standard Deviation are:

``````s = df.weight.std()
m = df.weight.mean()
[m,s]
``````

[169.68295, 40.080969967120254]

The Standard Error is:

``````se = s/x
se
``````

4.008096996712025

The Z score is: 1.683 with a Probability of 0.8790 .

P value is: (1-0.8790)= 0.121

P = 0.121 > 0.05, therefore I fail to reject the null hypothesis.

• edited June 20

import pandas as pd

``````df = pd.read_csv('https://www.marksmath.org/data/cdc.csv')
df_men = df[df.gender=='m']
five9 = df_men[df_men.height==69]
sample = five9.sample(100, random_state=7)
print(len(five9))
``````

Mean:
sample.weight.mean()
m
178.98

Standard Deviation:
sample.weight.std()
sd
25.36599967267088

Standard Error:
se=2.53659
se
2.53659

z score:
z=(m-165)/se
z
5.511336085059072

H_A:mu>165
H_O:mu=165
Thus, I reject the null hypothesis.

• Hypothesis:
H_0:mu=165
H_A:mu>165%

Data set:

``````import pandas as pd

df_men = df[df.gender=='m']
five9 = df_men[df_men.height==69]
sample = five9.sample(100, random_state=5)
print(len(five9))
``````

Mean:

``````m= df.weight.mean()
m
``````

=169.68295

Standard Deviation:

``````sd= df.weight.std()
sd
``````

=40.080969967120254

Standard Error:

``````m=sample.weight.mean()
se=sample.wieght.std()/(10)
``````

=[180.77, 2.9344300635693807]

Margin of Eroor

``````me=2*se
me
``````

=5.8688601271387615

Z score:

``````z=(m-165)/se
z
``````

=5.374127056488

The Z-score is 5.374127056488, therefore we can reject the null.