A heavy hypothesis test

edited June 20 in Assignments

(5 pt)

Our CDC data set has just over 1000 men who are 5'9''. We can pack them into a data frame and take a sample of size 100 from that data frame as follows:

import pandas as pd

df = pd.read_csv('https://www.marksmath.org/data/cdc.csv')
df_men = df[df.gender=='m']
five9 = df_men[df_men.height==69]
sample = five9.sample(100, random_state=5)
print(len(five9))
sample.head()
genhlth exerany hlthplan smoke100 height weight wtdesire age gender
838 excellent 1 1 1 69 160 160 31 m
13979 good 1 1 1 69 150 165 52 m
16025 fair 1 1 1 69 190 150 53 m
14550 excellent 1 1 0 69 200 180 29 m
4765 fair 0 1 1 69 235 180 56 m

The CDC recommends that the average weight of men at this height be 165 points, but we suspect that it might be more.

Your exercise is to use the code above (with your special number as the random_state to grab a sample of size 100 and test the null hypothesis that the average weight of men at 5'9'' is 165 vs the alternative hypothesis that the actual mean is larger.

Thus, my alternative hypothesis is:

%H_A: mu > 165%

Comments

  • edited June 20

    %H_0: mu = 165%
    %H_A: mu > 165%

    With 95% level of confidence
    Therefore a=0.05

    import pandas as pd
    import numpy as np
    from scipy.stats import norm
    
    df = pd.read_csv('https://www.marksmath.org/data/cdc.csv')
    df_men = df[df.gender=='m']
    five9 = df_men[df_men.height==69]
    sample = five9.sample(100, random_state=9)
    print(len(five9))
    sample.head()
    

    m=sample.weight.mean()
    s=sample.weight.std()
    se=s/np.sqrt(100)
    [m,s,se]
    

    [176.33, 23.76486456680929, 2.376486456680929]

    z = (m-165)/se
    z
    

    4.767542423037337

    norm.cdf(-z)
    

    9.324336787626066e-07

    At a 95% confidence level

    9.324336787626066e-07 < 0.05

    Therefore, we reject the null hypothesis

    mark
  • edited June 20

    Hypotheses:

    %H_0: mu=165%

    %H_A: mu>165%

    Data Set:

    import pandas as pd
    
    df = pd.read_csv('https://www.marksmath.org/data/cdc.csv')
    df_men = df[df.gender=='m']
    five9 = df_men[df_men.height==69]
    sample = five9.sample(100, random_state=1)
    print(len(five9))
    sample.head()
    

    Mean:

    m= df.weight.mean()
    m
    

    =169.68295

    Standard Deviation:

    sd= df.weight.std()
    sd
    

    =40.080969967120254

    Standard Error:

    se= sd/np.sqrt(100)
    se
    

    =4.008096996712025

    Z Score:

    z= (m-165)/se
    z
    

    =1.1683724230829704

    P Value:

    %P(Z=1.16)= 0.123%

    %P Value (0.123) > Confidence Level (.05)%

    So, we fail to reject %H_A%

    mark
  • benben
    edited June 20
    import pandas as pd
    
    df = pd.read_csv('https://www.marksmath.org/data/cdc.csv')
    df_men = df[df.gender=='m']
    five9 = df_men[df_men.height==69]
    sample = five9.sample(100, random_state=4)
    print(len(five9))
    sample.head()
    

    mean:

    df.weight.mean()
    

    =169.68296

    standard deviation

    df.weight.std()
    

    =40.080969967120254

    standard error

    se = sd/np.sqrt(100)
    se
    

    =4.008

    Z score

    z = (m-165)/se
    z
    

    =1.167

    %H_0:mu=165%
    %H_A:mu>165%

    P value:
    P(Z=1.16)=.13
    Pvalue(.13)>(.05)

    we fail to reject %H_A%

    mark
  • edited June 20

    %H_A:mu=165%
    %H_A:mu>165%

    Mean, Standard Deviation, Population

    weight = df.weight
    m = weight.mean()
    s = weight.std()
    n = len(weight)
    [m,s,n]
    

    [169.68295, 40.080969967120254, 20000]

    Standard Error

    se = s/10
    se
    

    4.008096996712025

    Z Score

    z= (m-165)/se
    z
    

    1.1683724230829704

    %P(Z=1.17)=.1210%

    PValue(.1210) > Confidence Level (.05)

    So, we fail to reject %H_A%

    mark
  • edited June 20
    import pandas as pd
    
    from scipy.stats import norm
    
    from numpy import sqrt
    
    df = pd.read_csv('https://www.marksmath.org/data/cdc.csv')
    df_men = df[df.gender=='m']
    five9 = df_men[df_men.height==69]
    sample = five9.sample(100, random_state=10)
    print(len(five9))
    sample.head()
    
    import numpy as np
    

    With a 95% level of confidence, our alpha value equals 0.05

    m = np.mean([160, 150, 170, 180, 160])
    m
    

    mean=164

    s = np.std([160, 150, 170, 180, 160])
    s
    

    standard deviation=10.198039027185569

    se = s/sqrt(100)
    se
    

    standard error=1.0198039027185568

    z = (m-165)/se
    z
    

    z-score=-0.9805806756909203

    probability value=0.1635

    p-value (0.8365) > confidence level (0.05)

    Therefore, we fail to reject the null hypothesis.

    %H_0: mu = 165%
    %H_A: mu < 165%

    mark
  • edited June 20

    %H_0: mu=165%
    %H_A: mu>165%

    Data Set imported from:

        import pandas as pd
    
        df = pd.read_csv('https://www.marksmath.org/data/cdc.csv')
        df_men = df[df.gender=='m']
        five9 = df_men[df_men.height==69]
        sample = five9.sample(100, random_state=3)
        print(len(five9))
        sample.head()
    

    The Mean and Standard Deviation are:

    s = df.weight.std()
    m = df.weight.mean()
    [m,s]
    

    [169.68295, 40.080969967120254]

    The Standard Error is:

    se = s/x
    se
    

    4.008096996712025

    The Z score is: 1.683 with a Probability of 0.8790 .

    P value is: (1-0.8790)= 0.121

    P = 0.121 > 0.05, therefore I fail to reject the null hypothesis.

    mark
  • edited June 20

    import pandas as pd

    df = pd.read_csv('https://www.marksmath.org/data/cdc.csv')
    df_men = df[df.gender=='m']
    five9 = df_men[df_men.height==69]
    sample = five9.sample(100, random_state=7)
    print(len(five9))
    sample.head()
    

    Mean:
    sample.weight.mean()
    m
    %178.98%

    Standard Deviation:
    sample.weight.std()
    sd
    %25.36599967267088%

    Standard Error:
    se=2.53659
    se
    %2.53659%

    z score:
    z=(m-165)/se
    z
    %5.511336085059072%

    %H_A:mu>165%
    %H_O:mu=165%
    Thus, I reject the null hypothesis.

    mark
  • Hypothesis:
    %H_0:mu=165%
    %H_A:mu>165%

    Data set:

    import pandas as pd
    
    df = pd.read_csv('https://www.marksmath.org/data/cdc.csv')
    df_men = df[df.gender=='m']
    five9 = df_men[df_men.height==69]
    sample = five9.sample(100, random_state=5)
    print(len(five9))
    sample.head()
    

    Mean:

    m= df.weight.mean()
    m
    

    =169.68295

    Standard Deviation:

    sd= df.weight.std()
    sd
    

    =40.080969967120254

    Standard Error:

    m=sample.weight.mean()
    se=sample.wieght.std()/(10)
    

    =[180.77, 2.9344300635693807]

    Margin of Eroor

    me=2*se
    me
    

    =5.8688601271387615

    Z score:

    z=(m-165)/se
    z
    

    =5.374127056488

    The Z-score is 5.374127056488, therefore we can reject the null.

    mark
Sign In or Register to comment.