An archive of Mark's Fall 2017 Intro Stat course.

# Examining a random proportion (for 8:00 AM section)

mark

(5 pts)

df = read.csv("https://www.marksmath.org/cgi-bin/random_data.csv?username=mark")

#     first_name last_name age gender height weight income smoke100 exerany handedness
# 1      Donna     Dinan  35 female  65.37 164.26   1947        N       Y          R
# 2      Ramon     Davis  20   male  66.59 139.53  22747        Y       Y          R
# 3       Mark      Buss  23   male  74.58 124.21  15489        N       Y          R
# 4      Lidia    Elmore  52 female  63.87 153.64   8369        N       Y          R

1. Use this data to find a 95% confidence interval for the proportion of people who are left handed.
2. Supposedly, about 12% of the population is left-handed. Use this data to perform a hypothesis test of that statement.

I guess your confidence interval should look like this: [0.0268, 0.1332].

mark
mark
asiarenee5
df=read.csv("https://marksmath.org/cgi-bin/random_data.csv?username=asiarenee5")

first_name last_name age gender height weight income smoke100 exerany handedness
1   Nannette    Mounts  31 female  61.31 187.23   1210        Y       Y          L
2      Keith   Micheau  28   male  70.63 150.21  27734        Y       Y          R
3   Chantell  Alvarado  25 female  66.22 145.85   6424        N       N          L
4     Arthur     Allen  44   male  67.50 209.12   3734        N       N          R
5     Sheryl   Bednorz  32 female  61.29 138.08  24255        N       N          R
6      Debra Rodriguez  34 female  64.65 108.44   9326        N       Y          L
> table(df$handedness) L R 12 88 > p=0.12 > se=sqrt(p*(1-p)/100) > c(p-2*se,p+2*se) [1] 0.05500769 0.18499231 phat=0.12 p0= 0.14 se0=sqrt(p0*(1-p0)/100) pnorm(phat,p0,se0) [1] 0.05610010  My The 95% confidence interval for the people who are left handed is [ 0.05500769 0.1849923]. Performing a hypothesis test with a mean of 0.14 and phat of 0.12, I cannot reject my hypothesis that 14% of the population is left handed. My p value of 0.05610010 is not less than 0.05. jkelso df = read.csv(“https://www.marksmath.org/cgi-bin/random_data.csv?username=jkelso”) table(df$handedness)
p=.19
se=sqrt(p*(1-p)/100)
c(p-1.96se, p+1.96se)
[1] 0.113109 0.266891

L R
19 81

1. The confidence interval is (.1121,.2669)

phat=.19
p0=.12
se0=sqrt(p0*(1-p0)/100)
pnorm(phat,p0,se0)
[1] 0.9843839

1. The P-value is not less than .05, so we cannot reject that 12% of the population is left handed
GetSwifty

that was no good

first_name last_name age gender height weight income smoke100

1 Phyllis Smith 39 female 61.52 183.00 2507 N

2 Jennifer Willis 36 female 63.27 175.96 152928 Y

3 Kenneth Staton 45 male 72.41 158.97 201 Y

4 Brian Liles 38 male 69.35 170.46 12655 N

5 Alma Reynoso 33 female 62.22 158.13 1957 N

6 Yee Schoonover 33 female 63.10 190.13 420836 N

exerany handedness

1 Y R
2 Y R
3 Y L
4 Y R
5 Y R
6 Y R

table(df$handedness) L R 19 81 p = 0.19 se = sqrt(p*(1-p)/100) c(p-2se, p+2se) 95% confidence interval = [1] 0.1115398 0.2684602 phat = 0.19 p0 = 0.12 se0 = sqrt(p0*(1-p0)/100) pnorm(phat,p0,se0) [1] 0.9843839 p value is not less then 0.5 so we can not reject the hypothesis that 12 % of the populations left handed nsugar  first_name last_name age gender height weight income 1 Bryan Barden 41 male 67.17 182.78 16229 2 Joanne Miller 23 female 63.08 151.54 16281 3 Joshua Tapia 25 male 71.46 162.37 48168 4 Gerald Marshall 49 male 71.88 175.65 30826 5 Emma Orourke 41 female 62.44 163.40 13781 6 Mercedes Martinez 28 female 63.33 222.74 530 smoke100 exerany handedness 1 Y N R 2 N N R 3 N Y R 4 N Y R 5 N Y R 6 Y Y L table(df$handedness)

L  R
18 82
p=.18
se=sqrt(p*(1-p)/100)
c(p-2*se,p+2*se)
[1] 0.1031625 0.2568375
phat=.18
p0=.12
se0=sqrt(p0*(1-p0)/100)
pnorm(phat,p0,se0)
[1] 0.9675809

1. The 95% Confidence Intervals for my data are: .1032, .2568
2. The P-value i s.9675809 and is not less than .05 therefore, we cannot reject the hypothesis that 12% of the population is left handed.
blaiser1

In my data, 21 people were left handed and 79 were right handed.
p=.21
se=.0407 (sqrt((.21*(.79))/100))

so my confidence interval is

(-.1285, +.291461)

df = read.csv("https://marksmath.org/cgi-bin/random_data.csv?
first_name last_name age gender height weight income smoke100 exerany handedness
1      Marie   Ferrara  52 female  66.22 163.60   8202        Y       N          R
2      Helen  Madrigal  21 female  59.85 215.28  11506        N       Y          L
3       Anna  Prichard  23 female  56.30 141.59    714        Y       N          R
4      Tyler     Boyce  51   male  65.49 150.10   1775        Y       Y          R
5      Agnes    Pruitt  52 female  66.14 178.02   4290        Y       Y          L
6       Mary  Williams  56 female  60.76 176.40   7196        Y       N          R
pnorm(.21,.12,.03249)
1-pnorm(.21,.12,.03249)


SO
when we take pnorm(.21,.12,.03249) we get 0.9971979. or we could skip and do 1-pnorm(.21,.12,.03249) which comes out to 0.0028021. You’ll notice that its smaller than .05, so we REJECT the null. According to my data I would propose a hypothesis that states that 21% of the population is left handed.

amandanail
> df=read.csv("https://marksmath.org/cgi-bin/random_data.csv?username=amandanail")
first_name last_name age gender height weight income smoke100 exerany handedness
1      Bryon    Conway  25   male  66.33 188.01   2725        Y       Y          R
2      Steve    Garber  21   male  76.20 165.14 126743        N       N          R
3      David    Jedele  23   male  64.77 138.71  33589        N       Y          R
4      Rubye    Tuttle  21 female  62.70 134.78  10656        N       Y          L
5     Frieda     Bixby  33 female  64.52 178.19   7388        Y       N          R
6     Bonnie    Turner  44 female  63.92 186.30  56072        Y       Y          R
> table(df$handedness) 1. So, we see there are 12 left handed people and 88 right handed people. L R 12 88 2. The confidence interval is [0.02574136 0.13425864] shiller df=read.csv("https://marksmath.org/cgi-bin/random_data.csv?username=shiller") head(df) # first_name last_name age gender height weight income smoke100 exerany handedness # 1 Cornelia Heath 59 female 66.51 171.07 1317 1 N Y R # 2 Donald Reid 20 male 64.74 146.36 22910 2 Y N R # 3 Katherine Driskill 38 female 66.20 184.15 9002 3 Y N R # 4 Brian Archila 39 male 70.88 105.76 4637 4 Y Y R # 5 Debra Schexnayder 21 female 62.36 140.74 5829 5 Y Y R # 6 Larry Dunn 35 male 67.63 127.93 5493 6 Y Y L table(df$handedness)
L  R
13 87
p=0.13
se=sqrt(p*(1-p)/100)
c(p-2*se,p+2*se)
[1] 0.06273931 0.19726069
phat=0.13
p0=0.12
se0=sqrt(p0*(1-p0)/100)
pnorm(phat,p0,se0)
[1] 0.6208556


My 95% confidence interval for people who are left handed is (0.06274, 0.19726). Performing a hypothesis test with a mean of .12 and phat of .13, I cannot reject the hypothesis that 12% of the population is left-handed. My p-value of 0.6208556 is not less than 0.05.

DariousAquarious
table(df$handedness) L R 9 91  In my population 9 out of 100 are left handed. sqrt((.9*.81)/100) [1] 0.0853815 .9-0.170763 [1] 0.729237 .9+0.170763 [1] 1.070763  To a 95% confidence interval the true proportion of left handed people is .9\pm.170763 or [.729237,1.070763] phat=.09 p0=.12 se0=sqrt(p0*(1-p0)/100) pnorm(phat,p0,se0) [1] 0.1779551  My null hypothesis is accepted as the final p-value of .1780 is more than the .05 of the null hypothesis. alogan3 df = read.csv("https://marksmath.org/cgi-bin/random_data.csv?username=alogan3") head(df) first_name last_name age gender height weight income smoke100 exerany handedness 1 Ernest Culbertson 37 male 67.36 154.32 25954 N N R 2 Leonard Baird 31 male 68.74 169.88 21447 Y N R 3 Tina Sweet 20 female 67.77 182.46 1582 N Y R 4 Kristine Seltzer 21 female 65.94 168.33 29010583 N N R 5 Judy George 27 female 57.33 128.77 12985 N Y R 6 Jeffrey Brown 29 male 63.71 152.05 8113 Y Y R table(df$handedness)
L    R
18  82
p = 0.18
se = sqrt(p*(1-p)/100)
c(p-2*se, p+2*se)
[1] 0.1031625 0.2568375
phat = 0.18
p0 = 0.12
se0 = sqrt(p0*(1-p0)/100)
pnorm(phat,p0,se0)
[1] 0.9675809


(0.1032, 0.2568) is my 95% confidence interval for people who are left-handed. Calculating the hypothesis test with a phat of .18 and a mean of 0.12, I cannot reject the null hypothesis that 12% of people are left-handed because my p-value, 0.9675809, is not less than 0.5.

acuozzi3

My 95% confidence interval for people who are left handed is (.07858572, .22141428). Performing a hypothesis test with a mean of .12 and a phat of .15, I cannot reject the hypothesis that 12% of the population is left-handed. My p-value is .8220449.

   L                         R

1. .07858572 .22141428

2 .8220449

mrothenb

se = sqrt(p*(1-p)/100)
c(p-2se, p+ 2se)
[1] 0.02574136 0.13425864

phat = .08
p0 = .12
se0 = sqrt(p0*(1-p0)/100)
pnorm(phat,p0,se0)
[1] 0.1091773

Due to the p value I can reject the hypothesis test.

nmitchel

first_name last_name age gender height weight income
1 Tasha Shelby 43 female 63.18 178.65 43854
2 Andrea Walden 51 female 60.98 170.78 7289
3 Cheryl Unger 46 female 59.90 179.02 4364
4 Melissa Hart 30 female 63.34 176.42 3398
5 Ethel Barker 32 female 64.61 145.12 88212
6 Laura Escareno 43 female 67.14 174.04 12467
smoke100 exerany handedness
1 N Y R
2 Y N R
3 Y Y R
4 Y Y R
5 N Y R
6 Y N R
table(df$handedness) L R 15 85 p=.15 se=sqrt(p*(1-p)/100) c(p-2se, p+2se) [1] 0.07858572 0.22141428 phat=.15 p0=.12 se0=sqrt(p0*(1-p0)/100) pnorm(phat,p0,se0) [1] 0.8220449 data: 15 out of 100, null probability 0.15 X-squared = 0, df = 1, p-value = 0.5 alternative hypothesis: true p is less than 0.15 95 percent confidence interval: 0.000000 0.217903 sample estimates: p 0.15 scrouse  df = read.csv("https://marksmath.org/cgi-bin/random_data.csv?username=scrouse") head(df) first_name last_name age gender height weight income 1 Robert Klima 20 male 64.06 194.86 46075 2 Keith Duong 42 male 71.38 165.88 4266 3 Richard Rodriquez 34 male 63.30 167.25 2267 4 Katharina Ulcena 45 female 61.26 192.28 63607 5 Leonard Vandeventer 25 male 69.18 170.20 46165 6 Maria Spencer 22 female 63.30 204.15 25589 smoke100 exerany handedness 1 Y Y R 2 Y Y R 3 N Y R 4 Y Y R 5 Y Y R 6 Y N R L R 13 87 p= 0.13 se=sqrt(p*(1-p)/100) se=0.0336 c(p-2*se)=0.0964 c(p+2*se)=0.1636 [0.0964, 0.1636] BryanDadson3 df = read.csv("https://www.marksmath.org/cgi-bin/random_data.csv?username=BryanDadson3") head(df) first_name last_name age gender height weight income smoke100 exerany handedness 1 Eric Mcdonald 20 male 68.33 184.21 14563 Y Y R 2 Nicole Mccreedy 36 female 62.78 191.15 8513 Y N R 3 Alicia Hunsicker 28 female 64.10 119.92 168591 N N R 4 Janis Hark 40 female 62.61 104.19 1384 Y Y R 5 James Hamilton 33 male 69.29 204.69 21821 N Y R 6 Glenn Washington 28 male 70.42 157.09 1183 N N R table(df$handedness)

L  R
12 88

p = 0.12
se = sqrt(p*(1-p)/100)
c(p-2*se, p+2*se)
[1] 0.05500769 0.18499231

phat = 0.12
p0 = 0.12
se0 = sqrt(p0*(1-p0)/100)
pnorm(phat,p0,se0)
[1] 0.5

1. The 95% confidence interval for the proportion of people who are left handed is: [0.055, 0.185]
2. The p-value is not less than .05, so we cannot reject that 12% of the population is left handed.
mark
mark
audrey

Here’s my approach to the hypothesis test portion of the question. First, I’ll read in the data and format the handedness portion of that data as table.

df = read.csv("https://www.marksmath.org/cgi-bin/random_data.csv?username=audrey")
table(df\$handedness)

* Out:
* L  R
* 10 90


From here, I can see that my sample proportion is \hat{p} = 10/100 = 0.1 , which is certainly not exactly the same as the published proportion of 0.12 . I guess our hypotheses should be:

• H_0: p = 0.12 ,
• H_A: p \neq 0.12 .

To compute the p -value, the question is - what is the probability of generating data at least as far away from 0.12 as 0.1 under the assumption that the true proportion is 0.12 ? We’ll use a normal to approximate this. The mean and standard deviation of the normal are dictated by the assumed proportion and sample size used to generate our approximation. Thus:

\mu = 0.12 \text{ and } \sigma = \sqrt{0.12\times0.88/100} \approx 0.0325.

The probability that we are interested in is represented by the following area:

We can compute this in R as follows:

2*pnorm(0.1,0.12, sqrt(0.12*0.88/100))
# Out: 0.5382527


Since this is a lot bigger than 0.05, we cannot reject the null.