An archive of Mark's Fall 2017 Intro Stat course.

# Examining a random proportion (for 11:00 AM section)

mark

(5 pts)

df = read.csv("https://www.marksmath.org/cgi-bin/random_data.csv?username=mark")

#     first_name last_name age gender height weight income smoke100 exerany handedness
# 1      Donna     Dinan  35 female  65.37 164.26   1947        N       Y          R
# 2      Ramon     Davis  20   male  66.59 139.53  22747        Y       Y          R
# 3       Mark      Buss  23   male  74.58 124.21  15489        N       Y          R
# 4      Lidia    Elmore  52 female  63.87 153.64   8369        N       Y          R

1. Use this data to find a 95% confidence interval for the proportion of people who are left handed.
2. Supposedly, about 12% of the population is left-handed. Use this data to perform a hypothesis test of that statement.

I guess your confidence interval should look like this: [0.0268, 0.1332].

mark
mark
lilyz
   L  R
13 87
p=0.13
se = sqrt(p*(1-p)/100)
c(p-2*se, p+2*se)


95% Confidence Interval: [0.06273931, 0.19726069]
2.

phat = 0.13
p0 = 0.12
se0 = sqrt(p0*(1-p0)/100)
pnorm(phat,p0,se0)


[1] 0.6208556

^cannot reject null hypothesis due to p-value 0.62 being greater than 0.05

TineriTalentati
> df = read.csv("https://www.marksmath.org/cgi-
first_name last_name age gender height
1       Dana Pritchett  33 female  65.29
2   Michelle       Rea  38 female  65.92
3      Mindy     Lusby  56 female  68.00
4       Mary   Friesen  34 female  65.71
5      Robin     Jones  22 female  63.13
6    Gregory    Adkins  25   male  64.97
weight income smoke100 exerany handedness
1 143.83 116672        Y       N          R
2 151.13    150        Y       N          R
3 195.86  76219        N       Y          R
4 154.46   4112        N       Y          R
5 161.12  15601        N       Y          R
6 173.40   9590        N       Y          R
> table(df$handedness) L R 14 86 > p=.14 > se=sqrt(p*(1-p)/100) > c(p-2*se, p+2*se) [1] 0.07060259 0.20939741 > phat=.08 > phat=.14 > p0=.12 > se0=sqrt(p0*(1-p0)/100) > pnorm(phat,p0,se0) [1] 0.7308737  1. Confidence interval: [.0706, .2094] 2. Hypothesis test: 0.7309 -> Cannot reject the null hypothesis due to .73 being larger than .05. Dancerlikens df = read.csv("https://www.marksmath.org/cgi-bin/random_data.csv?username=mlikens") **Confidence Interval** p=.13 se=(p*(1-p)/100) View(df) c(p-2*se, p+2*se)  Confidence Interval is [0.127738, 0.132262] **Hypothesis Test** phat=.13 p0=.12 se0=sqrt(p0*(1-p0)/100) pnorm(phat, p0, se0) [1] 0.6208556  We cannot reject the Null Hypothesis. LunaLovegood df = read.csv("https://marksmath.org/cgi-bin/random_data.csv?username=LunaLovegood") > head(df) first_name last_name age gender height weight income smoke100 exerany handedness 1 Ashley Hudson 55 female 63.54 125.44 1965 Y Y L 2 Larry Ayala 40 male 68.56 143.52 401 N Y R 3 Arthur Kelly 21 male 71.30 166.05 69589 N Y R 4 Christopher Long 44 male 68.90 160.10 16940 Y Y R 5 Howard Hang 44 male 70.13 182.23 13343 N Y R 6 Kurt Sublett 31 male 69.33 204.95 7656 Y Y L >table(df$handedness)

L  R
14 86
> p = 0.14
> se = sqrt(p*(1-p)/100)
> c(p-2*se, p+2*se)
[1] 0.07060259 0.20939741
> phat = 0.14
> p0 = 0.12
> se0 = sqrt(p0*(1-p0)/100)
> pnorm(phat,p0,se0)
[1] 0.7308737


Confidence Interval: [.0706, .2094]
The p-value for the hypothesis test is 0.7309. That value is larger than 0.05, so it does not pass the hypothesis test. As such, we fail to reject the null hypothesis.

  first_name last_name age gender height weight income smoke100 exerany handedness
1     Donald   Hankins  32   male  72.02 224.71   3528        Y       Y          L
2       Ruth    French  46 female  63.39 136.03  17533        N       N          L
3      Larae     Mills  59 female  59.91 171.92 103281        N       Y          R
4     Alexis       Zhu  42   male  68.15 168.52  10431        N       Y          R
5   Patricia       Gee  32 female  61.87 154.39   4450        Y       N          R
6      James    Carter  34   male  73.16 183.77  12195        Y       Y          R


table(df$handedness) L R 11 89 > p=.11 > se=sqrt(p*(1-p)/100) > c(p-2*se, p+2*se) [1] 0.04742205 0.17257795 ~ ~*hypothesis testing zone*~ ~ > phat = 0.11 > se0 = sqrt(p0*(1-p0)/100) > se0 = sqrt(p0*(1-p0)/100) > pnorm(phat,p0,se0) [1] 0.3791444  Confidence interval : [0.4742, 0.1726] Hypothesis tests: 0.379 One may not reject the aforementioned hypothesis, good day. emeli > df = read.csv("https://www.marksmath.org/cgi-bin/random_data.csv?username=emeli") > head(df) first_name last_name age gender height 1 Charles Smith 49 male 73.69 2 Kathleen Leonard 28 female 64.40 3 Andrew Carlson 37 female 63.33 4 Sandra Martinez 38 female 64.95 5 Krista White 29 female 65.39 6 Otis Reyes 30 male 70.27 weight income smoke100 exerany handedness 1 226.06 1947 N Y R 2 142.38 333 Y Y R 3 190.32 11730 N Y L 4 177.29 802 N Y L 5 191.46 25135 Y Y R 6 202.01 14952 Y Y L > table(df$handedness)

L  R
25 75
> se = sqrt(.25*(1-.25)/100)
> c(p-2*se, p+2*se)
[1] 0.1633975 0.3366025


Confidence interval is (.16, .34)

> phat = 0.25
> p0 = 0.12
> se0 = sqrt(p0*(1-p0)/100)
> pnorm(phat,p0,se0)
[1] 0.9999684
> prop.test(25,100, p = .12, alternative = "less", correct = FALSE)

1-sample proportions test without
continuity correction

data:  25 out of 100, null probability 0.12
X-squared = 16.004, df = 1, p-value = 1
alternative hypothesis: true p is less than 0.12
95 percent confidence interval:
0.0000000 0.3271734
sample estimates:
p
0.25


The p value is .25 so you cannot reject the null hypothesis.

oyang
df = read.csv("https://marksmath.org/cgi-

first_name last_name age gender height weight income smoke100 exerany
1      Peggy      Bell  31 female  62.62 203.45   8249        Y       Y
2   Michelle     Cantu  36 female  65.00 188.63   5351        N       N
3    Anthony  Cauffman  45   male  70.25 191.85   1979        N       Y
4      Laura     Smith  40 female  64.87 176.73   1237        N       Y
5      Kitty      Cahn  26 female  64.42 151.86   8715        N       N
6     Sharon    Pastel  27 female  64.11 151.31  64571        Y       Y

handedness
1          R
2          R
3          R
4          R
5          R
6          R

table(df$handedness) L R 13 87 sqrt(p0*(1-p0)/100) [1] 0.03249615 pnorm(phat,p0,se0) [1] 0.000127125 .000127125 [1] 0.000127125 (.13-2*0.001131) [1] 0.127738 sqrt(.13*(1-.13)/100) [1] 0.03363034 (.13- 0.03363034) [1] 0.09636966 (.13+ 0.03363034) [1] 0.1636303 phat=.13 p0=.12 se0=sqrt(p0*(1-p0)/100) pnorm(phat,p0,se0) [1] 0.6208556  Confidence interval: [0.09636966, 0.1636303] Hypothesis test: I cannot reject the null hypothesis due to 0.6208556 being bigger than .05 vee df = read.csv("https://marksmath.org/cgi- bin/random_data.csv?username=vee") head(df) first_name last_name age gender height weight income 1 Eric Thao 46 male 71.61 197.09 2262 2 Scott Miles 37 male 66.86 196.26 6170 3 Larry Martinez 28 male 71.35 159.21 8114 4 Carl Brickey 49 male 68.63 212.83 7852 5 Ronald Stclair 38 male 67.97 167.54 4181 6 Samuel Forbes 22 male 67.77 199.50 36593 smoke100 exerany handedness 1 N Y R 2 Y Y R 3 Y Y R 4 N Y R 5 N N R 6 Y N R  1. My 95% confidence interval for the proportion of people who are left handed is… L R 21 79 p = 0.21 se = sqrt(p*(1-p)/100) c(p-2*se, p+2se) Error: unexpected symbol in "c(p-2*se, p+2se" c(p-2*se, p+2*se) [1] 0.1285384 0.2914616  1. Hypothesis test ~ phat = .21 p0 = .12 se0 = sqrt(p0*(1-p0)/100) pnorm(phat,p0,se0) [1] 0.9971934  This value is larger than .05, so we can not reject the null hypothesis. Nathan89  df=read.csv("http://marksmath.org/cgi-bin/random_data.csv?username=Nathan89") head(df) first_name last_name age gender height weight income smoke100 exerany handedness 1 Arthur Evans 21 male 66.12 187.59 6521 Y N R 2 Jacquelyn Robinson 55 female 62.52 176.21 6178 N Y R 3 Dolores Strickland 42 female 60.24 179.07 2160 N Y R 4 Brent Bickel 39 male 70.92 200.13 5616 Y Y R 5 Lori Russ 26 female 66.75 148.23 6526 Y Y R 6 Willie Keplin 25 male 71.90 190.56 7608 N Y R table(df$handedness)
L  R
24 76
24/100
[1] 0.24
p=.24
se=sqrt(p*(1-p)/100)
c(p-2*se, p+2*se)
[1] 0.1545834 0.3254166
phat= .24
p0=.12
se0=sqrt(p0*(1-p0)/100)
1- pnorm(phat, p0, se0)
[1] 0.0001109233

1. 95% Confidence Interval: ( 0.1545834 , 0.3254166)
2. Hypothesis test: (0.9998891) Therefor, we can reject the hypothesis that 12% of the population is left handed.
Megatog

My Table:

df = read.csv("https://www.marksmath.org/cgi-bin/random_data.csv?username=Megatog")
first_name last_name age gender height weight income smoke100 exerany handedness
1    Michael   Molinar  24   male  70.33 182.17   5190        Y       N          R
2       Erik      Lima  26   male  70.85 203.72    771        N       Y          R
3       Mark   Higgins  34   male  71.70 181.90  17682        N       Y          R
4      Clint  Dietrich  58   male  66.14 156.02   6783        N       N          R
5      Jason    Craven  29   male  66.60 168.87   4687        Y       N          R
6  Catherine      Rice  39 female  59.84 176.96  50988        Y       Y          R


table(df$handedness) L R 17 83 p = 0.17 se = sqrt(p*(1-p)/100) c(p-2*se, p+2*se) [1] 0.09487344 0.24512656 phat = 0.17 p0 = 0.12 se0 = sqrt(p0*(1-p0)/100) pnorm(phat,p0,se0) [1] 0.9380543  1. [0.09487344, 0.24512656] 2. Because the p-value in the test is = 0.9380543, since that is greater than 0.05 we cannot reject the hypothesis that 12% of the population is left handed. Nashman92 df = read.csv("https://marksmath.org/cgi-bin/random_data.csv?username=Nashman92") first_name last_name age gender height weight income smoke100 exerany handedness 1 Michael Sutton 39 male 69.06 161.40 477 Y N R 2 Darryl Nichols 33 male 70.02 113.67 20020 Y Y R 3 Geraldine Garrett 33 female 67.26 136.91 27010 Y Y R 4 Constance Gonzales 39 female 61.93 123.83 44504 N N R 5 William Adams 25 male 61.18 164.52 70017 Y Y R 6 Judith Arias 53 female 64.76 217.80 924272 Y Y R table(df$handedness)
L  R
13 87
p = 0.13
se = sqrt(p*(1-p)/100)
c(p-2*se, p+2*se)
[1] 0.06273931 0.19726069
phat = 0.13
p0 = 0.12
se0 = sqrt(p0*(1-p0)/100)
pnorm(phat,p0,se0)
[1] 0.6208556


1.Confidence Interveral [0.06273931, 0.19726069]
2. The p-value for the hypothesis test is 0.6208556. That value is larger than 0.05, so it does not pass the hypothesis test. As such, we fail to reject the null hypothesis.

emma0126
df = read.csv("https://www.marksmath.org/cgi-bin/random_data.csv?
first_name last_name age gender height weight income smoke100 exerany
1     Vincent      Yang  24   male  71.83 189.29  42825        N       Y
2 Christopher    Towler  36   male  67.39 174.96  37914        Y       N
3       Shawn   Johnson  24   male  69.44 154.93 195728        Y       Y
4       Scott Henderson  30   male  65.96 171.23  48302        Y       N
5         Luz   Bransom  26 female  62.63 189.53  62812        Y       N
6       Nidia  Mccleese  38 female  60.08 141.43   4170        Y       N
handedness
1          L
2          R
3          R
4          R
5          L
6          R
table(df$handedness) L R 19 81 se=sqrt(p*(1-p)/100) Error: object 'p' not found p=(0.19) se=sqrt(p*(1-p)/100) c(p-2*se, p+2*se) [1] 0.1115398 0.2684602 phat=0.08 p0=0.12 se0=sqrt(p0*(1-p0)/100) pnorm(phat,p0,se0) [1] 0.1091773 prop.test(8,100, p = 0.12, alternative = "less" , correct = F) 1-sample proportions test without continuity correction data: 8 out of 100, null probability 0.12 X-squared = 1.5152, df = 1, p-value = 0.1092  alternative hypothesis: true p is less than 0.12 95 percent confidence interval: 0.0000000 0.1364648 sample estimates: p 0.08 1. Confidence interval: [0.1115, 0.2685] 2. Hypothesis test: 0.1092 -----> Cannot reject the null hypothesis due to 0.11 being larger than 0.05. jesshcra df = read.csv("https://www.marksmath.org/cgi-bin/random_data.csv? username=jesshcra") head(df) first_name last_name age gender height weight income smoke100 exerany 1 Suzanne Fender 30 female 61.10 137.18 1167 N N 2 Lois Campos 40 female 64.44 136.96 2908 N Y 3 Sean Wake 37 male 70.09 193.75 19449 Y Y 4 Shirley Cooper 23 female 63.08 182.81 11582 N Y 5 Barbara Hymel 59 female 64.85 188.41 2976 N Y 6 Bryon Williams 24 male 69.42 194.16 5485 N Y handedness 1 L 2 R 3 R 4 R 5 R 6 R p = .19 se = sqrt(p*(1-p)/100) c(p-2*se, p+2*se) [1] 0.1115398 0.2684602 phat = .19 p0 = .12 se0 = sqrt(p0*(1-p0)/100) pnorm(phat,p0,se0) [1] 0.9843839  Confidence interval: [.01115, .2684] Hypothesis test: 0.9843 -> Cannot reject the null hypothesis due to .98 being larger than .05. BeauNichols first_name last_name age gender height weight income smoke100 exerany handedness 1 Ashley Hudson 55 female 63.54 125.44 1965 Y Y L 2 Larry Ayala 40 male 68.56 143.52 401 N Y R 3 Arthur Kelly 21 male 71.30 166.05 69589 N Y R 4 Christopher Long 44 male 68.90 160.10 16940 Y Y R 5 Howard Hang 44 male 70.13 182.23 13343 N Y R 6 Kurt Sublett 31 male 69.33 204.95 7656 Y Y L table(df$handedness)

L R
14 86

p = 0.14
se = sqrt(p*(1-p)/100)
c(p-2se, p+2se)
[1] 0.07060259 0.20939741
phat = 0.14
p0 = 0.12
se0 = sqrt(p0*(1-p0)/100)
pnorm(phat,p0,se0)
[1] 0.7308737

laurabeth
> df = read.csv("https://marksmath.org/cgi-bin/random_data.csv?username=laurabeth")
first_name last_name age gender height weight
1    Dorothy    Miller  39 female  61.55 196.20
2      David    Bailey  24   male  71.23 202.14
3  Margarita  Crabtree  37 female  64.51 143.51
4    Kathryn     Davis  24 female  61.72 126.94
5    Michael   Morosow  25   male  66.74 203.72
6       Lisa     Jones  47 female  69.20 182.34
income smoke100 exerany handedness
1  16897        N       Y          R
2   1497        Y       Y          R
3  32101        Y       Y          R
4  32517        Y       Y          R
5    867        N       N          R
6  10344        Y       N          L
> table(df$handedness) L R 19 81 > sqrt(0.19*(1-0.19)/100) [1] 0.03923009 > c(0.19-2*0.039, 0.19+2*0.039) [1] 0.112 0.268 > phat = 0.19 > p0 = 0.12 > se0 = sqrt(0.12*(1-0.12)/100) > pnorm(phat,p0,se0) [1] 0.9843839  1. Confidence Interval: [0.112, 0.268] 2. Hypothesis Test: [0.9843839] Because the Hypothesis test is greater than 0.05 we cannot reject the null hypothesis. Erad df = read.csv("https://www.marksmath.org/cgi-bin/random_data.csv?username=Erad") head(df) first_name last_name age gender height weight income smoke100 exerany handedness 1 Lashawn Parks 39 female 62.07 177.27 6312 Y Y R 2 Clara Landers 57 female 65.59 173.18 4364 N N R 3 Iva Spears 32 female 63.22 171.55 4309 N Y R 4 Marguerite Geise 57 female 63.23 173.17 4692 N Y R 5 Bernice Willis 33 female 64.87 198.35 21841 N N L 6 Ruby Miguel 38 female 65.75 206.60 240986 N Y R table(df$handedness)
L  R
15 85
se = sqrt(.15*(1-.15)/100)
sqrt(.15*(1-.15)/100)
[1] 0.03570714
.15-(2*0.03570714)
[1] 0.07858572
.15+(2*0.03570714)
[1] 0.221414
phat= .15
p0= .12
se0= sqrt(p0*(1-p0)/100)
1-pnorm(phat,p0,se0)
[1] 0.1779551


The 95% confidence interval: (0.0786, 0.2214)
The p value is approximately 0.178 which is greater than .05 so we cannot reject the null hypothesis.

brifro
df = read.csv("https://marksmath.org/cgi-bin/random_data.csv?username=mark")
table(df\$handedness)
L  R
13 87
p=0.13
> se=sqrt(p*(1-p)/100)
> c(p-2*se,p+2*se)
[1] 0.06273931 0.19726069
Confidence Intervals is (.06, .20)
> phat=0.08
> p0=0.12
> se0=sqrt(p0*(1-p0)/100)
> pnorm(phat,p0,se0)
[1] 0.1091773


prop.test(8,100,p=0.12,alternative=“less”,correct=F)
1-sample proportions test without continuity
correction
data: 8 out of 100, null probability 0.12
X-squared = 1.5152, df = 1, p-value = 0.1092
alternative hypothesis: true p is less than 0.12
95 percent confidence interval:
0.0000000 0.1364648
sample estimates:
p
0.08
Confidence Interval: [0.06273931 0.19726069]
Hypothesis Test: [1] 0.1091773, Cannot reject the null hypothesis due to 0.11 being larger than 0.05.