An archive of Mark's Fall 2017 Intro Stat course.

Examining a random proportion (for 11:00 AM section)

mark

(5 pts)

Using your Discourse login name, load some data into a data from like so:

df = read.csv("https://www.marksmath.org/cgi-bin/random_data.csv?username=mark")
head(df)

#     first_name last_name age gender height weight income smoke100 exerany handedness
# 1      Donna     Dinan  35 female  65.37 164.26   1947        N       Y          R
# 2      Ramon     Davis  20   male  66.59 139.53  22747        Y       Y          R
# 3       Mark      Buss  23   male  74.58 124.21  15489        N       Y          R
# 4      Lidia    Elmore  52 female  63.87 153.64   8369        N       Y          R
  1. Use this data to find a 95% confidence interval for the proportion of people who are left handed.
  2. Supposedly, about 12% of the population is left-handed. Use this data to perform a hypothesis test of that statement.

I guess your confidence interval should look like this: [0.0268, 0.1332].

mark
mark
lilyz
   L  R 
   13 87 
   p=0.13
   se = sqrt(p*(1-p)/100)
   c(p-2*se, p+2*se)

95% Confidence Interval: [0.06273931, 0.19726069]
2.

phat = 0.13
p0 = 0.12
se0 = sqrt(p0*(1-p0)/100)
pnorm(phat,p0,se0)

[1] 0.6208556

^cannot reject null hypothesis due to p-value 0.62 being greater than 0.05

TineriTalentati
> df = read.csv("https://www.marksmath.org/cgi-
bin/random_data.csv?username=TineriTalentati")
> head(df)
 first_name last_name age gender height
1       Dana Pritchett  33 female  65.29
2   Michelle       Rea  38 female  65.92
3      Mindy     Lusby  56 female  68.00
4       Mary   Friesen  34 female  65.71
5      Robin     Jones  22 female  63.13
6    Gregory    Adkins  25   male  64.97
  weight income smoke100 exerany handedness
1 143.83 116672        Y       N          R
2 151.13    150        Y       N          R
3 195.86  76219        N       Y          R
4 154.46   4112        N       Y          R
5 161.12  15601        N       Y          R
6 173.40   9590        N       Y          R
> table(df$handedness)

 L  R 
14 86 
> p=.14
> se=sqrt(p*(1-p)/100)
> c(p-2*se, p+2*se)
[1] 0.07060259 0.20939741
> phat=.08
> phat=.14
> p0=.12
> se0=sqrt(p0*(1-p0)/100)
> pnorm(phat,p0,se0)
[1] 0.7308737
  1. Confidence interval: [.0706, .2094]
  2. Hypothesis test: 0.7309 -> Cannot reject the null hypothesis due to .73 being larger than .05.
Dancerlikens
df = read.csv("https://www.marksmath.org/cgi-bin/random_data.csv?username=mlikens")

  **Confidence Interval**
p=.13
se=(p*(1-p)/100)
View(df)
c(p-2*se, p+2*se)

Confidence Interval is [0.127738, 0.132262]

**Hypothesis Test**
phat=.13
p0=.12
se0=sqrt(p0*(1-p0)/100)
pnorm(phat, p0, se0)   
[1] 0.6208556

We cannot reject the Null Hypothesis.

LunaLovegood
df = read.csv("https://marksmath.org/cgi-bin/random_data.csv?username=LunaLovegood")
> head(df)
   first_name last_name age gender height weight income smoke100 exerany handedness
1      Ashley    Hudson  55 female  63.54 125.44   1965        Y       Y          L
2       Larry     Ayala  40   male  68.56 143.52    401        N       Y          R
3      Arthur     Kelly  21   male  71.30 166.05  69589        N       Y          R
4 Christopher      Long  44   male  68.90 160.10  16940        Y       Y          R
5      Howard      Hang  44   male  70.13 182.23  13343        N       Y          R
6        Kurt   Sublett  31   male  69.33 204.95   7656        Y       Y          L

>table(df$handedness)

L  R 
14 86 
> p = 0.14
> se = sqrt(p*(1-p)/100)
> c(p-2*se, p+2*se)
[1] 0.07060259 0.20939741
> phat = 0.14
> p0 = 0.12
> se0 = sqrt(p0*(1-p0)/100)
> pnorm(phat,p0,se0)
[1] 0.7308737

Confidence Interval: [.0706, .2094]
The p-value for the hypothesis test is 0.7309. That value is larger than 0.05, so it does not pass the hypothesis test. As such, we fail to reject the null hypothesis.

avocadoburrito

df = read.csv(“https://marksmath.org/cgi-bin/random_data.csv?username=avocadoburrito”)
head(df)

  first_name last_name age gender height weight income smoke100 exerany handedness
1     Donald   Hankins  32   male  72.02 224.71   3528        Y       Y          L
2       Ruth    French  46 female  63.39 136.03  17533        N       N          L
3      Larae     Mills  59 female  59.91 171.92 103281        N       Y          R
4     Alexis       Zhu  42   male  68.15 168.52  10431        N       Y          R
5   Patricia       Gee  32 female  61.87 154.39   4450        Y       N          R
6      James    Carter  34   male  73.16 183.77  12195        Y       Y          R

table(df$handedness)

L  R 
11 89 

> p=.11
> se=sqrt(p*(1-p)/100)
> c(p-2*se, p+2*se)
[1] 0.04742205 0.17257795

                       ~ ~*hypothesis testing zone*~ ~

> phat = 0.11
> se0 = sqrt(p0*(1-p0)/100)
> se0 = sqrt(p0*(1-p0)/100)
> pnorm(phat,p0,se0)
[1] 0.3791444

Confidence interval : [0.4742, 0.1726]
Hypothesis tests: 0.379 One may not reject the aforementioned hypothesis, good day.

emeli
> df = read.csv("https://www.marksmath.org/cgi-bin/random_data.csv?username=emeli")
> head(df)
first_name last_name age gender height
1    Charles     Smith  49   male  73.69
2   Kathleen   Leonard  28 female  64.40
3     Andrew   Carlson  37 female  63.33
4     Sandra  Martinez  38 female  64.95
5     Krista     White  29 female  65.39
6       Otis     Reyes  30   male  70.27
weight income smoke100 exerany handedness
1 226.06   1947        N       Y          R
2 142.38    333        Y       Y          R
3 190.32  11730        N       Y          L
4 177.29    802        N       Y          L
5 191.46  25135        Y       Y          R
6 202.01  14952        Y       Y          L
> table(df$handedness)

L  R 
25 75 
> se = sqrt(.25*(1-.25)/100)
> c(p-2*se, p+2*se)
[1] 0.1633975 0.3366025

Confidence interval is (.16, .34)

> phat = 0.25
> p0 = 0.12
> se0 = sqrt(p0*(1-p0)/100)
> pnorm(phat,p0,se0)
[1] 0.9999684
> prop.test(25,100, p = .12, alternative = "less", correct = FALSE)

  1-sample proportions test without
continuity correction

data:  25 out of 100, null probability 0.12
 X-squared = 16.004, df = 1, p-value = 1
 alternative hypothesis: true p is less than 0.12
95 percent confidence interval:
 0.0000000 0.3271734
sample estimates:
 p 
0.25 

The p value is .25 so you cannot reject the null hypothesis.

oyang
df = read.csv("https://marksmath.org/cgi-
bin/random_data.csv?username=oyang")
head(df)

first_name last_name age gender height weight income smoke100 exerany
1      Peggy      Bell  31 female  62.62 203.45   8249        Y       Y
2   Michelle     Cantu  36 female  65.00 188.63   5351        N       N
3    Anthony  Cauffman  45   male  70.25 191.85   1979        N       Y
4      Laura     Smith  40 female  64.87 176.73   1237        N       Y
5      Kitty      Cahn  26 female  64.42 151.86   8715        N       N
6     Sharon    Pastel  27 female  64.11 151.31  64571        Y       Y

 handedness
1          R
2          R
3          R
4          R
5          R
6          R

 table(df$handedness)
L  R 
13 87

 sqrt(p0*(1-p0)/100)
[1] 0.03249615
 pnorm(phat,p0,se0)
[1] 0.000127125
.000127125
[1] 0.000127125
 (.13-2*0.001131)
[1] 0.127738
 sqrt(.13*(1-.13)/100)
[1] 0.03363034
 (.13- 0.03363034)
[1] 0.09636966
 (.13+ 0.03363034)
[1] 0.1636303
phat=.13
p0=.12
se0=sqrt(p0*(1-p0)/100)
pnorm(phat,p0,se0)
[1] 0.6208556

Confidence interval: [0.09636966, 0.1636303]
Hypothesis test: I cannot reject the null hypothesis due to 0.6208556 being bigger than .05

vee
df = read.csv("https://marksmath.org/cgi-
bin/random_data.csv?username=vee")
head(df)
first_name last_name age gender height weight income
1       Eric      Thao  46   male  71.61 197.09   2262
2      Scott     Miles  37   male  66.86 196.26   6170
3      Larry  Martinez  28   male  71.35 159.21   8114
4       Carl   Brickey  49   male  68.63 212.83   7852
5     Ronald   Stclair  38   male  67.97 167.54   4181
6     Samuel    Forbes  22   male  67.77 199.50  36593
smoke100 exerany handedness
1        N       Y          R
2        Y       Y          R
3        Y       Y          R
4        N       Y          R
5        N       N          R
6        Y       N          R
  1. My 95% confidence interval for the proportion of people who are left handed is…
L   R 
21 79 
p = 0.21
se = sqrt(p*(1-p)/100)
c(p-2*se, p+2se)
Error: unexpected symbol in "c(p-2*se, p+2se"
c(p-2*se, p+2*se)
[1] 0.1285384 0.2914616
  1. Hypothesis test ~
phat = .21
p0 = .12
se0 = sqrt(p0*(1-p0)/100)
pnorm(phat,p0,se0)
[1] 0.9971934

This value is larger than .05, so we can not reject the null hypothesis.

Nathan89
 df=read.csv("http://marksmath.org/cgi-bin/random_data.csv?username=Nathan89")
 head(df)
 first_name  last_name age gender height weight income smoke100 exerany handedness
1     Arthur      Evans  21   male  66.12 187.59   6521        Y       N          R
2  Jacquelyn   Robinson  55 female  62.52 176.21   6178        N       Y          R
3    Dolores Strickland  42 female  60.24 179.07   2160        N       Y          R
4      Brent     Bickel  39   male  70.92 200.13   5616        Y       Y          R
5       Lori       Russ  26 female  66.75 148.23   6526        Y       Y          R
6     Willie     Keplin  25   male  71.90 190.56   7608        N       Y          R
table(df$handedness)
L  R 
24 76 
 24/100
[1] 0.24
 p=.24
 se=sqrt(p*(1-p)/100)
 c(p-2*se, p+2*se)
[1] 0.1545834 0.3254166
 phat= .24
 p0=.12
 se0=sqrt(p0*(1-p0)/100)
1- pnorm(phat, p0, se0)
[1] 0.0001109233
  1. 95% Confidence Interval: ( 0.1545834 , 0.3254166)
  2. Hypothesis test: (0.9998891) Therefor, we can reject the hypothesis that 12% of the population is left handed.
Megatog

My Table:

df = read.csv("https://www.marksmath.org/cgi-bin/random_data.csv?username=Megatog")
head(df)
first_name last_name age gender height weight income smoke100 exerany handedness
1    Michael   Molinar  24   male  70.33 182.17   5190        Y       N          R
2       Erik      Lima  26   male  70.85 203.72    771        N       Y          R
3       Mark   Higgins  34   male  71.70 181.90  17682        N       Y          R
4      Clint  Dietrich  58   male  66.14 156.02   6783        N       N          R
5      Jason    Craven  29   male  66.60 168.87   4687        Y       N          R
6  Catherine      Rice  39 female  59.84 176.96  50988        Y       Y          R

Code For My Answers:

table(df$handedness)
L  R 
17 83 
p = 0.17
se = sqrt(p*(1-p)/100)
c(p-2*se, p+2*se)
[1] 0.09487344 0.24512656
phat = 0.17
p0 = 0.12
se0 = sqrt(p0*(1-p0)/100)
pnorm(phat,p0,se0)
[1] 0.9380543
  1. [0.09487344, 0.24512656]
  2. Because the p-value in the test is = 0.9380543, since that is greater than 0.05 we cannot reject the hypothesis that 12% of the population is left handed.
Nashman92
df = read.csv("https://marksmath.org/cgi-bin/random_data.csv?username=Nashman92")


  first_name   last_name age gender height weight income smoke100 exerany handedness
1    Michael    Sutton  39   male  69.06 161.40    477        Y       N          R
2     Darryl   Nichols  33   male  70.02 113.67  20020        Y       Y          R
3  Geraldine   Garrett  33 female  67.26 136.91  27010        Y       Y          R
4  Constance  Gonzales  39 female  61.93 123.83  44504        N       N          R
5    William     Adams  25   male  61.18 164.52  70017        Y       Y          R
6     Judith     Arias  53 female  64.76 217.80 924272        Y       Y          R

table(df$handedness)
L  R 
13 87
p = 0.13
se = sqrt(p*(1-p)/100)
c(p-2*se, p+2*se)
[1] 0.06273931 0.19726069
phat = 0.13
p0 = 0.12
se0 = sqrt(p0*(1-p0)/100)
pnorm(phat,p0,se0)
[1] 0.6208556

1.Confidence Interveral [0.06273931, 0.19726069]
2. The p-value for the hypothesis test is 0.6208556. That value is larger than 0.05, so it does not pass the hypothesis test. As such, we fail to reject the null hypothesis.

emma0126
df = read.csv("https://www.marksmath.org/cgi-bin/random_data.csv?
username=emma0126")
head(df)
first_name last_name age gender height weight income smoke100 exerany
1     Vincent      Yang  24   male  71.83 189.29  42825        N       Y
2 Christopher    Towler  36   male  67.39 174.96  37914        Y       N
3       Shawn   Johnson  24   male  69.44 154.93 195728        Y       Y
4       Scott Henderson  30   male  65.96 171.23  48302        Y       N
5         Luz   Bransom  26 female  62.63 189.53  62812        Y       N
6       Nidia  Mccleese  38 female  60.08 141.43   4170        Y       N
handedness
1          L
2          R
3          R
4          R
5          L
6          R
table(df$handedness)
L  R 
19 81 
se=sqrt(p*(1-p)/100)
Error: object 'p' not found
p=(0.19)
se=sqrt(p*(1-p)/100)
c(p-2*se, p+2*se)
[1] 0.1115398 0.2684602
phat=0.08
p0=0.12
se0=sqrt(p0*(1-p0)/100)
pnorm(phat,p0,se0)
[1] 0.1091773
prop.test(8,100, p = 0.12, alternative = "less" , correct = F)
    1-sample proportions test without continuity correction
data:  8 out of 100, null probability 0.12
X-squared = 1.5152, df = 1, p-value = 0.1092

alternative hypothesis: true p is less than 0.12
95 percent confidence interval:
0.0000000 0.1364648
sample estimates:
p
0.08




  1. Confidence interval: [0.1115, 0.2685]
  2. Hypothesis test: 0.1092 -----> Cannot reject the null hypothesis due to 0.11 being larger than 0.05.
jesshcra
df = read.csv("https://www.marksmath.org/cgi-bin/random_data.csv?
username=jesshcra")
head(df)
first_name last_name age gender height weight income smoke100 exerany
1    Suzanne    Fender  30 female  61.10 137.18   1167        N       N
2       Lois    Campos  40 female  64.44 136.96   2908        N       Y
3       Sean      Wake  37   male  70.09 193.75  19449        Y       Y
4    Shirley    Cooper  23 female  63.08 182.81  11582        N       Y
5    Barbara     Hymel  59 female  64.85 188.41   2976        N       Y
6      Bryon  Williams  24   male  69.42 194.16   5485        N       Y
handedness
1          L
2          R
3          R
4          R
5          R
6          R
p = .19
se = sqrt(p*(1-p)/100)
c(p-2*se, p+2*se)
[1] 0.1115398 0.2684602
phat = .19
p0 = .12
se0 = sqrt(p0*(1-p0)/100)
pnorm(phat,p0,se0)
[1] 0.9843839

Confidence interval: [.01115, .2684]
Hypothesis test: 0.9843 -> Cannot reject the null hypothesis due to .98 being larger than .05.

BeauNichols

first_name last_name age gender height weight income smoke100 exerany handedness
1 Ashley Hudson 55 female 63.54 125.44 1965 Y Y L
2 Larry Ayala 40 male 68.56 143.52 401 N Y R
3 Arthur Kelly 21 male 71.30 166.05 69589 N Y R
4 Christopher Long 44 male 68.90 160.10 16940 Y Y R
5 Howard Hang 44 male 70.13 182.23 13343 N Y R
6 Kurt Sublett 31 male 69.33 204.95 7656 Y Y L





table(df$handedness)

L R
14 86

p = 0.14
se = sqrt(p*(1-p)/100)
c(p-2se, p+2se)
[1] 0.07060259 0.20939741
phat = 0.14
p0 = 0.12
se0 = sqrt(p0*(1-p0)/100)
pnorm(phat,p0,se0)
[1] 0.7308737







laurabeth
> df = read.csv("https://marksmath.org/cgi-bin/random_data.csv?username=laurabeth")
> head(df)
  first_name last_name age gender height weight
1    Dorothy    Miller  39 female  61.55 196.20
2      David    Bailey  24   male  71.23 202.14
3  Margarita  Crabtree  37 female  64.51 143.51
4    Kathryn     Davis  24 female  61.72 126.94
5    Michael   Morosow  25   male  66.74 203.72
6       Lisa     Jones  47 female  69.20 182.34
  income smoke100 exerany handedness
1  16897        N       Y          R
2   1497        Y       Y          R
3  32101        Y       Y          R
4  32517        Y       Y          R
5    867        N       N          R
6  10344        Y       N          L
> table(df$handedness)

 L  R 
19 81 
> sqrt(0.19*(1-0.19)/100)
[1] 0.03923009
> c(0.19-2*0.039, 0.19+2*0.039)
[1] 0.112 0.268
> phat = 0.19
> p0 = 0.12
> se0 = sqrt(0.12*(1-0.12)/100)
> pnorm(phat,p0,se0)
[1] 0.9843839
  1. Confidence Interval: [0.112, 0.268]
  2. Hypothesis Test: [0.9843839] Because the Hypothesis test is greater than 0.05 we cannot reject the null hypothesis.
Erad
df = read.csv("https://www.marksmath.org/cgi-bin/random_data.csv?username=Erad")
head(df)
first_name last_name age gender height weight income smoke100 exerany handedness
1    Lashawn     Parks  39 female  62.07 177.27   6312        Y       Y          R
2      Clara   Landers  57 female  65.59 173.18   4364        N       N          R
3        Iva    Spears  32 female  63.22 171.55   4309        N       Y          R
4 Marguerite     Geise  57 female  63.23 173.17   4692        N       Y          R
5    Bernice    Willis  33 female  64.87 198.35  21841        N       N          L
6       Ruby    Miguel  38 female  65.75 206.60 240986        N       Y          R
table(df$handedness)
L  R 
15 85 
se = sqrt(.15*(1-.15)/100)
sqrt(.15*(1-.15)/100)
[1] 0.03570714
.15-(2*0.03570714)
[1] 0.07858572
.15+(2*0.03570714)
[1] 0.221414
phat= .15
p0= .12
se0= sqrt(p0*(1-p0)/100)
1-pnorm(phat,p0,se0)
[1] 0.1779551

The 95% confidence interval: (0.0786, 0.2214)
The p value is approximately 0.178 which is greater than .05 so we cannot reject the null hypothesis.

brifro
df = read.csv("https://marksmath.org/cgi-bin/random_data.csv?username=mark")
table(df$handedness)
 L  R 
13 87 
p=0.13
> se=sqrt(p*(1-p)/100) 
> c(p-2*se,p+2*se) 
[1] 0.06273931 0.19726069
Confidence Intervals is (.06, .20)
> phat=0.08
> p0=0.12
> se0=sqrt(p0*(1-p0)/100)
> pnorm(phat,p0,se0)
 [1] 0.1091773

prop.test(8,100,p=0.12,alternative=“less”,correct=F)
1-sample proportions test without continuity
correction
data: 8 out of 100, null probability 0.12
X-squared = 1.5152, df = 1, p-value = 0.1092
alternative hypothesis: true p is less than 0.12
95 percent confidence interval:
0.0000000 0.1364648
sample estimates:
p
0.08
Confidence Interval: [0.06273931 0.19726069]
Hypothesis Test: [1] 0.1091773, Cannot reject the null hypothesis due to 0.11 being larger than 0.05.