# A confidence interval for your random heights

edited June 18

(5pts)

In this problem, we're going to return to our fun web program that generates random CSV data for people. Recall that you can access it via Python like so:

``````matplotlib inline
import pandas as pd
df = pd.read_csv('https://www.marksmath.org/cgi-bin/random_data.csv?username=mark')
df.tail()
``````
first_name last_name age sex height weight income activity_level
0 Donna Dinan 35 female 65.37 164.26 1947 high
1 Antonia Davis 39 female 64.95 140.40 2188 none
2 Stephanie Buss 30 female 60.75 181.83 18108 high
3 Wendell Elmore 26 male 64.68 157.90 1935 moderate
4 Nina Mcilhinney 21 female 59.94 163.38 5675 none

Also recall that the data is randomly generated but the random number generator is seeded using the `username` query parameter in the URL. Thus, if I execute that command several times, I get the same result every time. That result depends upon the `username`, however. Thus, if you do it with your forum `username`, you'll get a different result. Thus, we all have our own randomly generated data file!

The problem: Using the code above with your `username`, generate your data file and then

1. Compute the average value of the heights in your data (which you've done before),
2. the standard deviation of the heights in your data,
3. the standard error of the heights in your data,
4. the margin of error to use the heights in your data to compute a $(100-s)\$ confidence interval (where $s$ is your special number), and
5. the resulting $(100-s)\$ confidence interval for height

Be sure to include both the code that you typed, as well as the results in your post.

## Comments

• edited June 20

matplotlib inline
import pandas as pd
df = pd.read_csv('https://www.marksmath.org/cgi-bin/random_data.csv?username=jordan')
df.tail()

heights=df.height.sample(100,random_state=1)
xbar=heights.mean()
xbar
66.73939999999997

s= heights.std()
s
4.122285043492017

se = s/sqrt(100)
se = (s/10)
se
0.41222850434920166

from scipy.stats import norm
z=norm.ppf(0.045)
z
-1.6953977102721358

from scipy.stats import norm
z= norm.ppf(0.995)
z
2.5758293035489004

me= z*se
me
1.061830261260809

ci= [m-me, m+me]
ci
[177.91816973873918, 180.0418302612608]

• edited June 19
``````matplotlib inline
import pandas as pd
df = pd.read_csv('https://www.marksmath.org/cgi-bin/random_data.csv?username=sarah')
df.tail()
``````

1) Avg Values

``````heights=df.height.sample(100,random_state=1)
xbar=heights.mean()
xbar
``````

65.5548

2) Std Deviation

``````s= heights.std()
s
``````

4.113876610652872

3) Std Error

``````se = s/sqrt(100)
se = (s/10)
se
``````

0.4113876610652872

4) Margin of Error

``````from scipy.stats import norm
z=norm.ppf(0.045)
z
``````

z = -1.6953977102721358
zstr=-z
zstr
zstr=1.6953977102721358

``````me=zstr*se
me
``````

0.6974656986042974

5) Confidence Interval for 91

``````ci=[m-me,m+me]
ci
``````

[64.8573343013957, 66.2522656986043]

• edited June 18
``````matplotlib inline
import pandas as pd
df = pd.read_csv('https://www.marksmath.org/cgi-bin/random_data.csv?username=ben')
df.tail()
``````

mean: 66.3
sd: 3.89
se: .39
margin of error: .79
z*: 2.05

``````norm.ppf(.02)
``````

confidence interval for 96: (65.51,67.09)

• edited June 18

Code for my data table:

``````matplotlib inline
import pandas as pd
df = pd.read_csv('https://www.marksmath.org/cgi-bin/random_data.csv?
username=lillian')
df.tail()
``````

1.) Average Value of the Heights:

``````heights = df.height.sample(100,random_state=1)
xbar = heights.mean()
xbar
``````

= 66.30110000000002

2.) Standard Deviation of the Heights:

``````s = heights.std()
s
``````

= 3.7302560084917624

3.) Standard Error of the Heights:

``````se = s/sqrt(100)
se
``````

= 0.37302560084917624

4.) Margin of Error:

``````me
``````

= 0.7460512016983525

5.) Confidence Interval for Height:

``````[xbar - me, xbar + me]
``````

= [65.55504879830167, 67.04715120169837]

6.) z* Multiplier:

I had a 94 CI, so my number was 6

``````norm.ppf(0.03)
``````

= 1.880793608151251

• edited June 20
``````matplotlib inline
import pandas as pd
df = pd.read_csv('https://www.marksmath.org/cgi-bin/random_data.csv?
username=alex') df.tail()
``````

1.

``````m= df.height.mean()
m
``````

m= 66.20770000000003

2.

``````s= df.height.std()
s
``````

s= 3.490168255907652

3.

``````from numpy import sqrt
s = df.height.std()
se = s/sqrt(100)
se
``````

se= 0.3490168255907652

1. (100-1)

5.

``````from scipy.stats import norm
z= norm.ppf(0.995)
z

me= z*se
me

ci= [m-me, m+me]
ci
``````

ci= [65.30869223321172, 67.10670776678835]

• edited June 18

First, I'll import my data and compute my mean and standard deviation:

``````matplotlib inline
import pandas as pd
df = pd.read_csv('https://www.marksmath.org/cgi-bin/random_data.csv?username=audrey')
m = df.height.mean()
s = df.height.std()
[m,s]
``````

[66.49599999999998, 3.9782326922309807]

Thus, my standard error is:

``````se = s/10
se
``````

0.39782326922309807

and my z^*-multiplier is 1.96 since:

``````from scipy.stats import norm
norm.ppf(0.025)
``````

-1.9599639845400545

• edited June 20

1.) Mean:

``````df.height.mean()
``````

65.35810000000001

2.) Standard Deviation:

``````    df.height.std()
``````

3.701679306324183

3.) Standard error

``````    from numpy import sqrt
s = heights.std()
se = s/sqrt(100)
se
``````

0.37016793063241826

4.) Margin of error

``````    me = 2*se
me
``````

0.7403358612648365

5.) Confidence Interval

``````    [xbar - me, xbar + me]
``````

[64.61776413873517, 66.09843586126485]

1. Z*= 2.33
• ``````m = df.height.mean()
m
``````

mean=67.02449999999999

``````s = df.height.std()
s
``````

standard deviation=3.8097767306462633

``````se = s/10
se
``````

standard error=0.3809776730646263

``````sp = (100-s)/100
sp
``````

0.9619022326935374

``````from scipy.stats import norm

z = norm.ppf(sp)
z
``````

z*=1.7732002261111544

``````me = z*se
me
``````

Margin of Error=0.6755496960214968

``````ci = [m-me, m+me]
ci
``````

Confidence Interval=[66.3489503039785, 67.70004969602148]

• edited June 20

This is the code importing my data along with the mean and standard deviation :

`````` matplotlib inline
import pandas as pd
df = pd.read_csv('https://www.marksmath.org/cgi-bin/random_data.csv?
username=beau')
df.tail()
s = df.height.std()
m = df.height.mean()
[m,s]
``````

[65.81690000000002, 3.854679615661196]

The standard error is: 0.3854679615661196

``````se = s/10
se
``````

The z multiplier is: 2.2

``````from scipy.stats import norm
norm.ppf(0.015)
``````

-2.1700903775845606

The margin of error is: 0.8480295154454632

``````me = 2.2 * se
me
``````

The confidence interval is: [64.96887048455456, 66.66492951544548]

``````le = m - (2.2 * se)
re = m + (2.2 * se)
[le,re]
``````
• 1+2) mean and standard deviation

``````%matplotlib inline
import pandas as pd
df = pd.read_csv('https://www.marksmath.org/cgi-bin/random_data.csv?
username=isabel')
m = df.height.mean()
s = df.height.std()
[m,s]
``````

[66.81249999999999, 4.248674273324336]

3) standard deviation

``````se = s/10
se
``````

0.4248674273324336

4) margin of error

``````me = 2*se
me
``````

0.8497348546648672

5) confidence interval

``````ci = [m-me, m+me]
ci
``````

[65.96276514533511, 67.66223485466486]

Sign In or Register to comment.