Inference for a Proportion in R

The data here come from a huge table of records of heart attack victims. Getting tables into R is a bit complicated so use this file which contains only the data on the DIED variable (coded 1=died). Save it on your hard drive in the directory where the R program is located. If you name the file DIED4R.txt, you can use this R command to input the data

> died = scan(file="DIED4R.txt")
Read 12844 items

This puts the data into a variable called "died". Use table on this variable to get counts if you do not already have them.

> table(died)
died
    0     1 
11434  1410 

1410 of the patients died. A single command gives confidence intervals and tests any hypothetical p0 specified. Here we test whether the results from this hospital match a hypothetical national average of 10%. Ignore the X-squared value and use the p-value for a hypothesis test. We need the number of 1's (from the table command), the number of subjects (from scan or length), and the hypothesized proportion.

> prop.test(1410,12844,p=0.1)

        1-sample proportions test with continuity correction

data:  1410 out of 12844, null probability 0.1 
X-squared = 13.5385, df = 1, p-value = 0.0002337
alternative hypothesis: true p is not equal to 0.1 
95 percent confidence interval:
 0.1044507 0.1153421 
sample estimates:
        p 
0.1097789  

We reject the hypothesis that the local proportion is the same as the national proportion. However, the confidence interval indicates that it is only slightly higher. Is that national average exactly 10.0000%? Is there enough of a difference to matter?


© 2006 Robert W. Hayden