A classic data set for investigating categorical variables deals with the RMS Titanic. The Titanic sank in the North Atlantic Ocean on April 15, 1912. There were more than 2,000 passengers and crew on board and more than 1,500 did not survive. The data set titanic.jmp located with this homework contains data on 887 passengers. For each passenger, the variables Class (class of the passenger) and Survived (whether or not the passenger lived) were recorded. A main question of interest is whether there is an association between class and survival.
(a) Use JMP to obtain the distribution of the variable Survived. See the Categorical Data JMP guide for any additional help. This is the marginal distribution of Survived. Write down the marginal distribution below (give the categories of Survived and the proportions in each category).
(b) One variable that could be associated with the survival of a passenger is their Class. In studying this relationship, which variable is the explanatory variable? Circle the appropriate answer.
Class Survived
(c) Use JMP to obtain a contingency table and mosaic plot of the relationship between the Class of a passenger and whether or not they Survived. Change the contingency table so that it includes only the Count and Row % values. Print out the output and turn it in with this assignment. See the Categorical Data JMP guide for any additional help.
(d) Give the conditional distribution of Survived given that the passenger was in First class.
(e) Compare this conditional distribution to the overall marginal distribution of Survived from part (a). Are the two distributions similar to each other, or are they different? Explain.
(f) Based on the contingency table and mosaic plot, does it appear there is an association between Class and Survival?