Question Detail: I. Descriptive Statistics: (20 pts)Download the data set gerstman1.sav. Complete...

Question

Question Detail: I. Descriptive Statistics: (20 pts) Download the data set gerstman1.sav. Complete the following: 1) List the level of measurement for the variables, AGE, SEX, AGEGRP, SBP1 in the data...

Question Detail:

I. Descriptive Statistics: (20 pts)

Download the data set gerstman1.sav. Complete the following:

1) List the level of measurement for the variables, AGE, SEX, AGEGRP, SBP1 in the data set and describe the appropriate numerical and descriptive statistics based on these. 4 pts

2) Calculate (by hand) the mean and standard deviation for age based on the first 20 records in the data set. Use the table below to do your calculations. 4 pts

Record Number	AGE	Observed Mean	Difference	Difference Squared
1	3
2	11
3	15
4	46
5	14
6	35
7	46
8	35
9	40
10	29
11	22
12	16
13	31
14	42
15	22
16	45
17	24
18	1
19	28
20	25
Sum

3) Generate numerical and graphical descriptive statistics for each of the variables, namely, AGE, SEX, AGEGRP and SBP1. 8 pts

4) Interpret the output you generated in part 3 for each of the variables in the data set. 4 pts

II. Paired and Independentttests: (20 pts)

Download the data set HIV.sav and use SPSS to complete the following calculations. Be sure to include interpretation of the SPSS output in your responses.

1) Use the 5-step approach to hypothesis testing and the calculation of the 95% confidence intervals to answer the following research question: Did you observe a significant difference in Systolic Blood Pressure (SBP) over the course of the study? (10 pts)

2) Use the 5-step approach to hypothesis testing and the calculation of the 95% confidence intervals to answer the following research question: Is there a difference in SBP1 based on HIV status? (10 pts)

III. Cross-Tabulation: (20pts)

Download the data set alcohol_Bladder.sav and use SPSS to complete the following calculations. Be sure to include interpretation of the SPSS output in your responses.

1) Use the 5-step approach to hypothesis testing to answer the following research question: In the sample provided in alcohol_Bladder.sav, are the variables income and Bladder Cancer independent of each other? (Note:The question could also be asked: Is there an association between the variables because the lack of independence implies an association)? (10 pts)

2) Answer the following based on the cross-tabulation of alcohol consumption and Bladder Cancer: (10 pts)

*Alcohol consumption Bladder Cancer Crosstabulation**
Count
Bladder Cancer	Total
No	Yes
Alcohol consumption	"Less than 1 drink per week"	30	54	84
4 or more drinks per month	22	115	137
Total	52	169	221

Calculate the odds ratio. 4 pts

Describe how the odds ratio differs from the relative risk or risk ratio and why you would chose it here. 2 pts

Interpret the odds ratio and how it might impact the practice of public health practitioners. 2 pts

If you wanted to know whether this relationship was statistically significant what test(s) could you use? 2 pts

IV. ANOVA: 20pts

Download the data set inc-pov-hlthins.sav and use SPSS to complete the following calculations. Be sure to include interpretation of the SPSS output in your responses.

1) Produce box plots of income for each region of the US in the data set and interpret them. Based on the box plots do you expect to find a difference between any of the groups? 4 pts

2) Create descriptive statistics for each region, using the variable income. 4 pts

Include skewness and kurtosis in the output. 2 pts

Create a histogram for each group. 2 pts

3) Run the ANOVA for income based on region. Include the ANOVA table and the test for Homogeneity of Variance. Interpret the results. 6 pts

4) Conduct post hoc analysis using Bonferroni and LSD methods to control for multiple testing. 6 pts

Provide the output. 2 pts

Interpret your results. 3 pts

Why do you need to use methods like Bonferroni and LSD with the ANOVA? 1 pt

V. Regression: 20pts

1) Download the data set Gender_BMI.sav and use SPSS to complete the following calculations.

Use an independentttestandsimple linear regression to identify whether a relationship exists between gender and BMI. (10 pts)

Run the appropriatettest in SPSS, report the significance of the difference in means and the confidence interval, and interpret the results. 4 pts

Run the simple linear regression in SPSS, report the significance of the variable gender and the overall fit of the model (using r²). Interpret the results. 4 pts

How are these two approaches different? 1 pt

Are your conclusions the same using both tests? 1 pt

2) Answer the following questions using the provided output: 10 pts

Multiple Linear Regression 5 pts

Researchers looked at the Emergency Department Records of 60 adults ages 22 to 46 years who arrived in the ED complaining of chest pain during a 6 month period of time. They did not use a random sample as they wanted 30 males and 30 females in the study. They collected information on BMI (a measure of overweight/obesity), Age, SBP (Systolic Blood Pressure) and the diagnosis of Diabetes. Their first hypothesis (alternative) was that the dependent variable SBP is associated with BMI, Age, Diabetes, and Gender. They conducted a multiple linear regression to test their hypothesis. Here are the results (note that they had two models and chose to use the second one):

Model Summary^c
Model	R	R Square	Adjusted R Square	Std. Error of the Estimate
1	.796^a	.634	.608	5.443
2	.792^b	.627	.607	5.445
a. Predictors: (Constant), Diabetes, Age, Gender, BMI b. Predictors: (Constant), Age, Gender, BMI c. Dependent Variable: SBP

ANOVA^c
Model	Sum of Squares	df	Mean Square	F
1	Regression	2824.968	4	706.242	23.839	.000^a
Residual	1629.408	55	29.626
Total	4454.376	59
2	Regression	2794.222	3	931.407	31.418	.000^b
Residual	1660.155	56	29.646
Total	4454.376	59
a. Predictors: (Constant), Diabetes, Age, Gender, BMI b. Predictors: (Constant), Age, Gender, BMI c. Dependent Variable: SBP
Coefficients^a
Model	Standardized Coefficients	t		95.0% Confidence Interval for B
Beta	Lower Bound	Upper Bound
1	(Constant)	8.092	.000	57.471	95.309
Gender	-.189	-2.100	.040	-6.381	-.149
BMI	.557	6.130	.000	1.213	2.392
Age	.507	6.067	.000	.426	.847
Diabetes	-.089	-1.019	.313	-4.752	1.549
2	(Constant)	8.885	.000	55.243	87.407
Gender	-.173	-1.950	.056	-6.054	.081
BMI	.574	6.413	.000	1.276	2.436
Age	.517	6.243	.000	.441	.859
a. Dependent Variable: SBP

1) Which variables in model 1 are significant? 1 pt

2) Which variables in model 2 are significant? 1 pt

3) Why did they choose model 2? 1 pt

4) What is the “fit” of model 2 (the one they chose to use)? 1 pt

5) Is this a good model, why or why not? 1 pt

Multiple Logistic Regression 5 pts

The Emergency Department Researchers selected another 60 adults and again looked at Age, SBP, BMI, Gender, and Diabetes. This time however, they also collected information on whether the chest pain was diagnosed as an MI (aka Heart Attack) or something else. Now their alternative hypothesis was that gender was related to the diagnosis of an MI, after controlling for Age, SBP, BMI, and Diabetes. They used multiple logistic regression to test their hypothesis and these are their results (note that there are multiple models and they chose to use the final one):

Model Fitting Information
Model	Model Fitting Criteria	Likelihood Ratio Tests
-2 Log Likelihood	Chi-Square	df
Intercept Only	74.995
Final	16.398	58.598	5	.000
Pseudo R-Square
Cox and Snell	.623
Nagelkerke	.866
McFadden	.767
Parameter Estimates
Heart Attack^a	B	Std. Error	Wald	df	Exp(B)
No	Intercept	115.037	43.679	6.936	1	.008
BMI	-1.400	.572	5.995	1	.014	.247
Age	.037	.116	.099	1	.753	1.037
Diabetes	.811	1.471	.304	1	.581	2.251
SBP	-.469	.213	4.849	1	.028	.626
[Gender=1]	-11.866	4.695	6.389	1	.011	7.025E-6
[Gender=2]	0^b	.	.	0	.	.

Parameter Estimates
Heart Attack^a	95% Confidence Interval for Exp(B)
Lower Bound	Upper Bound
No	Intercept
BMI	.080	.756
Age	.826	1.303
Diabetes	.126	40.193
SBP	.412	.950
[Gender=1]	7.088E-10	.070
[Gender=2]	.	.

1) Is the final model significant? 2 pts

2) What are the odds ratios for each of the significant variables, and what do they mean? 2 pts

3) Will this model help the researchers, why or why not? 1 pt

005_ucklzk-icpf5vrb.sav 005_vcolzk-jcjiewx1.sav 005_acslzk-vru1b15w.sav 005_t8slzk-ztedysln.sav 005_jcslzk-aw52bi0z.sav

Answered Same DayDec 21, 2021

David · Accepted Answer

PUBH6125 FINAL EXAM
FINAL EXAM
PUBH 6125/8125 Biostatistics
Answer the following questions. Copy and paste any required data charts or summaries into this Word document. Use additional space as needed. Be sure to include your name on the document and use the file naming convention. This exam is open book and open notes.
I. Descriptive Statistics: (20 pts)
Download the data set gerstman1.sav. Complete the following:
1) List the level of measurement for the variables, AGE, SEX, AGEGRP, SBP1 in the data set and describe the appropriate numerical and descriptive statistics based on these. 4 pts
Answer: The following table shows the Name of the variables and the Measurement for these variables:
	Variable Name
	Measurement
	AGE
	Scale
	SEX
	Nominal
	AGEGRP
	Ordinal
	SBP1
	Scale
2) Calculate (by hand) the mean and standard deviation for age based on the first 20 records in the data set. Use the table below to do your calculations. 4 pts
	Record Number
	AGE
	Observed 
Mean
	Difference
	Difference 
Squared
	1
	3
	
	
	
	2
	11
	
	
	
	3
	15
	
	
	
	4
	46
	
	
	
	5
	14
	
	
	
	6
	35
	
	
	
	7
	46
	
	
	
	8
	35
	
	
	
	9
	40
	
	
	
	10
	29
	
	
	
	11
	22
	
	
	
	12
	16
	
	
	
	13
	31
	
	
	
	14
	42
	
	
	
	15
	22
	
	
	
	16
	45
	
	
	
	17
	24
	
	
	
	18
	1
	
	
	
	19
	28
	
	
	
	20
	25
	
	
	
	Sum
	
	
	
	
Answer:
	Record Number
	AGE
	Observed 
	Difference
	Difference 
	
	
	Mean
	
	Squared
	1
	3
	26.5
	-23.5
	552.25
	2
	11
	26.5
	-15.5
	240.25
	3
	15
	26.5
	-11.5
	132.25
	4
	46
	26.5
	19.5
	380.25
	5
	14
	26.5
	-12.5
	156.25
	6
	35
	26.5
	8.5
	72.25
	7
	46
	26.5
	19.5
	380.25
	8
	35
	26.5
	8.5
	72.25
	9
	40
	26.5
	13.5
	182.25
	10
	29
	26.5
	2.5
	6.25
	11
	22
	26.5
	-4.5
	20.25
	12
	16
	26.5
	-10.5
	110.25
	13
	31
	26.5
	4.5
	20.25
	14
	42
	26.5
	15.5
	240.25
	15
	22
	26.5
	-4.5
	20.25
	16
	45
	26.5
	18.5
	342.25
	17
	24
	26.5
	-2.5
	6.25
	18
	1
	26.5
	-25.5
	650.25
	19
	28
	26.5
	1.5
	2.25
	20
	25
	26.5
	-1.5
	2.25
	Sum
	530
	 
	 
	3589
Here the mean = 26.5 and Standard Deviation = √(3589/20) = 13.4
3) Generate numerical and graphical descriptive statistics for each of the variables, namely, AGE, SEX, AGEGRP and SBP1. 8 pts
Answer: The following table shows the descriptive statistics for AGE and SBP1:
	Descriptive Statistics
	 
	Range
	Minimum
	Maximum
	Mean
	Std. Deviation
	Variance
	Skewness
	Kurtosis
	age
	63
	1
	64
	28.16
	14.674
	215.322
	.167
	-.654
	sbp1
	82
	61
	143
	109.02
	16.631
	276.579
	-.411
	-.020
The following figure shows the histogram for AGE:
The following figure shows the histogram for SBP1:
The following table shows the frequency table for SEX:
	sex
	
	Frequency
	Percent
	Valid Percent
	Cumulative Percent
	Valid
	F
	36
	48.0
	48.0
	48.0
	
	M
	39
	52.0
	52.0
	100.0
	
	Total
	75
	100.0
	100.0
	
The following figure shows the pie chart of SEX:
The following table shows the frequency table for AGEGRP:
	agegrp
	
	Frequency
	Percent
	Valid Percent
	Cumulative Percent
	Valid
	0 - 9
	16
	21.3
	21.3
	21.3
	
	10 - 19
	17
	22.7
	22.7
	44.0
	
	20 - 29
	17
	22.7
	22.7
	66.7
	
	30 - 39
	15
	20.0
	20.0
	86.7
	
	40 - 49
	8
	10.7
	10.7
	97.3
	
	50 - 59
	2
	2.7
	2.7
	100.0
	
	Total
	75
	100.0
	100.0
	
The following figure shows the pie chart of AGEGRP:
4) Interpret the output you generated in part 3 for each of the variables in the data set.  4 pts
Answer: From the above tables we can see that the mean AGE is 28.16 with standard deviation 14.674. The minimum age was 1 and the maximum was 64 with range of 63. The coefficient of skewness is 0.167 so AGE is slightly positively skewed and coefficient of kurtosis is -0.654 so AGE is platykurtic.
the mean sbp1 is 109.02 with standard deviation 16.631. The minimum sbp1 was 61 and the maximum was 143 with range of 82. The coefficient of skewness is -0.411 so sbp1 is slightly negatively skewed and coefficient of kurtosis is -0.02 so sbp1 is platykurtic.
II. Paired and Independent t tests: (20 pts)
Download the data set HIV.sav and use SPSS to complete the following calculations.  Be sure to include interpretation of the SPSS output in your responses.
1) Use the 5-step approach to hypothesis testing and the calculation of the 95% confidence intervals to answer the following research question:  Did you observe a significant difference in Systolic Blood Pressure (SBP) over the course of the study? (10 pts)
Answer: Here we want to test whether there is a significant difference in Systolic Blood Pressure (SBP) over the course of the study.
We assume that the systolic blood pressure before the course (sbp1) follows Normal distribution with mean m1 and the systolic blood pressure after the course (sbp1) follows Normal distribution with mean m2. Here we want to test, H: m1 = m2 vs. K: m1 ≠ m2.
Here we will use paired samples t-test. The results are given below:
	 
	Paired Differences
	t
	df
	Sig. (2-tailed)
	
	Mean
	Std. Deviation
	Std. Error Mean
	95% Confidence Interval of the Difference
	
	
	
	
	
	
	
	Lower
	Upper
	
	
	
	sbp1 - sbp2
	6.878
	3.415
	.402
	6.076
	7.681
	17.088
	71
	.000
From the above table we can see that the observed value of t = 17.088. Here the p-value of the test almost zero (from ‘Sig. (2-tailed)’). Since the p-value is less than 0.05 we reject H at 5% level of significance and conclude that there is a significant difference in Systolic Blood Pressure (SBP) over the course of the study.
Here the 95% confidence interval for the difference is (6.076, 7.681).
2) Use the 5-step approach to hypothesis testing and the calculation of the 95% confidence intervals to answer the following research question:  Is there a difference in SBP1 based on HIV status? (10 pts)
Answer: Here we want to test whether there is a significant difference in SBP1 based on HIV status.
We assume that SBP1 for HIV = Y follows Normal distribution with mean m1 and SBP1 for HIV = N follows Normal distribution with mean m2. Here we want to test, H: m1 = m2 vs. K: m1 ≠ m2.
Here we will use independent samples t-test. The results are given below:
	 
	Levene's Test for Equality of Variances
	t-test for Equality of Means
	
	F
	Sig.
	t
	df
	Sig. (2-tailed)
	Mean Difference
	Std. Error Difference
	95% Confidence Interval of the Difference
	
	
	
	
	
	
	
	
	Lower
	Upper
	Equal variances assumed
	.109
	.743
	-2.580
	70
	.012
	-9.679
	3.752
	-17.162
	-2.197
From the above table we can see that for Levene’s Test for Equality of Variance, the observed value of F = 0.109 with p-value = 0.743. Since the p-value is greater than 0.05 so we accept the null hypothesis of equality of variance at 5% level of significance. So, here we can assume equality of variance. Now, the observed value of t = -2.58. Here the p-value of the test is 0.012 (from ‘Sig. (2-tailed)’). Since the p-value is less than 0.05 we reject H at 5% level of significance and conclude that there is a significant difference in SBP1 based on HIV status.
Here the 95% confidence interval for the difference is (-17.162, -2.197).
III. Cross-Tabulation: (20pts)
Download the data set alcohol_Bladder.sav and use SPSS to complete the following calculations. Be sure to include interpretation of the SPSS output in your responses.
1) Use the 5-step approach to hypothesis testing to answer the following research question: In the sample provided in alcohol_Bladder.sav, are the variables income and Bladder Cancer independent of each other? (Note:

Question Detail: I. Descriptive Statistics: (20 pts) Download the data set gerstman1.sav. Complete the following: 1) List the level of measurement for the variables, AGE, SEX, AGEGRP, SBP1 in the data...

Answer To: Question Detail: I. Descriptive Statistics: (20 pts) Download the data set gerstman1.sav. Complete...

Answer To This Question Is Available To Download

Related Questions & Answers

Submit New Assignment