# SPSS Data Analysis American Heart Association Prediction Term Paper

**Pages:** 9 (2342 words) ·
**Bibliography Sources:**
0 · **File:** .docx · **Level:** College Senior · **Topic:** Other

SPSS Data Analysis

American Heart Association Prediction of Stroke Risks

Over a ten-year study, the American Heart Association collected data on age, blood pressure level, and smoking information in order to calculate the risk of strokes within the sample population. Within the context of this study, risk is interpreted by the probability (times 100) that the patient will have a stroke over the next ten-year period. With those who smoke, there is a dummy variable assigned to correlate the data. In this case a 1 indicates a smoker, and 0 indicates a nonsmoker.

Data Set

Risk

Age

Blood Pressure

Smoker

Using the data, develop an estimated regression equation that relates the risk of a stroke to the person's age, blood pressure, and whether the person is a smoker

With the three separate independent variables representing the individual's age, blood pressure, and whether or not the smoke, the regression equation must reflect a multi-linear regression analysis. Here, the dependent variable equates to the numeric value of the risk level for each individual depending on their relation of their age, blood pressure, and smoking habits. With the regression analysis done using the data set above, the constant value equates to -93.401; each independent variable also has its own coefficient which must be used within the final regression equation. Thus, the equation goes as follows:

Y = a + b1*X1 + b2*X2 + b3*X3

And equates to the following with the constant and independent coefficients plugged into it.

Get full access

for only $8.97. Y = -93.401 + 0.98869x1 + 0.2994x2 + 6.5766x3

B. Use the regression analysis tool to obtain a complete diagnostics.

Variables Entered/Removedb

Model

Variables Entered

Variables Removed

Method

1

smoker paitient, blood pressure, paitient age (years)a

Enter

a. All requested variables entered.

b. Dependent Variable: Risk of stroks (%)

Model Summaryb

Model

## Term Paper on

R

R Square

Adjusted R. Square

Std. Error of the Estimate

1

.935a

.873

.850

5.75657

a. Predictors: (Constant), smoker paitient, blood pressure, paitient age (years)

b. Dependent Variable: Risk of stroks (%)

ANOVAb

Model

Sum of Squares

df

Mean Square

F

Sig.

1

Regression

3

36.823

.000a

Residual

16

33.138

Total

19

a. Predictors: (Constant), smoker paitient, blood pressure, paitient age (years)

b. Dependent Variable: Risk of stroks (%)

Coefficientsa

Model

Unstandardized Coefficients

Standardized Coefficients

t

Sig.

B

Std. Error

Beta

1

(Constant)

-91.759

15.223

-6.028

.000

paitient age (years)

1.077

.166

.697

6.488

.000

blood pressure

.252

.045

.553

5.568

.000

smoker paitient

8.740

3.001

.302

2.912

.010

a. Dependent Variable: Risk of stroks (%)

Casewise Diagnosticsa

Case Number

Std. Residual

Risk of stroks (%)

Predicted Value

a. Dependent Variable: Risk of stroks (%)

Residuals Statisticsa

Minimum

Maximum

Mean

Std. Deviation

N

Predicted Value

4.4606

54.1511

26.9500

13.88058

20

Std. Predicted Value

-1.620

1.960

.000

1.000

20

Standard Error of Predicted Value

1.903

3.532

2.538

.445

20

Adjusted Predicted Value

4.8474

54.2600

26.8973

13.98313

20

Residual

-13.10645

8.55608

.00000

5.28260

20

Std. Residual

-2.277

1.486

.000

.918

20

Stud. Residual -2.418

1.678

.004

1.016

20

Deleted Residual

-14.78714

10.90265

.05268

6.48651

20

Stud. Deleted Residual

-2.940

1.790

-.025

1.107

20

Mahal. Distance 1.127

6.203

2.850

1.340

20

Cook's Distance

.000

.193

.057

.070

20

Centered Leverage Value

.059

.326

.150

.071

20

a. Dependent Variable: Risk of stroks (%)

Curve Fit

Case Processing Summary

N

Total Cases

20

Excluded Casesa

0

Forecasted Cases

0

Newly Created Cases

0

a. Cases with a missing value in any variable are excluded from the analysis.

Variable Processing Summary

Variables

Dependent

Independent

paitient age (years)

blood pressure smoker paitient

Risk of stroks (%)

Number of Positive Values

20

20

10

20

Number of Zeros

0

0

10

0

Number of Negative Values

0

0

0

0

Number of Missing Values

User-Missing

0

0

0

0

System-Missing

0

0

0

0

Model Description

Model Name

MOD_1

Dependent Variable

1

paitient age (years)

2

blood pressure

3

smoker paitient

Equation

1

Linear

Independent Variable

Risk of stroks (%)

Constant

Included

Variable Whose Values Label Observations in Plots

Unspecified

Model Summary and Parameter Estimates

Dependent Variable:paitient age (years)

Equation

Model Summary

Parameter Estimates

R Square

F

df1

df2

Sig.

Constant

b1

Linear

.423

13.186

1

18

.002

58.104

.421

The independent variable is Risk of stroks (%) .

C. Is smoking a significant factor in the risk of a stroke? Explain. Use a=0.05

With the regression analysis previously conducted, the factor of whether or not smoking proves to be a significant factor within the risk of a stroke can be sufficiently examined. In order to conduct this regression analysis, the following equation was used in the examination of only the smoking variable in comparison to the dependent numeric value of predicted risk of stroke.

As the graph and equation shows, there is a significant impact on risk factor if the individual smokes. Although the other independent variables, including age and blood pressure, also play a factor, smoking seems to show a significant increase in the predicted risk of a stroke within the individuals included in the data set. Thus, it can be sufficiently assumed that smoking itself is a significant signal in an increased risk factor for predicted strokes.

D. What is the probability of a stroke over the next ten years for Thompson, a 68-year-old smoker who has a blood pressure of 175?

Coefficients from SPSS Regression Analysis age.697

smoking.302

With the equation formulated earlier that computes the overall numeric risk value being Y = a + b1*X1 + b2*X2 + b3*X3, we can now begin to plug in both the computed constant and coefficients along with new independent variables of an individual not included in the original data set. The equation with the constant and coefficients included, the final equation to be used with new variable sets is Y = -93.401 + .697x1 + 0.553x2 + .302x3. Here, we must first define the variables used in the regression analysis. Variable 0 represented the age of each individual within the data set, Variable 2 represented blood pressure, and variable 3 represented smoking habits. Variable 1 is equated to the dependent variable, or numeric risk value, and so is represented as Y. Thus, with a 68-year-old man who smokes and has a blood pressure of 175, shows an equation to:

Y=-93.401 + .697 (age) + .553 (blood pressure) + .302 (smoking)

Y = -93.401 + .697 (68) + .553(175)+ .302(1)

Y=-93.401 + 47.396 + 96.775+ .302

Y=51.072

Here then, the risk level is at 51.072, and can then be rounded down to 51. The individual in question here then has a risk factor of 51 in terms of his risk for having a stroke within the next ten years, meaning that the base probability is estimated at .51072. It is clear that the man's blood pressure and smoking habits are the two independent variable factors that play the most significant role in formulating such a high risk of stroke within the next ten years in comparison to the other individuals within the original data set.

Question 2

A. Fuel Additives and Mileage

Data Table

Sample a

Sample B

17.3

18.7

18.4

17.8

19.1

21.3

16.7

21

18.2

22.1

18.5

18.7

17.5

19.8

20.7

20.2

Data Rank

Rank a

Rank B

2

8.5

6

4

10

15

1

14

5

16

7

8.5

5

11

13

12

Data Set

Sum

Mean

17.9571429

5.14285714

Variance

0.6795238

2.2641071

Rank Sum

Rank Mean

4.9

11.3

Combined Sum

Combined Median Rank

8.1

Testing commenced on two separate fuel additives in order to test their differing effect on the mileage of the cars. One sample included seven cars, and the other nine cars. Their mileage per gallon can then be used to determine if there is a significant difference between the two additives in terms of mileage information. The two sets of data represent comparable observations and measurable central tendencies. Both include the mileage of vehicles which were used as sampling for the fuel additives in question. Additionally, each sample test is independent of the other and the observations in each sample itself are also independent of each other. Thus, the Mann-Whitney statistical test proves a viable option to compare the two sets of data from the two different and independent fuel additives. The Mann-Whitney test allows for the observation of one sample population in regards to how it fairs in comparison to another sample population, where the variances are equal amongst both sample groups.

Thus, the following equation can be implemented within the computation of the Mann-Whitney statistical test.

UA= nanb +na (na+1) -- TA

2

na= 7 (critical values for U)

nb= 9

TA= the sum of the ranks of Sample a

nanb +na (na+1) = the maximum value of TA

2

With these values, the following computations were made, including the value of U, P (1), and P (2), which can then be analyzed to show if there is a significant difference between the two additives and how they affect the mileage rate of the vehicles they are used in.

Ranks

fuel additives (per m)

N

Mean Rank

Sum of Ranks… [END OF PREVIEW] . . . READ MORE

American Heart Association Prediction of Stroke Risks

Over a ten-year study, the American Heart Association collected data on age, blood pressure level, and smoking information in order to calculate the risk of strokes within the sample population. Within the context of this study, risk is interpreted by the probability (times 100) that the patient will have a stroke over the next ten-year period. With those who smoke, there is a dummy variable assigned to correlate the data. In this case a 1 indicates a smoker, and 0 indicates a nonsmoker.

Data Set

Risk

Age

Blood Pressure

Smoker

Using the data, develop an estimated regression equation that relates the risk of a stroke to the person's age, blood pressure, and whether the person is a smoker

With the three separate independent variables representing the individual's age, blood pressure, and whether or not the smoke, the regression equation must reflect a multi-linear regression analysis. Here, the dependent variable equates to the numeric value of the risk level for each individual depending on their relation of their age, blood pressure, and smoking habits. With the regression analysis done using the data set above, the constant value equates to -93.401; each independent variable also has its own coefficient which must be used within the final regression equation. Thus, the equation goes as follows:

Y = a + b1*X1 + b2*X2 + b3*X3

And equates to the following with the constant and independent coefficients plugged into it.

Get full access

for only $8.97. Y = -93.401 + 0.98869x1 + 0.2994x2 + 6.5766x3

B. Use the regression analysis tool to obtain a complete diagnostics.

Variables Entered/Removedb

Model

Variables Entered

Variables Removed

Method

1

smoker paitient, blood pressure, paitient age (years)a

Enter

a. All requested variables entered.

b. Dependent Variable: Risk of stroks (%)

Model Summaryb

Model

## Term Paper on *SPSS Data Analysis American Heart Association Prediction* Assignment

RR Square

Adjusted R. Square

Std. Error of the Estimate

1

.935a

.873

.850

5.75657

a. Predictors: (Constant), smoker paitient, blood pressure, paitient age (years)

b. Dependent Variable: Risk of stroks (%)

ANOVAb

Model

Sum of Squares

df

Mean Square

F

Sig.

1

Regression

3

36.823

.000a

Residual

16

33.138

Total

19

a. Predictors: (Constant), smoker paitient, blood pressure, paitient age (years)

b. Dependent Variable: Risk of stroks (%)

Coefficientsa

Model

Unstandardized Coefficients

Standardized Coefficients

t

Sig.

B

Std. Error

Beta

1

(Constant)

-91.759

15.223

-6.028

.000

paitient age (years)

1.077

.166

.697

6.488

.000

blood pressure

.252

.045

.553

5.568

.000

smoker paitient

8.740

3.001

.302

2.912

.010

a. Dependent Variable: Risk of stroks (%)

Casewise Diagnosticsa

Case Number

Std. Residual

Risk of stroks (%)

Predicted Value

a. Dependent Variable: Risk of stroks (%)

Residuals Statisticsa

Minimum

Maximum

Mean

Std. Deviation

N

Predicted Value

4.4606

54.1511

26.9500

13.88058

20

Std. Predicted Value

-1.620

1.960

.000

1.000

20

Standard Error of Predicted Value

1.903

3.532

2.538

.445

20

Adjusted Predicted Value

4.8474

54.2600

26.8973

13.98313

20

Residual

-13.10645

8.55608

.00000

5.28260

20

Std. Residual

-2.277

1.486

.000

.918

20

Stud. Residual -2.418

1.678

.004

1.016

20

Deleted Residual

-14.78714

10.90265

.05268

6.48651

20

Stud. Deleted Residual

-2.940

1.790

-.025

1.107

20

Mahal. Distance 1.127

6.203

2.850

1.340

20

Cook's Distance

.000

.193

.057

.070

20

Centered Leverage Value

.059

.326

.150

.071

20

a. Dependent Variable: Risk of stroks (%)

Curve Fit

Case Processing Summary

N

Total Cases

20

Excluded Casesa

0

Forecasted Cases

0

Newly Created Cases

0

a. Cases with a missing value in any variable are excluded from the analysis.

Variable Processing Summary

Variables

Dependent

Independent

paitient age (years)

blood pressure smoker paitient

Risk of stroks (%)

Number of Positive Values

20

20

10

20

Number of Zeros

0

0

10

0

Number of Negative Values

0

0

0

0

Number of Missing Values

User-Missing

0

0

0

0

System-Missing

0

0

0

0

Model Description

Model Name

MOD_1

Dependent Variable

1

paitient age (years)

2

blood pressure

3

smoker paitient

Equation

1

Linear

Independent Variable

Risk of stroks (%)

Constant

Included

Variable Whose Values Label Observations in Plots

Unspecified

Model Summary and Parameter Estimates

Dependent Variable:paitient age (years)

Equation

Model Summary

Parameter Estimates

R Square

F

df1

df2

Sig.

Constant

b1

Linear

.423

13.186

1

18

.002

58.104

.421

The independent variable is Risk of stroks (%) .

C. Is smoking a significant factor in the risk of a stroke? Explain. Use a=0.05

With the regression analysis previously conducted, the factor of whether or not smoking proves to be a significant factor within the risk of a stroke can be sufficiently examined. In order to conduct this regression analysis, the following equation was used in the examination of only the smoking variable in comparison to the dependent numeric value of predicted risk of stroke.

As the graph and equation shows, there is a significant impact on risk factor if the individual smokes. Although the other independent variables, including age and blood pressure, also play a factor, smoking seems to show a significant increase in the predicted risk of a stroke within the individuals included in the data set. Thus, it can be sufficiently assumed that smoking itself is a significant signal in an increased risk factor for predicted strokes.

D. What is the probability of a stroke over the next ten years for Thompson, a 68-year-old smoker who has a blood pressure of 175?

Coefficients from SPSS Regression Analysis age.697

smoking.302

With the equation formulated earlier that computes the overall numeric risk value being Y = a + b1*X1 + b2*X2 + b3*X3, we can now begin to plug in both the computed constant and coefficients along with new independent variables of an individual not included in the original data set. The equation with the constant and coefficients included, the final equation to be used with new variable sets is Y = -93.401 + .697x1 + 0.553x2 + .302x3. Here, we must first define the variables used in the regression analysis. Variable 0 represented the age of each individual within the data set, Variable 2 represented blood pressure, and variable 3 represented smoking habits. Variable 1 is equated to the dependent variable, or numeric risk value, and so is represented as Y. Thus, with a 68-year-old man who smokes and has a blood pressure of 175, shows an equation to:

Y=-93.401 + .697 (age) + .553 (blood pressure) + .302 (smoking)

Y = -93.401 + .697 (68) + .553(175)+ .302(1)

Y=-93.401 + 47.396 + 96.775+ .302

Y=51.072

Here then, the risk level is at 51.072, and can then be rounded down to 51. The individual in question here then has a risk factor of 51 in terms of his risk for having a stroke within the next ten years, meaning that the base probability is estimated at .51072. It is clear that the man's blood pressure and smoking habits are the two independent variable factors that play the most significant role in formulating such a high risk of stroke within the next ten years in comparison to the other individuals within the original data set.

Question 2

A. Fuel Additives and Mileage

Data Table

Sample a

Sample B

17.3

18.7

18.4

17.8

19.1

21.3

16.7

21

18.2

22.1

18.5

18.7

17.5

19.8

20.7

20.2

Data Rank

Rank a

Rank B

2

8.5

6

4

10

15

1

14

5

16

7

8.5

5

11

13

12

Data Set

Sum

Mean

17.9571429

5.14285714

Variance

0.6795238

2.2641071

Rank Sum

Rank Mean

4.9

11.3

Combined Sum

Combined Median Rank

8.1

Testing commenced on two separate fuel additives in order to test their differing effect on the mileage of the cars. One sample included seven cars, and the other nine cars. Their mileage per gallon can then be used to determine if there is a significant difference between the two additives in terms of mileage information. The two sets of data represent comparable observations and measurable central tendencies. Both include the mileage of vehicles which were used as sampling for the fuel additives in question. Additionally, each sample test is independent of the other and the observations in each sample itself are also independent of each other. Thus, the Mann-Whitney statistical test proves a viable option to compare the two sets of data from the two different and independent fuel additives. The Mann-Whitney test allows for the observation of one sample population in regards to how it fairs in comparison to another sample population, where the variances are equal amongst both sample groups.

Thus, the following equation can be implemented within the computation of the Mann-Whitney statistical test.

UA= nanb +na (na+1) -- TA

2

na= 7 (critical values for U)

nb= 9

TA= the sum of the ranks of Sample a

nanb +na (na+1) = the maximum value of TA

2

With these values, the following computations were made, including the value of U, P (1), and P (2), which can then be analyzed to show if there is a significant difference between the two additives and how they affect the mileage rate of the vehicles they are used in.

Ranks

fuel additives (per m)

N

Mean Rank

Sum of Ranks… [END OF PREVIEW] . . . READ MORE

Two Ordering Options:

?

**1.**Buy full paper (9 pages)

Download the perfectly formatted MS Word file!

- or -

**2.**Write a NEW paper for me!

We'll follow your exact instructions!

Chat with the writer 24/7.

#### American Psychological Association/Adult Development and Aging Research Proposal …

#### SPSS Statistics Data Analysis Essay …

#### SPSS Statistics Data Analysis Term Paper …

#### Non-Parametric Testing Data Analysis Chapter …

#### American Foreign Policy Since Its Inception Term Paper …

### How to Cite "SPSS Data Analysis American Heart Association Prediction" Term Paper in a Bibliography:

APA Style

SPSS Data Analysis American Heart Association Prediction. (2010, February 4). Retrieved January 22, 2021, from https://www.essaytown.com/subjects/paper/spss-data-analysis-american-heart-association/81604MLA Format

"SPSS Data Analysis American Heart Association Prediction." 4 February 2010. Web. 22 January 2021. <https://www.essaytown.com/subjects/paper/spss-data-analysis-american-heart-association/81604>.Chicago Style

"SPSS Data Analysis American Heart Association Prediction." Essaytown.com. February 4, 2010. Accessed January 22, 2021.https://www.essaytown.com/subjects/paper/spss-data-analysis-american-heart-association/81604.