I have attached all the files, look at the file called final required it has all the requierments and it explains the project
1 REQUIRED FINAL DATA ANALYSIS PROJECT Assignment Spring 2021, Statistics 285: Section 06 GRAND TOTAL: 150 points___ FINAL Data Analysis Project/Methods Paper Due Date: Friday, May 7 2021 Requirements – Compile a data set of 20 points (observations, cases, study subjects, etc.) or more with two (2) quantitative variables (2 variables are needed for probability calculations and regression analysis). Goal: Apply Methods learned from Chapters 2 through 8 to a data set gathered or provided by a reliable source (Examples: https://www.kaggle.com/datasets ; www.census.gov; www.cdc.gov; etc. please not Wikipedia! Thank you!) and write an explanation of the results. (Length:7 -10 pages narrative; Tables-Charts-Figures separate) Final Paper Due: Friday, May 7, 2021 *****SUBMIT TO ASSIGNMENTS on CANVAS INCLUDE SEPARATE COVER PAGE TO INCLUDE: PROJECT TITLE Submitted to: Lynn A. Agre, MPH, PhD Statistics 285:Section 06 May 7, 2021 by: Your Name Rutgers Student ID No. ***REMEMBER to INSERT PAGE NUMBERS IN FINAL PAPER*** I. Abstract n = 150 words - brief description of proposed study-5 sentences. (Total 15 points) Example Lead Sentence for the Abstract or Summary: This study will investigate the relationship between x and y variables. This research will explore how x affects y. This analysis will explore the association between x and y. II. Introduction/Background: Brief (at least 2-3 pages) (25 points) a. State the problem based on preponderance of evidence (at least 3 literature citations within the last 5 years). Literature Review - summarized in two or three paragraphs (reliable sources located on https://www.libraries.rutgers.edu/indexes/google_scholar Use RU Net ID and log-in to access articles for free i.e. Get it @ R. b. Who or what is the unit of analysis -- what is the topic being studied. c. Where are the data collected from, including the source website? d. When were the data collected--what time frame, i.e. year, months. e. Hypotheses: – quantify by setting a threshold value, i.e. >, <, = note: formulate at least 3 hypotheses, with ho (null) and ha: (alternative) stated. https://www.kaggle.com/datasets http://www.census.gov/ http://www.cdc.gov/ https://www.libraries.rutgers.edu/indexes/google_scholar 2 iii. methods - i.e. data analysis portion (100 points) describe the data -- i.e. who or what is unit of analysis--what is being compared (such as us states and then the two variables, such as percent poverty, percent unemployed by us state). provide the data table. ( 4 points) ➢ please note: need to include all formulas used. formulas can be typed or cut and paste from the course slides, or web-source, if correct. step-by-step calculations not required, but helpful. ➢ please do not insert pictures of hand-written formulas and calculations in the word document. ➢ no credit will be given if not legible. thank you. from chapter 2: (10 points)-methods for describing data: calculate all for both variables: • mean • standard deviation • range • inter-quartile range, lower and upper limit values • create box plot (can use online boxplot or draw by hand in excel) • create histogram. from chapter 3 : (20 points) – probability - analyze the data set with at 4 types of probability methods. • complementary rule • additive rule – for mutually exclusive events • conditional rule • multiplicative rule • bayes rule from chapters 4: (choose 4: 20 points) – random variables and probability distribution • discrete random variables, mean and standard deviation • binomial distribution • poisson distribution • hypergeometric • percentiles, z-scores from chapter 5: (10 points)- sampling distribution • calculate the sampling distribution based on the mean • compute the sampling distribution of the sample proportion from chapter 6: (10 points) confidence intervals for single sample • calculate the population mean, the confidence interval either sigma ? known or unknown • apply either the normal z statistic or t-interval procedure (based on sample size and distribution of data) • calculate the necessary sample size 3 from chapter 7: (10 points) – hypothesis testing for single sample • formulate hypothesis test for μ and illustrate rejection region • apply hypothesis test either using normal z statistic student's t-statistic from chapter 8: (20 points) – confidence intervals for two samples • compute the standard deviation and confidence interval for two independent samples (will need a second data set of n = 20) • calculate either a one-tailed or two-tailed hypothesis test • calculate the paired difference confidence interval for µd = μ1 - μ2 • use the large sample test of hypothesis about p1 - p2 normal z statistic extra credit: from chapter 11 : simple linear regression (choose 2 methods: 10 points) • fit the model: the least squares approach y = mx + b or y = b0 + b1x1 • compute coefficient of determination or strength of linear relation with r2 iv. results (2-3 pages) 25 points) a. describe findings from analyses. b. are the two variables related? c. what are the significant results--what are not significant results. v. discussion or interpretation of the results (2-3 pages) (25 points) a. why is the research question important? b. how do these statistical methods answer your research question? how do these methods illustrate or explain the objective of your paper? c. what are the limitations of the study? what other variables can be included? d. what is the broader impact of the study? who can benefit from this research? e. what future studies can be generated from the results of your research? research investigators consult the existing body of knowledge or conduct a literature search first when developing and testing hypotheses, to extend, replicate and build on existing empirical research (i.e. based on hypothesis testing) — either quantitative or qualitative. vi. references – bibliography (6-10 references at least 50% from last five years) (20 points) citation format: author last name, first initial (year). article title. journal or newspaper, vol no., page numbers. please do not copy and past the url (i.e. the web address for the citation). a citation in that format will not receive credit or any points. example below: picket, k.e., wilkinson, r.g. (2015). income inequality and health: a causal review. social science and medicine, 128, 316-326. 4 types of articles for possible citation are: a. citations on the topic being studied; b. scholarly articles discussing application of statistical method chosen to test your hypothesis. make sure to address the following questions in your report, i.e. who, what, where, when and why, and how. (1) who or what is the unit of analysis? (2) where are the data collected from, including the source website? (3) when were the data collected--what time frame, i.e. year, months? (4) why is the research question important? (5) how do these statistical methods answer your research questions? how do they illustrate or explain the objective of your paper? data analysis project - common questions how to create a random sample step no. 1 is to identify a data set--raw data that have not been analyzed. this is the population. • column no. 1 contains the unit of analysis, such as us states, countries, teams or players or perhaps individuals or persons. • column no. 2 contains quantitative variable no. 1, column no. 3 contains quantitative variable no. 2. the document entitled, "sample structure of a data set" the pdf file listed under the required final data analysis project folder under the canvas resources folder displays the format. kaggle.com is one source for data--this would be considered population. step no. 2 to create a random sample from the data set selected. from the population of n =? whatever the maximum number is in the kaggle data set,, you can build a data set from the full list of us states, countries---using a random number generator available https://www.random.org ➢ insert the minimum number (i.e. 1) and then the maximum number of cases or in this case countries in the data set (i,e, 170 for example). ➢ the random number generator will provide a number between 1 and 170 perhaps 67. ➢ thus, the first random selection from the countries who are on the list is no. 67. ➢ then, enter country no. 67 in the spreadsheet with the two quantitative variables-- continuous variables which are suitable for the methods studied in the course literature review - finding articles ❖ the articles are published papers that are related to the variables you are analyzing in your data set. ❖ simply summarize the scientific results in those three papers--one paragraph for each paper. ❖ preponderance of evidence in science refers to the published research on a similar or related topic to the data set that is being analyzed. ❖ you can access relevant literature i.e. scholar.google.com accessible via libraries.rutgers.edu and search for articles (not personal web pages or blogs) that pertain to the data topic you are analyzing. see link below. https://www.libraries.rutgers.edu/indexes/google_scholar https://www.random.org/ creating hypotheses ✓ the hypotheses then pertain to the data set that you are analyzing. ✓ each method, each formula studied in the course uses the ho-null and the ha- alternative hypothesis formulate to test the variable against a null value (such as a population mean. probability calculations • regarding probability calculations, first create a relative frequency table. • probability is based on f over n or x over n--referring to total n=20. • to determine the numerator, set a threshold, i..e count the number of states that have at least a certain threshold of depression rate out of 20 states total. • thus, if 5 =="" note:="" formulate="" at="" least="" 3="" hypotheses,="" with="" ho="" (null)="" and="" ha:="" (alternative)="" stated.="" https://www.kaggle.com/datasets="" http://www.census.gov/="" http://www.cdc.gov/="" https://www.libraries.rutgers.edu/indexes/google_scholar="" 2="" iii.="" methods="" -="" i.e.="" data="" analysis="" portion="" (100="" points)="" describe="" the="" data="" --="" i.e.="" who="" or="" what="" is="" unit="" of="" analysis--what="" is="" being="" compared="" (such="" as="" us="" states="" and="" then="" the="" two="" variables,="" such="" as="" percent="" poverty,="" percent="" unemployed="" by="" us="" state).="" provide="" the="" data="" table.="" (="" 4="" points)="" ➢="" please="" note:="" need="" to="" include="" all="" formulas="" used.="" formulas="" can="" be="" typed="" or="" cut="" and="" paste="" from="" the="" course="" slides,="" or="" web-source,="" if="" correct.="" step-by-step="" calculations="" not="" required,="" but="" helpful.="" ➢="" please="" do="" not="" insert="" pictures="" of="" hand-written="" formulas="" and="" calculations="" in="" the="" word="" document.="" ➢="" no="" credit="" will="" be="" given="" if="" not="" legible.="" thank="" you.="" from="" chapter="" 2:="" (10="" points)-methods="" for="" describing="" data:="" calculate="" all="" for="" both="" variables:="" •="" mean="" •="" standard="" deviation="" •="" range="" •="" inter-quartile="" range,="" lower="" and="" upper="" limit="" values="" •="" create="" box="" plot="" (can="" use="" online="" boxplot="" or="" draw="" by="" hand="" in="" excel)="" •="" create="" histogram.="" from="" chapter="" 3="" :="" (20="" points)="" –="" probability="" -="" analyze="" the="" data="" set="" with="" at="" 4="" types="" of="" probability="" methods.="" •="" complementary="" rule="" •="" additive="" rule="" –="" for="" mutually="" exclusive="" events="" •="" conditional="" rule="" •="" multiplicative="" rule="" •="" bayes="" rule="" from="" chapters="" 4:="" (choose="" 4:="" 20="" points)="" –="" random="" variables="" and="" probability="" distribution="" •="" discrete="" random="" variables,="" mean="" and="" standard="" deviation="" •="" binomial="" distribution="" •="" poisson="" distribution="" •="" hypergeometric="" •="" percentiles,="" z-scores="" from="" chapter="" 5:="" (10="" points)-="" sampling="" distribution="" •="" calculate="" the="" sampling="" distribution="" based="" on="" the="" mean="" •="" compute="" the="" sampling="" distribution="" of="" the="" sample="" proportion="" from="" chapter="" 6:="" (10="" points)="" confidence="" intervals="" for="" single="" sample="" •="" calculate="" the="" population="" mean,="" the="" confidence="" interval="" either="" sigma="" known="" or="" unknown="" •="" apply="" either="" the="" normal="" z="" statistic="" or="" t-interval="" procedure="" (based="" on="" sample="" size="" and="" distribution="" of="" data)="" •="" calculate="" the="" necessary="" sample="" size="" 3="" from="" chapter="" 7:="" (10="" points)="" –="" hypothesis="" testing="" for="" single="" sample="" •="" formulate="" hypothesis="" test="" for="" μ="" and="" illustrate="" rejection="" region="" •="" apply="" hypothesis="" test="" either="" using="" normal="" z="" statistic="" student's="" t-statistic="" from="" chapter="" 8:="" (20="" points)="" –="" confidence="" intervals="" for="" two="" samples="" •="" compute="" the="" standard="" deviation="" and="" confidence="" interval="" for="" two="" independent="" samples="" (will="" need="" a="" second="" data="" set="" of="" n="20)" •="" calculate="" either="" a="" one-tailed="" or="" two-tailed="" hypothesis="" test="" •="" calculate="" the="" paired="" difference="" confidence="" interval="" for="" µd="μ1" -="" μ2="" •="" use="" the="" large="" sample="" test="" of="" hypothesis="" about="" p1="" -="" p2="" normal="" z="" statistic="" extra="" credit:="" from="" chapter="" 11="" :="" simple="" linear="" regression="" (choose="" 2="" methods:="" 10="" points)="" •="" fit="" the="" model:="" the="" least="" squares="" approach="" y="mx" +="" b="" or="" y="B0" +="" b1x1="" •="" compute="" coefficient="" of="" determination="" or="" strength="" of="" linear="" relation="" with="" r2="" iv.="" results="" (2-3="" pages)="" 25="" points)="" a.="" describe="" findings="" from="" analyses.="" b.="" are="" the="" two="" variables="" related?="" c.="" what="" are="" the="" significant="" results--what="" are="" not="" significant="" results.="" v.="" discussion="" or="" interpretation="" of="" the="" results="" (2-3="" pages)="" (25="" points)="" a.="" why="" is="" the="" research="" question="" important?="" b.="" how="" do="" these="" statistical="" methods="" answer="" your="" research="" question?="" how="" do="" these="" methods="" illustrate="" or="" explain="" the="" objective="" of="" your="" paper?="" c.="" what="" are="" the="" limitations="" of="" the="" study?="" what="" other="" variables="" can="" be="" included?="" d.="" what="" is="" the="" broader="" impact="" of="" the="" study?="" who="" can="" benefit="" from="" this="" research?="" e.="" what="" future="" studies="" can="" be="" generated="" from="" the="" results="" of="" your="" research?="" research="" investigators="" consult="" the="" existing="" body="" of="" knowledge="" or="" conduct="" a="" literature="" search="" first="" when="" developing="" and="" testing="" hypotheses,="" to="" extend,="" replicate="" and="" build="" on="" existing="" empirical="" research="" (i.e.="" based="" on="" hypothesis="" testing)="" —="" either="" quantitative="" or="" qualitative.="" vi.="" references="" –="" bibliography="" (6-10="" references="" at="" least="" 50%="" from="" last="" five="" years)="" (20="" points)="" citation="" format:="" author="" last="" name,="" first="" initial="" (year).="" article="" title.="" journal="" or="" newspaper,="" vol="" no.,="" page="" numbers.="" please="" do="" not="" copy="" and="" past="" the="" url="" (i.e.="" the="" web="" address="" for="" the="" citation).="" a="" citation="" in="" that="" format="" will="" not="" receive="" credit="" or="" any="" points.="" example="" below:="" picket,="" k.e.,="" wilkinson,="" r.g.="" (2015).="" income="" inequality="" and="" health:="" a="" causal="" review.="" social="" science="" and="" medicine,="" 128,="" 316-326.="" 4="" types="" of="" articles="" for="" possible="" citation="" are:="" a.="" citations="" on="" the="" topic="" being="" studied;="" b.="" scholarly="" articles="" discussing="" application="" of="" statistical="" method="" chosen="" to="" test="" your="" hypothesis.="" make="" sure="" to="" address="" the="" following="" questions="" in="" your="" report,="" i.e.="" who,="" what,="" where,="" when="" and="" why,="" and="" how.="" (1)="" who="" or="" what="" is="" the="" unit="" of="" analysis?="" (2)="" where="" are="" the="" data="" collected="" from,="" including="" the="" source="" website?="" (3)="" when="" were="" the="" data="" collected--what="" time="" frame,="" i.e.="" year,="" months?="" (4)="" why="" is="" the="" research="" question="" important?="" (5)="" how="" do="" these="" statistical="" methods="" answer="" your="" research="" questions?="" how="" do="" they="" illustrate="" or="" explain="" the="" objective="" of="" your="" paper?="" data="" analysis="" project="" -="" common="" questions="" how="" to="" create="" a="" random="" sample="" step="" no.="" 1="" is="" to="" identify="" a="" data="" set--raw="" data="" that="" have="" not="" been="" analyzed.="" this="" is="" the="" population.="" •="" column="" no.="" 1="" contains="" the="" unit="" of="" analysis,="" such="" as="" us="" states,="" countries,="" teams="" or="" players="" or="" perhaps="" individuals="" or="" persons.="" •="" column="" no.="" 2="" contains="" quantitative="" variable="" no.="" 1,="" column="" no.="" 3="" contains="" quantitative="" variable="" no.="" 2.="" the="" document="" entitled,="" "sample="" structure="" of="" a="" data="" set"="" the="" pdf="" file="" listed="" under="" the="" required="" final="" data="" analysis="" project="" folder="" under="" the="" canvas="" resources="" folder="" displays="" the="" format.="" kaggle.com="" is="" one="" source="" for="" data--this="" would="" be="" considered="" population.="" step="" no.="" 2="" to="" create="" a="" random="" sample="" from="" the="" data="" set="" selected.="" from="" the="" population="" of="" n="?" whatever="" the="" maximum="" number="" is="" in="" the="" kaggle="" data="" set,,="" you="" can="" build="" a="" data="" set="" from="" the="" full="" list="" of="" us="" states,="" countries---using="" a="" random="" number="" generator="" available="" https://www.random.org="" ➢="" insert="" the="" minimum="" number="" (i.e.="" 1)="" and="" then="" the="" maximum="" number="" of="" cases="" or="" in="" this="" case="" countries="" in="" the="" data="" set="" (i,e,="" 170="" for="" example).="" ➢="" the="" random="" number="" generator="" will="" provide="" a="" number="" between="" 1="" and="" 170="" perhaps="" 67.="" ➢="" thus,="" the="" first="" random="" selection="" from="" the="" countries="" who="" are="" on="" the="" list="" is="" no.="" 67.="" ➢="" then,="" enter="" country="" no.="" 67="" in="" the="" spreadsheet="" with="" the="" two="" quantitative="" variables--="" continuous="" variables="" which="" are="" suitable="" for="" the="" methods="" studied="" in="" the="" course="" literature="" review="" -="" finding="" articles="" ❖="" the="" articles="" are="" published="" papers="" that="" are="" related="" to="" the="" variables="" you="" are="" analyzing="" in="" your="" data="" set.="" ❖="" simply="" summarize="" the="" scientific="" results="" in="" those="" three="" papers--one="" paragraph="" for="" each="" paper.="" ❖="" preponderance="" of="" evidence="" in="" science="" refers="" to="" the="" published="" research="" on="" a="" similar="" or="" related="" topic="" to="" the="" data="" set="" that="" is="" being="" analyzed.="" ❖="" you="" can="" access="" relevant="" literature="" i.e.="" scholar.google.com="" accessible="" via="" libraries.rutgers.edu="" and="" search="" for="" articles="" (not="" personal="" web="" pages="" or="" blogs)="" that="" pertain="" to="" the="" data="" topic="" you="" are="" analyzing.="" see="" link="" below.="" https://www.libraries.rutgers.edu/indexes/google_scholar="" https://www.random.org/="" creating="" hypotheses="" ✓="" the="" hypotheses="" then="" pertain="" to="" the="" data="" set="" that="" you="" are="" analyzing.="" ✓="" each="" method,="" each="" formula="" studied="" in="" the="" course="" uses="" the="" ho-null="" and="" the="" ha-="" alternative="" hypothesis="" formulate="" to="" test="" the="" variable="" against="" a="" null="" value="" (such="" as="" a="" population="" mean.="" probability="" calculations="" •="" regarding="" probability="" calculations,="" first="" create="" a="" relative="" frequency="" table.="" •="" probability="" is="" based="" on="" f="" over="" n="" or="" x="" over="" n--referring="" to="" total="" n="20." •="" to="" determine="" the="" numerator,="" set="" a="" threshold,="" i..e="" count="" the="" number="" of="" states="" that="" have="" at="" least="" a="" certain="" threshold="" of="" depression="" rate="" out="" of="" 20="" states="" total.="" •="" thus,="" if="">, = note: formulate at least 3 hypotheses, with ho (null) and ha: (alternative) stated. https://www.kaggle.com/datasets http://www.census.gov/ http://www.cdc.gov/ https://www.libraries.rutgers.edu/indexes/google_scholar 2 iii. methods - i.e. data analysis portion (100 points) describe the data -- i.e. who or what is unit of analysis--what is being compared (such as us states and then the two variables, such as percent poverty, percent unemployed by us state). provide the data table. ( 4 points) ➢ please note: need to include all formulas used. formulas can be typed or cut and paste from the course slides, or web-source, if correct. step-by-step calculations not required, but helpful. ➢ please do not insert pictures of hand-written formulas and calculations in the word document. ➢ no credit will be given if not legible. thank you. from chapter 2: (10 points)-methods for describing data: calculate all for both variables: • mean • standard deviation • range • inter-quartile range, lower and upper limit values • create box plot (can use online boxplot or draw by hand in excel) • create histogram. from chapter 3 : (20 points) – probability - analyze the data set with at 4 types of probability methods. • complementary rule • additive rule – for mutually exclusive events • conditional rule • multiplicative rule • bayes rule from chapters 4: (choose 4: 20 points) – random variables and probability distribution • discrete random variables, mean and standard deviation • binomial distribution • poisson distribution • hypergeometric • percentiles, z-scores from chapter 5: (10 points)- sampling distribution • calculate the sampling distribution based on the mean • compute the sampling distribution of the sample proportion from chapter 6: (10 points) confidence intervals for single sample • calculate the population mean, the confidence interval either sigma ? known or unknown • apply either the normal z statistic or t-interval procedure (based on sample size and distribution of data) • calculate the necessary sample size 3 from chapter 7: (10 points) – hypothesis testing for single sample • formulate hypothesis test for μ and illustrate rejection region • apply hypothesis test either using normal z statistic student's t-statistic from chapter 8: (20 points) – confidence intervals for two samples • compute the standard deviation and confidence interval for two independent samples (will need a second data set of n = 20) • calculate either a one-tailed or two-tailed hypothesis test • calculate the paired difference confidence interval for µd = μ1 - μ2 • use the large sample test of hypothesis about p1 - p2 normal z statistic extra credit: from chapter 11 : simple linear regression (choose 2 methods: 10 points) • fit the model: the least squares approach y = mx + b or y = b0 + b1x1 • compute coefficient of determination or strength of linear relation with r2 iv. results (2-3 pages) 25 points) a. describe findings from analyses. b. are the two variables related? c. what are the significant results--what are not significant results. v. discussion or interpretation of the results (2-3 pages) (25 points) a. why is the research question important? b. how do these statistical methods answer your research question? how do these methods illustrate or explain the objective of your paper? c. what are the limitations of the study? what other variables can be included? d. what is the broader impact of the study? who can benefit from this research? e. what future studies can be generated from the results of your research? research investigators consult the existing body of knowledge or conduct a literature search first when developing and testing hypotheses, to extend, replicate and build on existing empirical research (i.e. based on hypothesis testing) — either quantitative or qualitative. vi. references – bibliography (6-10 references at least 50% from last five years) (20 points) citation format: author last name, first initial (year). article title. journal or newspaper, vol no., page numbers. please do not copy and past the url (i.e. the web address for the citation). a citation in that format will not receive credit or any points. example below: picket, k.e., wilkinson, r.g. (2015). income inequality and health: a causal review. social science and medicine, 128, 316-326. 4 types of articles for possible citation are: a. citations on the topic being studied; b. scholarly articles discussing application of statistical method chosen to test your hypothesis. make sure to address the following questions in your report, i.e. who, what, where, when and why, and how. (1) who or what is the unit of analysis? (2) where are the data collected from, including the source website? (3) when were the data collected--what time frame, i.e. year, months? (4) why is the research question important? (5) how do these statistical methods answer your research questions? how do they illustrate or explain the objective of your paper? data analysis project - common questions how to create a random sample step no. 1 is to identify a data set--raw data that have not been analyzed. this is the population. • column no. 1 contains the unit of analysis, such as us states, countries, teams or players or perhaps individuals or persons. • column no. 2 contains quantitative variable no. 1, column no. 3 contains quantitative variable no. 2. the document entitled, "sample structure of a data set" the pdf file listed under the required final data analysis project folder under the canvas resources folder displays the format. kaggle.com is one source for data--this would be considered population. step no. 2 to create a random sample from the data set selected. from the population of n =? whatever the maximum number is in the kaggle data set,, you can build a data set from the full list of us states, countries---using a random number generator available https://www.random.org ➢ insert the minimum number (i.e. 1) and then the maximum number of cases or in this case countries in the data set (i,e, 170 for example). ➢ the random number generator will provide a number between 1 and 170 perhaps 67. ➢ thus, the first random selection from the countries who are on the list is no. 67. ➢ then, enter country no. 67 in the spreadsheet with the two quantitative variables-- continuous variables which are suitable for the methods studied in the course literature review - finding articles ❖ the articles are published papers that are related to the variables you are analyzing in your data set. ❖ simply summarize the scientific results in those three papers--one paragraph for each paper. ❖ preponderance of evidence in science refers to the published research on a similar or related topic to the data set that is being analyzed. ❖ you can access relevant literature i.e. scholar.google.com accessible via libraries.rutgers.edu and search for articles (not personal web pages or blogs) that pertain to the data topic you are analyzing. see link below. https://www.libraries.rutgers.edu/indexes/google_scholar https://www.random.org/ creating hypotheses ✓ the hypotheses then pertain to the data set that you are analyzing. ✓ each method, each formula studied in the course uses the ho-null and the ha- alternative hypothesis formulate to test the variable against a null value (such as a population mean. probability calculations • regarding probability calculations, first create a relative frequency table. • probability is based on f over n or x over n--referring to total n=20. • to determine the numerator, set a threshold, i..e count the number of states that have at least a certain threshold of depression rate out of 20 states total. • thus, if 5>