Lab assignment
BI 108 LAB Name: ___________________________________________________________ Date: _________________ BIOL 151 Lab | Natural Selection Pre-Lab Once data are gathered the first thing scientists do is to plot the data. This gives us a visual description of two important aspects of most data sets 1) the central tendency of the data (e.g., the average, median or mode of the data) and 2) the spread of the data around the center of the distribution (e.g., standard deviation and variance). There are many ways to graph data, but in this activity we are going to learn about histograms. Histograms are graphs that convert continuous data like height and weight into different groups or bins that are shown as bars on the graph. In these graphs the different heights of the bars indicate how frequently a group of observation occurs in a data set. See the graph below for an example of a histogram of the weights of newborn puppies. The graph has many components that ALL graphs need. The x-axis and y-axis are labeled, and the units of measurement are given. In this case, the unit of measurement on the y-axis is the number of puppies whose weight (or mass) fell within the size class of each bar on the x-axis. The unit of measurement on the x-axis is the mass of each puppy in grams that fell within a particular size class. Note that each bar, which is the frequency of each group, is bounded by a "(" on one side and a "]" on the other. For example, the values beneath the second bar are given as (136.1, 197.1]. This means that all puppies that weigh 136.1 grams to 197.1 grams are included. However, the “(“ means that that number, 136.1, is not included. Instead, 136.1 is included in the range that uses the “]”. So if a data value was 136.1, it would be included in the [75.1, 136.1] bar. It doesn’t matter if it is written as (X, Y] or [X, Y) – you interpret the bracket and parentheses the same. The graph alone shows you many things about the sampling regime and the size distribution of newborn puppies. The first thing the graph shows you is the total sample size (n) of the experiment. Note that based on the height of the first bar, you can see that 9 puppies had a mass that fell in the range of 75.1-136.1 grams. 1. Using the approach described above, how many total puppies were used to generate this frequency histogram? Another thing the graph shows you is the approximate range of sizes of newborn puppies. The range is the difference between the highest and lowest values in the data set. Although you don't actually see the individual data points on the graph, you can safely assume that the largest puppy is no more than 380.1 grams and the smallest puppy no less than 75.1 grams. 2. Using only the graph, what is the approximate range of newborn puppy masses? 3. The graph also shows roughly what the average mass of the puppies was in this sample. Comparing the frequencies of each bar on the graph, within which weight range do most of the puppies weigh? 4. Open up the Excel spreadsheet that contains the dataset of puppy masses that were used to create the histogram. Let’s figure out what the actual range is of this dataset. In an empty cell to the right of the dataset, write “Range”. Using the skills you learned earlier in the Excel tutorial, determine the range in the cell to the right of the “Range” label you wrote. In the space below, outline the steps you took and write in any formulas you used to solve for the range. Also include the value for the range. Of course graphs are just visual depictions of the data. The next step in any analysis is to calculate basic statistics on your data set. Two of the more common statistics calculated for any data set are the mean (or average) and the standard deviation. The average measures the central tendency of the data. Below the word “Range”, write the word “Average” or “Mean”, whichever you prefer. In the cell to the right of it, begin typing the following equation: =average( and then highlight the data you want averaged. Then continue typing by closing the parentheses and hit enter. 5. The average mass of puppies in this data set is (round to 2 decimal places, include units) ____________________ . We can also get an estimate of the spread of the data or the standard deviation of the data around the mean. The shorthand for standard deviation is σ, but we will commonly see it in this course as s.d. The s.d. is specific to a dataset, so the number itself doesn’t matter too much. With two similar datasets, it can be compared to identify which dataset has more or less spread of the data. A high standard deviation means the data is far from the mean, a smaller standard deviation means the values are closer to the mean. The s.d. also gives an idea of how much of your data falls within a certain range of the mean. These percentages are a constant in statistics. · 68% of the data can be found within ±1 standard deviation of the mean · 95% of the data can be found within ±2 standard deviations of the mean · 99% of the data can be found within ±3 standard deviations of the mean Below the word “Average/Mean”, write “standard deviation”. Then, in the cell below it begin to write the following formula: =STDEV( Similar to when you calculated the mean, click and drag the values that you want to take the standard deviation of. Close the equation with a “)” and click enter. 6. The standard deviation is (round to 2 decimal places, include units) ______________________ . 7. Let’s interpret what this standard deviation means in terms of our puppy mass data set. Use the mean you calculated in #5 and the standard deviation from #6 to answer the following questions. a) Within the puppy masses, 68% of the data is between which two values? Show your work. b) Within the puppy masses, 95% of the data is between which two values? Show your work. c) Within the puppy masses, 99% of the data is between which two values? Show your work. Page 2 of 2 Name: _________________________________________Date: ___________________ Natural Selection | Principles of Biology II OBJECTIVES By the end of this lab, students will be able to: · Use Excel to sort data, find means, and calculate standard deviation · Draw a histogram by hand and using Microsoft Excel and understand its use · Describe mechanisms that affect the distribution of a trait in a population · Explain how directional, stabilizing and disruptive selection influence the mean and standard deviation of phenotypic traits in natural populations INTRODUCTION Natural selection is one of the major forces of evolution acting on populations. The major requirements for selection to occur are that 1) traits (or phenotypes) vary among individuals in populations, 2) some of this variation among individuals is due to genetic differences among individuals, and 3) that variation in traits affects some aspect of fitness. If scientists know the distribution of a trait over time, they can identify changes in the population as a result of a change in the environment. Selective pressures can act on the variance of a population to result in stabilizing, disruptive, or directional selection. One way to diagram variance is using a histogram, a type of bar graph that looks at the abundance of a category. In the example to the right, the number of people who are different heights is graphed. In this case, we can look at the abundance of each phenotype of a trait. In today’s lab, we will look at the mean and variance of shark tooth length in a population of sharks. The length of a tooth can be related to many factors such as size of shark, typical size of prey, difficulty in catching prey, or sexual selection. We will then examine how natural selection can change the mean and variance of shark's teeth in a population and use these changes to interpret what type of selection may have occurred. METHODS – EXERCISE 1 1. Open an internet window (preferably Firefox. Do not use Internet Explorer or Microsoft Edge). Navigate to our Blackboard Page > Labs > Natural Selection. Download the Microsoft Excel file entitled “Shark Tooth Data”. 2. To get an idea of how our data looks, we will create a histogram. At first glance, a histogram looks just like a bar chart (and it is) but the bars are touch each other. This is to make it more obvious that this graph is characterizing one dataset versus each bar representing different data sets. A histogram can look at the variation in a trait, which in this case is tooth length. The first thing we need to do is identify the highest and lowest values of the dataset. Highlight the data values in Microsoft Excel and then select the sort button (shown to the left) to arrange your values from smallest to largest. If the lowest number is a decimal (e.g. 22.6), round down to the next whole number (e.g. 22). If the highest number is a decimal (e.g. 33.2), round up to the next whole number (e.g. 34). We will use these to give us a range of values for the X-axis on our graph. Include units. · Lowest number, rounded: _________________ · Highest number, rounded: ________________ 3. Using the values in #2, calculate the range of data (the highest value minus the lowest value). Include units. · Range: _______________________ 4. There is no general rule for how many bins, or bars, to create. Divide the range by the number of bins desired; in this case, we will use 8 bins. This will get you the bin size. Round this number to the next highest 0.1 (e.g. 2.34 rounds up to 2.4). Include the units. · Bin size: _____________________ Bin Values [min #, max #) Frequency (# of Data Points in Bin) 5. Now that we have the highest and lowest values for the graph as well as the bin size, we can figure out how many data values can be found in each bin to create our histogram. The first column “Bin Values” are the values for the bars