Hardy-Weinberg Going Hard

Devansh Taori (per. 2)
Biology AP

Introduction

Evolution occurs in populations of organisms and involves variation in the population, heredity, and differential survival. One way to study evolution is to study how the frequency of alleles in a population changes from generation to generation. Important questions surrounding the topic of evolution involve what the inheritance patterns of alleles, not just from two parental organisms, but also in a population are. Exploration of how allele frequencies change in populations can help us with predicting what will happen to a population in the future.
 
Mathematical models and computer simulations are tools used to explore the complexity of biological systems that might otherwise be difficult or impossible to study. Several models can be applied to questions about evolution. In this experiment, we built a spreadsheet that models a hypothetical gene pool changes from one generation to the next. The model exolores parameters that affect allele frequencies, such as selection, mutation, and migration. 

Important to this experiment is the Hardy-Weinberg equilibrium, which is a principle stating that the genetic variation in a population will remain constant from one generation to the next in the absence of disturbing factors. When mating is random in a large population with no disruptive circumstances, the law predicts that both genotype and allele frequencies will remain constant because they are in equilibrium.


Hypothesis
I hypothesize that the larger the population of a certain species is, the closer that population will be to the Hardy-Weinberg equilibrium because minor changes and variations in the frequency of an allele will have less of an impact on the greater population.


Procedure

For this model, assume that all the organisms in our hypothetical population are diploid. This organism has a gene locus with two alleles — A and B. To begin this model, let’s define a couple of variables. Let p = the frequency of the A allele and let q = the frequency of the B allele.

Bring up the spreadsheet on your computer. The examples here are based on Microsoft Excel, but almost any modern spreadsheet can work, including Google’s online Google Docs (https://docs.google.com) and Zoho’s online spreadsheet (http://www.zoho.com).


Somewhere in the upper left corner (in this case, cell D2), enter a value for the frequency of the A allele. This value should be between 0 and 1. Type in labels in your other cells. This is a spreadsheet, so enter the value for the frequency of the B allele (0.5); however, when making a model it is best to have the spreadsheet do as many of the calculations as possible. All of the alleles in the gene pool are either A or B; therefore p + q = 1 and 1 - p = q. In cell D3, enter the formula to calculate the value of q.


In spreadsheet lingo it is =1-D2.


Let’s explore how one important spreadsheet function works before we incorporate it into our model. In a nearby empty cell, enter the function (we will remove it later).


=Rand()


Note that the parentheses have nothing between them. If you are on a PC, try hitting the F9 key several times to force recalculation. On a Mac, enter cmd + or cmd =.


The RAND function returns random numbers between 0 and 1 in decimal format. This is a powerful feature of spreadsheets. It allows us to enter a sense of randomness to our calculations if it is appropriate — and here it is when we are “randomly” choosing gametes from a gene pool. Go ahead and delete the RAND function in the cell.


Let’s select two gametes from the gene pool. In cell E5, let’s generate a random number, compare it to the value of p, and then place either an A gamete or a B gamete in the cell. We’ll need two functions to do this, the RAND function and the IF function. Check the help menu if necessary.


Note that the function entered in cell E5 is =IF(RAND()<=D$2,“A”, “B”)


Be sure to include the $ in front of the 2 in the cell address D2. It will save time later when you build onto this spreadsheet.


The formula in this cell basically says that if a random number between 0 and 1 is less than or equal to the value of p, then put an A gamete in this cell, or if it is not less than or equal to the value of p, put a B gamete in this cell. IF functions and RAND functions are very powerful tools when you try to build models for biology.


Now create the same formula in cell F5, making sure that it is formatted exactly like E5. When you have this completed, press the recalculate key to force a recalculation of your spreadsheet. If you have entered the functions correctly in the two cells, you should see changing values in the two cells. (This is part of the testing and retesting that you have to do while model building.)


Try recalculating 10–20 times. Try changing your p value to 0.8 or 0.9. Does the spreadsheet still work as expected? Try lower p values. If you don’t get approximately the expected numbers, check and recheck your formulas now, while it is early in the process.


Copy these two formulas in E5 and F5 down for about 16 rows to represent gametes that will form 16 offspring for the next generation. (To copy the formulas, click on the bottom right-hand corner of the cell and, with your finger pressed down on the mouse, drag the cell downward.)


We’ll put the zygotes in cell G5. The zygote is a combination of the two randomly selected gametes. In spreadsheet vernacular, you want to concatenate the values in the two cells. In cell G5 enter the function =CONCATENATE(E5,F5), and then copy this formula down as far down as you have gametes.


The next columns on the sheet, H, I, and J, are used for bookkeeping — that is, keeping track of the numbers of each zygote’s genotype. They are rather complex functions that use IF functions to help us count the different genotypes of the zygotes.


The function in cell H5 is =IF(G5=“AA”,1,0), which basically means that if the value in cell G5 is AA, then put a 1 in this cell; if not, then put a 0.


Now let’s tackle the nested IF function. This is needed to test for either AB or BA. In cell I5, enter the nested function: =IF(G5=“AB”,1,(IF(G5=“BA”,1,0))).


This example requires an extra set of parentheses, which is necessary to nest functions. This function basically says that if the value in cell G5 is exactly equal to AB, then put a 1; if not, then if the value in cell G5 is exactly BA, put a 1; if it is neither, then put a 0 in this cell. Copy these three formulas down for all the rows in which you have produced gametes.


Enter the labels for the columns you’ve been working on — gametes in cell E4, zygote in cell G5, AA in cell H4, AB in cell I4, and BB in cell J4.


As before, try recalculating a number of times to make sure everything is working as expected. You could use a p value of 0.5, and then you’d see numbers similar to the ratios you would get from flipping two coins at once. Don’t go on until you are sure the spreadsheet is making correct calculations. Try out different values for p. Make sure that the number of zygotes adds up.


Now, copy the cells E5 through J5 down for as many zygotes as you’d like in the first generation. Use the SUM function to calculate the numbers of each genotype in the H, I, and J columns. Use the genotype frequencies to calculate new allele frequencies and to recalculate new p and q values. Make a bar graph of the genotypes using the chart tool.


You now have a model with which you can explore how allele frequencies behave and change from generation to generation.


Try out different starting allele frequencies in the model. Look for and describe the patterns that you find as you try out different allele frequencies. Develop and use a pattern to select your values to test and organize your exploration. In particular, test your model with extreme values and intermediate values.


Try adding additional generations to your model to look at how allele frequencies change in multiple generations. To do this, use your newly recalculated p and q values to seed the next generation. Once you’ve included the second generation, you should be able to copy additional generations so that your model looks something like each new generation determining the new p and q values for the next.


Try to create a graph of p values over several generations, for different-sized populations. See if you can detect a pattern of how population size affects the inheritance pattern. Be sure to try out both large and small populations of offspring.



Results

To demonstrate my results, I took a video. I first made the p value 0.5 and tested for five generations across population sizes of 17, 997, then 4997. Then, I made the p value 0.2 and test for 5 generations across the same population sizes. All charts/graphs/pictures are displayed in the video.

As we can see, my hypothesis is proven true, because the consistency of the allele frequency remaining the same is higher when the population larger. When there is a smaller population size, the chance that any variation effects the overall frequency is larger.

How do inheritance patterns or allele frequencies change in a population over one generation?
Over one generation, when the population is isolated and there are no outside factors affecting it, inheritance patterns and allele frequencies remain the same. This is corroborated by the results above, where the allele frequency remains extremely close to p=0.5, q=0.5 and p=0.2, q=0.8. This shows that when there are no factors that could change the allele frequency in the population (like mutations), inheritance patterns remain the same. Obviously, however, if there are outside factors acting on the population, then allele frequencies are liable to change over one generation.

How do inheritance patterns or allele frequencies change in a population over multiple generations?

Over multiple generations, the trend for inheritance patterns and allele frequencies is pretty much the same as it is for one generation. Because there are no outside factors acting on the population, the frequencies will remain the same from generation to generation. This is proven by my results above, where allele frequencies remained largely the same (p=0.5, q=0.5 and p=0.2, q=0.8), even across five generations. The only way that inheritance patterns could change is if there are outside factors acting (like predation, mutation, etc...).

What can you change in your model? If you change something, what does the change tell you about how alleles behave?

In the model, the population sizes could be changed. If the population sizes are altered, then the chance that the allele frequencies remain the same is less likely. For example, with a population of 997 and 4997, the variation is going to be a lot less (per my hypothesis). However, if the population size is 5 or 10, then changes in allele frequency would be much more likely to spill over from generation to generation. This is proven by my results, where the population size of 17 had a LOT more change in allele frequency, as compared to the population sizes 997 and 4997. Thus, the overall inheritance patterns would be much more likely to change if the population sizes varied.

Do alleles behave the same way if you make a variable more extreme? Less extreme?

No, alleles don't behave the same way if you make a variable more or less extreme. In the context of this experiment, the variable that changes is the population size. We can see that the alleles didn't behave the same in populations of different sizes because genetic drift tended to cause greater changes in allele frequencies in small populations than in large populations over one/many generations. When looking at the pie charts, the population sizes of 997 and 4997 were a lot closer to staying the same, whereas the chart of the population size of 17 changed a ton.

Do alleles behave the same way no matter what the population size is?

No, alleles don't behave the same way no matter what the population size is. As I explained above, because genetic drift affects small populations to a greater extent, allele frequencies will change dramatically when there is a smaller population but to a lesser extent when the population size is greater.

Conclusion

Ultimately, my hypothesis was proven true, because the larger population sizes were far closer to attaining the Hardy-Weinberg equilibrium, as compared to smaller population sizes. In particular, when I tested a p=0.5, q=0.5 with population sizes 17, 997, and 4997, the population size of 4997 had the most consistency in allele frequency over 5 generations. That is because genetic drift affected it to a far less extent, as compared to the population size of 17. When I tested a p=0.2, q=0.8 with the same population sizes, the same results were proven true.

Thus, we can effectively make the claim that larger populations are much closer to reaching Hardy-Weinberg equilibrium. If I were to do this experiment again, I would change the population size to perhaps be a lot smaller (5) and a lot larger (999997). That way, I could test the allele frequencies when variables are taken to extremes. I would also run a lot more tests to gain further consistency in my results.

Potential errors include mathematical errors when entering the values into Excel, and calculation problems that are embedded within the computer itself.

Thanks so much, Mr. Wong for letting us do this experiment! And thank you, Manas, for grading this beautiful assignment. <3 Love you, bro.

#TheWongStrikesBack #Yuh

Comments

Popular posts from this blog

PhylogenYEE #WongSwagger #JesusSavesAll

Bacteria That Glow #GlowNation

Investigation 4 Lab Report #YoungMoola #$$$