# Class Attendance Experiment

On Monday, I conducted a simple experiment in my Advanced Cell Biology class.  My hypothesis was that the sub-population of students who attend class is likely to score higher on exams.  To test my hypothesis, I looked up the frequency of scoring an “A” on Exam 1.  53 students scored an “A” out of 117 enrolled (yes, that seems high, but our CBN majors are really smart!).  During class I took a survey using PollEverywhere (below).  16 out of 26 students who responded reported that they scored an “A” on the same exam. More than 26 students attended the class but I figured that 26 was a reasonable sample of those who attended. What’s the probability that this happened by chance alone?

This is an example of the classic “marble and urn” problem.  You have an urn containing 100 marbles–80 of them are black and 20 of them are white.  If you randomly draw 10 marbles from the urn, what’s the chance of drawing 8 black and 2 white?  Or of drawing 3 black and 7 white?  The probability can be calculated based on the hypergeometric distribution.

The “null hypothesis” is that the students attending class are a random subset of all those enrolled in the class (that is, no particularly enrichment of A’s).  Among my sample of 26 votes, I counted 16 A’s.  The total contents of the “urn” was 53 A’s and 117-53 grades that were not an A.

I calculated this using the dhyper function in the statistical programming environment R.  Here’s the command:

```> sum(dhyper(16:26,53,117-53,26))
[1] 0.04829361```

The arguments are:

• 16:26 (the number of sampled students who got an A up to the total number sampled; I use the range to find out the probability of finding 16 or more A students in a sample of 26)
• 53 (the total number of students in the class who got an A)
• 117 – 53 (the total number of students in the class who didn’t get an A)
• 26 (the sample size)

The result says the 0.048 probability of the hypergeometric distribution produces an enrichment of 16 A’s or more out of 26 samples.  So, at a threshold of p<0.05, I reject the null hypothesis and conclude that students attending class are more likely to score higher on exams.  This is not a cause-and-effect proof that attending class will get you a higher grade.  But it may be worth considering!  And certainly if you did not get an A and want to do better, it might help to attend class!