Dear students,

Today we begin our descent into the world of collections of data. No longer will we be content with one or two numbers—we want them by the hundreds!

For our first problem involving a data collection, we’ll look at testing whether or not a six-sided die (d6) is fair. We could do it with something like this:

Random generator = new Random(); int nOnes = 0; int nTwos = 0; int nThrees = 0; int nFours = 0; int nFives = 0; int nSixes = 0; for (int i = 0; i < 100000; ++i) { int roll = generator.nextInt(6) + 1; if (roll == 1) { ++nOnes; } else if (roll == 2) { ++nTwos; } else if (roll == 3) { ++nThrees; } else if (roll == 4) { ++nFours; } else if (roll == 5) { ++nFives; } else if (roll == 6) { ++nSixes; } } System.out.prinf("%8d%n", nOnes); System.out.prinf("%8d%n", nTwos); System.out.prinf("%8d%n", nThrees); System.out.prinf("%8d%n", nFours); System.out.prinf("%8d%n", nFives); System.out.prinf("%8d%n", nSixes);

Yes, this works, but everything is done six times. Imagine testing a d30! We need something better. Instead of six separate counters, we need an *array*, whichs hold a collection of six numbers. We make an array to hold 6 numbers like so:

int[] counts = new int[6];

We access individual elements of an array through an index:

counts[0] = 0; int n123s = numbers[0] + numbers[1] + numbers[2];

We can ask an array its length through its `length`

property:

System.out.println(counts.length); // prints 6

The beauty of an array is that we can create a loop that spins through the legal indices and processes the whole collection in just a few lines of code:

int nrolls = 0; for (int i = 0; i < counts.length; ++i) { nrolls += counts[i]; }

For our second problem, we’ll look at a collection of our own data. We’ll ask everyone to report two numbers:

- The number of children your grandparents had (i.e., the number of parents you have plus their brothers and sisters). For example, I have two parents, three uncles, and two aunts, so I’d report 7.
- The number of children your parents had, including any half- or step-siblings, and including you. For example, I have just one brother, so I’d report 2.

We will collect this data in a plain text file and then process it to determine the following statistics:

- the average number of children in your families
- the relative frequency of each number of children in your families
- the relationship between the number of children in the previous generation and in your own

To calculate the relationship between the two numbers, we will essentially ask this question: “Is the number of children in your family a function of the number of children in your parents’ families?” We will compute a trend line—a linear regression—that gives such a function, and we will see if it’s a good fit or not.

Linear regression is computed with the following formulae:

meanXY - meanX * meanY m = ---------------------- meanXX - meanX * meanX b = meanY - m * meanX y = mx + b

We’ll plot the results and see how good a model this function is.

See you next class!

Sincerely,

P.S. Here’s the code we wrote together…

package lecture1107; import java.util.Random; public class FairDieTerrible { public static void main(String[] args) { int n1s = 0; int n2s = 0; int n3s = 0; int n4s = 0; int n5s = 0; int n6s = 0; Random generator = new Random(); // float f = 5.6F; long nspins = 10000000l; for (long i = 0; i < nspins; ++i) { int roll = generator.nextInt(6) + 1; if (roll == 1) { ++n1s; } else if (roll == 2) { ++n2s; } else if (roll == 3) { ++n3s; } else if (roll == 4) { ++n4s; } else if (roll == 5) { ++n5s; } else if (roll == 6) { ++n6s; } else { throw new RuntimeException("This should never happen!"); } } System.out.printf("1: %d%n", n1s); System.out.printf("2: %d%n", n2s); System.out.printf("3: %d%n", n3s); System.out.printf("4: %d%n", n4s); System.out.printf("5: %d%n", n5s); System.out.printf("6: %d%n", n6s); } }

package lecture1107; import java.util.Random; import java.util.Scanner; public class FairDieAwesome { public static void main(String[] args) { int nsides = 30; int[] counts = new int[nsides]; Random generator = new Random(); long nspins = 100000000l; for (long i = 0; i < nspins; ++i) { int roll = generator.nextInt(nsides) + 1; counts[roll - 1]++; } for (int i = 0; i < counts.length; ++i) { System.out.printf("%d: %d%n", i + 1, counts[i]); } } }

package lecture1107; import java.io.File; import java.io.FileNotFoundException; import java.util.Scanner; public class Generations { public static void main(String[] args) throws FileNotFoundException { File inFile = new File("/Users/johnch/numbers.csv"); Scanner in = new Scanner(inFile); while (in.hasNextInt()) { int x = in.nextInt(); int y = in.nextInt(); System.out.printf("%d,%d%n", x, y); } in.close(); } }

## Comments

For the quartersheet, we just do the arrays practice problems in CodinBat?