CS 145 Lecture 26 – Arrays Cont’d
Dear students,
Today we continue looking at the data that we collected last time:
- The number of children your grandparents had (i.e., the number of parents you have plus their brothers and sisters). For example, I have two parents, three uncles, and two aunts, so I’d report 7.
- The number of children your parents had, including any half- or step-siblings, and including you. For example, I have just one brother, so I’d report 2.
We collected this data in a plain text file, and we will process it to determine the following statistics:
- the average number of children in your families
- the relative frequency of each number of children in your families
- the relationship between the number of children in the previous generation and in your own
To calculate the relationship between the two numbers, we will essentially ask this question: “Is the number of children in your family a function of the number of children in your parents’ families?” We will compute a trend line—a linear regression—that gives such a function, and we will see if it’s a good fit or not.
Linear regression is computed with the following formulae:
meanXY - meanX * meanY m = ---------------------- meanXX - meanX * meanX b = meanY - m * meanX y = mx + b
We’ll plot the results and see how good a model this function is.
Next we’ll examine some other data that belongs to us: our birthdays! We will do a quick check of the canonical birthday problem:
Put n people in a room. What’s the likelihood that all have different birthdays? At what n, does it flip from unlikely to likely that there’s a shared birthday?
Do we have any shared birthdays in this class? We’ll find out. Arrays will help.
Here’s your TODO list to complete before we meet again:
- Solve two problems from Array-1 and two problems from Array-2 on Coding Bat. On a quarter sheet, write down the names of the problems you solved and at least one of the solutions.
See you next class!
![](http://www.twodee.org/images/signature_first.png)
P.S. Here’s the code we wrote together…
Generations.java
package lecture1109; import java.io.File; import java.io.FileNotFoundException; import java.util.Arrays; import java.util.Scanner; public class Generations { public static void main(String[] args) throws FileNotFoundException { File inFile = new File("/Users/johnch/numbers.csv"); Scanner in = new Scanner(inFile); int nsamples = in.nextInt(); int[] oldChildren = new int[nsamples]; int[] youngChildren = new int[nsamples]; for (int i = 0; i < nsamples; ++i) { oldChildren[i] = in.nextInt(); youngChildren[i] = in.nextInt(); } in.close(); // for (int i = 0; i < oldChildren.length; ++i) { // System.out.println(oldChildren[i]); // } // System.out.println(Arrays.toString(oldChildren)); int sumSoFar = 0; int sumXX = 0; int sumXY = 0; int sumX = 0; int sumY = 0; for (int i = 0; i < youngChildren.length; ++i) { sumY += youngChildren[i]; sumX += oldChildren[i]; sumXX += oldChildren[i] * oldChildren[i]; sumXY += oldChildren[i] * youngChildren[i]; System.out.println(oldChildren[i] + "," + youngChildren[i]); } double meanX = sumX / (double) nsamples; double meanY = sumY / (double) nsamples; double meanXX = sumXX / (double) nsamples; double meanXY = sumXY / (double) nsamples; double m = (meanXY - meanX * meanY) / (meanXX - meanX * meanX); double b = meanY - m * meanX; System.out.printf("y = %f * x + %f%n", m, b); // double mean = sumSoFar / (double) youngChildren.length; // System.out.println(mean); } }
Birthdays.java
package lecture1109; import java.io.File; import java.io.FileNotFoundException; import java.util.Arrays; import java.util.Scanner; public class Birthdays { public static void main(String[] args) throws FileNotFoundException { Scanner in = new Scanner(new File("/Users/johnch/birthdays.csv")); int[] counts = new int[31 * 12]; while (in.hasNextInt()) { int month = in.nextInt(); int day = in.nextInt(); int daysBeforeThisMonth = (month - 1) * 31; int i = daysBeforeThisMonth + day - 1; // increment that day's counter counts[i]++; } in.close(); System.out.println(Arrays.toString(counts)); for (int i = 0; i < counts.length; ++i) { if (counts[i] > 1) { int month = i / 31 + 1; int day = i % 31 + 1; System.out.println(month + " " + day); } } } }