# teaching machines

## CS 145 Lecture 26 – Arrays Cont’d

November 9, 2016 by . Filed under cs145, fall 2016, lectures.

Dear students,

Today we continue looking at the data that we collected last time:

1. The number of children your grandparents had (i.e., the number of parents you have plus their brothers and sisters). For example, I have two parents, three uncles, and two aunts, so I’d report 7.
2. The number of children your parents had, including any half- or step-siblings, and including you. For example, I have just one brother, so I’d report 2.

We collected this data in a plain text file, and we will process it to determine the following statistics:

• the average number of children in your families
• the relative frequency of each number of children in your families
• the relationship between the number of children in the previous generation and in your own

To calculate the relationship between the two numbers, we will essentially ask this question: “Is the number of children in your family a function of the number of children in your parents’ families?” We will compute a trend line—a linear regression—that gives such a function, and we will see if it’s a good fit or not.

Linear regression is computed with the following formulae:

    meanXY - meanX * meanY
m = ----------------------
meanXX - meanX * meanX

b = meanY - m * meanX

y = mx + b

We’ll plot the results and see how good a model this function is.

Next we’ll examine some other data that belongs to us: our birthdays! We will do a quick check of the canonical birthday problem:

Put n people in a room. What’s the likelihood that all have different birthdays? At what n, does it flip from unlikely to likely that there’s a shared birthday?

Do we have any shared birthdays in this class? We’ll find out. Arrays will help.

Here’s your TODO list to complete before we meet again:

• Solve two problems from Array-1 and two problems from Array-2 on Coding Bat. On a quarter sheet, write down the names of the problems you solved and at least one of the solutions.

See you next class!

Sincerely,

P.S. Here’s the code we wrote together…

#### Generations.java

package lecture1109;

import java.io.File;
import java.io.FileNotFoundException;
import java.util.Arrays;
import java.util.Scanner;

public class Generations {
public static void main(String[] args) throws FileNotFoundException {
File inFile = new File("/Users/johnch/numbers.csv");
Scanner in = new Scanner(inFile);

int nsamples = in.nextInt();
int[] oldChildren = new int[nsamples];
int[] youngChildren = new int[nsamples];

for (int i = 0; i < nsamples; ++i) {
oldChildren[i] = in.nextInt();
youngChildren[i] = in.nextInt();
}

in.close();

//    for (int i = 0; i < oldChildren.length; ++i) {
//      System.out.println(oldChildren[i]);
//    }
//    System.out.println(Arrays.toString(oldChildren));

int sumSoFar = 0;
int sumXX = 0;
int sumXY = 0;
int sumX = 0;
int sumY = 0;

for (int i = 0; i < youngChildren.length; ++i) {
sumY += youngChildren[i];
sumX += oldChildren[i];
sumXX += oldChildren[i] * oldChildren[i];
sumXY += oldChildren[i] * youngChildren[i];
System.out.println(oldChildren[i] + "," + youngChildren[i]);
}

double meanX = sumX / (double) nsamples;
double meanY = sumY / (double) nsamples;
double meanXX = sumXX / (double) nsamples;
double meanXY = sumXY / (double) nsamples;

double m = (meanXY - meanX * meanY) / (meanXX - meanX * meanX);
double b = meanY - m * meanX;

System.out.printf("y = %f * x + %f%n", m, b);

//    double mean = sumSoFar / (double) youngChildren.length;
//    System.out.println(mean);
}
}


#### Birthdays.java

package lecture1109;

import java.io.File;
import java.io.FileNotFoundException;
import java.util.Arrays;
import java.util.Scanner;

public class Birthdays {
public static void main(String[] args) throws FileNotFoundException {
Scanner in = new Scanner(new File("/Users/johnch/birthdays.csv"));

int[] counts = new int[31 * 12];

while (in.hasNextInt()) {
int month = in.nextInt();
int day = in.nextInt();

int daysBeforeThisMonth = (month - 1) * 31;
int i = daysBeforeThisMonth + day - 1;

// increment that day's counter
counts[i]++;
}

in.close();

System.out.println(Arrays.toString(counts));

for (int i = 0; i < counts.length; ++i) {
if (counts[i] > 1) {
int month = i / 31 + 1;
int day = i % 31 + 1;
System.out.println(month + " " + day);
}
}
}
}