top of page

Big Data






Histograms I

Histograms II

Checkpoint #1



Average Height & Weight

Because the data take so long to load, we’ll be working with a smaller file for the rest of the lesson (only 1000 rows). However, don’t think that computers can’t work with big data sets; it’s only because the file needs to be loaded into each window that it seems too slow. Normally, you only have to load the data once, and then you can perform all of your calculations relatively quickly!


Now, try going to the website the data came from and see if you can figure out what the average height and weight of these individuals is… we bet you can’t! Let’s use Python to calculate the mean in fractions of a second using the mean property!

1 import pandas # work with big data

2 import numpy # statistics functions

3 # Load data

4 human_data = pandas.read_csv('HumanHeightWeightData_1000.csv')

# Print the average height

6 print(numpy.mean(human_data.height_inches))

bottom of page