Assignment #05

Unit 05

Until next unit, work through the material provided in Unit 5 and solve the following exercise blocks in separate notebooks.

Notebook 1

#05-01: Looping vs vectorizing

Consider this:

import numpy as np
count = np.arange(10, 20)
v1a = np.full(count.shape, False)
v1b = v1a.copy()
v2a = v1a.copy()
v2b = v1a.copy()
  1. Write a loop that achieves the same result as the following line of code, so that v1a and v1b are equal again:
v1a[count > 16] = True
  1. Write a vectorized statement using logical indexing that achieves the same result as the following line of code, so that v2a and v2b are equal again:
for i, ct in enumerate(count):
    if ct > 13 and ct < 17:
        v2a[i] = True

Notebook 2

For this exercise block, you will work with a real data set, nowcast data from a numerical weather prediction model in Glacier National Park, Canada, WX_GNP.csv.

Here is a description of the spread sheet variables:

  datetime: datetime in the form YYYY-MM-DD HH:MM:SS
station_id: ID of virtual weather station (i.e., weather model grid point)
        hs: Snow height (cm)
      hn24: Height of new snow within last 24 hours (cm)
      hn72: Height of new snow within last 72 hours (cm)
      rain: Liquid water accumulation within last 24 hours (mm)
      iswr: Incoming shortwave radiation (also refered to as irradiance) (W/m2)
      ilwr: Incoming longave radiation (W/m2)
        ta: Air temperature (degrees Celsius)
        rh: Relative humidity (%)
        vw: Wind speed (m/s)
        dw: Wind direction (degrees)
      elev: Station elevation (m asl)
#05-02: Explore WX
  • Read the comma separated spreadsheet WX_GNP.csv
  • How many different virtual stations are included in the data frame?
  • How many unique time stamps does the data frame contain?
  • What are the earliest and latest time records? Which time records are displayed in the first and last rows?
  • How many stations are located above 2000m but below 2200m?
  • Characterize the elevation distribution of the stations with different percentiles
Tip

The methods .unique(), .min(), .max(), .describe(), and .quantile() will be handy for these tasks.

#05-03: Subset WX and compute more summary stats
  • Create another data frame as subset of WX, which contains all the data from one station_id of your choice.
  • What is the average air temperature and its standard deviation?
  • What is the median relative humidity rh when either hn24 is greater than 10 cm or rain is greater than 2 mm? What about the median rh during the opposite conditions?
  • Compute a new column hn72_check that should conceptually be identical to hn72. Use only hn24 to derive hn72_check.
  • Test whether hn72_check is indeed equal to hn72. Why not?
  • Store the new data frame in a csv file.

highlight-style: github

Solutions

import numpy as np
count = np.arange(10, 20)
v1a = np.full(count.shape, False)
v1b = v1a.copy()
v2a = v1a.copy()
v2b = v1a.copy()

v1a[count > 16] = True

for i, ct in enumerate(count):
    if ct > 16:
        v1b[i] = True

print(f"v1a is equal to v1b: {np.array_equal(v1a, v1b)}")

for i, ct in enumerate(count):
    if ct > 13 and ct < 17:
        v2a[i] = True

v2b[(count > 13) & (count < 17)] = True

print(f"v2a is equal to v2b: {np.array_equal(v2a, v2b)}")
v1a is equal to v1b: True
v2a is equal to v2b: True

Download the notebook: 05_tasks_solved.ipynb