Assignment #05

Unit 05

Until next week, work through the material provided in Unit 5 and solve the following exercises in a notebook.

This week, you will work with a real data set, nowcast data from a numerical weather prediction model in Glacier National Park, Canada. You find the data set at ILIAS 05_unit/WX_GNP.csv.

Here is a dictionary that explains the meaning of the spread sheet variables

WX_dict = {
    'datetime': 'datetime in the form YYYY-MM-DD HH:MM:SS',
    'station_id': 'ID of virtual weather station (i.e., weather model grid point)',
    'hs': 'Snow height (cm)',
    'hn24': 'Height of new snow within last 24 hours (cm)',
    'hn72': 'Height of new snow within last 72 hours (cm)',
    'rain': 'Liquid water accumulation within last 24 hours (mm)',
    'iswr': 'Incoming shortwave radiation (also refered to as irradiance) (W/m2)',
    'ilwr': 'Incoming longave radiation (W/m2)',
    'ta': 'Air temperature (degrees Celsius)',
    'rh': 'Relative humidity (%)',
    'vw': 'Wind speed (m/s)',
    'dw': 'Wind direction (degrees)',
    'elev': 'Station elevation (m asl)'
}

for key, explanation in WX_dict.items():
    print(f'{key:>10}: {explanation}')
  datetime: datetime in the form YYYY-MM-DD HH:MM:SS
station_id: ID of virtual weather station (i.e., weather model grid point)
        hs: Snow height (cm)
      hn24: Height of new snow within last 24 hours (cm)
      hn72: Height of new snow within last 72 hours (cm)
      rain: Liquid water accumulation within last 24 hours (mm)
      iswr: Incoming shortwave radiation (also refered to as irradiance) (W/m2)
      ilwr: Incoming longave radiation (W/m2)
        ta: Air temperature (degrees Celsius)
        rh: Relative humidity (%)
        vw: Wind speed (m/s)
        dw: Wind direction (degrees)
      elev: Station elevation (m asl)
#05-01: Explore WX
  • Read the comma separated spreadsheet WX_GNP.csv
  • How many different virtual stations are included in the data frame?
  • How many unique time stamps does the data frame contain?
  • What are the earliest and latest time records? Which time records are displayed in the first and last rows?
  • How many stations are located above 2000m but below 2200m?
  • Characterize the elevation distribution of the stations with a number of percentiles!
Tip

The methods .unique(), .min(), .max(), .describe(), and .quantile() will be handy for these tasks.

#05-02: Use datetime provided in WX
  • How many records are available from 2019?
  • Create a new data frame WX19 that contains all records from the 2018/19 winter season. Let’s take all records between September and Mai. The only meteorological variables we need in addition to the meta variables station_id and datetime are ta and iswr.
  • Save WX19 as an excel file (‘.xlsx’)
Tip

To solve the first question in a pythonic way, let’s make use of pandas datetime features. Either re-read the csv file with the argument parse_dates["datetime"], or run the line WX["datetime"] = pd.to_datetime(WX["datetime"]). Either approach will tell pandas that the column of the spreadsheet named ‘datetime’ contains a string with date and time information.

All of a sudden, you can access very useful attributes, e.g. WX['datetime'].dt.year. Check out this cheet sheat table for datetime attributes available in this manner.

#05-03: Quick ’n dirty time series
  • From WX19, create a new data frame WXzoom1 that contains the records from the first two weeks of January and any station_id of your choice.
  • Using the WXzoom1 data frame, plot a line graph of the temperature that also shows the measurements with circles and, in a different figure, plot an area curve of the incoming shortwave radiation.
Tip

If your x-axis shows integers instead of dates and times, compare WXzoom1 before and after running the following line of code: WXzoom1.set_index('datetime', inplace=True). What changed? Run your plot command again. Do you now see a proper time series representation with time on the x-axis?

#05-04: Summary stats
  • From WX19, create a new data frame WXzoom2 that contains the records from the first two weeks of January and the first two station_id’s in WX19['station_id'].unique().
  • Using the WXzoom2 data frame, what is the average temperature of each of the stations?
  • Using the WXzoom2 data frame, plot the temperature curves of both stations into one figure.
Tip
  1. You might be tempted to use some sort of syntax like ... in ... to filter for the relevant station_id’s. Why does that not work? Instead, use the pandas method .isin().
  2. Use the .groupby() method to compute the result of question 2 in one line of code.
  3. Either,
  • use the .pivot() method to reshape the data frame, so that the temperature at each station has its own column. Apply the .plot() method to the reshaped data frame,
  • or alternatively, create a figure and axes using plt.subplots() and then fill the axes with the two lines one after another by subsetting WXzoom2 to the individual station_id’s.
#05-05: Summary stats
  • Using the WX19 data frame, compute which station has the maximum total incoming shortwave radiation.
  • Create a plot as similar as possible to the following plot.

Tip
  1. You should be able to compute the total irradiance for each station based on all the tips you got up until here. The step from the total irradiance at each station to getting the station_id with the maximum total irradiance is a bit confusing a tricky. Don’t get hung up here.
  2. If you made it to here, you can create the figure no problem! Look into options to plt.subplots(), the axes method .fill_between(), and control the transparency of the shaded areas with the alpha argument to your plot methods.