This week, we have two main objectives. First, we want to install our own development Python package. By setting up a local development package that contains useful math equations and functions, and running our computations and visualizations in a separate notebook, this assignment demonstrates how a larger project could be split into packaged code and analysis code. This will keep things organized and tidy, not only for yourself but also for future thesis supervisors and other collaborators.
Second, we continue working with our module solar to run some real-life calculations. Overall, these exercises will give you the opportunity to work on your skills in writing and applying own functions, creating and working with a DataFrame including datetime index, manipulating data, and plotting it using different plot types, such as creating quick working plots and polished figures.
Your own development package
Exercise #07-01: mytoolbox
Download mytoolbox and unzip it into your directory $SCIPRO. Install the local package mytoolbox into your conda environment scipro2024. You will find more detailed informatin on how to do this in the lecture material on packages.
In your directory $SCIPRO/10_unit, start a new notebook and see whether you can import the module solar from mytoolbox by
from mytoolbox import solar
If that worked and you have access to all the functionalities that come with solar, add a new variable to solar: check123 = "Successfully installed local package in dev mode!". Save the file.
Go back to your notebook and reload solar like that
from importlib importreloadreload(solar)
Run solar.check123. Do you see the string? Do you understand what that means?
Note: Don’t get hung up if you can’t get the local package installed in development mode! Try the method of modifying sys.path by adding the directory that contains the solar module to your path (again, see lecture material on packages for more details. hint: sys.path.append()). If that doesn’t work either, just copy and paste solar.py into your current working directory.
With all the equations packed into our module and/or package, we can continue to answer and visualize some pretty cool questions. Start a new notebook and solve the remaining exercises there.
More exercises on the methods from previous Units
#10-02: Applying our module solar
We want to look at the entire year 2023 in Dornbirn. Create a Pandas DataFrame indexed by the datetime of the year in hourly sampling.
Use the functions of our solar module to compute new columns of the DataFrame: the hourangle, the declination, the solar elevation angle, the solar azimuth, and also the clear-sky irradiance. Use the latitude and longitude of Dornbirn as defined by the variables LAT_DO and LON_DO in solar.
When computing the DataFrame, I get a runtime warning that there were invalid values encountered in a function call to arccos. This means, we have to expect some missing values in our data set. Use the DataFrame methods .isna() and .sum() to check how many NaN’s our data set contains and which columns are affected.
The remaining exercises below all use the DataFrame created in the previous exercise. It is the same DataFrame, solar_Dornbirn.csv, we already used in Assignment #08. So, if you are having challenges to solve exercise #10-02, just copy over the csv file from the other Unit.
#10-03: Idealized conditions in Dornbirn
Let’s look a bit deeper into the idealized clear-sky irradiance in Dornbirn over 2023.
Create quick working plots like the following ones without spending time to style them etc. You just want to look at the data as conveniently and quickly as possible.
Compute the maximum and median clear-sky irradiance for Dornbirn each day in 2023. Similarly to your working plots above, apply the Pandas .plot() method (here, .plot.area()) to create a plot that you then style a little bit with legend, title, ylabel. Make it look as closely as possible to that one:
Note that the circles representing monthly means are a bit of a tricky part. Here are a few tips:
First resample to monthly sampling (using sampling frequency 'MS') and compute the average.
You will then get a Series with the desired values at the index values ‘start of each month’.
To display the circles at the mid of each month of the graph, add a time offset of 14 days to the index of your resampled monthly mean Pandas Series.
#10-04: Histograms
We stick with the same DataFrame, but want to look at the data in a different way.
We start simple and create some very quick working plots. A histogram of the solar elevation angle and one of the solar azimuth.
Now that you have your working plots, let’s modify them a bit. Create three panels next to each other that share the same y-axis. Try to emulate the following plot as closely as possible.
Working with angles would lend itself to plotting on a polar axis. There is nothing like a polar histogram readily available on matplotlib, but a quick Internet search highlights several solutions to modify the matplotlib barchart on a polar axis to your needs. An example with random data for your inspiration:
Exercise #10-05: Box plot with multiple artists
This exercise will help you to refine your skills in reshaping data frames to create working plots with multiple artists efficiently.
Extract the month information from the datetime index and write it to a new column named “month”.
Create a wide data frame with month as columns and elevation angle h as values using the .pivot() method.
Try to reproduce the following figure
Solutions
Solution: Notebook
#10-02: Applying our module solar
We want to look at the entire year 2023 in Dornbirn. Create a Pandas DataFrame indexed by the datetime of the year in hourly sampling.
Use the functions of our solar module to compute new columns of the DataFrame: the hourangle, the declination, the solar elevation angle, the solar azimuth, and also the clear-sky irradiance. Use the latitude and longitude of Dornbirn as defined by the variables LAT_DO and LON_DO in solar.
When computing the DataFrame, I get a runtime warning that there were invalid values encountered in a function call to arccos. This means, we have to expect some missing values in our data set. Use the DataFrame methods .isna() and .sum() to check how many NaN’s our data set contains and which columns are affected.
import numpy as npimport pandas as pdimport matplotlib.pyplot as pltfrom mytoolbox import solar
# Generate a datetime series for all of 2023 with hourly samplingstart_date ='2023-01-01'end_date ='2023-12-31'datetime = pd.date_range(start=start_date, end=end_date, freq='h')DF = pd.DataFrame(index=datetime)
tau 0
declination 0
h 0
alpha 287
iswr_clearsky 0
dtype: int64
#10-03: Idealized conditions in Dornbirn
Let’s look a bit deeper into the idealized clear-sky irradiance in Dornbirn over 2023.
Create quick working plots like the following ones without spending time to style them etc. You just want to look at the data as conveniently and quickly as possible.
Compute the maximum and median clear-sky irradiance for Dornbirn each day in 2023. Similarly to your working plots above, apply the Pandas .plot() method (here, .plot.area()) to create a plot that you then style a little bit with legend, title, ylabel. Make it look as closely as possible to that one:
Note that the circles representing monthly means are a bit of a tricky part. Here are a few tips:
First resample to monthly sampling (using sampling frequency 'MS') and compute the average.
You will then get a Series with the desired values at the index values ‘start of each month’.
To display the circles at the mid of each month of the graph, add a time offset of 14 days to the index of your resampled monthly mean Pandas Series.
We stick with the same DataFrame, but want to look at the data in a different way.
We start simple and create some very quick working plots. A histogram of the solar elevation angle and one of the solar azimuth.
DF['h'].plot.hist()
<Axes: ylabel='Frequency'>
DF['alpha'].plot.hist()
<Axes: ylabel='Frequency'>
Now that you have your working plots, let’s modify them a bit. Create three panels next to each other that share the same y-axis. Try to emulate the following plot as closely as possible.
This exercise will help you to refine your skills in reshaping data frames to create working plots with multiple artists efficiently.
Extract the month information from the datetime index and write it to a new column named “month”.
Create a wide data frame with month as columns and elevation angle h as values using the .pivot() method.
Try to reproduce the following figure
DF['month'] = DF.index.monthDF_wide = DF.pivot(columns='month', values='h')DF_wide.plot.box()# The exact figure including the styles:# fig, ax = plt.subplots(figsize=(6, 4))# DF_wide.plot.box(ax=ax)# ax.set_ylabel(r"Solar elevation angle ($^\circ$)")# ax.set_xlabel("Month of year")# ax.set_title("Variation in solar elevation angle over the entire year")# plt.savefig("boxplot_series.png")