import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
Correction of 2nd exam 2024/25
NES—Programmiertechniken (Introduction to scientific programming)
21.03.2025
Please work through the following tasks. You have 75 minutes to complete the exam.
Make sure you executed all relevant code cells and save the notebook before the end of the exam.
This is a correction of the exam questions. Like always in programming, there are many ways to achieve similar results. My suggestions here follow our coding exercises discussed in class.
If you want to solve this exam as an exercise, download the uncorrected notebook.
Question 1: Multiple-choice (5 points)
To tick correct answers, replace - [ ] with - [x]
Careful: Wrongly ticked answers will give negative points!
(a) Loops and vectorization
Which of the following statements are true? (2.5 points)
(b) Indexing
Which of the statements about the following code block are true? (2.5 points)
= np.arange(12).reshape((3, 4))
a = a * 2
b = np.nonzero(a > 8)
nz = a[nz] b[nz]
Question 2: Spreadsheet data
- Read the dataset global_energy.csv into a DataFrame called
df
. (1 point)
= pd.read_csv('global_energy.csv')
df df
Country | Year | Renewable Energy Share (%) | Fossil Fuel Dependency (%) | Industrial Energy Use (%) | |
---|---|---|---|---|---|
0 | Canada | 2018 | 50.38 | 49.540 | 43.39 |
1 | Germany | 2018 | 53.14 | 42.270 | 43.11 |
2 | Russia | 2018 | 48.71 | 43.800 | 33.10 |
3 | Brazil | 2018 | 52.16 | 45.240 | 36.70 |
4 | UK | 2018 | 45.28 | 41.320 | 38.89 |
... | ... | ... | ... | ... | ... |
245 | India | 2000 | 42.31 | 48.380 | 39.11 |
246 | Australia | 2000 | 56.51 | 45.100 | 35.06 |
247 | China | 2000 | 45.87 | 44.365 | 36.89 |
248 | USA | 2000 | 47.52 | 48.900 | 38.89 |
249 | Japan | 2000 | 36.53 | 48.360 | 42.95 |
250 rows × 5 columns
- How many unique countries does the data set contain? (1 point)
= df['Country'].unique().shape[0] countries
- What are the oldest and most recent years recorded in the DataFrame? (1 point)
'Year'].min(), df['Year'].max() df[
(np.int64(2000), np.int64(2024))
- How many rows does the DataFrame contain with ‘Renewable Energy Share (%)’ more than 50 % and at the same time ‘Fossil Fuel Dependency (%)’ lower than 50 %? (2 points)
sum((df['Renewable Energy Share (%)'] > 50) & (df['Fossil Fuel Dependency (%)'] < 50)) np.
np.int64(65)
Create a subset
df_sub
of the DataFramedf
that contains only records from Canada and USA, and only the following columns:
‘Year’, ‘Renewable Energy Share (%)’, ‘Fossil Fuel Dependency (%)’Make sure you do this in a way that avoids any potential side effects.
(2 points)
= df.loc[df['Country'].isin(['Canada', 'USA']), ['Year', 'Renewable Energy Share (%)', 'Fossil Fuel Dependency (%)']].copy() df_sub
Using
df_sub
and thegroupby
method, create a quick working plot like the following: (2 points)
= df_sub.groupby('Year').median()
df_annual ='Median percentages in USA and Canada')
df_annual.plot(title plt.show()
- The following code cell computes a Pandas Series
range_renewables
. Compute a Numpy arrayrange_renewables_loop
that contains the same values asrange_renewables
but use afor
-loop for the computation instead of thegroupby
method. (2 points)
= df.groupby('Country').min()
df_min = df.groupby('Country').max()
df_max = df_max['Renewable Energy Share (%)'] - df_min['Renewable Energy Share (%)'] range_renewables
= []
range_renewables_loop for country in df['Country'].unique():
= df.loc[df['Country'].isin([country]), 'Renewable Energy Share (%)']
renewables max() - renewables.min())
range_renewables_loop.append(renewables.= np.array(range_renewables_loop) range_renewables_loop
- Do the equivalent computation of the following code cell using
df
andrange_renewables_loop
to display the country. (1 point)
Tip: If you cannot remember the equivalent Numpy method, briefly skim the documentation ofidxmax
for help.
range_renewables.idxmax()
'Germany'
'Country'].unique()[range_renewables_loop.argmax()] df[
'Germany'
Question 3: datetime calculations
- Create a DataFrame
df
with a DatetimeIndex ranging from2025-01-01 00:00
to2025-12-31 23:00
in hourly sampling. (1 point)
= pd.DataFrame(index=pd.date_range('2025-01-01 00:00', '2025-12-31 23:00', freq='h')) df
- Create a column named
hour_of_year
that stores the hour count from 1 to the total number of hours in 2025. (1 point)
"hour_of_year"] = np.arange(1, df.shape[0] + 1) df[
- Create another column
sin_wave
that computes the sine ofhour_of_year
, using the formula: (1 point)
\[ \sin\left(\frac{2\pi \cdot \text{hour\_of\_year}}{8760}\right) \]
"sin_wave"] = np.sin((2 * np.pi * df["hour_of_year"]) / 8760) df[
- Find the first datetime where
sin_wave
is close to 0 (using the functionnp.isclose
). (1 point)
'sin_wave'], 0)][0] df.index[np.isclose(df[
Timestamp('2025-07-02 11:00:00')
Resample
sin_wave
to daily minimum, mean, and maximum values and plot the results of the month of September. Reproduce the following figure as closely as possible.
Tip: Use the Axes methodfill_between
for the shaded area artist.(7 points)
= pd.DataFrame()
dfr 'max'] = df['sin_wave'].resample('D').max()
dfr['min'] = df['sin_wave'].resample('D').min()
dfr['mean'] = df['sin_wave'].resample('D').mean()
dfr[= dfr.loc[dfr.index.month==9]
dfr
= plt.subplots()
fig, ax 'min'], dfr['max'], alpha=0.2, label="Range")
ax.fill_between(dfr.index, dfr['mean'].plot(label="Average")
dfr["Sine wave")
ax.set_title("Unitless")
ax.set_ylabel(
ax.legend()=':', which='both')
ax.grid(linestyle plt.show()
Question 4: Functions and Classification
Define a function
determine_phase_of_water
that determines whether water is frozen or liquid at the given temperature. The function should take one positional argumenttemp
and one keyword argumentunit
, which defaults to"Celsius"
. (6 points)- If
unit
is not one of"Celsius"
or"Farenheit"
, raise a ValueError with a meaningful error message.
(If you don’t know how to raise a ValueError print a meaningful error message and returnNone
.) - If
temp
is below 0°C or 32°F, return"solid"
. - Otherwise, return
"liquid"
.
- If
Write the docstring for the function, also including the Parameters and Returns sections. (2 points)
def determine_phase_of_water(temp, unit="Celsius"):
"""Determine the phase state of water at a given temperature
Parameters
----------
temp: float
Temperature of water
unit: str
Unit of given temperature
Returns
-------
str
Phase state
"""
if unit not in ["Celsius", "Farenheit"]:
raise ValueError("Unknown unit.")
if unit == "Celsius" and temp < 0:
return "solid"
elif unit == "Farenheit" and temp < 32:
return "solid"
else:
return "liquid"
## You can use this code block to check the results of your function.
# No need to change anything here.
print(determine_phase_of_water(-5))
print(determine_phase_of_water(70))
print(determine_phase_of_water(20, "Farenheit"))
try:
print(determine_phase_of_water(273, "Kelvin"))
except ValueError as e:
print(e)
solid
liquid
solid
Unknown unit.