import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
df = pd.read_csv("data/bouldercreek_09_2013.txt", skiprows=25, sep='\t')
df.head()
df.shape
df2 = df.iloc[1:2876,2:5].copy()
df2
Have a look at numpy.random documentation. Choose a distribution you have no familiarity with, and try to sample from and visualize it.
from numpy.random import default_rng
rng = default_rng()
dh = np.random.default_rng().dirichlet((10, 5, 3), 20).transpose()
dh
plt.barh(range(20), dh[0])
plt.barh(range(20), dh[1], left=dh[0], color='g')
plt.barh(range(20), dh[2], left=dh[0] + dh[1], color='r')
plt.title("Lengths of Strings")
Load the streamgage data set with Pandas, subset the week of the 2013 Front Range flood (September 11 through 15) and create a hydrograph (line plot) of the discharge data using Pandas, linking it to an empty maptlotlib ax object. Create a second axis that displays the whole dataset. Adapt the title and axes’ labels using matplotlib.
discharge = pd.read_csv("data/bouldercreek_09_2013.txt",
skiprows=27, delimiter="\t",
names=["agency", "site_id", "datetime",
"timezone", "discharge", "discharge_cd"])
discharge["datetime"] = pd.to_datetime(discharge["datetime"])
front_range = discharge[(discharge["datetime"] >= "2013-09-09") &
(discharge["datetime"] < "2013-09-15")]
fig, ax = plt.subplots()
front_range.plot(x ="datetime", y="discharge", ax=ax)
ax.set_xlabel("") # no label
ax.set_ylabel("Discharge, cubic feet per second")
ax.set_title(" Front Range flood event 2013")
discharge = pd.read_csv("data/bouldercreek_09_2013.txt",
skiprows=27, delimiter="\t",
names=["agency", "site_id", "datetime",
"timezone", "flow_rate", "height"])
fig, ax = plt.subplots()
flood = discharge[(discharge["datetime"] >= "2013-09-11") &
(discharge["datetime"] < "2013-09-15")]
ax2 = fig.add_axes([0.65, 0.575, 0.25, 0.3])
flood.plot(x ="datetime", y="flow_rate", ax=ax)
discharge.plot(x ="datetime", y="flow_rate", ax=ax2)
ax2.legend().set_visible(False)
ax.set_xlabel("") # no label
ax.set_ylabel("Discharge, cubic feet per second")
ax.legend().set_visible(False)
ax.set_title(" Front Range flood event 2013")
discharge = pd.read_csv("data/bouldercreek_09_2013.txt",
skiprows=27, delimiter="\t",
names=["agency", "site_id", "datetime",
"timezone", "flow_rate", "height"])
fig, ax = plt.subplots()
flood = discharge[(discharge["datetime"] >= "2013-09-11") &
(discharge["datetime"] < "2013-09-15")]
ax2 = fig.add_axes([0.65, 0.575, 0.25, 0.3])
flood.plot(x ="datetime", y="flow_rate", ax=ax)
discharge.plot(x ="datetime", y="flow_rate", ax=ax2)
ax2.legend().set_visible(False)
ax.set_xlabel("") # no label
ax.set_ylabel("Discharge, cubic feet per second")
ax.legend().set_visible(False)
ax.set_title(" Front Range flood event 2013")
Check the documentation of the savefig method and check how you can comply to journals requiring figures as pdf file with dpi >= 300.
fig.savefig?
fig.savefig("figure1.pdf",dpi=300,)
Display your data using one or more plot types from the example gallery. Which ones to choose will depend on the content of your own data file. If you are using the streamgage file bouldercreek_09_2013.txt, you could make a histogram of the number of days with a given mean discharge, use bar plots to display daily discharge statistics, or explore the different ways matplotlib can handle dates and times for figures.
discharge = pd.read_csv("data/bouldercreek_09_2013.txt",
skiprows=27, delimiter="\t",
names=["agency", "site_id", "datetime",
"timezone", "discharge", "discharge_cd"])
discharge.head()
disp = discharge.groupby(['discharge'])['datetime'].count().reset_index(name='datetime')
disp
disp['discharge'] = disp['discharge'].astype(object)
fig, ax = plt.subplots()
disp.plot(x='discharge',y='datetime' ,kind='bar',stacked=True, ax=ax)
ax.set_xlabel("Discharge level") # no label
ax.set_ylabel("Discharge, cubic feet per second")
ax.set_title(" Discharge recorded per level")
ax.tick_params(labelsize=8, pad=8)
fig.savefig("figure2.pdf",dpi=300,)