ADM1F_SRT: Input/output sensitivity¶

Here we explore the relationships between inputs and outputs. In the ADM1F: Execution time example we showed how to run the models with the perturbed input values from influent.dat and param.dat files. Assuming that you run the ADM1F: Execution time example and produced the outputs_influent.csv and outputs_params.csv files, we use these outputs here to study the relationship between influents and outputs, and params and outputs. If not just uncomment lines 5 and 18 and re-run the simulations.

Authors: Wenjuan Zhang and Elchin Jafarov

[1]:

import adm1f_utils as adm1fu
import os
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline

1. Influent/Output sensitivity¶

[2]:

# navigate to simulations folder
os.chdir('../../simulations')

[3]:

#Set the path to the ADM1F executable
ADM1F_EXE = '/Users/elchin/project/ADM1F_WM/build/adm1f'

# Set the value of percentage and sample size for lhs
percent = 0.1 # NOTE: for params percent should be <= 0.05
sample_size = 100
variable = 'influent'    # influent/params/ic
method = 'uniform'    #'uniform' or 'lhs'

[4]:

index=adm1fu.create_a_sample_matrix(variable,method,percent,sample_size)
print ()
print ('Number of elements participated in the sampling:',len(index))

Saves a sampling matrix [sample_size,array_size] into var_influent.csv
sample_size,array_size:  (100, 11)
Each column of the matrix corresponds to a variable perturbed 100 times around its original value
var_influent.csv SAVED!

Number of elements participated in the sampling: 11

[5]:

#exe_time=adm1fu.adm1f_output_sampling(ADM1F_EXE,variable,index)

[6]:

[output_name,output_unit]=adm1fu.get_output_names()
alloutputs = pd.read_csv('outputs_influent.csv', sep=',', header=None)
alloutputs.columns = output_name

[7]:

alloutputs

[7]:

	Ssu	Saa	Sfa	Sva	Sbu	Spro	Sac	Sh2	Sch4	Sic	...	Alk	NH3	NH4	LCFA	percentch4	energych4	efficiency	VFA/ALK	ACN	sampleT
0	7.37820	3.30795	56.7230	6.60272	8.46032	6.26942	1908.38	0.000123	48.7774	615.799	...	8148.98	8.64671	987.952	56.7230	57.2720	66.2091	52.7615	0.222004	116.5620	24.3594
1	7.28460	3.26614	55.8953	6.52078	8.34621	6.18289	2033.37	0.000121	48.4068	645.849	...	8602.48	9.61694	1048.750	55.8953	56.8728	65.9860	53.8907	0.223891	92.1431	24.8179
2	6.85866	3.07582	52.1661	6.11321	7.84990	5.79147	1926.30	0.000114	47.7048	651.060	...	8442.25	9.57599	1031.740	52.1661	56.7510	61.6157	52.8110	0.216106	125.4410	27.1630
3	6.72726	3.01709	51.0289	5.97094	7.70738	5.67153	1931.76	0.000112	47.6768	660.852	...	8558.31	9.84058	1044.000	51.0289	56.6698	61.9098	53.9655	0.213730	116.0810	27.9786
4	7.55763	3.38810	58.3212	6.83699	8.63179	6.43593	2242.26	0.000126	49.1049	657.957	...	9061.95	10.49270	1109.360	58.3212	57.2318	68.8331	51.4620	0.234238	77.3666	23.5162
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
95	7.65859	3.43319	59.2237	6.78604	8.84032	6.52986	1911.45	0.000128	48.9244	616.258	...	8197.36	8.37148	985.507	59.2237	56.5753	71.6467	52.7621	0.221140	125.5700	23.0752
96	7.40606	3.32040	56.9708	6.57190	8.52763	6.29525	2049.40	0.000123	48.6287	649.584	...	8699.15	9.58421	1054.650	56.9708	56.5357	69.7230	54.2485	0.223168	102.1390	24.2222
97	7.06582	3.16840	53.9710	6.30301	8.09601	5.98132	1778.88	0.000118	48.2397	605.950	...	7819.47	8.22490	949.405	53.9710	57.3804	61.6582	52.9718	0.215719	129.6520	25.9740
98	6.85456	3.07399	52.1312	6.12749	7.83407	5.78776	2146.27	0.000114	48.1629	685.484	...	9144.20	11.07250	1119.840	52.1312	56.8556	64.3136	53.8121	0.222069	88.1808	27.1834
99	7.44704	3.33870	57.3346	6.67541	8.53678	6.33321	1983.08	0.000124	48.5798	627.253	...	8349.28	9.04090	1016.750	57.3346	57.0513	66.1387	53.8114	0.225090	103.0960	24.0295

100 rows × 67 columns

[8]:

[influent_name,influent_index]=adm1fu.get_influent_names()

[9]:

# since we did not use all the columns in the influent.dat (see create_a_sample_matrix)
# we use index to select used headers for used values
header=[]
for i in index:
    header.append(influent_name[i])

influent_inputs = pd.read_csv('var_influent.csv', sep=',', header=None)
influent_inputs.columns = header
influent_inputs.head()

[9]:

	S_su_in	S_aa_in	S_fa_in	S_ac_in	S_IN_in	X_ch_biom_in	X_pr_biom_in	X_li_biom_in	X_I_in	Q	Temp
0	2.41679	4.54049	3.26551	1.06715	0.00794	8.21103	7.65897	5.45471	18.04704	139.57635	31.64409
1	2.71197	4.44197	2.94117	0.97991	0.00798	8.47247	8.44312	5.01331	16.95071	136.99766	32.47646
2	2.37594	4.05372	3.09329	1.10619	0.00801	8.84280	8.55680	4.62146	18.06979	125.17005	31.95536
3	2.70155	4.55292	3.31319	1.00561	0.00784	9.14260	8.30096	4.69829	17.67225	121.52161	37.86524
4	2.35939	4.30043	3.00319	1.05070	0.00860	8.26193	9.19056	5.36216	19.24420	144.58137	35.68530

[10]:

# merge influent and output datasets
inout=pd.concat([influent_inputs,alloutputs], axis=1)
inout.head()

[10]:

	S_su_in	S_aa_in	S_fa_in	S_ac_in	S_IN_in	X_ch_biom_in	X_pr_biom_in	X_li_biom_in	X_I_in	Q	...	Alk	NH3	NH4	LCFA	percentch4	energych4	efficiency	VFA/ALK	ACN	sampleT
0	2.41679	4.54049	3.26551	1.06715	0.00794	8.21103	7.65897	5.45471	18.04704	139.57635	...	8148.98	8.64671	987.952	56.7230	57.2720	66.2091	52.7615	0.222004	116.5620	24.3594
1	2.71197	4.44197	2.94117	0.97991	0.00798	8.47247	8.44312	5.01331	16.95071	136.99766	...	8602.48	9.61694	1048.750	55.8953	56.8728	65.9860	53.8907	0.223891	92.1431	24.8179
2	2.37594	4.05372	3.09329	1.10619	0.00801	8.84280	8.55680	4.62146	18.06979	125.17005	...	8442.25	9.57599	1031.740	52.1661	56.7510	61.6157	52.8110	0.216106	125.4410	27.1630
3	2.70155	4.55292	3.31319	1.00561	0.00784	9.14260	8.30096	4.69829	17.67225	121.52161	...	8558.31	9.84058	1044.000	51.0289	56.6698	61.9098	53.9655	0.213730	116.0810	27.9786
4	2.35939	4.30043	3.00319	1.05070	0.00860	8.26193	9.19056	5.36216	19.24420	144.58137	...	9061.95	10.49270	1109.360	58.3212	57.2318	68.8331	51.4620	0.234238	77.3666	23.5162

5 rows × 78 columns

The correlation heat map matrix below shows that four influents have the highest impact on the results: X_ch_biom_in, X_pr_biom_in, X_li_biom_in, and Q.

[11]:

corr=inout.corr()
plt.figure(figsize=(21,5))
sns.heatmap(corr.iloc[0:11,11:-1], cmap=sns.diverging_palette(220, 10, as_cmap=True))
plt.title('Correlation Matrix [influent/results]',fontsize=16);
plt.ylabel('Influent',fontsize=16)
plt.xlabel('Outputs',fontsize=16)

[11]:

Text(0.5, 23.09375, 'Outputs')

../_images/jupyter_notebook_io_sensitivity_analysis_13_1.png

Let’s merge ph values from the results with the influents and explore the correlations.

[12]:

influent_ph=pd.concat([influent_inputs,alloutputs[' pH ']], axis=1)

[13]:

plt.figure(figsize=(10,4))
influent_ph.corr().iloc[-1].plot(linewidth=2)
plt.title('Correlation plot (ph)')
plt.xlabel('Influents')
plt.ylabel('Correlation')
plt.ylim([-1,1])

[13]:

(-1.0, 1.0)

../_images/jupyter_notebook_io_sensitivity_analysis_16_1.png

[14]:

plt.scatter( influent_ph['X_pr_biom_in'],influent_ph[' pH '])
plt.xlabel('X_pr_biom_in',fontsize=18)
plt.ylabel('pH',fontsize=18);

../_images/jupyter_notebook_io_sensitivity_analysis_17_0.png

[15]:

from pandas.plotting import scatter_matrix
scatter_matrix(influent_ph, alpha=0.6,figsize=(10,10));

../_images/jupyter_notebook_io_sensitivity_analysis_18_0.png

2. Params/Output sensitivity¶

[16]:

# Set the value of percentage and sample size for lhs
percent = 0.1 # NOTE: for params percent should be <= 0.05
sample_size = 100
variable = 'params'    # influent/params/ic
method = 'uniform'    #'uniform' or 'lhs'

[17]:

index=adm1fu.create_a_sample_matrix(variable,method,percent,sample_size)
print ()
print ('Number of elements participated in the sampling:',len(index))

Saves a sampling matrix [sample_size,array_size] into var_params.csv
sample_size,array_size:  (100, 92)
Each column of the matrix corresponds to a variable perturbed 100 times around its original value
var_params.csv SAVED!

Number of elements participated in the sampling: 92

[18]:

#exe_time=adm1fu.adm1f_output_sampling(ADM1F_EXE,variable,index)

[19]:

[output_name,output_unit]=adm1fu.get_output_names()
alloutputs = pd.read_csv('outputs_params.csv', sep=',', header=None)
alloutputs.columns = output_name

[20]:

[param_name,param_index]=adm1fu.get_param_names()

[21]:

# since we did not use all the columns in the influent.dat (see create_a_sample_matrix)
# we use index to select used headers for used values
header=[]
for i in index:
    header.append(param_name[i])

param_inputs = pd.read_csv('var_params.csv', sep=',', header=None)
param_inputs.columns = header
param_inputs.head()

[21]:

	f_sI_xc	f_xI_xc	f_ch_xc	f_pr_xc	f_li_xc	N_xc	N_I	N_aa	C_xc	C_sI	...	k_A_Bpro	k_A_Bac	k_A_Bco2	k_A_BIN	kLa	K_H_h2o_base	K_H_co2_base	K_H_ch4_base	K_H_h2_base	k_P
0	0.09116	0.27346	0.21535	0.21711	0.27475	0.00251	0.00420	0.00736	0.02895	0.02792	...	1.026524e+10	1.086406e+10	9.205019e+09	1.087446e+10	207.51543	0.02859	0.03361	0.00146	0.00071	50821.70460
1	0.09692	0.25605	0.18183	0.21486	0.27367	0.00293	0.00450	0.00648	0.02930	0.02715	...	9.961014e+09	9.209860e+09	9.484090e+09	1.097333e+10	185.69982	0.03129	0.03583	0.00146	0.00079	45097.70847
2	0.09653	0.25089	0.18351	0.19403	0.22666	0.00246	0.00420	0.00649	0.02824	0.03114	...	1.043165e+10	1.081807e+10	9.359366e+09	9.475087e+09	218.85580	0.02930	0.03748	0.00140	0.00074	53707.49901
3	0.09891	0.25074	0.19437	0.20372	0.23318	0.00263	0.00468	0.00666	0.02873	0.02895	...	9.431350e+09	1.031777e+10	9.787729e+09	1.030247e+10	184.26372	0.03229	0.03850	0.00127	0.00086	49069.07961
4	0.10742	0.26412	0.20268	0.20954	0.26893	0.00263	0.00414	0.00723	0.02958	0.03157	...	1.090281e+10	1.020322e+10	1.063838e+10	1.076841e+10	189.12319	0.02950	0.03578	0.00138	0.00083	54000.23123

5 rows × 92 columns

[22]:

# merge influent and output datasets
inout=pd.concat([param_inputs,alloutputs], axis=1)
inout.head()

[22]:

	f_sI_xc	f_xI_xc	f_ch_xc	f_pr_xc	f_li_xc	N_xc	N_I	N_aa	C_xc	C_sI	...	Alk	NH3	NH4	LCFA	percentch4	energych4	efficiency	VFA/ALK	ACN	sampleT
0	0.09116	0.27346	0.21535	0.21711	0.27475	0.00251	0.00420	0.00736	0.02895	0.02792	...	8909.63	9.53919	1060.990	61.5561	56.0945	65.7436	52.8869	0.222452	87.1484	25.3731
1	0.09692	0.25605	0.18183	0.21486	0.27367	0.00293	0.00450	0.00648	0.02930	0.02715	...	7707.69	10.88420	927.238	56.1139	62.1148	62.3272	55.0303	0.147138	654.5290	25.3731
2	0.09653	0.25089	0.18351	0.19403	0.22666	0.00246	0.00420	0.00649	0.02824	0.03114	...	7873.10	8.67396	927.390	68.2248	58.8253	64.9817	54.1026	0.191081	196.3670	25.3731
3	0.09891	0.25074	0.19437	0.20372	0.23318	0.00263	0.00468	0.00666	0.02873	0.02895	...	7961.45	10.83540	953.650	64.3854	57.4965	66.4942	55.2003	0.118815	-1663.6400	25.3731
4	0.10742	0.26412	0.20268	0.20954	0.26893	0.00263	0.00414	0.00723	0.02958	0.03157	...	8755.37	10.43910	1081.950	67.7978	56.9557	64.9350	53.1639	0.222974	94.6821	25.3731

5 rows × 159 columns

[23]:

corr=inout.corr()
plt.figure(figsize=(25,15))
sns.heatmap(corr.iloc[0:92,92:-1], cmap=sns.diverging_palette(220, 10, as_cmap=True))
plt.title('Correlation Matrix [params/results]',fontsize=16);
plt.ylabel('Parameters',fontsize=16)
plt.xlabel('Outputs',fontsize=16)

[23]:

Text(0.5, 113.09375, 'Outputs')

../_images/jupyter_notebook_io_sensitivity_analysis_27_1.png

[ ]: