What is GSMOD?

GSMOD is a stochastic modelling software that allows users to add calibration-constrained Monte Carlo (or quasi-Monte Carlo) simulation capabilities to existing MODFLOW and MT3D models through an easy to use graphical user interface, as described in Figure 1The capacity to add stochastic simulations to existing models is what makes GSMOD extremely useful and unique:

  • Groundwater modellers do not have to significantly change their existing modelling workflow in order to perform stochastic simulations throughout their modelling project. They can continue using their preferred MODFLOW / MT3D modelling software (Groundwater Vistas, GMS, ModelMuse, FloPy...), adding stochastic capabilities as needed, depending on the objectives of the project.
  • Given that only few changes to the modelling workflow need to be implemented, the modeller can take the final decision to perform stochastic simulations towards the end of the project, depending on the available budget and time. The idea is for the stochastic simulations to complement the traditional modelling results, not to replace them.
  • Since GSMOD uses existing groundwater models, consulting companies can take a look at groundwater models that have been developed in the past and contact clients to offer this new product using their existing models. This approach adds significant value to the modelling results at reasonable costs.


GSMOD, Monte Carlo, stochastic, mining, hydrogeology, groundwater, software, calibration, free, simulation, transport, MODFLOW, MT3D, FEFLOW, slope stability, finite difference, USG, USGS, hydrograph, probability, histogram, python, matplotlib, water resources, climate change, geology, geosciences, probability, concentrations, chemistry, geochemistry, demo, predictions, model, modeling, modelling, flopy, leapfrog, system, risk, evaluation, null space, uncertainty, analysis, heterogeneity, hydrology, dynamic, coupling, numpy, assessment, interactive, parameter, distribution, thomas, booth, pest, modeling
Figure 1: GSMOD stochastic modelling workflow



GSMOD: A stochastic modelling platform for MODFLOW / MT3D (part I)

The use of stochastic models offers many advantages when trying to understand and manage uncertainty in hydrogeological studies. However, many challenges arise when trying to implement these types of models, which might lead to question their benefits given the expected costs.

GSMOD was developed with the objective of making the stochastic modelling process as simple as possible, including the following features:

  • Capacity to add calibration-constrained Monte Carlo (or quasi-Monte Carlo) simulation capabilities to existing MODFLOW / MT3D models, independently of the original modelling software.
  • Different post-processing options such as stochastic hydrographs with confidence bands (Figure 1) and probability of exceedance maps (Figure 2).
  • Capacity to filter out the stochastic realizations that do not represent the observed historic behaviour (poor calibration).
  • Control over the model executions, including an heuristic solver approach that detects when a model fails to converge and tries to re-run it with more robust solver parameters.
  • All pre- and post-processing tools wrapped within an easy to use GUI (Figure 3).


GSMOD, Monte Carlo, stochastic, mining, hydrogeology, groundwater, software, calibration, free, simulation, transport, MODFLOW, MT3D, FEFLOW, slope stability, finite difference, USG, USGS, hydrograph, probability, histogram, python, matplotlib, water resources, climate change, geology, geosciences, probability, concentrations, chemistry, geochemistry, demo, predictions, model, modeling, modelling, flopy, leapfrog, system, risk, evaluation, null space, uncertainty, analysis, heterogeneity, hydrology, dynamic, coupling, numpy, assessment, interactive, parameter, distribution, thomas, booth, pest, modeling
Figure 1: Concentration hydrograph with confidence band (1000 realizations; 423 of 1000 with NRMS transport < 10%)


GSMOD, Monte Carlo, stochastic, mining, hydrogeology, groundwater, software, calibration, free, simulation, transport, MODFLOW, MT3D, FEFLOW, slope stability, finite difference, USG, USGS, hydrograph, probability, histogram, python, matplotlib, water resources, climate change, geology, geosciences, probability, concentrations, chemistry, geochemistry, demo, predictions, model, modeling, modelling, flopy, leapfrog, system, risk, evaluation, null space, uncertainty, analysis, heterogeneity, hydrology, dynamic, coupling, numpy, assessment, interactive, parameter, distribution, thomas, booth, pest, modeling
Figure 2: Example of probability map of exceeding concentration limit (1000 realizations; 423 of 1000 with NRMS transport < 10%)


GSMOD, Monte Carlo, stochastic, mining, hydrogeology, groundwater, software, calibration, free, simulation, transport, MODFLOW, MT3D, FEFLOW, slope stability, finite difference, USG, USGS, hydrograph, probability, histogram, python, matplotlib, water resources, climate change, geology, geosciences, probability, concentrations, chemistry, geochemistry, demo, predictions, model, modeling, modelling, flopy, leapfrog, system, risk, evaluation, null space, uncertainty, analysis, heterogeneity, hydrology, dynamic, coupling, numpy, assessment, interactive, parameter, distribution, thomas, booth, pest, modeling
Figure 3: GSMOD v1.3 Graphical User Interface



GSMOD: A stochastic modelling platform for MODFLOW/MT3D (part II)

Stochastic groundwater models can be a key contribution to hydrogeological studies, especially when the limitations of traditional workflows seem evident. Although efforts have been made in order to promote their use, they continue to be rather uncommon within the industry. There are several reasons for this to happen, including:

  • The lack of knowledge about the type of results that can be generated, and about their potential contribution to commercial projects.
  • The idea that stochastic groundwater models are too complicated to develop, requiring expertise that consultants might not have, along with excessive implementation times.

Stochastic models are indeed sophisticated tools, however the key to their success relies on (1) understanding how they can make a significant difference throughout a particular project, (2) getting the most value out of the stochastic realizations by using appropriate post-processing methodologies, and (3) speeding up their implementation time through the use of specialized software. Following these ideas, new features have been added to GSMOD, including:

  • Improved processing and post-processing times of the stochastic simulations through the use of parallel computing.
  • Capacity to generate histograms of the stochastic results (Figure 1), in addition to the stochastic hydrographs previously presented in GSMOD part I.
  • Capacity to generate contour maps based on frequency analysis of the stochastic realizations (Figure 2), in addition to the probability maps previously presented in GSMOD part I.
  • Capacity to identify the most influential parameters on the calibration and predictive simulations (Figure 3 and 4), allowing to evaluate potential areas of improvement for the model and hydrogeological characterization. Which are the most important parameters for the predictions of interest of your study? Is your calibration highly dependent on these parameters? Are these parameters well characterized?
  • Capacity to export contour map animations to GIF and MPG files (Figure 5).


GSMOD, Monte Carlo, stochastic, mining, hydrogeology, groundwater, software, calibration, free, simulation, transport, MODFLOW, MT3D, FEFLOW, slope stability, finite difference, USG, USGS, hydrograph, probability, histogram, python, matplotlib, water resources, climate change, geology, geosciences, probability, concentrations, chemistry, geochemistry, demo, predictions, model, modeling, modelling, flopy, leapfrog, system, risk, evaluation, null space, uncertainty, analysis, heterogeneity, hydrology, dynamic, coupling, numpy, assessment, interactive, parameter, distribution, thomas, booth, pest, modeling
Figure 1: Histogram of the maximum concentration at a given location for a total of 1000 realizations (423 of 1000 with NRMS transport < 10%)


GSMOD, Monte Carlo, stochastic, mining, hydrogeology, groundwater, software, calibration, free, simulation, transport, MODFLOW, MT3D, FEFLOW, slope stability, finite difference, USG, USGS, hydrograph, probability, histogram, python, matplotlib, water resources, climate change, geology, geosciences, probability, concentrations, chemistry, geochemistry, demo, predictions, model, modeling, modelling, flopy, leapfrog, system, risk, evaluation, null space, uncertainty, analysis, heterogeneity, hydrology, dynamic, coupling, numpy, assessment, interactive, parameter, distribution, thomas, booth, pest, modeling
Figure 2: Drawdown maps for the best calibrated model vs percentile 95% of 357 realizations (357 of 500 with NRMS flow < 5%)


GSMOD, Monte Carlo, stochastic, mining, hydrogeology, groundwater, software, calibration, free, simulation, transport, MODFLOW, MT3D, FEFLOW, slope stability, finite difference, USG, USGS, hydrograph, probability, histogram, python, matplotlib, water resources, climate change, geology, geosciences, probability, concentrations, chemistry, geochemistry, demo, predictions, model, modeling, modelling, flopy, leapfrog, system, risk, evaluation, null space, uncertainty, analysis, heterogeneity, hydrology, dynamic, coupling, numpy, assessment, interactive, parameter, distribution, thomas, booth, pest, modeling
Figure 3: Influence of natural recharge on the flow calibration for a total of 1000 realizations (555 of 1000 with NRMS flow < 5%)


GSMOD, Monte Carlo, stochastic, mining, hydrogeology, groundwater, software, calibration, free, simulation, transport, MODFLOW, MT3D, FEFLOW, slope stability, finite difference, USG, USGS, hydrograph, probability, histogram, python, matplotlib, water resources, climate change, geology, geosciences, probability, concentrations, chemistry, geochemistry, demo, predictions, model, modeling, modelling, flopy, leapfrog, system, risk, evaluation, null space, uncertainty, analysis, heterogeneity, hydrology, dynamic, coupling, numpy, assessment, interactive, parameter, distribution, thomas, booth, pest, modeling
Figure 4: Parameter influence on the flow calibration (555 of 1000 with NRMS flow < 5%)


GSMOD, Monte Carlo, stochastic, mining, hydrogeology, groundwater, software, calibration, free, simulation, transport, MODFLOW, MT3D, FEFLOW, slope stability, finite difference, USG, USGS, hydrograph, probability, histogram, python, matplotlib, water resources, climate change, geology, geosciences, probability, concentrations, chemistry, geochemistry, demo, predictions, model, modeling, modelling, flopy, leapfrog, system, risk, evaluation, null space, uncertainty, analysis, heterogeneity, hydrology, dynamic, coupling, numpy, assessment, interactive, parameter, distribution, thomas, booth, pest, modeling
Figure 5: Probability of exceeding chloride concentration limit (423 of 1000 realizations with NRMS transport < 10%)



GSMOD: Current characteristics and capabilities

GSMOD is a stochastic modelling software that allows users to add calibration-constrained Monte Carlo (or quasi-Monte Carlo) simulation capabilities to existing MODFLOW/MT3D models through an easy to use graphical user interface. 

The current characteristics/capabilities of GSMOD are:

- Compatibility with MODFLOW-2000, 2005, NWT and USG, and MT3DMS, MT3D-USGS and USG-Transport.

- Capacity to run flow stochastic models, flow-transport stochastic models, and transport stochastic models based on deterministic flow models. In the last case GSMOD runs the flow model only once to save computational time.

- Capacity to generate stochastic parameters based on Uniform, Log-Uniform, Normal, Log-Normal, Triangular and PERT distributions, and to trim the lower and upper limits of Normal and Log-Normal distributions to avoid unrealistic parameters when needed. The distributions can represent the value of the parameter or a multiplier of the parameter values that already exist in the model.


- Capacity to use purely random, Latin Hypercube (near-random) and Sobol sequence (quasi-random) parameter sampling methods.

- Capacity to use an improved calibration-constrained Monte Carlo approach that considerably increases the number of runs that are consistent with the historic behaviour of the system (i.e. low NRMS), while maintaining the stochastic nature of the method.

- Capacity to calculate calibration statistics for each valid model run and to post-process the stochastic runs based on different calibration criteria (known as calibration constrained Monte Carlo, or stochastic modelling with Bayesian probability quantification). This allows to generate a set of realizations that comply both with the probabilistic distributions of the flow and transport parameters and with the historic behaviour of the system (good calibration). The user is able to change and test different calibration criteria in a matter of seconds, allowing to assign different valorizations to the expert knowledge (probability distributions of the defined parameters) and observed values (calibration statistics).

- Capacity to change the observed data and associated weights used to calculate the calibration statistics, without having to re-run the stochastic simulations, allowing to test different objective functions (i.e. calibration functions) depending on different model objectives. This also allows to easily fix any errors with the input data.

- Capacity to perform parallel runs and multi-threading analyses of the stochastic results using different CPU cores within one computer, with the additional possibility to combine computational results from up to 16 different computers (distributed computing).


- Capacity to halt and post-process the stochastic runs at any point to verify if the preliminary results are as expected, allowing the user to correct any potential issue without having to run all stochastic realizations.

- Capacity to detect and filter out models that fail to converge or have numerical balance errors above a user-defined limit for a percentage of the model time steps (e.g. 99% of the times steps need to have a balance error lower than 1%).

- Use of an innovative and exclusive heuristic solver approach that is able to automatically re-run problematic stochastic realizations with more robust solver parameters.

- Capacity to identify the best calibrated model and the importance of each stochastic parameter with respect to the calibration through the use of Global Sensitivity Analysis techniques. That said, it is not an objective of GSMOD to calibrate the models, just a consequence of the stochastic runs.

- Capacity to omit user defined realizations and to replace the internal NRMS calculations by external values provided by the user.

- Capacity to generate stochastic hydrographs and histograms for heads, pore water pressures, drawdowns, concentrations and global flow and mass balances.

- Capacity to generate stochastic maps based on statistical analyses of the stochastic runs, including probability of exceedance maps and contour maps based on percentile criteria. For each map, different options exist to integrate the information with respect to depth (mean, maximum or minimum values in depth, or results by layer).


- Capacity to export the stochastic results to Excel spreadsheets, CSV files, PNG images, GIF/MP4 animations and raster files.

- Capacity to export all necessary information related to the stochastic realizations for users to be able to perform analyses outside of GSMOD with their preferred programming language, e.g. Python.

- Capacity to run stochastic predictive simulations that are a continuation of a stochastic calibration model. For each predictive simulation the flow and transport initial conditions are automatically updated from the respective calibration run, meantime user specified stochastic parameter values will be taken from the calibration models to be consistent with respect to parameters that should have the same values or series of values for both modelling periods.

- Capacity to detect user errors during the setting up of the GSMOD project to avoid problems during the stochastic runs. Part of the setting up process will be automatically filled up by GSMOD based on analyses of the numerical model files. An important objective of GSMOD is to make the stochastic modelling process as simple as possible.


- Comprehensive documentation within the help menu, including various tutorials that explain how to add stochastic simulation capabilities to existing groundwater models.



GSMOD: Calibration-constrained Monte Carlo simulations for pore water pressure prediction in open pit mining

Numerical groundwater models are regularly used to predict pore water pressures (PWPs) in open pit mines. The resulting PWPs are used in geomechanical analyses to evaluate slope stability and the need for active depressurization. These groundwater models tend to be extremely complex, both from a hydrogeological and a numerical point of view.

Important efforts have been made to improve the quality of the modelling predictions, including the use of highly discretized 3D models and the representation of hydromechanical coupled processes, if required. That said, there is still a tendency to rely exclusively on deterministic models, which limits the capacity to quantify and manage the underlying uncertainty.

Stochastic predictions of PWPs for an open pit project located in Chile are presented in Figures 1 to 4 to demonstrate the advantages of performing calibration-constrained Monte Carlo (or quasi-Monte Carlo) simulations. These figures give important information about the uncertainty of the modelling results, including the fact that the degree of uncertainty of the PWPs decreases with time in most piezometers located close to the excavation (e.g. piezometer VWP-1 in Figure 1). This behaviour is explained by the nature of the groundwater flow around the open pit, which tends to be highly conditioned by the existing seepage faces and active depressurization systems. On the contrary, the uncertainty of the calculated PWPs increases with time in piezometers that are located further away from the open pit and in the proximity to relevant geological structures (e.g. piezometer VWP-2 in Figure 2).

The stochastic simulations also reveal how the best calibrated model could display PWPs in the upper range of the stochastic results for certain periods of time (conservative result from a geomechanical point of view) and in the lower range for other periods (optimistic result from a geomechanical point of view), as shown for piezometer VWP-3 in Figure 3. This result refutes the idea that a conservative historic behaviour will necessarily lead to conservative predictions, and that the best calibrated model represents a middle point within the expected uncertainty. All of this corroborates the advantages of performing stochastic simulations, especially if the available field observations are limited.

Finally, it is concluded that the uncertainty of the decrease in PWPs grows with time in all piezometers around the mining operation (e.g. piezometer VWP-1 in Figure 4). This result reinforces the importance of analyzing in detail the type of predictive variables to consider for the objectives of the study: PWPs, decrease of PWPs, and/or rate of decrease of PWPs.

Adding stochastic capabilities to the existing MODFLOW-USG open pit model took less than a day thanks to the use of GSMOD. Running times were in the order of days due to the use of parallel computing. Preliminary results could be revised after just a couple of hours in order to correct potential issues and to have an idea of how the final stochastic results would look like.



GSMOD, Monte Carlo, stochastic, mining, hydrogeology, groundwater, software, calibration, free, simulation, transport, MODFLOW, MT3D, FEFLOW, slope stability, finite difference, USG, USGS, hydrograph, probability, histogram, python, matplotlib, water resources, climate change, geology, geosciences, probability, concentrations, chemistry, geochemistry, demo, predictions, model, modeling, modelling, flopy, leapfrog, system, risk, evaluation, null space, uncertainty, analysis, heterogeneity, hydrology, dynamic, coupling, numpy, assessment, interactive, parameter, distribution, thomas, booth, pest, modeling
Figure 1: PWPs in piezometer VWP-1 for 500 model runs (212 model runs with NRMS < 7.5%) 


GSMOD, Monte Carlo, stochastic, mining, hydrogeology, groundwater, software, calibration, free, simulation, transport, MODFLOW, MT3D, FEFLOW, slope stability, finite difference, USG, USGS, hydrograph, probability, histogram, python, matplotlib, water resources, climate change, geology, geosciences, probability, concentrations, chemistry, geochemistry, demo, predictions, model, modeling, modelling, flopy, leapfrog, system, risk, evaluation, null space, uncertainty, analysis, heterogeneity, hydrology, dynamic, coupling, numpy, assessment, interactive, parameter, distribution, thomas, booth, pest, modeling
Figure 2: PWPs in piezometer VWP-2 for 500 model runs (212 model runs with NRMS < 7.5%) 


GSMOD, Monte Carlo, stochastic, mining, hydrogeology, groundwater, software, calibration, free, simulation, transport, MODFLOW, MT3D, FEFLOW, slope stability, finite difference, USG, USGS, hydrograph, probability, histogram, python, matplotlib, water resources, climate change, geology, geosciences, probability, concentrations, chemistry, geochemistry, demo, predictions, model, modeling, modelling, flopy, leapfrog, system, risk, evaluation, null space, uncertainty, analysis, heterogeneity, hydrology, dynamic, coupling, numpy, assessment, interactive, parameter, distribution, thomas, booth, pest, modeling
Figure 3: PWPs in piezometer VWP-3 for 500 model runs (212 model runs with NRMS < 7.5%)


GSMOD, Monte Carlo, stochastic, mining, hydrogeology, groundwater, software, calibration, free, simulation, transport, MODFLOW, MT3D, FEFLOW, slope stability, finite difference, USG, USGS, hydrograph, probability, histogram, python, matplotlib, water resources, climate change, geology, geosciences, probability, concentrations, chemistry, geochemistry, demo, predictions, model, modeling, modelling, flopy, leapfrog, system, risk, evaluation, null space, uncertainty, analysis, heterogeneity, hydrology, dynamic, coupling, numpy, assessment, interactive, parameter, distribution, thomas, booth, pest, modeling
Figure 4: Decrease in PWPs in piezometer VWP-1 for 500 model runs (212 model runs with NRMS < 7.5%)




GSMOD: Stochastic groundwater modelling in commercial projects

The question of how much time and resources are required to implement, run and postprocess stochastic groundwater simulations is key for the purpose of making stochastic groundwater modelling a common practice and profitable activity. Three important factors are presented below:


1) Do groundwater modellers need to considerably modify their existing modelling workflows to perform stochastic simulations? 

Chances of successfully implementing stochastic groundwater models decrease if important modifications to the existing modelling workflows are required. These modifications will increment the risk of failing to deliver modelling results in reasonable times, which could cause even major repercussions if the main results expected by the client are exclusively associated to the stochastic component of the model.

A more convenient approach is to modify the existing modelling workflows as little as possible, and to think of the stochastic simulations as a complement to the traditional deterministic modelling results. In this way, consulting companies can offer stochastic groundwater modelling as an additional task or as a continuation of traditional modelling projects, instead of a new and uncertain type of project that conditions the development of a probably expensive numerical model around stochastic simulations.


2) Are stochastic computational times compatible with commercial projects?

Stochastic simulations take time, which can be considerably reduced by the use of parallel and distributed computing. It is concluded that any stochastic modelling workflow has to include the option of performing parallel simulations. That said, there is an additional critical factor to consider: where in the traditional modelling process should the stochastic modelling begin? This question is intrinsically related to the initial question of how much existing modelling workflows need to be modified to perform stochastic simulations.

A convenient option is to start the stochastic modelling process in the latest stages of the modelling project, ideally after finishing the calibration and deterministic predictive simulations. This approach requires minimum changes to the existing modelling workflows, taking the stochastic simulation process out of the iterative modelling loop of "conceptual model - calibration - sensitivity analyses - predictive simulations", which might be completely incompatible with stochastic modelling times. Alternatively, it allows to implement stochastic simulations using previously developed models, which opens new business opportunities not only with current but also with past clients.

This approach has the additional benefit of starting the stochastic simulations after finishing the deterministic simulations, which are considered to be the traditional main deliverable. If the stochastic results are consistent with the deterministic results, then the traditional deterministic approach was adequate and the stochastic results contribute to demonstrate the robustness of the conceptual and numerical model, adding quantitative estimations of the model uncertainty which could ultimately allow to readjust any adopted safety factors. A properly developed model will very likely follow this route. If the stochastic results lead to different conclusions than the deterministic results, then the stochastic simulations can be considered as a key contribution to the project, and a revision of the deterministic and stochastic components of the model has to be made to understand the differences. 

Postponing the stochastic modelling process to the latest stages of the project has the inconvenient of starting to condition the stochastic results to decisions taken during the model construction and calibration. My personal opinion is that the practical benefits of the stochastic modelling approach described in the previous paragraphs exceed this type of disadvantages, especially considering the current limited use of stochastic groundwater modelling within the industry. Independent of this, more advanced stochastic modelling methodologies that try to address this specific issue exist, like the utilization of stochastic multi-model approaches. These modelling techniques are however more expensive.


3) Do modellers need to develop their own pre and postprocessing modelling routines?

Although it is many times interesting and beneficial to develop in-house processing routines, having access to a robust, easy to use and widely available stochastic modelling software could make an important difference both from a technical and business perspective. Developing adequate in-house codes takes time, especially if these codes need to be flexible enough to be compatible with diverse modelling objectives and configurations, and efficient enough to process up to tens or even hundreds of gigabytes of stochastic modelling results in reasonable times.

All of this said, the main challenge for relying exclusively on in-house routines is that clients will very likely prefer the utilization of validated, easy to use and widely available software packages, that do not tie their projects to a specific consultant. Furthermore, one can argue that the existence and continuous improvement of these type of easy to use - widely available software is what will end up making stochastic groundwater modelling a common practice in the future.



Estimation of optimal complexity in groundwater models using cross-validation methods

The level of complexity that a groundwater model should have is an important and recurrent question for hydrogeologists. A few weeks ago an interesting discussion about this topic occurred on LinkedIn (link), with one of the main conclusions being that the optimal complexity should minimize predictive error/uncertainty by improving the fit to the existing observations, without overfitting them, as described in Figure 1. Understanding the meaning and implications of this figure is key for the modelling process. In that sense, I decided to implement two sets of models to illustrate how different degrees of complexity affect predictive error. We will also see how to estimate the optimal level of complexity in groundwater models utilizing cross-validation techniques commonly used by data scientists.

In the context of this article, model complexity will mainly refer to how many parameters the model has (e.g. the number of hydraulic conductivity zones or pilot points), and not necessarily to the type of equations that the model is using (e.g. fully saturated vs Richards' equations, surface water as boundary conditions vs fully coupled sw-gw, etc.), which opens a broader discussion. Additionally, the existing observations have to be reasonably related to the predictive variables, in terms of type, locations, scales and associated stresses.

 
"GSMOD, Monte Carlo, stochastic, mining, hydrogeology, groundwater, software, calibration, free, simulation, transport, MODFLOW, MT3D, FEFLOW, slope stability, finite difference, USG, USGS, hydrograph, probability, histogram, python, matplotlib, water resources, climate change, geology, geosciences, probability, concentrations, chemistry, geochemistry, demo, predictions, model, modeling, modelling, flopy, leapfrog, system, risk, evaluation, null space, uncertainty, analysis, heterogeneity, hydrology, dynamic, coupling, numpy, assessment, interactive, parameter, distribution, thomas, booth, pest, modeling"
Figure 1: Relationship between model complexity and predictive uncertainty, taken from the Australian Groundwater Modelling Guidelines (Barnett et al, 2012).


Relationship between model complexity and predictive error

In simple terms, Figure 1 indicates that adding complexity to a model (e.g. adding K zones) allows us to obtain a better fit to the existing observations (e.g. a lower NRMS), which reduces predictive error/uncertainty, but that too much complexity (e.g. too many K zones) could potentially lead to overfitting and increased predictive error/uncertainty. It is concluded that a model should aim to have an optimal level of complexity. Clearly, the definition of the model parameters has to be related to the conceptual understanding of the system, to the available field information, and to the modelling objectives; however, for the purpose of this article we will just focus on the complexity of the model.


Example 1: Generic Excel model

We start with a very simple example implemented in Excel, to better understand the basic concepts behind Figure 1. The main idea is to (1) obtain synthetic observations from a generic model that will be known to us, (2) add a small error to each observation, (3) fit a series of trendlines with different degrees of complexity to the previously generated observations, and (4) compare the predictive capacity of each trendline with respect to the "true prediction" given by the model that was used to generate the synthetic observations.

It is clear that in real problems we will never know the "true prediction", which is the main reason why we model; it is only a convenient technique that allows us to illustrate the relationship between model complexity and predictive error. The generic model that we choose for creating the synthetic observations is f (x) = x ^ 2, with random errors between +/- 50, as shown in Figure 2.


"GSMOD, Monte Carlo, stochastic, mining, hydrogeology, groundwater, software, calibration, free, simulation, transport, MODFLOW, MT3D, FEFLOW, slope stability, finite difference, USG, USGS, hydrograph, probability, histogram, python, matplotlib, water resources, climate change, geology, geosciences, probability, concentrations, chemistry, geochemistry, demo, predictions, model, modeling, modelling, flopy, leapfrog, system, risk, evaluation, null space, uncertainty, analysis, heterogeneity, hydrology, dynamic, coupling, numpy, assessment, interactive, parameter, distribution, thomas, booth, pest, modeling"
Figure 2: Generic Excel model and synthetic observations including random error.


We fit trendlines with different degrees of complexity to our synthetic observations, starting with a line, moving forward with polynomials of degree 2, 4 and 6. For all trendlines we define a forecast period, which we visually compare to the "true prediction". The results are presented in Figure 3, which shows that:

  • increasing the model complexity always allows for a better fit to the existing observations (i.e. R2 gets closer to 1 as we increase the complexity),
  • increasing the complexity leads to better predictions up to a certain level of complexity,
  • increasing the complexity beyond that level affects the predictive capacity of the model due to overfitting.

This simple example shows that under- and over-parameterization can lead to significant predictive errors. It is also concluded that improving the fit as much as possible does not necessarily lead to better predictions; on the contrary, it can cause important predictive errors due to overfitting.

It is important to mention that if no errors would have been added to the synthetic observations, all adjusted polynomials would have led to a perfect fit (y = x ^ 2). This demonstrates that predictive error due to overfitting is an amplification of the measurement error, as indicated in Figure 1, confirming that the quality of the observations is fundamental for the modelling process.


"GSMOD, Monte Carlo, stochastic, mining, hydrogeology, groundwater, software, calibration, free, simulation, transport, MODFLOW, MT3D, FEFLOW, slope stability, finite difference, USG, USGS, hydrograph, probability, histogram, python, matplotlib, water resources, climate change, geology, geosciences, probability, concentrations, chemistry, geochemistry, demo, predictions, model, modeling, modelling, flopy, leapfrog, system, risk, evaluation, null space, uncertainty, analysis, heterogeneity, hydrology, dynamic, coupling, numpy, assessment, interactive, parameter, distribution, thomas, booth, pest, modeling"
Figure 3: Effect of model complexity on the predictive error for the generic Excel model.


Example 2: MODFLOW/MT3DMS model

A flow and transport model is implemented using MODFLOW-2005 and MT3DMS to study the relationship between complexity and predictive error in groundwater models, with a brief description presented in Figure 4. The base model has twelve hydraulic conductivity zones, defining a known level of complexity. Infiltration of polluted water occurs over a period of 5 years with a concentration of 3000 mg/L, followed by a period of 45 years in which no infiltration occurs.

It is important to mention that the concepts explained in this article are also valid if pilot points instead of zones are being used, although additional considerations need to be taken. See the additional comments at the end of this article for more information.


"GSMOD, Monte Carlo, stochastic, mining, hydrogeology, groundwater, software, calibration, free, simulation, transport, MODFLOW, MT3D, FEFLOW, slope stability, finite difference, USG, USGS, hydrograph, probability, histogram, python, matplotlib, water resources, climate change, geology, geosciences, probability, concentrations, chemistry, geochemistry, demo, predictions, model, modeling, modelling, flopy, leapfrog, system, risk, evaluation, null space, uncertainty, analysis, heterogeneity, hydrology, dynamic, coupling, numpy, assessment, interactive, parameter, distribution, thomas, booth, pest, modeling"
Figure 4: Description of the MODFLOW/MT3DMS base model.


Similarly to the Excel model, synthetic observations are obtained from our base model (i.e. concentrations in MW-01 to MW-10), to which a certain amount of error is added. The objective is to use these observations to calibrate models with different degrees of complexity (i.e. the base model with different amount of K zones: 1, 6, 12, 36), and to visually compare the predictive capacity of each calibrated model with respect to the "true prediction" given by the base model. The results are presented in Figure 5, which corroborate the conclusions from the Excel model:

  • increasing the model complexity always allows for a better fit to the existing observations (i.e. adding K zones allows to obtain lower RMS values, with a particularly poor behaviour for the "1 zone model"),
  • increasing the complexity leads to better predictions up to a certain level of complexity,
  • increasing the complexity beyond that level affects the predictive capacity of the model due to overfitting.

It is important to understand that each calibrated model in Figure 5 corresponds to the highest achievable fit for a given level of complexity considering all observations (MW-01 to MW-10). In that sense, the calibration and predictive errors shown in the first chart (i.e. the "1 zone model") are almost exclusively associated to the lack of complexity of the model and not to the chosen K value (i.e. there is no other K value that could lead to a better fit to the existing observations using only one K zone), meantime the errors of the last chart (i.e. the "36 zones model") are associated to an excess of complexity that leads to overfit in most observation wells.


"GSMOD, Monte Carlo, stochastic, mining, hydrogeology, groundwater, software, calibration, free, simulation, transport, MODFLOW, MT3D, FEFLOW, slope stability, finite difference, USG, USGS, hydrograph, probability, histogram, python, matplotlib, water resources, climate change, geology, geosciences, probability, concentrations, chemistry, geochemistry, demo, predictions, model, modeling, modelling, flopy, leapfrog, system, risk, evaluation, null space, uncertainty, analysis, heterogeneity, hydrology, dynamic, coupling, numpy, assessment, interactive, parameter, distribution, thomas, booth, pest, modeling"
Figure 5: Effect of model complexity on the predictive error for the MODFLOW/MT3DMS model.


Estimation of optimal complexity in groundwater models

A common approach used by data scientists for estimating the optimal complexity of a model is to separate the observations into two datasets: the training dataset, used to calibrate the model, and the testing dataset, used to verify that no under or overfitting is occurring. The associated methods are grouped under the name of cross-validation techniques, including the Train-Test split, the K-Fold CV and the Shuffle and Split. The use of cross-validation techniques is a fundamental step in the modelling process for data scientists, since as demonstrated before, minimum predictive error is only achieved if an optimal complexity is used. It is necessary to mention that cross-validation methods should be applied only when the existing observations are reasonably related to the predictions of interest, in terms of type, location, scales, and associated stresses.

We can see that cross-validation in data science is up to a certain point similar to an existing groundwater modelling approach, in which the historic period is split into a calibration and a validation period. That said, cross-validation techniques used by data scientists are fairly sophisticated algorithms, formulated with the specific objective of finding the optimal level of complexity of a model. Furthermore, the idea of separating the historic period into a calibration and a validation period is many times misinterpreted in groundwater modelling, to the point in which in many occasions the distinction is not even attempted because it is considered to be useless. This happens due to a misunderstanding of the underlying problem and available methods.

For the purpose of this article we apply a very simple version of the Train-Test split method to the MODFLOW/MT3DMS model, i.e. (1) we separate the observations into training (years 0 to 14) and testing (years 15 to 17) datasets, (2) we use the training dataset to calibrate the models associated to the different levels of complexity, and (3) we estimate the optimal complexity by finding the minimum RMS for the testing dataset (not for the training dataset, which always decreases with the number of K zones). The results are presented in Figure 6, which shows that the optimal complexity is approximately between six and twelve K zones, although other levels of complexity could have been evaluated for a more precise result.


"GSMOD, Monte Carlo, stochastic, mining, hydrogeology, groundwater, software, calibration, free, simulation, transport, MODFLOW, MT3D, FEFLOW, slope stability, finite difference, USG, USGS, hydrograph, probability, histogram, python, matplotlib, water resources, climate change, geology, geosciences, probability, concentrations, chemistry, geochemistry, demo, predictions, model, modeling, modelling, flopy, leapfrog, system, risk, evaluation, null space, uncertainty, analysis, heterogeneity, hydrology, dynamic, coupling, numpy, assessment, interactive, parameter, distribution, thomas, booth, pest, modeling"
Figure 6: Estimation of the optimal complexity using the Train-Test split method.


Summary

Finding the optimal level of complexity of a model is key for minimizing predictive error. Under and overfitting can deteriorate the predictive capacity of a model, especially if measurement errors are significant. Cross-validation methods from data science can be used to estimate the optimal complexity of groundwater models, when the existing observations are reasonably related to the predictive variables of interest, and when the amount and quality of the observations is adequate. That said, the use of these types of cross-validation methods is still uncommon in groundwater modelling, which means that there is not enough evidence to conclude about their real contribution in consulting projects considering all practicalities. Independent of this, it is important for groundwater modellers to understand the implications of under and overfitting, and the existing approaches for estimating the optimal level of complexity of a model.


Additional comments

Model complexity can have important practical implications for the development of a modelling project, which were not discussed in this article. Complex models can take an important amount of time to implement and run, potentially affecting the tasks that the modeller will be able to perform considering the available deadlines and budget. These factors need to be considered when estimating the optimal level of complexity of a model, ideally within a tier-based risk assessment framework.

The application of cross-validation techniques requires an important degree of automation, especially when using the most sophisticated types of cross-validation methods. GSMOD is in the process of incorporating cross-validation techniques commonly used by data scientists as part of its modelling workflow. These methods will be embedded within the Bayesian uncertainty analysis framework currently offered by GSMOD.

When using PEST and Pilot Points, modellers have additional tools that allow them to control the "complexity" of the model without having to change the number of Pilot Points. One of these tools is Tikhonov regularisation. Visit the PEST homepage for additional information.


Calibration-constrained Monte Carlo and additional methods

Groundwater model predictions can be affected by different sources of uncertainty, potentially leading to important repercussions in the decision-making process that follows them. Modelling workflows that rely exclusively on calibration (meaning history matching) and local sensitivity analysis for data assimilation usually give limited insight into the effect that these sources of uncertainty have on the model predictions. This is particularly important when (1) the future behaviour of the system is expected to be considerably different than the existing and past behaviour, (2) the observation data is not directly related to the predictive variables of interest, and (3) the amount, quality, distribution, and type of observation data leads to a strong non-unique response for specific results of interest.

Among the most important sources of uncertainty, it is worth mentioning:

  • the uncertainty associated to hydraulic and transport parameters due to the heterogeneity of geological materials and potential lack of representative field information at the required scales,
  • the structural uncertainty associated to the hydrogeological units, basement and faults/dykes locations and geometries,
  • the uncertainty associated to the recharge and discharge processes, including climate, 
  • the errors and inadequate representativity of field observations, considering the scales of measurement, heterogeneity, and interest of the hydrogeological study,
  • and the numerical assumptions and limitations of the modelling software: is the assumption of (equivalent) porous media adequate in my particular study? is the modelling code adding numerical dispersion to my transport models? is the assumption of constant total stress (Δσ = 0) of the traditional groundwater flow equations adequate in models that include excavation and/or deposition of material? 

Different methods exist for studying and estimating predictive uncertainty, many of them relying on some form of Bayesian analysis: there is prior knowledge of the system expressed through probability distributions (e.g. hydraulic conductivities estimated from literature and field tests, that vary between Kmin and Kmax with a mode of Km, following a triangular distribution), which is improved by adjusting the model to match existing observations (e.g. heads, concentrations, flows...), leading to posterior knowledge of the system (i.e. a subset of the original K values that reproduces the historic behaviour of the system while being consistent with the literature values and field data). 

The most direct implementation of Bayesian analysis in groundwater modelling is usually called calibration-constrained Monte Carlo (or ABC rejection), with an example presented in Figure 1. In this method, selected model parameters (K, S…) are characterized by probability distributions (prior knowledge; purple histogram), which are sampled and evaluated in the model for several runs (realizations), leading to a set of results that can be classified based on user-defined calibration criteria (posterior knowledge; green histogram). The accepted realizations are used for predictive purposes. In the example of Figure 1, the hydraulic conductivity was originally estimated to follow a PERT distribution with a minimum of 2 m/d, a mode of 5 m/d, and a maximum of 12 m/d (prior knowledge; purple histogram), ending up with values that vary between 3 m/d and 11 m/d with a mode of 7.5 m/d (posterior knowledge; green histogram).

GSMOD, Monte Carlo, stochastic, mining, hydrogeology, groundwater, software, calibration, free, simulation, transport, MODFLOW, MT3D, FEFLOW, slope stability, finite difference, USG, USGS, hydrograph, probability, histogram, python, matplotlib, water resources, climate change, geology, geosciences, probability, concentrations, chemistry, geochemistry, demo, predictions, model, modeling, modelling, flopy, leapfrog, system, risk, evaluation, null space, uncertainty, analysis, heterogeneity, hydrology, dynamic, coupling, numpy, assessment, interactive, parameter, distribution, thomas, booth, pest, modeling
Figure 1: Results from a calibration-constrained Monte Carlo analysis using GSMOD (10.000 model runs).


Calibration-constrained Monte Carlo methods offer many advantages as an uncertainty analysis tool, including:

  • the use of any combination of probability distribution functions to represent the prior knowledge of the system (e.g. uniform, normal, triangular, log-normal, etc.),
  • no assumptions with respect to the resulting posterior distribution, which can end up having any shape, such as uniform, with one peak (unimodal) or several peaks (multimodal),
  • no artificial tendency to underestimate regions of extreme parameter values, or relatively low (but still relevant) posterior probabilities (left and right margins of the histograms),
  • and no relationship between model runs, which means that all realizations can be run independently from each other using parallel computing.

However, the same characteristics that make calibration-constrained Monte Carlo appealing, lead to an important disadvantage: the method tends to generate a low amount of realizations that fit the historic behaviour of the system (e.g. acceptable NRMS) in relation to the total number of model runs, requiring important computational times. This happens because the method does not make any particular effort to find the realizations that best fit the historic observations. As an example, only 3% of the model runs in Figure 1 comply with the defined NRMS criteria. Although several approaches exist for reducing the associated computational times (e.g. use of parallel computing, grid coarsening, narrowing the prior distributions and use of metamodels), calibration-constrained Monte Carlo methods could be unfeasible in many projects.

Different uncertainty analysis methods try to improve the efficiency of calibration-constrained Monte Carlo approaches while aiming to address a similar objective, such as the popular Null Space Monte Carlo method available in PEST (Doherty, 2015) and the also popular Iterative Ensemble Smoother available in PEST++ (White et al., 2020). More information about these methods can be found in the series of webinars Decision-Support Modelling Principles and Practice from the Groundwater Modelling Decision Support Initiative (link). It is worth mentioning at this point that GSMOD is in the process of adding compatibility with PEST++, to offer users the wide variety of tools associated to this popular software package.

Additional to this, a new uncertainty analysis method is being developed specifically for GSMOD. Results can be seen in Figure 2, which shows its application to the same model of Figure 1, increasing the original success rate of 3% to a value of 51%, maintaining good sampling of the parameter space, as seen in Figure 3. It is important to mention that no method can achieve a success rate of 100%, unless the acceptance criteria is considerably high (e.g. high NRMS limit), which would make the method a full Monte Carlo. This new methodology aims to offer a good compromise between Monte Carlo and mathematical optimization approaches, with the following specific advantages:

  • the use of any probability distribution functions to represent the prior knowledge of the system, and no assumptions with respect to the resulting posterior distribution,
  • sampling extreme parameter values is not necessarily a problem,
  • algorithm tries to even out over and underestimated regions of the parameter space as the realizations progress,
  • and no realizations spent on the calculation of a Jacobian, which tends to make it compatible with models that rely on an important number of parameters.

GSMOD, Monte Carlo, stochastic, mining, hydrogeology, groundwater, software, calibration, free, simulation, transport, MODFLOW, MT3D, FEFLOW, slope stability, finite difference, USG, USGS, hydrograph, probability, histogram, python, matplotlib, water resources, climate change, geology, geosciences, probability, concentrations, chemistry, geochemistry, demo, predictions, model, modeling, modelling, flopy, leapfrog, system, risk, evaluation, null space, uncertainty, analysis, heterogeneity, hydrology, dynamic, coupling, numpy, assessment, interactive, parameter, distribution, thomas, booth, pest, modeling
Figure 2: Comparison between the calibration-constrained Monte Carlo results (top chart) and the new uncertainty analysis method (bottom chart), for the same number of total runs (1000); accepted runs in dark blue, rejected runs in light blue. Right plot indicates NRMS of each model run from 1 to 1000 with the same colour legend as the hydrographs.


GSMOD, Monte Carlo, stochastic, mining, hydrogeology, groundwater, software, calibration, free, simulation, transport, MODFLOW, MT3D, FEFLOW, slope stability, finite difference, USG, USGS, hydrograph, probability, histogram, python, matplotlib, water resources, climate change, geology, geosciences, probability, concentrations, chemistry, geochemistry, demo, predictions, model, modeling, modelling, flopy, leapfrog, system, risk, evaluation, null space, uncertainty, analysis, heterogeneity, hydrology, dynamic, coupling, numpy, assessment, interactive, parameter, distribution, thomas, booth, pest, modeling
Figure 3: Comparison of the posterior probability distribution for K, generated by the calibration-constrained Monte Carlo method (top chart) and the new uncertainty analysis method (bottom chart), for the same number of total runs (1000).


Independent of the chosen uncertainty analysis method, what it is important to understand is that trying to estimate predictive uncertainty is key for the modelling process. A rigorous quantification must be done with more or less sophisticated methods, such as the ones mentioned throughout the article; however, even these methods are meaningless without a good (and very likely multidisciplinary) hydrogeological understanding of the system, which must rely on adequate field information. They are also meaningless if unethical practices are present throughout the modelling process. Needless to say, a good hydrogeological understanding of the system in conjunction with appropriate modelling practices can many times lead to successful results even without the use of advanced uncertainty analysis methods, as long as the modelling objectives are clear and the concepts of uncertainty and risk-benefit are present throughout the development of the project, such as in tier-based risk assessments.



References

Doherty JE (2015). Calibration and Uncertainty Analysis for Complex Environmental Models. Watermark Numerical Computing, Brisbane, Australia.

White JT, Hunt RJ, Doherty JE, and Fienen MN (2020). PEST++ version 5, a parameter estimation and uncertainty analysis software suite optimized for large environmental models. U.S. Geological Survey Techniques and Methods Report, in press.