Configuration Arguments
This page provides a list of possible configuration arguments. For examples of how a config file could look like, check out the files:
Data entries
dataset: Defines which dataset will be used. Currently supported iscamelsus(CAMELS-US dataset by Newman et al., 2015. The code is intended to support other datasets but might require specific adaptations, such as extending the parent class to handle differences in dataset structure or features.concept_data_dir: Specifies the path to the data source required for the conceptual model. This path should be defined in the configuration filesrc/utils/data_dir.yml.forcings: This entry can be ignored if the dataset is notcamelsusor unless it is strictly required by a newly defined dataset. It can be either a string or a list of strings corresponding to forcing products in the CAMELS dataset.Examples:
[daymet, maurer, maurer_extended, nldas].
General experiment entries
basin_file: Specifies the full or relative path to a text file containing the basin IDs used for training, validation, and testing. Each line in the file should contain a single basin ID, as defined in the dataset.train_start_date: Start date of the training period (first day of discharge) in the format DD/MM/YYYY. Corresponding pairs of start and end dates denote the different periods.train_end_date: End date of the training period (last day of discharge) in the format DD/MM/YYYY.valid_start_date: Start date of the validation period (first day of discharge) in the format DD/MM/YYYY.valid_end_date: End date of the validation period (last day of discharge) in the format DD/MM/YYYY.metrics: Specifies the list of metrics to calculate during validation (testing). Available metrics include: NSE, Alpha-NSE, Beta-NSE, FHV, FMS, FLV, KGE, Beta-KGE, Peak-Timing, Peak-MAPE, Pearson-r.Reference: For a full list of available metrics, see src/utils/metrics.
experiment_name: Defines the name of your experiment that will be used as a folder name (+ date-time string suffix) to save the model and results.,device: Which device to use in format ofcuda:0,cuda:1, etc, for GPUs orcpu.seed: Fixed random seed. If empty, a random seed is generated for this run.precision: Sets the precision for the model. Supported options: float32, float64.verbose: Specifies the verbosity level of the model’s logging and progress display. -0: Only log informational messages; progress bars are not shown. -1: Show progress bars along with informational messages.
Conceptual model entries
concept_model: Specifies the conceptual model to use. Supported models includeexphydro. The code is intended to support other conceptual models but might require specific adaptations, such as extending the parent class to accommodate the specifics of a newly defined model.exphydro: A two-bucket model (water and snow) with 5 processes and 6 parameters. (Höge et al., 2022.)ode_solver_lib: Specifies the library used for solving ODEs. Supported options includescipyandtorchdiffeq.scipy: Solves ODEs usingsolve_ivpfromscipy.integrate. Reference: SciPy Documentation.Supported methods:
RK45,RK23,DOP853,Radau,BDF,LSODA.Example:
For adaptive-step solvers:
ode_solver_lib:scipyodesmethod:RK23rtol:1e-4atol:1e-6
Note: Methods such as
eulerandrk4are not part of the scipy module and have been separately implemented in the model class.For fixed-step solvers:
ode_solver_lib:scipyodesmethod:eulertime_step:0.5
torchdiffeq: Solves ODEs using thetorchdiffeqlibrary. Reference: torchdiffeq documentation. Supported methods:euler,rk4,midpoint,adaptive_heun,bosh3,dopri5.Example:
ode_solver_lib:torchdiffeqodesmethod:dopri5rtol:1e-4atol:1e-6
Neural network entries
data_dir: Specifies the folder that contains the data obtained by running the conceptual model. The path should be:src/data/data_dir- beware of locating the data in the correct folder.nn_model: Specifies the neural network model to use. Supported models includemlpandlstm. The code is intended to support other neural network models but might require specific adaptations.mlp: A multi-layer perceptron model with fully connected layers.lstm: A Long Short-Term Memory model.hidden_size: Specifies the number of hidden units in each layer of the neural network.Example:
[32, 32, 32, 32, 32]seq_length: Length of the input sequence. Only required for LSTM models.nn_dynamic_inputs: Specifies the dynamic inputs to the neural network.Example:
[s_snow, s_water, prcp, tmean]nn_mech_targets: Specifies the mechanistic targets to the neural network (neural network outputs).Example:
[ps_bucket, pr_bucket, m_bucket, et_bucket, q_bucket]target_variables: Specifies the main target variables for the neural network - the one that will be used to train the model.Example:
[obs_runoff]Note: The
nn_dynamic_inputs,nn_mech_targets, adntarget_variablesentries should be consistent with the variables in thedatasetand be inluded asmodel_inputs,nn_mech_targets, andtarget_variables, respectively, in theconcept_modelentry definded in the filesrc/utils/concept_model_vars.yml.loss_pretrain: Specifies the loss function to use during the pre-training phase. Supported options include nse and mae, but the code is intended to support other loss functions.lr_pretrain: Specifies the learning rate for the pre-training phase.epochs_pretrain: Specifies the number of epochs for the pre-training phase.
Hybrid model entries
data_dir: Same as in the Neural network entries.
hybrid_model: Specifies the hybrid model to use. Supported models includeexphydroM100. The code is intended to support other hybrid models but might require specific adaptations.exphydroM100: A hybrid model that combines a conceptual model with a neural network model. (Höge et al., 2022.). Seeclass ExpHydroM100in src/modelzoo_hybrid/exphydroM100.py for more details.concept_model: Same as in the Conceptual model entries.ode_solver_lib: Same as in the Conceptual model entries but onlytorchdiffeqis supported for hybrid models.basin_file: Same as in the General experiment entries.nn_model_dir: Specifies the path to the pre-trained neural network model.Note: If
nn_model_diris not specified, the model will be trained from scratch and all the Neural network entries should be defined in the configuration file.scale_target_vars: Specifies whether to scale the target variables. If set to True, the target variables will be scaled using the mea and standard deviation of the training period.loss: Specifies the loss function to use. Supported options include mse, nse, and nse-nh.epochs: Specifies the number of epochs to train the model.patience: Specifies the patience for early stopping.clip_gradient_norm: If a value, clips the gradients during training to that norm.batch_size: Specifies the batch size for training. If set to -1, the whole dataset will be used in a single batch.optimizer: Specifies the optimizer to use. Supported options include adam and sgd.learning_rate: Learning rate. Can be either a single number (for a constant learning rate) or a dictionary. See How to adjust learning rate in the Pytorch documentation for more information.Example:
learning_rate:initial:0.001decay:0.5decay_step_fraction:2
Note: The learning rate will be decayed by a factor of
decayeverydecay_step_fractionepochs.log_n_basins: Specifies the number of basins to log during training. If set to 0, no basins will be logged.log_every_n_epochs: If a value and greater than 0, logs figures and metrics, and saves the model after each n epochs.