Configuration Arguments ======================= This page provides a list of possible configuration arguments. For examples of how a config file could look like, check out the files: - `tutorials/01-RunConceptModel/config_run_concept.yml `_ - `tutorials/02-PretrainNNmodel/config_pretrain_nn_model.yml `_ Data entries ------------ - ``dataset``: Defines which dataset will be used. Currently supported is ``camelsus`` (CAMELS-US dataset by `Newman et al., 2015 `_. The code is intended to support other datasets but might require specific adaptations, such as extending the parent class to handle differences in dataset structure or features. - ``concept_data_dir``: Specifies the path to the data source required for the conceptual model. This path should be defined in the configuration file ``src/utils/data_dir.yml``. - ``forcings``: This entry can be ignored if the dataset is not ``camelsus`` or unless it is strictly required by a newly defined dataset. It can be either a string or a list of strings corresponding to forcing products in the CAMELS dataset. Examples: ``[daymet, maurer, maurer_extended, nldas]``. General experiment entries -------------------------- - ``basin_file``: Specifies the full or relative path to a text file containing the basin IDs used for training, validation, and testing. Each line in the file should contain a single basin ID, as defined in the dataset. - ``train_start_date``: Start date of the training period (first day of discharge) in the format `DD/MM/YYYY`. Corresponding pairs of start and end dates denote the different periods. - ``train_end_date``: End date of the training period (last day of discharge) in the format `DD/MM/YYYY`. - ``valid_start_date``: Start date of the validation period (first day of discharge) in the format `DD/MM/YYYY`. - ``valid_end_date``: End date of the validation period (last day of discharge) in the format `DD/MM/YYYY`. - ``metrics``: Specifies the list of metrics to calculate during validation (testing). Available metrics include: `NSE`, `Alpha-NSE`, `Beta-NSE`, `FHV`, `FMS`, `FLV`, `KGE`, `Beta-KGE`, `Peak-Timing`, `Peak-MAPE`, `Pearson-r`. **Reference**: For a full list of available metrics, see `src/utils/metrics`. - ``experiment_name``: Defines the name of your experiment that will be used as a folder name (+ date-time string suffix) to save the model and results., - ``device``: Which device to use in format of ``cuda:0``, ``cuda:1``, etc, for GPUs or ``cpu``. - ``seed``: Fixed random seed. If empty, a random seed is generated for this run. - ``precision``: Sets the precision for the model. Supported options: `float32`, `float64`. - ``verbose``: Specifies the verbosity level of the model's logging and progress display. - ``0``: Only log informational messages; progress bars are not shown. - ``1``: Show progress bars along with informational messages. Conceptual model entries ------------------------ - ``concept_model``: Specifies the conceptual model to use. Supported models include ``exphydro``. The code is intended to support other conceptual models but might require specific adaptations, such as extending the parent class to accommodate the specifics of a newly defined model. ``exphydro``: A two-bucket model (water and snow) with 5 processes and 6 parameters. (`Höge et al., 2022. `_) - ``ode_solver_lib``: Specifies the library used for solving ODEs. Supported options include ``scipy`` and ``torchdiffeq``. - ``scipy``: Solves ODEs using ``solve_ivp`` from ``scipy.integrate``. Reference: `SciPy Documentation `_. Supported methods: ``RK45``, ``RK23``, ``DOP853``, ``Radau``, ``BDF``, ``LSODA``. Example: For adaptive-step solvers: - ``ode_solver_lib``: ``scipy`` - ``odesmethod``: ``RK23`` - ``rtol``: ``1e-4`` - ``atol``: ``1e-6`` **Note**: Methods such as ``euler`` and ``rk4`` are not part of the `scipy` module and have been separately implemented in the model class. For fixed-step solvers: - ``ode_solver_lib``: ``scipy`` - ``odesmethod``: ``euler`` - ``time_step``: ``0.5`` - ``torchdiffeq``: Solves ODEs using the ``torchdiffeq`` library. Reference: `torchdiffeq documentation `_. Supported methods: ``euler``, ``rk4``, ``midpoint``, ``adaptive_heun``, ``bosh3``, ``dopri5``. Example: - ``ode_solver_lib``: ``torchdiffeq`` - ``odesmethod``: ``dopri5`` - ``rtol``: ``1e-4`` - ``atol``: ``1e-6`` Neural network entries ---------------------- - ``data_dir``: Specifies the folder that contains the data obtained by running the conceptual model. The path should be: ``src/data/data_dir`` - beware of locating the data in the correct folder. - ``nn_model``: Specifies the neural network model to use. Supported models include ``mlp`` and ``lstm``. The code is intended to support other neural network models but might require specific adaptations. ``mlp``: A multi-layer perceptron model with fully connected layers. ``lstm``: A Long Short-Term Memory model. - ``hidden_size``: Specifies the number of hidden units in each layer of the neural network. Example: ``[32, 32, 32, 32, 32]`` - ``seq_length``: Length of the input sequence. Only required for LSTM models. - ``nn_dynamic_inputs``: Specifies the dynamic inputs to the neural network. Example: ``[s_snow, s_water, prcp, tmean]`` - ``nn_mech_targets``: Specifies the mechanistic targets to the neural network (neural network outputs). Example: ``[ps_bucket, pr_bucket, m_bucket, et_bucket, q_bucket]`` - ``target_variables``: Specifies the main target variables for the neural network - the one that will be used to train the model. Example: ``[obs_runoff]`` **Note**: The ``nn_dynamic_inputs``, ``nn_mech_targets``, adn ``target_variables`` entries should be consistent with the variables in the ``dataset`` and be inluded as ``model_inputs``, ``nn_mech_targets``, and ``target_variables``, respectively, in the ``concept_model`` entry definded in the file ``src/utils/concept_model_vars.yml``. - ``loss_pretrain``: Specifies the loss function to use during the pre-training phase. Supported options include `nse` and `mae`, but the code is intended to support other loss functions. - ``lr_pretrain``: Specifies the learning rate for the pre-training phase. - ``epochs_pretrain``: Specifies the number of epochs for the pre-training phase. Hybrid model entries -------------------- ``data_dir``: Same as in the **Neural network entries**. - ``hybrid_model``: Specifies the hybrid model to use. Supported models include ``exphydroM100``. The code is intended to support other hybrid models but might require specific adaptations. ``exphydroM100``: A hybrid model that combines a conceptual model with a neural network model. (`Höge et al., 2022. `_). See ``class ExpHydroM100`` in `src/modelzoo_hybrid/exphydroM100.py` for more details. - ``concept_model``: Same as in the **Conceptual model entries**. - ``ode_solver_lib``: Same as in the **Conceptual model entries** but only ``torchdiffeq`` is supported for hybrid models. - ``basin_file``: Same as in the **General experiment entries**. - ``nn_model_dir``: Specifies the path to the pre-trained neural network model. **Note**: If ``nn_model_dir`` is not specified, the model will be trained from scratch and all the **Neural network entries** should be defined in the configuration file. - ``scale_target_vars``: Specifies whether to scale the target variables. If set to `True`, the target variables will be scaled using the `mea` and `standard deviation` of the training period. - ``loss``: Specifies the loss function to use. Supported options include `mse`, `nse`, and `nse-nh`. - ``epochs``: Specifies the number of epochs to train the model. - ``patience``: Specifies the patience for early stopping. - ``clip_gradient_norm``: If a value, clips the gradients during training to that norm. - ``batch_size``: Specifies the batch size for training. If set to `-1`, the whole dataset will be used in a single batch. - ``optimizer``: Specifies the optimizer to use. Supported options include `adam` and `sgd`. - ``learning_rate``: Learning rate. Can be either a single number (for a constant learning rate) or a dictionary. See `How to adjust learning rate `_ in the Pytorch documentation for more information. Example: ``learning_rate``: - ``initial``: ``0.001`` - ``decay``: ``0.5`` - ``decay_step_fraction``: ``2`` **Note**: The learning rate will be decayed by a factor of ``decay`` every ``decay_step_fraction`` epochs. - ``log_n_basins``: Specifies the number of basins to log during training. If set to `0`, no basins will be logged. - ``log_every_n_epochs``: If a value and greater than `0`, logs figures and metrics, and saves the model after each `n` epochs.