Debugging
Overview
Teaching: 30 min
Exercises: 15 minQuestions
How can I handle errors/warnings?
Objectives
Fix a broken recipe
Every user encounters errors. Once you know why you get certain types of errors, they become much easier to fix. The good news is, ESMValTool creates a record of the output messages and stores them in log files. They can be used for debugging or monitoring the process. This lesson helps to understand what the different types of errors are and when you are likely to encounter them.
Log files
Each time we run ESMValTool, it will produce a new output directory. This
directory should contain the run
folder that is automatically generated by
ESMValTool. To examine this, we run a recipe_example.yml
that can be found
in Setup. Let’s download it to our working
directory esmvaltool_tutorial
that was created during the
Configuration.
In a new terminal, go to our working directory esmvaltool_tutorial
where
the file recipe_example.yml
is located and run
the recipe:
cd esmvaltool_tutorial
esmvaltool run recipe_example.yml
esmvaltool: command not found
ESMValTool encounters this error because the conda environment esmvaltool
has not been activated. To fix the error, before running the recipe, activate
the environment:
conda activate esmvaltool
conda environment
More information about the conda environment can be found at Installation.
Let’s change the working directory to the folder run
and list its files:
cd esmvaltool_output/recipe_example_#_#/run
ls
diag_timeseries_temperature main_log_debug.txt main_log.txt recipe_example.yml resource_usage.txt
In the main_log_debug.txt
and main_log.txt
, ESMValTool writes the output
messages, warnings and possible errors that might happen during pre-processings.
To inspect them, we can look inside the files. For example:
cat main_log.txt
Now, let’s have a look inside the folder diag_timeseries_temperature
:
cd diag_timeseries_temperature/timeseries_diag
ls
log.txt resource_usage.txt settings.yml
In the log.txt
, ESMValTool writes the output messages,
warnings and possible errors that are related to the diagnostic script.
If you encounter an error and don’t know what it means, it is important to read the log information. Sometimes knowing where the error occurred is enough to fix it, even if you don’t entirely understand the message. However, note that you may not always be able to find the error or fix it. In that case, ESMValTool community helps you figure out what went wrong.
Different log files
In the
run
directory, there are two log filesmain_log_debug.txt
andmain_log.txt
. What are their differences?Solution
The
main_log_debug.txt
contains the output messages from the pre-processor whereas themain_log.txt
shows general errors and warnings that might happen in running the recipe and diagnostics script.
Let’s change some settings in the recipe to run a regional pre-processor.
We use a text editor called nano
to open the recipe file:
cd ~/esmvaltool_tutorial
nano recipe_example.yml
Text editor side note
No matter what editor you use, you will need to know where it searches for and saves files. If you start it from the shell, it will (probably) use your current working directory as its default location. We use
nano
in examples here because it is one of the least complex text editors. Press ctrl + O to save the file, and then ctrl + X to exitnano
.
See the recipe_example.yml
01 # ESMValTool 02 # recipe_example.yml 03 --- 04 documentation: 05 description: Demonstrate basic ESMValTool example 06 07 authors: 08 - demora_lee 09 - mueller_benjamin 10 - swaminathan_ranjini 11 12 maintainer: 13 - demora_lee 14 15 references: 16 - demora2018gmd 17 # Some plots also appear in ESMValTool paper 2. 18 19 projects: 20 - ukesm 21 22 datasets: 23 - {dataset: HadGEM2-ES, project: CMIP5, exp: historical, mip: Omon, ensemble: r1i1p1, start_year: 1859, end_year: 2005} 24 25 preprocessors: 26 prep_timeseries: # For 0D fields 27 annual_statistics: 28 operator: mean 29 30 diagnostics: 31 # -------------------------------------------------- 32 # Time series diagnostics 33 # -------------------------------------------------- 34 diag_timeseries_temperature: 35 description: simple_time_series 36 variables: 37 timeseries_variable: 38 short_name: thetaoga 39 preprocessor: prep_timeseries 40 scripts: 41 timeseries_diag: 42 script: ocean/diagnostic_timeseries.py
Keys and values in recipe settings
The ESMValTool
pre-processors
cover a broad range of operations on the input data, like time manipulation,
area manipulation, land-sea masking, variable derivation, etc. Let’s add the
preprocessor extract_region
to the section prep_timeseries
:
25 preprocessors:
26 prep_timeseries: # For 0D fields
27 annual_statistics:
28 operator: mean
29 extract_region:
30 start_longitude: -10
31 end_longitude: 40
32 start_latitude: 27
33 end_latitude: 70
Also, we change the projects
value ukesm
to tutorial
:
19 projects:
20 - tutorial
Then, we save the file and run the recipe:
esmvaltool run recipe_example.yml
ValueError: Tag 'tutorial' does not exist in section 'projects' of esmvaltool/config-references.yml
2020-06-29 18:09:56,641 UTC [46055] INFO If you suspect this is a bug or need help,
please open an issue on https://github.com/ESMValGroup/ESMValTool/issues and
attach the run/recipe_*.yml and run/main_log_debug.txt files from the output directory.
The values for the keys author
, maintainer
, projects
and
references
in the recipe should be known by ESMValTool:
- A list of ESMValTool author, maintainer, and projects can be found in the config-references.yml.
- ESMValTool references in
BibTeX
format can be found in the ESMValTool/esmvaltool/references directory.
ESMValTool can’t locate the data
You are assisting a colleague with ESMValTool. The colleague replaces the
CMIP5
entry inproject: CMIP5
toCMIP6
and runs the recipe. However, ESMValTool encounters an error like:esmvalcore._recipe_checks.RecipeError: Missing data 2020-06-29 17:26:41,303 UTC [43830] INFO If you suspect this is a bug or need help, please open an issue on https://github.com/ESMValGroup/ESMValTool/issues and attach the run/recipe_*.yml and run/main_log_debug.txt files from the output directory.
What suggestions would you give the researcher for fixing the error?
Solution
- Inspect
main_log.txt
- Check
user-config.yml
to see if the correct directory for input data is introduced- Check the available data, regarding exp, mip, ensemble, start_year, and end_year
- Check the variable name in the
diag_timeseries_temperature
section in the recipe
Check pre-processed data
The setting save_intermediary_cubes
in the configuration file can be used to
save the pre-processed data. More information about this setting can be found at
Configuration.
save_intermediary_cubes
Note that this setting should be only used for debugging, as it significantly slows down the recipe and increases disk usage because a lot of output files need to be stored.
Check diagnostic script path
The result of the pre-processor is passed to the diagnostic_timeseries.py
script, that is introduced in the recipe as:
40 scripts:
41 timeseries_diag:
42 script: ocean/diagnostic_timeseries.py
The diagnostic scripts are located in the folder diag_scripts
in the
ESMValTool installation directory <path_to_esmvaltool>
. To find
where ESMValTool is located on your system, see Installation.
Let’s see what happens if we can change the script path as:
40 scripts:
41 timeseries_diag:
42 script: diag_scripts/ocean/diagnostic_timeseries.py
esmvaltool run recipe_example.yml
esmvalcore._task.DiagnosticError: Cannot execute script 'diag_scripts/examples/diagnostic.py' (esmvaltool/diag_scripts/diag_scripts/examples/diagnostic.py): file does not exist.
2020-06-29 20:39:31,669 UTC [53008] INFO If you suspect this is a bug or need help, please open an issue on https://github.com/ESMValGroup/ESMValTool/issues and attach the run/recipe_*.yml and run/main_log_debug.txt files from the output directory.
The script path should be relative to diag_scripts
directory. It means that
the script diagnostic_timeseries.py
is located in
<path_to_esmvaltool>/diag_scripts/ocean/
.
Alternatively, the script path can be an absolute path. To examine this, we can download the script from the ESMValTool
repository:
wget https://github.com/ESMValGroup/ESMValTool/blob/master/esmvaltool/diag_scripts/ocean/diagnostic_timeseries.py
One way to get the absolute path is to run:
readlink -f diagnostic_timeseries.py
Then we can update the script path and run the recipe:
40 scripts:
41 timeseries_diag:
42 script: <path_to_script>/diagnostic_timeseries.py
esmvaltool run recipe_example.yml
Now examine ./esmvaltool_output/recipe_example_#_#/run/diag_timeseries_temperature/timeseries_diag/
to see if it worked!
Available recipe and diagnostic scripts
ESMValTool provides a broad suite of recipes and diagnostic scripts for different disciplines like atmosphere, climate metrics, future projections, IPCC, land, ocean, ….
Re-running a diagnostic
Look at the
main_log.txt
file and answer the following question: How to re-run the diagnostic script?Solution
The
main_log.txt
file contains information on how to re-run the diagnostic script without re-running the pre-processors:2020-06-29 20:36:32,844 UTC [52810] INFO To re-run this diagnostic script, run:
If you run the command in the next line, you will be able to re-run the diagnostic.
Memory issues
If you run out of memory, try setting
max_parallel_tasks
to 1 in the configuration file. Then, check the amount of memory you need for that by inspecting the filerun/resource_usage.txt
in the output directory. Using the number there you can increase the number of parallel tasks again to a reasonable number for the amount of memory available in your system.
Key Points
There are three different kinds of log files:
main_log.txt
, andmain_log_debug.txt
andlog.txt
.