How to create workflow that skips gfs_post?

Hi-

We are switching over to ufs-srweather-app, and we don't need any of the post output. The mrw app allowed a "--workflow ufs-mrweather_wo_post" option for the create_newcase script to skip gfs_post. Is there an equivalent option for creating a new workflow for a srw experiment, in order to leave out the post jobs?

Thank you in advance!

-Paddy McCarthy

Permalink

In reply to by wmayfield

Thanks, Will-

When I add that line to the regional_workflow/ush/config.sh and run ./generate_FV3LAM_wflow.sh, I get this:

ERROR:
  From function:  "compare_config_scripts"
  In file:  "compare_config_scripts.sh"
  Full path to file:  "/glade/scratch/paddy/ufs-srweather-app/regional_workflow/ush/compare_config_scripts.sh"
The variable specified by var_name in the user-specified experiment/
workflow configuration file (EXPT_CONFIG_FN) does not appear in the de-
fault experiment/workflow configuration file (EXPT_DEFAULT_CONFIG_FN):
  EXPT_CONFIG_FN = "config.sh"
  EXPT_DEFAULT_CONFIG_FN = "config_defaults.sh"
  var_name = "RUN_TASK_RUN_POST"
Please assign a default value to this variable in the default configura-
tion file and rerun.
Exiting with nonzero status.
 

Do I need to be running on the dev tree? I currently have v1.0.1.

 

Thanks again,

-Paddy.

Hi Will,

I think the most recent error that Paddy is getting is because we (the DTC SRW team) haven't updated the hash of regional_workflow in the Externals.cfg file (in ufs-srweather-app) to the HEAD of regional_workflow.  I've been meaning to do that but we need to do some tests first.  But Paddy can still try it; could you help him do that?  Thanks.

Gerard

Permalink

In reply to by gerard.ketefian

Thanks, Gerard-

I put the suggested param into config_defaults.sh instead of config.sh, and the error went away but I think it's still trying to run post. The post jobs are still showing up in the rocotosphere, but my run hasn't gotten there yet...

 

-Paddy.

Hi Paddy, 

To try Gerard's suggestion I would recommend recloning the repo into a new directory (ufs-srweather-app-develop or something) and building there (leave the one you have as ufs-srweather-app-v1.0.1). 

1.  git clone -b develop https://github.com/ufs-community/ufs-srweather-app.git ufs-srweather-app-develop

2. Go into ufs-srweather-app-develop and edit the Externals.cfg to uncomment "branch=develop" and comment out "hash=bb54e59" (shown below)

[regional_workflow]

protocol = git

repo_url = https://github.com/NOAA-EMC/regional_workflow

# Specify either a branch name or a hash but not both.

branch = develop

#hash = bb54e59

local_path = regional_workflow

required = True

3. Continue with the getting started process by running ./manage_externals/checkout_externals etc.

Let us know how that goes.

Jamie

Hi-

The model no longer runs for me using the regional_workflow development branch. After some effort I got everything compiled with the development regional workflow (I didn't realize for a bit that I needed a totally different environment to compile the dev version). Now I'm getting DEAD status on the initialization steps with the suggested combination:

paddy@cheyenne1:/glade/scratch/paddy/expt_dirs/test_CONUS_25km_GFSv15p2$ rocotostat -w FV3LAM_wflow.xml -d FV3LAM_wflow.db -v 10
       CYCLE                    TASK                       JOBID               STATE         EXIT STATUS     TRIES      DURATION
================================================================================================================================
201906150000               make_grid                     2332059                DEAD                   1         1           6.0
201906150000               make_orog                           -                   -                   -         -             -
201906150000          make_sfc_climo                           -                   -                   -         -             -
201906150000           get_extrn_ics                     2332060                DEAD                   1         1           6.0
201906150000          get_extrn_lbcs                     2332061                DEAD                   1         1           6.0
201906150000                make_ics                           -                   -                   -         -             -
201906150000               make_lbcs                           -                   -                   -         -             -
201906150000                run_fcst                           -                   -                   -         -             -

Is this related to Mike Kavulich's message to me about the model not running on cheyenne these days, after the recent downtimes? I'm starting to think I should wait until we are sure the SRW app runs on cheyenne before proceeding. 

However, the workflow *does* now skip the gfs_post jobs!

Thanks again for all the help!

-Paddy.
 

Hi Paddy,

Yes, this is related to Mike's message about the SRW App being broken on cheyenne.

One quick fix you can try is to comment out (or remove) the line "module purge" in the script "regional_workflow/ush/load_modules_run_task.sh".  It's around line 135.  Then relaunch the workflow by calling "./launch_FV3LAM_wflow.sh" from your main experiment directory.  If you've exceeded your maxtries for one or more tasks, first go into your FV3LAM_wflow.xml file (in your experiment directory) and search for "maxtries", and increase the corresponding number(s) as needed.

Gerard

Still failing... Now it's only make_grid that fails -- get_extrn_ics and get_extrn_lbcs are succeeding:

(NPL) paddy@cheyenne4:/glade/scratch/paddy/expt_dirs/test_CONUS_25km_GFSv15p2$ rocotostat -w FV3LAM_wflow.xml -d FV3LAM_wflow.db -v 10
       CYCLE                    TASK                       JOBID               STATE         EXIT STATUS     TRIES      DURATION
================================================================================================================================
201906150000               make_grid                     2335979                DEAD                   1         1          15.0
201906150000               make_orog                           -                   -                   -         -             -
201906150000          make_sfc_climo                           -                   -                   -         -             -
201906150000           get_extrn_ics                     2335980           SUCCEEDED                   0         1          16.0
201906150000          get_extrn_lbcs                     2335981           SUCCEEDED                   0         1          16.0
201906150000                make_ics                           -                   -                   -         -             -
201906150000               make_lbcs                           -                   -                   -         -             -
201906150000                run_fcst                           -                   -                   -         -             -
 

It looks as if libnetcdff.so.7 is not found. Module netcdf/4.8.1 is loaded when the job runs. I turned on DEBUG, and I've included the log/make_grid.log file contents at the end of this mail.

Thanks again for all of your help,

-Paddy.

 

cat log/make_grid.log

The following have been reloaded with a version change:
  1) mpt/2.22 => mpt/2.19

+ /glade/scratch/paddy/develop-ufs-srweather-app/regional_workflow/ush/load_modules_run_task.sh make_grid /glade/scratch/paddy/develop-ufs-srweather-app/regional_workflow/jobs/JREGIONAL_MAKE_GRID

Initializing the shell function "module()" (and others) in order to be
able to use "module load ..." to load necessary modules ...

The following have been reloaded with a version change:
  1) hdf5/1.10.6 => hdf5/1.10.8


The following have been reloaded with a version change:
  1) g2/3.4.1 => g2/3.4.2     2) g2tmpl/1.9.1 => g2tmpl/1.10.0


The following have been reloaded with a version change:
  1) netcdf/4.7.4 => netcdf/4.8.1


Loading modules for task "make_grid" ...
Warning: library /glade/p/ral/jntp/UFS_CAM/ncar_pylib_20200427 is missing from NPL clone registry.

Now using NPL virtual environment at path:
    /glade/p/ral/jntp/UFS_CAM/ncar_pylib_20200427

Use deactivate to remove NPL from environment


Currently Loaded Modules:
  1) ncarenv/1.3          12) png/1.6.35      23) upp/10.0.6
  2) ncarcompilers/0.5.0  13) pio/2.5.2       24) gfsio/1.4.1
  3) cmake/3.18.2         14) esmf/8_1_1      25) sfcio/1.4.1
  4) hpc/1.1.0            15) fms/2020.04.03  26) landsfcutil/2.4.1
  5) intel/19.1.1         16) g2/3.4.2        27) bacio/2.4.1
  6) mkl/2020.0.1         17) g2tmpl/1.10.0   28) w3nco/2.4.1
  7) hpc-intel/19.1.1     18) ip/3.3.3        29) nemsio/2.5.2
  8) mpt/2.22             19) sp/2.3.3        30) nemsiogfs/2.5.3
  9) hpc-mpt/2.22         20) sigio/2.3.2     31) netcdf/4.8.1
 10) jasper/2.0.22        21) w3emc/2.7.3     32) wgrib2/2.0.8
 11) zlib/1.2.11          22) crtm/2.3.0      33) make_grid.local

 


Launching J-job (jjob_fp) for task "make_grid" ...
  jjob_fp = "/glade/scratch/paddy/develop-ufs-srweather-app/regional_workflow/jobs/JREGIONAL_MAKE_GRID"


========================================================================
Entering script:  "JREGIONAL_MAKE_GRID"
In directory:     "/glade/scratch/paddy/develop-ufs-srweather-app/regional_workflow/jobs"

This is the J-job script for the task that generates grid files.
========================================================================

========================================================================
Entering script:  "exregional_make_grid.sh"
In directory:     "/glade/scratch/paddy/develop-ufs-srweather-app/regional_workflow/scripts"

This is the ex-script for the task that generates grid files.
========================================================================

No arguments have been passed to the script in file

  "/glade/scratch/paddy/develop-ufs-srweather-app/regional_workflow/scripts/exregional_make_grid.sh"


Starting grid file generation...

Creating namelist file (rgnl_grid_nml_fp) to be read in by the grid
generation executable (exec_fp):
  rgnl_grid_nml_fp = "/glade/scratch/paddy/expt_dirs/test_CONUS_25km_GFSv15p2/grid/tmp/regional_grid.nml"
  exec_fp = "/glade/scratch/paddy/develop-ufs-srweather-app/bin/regional_esg_grid"
/glade/scratch/paddy/develop-ufs-srweather-app/bin/regional_esg_grid: error while loading shared libraries: libnetcdff.so.7: cannot open shared object file: No such file or directory
0.00user 0.00system 0:00.02elapsed 11%CPU (0avgtext+0avgdata 472maxresident)k
0inputs+0outputs (0major+49minor)pagefaults 0swaps

ERROR:
  From script:  "exregional_make_grid.sh"
  Full path to script:  "/glade/scratch/paddy/develop-ufs-srweather-app/regional_workflow/scripts/exregional_make_grid.sh"
Call to executable (exec_fp) that generates a ESGgrid-type regional grid
returned with nonzero exit code:
  exec_fp = "/glade/scratch/paddy/develop-ufs-srweather-app/bin/regional_esg_grid"
Exiting with nonzero status.

ERROR:
  From script:  "JREGIONAL_MAKE_GRID"
  Full path to script:  "/glade/scratch/paddy/develop-ufs-srweather-app/regional_workflow/jobs/JREGIONAL_MAKE_GRID"
Call to ex-script corresponding to J-job "JREGIONAL_MAKE_GRID" failed.
Exiting with nonzero status.