Wgrib2 was designed to be parallelized by what-may-be-called dataflow programming.
Data flows into a black box and data flows out. One way to parallelize is to
divide the data flow into N streams, process each stream separately and then recombine
the streams at the end of the processing. Wgrib2m parallelizes wgrib2 this way.
The limitation of this parallelization is that it uses pipes and is
limited by pipe speed, disk speed, the number of CPUs on a node/cpu and the
overhead of setting up and running the parallel job. Pipe speed can be increased
by increasing the pipe buffer size (linux kernel 2.6.35+).
Wgrib2m parallelizes a wgrib2 command by dividing the data flow into N
streams which are processed independently. Only a limited number of output
options are supported. Note that the inventory from wgrib2m is in a different
order than the inventory from a wgrib2 command. For -new_grid to work,
the input file must have UGRD and VGRD in the same grib message. (Default
for NCEP production forecasts.) Note that wgrib2m follows the copygb convention
and only UGRD and VGRD are interpolated using vector interpolation.
The wgrib2m command was written to regrid NCEP-production type
files (no submessages except for U/V are in the same grib message).
So the input and output files follow this convention. The script was
extended to handle other grib-writing output options. Note that this
script has been tested using the bash shell and requires named pipes.
Never tried it but I doubt that this script has chance of working in Windows.
wgrib2 output options supported by wgrib2m
- all other output options should not be used
wgrib2m restrictions on the output options
- Each output option must write to a different file
- Each output option must write to the output file for every record processed.
- You can use the -match option because -match selects the record prior to processing
- You cannot use -if to select the record to be output (see restriction 2)
- Output options can only write grib (ex. -netcdf, -cvs are not allowed)
wgrib2 reading options supported by wgrib2m
- processing a regular grib file (not a pipe)
- -i (reading inventory from stdin) added v1.1
- -import will cause problems
wgrib2 options that work differently in wgrib2m
Some options still work but may behave differently in wgrib2m.
Since the processing is split in to N streams, each copy of
wgrib2 will not see all the records. For example, you
may want to calculate the 1000mb-500mb thickness. If one
copy of wgrib2 gets the 1000 mb Z and other one gets the 500 mb Z,
then you can't calculate the thinkness. This will affect
wgrib2m N (wgrib2 subset options)
for N > 1, execute wgrib2 (wgrib2 subset options) in N streams
for N < -1, produces script running -N streams
grep ":HGT:" nam.idx | wgrib2m 3 -i nam.grb2 -set_grib_type c3 -grib_out HGT.c3
wgrib2m 4 IN.grb -set_grib_type c3 -new_grid_winds -new_grid ncep grid 221 out22.grb -new_grid ncep grep 3 out3.grb
Using Centos 6.4 on a FX 8320 (8 core), there was little speed up with N > 4 when
using 1 MB grib messages. Using grib messages < 64KB (pipe buffer size), the
processing scaled better with the number of streams.
Code location: http://www.ftp.cpc.ncep.noaa.gov/wd51we/wgrib2_aux_progs/wgrib2m