Skip Navigation Links www.nws.noaa.gov 
NOAA logo - Click to go to the NOAA home page National Weather Service   NWS logo - Click to go to the NWS home page
Climate Prediction Center
 
 

 
About Us
   Our Mission
   Who We Are

Contact Us
   CPC Information
   CPC Web Team

 
HOME > Monitoring_and_Data > Oceanic and Atmospheric Data > Reanalysis: Atmospheric Data > wgrib2-for_n, wgrib2-n
 

wgrib2: multi-processing with -for_n and -n

Introduction

wgrib2 is serial job, I can only run it on one CPU even though I am running it on a computer with 32 available CPUs. Wouldn't it be fun to run it on all 32 CPUs? (Ok, the other users may complain but you get the point.) Rather than pull out the MPI textbook, we'll show a script level solution.

Assumptions

  • CPU time is longer than the I/O time
  • each record can be handled independantly
  • multiple cpus are available on the same machine/node
  • a two cpu version is sufficient documentation

The inventory number, -n

Our first step is to add the inventory line number. You can see the inventory number by the -n option. Once we have add the inventory number, we can have one copy of wgrib2 process the even number and another process the odd numbers.

Note that the inventory number is not the same as the record number for many reason such as the order of processing may be read from standard input by -i, some messages may have submessages and some records could be skipped by the -match and other options.

Even and Odd, -for_n

The -for_n option is like the -for option except that it uses the inventory number rather than the record number.

To select the odd records to process, you use the option -for_n 1:99999:2. Here, 99999 is just a large number greater than the number of records. You could also use -for_n 1::2. To process the even fields, use -for_n 2::2.

Pipes, fifo

Now that we can run wgrib2 on the even and odd records, how do we make the output. Here is a simple way.
f=file
wgrib2 $f -ijsmall_grib 1:10 1:10 /tmp/p1 -for_n 1:88888:2 >/dev/null
wgrib2 $f -ijsmall_grib 1:10 1:10 /tmp/p2 -for_n 2:88888:2 >/dev/null
cat /tmp/p1 /tmp/p2 >output
rm /tmp/p1 /tmp/p2
The above method is not optimal as it uses temporary files and rearranges the order of the records. A better method is to use pipes and a simple program that reads the pipes and writes out a merged output file.
f=file
mkfifo /tmp/p1.$$
mkfifo /tmp/p2.$$
wgrib2 $f -ijsmall_grib 1:10 1:10 /tmp/p1.$$ -for_n 1::2 -flush >/dev/null &
wgrib2 $f -ijsmall_grib 1:10 1:10 /tmp/p2.$$ -for_n 2::2 -flush >/dev/null &
gmerge output /tmp/p1.$$ /tmp/p2.$$
rm /tmp/p1.$$ /tmp/p2.$$
The program, gmerge, simply reads a grib message from p1 and writes it to output. Then it reads a grib message from p2 and writes it to output. This is repeated until there is no data left (pipes are closed). The source code is available. The -flush option makes sure that output buffers are flushed (i.e. written) before going to the next record. Otherwise some of the writes may be delayed to consolidate the writes into larger writes.

Limitations

The above script level multi-tasking is very simple and is limited to the available cpus on the "node". For large MPP computers, the number of CPUs on a node is much smaller than the total number of computers that are available.

Usage

-n                  prints the inventory number
-for_n I:J:K        same as for n = I to J by K
-for_n I:J          same as for n = I to J by 1
-for_n I::K         same as for n = I to MAX_INTEGER by K
-for_n I            same as for n = I to MAX_INTEGER by 1
See also: -for

NOAA/ National Weather Service
National Centers for Environmental Prediction
Climate Prediction Center
5200 Auth Road
Camp Springs, Maryland 20746
Climate Prediction Center Web Team
Page last modified: May 15, 2009
Disclaimer Privacy Policy