Skip navigation
NASA Logo, National Aeronautics and Space Administration

Manage categories

Close

Create and manage categories in GEOS-DAS. Removing a category will not remove content.

Categories in GEOS-DAS
Add a new category (0 remaining)

Manage Announcements

Close

Create and manage announcements in GEOS-DAS. Try to limit the announcements to keep them useful.

Announcements in GEOS-DAS
Subject Author Date Actions

Blog Posts

2 Posts
0

GEOS-5 DAS timings

Posted by Amidu Oloso Jul 3, 2008

Below is a high-level summary of elapsed times (seconds) for different parts of GEOS-5 DAS :

 


CopyAnaRS                  ---      15.48
Total CopyAnaRS            ---      15.48
-----
CopyGCMRS                  ---       0.14
Total CopyGCMRS            ---       0.14
-----
lnbcs                      ---       0.14
Total lnbcs                ---       0.14
-----
VerifyRS                   ---       0.05
Total VerifyRS             ---       0.05
-----
acquire                    ---      17.85
Total acquire              ---      17.85
-----
PreAnaQC                   ---       9.60
Total PreAnaQC             ---       9.60
-----
FixUnblocked               ---       0.01
Total FixUnblocked         ---       0.01
-----
FixEndian                  ---       9.54
Total FixEndian            ---       9.54
-----
AnalysisRun                ---    1366.24
AnalysisRun                ---    1253.35
AnalysisRun                ---    1252.58
AnalysisRun                ---    1241.28
Total AnalysisRun          ---    5113.45
-----
ssprepqc                   ---     109.31
ssprepqc                   ---      80.28
ssprepqc                   ---      93.44
ssprepqc                   ---      81.62
Total ssprepqc             ---     364.65
-----
pqc_init                   ---       0.02
pqc_init                   ---       0.24
pqc_init                   ---       0.28
pqc_init                   ---       0.19
Total pqc_init             ---       0.73
-----
pqc_fv2ss                  ---      59.83
pqc_fv2ss                  ---      48.39
pqc_fv2ss                  ---      47.61
pqc_fv2ss                  ---      47.51
Total pqc_fv2ss            ---     203.34
-----
pqc_ssprev                 ---      13.36
pqc_ssprev                 ---      10.11
pqc_ssprev                 ---      12.39
pqc_ssprev                 ---      11.11
Total pqc_ssprev           ---      46.97
-----
pqc_cqcht                  ---       3.93
pqc_cqcht                  ---       1.15
pqc_cqcht                  ---       3.77
pqc_cqcht                  ---       0.99
Total pqc_cqcht            ---       9.84
-----
pqc_raobcor                ---       6.20
pqc_raobcor                ---       1.93
pqc_raobcor                ---       5.40
pqc_raobcor                ---       1.53
Total pqc_raobcor          ---      15.06
-----
pqc_profcqc                ---       2.58
pqc_profcqc                ---       2.58
pqc_profcqc                ---       2.65
pqc_profcqc                ---       2.53
Total pqc_profcqc          ---      10.33
-----
pqc_arqc                   ---       0.75
pqc_arqc                   ---       0.69
pqc_arqc                   ---       0.79
pqc_arqc                   ---       0.70
Total pqc_arqc             ---       2.93
-----
pqc_acarsqc                ---       0.88
pqc_acarsqc                ---       0.55
pqc_acarsqc                ---       0.75
pqc_acarsqc                ---       0.87
Total pqc_acarsqc          ---       3.06
-----
pqc_cqcvad                 ---       0.68
pqc_cqcvad                 ---       0.56
pqc_cqcvad                 ---       0.67
pqc_cqcvad                 ---       0.55
Total pqc_cqcvad           ---       2.46
-----
pqc_oiqcc                  ---      20.72
pqc_oiqcc                  ---      13.64
pqc_oiqcc                  ---      18.75
pqc_oiqcc                  ---      15.28
Total pqc_oiqcc            ---      68.39
-----
analyzer                   ---    1255.12
analyzer                   ---    1252.26
analyzer                   ---    1251.19
analyzer                   ---    1240.60
Total analyzer             ---    4999.17
-----
make_satinfo.x             ---       0.11
make_satinfo.x             ---       0.14
make_satinfo.x             ---       0.13
make_satinfo.x             ---       0.14
Total make_satinfo.x       ---       0.52
-----
gsi.x                      ---    1215.49
gsi.x                      ---    1198.80
gsi.x                      ---    1210.60
gsi.x                      ---    1206.79
Total gsi.x                ---    4831.69
-----
sac.x                      ---       2.98
sac.x                      ---       2.48
sac.x                      ---       2.93
sac.x                      ---       2.32
Total sac.x                ---      10.72
-----
CoupleAnaToGcm             ---      24.99
CoupleAnaToGcm             ---      24.77
CoupleAnaToGcm             ---      21.10
CoupleAnaToGcm             ---      22.66
Total CoupleAnaToGcm       ---      93.51
-----
GcmRun                     ---    2101.75
GcmRun                     ---    2153.11
GcmRun                     ---    2172.61
GcmRun                     ---    1324.23
Total GcmRun               ---    7751.70
-----
gcm                        ---    2100.48
gcm                        ---    2152.09
gcm                        ---    2171.38
gcm                        ---    1277.94
Total gcm                  ---    7701.90
-----
MoveCheckpoint             ---       0.84
MoveCheckpoint             ---       0.46
MoveCheckpoint             ---       0.69
MoveCheckpoint             ---      45.69
Total MoveCheckpoint       ---      47.68
-----
CoupleGcmToAna             ---      60.68
CoupleGcmToAna             ---      66.93
CoupleGcmToAna             ---      65.89
CoupleGcmToAna             ---      14.45
Total CoupleGcmToAna       ---     207.94
-----
dyn2dyn3                   ---      47.71
dyn2dyn3                   ---      42.38
dyn2dyn3                   ---      52.06
dyn2dyn3                   ---       0.05
Total dyn2dyn3             ---     142.20
-----
diag2ods                   ---      37.34
Total diag2ods             ---      37.34
-----

 

0

Hello everyone and welcome to GEOS 5 Optimization: MERRA Edition! I'm your host, Shawn "Crazy Coder"  Freeman.

 

So what's this all about? Well my specific task is to improve the performance of the history output operation of the GEOS5 model when generating netCDF/HDF files (flat files/GrADS is another story and will be told at another time).

 

The performance issue? The output code is serial. This means the entire model has to wait for output generation to gather, regrid/interpolate (if necessary), and write out the data to a file at the whatever interval the user requested. That adds up to a lot of idle time  that the processors could be using for more useful things (such as painting the computer cabinents or mowing the fiber connections).

 

The obvious solution is to parallelize the output generation and make it asynchronous to the model itself. Unfortunately it appears that the way the current MAPL and ESMF frameworks work makes this difficult without a fair amount of work. At least that's what I've gathered so far (I'm new to the model).The current task really doesn't have the time/resources for that kind of effort.

 

 

Can't do it. Time to go home....yeah right. I will not be so easily defeated (Read: I like my job and want to keep it).

 

 

 

 

So here's my current solution. Have each process write out it's own data to file. This acts to parallelize the output. Then an "aggregator" running asynchronously to the model grabs all the files and consolidates them into a single output file.The individual processor files are uniquely marked so they won't be overwritten, and are deleted after the aggregation. The aggregator would most likely be a python daemon running on a seperate node so as to not impact model operations.

 

 

 

 

This approach will effectively eliminate the serial bottleneck for output generation; i.e, the model can continue on computing without having to wait for the output generation to complete. That keeps those idle processors from getting spare tires.

 

 

 

 

There are drawbacks to this approach. The reliance on an external aggregation program/script to produce the final output file might not be pallatable to some. It presents an added complexity and another possible point of failure. There also may be disk bandwidth contention from writing numerous smaller files in parallel (I'll need to test this to see if it is a concern, but it wouldn't affect the model per se). But given the constraints, this appears to be the best short term approach.

 

 

 

 

 

 

 

 

 

Tune in next $TIME_PERIOD for the next installment of the MERRA Output Optimization blog!

 

 

 

 

 

 

 

 

 

~Shawn

 

 

 

 

 

 

Actions

Notifications

USAGov logo NASA Logo - nasa.gov