Climate Prediction "Resolution Independent" Trickle Data Interface for Distributed Computing

Abstract
Current climate prediction programs that are deployed in distributed computing have very high attrition rates (~10% participant loss per ~15% simulation completion). On top of this these programs in their current form have almost non-existent data return. In spite of the advent of using trickle 'work units' to grant participant credit, not much has been achieved with respect to advancing the technology.

Fundamental distributed computing problems relating to climate prediction that need to be fixed and must be fixed

API Scope

Trickle messages, as of this time are specified here
Trickle messages let applications communicate with the server during the execution of a workunit.
These trickle messages are intended for applications that have long workunits (running over multiple days or weeks).



Trickle-up messages
Trickle-up messages go from application to server. They are handled by trickle handler daemons running on the server. Each message is tagged with a 'variety' (a character string). Each daemon handles messages of a particular variety. (This is used, typically, to distinguish different applications.)

Example uses:

Client-side API

To send a trickle-up message, call : int boinc_send_trickle_up(char* variety, char* text)
 
Server-side API

To handle trickle-up messages, use a 'trickle_handler' daemon.
This is a program, based on sched/trickle_handler.cpp, linked with a function

int handle_trickle(MSG_FROM_HOST&);
struct MSG_FROM_HOST { ... };




Overall design goals and principals

Model Data
Why needed
Model date range for trickle (start time, stop time)
Format START: YYYYYY-DDD; STOP: YYYYYY-DDD, forced leading zeros
Overall model stability
Should only be reported fortnightly
Ocean model stability
Not required for slab ocean models, but should be mandatory for non-slab models.
Land model stability
Should only be reported fortnightly
Flux correction stability
Should be reported every 7 days
Convention stability
Not needed in models that have no support for convection, optional
Clausius Clapeyron trends
The C-C equation governs the water-holding capacity of the atmosphere. Ideally, there should be numbers for the Arctic, Mid-Latitudes (N & S hemispheres), Tropics and Antarctic.
CFL Stability
Three mathematicians named Courant, Friedrichs, and Lewy created a criterion that, if violated, would lead to the "blowing up" of a finite-difference weather prediction model. This CFL criterion is: The speed of fastest winds in model must be less than or equal to grid spacing divided by the time step. Update daily.
Model data version numbers
For version tracking as well as core science code tracking ... {CC=1; CFL=2; ...} one standardized string in one XML field, printable not binary.


Global Data

Average temperature
Arbitrary
Average daily rainfall
Arbitrary
Average daily snowfall
Arbitrary
Number of High Pressure Cells (+ average size)
Shape and vector categorization is possible for up to 5 cells
Number of Low Pressure Cells (+ average size)
Shape and vector categorization is possible for up to 5 cells
Number of Median Pressure Cells
For statistical monitoring of the model
Pressure cell deviation
Highest vs Lowest pressure cells, the standard deviation units away from each other
Accumulated snowfall : Sea
TBA
Accumulated snowfall : Land
TBA
Accumulated rainfall : Sea
TBA
Accumulated rainfall : Land
TBA
Global data version numbers
For version tracking as well as core science code tracking


Regional Data

Number of High Pressure Cells (+ average size)
Tropics, Mid Latitudes, Arctic AND Americas, Europe-Africa, Asia, Australasian differential; vector data possible for up to 5 cells in this dataset
Number of Low Pressure Cells (+ average size)
Tropics, Mid Latitudes, Arctic AND Americas, Europe-Africa, Asia, Australasian differential; vector data possible for up to 5 cells in this dataset
Tropical average temperature
Area inside the Tropic of Capricorn & Tropic of Cancer
Tropical average temperature : LAND
Area inside the Tropic of Capricorn & Tropic of Cancer
Tropical average temperature : SEA
Area inside the Tropic of Capricorn & Tropic of Cancer
Mid Latitudes average temperature

Mid Latitudes average temperature : LAND

Mid Latitudes average temperature : SEA

Arctic & Antarctic average temperature
Using Arctic & Antarctic Geodesic Circle
Antarctica differential average temperature
Antarctica is mostly land, not sea as with the Arctic
Clausius Clapeyron trends (Arctic, Tropics, Mid Latitudes)
The C-C equation governs the water-holding capacity of the atmosphere -- but it may play out differently in different parts of the world due to seasonal variation and model input variation.
Ocean sector level average temperature
North Atlantic (North & South), Caribbean Sea, South Atlantic (North & South), Indian Ocean (North, Central & South), Pacific (North, Tropics & South), Tazman Sea, Mediterranean Sea,  Southern Ocean (surrounds Antarctica) -- FOR NON SLAB OCEAN MODELS
"Cold Equator" trend line
Mainly for slab ocean models, although no model is totally immune from this effect; for monitoring only. If too frigid, abort the model!
Regional data version numbers For version tracking as well as core science code tracking


National Level Data

SEE APPENDIX A
National level data is vital for the national weather agencies -- and national weather data available may vastly expand funding for this branch of science.
National Level data version numbers
For version tracking as well as core science code tracking


Optional Data

Pollution dispersion (National or Regional Level)
Useful for Sulfur models, volcanoes and smokestacks are equal here...
Sea Ice granulation or solidification events
Trend data that applies to Antarctic and Arctic regions only, may require some math and climate database reads to find.
Sea Ice extent
Trend data that applies to Antarctic and Arctic regions only, a snapshot number.


Runtime Diagnostic Data

Trickle version number (software + science code)
The trickle software should be upgradeable during a simulation. The climate simulation programs should not even know it is there ... except for some geoiddatabase locking policies no changes needed.
CPU Time (seconds)
For granting credit to users
Average seconds per timestep
For user to track real CPU speed & credit calibration
Median seconds per timestep
For user to track real CPU speed & credit calibration
Minor exceptions handled (+ type)
For debugging and program optimization
Start & Stop record
For tracking restarts
Program loop diagnostic data
For beta testing
Hashsums for data groups
Data integrity


Data formatting issues
The data should be in the XML or XML-DB form. It should be readable, as well as unambiguously structured.

Trickle up issues
It would be advisable to have each trickle be no more than 4 kb in size.

Further reading


APPENDIX A: National Level Data



North America
  • 1
  • 2
  • 3
  • 4
  • 5

Greater Europe

    * 1
    * 2
    * 3
    * 4
    * 5

South Asia

    * 1
    * 2
    * 3
    * 4
    * 5


Central America
  • 1
  • 2
  • 3
  • 4
  • 5

Lesser Europe
  • European exclaves and overseas territories

East Asia

    * 1
    * 2
    * 3
    * 4
    * 5


South America
  • 1
  • 2
  • 3
  • 4
  • 5

Africa
  • 1
  • 2
  • 3
  • 4
  • 5

Australasia
  • 1
  • 2
  • 3
  • 4
  • 5


At this time these are suggestions.


APPENDIX B: Prehistoric Oceans & Land masses

There needs to be a long term working group set up to determine best how to upload data for ocean arrangements that date back further than 2 million years. Until the trickle work unit technology is adequately working, this avenue of research should be put on hold. Ideally geographic reporting regions should be set out in 50mya chunks.


APPENDIX B: Outer Planets Climate Simulations

There needs to be a working group set up to define a common message set for this application. Both rocky and gas planets need to have a common answerback API.







Author
Max Power

Initial idea
21 April 2006

Document created
15 May 2007

Last modified
01 February 2010

Last edit
API, Outer Planets