aivika-distributed

Parallel distributed discrete event simulation module for the Aivika library http://www.aivikasoft.com

Latest on Hackage:1.0

This package is not currently in any snapshots. If you're interested in using it, we recommend adding it to Stackage Nightly. Doing so will make builds more reliable, and allow stackage.org to host generated Haddocks.

BSD3 licensed by David Sorokin
Maintained by David Sorokin

This package extends the aivika-transformers [1] package and allows running parallel distributed simulations. It uses an optimistic strategy known as the Time Warp method. To synchronize the global virtual time, it uses Samadi's algorithm.

Moreover, this package uses the author's modification that allows recovering the distributed simulation after temporary connection errors whenever possible. For that, you have to enable explicitly the recovering mode and enable monitoring all logical processes including the specialized Time Server process as it is shown in one of the test examples included in the distribution.

With the recovering mode enabled, you can try to build a distributed simulation using ordinary computers connected via the ordinary net. For example, such a distributed model could even consist of computers located in different continents of the Earth, where the computers could be connected through the Internet. Here the most exciting thing is that this is the optimistic distributed simulation with possible rollbacks. It is assumed that optimistic methods tend to better support the parallelism inherited in the models.

You can test the distributed simulation using your own laptop only, although the package is still destined to be used with a multi-core computer, or computers connected in the distributed cluster.

There are additional packages that allow you to run the distributed simulation experiments by the Monte-Carlo method. They allow you to save the simulation results in SQL databases and then generate a report or a set of reports consisting of HTML pages with charts, histograms, links to CSV tables, summary statistics etc. Please consult the AivikaSoft [3] website for more details.

Regarding the speed of simulation, the recent rough estimation is as follows. This estimation may change from version to version. For example, in version 1.0 the rollback log was rewritten, which had a significant effect.

The distributed simulation module is slower up to 8-15 times in comparison with the sequential aivika [2] simulation library using the equivalent sequential models. The lower estimation in 8 times is likely to correspond to complex models. The upper estimation in 15 times will probably correspond to quite simple event-oriented and process-oriented models, where the sequential module can be exceptionally fast.

Note that you can run up to 7 parallel logical processes on a single 8-core processor computer and run the Time Server process too. On a 36-core processor, you can launch up to 35 logical processes simultaneously.

At the same time, the message passing between the logical processes can dramatically decrease the speed of distributed simulation, especially if they cause rollbacks. Thus, much depends on the distributed model itself.

Finally, you can use the following test model [4] as an example.

[1] http://hackage.haskell.org/package/aivika-transformers

[2] http://hackage.haskell.org/package/aivika

[3] http://www.aivikasoft.com

[4] https://github.com/dsorokin/aivika-distributed-test

Changes

Version 1.0

  • Optimized the rollback log.

  • Increased the default rollback log threshold.

  • Returned the size threshold for the output message queue.

Version 0.8

  • No more restriction on the number of output messages, which would lead to throttling.

Version 0.7.4.2

  • Provided a more precise estimation of speed of simulation.

Version 0.7.4.1

  • Updated the estimaton of speed in the description after recent changes in the sequential module.

Version 0.7.4

  • A more graceful termination of the time server in case of self-destruction by time-out.

Version 0.7.3

  • Updated so that external software tools could monitor the distributed simulation.

Version 0.7.2

  • Improved the stopping of the logical processes in case of shutting the cluster down.

Version 0.7.1

  • Added the time server and logical process strategies to shutdown the cluster in case of failure by the specified timeout intervals.

Version 0.7

  • Fixed the use of the LP abbreviation.

Version 0.6

  • Using the mwc-random package for generating random numbers by default.

Version 0.5.1

  • Added functions expectEvent and expectProcess.

  • Added the Guard module.

Version 0.5

  • Added an ability to restore the distributed simulation after temporary connection errors.

  • Better finalisation of the distributed simulation.

  • Implemented lazy references.

Version 0.3

  • Started using Samadi's algorithm to synchronize the global virtual time.

  • The logical processes must call registerDIO to connect to the time server.

  • Increased the default synchronization time-out and delay.

  • Increased the default log size threshold.

comments powered byDisqus