Although not an essential system requirement, an SSD will be included in our ideal system specification in future.
In a recent support interaction involving delays running a huge model, we were asked
exactly what happens behind the scenes when you press <F5> to update the results
charts according to the latest changes in the Editor. The client was keen to understand
which processes, memory and disk accesses were involved to determine why the very
same model ran dramatically faster on one of our machines at Implied Logic.
What follows is a more or less verbatim copy of our response, edited only for context
and to remove any personal or client-identifiable data. However, the recommendation
about the potentially significant advantages of SSD over HDD is entirely universal
and a strong factor in our decision to circulate this more widely.
The detailed modelling context
The model in question explores cost dynamics in a national transmission network
and utilises multiple templates and scenarios. Detailed cost reporting is enabled
at the service–resource level by setting Cost Allocation
= Transparent in the global Other
Details dialog and this significantly increases the size of the results
file to well over 0.5 GB per scenario.
A simulation of this size is always going to take some time to process, but the
alarm bells rang when we reported a round trip of eight minutes against the client’s
experience of four hours! Our hardware was known to be both newer and more powerful
than the client machine, but not to such an extent as to explain a thirty-fold difference!
In order to explain the discrepancy, we felt it was both necessary and insightful
to explain more specifically what STEM does at run time. We were thus able to make
a very specific recommendation to improve the model-run experience for our client.
Step by step, a model run with STEM
In order to shed light on the apparent memory and CPU usage, here is a slightly
idealised explanation of what happens when you run a model from the Editor:
the Editor, ed.exe, first checks and then saves
the model (.dtl, .dtm
and .icp files)
it then asks the core STEM process, stem.dll, to
orchestrate the model run, which involves a sequence of intentionally-separate processes,
all of which for mostly cosmetic reasons remain hidden and report their status via
the Editor window if it is present (which is why the Editor will never look very
busy or memory hungry in Task Manager)
if scenarios are required, then:
the standalone model compiler, dtlcompw.exe, loads
the working model, applies the requested scenarios in memory and then saves the
separate per-scenario .dtm files in turn in the
<model name>.scn folder
the standalone checker, chkmodlw.exe, checks each
of the requested scenario files
if replication is required, then:
the standalone model compiler, dtlcompw.exe, loads
the working model and each scenario in turn, processes the templates, and then saves
the corresponding per-scenario .exp.* files
the standalone checker, chkmodlw.exe, checks each
of the generated .exp files
the working model and each scenario are run in turn:
the model engine, model.exe, does all of the calculations,
proceeding one period at a time, saving all of the results for each successive period
sequentially in a .rmr (raw model results) file
the results sorter, smrsortw.exe, reads the successive
values in time for each element and result by random access from the
.rmr file and then writes each corresponding element–result vector
as a contiguous block in the .smr (sorted model
results) file to provide acceptable on-demand access to the results
the results program, results.exe, loads the updated
results:
for the working model and each scenario:
the updated inputs are loaded from the .dtm and
matched against any previous existing record of the set of element names of each
type, either from a .smw file on first load, or
from the previous results in memory
the lists for each type of element are augmented with items corresponding to each
homogeneous collection of elements of the same type
when cost allocation is enabled, the list of service/transformation–resource
pairs is augmented with items corresponding to the relevant element–collection
and collection–collection pairs
results are accessed on demand from the .smr, and
from derived results calculated in memory, as required, to update pre-existing charts
and when drawing graphs interactively.
Note: part of the complexity of this approach comes down to a trade-off between
memory and disk usage, the need to serialise the processing of scenarios when handling
very large datasets, and a general principle of using operating-system enforced
process isolation to reliably modularise the system into a small number of individually
verifiable tasks.
The consequent advice
The first observation is that merely updating results charts in situ can take a
significant time if the number of ‘elements’ has been radically increased
by enabling service–resource cost reporting. Step 6.a.i
above is significantly faster if there is no .smw
file to match against (i.e., you temporarily hide the results workspace), and will
be faster also if you are prepared to close the results before each subsequent model
run. This does not mean ‘never save a workspace again’, it simply provides
a shortcut when debugging.
Our second observation is that you can see why step 5.b
will become a performance bottleneck if the .rmr
file exceeds the size of the disk cache as it most certainly would in this example.
So our recommendation was that, if you do not have an SSD now, then
you should request a suitably-equipped replacement machine urgently.
Implied Logic has used SSD-based laptops for all modelling and software work from
January 2011 and it makes a huge impact on every aspect of computer performance
from boot time onwards.