Developing smif¶

smif is under active development at github.com/nismod/smif

Install¶

The smif codebase is contained in src/smif.

Install the library in develop mode using the command:

python setup.py develop

If you also wish to use the GUI while using smif in develop mode, you’ll need to navigate to the src/smif/app folder and run the commands:

npm install
npm run build

Testing¶

We use pytest for testing, with tests under tests/ matching the module and class structure of smif/.

Install requirements for testing:

pip install -r test-requirements.txt

Run tests:

python setup.py test

Integration testing¶

The smif test suite includes a number of integration tests: - tests/cli contains a few system integration tests, running small models using the smif

command-line interface

tests/data_layer contains data store integration tests which interact with data stores - including the filesystem and a database.

To run Postgres database integration tests, you will need: - a Postgres installation, (currently testing on 9.6 or greater) - a test_smif database, and a user with login and permissions to create and drop tables on

the database

to set the PG... environment variables before running the tests

For example, assuming Postgres is installed and your user has database creation rights:

createdb test_smif
export PGHOST=localhost
export PGPORT=5432
export PGUSER=username
export PGPASSWORD=password
python -m pytest tests/data_layer

Documentation¶

We use better-apidoc for building documentation in reStructuredText under smif/docs and the Numpy style docstrings that are used throughout the codebase. Documentation is generated and hosted on readthedocs.

Setuptools should allow for building the docs from the project root:

python setup.py docs

There is a also a Makefile to building the docs locally, with options for multiple formats:

cd docs/
make html

This generates a local version in smif/docs/_build/html that can be opened with a browser.

Versioning¶

We intend to follow semantic versioning, with major versions for any incompatible changes to the public API. Note that tags should follow PEP440 which has stricter constraints on tags than semantic versioning.

Releases¶

smif is deployed as a package on the Python Package Index, PyPI. A full guide to packaging and distributing projects is available online.

Deployment to PyPI is handled by Travis CI.

To make a release, create an annotated tag, and submit a pull request:

git tag -a v0.2.0       # create annotated tag (will need a message)
git describe            # show current commit in relation to tags

You’ll need to specify you tag to push either using the --tags flag or the tag name:

git push upstream master --tags
git push upstream v0.2.0        # alternatively

Code style¶

Linting is handled by pre-commit hooks, which can be installed from the root of the repository using:

pre-commit install

Errors and messages¶

As a general guideline, smif fails fast, with errors that users can understand in context, whether they call smif through the python api, CLI, HTTP API or GUI.

When handling errors, we raise custom exceptions (with an informative name and message) which can be communicated out through STDERR, HTTP response or error box.

In normal operations, we catch all errors from the standard library and other dependencies close to where they may arise, re-raising with a custom SmifException if it can’t be handled directly.

For example:

try:
    networkx.topological_sort(graph)
except networkx.NetworkXUnfeasible as err:
    raise SmifNotImplementedError("JobGraphs must not contain cycles") from err

Error messages should contain concrete details from the immediate context if brief and relevant. This might include names and small values, but not lists or serialisations of large or even medium-sized data structures. Errors and messages can be extended with extra context if we catch and re-raise further up the stack.

Error boundaries¶

There are three major boundaries where we catch and handle errors: - around a job (a call to Model.simulate) - independent jobs shouldn’t cause others to fail - around a modelrun - independent modelruns shouldn’t cause others to fail - around the smif process - errors should be reported, followed by a clean exit if the process

cannot continue.

At program boundaries, we catch anything inheriting from SmifException and pass on the message. Stack traces are only shown if running in debug mode, or as the result of a programming error (we missed something - it’s a bug).

Logging¶

Log messages should be used sparingly, following the `python guidelines`<https://docs.python.org/3/howto/logging.html#when-to-use-logging>:

print() displays console output for ordinary usage of the CLI (respond with a message or similar usual channel for API/GUI)
CRITICAL errors are the last thing logged before a daemon is forced to quit (scheduler or server process)
ERROR level errors are communicated to user, typically causing jobs, requests or batch jobs to fail.
WARN indicates an event that a client may not be able or need to do anything about - including error handling and unexpected events (failover, fallback). Use warnings.warn if client code should be modified, for example if deprecating a method.
INFO reports on events that occur during normal operation (e.g. start/stop modelrun, jobs)
DEBUG records events at a finer grain. Prefer introducing debug statements temporarily for debugging, but not to commit them without justification.

CRITICAL, ERROR and WARN are shown with any verbosity level, and we should not typically expect to see any of them.

INFO messages are shown at the first level of verbosity (-v).

DEBUG messages are shown at the second level of verbosity (-vv).

Module import relationship diagram¶

Class diagrams¶

Decision - simulation class interaction/interface design¶

UML for smif decision and simulation interaction

Data flow¶

Locating the data required by a particular simulation model could become complicated. A data input may be provided as scenario data or as the output from another model. Scenario data vary between model runs as different scenarios are explored. Model outputs vary between model runs and possibly within model runs, as coupled models iterate to find stable solutions to loops in the dependency graph or as decision algorithms run multiple simulations to explore possible interventions.

The two abstractions introduced are a DataInterface and a DataHandle. A DataInterface has responsibility for accessing and persisting data and results, for example to a file system or database. A DataHandle has responsibility for directing a simulation model’s requests to the correct dataset, given the modelrun, requesting model, particular spatial or temporal resolution, and current iteration state. The containing layers - ModelRun, SosModel, ModelSet - must incrementally add and update details when creating a specialised DataHandle to pass in to each simulation model.

This class diagram show part of the API to DataInterface and the smaller API to DataHandle which internally makes use of DataHandle’s private attributes.

Class diagram for smif DataHandle / DataInterface composition