Evgeny Metelkin
2025-09-05
DRAFT
In Part 1, we looked at the landscape of QSP model formats—their origins, strengths, and limitations. We saw how tools approach modeling in different ways, yet still leave teams struggling with Locked-in projects, fragile reproducibility, and poor collaboration. In this follow-up, I want to step back and explore the problem from a software engineering perspective: what practices and design principles could make QSP modeling more transparent, modular, and reproducible, and how model formats can evolve to support that shift.
In software, the “X as code” idea has proven itself many times over—Infrastructure as Code (Terraform), Configuration as Code (Ansible), and Pipeline as Code (Jenkins), among others. The core idea is simple: we don’t just manage artifacts (infrastructure, configuration, pipelines) directly—we manage their textual representation under version control, even if those artifacts were never treated as “code” before. In all cases, the authoritative source is human-readable text. This doesn’t exclude working with diagrams or tables—authoring can stay visual or interactive, but the canonical format must still be code. This shift brought massive gains in transparency, modularity, and reproducibility, and accelerated progress in software engineering.
The question is: can we do the same in QSP?
If we look back at the popular formats discussed earlier, not all of them can be considered model as code. Some are well-suited; others are not. QSP formats that are locked in binary or tool-specific project files cannot be diffed, reviewed, split, or merged—and therefore cannot be treated as code. Formats that are formally “text-based” but too complex or unstructured for a human to read or write also only partially fit this concept.
Checklist: how to tell if your model is really “as code”
When this principle is in place, modelers gain access to an entire ecosystem of tools: version control, diff and merge utilities, code completion, automated testing, and CI/CD pipelines. Nothing needs to be reinvented—these are the same practices that software engineers already use. It means versioned, testable, modular, and reusable models.
In practice, a model becomes more than a loose collection of files: it turns into a full engineering project with a clear folder structure, documentation, and automation. This approach improves reproducibility, makes validation and review more systematic, and simplifies onboarding new team members.
Code transparency means that every aspect of a model is visible, reviewable, and understandable at a glance. This covers not only the model structure, but also the equations, parameters, and the assumptions embedded within it. A transparent format allows collaborators (and even regulators) to see what exactly the model does without needing the original author or a proprietary tool to “explain” it.
A second requirement is traceable change. Transparency is not only about being able to read the code, but also about being able to see what changed and why between versions. With text-based formats, differences are captured by standard tools (e.g., git diff), making version control, peer review, and collaboration far more effective. Binary or opaque formats cannot provide this—they hide changes inside unreadable blobs.
Fig. 1. Traceable change for NONMEM file. Change is traceable but not easy readable.
A desirable (not strictly mandatory) requirement is self-explanatory code. This means that the model description carries enough context—through clear naming, annotations, and units—that a new reader can understand the intent without constantly referring back to external notes or publications. While not every project achieves this ideal, self-describing code lowers the entry barrier for new collaborators, reduces misinterpretation, and makes the model more resilient to staff turnover or long gaps between updates.
Fig. 2. Self-explanatory code traceable change. Easy to understand at a glance.
By contrast, QSP environments that store models in binary, closed, or otherwise non-readable formats cannot ensure transparency. They block the very practices—review, versioning, collaboration—that modern scientific software relies on. Even when dedicated comparison tools exist, they are typically ad-hoc, tied to a single platform, and rarely integrate smoothly into a team’s normal project workflow. As a result, they are used sporadically and do not replace true text-level transparency.
Fig. 3. Non traceable change for Simbiology project. Just files nothing more.
Modularity is the ability to divide a project into independent parts that can be developed, tested, and reused separately. It comes from having clear interfaces and well-defined dependencies between components.
Here we use the term broadly: it includes both the separation of different project layers—model, data, scripts (covered in more detail in the Separation of Concerns section)—and the internal modularization of the model itself into subcomponents that are easier to manage and understand.
What modularity brings:
A natural way to bring this modularity into practice is through a clear and consistent project structure. When each component has its own dedicated place, it becomes easier to navigate, test, and extend the project. In other words, a well-structured repository is the simplest form of modularity:
qsp-project/
model/ # core: states, processes/reactions, equations
data/ # measurements for calibration/validation (+ units, sources)
scenarios/ # protocols: simulations, fitting, sensitivity
docs/ # annotations, assumptions, limitations, references
pipelines/ # CLI scripts and CI/CD recipes (build, test, report)
julia/ # Julia code for simulation and analysis
R/ # R code for simulation and analysis
project.yml # project metadata, dependencies, configuration
This kind of layout turns a QSP model into something that looks and behaves like any other modern software project: modular, reviewable, and reproducible.
Longevity means that a model remains usable and trustworthy years after it was first created. Instead of “remembering which buttons we clicked,” the project can be rebuilt, rerun, and revalidated from its source.
What makes longevity possible:
Longevity turns a QSP model from a one-off experiment into a sustainable scientific asset—something that can be reliably shared, revisited, and built upon.
Automation means that models are executed and tested through scripts instead of manual clicks. Once a model is code, it can be integrated into pipelines that ensure consistent, repeatable runs for simulations, parameter estimation, or sensitivity analyses.
With CI/CD, every commit can automatically launch validation tasks, generate reports, or even dispatch heavy computations to a more powerful server or cluster. This reduces human error, scales effortlessly, and makes QSP projects more reliable and collaborative.
License: CC-BY-4.0