"Let's stop hoping software development projects will work out, and start trusting they will. But how do you validate trust?"
Gertjan Filarski's face

Developing software requires knowledge in two domains: the subject-matter and engineering. Most members of a team are expert in one and not the other. In my earlier blog Code that does the Right Thing I explored what happens when we cannot prove that subject-matter experts reliably transferred their knowledge to software engineers. At that point team members can only hope an application will do the right things and projects start suffering from Hopitis.

This week I will look at the way we can counter Hopitis.

"How do you verify trust?"

Dealing with trust within a team is considered a 'soft' issue and is often the responsibility of a project manager or SCRUM-master. This reflects, not entirely coincidental, two fundamentally different ways of project management. But spoiler alert: neither cures Hopitis.

Current remedy 1: Rigid project management

The first is the rigid approach. This is commonly called 'waterfall' management, but I disagree with the somewhat dismissive meaning attached to that name. I prefer 'rigid'. Rigid is not bad, it is sturdy, and provides many organisations with a system of control. Maybe it is not the most effective system and it is certainly not my first choice, but it can get the job done and it integrates well with other business units and departments. Rigid project management methods revolve around governance, responsibility, and accountability. Within that regulatory framework subject-matter experts and software developers are often surprisingly free to do their job. Rigid systems deal in advance with expected problems. If something like X happens: who is responsible for solving it? and who is accountable? Many rigid systems excel in risk analysis. And with solid risk analysis you can signal the early stages of Hopitis. But signalling a risk is not the same as solving it. A proper risk assessment includes remedial actions, but when trust is lost within a team, it is exceptionally difficult to regain it. Trust takes years to build, seconds to break, and forever to fix.

Current remedy 2: Agile project management

The second approach is known as agile and seeks to keep trust-issues from developing into problems through short feedback loops. After each software development iteration (often called a sprint) domain experts and developers sit together and review new functionality. This demonstration is intended to build and retain trust. One team member has a dedicated role (the product owner) for intermediation between subject-matter experts and software engineers. After consultation with users and other stakeholders the product owner signs off on the features that the team will work on. But intermediation is a double edged sword. Although it avoids the pitfall of engineers who need to become domain experts, it introduces another step in communication. The subject-matter experts may eventually trust the product owner in the team. But that trust is not automatically extended to the software developers. How do we confirm that their knowledge was reliably and consistently transferred from the subject-matter experts to the product owner? And from the product owner to the engineers?

Virtually all project management methods agree that knowledge transfer is important. But none can guarantee the reliability of the transfer. In most instances, the problem is recognised and piled together with the other soft communication issues that teams need to deal with. Also, no current method leaves a structural trail of trust by which future teams can review and assess the code and trust its basis for continued sound development.

Trust and verifiable understanding

Trust occurs incrementally as a step function. In most professional relationships we start with a little bit of initial trust. Let's call that the benefit of doubt. Trust will not increase by itself: something has to occur to move trust upwards over time. The most important 'something' that pushes trust up are those moments when we get to verify we understand each other. If we fail to do that for a prolonged period of time, trust tends to decrease.

Domain Engineers can help subject-matter experts and software engineers to understand each other; they have gone through the process of learning both languages. They are the crucial missing links between the two. But on its own a Domain Engineer with a shared language does not provide anyone else outside the team with a tangible and reproducible proof of understanding.

We need to provide the Domain Engineer with a metric - or better a model - for verifiable understanding. Currently, to reach mutual understanding, both sides agree on a shared mental model. That model is ambiguous and everyone has their own interpretation. The model is contained in project plans, proposals, and assessments. These documents maybe structured but they are still texts that require interpretation. When our interpretation seems to align, people think they understand each other.

Explicit understanding === trust

Domain Engineers can make the shared mental model explicit by defining hypotheses: "If X then assert Y" - much akin to unit tests. But unlike unit tests that check if the application is doing this right, the Domain Engineer's checks assert whether or not the application is doing the right thing. Individual hypotheses reflect the knowledge transferred from the domain expert to the software engineer: e.g. "If the time gap between two mentions of a person in a given set of archival records is larger than X years, they cannot be the same data point", "If the number of hours that a consultant is expected to work per year is below 1636, then report that they do not meet EU subsidy requirements", etc.

The hypotheses need to be phrased in such a way that they can be automatically tested. When we manage to do that, the model can grow over time as the project progresses. After every change in the code we can reassert the model based on the data and the code. When the application passes the assessment we have verified that software engineers and subject-matter experts still understand each other and the code is doing the right thing. If it fails we can flag the disparity immediately and discuss it.

Example

Rule 1: Subject-matter Expert X on September 27, 2020 16:44:23 - "If the time gap between two mentions of a person in a given set of archival records is larger than 50 years, they cannot be the same data point."

...

Rule 17: Subject-matter Expert Y on November 12, 2020 12:51:42 - "If a user claims two different data points are the same they need to send Documents as proof."

...

Rule 34: Subject-matter Expert Y on November 12, 2020 12:53:38 - "If the Documents match some condition, then an ADMIN user can merge the two data points."

When the team implements the feature 'add ADMIN users' they reassert their shared model to verify understanding. The model immediately raises the issue that on September 27th expert X claimed a conflicting truth. The model does not care whether or not expert X was wrong, the transferred knowledge has become outdated, or that the software engineer did not understand X properly. The model only flags that Rule 1 and Rule 34 conflict and there is no longer a consistent shared understanding of what the application is supposed to do when merging two data points when they happen to be people.

The Domain Engineer in the team decides to solve the issue by modifying rule 1 in the model:

Subject-matter Expert Y on November 19, 2021 17:04:10 - "If the time gap between two mentions of a person in a given set of archival records is larger than 50 years, they cannot be the same data point unless that data point is created by an ADMIN user."

Every time the Domain Engineer changes and reasserts the model, trust between software developers and subject-matter experts grows. And in years from now, when people have left and new engineers need to add a new feature to the codebase, they too can rely on this model. Verifiable understanding leaves no space for vagueness or ambiguity. It makes the expectations of the code transparent and reproducible.


Photo credits