The ambiguity of problems

411 tests?

Imagine that we work on a feature that requries doing some work around VeryImportantGraphAnalyzer class.

public class VeryImportantGraphAnalyzer
{
public VeryImportantGraphAnalyzer(
        IGraphBuilder graphBuilder,
        IGraphValidator graphValidator,
        IGraphShuffler graphShuffler,
        INodeFinder nodeFinder,
        IEdgeComposer edgeComposer
    )
    {
        //...
    }
}

Usually it is a pleasure to work with it - the domain is so interesting - computer science at its essence!

We follow "best practices" - we have a lot of tests around it. This brings the feeling of safety and confidence.

Turns out that in order to implement the feature we need to change the class itself - two new methods are required, and to provide new functionality, VeryImportantGraphAnalyzer needs to get some new collaborators invited to help witht the job (collaborators? invited? You might be interested in The ambiguity of composition).

We usually try to follow TDD, but this time we are in a hurry (who isn't, right?) so we decide to check what is the scope of changes.

Let's just add two new collaborators that we think we might need to implement the feature.

public class VeryImportantGraphAnalyzer
{
public VeryImportantGraphAnalyzer(
        IGraphBuilder graphBuilder,
        IGraphValidator graphValidator,
        IGraphShuffler graphShuffler,
        INodeFinder nodeFinder,
        IEdgeComposer edgeComposer,
        ICollisionDetector collisionDetector, // 👈🏻 new collaborator!
        IAreaBuilder areaBuilder // 👈🏻 new collaborator!
    )
    {
        //...
    }
}

We run tests aaaand... 411 tests out of 2598 failed ❌ .

Well, tests should help us but somehow it does not feel right.

We've heard stories that people are moaning about "how a single method caused 4 days of tests fixing" - maybe that's the case too?

After some minutes of being alone with the VeryImportantGraphAnalyzer class, we see it grew a bit...Too much?

Well, it's only 1458 lines of code now. Could be worse.

Everything is centralized, easy to find out what is going on. We have a lot of tests, maybe some are broken now, but we can fix them.

During the next daily we just say that we need to refactor the class a bit and that this might take up to 2 days.

"Ok, next", we hear and we get back to the production line.

10 days?

Turned out that the feature is now complete and ready for testing. Now it's time for QA engineers to shine and to shoot at our work.

We can grab another task from the backlog and start working on it.

Fortunately or unfortunately, along with our change other features were also added so it's pretty hard to isolate just our feature.

Each deployment takes at least couple of hours so we need to be really careful with what we are deploying.

In case of failure, we need to rollback but thanks God it can be done within 2 hours at most.

Product owner is a bit nervous - the feature was supposed to be already available, now it's 10 days and we are still not sure if it's working.

"Feature freeze" is announced - no new features until the current one is deployed.

But developers can't just sit and "do nothing" - "time for maintenance sprint" - Product Owner says.

QA Engineers, Developers, Product Owner work together to test everything, just to make sure.

Finally, all features are deployed. The customer picks it up and... There's a failure due to specific customer configuration in production.

"Start rollback procedure" we hear and almost automatically we start to do it.

In 2 weeks?

The problem with configuration was resolved - turned out that no one actually tested this particular combination of settings.

We added another step in approval process so that this won't happen again.

We are safe.

Feature lock was released so now we can get back to business as usual - daily standups, people giving their status, grabbing tasks from the backlog and going to the workstations to increment the product.

During the dependency analysis the team sees that one of the next features requires help from another department.

People from that tribe are hard to get - they are busy with their own stuff as they work with the customers directly.

To use the time of such a valuable expert wisely, we need to prepare a lot of things upfront.

"Please give a time when I should book this person" - Product Owner says.

After seven planning poker sessions and hectoliters of coffee, we have a plan - "in two weeks we will be ready", we say.

Product Owner requests expert's time in two weeks.

But there's bumpy road ahead - we work with failing tests, broken build pipelines, data cleaning - the team didn't make it in time.

"Sorry, you need to wait another two weeks", another department leader says.

And we wait.

10 classes?

The time has come - the expert joined the team.

We were adviced on how to handle various complex data relationships so that it will be easier to implement the feature.

This required changes in some database tables - some of them were too big, some of them were too small.

We even joked we do DDD - Database-Driven Design - as we started with database.

But there was a small, or maybe not that small, problem - as we changed the database structure, 22% of integration tests failed in runtime as the classes playing the roles of repositiories were affected - even though we had a repository pattern implemented.

We took additional 3 days to carefully refactor those 10 classes, prepare migration scripts and we scheduled "maintenance window" during which our services were not available.

We deployed changed, ran migrations, and... Everything was working.

At least this time there were no additional problems.

Problems

Let's stop for a while dear Reader.

What one could notice in those tiny tales?

Did they sound familiar to you?

From the outside, all those cases might suggest that there were some problems along the way.

Based on the business context, team context, technological context - they might seem quite "obvious".

"Yeah, we need to wait for other people to help us, what's the problem with that?", some might ask.

But how do we know that it is normal behavior? A baseline?

There's another nice tale (a bonus tale!):

Conclusion 🔍

A little crayfish and a little fish were swimming in the lake and having fun underwater, until the crayfish was mature enough to leave the water and enter the beach. Crayfish was amazed about the sky, about trees, birds chirping and humans running around. He came back underwater and said "Little fish, there's a whole new world outside water!" and little fish replied "Water? What is that?".

What is your water, dear Reader?

Listen to your tests

Let's take the first tale when there were 411 tests failing.

What if the problem was not in the tests?

Tests are provided by human beings (and now by "AI"?) so they are not inherently bad.

As authors of GOOS wrote:

Conclusion 🔍

When writing unit and integration tests, we stay alert for areas of the code that are difficult to test. When we find a feature that’s difficult to test, we don’t just ask ourselves how to test it, but also why is it difficult to test

So all those 411 tests were whispering to us - "Hey, there's something wrong with the design of the object".

What could be improved? What could be done differently?

And as they continue:

Conclusion 🔍

Our experience is that, when code is difficult to test, the most likely cause is that our design needs improving. The same structure that makes the code difficult to test now will make it difficult to change in the future.

Untestable code tells about the design - it is just a diagnostic signal sent and propagated by the tests.

The only question is if we are able to listen to it.

It's often mentioned that "tests provide a quick feedback" but it's an ambiguity too (should I write about it too?) - it's not only about correctness (if you are more interested about specification and verification, please check The ambiguity of TDD) or understanding the consumer needs, then providing the well-structured API.

The feedback is also about the "metabolic health" of the code. Let's check what GOOS authors say:

Conclusion 🔍

Our response is to regard the process of writing tests as a valuable early warning of potential maintenance problems and to use those hints to fix a problem while it’s still fresh.

So there are multiple levels of feedback provided by tests - we might get warnings about the consumer needs, the correctness, the growing entropy.

What do you think dear Reader, what could those 411 failing tests whisper to us?

Listen to your release process

If we apply similar reasoning (or model if you wish) to the other tales, we might see that the problems are communicating something to us.

They are not just "problems" - as we already stated, they are diagnostic signals.

In another tale we've seen that it took 10 days to go through QA process of all the changes and then deploy - without avoding problems in production.

What does this signal mean? Are we able to interpret it?

It required couple of hours to deploy new changes - will this rate encourage doing small changes, small deployments?

Or maybe it's going to reinforce the "big bang" deployments?

As someone said:

Conclusion 🔍

We have the abundance of data, but we have scarcity of insights.

In this case, the process whispers to us, begging to put our lenses on it and understand the underlying context.

Everything is emitting signals but we hardly subscribe and use those information to build intelligence.

Listen to your dependencies

In the next tale we've seen that the team was not able to deliver the feature in time as there was a dependency on another department.

The team needed to "just give numbers" so that project managers could plan the work.

Planning the knowledge work - it must be fun, right?

"But Damian, if you have external dependency you need to communicate the plan and schedule, right?", some could ask.

I do not have the answers.

But I do have questions.

What does it mean that we need to wait for another department?

Do we have Scrum ceremonies and we plan the work?

What is "the water" in this context, dear Reader?

Listen to your classes

Isn't it strange that by changing the database structure, the classes also changed?

Or maybe there's nothing strange in it - it's just a consequence of introducing a change.

Some smart folks could say there's a coupling between the database and the application code.

Another signal, huh.

As GOOS authors stated, it warns about "future maintenance problems" - what future maintenance problems does it try to warn us about?

What does it tell us about the design?

And what about failing tests - what do they whisper to us?

Listen to your problems

Problems are a powerful tool.

Of course it depends on how do we understand the word "problem".

I have a background in a control engineering and in that field "an error" is just a part of "the life", a part of "control feedback loop".

It's a signal, a very useful one.

Also, the huge part of such a control feedback loop system is observability - sensors that work as "ears and eyes" for the controller, to make decision upon.

Still, the most difficult part is to look through "the water", see beyond our own biases, misconceptions, and tear down abstractions that we've built around us to support us.

The structure, "the water", might push us back to the same patterns, the same solutions, the topology.

To challenge the status quo we must jump into the neverending loop of improvement, full of unlearning (!) and learning - and all of these require being humble, curious and actionable

Interestingly, when things go wrong, meaning, when we experience problems - whether they are manifested through failing tests, problems in production, difficulities in the delivery or issues in deployment process - typically people want to minimize the risk by creating more structure, adding more processes.

Make all the steps bigger (structure-wise) - more approvals, more meetings, more resource planning (I was barely able to write it, believe me), more estimates, etc.

All to reduce the risk, actually increasing the risk by buying illusionary confidence - what if one could make the steps smaller?

Make smaller changes?

Establish smaller, trully cross-functional dedicated teams? (You might be interested in The ambiguity of team work)

Additional observation might be that changing technical parts is easy - but changing the way we think, the way other people operate, the way we communicate - that's the real challenge.

The Software System is a part of a bigger system - the organization, the society, the world. It does not exist in a vacuum.

Next time when you experience a problem, try to listen to it dear Read and ask yourself:

Question 🤔

What is "the water" I can't see?