Evolve, don't hack

#1 Bugs strike back!

Yet another day, yet another bug.

Turned out that one of the algorithms didn't work properly.

The whole algorithm was related to something that, in my opinion, is the most "DDD"-ed one could get - graphs.

The potential fix was scoped to this little method:

public IEnumerable<Node> ImportantTraversalAlgorithm(Node node)
{
    // details of the algorithm
}

(I skipped the details due to NDA)

What now?

Probably, the best way is to understand the outside-in perspective and the needs of the consumer.

Alone?

Hell no!

By gathering the brilliant minds around, in the software teaming spirit, we figured out that we are missing very important aspect in the traversal steps.

The step should stop immediately when finding a "special" node - GroupingNode - that was what we were really looking for.

Just so you know - it wasn't "ten minutes tinkering".

We informally went through Example Mapping-ish process, when we laid out the need, the rules and examples.

Together. (you might be interested in The ambiguity of team work)

Writing "code" was merely an act of execution.

#1 Model

Ok, what did I know?

The problem - missing termination condition - finding a GroupingNode.

Let's try to express it as a question: has GroupingNode been found?

Ok, what is the simplest model that is able to answer this question?

Possibly it is something that can state:

yes (GroupingNode has found)
no (GroupingNode has not been found)

Which means we are dealing with a simple boolean model!

So the intuition might tell us that we need to start with "a clear state" (1. initialize as no GroupingNode found):

public IEnumerable<Node> ImportantTraversalAlgorithm(Node node)
{
    // existing initialization
    bool hasFoundGroupingNode = false; // 👈 1. initialize as no GroupingNode found
    // details of the algorithm
}

Next step would be to use our model that is able to answer the main question (2. model answering the question):

public IEnumerable<Node> ImportantTraversalAlgorithm(Node node)
{
    // existing initialization
    bool hasFoundGroupingNode = false; // 👈 1. initialize as no GroupingNode found
    // details of the algorithm

    if(hasFoundGroupingNode) // 👈 2. model answering the question
    {
        return Traversal.Stop;
    }

    // rest of the details
}

And then we are almost there - the last piece is to identify a node that plays the role of a GroupingNode (3. identify a GroupingNode):

public IEnumerable<Node> ImportantTraversalAlgorithm(Node node)
{
    // existing initialization
    bool hasFoundGroupingNode = false; // 👈 1. initialize as no GroupingNode found
    // details of the algorithm

    if(hasFoundGroupingNode) // 👈 2. model answering the question
    {
        return Traversal.Stop;
    }

    // rest of the details
    if(currentNode is GroupingNode _)
    {
        hasFoundGroupingNode = true; // 👈 3. identify a GroupingNode
    }
    // rest of the details
}

9 lines in total.

Phew, that was a real blast.

Verification - all good!

#2 Bugs strike back

You can't predict the future, ain't you dear Reader?

Me neither.

Turned out that there are some scenarios when it's not sufficient to only identify a GroupingNode, but also to check if a currently processed node is grouped by an already found GroupingNode.

"But it wasn't specified in the user story", one could vishously shout.

Well, believe me or not, no amount of specification can give more feedback than shipping stuff. (if you want to understand more, please check (Fr)Agile).

Let's get back to our misaligned behavior.

It seems that our current solution needs some additional effort.

public IEnumerable<Node> ImportantTraversalAlgorithm(Node node)
{
    // existing initialization
    bool hasFoundGroupingNode = false; // 👈 1. initialize as no GroupingNode found
    // details of the algorithm

    if(hasFoundGroupingNode) // 👈 2. model answering the question
    {
        return Traversal.Stop;
    }

    // rest of the details
    if(currentNode is GroupingNode _)
    {
        // 🤔 is this model sufficient?
        hasFoundGroupingNode = true; // 👈 3. identify a GroupingNode
    }
    // rest of the details
}

Let's revisit the main question - did it change?

Well, yes but no.

Additionally to identifying a GroupingNode we also need to check if a currently processed node is grouped by it.

We could extend this question:

- Has a GroupingNode been found?
+ Has a GroupingNode been found for a given node?

Which basically means that our current model is not capable of providing a right answer.

bool model is not sufficient!

#2 Model

What would be the easiest possible model that would be capable of giving more advanced answer?

A tuple?

public IEnumerable<Node> ImportantTraversalAlgorithm(Node node)
{
    // existing initialization
    (bool hasFoundGroupingNode, GroupingNode? groupingNode) groupingNodeRule =
        (false, null); // 👈 1. initialize as no GroupingNode found
    // details of the algorithm

    // 👇 we already changed the existing behavior
    if(groupingNodeRule.hasFoundGroupingNode && groupingNodeRule.groupingNode.Groups(currentNode)) // 👈 2. model answering the question
    {
        return Traversal.Stop;
    }

    // rest of the details
    if(currentNode is GroupingNode groupingNode)
    {
        groupingNodeRule  = 
            (true, groupingNode); // 👈 3. identify a GroupingNode
    }
    // rest of the details
}

It's easy, but is it simple? (you might be wondering what I am talking about - please feel invited to read Simple isn't easy)

In fact, we could employ even "simpler" solution - a nullable GroupingNode!

By dropping bool we make the model smaller and decision could be based on existence.

public IEnumerable<Node> ImportantTraversalAlgorithm(Node node)
{
    // existing initialization
    GroupingNode? groupingNode = null; // 👈 1. initialize as no GroupingNode found
    // details of the algorithm

    if(groupingNode.HasValue && groupingNode.Value.Groups(currentNode)) // 👈 2. model answering the question
    {
        return Traversal.Stop;
    }

    // rest of the details
    if(currentNode is GroupingNode _groupingNode)
    {
        groupingNode = _groupingNode; // 👈 3. identify a GroupingNode
    }
    // rest of the details
}

That's actually a really minimalistic solution, isn't it dear Reader?

It's quite typical to initialize variables as null at the beginning of a method, right? (because they are varying, right?)

I know the solution is working.

Although there's something hidden in the statement "the solution is working".

You know what is it, dear Reader?

For me, now.

Easy becomes complex so it might be that in three weeks (or days?) I will completely forget what does the null mean.

Of course it means "missingness", right? (you might be interested in reading Rethinking "missingness")

Maybe we should slow down and rethink the problem again, and rethink representing it with "the code"?

#3 Model

If we look closely onto the termination condition:

if(groupingNode.HasValue && groupingNode.Value.Groups(currentNode))

It might seem fine but before we didn't need "and" (&&).

Now the problem changed and the solution should follow - structure-wise we evolved a simple bool into a nullable GroupingNode.

But is it a right structure?

A simple nullable type does not offer us clear understanding as it does not communicate with the correct language.

We don't think in terms of null GroupingNode and non-null GroupingNode grouping a node, isn't it?

It feels as if we are missing a concept being represented with the code (a concept? you might be interested in Concept Maps)

Ok, shall we try to represent this abstract idea that lives in the problem area?

// 👇 abstract idea 
private abstract record IdentifiedGroupingNode
{
    public abstract bool IsIdentifiedFor(Node node);
}

This abstract idea is our model that is capable of answering a main question, expressing the problem.

It expresses WHAT, without telling HOW.

What we know is that currently there are two scenarios able to bring the answer:

private abstract record IdentifiedGroupingNode
{
    public abstract bool IsIdentifiedFor(Node node);

    public static IdentifiedGroupingNode NotFound => new NoGroupingNodeFound();
    public static IdentifiedGroupingNode Found => new GroupingNodeFound();

    // 👇 represents "false"
    private sealed record NoGroupingNodeFound : IdentifiedGroupingNode
    {
        public override bool IsIdentifiedFor(Node node) => false;
    }

    // 👇 represents "true"
    private sealed record GroupingNodeFound : IdentifiedGroupingNode
    {
        public override bool IsIdentifiedFor(Node node) => true;
    }
}

Ok, it looks scary as hell.

We got a bool, now we have a beast.

Look how big it is!

But how can it be used?

public IEnumerable<Node> ImportantTraversalAlgorithm(Node node)
{
    // existing initialization
    IdentifiedGroupingNode groupingNode =
        IdentifiedGroupingNode.NotFound; // 👈 1. initialize as no GroupingNode found
    // details of the algorithm

    if(groupingNode.IsIdentifiedFor(currentNode)) // 👈 2. model answering the question
    {
        return Traversal.Stop;
    }

    // rest of the details
    if(currentNode is GroupingNode _)
    {
        groupingNode = IdentifiedGroupingNode.Found; // 👈 3. identify a GroupingNode
    }
    // rest of the details
}

That is kind of surprising, isn't it?

What's more, we didn't change the algorithm yet.

We firstly evolved the model.

"Again".

Now probably it's time to evolve the behavior to fix the bug we were originally assigned to.

But one can notice that the decision how to answer the question is no longer available in the ImportantTraversalAlgorithm method.

It is not longer responsible for that.

The responsibility to provide the answer was moved to a designated "structure":

private abstract record IdentifiedGroupingNode
{
    public abstract bool IsIdentifiedFor(Node node);

    public static IdentifiedGroupingNode NotFound => new NoGroupingNodeFound();
    public static IdentifiedGroupingNode Found(GroupingNode groupingNode) => 
        new GroupingNodeFound(groupingNode);

    private sealed record NoGroupingNodeFound : IdentifiedGroupingNode
    {
        public override bool IsIdentifiedFor(Node node) => false;
    }

    // 👇 now our model requires additional information to be able to answer the question
    private sealed record GroupingNodeFound(GroupingNode GroupingNode) : IdentifiedGroupingNode
    {
        public override bool IsIdentifiedFor(Node node) => 
            GroupingNode.Groups(node);
    }
}

And this little change causes a compilation error which we happily fix:

public IEnumerable<Node> ImportantTraversalAlgorithm(Node node)
{
    // existing initialization
    IdentifiedGroupingNode groupingNode =
        IdentifiedGroupingNode.NotFound; // 👈 1. initialize as no GroupingNode found
    // details of the algorithm

    if(groupingNode.IsIdentifiedFor(currentNode)) // 👈 2. model answering the question
    {
        return Traversal.Stop;
    }

    // rest of the details
    if(currentNode is GroupingNode _groupingNode)
    {
        groupingNode = IdentifiedGroupingNode.Found(_groupingNode); // 👈 3. identify a GroupingNode
    }
    // rest of the details
}

We did checks and it turned out that everything is working as expected.

What's interesting, we also started talking about identified grouping node in a slightly different way.

Do you know why?

Models: evolve, don't hack

You might be thinking dear Reader: "wow Damian, you turned a simple bool into a object-oriented pile of c..lasses".

Actually, records, but yes - I hear you, dear Reader.

We experienced an evolution from a simple bool into a larger structure.

Some might whipser: "mother of god, was it worth it?"

I don't know - the time will reveal the truth.

However, I would ask another question - why didn't we start with a larger structure from the early beginning?

We could use it as the #1 Model and then even less changes will be needed.

"KISS, YAGNI" - I hear immediately in the back of my head. (is everything alright? do you hear voices too?)

And in fact - we didn't need such structure from the first moments.

But as the reality challenged our model, it turned out it is not enough.

Up to that time, it was Just Enough Model™.

Hacking?

I would consider the #2 Model a hack, a workaround, and not a right solution.

Not because I "love complexity" (as some KISS-lovers might state), but because it did not provide a right place to accomodate further changes.

The problem manifested once, might eventually manifest itself.

Is it a speculation? Yes, of course.

A little bit of speculation on that level is justifiable.

"Working in the absence of knowledge is the job".

Why "Hacking"? Because it works for me, now.

When another person comes, he or she might "just want to fix the bug" and move on.

Will it work for other person (including future me), in the future?

ImportantTraversalAlgorithm will grow.

It will get more and more responsibilities.

More "logic" will land in its hands until it will collapse under its weight (you might be interested in A solid grasp of responsibility).

Did all the 2000LOC+ friends happen over a single sprint?

Evolving?

"But now it's hidden from the sight and it's not explicit how does it work!", some might shout (in their heads).

Is the current model "easy"? No.

Is the current model "simple"?

I have never told that #3 Model is free from downsides.

I made a trade.

I made a trade by picking options in the future by providing "a structure" that will accomodate changes within its area of responsibilities, instead of assigning those responsibilities to our ImportantTraversalAlgorithm.

I made a gamble that expressing the right language is far more important that "simple" (in fact easy, see Simple isn't easy), "first-thought" solution.

Could I have used different structure?

Well, I can't really say, but yes..

As I expressed in Essentially bounded, Accidentally unlimited:

Conclusion 🔍

There is no single solution, but there is a single problem.

In the #2 Bugs strike back, we needed to express a closed set of variants (that's why a bool was not sufficient).

We could use OneOf nuget package and use discriminated unions.

The most important aspect is that we evolved the model by refactoring the structure (a bool is a data structure).

Conclusion 🔍

Make the change easy, then make the easy change

That's why "refactoring sprints" or "tech debt maintaince sprints" are harmful and pointless.

"Knowledge structures" (essentially representations of our understanding of the problem) are attracting more and more changes.

They tend to "grow knowledge belly fat" rapidly.

Of course, modularization, thoughtful responsibilities assignment and encapsulation might help us limiting the scope of the possible changes.

But one needs to design systems and applications really mindfully so structures yield those attributes.

Hacking or evolving?

Even if the #3 Model is "larger", it has clear purposes.

No solution is free from disadvantages that's why we need to be aware of the trades we make.

Models need to evolve to provide business value, no matter how small they are.

One of such values is the ability to change the "code" in an easy way.

Sometimes we need to make that "easy way" happening by representing concepts (Concepts, Entities, Data) and letting them do the job.

We should still start with the question:

Question 🤔

What is the simplest possible model that answers the main question?

Which forces us to find the right question.

But it is merely a start.

So next time dear Reader, if you have an assignment to fix a bug or provide a new capability, remember:

Conclusion 🔍

Evolve models, don't hack them