Wisdom: System Complexity

What is a system?

A system is a group of parts that act together to accomplish a common goal.

Examples:

Company made up of employees

Sentence made up of words

Team made up of players

Cloth made up of threads

Organ made up of Cells

Note: A part can be a system itself.

Need of system

Specialization

Some parts can do somethings that other parts cannot.

Example:

It needs so much special skills and experience that a person cannot be a chartered accountant and a nuero surgeon at the same time in one life time. A hospital which needs both, has to hire two different people.

Simultaneity

In some situations, multiple tasks need to be done at the same time to achieve the desired result.
Example:

In an office there has to be a guard that has to keep an eye on the door, exactly at the same time a peon has to make tea, exactly at the same time a clerk has to fill a voucher.

Readiness

Some works need to be done constantly without any break.

Example

The borders of a country need to be guarded constantly without a minute’s break. A part cannot work continuously because after sometime of working the part burns out and needs to be repaired. There has to be shifts.

Advantages of System

Risk Reduction

If a part is in a system, then the chances of damage to that part is greatly reduced, because in case of a problem the other parts can come for rescue.

Example:

A person living in a village has greater chances of surviving an illness or injury or a famine, than a person living alone in a jungle.

Resiliency

Resiliency to the whole group, if one part fails another part can continue providing same services until the first part is repaired or replaced.

Example:

If a farmer has many different types of crops sown simultaneously on different acres, then in case of an infestation some of the crops are likely to be saved. If the farmer has sown only one type of crop, then it would be an all or nothing situation.

Quality

Specialization brings quality in output. Taking advantage of reduction in responsibilities, a part can invest more time in education, and can also narrow down the area of education to know a lot about a very little set of things

Example:

A large corporation is capable of hiring specialized, highly educated personnel with very limited area of responsibility. A small company cannot afford that and have to have mediocre level people where every person wear multiple hats.

Economy

Taking advantage of reduction in responsibilities, a part can choose to do what its best in. Also, since number of duties is decreased, time formerly spent in context-switching, setting-up and moving to new workplace is reduced.

Example:

In an economy where trading is safe and frequent, different regions can afford to grow only those crops that their lands are better in, and rely on other regions to provide other crops they need through trade. A self-sufficient village on the other hand, has to grow all it needs and have to sufferent in quality and quantity as a result.

Power

A group of parts working together has very high chance of defeating an individual part working alone. It is simple power-in-number thing. Also the parts can surround the enemy part and attack all at once instead of a linear one-by-one approach.

Example:

The surrounding thing is the essence of German army technique to win over all Europe except UK in a matter of months. The trick was to always encounter enemy forces in small parts, so though the German army was small in number overall, yet in every battle individually it was in majority. Note that there is no advantage in having more resources if all of those resources cannot be used at once when needed.

Disadvantages of System

Complexity

Complexity is number of connections between parts in a system. If no part of a system has any connection to any other part, then the complexity of the system is exactly zero. It not matter how many parts are there in the system.

The problem with connections is, as number of connections grow, the number of paths also grow.

Defining Paths

A path is a flow of execution.

Example:

The vegetable that reaches my table every day has the following path:

Farm -> Farm Market -> Transport Vehicle -> City Whole Sale Market -> Local Vegetable Shop.

When the complexity grows, the path becomes long and multiple paths come into existence where there was only one part formerly. Let us suppose the farmer now instead of waiting for annual flood, choose to buy artificial fertilizers:

Mine -> Fertilizer Factory -> Transport Vehicle -> Fertilizer Whole Sale Market -> Village Fertilizer Shop -> Farm -> Farm Market -> Transport Vehicle -> City Whole Sale Market -> Local Vegetable Shop

So far, there is only one part though the length of part has increased. The longer the path becomes, more at the ways to short-circuit it to make smaller paths.

Mine -> Fertilizer Factory -> Transport Vehicle -> Fertilizer Whole Sale Market -> Village Fertilizer Shop -> Farm -> Farm Market -> Transport Vehicle -> City Whole Sale Market

Note that in above, some people choose to buy vegetable directly from the wholesale market, therefore a short circuit. The other path still exists where people still buy from local vegetable shop instead of going to the wholesale market. We have two paths now instead of one. It can become more complicated.

Mine -> Farm -> Farm Market -> Transport Vehicle -> City Whole Sale Market -> Local Vegetable Shop

In the above, some of the farmers choose to mine the minerals on their own (or recycle excretions), while other farmers are still buying fertilizers from the market. Therefore a new path is created, and the two existing ones still exists.

Of course some of the farmers may choose to buy directly from wholesale market, in which case we get 4 paths. There are many combinations.

As number of connections between parts of a system grow, the number of paths also grow. Same thing could happen in multiple ways utilizing different flows of execution. The number of words that are needed to explain the system grows. The number of test cases grows. It becomes harder and harder to capture all paths.

To tackle complexity, there has to be a manager. An overriding executer that can initiate, rollback, pause, resume, stop any flow. Flow is when the path is being utilized. Flow is like current, path is like wire.

How to Handle Complexity in a Computer Program

In a computer program there are methods that perform actions. If any method can call any other method then paths increase and so do the complexity. We have to divide methods in two types:

Methods that can call other methods

Methods that cannot call other methods.

The First Type of Methods

These are the paths. When you want to change a path, change the body of this method.

Example:

Suppose the business logic is this:

User input data in a form.

Form is sent to code-behind.

Code-behind validate user input.

Code-behind insert data in database.

Success message is displayed.

What we need to do is make five methods of the second class and call them one by one from the first type of method. The function that validates data cannot for example insert it in database. The function that validates data just return a true/false to the first type of method.

Why use this scheme where methods are divided in two types? Because we can easily change the path, and because we can find out very easily that what is going on.

Let us suppose there is a change in business logic. Now, we have to redirect user to another web-page instead of displaying success message on the same page. How to do it? Very simple, make a new method that redirect user. In the first type of method, replace the call to the method that display success message to the call to the method that redirects

How to Document This?

Use a Sequence Diagram. Sequence Diagram shows the flow. Don’t get yourself confused with the concept of “message passing” and “objects” yet. Just try to make a Sequence Diagram without any object-oriented stuff!

We have to make a separate Sequence Diagram for every functionality. Some of the functionalities are:

Customer can add a new order.

Customer can view orders.

Customer can change an order.

Customer can delete an order.

A report of orders last year that exceed Rs. 100,000.

Audit trailing.

Searching in orders.

In an event-driven programming, user can initiate almost any functionality. We have to put the first type of method in the event-handler.

What Architecture This Is Called?

The above scheme is very close to the MVC architecture. In MVC, controller is the logic that controls the workflow i.e. the path. Model and View are not allowed to call each other directly. All interaction between Model and View has to go through the controller. Of course there are variations in MVC which do allow Model and View to call each other directly but that kill the purpose.

Do not think too much about the concept of objects yet. Do not consider Model, View and Controller as objects. Consider them as functions.

Lets give this scheme a name. Lets call it "Director-Actor Procedural Design". Director is the method that calls other methods. Actor is the method that do not call another method but do the work itself.

Director do not do any work. It delegates the work to actors and orchestrate the workings of the actors. It decides whether and when to call who. It also handle exceptions, because it knows what to do, that is, what other method to call in case of an exception. It also allocate and deallocate resources. Actor cannot allocate the resources itself, because the same resource may need to be used by another actor first, for example an actor initializes an object with values and then the new actor uses that object. If the second actor allocate its own resource then it would be a new object which we do not want. Actor cannot deallocate resource because in case of an exception control goes back to the director and then only the director can deallocate the resource.

There are no director-directors, means there is no super director that calls a director. The directors are called asnchronougly by event handlers.

The directors are the effective event handlers, means almost the entire work that need to be done in an event handler is done by a director. There is no logic in the eventhandler except passing the values of related controls to the director, means there is no if-else or loop in eventhandlers. The only work of eventhandler is to call the right director and pass on the values of related controls.

Director is control-type-agnostic, means director don't have to know what controls are displayed to user, be it a group of radio buttons or a drop down list. User can select an option anyway, by checking a radio button, by selecting an option from drop down list or by clicking a button, whatever. Its the work of the form and the event handler in that form to extract the values from the controls and send it to the appropriate director. Therefore, if user interface changes, means your company decide to use a web page instead of a web form for example, you do not need to change the director.

Director is also database-provider-agnostic. It not matter to director whether you are using Sql Server, Oracle, MS Access or even flat file database. The inner workings of database is handled by the stored procedures layer and the calling of those stored procedures is handled by the model layer.

Similarities With MVC

The above do very much look like the MVC pattern. Director is very much same as the Controller in the MVC. The differences are the Controller need not be an object, it could be a method, and, only the directors can call only the actors.

Similarities With Object-Oriented-Programming

The central theme of object-oriented-programming is not to limit who calls who, the central theme of object-oriented programming is what data a function act on, this is called encapsulation. All the other stuff in object-oriented programming is built around this basic concept. Inheritance, and polymorphism are not in their full glow without encapsulation.

The above scheme is based on limiting who calls who. This is independent of what data the functions are acting on. There can even be no data the function is acting on, for example the function is creating a file, or displaying something on screen etc, i.e. there are no variables. The function can be pure, means the function only acts on its input data and not change anything in environment. Means the function don’t change anything in file system, database and network. The above scheme works in all such scenarios.

Back To Ways of Reducing Complexity

A little recap

It was decided to have two types of methods. First one, named Directors, can call other methods. The second one, named Actors, cannot call other methods. It was also decided that the first type would manage the workflow.

We can put a prefix to names of both types of methods. For directors the appropriate prefix is “Handle”, for actors the appropriate prefix is “Do”. So a validation method is called “DoValidation”, an insert method is called “DoInsert”, a file writer method is called “DoWriteFile” etc. The handlers are like these: “HandleSubmit”, “HandlePageLoad”, “HandleCancel”, “HandleRefreshGrid”, “HandleDelete” etc.

So for every desired situation, we have a handler. The handler manages resources and executes operations by asking doers to do what they are supposed to do. A desired situation is an anticipated situation, such as “user clicks on submit button”, “user sends invalid input”, “user wants to load the page” etc. In event-driven programming, an anticipated situation is already associated with an event handler therefore that event-handler would have to call a director.

So, you are a software developer, you can make softwares and websites. A colleague of you in your company is an accountant; he can write vouchers and make ledger entries. Another colleague of you in the same company is a marketer, he can run a media campaign or do a user survey. All three of you are there in the company’s office, sitting idle, waiting for manager to tell you when to work and what to work on. The point is, all three of you are operatives, doers, actors, you do things but you don’t decide when to do it, you also don’t decide what to work on. Your manager should tell you what to do, for example decide whether its a new ERP or a new website. Your manager should tell the accountant in what bank’s what account number he must work on. Your manager should tell the marketer which people to act on and when.

You, your accountant friend and your marketer acquaintance are methods. You take inputs. You cannot order each other to work. You don’t decide when to work. You can't handle unexpected situations on your own, such as a power failure, you have to report back to manager and manager then decide what to do.

We have two sets of synonyms in our scheme:

Directors, Handlers, Managers, Executors.

Actors, Doers, Operatives.

Should A Handler Method Call Another Handler Method

Now it is getting complex. We have only one level of hierarchy uptil now. We have handlers that calls doers and that’s it. No handlers are allowed to call other handlers. The question is, what if a workflow is part of a larger workflow.

Many operations in add and edit are same. In both you have to validate user input. In both you have to show a success message if task is successful, otherwise show an error message.

I do not recommend doing this level of software reusing. Handlers shouldn’t call other handlers, period. It is because to implement this, we have to change body of the operatives. We have to put conditions in operatives to do certain things when called by one handler and do some other things when called by some other handler and do common things when called by any handler. The point of separating the two types of methods is that one is independent of another. If both are dependent on each other then there is no advantage of separation. Infact its not a separation.
The biggest problem in softwares is that things are interrelated. One change in one part results in a lot of changes in other parts. We want to minimize impact of changes. There are ways to do this, such as object-oriented-programming where we can change private methods without any effect outside the object. We can also write automatic testing code to be sure that all effects are accounted for. Still, the essence of problem is not tackled. Object-oriented-programming do not reduce who call who very effectively, a private method can call a method in some other object. Automatic testing can give us extra eyes but it not make the garden greener. Some of the situations can be overlooked in testing.

The real solution is to reduce dependencies. What is dependency? Dependency is when behavior of one thing changes when behavior of another thing changes. For example: virus in my computer effects my productivity because I depend on my computer to do my work, but virus in a doctor’s computer do not effect his productivity because he do not use computer to do his work.

Dependencies in software is the cost of code reuse. When we reuse a method, we call it from more than one places. In most situations, the body of that method has to react differently depending on who is calling it.

The solution is to make one thing independent of another. The other thing would still be dependent on the first one and that is ok, that is code reuse. We have to make two-way dependency one-way. How? Make the actors small. This can only be accomplished if the Single Responsibility Principle is followed at the method level, means one method do exactly one thing.

Example:

Insert and Update:

You have an “insert” operation and an “update” operation. You have to do user-input validation in both cases. The validation operation is same in both cases except that in case of update you have to validate the id of the record. The id of the rercord can be in form of value of an item in drop down list or it can be in a hidden field, whatever. You have to make sure that the Id of record is there and that its numeric and that its non-negative, before you start the update operation.

You should make a method that validates all fields except the id field. Then you should make a separate method that validates only the id field. Now, in the insert workflow, you call only the first method and in update method you call both the methods. As a bonus, you get a ready-made validation function for the delete workflow. Note that in the delete workflow you only have to validate the id.

Create File If Not Exist Then Write In It:

In some situation you have to write in a file if it exists, and if it not exist then create the file first then write in it. These are two separate operations: Create File if not exist, Write in file. There should be two methods.

Allow User to Type Numeric Keys With Or Without Dot:

In a web page this is client-side scripting. You catch the key user pressed, if its not numeric you cancel the event. There can be two situations, one in which user can type an integer only, such as Id of a record, count of something etc, that is, things which cannot be decimal. The other situation is when user can input a decimal, such as price of an item. You want to validate the key in both cases and decide whether to cancel the event.

You should make two methods, one valids numeric keys only without dots, the other validate dot keys only. Then in situation where you need to validate both write something like this:


if(! (DoIsValidNum(txtData.value)  DoIsValidDot(txtData.value)))
 evt.Cancel = true;

When you validate only numerically then:


if(!DoIsValidNum(txtData.value))

evt.Cancel = true;

Summary

Things better exist in a system, than alone. Its good for them (risk-reduction) and for the system(resiliency).

There are situations where systems are unavoidable, such as: Specialization, Simultaneity and Readiness.

Effectiveness of a system is directly proportional to number of interactions between its parts. Unfortunately, complexity of the system is also proportional to that.

There is a way to reduce complexity while keeping effectiveness at the same level. It is to allow only one kind of workers to use other workers.

In a computer program:
- We should have only one type of methods, the Directors, which could call other methods, the Actors, that do things.
- A Director shouldn't be allowed to call another Director.
- An Actor should perform only one work.
- A Director shouldn't do any work, except orchestrating the working of the Actors.
- Director should decide when and what Actors to call, in what order, what to pass to them and what to do with the returned value.
- Exception handling should be done by Directors only.

Wisdom

10 October, 2011

System Complexity