Writing good logic in code
This article does not relate to a specific language, but to many languages although I am only familiar with about 10 so you will have to decide whether the comments are appropriate for yours or not.
There are two major points to cover and the reason for needing good logic is simple. I'm not sure there are many statistics but I bet the majority of software bugs are related to broken or incomplete logic. The first point is that we need to reduce the amount of logic required in the first place and the second is that we need to simplify and rework any logic that is required in order to make it readable, maintainable and testable.
How can we reduce logic? The first and obvious point is that we need to learn it properly. I saw some code the other day that said something like:
and it made me chuckle but this is typical of the first point. Can you see what is wrong with it? If you are checking for the string SC appearing in the first two characters of a string and then checking its length is equal to 2, surely the string EQUALS "sc" in other words, the following would be equivalent:
Which is clearer of the two? The second is by far clearer and has many less potential defects. The first example could have an accidental = instead of ==, it could have different string references for the first and second checks and it could get the substring values wrong. All of these on top of the basic potential defects found in both examples. Now imagine if we counted the potential defects in a piece of software before and after rationalising poor logic like this and you might have a 50% reduction in defect risk - nice! There are plenty of examples of places where logic is confusing, how many people have used horrific "if, then, else if, if, then end if.." as if that is perfectly acceptable. Ask your bosses to enforce a simple logic policy that something is simple or it is re-factored.
The second way to reduce logic is to use polymorphism and inheritance to create structural logic. You need to understand the difference between structural logic and behavioural logic because one should always be implemented in polymorhism and the other might use polymorphism. A car does not need to control whether the drive-shafts turn the wheels because for a given car, they always will. That is not to say that all drive-shafts turn wheels but for a given car, once it is built, that is how it works. On the other hand, the gear to use in the car is not fixed but is dependent on how the driver is driving it. It requires behavioural control and can change frequently, the drive shafts are fixed and have logic or behaviour dependent on the structure. How do we equate this to software? Suppose we have a simple application that has 2 dialogs. One for normal users and one for administrators, they have largely the same properties and perhaps one has an additional option on it. When we want to query the properties set by the user in the dialog, we could say:
or we could think we are being slightly cleverer by using the pointers/references:
Sure, it's not the end of the world but it already has a built-in assumption in the first example that the dialog has a name that is not going to change and in the first and second one that there are only two dialogs. If you add another type in both of these examples, the code will still compile. This is a structual situation since once the dialog type is defined, presumably it stays set until the application is 'rebuilt' or restarted for another user. In this case, we could create a base class or interface for our dialogs, keep a pointer to the base class and lose the logic:
Ah you say, but I do things differently for each dialog and need the logic brackets to separate it. Most of the time you do not. At the level you are handling this, you probably do not need to know about the detail but if you do, ask the dialog to do it or ask the dialog for a handler class which can be specialised for each type so that the caller still does not need to know what type of dialog is displayed. If you create a new type, you will need to implement the base class interface or dialog so once the compiler is happy, you will have precisely 0 strutural defects with your code!
Behavioural logic is related to things that change all the time, you might receive a network message and decide what to do based on the message number:
Remember that although this logic might be required you can still use design patterns to handle things in a way that does not create large and convoluted logic that is hard to maintain. You could argue that for simple cases the logic is OK but my experience is that there is no simple case - what if you accidentally mistype the value you are checking or put a value in twice, do you really test every single message to make sure it works?
The second area that is important is simplifying logic in order to make it usable. This follows on from the behavioural discussion since the structural logic should be hidden away in polymorhic function calls.
How do we simplify logic? Again, we need to learn how to re-factor logic and we can do things like logic reversal, so instead of:
which is very common in software, you can reverse the logic and have the individual items dictate what their logic needs to be:
Can you see how much neater that is? Consider this as a possible refactoring tool whenever you see terse logic statements.
Another technique is very simple but often underused. You create a function with a helpful name and you move the logic into that function or more commonly functions. I can only think it is laziness that prevents us creating these helper functions that can turn otherwise impossibly complex logic statements into a collection of helpful and very readible statements:
The logic that defines each of the two unrelated functions will make more sense in separate functions than lumped together in a single statement. Also if one type of message is not being handled, a single easily testable function can be examined.
The last suggestion for majorly helping after everything else is done is to use automated testing tools (many great free ones exist such as csunit - although not sure why we don't like paying for things!). I am constantly amazed about how often I am caught out by a very innocuous defect in an otherwise simple function. A function to add items to a list? How hard could that be but let us consider what might fail even in a single function. 1) The list could be null/unitialised, 2) The item might already exist in the list which might not be allowed 3) The list could be full 4) The item might not be the right type for the list (not always easy to trap with the compiler for un-typed collections) 5) You might be trying to insert it at an invalid position 6) The object you are trying to add might be null and this might not be allowed. You get the idea - there are often more considerations than we can think of so how to do we cope? We don't. We have peer code reviews, we get trained, we get experience, we use robust languages and frameworks and we test,test,test. How can we test the function? We can throw a whole load of automatic data at the function and find out what happens in certain realistic conditions or we specifically handle the exceptions or errors that might be thrown by an invalid condition (or both). We don't necessarily care about running out of memory after adding 4 gazillion items to a list but it might be an issue. We might instead care about what will happen with a full list or a null object, we might want to test the logic by setting external conditions and calling the function. If you can't easily test the function then break it down until you can. Remember that our functions often assume too much about the parameter data or member variables they use. These assumptions coupled with poor logic design is defect central!
Unfortuantely most logic issues are not found until after release because there are too many of them to test and many are very subtle or specific. By carefully approaching the design and build process, you will find a massive reduction in these!
There are two major points to cover and the reason for needing good logic is simple. I'm not sure there are many statistics but I bet the majority of software bugs are related to broken or incomplete logic. The first point is that we need to reduce the amount of logic required in the first place and the second is that we need to simplify and rework any logic that is required in order to make it readable, maintainable and testable.
How can we reduce logic? The first and obvious point is that we need to learn it properly. I saw some code the other day that said something like:
if ( myString.Length == 2 && myString.SubString(0,2).ToUpper() == "SC" )...
and it made me chuckle but this is typical of the first point. Can you see what is wrong with it? If you are checking for the string SC appearing in the first two characters of a string and then checking its length is equal to 2, surely the string EQUALS "sc" in other words, the following would be equivalent:
if ( myString.ToUpper() == "SC" )...
Which is clearer of the two? The second is by far clearer and has many less potential defects. The first example could have an accidental = instead of ==, it could have different string references for the first and second checks and it could get the substring values wrong. All of these on top of the basic potential defects found in both examples. Now imagine if we counted the potential defects in a piece of software before and after rationalising poor logic like this and you might have a 50% reduction in defect risk - nice! There are plenty of examples of places where logic is confusing, how many people have used horrific "if, then, else if, if, then end if.." as if that is perfectly acceptable. Ask your bosses to enforce a simple logic policy that something is simple or it is re-factored.
The second way to reduce logic is to use polymorphism and inheritance to create structural logic. You need to understand the difference between structural logic and behavioural logic because one should always be implemented in polymorhism and the other might use polymorphism. A car does not need to control whether the drive-shafts turn the wheels because for a given car, they always will. That is not to say that all drive-shafts turn wheels but for a given car, once it is built, that is how it works. On the other hand, the gear to use in the car is not fixed but is dependent on how the driver is driving it. It requires behavioural control and can change frequently, the drive shafts are fixed and have logic or behaviour dependent on the structure. How do we equate this to software? Suppose we have a simple application that has 2 dialogs. One for normal users and one for administrators, they have largely the same properties and perhaps one has an additional option on it. When we want to query the properties set by the user in the dialog, we could say:
if ( Dialog.Name == "Admin" )
{
}
else
{
}
or we could think we are being slightly cleverer by using the pointers/references:
if ( AdminDialog != null )
{
}
else
{
}
Sure, it's not the end of the world but it already has a built-in assumption in the first example that the dialog has a name that is not going to change and in the first and second one that there are only two dialogs. If you add another type in both of these examples, the code will still compile. This is a structual situation since once the dialog type is defined, presumably it stays set until the application is 'rebuilt' or restarted for another user. In this case, we could create a base class or interface for our dialogs, keep a pointer to the base class and lose the logic:
MyDialog.DoSomething(); // Will call whatever is currently attached
Ah you say, but I do things differently for each dialog and need the logic brackets to separate it. Most of the time you do not. At the level you are handling this, you probably do not need to know about the detail but if you do, ask the dialog to do it or ask the dialog for a handler class which can be specialised for each type so that the caller still does not need to know what type of dialog is displayed. If you create a new type, you will need to implement the base class interface or dialog so once the compiler is happy, you will have precisely 0 strutural defects with your code!
Behavioural logic is related to things that change all the time, you might receive a network message and decide what to do based on the message number:
if ( Message.Number == 1 )
{
// Handle message 1
}
else...
Remember that although this logic might be required you can still use design patterns to handle things in a way that does not create large and convoluted logic that is hard to maintain. You could argue that for simple cases the logic is OK but my experience is that there is no simple case - what if you accidentally mistype the value you are checking or put a value in twice, do you really test every single message to make sure it works?
The second area that is important is simplifying logic in order to make it usable. This follows on from the behavioural discussion since the structural logic should be hidden away in polymorhic function calls.
How do we simplify logic? Again, we need to learn how to re-factor logic and we can do things like logic reversal, so instead of:
if ( VariousConditions )
{
Control1.Enable = true;
}
else if ( SomethingElse )
{
Control1.Enable = true;
Control2.Enable = true;
}
else if ( AnotherThing )
{
// You get the idea
}
which is very common in software, you can reverse the logic and have the individual items dictate what their logic needs to be:
Control1.Enable = VariousConditions || SomethingElse;
Control2.Enable = SomethingElse || AnotherThing;
Can you see how much neater that is? Consider this as a possible refactoring tool whenever you see terse logic statements.
Another technique is very simple but often underused. You create a function with a helpful name and you move the logic into that function or more commonly functions. I can only think it is laziness that prevents us creating these helper functions that can turn otherwise impossibly complex logic statements into a collection of helpful and very readible statements:
if ( MessageIsTaggedAsUrgent(Message) || MessageIsFirstInQueue(Message) )
{
DealWithIt();
}
The logic that defines each of the two unrelated functions will make more sense in separate functions than lumped together in a single statement. Also if one type of message is not being handled, a single easily testable function can be examined.
The last suggestion for majorly helping after everything else is done is to use automated testing tools (many great free ones exist such as csunit - although not sure why we don't like paying for things!). I am constantly amazed about how often I am caught out by a very innocuous defect in an otherwise simple function. A function to add items to a list? How hard could that be but let us consider what might fail even in a single function. 1) The list could be null/unitialised, 2) The item might already exist in the list which might not be allowed 3) The list could be full 4) The item might not be the right type for the list (not always easy to trap with the compiler for un-typed collections) 5) You might be trying to insert it at an invalid position 6) The object you are trying to add might be null and this might not be allowed. You get the idea - there are often more considerations than we can think of so how to do we cope? We don't. We have peer code reviews, we get trained, we get experience, we use robust languages and frameworks and we test,test,test. How can we test the function? We can throw a whole load of automatic data at the function and find out what happens in certain realistic conditions or we specifically handle the exceptions or errors that might be thrown by an invalid condition (or both). We don't necessarily care about running out of memory after adding 4 gazillion items to a list but it might be an issue. We might instead care about what will happen with a full list or a null object, we might want to test the logic by setting external conditions and calling the function. If you can't easily test the function then break it down until you can. Remember that our functions often assume too much about the parameter data or member variables they use. These assumptions coupled with poor logic design is defect central!
Unfortuantely most logic issues are not found until after release because there are too many of them to test and many are very subtle or specific. By carefully approaching the design and build process, you will find a massive reduction in these!