What is code smell?
A code smell is an indication of poor code quality. If code smells are present, you get the feeling that something is wrong when viewing the code. We’ll show you the known code smells and explain how to get rid of them.
What is a code smell?
A code smell is a characteristic of code that causes experienced programmers to feel that the code isn’t clean. Having a good `nose´ for code smells is roughly comparable to the intuition of a master craftsman. If he sees a tangle of cables or badly laid wires, he immediately knows that something is wrong. Looking at the source code is the only way to notice the difference.
The well-known British programmer Martin Fowler points to his colleague Kent Beck as the creator of the term. On Fowler’s blog, he provides the following definition:
`A code smell is a surface indication that usually corresponds to a deeper problem in the system.´ – Source: https://martinfowler.com/bliki/CodeSmell.html
Therefore, a code smell indicates a systemic problem. Either the code was written by a programmer with insufficient competence or by constantly changing programmers. In the latter case, the fluctuating levels of competence, knowledge of the codebase and familiarity with guidelines and standards leads to poor code quality. The code starts to stink.
A code smell is not to be confused with selective deficiencies, which are to be expected in every codebase. First, a code smell is only an indication. If a programmer notices a code smell, further examination is necessary to determine whether there’s actually a systemic problem.
If it becomes apparent that the code smells because it’s bad, there are different approaches to cleaning up the code. Often these are refactoring approaches, which are designed to improve the structure of the code while maintaining functionality. However, depending on the size and complexity of the code, it may be impossible to remove code smells. Then only a `re-write´, i.e., starting from scratch, will help.
What kinds of code smells are there?
Code is a medium that exists at different levels of abstraction. Let’s think of an application. This consists of a multitude of individual items, all of which are defined in code like variables, functions, classes, modules. To consider the different code smells, we distinguish between the abstraction levels:
- General code smells
- Code smells on a functional level
- Code smells on a class level
- Code smells on an application level
General code smells
General code smells can be found on every level of an application. In most cases, they are not clear errors, but rather patterns that have a negative impact. A rule of thumb in programming states that code should be written primarily for the reader. Inexperienced programmers often write code that delivers the desired result but is incomprehensible.
Poor naming of code constructs
Naming individual code constructs well, such as variables and functions, is a fine art. Unfortunately, there are often meaningless, nonsensical or even contradictory names in code. The prime example is variable names consisting of only one letter: x, i, n, etc. Since these names don’t give any context, sense and purpose are not apparent.
It’s better to use expressive names. Then the code reads like natural language and it’s easier to follow the program flow and detect logical inconsistencies.
No uniform code style
Clean code looks like it was written as an integrated whole. You can immediately see that the code was created based on certain rules. If this regularity is missing, it’s a code smell.
Variation in naming suggests that several people wrote parts of the code independently of each other. If it was a team, it suggests there was obviously no uniform specification.
Curious? Read our article `What is clean code?´.
Undefined variables
Some languages such as C++, Java and JavaScript allow you to declare a variable without specifying an initial value. The variable exists from the declaration but is not defined. This may result in subtle bugs. This code smell is particularly fatal in carelessly typed languages because without an initial definition, it’s not recognisable which type is expected for the variable.
Hard-coded values in the code
Inexperienced programmers often make this mistake. To compare a variable with a specific value, they enter the value directly. This is also referred to as `hard-coded values´. Hard-coded values are problematic if they appear in several places in the program. This is because the individual copies tend to mutate independently over time.
It’s better to define a constant for the value. This establishes a `single source of truth´ (SSOT), i.e., any code that needs the value accesses the same constant defined in a single place.
Magic numbers in the code
A magic number is a special case of hard-coded values. Let’s imagine that our code works with a value limited to 24 bits. 24 bits correspond to 16,777,216 possible values or the numbers from 0 to 16,777,215.
Unfortunately, a comment is missing so it’s not clear how the value came about later. A magic number has been created. The lack of knowledge about the origin of the value increases the risk that it will be changed accidentally:
We can spare ourselves these sources of error. We include the definition of the value as an expression in the assignment of the constant. This way, we make the idea behind the value explicit. Optionally, we use a subsequent assert statement to document the actual value in the code:
Deeply nested control statements
In most languages, it’s possible to switch code constructs arbitrarily deeply. Unfortunately, this also increases the complexity, which makes it more difficult to see through the code when reading it. A particularly common code smell is deeply nested control statements such as loops and branches.
There are several approaches to resolving the nesting. Here we use the Boolean AND operator to place the two conditions within one `if´ statement:
Another approach, which works with any number of conditions, uses `guard clauses´. We break out of the function as soon as one of the necessary conditions is not fulfilled:
The rating “deeply nested” is subjective. As a rule of thumb, control statements should be nested a maximum of three levels deep, and four levels deep in exceptional cases. Deeper is almost never advisable or necessary. If you feel tempted to nest deeper, you should consider refactoring.
Code smells on a functional level
In most languages, functions are the basic unit of execution of code. There are smaller pieces of code, such as variables and expressions, but these stand alone. Writing clean functions requires some experience. We’ll show you a few of the most common code smells on a functional level.
Lack of handling of exceptions
Functions receive input values as arguments. In many cases, only certain values or value ranges are valid. It’s the responsibility of the programmer to check input values for validity and handle exceptions accordingly.
Access to variable from higher-level scope in function
Scopes are a fundamental feature of most programming languages. They determine which names are defined at which positions in the code. Subordinate scopes inherit from the scopes above them. If an externally defined variable is accessed within a function, this is a code smell. This is because the value may have changed between the definition of the variable and the function call:
Ideally, you should only access values within functions that have been presented as arguments. To avoid having to present the same value over and over again, default parameters are used:
Accessing a variable from the scope of an enclosing function creates a `closure´. This is not a code smell. Closures are an important concept in JavaScript programming.
Code smells on a class level
Object-oriented programming (OOP) can help to increase code reuse and reduce complexity. Unfortunately, the odds of making mistakes in class design are high, which later become noticeable as code smell.
One of the most common code smells on the class level is the ‘god object’ or the corresponding ‘god class’. A god object combines all kinds of functionality that don’t actually belong together and violates the principle of ‘separation of concerns’.
Code smells are often encountered in connection with the inheritance hierarchy. Sometimes code is unnecessarily distributed across multiple levels of inheritance. The mistake is also often made when inheritance is used instead of composition as the primary approach to composing objects.
The unintended use of ‘data classes’ is also considered a code smell. These are classes that don’t implement their own behaviour. If no methods are defined other than generic getters and setters, you should consider using a simpler data structure such as a Dict or a Struct.
Code smells on an application level
It’s every programmer’s nightmare: you’re assigned to work on an existing application, and a first look at the code reveals that the entire application is in a single, huge file. It’s unclear how the individual code components are related. Like a bowl of pasta, everything is haywire in the ‘spaghetti code’.
Abstractions are missing, there’s no subdivision into classes and, at best, only a few functions. Instead, there are all sorts of code duplications, which result in subtle bugs. Furthermore, there’s a lot of use of global variables.
The use of global variables is a particularly pervasive code smell. Because global variables can potentially be changed from anywhere in the code, you open the door to hard-to-fix errors. When debugging, you are forced to look for errors everywhere.
Another application-specific code smell is the lack of separation of concerns. Especially in PHP projects, it’s easy to mix markup, functionality and presentation in one file. This makes code reuse and refactoring difficult. It is better to strive for a clean separation of concerns, e.g., by using the ‘Model-View-Controller’ (MVC) pattern.
Another application-specific code smell is the lack of separation of concerns. Especially in PHP projects, it’s easy to mix markup, functionality and presentation in one file. This makes code reuse and refactoring difficult. It is better to strive for a clean separation of concerns, e.g., by using the “Model-View-Controller” (MVC) pattern.
How do you get rid of a code smell?
In principle, it’s preferable to continuously keep code clean. The best way to do this is to follow the pattern:
- Prototype: create draft
- Test: test the functionality
- Refactor: clean up the code
- Ship: deliver code in production environment
Unfortunately, the third point is often skipped, especially when the decision is made by a manager with no coding experience. The reasoning is: the code works so why keep putting effort into it? Over a longer period of time, the code starts to stink and technical debt accumulates.
If an existing codebase contains code smells, it’s not possible to start from scratch. The task then is to clean up the code. The best approach is to take an incremental approach and first delineate areas within the codebase that can be cleaned up relatively easily. Separate the working or improvable parts from the ‘dark corners’ that are better left alone.
Separating the less stinky code areas from the more stinky ones reduces complexity. This makes it easier to test code and perform debugging steps. Automated code review tools that detect code smells and offer suggestions or tips for cleaning up can help.
Refactoring approaches are effective in removing code smells. Functionality is encapsulated in functions or existing functions are split. Removing unused sections of code and improving the naming of the code makes the codebase leaner. The golden rules ‘Don’t repeat yourself’ (DRY) and ‘You ain’t gonna need it’ (YAGNI) help as guidelines.