Refactoring Techniques

Introduction

Let’s begin by considering: “What is Refactoring?”

The definition of refactoring is:

a disciplined technique for restructuring an existing body of code, altering its internal structure without changing its external behaviour

Refactoring is a term originated from the Smalltalk community of developers back in the mid-late nineties.

Two of the most prolific programmers of recent times, Martin Fowler and Kent Beck literally wrote the book on the subject of refactoring called “Refactoring: Improving the Design of Existing Code” (well, written by Martin with contributions from Kent).

In 2009 both Martin and Kent helped with a rewrite of the book that focused more on the Ruby language than the original book’s target language of Java. This follow-up book was called “Refactoring: The Ruby Edition” and it’s that book which is the primary driving force of this post.

Since reading the Ruby edition I wanted to have a short summarised version of some of the more commonly used refactoring techniques (mainly for my own reference). By that I mean the techniques described in the book that I find interesting and use a lot in my day to day programming life.

Languages

These refactoring techniques aren’t specific to the Ruby language (although my implementation examples are). You can use them when working with JavaScript or PHP (or any other language for that matter).

Programming languages don’t all offer identical APIs and so sometimes you might need to tweak the examples slightly to fit your environment.

Regardless, the idioms and syntax differences between languages become redundant when you just focus on the pattern(s) behind the proposed solution.

Why refactor?

The purpose of refactoring is to improve the quality, clarity and maintainability of your code. Simple really.

But also, refactoring can be a great lesson in understanding an unfamiliar code base.

Think about it, if you inherit a poorly designed code base that you’ve not seen before and you now need to either fix a bug or add a new feature, then implementing the code necessary would be a lot easier once you had refactored it to be in a more stable, maintainable and ultimately ‘understandable’ state.

Otherwise you would be forced to retro fit your new code on top of a poorly designed foundation and that would be the start of a very unhappy relationship.

When should you refactor?

You’ll usually find the time you start refactoring the most is when you are fixing bugs or adding new features.

For example, you typically first need to understand the code that has already been written (regardless of whether it was you who wrote it originally or someone else).

The process of refactoring helps you better understand the code, in preparation for modifying it.

But don’t fall into the trap of thinking that refactoring is something you set aside time for, or only consider at the start/end of a project. It’s not. Refactoring should be done in small chunks throughout the entire life cycle of the project.

As the great Uncle Bob once said:

leave a module in a better state than you found it

…what this suggests is that refactoring is essential to your daily coding process.

Tests

Before we get started, it’s important to mention that you should have tests in place when you’re refactoring.

You can refactor without tests, but realise that without tests to back you up then you can have no confidence in the refactoring you are implementing.

Refactoring can result in substantial changes to the code and architecture but still leave the top layer API the same. So while you’re refactoring remember the old adage…

program to an interface, not an implementation

We want to avoid changing a public API where ever possible (as that’s one of the tenets of refactoring).

If you don’t have tests then I recommend you write some (now)… don’t worry, I’ll wait.

Remember, the process of writing tests (even for an application you don’t know) will help solidify your understanding and expectations of the code you’re about to work on.

Code should be tested regularly while refactoring to ensure you don’t break anything. Keep the ‘red, green, refactor’ feedback loop tight. Tests help confirm if your refactoring has worked or not. Without them you’re effectively flying blind.

So although I won’t explicitly mention it below when discussing the different refactoring techniques, it is implied that on every change to your code you should really be running the relevant tests to ensure no broken code appears.

Refactoring Techniques

There are many documented refactoring techniques and I do not attempt to cover them all, as this post would end up becoming a book in itself. So I’ve picked what I feel are the most common and useful refactoring techniques and I try my best to explain them in a short and concise way.

I’ve put these techniques in order of how you might approach refactoring a piece of code, in a linear, top to bottom order. This is a personal preference and doesn’t necessarily represent the best way to refactor.

Final note: with some of the techniques I have provided a basic code example, but to be honest some techniques are so simple they do not need any example. The Extract Method is one such technique that although really useful and important, providing a code example would be a waste of time and space.

So without further ado, let’s begin…

Rename Method

The single most effective and simple refactoring you can implement is to rename a property/attribute, method or object.

Renaming identifiers can reduce the need for code comments and nearly always helps to promote greater clarity.

You’ll find that renaming things is a fundamental part of other refactoring techniques to aid understanding of the code.

This technique relies on giving items a descriptive name to ensure the developer knows at a glance exactly what it does. The following technique Introduce Explaining Variable is effectively the same.

Introduce Explaining Variable

So here is a technique specifically based around the premise of renaming.

If you have a complicated expression (for example, you’ll typically have a long winded set of conditions within an if statement) then place that complex expression into a temp variable and give it a descriptive identifier.

For example:

unless "This is a String with some CAPS".scan(/([A-Z])/).empty?
  puts "capitalised text was found"
end

Should be:

caps_not_found = "This is a String with some CAPS".scan(/([A-Z])/).empty?

unless caps_not_found
  puts "capitalised text was found"
end

Note: this is the only technique that finds temps (i.e. local variables) acceptable. This is because temps are deemed to be less reusable than methods (due to their very nature being ‘local’) and so introducing temps is something that shouldn’t be considered lightly. Maybe consider using the Extract Method technique instead before using this particular technique.

Also, don’t worry about performance until you know you have a performance issue to worry about. Developers will always suggest that calling methods is slower than running code inline, but good programming is about readability and maintainability, and extracted methods are not only easier to understand but are much more reusable by other methods.

So if you are considering using the Introduce Explaining Variable technique, first decide whether the temp would be more useful if it was available to other methods (that way you could use Extract Method instead and avoid defining a temp altogether).

Inline Temp

Temp variables are a bit of a code smell as they make methods longer and can make the Extract Method more awkward (as you’d have to pass through more data to the extracted method).

Inline Temp effectively removes the temp variable altogether by just using the value assigned to it (I’d only suggest doing this if the temp is only used once or if the resulting value has come from a method invocation).

For example:

def add_stuff
    1 + 1
end

def do_something
    temp_variable_with_descriptive_name = add_stuff
    puts "Number is #{temp_variable_with_descriptive_name}"
end

Should be:

def add_stuff
    1 + 1
end

def do_something
    puts "Number is #{add_stuff}"
end

Note: a temp by itself doesn’t do any harm, and in some instances can actually make the code clearer (especially if using a result from a method invocation and the method identifier doesn’t indicate the intent as well as it should).

But most likely you’ll end up using this technique to aid the Extract Method technique as less temp vars means less requirement to pass through additional parameters to the extracted method.

Split Temp Variable

This technique aims to resolve the concern of violating the SRP (Single Responsibility Principle), although slightly tamer in the sense that SRP is aimed more at Classes/Objects and methods, not typically variable assignments.

But regardless if a temporary variable is assigned to more than once and it is not a loop variable or a collecting/accumulator variable then it is a temp considered to have too many responsibilities.

For example: (this is a daft example, but what the heck)

temp = 2 * (height + width)
temp = height * width

Becomes:

perimeter = 2 * (height + width)
area = height * width

As you can see, the temp variable was handling more responsibility than it should be and so by creating two appropriately distinct temps we ensure greater code clarity.

Replace Temp With Query

This technique has a very similar intent to Inline Temp in that one of its primary focuses is to aid the Extract Method.

The subtle but important difference between this technique and Inline Temp is that the complex expression assigned to the temp needs to be first moved to a method (whereas the Inline Temp technique is different in that the temp may already be using a method invocation).

For example:

class Box
  attr_reader :length, :width, :height

  def initialize length, width, height
    @length = length
    @width  = width
    @height = height
  end

  def volume
    # `area` is the temp
    area = length * width
    area * height
  end
end

Becomes:

class Box
  attr_reader :length, :width, :height

  def initialize length, width, height
    @length = length
    @width  = width
    @height = height
  end

  def volume
    # notice `area` is now a direct method call
    area * height
  end

  def area
    length * width
  end
end

This technique can help to shorten a long method by not having to define lots of temp variables just to hold values.

If the extracted query method is given an identifier that aptly describes its purpose then the code still can be considered clear and descriptive.

Also, it is considered bad form to define a variable which changes once it has been set (hence moving to a method better indicates an unstable value).

Note: this technique can sometimes be made easier to implement once you’ve used Split Temp Variable.

Remember this technique (as with other techniques) is an incremental step towards removing non-essential temps, so consider using Inline Temp afterwards, thus removing the need for the temp altogether.

Replace Temp With Chain

This is yet another technique designed to rid your code of temp variables.

If you have a temp variable holding the result of calling an object’s method, and follow the assignment by using that temp to carry out more method calls, then you should consider chaining method calls instead.

The implementation is quite simple, you just have to ensure the methods called return self (or this if using a language like JavaScript).

By allowing methods to chain we again have the opportunity to remove an unnecessary temps.

For example:

class College
    def create_course
        puts "create course"
    end

    def add_student
        puts "add student"
    end
end

temp = College.new
temp.create_course
temp.add_student
temp.add_student
temp.add_student

Becomes:

class College
    # static method so can be accessed without creating an instance
    def self.create_course
        college = College.new
        puts "create course"
        college # return new object instance
    end

    def add_student
        puts "add student"
        self # refers to the new object instance
    end
end

college = College.create_course
                 .add_student
                 .add_student
                 .add_student

Extract Method

Here it is! In my opinion ‘The’ most used and important refactoring technique.

The implementation behind this technique is very simple. It consists of breaking up long methods by shifting overly complex chunks of code into new methods which have very descriptive identifiers.

For example:

class Foo
  attr_reader :bar

  def initialize bar
    @bar = bar
  end

  def do_something
    puts "my baz" # notice this is duplication
    puts bar
  end

  def do_something_else
    puts "my baz" # notice this is duplication
    puts "Something else"
    puts bar
  end
end

Becomes:

class Foo
  attr_reader :bar

  def initialize bar
    @bar = bar
  end

  def do_something
    baz
    puts bar
  end

  def do_something_else
    baz
    puts "Something else"
    puts bar
  end

  def baz
    puts "my baz"
  end
end

But be careful with handling local variables as you’ll need to pass them through to the extracted method and that can be difficult if there are lots of temps in use. Sometimes to facility the Extract Method you’ll need to first incorporate other techniques such as Replace Temp With Query and Inline Temp.

Inline Method

Sometimes you want the opposite of the Extract Method technique. Imagine a method exists whose content is already simple and clear, and whose identifier adds no extra benefit. In this instance we’re just making an extra call for no real benefit.

So to fix this problem we’ll convert the method invocation into an inlined piece of code (unless of course the method is used in multiple places, in that case leave it where it is as having it in a separate method keeps our code DRY).

Move Method

In a previous post about Object-Oriented Design I explained that you should query your classes/objects to ensure the methods they define are actually where they should be (another reason is ‘feature envy’, if a method is asking another class a lot of questions then it may be an indication the method is on the wrong object).

The Move Method technique ensures this decoupling by simply moving the identified misplaced method onto the correct one.

Once the method has been moved you should clean up the previously passed parameters by seeing what can be moved over to the other object or whether additional data needs to be passed over now via the method invocation.

For example:

class Gear
    attr_reader :chainring, :cog, :rim, :tire

    def initialize (chainring, cog, rim, tire)
      @chainring = chainring
      @cog       = cog
      @rim       = rim
      @tire      = tire

      # let's asked the question:
      # "Please Mr. Gear what is your tire size?"
      # hmm? notice this doesn't sound like it quite fits the purpose of a 'Gears' class
    end

    def ratio
      chainring / cog.to_f
    end

    def gear_inches
        # tire goes around rim twice for diameter
        ratio * (rim + (tire * 2))
    end
end

Becomes:

class Gear
    attr_reader :chainring, :cog, :rim, :tire

    def initialize (chainring, cog, rim, tire)
      @chainring = chainring
      @cog       = cog
      @rim       = rim
      @tire      = tire.size
    end

    def ratio
      chainring / cog.to_f
    end

    def gear_inches
        # tire goes around rim twice for diameter
        ratio * (rim + (tire * 2))
    end
end

class Tire
  def self.size
    5
  end
end

From the original class/object keep the original method in place while you test and change it so it now delegates to the method on the new object. Then slowly refactor by replacing delegating calls throughout your code base with direct calls to the method via its new host.

Finally, remove the old method altogether and the tests should tell you if you missed a replacement somewhere.

Replace Method With Method Object

You may run into a problem where you have a long method you want to use Extract Method on, but the number of temporary local variables are too great to allow you to utilise the Extract Method technique (because passing around that many variables would be just as messy as the long method itself).

To resolve this issue you could look at different types of smaller refactors (such as Inline Temp) but in some cases it would actually be better to first move the contents of the long method into an entirely new object.

So the first thing to do is create a new class named after the long method and add the temp local vars as properties/attributes of the class/object.

Now when you try to implement Extract Method you don’t have to pass around the temp vars because they are now available throughout the class/object.

Then from within the original class/object you can delegate any calls to the original method on to the object (you’ll still pass on the original arguments to the method within the new object but from there on the method extraction becomes easier).

For example:

class Foo
  def bar
    puts "We're doing some bar stuff"
  end

  def baz(a, b, c)
    if a == 'something'
      # do something
    end

    if b == 'else'
      # do else
    end

    if c == 'none'
      # do none
    end
  end
end

Becomes:

class Foo
  def bar
    puts "We're doing some bar stuff"
  end
end

class Baz
  attr_accessor :a, :b, :c

  def initialize(a, b, c)
    @a = a
    @b = b
    @c = c

    if a == 'something'
      # do something
    end

    if b == 'else'
      # do else
    end

    if c == 'none'
      # do none
    end
  end
end

From here we’re now in a better state to use both the Extract Method and Replace Conditional with Polymorphism techniques to refactor the Baz class.

Replace Loop With Collection Closure Method

If you write a loop that parses a collection and interacts with the individual elements within the collection then move that interaction out into a separate closure based method (meaning you replace the loop with an Enumerable method).

This refactoring may not be as clear or impressive as other refactoring techniques but the motivation behind it is that you hide the ugly details of the loop behind a nicer iteration method, allowing the developer looking at the code to focus on the business logic instead.

For example:

managers = []
employees.each do |e|
    managers << e if e.manager?
end

Becomes:

managers = employees.select { |e| e.manager? }

Ruby has a few of these types of enumerable methods but other languages such as PHP and JavaScript aren’t so lucky.

JavaScript has a couple of accumulators: Array#reduce and Array#reduceRight but they aren’t very useful as closure based collection methods compared to Ruby which has methods such as Enumerable#inject, Enumerable#select (seen in above example) or Enumerable#collect.

Note: in JavaScript you can implement a similar effect with clever use of closures.

Pull Up Method

When you have duplicated code across two separate classes then the best refactoring technique to implement is to pull that duplicate code up into a super class so we DRY (Don’t Repeat Yourself) out the code and allow it to be used in multiple places without duplication (meaning changes in future only have to happen in one place).

For example:

class Person
  attr_reader :first_name, :last_name

  def initialize first_name, last_name
    @first_name = first_name
    @last_name = last_name
  end

end

class MalePerson < Person
  # This is duplicated in the `FemalePerson` class
  def full_name
    first_name + " " + last_name
  end

  def gender
    "M"
  end
end

class FemalePerson < Person
  # This is duplicated in the `MalePerson` class
  def full_name
    first_name + " " + last_name
  end

  def gender
    "F"
  end
end

Becomes:

class Person
  attr_reader :first_name, :last_name

  def initialize first_name, last_name
    @first_name = first_name
    @last_name = last_name
  end

  def full_name
    first_name + " " + last_name
  end
end

class MalePerson < Person
  def gender
    "M"
  end
end

class FemalePerson < Person
  def gender
    "F"
  end
end

Form Template Method

The technique is reliant on inheritance: a parent class and two sub classes of that parent. The two sub classes have methods which have similar steps, in the same order and yet the steps themselves are different.

The technique involves moving the sequence of steps into the parent class and then using polymorphism to allow the sub classes to handle the differences in the steps.

Here is a silly example (I’m no good at giving real examples; you may have noticed), here is an example of our problematic code…

class Foo; end

class Bar < Foo
  def initialize
    @hey = 1
    @hai = 2
  end

  def qux
    @a = @hey + @hai
    @b = @a * 10
    @a + @b
  end
end

class Baz < Foo
  def initialize
    @hey = 5
    @hai = 7
  end

  def qux
    @a = @hey + @hai
    @b = @a * 10 * 20
    @a + @b
  end
end

bar = Bar.new
baz = Baz.new

puts bar.qux
puts baz.qux

…we could try to inject the values each sub class requires but then we still have a lot of duplication in this code.

We can see the sequence of steps is:

determine what a should be
determine what b should be
return a specific calculation

…so we can clean up our code a little by abstracting the commonality…

class Foo
  def initialize(hey=1, hai=1)
    @hey = hey
    @hai = hai
  end

  def qux
    determine_a
    determine_b
    result
  end

  def determine_a
    @a = @hey + @hai
  end

  def result
    @a + @b
  end
end

class Bar < Foo
  protected

  def determine_b
    @b = @a * 10
  end
end

class Baz < Foo
  protected

  def determine_b
    @b = @a * 10 * 20
  end
end

bar = Bar.new(1, 2)
baz = Baz.new(5, 7)

puts bar.qux
puts baz.qux

Extract Surrounding Method

If you find you have different methods which contain almost identical code but with a slight variant in the middle, then pull up the duplicated code into a single method and pass a code block to the newly created method which it yields to in order to execute the unique behaviour…

def do_something
    puts 1
    yield
    puts 3
end

do_something { puts 2 }

This is actually a common pattern in Ruby known as the ‘wrap around’ method. This technique is similar to the Form Template Method, but is different in that you can use it without forcing an inheritance model on your code.

Note: JavaScript doesn’t have the ability to pass a code block but it can be replicated by passing a function that acts like a callback…

function doSomething (callback) {
        console.log(1);
        callback();
        console.log(3);
}

doSomething(function(){
        console.log(2);
});

…although in the latest versions of Node (as of November 2013) Generators are implemented and would allow JavaScript code to yield similar to how Ruby works.

Self Encapsulate Field

When inheriting properties from a parent class/object then it can be more flexible if the parent class only allows access to the properties from within a getter/setter.

The motivation for this technique is that a sub class can override and modify the behaviour of the getter/setter without affecting the parent class’ implementation. Which is similar to how the Decorator design pattern works (e.g. modifying the behaviour without affecting the original).

This technique should only be used once you find the coupling between objects is becoming a problem. Otherwise direct access to properties and instance variables should be acceptable initially.

For example:

def total
  @base_price * (1 + @tax_rate)
end

Becomes:

attr_reader :base_price, :tax_rate

def total
  base_price * (1 + tax_rate)
end

Introduce Named Parameter

When method arguments are unclear then convert them into named parameters so they become clearer (and easier to remember).

Although Ruby supports named parameters…

def turnOnTheTV (channel: 1, volume: 1); end
turnOnTheTV(channel: 101, volume: 10)

…neither PHP or JavaScript do, so for PHP you can pass an associated Array and with JavaScript you can pass an Object/Hash.

For example (JavaScript):

function turnOnTheTV(c, v){}
turnOnTheTV(101, 10);

Becomes:

function turnOnTheTV (config) {
    // config.channel === 101
    // config.volume  === 10
}
turnOnTheTV({ channel: 101, volume: 10 });

Note: ECMAScript 6.0 (the latest JavaScript specification - which is still being worked on as of Nov 2013) implements named parameters.

Remove Redundancy

This isn’t an explicit technique, more a grouping of techniques.

The principle idea being that: code evolves, and as it evolves you may find techniques you previously implemented (as part of an earlier refactoring) have since become redundant.

Imagine you implemented the “Introduce Named Parameter” technique (passing a hash with named properties as a single argument instead of multiple unidentified arguments).

Now, after some other refactorings have taken place, you discover the method originally refactored is no longer as complex and so your argument hash refactor has been reduced down to just a single named property.

In this particular scenario you should remove the named parameter and simply pass a single argument instead.

This principle applies with other refactoring techniques.

Imagine an earlier refactoring included implementing a default parameter value for a method call. As your code evolves, if you discover you now only ever call the method with an argument then the default value becomes redundant and makes the code more complex than it needs to be by providing a default value. So just remove the redundant code.

Dynamic Method Definition

Sometimes defining multiple methods can be wasteful when functionally they carry out similar steps.

For example, imagine we had the following code…

def failure do 
  self.result = "failure" 
end 

def success do 
  self.result = "success" 
end 

def error do 
  self.result = "error" 
end

Notice how the functions are structurally identical. They simply set a result property to have a value This can be refactored using Ruby’s define_method method (which let’s you create methods dynamically at run time)…

[:failure, :success, :error].each do |method| 
  define_method method do 
    self.result = method.to_s 
  end 
end

Note: you could also abstract this code into a more reusable (and easier to maintain) function like so…

def dynamic_methods(*method_names, &block) 
  method_names.each do |method_name| 
    define_method method_name do 
      instance_exec(method_name, &block)
    end 
  end 
end

You can also use this technique to help ease creating properties on an object. For example, I used this technique in my MVCP blog post to dynamically create instance variables…

require 'app/presenters/base' 
require 'app/models/person' 

class Presenters::Person < Presenters::Base 
  attr_reader :run, :name, :age 

  def initialize 
    @run = true 

    model = Person.new('Mark', '99') 
    prepare_view_data({ :name => model.name, :age => model.age }) 
  end 
end 

module Presenters 
  class Base 
    attr_accessor :model 

    def prepare_view_data hash 
      hash.each do |name, value| 
        instance_variable_set("@#{name}", value) 
      end 
    end 
  end 
end

Extract Class

This is a pretty standard technique which helps ensure your objects abide by the SRP (Single Responsibility Principle).

If you find your classes are doing too much then simply create a new class and move the relevant fields and methods over one by one (while running the tests as you go to ensure all code continues working as expected).

Doing so you’ll end up with two small, focused and clean classes which are easier to manage.

Hide Delegate

This technique focuses on the principle of object encapsulation. Specifically decoupling two or more objects by reducing the context the objects have of each other.

The following code demonstrates the idea…

module Bar
  def display
    puts "Bar Stuff"
  end
end

module Baz
  def display
    puts "Baz Stuff"
  end
end

class Foo
  include Bar

  def do_something
    display
  end
end

foo = Foo.new
foo.do_something

…as you can see, the user only needs to rely on the interface having a do_something method.

The implementation details of do_somthing (in this case the delegation off to another method) are hidden.

If we changed include Bar for include Baz, or maybe we don’t mixin a module at all and just write some code inside of do_something, it doesn’t matter because the public interface is set as far as the user is concerned.

Replace Array with Object

The motivation for this technique is to convert a simple data container which holds multiple data types into an object with clear and descriptive identifiers.

This principle helps to present your complex data into a more sensible format (I demonstrated this in a previous post on object-oriented design). This technique also makes the data interaction more maintainable by providing an easier and understandable interface to the data.

Here is an example where we’re violating the principle of a clean data interaction…

class Foo 
  attr_reader :data 

  def initialize(data) 
    @data = data 
  end 

  def do_something 
    data.each do |item| 
      puts item[0] 
      puts item[1] 
      puts '---' 
    end 
  end 
end 

obj = Foo.new([[10, 25],[3, 9],[41, 7]]) 
obj.do_something

Notice in the first example how our code has far too much knowledge (context) about the object it’s interacting with. It knows that the Array index zero holds an X coordinate and the Array index one holds a Y coordinate.

If that format was to change (let’s say the X and Y swap places) then that would cause our code to break in unexpected ways.

But now take a look at the following example which works around this issue by converting our complex data structure into a cleaner data format…

class Foo 
  attr_reader :new_data 

  def initialize(data) 
    @new_data = transform(data) 
  end 

  def do_something 
    new_data.each do |item| 
      # now we are able to reference easily understandable 
      # property names (rather than item[0], item[1]) 
      puts item.coord_x 
      puts item.coord_y 
      puts '---' 
    end 
  end 

  Transform = Struct.new(:coord_x, :coord_y) 

  def transform(data) 
    data.collect { |item| Transform.new(item[0], item[1]) } 
  end 
end 

obj = Foo.new([[10, 25],[3, 9],[41, 7]]) 
obj.do_something

…here we convert the Array into an object and instead can more easily and safely reference the data we’re interested in via recognisable property identifiers. This doesn’t mean if the data source changes that we’ll totally avoid all problems but it’ll be clearer where the problem is arising.

Replace Conditional with Polymorphism

This is one of the most useful refactoring techniques available to you, and there are two ways it can help:

It removes the code smell of conditional logic
It demonstrates perfectly the principle of object-oriented design

The following example shows the typical procedural attempt to handle different scenarios based on the data object type being passed…

class Foo
  def initialize(data)
    @data = data
  end

  def do_something
    if @data.class == Bar
      puts "Bar!"
    elsif @data.class == Baz
      puts "Baz!"
    elsif @data.class == Qux
      puts "Qux!"
    end
  end
end

class Bar; end
class Baz; end
class Qux; end

foo_bar = Foo.new(Bar.new)
foo_bar.do_something

foo_baz = Foo.new(Baz.new)
foo_baz.do_something

foo_qux = Foo.new(Qux.new)
foo_qux.do_something

…as you can see, if we have a new Class type we need to go back and to modify the Foo base class. This violates the OCP (Open/Closed Principle) which states a file should be open for extension but closed for modification.

For us to abide by OCP we can use polymorphism and a trusted interface/duck typing to solve the problem…

class Foo
  def initialize(data)
    @data = data
  end

  def do_something
    @data.identifier
  end
end

class Bar
  def identifier
    puts "#{self.class}!"
  end
end

class Baz
  def identifier
    puts "#{self.class}!"
  end
end

class Qux
  def identifier
    puts "#{self.class}!"
  end
end

foo_bar = Foo.new(Bar.new)
foo_bar.do_something

foo_baz = Foo.new(Baz.new)
foo_baz.do_something

foo_qux = Foo.new(Qux.new)
foo_qux.do_something

Notice we have removed the need for a conditional and just sent the message to the relevant object to be handled. Much cleaner and easier to maintain and scale.

Decompose Conditional

Not all conditional statements can be avoided through the use of polymorphism. In those cases you can simplify the conditional logic (and the subsequent statements) by extracting them into external methods.

Here is a simple example…

if date < SUMMER_START || date > SUMMER_END 
  charge = # some complex calculation if it's winter 
else 
  charge = # some other complex calculation if it's summer 
end

…which we can refactor like so…

if not_summer(date) 
  charge = winter_charge 
else 
  charge = summer_charge 
end

…much better.

Introduce Null Object

The motivation behind this technique is to avoid using a conditional whose purpose is to check whether a property exists or not before using it.

Here is a simple example of what we want to avoid…

class Post
  attr_reader :id

  def initialize id
    @id        = id
    @published = false
  end

  def self.find_and_publish id
    # Simulated database operation
    post = Posts.find { |post| post.id == id }
    post.publish unless post.nil?
  end

  def publish
    puts @published = true
  end
end

Posts = [Post.new(1), Post.new(2)]

Post.find_and_publish(0) # displays nothing
Post.find_and_publish(1) # displays true

…in the above example we check whether post is nil or not. If it isn’t nil then we call the publish method, otherwise we don’t do anything.

This is kind of ugly.

The following code demonstrates how we can avoid that problem by introducing the concept of having an object to handle null scenarios (it’s the same principle of using duck typing/trusted interfaces/polymorphism)…

class Post
  attr_reader :id

  def initialize id
    @id        = id
    @published = false
  end

  def self.find_and_publish id
    # Simulated database operation
    post = Posts.find { |post| post.id == id } || NullPost.new
    post.publish
  end

  def publish
    puts @published = true
  end
end

class NullPost
  def publish
    # noop
  end
end

Posts = [Post.new(1), Post.new(2)]

Post.find_and_publish(0) # displays nothing
Post.find_and_publish(1) # displays true

…as you can see, effectively we have the same code with the exception that we no longer check for nil? in the second example and instead we rely on another object NullPost implementing the same interface but returns a null related value.

This way we’re using objects to handle our logic. Yes, we end up with more code (one extra Class) but ultimately this is more maintainable and understandable than lots of inline logic.

Conclusion

There are still many different refactoring techniques that I’ve not included. But hopefully you’ve found this quick reference useful so far.