Home » Best Practices » Extracting methods

Extracting methods

Overview

A block of noisy code will be analyzed in this article. This code does not clearly communicates what it is doing and violates Single Responsibility & Separation of Concerns principles. In order to improve the readability and overall quality of the code, it will be refactored first using Extract Method and Extract Class techniques.

What this code is doing?

Can you recognize at first glance what the following method “Process” in an “Order” class is doing? How it processes the order?

    public class Order
    {
        ...
        
        public void Process()
        {
            int priceReductionInPercents = 0;
            if (DateTime.Now.DayOfWeek == DayOfWeek.Tuesday)
            {
                if (Items.Count() >= 10)
                {
                    priceReductionInPercents = 7;
                }
                else if (Items.Count() >= 5)
                {
                    priceReductionInPercents = 5;
                }
            }

            var totalPrice = Items.Sum(x => x.Price)*((100m - priceReductionInPercents)/100m);

            if (totalPrice < 0)
            {
                throw new NegativePriceException(this);
            }

            var package = new Package(Items, priceReductionInPercents, totalPrice);
            var bill = new Bill(Customer, package);
            _mailer.Send(bill);

            decimal shippingCostsReduction = 0;
            if (totalPrice > 100)
            {
                shippingCostsReduction = 100;
            }
            else if (totalPrice > 50)
            {
                shippingCostsReduction = 50;
            }

            if (shippingCostsReduction > 0)
                package.ReduceShippingCostsByPercents(shippingCostsReduction);

            _warehouse.Send(package, Customer.Address);
        }
    }

This example demonstrates a typical violation of the Separation of concerns (wikipedia) principle of object-oriented software development (the way I understand it), which helps to make the code cleaner, and to improve maintainability and readability. Obviously, the “Process” method is doing several different things. Some of these things are implemented directly in the method, like calculation of price reduction/discounts. Some are delegated to other classes (_mailer, _warehouse).

Let’s simply write down what the Process method is doing:

  • creates the package object
  • calls a mailer class to send a bill for this package per e-mail to the customer who ordered it
  • calls a warehouse class to trigger sending the package per post to the customer

Creation of package consists of 3 things:

  • calculate the price reduction
  • calculate the total price, with previously calculated reduction/discount
  • create the package object with ordered items. The package object also receives the total price, discount on total price and discount on shipping costs.

This is the exact workflow of processing the order! These actions (workflow steps) must be performed in this order, and that’s the purpose of the “Process” method. But why the workflow is not visible there at first glance?! Because it’s lost in the lines of maths mixed with instantiation of objects and other stuff. The method does not tell what it does but tells how several workflow steps are implemented.

To improve the readability of the code, the “Process” method should expose this workflow and be literally closer to the list of workflow steps we’ve already written down above. Extract Method is an essential and very simple refactoring technique required for doing that. By the way, if you work with C# and ReSharper, like I do – you need only to mark the code block and use a shortcut to extract method: it’s Alt+Ctrl+M in ReSharper keyboard schema.

That’s how the code looks like after the first refactoring which was simply extracting the code blocks to methods. It’s not perfect, still, but already improved. Can you recognize now at first glance what the “Process” method is doing?.

    public class Order
    {
        ...
        
        public void Process()
        {
            var priceReductionInPercents = CalculatePriceReductionInPercents();
            var totalPrice = CalculateTotalPrice(priceReductionInPercents);

            var package = CreatePackage(totalPrice, priceReductionInPercents);
            SendTheBillPerEmailFor(package);
            SendToCustomer(package);
        }

        private int CalculatePriceReductionInPercents()
        {
            if (!TodayIsTuesday())
                return 0;

            if (Items.Count() >= 10)
                return 7;

            if (Items.Count() >= 5)
                return 5;

            return 0;
        }

        private static bool TodayIsTuesday()
        {
            return DateTime.Now.DayOfWeek == DayOfWeek.Tuesday;
        }

        private decimal CalculateTotalPrice(int priceReductionInPercents)
        {
            var priceSum = Items.Sum(x => x.Price);

            var totalPrice = priceSum * GetPriceFactor(priceReductionInPercents);
            if (totalPrice < 0)
                throw new NegativePriceException(this);

            return totalPrice;
        }

        private static decimal GetPriceFactor(int priceReductionInPercents)
        {
            return (100m - priceReductionInPercents) / 100m;
        }

        private void SendTheBillPerEmailFor(Package package)
        {
            var bill = new Bill(Customer, package);
            _mailer.Send(bill);
        }

        private Package CreatePackage(decimal totalPrice, decimal priceReductionInPercents)
        {
            var package = new Package(Items, priceReductionInPercents, totalPrice);
            var shippingCostsReduction = CalculateShippingCostsReduction(totalPrice);

            if (shippingCostsReduction > 0)
                package.ReduceShippingCostsByPercents(shippingCostsReduction);
            return package;
        }

        private static decimal CalculateShippingCostsReduction(decimal totalPrice)
        {
            if (totalPrice <= 50)
                return 0;

            if (totalPrice <= 100)
                return 50;

            return 100;
        }

        private void SendToCustomer(Package package)
        {
            _warehouse.Send(package, Customer.Address);
        }

Improved readability is already a big benefit in my opinion.

Also, if any of the code parts has to be changed, it’s a way easier to do it in a small method, than in a big one, where things are cross-wired and changing any part of it can cause side-effects in another part. For the same reason it is easier to track such changes afterwards as well.

Another benefit of extracting methods is that it opens the door to the next refactoring step – extracting classes. Now the code is better readable and improved overall. But it’s still lacking object orientation. The code (still) violates an important principle of object-oriented code – Single responsibility (wikipedia). Single responsibility principle means that every class should have a single responsibility, and that responsibility should be entirely encapsulated by the class. The “Order” class still implements and is responsible for several things:

  • the workflow of processing the order, as described above
  • the calculation of discount on items
  • the calculation of total price, with previously calculated discount
  • the calculation of discount on shipping costs
  • the creation of a package using the calculated discounts and total price

But the main responsibility of the class is to store some data (Items) and to implement the workflow of processing orders, not to implement every single step of this workflow. So what I did in the next refactoring was simply moving different aspects of business logic into their own objects:

  • DiscountOnItems class is responsible for calculation of discount on items (makes sense, right? )
  • Price class is responsible for calculation of total price of items, considering given discount
  • DiscountOnShippingCosts class is responsible for calculation of discount on shipping costs

All these classes have each a dedicated exclusive responsibility and do nothing else! The code after moving methods to the classes:

	public class Order
	{
		...
		public void Process()
		{
			var package = CreatePackage();
			SendTheBillPerEmailFor(package);
			SendToCustomer(package);
		}

		private Package CreatePackage()
		{
			var discount = new DiscountOnItems(Items);
			var price = new Price(Items, discount);
			var discountOnShippingCosts = new DiscountOnShippingCosts(price);
			return new Package(Items, price, discountOnShippingCosts);
		}

		private void SendTheBillPerEmailFor(Package package)
		{
			var bill = new Bill(Customer, package);
			_mailer.Send(bill);
		}

		private void SendToCustomer(Package package)
		{
			_warehouse.Send(package, Customer.Address);
		}

		...
	}

	
	public class DiscountOnItems
	{
		private readonly decimal _percentage;

		public DiscountOnItems(IEnumerable<IItem> items)
		{
			_percentage = CalculatePriceReductionInPercents(items);
		}

		public decimal ApplyTo(decimal amount)
		{
			var factor = (100m - _percentage)/100m;
			return amount * factor;
		}

		private int CalculatePriceReductionInPercents(IEnumerable<IItem> items)
		{
			if (!TodayIsTuesday())
				return 0;

			var itemsCount = items.Count();

			if (itemsCount >= 10)
				return 7;

			if (itemsCount >= 5)
				return 5;

			return 0;
		}

		private static bool TodayIsTuesday()
		{
			return DateTime.Now.DayOfWeek == DayOfWeek.Tuesday;
		}
	}

	
	public class Price
	{
		private readonly decimal _price;

		public Price(IEnumerable<IItem> items, DiscountOnItems discount)
		{
			var itemsTotalPrice = items.Sum(x => x.Price);
			
			_price = discount.ApplyTo(itemsTotalPrice);

			if (_price < 0)
				throw new NegativePriceException(this);
		}

		public bool IsOver(decimal limit)
		{
			return _price > limit;
		}
	}


	public class DiscountOnShippingCosts
	{
		private decimal _percentage;

		public DiscountOnShippingCosts(Price price)
		{
			_percentage = CalculateDiscountInPercents(price);
		}

		public void ApplyTo(Package package)
		{
			package.ReduceShippingCostsByPercents(_percentage);
		}

		private static decimal CalculateDiscountInPercents(Price price)
		{
			if (price.IsOver(100))
				return 100;

			if (price.IsOver(50))
				return 50;

			return 0;
		}
	}

The creation of package is still in the Order class only because I did not want to create something like a “PackageFactory” for that (which actually can be a very good idea if other discount types will appear). For me it feels more natural to have a constructor of the “Package” class to do this job, so I’d rather move the method “CreatePackage” to the new constructor, which only takes the Items as an argument:

public class Package
{
	...
	public Package (IEnumerable<IItem> items)
	{
		_discount = new DiscountOnItems(items);
		_price = new Price(items, discount);
		_discountOnShippingCosts = new DiscountOnShippingCosts(price);
		...
	}
...

After that the “Order” class will look like this:

        public void Process()
        {
            var package = new Package(items);
            SendTheBillPerEmailFor(package);
            SendToCustomer(package);
        }

        private void SendTheBillPerEmailFor(Package package)
        {
            var bill = new Bill(Customer, package);
            _mailer.Send(bill);
        }

        private void SendToCustomer(Package package)
        {
            _warehouse.Send(package, Customer.Address);
        }

A lot of implementation details, for which order processing is not directly responsible for, were moved out to the classes which ARE directly and exclusively responsible for that: calculation of discounts and even the creation of package. The “Process” method doesn’t care about how discounts are calculated, it only cares about packaging the items, sending a bill and sending a package. The class is now a way easier to read, understand and maintain. The workflow is visible at first glance.

Core of the problem

I often see methods which are much larger than in my example, with longer lines, more logic and heavily nested. The biggest problem is in my opinion that some developers disagree that the initial “Process” method is messy. They would find a code which follows the Separation of concerns principle too modular, the extracted methods and classes too small. I think, that the reason is that these developers are used to working with huge procedural methods so much so that they can’t recognize neither the problem of large method or lacking object-orientation, partly because they never did it differently and can’t do it differently (yet!). The principles of object oriented software development are all there for a good reason, it is important to read about them, to understand and apply them.

One can hardly argue with the improved readability. And this, together with other benefits I mentioned, are reasons enough to do the refactoring and to drop the habit of writing big code blocks.

A small step for developer, a big step for readability

Among the extracted methods there is even a very small one:

            private static bool TodayIsTuesday()
            {
                return DateTime.Now.DayOfWeek == DayOfWeek.Tuesday;
            }

This could be considered as too small piece of logic to create a method for. It’s just one line! However, first of all, the method provides exactly the information it should – if today is a Tuesday – and contains all the logic required to figure this out. So, it already deserves to be a separately encapsulated thing, because it handles an encapsulated and independent concern. Second, check how the method call looks like:

Before:

	if (DateTime.Now.DayOfWeek == DayOfWeek.Tuesday)
  After:

	if (TodayIsTuesday())

A developer would understand both, but the readability makes a big difference, making understanding faster.

Try to pronounce the first version:
“if date time now day of week equals equals day of week Tuesday”
Sounds really weird, like a secret code (ironically, it is computer code).

Compare with the method call:
“if today is Tuesday”
This is proper English and this is understandable immediately.

Extracting the methods brings the possibility to give pieces of logic good sounding, readable and fast understandable names, hiding the technical implementation behind these names. Remember – code is processed by compilers, but it will be read and has to be understood by people. Compiler doesn’t care, but your fellow developers do. If one can use proper human language, at least when it comes to writing code for business workflow, then it should be used.

Clothing storage metaphor


Having all implementation details together in one method is like having different kinds of clothing lying on the floor. At some point it becomes an unpleasantly looking heap, which grows constantly, starts to include other things than clothes, and at some point smells really strange.


Having all implementation details together in different methods of one class is like having all the clothing sorted but stored in a very large storage rack. There are labels on a rack saying where the jeans are stored and where the socks are and where each other type of clothing belongs to. Everybody who enters the room always sees all of the clothes. And you can’t just hide it, even if you don’t need certain clothes – the storage rack is in the room and all the clothing is always visible.


Having separate classes is like having all the clothing sorted and stored in a wardrobe with several doors or in a chest of drawers. One drawer has all the socks, another one all the jeans, etc. Each drawer has a label saying exactly what’s inside, like “Socks”, “Jeans”, etc. A person who enters the room don’t see any of it, unless opening a drawer, which is only done when a certain clothing type is needed. And actually that’s absolutely fine, because first – you don’t need clothing >99% of time you spend in this room, second – you can easily have any piece of clothing within seconds, and last but not least – it is aesthetically more pleasant.

Magically, there is unlimited number of drawers and chests available. If one drawer is full, you can always “create” another one, for example instead of the “Jeans” drawer make one drawer for the blue jeans and another one for the black ones.

Further reading

  • SOLID principles (wikipedia) – five basic principles of object-oriented programming and design, which include Single responsibility (“S” in SOLID).
  • “Refactoring” book by Martin Fowler, chapters “Extract method” & “Extract class”

The word “and”

While preparing this article I’ve found a brilliantly simple guideline of when a method should be splitted into several methods (source):

A co-worker of mine has the phrase he uses when judging if a given method needs to be broken up into smaller ones:
If, when describing the activity of the code to another programmer you use the word ‘and’, the method needs to be split into at least one more part.

But why there is a need to describe code verbally? The code can briefly describe by itself what it’s doing, because there is (almost) a complete freedom of naming variables and methods in code using a human language.

Picture Sources

Thanks for the pics:

Level Up Code, January 2014


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: