I'm Don Werve. I help companies build software that scales. I love talking shop, so feel free to shoot me a message – I reply to everyone that does, time and spam-filter permitting.

Join my mailing list, and you too can learn from my mistakes experience.
Untangling the God Object.

Back around the year two-thousand, shortly after the world failed to end on schedule, I worked as a sysadmin for a company now known only as “LexisNexis”. My job consisted of building out automated deployment and provisioning systems, all of which were written in Perl, and some web-based frontends to those tools, most of which were written in PHP.

Even in those days, when we had to walk uphill – both ways – just to get to our compilers, web development was a lot of fun, and I wanted to learn more. Since the company I worked for sold a large webapp built using Serious Enterprise Tools, I figured that if I wanted to get into Serious Web Development, I should probably learn Java.

So I signed up for an “Introduction to Object-Oriented Programming in Java” course at my local community college.

They might as well have taught me to play the banjo.

If you’ve ever taken a class on Object-Oriented Programming (OOP), or read through a book that explains the fundamental concepts, you probably saw an example – somewhere in the chapter where they explain what objects are – that looked a bit like this:

class Animal
    # methods shared by all animals
end

class Pig < Animal
    # pig-specific code
end

class Dog < Animal
    # dog-specific code
end

There’s a mountain of weapons-grade badness here, but we’re going to focus on one specific concept: the idea that objects are things. Maybe not a Pig or a Dog, but certainly a User, a Shipment or a Product.

It’s such a sneaky idea. So subtle. I didn’t even realize this how I was thinking of objects, until years later, when I received a much-needed smack-in-the-head by a talented coworker. What makes it super awful is that things work just fine… at first.

To see what I mean, we’re going to need to look at some real-world code.

Let’s make a mess.

Many moons ago, I was tasked with adding search to an application for a client. This isn’t the exact code I wrote for them, of course, but it’s a good idea of my general approach at the time. I’m also hugely simplifying the code, so that we can focus on the insidious – rather than the obvious – badness.

This particular client sold things on the internet, and the application was their online store, little more than a catalog with a shopping cart. The client wanted to add search to make products more discoverable, which is simple enough.

Since Product is an object, it should know how to search for itself, so let’s add a static method to the class:

class Product < ActiveRecord::Base
    # Many, many other codes live up here.

    def self.search(query)
        keywords = split_query_into_keywords(query)
        build_finder_from_keywords(keywords)
    end

    def self.split_query_into_keywords(query)
        query.split(/\s+/)
    end

    def self.build_finder_from_keywords(keywords)
        keywords.inject(Product) do |model, keyword|
            model.where("description LIKE ?", keyword)
        end
    end
end

Even if you’re not that into Rails, it should be easy to work out what’s going on here. We’ve added a search method to the Product class, which splits a user-supplied query apart by spaces, then searches for products that match all terms in the description.

The next step is to try and make our search a little more relevant, by sorting the resulting product list by popularity. This isn’t so bad; we can just add another method to handle ordering:

def self.search(query)
    keywords = split_query_into_keywords(query)
    finder = build_finder_from_keywords(keywords)
    order_results_by_popularity(finder)
end

def self.order_results_by_popularity(finder)
    # Big SQL goes here.
end

Still not too bad. Now, customers are price sensitive, so the client wanted customers to be able to search by price as well as by the description. So if the customer typed in something like $30, then we should find products in that price range:

def self.build_finder_from_keywords(keywords)
    keywords.inject(Product) do |model, keyword|
        result = maybe_finder_from_price(model, keyword)
        result || maybe_finder_from_keyword(model, keyword)
    end
end

def self.maybe_finder_from_price(model, keyword)
    return unless (match = keyword.match(/^\$(\d+)/))
    price = match[1].to_i
    lower = (price * 0.90)
    upper = (price * 1.25)
    model.where("(price > ? AND price < ?)", lower, upper)
end

def self.maybe_finder_from_keyword(model, keyword)
    model.where("description LIKE ?", keyword)
end

Things are starting to get a little uglier. It’s all contained within Product, but at every step, we’re adding more and more methods that do nothing but hang off of Product.search, and this is in a simplified-for-a-blog-post example. The real code was as appealing as a a pork milkshake.

The next thing my client wanted was suggestions. Given a user’s purchase history, recommend items that other people with similar tastes have purchased, that are in the same price range as what the user is searching for.

Now we’ve got a problem. The database queries aren’t that bad, but now Product#search needs to know about the user performing the search, and we need to pass back more than just an array of results… where do we even put things? Maybe just return a hash?

def self.search(query, user)
    keywords = split_query_into_keywords(query)
    finder = build_finder_from_keywords(keywords)
    ordered_results = order_results_by_recent_popularity(finder)
    suggestions = find_friend_suggestions(ordered_results, user)
    { results: ordered_results, suggestions: suggestions }
end

def self.find_friend_suggestions(products, user)
    # More SQL goes here.
end

Of course, this breaks all of the code that was expecting Product.search to return an array, so we’ll need to update that. And next in the pipeline is saving which queries led to products being purchased…

Things are officially ugly.

Product has a bunch of new methods, all of which get called from Product#search. We’ve got variables getting passed around everywhere, and have no place to add out-of-band results to the search query.

Not to mention, we can’t store that user or query anywhere, because we’re in the Product class, and this is a multithreaded app, so if anybody else runs a search at the same time, we’ve got an nondeterministic bug with undefined behavior (those are buckets of fun to track down).

Every step was small and sensible, and yet I had neatly managed to wedge my wooden footwear deep inside the machine I was trying to build.

One object to rule them all.

By thinking of the Product as the bucket that should hold all product-related methods, I had accidentally created a God Object, and not one of the cool gods like Thor or Dionysus.

Cthulhu had come to my land of milk and honey.

Starting out, this felt like a very natural approach. Objects know everything about themselves. Rather than telling each object how to do something, you’re supposed to ask the object to do things for you – which is what I had been doing.

But this approach leads to a violation of the conjunction rule we touched on when we talked about writing readable code. To describe all of what Product does, we need large assortment of “ands”, a choice selection of “ors” and more than a few “buts” – and probably lots of profanity as well.

So it seems that we need to split things apart, but how?

You must construct additional pylons.

Let’s take a step back.

We write software to solve problems. This process involves taking what we know about the problem, figuring out what sort of solution would be acceptable, and then figuring out the bits in between those two things.

Those in-between bits usually involve turning data into other, different data. Our programs choreograph the process of turning web requests into data structures into credit-card transactions into boxes sent to peoples' houses on their birthdays – and in return, get a paycheck and an assortment of bagels on Thursdays.

To my mind, this sounds like a story – the epic tale of Alice Places An Order For Artisan Coffee. And stories are told by actors, which as it turns out, is a much better metaphor for how objects behave:

  • Each actor has a well-defined role: the hero, the villain, the plucky assistant, and so on.

  • When an actor does something out-of-character, the audience notices. We might not be able to predict the entire plot, but we have a good sense of how the character should behave.

  • Every action moves the plot forward.

Who knew that thespians had such a solid grasp of software engineering!

When I stopped thinking of objects as things, and started thinking of them as actors, something magical happened inside my brain.

Make a mess, then clean it up.

Let’s go back to our problem and try breaking things into a more sensible set of objects. At the moment, we’ve got two roles in one body: the Holder of Product-Knowledge and the Searcher of Things. We can start by moving search out of Product:

class Product < ActiveRecord::Base
    def search(query, user)
        ProductSearch.new(query, user).results
    end
end

…and into a separate ProductSearch class:

class ProductSearch
    attr_reader :query, :user

    def initialize(query, user)
        @query = query
        @user = user
    end

    def results
        keywords = split_query_into_keywords(query)
        finder = build_finder_from_keywords(keywords)
        order_results_by_popularity(finder)
    end

    private

    # Followed by all the search methods from Product.
end

Leaving Product.search where it is is intentional. Other code depends on search being where it is, and we don’t want to have to change any more than is really necessary. This is a the essence of refactoring – the code is cleaner, but we’re not adding any new features.

Now it’s time to tackle the problem we got stuck on before: adding suggestions to the search results. This creates a new role in our code: the Suggester of Products:

class ProductSuggester
    def initialize(finder, user)
        @finder = finder
        @user = user
    end

    def results
        # Many deep SQL magics.
    end

    # Other methods that also do magic.
end

Which we then need to tie into the existing code. Right now, results are returned as a plain Array, which is no longer versatile enough to handle the role of Holder of Results.

Since we don’t want to rewrite everything that relied on search returning an array, we can craft a replacement that respects the Array interface, on to which we can hang a set of product suggestions:

class ProductSearchResult
    include Enumerable

    attr_reader :products, :suggestions

    def initialize(products, suggestions)
        @products = products
        @suggestions = suggestions
    end

    def each(&block)
        products.each(&block)
    end
end

Now that there’s a place to put those product suggestions, we can tie the final pieces together:

class ProductSearch
    def results
        keywords = split_query_into_keywords(query)
        finder = build_finder_from_keywords(keywords)
        products = order_results_by_popularity(finder)
        suggestions = ProductSuggester.new(finder, user).results
        results = ProductSearchResult.new(products, suggestions)
    end
end

Not only is each piece small enough to understand, but we now have a clear path for adding new functionality going forward – and we didn’t even need to make drastic changes to the existing codebase.

A Natural Consequence of Extreme Sanity.

Even though we started with (my) tangled mess, it wasn’t that nasty to fix, because we rethought the code first, rather than just scrapping everything and starting over from scratch.

I’ve led my share of rewrites, and experience tells me that even if you rebuild everything, if you don’t change how you code, then the end result will probably be a worse mess than what you started with.

For me, this approach – thinking of objects as a set of actors working together to solve the problem – has been not inly incredibly enlightening, but a key part of learning to build software that can scale.

It’s a pity it never came up in any of my programming classes. Clever algorithms and data structures are certainly useful, but I would rather have learned first how to not bury myself – not to mention the rest of my team - under a pile of technical debt.

Maybe I should have studied theater instead…