Wednesday, February 2, 2011

Stardate 2011.31.1: Why Me(ssage Brokers)?

Boarding the Enterprise (architecture)

Enterprise-level application development is often much different from non-enterprise application development. It requires a lot more discipline, and a lot better understanding of the design of the application by all of the people on the project. The reasons why are fairly simple: If the architecture supports modularization, a strong probability, and at the lowest level the developers have glued dependencies between all of the modules through the use of static classes or through plain and simple data sharing, then what was the point of having modules in the first place? Modules are intended to have separate, clearly delineated boundaries between each area of the application, so if there are dependencies between each and every module, then there's something disturbingly wrong.

How does one typically fix this? There are three ways, usually: A message broker, an enterprise service bus, or a simple service locator/inversion of control container, depending on the requirements of the system. I'd like to discuss all three here today, but I don't know enough about service buses to write anything intelligible about them. On the other hand, I've used service locators and message brokers out the yin-yang, so I'll write about those. For more info on service buses, I highly recommend Udi Dahan's blog, wherein he talks about them most of the time. :)

An IoC container/service locator is a simple construct that allows you to grab a reference to an interface without having to know how it was instantiated, nor what the concrete implementation of that interface is. This allows a very clean, simple architecture, although I tend to use an IoC container rather than a service locator simply for the all-powerful constructor injection, which requires an explicit declaration of dependency on the interface in question.

A message broker is most often used to communicate between processes or programs. They typically support a Publish/Subscribe model, where a programmer can send messages to a host of programs easily by publishing a single message on the broker. This can also be used for inter-module communication, however, and cleanly supports future integration with other applications.

How are these two similar, and what is the motivation for using them in the first place?

A clean architecture may have explicit dependencies between each module, but it does so in the form of interfaces. These interfaces can be stored in an intermediate project so that everyone has access to them. While this leads to a high afferent coupling, the coupling is to an interface, so it's less likely to break something if the implementing class has to change how it satisfies the interface. When you rely directly on the implementing class (instead of the interface), you get into trouble when that implementation class has to change substantially, mainly because the code that used it may need to change as well. If it was dependent only upon the interface, i.e. the core functionality that the class provides, it's less likely that the interface will have to change, because the implementation class can change much more easily to meet the needs of the interface, so you save in two ways.

A service locator or IoC tool can be used to mitigate the complexity of the construction of an object, as well as the inherent coupling to the concrete implementation class. Whereas before you might have had:

IUsageTrackingService tracker = new SqlUsageService(connectionString, other_bs);

Now, you would have the rather simpler:

IUsageTrackingService tracker = ServiceLocator.Resolve<IUsageTrackingService>();

Instead of having whatever class needs access to the usage tracking service newing it up itself, it is pulled in through a static service locator here and the using-class doesn't need to know the connection string or anything else about the usage tracker in order to get a reference to one. You can see how useful this is if you ever need to unit-test the usage tracker: Do you want to have it crush a database, or simply output the usage into a string which can be inspected way more easily? And unit-testing the class that uses the usage tracker? FUHGEDABODIT! (Pass an interface reference into the constructor of the using-class, and just mock it.)

A message broker works in a strangely similar fashion. What the service locator/IoC tool did through inherent reliance on the interface (in this case IUsageTrackingService), a message broker goes one step further and says, "I don't even know who is going to be getting this, but here's some indication that something is happening! AAAAAAGGHHHHH!" Here's how I think of a message broker: When you need something to happen (let's say we're still on usage tracking) but there's absolutely no reason whatsoever for a class to care who or what is responding to its clicking of an advertisement in the application, then a message broker can very, very cleanly implement that. In this case, you just have the class fire off a message such as:

broker.Publish(new AdvertisementClicked(module: Modules.Comedy, user: user.ID, advertisement: ad.ID); // Only using named arguments for clarity

Now, anyone who might want to catch that event can simply be subscribed to it upon instantiation of the message broker, and that leads to much, much cleaner code. Consider also that if the message broker were to just push the message into a queue, this could be processed completely outside of the main application somewhere and you'd take no performance hit whatsoever.

The key here is that the message broker removed dependency even upon the interface itself, so now the interface that you've created can change, if required, without the using-class having to change its code to handle it. It can be argued that in this case, you may not even need the interface at all, but I'll leave it as-is for now.

Downsides 

There are several downsides to each of these methods. I'll discuss those now briefly because they are pretty self-explanatory, in my opinion.

ServiceLocator/IoC container - With a static ServiceLocator, you get the added lack of bonus of relying on a static singleton, which isn't great for testability. At the same time, the IoC container can hinder testing as well. With the ServiceLocator pattern, you tend to run into problems when someone wants to use a class, but doesn't realize it has some dependency upon some interface - since the interface dependency isn't explicitly defined in the constructor of the class, it's not nearly as easy for the user to know that they have to register that dependency before using said class. With an IoC container, this is less of a problem because they typically have constructor injection, which allows the class to explicitly define the interface in its constructor. Both of these methods have somewhat of a complex learning curve due to the fact that you never technically know which implementation is being used, unless you're in the place where that implementation is being registered as the interface. You may very well *not* have access to that; for instance, if you were working in a module and the registering of the implementations all happens in the bootstrapper, and the bootstrapper is part of a totally different project, then you may not have access to it. At the same time, it lends itself to very simple programming, where you just get the reference and call functions on it - dirt simple.

With a message broker, the learning curve is much, much steeper. In addition to not knowing which particular class that you care about is processing the message, there could be tons of classes other classes processing the message as well, which can interfere. If you don't have access to all of them, then it could be one of those other classes that has a problem with the message. For this reason, it is a standard of mine that, when a class is processing a message, I log it. That way, when it fails, it is at least logged and I can go back and see exactly where it failed, if not specifically what failed.

The message broker also requires some knowledge of threading typically, as well as weak references. When you have an event that is being published, it could happen on the main thread or on a background thread. This can vastly change the programming model, and is a bit harder for most developers to wrap their heads around. Similarly, the pub/sub model that is usually used is somehow difficult for some developers to understand and compute. Some programmers also seem to struggle with weak references.

Both of these methods are more complicated, but save you time and money in the long run - it's a matter of how much extra complexity you're willing to take on up-front, in exchange for safety and clean code down the line.

Conclusion

I hope this post was useful to you in some way, and helped you to understand the very basics of how to use a message broker in an enterprise application. It's very high-level, but if you have low-level questions, feel free to post them in the comments and I'll see what I can do to answer them in future posts!