AOP != Interception

Recently, a number of authors and writers have been talking about AOP (Aspect-Oriented Programming), and how incredibly powerful and wonderful the whole thing is. And yet, for the vast majority of them, what they're really referring to is an old pattern called Interception. These people aren't stupid, nor are they misguided; while Interception is a powerful mechanism in its own right, it's not the same as AOP, and while they do share a number of defining characteristics, to understand AOP as Interception is like thinking OOP is data structs plus a bunch of function pointers. Just as we didn't understand objects until we got past the idea that objects are "just code and data", if we're to truly understand the potential power inherent in AOP, we need to get beyond this thinking that Interception and AOP are the same thing. They're obviously related at some points, but that's not the same thing as equality.

DefinitionsFor starters, because 90% of all arguments are based on differences in semantics, let's establish our terminology up front. Specifically,

Aspect-Oriented Programming:

"Aspect-oriented software development is a new technology for separation of concerns (SOC) in software development. The techniques of AOSD make it possible to modularize crosscutting aspects of a system." --AOSD homepage

"The central idea of AOP is that while the hierarchical modularity mechanisms of object-oriented languages are extremely useful, they are inherently unable to modularize all concerns of interest in complex systems. Instead, we believe that in the implementation of any complex system, there will be concerns that inherently crosscut the natural modularity of the rest of the implementation.

"AOP does for crosscutting concerns what OOP has done for object encapsulation and inheritance--it provides language mechanisms that explicitly capture crosscutting structure. This makes it possible to program crosscutting concerns in a modular way, and achieve the usual benefits of improved modularity: simpler code that is easier to develop and maintain, and that has greater potential for reuse. We call a well modularized crosscutting concern an aspect." --An Overview of AspectJ

Interception: "The Interceptor architectural pattern allows services to be added transparently to a framework and triggered automatically when certain events occur." --Patterns of Software Architecture, Vol 2 by Schmidt et. al, p. 109 At first blush, these two sound amazingly similar--Interceptors fire at certain execution points in order to provide additional behavior to the object methods they hijack, and formalized AOP "weaves" code into declarative points in the source code to add additional behavior above and beyond the objects they crosscut. As with so many things, however, the difference is really in the details and the layer beneath what's immediately obvious. As part of this discussion, however, we need to define a few more terms in order to truly see why these two overlap so much:

Separation of concerns: A concern is essentially an atomic thing we want to model in our software system; for example, within an accounting system, the idea of an asset is a "concern". Therefore, we want to try and capture all ideas related to assets into a single coherent "thing" we call an object in OO systems. This allows us to keep better track of the abstractions within the system, allowing for better maintenance and functionality over time. Every programming language implicitly holds this as its central goal, even if what those concerns are and how they should be captured vary wildly.

Cross-cutting concern: A crosscutting concern, as the AspectJ literature states, is a part of the system that cannot be captured as a first-class cosntruct within normal object model semantics. For example, consider the act of tracing method entry and exit. Normally, this is something that can only be done directly within each class itself, via something like if (Debugging.isEnabled())

Debugging.trace("Method foo() entered");

// . . .

if (Debugging.isEnabled())

Debugging.trace("Method foo() exiting");

As you can well imagine, this isn't behavior that can be inherited from a base class or provided via an internal reference to another class (the canonical way for OO systems to build clean separation of concerns is via inheritance or composition); each and every single method that wants to be traced is going to have to include these four lines of code. Other, more complex, crosscutting concerns common within enterprise systems include persistence (fighting the age-old object-relational impedance mismatch), synchronization behavior (how should my objects act in the face of being called from multiple threads?), and remoting (how should my objects allow for invocation from across process boundaries?). All of these are problems that aren't captured cleanly in a traditional OO environment.

Interception and AOP

The Interceptor pattern from POSA2, which I'm using as the canonical definition of interception until a better reference comes along, states that the problem is "Developing frameworks that can be extended transparently." The solution suggested starts by saying

Allow applications to extend a framework transparently by registering 'out-of-band' services with the framework via predefined interfaces, then let the framework trigger these services automatically when certain events occur. (Footnore: In this context, "events" denotes application-level events such as the delivery of requests and responses within an ORB framework. These events are often visible only within the framework implementation.) In addition, open the framework's implementation so that the out-of-band services can access and control certain aspects of the framework's behavior. In other words, Interception implicitly relies on several things:

Explicit support from the system for interception. POSA2 calls it the framework, COM+, .NET, servlets and EJB do it at container boundaries, but the underlying idea is the same--somewhere, some kind of construct (what I'll colloquially refer to as "The Plumbing") is what's providing the interception behavior, explicitly routing the call flow through the Interceptor as part of normal processing. This means that only those artifacts understood by the plumbing as first-class citizens can be so Intercepted--anything that's not within the purview of the plumbing cannot be intercepted. For example, in the servlet framework (starting from the 2.3 specification), Interception is provided via filters, allowing servlet authors the opportunity to "hook" the request and response to a servlet or JSP (or any other URL, for that matter). However, Interception can only apply to URL request/response semantics, so filters cannot, for example, intercept calls to ordinary Java classes or calls via other channels, like RMI, to other systems.

The notion of a request/response channel. Interception implicitly relies heavily on the idea of request/response semantics, since interceptors almost always fire around method entry and exit. This means that a large number of object interaction semantics are left untouched and unavailable. (If this is a bit abstract, hang on--this will make more sense later.)

Interception is almost universally a runtime construct. POSA2 makes explicit reference to this fact, citing the idea that these services won't always be desired or necessary for all cases, so the behavior from the Interceptor cannot be layered in statically--it needs to be registered at runtime. Unfortunately, this also implies a certain amount of runtime overhead, particularly if the interceptor's event model is fairly coarse- grained. For example, in the .NET context object architecture, method calls both into and out of an object inside of a context can be intercepted via the IMessageSink-and-friends architecture. However, once registered, a context interceptor must be invoked for every method call in and/or out of the context; there is no way to specify that the interceptor is only valid for "methods of this type". As a side note, thus far it would be possible to come away with the assumption that Interception is a remoting-only artifact, that it's only possible when control passes across process boundaries. This simply isn't true--for example, Java provides for a generic interception architecture via its Dynamic Proxy API, allowing Java programmers to create Interceptors around any object that implements any interface. (Dynamic proxies do this by constructing a concrete implementation of that interface that sits between the actual implementation and the caller; the caller only sees an interface reference, and has no idea that the thing on the other side of its reference isn't the original object.)

AOP, on the other hand, relies on two fundamental constructs as part of its definition: join points, and advice. Join points are used to describe to the AOP system where exactly in the object model code should be "woven" into the original source base, and advice is the actual code to weave at those join points. The richness of the AOP system depends quite strongly on its join point model; for example, AspectJ defines a rich join point model that includes, among other things, field get/set operations, method call entry and exit, as well as method execution entry and exit. The difference between method call and method execution, by the way, is the difference of where the call is made, versus where the call is actually carried out. This means that:

AOP systems need no underlying "plumbing". This statement is somewhat controversial, since obviously something needs to do the aspect weaving at some point in the execution lifecycle. However, this frequently isn't a runtime construct, but a compile-time or load-time operation; for example, AspectJ is a compiler, performing all the weaving at compile-time.

AOP systems can weave against anything. An AOP system is limited only by its join point model--for example, AspectJ can easily define an object-relational persistence aspect that weaves around JavaBean classes, such that any field-get access against a non-transient field would immediately trigger a database access to obtain the latest version of that field, and any field-set access against a non-transient field would immediately trigger a database access to set that value into the database. While the performance of such an aspect system would be horrendous (think of all the round-trips!), the fact that this could be done against any JavaBean class, which could be used in a servlet system, a Swing app, or a J2ME device is powerful. As an example of where an AOP system is capable of capturing elements that a traditional Interception system couldn't, consider the very real problem today of SQL injection attacks in web-based systems.

For those of you unfamiliar with this problem, the canonical example is that of the login screen of many webapps. If you're like me, the easiest way to authenticate somebody seems to be something like this:

Present a form with the input elements "username" and "password" to the user. Submit the form to a servlet or HttpHandler.

In that servlet or HttpHandler, construct a SQL query of the form "SELECT * FROM users WHERE username = '" + form.getParameter("username") + "' AND password = '" + form.getParameter("password") + "'" and execute it. If the query comes back with a row in the results, they're obviously OK; if not, they either don't exist in the system or they got the password wrong. Reject. The problem here is simple: what happens when somebody submits their username as "Bob' -- Ignore the rest of this line"? The SQL query turns into: SELECT *

FROM users

WHERE username = 'Bob' -- Ignore the rest of this line AND password = ''

As the DBAs in the audience will tell us, the double-dash is the standard form of the SQL single-line comment, which means that we never even check if the data passed in the "password" field of the form was correct or not. It gets worse--what if the malicious attacker passes "Bob'; DROP TABLE users" as his input? Yikes!

Solving this problem is obviously not something that we can fix simply by writing a base class or utility class that programmers can call--the necessary code to prevent this is already there, via the PreparedStatement in JDBC (similar support is there in .NET, too). The problem is that the programmer didn't use a PreparedStatement, whether out of ignorance or apathy or maliciousness.

Clearly this isn't an area that an Interceptor can solve, since the problem doesn't fit cleanly into a request/response model--the code to do SQL access won't always fit around a single method call. It could be part of a larger method, like finding search hits. (Remember, this isn't just a problem on logins; it's going to be present everywhere user input is used to construct a SQL query or statement, so any INSERT, UPDATE, or DELETE statement is just as vulnerable.) But an AOP system, by virtue of its richer joinpoint model, could capture the call out to the JDBC Connection.createStatement() method and immediately add code to throw a runtime exception--in essence, crash the system so the faux pas gets caught during unit testing or QA and fixed. (Wes Isberg, of the AspectJ team, goes into much deeper detail about using AOP in this fashion.)

As another point of consideration, thus far all discussion of these crosscutting concerns have been domain-independent: persistence, call tracing, security, and so forth. A large number of domain-specific crosscutting concerns creep up as part of any system, however; for example, the AspectJ team describes a simple example where an aspect is written for a simple graphical object system to track all "object moves", where the object is told to move to different coordinates on the screen, in order to force a repaint. This is obviously domain-dependent, and yet captures an obvious crosscutting concern, that of forcing the graphics system to refresh the screen contents after a change. Again, this is not something that can easily be captured using an Interception-based design.

ConclusionCertainly, the line can be blurred between these two concepts without too much difficulty; for example, John Lam's CLAW implementation does runtime modification of .NET CIL to perform the necessary aspect code-weaving at load-time. Is this Interception or AOP? Arguably, it's somewhere in between the two--it has the benefits of being able to express itself in compiled form, only against those methods that are being explicitly named in the joinpoints, but suffers from a severely reduced join point model (again, just method entry and exit).

Just as clearly, it's possible to build a system that is AOP in nature using Interception as the underlying implementation--EJB, servlets, ASP.NET, MTS/COM+ and the .NET context model all serve as standing testimony to that fact. The JBoss EJB implementation stands as quite possibly the most intense example of how Interception can provide AOP-motivated solutions, and definitely serves as a glowing testimony of how a well-constructed Interception framework can solve a whole host of problems.

In summation, Interception can be seen as a simplified form of AOP (just as Visual Basic was long thought of as a simplified form of OOP), in that it defines an extremely simple join point model, that of method-entry and method-exit at component boundaries, with the advice woven in at runtime via interceptor callback registration. Certainly, Interception can be seen as the first big step to take to understanding AOP; it even makes sense in scenarios that would be horribly complicated by using an AOP system. But never be fooled into thinking that the story ends there. Interception is clearly best used at component boundaries, and AOP seems best used entirely within component boundaries.