Sunday, February 28, 2010

Separation of concerns: Exception handling

This week (26-02-2010 ... 05-03-2010) I'm in Egypt for business.

While using a local ATM machine, I got an error message on screen similar to "Error xxx occurred while performing activity yyy". What was clearly visible from the descriptive text was that the activity I was performing, that activity yyy was actually a service on a very specific back-end. Even the back-end name was mentioned in the descriptive text. Also, from the error xxx, I could derive that this was a specific session management issue with the back-end itself.

Why is this wrong?

Well for a number of reasons but here are the main ones:


Security


- Security wise, this message exposed sensitive implementation details to the user, on-screen. Mentioning system names, nature of exceptions to a user etc. is a risk. Any mischievous user could easily have used this information to their own benefit. It is these kinds of mistakes which can easily be avoided. The user should not be bothered with this kind of information. The ATM user is simply not concerned. A simple "Service not available" or "Service temporarily unavailable" message should have sufficed. The user is not interested in this information. A system administrator for example, should be concerned.
- Similarly activity names expose functional context (expose implementation details on functional level) to the ATM user. Same issue: why would you give out such information to a user. Shield the "activity concerns" from  the user by not telling them too much "Service temporarily not available" would suffice.
- The back-end exception was almost 1:1 delivered to the front-end. The builder had implemented specific back-end exception handling to be "pass-through" - meaning that any system which needs to react on these exceptions, would have to be concerned with the actual implementation of the back-end exceptions. Note that this last one is a 'guestimate' as obviously I cannot look into the system. But from interpreting the error message I have strong reason to believe so.

Separation of back-end exceptions from integration-level exceptions

What strikes me as odd that there are still many applications which would have to integrate with back-end systems one way or another, expose back-end details to consumers or even to user interfaces. Good practice is to shield domain specific exceptions (back-end) from the consumer of a service. This is commonly referred to as "Separation of concerns". The consumer of a SOA service does not need to know about the domain specific service because that would cause what Thomas Erl refers to as "contract-to-implementation" coupling, one of the four negative coupling types.

While we are at it, think about this one, It is not directly related to the issue at hand but it is a good practice anyhow:

Separation of technical exceptions from functional exceptions

What I encounter many times in my profession is that in the way exceptions are implemented, people often do not make a difference between technical and functional exceptions. This is related to the fact that technical issues (real exceptions) and functional concerns (not really exceptions) are easily intertwined - sometimes the conceptual difference between the two are ignored or not recognized leading to complex implementations.

Technical exceptions are defined for situations in the system which break the normal operation (ie. Service Not Available, Connection timeout, Timeout, Service down for maintenance etc). These happen typically when something is wrong with a system, the network, the database etc. As you can see these are significant failures in the system as they prevent the software system from working properly, that is, from executing the core service logic. Typically these problems require rigorous kind of resolving and are considered disruptive to the execution of the core service capability. If you encounter one of these, a retry mechanism may make sense; but not always.

Functional exceptions are not really exceptions. They typically happen while executing the core logic of a service, and are conditions which make sense to the core logic of the service. Note that in these situations, nothing is wrong with the system. Examples are Customer not found, Customer status is inactive etc. Business rules are typically based on these kinds of exceptions. If you encounter one of these then typically the consumer can anticipate that another successive call to this service will return the same result.

A strategy I tend to follow is that both technical status and functional status are mapped into typically two separate fields. TECH_STATUS and FUNC_STATUS.

Whenever a service capability is to be executed, the response to any consumer would always contain a technical status. As the tech status illustrates whether something technically went wrong it is important to ie. both request-response patterns, as well is fire-forget.
The functional status however, may not always be available especially ie. when a fire-forget pattern is executed.

Some examples are listed below.

request response:

  • tech status = service down for maintenance; func status = don't care/not available
  • tech status = ok; func status = ok
  • tech status = ok; func status = no data found
fire forget:
  • tech status = connection not established; func status = not available
  • tech status = ok; func status = not available

Obviously, a fire-forget pattern will not return a functional status as it will be executed offline from the consumer logic execution. Potentially a functional status may be returned asynchronously but this would not always be necessary. Sometimes it is sufficient to know that -eventually- the service core logic gets executed.

Hope you had fun reading; until next time...

No comments:

Post a Comment