Friday, July 2, 2010

ACID vs. BASE

We all know that ACID and BASE are opposites, both in the chemical world, as well as in the IT world. They are used for different purposes and should not be confused with each other as they are distinctly different. In the IT world, ACID and BASE can be used complementary, similar to the chemical world.

In IT-land, both mechanisms are intended to make sure the end result of an operation leaves a system in a consistent state. The way they do it however, are radically different. And which one you need, depends on your needs, or better, your business's needs. Default reaction of almost everyone at business level would be "I need ACID". In my entire career I have only ever met one business principal who was fine with BASE after I had suggested it. Everyone always wanted to see 'the full monty' regarding consistence, no-one was willing to do any trade-offs to free up some system resources, even if this meant purchasing more powerful hardware resources.

The difference for a person applying the patterns, between ACID and BASE, are concentrated around the amount of effort required to implement (design-time), and the amount of effort for systems to execute the pattern, also known as scalability (runtime). Where nowadays, most middle-ware has support for ACID making life for designers a lot easier, there is no such thing for BASE. Find out why in this article.

ACID - Atomic, Consistent, Isolated, Durable

A transaction must be explicitly committed or fully rolled back (sometimes the environment or middleware automates part of this concern). In principle, as long as a transaction is not committed or rolled back, it can keep or lock huge amounts of resources, like memory, disk space etcetera). While in progress, the system resources consumed keep the overall solution from being scalable, as smaller resource consumption makes more operations fit the same machine. The bigger volume of resources a transaction comprises, the less scalable the system becomes, as less operations fit the same machine, or same group of machines.

If it is really required, the approach is a (often standardized - for example "XA" distributed transactions) method to ensure a system is consistent at all times. The concept of a distributed transaction spans across multiple (parts of) a system and has those parts participating in this transaction. Any participating (service) logic will be eventually entirely committed, or fully rolled back.
Note that committed is not always analogous to successfully executed to the fullest extent; it rather means that the system is left in a consistent state. It could be that in order to make things 'committable', or consistent, an intermediate valid system state is reached which also allows committing but does not reach the ultimate intended goal - yet. An example is where logic has a state deferral mechanism in place which is used to make the system consistent and to break up the transaction in manageable parts. This is by the way an excellent mechanism to make transactions more scalable. Because the spanned logic, hence consumed resources, are broken into smaller pieces, the individual parts become more manageable and inherently makes them more scalable. Furthermore, if the transaction coordination mechanism is a 'generic' implementation, typically it has safeguards built-in to make sure that everything is fine, like a double-commit pattern. This all costs time and resources and can be counter-productive if your system architects, designers and developers do not know what they are doing.

BASE - Basically Available, Soft state, Eventually consistent

In a BASE pattern, the need to commit does not exist. If things happen according to plan, they are done. No need to confirm when they are done, to anyone, if you don't want to, contrary to a pattern like XA-transactions. There is a downside to the BASE pattern: there is no (explicit) rollback mechanism. If does not exist, the architect/designer is responsible for creating so-called compensation algorithms. These are pieces of logic which handle specific exception scenarios. It is easy to forget one or two so it requires more than ample planning and design to make sure all crucial scenarios are covered. The pattern coordinator (the implemented consistency control mechanism of this pattern) must manage its own state and progress in the process. This means that it must track what it did, and based on when and where things went wrong, explicitly execute compensation activities for undoing the parts of the work which did go OK until that moment. There is however one big advantage: because the system generally does not need to keep track of your process's status -you manage that for the system in a tailored-to-the-need way- less resources are consumed, which would have a very positive influence on your system's performance and scalability. Also, as the architect or designer can choose to selectively or strategically apply checkpoint logic, potentially less logic is executed. This allows a service designer to focus more on the actual core service logic.This is because the architect is in full control of the "when and how" of any applied core logic as well as supporting logic. Since no standardized method to ensure consistency of the system exists, temporary inconsistency is a almost-certain consequence. It is not only a consequence; it is the actual foundation why this pattern is so much more scalable. Temporary inconsistency is allowed in a BASE pattern to make matters more manageable. The information and core service capability is "basically available". Contrary to an ACID managed consistence, some inaccuracy is temporarily allowed and data may change while being used (must be known and accepted by the consumer of the service) to reduce the amount of consumed resources (soft state). Eventually, when all service logic is executed, the system is left in a consistent state (eventually consistent). Presumably, as the core service logic executes relatively quickly, the amount of occasions where the service logic is required but only available in inconsistent state should be generally manageable.

Scalability and consistency patterns

As ACID resources are generally larger groups of resources, the easiest way of scaling up, is scaling vertically, also known as purchasing larger machines. As there is a limit to the physical size of a machine - you can only get them 'this big', this is not a very future-proof approach. Also, when increasing hardware capacity this way, quite often a certain amount of capacity is wasted to make room for the new hardware (ie. by replacing the current machine with the next bigger model, or by replacing smaller memory modules by bigger ones: what happens to the old hardware?).

Horizontal scaling, also known as deploying -concurrently- more machines to process more data, is far more extensible and typically cheaper, as you always can add capacity without wasting the capacity you already have. By having more machines, two methods of horizontal scaling can be applied: functional scaling or sharding. Functional scaling is distributing pieces of logic of the same capability to a certain group of hardware nodes, and another group of functionality to another group of hardware nodes. This is -ie in the database world- known as partitioning. Sharding is the act of deploying the same functionality/capability across multiple nodes. By nature, horizontal scaling is more complex than vertical scaling, but the benefits may outweigh the cost of the design and architecture.

BASE resources are typically a lot smaller and inherently less hardware nodes are required, or smaller nodes are required to reach the necessary system capacity.

In a SOA, it is because of the nature of the comprised distributed service capabilities and the availability of standards, or better the lack thereof, for implementing consistency patterns, extremely difficult to reach 'transactional integrity' across service boundaries. By nature, the ACID approach is difficult to implement, if not impossible with certain platforms. The BASE approach is, especially outside service boundaries more easily implementable. If we look at task services and certainly orchestrated task services, the BASE approach is the default approach and anything 'better' must be custom-built. This is why it's such a good idea to offer BASE approach more often than the ACID approach. Offering BASE is more easily if it's known what the accepted margin-of-error is by the business or project principal. This can be discovered during requirements clarification phase, a very important phase of every service delivery effort. Without a proper requirements and business process investigation (discovery, clarification, refinement), it's virtually impossible to see whether you need ACID or BASE.