Tuesday, November 23, 2010

High Volume Transaction Processing

Came across a presentation on payment processing by Voca at InfoQ. Just felt that the key design principles used by voca may be of interest to those generally deal with the problems of high volume transaction processing. These principles are:

1. Minimize movement of data: Movement of data across physical and logical layers could result in heavy network traffic and also necessitates a complex transaction and exception management over multiple layers. The idea is to whenever a set of transactions need to be processed for certain validation or transformation, do it within the database instead of moving such set of data to other layers and then bringing it back to the database.

2. Task parallelization: This an area which most of us might have not considered. Using the Work Manager / worker architectural pattern, the tasks can be executed by multiple nodes, which could be separate physical nodes with the ability to add more nodes when demanded.

3. Physically partition data: When the all in one database hits its scalability limits, the option is to partition the database. It would be ideal to envision the possible physical partitioning of the data and implement it right from the beginning.

4. Optimized reads and writes of volatile data: This is one principle, which most of us adhere to by having necessary indexes, managing the fetch sizes, etc.

5. Minimize contention: Contention is certainly an avoidable thing as this will cause the workers wait for release of the resources or data and directly impacting performance. One option is not to wait for the release of resources and instead, look for an alternate source of data / resources. Of course this will require a thought through design of multiple synchronized instances providing the data / resource.

6. Asynchronous decoupling: Usage of a middleware like message queue can certainly help in this area and there by improving the response for the consuming applications.

7. Keep Complex business logic outside database: Considering the limited scalability options of the database, it would be ideal to minimize the work load for the database by shifting it to other tiers where possible.

8. Caching frequently accessed data: There is no point in traversing through multiple physical and logical layers to fetch the same data multiple times. Caching such data in appropriate layers will certainly will leave the network and other resources for other useful tasks.

The referenced presentation is available on InfoQ website at http://www.infoq.com/presentations/qcon-voca-architecture-spring

Wednesday, July 21, 2010

Scalability

When we talk about the scalability for a web application, it is quite common for us to think of load balanced web farms. This obviously will require the application session state be managed using one of the OutProc options. Depending the utilization of the session state data and the implementation approach, the state server (be it state service or SQL server) may soon hit the scalability / performance limits and become a bottleneck.

This prompted me to think about possible scalability and availability solutions for the State Server. It is again possible to have a load balanced farm of servers to serve the session state requests. But, this requires the state information is replicated across all such servers, so that the information will always be available in across all the load balanced state servers. Obviously this is additional overhead. There are third party tools(like scalenet) to have the session and cache information replicated across multiple servers.

As I was looking for an alternative solution, just came across this Best Practices article on state management published by Microsoft. This article suggests a simple approach of session partitioning using the PartitionResolver Interface and determining the partitioned server using a key derived from the session id.

If any of you have implemented this or any other approach, do post it here.

Tuesday, March 2, 2010

Framework Vs Library

I was trying to understand the difference between Framework and a Library and came across an article by Martin Fowler on Inversion of Control. What I understand from that article is that the Inversion of Control is what makes a Framework different from a Library. I have in the past been involved in the development of reusable components, but wondered if they can be called as Frameworks, but the answer seem to be No, as in those cases, I have not inverted the control over to the Framework. So, those packages were just libraries and do not qualify to be called as Frameworks. What is the significant differentiator then? Based on my reading of various articles, I think the following are the key differentiators:

1. You call the a library method, whereas, a Framework calls your code.
2. A Library provides an API for your use, whereas a Framework provides an API to which your application should conform.
3. A Library is something contained within the application; i.e, if a library is not suitable, we can always substitute with a different library. But a Framework is a container for the application, having said that if the framework ends up not meeting the need, one need to throw away the entire application.

I also enjoyed reading the discussion on Why I hate Frameworks.