Tech Bytes by Kannan Subbiah

Saturday, June 1, 2013

Software Quality Attributes: Trade-off anaysis

We all know that the Software Quality is not just about meeting the Functional Requirements, but also about the extent of the software meeting a combination of quality attributes. Building a quality software will requires much attention to be paid to identifying and prioritizing the quality attributes and design & build the software to adhere those. Again, going by the saying "you cannot manage what you cannot measure", it is also important to design the software with the ability to collect metrics around these quality attributes, so that the degree to which the end product satisfies the specific quality attribute can be measured and monitored.

It has always remained as a challenge for the software architects or designers in coming up with the right mix of the quality attributes with appropriate priority. This is further complicated as these attributes are highly interlinked as a higher priority on one would result in an adverse impact on another. Here is a sample matrix showing the inter-dependencies of some of the software quality metrics.

	Avail- ability	Effici- ency	Flexi- bility	Inte- grity	Inter-oper-ability	Main-tain-ability	Port-ability	Reli-ability	Reus-ability	Rob-ust-ness	Test-ability
Avail-ability								+		+
Effici- ency			-		-	-	-	-		-	-
Flexi- bility		-		-		+	+	+		+
Integrity		-			-				-		-
Inter-oper-ability		-	+	-			+
Maintai-nabilit	+	-	+					+			+
Port-ability		-	+		+	-			+		+
Reli- ability	+	-	+			+				+	+
Reus-ability		-	+	-				-			+
Robust-ness	+	-						+
Test-ability	+	-	+			+		+

While the '+' sign indicates positive impact, the '-' sign indicates negative impact. This is only an likely indication of the dependencies and in reality, this could be different. The important takeaway however is that there is a need for planning and prioritizing the quality attributes for every software being designed or built and the prioritization has to be accomplished keeping mind the inter-dependencies amongst the quality attributes. This would mean that there should be some trade-off made and the business and IT should be in agreement with these trade off decisions.

SEI's Architecture Trade-off Analysis Method (ATAM) provides a structured method to evaluate the trade off points. . The ATAM not only reveals how well an architecture satisfies particular quality goals (such as performance or modifiability), but it also provides insight into how those quality attributes interact with each other—how they trade off against each other. Such design decisions are critical; they have the most far-reaching consequences and are the most difficult to change after a system has been implemented.

A prerequisite of an evaluation is to have a statement of quality attribute requirements and a specification of the architecture with a clear articulation of the architectural design decisions. However, it is not uncommon for quality attribute requirement specifications and architecture renderings to be vague and ambiguous. Therefore, two of the major goals of ATAM are to

elicit and refine a precise statement of the architecture’s driving quality attribute requirements
elicit and refine a precise statement of the architectural design decisions

Sensitivity points use the language of the attribute characterizations. So, when performing an ATAM, the attribute characterizations are used as a vehicle for suggesting questions and analyses that guide towards potential sensitivity points. For example, the priority of a specific quality attribute might be a sensitivity point if it is a key property for achieving an important latency goal (a response) of the system. It is not uncommon for an architect to answer an elicitation question by saying: “we haven’t made that decision yet”. However, it is important to flag key decisions that have been made as well as key decisions that have not yet been made.

All sensitivity points and tradeoff points are candidate risks. By the end of the ATAM, all sensitivity points and tradeoff points should be categorized as either a risk or a non-risk. The risks/non-risks, sensitivity points, and tradeoffs are gathered together in three separate lists.

Saturday, May 18, 2013

Why Software Product Delivey is not Identical to a Car Delivery?

I happened to sit in one of the project review meeting where client raised a question on software delivery expecting it to be used in production on the day of delivery by the vendor. He went ahead and started comparing it with a use case of a customer driving off a car upon taking delivery. I know all the project managers out there will jump in to say that don't compare apple with orange. On the one hand, yes, software development is unique and cannot be compared with production of a tangible product. On the other hand, attempts are being made by various standards organizations in helping the industry achieve a high maturity process capability and thus deliver production quality software consistently.

Software product vendors selling or licensing software products can however be compared to Car manufacturers. Take for example, Microsoft Office product suite, one can just go and buy it off the shelf and use it, just like driving off a car. This is possible for product vendors because, the product vendors over a period acquire enormous amount of knowledge on the targeted product domain and in turn subjecting it through as many cycles of testing before announcing its readiness.

A typical car from concept to production may take atleast few years and it would be in the order of around 3 to 6 years. During this period, the product undergoes various cycles of tests, which include crash test, drivability on the road conditions of the target market, etc. Similarly, software product vendors do conceptualize the product idea and then work on it over a period. The vendors are attempting to achieve a faster time to market, but by adopting newer product development methodologies. The one that works best is to identify the smallest piece of the software that can meet certain specific use cases and then build on top of it over a period of time.

In case of a software product, it is the vendor's responsibility to subject the product through various tests in production like environments before getting it out to customers. The tests include alpha and beta tests, where in interested end users are engaged to use it in production environments and collect the feedback on various aspects and the developers working on it to have all those critical issues addressed. Achieving a zero defect state may not be a possibility and vendors take balanced approach as to whether wait until all the identified issues be addressed or reach out to the market with certain known issues, which can be addressed in subsequent product releases.

But it is a different approach when it comes to bespoken software projects. The major challenges with the bespoken software projects that differentiates from software products include:

It is the Client who specifies the requirement. Though the vendor might facilitate documenting the requirements, it is the customer who formally agrees as to what need to be produced.
Lack of understanding and agreement on the non functional requirements, which are not documented in most cases.
The vendor might not be an expert in the target domain area. Though the vendor has expertise in the domain area, it is the customer's need that is final and not the vendor's understanding of the requirement.
The ever changing requirements. By the time, the vendor delivers the software, the requirements as agreed by the client would have undergone change due to various reasons.
It is difficult to unambiguously understand the requirements as the project progresses through various phases involving humans with varying abilities.
There are dependencies with various pre-existing or emerging software and hardware environments in the production environments of the client.
The client has the roles and responsibility to assume to make the project successful. However, it is for the vendor to ensure that the client understands their role and responsibility throughout project life cycle.

Another point to consider for this discussion is what constitutes delivery. A well written SoW (Statement of Work) clearly lists down the acceptance criteria, which when met would constitute acceptance of the delivery by the client. For a given requirements, no two vendors would build an identical solution. That's due to the tools, technologies out there for use and the varying intellectual abilities of those involved in building the software. It is important as to what the client wants, than how the vendor delivers the solution. For this reason, the client shall assume the responsibility of performing an user acceptance tests. Some times, the delivery of a software might mean implementation in production environment, which might involve data migration, appropriately configuring various pre-existing software and / or hardware.

In the end, it is all about managing the expectations of the client. It is not just setting at the start, but needs to be appropriately managed throughout the project life cycle.

Sunday, May 12, 2013

Application Integration - Business & IT Drivers

Design Patterns are finding increased use for its apparent benefits, i.e. when there is a solution for a common recognized problem, why reinvent the solution again, instead use the solution design that is expected to address the problem. However it is not a one size fits all, as there are a combination of factors that may influence the choice of the design and architecture. Enterprise Application Integration is a complex undertaking and it requires a thorough understanding of the applications being integrated and the various Business and IT drivers that necessitate the integration.

There are numerous approaches, methods and tools to design and implement Application Integration and it is always a challenge in choosing the right combination of design, tools and methods to accomplish the business need. Selecting the right design requires good knowledge of the problem on hand and the other related attributes that drives the solution. In this blog, let us try to identify various business and IT drivers that the architects need to understand well before making a choice of the design, method, tool or approach.

It is also worth understanding that these drivers should be considered collectively. While a particular tool or design might seem to solve the specific need, it might require certain other needs to be compromised. Thus it is a mix of art and science that the Architects need to apply.

Business Drivers:

Business Process Efficiency - The level of efficiency that the business wants achieve through the application integration is a basic need that should be considered. Being a basic need there is a tendency amongst the Architects to neglect to have this documented and thus leaving a chance failure in this area. Knowing this will also be useful in validating the design through the implementation phases.
Latency of Business Events - Certain business events are time sensitive and need to be propagated to the target systems within a time interval. Achieving a low latency integration may require tooling the components at much a lower layer of the computing hardware.
Information / Data Flow direction - Whether the data or information should flow in one way or should it be a request / response combination has an influence on the choice of design.
Routing / Distribution - The information or data flow might be just a broadcast or in some cases, there have to be a response for each request. Similarly, the target applications may be one or many, which could be dynamic based on certain data combinations. These routing and brokering rules have to be identified to orchestrate the data flow appropriately.
Level of Automation - End to end automation might require changes to the source and / or target applications. Alternatively, it might be a best choice to leave this to be semi automatic, so that the existing systems and related process may remain unchanged. An example of this could be participation of a legacy system which is likely to be sunset in the near term, in which case changes to the legacy system is not preferred.
Inter-Process Dependencies - Dependencies between processes would determine whether to use serial or parallel processing. It is important to identify and understand this need, so that processes which can be processed in parallel can be identified and designed accordingly, so as to achieve efficiency.

IT Drivers:

Budget / TCO - Any enterprise Initiative would be budget driven and the Return on Investment must be established to get the consent of the project governance body (or Steering Committee). The Choice of tools, design and the approach should be made considering the allocated budget and aiming to achieve a lower TCO (Total Cost of Ownership).
Technology and Skills - It is also important that the design and architecture considers the technology in use in the organization and the availability of skilled resources to build and as well as maintain the integration implementation. Application Integration solutions require a continuous maintenance as either of the source or target systems or even the business processes are likely to undergo changes.
Legacy Investment - All enterprises will have investments in legacy systems as the technology obsolescence is happening faster than planned. It would be prudent for enterprises to explore opportunities to get returns out of such legacy investments where possible. The Architects shall consider this aspect while designing integration solutions and thus facilitate planned sunset of Legacy systems over a period.
Application Layers - Integration can be achieved at various layers, viz. Data Layer, Application Layer, UI Layer, etc. The choice of the integration layer depends on various business drivers and an appropriate choice need to be made to achieve desired efficiency, latency and other needs.
Application Complexity - An enterprise is likely to have a portfolio of applications within and outside to integrate with. The complexity of the applications would have a direct influence on the design and architecture of the integration solution as well. A thorough study and evaluation of the applications is a must to come up with a good integration solution.

As you know the above is not an exhaustive list and there are various other business and IT Drivers that need consideration. In addition, the quality attributes like security, availability, performance, maintainablity, extendability etc that need consideration in choosing a right design and architecture for the application integration solution. I would be writing another blog on Quality of Service (QoS) considerations for Application Integration solutions.

Saturday, April 13, 2013

A Framework for Cloud Strategy and Planning

Cloud technologies are here to stay, but not without concerns. The concerns are different for different organizations and there is no ‘one size fits all’ solution to have these concerns addressed. It is for the enterprise to come up with a cloud strategy and carefully weigh the pros and cons of cloud adoption with due consideration to the competing constraints. When not done well, this could have telling impact on the business.

A model based approach or a framework based approach to come up with the cloud strategy could help maximize value creation out of cloud adoption. With their expertise on the standards and frameworks and on the internal business capabilities, the Enterprise Architects are in the best position to take this mantle. Though there are various standards and frameworks that can be used to strategize and plan the Cloud Adoption, Mike Walker in his article suggests a simple three step framework for Cloud Strategy and Planning, which he adopts from TOGAF and other frameworks. The following diagram visualizes the mapping of the TOGAF methods to the Cloud Strategy Planning Framework.

The following are the seven activities that forms part of the three step Cloud Strategy Planning Framework:

Strategy Rationalization:

1. Establish Scope and Approach - This activity is all about educating or envisioning the stakeholders about the cloud computing and then come up with the business model for cloud computing

2. Set Strategic Vision - This is about understanding and coming up with a strategic vision for cloud computing in the enterprise and for the purpose, due consideration shall be given to the IT and Business strategic objectives of the organization, the cloud computing patterns and technologies and the feasibility and readiness.

3. Identify & Prioritize Capabilities - This is an important activity as it lays down the criteria for evaluating the capabilities in the form of IT and business value drivers. The outcome of this activity would be the key capabilities that are subject to further analysis.

Cloud Valuation:

4. Profile Capabilities - This again is a critical activity, in which the current maturity levels of the capabilities are determined and then other risks and other constraints like architectural fit, readiness, feasibility etc are assessed.

5. Recommend Deployment Patterns - Based on the capability profiling come up with recommended cloud services and the deployment patterns based on architectural fit, value and risk. Of course this will call for due consideration of the proven practices, and the industry trend.

Business Transformation Planning

6. Define and Prioritize Opportunities - This activity is more like coming up with a business case for an opportunity for cloud transformation and yes, this should include the perceived benefits, expected risks, technology impacts, and a detailed architecture..

7. Define Business Transformation Roamap - This activity is about performing an assessment of implementation risks and dependencies and develop a transformation roadmap, which should be validated with the business users.

In line with the above framework, Mike Walker suggests a top down model as below:

You may wish to check out the original article here.

Sunday, April 7, 2013

NGINX - A High Performance HTTP Server

What you will find about NGINX from its website is that it is a free, open source light weight HTTP web server and a reverse proxy. Nginx doesn't rely on threads to handle requests. Instead it uses a much more scalable event-driven (asynchronous) architecture. This architecture uses small, but more importantly, predictable amounts of memory under load. We have found that it is holding up to what it says. Here is how we ended up using NGINX for one of our client and have found it meeting up to our expectation.

We were on a performance testing assignment of a game application, where much of the game contents were static flash files (of course they are dynamic as they had embedded action scripts for execution in the front end) and image files of which quite many are of sizes above 500KB. Need not to mention that this is the compressed size and as such enabling compression did not offered any performance gains. These files were served by Apache, which also serves dynamic php requests.

Our initial performance tests did indicated that for about 1000 concurrent hits, the Apache server memory consumption shot up considerably beyond 2 GB and also quite many requests were timing out. Examination of the server configuration did reveal that the server is configured to serve 256 child processes and also that it was configured to use the ‘prefork’ mpm module, which limits each child process to use only one thread. This technically limits the number of concurrent requests the Apache Server can serve to 256.

We also figured out that Apache needs to use ‘prefork’ module in order to safely serve PHP requests. Though later versions of PHP are found to be working with the other mpm module (‘worker’, which can use multiple threads per child process) many still have concerns around thread safety. The 256 client process limit is also a hard limit and if that needs to be increased, the Apache server needs to be rebuilt.

With this diagnosis, the choices we had with us was to have the static content served out of a server other than Apache and just leave Apache to serve the PHP requests. We just then thought of trying out NGINX and without wasting much time, we went ahead and implemented it. NGINX was then configured to listen on port 80 and has been configured to serve the contents from the root folder of Apache. Apache server was configured to listen at a non standard port and NGINX has also been configured to act as a reverse proxy for all php requests by getting them served processed by Apache.

We performed the tests again and have found tremendous improvement. The average latency at 1000 concurrent hits stress level has come down to under 5 seconds from about 80 seconds. We could also observe that NGINX was consuming just around 10MB. NGINX, by itself does not has any limitation on the number concurrent threads and it is constrained by the limits of the server itself and other daemons running on it.

Sunday, March 31, 2013

Software Complexity - an IT Risk perspective

IT risk management predominantly focuses on IT Security though there are other category of IT risks which are less known. Software complexity is such an important unusual or unpopular IT risk. Basically, software complexity is one of many non functional quality attributes, which in most cases is ignored completely, leading to associated risks occurring and thus leading to financial and non financial impacts on the business entity. In this blog, let us try to understand the software complexity in the perspective of IT risk, and shed some light on whose baby it is and how to manage or control it.

What is Software Complexity?

Let us first understand what is Software Complexity. The simple definition of complexity is how hard a software is to understand, use and / or verify it. Complexity is relative to the observer, meaning that what appears complex to one person might be well understood by another. For example, a software engineer familiar with the systems theory of control would see good structure and organization in a well-designed attitude control system, but another engineer, unfamiliar with the theory, would have much more difficulty understanding the design.

Basili defines complexity as a measure of resources expended by a system while interacting with a piece of software to perform a given task. In terms of the end users, this could be understood as the usability. i.e. how much time the user needs to spend accomplish a task using a software and how much training is required to get familiar with it. Similarly in terms of the developers, this could be understood as maintainability i.e. how easy it is for new developers to understand and work on it to further improve it or to address issues. Again, in terms of the IT infrastructure team, this could be understood as the level of efforts needed to have it deployed and support it, which again should be considered under maintainability.

Why Complexity is an evil?

Complexity as the above definition goes, requires more resources to be expended than normal and thus is counter productive. We have numerous best practices, standards and frameworks that advocate for eliminating complexity, but somehow it creeps in and pose as a challenge in most cases. The consequence of software complexity as we have put it above clearly is risk, and it could be even be a business risk, when we look at it in the end user perspective. The following is a sample list of risks that could arise due to software complexity:

Complex software is difficult to verify and validate and this could lead to new defects surfacing even after months of its production use. Depending on the severity of the defect, the impact could be low to very high.
Software complexity could render the business processes inefficient, as the end users may have to spend more time in using the software to execute certain business processes, which could result in losing the competitiveness and thereby losing market.
Failure to consider the downstream complexity by the development teams could result in a state of ‘project successful, but mission failed’. i.e. even if the project teams have successfully completed the project, the resulting software product might be complex enough to render it useless thereby rendering the entire investment in the project to be loss.
Post production support will always be challenge as it is difficult to retain trained and capable resources to maintain the complex software and this is a most common risk that enterprises are continuing to battle with.
Complexity in deploying and distributing the software is another area of concern for the infrastructure team. Sometimes, such complex software might demand so much of computing resources wherein the costs of such hardware and other infrastructure resources might outweigh the perceived benefits out of using such a software.

The above list is only illustrative and the list could be even bigger and I would let you come up with more such consequences in the form of feedback.

Measuring Software Complexity

To be managed or controlled, we should be able to measure the complexity. While the discussion of measures of software complexity is out of the scope of this blog (I might work on to write another blog on measuring software complexity), at a high level, the following measures can be used.

Complexity Metric	Primary Measure of …
Cyclomatic complexity (McCabe)	Soundness and confidence; measures the number of linearly-independent paths through a program module; strong indicator of testing effort
Halstead complexity measures	Algorithmic complexity, measured by counting operators and operands; a measure of maintainability
Henry and Kafura metrics	Coupling between modules (parameters, global variables, calls)
Bowles metrics	Module and system complexity; coupling via parameters and global variables
Troy and Zweben metrics	Modularity or coupling; complexity of structure (maximum depth of structure chart); calls-to and called-by
Ligier metrics	Modularity of the structure chart

Controlling Software Complexiity

As we have seen in the definition section, complexity is context sensitive. i.e. the same software may be viewed at different complexity level by different class of users. While this is a challenge for the developers, they find it as a useful shelter to stay away from addressing it. Developers primarily depend on the business system analysts, who in most cases play the role of representing the end users and when they fail to see the complexity which the end users would, development teams are mislead. In one of the project that we have been working on, the business system analyst, who was also tasked to triage the defects logged by end users, simply turns down such defects suggesting the users to learn to use the system. This definitely goes against controlling the complexity there by managing the IT Risk.

Controlling Software Complexity calls for adopting certain practices in the areas of architecture, design, build, testing and project management some of which are listed below:

Set the expectations: Overly stringent requirements and simplistic hardware interfaces can complicate software and a lack of consideration for testability can complicate verification efforts. Cultivating awareness about the downstream complexity and how detrimental the complexity could be, if ignored, would help in getting the much needed team collaboration and commitment.
Requirement Reviews: Ambiguous software requirements has been a cause of complexity, as many times, the developers tend to assume things. Similarly certain requirements may be unnecessary and could be overly stringent, which can be spotted in the review process leading to reduction in complexity.
Domain Skills: Having the relevant product or domain skills will go a long way in reducing downstream complexity as the development team would be familiar as to how the industry is handling such requirements.
Architecture Reviews: A good architectural design will go a long way in reducing the downstream complexity. Design and architecture reviews early on would go a long way in addressing this need. It would even make sense to have an Architecture Review Board comprising of representatives who will be able to use their expertise in the specific areas and spot problem areas.
COTS: Third party libraries usually termed as COTS (Commercially available Off the Shelf) components often come with unneeded features that adds to the complexity of the software. It is important to make a careful and thoughtful decision in case of COTS.
Software Testability: Design the Software with testability in mind, so that the complexity can be verified by the QA teams.

Conclusion:

Much of the plans for mitigating the software complexity risk revolves around the software acquisition or software development process. This calls for a close co-ordination and collaboration between the IT Risk managers and the Development teams. Though risk management is well understood as an important function of the project managers, how well the risk of complexity is managed is still questionable. The project managers are primarily concerned on those risks that might impact the project success and fail to consider those like the complexity that emerge downstream post project. It has also been observed that reduction of complexity is an important characteristic of disruptive innovation. Innovations emerge disruptive, when it addresses the complexity in already existing products or systems and thus being able to capture market segments one after the other in it course to the top of the cliff.

Saturday, March 23, 2013

Surviving Disruptive Innovations

I have recently made a presentation on Disruptive Technologies at the Chennai Chapter of ISACA. While chose the topic in the context of presenting a picture of the pace at which the disruption is happening in IT world and what are the upcoming Disruptions to watch out for. But As I was preparing the agenda and content for the presentation, I was curious to find out how successful enterprises are managing rather surviving disruptions and in the process have stumbled upon some of the research work done by Clayton Christensen.

It was interesting to observe few things from his theory, which are the following:

Good Management principles would not be of great help in managing or surviving the disruptive innovations. Christensen sites the examples of how Toyota came up disrupting General Motors. He sees a pattern in the happening of disruptions in the form of an S curve, where the top of the curve is a cliff. Leaders / Leadership teams follow the bese management principles to climb up the S Curve and when they reach top they just fall off the cliff.

Extendable core is the key enabler of of innovations becoming disruptive. The potential disruptive innovations would appear as if it is insignificant in terms of the competitive capabilities of the incumbent’s existing products and thus tempting the incumbent to ignore it. But having an extendable core within it, the new entrant quietly enhances its capabilities and slowly get into the mainstream market of the incumbent and then disrupting a whole market resulting in driving the big and well managed incumbents out of the market.

Emerges from where it is least expected. For example, we now find it very comfortable to use a smartphone to various jobs which otherwise were performed by some special purpose devices. Examples include GPS devices, Digital Cameras and even PCs. While GPS device manufacturers still believe that GPS feature of Smartphone is not a threat for them as the special purpose GPS devices have certain unique advantages, which Smartphones don’t. But be reminded that the smartphones have the extendable core and can easily address this capability gap and soon GPS devices will be a thing of the past and we are already seeing the signs of it.

While there are many other interesting observations to note, I would leave it for you to find those out. I was then curious to look into the cases of disruptions that happened in the past. the following three cases of disruptions were of interest to me:

Kodak: Kodak ruled the photography market for a whole century. Their management as all the best qualities and were praised in all respects. Kodak has many innovations to its credit and have many firsts as well. With such a performance it has recently gone into bankruptcy and has sold its patent portfolio, which included close to 1000 patents to salvage some value. It is natural for us to think that the emergence of Digital Cameras would have disrupted Kodak in a big way. But as many would know, Kodak knew that digital era is emerging and they were the first to introduce a Digital Camera in the 1970s. But then what went wrong and how did they miss to sustain that innovation and stay alive in the market? Kodak has been believing till early 200 that the Photographic films wouldn’t die so soon. The other interesting observation out of Kodak’s failure is that with a heavyweight team of experts, sustaining innovation is really expensive and the outside view is most likely ignored.

That’s where the management tend to give up on some of the innovations as the time and investment in it may not worth it as they are making decent business with their current line of products. The situation is different for new entrants as the startups usually break the rules of convention and are in a position to pursue such innovations relatively a lot cheaper and also in an progressive manner. Startups usually start to focus a market which is ignored or to which the incumbents don’t pay much attention and there by not drawing the attention of the incumbents till a point when it will be difficult for the incumbent to respond to.

NOKIA: Nokia came big in the cellular phone, but failed to get its innovation strategy right with the smartphones. Even in NOKIA’s case, its research team came up with a prototype of smartphone with internet access and touch interface, way back in 2003, but the management, again going by good management principles, citing the risk involved in the product being successful and the very high cost of its development has turned down the proposal to pursue this plan further. Exactly three years later Apple launched its iPhone.

NetFlix: Netflix case is a little different. Netflix has been very successful in its DVD Rental business and in fact has seen the emergence of disruptive innovations in the form of streaming videos. It even responded to it by pursuing its research activities in that direction and has developed a service of streaming videos. What went wrong according to analysts is that it got is business model and pricing wrong as it combined both the traditional service and the digital streaming service as a bundle and increased the pricing. Ideally it would have been more appropriate to have offered the digital streaming under a different brand or as a separate service, as the surveys indicated that the DVD rentals still account for 70% of the total video sales in the US.

Now, given that just good management principles don’t help in sustaining or surviving the disruptive innovations, what should the organizations do to stay alive in todays world where IT has enabled the disruptive innovations to emerge with much faster pace leaving very little time for the incumbents to respond to. We also keep hearing that the “break the rules” is the way to go to foster innovation. While disruption is always seen as a risk to be managed, how well enterprises come up with the right risk mitigation and contingency plans to handle the risk of disruption is still a mystery.

You may check out my presentation on the subject at Slideshare and feel free to share your views and thoughts on this topic. You may google to find out some of the great articles and papers on the theory of disruptive innovation by Clayton Christensen. You will also find some good video lectures of his on YouTube.

Saturday, February 9, 2013

Stress Testing a Multi-player Game Application

I recently had an opportunity to consult for a friend of mine on stress testing a multi-player game application. This was a totally new experience for me and this blog is to detail as to how I approached this need to simulate the required amount stress and have it tested under stressed circumstances.

About the Application Architecture

The Application was developed using Flash Action Scripts with few php scripts for some of the support activities. The multi-player platform is aided by SmartFox multi-player gaming middleware. It also makes use of MySQL. The flash files containing the action scripts and a lot of images have been hosted on an Apache web server, which also hosts the PHP. All of Apache, MySQL and SmartFox have been hosted on a single cloud hosted hardware on Linux operating system.

The test approach

My first take on the test approach was to focus on simulating stress on the server and get the client out of the scope of this test. This made sense as all of the flash action scripts get executed on the client side and in reality it is typically single user using the game on a client device and the client side application has all the CPU, memory and related resources available on the client device. Thus in reality there is no multi-player stress on the client.

Given that I will now focus on the impact of the stress on the server resources, I had to understand how the client communicates with the server and about the request / response protocols and related payload. I used Fiddler to monitor the traffic out of the client device on which the game is being played and I could only see http requests for fetching the flash and image files and few of the php files. But I could not find any traffic for the SmartFox Server on http and figured out that those requests are tcp socket requests and thus not captured by fiddler.

The test tools

At this stage, it was much clear that we need to simulate stress on apache by sending in as many http requests, we need to simulate stress on SmartFox server as well over tcp sockets. We have a choice of numerous open source tools to simulate http traffic. I chose JMeter for http traffic simulation, which is open source, UI driven, easy to setup and use. It also supports multi node load simulation.

I need figure out for a tool for simulating load on sockets. I checked with SmartFox to see if they offer a stress test tool, but they don’t. A search through the SmartFox forums revealed that a custom tool is the way to go and to make it easier, we can use one of SmartFox client API libraries, which are available for .NET, Java, Action Script and few other languages. I settled for .NET route as C# is the language in which I have been working with in the recent years.

I have built a multi-threaded custom .NET tool using the SmartFox Client API to simulate the stress on the SmartFox. To my surprise, the SmartFox Client API library has not been designed to work with multi threading and SmartFox support confirmed this behaviour. I then decided redesign my custom tool to use the multi-process architecture and it worked fine..

I needed a server monitoring tool to monitor and measure various server performance parameters under stress conditions. I have found the cloud based NewRelic as the tool of choice to monitor the Linux Server hosting the game components.

The test execution

I had JMeter configured on three nodes (one being the monitoring node) and had set it up to spawn the desired number of threads. I had the custom .NET tool on another client and set it up to spawn the desired number of processes making a sequence of tcp socket requests. I also engaged couple of QA resources to play the game record the user experience under stress conditions.

The test execution went well and we could gather the needed data to form an opinion and make recommendations.

References:

Apache JMeter

NewRelic

Multi Process Architecture in C#

Saturday, January 26, 2013

Solution Architecture - Aligning Solutions with Business Needs

As the Enterprise Architecture & IT Governance adoption is on the rise, we keep hearing a lot about IT - Business alignment. ISACA positions Strategy Alignment as an important knowledge area along with five other knowledge areas. Enterprise Architecture being an important function in architecting or engineering the enterprise itself has a major role to play in appropriately deriving the IT strategy from the business strategy and ensure its execution as intended. Within the overall Enterprise Architecture function, it is the Solution Architecture function that acts as a bridge between Business Architecture and the Technology / Application Architecture. So, the Solution Architecture is a key role in transforming the business needs into IT solutions and it is this function that is key to ensure that the IT solutions are in alignment with the business needs and strategies.

Now, most of the Solution Architects are well aware of this important need at the early stage of solution design and Architecture, things keep going in the wrong way and the business in general are not happy with the solutions delivered. While most of the Solution Architects do well in the initial phase of the solutioning, somewhere down the line things tend to stray away and let gaps creep in resulting in a misaligned solution being delivered. A close examination of the causes of various misaligned projects reveal the causes which are listed below. This list is not exhaustive and is not in any order of priority.

Requirement Analysis

Typically a detailed Business Requirements Specification document is the input for the Solution Architecture function and the solution that the Architects propose largely depends on the veracity of this document. A good knowledge on the business domain of the organization in general and the ability to apply critical review on the requirements would bring out possible gas that if ignored might creep into the solution implementation leading to gaps. Needless to stress here is the non functional requirements, which on most occasions remain undocumented and a largely contributes for the gaps to creep in. Lack of formal processes that call for iterative reviews of the business case and in turn the requirements specifications is cause of concern.

Generally the Solution Architects should be engaged from the initial stages of problem definition, so that they will be in a position to gather the needed details, which might not figure in the requirement specifications.

Expectation Management

Traditionally, the business just throws in the problem they want to solve or the opportunity they want to exploit and probably participate in defining the requirements. Thereafter, they just leave it to the Architects and developers to do whatever and finally deliver the solution. In one of a project for a leading multinational bank, it so happened that when the solution was delivered, they reacted stating that this is not what we wanted and they found that the requirements specification as duly accepted by them was in total mismatch with what was expected. It is good and easier to have this expectation mismatch ironed out early on, before getting deep into implementation. While the expectation mismatch is generally on the non functional needs, like usability, portability, reliability, etc, at times, even the functional requirements also leaves room for misunderstanding by different teams leading to misalignment. The best way to address this area is to involve the business in the periodical reviews by using Agile practices. Even with Agile, this could happen if those representing the business do not have the same understanding of the business needs at large.

Validating the solution design with the business representatives and keeping a watch on the expected design outcome as the implement progresses for possible impacts on the expectation is the best way to address this concern.

Skill Gaps

Solutioning is not a pure science and one should be able to blend the science and art. Getting the best solution depends on the Solution Architects’ ability to visualize the final outcome given the constraints and choice of tools and making sure that this what the business would want. While the skill gaps with other related functions would still be a cause for concern, the deficiencies with Solution Architecture function could mean dear as they play a part early on in the life cycle.

Choice of Tools & Technology

While the choice of tools and technology is not generally within the domain of the Solution Architect, it is important to be knowledgeable of various tools and technology that can be leveraged within the constraints of the IT strategy will go a long way in proposing a best solution. Another concern around this issue is that as the Technical Architects or the development team encounter feasibility issues with a chosen tool or technology, the Solution Architect has to jump in again and align or tweak his solution to overcome such issues without impacting the expected final outcome.

Culture, Communication & Collaboration

While continuous review and monitoring would help bring out exceptions, it is for the organizaiton to nurture a culture amongst the teams so that everyone knows what is the business and IT strategy and the value of the work that they deliver towards IT solutions. The Solution Architecture team should have a thorough understanding of the IT Strategy and ensure that the solutions that they propose are in line with it. It is also important that the teams should collaborate well and work for the common objective of ensuring that the execution of the solution design progress in the expected manner and timely escalation of exceptions for appropriate action.

Roles & Responsibilities

Lack of clear definition of roles and responsibilities would lead to negligence in the deliverables by individuals and teams. As with a project I cited earlier, the business representative who approved the requirements specification did so without being sure of what he is approving. Further reviews and monitoring are very much needed to spot such lapses early on. The culture of the organization should also be such that the teams and team members own the problem and contribute for the solution to the expectation of the business.

Review and Monitoring

Depending on the size and strategic importance of the projects, the program and project management team should keep a watch on any potential impact on the expected outcome and exceptions if any observed are analysed for possible changes to the solution design and implementation and in turn managing the expectations appropriately. This would call for identifying the key goals and outcomes upfront and come up with appropriate metrics to keep track of.

Another area to keep a watch on are the assumptions, constraints and the dependencies which have been identified as part of the project charter. It would be impossible to iron out all ambiguities at the start of the project and as such it is a prevalent practice to start off by making assumptions on such ambiguities and proceed. However efforts should be on to get those assumptions validated and make them unambiguous as the project progresses. Any change in the assumption might trigger redesigning the solution. Same is the case with the various known constraints and dependencies with which the solution is designed and being implemented.

As we could observe all the stakeholders have a role to play in ensuring the alignment with business need. However, the Solution Architecture function sets it off and the other teams carry on from there. The Solution Architects should hold the key, being in close touch with the implementation teams, making themselves available for clarifications, being receptive to emerging issues around the solution design and coming up with needed improvement to the solution designs.

Pages