Tech Bytes by Kannan Subbiah

Friday, May 18, 2012

Pre-project Reviews help projects fail at start

A new project was approved and a 15 member team sit round the table in the project kick off meeting with excitement and enthusiasm. Business Analysts have put together the stories which Developers have started developing. There were periodical project status reviews and there were issues and risks discussed in these meetings and the project manager did well in managing the issues and risks. There comes a message from stakeholders that all work on the project be stopped with immediate effect.

That may be sounding familiar to every IT familiar as study says that 37% of the IT projects are at the risk of failing. Another study finds that 31% of the projects are cancelled before even hitting the finish line. It really hurts the project team members when a project is cancelled or shelved. But there could be reasons for doing so and such decisions are taken when continuing with the project will only increase the loss and would not bring in value for the sponsors. This is where ‘fail fast’ will help as some times pulling the project down even earlier would save a lot of efforts and money for the sponsors.

Let us see what can be done even before starting any work on a project, so that the potential failure is spotted in the pre-project stage itself so that the project is not allowed to start in the first place. When the need for a project is felt, the executive management (sometimes called Project Review Board) takes a look at it and reviews it and if it sees merit in it, gives a go for the project. In some cases, by the time the Project Review Board (PRB) sees the project proposal, considerable efforts would have already incurred in the form of requirement gathering and analysis. The review by the PRB members, when done well will bring out the ability to measure the risk exposure and the RoI (Return on Investment) of the project, which in turn influences the further decision. However, it is important here that the PRB members shall be presented with adequate information that helps taking the right decision. A well drafted checklist or template that captures all the required information would help not to miss out the details and will help reduce the projects failing down the road.

The project proposal template shall at a minimum capture the following details, so that an informed decision is taken by the PRB.

Motivation:

The problem the project is expected to solve or a business opportunity that the project will help the business to capitalize should be stated well, so that the PRB members are able to appreciate the motivation behind the project need. It would be appropriate to quantify the value / benefit the solution would bring when implemented as intended. Some projects may bring immediate benefits, whereas some may bring benefits in the longer term, in which case, the same shall be called out explicitly. Similarly some projects may bring in monetary benefits, while others may bring in intangible value / benefits.

Estimated investment:

It may not make sense if the investment to be made for the project is significantly higher, which may leave the value or benefit negligible. Most projects get hit on account of cost overrun. The estimate should be close to being accurate and use of a template will make sure that all cost elements are considered. In some cases, a better estimation can be made only after gathering requirements with sufficient details. It would be wise that in such cases, the PRB may take a call to sponsor the initial efforts alone and let the proposal be placed again for review with more accurate estimation.

Constraints:

All projects will be constrained with various ifs and buts. Each possible constraint should be identified and called out in the proposal document. Each constraint will have one more associated risk items for which a contingency plan and mitigation plan has to be put in place. Risk Management practice has evolved in recent times and there are methods and techniques with which the risks can be quantified (a factor of probability and impact) and the overall risk exposure of the proposed project can be measured. This will help the reviewers to take an informed decision whether this much risk is worth to be taken for the value this project might give back.

Solution longevity and re-usability:

It would help if the expected life of the solution is determined. Some of the solutions may have shorter life, but may offer the re-usability with minor changes / enhancements. This will have to be determined considering the longevity of the tools and technology that forms part of the solution. Considerable number projects get cancelled as the sponsors gets to know that the useful life of the solution is going to be short and that the project may not be completed on time leading to further reduction in the longevity.

The above are some of the key factors that influence the decision on the project. There are many other factors which depending on the nature and size of the project may have significant influence on the decision making. For instance, security and compliance requirements could be a significant risk, which the PRB needs to know of while taking the go decisions on projects. Other factors that may find a place in the Proposal document include, deployment infrastructure, post production maintenance aspects, choice of technology, resource availability, etc.

On top of everything, the team that prepares the project proposal document should have the expertise in the business domain, technology used, estimation and related skill areas. Typically this will require a team of architects of appropriate specialization to accomplish this.

References:

http://www.zdnet.com/blog/projectfailures/cio-analysis-why-37-percent-of-projects-fail/12565

http://www.it-cortex.com/Stat_Failure_Cause.htm

http://www.articlesbase.com/training-articles/project-failure-what-are-the-reasons-for-and-statistics-on-it-871395.html

Saturday, May 12, 2012

Software Testability Review

Introduction

Close to 40% of the overall software development efforts is spent in testing activities which include test design, test preparation, test execution, and test result analysis. Improvement of test criteria and coverage, test automation, use of tools test analysis and test case re-use, etc are the popular methods used to bring this cost down.

Testability of a software application also has an influence on the testing efforts, making it harder or easier to test and analyse the test results. Testability is an important factor to achieve an efficient and effective test process. Organizations attempt to achieve higher testability by adopting an systematic approach which cuts across the SDLC, which is called Testability Engineering. Software Architects or specialized test architects are called in at different stages of product design and development to ensure that the testability is designed and built as part of the system. Being an important characteristic as it plays a vital role in influencing the cost of testing. Testability review technique is one of the primary technique which when adopted at the right stages will ensure high testability. This blog focuses primarily on the Testability Reviews.

Testability Reviews

Reviews in general is a popular and widely used technique by Software project teams to isolate defects in various stages of design and development. While testers do participate in most such reviews, whether they are called upon to bring out potential testing problems during these reviews needs validation. Let us see some of the testability heuristics as derived from testing practices and related literatures.

Architecture Level

Concentrate control structure related to a particular functionality in one cluster (or class). At times, Software Arcitects tend to emphasize this aspect, which could lead to dependent functions or methods being part of different clusters and in case of distributed development, this could even be with a different development teams. This will call for unnecessary development of test stubs or test drivers while testing.

Give higher priority to the modularity of a system than to the reuse of components. This one is debatable, but the reason why it is an important testability heuristics is that a higher stress on reusability may mean the non-availability of some such dependent methods or components, leading to delays in testing. One has to carefully evaluate the pros and cons and take a call on this heuristic.

Implement a standard test interface within each domain class. Implement a standard test interface within base classes and technical classes if they are selected for direct class testing. A standard test interface in all classes causes minimal additional implementation effort and has a high potential payback.

Introduce observation points at semantically meaningful points. Observation points are helpful in being able to collect or watch states and values of various objects or variables as the code execution pass through different stage. Observation points will be very helpful to debug or troubleshoot some of the hard to simulate defects.

Map your test strategy and your design approach with respect to inheritance hierarchies. Polymorphic method calls between classes within an inheritance hierarchy may require re-testing of the superclass when a subclass changes and vice versa. There are at least three different ways you can deal with this problem:

Avoid inheritance, use delegation instead: re-testing is not necessary.
Use inheritance, perform an analysis of the dependencies between super- and subclasses, and use a selective regression testing strategy.
Use inheritance, don’t analyse dependencies and rerun all test cases (test automation strategy).

Design Level

Make control structures explicit. Sometimes, control structure can be hidden in data, which gives room for chances of omitting important test cases there by leading to reduced test coverage. Being explicit will help testers to overcome this problem.

Avoid cyclic dependencies, especially between methods. Cyclic dependencies make the determination of test order complicated and also necessitate stubs and drivers and this could significantly increase the testing efforts.

Avoid unmotivated polymorphic method parameters, especially if strict subtyping does not apply. This could have a multiplier effect on the number of test cases or the test data as the all combinations of parameter classes need to be tested. An alternative here could be to use strict sub typing or minimize the number of parameter classes and restrict the type by appropriate casts wherever possible.

Avoid implicit inputs and side-effects. It is easy to miss implicit input and side-effects of methods, especially if they are not well documented.

Avoid unmotivated state behavior of objects. This could lead to increase in number of test cases as it would be necessary to test each method for each object state.

Implement a state testing function for each test relevant class and restrict its invocation by test drivers if necessary. The state of an object is an important part of the test result after each test case execution but normally not accessible from the outside. Encapsulation makes testing more difficult. Breaking the encapsulation introduces unwanted dependencies between test drivers and the class to test.

Compensate test relevant information loss by built-in-tests. Built-in-test facilities like assertions provide a way to review and control intermediate computation results, thereby reducing the impact of information loss on the testing efficiency.

There has to be at least one input element (or combination of input values) for each output element. Unachievable output values (called output inconsistencies) may be indicators for unreachable statements and paths. This when avoided makes test result analysis a lot easier.

Provide means to trigger all exceptions. Some exceptions do never occur according to theory but needs consideration as it could improve the reliability of the system. It is a challenge for testers too to being able to trigger these exceptions so that the test cases cover all exceptions. A testable exception handling requires a design strategy and perhaps simulation of failure modes.

Code Level

Don’t squeeze the code. Code readability is an important factor which will be of great help when the testers are to design test cases and achieve maximum code coverage especially in case of unit testing.
Avoid variable reuse. Variable reuse leads to implicit information loss, i.e. loss of intermediate computation results and can be avoided by using more variables.

Minimize the number of unachievable paths. Unachievable paths will impact the code coverage as it would be difficult for the testers to design test cases that traverse through such paths. In order to reduce the number of unachievable paths, avoid correlated decisions.

Avoid recursive implementations of algorithms if there is no checking of invariants. It is difficult to test recursive algorithms because we cannot easily create a stub for the component or method we want to test. A solution to this problem is to split it into pairs that call each other or to use built-in assertions.

Summary

Dealing with testability issues during reviews of software artifacts should be the first choice to achieve an effective and efficient test process. Testability is not a characteristic of code alone but applies to the architecture and design as well. The heuristics presented above are not exhaustive as more heuristics may need consideration depending on the test & design strategies, the nature of the system, the tools and technologies used for development and testing.

Wednesday, April 25, 2012

GRC for IT Architects

Governance Risk and Compliance (GRC) as most of you might know, is more than a catchy acronym used by IT and security professionals and in fact it is an approach or framework that an organization adopts to ensure proper management and control.

The broader term Governance calls for a better way of managing the business, which includes protection of the assets of the organization (includes information as an asset), sustainability of the organization irrespective of the business or economic climate. Risks are the unforeseen events or forces which could potentially result in severe impact on the overall performance of the organization. Better Governance cannot be achieved without a good risk management program in place. The risk appetite of an organization should be known to the stakeholders who should manage or control the risks, so that the risk exposure is well within the risk appetite. The term Compliance denotes the organization’s approach to being compliant with various legislative requirements of different countries in which it operates and also to comply with social commitments.

GRC exists at different levels, for instance Governance could exist at the corporate level, project level or at sub organization level. While the goals of the GRC at various levels will be the same, the means or techniques used to achieve it vary.

As one could observe these three terms have inter-relations amongst each other and it’s for that reason, there is a need to have a 360 degree view of all these three together. GRC aligns various components of the enterprise (processes, employees, systems and partners) to be more efficient and more manageable leading to better business performance.

An organization is primarily comprised of People, Processes and Technology. The technology domain in turn is made up of Data, Applications and Infrastructure. The Corporate GRC goals can be met when these components are aligned to meet the respective goals.

Much of the risks that today’s organization is battling with are around Data and Applications used within and outside the organization. The IT Architects in turn play important role in designing the solutions involving data, applications and the infrastructure. Thus it is important for the IT Architect that the solution design process is aligned to the GRC framework of the organization.

Information Systems Audit and Control Association has recently released COBIT 5, which helps organizations to get more value from both information and technology investments. By approaching Governance, COBIT 5 helps maximize the trust in and value from organization’s information and technology. Let us go over some of the questions the stake holders would raise on the governance and management context of enterprise IT and see how it will be relevant for IT Architects.

How do I get value from use of IT? Are end users satisfied with the quality of IT?

IT investments are about enabling business changes and are expected to bring enormous value to the business. But 2 out of 10 enterprise IT projects are outright failures. Keeping a focus on the value delivery from proposal stage till delivery of the solution is likely to improve the chances of success. The Architects should establish the business value that the solution could bring, so that the stakeholders can make an informed decision whether to go ahead with the investment or not.

The perceived value out of IT investments is also dependent on user satisfaction on the service delivery using the solution. The usability should not be ignored for any reason by the Architects and to achieve this Architect should collaborate with target end users on a continuous basis to solicit and elicit feedback.

How do I manage performance of IT?

As businesses heavily depend on IT, the performance of IT to the satisfaction of business is important. Among various other reasons, poor or sub optimal solution design is a major cause for IT’s non performance. Here again, IT Architects have an opportunity to factor the best design practices and ability to generate appropriate metrics so that each of the IT services can be measured and monitored in terms of its performance.

How do I best exploit new technology for new strategic opportunities?

Information Technology is advancing in a faster pace, and the trends are shifting too frequently. Newer tools and technology frameworks that come into the market make enabling business changes more and more easier. This at the same time calls for the people’s abilities in mastering related skills. The Architects has to do a balancing act in not missing the opportunities that the newer technology and tools have to offer and at the same time should not risk the business by taking on such changes so early when skills to manage it is hard to get. Many a times, exploiting new technology ahead of the completion can spur business growth.

How dependent am I on external providers? How well are IT outsourcing agreements being managed? How do I obtain assurance over external providers?

Organizations are embracing cloud and started looking at SaaS applications as these offer a higher degree of flexibility in terms of investments and in terms of capabilities. This is happening though there exist quite many security and other compliance concerns that the industry is still trying to address. This resulting in more external vendors being engaged, calls for a well drafted SLA, which should be in line with the security and regulatory compliance needs of the organization. A careful evaluation of the product and the vendor is essential as it does not absolve the organization from this compliance needs.

What are the control requirements for information?

Information and data as assets are gaining significance and in the next few years, the ability to control and manage large volumes of data from discrete sources in an efficient and effective manner will be looked forward by almost all organizations. At the same time, data breaches are also on the rise and the information security practice is also drawing considerable attention from the CIOs. It is time that the CIOs or CSOs put in place an Information Governance program, identifying and classifying sensitive data and information and defining the control requirements around the same. This will require the all the applications be designed appropriately to have these control requirements implemented.

Did I address all IT related risk?

Risk is one of the important area to be managed well to minimize uncertainty and the associated impact on the business. Risk Management has to be practiced at every level including IT Architecture. IT Architects start risk management right from proposal stage to delivery and even after that. Lack of Risk Management skill amongst the Architects could itself be a risk.

Am I running an efficient and resilient IT operation?

With high dependence on IT, today’s enterprise needs an efficient, effective, secure and resilient IT infrastructure for its survival and success. This requires the sub systems of IT to be highly performing and at the same time architected in such a way to be flexible enough to accommodate changes to it. The Architects should always be willing to embrace change and make sure that the solutions that they design is receptive such changes.

How do I control the cost of IT? How do I use IT resources in the most effective and efficient manner? What are the most effective and efficient sourcing options?

The Architects who design IT solutions are not usually constrained by a budget, and so why in most cases the solutions designed are not necessarily a cost efficient one. Ideally, the Architecture team should consider better budgeting and estimation techniques and should be able to quantify the capital and operational costs, which allows the stakeholders to take informed decisions.

Do I have enough people for IT? How do I develop and maintain their skills, and how do I manage their performance?

Choosing the right tools and technology should also mean that availability of people in to manage and support it. Architects sometimes get carried away by the features and abilities of such tools and sometimes carried away or influenced by vendors and eventually end up in a situation where incurring huge cost in finding skilled people and retaining them. Architects should seriously consider the talents available in house and the availability of such skills in the market on demand, while making such choices.

How do I get assurance over IT?

Quite often, the IT is pulled in to diagnose the problem of an application coming down crashing. Teams like Developers, Architects, Network engineers, Hardware engineers, etc come together to trouble shoot the problem and come up with a corrective and preventive action. Every such instance throws a new root cause and the teams keep on learning out of such outages. But what the end user community wants is a stable and reliable system, which the business can depend on. While it is hard to rule out outages, there should be processes in place, which helps reduce the down times. The systems should be designed to being able to log information necessary for trouble shooting, raise alerts upon encountering exceptional conditions, factor redundancy in hardware and software components. Periodic audits and reviews should be carried out to ensure that the recovery measures put in place are working.

Is the information I am processing well secured?

With cyber security crimes on the rise, organizations are investing heavily on securing the data and information assets that are stored within and outside the organization. IT Security should be one of the key non-functional requirements that the Architects should consider while designing solutions. The significance of Security needs could vary based on the organization’s nature of business and the information being processed or stored. Many countries have pronounced legislations on security requirements for specific industries and specific class of data, which should be complied without exception. Here again, period audits and reviews would help assure about the IT security level to the stakeholders.

How do I improve business agility through a more flexible IT environment?

Agility is key to quickly turnaround business changes as solutions. Flexible IT enables the organizations to quickly capitalize on the new opportunities, to innovate and to get ahead of the competition. This saves time and increases efficiencies. Some of the key evaluation or design criteria to make this happen are: shared / outsourced infrastructure, ability to scale up and scale out, reduced complexity, continuous data and application availability, built-in efficiency within every component, etc.

The above is not an exhaustive list to be taken care by the Architects. Most of the above would be addressed if one follows the best design practices considering all of the undocumented abilities (scalability, availability, maintainability, usability, etc.) required out of the solutions and applying the right design patterns.

Reference: COBIT 5 published by ISACA, COBIT 5 and GRC

Sunday, April 15, 2012

Emerging Cloud Trends – Impact on IT

A recent Gartner Report identified five Cloud computing trends which could affect the cloud strategy through 2015. While Cloud Computing has a significant potential impact on every aspect of IT, the uncertainty, confusions and misunderstandings continue to exist and the five sub trends would be accelerating and need to be factored into the planning process. This means that the CIOs would be inclined to revise the cloud strategies to align with these trends. This will also mean that the enterprises would need IT workers with skills that could help in making this strategic shift successful. Here are the five sub trends and the skills that these trends would demand.

Formal Decision Frameworks facilitate Cloud Investment Optimization

The benefits of cloud include the shift from CAPEX to OPEX models, reduced spending, greater agility and reduced complexity. These benefits do not come just like that and they come with some challenges in the form of security, lack of transparency, performance & availability concerns, vendor lock-in, licensing constraints and integration needs etc. It is important that these benefits and concerns are carefully mapped against the needs of the enterprise and an appropriate decision is made and necessary monitoring and management processes are put in place. Each of these benefits needs to be quantified considering the organization’s current and future priorities and constraints. For instance, a financial services firm may find the greater agility as a challenge as well (as against a benefit), because, greater agility could mean more frequent changes, which would have an impact on the reliability and stability of the applications. Realizing such impact in mid-course could result in rolling-back from cloud adoption and the resulting impact is obvious.

Over the next few years, organizations would be putting in appropriate decision frameworks, more specifically for the cloud adoption so that the benefits and risks are known upfront and decisions are taken appropriately. The skills that this trend may demand include Risk Management, IT Security, IT Governance, Estimation and Metrics.

Hybrid Cloud Computing as an Imperative

As there are enough reasons for enterprises not moving all their IT on to public cloud, Gartner sees a unified cloud model, where a cloud of clouds is a possibility, in which a single cloud may comprise of multiple cloud platforms part of which could be it internal. As everyone know, the key challenge with hybrid cloud computing is the integration of application and data between on-premise and cloud applications.

This calls for existing internal applications being enhanced to support integration with external cloud applications and at the same time the cloud applications should expose APIs for consumption by other cloud applications and / or the organization’s internal applications. Applications on public cloud need to adhere to industry standards and best practices, so as to support varying integration needs of its customers. The skills that an IT professional would start seriously looking at to get on with this trend are EAI (Enterprise Application Integration), SOA (Service Oriented Architecture), ETL (Extract Transform and Load) and EII (Enterprise Information Integration).

Cloud Brokerage will facilitate Cloud Consumption

As cloud adoption proliferates, so does the need for consumption of assistance. Gartner believes that Cloud Service Brokers (CSB) are one of the most necessary and attainable opportunities for service providers, service distributors and internal IT organizations. The CSB model provides an architectural, business and IT operations model for enabling, delivering and managing different cloud services within a federated and consistent provisioning, billing, security administration and support framework. This will help the unification of the cloud services delivery and management. Gartner has designated Jamcracker as a “Cool Vendor in Cloud Service Brokerages”.

This trend will call for the IT professionals to have a great deal of knowledge on SOA in addition to various standards, practices and tools on service provisioning, delivery monitoring, billing and management.

Cloud-Centric Design becomes a necessity

Migrating existing workloads with highly variable resource needs to cloud platforms is among the immediate opportunities that many organizations are looking at utilizing. But this will not make the cloud adoption complete, as it will result in using various work-around approaches to make it work with existing applications, by-passing standards and best practices. This might work in the near term and but may not scale and yield the real benefits in the longer term. Organizations should start looking at development of cloud-optimized applications that exploit the potential of the cloud. Even internal applications should be designed with cloud-centric model, so that it can exploit the private cloud platform and would make the integration with public cloud applications easier over hybrid cloud computing platforms.

This trend will expect the application and solution architects to start acquiring necessary cloud skills, so that the solution that they architect is cloud-centric and will have identifiable service end points for use with various other internal and external applications and also factor in the support for Cloud Service Brokerages. The design patterns, standards and practices around cloud-centric design is evolving and it is important for the IT workers to keep a watch in this area.

Cloud Computing influences future Data Center and Operational Models

In public cloud computing, the providers have implemented such a model so that the ability of provisioning, delivering and managing the services is optimized and automated to a great deal. This also ensures optimal utilization of the underlying hardware and also minimizing the energy and other operational costs. Enterprises are attempting to implement the similar models within their data centers and have private clouds setup for the consumption of their own internal consumers. This trend is increasing and Gartner predicts that in the next few years any data center (small or big, internal or external) implementation would follow the cloud model.

This trend will expect the Infrastructure Architects to be cloud aware and be familiar with the underlying tools and technologies, which form part of the cloud service provisioning, delivery and management.

Reference: Gartner report "Five Cloud Computing Trends That Will Affect Your Cloud Strategy Through 2015." The report is available on Gartner's website at http://www.gartner.com/resId=1920517.

Thursday, March 15, 2012

Enterprise Application Integration - Challenges

The increased complexity and diversity in the information systems and the inability to rebuild the information systems from scratch is forcing enterprises to look at EAI as an alternative solution that will help extend the life of the existing applications and also add on newer applications to meet their changing needs. EAI, if not well done could add to the woes of the enterprise. Here are some of the typical challenges an EAI project will face, which need to be worked around to reap the real benefits of EAI.

Change Management Plan

It is important that the employees are bought in for the EAI initiatives and are taken into confidence for the changes that EAI brings in. With EAI, this is important as the Integration could be between many to many applications not only within the enterprise but also with partner / vendor applications. Many times, changes to such applications and / or the processes are necessary to implement best EAI solutions. Lack of proper Change Management Plan to support EAI initiative would mean resistance or reluctance from various business and IT teams to support the EAI project, which could lead to failure of EAI initiatives.

Project Costs

According to Gartner 50% of EAI projects are over budget. Even when cost is under control, the projects slip the schedules and hits production later than expected. Typically, the cost of acquisition of knowledge of various systems, additional software & hardware, infrastructure support, vendor management, etc are ignored or not appropriately estimated in the planning stage resulting in cost overrun. In addition EAI requires lots of technical and business decisions to be taken during the course of the project and that calls for experts with tons of experience who in turn have a hefty price tag. While the project cost itself is not an issue, it is important that the overall cost is well estimated and the return on Investment is well established to the satisfaction of the Management, so that they will continue to support the EAI initiative. If this is not done well and if the project cost keep shooting up every now and then or if the schedule keeps slipping off, then the management is likely to withdraw the support, which would mean shelving the EAI initiatives.

Continuous Support

Like any other application, the EAI project is not an one-time implementation and it needs continuous support and maintenance. This is evident as the participating applications keep evolving and the business processes around which the integration needs are orchestrated keep changing depending on the growth and diversity of the business operations. In case of most EAI projects executed by vendors, this aspect is ignored and the recurring cost that arise on account of support and maintenance could be a surprise.

Choice of Technology & Tool

While the business team look for quick and easy solution that is flexible and cheap, the IT team look for reliability and ease of use among other things. The IT team also expects that the business team will be appreciative of certain limitations of the technology and tools, which is an area of concern for the business team. It is extremely difficult to choose a technology or tool that meets all the needs. For instance, especially in the Integration space, EAI tools are great for real time integration of small chunk of data between applications, whereas there are different tools for bulk ETL kind of integration needs. Similarly, EAI tools typically do not support complex transformations and instead the source or target solutions need to be enhanced to handle transformations. The Architects have a key role to play in establishing the process and practice aspects of EAI within the enterprise and it is important that these are thought of ahead of tools and technology selection.

Connecting people and technology is always a challenge, which is magnified further with great many choice of tools and technology. It is important that these challenges are understood well and an appropriate plan to work around these is put in place well in advance would ensure the success of EAI initiatives in an enterprise.

Saturday, February 18, 2012

Developers’ take away from a support project

Developers usually tend to prefer development projects over production support projects. Developers always want new challenges in terms of technology and would like to be using the latest tech tools and platforms. As most development projects offer them this advantage, they usually prefer to get away from production support projects. But in reality, the production support projects do offer them certain key benefits, which are very much required as they move up in their career path. Let us examine some of these here.

The real life business scenarios

A software project begins with perceived business requirements as drafted by the Business Analysts and approved by customers. In most cases, the requirements are far from complete and that leads the developers to live with ambiguity giving room for more defects in the product that they develop. How much ever the software is tested, when it hits the production use, the real life business scenarios will for sure throw the software out of gear and makes it fail. Thus, those involved in the support projects get the opportunity to deal with production business scenarios which will sharpen their business / domain knowledge. Given that the world has started embracing the cloud and SaaS applications, there will be less of development and more of customization and managing the configurations. That means that the need for domain skills with the developers will rank very high amongst the SaaS providers and consumers.

Better product / domain knowledge

In product development, it is quite possible that a developer or a team of developers would be working on just a small part of a product. That means, the developers associated with development projects have very little opportunity to have complete understanding of the product. Whereas the developers involved in the production support would get opportunities to work with all parts of the product and some times across other products too. They get better visibility on the operating processes / practices associated with a use case, there by getting a better product / domain knowledge.

Solution design skill

Developers tend to believe that support projects do not have much opportunity in the solution design space, which is a myth. A production defect is far more difficult to deal with than a defect identified during the development life cycle. Resolution of a production defect involves at a high level the following steps:

Quickly come up with a data fix to maintain the data integrity if impacted by the defect.
Perform a root-cause analysis and come up with the real life scenarios that could lead to this defect being encountered.
Come up with an interim work around if any available to prevent it from recurring in the shorter term.
Identify a best solution to prevent it from recurring – This is rather challenging as the solution has to be designed within the existing product architecture, with lesser efforts and least impact to the already working software.

Each of the steps when done well in combination with the real life scenarios add tremendous value to the abilities of the developers and that will lead them towards software or solution architects. Solutions in support project see production quicker than the development projects and as such high appreciation from business teams.

Code Re-factoring

Learning from one’s own mistake is a good way of learning. But, learning from other’s mistake is a smart way of learning. Every time a developer attempts to resolve a production defect, he might be looking into the code written by someone else and may come across many different ways of achieving a result. Taking it positively, a support developer may enjoy reading through the code written by others and pick up some better algorithms and at the same time, how not to write codes. This will for sure better their coding abilities.

The developers in the supporting a production instance of a software product will realize how important the readability of the code is and hopefully they will be making it their habits to write readable code with appropriate comments and indents.

Trouble shooting expertise

Usually software products are moved to production environment after atleast three levels of testing. A defect in production means that it has slipped through all the testing phases during development. So the scenario under which this comes to surface is not something that has been visualized during the development phases. Some of such defects would be very difficult to reproduce without which resolving it would be a nightmare. Those involved in support projects would quite often exposed to such scenarios and they will over a period gain good trouble shooting expertise. Read one of my other blog on Debugging performance problems.

Collaboration with other teams

During development phase, a software developer would be looking up to his lead for any clarifications on the work that is assigned to him and would not get exposed to other teams. Whereas, those involved in production support get to work with various other teams like the infrastructure, IT security, subject matter experts, quality assurance, business analysts, end users, third party vendors when any of their components are used, etc. This collaboration and interaction brings room for acquiring some additional skills both in technical space and also on the soft skill space.

Conclusion

Being in production, support projects facilitates the enterprise to perform its operations and earn profits on an ongoing basis. They play a vital part in the business continuity of the enterprise. As long as a production software is well supported and maintained, the IT heads would not think of replacing it unless a major technology overhaul is expected.

Of course, there are certain downsides of being support projects too. For instance, one may have to be on call to support any emergency and some times, a hard to crack defect could result in tremendous pressure and stress.

Saturday, February 4, 2012

The skills that transform an IT Manager to an IT Architect

One of a typical question that an aspiring architect has to answer in the hiring interview is “how in your opinion an architect is different from an IT Manager?” Even I have been in the asking side, on many occasions and have not been getting convincing answers from many. This is very typical as in most organizations, even if one is titled as Architect, he or she end up managing IT as against Architecting IT. Let us attempt to compare and contrast the skills that these roles demand. Please note this is not an attempt to make out a complete list of skills of these roles.

1. The Engagement

The Architect’s engagement starts with identification of a business pain point. At times, it would be the Architect who should spot the pain points and propose remedies. On the other hand the IT Manager is engaged the moment a project has been scoped and is ready for execution and implementation. The engagement of an Architect starts with the challenge of making a business case (of course with the help of fellow architects specializing on specific areas) to the stake holders with appropriate quantified projections that lead to a positive RoI. When this task is done well, the projects that get shelved mid course would come down considerably.

2. Big Picture thinking

The Architect has to be the one who can visualize the big picture of any given problem or a possible solution in line with the enterprise’s long and short term goals, the anticipated business growth and the technology trends in the related area. This visualization would help the Architects to appropriately prioritize the various initiatives and to draw out the short term and long term road maps. The big picture visualization is an integral part of the IT strategy lifecycle of the enterprise to determine the future state and plan for the transformations from current state to future state. On the other hand, the IT Manager is engaged only in the transformation tasks and there by leading the organization to the future state.

Some of the aspiring architects, when asked how would they do the capacity planning, their response was based on a pinned down Statement of Work, which in case of architect engagement would not exist and they could not think of a situation where they would be designing a solution with absolutely no requirements on hand.

3. Project Execution

The IT managers will have teams to execute, implement and manage various projects, whereas, the Architect would be playing an independent and in most cases individual contributors. The Architects have to be with the project execution teams and help the teams by bringing in course corrections whenever they deviate from the intended plan. This role requires the Architect to possess hands on ability in the related technology, so as to hand hold the project teams during execution. The IT Managers on the other hand may not be required to do this and instead he would simply co-ordinate the engagement of Architects with the teams.

4. Risk Management

While Risk management is essential in IT management life cycle, performing it early on adds huge value to the solution execution as it helps better decision making in terms of resource planning, training needs and of course will bring in the ability to remedy the risk. That is the challenge an Architect has to face in identifying all potential risks that the solution may lead to. IT managers also face challenges in risk management, but it will be a lot easier as many things would have gained concrete shape with less ambiguity.

5. Staying on top of the trends

The IT Managers are not normally pressed to stay on top of the technology trends and it would be enough for them to stay on top of the technology that is mature enough and has good industry adoption. The Architect on the other hand has to stay on top of the trends and that would help him in measuring the longevity of the proposed initiative or solution. It also important for the Architect to have a good grasp of industry predictions and analysis and should have the ability to choose the less risky path to lead the enterprise to the future state.

Saturday, January 21, 2012

SaaS Security and Compliance Concerns

Security is one of the major concerns which hold enterprises from embracing the cloud. But some think that this is manageable and as such have started adopting cloud based SaaS applications. Cloud based Enterprise solutions like Sales Force, Service Now, NetSuite are gaining ground. Let us attempt to list down some of the key areas, where the security and compliance needs are different from the on-premise applications.

Data Security

Unlike on-premise applications, in case of SaaS applications, the underlying data is located outside the organization’s physical / logical boundaries. If the SaaS Provider in turn has hosted the application with a Cloud Platform or Infrastructure provider, then it is true for the Saas Provider too. This very fact that the data is located outside the enterprise could be a serious security concern for some enterprise, depending on the sensitivity of the data and the laws governing them.

Mature SaaS applications do expose Service Interfaces to facilitate data integration with other on-premise applications of the enterprise. Which means these Interfaces also need to be equally secured as these can be exploited to steal or manipulate data.

This means that the data as stored elsewhere should be secured by well established practices and methods like encryption, access control, etc. In addition, all the interfaces that expose data should also be secured using appropriate access control, authorization and logging techniques. The SaaS provider shall also make the data access logs available to the enterprise for review and audit.

Another important area to be concerned about is upon the service termination, how the provider ensures the proper return of the complete data and also make sure the destruction of copies maintained as the services may require.

When the SaaS provider assures adequate back-up and recovery tools and processes, it should also be ensured that the location of the back-up data is physically and logically secure enough and the data is stored with an acceptable form of encryption. Again, access to such locations should also be controlled, so that the SaaS consumers could access only their very own back up data and not that of others.

Data Segregation

One of the key characteristics of SaaS application is multi tenancy, where an application instance is shared with multiple customers. That means, the customers’ data as handled by the application needs an appropriate physical or logical segregation, so that users of a customer access only their own data. On the consuming enterprise front, this could be a big concern as lack of adequate control could result in data being accessed by other consumers of the application, who could be competitors.

In this context, it is essential to know how the application and data stores are architected and what controls are in place to ensure isolation of customer’s data to its own users only. This is a larger issue and a lapse in one of the following areas could result in a compromise:

A design or architectural flaw in the user authentication and / or authorization
A design or architectural flaw in the customer data separation
A defect in the user authentication routine
Regression issue caused by a design or application change

Development Model

As we understand, how a flaw or change in the application could compromise the data security, the SaaS provider should also assure the consumers that their application design development model is developed to address the SaaS Security concerns. This means that the SaaS provider should have the following practices as part of their application design and development model.

The developers should be security aware and the design and code that they produce should be of high standards. The development team should be well trained in remediating security threats as they are detected.
The development lifecycle should provide for security reviews and security testing at the end of every phase / sprint
As the application instance is shared by multiple customers, there could be need for customer specific patches and far more quicker releases of smaller changes. This may require agile development methods to be adopted.
The engineering process should include a configuration management process as the shorter release cycles and customer specific patches would require better source code control and more specifically in code branching and merging.

Infrastructure Security

Like in-house production environments, the environments where the SaaS application is hosted should be well secured. The SaaS provider should establish this fact by exposing the process and practices followed for the purpose by adhering to the popular security frameworks and standards. This need is the same as like in case of on-premise or cloud hosted enterprise’s own applications.

Regulatory Compliance

There are a number of legislations in various countries, which stipulates for compliance to certain processes and practices when it comes to data. For instance in US we have HIPAA, SOX, etc. The SaaS model complicates this further as the enterprise may be operating out of a location in a country and the SaaS provider may be operating out of a different country and the data could be located in yet another country or even countries.

Again, in the SaaS provider perspective, customers from all over the world may subscribe for the application and in which case, the provider should establish necessary compliance practice to meet all such legal requirements. Some countries have legal requirements to retain data for certain minimum period, which the SaaS vendor should support.

It is needless to say that the governments would be monitoring the cross border data traffic and would seek clarifications and explanations and for which adequate audit trails and logs are essential.

Identity Management

Provisioning users and managing their access and authorization permissions for various applications is becoming increasingly complex and enterprises are looking at central identity stores coupled with single sign on to address this problem. With SaaS, this could bring in additional complexity. Most enterprise SaaS application providers offer support for local store and as well as integration with enterprise’s very own identity store.

In case of local identity store, adequate security around the identity store and the process around managing the identity store gains significance. Any lapse in this area could lead to data a compromise in data security.

In cases where integration with enterprise’s own identity store is preferred, then the additional concerns could be in the areas of establishing trust relationships with the SaaS application and securing the federation of user identities between the identity store and the cloud hosted SaaS application.

Conclusion

Adopting a SaaS application does not absolve the enterprise from governing the information assets associated with such application. The CxOs of the enterprises will still be responsible whether these are managed on-premise or by the SaaS vendor. The SaaS vendors in turn should make sure that they comply with all these needs so as to win the confidence and trust of the enterprise for selling the software services.

Check out my presentation on the same topic on slideshare.

Saturday, December 24, 2011

Driving fast into the Tech Lane

As I was driving down to a restaurant with a friend of mine, we were chatting about another common friend and his new venture on mobile applications. The conversation soon gained technical flavor and it was a nice drive into the fast changing technology lane. Here are some excerpts from our conversation during the drive.

On why enterprises are in a hurry to port existing applications to mobile platform...

The technology is evolving so fast and enterprises will soon be embracing mobile devices which range from smart phones to tablets. Every tech worker owns a mobile smart device of his or her choice. Most such workers are holding senior positions in the enterprise and are very keen to use it to perform their work and for the purpose, try to influence the IT heads to allow such devices in work environment. This in fact is a challenge for the CIOs in terms of information security and confidentiality. But as this trend is growing, the IT heads have no option than to embrace this trend and start regulating this with a formal BYOD (Bring Your Own Device) policy, controls and governance framework around it.

On how BYOD is relevant in the context of mobile applications...

Yes, as the BYOD is gaining increased acceptance, the next big challenge is to get existing applications working on such devices, so that the employees don’t have to be provided with a desktop or even laptop. This in turn drives the need for porting the applications to mobile platform. Many tools and methodologies are emerging in this space so as to facilitate building mobile applications from ground up and also to port existing legacy applications to mobile platform. Write once deploy any where is the USP for today’s development tool vendors.

On how legacy applications can be ported...

This is where the Service Orientation is gaining importance. Business services are identified and exposed as reusable services and then build a portal application on top of it to appropriately present it for end user access on a variety of devices. The organizations would also consider embracing the cloud based SaaS applications to replace the legacy applications. And yes, migration to cloud could be a daunting task but CIOs are seeing a longer term benefit in doing so. An alternative shorter term solution could be to get a virtual desktop on the mobile device and then work on whatever legacy app that runs on the desktop.

About the concerns on cloud...

Yes, there still are certain concerns that keep organizations away from the cloud. However this trend is changing. Most organizations have already moved less critical applications to the public cloud. Like we have central / reserve banks regulating the banking industry, it is time for the industry consortium to come up with an independent regulatory body / framework, which can help establish the trust amongst the enterprises, which in turn will ease some of the security concerns. While industries like Banks and healthcare providers have reasons to be concerned to embrace cloud, other industries are showing serious signs of embracing the cloud.

On the amount of data that banks process and manage and whether that could be a deterrent for cloud adoption...

Be it cloud or not, data quality and data maintenance is going to emerge as a critical function. Dirty data and redundant data is being identified as having considerable impact on the profits of the organization. Tools have emerged in assuring data quality, data de-duplication and master data management. Computing hardware and related technologies like virtualization has made vertical and horizontal scaling very easy and thereby making the usage of these data intensive tools a possibility.

We both enjoyed this conversation and I am sure, you would also enjoy reading this.

Friday, December 16, 2011

Debugging a performance problem

As with any typical Application development, performance is mostly conveniently ignored in all the phases of the development life cycle. In spite of it being a key non functional requirement it mostly remains undocumented. It is more so, as the development, test and UAT environments may not really represent the real world production usage of the application as some of the performance problems could not be spotted earlier. Even if the application is put to load test, there are certain in the production environment, like data growth, user load, etc, which may lead to performance degradation over a period of time.

While most performance problems could easily be spotted and resolved, some could be a challenge and may require sleepless nights to resolve. A structured approach may help addressing such issues within reasonably quicker time frame. Here is a step by step approach which should work in most cases.

1. Understand the production environment

It is important to understand the production environment thoroughly so as to identify the various hardware & networking resources and the middleware components involved in the application delivery. In a typical n-tiered application, it is possible that there could be multiple appliances and servers through which a requested passes through and get processed before responding back to the user with response. Also understand which of these components are capable of collecting logs / metrics or capable of being monitored in real time.

2. Understand the specific feedback from the end users

Gather details like who noticed the performance degradation, at what time frame, whether it is repeating at pattern or just pulling the system down. Also understand if the entire application is slowing down or some specific application components are not performing. Also try to experience the problem first hand, sitting alongside an end user or if possible use an appropriate user credentials to experience the performance issue. The ‘who’ also matters as in certain circumstances, the application slow down may be for a user associated with some specific role as the amount of data to be processed and transmitted may differ based on the user role.

3. Review available logs and metrics

Gather available logs and metrics data collected by various hardware and software components and look for information that could be relevant to the specific application, or more specifically the set of requests that could demonstrate the performance issue. As Logging itself could be performance overkill, it would be ideal to switch off the logs or to set it to collect only minimal logs. If that be the case, configure or effect necessary code change to achieve appropriate level of logging and then try to collect the required details by re-deploying the application on to a production equivalent environment.

4. Isolate the problem area

This step is very important and could be very challenging too. Take the help of developers and performance and load testing tools, to simulate the problem and in the meanwhile monitor for key measurement data as the request and response pass through various hardware and software components.

By analyzing the data gathered from the application end user or out of the first hand experience, and with the available logs and metrics try to isolate the issue to a specific hardware or software component. This is best done by doing the following step by step:

a. Trace the request from the UI to the final destination, which typically may be the Database.

b. If the request could reach the final destination, then measure the time taken for the request to cross various physical and logical layers and look for any information that could cause the slow down. If a hardware resource is over utilized, it could so happen that the requests would be queued up or rejected after a time out. Look for such information in the logs.

c. Then review the response cycle and try to spot the delays in the return path.

d. Try the elimination technique whereby, the involved component one after the other from the bottom is cleared of performance bottleneck.

Experience and expertise on the application and the infrastructure architecture could come in handy to spot the problem area quickly. It is possible that there could be multiple problems whether contributing to the problem on hand or not. This situation may lead to shift in focus on different areas resulting in longer time to resolve the problem. It is important to always stay in focus and proceeding in the right direction.

5. Simulate the problem in Test /UAT environment

Make sure that the findings are correct by simulating the problem multiple times. This will reveal much more data and help characterize the problem better.

6. Perform reviews

If the problem area has already been isolated in any of the steps above, then narrow the scope of the review to the components involved in the isolated problem area. If not, then the scope of review is little wider and look for problem areas in every component involved in the request response cycle. Code reviews to debug performance issues require unique skills. For instance, looping blocks, disk usage, processor intensive operations could be the candidates for a detailed review. Similarly, in case of distributed application, look for too many back and forth calls to different physical tiers could easily contribute to performance problem. Good knowledge on the various third party components and Operating System APIs consumed in the application may sometimes be helpful.

When the problem is isolated to a server and the application components seem to have no issues, then it might be possible that any other services or components running on the server might cause load on the server resources there by impacting the application being reviewed. If the problem is isolated to Database server, then look for dead locks, appropriate indexes etc. Sometimes, lack of archival / data retention policies could result in the database tables growing in a much faster pace leading to performance degradation.

7. Identify the root cause

By now one should have identified the specific application procedure or function that could be the cause of the problem on hand. Have it validated by doing more simulations and tests in environments equivalent to production.

8. Come up with solution

It is just not over yet, as root cause identification should be followed by a solution. Sometimes, the solution to the problem may require change in the architecture and might have a larger impact on the entire application. An ideal solution should prevent the problem from recurring and at the same time it should not introduce newer problems and should require minimal efforts. Alternatively if the ideal solution is not a possibility with various constraints, a break-fix solution should be offered so that the business continues and also plan for having the ideal solution implemented in the longer term.

Hope this one is useful read for those of you in production support. Feel free to share your thoughts on this subject in the form of comments.

Pages