Showing posts with label roi. Show all posts
Showing posts with label roi. Show all posts

Sunday, April 10, 2016

Economics of Software Resiliency

Resilience is a design feature that facilitates the software to recover from occurrence of an disruptive event. As it is evident, this is kind of automated recovery from disastrous events after occurrence of such events. Yes, given an option, we would want the software that we build or buy has the resilience within it. Obviously, the resilience comes with a cost and the economies of benefit should be seen before deciding on what level of resilience is required. There is a need to balance the cost and effectiveness of the recovery or resilience capabilities against the events that cause disruption or downtime. These costs may be reduced or rather optimized if the expectation of failure or compromise is lowered through preventative measures, deterrence, or avoidance.

There is a trade-off between protective measures and investments in survivability, i.e., the cost of preventing the event versus recovering from the event. Another key factor that influences this decision is that cost of such event if it occurs. This suggests that a number of combinations need to be evaluated, depending on the resiliency of the primary systems, the criticality of the application, and the options as to backup systems and facilities.

This analysis in a sense will be identical to the risk management process. The following elements form part of this process:


Identify problems


The events that could lead to failure of the software are numerous. Developers know that exception handling is an important best practices one should adhere to while designing and developing a software system. Most modern programming languages provide support for catching and handling of exceptions.  This will at a low level help in identifying the exceptions encountered by a particular application component in the run-time. There may be certain events, which can not be handled from within the component, which require an external component to monitor and handle the same. Leave alone the exception handling ability of the programming language, the architects designing the system shall identify and document such exceptions and accordingly design a solution to get over such exception, so that the system becomes more resilient and reliable. The following would primarily bring out possible problems or exceptions that need to be handled to make the system more resilient:


  • Dependency on Hardware / Software resources - Whenever the designed system need to access a hardware resource, for example a specified folder in the local disk drive, expect a situation of the folder not being there, the application context doesn't have enough permissions to perform its actions, disk space being exhausted, etc. This equally applies to software resources like, an operating system, a third party software component, etc.
  • Dependency on external Devices / Servers / Services / Protocols - Access to external devices like printers, scanners, etc., or other services exposed for use by the application system, like an SMTP service for sending emails, database access, a web service over HTTPS protocol, etc. could also cause problems, like the remote device not being reachable, or a protocol mismatch, request or response data inconsistency, access permissions etc. 
  • Data inconsistency - In complex application systems, certain scenarios could lead to a situation of inconsistent internal data which may lead to the application getting into a dead-lock or never ending loop. Such a situation may have cascading effect as such components will consume considerable system resources quickly and leading to a total system crash. This is a typical situation in web applications as each external request is executed in separate threads and when each such thread get into a 'hung' state, over a period, the request queue will soon surpass the installed capacity. 


Cost of Prevention / recovery


The cost of prevention depends on the available solutions to overcome or handle such exceptions. For instance, if the issue is about the SMTP service being unavailable, then the solution could be to have an alternate redundant, always active SMTP service running out of a totally different network environment, so that the system can switch over to such alternate service if it encounters issues with the primary one. While the cost of implementing the handling of multiple SMTP services and a fail-over algorithm may not be significant, but maintaining redundant SMTP service could have significant cost impact. Thus with respect to each such event that may have an impact on the software resilience, the total cost for a pro-active solution vis-a-vis a reactive solution should be assessed.

Time to Recover & Impact of Event


While the cost of prevention / recovery as assessed above will be an indicator of how expensive the solution is, the Time to Recover and the Impact of such an event happening will indicate the cost of not having the event handled or worked around. Simple issues like a database dead-lock may be reactively handled by the DBAs who will be monitoring for such issues and will act immediately when such an event arise. But issues like, the network link to an external service failing, may mean an extended system unavailability and thus impacting the business. So, it is critical to assess the time to recover and the impact that such an event may have, if not handled instantly.

Depending on the above metric, the software architect may suggest an cost-effective solution to handle each such events. The level of resiliency that is appropriate for an organization depends on how critical the system in question is for the business, and the impact of the lack of resilience for the business. The organization understands that the resiliency has its own cost-benefit. The architects should have this in mind and design solutions to suit the specific organization.

The following are some of the best practices that the architects and the developers should follow while designing and building the software systems:
  • Avoid usage of proprietary protocols and software that makes migration or graceful degradation very difficult.
  • Identify and handle single points of failure. Of course, building redundancy has cost.
  • Loosely couple the service integrations, so that inter-dependence of services is managed appropriately.
  • Identify and overcome weak architecture / designs within the software modules or components.
  • Anticipate failure of every function and design for fall-back-scenarios, graceful degradation when appropriate.
  • Design to protect state in multi‐threaded and distributed execution environments.
  • Expect exceptions and implement safe use of inheritance and polymorphism 
  • Manage and handle the bounds of various software and hardware resources.
  • Manage allocated resources by using it only when needed.
  • Be aware of timeouts of various services and protocols and handle it appropriately

Monday, November 3, 2014

Information Security - Cost Analysis

Reports indicate that the Information Security is now a Board Agenda and the security spending by enterprises is on the rise. This is more because of the raise in the data breaches worldwide and the increased hacking and cyber attacks. This impacting all enterprises, be it small, medium or large and across various segments, i.e. not only financial but also all domains. The increased exposure and financial damages associated with security risks have pushed enterprises to increase the budget allocations and mitigate if not avoid such risks.

The following recent predictions of Gartner influence the Information Security spending among enterprises:

  • By 2015, roughly 10% of overall IT security enterprise product capabilities will be delivered in the cloud.
  • Regulatory pressure will increase in Western Europe and Asia/Pacific from 2014.
  • By year-end 2015, about 30% of infrastructure protection products will be purchased as part of a suite offering.
  • By 2018, more than half of organizations will use security services firms that specialize in data protection, security risk management and security infrastructure management to enhance their security postures.
  • Mobile security will be a higher priority for consumers from 2017 onward.

In the best interests of the investors, any spending or investment should be backed up with an appropriate cost-benefit analysis. Applying this cost-benefit-justifications to Information Security function is gaining focus but remains a challenge. Quantification forms the basis for being able to perform the cost-benefit analysis. The advantages of quanti fication are its accuracy, objectivity, and comparability. In addition, quanti cation is the basis for calculations and statistical analyses. While costing is a comparatively easier aspect, quantifying the benefits is still a challenge as it depends on the occurrence of uncertain events.

Starting with the idea of a Return on Security Investment (ROSI) several concepts have been developed to support the decision for or against an information measure. On way to do this is to apply the concept of Net Present Value (NPV). NPV-Formula for information security investments could be as below:


The following are the four aspects of Information Security costs:

  • Information Security Management - This is about the costs associated with the Information Security function, which comprises of People, Process and Technology. Though quantifying this aspect of the cost is straightforward, measuring the benefits is not.
  • Incidental costs of Information Security related decisions - As we all know, Information Security is a cross functional task and every personnel and process in the organization need to contribute towards Information Security. As such, implementation of any security control will cause additional overhead in other departments or functions. For instance, regulating the fair use of the Internet will require some extent of involvement from the HR function in the form of policies, code of conduct, ethics etc. Quantifying of both costs and benefits is not as easy.
  • Cost of capital for Security investments - Like any investment, capital invested in security function has a cost and quantifying this element of cost is not at all a challenge.
  • Costs arising out of security incidents - This is more like a Risk Management and all the principles of measuring the risks apply here as well. The risk measure for security incidents can be measured as a product of the probability and the impact. However quantifying this in absolute value requires the identification of the impacted information and / or related resource and the value of such resource. Many people have opined that information is the currency of the organization, but it has a dynamic value, i.e. the value of information depends not only on its significance to the organization but also its significance to others.

A common way of categorising and structuring costs in a repeatable and comparable way is required to manage the associated challenges. Building on that basis it becomes possible to identify cost-drivers and to analyse di fferent security management approaches like the following:

  • Balance Sheet Oriented Approach - where the costs are categorized and quantified under personnel, hardware, software and services. This approach does not take into consideration of the cross functional aspect of the security function.
  • Life Cycle Oriented Approach - where the costs are categorized and quantified against the various life cycle phases of the security function. Typically, the life cycle of the security function would be in the lines of Plan - Do - Check - Assess, in which case the costs are quantified with respect to each of the life cycle phases. This approach takes the project management approach and can be useful for quantifying the incremental cost of a specific security initiative, but this approach will not be useful for assessing the costs for the security management function as a whole.
  • Process Oriented Approach - where the costs are categorized into direct and indirect costs at process level. Direct costs could comprise of People and Technology and the Indirect costs could comprise of cost allocated by various functions towards a specific process, the quantified costs of risk avoidance and risk mitigation. This approach can be customized further to suit the varying needs of the enterprise.
  • Control Oriented Approach - where costs are categorized with respect to individual security control, which can be added up to ascertain the cost for a security area. However this approach has challenges abound in putting a standard approach and framework for ascertaining the costs at control level. The costs that every control comprise of are that of a share in the fixed organizational overhead, in addition to the variable costs of people, technology and the processes.
  • Layer Oriented Approach - where information security costs are categorized against the different layers of the ISMS layers, namely Management System, People & Processes, Architecture & Concepts, Operational Measures and Pre-requisites.

While quantifying the benefits is not very easy, by applying the Quantitative Risk Analysis techniques, the cost of not implementing a specific security process or control can be ascertained, which can be considered as the benefit of implementing the control or process. Another technique that can be useful to categorize and visualize the cost-benefits is the modeling and simulation.