Saturday, December 28, 2024

Setting up a Security Operations Center (SOC) for Small Businesses

In today's digital age, security is not an option for any business irrespective of its size. Small Businesses equally face increasing cyber threats, making it essential to have robust security measures in place. A SOC is a dedicated team responsible for monitoring, detecting, and responding to cybersecurity incidents in real-time. It acts as the frontline defense against cyber threats, helping to safeguard your business's data, reputation, and operations. By establishing a SOC, you can proactively address security risks and enhance your overall cybersecurity posture.

The cost of setting up a SOC for a small business may be prohibitive, in which case, the businesses may look at engaging Managed Service Providers for the whole or part of the services. For instance, if the business can afford to have its own team, then they can consider subscribing to cloud based technology services / tools to facilitate the SOC operations.

Here’s an attempt to provide guidance in setting up a SOC, even on a limited budget.

The Objectives

Before setting up a SOC, it's crucial to outline the objectives. The People, Process and Technology to be used for the SOC largely depends on the objectives. Here are some common goals for a small business SOC:

  • Protecting assets: The SOC monitors and protects the organization's assets, such as intellectual property, personnel data, and business systems.
  • Responding to incidents: The SOC identifies and responds to security incidents, analyzing suspicious activity and taking action to contain and remediate the incident.
  • Gathering threat intelligence: The SOC gathers and analyzes threat intelligence to stay up to date on cyber threats and vulnerabilities.
  • Managing vulnerabilities: The SOC identifies and assesses vulnerabilities in the organization's IT infrastructure and systems, and prioritizes and remediates them.
  • Ensuring compliance: The SOC ensures that the organization complies with relevant security regulations and standards.

The SOC Team

Building a competent SOC team is essential for the success of security operations. Depending on the budget and resources, the SOC team may include:

  • SOC Manager: Develops the organizzaation's security strategy, including hiring, processes, and technology. They provide technical guidance and managerial oversight.
  • Threat Hunters: Proactively look for threats that may have evaded automated detection. They use data analysis, threat intelligence, and experience to uncover potential breaches and hidden vulnerabilities.
  • Security Analysts: Monitor security events and alerts from various sources, such as intrusion detection and prevention systems (IDPS), security information and event management (SIEM) systems, and endpoint detection and response (EDR) solutions.
  • Incident Responders: Focus on containment, eradication, and recovery of confirmed cybersecurity incidents. They need specific skills in incident management, crisis control, and restoring systems to normal operations.
  • Threat Intelligence Analysts: Use threat intelligence to perform assessments to discover the primary aim of the attack and which systems were affected.
  • IT Support: Assist with deploying and maintaining security tools and technologies.
  • Complince Auditor: Ensures that SOC members are following protocols and adhering to government or industry regulations. They play a key role in standardizing processes within a SOC.

If staffing a full team is not feasible, consider outsourcing certain functions to managed security service providers (MSSPs) or utilizing part-time consultants. Alternatively, depending on the volume of work, some roles may be combined and rolled up to one employee.

Essential Tools & Technologies

Equipping your SOC with the right tools and technologies is critical. Here are some essential components:

  • Security Information and Event Management (SIEM) System: Collects and analyzes logs & other associated data from various infrastructure assets including applications for the purpose of providing real-time alerts and insights. SIEM is a fundamental technology that forms the core of a SOC. Modern SIEM tools have the ability to leverage Artificial Intelligence capabilities so as to correlate data from different sources and help the SOC team make a better decision.
  • Intrusion Detection and Prevention Systems (IDPS): Analyzes network traffic to identify and prevent cyber threats. IDPSs can be either a hardware device with pre-loaded software tools or a virtual service, and they can use various methods including to identify attacks, such as signature matching, anomaly detection, behavioral analysis, and threat intelligence. Here again, AI is being explored to play a vital role to improve the efficiency and effectiveness of the detection and prevention.
  • Endpoint Detection and Response (EDR) Tools: Helps organizations detect, contain, and respond to cyberattacks. EDR tools can collect endpoint data from various sources, including on-premises and cloud services. They can also provide SOC teams with remote control over endpoints to perform immediate mitigation.
  • Incident Response Tools: Facilitates the investigation and remediation of security incidents. Modern tools can help SOC teams automate routine response tasks, such as isolating compromised endpoints.
  • Vulnerability scanners: Detect weaknesses in systems and applications before attackers can exploit them. They can scan networks, systems, and applications for known vulnerabilities and misconfigurations.

In case, you have hosted your applications on the cloud infra, it is likely that your Cloud Service Provider (CSP) offers some or all of the above tools as a service. Ofcourse, subscribing to such services may result in additional cost. While budget constraints may limit the number of tools you can acquire, prioritize those that address your most critical security needs.

SOC Processes

Establishing clear, well-defined processes is vital for the smooth functioning of your SOC. NIST Cyber Security Framework could be a good fit for all businesses and one can define the processes that are essential and relevant considering the size, threat landscape and risk tolerance of the business. Key processes include:

  • Incident Detection and Reporting: Define steps for identifying and reporting incidents, including automated alerts and manual reporting procedures.
  • Incident Response and Remediation: Outline the actions to take when an incident occurs, including containment, eradication, and recovery.
  • Threat Hunting: Proactively search for potential threats and vulnerabilities within your network.
  • Regular Audits and Assessments: Conduct periodic reviews to evaluate the effectiveness of your security measures and identify areas for improvement.

Training & Up-skilling

Continuous training and development are essential for keeping your SOC team prepared to handle evolving threats. Offer regular training sessions, certifications, and workshops to enhance their skills and knowledge. Encourage your team to stay updated on the latest cybersecurity trends, tools, and best practices.

Continuous Improvement

Once your SOC is operational, regularly monitor its performance and effectiveness. Collect and analyze data on incidents, response times, and resolution success rates. Use this information to identify areas for improvement and make necessary adjustments. Continuously updating and refining your SOC processes will help you stay ahead of emerging threats.

Conclusion

Setting up a Security Operations Center may seem daunting, especially for small businesses with limited resources. However, by defining clear objectives, assembling a skilled team, investing in essential tools, and establishing robust processes, you can create an effective SOC that enhances your cybersecurity defenses. Proactive monitoring and continuous improvement will help protect your business from cyber threats and ensure long-term success.

Monday, August 5, 2019

The Biggest Predictions for the IT Industry

By Team TechJury


The IT industry is continuously evolving. Thanks to its dynamic nature, we are getting access to improved products and services. Moving forward, the industry is poised for even more growth as developers and IT experts work around the clock to upend what IT is today.

Here are some predictions on what comes next within the industry.

AR For Businesses


The world met augmented reality (AR) in its true form when Pokemon GO was launched in 2016. Aside from video games, AR will be making its way into businesses as well. Firms such as realty companies can use AR to provide virtual and in-depth tours of rooms and homes. Others can use it to create more engaging marketing strategies.

More Jobs In IT


With automation making its way in various industries, there’s no doubting that the growth of AI will penetrate the workplace too. For this reason, we can expect more and more IT-related jobs in the future.

According to recent data, the total number of IT jobs will increase by 13% between 2016 and 2026. Considering the current rate, there might even be more jobs to look out for.

Smarter AI


According to Forbes, the next step for AI is the processing of emotions. By making AI capable of the feat, we can make it even smarter, thus allowing it to do more jobs that only humans are capable of at the moment.

Blockchain Acceptance


Blockchain is one of the emerging trends in the IT industry. Even though some experts believe that it’s a promising technology, it still has a long way to prove their trust. We are yet to see what can blockchain change in this sector.

Meanwhile, blockchain disruptions are already a reality. A good example is the telecom industry. If everything goes according to plan, we should see huge improvements in telecommunication services soon.

These are just some of the current trends and predictions for IT in the coming years. Obviously, exciting things are up ahead for the industry.


Friday, March 30, 2018

Enterprise Architecture Framework - Non-Functional Attributes

Non-Functional Attributes (NFAs) always exist though their signficance and priority differs when considered with certain other functional or non-functional attribute. It’s particularly important to pay attention and consider them in the inital phase of the EA framework development, as these attributes may have direct or indirect impact on some of the functional attribute of the framework. Considering Non Functional attributes early in the lifecycle is important because NFAs tend to be cross-cutting, and because they tend to drive important aspects of your architecture, they do cause considerable impact on certain important aspects of your test strategy. For example, security requirements will drive the need to support security testing, performance requirements will drive the need for stress and load testing, and so on. These testing needs in turn may drive aspects of your test environments and your testing tool choices.


The Enterprise Architecture team will interact closely with all the other management processes in an organisation, especially the IT management processes. When all these processes work together effectively, an enterprise will be able to successfully manage strategic changes and drive business transformation effectively and efficiently. Often in organisations little thought has been given to the integration of the EA processes to the other management processes. Identifying and considering NFAs early on will certainly of help in proactively address such issues. Having a clear picture of the NFAs help the EAs in taking into account innovative alternatives or trade-off before presenting decision-ready options. 


NFAs play a vital role in defining certain atomic properties of each enterprise architecture framework. The challenge with NFAs is that it is difficult to trace and identify the same. It is also difficult to define metrics to measure its performance. Described below in this blog are the typical NFAs that need to be considered while developing the EA Framework:


  • Adaptability – Be it people, process or technology, Adaptability as an attribute has never been more needed in the enterprise workplace. With the change happening at a faster pace than ever before, Adaptability is becoming a key attribute of every resource, including the human resouces apart from the systems. The resources identified as part of the EA Framework should have the the ability to accept and acquire the changes that is coming along. This way, the longivity of the EA Framework can be furthered with fewer or least changes to the framework itself.
  • Compatibility – EA Framework will have many artificacts which are not only interfaced with the other internal artifacts, but also with the external actors. Making this work seamlessly requires that the interfaces shall be compatible with each other at all times. The EA Framework shall be developed considering this important aspect of compatibility in mind and any incremental changes should not lead to break the compatibility, so that functional performance of the same is not impacted. Considering the compatibility of the artificats in the initial phase of the development of the EA Framework will save considerable efforts than fixing it when a compatibility issue surface later in the lifecycle.
  • Cohesiveness – Cohesion is the uniqueness in purpose of the system elements. A certain amount of formality is essential in providing uniformity and forming a coherent aggregate. This is critical when the components of EA Framework are developed by people both from a centralized EA team and from projects and programs. Obviously, lower level architectures should conform to the upper level architectures and unnecessary duplication should be avoided. Cohesion has to be considered in developing components or models describing a certain target area from different viewpoints. Utilizing a formal EA framework in an appropriate way is critical in achieving uniformity and cohesion in EA products.
  • Conceptuality – The benefit of enterprise architecture (EA) management is directly coupled to the underlying conceptualization of the enterprise. This conceptualization should reflect the goals pursued by the EA management endeavor and focus on the areas of interest of the involved stakeholders. A conceptual model captures the essential concepts that are present or should be present in the specific artifiact or entity and thus makes the understanding or visualization of such entity easier and unambiguous.
  • Coupling – It describes the level of dependencies between modules and components of the system. Loosely coupled systems minimize the assumptions they make about one another while still providing a meaningful interchange. Conversely, Tightly coupled systems have restrictive effect on the variability and evolution of the connected components or systems. The level of coupling that is appropriate for the particular system component shall be ascertained and considered while developing the EA Framework.
  • Diversity – Diversity is the difference between the systems or components of the EA Framework in terms of technology, methodology, principles, process, environment, etc. Diversity shall be at the manageable level, so as to minimise the cost of maintaining expertise in and connectivity between multiple processing environments. The advantages of minimum diversity include: standard packaging of components; predictable implementation impact; predictable valuations and returns; redefined testing; and increased flexibility to accommodate future changes. 
  • Dependability – As system operations become more pervasive, the enterprise become more dependent on them. Dependable systems are characterized by a number of attributes including: reliability, availability, safety and security. For some attributes, there exist probability-based theoretic foundations, enabling the application of dependability analysis techniques.  To ensure that all stakeholders at different level get the same understanding, considering the level of dependability expected out of the systems and components becomes critical. This will also ensure that the systems and components are developed and implemented as expected. 
  • Extensibility – One of the capabilities of the enterprise architecture is to allow for various artifacts of prebuilt integrations to be extended without or with least efforts. Extensibility also ensures that such system or component extensions are protected during implementing changes or revisions later on.  It is essential to evaluate and consider the appropriate level of extensibility of each system or component that is part of the EA Framework in the initial phase. 
  • Flexibility – It is a quality attribute of business information systems that contributes to the prevention of aging. It may also be considered as the capability of the enterprise to connect people, process and information in way that allows enterprise to become more flexible and responsive to the dynamics of its ever changing environment, stakeholders and competitors. This requires simplification of underlying technology and related infrastructure and creation of a consolidated view of and access to, all available resources in the enterprise.  
  • Interoperability – It is the ability of systems (including organizations) to exchange and use exchanged information without knowledge of the characteristics or inner workings of the collaborating systems (or organizations). Clearly, making systems interoperable can mean many things. The strongest drive for interoperability is technical interoperability—the technical problem of sharing information that already exists in different systems from different times and places by enabling sharing, or at least providing connected technical services. Therefore, it is imperative to develop the big picture of what data the enterprise needs to share, to receive as incoming data and to send to other systems. Both end points may reside within the enterprise, or some may reside in external enterprises..
  • Maintainability – Maintainability is defined as the ease with which a system or component can be modified to correct faults, improve performance or other attributes, or adapt to a changed environment. A fast and continuously changing business environment demands flexible systems easy to modify and maintain. Maintainability is said to be affected by; the maturity of the human resources involved, the maturity of the process governing change management, the quality of the systems' supporting documentation, the systems' architectural quality and the quality of the enterprise ecosystem on which the system executes. Thus, identifying and appropriately documenting the expectations around this attribute will certainly help implementing a better EA Framework.
  • Portability – It is the ability of the system to run under a different environment without any disruptions. Portability depends on the symmetry of conformance of both applications and the platform to the architected API. That is, the platform must support the API as specified, and the application must use no more than the specified API. Documenting the level of portability expected early on would contribute considerably in designing and developing the systems in line with the target platforms or ecosystems.
  • Robustness – It is the ability of a system to recover elegantly after failure or restart. Clearly, robust and easily modifiable automation is fundamental to achieving an enterprise’s vision for the future. However, such benefits don’t come without their price. Hard work and management commitment, both from IT and from the highest levels of the business are needed to build the kind of integrated IT architecture plans that will make the difference between success and failure in today’s highly competitive business climate.  
  • Scalability – It is the capability of a system, network, or process to handle a growing amount of work, or its potential to be enlarged to accommodate that growth. Scalability, as a property of systems, is generally difficult to define and in any particular case it is necessary to define the specific requirements for scalability on those dimensions that are deemed important. The concept of scalability is desirable in technology as well as business settings.
  • Security – With the ever evolving cyber threats both on the IT and as well as OT, security has become a very important NFA to be considered in the development of EA Framework. Considering its significance, the Security requirements ideally should be intertwined with EA Framework. Security must be designed into data elements from the beginning; it cannot be added later. Systems, data, and technologies must be protected from unauthorized access and manipulation. Headquarters information must be safeguarded against inadvertent or unauthorized alteration, sabotage, disaster, or disclosure.
Most of the attributes mentioned above are easily reckoned as Non Functional Requirements with respect to a Software System. Though Enterprise Architecture by itself may not be 'software system', it is a 'System' which depicts the blueprint of the enterprise's overall business activities with answers to the basic questions like What, Who, When, Where and How. Enterprise Architecture has multiple layers and implementation of software and IT systems is one such layer. To ensure that the stakeholders involved in different layers get the accurate view of the principles, strategies and guidlines, it is important to identify, analyze and consider these NFAs early on in the EA Framework development lifecycle.

Sunday, March 25, 2018

Securing the Operational Technology (OT) - The Challenges

OT - Overview

Operational Technology(OT) is generally technology used in the manufacturing or operational floor. The OT has evolved considerably in the recent years from pure mechanical technology to data-driven technologies like Robotic Process Automation (RPA) leveraging IOT, Machine Learning and Artifiial Intelligence. The impetus from the Industrial IOT (IIOT) brings more and more automation capabilities and the connected behavior into the manufacturing floor. Thus the adoption of IT and related technologies in OT is now the common norm and so the need for alignment and convergence with the IT function. 
IOT sensors are deployed everywhere, inside a manufacturing floor, or along the gas pipelines, inside a moving automobile, to monitor the stock movements, etc. Though these dispersed IOT devices perform small functions, the data it produces and the decisions taken based on sucgh data are critical and thus it is being realized that the OT could lead to critical security issues, depending on the size, and critical nature of such enterprise.  

The adoption of IIoT and related technologies brings many benefits to businesses such as smart machines and real-time intelligence from the factory floor - but it also increases the attack surface and requires continuous connectivity between IT and OT. The differing culture and mindset between the IT and OT functions, combined with few other factors often leads to conflicts. 

Hackers and Cybercriminals are now looking at critical infrastructure systems as the targets.  Motivations include holding systems hostage for a ransom, stock price manipulation, denial of production operations, etc. For example, the hackers may take control of your car when on a high way and demand a ransom, which could be life threatening. Similarly, Hackers may get hold of the Energy Grid and shut down the power supply for a region or even nation as a whole. The connected nature of these devices and systems involved in the modern day OT poses serious challenges as they get hooked on to the IT owned network infrastructure, wireless access points, and mobile networks.

Securing the OT

The introduction of new technologies to drive improvements such as production and supply chain efficiency and asset management has led to closer and more open integration between IT and shop floor systems. But the increasing connectivity of previously isolated manufacturing systems, together with a reliance on remote supporting services for operational maintenance, has introduced new vulnerabilities for cyber attack. Not only is the number of attacks growing, but so is their sophistication. As OT security becomes a widely discussed topic, the awareness of OT operators is rising, but so is the knowledge and understanding of OT-specific problems and vulnerabilities in the hacker community.

It’s true that the systems and devices involved in OT are often based on the same technologies as that of IT and as such many of the threats they face are exactly the same. However, it is an open secret that OT security is not the same as IT security. While securing OT systems requires an integrated approach similar to IT, its objectives are inverted, with availability being the primary requirement, followed by integrity and confidentiality. There are certain other important differences as well that mean that the OT infrastucture can not be managed as an extension of the IT infrastructure

Here are some of the areas that makes OT different from IT and thus pose a challenge for the IT Security experts:

1. Visibility:

From the perspective of the organizational units responsible for IT Security function, OT has been somewhat off the radar. This is so, because, the IT function is not involved in the evaluation and selection and procurement of the OT systems. More so, as such OT systems come with a dedicated-networked IT system(s), which could mean even isolated data-centers being setup within the manufacturing floor without the knowledge of the IT function.  Until recently, or even now in certain cases, the IT systems involved in OT are treated as an integral part of production machinery rather than computerized information systems, so the ultimate responsibility of its operation and maintenance, regardless of the cause of potential failure, was assigned to the OT function and not IT function. In most cases, the OT staff often don’t know what types of IT, or IoT devices or equipment that they have as part of their OT ecosystem. 


2. Skill Gap:

One of the biggest challenges facing the industry is deciding who is responsible for OT security - should it be the IT or OT function? Given their background and resources, in many cases IT security teams are being asked to take ownership of coordinating security for OT. However, they typically lack OT specific skills. Defining the security controls / processes for OT systems require indepth knowledge on the OT systems, so that the interests and priorities of the OT function is also taken care of. The cybersecurity industry is projected to reach 1.8 million unfilled roles by 2020. The added complexities of a converged IT/OT security environment could amplify perceived barriers to entry, as organizations struggle to manage the aging workforce of their plant teams with the Millennial generation of new cybersecurity talent.


3. Availability and Safety:

For a Manufacturing company, the production line is very important and its smooth functioning always is very important. Companies lose revenue when their production line is shut down for maintenance, be it planned or unplanned. Nobody wants to disturb OT equipment because any downtime can turn into millions of dollars in lost productivity, highly vocal, disgruntled customers and regulatory fines. Machines must reach a high OEE (overall equipment effectiveness). There is no time to allow IT-style updates and patches that take down equipment.

In many cases, where OT systems are involved in delivering essential services, such as electricity or water, or maintaining safety systems at chemical plants or dams availability is a significant parameter. Even momentary non-availability could lead to catastrophy in certain cases. Enabling high availability of OT systems and maintaining the confidentiality of some sensitive information processes by those systems require additional security controls. Not only are many of these now-connected OT system components are quite vulnerable to compromise, a failure in one of these also has the possibility of causing a catastrophic effect on human life and property. 


4. Processes:

Safety and security for employees and customers have always been top priorities for the OT function and the processes and guidelines are usually defined keeping that in mind. IT function doesn’t even factor plant or employee physical safety in, except where physical access systems are under their domain. IT’s top priority is to protect the data. OT’s top priority, however, is to protect the availability and integrity of the process with security (confidentiality) coming last. At the same time, the OT system components designed for direct control, supervisory control or the safe operation of manufacturing processes,  could turn out to be a safety hazard, even if any component or subsystem  involved compromised. Business systems are also critical but their failure is unlikely to result in the uncontrolled release of hazardous materials or energy. 


5. Legacy: 

It is not uncommon that the computer and related software systems used as part of the OT are used over a decade without being replaced or made any change. These computers and softwares are designed for certain specific functions of interfacing with the other plants and equipments involved in the manufacturing process. It largely depends on the plant or equipment vendor to come out with software and related IT hardware enhancements, otherwise, such systems may not be compatible with the upgraded IT hardware or the OS. Consequently, such systems would be vulnerable to a wide range of cyber-threats that have already been mitigated on the systems used in IT function. This is more so


6. Disparate Technologies:

Until recently, or even now in most cases, the OT architectures run on a separate and isolated infrastructure and as such they have been traditionally isolated from the Internet. One of the reasons for this is because these systems are often hard wired to work with a plant / equipment and to receive and process signlas received and disseminate instructions back to various components. Some OT systems are already only supporting obsolete, insecure operating systems. OT system vendors also do not feel obliged to increase the security capabilities of their systems. Something as benign as an active system scan can cause these devices to fail, which can have serious if not catastrophic results.

System-dedicated networks, multiple domains and dedicated supporting systems require more resources to achieve a maturity level comparable with IT. It also greatly increases the complexity in monitoring and maintaining security levels. The sophisticated nature of OT infrastructure technologies means that most IT security and threat intelligence solutions don’t have visibility into, let alone the ability defend against attacks on critical infrastructures. This creates a challenge in defining and implementing coherent security policies across production plants


7. IIoT Impact: 

The Industry 4.0 revolution is having a great impact on the manufacturing environments. It offers significant opportunities for improving production effectiveness; in particular, based on continual, online information about manufacturing processes and equipment. However, the utilization of new IoT technologies also has an impact on security. It’s not just about networks of course, there are loads of components, including things like sensors and actuators (transducers) and ‘smart things’, fog nodes,(industrial and intelligent) IoT gateways, IoT platforms and so forth. And for IT some of these components are “different” from the cyber security perspective they are used to by the way. New protocols (including wireless) or mesh network architectures increase the number of potential access points to the network and require a different approach to security.

8. Culture:

The IT function responsible for maintaining and securing the Information and related Resources, help ensuring the data Confidentiality, Integrity and Availability aspects and in the process protect corporate information and related assets including networks from cyberattacks. They're less familiar with the OT space, and often display little interest in knowing what their counterparts do to keep it safe and operational. In contrast, OT function monitors and fixes issues in highly complex and sensitive industrial plants with maintaining operational safety, reliability, and continuity as the top priorities. They don't deal or work with IT function, and certainly don't want them to get involved in their operational issues.

Each group is concerned that the other side will wreak havoc in their environment. When there is a need to secure OT against cyberthreats, plant engineers worry that if IT team members get involved, they'll compromise system safety and stability. Unsanctioned changes to these systems might cripple the plant, cause an explosion, or worse. These concerns are justified. After all, when it comes to OT, IT staff members are in uncharted waters. At the same time, the IT function is concerned that vulnerable OT networks will introduce new threats into IT networks, threatening corporate assets, data, and systems.

Conclusion:

As industrial organizations begin to connect their machines to the network, the differences in security requirements for IT versus operational technology (OT) are becoming more important to understand.
There were no good practices and formal regulations for manufacturers on how to provide even minimal security protection on medical devices. 

IT and OT teams are discovering the need to work together in order to deploy cybersecurity solutions throughout the enterprise; from headquarters to remote locations, and the factory floor. Hackers are going after intellectual property, financial data and customer information. CIOs report that intellectual property can constitute more than 80% of company value. Now is the time for OT and IT leaders to develop strong partnerships to promote operational efficiency, safety and competitive advantage.

Neither OT team members nor IT team members are experts in defending OT systems against emerging cyberthreats. Because OT networks were previously disconnected from the external world, engineering staff never had to deal with such threats. Meanwhile, IT staff members who deal with cyberthreats on a daily basis don't fully understand how these new threats will affect OT systems.  Nevertheless, both sides must cooperate, because neither group can protect industrial systems singlehandedly. Given the divergent cultures, technologies, and objectives of IT and OT, the two groups must overcome a significant divide, including mutual suspicion.

To ensure IT and OT collaboration, business-level oversight and leadership is required. More and more organizations are taking senior, experienced engineers from OT business units, usually from under the COO, and moving them under the CIO hierarchy. This interdisciplinary model combines expertise and roles that straddle and unify both sides of the IT-OT fence. Some organizations have taken this one step further. Instead of aligning IT roles under the CIO, they're creating a new C-level role to facilitate this management strategy. 

The higher up the organizational ladder that IT-OT convergence decisions are being made, the better the chances for success in bridging the gap.

Sunday, December 25, 2016

The Mobile Phone Is Your Private Property

This morning, when I was on my morning walk, a person came out of a construction site and was requeting me to lend my phone to make a phone call. I was not comfortable lending my phone primarily for three reasons: First he is a stranger to me; Second, he seem to be working in the construction site and he should have sought help from those around in his workplace as they would be more comfortable helping him; Third, my mobile is my private identity and would not want a stranger to use impersonate me. I did not lend my phone on that occasion.

How about you? Would you mind lending your phone for such requests? I understand, the answer will be "it depends." Thank's to "Selfie" feature, seeking help from a stranger to take a snap on the mobile phone is not required any more. Any ways, I thought it would be useful to list out the concerns, so that one can decide how safe is to part with one's smart phone. These apply for stolen / lost mobile phones as well.

Your Phone Contains Sensitive Information


You have your email configured on your mobile and typically, it does not expect you to login every time you use your mail app on your mobile. So lending a phone may allow the stranger gaining access to your emails and depending the duration it remains with such stranger, the impact of such compromise could be larger. Similarly, all your social media accounts do not expect any additional authentication. It is needless to say that what a smart or malicious stranger could do with access to your social media accounts. Exposing all the intimate details of our lives because of a lost, stolen or hacked phone is a serious issue.

Banking / Payment Applications


"There is an App for everything". Yes, every bank and the investment advisors are rolling out their own Apps with pre-stored credentials for the mobile savvy customers. Mobile users, find it convenient to use such an App, without having to login every time. However, the issue of how many such Apps will you install on your mobile phone is an issue to be discussed in a separate blog. For the purpose this blog let us consider the prevailing App culture. Driven by the Digital economy, there are humpteen number of Payment / eWallet Apps out in the store. The user convenience always wins over the security requirements and as such most such Apps doesn't requie a login to initiate a payment. This could be a potential risk one should be aware of and be careful about.


Personal & Corporate Information


If you are working for an organization, it is most likely that you would have setup your corporate email account as well on your smart phone and there you go, you are putting your organization's data / information at risk. Your organization would have a BYOD policy and procedure, stating what precautions you should take on the corporate data that you use or access using your smart phone. If you are an senior level executive, it is likely that you will have access to your organizational applications configured on your mobile. This includes compromise of your or your organization's cloud storage if any configured on the phone.

Illegitimate Calls / Messages



In addition to your device, your mobile phone number (SIM) is very well linked to your identity. As such any calls or message that such a stranger sends using your phone will be logged against your identity and you are responsible and answerable for consequences if any that may arise out of such calls or messages. Even if the activity is legitimate, it may be possible that the other person might call or message you back in future with or without any specific intent.



AVAST did a research in February 2016 and according to them, their researchers were able to recover the following files from the 20 phones that were sold:

  • More than 1,200 photos
  • More than 200 photos with adult content
  • 149 photos of children
  • More than 300 emails and text messages
  • More than 260 Google searches, including 170 searches for adult content
  • Two previous owners’ identities
  • Three invoices
  • One working contract
  • One adult video

Given the ever evolving capabilities of the smart phones, the devices are increasingly becoming one's identity and as such should be handled with care and caution, or else one has to face the consequences that may arise as a result of such compromise.

Sunday, November 13, 2016

A Software Product Vs Project

In short, a software Project is all about to execute a Statement of Work of an internal or external customer, where what customer required is right irrespective of what is ideal or what the end user would expect. Though some projects are scoped in such a way that certain aspects of non-functional requirements are left to the choice of the project teams.

Product development isn’t about implementing what the customer wanted to. In product development, the product manager owns and comes up with the product requirements. A large product or product suite, typically comprise of many projects and will evolve over time.

Unlike a project the product will be improved continuously without an end date based on feedback from end users and the product team prioritizes what needs to be built next based on its perceived value for its target users or customers.

A project on the other hand is funded with specific goals, a business case in mind and with finite expected value and cost.

Here is an attempt to bring out the differences between a software project and product and such differences are categorised as below:

The Mindset:

Projects are many a times started off with main focus on to deliver on time, under budget, within scope and with a temporary team. All these constraints are set in stone and any deviation is viewed seriously, which may impact the course of the project depending on the methodology adopted. So, the mindset of the project team will be with primary focus on the project parameters that determine the success of delivery and may not be the success of the product that the project may form part of. This is more so as the resources keep changing and the resources with no or little knowledge on the business domain may still deliver the project, but the product may be crappy.

Products tend to have a longer lifetime than projects and mostly built with more focus on the outcome instead of the output. Product teams are given the freedom and responsibility to think of a strategy they believe will result in the best product within a boundary of product framework. This leads to less waste and more creativity being introduced into the product development process, allowing room for embracing changes continously.

Management:

The product roadmap is key for the success of the prodct and as such, the product manager shall align the product vision and strategy with that of the business. A Project Manager, on the other hand, is responsible for executing on a predefined objective.

A Project Managers function is to create a plan, that the project will follow, and then to drive the people involved in the project to follow that plan with as little change as possible. If deviations from the planned execution are beyond an accepted threshold, the Project Manager must escalate and explain the situation to the stakeholders, who in turn will either accept the deviation or may choose to fail the project.

A product manager with the focus on constantly evaluating the viability of the product, will typically follow an agile approach with shorter sprints of developments, so the product evolves incrementally, delivering values at every stage.

Motivation:

With the primary focus of the project team being on delivering on time and within budget, the team does not have enough room to be creative enough. This brings down the motivation because the teams lose a sense of purpose and the autonomy in how to operate.

On the other hand, as typically, the resources stay longer with the product teams, they get aligned to the product strategy and the vision and thus they are given the freedom to bring in their thinking and creativity into the product, process and methodology. The feedback and collaboration with stakeholders enables the right environment, where the resources reach a higher potential and operate autonomously, resulting in better problem solving, higher ownership of outcomes, and faster time to market.

Tools:

Product management software and project management software are entirely different tools — each designed for a different type of role, to help address different business needs. Product management software helps product managers organize, develop, and communicate the product strategy, while project management software helps project managers in track the execution and incidentally manage the resource allocation, risk and issue management.

Scope:

Product scope is defined as "The features and functions that characterize a product, service, or result". Whereas the project scope is defined as "The work performed to deliver a product, service, or result with the specified features and functions".

The Product Scope defines all the capabilities of a product from the User point of view. The Product is the end result of your project and characterizes by the Product Scope. Thus, the Product Scope description includes features of a product, how the product will look like using these features, and how will it work. Product Scope also describe the ways of measuring the product performance.

The Project Scope on the other hand is an agreement of the work which is needed to deliver the product, service, or result. To develop a product features, you establish a project which has a schedule, budget, and resource allocation. In other words, the work you do to construct your product is the Project Scope.

Design & Architecture:

The product owner or manger is responsible for defining the architecture and design of the product, which should take the following into consideration:
  • Business Idea & Strategy
  • Identifying and Creating a product feature
  • Aligning with Market Trends
  • Define Product Performance Indicators
  • Prioritize the implementation of features and bugs
Though a project may include the product architecture and design as part of the scope, the focus of the project team will be more on the following:
  • Defining the project scheduling, taking into account the deliverables at various milestones.
  • Monitoring the budget
  • Planning and managing resources
  • Problem and issue management
  • Risk management
  • Managing the scope creep.

Saturday, October 1, 2016

DNS Security Extensions - Complexities To Be Aware Of

The Domain Name System (DNS) primarily offers a distributed database storing typed values by name.  The DNS acts like a phone book for the Internet, translating IP addresses into human-readable addresses. Obviously, as close to 100% of the internet requests are by the domain names, requiring the DNS servers resolve the domain names into IP addresses. This results in a very high load on the DNS servers located across the world. In order to support such a high frequency of requests, DNS employs a tree-wise hierarchy in both name and database structure. 


However, the wide-open nature of DNS leaves it susceptible to DNS hijacking and DNS cache poisoning attacks to redirect users to a different address than where they intended to go. This means that despite entering the correct web address, the user might be taken to a different website.DNS Secrutity Extension (DNSSEC) was brought in as the answer to the above problem.


DNSSEC is designed to protect Internet resolvers (clients) from forged DNS in order to prevent DNS tampering. DNSSEC offers protection against spoofing of DNS data by providing origin authentication, ensuring data integrity and authentication of non-existence by using public-key cryptography. It digitally signs the information published by the DNS with a set of cryptographic keys, making it harder to fake, and thus more secure.


The DNSSEC brings in certain additional records to be added to the DNS. The new record types are: RRSIG (for digital signature), DNSKEY (the public key), DS (Delegation Signer), and NSEC (pointer to next secure record). The new message header bits are: AD (for authenticated data) and CD (checking disabled). A DNSSEC validating resolver uses these records and public key (asymmetric) cryptography to prove the integrity of the DNS data. 


A hash of the public DNSKEY is stored in a DS record. This is stored in the parent zone. The validating resolver retrieves from the parent the DS record and its corresponding signature (RRSIG) and public key (DNSKEY); a hash of that public key is available from its parent. This becomes a chain of trust — also called an authentication chain. The validating resolver is configured with a trust anchor — this is the starting point which refers to a signed zone. The trust anchor is a DNSKEY or DS record and should be securely retrieved from a trusted source.


The successful implementation DNSSEC depends on the deployment of the same at all levels of the DNS architecture and the adoption by all involved in the DNS resolution process. One big step was given in July 2010 when the DNS root zone was signed. Since then, resolvers are enabled to configure the root zone as a trusted anchor which allows the validation of the complete chain of trust for the first time.  The introduction and use of DNSSEC has been controversial for over a decade due to its cost and complexity. However, its usage and adoption is steadily growing and in 2014, DNS overseer ICANN determined that all new generic top-level domains would have to use DNSSEC.


Implementing DNSSEC is not always unproblematic. Some faults in DNS are only visible in DNSSEC – and then only when validating making the debugging the DNSSEC difficult. DNS software that apply only to DNSSEC has many issues to be plugged, leading to disruptions in service.
Interoperability amongst the DNS software is another issue that is adding to the problems. Above all, attackers can abuse improperly configured DNSSEC domains to launch denial-of-service attacks. The following are some such major complexities that one should be aware of.


Zone Content Exposure

DNS is split into smaller pieces called zones. A zone typically starts at a domain name, and contains all records pertaining to the subdomains. Each zone is managed by a single manager. For example, kannan-subbiah.com is a zone containing all DNS records for kannan-subbiah.com and its subdomains (e.g. www.kannan-subbiah.com, links.kannan-subbiah.com). Unlinke DNS, with DNSSEC the requests will be at the signed zone level. As such, enabling DNSSEC may expose otherwise obscured zone content. Subdomains are sometimes used as login portals or other services that the site owner wants to keep private. A site owner may not want to reveal that “secretbackdoor.example.com” exists in order to protect that site from attackers.


Non-Existent Domains

Unlike standard DNS, where the server returns an unsigned NXDOMAIN (Non-Existent Domain) response when a subdomain does not exist, DNSSEC guarantees that every answer is signed. For statically signed zones, there are, by definition, a fixed number of records. Since each NSEC record points to the next, this results in a finite ‘ring’ of NSEC records that covers all the subdomains. This technique may unveils internal records if zone is not configured properly.The information that can be obtained can help us to map network hosts by enumerating the contents of a zone.


The NSEC3-walking attack

DNSSEC has undergone revisions on multiple occasions and NSEC3 is the current replacement for NSEC. "NSEC3 walking" is an easy privacy-violating attack against the current version of DNSSEC. After a few rounds of requests to a DNSSEC server, the attacker can collect a list of hashes of existing names. The attacker can then guess a name, hash the guess, check whether the hash is in the list, and repeat.  Compared to normal DNS, current DNSSEC (with NSEC3) makes privacy violations thousands of times faster for casual attackers, or millions of times faster for serious attackers. It also makes the privacy violations practically silent: the attackers are guessing names in secret, rather than flooding the legitimate servers with guesses. NSEC3 is advertised as being much better than NSEC. 


Key Management

DNSSEC was designed to operate in various modes, each providing different security, performance and convenience tradeoffs. Live signing solves the zone content exposure problem in exchange for less secure key management. The most common DNSSEC mode is offline signing of static zones. This allows the signing system to be highly protected from external threats by keeping the private keys on a machine that is not connected to the network. This operating model works well when the DNS information does not change often.

Key management for DNSSEC is similar to key management for TLS and has similar challenges. Enterprises that decide to manage DNSSEC internally need to generate and manage two sets of cryptographic keys – the Key Signing Key (KSK), critical in establishing the chain of trust, and the Zone Signing Key (ZSK), used to sign the domain name’s zone. Both types of keys need to be changed periodically in order to maintain their integrity. The more frequently a key is changed, the less material an attacker has to help him perform the cryptanalysis that would be required to reverse-engineer the private key.  

An attacker could decide to launch a Denial of Service (DoS) attack at the time of key rollover. That is why it is recommended to introduce some "jitter" into the rollover plan by introducing a small random element to the schedule. Instead of rolling the ZSK every 90 days like clockwork, a time within a 10-day window either side may be picked, so that it is not predictable.


Reflection/Amplification Threat

DNSSEC works over UDP, and the answers to DNS queries can be very long, containing multiple DNSKEY and RRSIG records. This is an attractive target for attackers since it allows them to ‘amplify’ their reflection attacks. If a small volume of spoofed UDP DNSSEC requests is sent to nameservers, the victim will receive a large volume of reflected traffic. Sometimes this is enough to overwhelm the victim’s server, and cause a denial of service. Specifically, an attacker sends a corrupted network packet to a certain server that then reflects it back to the victim. Using flaws in DNSSEC, it is possible to use that extra-large response as a way to amplify the number of packets sent – anywhere up to 100 times. That makes it an extremely effective tool in efforts to take servers offline.



The problem isn't with DNSSEC or its functionality, but rather how it's administered and deployed. DNSSEC is the best way to combat DNS hijacking, but the complexity of the signatures increases the possibility of administrators making mistakes. DNS is already susceptible to amplification attacks because there aren't a lot of ways to weed out fake traffic sources.


"DNSSEC prevents the manipulation of DNS record responses where a malicious actor could potentially send users to its own site. This extra security offered by DNSSEC comes at a price as attackers can leverage the larger domain sizes for DNS amplification attacks," Akamai said in a report.

Sunday, August 7, 2016

Distributed Ledger - Strengths That Warrants Its Adoption

Blockchain is the most talked about technology today that is likely to have a pervasive impact on all industry segments, more specifically in the Banking and Financial Services. Blockchain packs the principles of cryptography, game theory and peer-to-peer networking. Blockchain, once the formal name for the tracking database underlying the cyptocurrency bitcoin, is now used broadly to refer to any distributed ledger that uses software algorithms to record transactions with reliability and anonymity. An increasingly interesting aspect of blockchain use is the concept of smart contracts – whereby business rules implied by a contract are embedded in the blockchain and executed with the transaction.


Built on the peer-to-peer technology, blockchain uses advanced encryption to guarantee the provenance of every transaction. The secure and resilient architecture that protects the distributed ledger is on of its key advantage. The other benefits of block chain include reduction in cost, complexity and time in addition to offering trusted record keeping and discoverability. Blockchain has the potential to make trading processes more efficient, improve regulatory control and could also displace traditional trusted third-party functions. Blockchain holds the potential for all participants in a business network to share a system of record. This distributed, shared ledger will provide consensus, provenance, immutability and finality around the transfer of assets within business networks.


The Banking and Financial Services industries world over are seriously looking at this technology. The Central Banks in many countries including India have formed committees to evluate the adoption of the blockchain technology, which is expected to address some of the problems that the industry is wanting to overcome over many years. For the financial services sector blockchain offers the opportunity to overhaul existing banking infrastructure, speed settlements and streamline stock exchanges. While many institutions understand its potential, they are still trying to work out whether blockchain technology offers a cost-cutting opportunity or represents a margin-eroding threat that could put them out of business.


Like the Cloud Computing, there three categories of blockchain, public, private, and hybrid. A public block chain is a fully decentralized “trustless” system open to everyone and where the ledger is updated by anonymous users. A private blockchain finds its use within a bank or an institution, where the organization controls the entire system. Hybrid is a combination of both public and private implementations, which is open to a controlled group of trusted and vetted users that update, preserve, and maintain the network collectively. Blockchain exploration has propelled banks in multiple directions, from examining fully decentralized systems that embed bitcoin or other virtual tokens to function, to ones where only authorized and vetted users are granted ac-cess to a network. 


The technology is being commercialised by several industry groups and are coming out with the use cases that this technology will be suitable for across different industry vertical. With the surge in funding for the FinTech innovations, the block chain technology may find its retail and institutional adoption in about 3 to 5 years, while some expect that this will take even longer. Some have invested in in-house development, while others have partenered with others in their pursuit to adopt the blockchain as part of their main stream business technology. 


Listed here are some of the key strengths that drives the adoption of the technology worldover.

Trusted

With the frequency at which data breaches are happening, users are seeking to have control over sensitive data. Blockchain by its nature puts users in total control. Applied to payments, blockchain allows users to retain control of their information and enable access to information about only one act of transaction. Participants are able to trust the authenticity of the data on the ledger without recourse to a central body. Transactions are digitally signed; the maintenance and validation of the distributed ledger is performed by a network of communicating nodes running dedicated software which replicate the ledger amongst the participants in a peer-to-peer network, guaranteeing the ledger’s integrity. They will also want the ability to roll back transactions in instances of fraud or error – which can be done on blockchain by adding a compensating record, as long as there are permission mechanisms to allow this – and a framework for dispute resolution.

Traceability

The cryptographic connection between each block and the next forms one link of the chain. This link ensures the  maintenance of trace for the information flow across the chain and thus enabling the articipants or regulators to trace information flows back through the entire chain. The distributed ledger is immutable as entries can be added to, but not deleted from. This information potentially includes, but is not limited to, ownership, transaction history, and data lineage of information stored on the shared ledger.  If provenance is tracked on a blockchain belonging collectively to participants, no individual entity or small group of entities can corrupt the chain of custody, and end users can have more confidence in the answers they receive.

Resiliency

Operates seamlessly and removes dependency on a central infrastructure for service availability. Distributed processing allows participants to seamlessly operate in case of failure of any participants. Data on the ledger is pervasive and persistent, creating a reliable distributed storage so that transaction data can be recovered from the distributed ledger in case of local system failure, allowing the system to have very strong built-in data resiliency. Distributed ledger-based systems would be more resilient to systematic operational risk because the system as a whole is not dependent on a centralised third party. With many contributors, and thus back-ups, the ledger has multiple copies which should make it more resilient than a centralised database. 

Reconciliation

Use cases that centre on increasing efficiency by removing the need for reconciliation between parties seem to be particularly attractive. Blockchain provides the benefits of ledgers without suffering from the problem of concentration. Instead, each entity runs a “node” holding a copy of the ledger and maintains full control over its own assets. Transactions propagate between nodes in a peer-to-peer fashion, with the blockchain ensuring that consensus is maintained. Reconciling or matching and verifying data points through manual or even electronic means would be eliminated, or at least reduced, because everyone in the network accessing the distributed ledger would be working off the exact same data on the ledger. In the case of syndicated loans, This is more so, since information is mutualised and all participants are working from the same data set in real time or near-real time. .

Distributed

When a blockchain transaction takes place, a number of networked computers, process the algorithm and confirm one another’s calculation. The record of such transactions thus continually expands and is shared in real time by thousands of people. Billions of people around the world lack access to banks and currency exchange. Blockchain-based distributed ledgers could change this. Just as the smartphone gave people without telephone lines access to communication, information, and electronic commerce, these technologies can provide a person the legitimacy needed to open a bank account or borrow money — without having to prove ownership of real estate or meeting other qualifications that are challenging in many countries.


Efficiency Gains

Removal of slow, manual and exception steps in existing end-to-end processes will lead to significant efficiency gains. Blockchain also removes the need for a clearing house or financial establishment to act as intermediary facilitating quick, secure, and inexpensive value exchanges. Blockchain ensures the most effective alignment between usage and cost due to its transparency, accuract and the significantly lower cost of cryptocurrency transaction. Distributed ledger technology has the potential to reduce duplicative recordkeeping, eliminate reconciliation, minimise error rates and facilitate faster settlement. In turn, faster settlement means less risk in the financial system and lower capital requirements

Sunday, April 10, 2016

Economics of Software Resiliency

Resilience is a design feature that facilitates the software to recover from occurrence of an disruptive event. As it is evident, this is kind of automated recovery from disastrous events after occurrence of such events. Yes, given an option, we would want the software that we build or buy has the resilience within it. Obviously, the resilience comes with a cost and the economies of benefit should be seen before deciding on what level of resilience is required. There is a need to balance the cost and effectiveness of the recovery or resilience capabilities against the events that cause disruption or downtime. These costs may be reduced or rather optimized if the expectation of failure or compromise is lowered through preventative measures, deterrence, or avoidance.

There is a trade-off between protective measures and investments in survivability, i.e., the cost of preventing the event versus recovering from the event. Another key factor that influences this decision is that cost of such event if it occurs. This suggests that a number of combinations need to be evaluated, depending on the resiliency of the primary systems, the criticality of the application, and the options as to backup systems and facilities.

This analysis in a sense will be identical to the risk management process. The following elements form part of this process:


Identify problems


The events that could lead to failure of the software are numerous. Developers know that exception handling is an important best practices one should adhere to while designing and developing a software system. Most modern programming languages provide support for catching and handling of exceptions.  This will at a low level help in identifying the exceptions encountered by a particular application component in the run-time. There may be certain events, which can not be handled from within the component, which require an external component to monitor and handle the same. Leave alone the exception handling ability of the programming language, the architects designing the system shall identify and document such exceptions and accordingly design a solution to get over such exception, so that the system becomes more resilient and reliable. The following would primarily bring out possible problems or exceptions that need to be handled to make the system more resilient:


  • Dependency on Hardware / Software resources - Whenever the designed system need to access a hardware resource, for example a specified folder in the local disk drive, expect a situation of the folder not being there, the application context doesn't have enough permissions to perform its actions, disk space being exhausted, etc. This equally applies to software resources like, an operating system, a third party software component, etc.
  • Dependency on external Devices / Servers / Services / Protocols - Access to external devices like printers, scanners, etc., or other services exposed for use by the application system, like an SMTP service for sending emails, database access, a web service over HTTPS protocol, etc. could also cause problems, like the remote device not being reachable, or a protocol mismatch, request or response data inconsistency, access permissions etc. 
  • Data inconsistency - In complex application systems, certain scenarios could lead to a situation of inconsistent internal data which may lead to the application getting into a dead-lock or never ending loop. Such a situation may have cascading effect as such components will consume considerable system resources quickly and leading to a total system crash. This is a typical situation in web applications as each external request is executed in separate threads and when each such thread get into a 'hung' state, over a period, the request queue will soon surpass the installed capacity. 


Cost of Prevention / recovery


The cost of prevention depends on the available solutions to overcome or handle such exceptions. For instance, if the issue is about the SMTP service being unavailable, then the solution could be to have an alternate redundant, always active SMTP service running out of a totally different network environment, so that the system can switch over to such alternate service if it encounters issues with the primary one. While the cost of implementing the handling of multiple SMTP services and a fail-over algorithm may not be significant, but maintaining redundant SMTP service could have significant cost impact. Thus with respect to each such event that may have an impact on the software resilience, the total cost for a pro-active solution vis-a-vis a reactive solution should be assessed.

Time to Recover & Impact of Event


While the cost of prevention / recovery as assessed above will be an indicator of how expensive the solution is, the Time to Recover and the Impact of such an event happening will indicate the cost of not having the event handled or worked around. Simple issues like a database dead-lock may be reactively handled by the DBAs who will be monitoring for such issues and will act immediately when such an event arise. But issues like, the network link to an external service failing, may mean an extended system unavailability and thus impacting the business. So, it is critical to assess the time to recover and the impact that such an event may have, if not handled instantly.

Depending on the above metric, the software architect may suggest an cost-effective solution to handle each such events. The level of resiliency that is appropriate for an organization depends on how critical the system in question is for the business, and the impact of the lack of resilience for the business. The organization understands that the resiliency has its own cost-benefit. The architects should have this in mind and design solutions to suit the specific organization.

The following are some of the best practices that the architects and the developers should follow while designing and building the software systems:
  • Avoid usage of proprietary protocols and software that makes migration or graceful degradation very difficult.
  • Identify and handle single points of failure. Of course, building redundancy has cost.
  • Loosely couple the service integrations, so that inter-dependence of services is managed appropriately.
  • Identify and overcome weak architecture / designs within the software modules or components.
  • Anticipate failure of every function and design for fall-back-scenarios, graceful degradation when appropriate.
  • Design to protect state in multi‐threaded and distributed execution environments.
  • Expect exceptions and implement safe use of inheritance and polymorphism 
  • Manage and handle the bounds of various software and hardware resources.
  • Manage allocated resources by using it only when needed.
  • Be aware of timeouts of various services and protocols and handle it appropriately

Sunday, March 20, 2016

Big Data for Governance - Implications for Policy, Practice and Research

A recent IDC forecast shows that the Big Data technology and services market will grow at a 26.4% compound annual growth rate to $41.5 billion through 2018, or about six times the growth rate of the overall information technology market. Additionally, by 2020 IDC believes that line of business buyers will help drive analytics beyond its historical sweet spot of relational (performance management) to the double-digit growth rates of real-time intelligence and exploration/discovery of the unstructured worlds.

This predicted growth is expected to have significant impact on all organizations, be it small, medium or large, which include exchanges, banks, brokers, insurers, data vendors and technology and services suppliers. This also extends beyond the organization with the increasing focus on rules and regulations designed to protect a firm’s employees, customers and shareholders as well as the economic wellbeing of the state in which the organization resides. This pervasive use and commercialization of big data analytical technologies is likey to have far reaching implications in meeting regulatory obligations and governance related activities. 

Certain disruptive technologies such as complex event processing (CEP) engines, machine learning, and predictive analytics using emerging big-data technologies such as Hadoop, in-memory, or NoSQL illustrate a trend in how firms are approaching technology selection to meet regulatory compliance requirements. A distinguishing factor between big data analytics and regular analytics is the performative nature of Big Data and how it goes beyond merely representing the world but actively shapes it.


Analytics and Performativity


Regulators are staying on top of the big data tools and technologies and are leveraging the tools and technologies to search through the vast amount of organizational data both structured and unstructured to prove a negative. This forces the organizations to use the latest and most effective forms of analytics and thus avoid regulatory sanctions and stay compliant.  Analytical outputs may provide a basis for strategic decision making by regulators, who may refine and adapt regulatory obligations accordingly and then require firms to use related forms of analytics to test for compliance. Compliance analytics are not simply reporting on practices but also shaping them through accelerated decision making changing strategic planning from a long term top down exercise to a bottom up reflexive exercise. Due to the 'automation bias' or the underlying privileged nature of the visualization algorithms, compliance analytics may not be neutral in the data and information they provide and the responses they elicit.

Technologies which implement surveillance and monitoring capabilities may also create self-disciplined behaviours through a pervasive suspicion that individuals are being currently observed or may have to account for their actions in the future. The complexity and heterogeneity of underlying data and related analytics provides a further layer of technical complexity to banking matters and so adds further opacity to understanding controls, behaviours and misdeeds. 

 Design decisions are embedded within technologies shaped by underlying analytics and further underpinned by data. Thus, changes to part of the systems may cause a cascading effect on the outcome. Data accuracy may also act to unduly influence outcomes. This underscores the need to understand big data analytics at the level of micro practice and from the bottom up. 


Information Control and Privacy


The collection and storage of Big Data, raises concerns over privacy. In some cases, the uses of Big Data can run afoul of existing privacy laws. In all cases, organizations risk backlash from customers and others who object to how their personal data is collected and used. This can present a challenge for organizations seeking to tap into Big Data’s extraordinary potential, especially in industries with rigorous privacy laws such as financial services and healthcare. Some wonder if these laws, which were not developed with Big Data in mind, sufficiently address both privacy concerns and the need to access large quantities of data to reach the full potential of the new technologies.

The challenges to privacy arise because technologies collect so much data and analyze them so efficiently that it is possible to learn far more than most people had predicted or can predict . These challenges are compounded by limitations on traditional technologies used to protect privacy. The degree of awareness and control can determine information privacy concerns; however, the degree may depend on personal privacy risk tolerance. In order to be perceived as being ethical, an organization must ensure that individuals are aware that their data is being collected, and they have control of how their data is used. As data privacy regulations impose increasing levels of administration and sanctions, we expect policy makers at the global level to be placed under increased pressure to mitigate regulatory conflicts and multijurisdictional tensions between data privacy and financial services’ regulations.

Technologies such as social media or cloud computing facilitate data sharing across borders, yet legislative frameworks are moving in the opposite direction towards greater controls designed to prevent movement of data under the banner of protecting privacy. This creates a tension which could be somewhat mediated through policy makers’ deeper understanding of data and analytics at a more micro level and thereby appreciate how technical architectures and analytics are entangled with laws and regulations. 

The imminent introduction of data protection laws will further require organizations to account for how they manage information, requiring much more responsibility from data controllers. Firms are likely to be required to understand the privacy impact of new projects and correspondingly assess and document perceived levels of intrusiveness. 


Implementing an Information Governance Strategy


The believability of analytical results when there is limited visibility into trustworthiness of the data sources is one of the foremost concern that an end user will have.  A common challenge associated with adoption of any new technology is walking the fine line between speculative application development, assessing pilot projects as successful, and transitioning those successful pilots into the mainstream. The enormous speeds and amount of data processed with Big Data technologies can cause the slightest discrepancy between expectation and performance to exacerbate quality issues. This may be further compounded by Metadata complications when conceiving of definitions for unstructured and semi-structured data.  

This necessitates the organizations to work towards developing an enterprise wide information governance strategy with related policies. The governance strategy shall encompass continued development & maturation of processes and tools for data quality assurance, data standardization, and data cleansing. The management of meta-data and its preservation, so that it can be evidenced to regulators and courts, should lso be considered when formulating strategies and tactics. The policies should be high-level enough to be relevant across the organization while allowing each function to interpret them according to their own circumstances. 

Outside of regulations expressly for Big Data, lifecycle management concerns for Big Data are fairly similar to those for conventional data. One of the biggest differences, of course, is in providing needed resources for data storage considering the rate at which the data grows. Different departments will have various lengths of time in which they will need access to data, which factors into how long data is kept. Lifecycle principles are inherently related to data quality issues as well, since such data is only truly accurate once it has been cleaned and tested for quality. As with conventional data, lifecycle management for Big Data is also industry specific and must adhere to external regulations as such.

Security issues must be part of an Information Governance strategy whichwill require current awareness of regulatory and legal data securityobligations so that a data security approach can be developed based on repeatable and defensible best practices.