Showing posts with label Architecture. Show all posts
Showing posts with label Architecture. Show all posts

Thursday, December 18, 2025

DNS as a Threat Vector: Detection and Mitigation Strategies

The Domain Name System (DNS) is often described as the “phonebook of the Internet” as its primary role is to translate human-readable domain names into IP addresses. DNS is a critical control plane for modern digital infrastructure — resolving billions of queries per second, enabling content delivery, SaaS access, and virtually every online transaction. Its ubiquity and trust assumptions make it a high‑value target for attackers and a frequent root cause of outages.

Unfortunately, this essential service can be exploited as a DoS vector. Attackers can harness misconfigured authoritative DNS servers, open DNS resolvers, or the networks that support such activities to initiate a flood of traffic to a target, impacting the service availability and causing disruptions in a large scale. This misuse of DNS capabilities makes it a potent tool in the hands of cybercriminals.

In recent years, DNS has increasingly become both a threat vector and a single point of failure, exploited through hijacks, cache poisoning, tunnelling, DDoS attacks, and misconfigurations. Even when not directly attacked, DNS fragility can cascade into global service disruptions.

The July 2025 Cloudflare 1.1.1.1 outage is a stark reminder of this fragility. Although the root cause was an internal configuration error, the incident coincided with a BGP hijack of the same prefix by Tata Communications India (AS4755), amplifying the complexity of diagnosing DNS‑related failures. The outage lasted 62 minutes and effectively made “all Internet services unavailable” for millions of users relying on Cloudflare’s resolver.

This blog explores why DNS is such a potent threat vector, identifies modern attack methods, how organizations can defend and mitigate such attacks and outlines the strategies required to build resilient DNS architectures.
 

Why DNS is the "Silent Killer" of Networks


DNS is frequently overlooked in security budgets because it is an open, trust-based protocol. Most firewalls are configured to allow DNS traffic (UDP/TCP Port 53) without deep inspection, as blocking it would effectively break the internet for users. Attackers exploit this "open door" to hide malicious activity within seemingly legitimate queries.

To understand the stakes, we only need to look at recent high-profile failures:

The AWS "DynamoDB" DNS Chain Reaction (October 2025): A massive 15-hour outage hit millions of users when a DNS error prevented AWS applications from locating DynamoDB instances. This triggered a "waterfall effect" across the US-East-1 region, proving that even internal DNS misconfigurations can cause global economic paralysis. 
 
The Cloudflare "Bot Management" Meltdown (November 2025): While not a malicious attack, this incident highlighted the fragility of DNS-related configuration files. A database permission error caused a "feature file" to bloat, crashing the proxy software that handles a fifth of the world’s web traffic.
 
The Aisuru Botnet (Q3 2025): This record-breaking botnet launched hyper-volumetric DDoS attacks peaking at 29.7 Tbps. By flooding DNS resolvers with massive volumes of traffic, the botnet caused significant latency and unreachable states for AI and tech companies throughout late 2025.


Why DNS Is an Attractive Threat Vector


DNS is a prime target because:
 
  • It is universally trusted — most organizations do not inspect DNS deeply.
  • It is often unencrypted — enabling interception and manipulation.
  • It is essential for every connection — making it a high‑impact failure point.
  • It is distributed and complex — involving resolvers, authoritative servers, registrars, and routing.
  • It is frequently misconfigured — creating opportunities for attackers.

Attackers exploit DNS for both disruption and covert operations.


Common DNS Attack Vectors


Common DNS attack vectors exploit the Domain Name System to redirect users, steal data, or disrupt services. Attackers leverage DNS's fundamental role in translating names to IPs, often using vulnerabilities like misconfigurations or outdated software for initial access or as part of larger campaigns. The following are some of the key attack vectors:

  • DNS Hijacking: Also known as DNS redirection, is a method in which an attacker manipulates the Domain Name System (DNS) resolution process (involving devices like: Routers, Endpoints, DNS resolvers, Registrar accounts) to redirect users from legitimate websites to malicious ones. This can lead to data theft, malware distribution, and phishing attacks. During the Cloudflare outage, a coincidental BGP hijack of the 1.1.1.0/24 prefix was observed, demonstrating how routing manipulation can mimic DNS hijacking symptoms.
  • DNS Cache Poisoning: Also known as DNS spoofing, is a cyberattack in which corrupted Domain Name System (DNS) data is injected into a DNS resolver's cache. This causes the name server to return an incorrect IP address for a legitimate website, consequently redirecting users to an attacker-controlled, often malicious, website without their knowledge. The attack exploits vulnerabilities in the DNS protocol, which was originally built on a principle of trust and lacks built-in verification mechanisms for the data it handles. Modern resolvers implement mitigations like source port randomization, but legacy systems remain vulnerable.
  • DNS Tunneling: It is a technique used to encode non-DNS traffic within DNS queries and responses, effectively creating a covert communication channel. This method is often used to bypass network security measures like firewalls, as DNS traffic is typically trusted and rarely subject to deep inspection. A DNS tunnelling attack involves two main components: a compromised client inside a protected network and a server controlled by an attacker on the public internet. However, cybercriminals primarily use it for Command and Control (C2), Data Exfiltration, Malware Delivery, and Network Footprinting. Because DNS is often allowed outbound by default, tunneling is a favorite technique for APTs.
  • DNS Flood Attack: A DNS flood is a type of distributed denial-of-service attack (DDoS) where an attacker floods a particular domain’s DNS servers in an attempt to disrupt DNS resolution for that domain. If a user is unable to find the phonebook, it cannot lookup the address in order to make the call for a particular resource. By disrupting DNS resolution, a DNS flood attack will compromise a website, API, or web application's ability respond to legitimate traffic. While the July 2025 Cloudflare incident was not a DDoS attack, it demonstrated how DNS unavailability — regardless of cause — can cripple global connectivity.
  • Registrar and Zone File Compromise: It refers to the unauthorized alteration of domain name system (DNS) records, which can be used to redirect user traffic to malicious websites, capture sensitive information, or host malware. Attackers typically compromise registrar accounts and zone files through stolen credentials, Registrar vulnerabilities, or domain shadowing. Unauthorized changes to DNS records can redirect traffic or disrupt services.


DNS Detection Strategies


DNS detection strategies focus on analyzing traffic patterns and query content for anomalies (like long/random subdomains, high volume, rare record types) to spot threats like tunneling, Domain Generation Algorithms, or malware, using AI/ML, threat intel, and SIEMs for real-time monitoring, payload analysis, and traffic analysis, complemented by DNSSEC and rate limiting for prevention. Legacy security tools often miss DNS threats. Modern detection requires a data-centric approach, which include:
 
  • Entropy Analysis: Monitoring for "high entropy" in domain names. Legitimate domains like google.com have low entropy. Long, random strings like a1b2c3d4e5f6.malicious.io are a red flag for tunneling or DGA (Domain Generation Algorithms) used by malware.
  • Linguistic/Readability Analysis: More advanced DGAs use dictionary words (e.g., carhorsebatterystaplehousewindow.example) to evade entropy-based detection. Natural Language Processing (NLP) techniques and readability indices can help determine if a domain name is a coherent, human-readable phrase or a machine-generated string of words.
  • NXDOMAIN Monitoring: A sudden spike in "NXDOMAIN" (Domain Not Found) responses often indicates a DNS Water Torture attack or a compromised bot trying to "call home" to randomized command-and-control servers.
  • Response-to-Query Ratio: DGA-infected hosts may exhibit unusual bursts of DNS queries, especially during off-peak hours, when network activity is typically low. If an internal host is sending 10,000 queries but only receiving 1,000 responses, it may be participating in a DDoS attack or scanning for vulnerabilities.
  • Lack of Caching: Legitimate domains are frequently visited and cached. DGA domains are typically short-lived, resulting in many cache misses and repeated queries for new domains that lack a history.
  • IP Address Behavior: Observing the resolved IP addresses can provide context. If many random domains resolve to the same IP or IP range, it might indicate a C2 server infrastructure.
  • DNSSEC Validation: DNSSEC ensures Authenticity of DNS responses and Integrity of zone data While not a silver bullet, DNSSEC prevents cache poisoning and man‑in‑the‑middle attacks.
  • BGP Monitoring for DNS Prefixes: Because DNS availability depends on routing stability, organizations should Monitor BGP announcements for their DNS prefixes and use RPKI to validate route origins The Cloudflare incident highlighted how BGP anomalies can complicate DNS outages.
  • Resolver Telemetry and Logging: Collect logs from Recursive resolvers, Forwarders, Authoritative servers and correlate them with Firewall logs, Proxy logs, Endpoint telemetry. This helps identify C2 activity and exfiltration attempts.


Strategies for building a resilient DNS Architecture


DNS mitigation strategies involve securing servers (ACLs, patching, DNSSEC), controlling access (MFA, strong passwords), monitoring traffic for anomalies, rate-limiting queries, hardening configurations (closing open resolvers), and using specialized DDoS protection services to prevent amplification, hijacking, and spoofing attacks, ensuring domain integrity and availability. A resilient DNS architecture shall consider the following:

  • Redundant, Anycast‑Based DNS Architecture: An Anycast-based DNS architecture uses one single IP address for multiple, geographically distributed DNS servers, routing user queries to the nearest server via Border Gateway Protocol (BGP) for reduced latency, improved reliability, load balancing, and inherent DDoS protection, making services faster and more resilient by sharing traffic across many points of presence (PoPs). This reduces the blast radius of outages. Cloudflare’s outage demonstrated how anycast misconfigurations can cause global failures — but also why anycast remains essential for scale.
  • Implement DNSSEC for Authoritative Zones: DNSSEC for Authoritative Zones secures DNS by adding digital signatures (RRSIGs) to DNS records using public-key cryptography, ensuring data authenticity and integrity, preventing spoofing; administrators sign zones with keys (ZSK/KSK), publish public keys (DNSKEY), and establish a chain of trust by adding DS records to parent zones, allowing resolvers to verify responses against tampering. This process involves key generation, zone signing on the primary server, and trust delegation to the parent, protecting DNS data from forgery.
  • Enforce DNS over HTTPS (DoH) or DNS over TLS (DoT): DNS over TLS (DoT) encrypts DNS on its own port (853) and is simpler/faster, while DNS over HTTPS (DoH) hides DNS traffic within standard HTTPS (port 443), making it harder to block but slightly slower; DoT is better for network visibility (admins), while DoH offers greater user privacy by blending with web traffic, making it ideal for bypassing censorship but potentially bypassing network controls. During the Cloudflare outage, DoH traffic remained more stable because it relied on domain‑based routing rather than IP‑based resolution.
  • Use DNS Firewalls and Response Policy Zones: DNS Firewalls using Response Policy Zones (RPZs) are a powerful security layer that intercepts DNS queries, checks them against lists (zones) of known malicious domains (phishing, malware, C&C), and then modifies the response to block, redirect (to a "walled garden"), or simply prevent access, stopping threats at the DNS level before users even reach harmful sites. Essentially, RPZs let you customize DNS behaviour to enforce security policies, overriding normal resolution for threats, and are a key defense against modern cyberattacks.
  • Adopt Zero‑Trust Principles for DNS: Implementing Zero Trust principles for the Domain Name System (DNS) means applying a "never trust, always verify" approach to every single DNS query and the resulting network connection, moving beyond implicit trust. This transforms DNS from a potential blind spot into a critical policy enforcement point in a modern security architecture.

Treat DNS as a monitored, controlled, and authenticated service — not a blind trust channel.


Conclusion


DNS is no longer just a networking utility; it is a frontline security perimeter. As seen in the outages of 2025, a single DNS failure—whether from a 30 Tbps botnet or a simple configuration error—can take down the digital economy. Organizations must move toward Proactive DNS Observability to catch threats before they resolve.

The path forward requires Deep visibility, Strong authentication, Redundant architectures, Continuous monitoring, Secure routing, and Encryption

DNS may be one of the oldest Internet protocols, but securing it is one of the most urgent challenges of the modern threat landscape.

Friday, June 19, 2015

Information Security - Reducing Complexity


Change is constant and we are seeing that everything around us are evolving. Primarily, the evolution is happening on the following categories:

Threats:

There is a drastic change in the threat landscape between now and the 1980s or even 1990s. Between 1980 and 2000, a good anti-virus and firewall solution was considered well enough for an organization. But now those are not just enough and the hackers are using sophisticated tools, technology and sills to attack the organizations. The motive behind hacking has also evolved and in that front, we see that hacking, though illegal is a commercially viable profession or business. 

Compliance:

With the pace at which the Threat landscape is evolving, governments have reasons to be concerned much as they are increasingly leveraging the technology to better serve the citizens and thus giving room for an increased security risk. To combat such challenges, Governments have come up with regulatory compliance requirements making it even complex for the CSOs of enterprises.

Technology:

Technology is evolving at a much faster pace and as we are experiencing, we are seeing that the things around us are getting smarter with the ability to connect and communicate to internet. On the other side, considerable progress have been achieved in the Artificial Intelligence, Machine Learning, etc. These newer ‘smarter things’ are adding up to the complexity as the CSOs of the have to handle the threats that these bring on to the surface.

Needless to mention that the hackers too make the best use of the technology evolution and thus improving their attack capabilities day by day.

Business Needs:

The driver of adoption of these evolution is the business need. As businesses want to stay ahead of the competition, they leverage the evolving technologies and surge ahead of the competition. With a shorter time to market, all departments, including the security organization should be capable of accepting and implementing such changes at faster pace. Due to this time pressure, there is a tendency to look for easier and quicker ways to implement changes ignoring the best practices.


Consumerization

IT today is to simplify things to the consumers within and outside the organization and this raises the user expectation and thus leading to too many changes with some being unrealistic as well. This may include the users bringing their own anything (BYOA). This will soon include Bring Your Own Identity with chips implanted under the skin. As you would know, employees who work at the new high tech office campus in Sweden, EpiCenter can wave their hands to open doors, with an RFID chip implanted under the skin.

Connected world

Most enterprises are now connected with their business partners in terms for exchanging business data. With this the IT System perimeter extends to that of the partners’ as well to some extent. Rules and polices had to be relaxed to support such connected systems. Now that we are looking at things that we use every day will transform as connected things, adding up to the complexity.

Big data

Basically the need for big data tools to handle this. While this complexity did exist earlier, the attacks were not that sophisticated then. Today with the level of sophistication on the attack surface, the need for simplifying complexity of handling huge data is very much required.

Skillset

The threat landscape is widening and the attacks are getting sophisticated, which call for even better tools and technologies to be used to prevent or counter them. This means that there is a continuous change in the method, approach, tools and technology used, making it difficult to maintain and manage the skills of the human resources.

Application Eco System

A midsized organization will have hundreds of applications, needing to have different exceptions to the policies and rules. These applications may in turn use third party components and thus the chances of a vulnerability within these applications is very high. Given that these applications constantly undergo change and evolve, there is a possibility that the code or component left behind might expose a vulnerability.


How does this impact

Complexity impacts the security capability in many ways and the following are some:

Accuracy in Detection

The complexity makes the detection of a compromise difficult. Having to handle and correlating large volume of logs from different devices and that too different vendors will always be a challenge and this makes timely and accurate detection a remote possibility. A successful counter measure require accurate detection in the pre-infection or atleast in the infection stage. The later it is detected, it is complex to counter the same.

Resources

Each new security technology requires people to properly deploy, operate and maintain it. But it is difficult to add new heads to the Security Organization as and when a new tool or technology is considered. Similarly, managing the legacy solutions put in by older employees who are no longer employed in the organizaiton is likely to remain untouched due to the fear of breaking certain things.

Vulnerabilities and Exposures

With the huge number of applications used by the enterprise, this is a complex and huge exercise, unless the same is integrated into the build and delivery process by mandating a security vulnerability assessment. With innumerable number of applications, components, and the operating systems connecting to the enterprise network, this is almost impossible. Needless to mention that with the wearables and other smarter things connection to the network, who knows, what vulnerability exist in such smarter things and in turn exploited by hackers.

Methods for reducing complexity

Complexity is certainly bad and reducing complexity will beneficial both in terms of cost and otherwise. However, simplification by any means should not result in compromising the needed detection and protection abilities. A balanced approach is necessary so that the risk, cost and complexity are well balanced and beneficial to the organization. The following are some of the methods that may help reduce the complexity:

  • Integrated processes as against isolated security processes. Every Business process should have the security related processes integrated within, so that every person in the organization will by default contribute towards security. The security process framework shall be designed in such a manner that it evolves over a period based on experience and feedback.
  • Practicing Agile approach within the security organization, so that the complexity is hidden within tools and appliances by automating the same. Agile approach also helps the security organization to embrace changes faster, especially, when implementing changes in response to a detected threat or compromise. One has to carefully adopt such practices into the Security framework.
  • Outsourcing the security operations to Managed Security Service Providers(MSSP) is certainly an option for small and medium enterprises that brings takes some of the complexity away and thus benefits the organization. Needless to mention here that outsourcing does not absolve the responsibility of the security organization from any security incident or breach.
  • “Shrinking the Rack” – Consolidating technologies whereby devices combining multiple technology and capability within it may make it easier for deployment and administration. At the same time this has the risk of ‘having all eggs in one basket’, i.e. when such a device or solution is hacked, then it is far and wide open for the hackers.
  • Mandating periodical code, component and process refactoring, where by unneeded legacy code, component and process are periodically reviewed and removed from the system. This will help keeping the applications maintainable and secure. Also implant security as a culture amongst all the employees, so that they handle security indicators responsibly.

Thursday, August 28, 2014

Architectural Security aspects of BGP/MPLS

The inherent benefits of the MPLS (Multi Protocol Label Switching), is gaining widespread use for providing IP VPN services. With the emerging trend of connected systems, a global enterprise today is well connected with their partners, with MPLS being the preferred choice. Border Gateway Routing Protocol (BGP) is used to interconnect such autonomous systems by exchanging the routing informaiton across such systems. The emergence of Multiprotocol Extension, and other variations of BGP Protocol, has furthered the choice of MPLS VPNs. On the same lines, the security concerns on using such a network is also on the rise. The specific demands of customers in terms of security is also emerging as they experience issues of data breaches and security incidents.

The objective of this blog is not to explain about the BGP / MPLS as such and instead let us examine how the BGP / MPLS addresses the typical security requirements in this blog. The following sections of this blog have been extracted from the RFC 4381 published by Internet Engineering Task Force (IETF) in 2006.


Address Space, Routing, and Traffic Separation

BGP/MPLS allows distinct IP VPNs to use the same address space, which can also be private address space. This is achieved by adding a 64-bit Route Distinguisher (RD) to each IPv4 route, making VPN-unique addresses also unique in the MPLS core. This "extended" address is also called a "VPN-IPv4 address". Thus, customers of a BGP/MPLS IP VPN service do not need to change their current addressing plan. The address space on the CE-PE link (including the peering PE address) is considered part of the VPN address space. Since address space can overlap between VPNs, the CE-PE link addresses can overlap between VPNs. For practical management considerations, SPs typically address CE-PE links from a global pool, maintaining uniqueness across the core.

On the data plane, traffic separation is achieved by the ingress PE pre-pending a VPN-specific label to the packets. The packets with the VPN labels are sent through the core to the egress PE, where the VPN label is used to select the egress VRF. Given the addressing, routing, and traffic separation across an BGP/ MPLS IP VPN core network, it can be assumed that this architecture offers in this respect the same security as a layer-2 VPN. It is not possible to intrude from a VPN or the core into another VPN unless this has been explicitly configured. If and when confidentiality is required, it can be achieved in BGP/ MPLS IP VPNs by overlaying encryption services over the network. However, encryption is not a standard service on BGP/MPLS IP VPNs.

Hiding of the BGP/MPLS IP VPN Core Infrastructure

Service providers and end-customers do not normally want their network topology revealed to the outside. This makes attacks more difficult to execute: If an attacker doesn't know the address of a victim, he can only guess the IP addresses to attack. Since most DoS attacks don't provide direct feedback to the attacker it would be difficult to attack the network. It has to be mentioned specifically that information hiding as such does not provide security. However, in the market this is a perceived requirement. 

With a known IP address, a potential attacker can launch a DoS attack more easily against that device. Therefore, the ideal is to not reveal any information about the internal network to the outside world. This applies to the customer network and the core. A number of additional security measures also have to be taken: most of all, extensive packet filtering. For security reasons, it is recommended for any core network to filter packets from the "outside" (Internet or connected VPNs) destined to the core infrastructure. This makes it very hard to attack the core, although some functionality such as pinging core routers will be lost. Traceroute across the core will still work, since it addresses a destination outside the core.

Being reachable from the Internet automatically exposes a customer network to additional security threats. Appropriate security mechanisms have to be deployed such as firewalls and intrusion detection systems. This is true for any Internet access, over MPLS or direct. A BGP/MPLS IP VPN network with no interconnections to the Internet has security equal to that of FR or ATM VPN networks. With an Internet access from the MPLS cloud, the service provider has to reveal at least one IP address (of the peering PE router) to the next provider, and thus to the outside world.

Resistance to Attacks

To attack an element of a BGP/MPLS IP VPN network, it is first necessary to know the address of the element. The addressing structure of the BGP/MPLS IP VPN core is hidden from the outside world. Thus, an attacker cannot know the IP address of any router in the core to attack. The attacker could guess addresses and send packets to these addresses. However, due to the address separation of MPLS each incoming packet will be treated as belonging to the address space of the customer. Thus, it is impossible to reach an internal router, even by guessing IP addresses.

In the case of a static route that points to an interface, the CE router doesn't need to know any IP addresses of the core network or even of the PE router. This has the disadvantage of needing a more extensive (static) configuration, but is the most secure option. In this case, it is also possible to configure packet filters on the PE interface to deny any packet to the PE interface. This protects the router and the whole core from attack. In all other cases, each CE router needs to know at least the router ID (RID, i.e., peer IP address) of the PE router in the core, and thus has a potential destination for an attack.

A potential attack could be to send an extensive number of routes, or to flood the PE router with routing updates. Both could lead to a DoS, however, not to unauthorised access. To reduce this risk, it is necessary to configure the routing protocol on the PE router to operate as securely as possible. This can be done in various ways: 

  • By accepting only routing protocol packets, and only from the CE router. The inbound ACL on each CE interface of the PE router should allow only routing protocol packets from the CE to the PE. 
  • By configuring MD5 authentication for routing protocols. This is available for BGP (RFC 2385 [6]), OSPF (RFC 2154 [4]), and RIP2 (RFC 2082 [3]), for example. 

This avoids packets being spoofed from other parts of the customer network than the CE router. It requires the service provider and customer to agree on a shared secret between all CE and PE routers. It is necessary to do this for all VPN customers. It is not sufficient to do this only for the customer with the highest security requirements.

It is theoretically possible to attack the routing protocol port to execute a DoS attack against the PE router. This in turn might have a negative impact on other VPNs on this PE router. For this reason, PE routers must be extremely well secured, especially on their interfaces to CE routers. ACLs must be configured to limit access only to the port(s) of the routing protocol, and only from the CE router.

Label Spoofing

Similar to IP spoofing attacks, where an attacker fakes the source IP address of a packet, it is also theoretically possible to spoof the label of an MPLS packet. For security reasons, a PE router should never accept a packet with a label from a CE router. RFC 3031 [9] specifies: "Therefore, when a labeled packet is received with an invalid incoming label, it MUST be discarded, UNLESS it is determined by some means that forwarding it unlabeled cannot cause any harm."

There remains the possibility to spoof the IP address of a packet being sent to the MPLS core. Since there is strict address separation within the PE router, and each VPN has its own VRF, this can only harm the VPN the spoofed packet originated from; that is, a VPN customer can attack only himself. MPLS doesn't add any security risk here. The Inter-AS and Carrier's Carrier cases are special cases, since on the interfaces between providers typically packets with labels are exchanged. See section 4 for an analysis of these architectures.


There are a number of precautionary measures outlined above that a service provider can use to tighten security of the core, but the security of the BGP/MPLS IP VPN architecture depends on the security of the service provider. If the service provider is not trusted, the only way to fully secure a VPN against attacks from the "inside" of the VPN service is to run IPsec on top, from the CE devices or beyond. This document discussed many aspects of BGP/MPLS IP VPN security. It has to be noted that the overall security of this architecture depends on all components and is determined by the security of the weakest part of the solution.

Saturday, June 1, 2013

Software Quality Attributes: Trade-off anaysis

We all know that the Software Quality is not just about meeting the Functional Requirements, but also about the extent of the software meeting a combination of quality attributes. Building a quality software will requires much attention to be paid to identifying and prioritizing the quality attributes and design &  build the software to adhere those. Again, going by the saying "you cannot manage what you cannot measure", it is also important to design the software with the ability to collect metrics around these quality attributes, so that the degree to which the end product satisfies the specific quality attribute can be measured and monitored.

It has always remained as a challenge for the software architects or designers in coming up with the right mix of the quality attributes with appropriate priority. This is further complicated as these attributes are highly interlinked as a higher priority on one would result in an adverse impact on another. Here is a sample matrix showing the inter-dependencies of some of the software quality metrics.




Avail-
ability
Effici-
ency
Flexi-
bility
Inte-
grity
Inter-oper-ability
Main-tain-ability
Port-ability
Reli-ability
Reus-ability
Rob-ust-ness
Test-ability
Avail-ability







+

+

Effici-
ency


-

-
-
-
-

-
-
Flexi-
bility

-

-

+
+
+

+

Integrity

-


-



-

-
Inter-oper-ability

-
+
-


+




Maintai-nabilit
+
-
+




+


+
Port-ability

-
+

+
-


+

+
Reli-
ability
+
-
+


+



+
+
Reus-ability

-
+
-



-


+
Robust-ness
+
-





+



Test-ability
+
-
+


+

+





While the '+' sign indicates positive impact, the '-' sign indicates negative impact. This is only an likely indication of the dependencies and in reality, this could be different. The important takeaway however is that there is a need for planning and prioritizing the quality attributes for every software being designed or built and the prioritization has to be accomplished keeping mind the inter-dependencies amongst the quality attributes. This would mean that there should be some trade-off made and the business and IT should be in agreement with these trade off decisions.

SEI's Architecture Trade-off Analysis Method (ATAM) provides a structured method to evaluate the trade off points. . The ATAM not only reveals how well an architecture satisfies particular quality goals (such as performance or modifiability), but it also provides insight into how those quality attributes interact with each other—how they trade off against each other. Such design decisions are critical; they have the most far-reaching consequences and are the most difficult to change after a system has been implemented.

A prerequisite of an evaluation is to have a statement of quality attribute requirements and a specification of the architecture with a clear articulation of the architectural design decisions. However, it is not uncommon for quality attribute requirement specifications and architecture renderings to be vague and ambiguous. Therefore, two of the major goals of ATAM are to

  • elicit and refine a precise statement of the architecture’s driving quality attribute requirements 
  • elicit and refine a precise statement of the architectural design decisions

Sensitivity points use the language of the attribute characterizations. So, when performing an ATAM, the attribute characterizations are used as a vehicle for suggesting questions and analyses that guide  towards potential sensitivity points. For example, the priority of a specific quality attribute might be a sensitivity point if it is a key property for achieving an important latency goal (a response) of the system. It is not uncommon for an architect to answer an elicitation question by saying: “we haven’t made that decision yet”. However, it is important to flag key decisions that have been made as well as key decisions that have not yet been made.

All sensitivity points and tradeoff points are candidate risks. By the end of the ATAM, all sensitivity points and tradeoff points should be categorized as either a risk or a non-risk. The risks/non-risks, sensitivity points, and tradeoffs are gathered together in three separate lists.