A recent IDC forecast shows that the Big Data technology and services market will grow at a 26.4% compound annual growth rate to $41.5 billion through 2018, or about six times the growth rate of the overall information technology market. Additionally, by 2020 IDC believes that line of business buyers will help drive analytics beyond its historical sweet spot of relational (performance management) to the double-digit growth rates of real-time intelligence and exploration/discovery of the unstructured worlds.
This predicted growth is expected to have significant impact on all organizations, be it small, medium or large, which include exchanges, banks, brokers, insurers, data vendors and technology and services suppliers. This also extends beyond the organization with the increasing focus on rules and regulations designed to protect a firm’s employees, customers and shareholders as well as the economic wellbeing of the state in which the organization resides. This pervasive use and commercialization of big data analytical technologies is likey to have far reaching implications in meeting regulatory obligations and governance related activities.
Certain disruptive technologies such as complex event processing (CEP) engines, machine learning, and predictive analytics using emerging big-data technologies such as Hadoop, in-memory, or NoSQL illustrate a trend in how firms are approaching technology selection to meet regulatory compliance requirements. A distinguishing factor between big data analytics and regular analytics is the performative nature of Big Data and how it goes beyond merely representing the world but actively shapes it.
Analytics and Performativity
Regulators are staying on top of the big data tools and technologies and are leveraging the tools and technologies to search through the vast amount of organizational data both structured and unstructured to prove a negative. This forces the organizations to use the latest and most effective forms of analytics and thus avoid regulatory sanctions and stay compliant. Analytical outputs may provide a basis for strategic decision making by regulators, who may refine and adapt regulatory obligations accordingly and then require firms to use related forms of analytics to test for compliance. Compliance analytics are not simply reporting on practices but also shaping them through accelerated decision making changing strategic planning from a long term top down exercise to a bottom up reflexive exercise. Due to the 'automation bias' or the underlying privileged nature of the visualization algorithms, compliance analytics may not be neutral in the data and information they provide and the responses they elicit.
Technologies which implement surveillance and monitoring capabilities may also create self-disciplined behaviours through a pervasive suspicion that individuals are being currently observed or may have to account for their actions in the future. The complexity and heterogeneity of underlying data and related analytics provides a further layer of technical complexity to banking matters and so adds further opacity to understanding controls, behaviours and misdeeds.
Design decisions are embedded within technologies shaped by underlying analytics and further underpinned by data. Thus, changes to part of the systems may cause a cascading effect on the outcome. Data accuracy may also act to unduly influence outcomes. This underscores the need to understand big data analytics at the level of micro practice and from the bottom up.
Information Control and Privacy
The collection and storage of Big Data, raises concerns over privacy. In some cases, the uses of Big Data can run afoul of existing privacy laws. In all cases, organizations risk backlash from customers and others who object to how their personal data is collected and used. This can present a challenge for organizations seeking to tap into Big Data’s extraordinary potential, especially in industries with rigorous privacy laws such as financial services and healthcare. Some wonder if these laws, which were not developed with Big Data in mind, sufficiently address both privacy concerns and the need to access large quantities of data to reach the full potential of the new technologies.
The challenges to privacy arise because technologies collect so much data and analyze them so efficiently that it is possible to learn far more than most people had predicted or can predict . These challenges are compounded by limitations on traditional technologies used to protect privacy. The degree of awareness and control can determine information privacy concerns; however, the degree may depend on personal privacy risk tolerance. In order to be perceived as being ethical, an organization must ensure that individuals are aware that their data is being collected, and they have control of how their data is used. As data privacy regulations impose increasing levels of administration and sanctions, we expect policy makers at the global level to be placed under increased pressure to mitigate regulatory conflicts and multijurisdictional tensions between data privacy and financial services’ regulations.
Technologies such as social media or cloud computing facilitate data sharing across borders, yet legislative frameworks are moving in the opposite direction towards greater controls designed to prevent movement of data under the banner of protecting privacy. This creates a tension which could be somewhat mediated through policy makers’ deeper understanding of data and analytics at a more micro level and thereby appreciate how technical architectures and analytics are entangled with laws and regulations.
The imminent introduction of data protection laws will further require organizations to account for how they manage information, requiring much more responsibility from data controllers. Firms are likely to be required to understand the privacy impact of new projects and correspondingly assess and document perceived levels of intrusiveness.
Implementing an Information Governance Strategy
The believability of analytical results when there is limited visibility into trustworthiness of the data sources is one of the foremost concern that an end user will have. A common challenge associated with adoption of any new technology is walking the fine line between speculative application development, assessing pilot projects as successful, and transitioning those successful pilots into the mainstream. The enormous speeds and amount of data processed with Big Data technologies can cause the slightest discrepancy between expectation and performance to exacerbate quality issues. This may be further compounded by Metadata complications when conceiving of definitions for unstructured and semi-structured data.
This necessitates the organizations to work towards developing an enterprise wide information governance strategy with related policies. The governance strategy shall encompass continued development & maturation of processes and tools for data quality assurance, data standardization, and data cleansing. The management of meta-data and its preservation, so that it can be evidenced to regulators and courts, should lso be considered when formulating strategies and tactics. The policies should be high-level enough to be relevant across the organization while allowing each function to interpret them according to their own circumstances.
Outside of regulations expressly for Big Data, lifecycle management concerns for Big Data are fairly similar to those for conventional data. One of the biggest differences, of course, is in providing needed resources for data storage considering the rate at which the data grows. Different departments will have various lengths of time in which they will need access to data, which factors into how long data is kept. Lifecycle principles are inherently related to data quality issues as well, since such data is only truly accurate once it has been cleaned and tested for quality. As with conventional data, lifecycle management for Big Data is also industry specific and must adhere to external regulations as such.
Security issues must be part of an Information Governance strategy whichwill require current awareness of regulatory and legal data securityobligations so that a data security approach can be developed based on repeatable and defensible best practices.