Journal #50: Data Analytics

CAPCO JOURNAL #50: DATA ANALYTICS

Data is playing a crucial role in informing decision-making to drive financial institutions forward, and organizations are unlocking hidden value through harvesting, analyzing and managing their data. The papers in this edition demonstrate a growing emphasis on this field, examining such topics as machine learning and AI, regulatory compliance, program implementation, and strategy.

DATA MANAGEMENT: A FOUNDATION FOR EFFECTIVE DATA SCIENCE

Alvin Tan
UNLOCKING VALUE THROUGH DATA LINEAGE

Thadi Murali, Rishi Sanghavi, Sandeep Vishnu
THE CFO OF THE FUTURE

Bash Govender, Axel Monteiro
MACHINE LEARNING FOR ADVANCED DATA ANALYTICS: CHALLENGES, USE-CASES AND BEST PRACTICES TO MAXIMIZE BUSINESS VALUE

Nadir Basma, Maximillian Phipps, Paul Henry, Helen Webb
UNIFYING DATA SILOS: HOW ANALYTICS IS PAVING THE WAY

Luis del Pozo, Pascal Baur
DATA ENTROPY AND THE ROLE OF LARGE PROGRAM IMPLEMENTATIONS IN ADDRESSING DATA DISORDER

Sandeep Vishnu, Ameya Deolalkar, George Simotas
DATA QUALITY IMPERATIVES FOR DATA MIGRATION INITIATIVES: A GUIDE FOR DATA PRACTITIONERS

Gerhard Langst, Jurgen Elsner, Anastasia Berzhanin

ARTIFICIAL INTELLIGENCE AND DATA ANALYTICS: EMERGING OPPORTUNITIES AND CHALLENGES IN FINANCIAL SERVICES

CRISPIN COOMBS | Reader in Information Systemsand Head of Information Management Group, Loughborough University
RAGHAV CHOPRA | Loughborough University

Artificial intelligence (AI) systems are providing a new opportunity to financial services firms to develop distinctive capabilities to differentiate themselves from their peers. Key to this differentiation is the ability to execute business in the most effective and efficient manner and to take the smartest possible business decisions. AI systems can process large amounts of data with levels of accuracy and consistency that is not possible for humans to achieve, providing a route to more accurate predictions and data-driven analytical decision-making. In this paper, we discuss the benefits of AI for improving data analytics and decision-making, current and potential applications of AI within financial services, operational challenges and potential solutions for AI adoption, and conclude with requirements for successful adoption of AI systems.

DATA ENTROPY AND THE ROLE OF LARGE PROGRAM IMPLEMENTATIONS IN ADDRESSING DATA DISORDER

SANDEEP VISHNU | Partner, Capco
AMEYA DEOLALKAR | Senior Consultant, Capco
GEORGE SIMOTAS | Managing Principal, Capco

Clutter is a highly pervasive phenomenon. Homeowners are very familiar with this occurrence as their acquisitions grow to fill available space. Closets, garages, basements, and many areas not in obvious sight become dumping grounds for things that do not have immediate utility or a logical place in the house. Now think of a scenario where the volume, velocity, and variety of goods entering the house goes up by several orders of magnitude in a very short period of time. The house will simply start to overflow with articles strewn wherever they can fit, with little thought given to order, use, and structure. Enterprises face a similar situation with data as volumes have grown dramatically over the last two-three years. Organizational reluctance to retire or purge data creates overflowing repositories, dark corners, and storage spaces full of outdated, unseen, and difficult to access information – i.e., data clutter. Temporary fixes only add layers to the problem, creating additional waste, maintenance challenges, damage, inefficiency, and improvement impediments. All these factors drive data entropy, which for purposes of this paper is defined as the tendency for data in an enterprise to become increasingly disorderly. Large programs are often data centric and surface data clutter issues. This paper explores the concept of data entropy in today’s world of rapidly expanding data types and volumes entering an organization at exponentially higher speeds, and how large program implementations can be used as catalysts to address data clutter and modernize the data supply chain to streamline data management.

DATA MANAGEMENT: A FOUNDATION FOR EFFECTIVE DATA SCIENCE

ALVIN TAN | Principal Consultant, Capco

Data sourcing and cleansing is often cited by data scientists to be amongst the most critical, yet most time-consuming aspects of data science. This article examines how data management capabilities, such as data governance and data quality management, can not only reduce the burden of data sourcing and preparation, but also improve quality and trust in the insights delivered by data science. Establishing strong data management capabilities ensures that less time is spent wrangling data to enter into an analytics model and more time is left for actual modeling and identification of actionable business insights. We find that organizations that build analytics data pipelines upon strong data management foundations can extract fuller business value from data science. This provides not only competitive advantage through the insights identified, but also comparative advantage through a virtuous circle of data culture improvements.

DATA QUALITY IMPERATIVES FOR DATA MIGRATION INITIATIVES: A GUIDE FOR DATA PRACTITIONERS

GERHARD LÄNGST | Partner, Capco
JÜRGEN ELSNER | Executive Director, Capco
ANASTASIA BERZHANIN | Senior Consultant, Capco

This article is based on the experiences gained through a large data migration and business process outsourcing project in 2019. Examining static data linked to approximately 12 million customer records spread across over 10 source systems led to the early identification of unclean data in approximately 10 percent of the golden source data and resulted in large-scale data remediation efforts that were necessary prior to data migration. Key takeaways and lessons learned about data quality on a financial institution’s customer data are summarized here for the data practitioner, with an emphasis on applicable methods and techniques applied to gain transparency about an institution’s overall current state of data.

DATA TECHNOLOGIES AND NEXT GENERATION INSURANCE OPERATIONS

IAN HERBERT | Senior Lecturer in Accounting and Financial Management, School of Business and Economics, Loughborough University
ALISTAIR MILNE | Professor of Financial Economics, School of Business and Economics, Loughborough University1
ALEX ZARIFIS | Research Associate, School of Business and Economics, Loughborough University

This article uses insights from knowledge management to describe and contrast two approaches to the application of artificial intelligence and data technologies in insurance operations. The first focuses on the automation of existing processes using robotic processing intervention (RPA). Knowledge is codified, routinized, and embedded in systems. The second focuses on using cognitive computing (AI) to support data driven human decision making based on tacit knowledge. These approaches are complementary, and their successful execution depends on a fully developed organizational data strategy. Four cases are presented to illustrate specific applications and data that are being used by insurance firms to effect change of this kind.

MACHINE LEARNING FOR ADVANCED DATA ANALYTICS: CHALLENGES, USE-CASES AND BEST PRACTICES TO MAXIMIZE BUSINESS VALUE

NADIR BASMA | Associate Consultant, Capco
MAXIMILLIAN PHIPPS | Associate Consultant, Capco
PAUL HENRY | Associate Consultant, Capco
HELEN WEBB | Associate Consultant, Capco

As the amount of data produced and stored by organizations increases, the need for advanced analytics in order to turn this data into meaningful business insights becomes crucial. One such technique is machine learning, a wide set of tools that builds mathematical models with minimal human decision-making. Although machine learning has the potential to be immensely powerful, it requires well-considered planning and the engagement of key business stakeholders. The type of machine learning used will be determined by the business question the organization is trying to answer, as well as the type and quality of data available. Throughout the development process, ethical considerations and explainability need to be considered by all team members. In this paper, we present some of the challenges of implementing a machine learning project and the best practices to mitigate these challenges.

NATURAL LANGUAGE UNDERSTANDING: RESHAPING FINANCIAL INSTITUTIONS’ DAILY REALITY

BERTRAND K. HASSANI | Université Paris 1 Panthéon-Sorbonne, University College London, and Partner, AI and Analytics, Deloitte

Though in the past, data captured by financial institutions and used to understand customers, processes, risks, and, more generally, the environment of financial institutions was mainly structured, i.e., sorted in “rigid” databases, today, that is no longer the case. Indeed, the so-called structured data is representing no more than a drop in an ocean of information. The objective of this paper is to present and discuss opportunities offered by natural language processing and understanding (NLP, NLU) to analyze the unstructured data, and automate its treatment. Indeed, NLP and NLU are essential to understanding and analyzing banks’ internal way of functioning and customer needs in order to bring as much value as possible to the firm and the clients it serves. Consequently, though we will briefly describe some algorithms and explain how to implement them, we will focus on the opportunities offered as well as the drawbacks and pitfalls to avoid in order to make the most out of these methodologies.

SYNTHETIC FINANCIAL DATA: AN APPLICATION TO REGULATORY COMPLIANCE FOR BROKER-DEALERS

J. B. HEATON | One Hat Research LLC
JAN HENDRIK WITTE | Honorary Research Associate in Mathematics, University College London

The hype of big data has not escaped the investment management industry, although the reality is that price data from U.S. financial markets are not really big data; price data is small data. The fact that sellers and advisors in financial markets use small data to generate and test investment strategies creates two major problems. First, the economic mechanisms that generate prices (and, therefore, returns) may change through time, so that historical data from an earlier time may tell us little or nothing about future prices and returns. Second, even if data-generating-mechanisms are somewhat stable through time, inferences about the profitability of investment strategies may be sensitive to a handful of outliers in the data that get picked up again and again in different strategies mined from the same small data set. In this article, we present an answer to the financial small data problem: using machine-learning methods to generate ‘synthetic’ financial data. The essential part of our approach to developing synthetic data is the use of machine learning methods to generate data that might have been generated by financial markets but was not. Synthetic price and return data have numerous uses, including testing new investment strategies and helping investors plan for retirement and other personal investment goals with more realistic future return scenarios. In this article, we focus on a particularly important use of synthetic data: meeting legal and regulatory requirements such as best interest and fiduciary requirements.

THE BIG GAP BETWEEN STRATEGIC INTENT AND ACTUAL, REALIZED STRATEGY

HOWARD YU | LEGO Professor of Management and Innovation, IMD Business School
JIALU SHAN | Research Fellow, IMD Business School

Most executives know what needs to get done but there is always a gap between intention and the realized strategy of the firm. We investigated three industries (the automotive, banking, and the consumer goods sectors) and showed how some companies can close such knowing-and-doing gap and beat the competition. We relied on hard market data and ranked companies based on the likelihood that they acquire new knowledge in their efforts to prepare for the future. Such findings can be generalized for other sectors; consequently, providing a set of important lessons for managers at large.

THE CFO OF THE FUTURE

BASH GOVENDER | Managing Principal, Capco
AXEL MONTEIRO | Principal Consultant, Capco

Finance departments of the major financial services organizations (FSOs) have undergone dramatic changes since the great crash of 2008. They have had to cut costs severely while still supporting an expanding portfolio of new regulatory and business requirements. As a result, they have been unable to fully benefit from the innovation boom of the past decade. In order to get a better understanding of the perspectives of the Chief Financial Officers (CFOs) of major FSOs on the current and potential operating models of the finance department, we interviewed a number of CFOs and finance executives across Europe and North America. We found that while the past decade has been tough on these departments, the future can be bright should they be able to institute the necessary digital innovations that the other departments and organizations have benefitted from.

UNIFYING DATA SILOS: HOW ANALYTICS IS PAVING THE WAY

LUIS DEL POZO | Managing Principal, Capco
PASCAL BAUR | Associate Consultant, Capco

This article looks at the ongoing issues associated with fragmented data silos; a problem exacerbated with the ever-increasing amount of data that enterprises must deal with. We highlight the need for unifying data silos and how analytics could help investment firms transform themselves. To support our propositions, we provide a number of real-world examples from investment firms on such journeys. In addition, we provide a roadmap for firms currently on a data analytics-driven transformation journey.

UNLOCKING VALUE THROUGH DATA LINEAGE

THADI MURALI | Principal Consultant, Capco
RISHI SANGHAVI | Senior Consultant, Capco
SANDEEP VISHNU | Partner, Capco

Data and information lifecycle management challenges in a financial services organization (FSO) can be daunting, especially when they relate to data security, integrity, or availability. Large FSOs recognize this and are willing to make investments to address IT cyber risk, data management, and data governance, specifically when the payoff is clearly articulated. In a world of big data, current techniques for information risk and control assessment fall woefully short as they do not provide adequate visibility around data nor do they assist the business in decision making. Data lineage can fill this gap. Thus far, data lineage has largely been directed towards regulatory initiatives focused on risk and finance. However, the broader business use of data lineage is relatively unexplored, in part due to a lack of industry standards or methodologies to guide organizations to realize the full potential of data lineage. This article explores how data lineage standards and patterns can drive substantial value beyond regulatory compliance by holistically considering control optimization and cost reduction.

USING BIG DATA ANALYTICS AND ARTIFICIAL INTELLIGENCE: A CENTRAL BANKING PERSPECTIVE

OKIRIZA WIBISONO | Big Data Analyst, Bank Indonesia
HIDAYAH DHINI ARI | Head of Digital Data Statistics and Big Data Analytics Development Division, Bank Indonesia
ANGGRAINI WIDJANARTI | Big Data Analyst, Bank Indonesia
ALVIN ANDHIKA ZULEN | Big Data Analyst, Bank Indonesia
BRUNO TISSOT | Head of Statistics and Research Support, BIS, and Head of the IFC Secretariat

Information and the internet technology have fostered new web-based services that affect every facet of today’s economic and financial activity. For their part, central banks face a surge in “financial big datasets”, reflecting the combination of new, rapidly developing electronic footprints as well as large and growing financial, administrative, and commercial records. This phenomenon has the potential to strengthen analysis for decision-making, by providing more complete, immediate, and granular information as a complement to “traditional” macroeconomic indicators. To this end, a number of techniques are being developed, often referred to as “big data analytics” and “artificial intelligence”. However, getting the most out of these new developments is no trivial task. Central banks, like other public authorities, face numerous challenges, especially in handling these new data and using them for policy purposes. This paper covers three main topics discussing these issues: the main big data sources and associated analytical techniques that are relevant for central banks, the type of insights that can be provided by big data, and how big data is actually used in crafting policy.