“Rrrrrrr”. No, I am not cold or angry. I’m teaching Siri to pronounce my name: Jeroen Aumand. After a few attempts, I settle for second best – an anglicised [Jeron] instead of [Yuron]. Then the app asks for help with my last name and I feel a sense of doom creeping up on me. We are told constantly that speech technology will revolutionise financial services, but beyond turning the lights on in the office, how can we use it in finance?
Today, there are a number of speech tools with potential to enhance trading desk operations, market surveillance and cyber security. These include voice identification, natural language understanding and speech-to-text transcript.
Traders will benefit from technology that instantly processes speech into market quotes and prices. Electronic trading already has a limited automated response to the words of news broadcasters, industry bodies and regulators. Already in use by news agencies and some hedge funds, these solutions are on their way to becoming mainstream for other capital markets players. In the near future, we can expect early adopters to commoditise speech technology and offer it as an additional service for their High Frequency Trading clients.
Speech technology can also be used to record live order information in the trade capture window during a trade negotiation. The trader would then validate the order details to complete the trade, increasing the processing speed of voice OTC trades.
Traders and sales staff can leverage Computer Voice Stress Analysers (CVSA) and Voice inflection software to support negotiations and manage client relationships. CVSAs can identify stress levels in our speech and are already used by call centres to improve client interactions and help operators better deal with callers. Voice inflection software is used to sound more positive when speaking to clients.
Listening to recordings takes time—a lot of it. But finding the right conversation in the first place can feel like searching for a needle in a haystack.
Search is an important function of speech technology. It is already used for market surveillance to narrow down the scope of conversations being analysed, by using both voice and keyword identification. Additionally, speech-to-text provides a text transcript that significantly reduces the listening time by switching directly to the relevant part of the conversation.
Combined with Artificial Intelligence (AI), recorded conversations can be linked to trades to identify best execution and illicit trading. AI can also scan recordings for signs of fraudulent / suspicious conversations. This reinforces the quality of market surveillance and demonstrates banks’ commitment to preventing fraud and minimising financial and reputational risk.
Voice identification is a powerful tool against social engineering where fraudsters impersonate someone, often a bank employee or a client, to obtain confidential information. On the rise in 2015, social engineering is best combatted by educating employees. But with voice identification, financial institutions can also identify and review all past conversations for possible fraudulent or criminal activities. This can lead to a better understanding of how social engineers operate and enable banks to create effective training tools and campaigns for employees. In addition, known offenders’ voice identification could be stored and shared with relevant organisations to prevent future attacks, increasing overall security within the industry. Voice identification is already used by some UK, US and Spanish banks to identify customers, instead of using password or security questions.
For trading, speech technology needs to be both accurate and contextual. For example, it needs to be able to distinguish between homophones such as buy or by. It introduces the notion of contextual risk, whereby speech technology is required to identify sarcastic or humorous statements. An example of speech technology that addresses contextual intelligence is Hound which can remember and change preliminary inputs. Later stage solutions will also need to encompass cultural differences - such as accents, dialects, words borrowed from other languages as well as cultural references - in the delivery of a message.
Another challenge is identifying the tonality of a message. IBM Watson, for instance, provides a tone analyser that scores words with either a positive or a negative value to detect and interpret emotional and social nuances. But electronic markets always depend on speed, and a tone analyser with language models that predicts the probability of language expression increases the overall response time. In addition, after interpreting a message there’s still the need to rate the relevance of the information and ‘calculate’ its impact on the market.
It is clear that speech technology is already here in financial services, improving speed, accuracy and quality of processing, and enhancing customer service. There’s still an appetite for greater efficiencies, not least the ability to support (not replace) good decision making. Although it is tempered by scepticism, especially around security, speech technology’s presence is undeniably set to grow, with opportunities for innovation across capital markets, other financial sectors, as well as other industries.
Jeroen Aumand is a Senior Consultant based in Capco’s London office. He has over a decade of experience in trading, sales and market surveillance for equity derivatives.
The content and opinions posted on this blog and any corresponding comments are the personal opinions of the original authors, not those of Capco.