While data has always been crucial to the financial services industry, in this era of digitalization, the push towards sourcing, maintaining and analyzing an increasing amount of data is being driven by both regulatory reasons and business opportunities. Investment managers, asset managers, banks and treasurers all have an interest in quickly accessing and analyzing market data to find sources of alpha.
To merge large datasets, various external data sources were traditionally moved onto a proprietary server or data warehouse. This moving of the data was not only time consuming, but posed an operational risk in terms of the potential of errors, incomplete data sources, single point of failures or cumbersome error handling techniques, all requiring monitoring and maintenance by IT operations.
Today, the world of data analytics has turned a corner with the advent of cloud-based data solutions that revolve around the concept of sharing data, rather than moving data. Taking advantage of almost unlimited cloud storage and cloud performance, cloud data warehouses can be built on the equivalent of cloud ‘lego bricks’, ultimately delivering cloud-native products.
In the past, meaningful reporting and decision-making required actors in the financial domain to gather market data and combine it with their own data. Traditionally, this was done by buying data from third party data vendors, such as Refinitiv, Bloomberg or MSCI. At first, these feeds were imported into vendor-specific databases which took on different forms (for example, Refinitiv using a specific data server vendor, while others used alternative data server vendors). Later, firms would directly import data feeds into their own proprietary database solutions, typically via a daily batch upload.
There were a number of issues, however, in handling the data in this manner. There was a constant need to reconcile databases - with a perpetual doubt whether that data could be fully trusted. Monitoring was necessary, and error handling became an onerous task. If a delta file went missing, the entire dataset consisting of millions of records would need to be reloaded over the weekend since there was not enough time to do so overnight.
Third party vendors eventually realized the potential of the cloud, but this came with its own challenges at first. While the cloud solved some of the problems described above, a whole host of new issues were subsequently created. There was a lack of flexibility and scalability due to the prerequisite of having to pre-determine how much data would be used, the frequency of use, and the type of network deployed. For those organizations that chose to create their own data warehouse based on the principle of sharing data with others, they would end up facing the same set of questions, along with additional security considerations.
To help alleviate these data challenges, new cloud-based technologies were introduced. These new data warehouses, built directly on the cloud, can host extremely large datasets from various sources in one common drive, which can then be shared and simultaneously accessed by all users. The big names providing these solutions are Snowflake, Amazon Redshift, Azure Synapse Analytics and Google's BigQuery.
These cloud data solutions have been developed to deal with two main problems. Firstly, the architecture has been specifically designed to deal with large storage volumes, as well as providing a query processing area. Secondly, they introduce an innovative pricing model based solely on the amount of CPU power used, providing a much more transparent structure and more suited to demand of analytics during normal business hours.
There a number of benefits regarding cloud-based data solutions, including:
From a business perspective, the architecture opens up a realm of new commercial possibilities. Business partnerships around shared common datasets is one good example of this. For instance, Volvo Cars, Mercedes Benz and BMW could all decide to contribute to a common dataset in the cloud. Rather than each having to build their own dataset, they would all benefit from one source and solution, with each ensuring quality and content of the information.
Another example would be a custody bank holding all assets and related data for an asset manager. Instead of sending the agreed data to an asset manager by file transfer, Message Queue or API, the ability to access the same dataset in the cloud would remove the need for transferring and reconciling the data.
Confronted with the need to deal with more and more data, financial services providers are turning towards cloud-native data storage solutions and doing away with the concept of having to reconcile data. Because cloud data storage delivers a higher degree of data integrity, financial services are now able to make better informed decisions, with confidence.
SkySparc is a certified partner to Snowflake, helping firms to develop a data cloud solution. Find out more here.
Automation
Risk, regulatory & compliance
Cloud strategy
Change management
Project management
System selection
Data engineering
Visualization & Discovery
Information governance
Data science
System expertise
Wallstreet Suite
FIS
Murex
Technology advisory
Test services
Implementations & upgrades
Platform integration
Cloud migration
OmniFi
Datamart Optimizer for Murex
Value at Risk (VaR) solution
Central banks
Corporate treasuries
Banks
Asset managers
OmniFi
Our journey
Our industries
Our leadership
Our solution
Thought leadership
Podcasts
Videos
Fact sheets
In the news
Life at SkySparc
Open positions
Independent contractor
Young Professionals