The internet. One giant platform with an endless number of disparate data sources and datasets. It's no wonder how hard it is to source statistics or facts you can trust, like data usage. Domo estimated people use 1.7MB a second, while BroadbandSearch estimated Americans generate approximately 0.16MB a second. Who is correct? Who is in the position to ask such a question?
Spin doctors can have a field day with situations like this, because they can pick and choose which datasets they use to tell stories from, stories which could be easily disproved if those datasets were properly integrated. This begs the question, what decisions are you, your executives, and your organisation making off incomplete data?
Globally, you can understand how hard it can be for people to get accurate information from the internet. Within an organisation, you expect sourcing trusted information would be much easier. While the scale is in most cases infinitely smaller, data silos, differing language, and the explosion of microservices are just some of the complexity multipliers that make life difficult for those seeking to create value for customers from data.
Tom Hallam, Jade's Director of Technology, recently featured on one of our podcasts, to discuss one of the remedies for such issues as we've discussed, which we will now dive into deeper.
What is Data Fabrics?
Historically, and also currently for a number of businesses, data sits largely in disparate stores across an organisation. One consequence of this is that analysts can come to very different conclusions when working with data from these various siloed data stores. Confusion reigns. Businesses have typically addressed this by implementing a data lake, which can work very effectively. Unfortunately, in the process of pooling this data, the context in which it was originally designed and captured is often lost. It ends up being closer to a data swamp.
Data fabrics, also referred to as data mesh, enables frictionless access and sharing of data across an organisation or ecosystem, through a single and consistent data management framework. Importantly, this sharing is seamless and connects previously siloed data stores, to give people across the organisation complete access to data, with few or no holes. With the right intention and approach, data fabrics is enabled by good organisational design.
Data fabrics in a financial services environment
David MacKeith, Richard Caven, Richard Nicholson, and the team at AWS have provided a great example of how data fabrics works in the financial services sector, slightly paraphrased as below.
Data fabrics provides a promising approach to solving regulatory reporting problems. Data fabrics address data ownership, governance, and lineage issues associated with regulatory environments. In a data fabrics approach, each data producer (such as a bank) independently maintains and updates its published data. Only when the bank chooses to “publish” a new version of the dataset are changes made visible to subscribed data consumers (such as the financial regulator). Each producer controls the structure of each published dataset and this structure is described by a data schema.
Meanwhile, the data consumers may gather published data from multiple data producers (such as from each bank under the regulator’s jurisdiction). The consumer may then use a diverse set of cloud technologies to populate data lakes or data warehouses as required. Hence, in a flexible and cost-efficient manner, a regulatory data mesh enables regulated entities to “bridge the gap” between their people, processes, and the systems that produce the data.
Important considerations for data fabrics
Leading businesses want to and are moving to become data-driven, enabling them to avoid opinion-based decisions, streamline processes, adopt automation, and ultimately improve outcomes for both business and the customer. Data fabrics is one of the best ways to achieve this, and here are some important considerations for any organisation considering such a move.
Design to respect context.
This first consideration is about designing teams around the problems they're solving as a department, rather than the jobs they do. This is moving to what's called a cross-functional team who works closely together, rather than having engineers, designers, analysts separated from one another, sometimes even physically sitting in different offices, buildings or time zones.
Businesses should never underestimate the value of contextual experience. Developers, designers, product owners, and business analysts working in a particular area of the business build up knowledge and experience that pertains to their environment. From a data governance perspective, they are closest to the data and technology they work on, so if there are changes, they know what has been made and why those changes were made. When you have teams based on competencies operating far from the coal face, they have the potential to miss out on nuances or factors that could create significant uplift for the business.
Expose data to the whole organisation.
For data to have the most value, it needs to be made accessible by everyone, as needed. This requires some form API layer that sits on top of these departments (such as GraphQL), which can unite and simplify the varying standards and naming conventions that each of the data stores individually have. Keeping in mind, the more microservices an organisation embraces, the more variation there will be. An owner will need to be assigned to maintaining this layer, which in our experience is often the platform team.
Treat your data like a product
Implementing data fabrics can be helped when organisations embrace data as a product. Doing so requires documentation, which needs to be evangelised around the business. Product-led businesses invest money into continual improvement, so the quality of that data is always rising. The functionality and what's possible as a result should be ever-increasing too.
When each team (context) protects and maintains their part of the organisation's data, the quality of insights and speed to market of new projects is greatly improved. It can also be very interesting when you have some internal competition around this as well, to see who can have the most well-maintained datasets.
What risks can organisations face when implementing data fabrics
While organisations are moving to modern, cloud-first applications, many still have applications that were designed to operate with low-volume loadslong before how the internet is used today, and . Due to the age and complexity of some applications, some legacy data stores may struggle coping with the “web load” required to operate as part of a data fabric. There are strategies to mitigate this, such as caching layers, combining data in other stores, and so forth.
Across the various data stores, data will likely be stored in completely different formats, have different terminology and naming conventions, and will certainly look very different when comparing one to another. Unless an organisation converges and agrees on a common data language and agrees on how this data is presented back to the organisation, it will be very difficult for it to move forward with their data maturity journey. Additionally, when organisations adopt open standards then bring in new engineering talent, their initial productivity is boosted as they have less of a learning curve as part of their onboarding.
One final risk when implementing data fabrics is the level of agility and organisational design in the business. The confusion and misinterpretation that occurs when sharing disparate data across teams that aren't working on joint problems can be very damaging. Since data fabrics is about the free flowing of data across a business, agile teams who are centred around solving specific problems are better geared for sharing data that is more contextually useful for other parts of the business. There's also a side effect of teams working closely together, which is that they can learn off and get insights from others that might enable them to try an alternative way of working, which might lead to a greater outcome.
Benefits of data fabrics
It's worth calling out at this stage, that data fabrics or data mesh is a relatively new way of thinking about big data platforms. In saying that, we can confidently say that benefits such as speed to market for new technology implementations and enhancements plus increased value for customers can be expected for those who embrace data fabrics. There are also two other benefits that are worth highlighting.
Improved governance
Organisations operating in the financial services space (and many others at that) know full well the pressures faced due to regulation. Whenever new legislation is announced, CIOs and CTOs around the world hold their breath while they consider the implications to their organisation, as well as the investment required to comply. Improvements due to knowledge of domain context plus embedding security and compliance within teams sure does make life easier for when auditors come knocking. What's more, it also means organisations can spend more time focusing on projects that drive customer experience, acquisition, and retention.
Morale of the development teams
This benefit often flies under the radar but is hugely important, which also ties into our value People First. In order to sustain high-performing teams, organisations need to be mindful of and keep cognitive load to a minimum. Essentially, this is finding the shortest path from idea to production, that limits the time engineers spend working on the problem.
Previously, development teams worked on projects for months on end, some times longer, and only saw the fruits of their work after many months if they're lucky. So anything to speed up process and realise progress is helpful. Not only that, it also means there is a chance build in feedback and continual improvement into the process. The other reason why keeping morale high is essential, particularly for those in the tech space, is because of the huge skills shortage we're facing. Even with the ability to work remotely, skilled developers are few and far between. Organisations simply can't afford to risk offering low-performing environments as top talent will move on.
Now is the time to let data set the record straight.
Yes, data can be manipulated to support any angle, and yes, it can tell 1000 stories too, but we're not interested in that. We're interested in combining data sets and presenting one view of the customer so the data speaks for itself - the truth, the whole truth, and nothing but the truth.
With the right approach to data fabrics, organisations can rise above their competitors and serve their customers in ways they never could have dreamed of before. If you are thinking about implementing data fabrics or want to discuss any of what you've read here, we'd love to talk. As we said, this is a new journey that few have embarked on, but now the time is right.