Top four focus areas to help you shape a data-driven enterprise
3AI March 24, 2022
Author: Praveen Reddy, Vice President (Digital – data, analytics, and cloud) | Genpact
As we navigate the world of data, some significant trends are manifesting with respect to data ownership and utilization. There are multiple triggers for these trends – ever-rising complexity of data, lack of proper intelligence concerning data assets, need for accelerated digital transformation, and data democratization. This list could go on. And the only way to respond to these challenges is by enabling organizations to become data driven.
Domain data fabric
The lack of data ownership is the most critical challenge here. If data is the holy grail that drives transformation, then there must be a proper context, ownership, and governance around it. It must be treated as an end-to-end (E2E) product rather than a commodity. To achieve this, it is not enough to build monolithic architectures such as data warehouses and data lakes and populate them through conventional ETL processes to meet consumption needs. We must move to a distributed architecture framework with the domain as the pivot for distribution and focus on self-sustenance, in terms of platform and ownership. This results in a mesh type of architecture called the data mesh (term defined by Zhamak Dehghani, a ThoughtWorks consultant).
But the important concern here is how should we define a domain. Is it based on the source data that is usually closely aligned to business functions or operational systems or is it based on consumer needs that are primarily driven by a set of use cases? Consumer-driven data sets are required to undergo different transformations like normalization, denormalization, aggregation and more.
For a domain data fabric to be successful, you need to meet the following checkboxes:
- Complete product mindset (more on the topic here)
- Life cycle management
- Key performance indicators (KPIs)
- Easy accessibility
- Effective governance
Advanced and active metadata management solutions
While there is tremendous progress in terms of management of modern data platforms and tools, we cannot say the same when it comes to speed, efficiency, and capability with respect to ingestion, transformation, and consumption of metadata. There is a lot that remains to be investigated as far as semantics, governance, reliability, and trust are concerned. With the rise in complexity and use of IT systems, the need to govern the different types of assets, stakeholders, and use cases to be supported, has expanded significantly. Thus, metadata management solutions have fallen behind in addressing industry requirements.
To address these gaps, modern solutions need to be scalable, flexible and must allow a high degree of collaboration capabilities, given that the size and complexity of the metadata that needs to be stored, processed, and connected have also increased. Further, all types of metadata (technical, operational, and business) across different platforms and tools need to be linked in a much more contextual and intelligent manner to meet expectations. In fact, some of the capabilities of modern IT systems (infinite scalability of cloud platforms, embedded collaboration, advances in semantics) will help in addressing these challenges.
Gartner made a big change last year by scrapping Magic Quadrant for Metadata Management Solutions and replacing it with a market guide for active metadata. This helps in identifying a broader set of platforms/tools that significantly enables active metadata use in organization’s transformation journey.
Unified and integrated metrics repository
Though surprising, organizations are still struggling to create a metadata-driven simple, flexible, scalable, and unified metrics layer. In the existing scenario, most of the metrics are pre-defined, designed, and tightly embedded in an organization’s BI platforms. While BI platforms are good at abstracting technical complexity from users (for example, users can avoid writing complex SQLs and can easily create and visualize simple metrics based on underlying data), these metrics can’t be extended beyond BI tools across other platforms and departments within an organization. Also, these can’t be clubbed with other metrics to come up with a set of integrated metrics without significant engineering effort.
One possible new approach Headless BI (introduced by Base Case) can solve this to some extent and the solution lies in separating visualization from metrics definition and ownership. This way, the team that owns the metrics, defines and catalogs them only once. And different tools/platforms and departments can access these metrics based on their needs. This goes a long way and helps organizations in addressing their metrics problems.
Data observability and data quality
When an organization goes through digital transformation and becomes data-driven, the reliability of its E2E pipelines and the quality of the data across all the pipelines become extremely important. This, in a way, changes how we think about data quality, reliability, and usage. Earlier, these issues were more application or process specific. But as we embrace a data-driven and data as a product mindset, forecasting, and real-time monitoring, tracking, and triaging of issues (E2E of this is called data observability) have become critical to any organization and are tightly linked to the emerging trends of today.
Therefore, it is crucial to have an integrated and intelligent data platform (a single platform/tool or a combination of platforms and tools) that can provide the following:
- Organize all data sets into domain-based, self-sustaining, and distributed repositories
- View, link, and tag all the assets, based on context-sensitive metadata, quality, and trust
- Adopt a product mindset
- Unify semantic framework for organization-wide metrics
- Focus on real-time data observability, stewardship, and correctness
Organizations that successfully establish these capabilities, frameworks, and mindset, have a better chance of accelerating and controlling their digital transformation initiatives for better, faster, and expected outcomes. This will help businesses to become more resilient and handle any unexpected scenario like COVID-19 and avoid getting disrupted or going out of the market.