about

home

the softtek blog

Get Insights from our experts delivered right to your inbox!

Subscribe to the Softtek Blog

Managing Data Fabric Architecture for Data-driven Business Challenges

Softtek

Feb 16, 2023

Achieving success in a purely data-driven organisation is not easy, and there are many obstacles to doing so. Generally, as a company uses more and more applications, its data becomes more isolated and inaccessible. Legacy infrastructures and systems make the situation worse, as data tends to become isolated when trying to migrate to the cloud. Another major difficulty is when trying to migrate data between different public clouds or between a public cloud and a local data centre.

Typically, companies have all their data distributed across multiple local locations and multiple public or private clouds, structured and unstructured data in a variety of formats. Managing all this requires the use of different technologies, at least 74% of organisations globally use 6 or more data integration tools, making it difficult to integrate, analyse and share data and incorporate new data sources.

Research shows that up to 68% of data is not analysed in most organisations and up to 82% of companies are inhibited by data silos. As data sources and the amount of data in existence increase, data professionals end up spending at least 75% of their time on tasks other than data analysis, leaving companies unable to make the most of their time and resources in relation to data utilisation.

To address these challenges, the Data Fabric concept has emerged as a trend for data management and analytics, providing a single environment consisting of a unified architecture and services running within the architecture that helps any organisation manage its data to add value and accelerate digital transformation.

It is predicted that by 2024, 25% of data management vendors will provide a complete Data Fabric framework, up from 5% today. It is a permanent and scalable solution for managing all data in a unified environment.

The Data Fabric is therefore a data management architecture designed to optimise access to distributed data so that it can be intelligently selected and orchestrated for self-service delivery to data consumers. Any enterprise that makes use of a data fabric can elevate the value of the company’s data by providing users with real-time access to data, no matter where it is stored.

A Data Fabric architecture is independent of data environments, data usage, data processes and geography, but has the ability to integrate core data management capabilities. This structure automates the discovery and governance of data by generating data that is ready to be analysed and used by artificial intelligence.

The Data Fabric implementation is able to provide a single environment to access and collect all data, eliminating silos. It also enables simple data management including data integration, data governance and data sharing when multiple tools are no longer required. This results in greater scalability to accommodate large volumes of data, its sources and applications, making it easier to leverage the cloud supporting on-premises, hybrid and multi-cloud environments. This reduces dependency on legacy infrastructures and solutions.

33% of the users bet on delivery within their homes

Features and modes of architecture

Data Fabric connects multiple locations, types and sources of data, allowing it to be managed, processed and stored as it moves within the fabric. It also makes it easy to access data or share it with applications for advanced analytics. The goals of this architecture include improving customer engagement through mobile applications and interactions, complying with data regulations and optimising business processes, among others.

What constitutes this structure varies by role, but always starts from the premise that the Data Fabric enables data to be accessed, integrated and shared in a distributed environment. In detail, the Data Fabric:

It connects to any data source via connectors, eliminating the need for encryption.
It provides data integration and data ingestion capabilities, between data sources and applications.
It provides integrated data quality, data preparation and data governance capabilities, enhanced by automation.
Supports data exchange with internal and external parties through API support.
Supports macro, real-time and batch use cases.
Manages multiple environments as a data source.

Broadly speaking, there are at least three modes of Data Fabric architecture. The first is a decentralised structure, a means of obtaining data that would otherwise be distributed without consolidation in a central repository, similar to a Data Lake or data warehouse.

Second, there is a more inclusive version of the Data Fabric that sees these centralised repositories as unprivileged participants in a distributed data architecture, the data exposed for access just like other sources, thus including centralised data, but still granting privileges for decentralised access.

The latest version sees it as a hybrid data architecture foundation, biased in favour of centralised access and offering data architects a way to bridge dispersed data resources and adapting to the data access needs of consumers such as data scientists, machine learning engineers and software engineers.

Why use Data Fabric?

Data creates a competitive advantage for companies, but they must deliver data quickly to meet customer needs. Knowledge-driven businesses are growing at an average of more than 30% per year.

With cloud migration and IoT, along with increasingly cost-effective data storage and processing, data is no longer tied to local centres, but more types of data are located in different places, making it difficult to manage.

A Data Fabric solution is a strategic approach to enterprise storage operations and leverages the best version of cloud migration. This architecture can go anywhere and be centrally managed, spanning public and private clouds, perimeter devices, IoT and more. This reduces management tasks through automation, accelerates the development and deployment process, and protects assets without interruption.

It enables changes to be made quickly, resolving problems, managing risk, reducing IT operations and complying with regulations.

In addition, using this architecture protects data through high levels of encryption with different functionalities and advanced restores, including read-only copies with efficient spacing management. So a Data Fabric solution improves overall performance, controls costs and simplifies infrastructure configuration and management.

NetApp and Talend as Data Fabric Providers

NetApp is a vendor focused on innovations that help create more robust, intelligent and efficient infrastructures. The company strives to deliver applications and data in the right place with the right capabilities. In addition, it conducts enterprise-specific research to drive success through a Data Fabric solution that delivers simplicity and efficiency.

The NetApp solution is integrated into the business fabric so that the company can organise the data infrastructure around discovery, integration, automation, optimisation, protection and data security. For each of these pillars, the company offers the necessary technologies to help design a strategy based on the different requirements and objectives of each company. For example, the Medical University of Hannover (MHH) serves its users with its Data Fabric solution, whether it is for healthcare, research or teaching. The University manages massive amounts of data and the solution has enabled them to find efficiencies in relation to their data.

On the other hand, Talend Data Fabric offers the breadth of capabilities that modern data-driven organisations need in a unified environment with a native architecture that allows them to adapt to change faster with built-in data integrity. Talend provides a unified environment to help transform raw data into healthy data, eliminating the need for data integration tools and support mechanisms. In addition, it generates optimised native code when creating data pipelines to leverage cloud platforms.

This service is natively designed to run in on-premises and cloud environments, so it can integrate data from on-premises back office and cloud environments, enabling adoption of new technologies such as Docker containers and Kubernetes. Talend Data Fabric is designed for IT and the business to collaborate and share healthy data with self-service data management.

Drug development company AstraZeneca uses the tool to accelerate the process with trusted data in a way that has given them speed and confidence, enabling them to shorten the drug development timeline. The drugmaker claims it takes 3 minutes for 90% of the data to be ready for analysis, reducing planning cycles to 3 hours, saving 99% of time and reducing the duration of each clinical trial.

Conclusions

The Data Fabric architecture is therefore a simplified data orchestration structure, which is used for the integration of operators for external databases, business logic, analysis and data transmission. It involves automated test data management from high quality production systems to equipment.

It also ensures data privacy compliance by configuring, managing and auditing access requests associated with national and international privacy regulations. Any data-driven enterprise should consider integrating Data Fabric for end-to-end data management, as it involves data configuration and management, with management tools, advanced analytics and unified configuration. The result is cost optimisation based on improved in-memory performance of commodity hardware, stability and full scalability without risk.

view all

the softtek blog

Categories

the softtek blog

Managing Data Fabric Architecture for Data-driven Business Challenges

Features and modes of architecture

Why use Data Fabric?

NetApp and Talend as Data Fabric Providers

Conclusions