Data Virtualization changes the game

During the Olympic Games of 1968, Dick Fosbury took gold with a new technique in the high jump. Now all high jumpers use his technique. It’s the only way to be competitive.

The data virtualization technology has become mature and enriches the classic data warehouse architecture. It’s the only way to be able to deliver data to end-users, at business speed.

The Business Need

End-users need information at the speed of their business. Fast, actionable, at the right time and sometimes even real-time. Origin of data and storage technology should not matter. Whether we need internal or external data, data stored in the cloud or on-premise, data should work for you, not against you.

Once available, information users want to have the freedom to consume the data the way they want. This requires that information is integrated, prepared and presented only once for all kinds of information consumption, regardless technology used. What we like to call ‘the data as a service’ concept.

The IT challenge

Today’s explosion of information volumes (social data, sensor data, IOT, …), types (structured / unstructured) and sources (cloud, Hadoop) makes it challenging to support the decision making process. Traditional BI architectures show weaknesses when they need to cover real-time or operational information requirements or need to support an agile delivery process.

Reasons for this lack of flexibility:

[info_list style=”circle with_bg” icon_color=”#ed1c24″ font_size_icon=”16″ el_class=”lijst”][info_list_item list_title=”Too many point-to-point integrations are implemented” list_icon=”Defaults-check”][/info_list_item][info_list_item list_title=”Information chain is too long” list_icon=”Defaults-check”][/info_list_item][info_list_item list_title=”Data is stored/duplicated many times along the way” list_icon=”Defaults-check”][/info_list_item][info_list_item list_title=”Lack of clear strategy about how to integrate new types of data” list_icon=”Defaults-check”][/info_list_item][info_list_item list_title=”Lack of tools to integrate new types of data” list_icon=”Defaults-check”][/info_list_item][/info_list]

As a result, it takes too long to get answers to business users and release new BI functionality. Because IT is not able to follow the demand, business sees no other choice than to build their own solutions. The perfect seed for shadow IT…

The concept of Data Virtualization

Data virtualization defined by Rick Van Der Lans, R20: “Data virtualization is the technology that offers data consumers a unified, abstracted, and encapsulated view for querying and manipulating data stored in a heterogeneous set of data stores.”

Data virtualization integrates data scattered in various silos, without replicating the data. Its functionality is comparable with traditional ETL, but it adds the capability to deliver real-time data integration at lower cost, with more speed and agility.

On top, it offers a single “virtual” data layer that delivers unified data services to support multiple applications and users.

DV platforms and servers typically process data in three steps:

[info_list style=”circle with_bg” icon_color=”#ed1c24″ font_size_icon=”16″][info_list_item list_title=”CONNECT” list_icon=”Defaults-check”]Connect to any source of data using connectors and expose the data as a view to the next step.[/info_list_item][info_list_item list_title=”COMBINE” list_icon=”Defaults-check”]Integrate, cleanse and apply business rules. Data is prepared for the next step.[/info_list_item][info_list_item list_title=”CONSUME” list_icon=”Defaults-check”]The information is published as a data service to consuming applications (reporting systems, …). Data is accessible using SQL, MDX, …[/info_list_item][/info_list]

The best DV platforms utilize a combination of real-time query optimization and rewriting, intelligent caching, and selective data movement to achieve superior response and performance against both on-demand pull and scheduled batch push data requests.

Data security and metadata management are incorporated in most DV platforms.

Benefits

[info_list style=”circle with_bg” icon_color=”#ed1c24″ font_size_icon=”16″ el_class=”lijst”][info_list_item list_title=”Boost agility and respond faster to business change” list_icon=”Defaults-check”][/info_list_item][info_list_item list_title=”Addresses the limitations of a traditional BI architecture” list_icon=”Defaults-check”][/info_list_item][info_list_item list_title=”Enable operational and/or real-time BI” list_icon=”Defaults-check”][/info_list_item][info_list_item list_title=”Simple incorporation of non-traditional data sources (cloud, big data, IOT, Hadoop, …)” list_icon=”Defaults-check”][/info_list_item][info_list_item list_title=”Enable agile development process” list_icon=”Defaults-check”][/info_list_item][info_list_item list_title=”Data is prepared and made available once as a service to any information consuming application” list_icon=”Defaults-check”][/info_list_item][/info_list]

Myths

The idea of data virtualization already exists for a few years, but only now technology fully supports the concept. Some DV myths caused a slow adaptation of the technique:

[info_list style=”circle with_bg” icon_color=”#ed1c24″ font_size_icon=”16″ el_class=”lijst”][info_list_item list_title=”Query performance is negatively influenced by DV servers” list_icon=”Defaults-check”][/info_list_item][info_list_item list_title=”DV eliminates the need for a data warehouse” list_icon=”Defaults-check”][/info_list_item][info_list_item list_title=”Data virtualization is a synonym for data federation” list_icon=”Defaults-check”][/info_list_item][info_list_item list_title=”Only limited data transformations are possible due to the on-demand concept” list_icon=”Defaults-check”][/info_list_item][/info_list]

Is one of these myths stopping you from considering the added value of DV for your company? Please contact us and let’s discuss!