Cookies

We use cookies to ensure that we give you the best experience on our website. By continuing to browse this repository, you give consent for essential cookies to be used. You can read more about our Privacy and Cookie Policy.


Durham e-Theses
You are in:

A Service Late Binding Enabled Solution for Data Integration from Autonomous and Evolving Databases

WANG, CHONG (2010) A Service Late Binding Enabled Solution for Data Integration from Autonomous and Evolving Databases. Doctoral thesis, Durham University.

[img]
Preview
PDF - Accepted Version
1780Kb

Abstract

Integrating data from autonomous, distributed and heterogeneous data sources to provide a unified vision is a common demand for many businesses. Since the data sources may evolve frequently to satisfy their own independent business needs, solutions which use hard coded queries to integrate participating databases may cause high maintenance costs when evolution occurs. Thus a new solution which can handle database evolution with lower maintenance effort is required.

This thesis presents a new solution: Service Late binding Enabled Data Integration (SLEDI) which is set into a framework modeling the essential processes of the data integration activity. It integrates schematic heterogeneous relational databases with decreased maintenance costs for handling database evolution. An algorithm, named Information Provision Unit Describing (IPUD) is designed to describe each database as a set of Information Provision Units (IPUs). The IPUs are represented as Directed Acyclic Graph (DAG) structured data instead of hard coded queries, and further realized as data services. Hence the data integration is achieved through service invocations. Furthermore, a set of processes is defined to handle the database evolution through automatically identifying and modifying the IPUs which are affected by the evolution.

An extensive evaluation based on a case study is presented. The result shows that the schematic heterogeneities defined in this thesis can be solved by IPUD except the relation isomorphism discrepancy. Ten out of thirteen types of schematic database evolution can be automatically handled by the evolution handling processes as long as the evolution is represented by the designed data model. The computational costs of the automatic evolution handling show a slow linear growth with the number of participating databases. Other characteristics addressed include SLEDI’s scalability, independence of application domain and databases model. The descriptive comparison with other data integration approaches shows that although the Data as a Service approach may result in lower performance under some circumstances, it supports better flexibility for integrating data from autonomous and evolving data sources.

Item Type:Thesis (Doctoral)
Award:Doctor of Philosophy
Keywords:Database evolution handling, Data as a Service, Service metadata, Service-based data integration
Faculty and Department:Faculty of Science > Engineering and Computing Science, School of
Thesis Date:2010
Copyright:Copyright of this thesis is held by the author
Deposited On:15 Mar 2011 15:35

Social bookmarking: del.icio.usConnoteaBibSonomyCiteULikeFacebookTwitter