IEEE-Computer Science -- Literature Review Literature Review

Pages: 12 (3194 words)  ·  Bibliography Sources: 15  ·  File: .docx  ·  Level: Master's  ·  Topic: Education - Computers


[. . .] " The offerings of a DBMS includes a "suite of interrelated services and guarantees that enables developers to focus on the specific challenges of their applications, rather than on the recurring challenges involved in managing and accessing large amounts of data consistently and efficiently." (Franklin, Halevy and Maier, 2005) Data management scenarios today can rarely be fitted into a "conventional relational DBMS or into any other single data model or system." (Franklin, Halevy and Maier, 2005)

Figure 1

A Space of Data Management Solutions

The above illustration in Figure 1 shows the existing data management solutions categorized according to two dimensions. Administrative proximity is reported to indicate "how close the various data sources are in terms of administrative control." (Franklin, Halevy and Maier, 2005) Near is reported to mean that the sources are "under the same or at least coordinated control" and Far is reported to indicate "a lower coordination tending towards none at all." (Franklin, Halevy and Maier, 2005)

Buy full Download Microsoft Word File paper
for $19.77
The closer the administrative control of a group of data sources then the stronger are the guarantees of such as permanence and consistency which can be provided by the data management system. (Franklin, Halevy and Maier, 2005, paraphrased) Semantic Integration is reported as a measure of "how closely the schemas of the various data sources have been matched." (Franklin, Halevy and Maier, 2005) The DBMS represents just one point solution in the DBMS environment. The 'data integration systems' and 'data exchange systems' are stated to offer "many of the purported services of dataspace systems." (Franklin, Halevy and Maier, 2005) The distinction is stated to be that the data integration systems "require semantic integration before any services can be provided." (Franklin, Halevy and Maier, 2005)

Literature Review on IEEE-Computer Science -- Literature Review Assignment

It is reported that the goal of 'Personal Information Management' has as its goal to provide "easy access and manipulation of all of the information on a person's desktop, with possible extension to mobile devices, personal information on the Web, or even all the information accessed during a person's lifetime." (Franklin, Halevy and Maier, 2005) Scientific data management involves monitoring, observation and forecasting and can also be used in "running atmospheric and fluid-dynamics models that simulate past, current and near-future conditions." It is reported that the computations require importing data and model outputs from other groups…" (Franklin, Halevy and Maier, 2005)

Dataspace Systems

Dataspaces are described as a "set of participants and relationships." (Franklin, Halevy and Maier, 2005) The participants in a dataspaces are stated to be "individual data sources" which can be relational databases, XML repositories, text databases, web services and software packages…" which may be "stored or streamed…" (Franklin, Halevy and Maier, 2005) Some participants are stated to support "expressive query languages" while other are stated to be "opaque and offer only limited interfaces for posing queries such as structured filed, web services or other software packages." (Franklin, Halevy and Maier, 2005)

The dataspace system should have the capacity to model any type of relationship between two or more participants. Dataspaces may be nested within each other as well. It should be understood that participants in a dataspace will not be able to provide the interfaces needed to support all DSSP functions therefore, the need will exist to extend data sources variously. The following is an example dataspace and the components of a dataspace system.

Figure 2

Example Dataspace and the Components of a Dataspace System

Source: Franklin, Halevy and Maier (2005)

The components of the dataspace system includes the catalog and browse. The catalog includes information about all the participants in the dataspace and the relationships among them. The catalog accommodates a great many sources and supports various levels of information about their structure and capabilities. The DSSP should also support a "model-management environment that allows creating new relationships and manipulation of existing ones.

Query Systems

Search and query should offer the following capabilities:

(1) query everything; and (2) structured query. (Franklin, Halevy and Maier, 2005)

Meta-data queries should be supported by the system, which includes:

(1) the source of an answer or how the answer was computed;

(2) timestamps on the data items participating in the answer's computation;

(3) specification of which dataspace data items may depend on a specific data items and the ability to support hypothetical queries. (Franklin, Halevy and Maier, 2005)

Finally, all of the search and query services must be supported in a way that can be applied in real-time streaming or modified data sources. A DSSP is stated to have a storage and indexing component for the following purposes:

(1) To create efficiently queryable associations between data objects in different participants,

(2) to improve accesses to data sources that have limited access patterns,

(3) to enable answering certain queries without accessing the actual data source, and (4) to support high availability and recovery. (Franklin, Halevy and Maier, 2005)

In addition, the index should be adaptable to "heterogeneous environments." (Franklin, Halevy and Maier, 2005) The goal of the discovery component is also addressed and it is stated that a DSSP "should be able to imbue such a participant with additional capabilities, such as a schema, a catalog, keyword search and update monitoring." (Franklin, Halevy and Maier, 2005) In addition, the source extension components "supports value-added information held by the DSSP, but not present in all of the initial participants." (Franklin, Halevy and Maier, 2005)

Data Integration Systems -- Collaborative Approach

The work of Doan and McCann (nd) entitled "Building Data Integration Systems: A Mass Collaboration Approach" reports that building data integration systems is primarily accomplished by hand in what is described as a "very labor intensive and error prone process." (nd) Doan and McCann additionally report "numerous research activities have been conducted on data integration, both in the AI and database communities." (nd)

There has been a great deal of progress made in the development of conceptual and algorithmic frameworks: query optimization, constructing semi-automatic tools for schema matching, wrapper construction, and object matching; and field data integration systems on the internet." (Doan and McCann, nd) Doan and McCann report that the basic idea in their work is "to have users contribute facts and rules in some specified language." (nd) Their work differs from others in several ways:

(1) building a knowledge base, potentially any fact or rule being contributed constitutes a parameter whose validity must be checked…" meaning that the number of parameters can be very high and checking them poses a serious problem."

(2) such knowledge bases must provide some mechanisms to allow users to immediately leverage the contributed information. Providing such mechanisms in the context of knowledge bases can be quite difficult, because it requires performing inference over a large number of possibly inconsistent or varying quality facts. Such mechanisms are considerably much simpler in our case, because feedback on the system parameters can immediately affect the query results." (Doan and McCann, nd)

Chawathe, et al. (nd) reports the Tsimmis Project which has as its goal the development of tools that "facilitate the rapid integration of heterogeneous information sources that may include both structured and unstructured data." (nd) It is reported that a common problem that many organizations today face is that of "multiple, disparate information sources and repositories, including database, object stores, knowledge bases, file systems, digital libraries, information retrieval systems, and electronic mail systems." (Chawathe, et al., nd)

Generally, those making decisions required information from various sources, yet are not able to obtain and then apply the required information in a time efficient manner due to challenges presented in accessing the different systems and as well due to the inconsistent and contradictory nature of the information obtained. The Tsimmis project involved the adoption of a "simple self-describing object model" and specifically the Tsimmis version is named the 'Object Ezchagne Model" or OEM which is stated to allow "nesting of objects." The basic premise is that "all objects, and their subjects have labels that describe their meaning." (Chawathe, et al., nd)

Fuxman and Miller (2010) report that inconsistency management "is a fundamental task in any data integration system." Schema integration methodologies are stated to have a design that results in global schemas that are "consistent with respect to the sources." (Fuxman and Miller, 2010) The schema integration process does not always mirror the semantics that the user has in mind for the global schema…" and in many cases there is not a consistent schema that meets the user requirement. (Fuxman and Miller, 2010, paraphrased)

The work of Schmidt and Lyle (2010) reports in the work entitled "Why Lean Integration is Important to Data Integration Systems" that lean integration "is not a one-time effort, you can't just flip a switch and proclaim to be done. It is a long-term strategy for how an organization approaches the challenges of process and data integration." (Schmidt and… [END OF PREVIEW] . . . READ MORE

Two Ordering Options:

Which Option Should I Choose?
1.  Buy full paper (12 pages)Download Microsoft Word File

Download the perfectly formatted MS Word file!

- or -

2.  Write a NEW paper for me!✍🏻

We'll follow your exact instructions!
Chat with the writer 24/7.

Computer-Based Training CBT Literature Review

Network Research Encountering -- and Countering Research Paper

Face Recognition Using PCA Literature Review

Evolution Over Time of Network Parameters Multiple Chapters

Risk Management and Analysis Process and Policy Before Technology Research Proposal

View 200+ other related papers  >>

How to Cite "IEEE-Computer Science -- Literature Review" Literature Review in a Bibliography:

APA Style

IEEE-Computer Science -- Literature Review.  (2011, February 8).  Retrieved February 29, 2020, from

MLA Format

"IEEE-Computer Science -- Literature Review."  8 February 2011.  Web.  29 February 2020. <>.

Chicago Style

"IEEE-Computer Science -- Literature Review."  February 8, 2011.  Accessed February 29, 2020.