Term Paper: Data Warehousing: A Strategic Weapon

Pages: 38 (10375 words)  ·  Bibliography Sources: 1+  ·  Level: College Senior  ·  Topic: Business  ·  Buy This Paper


[. . .] On the basis of information provided by Information Sciences at the University of California-Berkley (1997), there are a number of defining features associated with data warehousing as contrasted with the attributes of operational applications. These include the following:

The data warehouse is oriented around the major subjects of the enterprise, e.g., customer, vendor, product and productivity. Hence it focuses on data modeling and database design exclusively, and it excludes data that is not useful for DSS processing. In contrast, the operational applications are designed around processes and functions, e.g., loans, savings, bank card and trust for a financial institution. Consequently, they are concerned both with database design and process design, and they contain data that satisfies immediate functional/processing requirements.

There are differences in orientation between the data warehouse and operational applications in terms of the relationships of data. While data warehouse data spans a spectrum of time, maintains many relationships, and represents many static business rules (and correspondingly, many data relationships) between two or more tables, operational data maintains an ongoing relationship between two or more tables based on a business rule that is in effect.

Data contained within the boundaries of the warehouse is integrated, that is, stored in a singular, globally acceptable fashion, although the underlying operational systems may store the data in various different ways. Data warehouse systems prove most successful when data can be combined from multiple source applications, when all sorts of data inconsistencies have to be effectively addressed. "Data scrubbing" or "data staging" enables the DSS analyst to focus on using the data that is in the warehouse, without having to wonder about its credibility or consistency. The integration of data is manifested in many ways -- in consistent naming conventions, in consistent measurement of variables, in consistent encoding structures, in consistent physical definition of attributes, and so on.

All data in the data warehouse is "time variant," i.e., accurate as of some moment in time, whereas in the operational environment data is accurate as of the moment of access. Thus, the time horizon represented for the data warehouse is much longer (which can involve years) than that for the operational environment (which ranges from the current values of today to ninety days). Every key structure in the data warehouse contains an element of time either implicitly or explicitly.

The data warehouse is nonvolatile. Data warehouse data is a long series of snapshots, and cannot be updated once correctly recorded, while record-to-record real-time updates -- inserts, deletes, and changes -- are done regularly to the operational environment. That is, once data is loaded into the warehouse from the application-oriented operational environment (and/or external sources), it does not change, but is merely accessed there. Therefore, there is no need to be cautious of the update anomaly, an important factor to consider in operational application systems; nor does data warehousing require the complex technologies supporting backup and recovery, transaction and data integrity, and the detection and remedy of deadlock. Data "updating" in the data warehousing environment consists of periodic mass loading of data from the operational environment. The simplicity of data management and the much less rigid response time requirements allow data warehouse designers to take liberties in optimizing the access of data. De-normalization of the physical data model is conducted to enhance performance and simplicity, which are more prominent for data warehouse operations because the amount of data involved is typically very large.

Inmon (1999a) emphasized the importance of monitoring the environment of a data warehouse once it has been deployed. As explained by Inmon, in order to manage the data warehouse environment, two types of data warehouse monitors are required, including activity monitors and data base monitors. A data warehouse activity monitor is one that analyses the activity - the queries - that operate against the data warehouse, The data warehouse activity monitor addresses the following questions in monitoring efforts:

who is using the data warehouse,

A how much is the data warehouse being used?

What is the nature of the queries that are being asked?

A what time of day is the warehouse being used the most?

A are there periodic patterns of usage that are occurring that are notable on a weekly basis? On a monthly basis? On a quarterly basis? An annual basis?

A how much growth is there in the usage of the data warehouse?

I should indexes be added to enhance performance?

A how should the data warehouse be tuned?

As also noted by Inmon (1999a), it is also important to determine what data is being used in the data warehouse. The reality associated with warehouses is that as the data warehouse grows in size and in importance, the percentage of data that is used actually shrinks. Thus, as explained by Inmon, determination of what data is being used and what data is not being used serves as a basis for removing unused data rather than adding additional disk storage. The data warehouse activity monitor can be used in determining what data needs to be removed.

As discussed by Inmon (1999a), the data warehouse data base monitor is used to address the following questions:

how has growth been occurring in the data warehouse?, what profile is there of data in the data warehouse:

key data?

A indexed data?

A non-key data?

The data warehouse data base monitor is used to track the contents of the warehouse and how the contents have changed over time. Not only is standard detailed data tracked, but summarized data is tracked as well. As well, as described by Inmon, the data base monitor is used to monitor the profiling of classifications of record types within the data warehouse. The results of this form of monitoring may be used by the DSS analyst who needs to be able to be knowledgeable of the profile of data subsets prior to the submission of a query.

According to Hackathorn (1995), five information flows are associated with data warehousing: the first four flows to get data in from legacy systems (Inflow), up to a more compact form (Upflow), down to archival storage (Downflow), and out to consumers (Outflow), and the fifth flow to manage the warehouse itself (Metaflow). Data warehouses require tools to make the functions associated with each flow more effective (Mattison, 1996).

According to Mattison (1996), when considering the tools necessary for developing data warehouses, there are three basic categories based on their activities: acquisition tools (for inflow), storage tools (for upflow and downflow), and access products (for outflow). As explained by Francett (1994), acquisition tools are critical in performing tasks such as modeling, designing, and populating data warehouses. These tools are used to extract data from various sources and transform it (i.e. condition it, clean it up, and denormalize it) to make the data usable in the data warehouse. As well, they are used to establish the meta data, where information about the data in the warehouse is stored.

As explained by Mattison (1996), storage is typically managed by relational databases and other special tools in a way that data is used for effective decision support. Alternatively, according to Mattison, access products include data mining tools such as multidimensional analysis products, neural networks, and data discovery tools that support end users in accessing and analyzing the data in the warehouse in various ways In Data mining is the process of making discovery from large amounts of detailed data (Barry, 1995; Mason, 1995). Data mining tools are used to sift through the data in efforts to determine patterns or similarities in the data. With data mining, data is evolved to information, then to knowledge, resulting in business intelligence by means of variety of statistical analyses and data visualization (Brown, 1995; Fogarty, 1994).

Deployment Obstacles

Inmon (1999b) identified a number of factors that have served as obstacles in deploying data warehouses. Each of these obstacles will be reviewed.

According to Inmon (1999b), accessing and pulling data from the source for the data warehouse represents one of the most challenging obstacles to the deployment of a data warehouse. Most often, the legacy systems environment serves as the source of the data needed for the warehouse. As noted by Inmon, a number of problems are associated with accessing and securing data from legacy systems including the following:

finding legacy data: Data is often so secured and convoluted within the legacy system that accessing it without a map is extremely difficult.

A understanding what data exists and means in the legacy environment: Lack of documentation of data within legacy systems makes it difficult to know what data exists and what the data represents.

A efficient traversal of the legacy environment: Legacy systems are complex as is accessing the data secured within systems; thus, finding one's way while attempting to access data can be challenging.

A lack of integration of legacy data: The data found within legacy systems is most often not integrated, requiring data transformation prior to placing the data… [END OF PREVIEW]

Four Different Ordering Options:

Which Option Should I Choose?

1.  Buy the full, 38-page paper:  $26.88


2.  Buy & remove for 30 days:  $38.47


3.  Access all 175,000+ papers:  $41.97/mo

(Already a member?  Click to download the paper!)


4.  Let us write a NEW paper for you!

Ask Us to Write a New Paper
Most popular!

Trade Show Industry Dissertation

Can Anticipatory Logistics Work in the Corporate World Term Paper

Green and Reverse Logistics Term Paper

IT Strategies to Maximize the Competitive Advantage Term Paper

flows of freight have of late Essay

View 12 other related papers  >>

Cite This Term Paper:

APA Format

Data Warehousing: A Strategic Weapon.  (2003, August 9).  Retrieved April 20, 2019, from https://www.essaytown.com/subjects/paper/data-warehousing-strategic-weapon/8621579

MLA Format

"Data Warehousing: A Strategic Weapon."  9 August 2003.  Web.  20 April 2019. <https://www.essaytown.com/subjects/paper/data-warehousing-strategic-weapon/8621579>.

Chicago Format

"Data Warehousing: A Strategic Weapon."  Essaytown.com.  August 9, 2003.  Accessed April 20, 2019.