Data Warehousing Chapter

Pages: 6 (2067 words)  ·  Bibliography Sources: 0  ·  File: .docx  ·  Level: College Senior  ·  Topic: Physics

Data Warehousing Text

• Chapters 7,9, 13

Exercise #3-pg 161 Chapter 7 -write

As a senior analyst responsible for data staging, you are responsible for the design of the data staging area. If your data warehouse gets input from several legacy systems on multiple platforms, and also regular feeds from two external sources, how will you organize your data staging area? Describe the data repositories you will have for data staging.

Get full Download Microsoft Word File access
for only $8.97.
Since the data staging area is a critical part of the functionality of the entire system, it is important to take into account the needs of the system from the staging area forward as well as the limitations and performance characteristics. Also, since the integration of data is the centerpiece of the system and the final product of the entire system, it is important to successfully integrate and assemble data in a way that makes sense to the user. Since more and more staging areas, which are responsible for the "assembly" of information coming from the legacy systems, are becoming relational databases, which in fact means that the data is stored in these places for longer than it had been before when the staging areas were mostly made up of flat files in formats which were readily accessible and usable to the system further downstream. A data warehouse or staging area today would likely consist of this more modern style of storage. However, since these files are stored for longer periods of time, the overhead within the storage system also increases to a point where index creation and data migration from the source systems begins to become a factor in the speed and efficiency of the entire system, not just the staging area itself. In integrated staging area would need to take into account the needs of the system, and in doing so, align itself with either a flat file storage format or a more relational format where data is stored for longer periods of time.

Chapter on Data Warehousing Assignment

An efficient data staging and intermediary storage area also includes multiple data marts, all configured for best operation relative to the operating system or context of the information being collected and the downstream requirements of this data. As far as repositories go, this portion of the architecture could vary depending on a few variables that the user would need to evaluate prior to setting up the system's overall architecture. These variables include system security, data archiving, refresh considerations, load increments, and backup and recovery capabilities. Each system, depending on how it is designed and implemented, varies in these aspects and the considerations of which of these aspects takes precedence over others would have to be made by the user or customer. The staging area would be responsible for many functions within the system, and the design would also have to take into account the consolidation of datasets and the creation of flat files for loading through DBMS utilities.

An efficient repository not only effectively organizes and stores data, but it also delivers information effectively and efficiently. This means that the information delivery systems' strength and robustness comes as a direct result of the data warehouse architecture. The flexibility of the system is also related to this architecture, but depends where data is summarized and stored as multidimensional cubes of information. The delivery portion of this repository architecture includes OLAP, data mining, and the report/query system where temporary result sets live and where standard reporting data stores make up the primary functionality of this key architectural component. The later is also a function of the architecture of independent data marts, storing information in a secondary "staging area" to be delivery by the information delivery system(s).

The data staging area then should be organized to help keep the data coming in from the two sources free from pollution while adequately and accurately storing it on a system that is compatible with the different legacy systems. This task is easier said than done and involves a survey of the older legacy systems to identify points where data could become polluted or bottlenecked. As far as an administrator is concerned, having accurate and adequate data storage is also dependent on the architecture of the system in that adequate data marts and delivery systems must be implemented or else the entire system could be slowed down or rendered useless for the customer (end-user).

Exercise #3-pg. 221 Chapter 9 - write 2 pages

As the data warehouse administrator, describe all the types of metadata you would need for performing your job. Explain how these types would assist you.

There are three basic types of metadata, all of which are important components of the entire delivery system and the functionality of the architecture of the system, each type is integral in performing the warehouse administrator's job. It provides important information to the administrator relative to the types of data being stored as well as the demographics of such data. This is key to the administrator because it allows the efficient and effective management of the data according to the parameters and characteristics outlined by the metadata. The characteristics of the data, as recorded and reported by the metadata, also directly affects the usability of such data and the potential uses the end-user or customer has for the data being stored and managed by the warehouse administrator. The metadata itself is like the roadmap for the end user, helping to divulge the existence and location of other important parts of the system, and giving the administrator a top-down view of the entire storage landscape. The administrator acts as the information navigator of sorts, helping to navigate the roads and highways laid out by the structure of this information.

Metadata has many sources, all of which the administrator must understand in order to help manage the system, no matter how the architecture is set up. These sources include source systems, data extraction, transforming and cleansing, data loading, data storage, and information delivery. All of these processes are directly related to the creation of metadata and to the functioning of the administrator. It is therefore the administrator's responsibility to know the sources and the generated metadata and how it relates to the end-user's experience and satisfaction with the efficiency of the system or product being supported.

Operation metadata is data that describes the basic operation of data files. This type of information would assist a warehouse administrator by helping the administrator to align the granularity and schemata in a way that makes the most sense to the architecture of the system. This helps the speed and functionality of the entire system and helps the administrator properly manage the metadata as well as the data files through characterization of operation. Many systems are designed to record, store, and deliver certain types of data and metadata within specific system organs or structures. A warehouse administrator may chose to align these functions with specific parts of an entire system, thereby creating a much more functional and smooth flow of information from beginning to end.

Extraction and transformation metadata is used by warehouse administrators to better understand and categorize data and data functions based upon extraction and transformation methods and parameters. A warehouse administrator is not only concerned with the proper management of metadata and data files, but also the characteristics of the functions of these two key types of information. Metadata that helps an administrator identify extraction and transformation methods also aids in helping to design a system with more or less flat file capabilities or relational database capabilities. This has ramifications further down the line of the system within the information delivery portion as well. This is important because the warehouse administrator can affect the efficiency of each stage of data and metadata movement, not just in the basic management of the warehouse and systems themselves.

End-user metadata is metadata that helps the end-user identify, quantify, and categorize the use of data within the data supply chain delivery system. This type of metadata is important to the warehouse administrator because it affects the end result or usability of the data itself for users or customers further downstream from the warehouse. Data is only as effective as its usability, and a warehouse administrator who is well-versed in the end-users' data reliability and flexibility concerns can also help to more effectively manage and categorize the data being stored upstream, in the warehouse. The result of the correlation of these different types of metadata and the understanding of the warehouse administrator that each of these types of metadata represent opportunities for both errors and for excellent service to the end-user will help in the warehouse administrator's decision to dedicate themselves to fast, effective metadata management.

Exercise #2-pg. 337 Chapter 13 - write 2 pages

Assume that you are the data quality expert on the data warehouse project team for a large financial institution with many legacy systems dating back to the 1970's. Review the types of data quality problems you are likely to have and make suggestions on… [END OF PREVIEW] . . . READ MORE

Two Ordering Options:

Which Option Should I Choose?
1.  Buy full paper (6 pages)Download Microsoft Word File

Download the perfectly formatted MS Word file!

- or -

2.  Write a NEW paper for me!✍🏻

We'll follow your exact instructions!
Chat with the writer 24/7.

Data: Warehousing, Mining, and Management Term Paper

Data Warehouse Data Mart and Business Intelligence Term Paper

Data Warehousing as the Senior Analyst Responsible Research Proposal

Data Management, Warehousing, and Mining Essay

Data Warehouse Implementation Research Proposal

View 200+ other related papers  >>

How to Cite "Data Warehousing" Chapter in a Bibliography:

APA Style

Data Warehousing.  (2011, January 30).  Retrieved October 22, 2020, from

MLA Format

"Data Warehousing."  30 January 2011.  Web.  22 October 2020. <>.

Chicago Style

"Data Warehousing."  January 30, 2011.  Accessed October 22, 2020.