Mis 761 - Enterprise Information Management
Autor: 雪 夏 • April 26, 2016 • Case Study • 2,198 Words (9 Pages) • 1,043 Views
MIS 761 Enterprise Information Management
Assignment 2
Student name: Xiaoteng Li
ID:211658566
Word count: 2045
Part 1
As we can tell from its name, ETL performs three steps: extraction, transformation and loading. To elaborate what ETL does, we dissect this three steps separately to see the different mission of each step.
Firstly, during extraction step, ETL’s mission is to extract relevant data and reformat these data into a specified format from any accessible sources. (KABIRI & CHIADMI 2013, P220) The main problem in this step is how to access local as well as distant sources to acquire data. If necessary, external data from external entities is also acceptable. In the extraction process, only required data is selected by ETL to achieve the desired output. It is the requisite stage for ETL process.
Then the transformation step is associated with cleaning data and confirming data. Cleaning operations such as refusing bad data and dealing with missing data, aims to modify wrong data and to deliver clean data to the decision makers. In order to correct and compatible data, confirming operation would check business rules and keys as well as lookup referential data. The transformation process is essential for keeping the quality and integrity of data. (Davenport 2008 )
Finally, loading step allow ETL loads data into targets in Data Warehouse (DW) environment. the loading process tries to access targets to transform the integrated data into the targets. Without loading process, ETL can not be able to present the transformed data to the user to achieve the desired output.
As ETL has clear, predefined target, it is seen as a “monolithic process” despite it has three effectively stages. (Davenport 2008 )
ETL are processes where data is extracted from outside sources, transformed to meet the operational needs and then loaded into DW. ETL approach is acceptable when DW involves many different databases. Especially when data needs to be transformed in a separate specialized engine. (Serra 2012) However, ELT means extract, load and transform, which are processes that involves extracting data from sources into the “Staging Database ” to transform it and then load it into target DW. In DW environment, validated and cleaned source data is provided as offline copies by ELT processes.
According to Davenport (2008), in ELT processes, it is possible to isolating the extracting and loading process from the transformation process. Unlike ETL, the extracting and loading process of ELT not only pick up the data it needs, but also absorb the data that it may need in the future. In other word, the entire source can be loaded into DW via loading process. When the data is enormous, or the source database is the same as the target database, ELT approach is more suitable to be applied rather than ETL. By using ETL approach, the transformation process can be completed either in staging area, the ETL tools or the target database, whilst ELT approach can only complete transformation process in the target database.
...