1. Extraction, Load and Transform (ELT):
Extraction, Load and Transform (ELT) is the technique of extracting raw data from the source and storing it in data warehouse of the target server and preparing it for endstream users.
ELT comprises of 3 different operations performed on the data:
Extracting data is the technique of identifying data from one or more sources. The sources may be databases, files, ERP, CRM or any other useful source of data.
Loading is the process of storing the extracted raw data in data warehouse or data lakes.
Data transformation is the process in which the raw data source is transformed to the target format required for analysis.
Data from the sources are extracted and stored in the data warehouse. The entire data is not transformed but only the required transformation is done when necessary. Raw data can be retrieved from the warehouse anytime when required. The data transformed as required is then sent forward for analysis. When you use ELT, you move the entire data set as it exists in the source systems to the target. This means that you have the raw data at your disposal in the data warehouse, in contrast to the ETL approach.
2. Extraction, Transform and Load (ETL):
ETL is the traditional technique of extracting raw data, transforming it for the users as required and storing it in data warehouses. ELT was later developed, having ETL as its base. The three operations happening in ETL and ELT are the same except that their order of processing is slightly varied. This change in sequence was made to overcome some drawbacks.
It is the process of extracting raw data from all available data sources such as databases, files, ERP, CRM or any other.
The extracted data is immediately transformed as required by the user.
The transformed data is then loaded into the data warehouse from where the users can access it.
The data collected from the sources are directly stored in the staging area. The transformations required are performed on the data in the staging area. Once the data is transformed, the resultant data is stored in the data warehouse. The main drawback of ETL architecture is that once the transformed data is stored in the warehouse, it cannot be modified again whereas in ELT, a copy of the raw data is always available in the warehouse and only the required data is transformed when needed.
Difference between ELT and ETL:
|ELT tools do not require additional hardware||ETL tools require specific hardware with their own engines to perform transformations|
|Mostly Hadoop or NoSQL database to store data.Rarely RDBMS is used||RDBMS is used exclusively to store data|
|As all components are in one system, loading is done only once||As ETL uses staging area, extra time is required to load the data|
|Time to transform data is independent of the size of data||The system has to wait for large sizes of data. As the size of data increases, transformation time also increases|
|It is cost effective ans available to all business using SaaS solution||Not cost effective for small and medium business|
|The data transformed is used by data scientists and advanced analysts||The data transformed is used by users reading report and SQL coders|
|Creates ad hoc views.Low cost for building and maintaining||Views are created based on multiple scripts.Deleting view means deleting data|
|Best for unstructured and non-relational data. Ideal for data lakes. Suited for very large amounts of data||Best for relational and structured data. Better for small to medium amounts of data|
GeeksforGeeks has prepared a complete interview preparation course with premium videos, theory, practice problems, TA support and many more features. Please refer Placement 100 for details
- Difference between HLP and PCP
- Difference between CPU and GPU
- Difference Between BFS and DFS
- Difference between PIP and PCP
- Difference between LAN and MAN
- Difference between PNG and GIF
- Difference between TCP and RTP
- Difference Between Blu-Ray and DVD
- Difference between CRT and LCD
- Difference between 4G and 5G
- Difference between LAN, MAN and WAN
- Difference between LAN and WAN
- Difference between OOP and POP
- Difference between PIP and HLP
- Difference Between SEO and SEM
If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to email@example.com. See your article appearing on the GeeksforGeeks main page and help other Geeks.
Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.