IS5 - Data warehousing
Advanced level - Second cycle course
90 credits in Computer and Systems Sciences or 90 credits in a DSV bachelor programme with at least 60 credits in Computer and Systems Sciences.
Including at least:
- 7,5 hp database methodology
In this course, student will learn about one of the most important ingredient to a Business Intelligence (BI) and Data Science area, called data warehousing, which is very high demand in the job market. They will learn about techniques to Extract Transform and Load (ETL) the data. They will learn theories about modeling and implementation of data warehouses. In addition, they will get familiar with multi-dimensional data storage, called OnLine Analytical Processing (OLAP). They will also learn about languages that help them to query information from OLAP systems. Data warehouses are used within Business Intelligence (BI) to support organizations strategic decision making by utilizing Decision Support Systems. This course will help students to design and implement data warehouses. They will learn different techniques and principles based on which they can design an effective and efficient data warehouse. In addition, they will experience the full data warehouse life-cycle by implementing a system according to latest advances in this area.
The general goal with this course is to familiarize the students with a special kind of information systems called Data Warehouses – their role, utilization and benefits for organisations, as well as architectures and underlying technologies relevant for their development.
After taking this course the student should have achieved the following objectives:
1. knows central terminology in the area.
2. is able to produce and document dimensional models for a data warehouse based on an informal domain description
3. from a given source, produce routines for data transfer into a data warehouse, and able to write queries to fetch data from a data warehouse using MDX.
4. is able to implement dimensional models in a given system, populate these models with data, and use a front-end systems for extracting and analyzing the data present in a data warehouse. Can use a given ETL system to extract data from different files and load it into relational tables
5. can summarize, present and assess results from research literature in the area
The course contains the following parts:
1. Introduction to decision support systems and Business Intelligence
2. History, state-of-the-art within data warehousing
3. Rationale and patterns for multi-dimensional modeling
4. The ETL process (Extract-Transform-Load) and data integration
5. The Data Warehouse Lifecycle (DW development and maintenance process)
6. OLAP (Online Analytical Processing) and a multi-dimensional query language for data cubes
7. Trends within data warehousing and current research
The course contains a number of lectures, lessons, labs and tutorials. Course schedule with the different activities is provided in Daisy. Students have to do self studies (i.e. read the course literature) and carry out a number of practical assignments. The assignments are completed in groups. The lectures and lessons introduce the students to the theory needed for the completion of the assignments. Tutorials are offered to students working with the assignments. The assignments are reported and discussed during seminars or meetings. The ETL and OLAP part of assignments are based on SQL Server Integration Services (SSIS) and SQL Server Analysis Services (SSAS), which are very high demand and widely used in industry.
Note that this is not a distance course; hence students are strongly recommended to participate in the activities offered during the course.