IS5 - Data warehousing
90 hp Computer and Systems Sciences with at least:
- 7,5 hp database
Data Warehouses are systems used for reporting and data analysis, and they are considered as a core component of business intelligence. This course will help you to design and implement data warehouses. You will learn different techniques and principles based on which you can design an effective and efficient data warehouse. In addition, you will experience the full data warehouse life-cycle by implementing a system according to latest advances in this area.
The general goal with this course is to familiarize the students with a special kind of information systems called Data Warehouses – their role, utilization and benefits for organisations, as well as architectures and underlying technologies relevant for their development.
After taking this course the student should have achieved the following objectives:
- Is able to account for central terminology in the area, (in particular the terminology introduced by Ralph Kimball in the course literature, e.g. multidimensional model, star join schema, fact table, degenerate dimension, ETL, hierarchy, business intelligence, bitmap indexing, OLAP). Is able to explain how a data warehouse functions and how it differs from a traditional OLTP database. Is able to specify ETL processes and multi-dimensional expressions, i.e. MDX, as well as account for the functionality of back- and front-end tools;
- Is able to apply the dimensional modelling techniques presented in the course literature and from a textual domain description produce and document dimensional models for a data warehouse. Be able to apply the strategies for physical data warehouse design presented in the course literature;
- Can, from given sources, produce models and routines for data transfer into a data warehouse and practically apply these in a specific tool for populating a warehouse. Can use a given ETL system to extract data from different files and load it into relational tables. Is able to deal with history in continuous ETL processes;
- Is able to implement dimensional models in a given system, populate these models with data, and use a front-end tool for extracting, calculating and analyzing the data present in a data warehouse;
- Is able to implement theories and techniques for data warehousing presented in given research literature in the area.
The course contains the following parts:
- Information requirements and information modeling relevant for enterprise management level
- History, state-of-the-art, trends for data warehouse
- Multidimensional modeling as a practical method for Data Warehouse design
- Physical design, indexing techniques and the use of aggregate tables
- The ETL process (Extract-Transform-Load)
- Project management and lifecycle during the development and the maintenance of a Data Warehouse
- Decision support systems and business intelligence (BI)
The course is though in a traditional way. It contains a number of lectures, lessons, seminars and tutorials. Course schedule with the different activities is provided in Daisy. Students have to do self studies (i.e. read the course literature) and carry out a number of practical assignments. The assignments are completed in groups. The lectures and lessons introduce the students to the theory needed for the completion of the assignments. Tutorials are offered to students working with the assignments. The assignments are reported and discussed during seminars or meetings.
Note that this is not a distance course; hence students are strongly recommended to participate in the activities offered during the course.