Differences between revisions 2 and 4 (spanning 2 versions)

BIGDATA - Big Data with NoSQL Databases

Requirements

7.5 credits databases and 7 credits programming

Aim

The overall objective of the course is to give the student knowledge about tools and models for managing large amounts of continuously growing and heterogeneous data from many diverse sources.

After completing the course, the student should be able to: - identify the challenges and opportunities of Big Data - describe data sources, types of data and properties of Big Data - describe the modular architecture of the Hadoop framework - analyze which forms of representations are appropriate with regard to type of data and application - analyze the needs and effects of distributed storage and analysis - manage the collection and storage of Big Data - apply predictive modeling with Big Data

Syllabus

The course discusses the motivations behind the development of Big Data and the technologies developed to handle the properties of Big Data. These can usually not be handled by traditional database management systems due to the volume, variation and speed of the data with which they are generated. Alternative forms of representations of data have therefore evolved within the NoSQL framework. The course addresses different approaches to NoSQL within Hadoop, which is a modular framework that allows distributed storage and analysis of large amounts of data. The course covers different data sources and types of data, including streaming data. The course also deals with predictive modeling with large amounts of data and gives examples of some typical applications.

Outline

Half speed Level: undergraduate Credits: 7.5

Lectures: 12 lectures x 2 hours Quizzes: 12 (6 theoretical + 6 practical) Assignments: 3 Projects: 1 (practical project or literature review) Written examination: 1

Quizzes and the written examination are individual. Assignments are carried out in groups of two students and projects are carried out in groups of four students.

Scheduled supervision is offered four times during the course. Additional supervision is offered via forums on the course platform.

Teaching is in English.

-  ⇤ ← Revision 2 as of 2017-03-15 14:19:18 → 
  Size: 1762
  Editor: sm@su.se
  Comment:
+   ← Revision 4 as of 2019-12-16 10:18:45 → ⇥
  Size: 2188
  Editor: sm@su.se
  Comment:
-Deletions are marked like this.
+Additions are marked like this.
 Line 3:
-,5 hp databaser och 7,5 hp programming
+.5 credits databases and 7 credits programming
 Line 5:
-Kursens övergripande mål är att ge studenten kunskaper om verktyg och modeller för hantering av och tillämpningar baserade på stora datamängder.
+The overall objective of the course is to give the student knowledge about tools and models for managing large amounts of continuously growing and heterogeneous data from many diverse sources.
 Line 7:
-Kursen kommer att gå igenom motiveringen bakom utvecklingen mot big data och de tekniker som utvecklats för att hantera stora mängder data. Dessa kan vanligtvis inte hanteras av traditionella databashanteringssystem på grund av datans volym, variation och hastigheten med vilken de genereras. Alternativa former för representationer av data har därför utvecklats inom ramverket NoSQL. Kursen tar upp olika ansatser för NoSQL inom Hadoop, vilket är ett modulärt ramverk som tillåter distribuerad lagring och analys av stora datamängder. Vi kommer att gå igenom olika datakällor samt typer av data, inklusive strömmande data. Kursen kommer även att behandla prediktiv modellering med stora mängder data och ge exempel på några typiska tillämpningar.
+After completing the course, the student should be able to:
- identify the challenges and opportunities of Big Data
- describe data sources, types of data and properties of Big Data
- describe the modular architecture of the Hadoop framework
- analyze which forms of representations are appropriate with regard to type of data and application
- analyze the needs and effects of distributed storage and analysis
- manage the collection and storage of Big Data
- apply predictive modeling with Big Data
-Line 10:
+Line 17:
-Efter genomförd kurs ska studenten kunna:
 * beskriva de utmaningar och möjligheter som finns med stora datamängder 
 * beskriva datakällor, typer av data samt egenskaper hos stora datamängder
 * beskriva Hadoop-ramverkets modulära arkitektur
 * analysera vilka representationsformer som är lämpliga med avseende på typ av data och tillämpning
 * analysera behov och effekt av distribuerad lagring och analys
 * tillämpa insamling och lagring av stora datamängder
 * tillämpa prediktiv modellering med stora datamängder
+The course discusses the motivations behind the development of Big Data and the technologies developed to handle the properties of Big Data. These can usually not be handled by traditional database management systems due to the volume, variation and speed of the data with which they are generated. Alternative forms of representations of data have therefore evolved within the NoSQL framework. The course addresses different approaches to NoSQL within Hadoop, which is a modular framework that allows distributed storage and analysis of large amounts of data. The course covers different data sources and types of data, including streaming data. The course also deals with predictive modeling with large amounts of data and gives examples of some typical applications.
 Line 20:
-Undervisningen består av föreläsningar och praktiska övningar/lektioner samt ett avslutande seminarium.
Undervisningen sker på engelska.
+Half speed 
Level: undergraduate 
Credits: 7.5 

Lectures: 12 lectures x 2 hours 
Quizzes: 12 (6 theoretical + 6 practical)
Assignments: 3
Projects: 1 (practical project or literature review)
Written examination: 1

Quizzes and the written examination are individual. Assignments are carried out in groups of two students and projects are carried out in groups of four students.

Scheduled supervision is offered four times during the course. Additional supervision is offered via forums on the course platform.

Teaching is in English.