DAMI - Data Mining in Computer and System Sciences

Aim

Knowledge and understanding: After having taken the course, the student is expected to:

Abilities and skills: After having taken the course, the student is expected to be able to:

Judgements and values: After having taken the course, the student is expected to:

Syllabus

Data mining and machine learning Fielded applications Machine learning and statistics Generalization as search Data mining and ethics

Input: Concepts, instances, attributes What’s a concept? What’s in an example? What’s in an attribute? Preparing the input

Output: Knowledge representation Decision tables Decision trees Classification rules ssociation rules Rules with exceptions Rules involving relations Trees for numeric prediction Instance-based representation Clusters

Algorithms: The basic methods Inferring rudimentary rules Statistical modeling Divide-and-conquer: constructing decision trees Covering algorithms: constructing rules Mining association rules Linear models Instance-based learning Clustering Further reading Credibility: Evaluating what’s been learned Training and testing Predicting performance Cross-validation Other estimates Comparing data mining schemes Predicting probabilities Counting the cost Evaluating numeric prediction The minimum description length (MDL) principle Applying MDL to clustering

Real machine learning schemes Decision trees Classification rules Extending linear models Instance-based learning Numeric prediction Clustering Bayesian networks

Transformations: Engineering the input and output Attribute selection Discretizing numeric attributes Some useful transformations Automatic data cleansing Combining multiple models Using unlabeled data Further reading

Moving on: Extensions and applications Learning from massive datasets Incorporating domain knowledge Text and Web mining Adversarial situations Ubiquitous data mining

Outline

Lectures: 8 x 2 hours Assignment: 1 Seminars: 12 hours.


CategoryCategory

DAMI (last edited 2011-11-21 14:18:14 by sm@dsv.su.se)