Data Mining (ETF RIO DM 51060 ) 

General information 

Module title  Data Mining 
Module code  ETF RIO DM 51060 
Study  ETFB 
Department  Computing and Informatics 
Year  2 
Semester  4 
Module type  Mandatory 
ECTS  5 
Hours  60 
Lectures  35 
Exercises  25 
Tutorials  0 
Module goal  Knowledge and skill to be achieved by students 

Introduction to the principles of data analysis in random contexts and finding new relations and information useful for strategic decision making. <br> Introduction to elements of an internal search process, by defining the search targets, collecting selected data, preparation for filtering, introduction to techniques and data mining algorithms. <br> Acquiring knowledge necessary for choosing the best technique for solving a particular knowledge discovering problem. <br> Acquiring knowledge about appliance of techniques and data mining algorithms as well as the interpretation and presentation of results obtained. <br> 

Syllabus 

INTRODUCTION TO DATA MINING <br> 1. Strategic decision making <br> 2. Strategic planning <br> 3. Knowledge discovering process, target defining <br> 4. Choice of source data (text, web, image) <br> DATA MINING TECHNIQUES <br> 5. String matching, bruteforce string matching <br> 6. Linear editing algorithms, finite automata based string matching, KnuttMorrisPratt algorithm, approximate string matching, Wagner Fischer algorithm for computing string distances <br> 7. Classification, decision trees based classification, Bayesian classification, teachers on the distance basis, vector supported machines <br> 8. Fuzzy decision trees <br> 9. Clustering, distance measures and symbolic objects, clustering categories, scalable clustering algorithms, approaches based on soft computing, hierarchical symbolic clustering, segmentation <br> 10. Associative rules, candidate generation and test methods, rules of interest, multilevel rules, online rules generation, generalized rules, temporal association rules <br> 11. Filtering and data transformation, validation and visualization of results <br> ARCHITECTURES AND STANDARDIZATION <br> 12. Data mining system architectures, standardization of information obtained by data mining <br> 

Literature 

Recommended  1. Notes and slides from lectures (See Faculty WEB Site) <br> 2. Han, Kamber: Data Mining  Concepts and Techniques, Morgan Kaufmann, 2000. <br> 
Additional  1. Hand, Mannila, Smyth: Principles of Data Mining, MIT Press, 2001 
Didactic methods 

Through lectures, students will learn about the theory, tasks and applicative examples within thematic units. Lectures consist of theoretical part, presentational descriptive examples, genesis and resolution of specific tasks. In this way, students will have basis for appliance of skilled material in engineering applications. Additional examples and exam tasks are discussed and solved during the laboratory exercises. Laboratory practice and home assignments will enable students of continuous work and their knowledge verification.  
Exams 

During the course students will collect points according to the following system: <br>  Attending lectures, exercises and tutorial classes: 10 points, student with more then three absences from lectures, exercises and/or tutorials can not achieve these points; <br>  Home assignments: maximum of 10 points, assuming solving 5 to 10 assignments evenly distributed throughout the semester; <br>  Partial exams: two written partial exams, maximum of 20 points for each positively evaluated partial exam; <br> Student who during the semester achieved less than 20 points must reenroll this course. <br> Student who during the semester achieved 40 or more points will access to final oral exam, the exam consists of discussing the partial exams tasks, home assignments and answers to simple questions related to course topics. <br> Final oral exam provides maximum of 40 points. To achieve a positive final grade, students in this exam must achieve a minimum of 20 points. Students who do not achieve this minimum will access to makeup oral exam. <br> Student who during the semester achieved 20 or more points, and less than 40 points will access to makeup exam. Makeup exam is structured as follows: <br>  Written part structured in the same way as a partial written exam, during which students solve problems in topics they failed on partial exams (achieved less then 10 points), <br>  Oral part structured in the same way as a final oral exam. <br> Only students who, after passing the written part of the makeup exam managed to achieve a total score of 40 or more points, can access to oral makeup exam, where the score consists of points achieved through attending classes, home assignments, passing partial exams and passing the written part of makeup exam. <br> Oral makeup exam provides maximum of 40 points. To achieve a positive final grade, students in this exam must achieve a minimum of 20 points. Students who do not achieve this minimum must reenroll this course. <br> 

Aditional notes 
