CS 583: Artificial Intelligence II
(Data Mining)
Instructor: Jugal Kalita
Relevant Links
Class Material
- Syllabus Syllabus, Text book, Grading
Scheme
Home Work Assignments
I will give you 2 or 3 home work assignments. They will invlove programs that learn with real or imagined data. Please make sure you finish all assignments before the final
and demo them to me.
Lecture Schedule
Here is the list of topics discussed in class.
- Two lectures: Chapter 1 of Data Mining by Margaret Dunham, Introduction: Basic Data Mining Tasks, Data Mining vs. Knowledge Discovery in Databases, Data Mining Metrics, Data Mining Issues, etc.
- Two lectures: Chapter 2 of Data Mining by Margaret Dunham, Related Concepts: Information Retrieval, Web Search Engines, Machine Learning, Pattern Matching, Fuzzy Sets and Fuzzy Logic, etc.
- Two lectures: Chapters 2 of Neuro-Fuzzy and Soft Computing by Jang, Sun and Mizutani, Fuzzy Sets: Introduction, Basic Definitions and Terminology; Set-Theoretic Operations, Membership Fuctions and Prameterization; Other ways of implementing Fuzzy Union, Intersection and Complement
- Two lectures: Chapters 3 of Neuro-Fuzzy and Soft Computing by Jang, Sun and Mizutani, Fuzzy Rules and Fuzzy Reasoning: Extension Principle, Fuzzy If-Then Rules, Fuzzy Reasoning
- One lecture: Section 5.5 of Expert Systems by Giarratano and Riley, Fuzzy Rules: Max-min Composition, Moment Methods,
- Two lectures: Chapter 5 of Introduction to Modern Information Retrieval by Salton and McGill: Retrieval Evaluation: Recall and Precision, Fallout, Generality, Single Value Measures
- One lecture: Chapter 17 of Numerical Methods by Chapra and Canale, Least-Squares Regression: Linear Regression, Straight Line Fitting, Errors, Linearlization of Non-linear Relationships, Polynomial Regression, Multiple Linear Regression, General Linear Least Squares, Non-linear Regression
- Two lectures: Chapter 3 of Machine Learning by Mitchell, Decision Tree Learning : Introduction, Entropy Measure, Information Gain, Inductive Biases, Avoiding Overfitting, Continuous-valued Attributes, Differing Cost Attributes
- One lecture: Chapter 5 of an old KEE Manual from Intellicorp, Understanding Rule Based Reasoning: Forward Chaining, Backward Chaining, Choosing Between the two, Using Rules with Variables; Section 5.3, Production Systems from Artificial Intelligence by Lugar
- One lecture: Chapter 4 of Machine Learning by Mitchell: Artificial Neural Networks: Introduction, Perceptrons, Perceptron Training Rule, Gradient Descent and Delta Rule, Backpropagation algorithm, Convergence issues (with help from Tony Anzelmo)
- One lecture: Chapter 9 of Machine Learning by Mitchell: Genetic Algorithms: Genetic Operators, Fitness Function, Example
- Two lectures: Chapter 5 of Data Mining by Margaret Dunham, Clustering: Similarity and Distance Measures, Outliers, Hierarchical Algorithms, Partitional Algorithms, Clustering Large Databases
- Two lectures: Chapter 6 of Data Mining by Margaret Dunham, Association Rules: Definitions, Apriori Algorithm, Sampling, Partitioning Algorithm
- One lecture (by Jeff Schott): Chapter 9 of The Handbook of Data Mining by Ye, Psychometric Methods of Latent Variable Modeling
- One lecture (by Ankur Deshmukh): Chapter 5 of The Handbook of Data Mining by Ye, Bayesian Data Analysis
- One lecture: Naive Bayes Classifiers for Spam Detection by Jugal Kalita, MXLogic, Inc., from Summer 2002
- One lecture (by Steve Boone) : Chapter 9 of Data Mining by Margaret Dunham, Temporal Mining
- One lecture (by Ankur Patwa): Chapter 21 of The Handbook of Data Mining by Ye, Text Mining
- One lecture (by Jaya Potharaju): Chapter 27 of The Handbook of Data Mining by Ye, Mining Image Data
- One lecture (by Tony Anzelmo): Chapter 14 of The Handbook of Data Mining by Ye, Data Collection, Preparation, Quality and Visualization
- One lecture (by Priyadarshini Selvam): Chapter 25 of The Handbook of Data Mining by Ye, Mining Customer Relationship Management (CRM) Data