CSCI 4370/5370 Data Mining

Fall, 2009 3:00 pm - 3:50 pm M/W/F MCSI 338

 

Office Hour:  10:00 am- 1:00 pm Monday, Wednesday, Friday and 2:00 pm-3:00 pm, Monday, Wednesday (by appointments)
Office:

 MCSI 304

E-Mail:  bchen [at] uca [dot] edu (You MUST put CSCI4370/5370 in Subject)
Textbooks:

 Data Mining Concepts and Techniques, 2nd edition, Jiawei Han and Micheline Kamber, 2006

Prerequire:  CSCI 3360 Database Systems

Announcement  

Group Project:  Write Research paper Section4 Results, due on Nov 23

Group Project:  Submit the final paper in IEEE format on Nov 30.  Also, prepare a 10 minute     presentation focus on what is the new approach you use and the improved results

 

Class Slides

Syllabus

Aug 24: Introduction

Aug 26: Association Rules

Aug 31: Classification

Sep 2: My Research and Clustering

Sep 9: Research topics

Sep 14: Ch2 Data Preprocessing

Sep 16: Ch2 Data Preprocessing part2

Sep 21: Ch2 Data Preprocessing part3

Sep 23: Ch3 Data Warehouse

Sep 28: Ch3 Data Warehouse part2

Sep 30: Research Project Proposal Presentation

Oct 5: Ch5 Association Rules with Apriori

Oct 9: Ch5 Association Rules with FP Tree

Oct 12: Midterm Exam         Exam File

Oct 14: Positional Association Rules

Oct 19: Positional Association Rules

Oct 21: Ch6 Classification and Prediction

Oct 26: Ch6 Classification and Prediction: SVM

Oct 28: Protein Local 3D Structure Prediction

Nov 2: Cost Sensitive Decision Tree, by Victor

Nov 4: Naive Bayes' Classification 

Nov 9: Association Classification

Nov 11: Ch6 Classification and Prediction End

Nov 16: HSSP-BLOSUM62 Value and WebLOGO

            WebLOGO              program   result example

            HSSP-BLOSUM62     program   input data example

            Python 2.4

Nov 18: Ch7 Clustering

Nov 23: Ch7 Clustering part2     IEEE Format

 

 

Research Topics:

Association Rules --

    Super-rules clustering by positional association rule

    (TJ, Michael, Tim, Mon 4pm)

Classification --

    Protein local 3D structure prediction incorporate with Chou-Fasman parameter

    (Matt, Tom, Lee, Mon 2:30)

Clustering--

    Using Biclustering algorithm to improve clustering results (Vincent, Wed 4pm)   

    Fuzzy-HKmeans clustering model for protein sequence motif discovery

    (Shabbir, Pavan, Abhinav, Naveen, Wed 2:30)(Luke, Chris, Chris, Wed 2:00)