Get in Touch

Course Outline

1. Grasping classification via nearest neighbors 

  • The kNN algorithm 
  • Distance calculation 
  • Selecting an optimal k 
  • Data preparation for kNN application 
  • Why is the kNN algorithm considered lazy?

2. Exploring naive Bayes 

  • Core concepts of Bayesian methods 
  • Probability fundamentals 
  • Joint probability
  • Conditional probability through Bayes' theorem 
  • The naive Bayes algorithm 
  • Naive Bayes classification 
  • The Laplace estimator
  • Incorporating numeric features with naive Bayes

3. Exploring decision trees 

  • Divide and conquer strategies 
  • The C5.0 decision tree algorithm 
  • Selecting the optimal split 
  • Pruning the decision tree

4. Exploring classification rules 

  • Separate and conquer approach 
  • The One Rule algorithm 
  • The RIPPER algorithm 
  • Deriving rules from decision trees

5. Exploring regression 

  • Simple linear regression 
  • Ordinary least squares estimation 
  • Correlations 
  • Multiple linear regression

6. Exploring regression trees and model trees 

  • Integrating regression into trees 

7. Exploring neural networks 

  • Transitioning from biological to artificial neurons 
  • Activation functions 
  • Network topology 
  • Layer count 
  • Information flow direction 
  • Node count per layer 
  • Training neural networks using backpropagation

8. Exploring Support Vector Machines 

  • Classification using hyperplanes 
  • Identifying the maximum margin 
  • Scenarios involving linearly separable data 
  • Scenarios involving non-linearly separable data 
  • Utilizing kernels for non-linear spaces

9. Exploring association rules 

  • The Apriori algorithm for association rule learning 
  • Evaluating rule interest – support and confidence 
  • Constructing a rule set based on the Apriori principle

10. Exploring clustering

  • Clustering as a machine learning task
  • The k-means algorithm for clustering 
  • Employing distance for cluster assignment and updates 
  • Selecting the appropriate number of clusters

11. Assessing performance for classification 

  • Handling classification prediction data 
  • Examining confusion matrices in detail 
  • Utilizing confusion matrices to gauge performance 
  • Beyond accuracy – additional performance metrics 
  • The kappa statistic 
  • Sensitivity and specificity 
  • Precision and recall 
  • The F-measure 
  • Visualizing performance tradeoffs 
  • ROC curves 
  • Forecasting future performance 
  • The holdout method 
  • Cross-validation 
  • Bootstrap sampling

12. Optimizing stock models for enhanced performance 

  • Utilizing caret for automated parameter tuning 
  • Developing a simple tuned model 
  • Customizing the tuning process 
  • Enhancing model performance through meta-learning 
  • Comprehending ensembles 
  • Bagging 
  • Boosting 
  • Random forests 
  • Training random forests 
  • Evaluating random forest performance

13. Deep Learning

  • Three Categories of Deep Learning
  • Deep Autoencoders
  • Pre-trained Deep Neural Networks
  • Deep Stacking Networks

14. Review of Specific Application Areas

 21 Hours

Number of participants


Price per participant

Testimonials (1)

Upcoming Courses

Related Categories