Get in Touch

Course Outline

Introduction

This segment offers a broad overview of when to apply 'machine learning', key considerations, core concepts, advantages, and disadvantages. Topics include data types (structured/unstructured/static/streamed), data quality and volume, data-driven versus user-driven analytics, the distinction between statistical and machine learning models, challenges associated with unsupervised learning, the bias-variance trade-off, iteration and evaluation processes, cross-validation strategies, and approaches involving supervised, unsupervised, and reinforcement learning.

MAJOR TOPICS

1. Grasping Naive Bayes

  • Core concepts of Bayesian methods
  • Probability fundamentals
  • Joint probability
  • Conditional probability via Bayes' theorem
  • The Naive Bayes algorithm
  • Naive Bayes classification
  • The Laplace estimator
  • Applying numeric features within Naive Bayes

2. Grasping Decision Trees

  • Divide and conquer strategies
  • The C5.0 decision tree algorithm
  • Selecting optimal splits
  • Pruning decision trees

3. Grasping Neural Networks

  • From biological to artificial neurons
  • Activation functions
  • Network topology
  • Number of layers
  • Direction of information flow
  • Node count per layer
  • Training neural networks via backpropagation
  • Deep Learning

4. Grasping Support Vector Machines

  • Classification using hyperplanes
  • Identifying the maximum margin
  • Scenarios with linearly separable data
  • Scenarios with non-linearly separable data
  • Utilizing kernels for non-linear spaces

5. Grasping Clustering

  • Clustering as a machine learning task
  • The k-means clustering algorithm
  • Using distance for cluster assignment and updates
  • Selecting the appropriate number of clusters

6. Evaluating Performance for Classification

  • Handling classification prediction data
  • Examining confusion matrices in detail
  • Using confusion matrices to assess performance
  • Beyond accuracy – alternative performance metrics
  • The kappa statistic
  • Sensitivity and specificity
  • Precision and recall
  • The F-measure
  • Visualizing performance trade-offs
  • ROC curves
  • Estimating future performance
  • The holdout method
  • Cross-validation
  • Bootstrap sampling

7. Optimizing Standard Models for Enhanced Performance

  • Utilizing caret for automated parameter tuning
  • Constructing a simple tuned model
  • Customizing the tuning process
  • Enhancing model performance through meta-learning
  • Understanding ensembles
  • Bagging
  • Boosting
  • Random forests
  • Training random forests
  • Evaluating random forest performance

MINOR TOPICS

8. Grasping Classification via Nearest Neighbors

  • The kNN algorithm
  • Calculating distance
  • Selecting an appropriate k
  • Preparing data for kNN usage
  • Why the kNN algorithm is lazy?

9. Grasping Classification Rules

  • Separate and conquer
  • The One Rule algorithm
  • The RIPPER algorithm
  • Rules derived from decision trees

10. Grasping Regression

  • Simple linear regression
  • Ordinary least squares estimation
  • Correlations
  • Multiple linear regression

11. Grasping Regression Trees and Model Trees

  • Incorporating regression into trees

12. Grasping Association Rules

  • The Apriori algorithm for association rule learning
  • Measuring rule interest – support and confidence
  • Building a rule set using the Apriori principle

Extras

  • Spark/PySpark/MLlib and Multi-armed bandits

Requirements

Proficiency in Python

 21 Hours

Number of participants


Price per participant

Testimonials (7)

Upcoming Courses

Related Categories