Certificates shipped for course completion till 25th of the last month.

Select Your Style

Choose your layout

Color scheme

Machine Learning

Machine Learning

Machine Learning

About the course

Professor Sudeshna Sarkar, a faculty of the department of Computer Science and Engineering, Indian Institute of Technology, Kharagpur, has designed this course on Machine Learning. This course provides a concise introduction to the fundamental concepts in machine learning and popular machine learning algorithms. Dr. Sarkar has worked with our development team to create the quality course content for the learners. This course comprises of video lectures which can be viewed online and offline as per the convenience of the user. The unique thing about this course is that, this course is accompanied by hands-on problem solving with programming in Python along with the lectures.

This course covers the standard and most popular supervised learning algorithms including linear regression, logistic regression, decision trees, k-nearest neighbour, an introduction to Bayesian learning and the naïve Bayes algorithm, support vector machines and kernels and neural networks with an introduction to Deep Learning. It also covers the basic clustering algorithms. Feature reduction methods will also be discussed. This course also introduces the basics of computational learning theory. Also, this course covers various issues related to the application of machine learning algorithms. This course also covers hypothesis space, overfitting, bias and variance, tradeoffs between representational power and learnability, evaluation strategies and cross-validation.


Advantages of learning this course
Target Audience
Why learn Machine Learning?
Course Features
Test & Evaluation
  • Timeline
  • Scholarship Details

Please Login or Register to fill the Scholarship form.

Scholarship Application

Under Graduate

Post Graduate



Other Information


*CGPA to percentage conversion formula:

Equivalent Percentage = CGPA obtained X 9.5 X (10/CGPA Scale)
Example: If CGPA obtained is 8.00 on the scale of 10, then Equivalent
percentage will be 8.00 X 9.5 X (10/10) = 76%,
or If CGPA is 3.7 out of 4, then Equivalent percentage will be 3.7 X 9.5 X (10/4) = 87.88%


We urge you to provide correct information to your best knowledge. Certificates will be withheld if found that you have misrepresented any data / information.


  1. Introduction

    1. What is the history of machine learning?

    2. What is the difference between machine learning solution and programmatic solution?

    3. What is a formal definition of machine learning?

    4. What are some domains and examples of machine learning?

    5. How can we create a (machine) learner?

  2. Different types of Machine Learning

    1. What are the broad types of machine learning?

    2. What is UnSupervised / Supervised / SemiSupervised and Reinforcement Learning?

    3. What is supervised learning?(In detail)

    4. What are some examples of Classification and Regression problems?

    5. What are Features, Some of the Sample training examples of feature and Can we draw some Schematic Diagrams (for Supervised learning)?

    6. What is Classification Learning? and what are some of its tasks and performance metric?

    7. How do we get data for the learning problems? How are representations of functions used in machine learning? What is the hypothesis space?

  3. Hypothesis Space and Inductive Bias

    1. What is inductive learning?

    2. What are features and feature vectors?

    3. What is the start of Classification problem. What is Feature Space and Hypothesis space for Classification problems?

    4. 5 types of representations of a function

    5. Hypothesis space

    6. Terminology (example, training data, instrance space, concept, target function)

    7. What is the Size of the Hypothesis space (for n boolean features) and What is Hypothesis language?

    8. What is inductive learning hypothesis?

    9. What is Inductive learning and consistent hypothesis? Why is Inductive learning an ill posed problem?

    10. What are various types of bias? (Occums Razor, MDL, MM) and what are the important issues in Machine Learning? What is Generalization? (Bias and Variance)

  4. Evaluation and Cross-Validation

    1. What is experimental evaluation of learning algorithms?

    2. How do we Evaluate predictions? and What is absolute error? (Evaluate predictions)

    3. What is sum of squares error and number of misclassification? (Evaluate predictions)

    4. What is confusion matrix?

    5. What is accuracy, precision and recall? (evaluate predictions)

    6. What is sample error and true error?

    7. What are the sources of errors?

    8. What are the difficulties in evaluating hypothesis with limited data and possible solutions?

    9. How can we evaluate with limited training data?

    10. What is K fold cross validation trade off in machine learning?

  5. Tutorial I

    1. Introduction to Tutorial I

    2. Types of learning : supervised vs unsupervised learning

    3. Example of supervised vs unsupervised learning

    4. Types of features : categorical vs continuous features

    5. Types of supervised learning: regression vs classification

    6. Bias vs Variance

    7. Generalization performance of a learning algorithm

  6. Linear Regression

    1. What is regression? (Linear functions and other functions) and What are various Types of regression models?

    2. What is linear regression?

    3. Looking at an example of a training set for regression

    4. What is multiple linear regression?

    5. What assumption are we making for errors?

    6. The least square regression line

    7. How do we learn the parameters (for single regression and for multiple linear regression)

    8. What is the delta or lms method and how do we use gradient descent?

    9. What is lms update or delta rule, batch descent and stochastic gradient descent?

  7. Introduction to Decision Trees

    1. What is a decision tree?

    2. How to draw a sample decision trees for discrete data?

    3. How to draw a sample decision trees for continuous data?

    4. Generate a decision tree from training examples

    5. Decision tree for playing tennis

    6. Introduction to ID3 (searching for a good tree )

  8. Learning Decision Tree

    1. How do we select attributes for decision tree? (information gain, entropy)

    2. Example of creating a decision tree (using ID3 algorithm)

    3. What is GINI Index?

    4. How do we split continuous attributes and what are the practical issues in classification

    5. Practical issues in classification

  9. Overfitting

    1. What is overfitting?

    2. An example of underfitting and overfitting

    3. Overfitting due to noise or insufficient examples

    4. How to avoid overfitting?

    5. What is MDL?

    6. What are the conditions for pre pruning?

    7. How do we use reduced error pruning for post pruning?

    8. What are the triple tradeoffs in model selection and generalization?

    9. What is regularization?

  10. Python Exercise on Decision Tree and Linear Regression

    1. Python exercise on linear regression

    2. Python exercise on logistic regression

    3. Python exercise on decision tree regression

  11. Tutorial II

    1. How to solve a sample problem in linear regression?

    2. How to solve problems related to decision trees?

    3. How to find the entropy of a set and use in decision trees?

    4. What is information gain?

  12. K-Nearest Neighbour

    1. What is instance based learning and K-Nearest Neighbour algorithm?

    2. What is the standard distance function (euclidean distance) and the 3 issues related to it?

    3. What are some examples of K-Nearest Neighbour and what is the impact of k?

    4. How can we use weighted distance functions?

    5. Why do we need to remove extra features?

    6. What are the various approaches to giving weights?

  13. Feature Selection

    1. Why do we need feature reduction?

    2. What is the curse of dimensionality?

    3. How can we do feature reduction? (selection and extraction)

    4. How can we evaluate feature subset? (wrapper / supervised and filter / unsupervised)

    5. How can we use the feature selection algorithm? (forward and backward selection algorithm)

    6. What are univariate feature selection methods?

    7. What are multivariate feature selection methods?

  14. Feature Extraction

    1. What is feature extraction and what kind of features do we want?

    2. What are principal components (PCs) and how do we choose features?

    3. How do we choose the direction of the principal components (PCs) and how do we use PCA?

    4. How do we choose a feature (axis) for classification and how is Linear discriminant Analysis useful?

  15. Collaborative Filtering

    1. What is a recommender system?

    2. How can we formally define recommendation problem?

    3. What are the two types of recommendation systems? (content, collaborative filtering)

    4. What are the two types of collaborative filtering? (used based nearest nbr, item based nearest nbr)

    5. What are the two phases of algorithms for collaborative filtering? (nbr formation, recommendation)

    6. What are the issues with user based KNN CF?

    7. What is item based collaborative filtering?

  16. Python Exercise on KNN and PCA

    1. What we will cover?

    2. How do we use KNeighborsClassifier in python?

    3. How do we use randomized PCA in Python?

    4. How can we do Face recognition using PCA and KNN?

  17. Tutorial III

    1. What is the curse of dimensionality?

    2. What is feature selection?

    3. What is feature reduction and PCA? (principal component analysis)

    4. How do you calculate the eigen values and eigen vector of a matrix?

    5. What is K-NN (K Nearest Neighbour) Classification?

  18. Bayesian Learning

    1. How is probability used for modelling concepts?

    2. What is Bayes theorem?

    3. Can we look at an example of Bayes theorem?

    4. How can Bayes theorem be applied to find the hypothesis in machine learning? (MAP hypothesis)

    5. What is Bayes optimal classifier?

    6. Gibbs sampling

  19. Naive Bayes

    1. Naive bayes algorithm

    2. Naive bayes algorithm for discrete x

    3. What is smoothing and why is it required?

    4. Can we look at an example of naive bayes algorithm for discrete x?

    5. How do we use smoothing when estimating parameters?

    6. What is the assumption that we made in naive bayes and what happens if it is invalid?

    7. What is gaussian naive bayes? (for continuous X, but discrete Y)

    8. What are bayesian networks?

  20. Bayesian Network

    1. Why do we need bayes network?

    2. Can we look at an example of bayes network?

    3. What does a bayesian network represent?

    4. What can we do with a baynesian network (Inference)?

    5. Where can we apply bayesian network?

    6. How do we define a bayesian network?

    7. What is the graphical representation of naive bayes model?

    8. What is the hidden markov model?

    9. How is learning helped by bayesian belief networks?

  21. Python Exercise on Naive Bayes

    1. How to use the naive bayes classifier?

    2. What is naive bayes classifier?

    3. How is naive bayes classifier relevant in the context of email spam classification?

  22. Tutorial IV

    1. How do we estimate the probabilities using the frequency distribution of probability?

    2. How do we use bayes rule?

    3. What is MAP inference?

    4. What is naive bayes assumption?

    5. What is bayesian networks (the structures), inference and marginalization?

  23. Logistic Regression

    1. What is Logistic Regression (for Classification problems) and sigmoid function?

    2. What are some of the Interesting Propreties of Sigmoid function?

    3. How can we use stochastic gradient descent with logistic regression?

  24. Introduction Support Vector Machine

    1. Support vector machine

    2. Functional margin

    3. Functional margin of a set of point

    4. Solving the optimization problem

  25. SVM The Dual Formulation

    1. Lagrangian duality in brief

    2. The KKT conditions

    3. Implication of Lagrangian

    4. The dual problem

  26. SVM Maximum Margin with Noise

    1. Linear SVM formulation

    2. Limitation of previous SVM formulation

    3. What objective to be minimized?

    4. Lagrangian

    5. Dual formulation

  27. Nonlinear SVM and Kernel Function

    1. Non-linear SVM, feature space and kernel function

    2. Kernel trick

    3. Commonly used kernel function

    4. Performance

  28. SVM Solution to the Dual Problem

    1. SMO algorithm (sequential optimization)

    2. Cordinate ascent

    3. SMO (for dual problem)

  29. Python Exercise on SVM

    1. Support vector classification

    2. Visualize the decision boundaries

    3. Load data

  30. Introduction to NN

    1. Neural network and neuron

    2. Perceptron - basic unit in NN

    3. Gradient decent

    4. Stochastic gradient descent

    5. Multi-layer networks - by stochastic many NN

  31. Multilayer Neural Network

    1. Limitation of perceptrons

    2. Multi-layer NN

    3. Power/ Expressiveness of multilayer networks

    4. Two-layer back-propagation neural network

    5. Learning for BP nets

    6. Derivation

  32. Neural Network and Backpropagation Algorithm

    1. Single layer perceptron and boolean functions (OR, XOR)

    2. Representation capability of NNs

    3. Learning in multi layer N using back propagation

    4. Derivation

    5. Back propagation algorithm

    6. Training practices: batch vs stochastic and learning in epoch

    7. Overfitting in anns and local minima

  33. Deep Neural Network

    1. Deep learning

    2. Hierarchical representation & unsupervised pre-training

    3. Architecture & Training

    4. Pooling

    5. CNN properties

  34. Python Exercise on Neural Network

    1. How can we create a artificial neural network using TensorFlow and TFLearn to recognize handwritten digits?

    2. How do we load dependencies (to recognize handwritten digits)?

    3. How do we load the data (to recognize handwritten digits)?

    4. How do we make the model (to recognize handwritten digits)?

    5. How do we train the model (to recognize handwritten digits)?

    6. What is our takeaway from this exercise (to recognize handwritten digits)?

  35. Tutorial VI

    1. What is a perceptron?

    2. What is perceptron learning rule?

    3. How do we represent a boolean function using a perceptron?

    4. What is forward and backward pass algorithm or backpropagation algorithm?

    5. Stochastic gradient descent and batch gradient descent

    6. Quick overview of some deep learning algorithms

  36. Introduction to Computational Learning Theory

    1. Goal of learning theory & Core aspect of machine learning

    2. PAC

    3. Prototypical concept learning task

  37. Sample Complexity Finite Hypothesis Space

    1. What is Sample Complexity?

    2. Can we look at an example of consistent case?

    3. What is Find-S algorithm and what can it do?

  38. VC Dimension

    1. What kind of theorems do we have when hypothesis state is infinite?

    2. What is shattering?

    3. What is the definition of VC dimension?

    4. What is the upper bound and lower band on sample complexity with VC?

  39. Introduction to Ensembles

    1. What is ensemble learning?

    2. How can we use weak learners?

    3. How can we combine learners in Bayesian classifiers?

    4. Why are ensembles successful and what are the main challenges with them?

  40. Bagging and Boosting

    1. What is Bagging?

    2. What is Boosting and what is AdaBoost?

    3. Why does ensembling work?

  41. Introduction to Clustering

    1. What is unsupervised learning and clustering?

    2. What are some applications of clustering, and what are various aspectis of clustering?

    3. Major clustering approaches

    4. How can we measure the quality of clustering?

  42. Kmeans Clustering

    1. What is K-means algorithm?

    2. How can we describe K-means Algorithm, and can we look at an illustration of it?

    3. What are the similarity and distance measures?

    4. What is the proof of convergence of K-means, time complexity, advantages and disadvantages?

    5. What is model based clustering?

    6. How can we apply K-means on a RGB image?

    7. What is EM algorithm?

  43. Agglomerative Hierarchical Clustering

    1. What is hierarchical clustering, bottom up and top down clustering?

    2. What is a Dendrogram?

    3. What is the algorithm for Agglomerative Hierarchical Clustering?

    4. What is the complete link method?

    5. What is average link clustering?

  44. Python Exercise on kmeans clustering

    1. Can we look at python code for K means algorithm?

    2. Can we look at python code for gaussian mixture model?

    3. Hierarchical agglomerative clustering

  45. Tutorial VIII

    1. What is K-means clustering?

    2. Solving a sample problem n K-means clustering

    3. What is agglomorative hierarchical clustering?

    4. What is gaussian mixture model?

  46. Machine Learning Final Quiz