Talk to Admissions +60 17 305 1737

Email Us:
Math and programming skills

Skills & Tools

Use Python numerical, machine learning and NLP libraries such as scikit-learn, NumPy, SciPy, Gensim and NLTK to mine datasets and predict patterns.

Data manipulation tools

Production Standard

Build statistical models — classification and clustering — that generate usable information from raw data.

Learn to make predictions with modeling

The Big Picture

Master the basics of data science and machine learning and harness the power of data to forecast what’s next.

See What You’ll Learn

Day 1: Foundations

Data Science Concepts and Course Overview

  • What and Why Data Science?
  • Course Outline

Interactive Environment with Jupyter Notebook

  • Your Programming Development Environment
  • Navigate through directories using the command line
  • Installing and Running Jupyter
  • Learn Jupyter UI and Features

Python Fundamentals

  • Arithmetic and String Operations in Python
  • Identifiers, Lists, Dictionaries, and Sets
  • Operators, Control Structures and Loops


  • Lambda and Map Function
  • Globals and Locals

Data Input and Output

  • Data Structure and Data Types
  • File Handling: Import and Export Data
  • Describe and View Data

Numerical Processing with NumPy

  • NumPy Standard Data Types
  • NumPy Arrays
  • Aggregations: Max, Min, Sum, and Average
  • Mathematical Functions
  • Array Manipulation: Sorting and Filtering

Data Manipulation with Pandas

  • Series, Dataframe
  • Data Selection
  • Indexing, Reindexing,
  • Iteration
  • Sorting
  • Statistical Functions
  • Aggregations
  • Missing Data
  • GroupBy
  • Merging/Joining
  • Concatenation
  • Sparse Data
  • Merging/Joining
  • Concatenation
  • Categorical Data

Working with APIs: Twitter API

  • How to use Twitter APIs
  • Pre-processing data from APIs


  • Line, Scatter, and Density Plots
  • Multiple Subplots
  • Visualising Errors
  • 3D Plotting
  • Customize Legends, Ticks, Colourbars, and Labels
Hands-on Exercises

Day 2: Exploratory Data Analysis


  • Histograms
  • Histograms Representation and Plotting
  • Summarizing Distributions
  • Variance
Probability Mass functions (PMFs)
Cumulative Distributions Functions (CDFs)
Probability Density Functions (PDFs)

Modelling Distributions

  • Exponential Distribution
  • Normal Distribution
  • Lognormal Distribution
  • Pareto Distribution
  • Generate Simulated Data

Relationships between variables

  • Scatter Plots
  • Correlation
  • Covariance
  • Pearson's correlation
  • Nonlinear relationships
  • Spearman's rank correlation

Hypothesis Testing

  • Standard Hypothesis Test
  • Testing a difference in means
  • Testing a correlation
  • Testing proportions
  • Chi-squared tests
Hands-on Exercises

Day 3: Classification and Clustering

Classification Methods

  • Linear Regression
  • Logistic Regression
  • Support Vector Machines
  • K-Nearest Neighbours
  • Decision Trees
  • Random Forests
  • Gradient Boosting Trees
  • Ensemble Learning

Clustering Methods

  • K-means
  • Expectation-Maximization (EM) with Gaussian Mixture Models (GMM)
  • Kernel Density Estimation
  • Hierarchical Trees
  • Density-based Spatial
  • Distance Measurements

Feature Engineering

  • Principle Component Analysis (PCA)
  • Linear Discriminant Analysis (LDA)

Hyperparameters Tuning

  • Exhaustive Grid Search
  • Randomized Parameter Optimization
Performance Evaluation
Deploy Model as REST API
Hands-on Exercises

Day 4: Text Analytics

NLP & Text Processing Libraries

  • NLTK
  • GenSim

Text Normalization

  • Tokenization
  • Stemming
  • Lemmatization

Text Representation Model

  • Bag of Words (BoW)
  • TF-IDF
  • N-gram
  • Cohort of Terms (dCoT)
  • Marginalized Stacked Denoising Autoencoder (mSDA)

Word Embedding

  • Word2Vec
  • FastText
Text Classification
Text Similarity
Hands-on Exercises

Day 5: Real Use Cases

Use Case: Intelligent Chatbot
Use Case: Matching Resumes with Jobs
Use Case: Sentiment Analysis
Use Case: Speech Segmentation
Hands-on Exercises

Request a Detailed Syllabus

Get Syllabus

Learn In


Monday - Friday

9am - 5pm

JUNE 10 – JUNE 14

Monday - Friday

9am - 5pm

data science student working

Get Answers

Have questions? We’ve got the answers. Get the details on how you can grow in this course.

  • Why is this course relevant today?

    Given the prevalence of technologies and the amount of data available in the online world about users, products, and the content that we generate, businesses can be making so much more well-informed decisions if this vast amount of data was more deeply analyzed through the use of data science. The data science course provides the tools, methods, and practical experience to enable you to make accurate predictions about data, which ultimately leads to better decision-making in business, and the use of smarter technology (think recommendation systems or targeted ads).

  • What practical skill sets can I expect to have upon completion of the course?

    This course will provide you with technical skills in machine learning, algorithms, and data modeling which will allow you to make accurate predictions about your data. You will be creating your models using Python so you will gain a good grasp of this programming language. Furthermore, you will learn how to parse and clean your data which can take up to 70% of your time as a data scientist.

  • Whom will I be sitting next to in this course?

    Individuals who have a strong interest in manipulating large data sets, finding patterns in data, and making predictions.

  • Are there any prerequisites?

    Prior experience in other Programming Languages such as C, C++, JAVA etc. will be useful.

Dig Deeper Into The Curriculum

By providing us with your email, you agree to the terms of our Privacy Policy and Terms of Service.