Talk to Admissions +60 17 305 1737

Email Us:
Math and programming skills

Skills & Tools

Use Python numerical, machine learning and NLP libraries such as scikit-learn, NumPy, SciPy, Gensim and NLTK to mine datasets and predict patterns.

Data manipulation tools

Production Standard

Build statistical models — classification and clustering — that generate usable information from raw data.

Learn to make predictions with modeling

The Big Picture

Master the basics of data science and machine learning and harness the power of data to forecast what’s next.

See What You’ll Learn

Day 1: Foundations

Data Science Concepts and Course Overview

  • What and Why Data Science?
  • Course Outline

Interactive Environment with Jupyter Notebook

  • Your Programming Development Environment
  • Navigate through directories using the command line
  • Installing and Running Jupyter
  • Learn Jupyter UI and Features

Python Fundamentals

  • Arithmetic and String Operations in Python
  • Identifiers, Lists, Dictionaries, and Sets
  • Operators, Control Structures and Loops


  • Lambda and Map Function
  • Globals and Locals

Data Input and Output

  • Data Structure and Data Types
  • File Handling: Import and Export Data
  • Describe and View Data

Numerical Processing with NumPy

  • NumPy Standard Data Types
  • NumPy Arrays
  • Aggregations: Max, Min, Sum, and Average
  • Mathematical Functions
  • Array Manipulation: Sorting and Filtering

Data Manipulation with Pandas

  • Series, Dataframe
  • Data Selection
  • Indexing, Reindexing,
  • Iteration
  • Sorting
  • Statistical Functions
  • Aggregations
  • Missing Data
  • GroupBy
  • Merging/Joining
  • Concatenation
  • Sparse Data
  • Merging/Joining
  • Concatenation
  • Categorical Data

Working with APIs: Twitter API

  • How to use Twitter APIs
  • Pre-processing data from APIs


  • Line, Scatter, and Density Plots
  • Multiple Subplots
  • Visualising Errors
  • 3D Plotting
  • Customize Legends, Ticks, Colourbars, and Labels
Hands-on Exercises

Day 2: Classification and Clustering

Classification Methods

  • Linear Regression
  • Logistic Regression
  • Support Vector Machines
  • K-Nearest Neighbours
  • Decision Trees
  • Random Forests
  • Gradient Boosting Trees
  • Ensemble Learning

Clustering Methods

  • K-means
  • Expectation-Maximization (EM) with Gaussian Mixture Models (GMM)
  • Kernel Density Estimation
  • Hierarchical Trees
  • Density-based Spatial
  • Distance Measurements

Feature Engineering

  • Principle Component Analysis (PCA)
  • Linear Discriminant Analysis (LDA)

Hyperparameters Tuning

  • Exhaustive Grid Search
  • Randomized Parameter Optimization

Model Evaluation

  • Underfitting
  • Overfitting
  • K-fold Cross-Validation
  • Model Verification

Performance Evaluation

  • Accuracy
  • Recall
  • Precision
  • F1-Score
  • Macro-Recall
  • Macro-Precision
  • Macro-F1 Score
Deploy Model as REST API
Practical Use Cases

Day 3: Exploratory Data Analysis


  • Histograms
  • Histograms Representation and Plotting
  • Summarizing Distributions
  • Variance
Probability Mass functions (PMFs)
Cumulative Distributions Functions (CDFs)
Probability Density Functions (PDFs)

Modelling Distributions

  • Exponential Distribution
  • Normal Distribution
  • Lognormal Distribution
  • Pareto Distribution
  • Generate Simulated Data

Relationships between variables

  • Scatter Plots
  • Correlation
  • Covariance
  • Pearson's correlation
  • Nonlinear relationships
  • Spearman's rank correlation

Hypothesis Testing

  • Standard Hypothesis Test
  • Testing a difference in means
  • Testing a correlation
  • Testing proportions
  • Chi-squared tests
Hands-on Exercises

Day 4: Text Analytics

NLP & Text Processing Libraries

  • NLTK
  • GenSim

Text Normalization

  • Tokenization
  • Stemming
  • Lemmatization

Text Representation Model

  • Bag of Words (BoW)
  • TF-IDF
  • N-gram
  • Cohort of Terms (dCoT)
  • Marginalized Stacked Denoising Autoencoder (mSDA)

Word Embedding

  • Word2Vec
  • FastText
Text Classification
Text Similarity
Hands-on Exercises

Day 5: Real Use Cases

Use Case: Sentiment Analysis
Use Case: Sales Prediction
Use Case: Matching Resumes with Jobs
Hands-on Exercises

Request a Detailed Syllabus

Get Syllabus

Meet your instructor

Learn from a skilled instructor with extensive professional experience in data science and big data fields.

Math and programming skills

Bill Nash, PhD

Lead Instructor

Bill has over 10 years of experience in the field of data science wih 5+ years as corporate trainer. He has previously worked across multiple verticals, including business, analytics boutiques, IT establishments and FMCG industries. He is an expert in machine learning and natural language processing with deep knowledge of computational complexity theory. He has a PhD in Cognitive Science and a MSc in Computational Linguistics, as well as many years of experience in statistical data analysis and software development in Python. He has extensive experience in developing software in pure Python as well as in C/C++, with a focus on the implementation of machine learning algorithms and statistical techniques.

Math and programming skills

Amril Nurman, PhD

Lead Instructor

Amril currently works as the Chief Architect at a leading telecom analytics solutions provider. He has 15 years of experience in managing Artificial Intelligence and Big Data related projects from both academic and industry. As the Chief Architect, he has led and managed USD$150M+ data science & big data technical project delivery and operational teams for the Company’s Telecommunication Traffic Monitoring and Tax Revenue Assurance system, which processes 20+ billion of CDR (Call Detail Records) transactions with an average of 4 Terabytes of binary and text data on a daily basis from telecommunication operators in India, Ghana, Guinea, and Sierra Leon. He has conducted extensive data science training classes to students for the last 5 years in the Middle East. He earned his PhD in Computer Science from the University College London (UCL), UK.

Learn In

JAN 29 – FEB 1

Wed - Sunday

9:30am - 5:30

FEB 26 – FEB 29

Wed - Sunday

9:30am - 5:30

data science student working

Get Answers

Have questions? We’ve got the answers. Get the details on how you can grow in this course.

  • Why is this course relevant today?

    Given the prevalence of technologies and the amount of data available in the online world about users, products, and the content that we generate, businesses can be making so much more well-informed decisions if this vast amount of data was more deeply analyzed through the use of data science. The data science course provides the tools, methods, and practical experience to enable you to make accurate predictions about data, which ultimately leads to better decision-making in business, and the use of smarter technology (think recommendation systems or targeted ads).

  • What practical skill sets can I expect to have upon completion of the course?

    This course will provide you with technical skills in machine learning, algorithms, and data modeling which will allow you to make accurate predictions about your data. You will be creating your models using Python so you will gain a good grasp of this programming language. Furthermore, you will learn how to parse and clean your data which can take up to 70% of your time as a data scientist.

  • Whom will I be sitting next to in this course?

    Individuals who have a strong interest in manipulating large data sets, finding patterns in data, and making predictions.

  • Are there any prerequisites?

    Basic knowledge in computing / statistics is recommended for this course. Knowledge in Python programming will be useful but not necessary.

Dig Deeper Into The Curriculum

By providing us with your email, you agree to the terms of our Privacy Policy and Terms of Service.