CPPDS – Certified Python Programming for Data Science
Are you planning to become a data scientist? If yes, then you have to learn Python programming language. Why? Python is the number one programming language in the world of data scientists. It emphasises on code readability and clear
programming on both small and large scales, allowing you to focus on your research, product, or project.
In this 5-day journey, you will be exposed to multiple development environments so you can choose the best one for you. You will be taught step-by-step how to program in Python. You will go through all the steps of a Data Science project
starting from data importing, data cleaning, data analysing, to data visualisation which reveals new insights.
Audience Profile
This workshop is intended for individuals who are interested in learning Data Science, or who want to begin their career as a data scientist.
Participant Prerequisites
All participants should have a basic knowledge of programming in any language (Java, C, C++, Pascal, Fortran, Javascript, PHP, Python, etc.)
Course Objectives
Upon completion of this course, you will be able to:
- Recognise the meaning of the terms “Data Science” and “Machine Learning”.
- Understand the basics of Python.
- Develop and write code easily in Python.
- Deal easily with files and file systems.
- Deal with different sources of data.
- Analyse and visualise data to gain new insights.
Course Outline
The following items describe the outline of the course:
Day 1: Introduction to Programming
- What is Algorithm?
- What is Programming?
- The Natural Language of the Computer
- Machine Language
- Programming Language Levels
- Translators
Python Basics
- Identifiers, Lists, and Tuples
- Dictionaries, Sets, Strings, Operators, Control Structures, Loops
Day 2: Jupyter Notebook
- Installing and Running Jupyter
- User Interface
- Checkpoints
Functions
- Functions
- Lambda and Map Functions
- Globals and Locals
Pythonic Programming
- List Comprehension
- Generator Expressions
- Exceptions Handling
Modules and Packages
- Modules
- Documentation
- Packages and Namespaces
Working with Files
- Create, Read, Update, Delete (CRUD) a File
Day 3: Object-Oriented Programming
- OOP in General
- Classes
- Objects
- Constructors
- Instance/Class Data
- Instance/Class Method
- Inheritance
OS Module
- Working with File Systems
- Walking Directory Trees
- Paths
- Filenames
- Directories
Working with Files
- Creating a File
- Reading a File
- Updating a File
- Deleting a File
Working with JSON Data
- What is JSON and Why Is It Important?
- Module, Serialisation and Deserialisation
Web Scraping (BeautifulSoup)
- What is Web Scraping?
- HTML Tags
- BeautifulSoup Module
- Webpage Scraping Phase
Day 4: Introduction to Matrix Processing (NumPy)
- What is NumPy?
- Ndarray Object, Data Types
- Array Attributes, Array Creation Routines
- Indexing and Slicing
- Array Manipulation
- Mathematical Functions
Data Analysis (Pandas)
- What is Pandas?
- Series
- DataFrame
- Data Importing
- Data Pre-Processing
- Data Grouping
Data Visualisation (Matplotlib)
- What is Matplotlib?
- Line Graphs
- Bar Graphs
- Pie Charts
- Histograms
- Scatter Plots
- Graph Attributes
- Text Annotation
Day 5: Introduction to Applied Machine Learning (Scikit-learn)
- What is Machine Learning?
- Machine Learning Algorithm Types
- Main Steps in Machine Learning Projects
- Introduction to Scikit-learn Module
Capstone Project
Final Evaluation (Exam)
Course Materials
The following materials are included as part of the course:
- iTrain Asia official digital curriculum
Exam Format
The Certified Python for Data Science certification exam duration is 2 hours, consisting of 50 Multiple-Choice Questions, with a passing score of 70%. You will receive a professional Certified Python for Data Science certification upon passing the exam.
