CRPDS – Certified R Programming for Data Science
R is a programming language that is well-known for its power in statistical computing. The use of R in Data Science enables insights from data to be extracted, and these insights allow companies to get ahead of their competitors. This course provides an introduction to the fundamentals of R language, with a specific focus on how it can be used in Data Science.
You’ll gain knowledge on how to gather data, and what you can do with it, starting from reading and cleansing, to manipulation and visualisation. You’ll also be exposed to a wide range of topics including Big Data and data analytics lifecycle, exploratory data analysis and Shiny R package.
Audience Profile
This workshop is intended for individuals who are interested in learning Data Science, or who want to begin their career as a data scientist.
Participant Prerequisites
All participants should have a basic statistical knowledge with some experience in programming but no specific language is required for this course.
Course Objectives
Upon completion of this course, you will be able to:
- Understand R language fundamentals, including basic syntax, variables, and types.
- Create functions and use control flow.
- Read and write data in R.
- Work with data in R.
- Create and customise visualisations using ggplot2.
- Perform predictive analytics using R.
Course Outline
The following items describe the outline of the course:
Day 1:
Module 1: Introduction to Big Data Analytics
- What is Data?
- Why Data Collection is Important
- Types of Data
- What is Data Science?
- Characteristics of Big Data – The Three V’s of Big Data
- Big Data Analytics and its Types
Module 2: Data Analytics Lifecycle
- Data Analytics Lifecycle Overview
- Detailed Explanation on Data Analytics Lifecycle
Module 3: Basic Programming Terminologies
- Variables
- Constants
- Keywords
- Comments
- Syntax
Module 4: Getting Started with R
- What is R?
- Install R and RStudio
- Explore RStudio Interface (With Lab Exercises)
Module 5: Data Types in R
- Numbers
- Strings
- Vectors
- Matrix
- Arrays
Day 2:
Module 5: Data Types in R
- Data Frames
- Lists
- Factor (With Lab Exercises)
Module 6: Control Structures and Functions in R
- Conditional Statements
- Looping Statements
- Operators
- Functions Syntax
- Scoping Rules
- Subsetting
- Apply Functions (lapply, sapply, vapply)
- Debugging Tools
- Split Function
Module 7: Dealing with Date and Time in R
- Date Time Representation
- Date Time Arithmetic
- Date Time Comparison
Day 3:
Module 8: Data Gathering
- Reading Data from CSV File
- Reading Data from JSON File
- Reading Data from XML File
- Reading Data from Web
Module 9: Data Cleansing and Exploration
- Extract, Transform and Load (ETL)
- Data Cleansing
- Aggregation, Filtering, Sorting, Joining
- Dealing with Missing Data
- Selecting Columns and Rows
- Data Wrangling
- Summarise and Group By
Module 10: Simulation and Profiling
- Random Sampling
- Generate Random Numbers
- R Profiler
Module 11: Data Visualisation
- What Is Visualisation?
- Need of Visualisation
- Types of Visualisation
- How to Handle the Properties for Chart Creation
- Activity
Day 4:
Module 12: Getting Deeper into Data Visualisation
- Scatter Plots
- Boxplots
- Bar Charts
- Pie Charts
- Histograms
Module 13: Creating Graphs with ggplot
- Getting Started with ggplot
- Mapping Color, Shape and Size
- Creating Attractive Color Scheme
- Creating Bar Charts
- Creating Box Plots
Module 14: Advance Graphs in ggplot
- Correlation
- Deviation
- Ranking
- Distribution
- Composition
- Time Series Plots
- Groups
- Spatial
Day 5:
Module 15: Shiny R Package
- Introduction
- How to Build a Simple Shiny Module?
Final Evaluation (Exam)
Course Materials
The following materials are included as part of the course:
- iTrain Asia official digital curriculum
Exam Format
The Certified R Programming for Data Science certification exam duration is 2 hours, consisting of 50 Multiple-Choice Questions, with a passing score of 70%. You will receive a professional Certified R Programming for Data Science certification upon passing the exam.