Course Outline

Introduction

  • The Data Science Process
  • Roles and responsibilities of a Data Scientist

Preparing the Development Environment

  • Libraries, frameworks, languages and tools
  • Local development
  • Collaborative web-based development

Data Collection

  • Different Types of Data
    • Structured 
      • Local databases
      • Database connectors
      • Common formats: xlxs, XML, Json, csv, ...
    • Un-Structured
      • Clicks, censors, smartphones
      • APIs
      • Internet of Things (IoT)
      • Documents, pictures, videos, sounds
  • Case study: Collecting large amounts of unstructured data continuosly

Data Storage

  • Relational databases
  • Non-relational databases
  • Hadoop: Distributed File System (HDFS)
  • Spark: Resilient Distributed Dataset (RDD)
  • Cloud storage

Data Preparation

  • Ingestion, selection, cleansing, and transformation
  • Ensuring data quality - correctness, meaningfulness, and security
  • Exception reports

Languages used for Preparation, Processing and Analysis

  • R language
    • Introduction to R
    • Data manipulation, calculation and graphical display
  • Python
    • Introduction to Python
    • Manipulating, processing, cleaning, and crunching data

Data Analytics

  • Exploratory analysis
    • Basic statistics
    • Draft visualizations
    • Understand data 
  • Causality
  • Features and transformations
  • Machine Learning
    • Supervised vs unsurpevised
    • When to use what model
  • Natural Language Processing (NLP)

Data Visualization

  • Best Practices
  • Selecting the right chart for the right data
  • Color pallets
  • Taking it to the next level
    • Dashboards
    • Interactive Visualizations
  • Storytelling with data

Summary and Conclusion

Requirements

  • A general understanding of database concepts
  • A basic understanding of statistics
 35 Hours

Number of participants



Price per participant

Testimonials (3)

Related Courses

Kaggle

14 Hours

Accelerating Python Pandas Workflows with Modin

14 Hours

GPU Data Science with NVIDIA RAPIDS

14 Hours

Anaconda Ecosystem for Data Scientists

14 Hours

ArcGIS for Spatial Analysis

14 Hours

ArcMap in ArcGIS

14 Hours

ArcGIS Pro for Spatial Analysis

14 Hours

ArcGIS with Python Scripting

14 Hours

QGIS for Geographic Information System

21 Hours

Sensu: Beginner to Advanced

14 Hours

Monitoring Your Resources with Munin

7 Hours

Automated Monitoring with Zabbix

14 Hours

Fluentd for Log Data Unification

14 Hours

Nagios Certified Administrator Preparation

21 Hours

Advanced Nagios

21 Hours

Related Categories

1