Get in Touch

Course Outline

Introduction

Installing and Configuring Dataiku Data Science Studio (DSS)

  • System requirements for Dataiku DSS.
  • Setting Up Apache Hadoop and Apache Spark integrations.
  • Configuring Dataiku DSS with web proxies.
  • Migrating from other platforms to Dataiku DSS.

Overview of Dataiku DSS Features and Architecture

  • Core objects and graphs foundational to Dataiku DSS.
  • Understanding recipes in Dataiku DSS.
  • Types of datasets supported by Dataiku DSS.

Creating a Dataiku DSS Project

Defining Datasets to Connect to Data Resources in Dataiku DSS

  • Working with DSS connectors and file formats.
  • Standard DSS formats versus Hadoop-specific formats.
  • Uploading Files for a Dataiku DSS Project.

Overview of the Server Filesystem in Dataiku DSS

Creating and Using Managed Folders

  • Dataiku DSS recipe for merging folders.
  • Local versus non-local managed folders.

Constructing a Filesystem Dataset Using Managed Folder Contents

  • Performing cleanups with a DSS code recipe.

Working with Metrics Dataset and Internal Stats Dataset

Implementing the DSS Download Recipe for HTTP Dataset

Relocating SQL Datasets and HDFS Datasets Using DSS

Ordering Datasets in Dataiku DSS

  • Writer ordering versus read-time ordering.

Exploring and Preparing Data Visuals for a Dataiku DSS Project

Overview of Dataiku Schemas, Storage Types, and Meanings

Performing Data Cleansing, Normalization, and Enrichment Scripts in Dataiku DSS

Working with Dataiku DSS Charts Interface and Types of Visual Aggregations

Utilizing the Interactive Statistics Feature of DSS

  • Univariate analysis versus bivariate analysis.
  • Leveraging the Principal Component Analysis (PCA) DSS tool.

Overview of Machine Learning with Dataiku DSS

  • Supervised ML versus unsupervised ML.
  • References for DSS ML Algorithms and feature handling.
  • Deep Learning with Dataiku DSS.

Overview of the Flow Derived from DSS Datasets and Recipes

Transforming Existing Datasets in DSS with Visual Recipes

Utilizing DSS Recipes Based on User-Defined Code

Optimizing Code Exploration and Experimentation with DSS Code Notebooks

Writing Advanced DSS Visualizations and Custom Frontend Features with Webapps

Working with Dataiku DSS Code Reports Feature

Sharing Data Project Elements and Familiarizing with the DSS Dashboard

Designing and Packaging a Dataiku DSS Project as a Reusable Application

Overview of Advanced Methods in Dataiku DSS

  • Implementing optimized dataset partitioning using DSS.
  • Executing specific DSS processing parts through computations in Kubernetes containers.

Overview of Collaboration and Version Control in Dataiku DSS

Implementing Automation Scenarios, Metrics, and Checks for DSS Project Testing

Deploying and Updating a Project with the DSS Automation Node and Bundles

Working with Real-Time APIs in Dataiku DSS

  • Additional APIs and Rest APIs in DSS.

Analyzing and Forecasting Dataiku DSS Time Series

Securing a Project in Dataiku DSS

  • Managing Project Permissions and Dashboard Authorizations.
  • Implementing Advanced Security Options.

Integrating Dataiku DSS with The Cloud

Troubleshooting

Summary and Conclusion

Requirements

  • Proficiency in Python, SQL, and R programming languages.
  • Fundamental knowledge of data processing using Apache Hadoop and Spark.
  • Solid comprehension of machine learning concepts and data models.
  • Background in statistical analysis and data science principles.
  • Experience in visualizing and communicating data insights.

Audience

  • Engineers
  • Data Scientists
  • Data Analysts
 21 Hours

Number of participants


Price per participant

Upcoming Courses

Related Categories