Get in Touch

Course Outline

  • Section 1: Introduction to Big Data / NoSQL
    • NoSQL overview
    • CAP theorem
    • When to use NoSQL
    • Columnar storage
    • NoSQL ecosystem
  • Section 2: Cassandra Fundamentals
    • Design and architecture
    • Cassandra nodes, clusters, and data centers
    • Keyspaces, tables, rows, and columns
    • Partitioning, replication, and tokens
    • Quorum and consistency levels
    • Labs: interacting with Cassandra via CQLSH
  • Section 3: Data Modeling – Part 1
    • Introduction to CQL
    • CQL Data Types
    • Creating keyspaces and tables
    • Selecting columns and data types
    • Defining primary keys
    • Data layout for rows and columns
    • Time to Live (TTL)
    • Querying with CQL
    • Updating data with CQL
    • Collections (list, map, set)
    • Labs: various data modeling exercises using CQL; exploring queries and supported data types
  • Section 4: Data Modeling – Part 2
    • Creating and using secondary indexes
    • Composite keys (partition keys and clustering keys)
    • Time series data
    • Best practices for time series data
    • Counters
    • Lightweight transactions (LWT)
    • Labs: creating and using indexes; modeling time series data
  • Section 5: Data Modeling Labs – Group Design Session
    • Multiple use cases from various domains are presented
    • Students collaborate in groups to develop designs and models
    • Discussion of various designs and analysis of design decisions
    • Lab: implementation of one selected scenario
  • Section 6: Cassandra Drivers
    • Introduction to the Java driver
    • CRUD (Create, Read, Update, Delete) operations using the Java client
    • Asynchronous queries
    • Labs: utilizing the Java API for Cassandra
  • Section 7: Cassandra Internals
    • Understanding Cassandra’s underlying design
    • SSTables, memtables, and commit log
    • Read and write paths
    • Caching mechanisms
    • VNodes
  • Section 8: Administration
    • Hardware selection
    • Cassandra distributions
    • Installing Cassandra
    • Running benchmarks
    • Tools for monitoring performance and node activities
      • DataStax OpsCenter
    • Diagnosing Cassandra performance issues
    • Investigating node crashes
    • Understanding data repair, deletion, and replication
    • Additional troubleshooting tools and tips
    • Cassandra best practices (compaction, garbage collection)
  • Section 9: Bonus Lab (if time permits)
    • Implement a music streaming service similar to Pandora or Spotify on Cassandra

Requirements

  • Familiarity with the Java programming language
  • Comfort in a Linux environment (navigating the command line, editing files with vi or nano)

Lab environment:

A fully operational Cassandra environment will be provided for all students. Participants will need an SSH client and a web browser to access the cluster.

Zero Install : No installation of Cassandra on personal machines is required.

 21 Hours

Number of participants


Price per participant

Testimonials (1)

Upcoming Courses

Related Categories