Thank you for sending your enquiry! One of our team members will contact you shortly.
Thank you for sending your booking! One of our team members will contact you shortly.
Course Outline
- Section 1: Introduction to Big Data / NoSQL
- NoSQL overview
- CAP theorem
- When to use NoSQL
- Columnar storage
- NoSQL ecosystem
- Section 2: Cassandra Fundamentals
- Design and architecture
- Cassandra nodes, clusters, and data centers
- Keyspaces, tables, rows, and columns
- Partitioning, replication, and tokens
- Quorum and consistency levels
- Labs: interacting with Cassandra via CQLSH
- Section 3: Data Modeling – Part 1
- Introduction to CQL
- CQL Data Types
- Creating keyspaces and tables
- Selecting columns and data types
- Defining primary keys
- Data layout for rows and columns
- Time to Live (TTL)
- Querying with CQL
- Updating data with CQL
- Collections (list, map, set)
- Labs: various data modeling exercises using CQL; exploring queries and supported data types
- Section 4: Data Modeling – Part 2
- Creating and using secondary indexes
- Composite keys (partition keys and clustering keys)
- Time series data
- Best practices for time series data
- Counters
- Lightweight transactions (LWT)
- Labs: creating and using indexes; modeling time series data
- Section 5: Data Modeling Labs – Group Design Session
- Multiple use cases from various domains are presented
- Students collaborate in groups to develop designs and models
- Discussion of various designs and analysis of design decisions
- Lab: implementation of one selected scenario
- Section 6: Cassandra Drivers
- Introduction to the Java driver
- CRUD (Create, Read, Update, Delete) operations using the Java client
- Asynchronous queries
- Labs: utilizing the Java API for Cassandra
- Section 7: Cassandra Internals
- Understanding Cassandra’s underlying design
- SSTables, memtables, and commit log
- Read and write paths
- Caching mechanisms
- VNodes
- Section 8: Administration
- Hardware selection
- Cassandra distributions
- Installing Cassandra
- Running benchmarks
- Tools for monitoring performance and node activities
- DataStax OpsCenter
- Diagnosing Cassandra performance issues
- Investigating node crashes
- Understanding data repair, deletion, and replication
- Additional troubleshooting tools and tips
- Cassandra best practices (compaction, garbage collection)
- Section 9: Bonus Lab (if time permits)
- Implement a music streaming service similar to Pandora or Spotify on Cassandra
Requirements
- Familiarity with the Java programming language
- Comfort in a Linux environment (navigating the command line, editing files with vi or nano)
Lab environment:
A fully operational Cassandra environment will be provided for all students. Participants will need an SSH client and a web browser to access the cluster.
Zero Install : No installation of Cassandra on personal machines is required.
21 Hours
Testimonials (1)
It was informative.