Home

Apache Iceberg Code Practice Wiki

Welcome to the Apache Iceberg Code Practice Wiki! This educational resource provides comprehensive guides and additional learning materials to supplement the hands-on coding labs in the main repository.

📚 Wiki Contents

Getting Started - Complete setup guide and first steps
Iceberg Fundamentals - Deep dive into Iceberg concepts and architecture
Lab Guides - Detailed walkthroughs for each lab
Best Practices - Production-ready patterns and tips
Troubleshooting - Common issues and solutions
Learning Path - Recommended order for completing labs
Multi-Engine Guide - Working with Spark, Trino, and DuckDB
Streaming & CDC - Real-time data pipelines with Kafka and Debezium

🎯 Educational Philosophy

This repository follows a hands-on, lab-based approach to learning Apache Iceberg:

Progressive complexity: Labs range from beginner to advanced
Real-world scenarios: Practical exercises based on actual use cases
Multi-engine learning: Experience with different query engines
Vendor independence: Learn concepts that apply across platforms
Production patterns: Best practices you can apply in production

🚀 Quick Start

📊 Progress Tracking

Track your progress by:

Marking completed labs in your notebook
Keeping a checklist of finished exercises
Timing yourself to measure improvement
Revisiting labs after learning new concepts

🤝 Community

Join our community of learners:

Share your solutions and insights
Ask questions in GitHub Issues
Contribute new labs and exercises
Help improve existing content

🔗 Resources

🎓 Lab Overview

Beginner Labs (0-2)

Lab 0: Sample Database Setup
Lab 1: Environment Setup
Lab 2: Basic Iceberg Operations

Intermediate Labs (3-5)

Lab 3: Advanced Features
Lab 4: Spark Optimizations
Lab 5: Real-World Patterns

Advanced Labs (6-11)

Lab 6: Performance & UI
Lab 7: Table Maintenance
Lab 8: Kafka Integration
Lab 9: CDC with Debezium
Lab 10: Spring Boot with Iceberg
Lab 11: Multi-Engine Lakehouse

💡 Tips for Success

Start with Fundamentals

Complete Labs 0-2 before moving to advanced topics
Understand Iceberg's core concepts (metadata, snapshots, manifests)
Practice basic table operations thoroughly

Use Multiple Engines

Try the same operations in Spark, Trino, and DuckDB
Understand engine-specific optimizations
Learn which engine is best for which use case

Practice Regularly

Consistency beats intensity - 30 minutes daily is better than 3 hours weekly
Revisit labs after breaks to reinforce learning
Try to complete labs without looking at solutions

Learn from Mistakes

Read error messages carefully
Understand why your solution didn't work
Try alternative approaches
Check the solution notebooks for patterns

🔧 Environment Options

Kubernetes with k3s (Recommended)

Full-featured environment
Production-like setup
Better resource isolation
Suitable for long-term learning

Docker Compose (Lightweight)

Quick to set up
Lower resource requirements
Good for initial learning
Easier to troubleshoot

📈 Skill Development

By completing all labs, you will develop skills in:

Apache Iceberg table operations and management
Data lakehouse architecture and design
Multi-engine query optimization
Streaming data pipelines with Kafka
Change data capture with Debezium
Performance tuning and monitoring
Production-ready data engineering patterns

🆘 Need Help?

Check the Troubleshooting page
Review Best Practices
Open an issue on GitHub
Start a discussion in GitHub Discussions
Check the Learning Path for guidance

Happy learning! 🎓🏔️

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Home

Apache Iceberg Code Practice Wiki

📚 Wiki Contents

🎯 Educational Philosophy

🚀 Quick Start

📊 Progress Tracking

🤝 Community

🔗 Resources

🎓 Lab Overview

Beginner Labs (0-2)

Intermediate Labs (3-5)

Advanced Labs (6-11)

💡 Tips for Success

Start with Fundamentals

Use Multiple Engines

Practice Regularly

Learn from Mistakes

🔧 Environment Options

Kubernetes with k3s (Recommended)

Docker Compose (Lightweight)

📈 Skill Development

🆘 Need Help?

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally