Data Engineer Interview Handbook

Seven day sprint for data engineering interviews. For candidates with an onsite next week, not next quarter.

The plan · Weekend only · Companion repos

This is the speed run. The full Data Engineering Interview Handbook is 200+ pages. If you have an onsite in seven days, you do not have time for that. You have time for this.

The 7 day plan

Day	Focus	Time	Deliverable
Mon	SQL fundamentals refresher	2h	Solve 5 problems on joins, aggregating, dates
Tue	Window functions	2h	Write `LAG`, `LEAD`, `ROW_NUMBER`, `SUM OVER`, `AVG OVER` from memory
Wed	Python data wrangling	2h	Solve 4 problems on chunking, dedup, interval merging
Thu	Schema design	3h	Sketch 3 schemas on paper before reading the solution
Fri	Pipeline architecture	3h	Design 3 pipelines using the eight beat framework
Sat	Mock interview	2h	60 min loop with a partner, or recorded solo
Sun	Behavioral story bank	2h	Six STAR stories, three minutes each, rehearsed out loud

Total: 16 hours over a week.

Day 1: SQL fundamentals

Lessons: joins intermediate, aggregating intermediate, filtering advanced.

Drill these 5:

10 Lowest Uptime Services. TOP N with ties. Trap: LIMIT 10 drops tied rows.
2FA Confirmation Rate. Conditional aggregation. Trap: divide by zero.
30 Day Page View Counts. Date filtering. Trap: timezone boundaries.
2nd Most Common Content Type. Tie breaking. Trap: LIMIT 1 OFFSET 1 ignores tied first place.
Active Users by Month. Cohort logic. Trap: double counting users active in multiple months.

Day 2: Window functions

Window functions show up in most senior DE SQL screens. Watch all three lessons: beginner, intermediate, advanced.

Drill: 7 Check Rolling Average, 7 Day Onboarding Conversion, then run the window functions drill timed.

Day 3: Python wrangling

Lessons: foundations intermediate, collections advanced.

Drill these 4:

Batch Records. Chunking iterables.
Activity Time Ledger. Interval merging.
Batch Partitioner. Hash bucketing.
Batch With Metadata. Stateful iteration.

Day 4: Schema design

Read first: keys, normalization, dimensional modeling, SCD.

Then sketch these 3 on paper for 20 minutes each before reading the solution:

Day 5: Pipeline architecture

Read first: data engineering system design.

Memorize the eight beat framework: clarify, estimate, freshness, batch vs stream, storage, topology, failure modes, cost.

Sketch these 3 end to end for 30 minutes each before reading the solution:

Day 6: Mock interview

60 minute loop:

15 min SQL, one medium problem
20 min Python, one medium problem
25 min schema or pipeline design

No partner? Record yourself solving each one out loud, then watch the recording. The talking is the training signal.

Day 7: Behavioral

Six STAR stories, three minutes each. Themes:

Owned an ambiguous problem
Disagreed with a stakeholder
Broke production and recovered
Mentored someone
Killed a project
Shipped fast then cleaned up

Rehearse each one out loud twice. The first attempt will be terrible.

50 common DE behavioral questions: datadriven.io/behavioral-interview-questions.

Weekend only version

If you have only two days, do days 2, 4, 5, and 7: window functions, schema design, pipeline architecture, behavioral. Those four cover the rounds where most senior candidates lose points.

Company specific prep (30 min)

If you know your target, spend 30 minutes on its guide:

Companion repos

data-engineering-interview-handbook. The full handbook with chapter by chapter coverage and 4, 8, and 12 week plans.
data-engineering-interview-questions. 1418 tagged practice problems.
system-design-for-data-engineers. 120 long form pipeline case studies.
data-engineering-cheatsheet. Single page recall reference.
data-engineer-interview-prep. 8 week structured practice track.
awesome-data-engineering-interview. Curated resource list.

License

CC BY-SA 4.0. Linked sandboxes and lessons hosted at datadriven.io.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Data Engineer Interview Handbook

The 7 day plan

Day 1: SQL fundamentals

Day 2: Window functions

Day 3: Python wrangling

Day 4: Schema design

Day 5: Pipeline architecture

Day 6: Mock interview

Day 7: Behavioral

Weekend only version

Company specific prep (30 min)

Companion repos

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Folders and files

Latest commit

History

Repository files navigation

Data Engineer Interview Handbook

The 7 day plan

Day 1: SQL fundamentals

Day 2: Window functions

Day 3: Python wrangling

Day 4: Schema design

Day 5: Pipeline architecture

Day 6: Mock interview

Day 7: Behavioral

Weekend only version

Company specific prep (30 min)

Companion repos

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 0

Packages

Contributors