Senior Data Platform Engineer & Lakehouse Architect
AWS Β· Apache Iceberg Β· Flink Β· Spark Β· DataOps Β· Fraud & Risk Engineering
I am a Senior Data Platform Engineer and Architect with a track record of designing and delivering production-grade data systems across financial services and industrial IoT. Currently at BrightSource Energy, I lead the architecture of IoT-driven Data Lakehouses on AWS to optimize real-time renewable energy operations at scale.
My career spans quantitative risk and fraud engineering at tier-1 financial institutions (JPMorgan Chase, Citi) through to real-time industrial telemetry β giving me a rare breadth across both regulated financial domains and high-throughput engineering environments.
I specialize in the Modern Data Stack: building open-format Lakehouses with Apache Iceberg and Apache Flink that unify streaming and batch workloads under a single, scalable architecture. I treat data infrastructure with the same engineering rigor as application software β every pipeline ships with CI/CD, Infrastructure as Code, automated quality gates, and full observability.
| π Currently | Architecting real-time streaming platforms with Apache Flink & Iceberg on AWS |
| π± Deepening | DataOps practices, Lakehouse governance, and Data Mesh / Data-as-a-Product patterns |
| π¬ Ask me about | Apache Iceberg, Flink, Spark, Airflow, AWS, Python (OOP), or large-scale fraud detection |
| π« Contact | moshesham@gmail.com |
- BrightSource Energy β Architected an IoT-driven Lakehouse platform on AWS, enabling real-time monitoring and optimization of renewable energy assets.
- JPMorgan Chase β Delivered quantitative risk data pipelines supporting regulatory reporting and real-time fraud detection at enterprise scale.
- Citi β Engineered data infrastructure for risk analytics and compliance reporting across global markets.
- Built and published open-source technical handbooks (43 + β) used by data engineers for interview preparation and architecture reference.
| Pillar | What I Build | Key Technologies |
|---|---|---|
| ποΈ Lakehouse Architecture | Open-format platforms with separated compute & storage, ACID guarantees, schema evolution, and time travel. Data products that serve entire organizations from a single source of truth. | Apache Iceberg Delta Lake AWS Glue S3 Databricks |
| βοΈ DataOps & CI/CD | Automated pipelines where every commit triggers validation and every deployment is reproducible, auditable, and rollback-safe. Data tested the same way application code is. | GitHub Actions Terraform Docker dbt pytest Jenkins |
| π Streaming & Batch Unification | Real-time event processing and large-scale batch orchestration sharing the same storage layer β no lambda architecture complexity. | Apache Flink Apache Kafka Apache Spark Airflow PySpark |
| π‘οΈ Observability & Resilience | Fault-tolerant systems designed for failure from day one: full-stack monitoring, data quality alerting, SLA enforcement, and automated recovery. | Prometheus Grafana AWS CloudWatch Great Expectations |
| π Fraud & Risk Engineering | Real-time scoring pipelines and feature stores for fraud detection, quantitative risk, and regulatory compliance at tier-1 financial institutions. | Streaming ML Feature Stores Python SQL Risk Models |
| Cloud & IaC |
|
| Lakehouse & Storage |
|
| Streaming |
|
| Batch & Orchestration |
|
| Languages |
|
| CI/CD & DevOps |
|
| Data Warehousing |
|
| Project | Description | Stack |
|---|---|---|
| β Data Science Analytical Handbook | Production-grade technical reference covering data engineering patterns, system design, and analytical problem-solving. A go-to resource for senior engineering interviews with 43 + stars. | Python GitHub Pages Data Modeling |
| π Economic Real-Time Analytics Platform | End-to-end real-time economic data pipeline with automated ingestion, streaming workflows, and interactive dashboards for market analysis. | Streamlit Python APIs GitHub Actions |
| π Economic Dashboard API Service | Production-grade FastAPI backend serving economic datasets at scale with containerized deployment, monitoring, and CI/CD automation. | FastAPI Python Docker CI/CD |
| π€ AI Omniscient Architect | LLM-powered system architecture assistant that analyzes codebases and produces intelligent architectural recommendations and design documents. | Python LLMs System Design Automation |
| π Databricks Solution Architect Handbook | Comprehensive patterns and reference architectures for Lakehouse solutions on Databricks, covering Iceberg, Delta Lake, and Unity Catalog. | Databricks Iceberg Delta Lake Jupyter |
| π‘ Practice Questions Platform | Interactive engineering platform for data engineering problem-solving with automated test suites and structured learning paths. | Python OOP Testing Frameworks |
I'm always open to conversations about distributed systems, lakehouse architecture, data strategy, or senior engineering opportunities.





