You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Airflow
an open-source platform to programmatically author, schedule and monitor data pipelines.
Apache Oozie
an open-source workflow scheduler system to manage Apache Hadoop jobs.
DBT (Data Build Tool)
is a command line tool that enables data analysts and engineers to transform data in their warehouse more effectively.
BMC Control-M
a digital business automation solution that simplifies and automates diverse batch application workloads.
DataKitchen
a DataOps Platform that reduces analytics cycle time by monitoring data quality and providing automated support for the deployment of data and new analytics.
Reflow
Reflow is a system for incremental data processing in the cloud. Reflow enables scientists and engineers to compose existing tools (packaged in Docker images) using ordinary programming constructs.
ElementL
A current stealth company founded by ex-facebook director and graphQL co-creator Nick Schrock. Dagster Open Source.
Astronomer.io
Astronomer recently re-focused on Airflow support. They make it easy to deploy and manage your own Apache Airflow webserver, so you can get straight to writing workflows.
Piperr.io
Use Piperr’s pre-built data pipelines across enterprise stakeholders: From IT to Analytics, From Tech, Data Science to LoBs.
Prefect Technologies
Open-source data engineering platform that builds, tests, and runs data workflows.
Genie
Distributed Big Data Orchestration Service by Netflix
Testing and Production Quality
ICEDQ
software used to automate the testing of ETL/Data Warehouse and Data Migration.
Naveego
A simple, cloud-based platform that allows you to deliver accurate dashboards by taking a bottom-up approach to data quality and exception management.
DataKitchen
a DataOps Platform that improves data quality by providing lean manufacturing controls to test and monitor data.
FirstEigen
Automatic Data Quality Rule Discovery and Continuous Data Monitoring
Great Expectations
Great Expectations is a framework that helps teams save time and promote analytic integrity with a new twist on automated testing: pipeline tests. Pipeline tests are applied to data (instead of code) and at batch time (instead of compiling or deploy time).
Enterprise Data Foundation
Open-source enterprise data toolkit providing efficient unit testing, automated refreshes, and automated deployment.
Deployment Automation and Development Sandbox Creation
Jenkins
a ‘CI/CD’ tool used by software development teams to deploy code from development into production
DataKitchen
a DataOps Platform that supports the deployment of all data analytics code and configuration.
Amaterasu
is a deployment tool for data pipelines. Amaterasu allows developers to write and easily deploy data pipelines, and clusters manage their configuration and dependencies.
Meltano
aims to be a complete solution for data teams — the name stands for model, extract, load, transform, analyze, notebook, orchestrate — in other words, the data science lifecycle.
Data Science Model Deployment
Domino
accelerates the development and delivery of models with infrastructure automation, seamless collaboration, and automated reproducibility.
Hydrosphere.io
deploys batch Spark functions, machine-learning models, and assures the quality of end-to-end pipelines.
Open Data Group
a software solution that facilitates the deployment of analytics using models.
ParallelM
moves machine learning into production, automates orchestration, and manages the ML pipeline.
Seldon
streamlines the data science workflow, with audit trails, advanced experiments, continuous integration, and deployment.
Metis Machine
Enterprise-scale Machine Learning and Deep Learning deployment and automation platform for rapid deployment of models into existing infrastructure and applications.
Datatron
Automate deployment and monitoring of AI Models.
DSFlowGo from data extraction to business value in days, not months. Build on top of open source tech, using Silicon Valley’s best practices.
DataMo-Datmo
tools help you seamlessly deploy and manage models in a scalable, reliable, and cost-optimized way.
MLFlow
An open source platform for the complete machine learning lifecycle from MapR.
Studio.ML
Studio is a model management framework written in Python to help simplify and expedite your model building experience.
Comet.ML
Comet.ml allows data science teams and individuals to automagically track their datasets, code changes, experimentation history and production models creating efficiency, transparency, and reproducibility.
Polyaxon
An open source platform for reproducible machine learning at scale.
Missinglink.ai
MissingLink helps data engineers streamline and automate the entire deep learning lifecycle.
kubeflow
The Machine Learning Toolkit for Kubernetes