Skip to main content

Deployment Overview

Helm Chart

Marquez uses Helm to manage deployments onto Kubernetes in a cloud environment. The chart and templates for the HTTP API server and Web UI are maintained in the Marquez repository and can be found in the chart directory. The chart's base values.yaml file includes an option to easily override deployment settings.

Note: The Marquez HTTP API server and Web UI images are publshed to DockerHub.

Database

The Marquez HTTP API server relies only on PostgreSQL to store dataset, job, and run metadata allowing for minimal operational overhead. We recommend a cloud provided databases, such as AWS RDS, when deploying Marquez onto Kubernetes.

Architecture

DOCKER

Minimal deployment via Docker

Figure 1: Minimal Marquez deployment via Docker.

KUBERNETES

Minimal deployment via Kubernetes

Figure 2: Marquez deployment via Kubernetes.

COMPONENTS

ComponentImageDescription
Marquez Web UImarquezproject/marquez-webThe web UI used to view metadata.
Marquez HTTP APImarquezproject/marquezThe core API used to collect metadata using OpenLineage.
Databasebitnami/postgresql or cloud providedA PostgreSQL instance used to store metadata.
SchedulerUser-providedA scheduler used to run a workflow on a particular schedule (ex: Airflow)
WorkflowUser-providedA workflow using an OpenLineage integration to send lineage metadata to Marquez.

Authentication

Our clients support authentication by automatically sending an API key on each request via Bearer Auth when configured on client instantiation. By default, the Marquez HTTP API does not require any form of authentication or authorization.

Next Steps

The following guides will help you and your team effectively deploy and manage Marquez in a cloud environment: