0.18.0 - 2021-09-14
- New Add Search API 🎉 @wslulciuc
- Add
.env.example
to override variables defined in docker-compose files @wslulciuc
- Add openlineage-java as dependency @OleksandrDvornik
- Move class SentryConfig from
marquez
tomarquez.tracing
pkg - Major UI improvements; the UI now uses the Search and Lineage APIs 🎉 @phixMe
- Set default API port to
8080
when running the Marquez shadowjar
@wslulciuc
- Update
examples/airflow
to useopenlineage-airflow
and fix the SQL in DAG troubleshooting step @wslulciuc
- Drop
job_versions_io_mapping_inputs
andjob_versions_io_mapping_outputs
tables @OleksandrDvornik
0.17.0 - 2021-08-20
- Update Lineage runs query to improve performance, added tests @collado-mike
- Add POST
/api/v1/lineage
endpoint to docs and deprecate run endpoints @wslulciuc - Drop
FieldType
enum @wslulciuc
- Run API endpoints that create or modify a job run (scheduled to be removed in
0.19.0
). Please use the POST/api/v1/lineage
endpoint when collecting job run metadata. @wslulciuc - Airflow integration, please use the
openlineage-airflow
library instead. @wslulciuc - Spark integration, please use the
openlineage-spark
library instead. @wslulciuc - Write only clients for
java
andpython
(scheduled to be removed in0.19.0
) @wslulciuc
- Dbt integration lib. @wslulciuc
- Common integration lib. @wslulciuc
0.16.1 - 2021-07-13
- dbt packages should look for namespace packages @mobuchowski
- Add common integration dependency to dbt plugins @mobuchowski
DatasetVersionDao
queries missing input and output facets @dominiquetipton- (De)serialization issue for
Run
andJobData
models @collado-mike - Prefix spark
openlineage.*
configuration parameters withspark.*
@collado-mike - Parse multi-statement sql in class
SqlParser
used in Airflow integration @wslulciuc - URL-encode namespace on calls to API backend @phixMe
0.16.0 - 2021-07-01
- New Add JobVersion API 🎉 @collado-mike
- New Add DBT integrations for BigQuery and Snowflake 🎉 @mobuchowski
- Reverted delete of BigQueryNodeVisitor to work with vanilla SparkListener @collado-mike
- Promote Lineage API out of beta @OleksandrDvornik
- Display job SQL in UI @phixMe
- Allow upsert of tags @hanbei
- Allow potentially ambiguous URIs with encoded path segments @mobuchowski
- Use source naming convetion defined by OpenLineage @mobuchowski
- Return dataset facets @collado-mike
- BigQuery source naming in integrations @mobuchowski
0.15.2 - 2021-06-17
- Add endpoint to create tags @hanbei
- Fixed build & release process for python marquez-integration-common package @collado-mike
- Fixed snowflake and bigquery errors when connector libraries not loaded @collado-mike
- Fixed Openlineage API does not set Dataset current_version_uuid #1361 @collado-mike
0.15.1 - 2021-06-11
- Factored out common functionality in Python airflow integration @mobuchowski
- Added Airflow task run macro to expose task run id @collado-mike
- Refactored ValuesAverageExpectationParser to ValuesSumExpectationParser and ValuesCountExpectationParser @collado-mike
- Updated SparkListener to extend Spark's SparkListener abstract class @collado-mike
- Use current project version in spark openlineage client @mobuchowski
- Rewrote LineageDao queries and LineageService for performance @collado-mike
- Updated lineage query to include new jobs that have no job version yet @collado-mike
0.15.0 - 2021-05-24
- Add tracing visibility @julienledem
- New Add snowflake extractor 🎉 @mobuchowski
- Add SSLContext to MarquezClient @lewiesnyder
- Add support for LogicalRDDs in spark plan visitors @collado-mike
- New Add Great Expectations based data quality facet support 🎉 @mobuchowski
- Augment tutorial instructions & screenshots for Airflow example @rossturk
- Rewrite correlated subqueries when querying the lineage_events table @collado-mike
- Web time formatting display fix @kachontep
0.14.2 - 2021-05-06
- Unpin
requests
dep inmarquez-airflow
integration @wslulciuc - Unpin
attrs
dep inmarquez-airflow
integration @wslulciuc
0.14.1 - 2021-05-05
- Updated dataset lineage query to find most recent job that wrote to it @collado-mike
- Pin http-proxy-middleware to 0.20.0 @wslulciuc
0.14.0 - 2021-05-03
- GA tag for website tracking @rossturk
- Basic CTE support in
marquez-airflow
@mobuchowski - Airflow custom facets, bigquery statistics facets @mobuchowski
- Unit tests for class
JobVersionDao
@wslulciuc - Sentry tracing support @julienledem
- OpenLineage facets support to API response models 🎉 @wslulciuc
BigQueryRelationTransformer
and deletedBigQueryNodeVisitor
@collado-mike- Bump postgres to
12.1.0
@wslulciuc - Update spark job name to reflect spark application name and execution node @collado-mike
- Update
marquez-airflow
integration to use OpenLineage 🎉 @mobuchowski - Migrate tests to junit 5 @mobuchowski
- Rewrite lineage IO sql queries to avoid job_versions_io_mapping_* tables @collado-mike
- Updated OpenLineage impl to only update dataset version on run completion @collado-mike
0.13.1 - 2021-04-01
- Remove unused implementation of SQL parser in
marquez-airflow
@mobuchowski
- Add inputs and outputs to lineage graph @henneberger
- Updated
NodeId
regex to support URIs with scheme and ports @collado-mike
0.13.0 - 2021-03-30
- Secret support for helm chart @KevinMellott91
- New
seed
cmd to populatemarquez
database with source, dataset, and job metadata allowing users to try out features of Marquez (data lineage, view job run history, etc) 🎉 - Docs on applying db migrations manually
- New Lineage API to support data lineage queries 🎉
- Support for logging errors via sentry
- New Airflow example with Marquez 🎉
- Update OpenLinageDao to stop converting URI structures to contain underscores instead of colons and slashes @collado-mike
- Bump testcontainers dependency to
v1.15.2
@ ShakirzyanovArsen - Register output datasets for a run lazily @henneberger
- Refactor spark plan traversal to find input/output datasets from datasources @collado-mike
- Web UI project settings and default marquez port @phixMe
- Associate dataset inputs on run start @henneberger
- Dataset description is not overwritten on update @henneberger
- Latest tags are returned from dataset @henneberger
- Airflow integration tests on forked PRs @mobuchowski
- Empty nominal end time support @henneberger
- Ensure valid dataset fields for OpenLineage @henneberger
- Ingress context templating for helm chart @KulykDmytro
0.12.2 - 2021-03-16
- Use alpine image for
marquez
reducing image size by+50%
@KevinMellott91 - Use alpine image for
marquez-web
reducing image size by+50%
@KevinMellott91
- Ensure
marquez.DAG
is (de)serializable
0.12.0 - 2021-02-08
- Modules:
api
,web
,clients
,chart
, andintegrations
- Working airflow example
runs
table indices for columns:created_at
andcurrent_run_state
@phixMe- New
/lineage
endpoint for OpenLineage support @henneberger - New graphql endpoint @henneberger
- New spark integration @henneberger
- New API to list versions for a dataset
- Drop
Source.type
enum (now a string type)
- Replace
jdbi.getHandle()
withjdbi.withHandle()
to free DB connections from pool @henneberger - Fix
RunListener
when registering outside of theMarquezContext
builder @henneberger
0.11.3 - 2020-11-02
- Add support for external ID on run creation @julienledem
- Throw
RunAlreadyExistsException
on run ID already exists - Add BigQuery, Pulsar, and Oracle source types @sreev
- Add run ID support in job meta; the optional run ID will be used to link a newly created job version to an existing job run, while supporting updating the run state and avoiding having to create another run
- Use
postgres
instead ofdb
inmarquez.dev.yml
- Allow multiple postgres containers in test suite @phixMe
0.11.2 - 2020-08-21
- Always migrate db schema on app start in development config
- Update default db username / password
- Use
marquez.dev.yml
in on docker composeup
0.11.1 - 2020-08-19
-
Use shorten name for namespaces in version IDs
-
Add namespace to Dataset and Job models
-
Add ability to deserialize
int
type to columns @phixMe -
Add
SqlLogger
for SQL profiling -
Add
DatasetVersionId.asDatasetId()
andJobVersionId.asJobId()
-
Add
DatasetService.getBy(DatasetVersionId): Dataset
-
Add
JobService.getBy(JobVersionId): Job
-
Allow for run transition override via
at=<TIMESTAMP>
, whereTIMESTMAP
is an ISO 8601 timestamp representing the date/time of the state transition. For example:POST /jobs/runs/{id}/start?at=<TIMESTAMP>
config.yml
->marquez.yml
- Fix dataset version column mappings
0.11.0 - 2020-05-27
Run.startedAt
,Run.endedAt
,Run.duration
@julienledem- class
MarquezContext
@julienledem - class
RunTransitionListener
@julienledem - Unique identifier class
DatasetId
for datasets @julienledem - Unique identifier class
JobId
for jobs @julienledem - class
RunId
@ravikamaraj - enum
RunState
@ravikamaraj - class
Version
@ravikamaraj
- Job inputs / outputs are defined as
DatasetId
- Bump to JDK 11
- Use of API models under
marquez.api.models
pkg
- API docs example to show correct
SQL
key in job context @frankcash
0.10.4 - 2020-01-17
- Fix
RunState.isComplete()
0.10.3 - 2020-01-17
- Add new logo
- Add
JobResource.locationFor()
- Fix dataset field versioning
- Fix list job runs
0.10.2 - 2020-01-16
- Added Location header to run creation @nkijak
0.10.1 - 2020-01-11
- Rename
datasets.last_modified
0.10.0 - 2020-01-08
- Rename table
dataset_tag_mapping
0.9.2 - 2020-01-07
- Add
Flyway.baselineOnMigrate
flag
0.9.1 - 2020-01-06
- Add redshift data types
- Add links to dropwizard overrides in
config.yml
0.9.0 - 2020-01-05
- Validate
runID
when linked to dataset change - Add
Utils.toUuid()
- Add tests for class
TagDao
- Add default tags to config
- Add tagging support for dataset fields
- Add
docker/config.dev.yml
- Add flyway config support
- Replace deprecated
App.onFatalError()
- Fix error on tag exists
- Fix malformed sql in
RunDao.findAll()
0.8.0 - 2019-12-12
- Add `Dataset.lastModified``
- Add
tags
table schema - Add
GET
/tags
- Use new Flyway version to fix migration with custom roles
- Modify
args
column in table `run_args
0.7.0 - 2019-12-05
- Link dataset versions with run inputs
- Add schema required by tagging
- More tests for class
common.Utils
- Add
ColumnsTest
- Add
RunDao.insert()
- Add
RunStateDao.insert()
- Add
METRICS.md
- Add prometheus dep and expose
GET
/metrics
- Fix dataset field serialization
0.6.0 - 2019-11-29
- Add
Job.latestRun
- Add debug logging
- Adjust class RunResponse property ordering on serialization
- Update logging on default namespace creation
0.5.1 - 2019-11-20
- Add dataset field versioning support
- Add link to web UI
- Add
Job.context
- Update semver regex in build-and-push.sh
- Minor updates to job and dataset versioning functions
- Make
Job.location
optional
0.5.0 - 2019-11-04
- Add
lombok.config
- Add code review guidelines
- Add
JobType
- Add limit and offset support to NamespaceAPI
- Add Development section to
CONTRIBUTING.md
- Add class
DatasetMeta
- Add class
MorePreconditions
- Added install instructions for docker
- Rename guid column to uuid
- Use admin ping and health
- Update
owner
toownerName
- Remove experimental db table versioning code
- Fix
marquez.jar
rename onCOPY
0.4.0 - 2019-06-04
- Add quickstart
- Add
GET
/namespaces/{namespace}/jobs/{job}/runs
0.3.4 - 2019-05-17
- Change
Datasetdao.findAll()
to order byDataset.name
0.3.3 - 2019-05-14
- Set timestamps to
CURRENT_TIMESTAMP
0.3.2 - 2019-05-14
- Set
job_versions.updated_at
toCURRENT_TIMESTAMP
0.3.1 - 2019-05-14
- Handle
Flyway.repair()
error
0.3.0 - 2019-05-14
- Add
JobResponse.updatedAt
- Return timestamp strings as ISO format
- Remove unused tables in db schema
0.2.1 - 2019-04-22
- Support dashes (
-
) in namespace
0.2.0 - 2019-04-15
- Add
@NoArgsConstructor
to exceptions - Add license to
*.java
- Add column constants
- Add response/error metrics to API endpoints
- Add build info to jar manifest
- Add release steps and plugin
- Add
/jobs/runs/{id}/run
- Add jdbi metrics
- Add gitter link
- Add column constants
- Add
MarquezServiceException
- Add
-parameters
compiler flag - Add JSON logging support
- Minor pkg restructuring
- Throw
NamespaceNotFoundException
onNamespaceResource.get()
- Fix dataset list error
0.1.0 - 2018-12-18
- Marquez initial public release.