Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support on_schema_change: full_refresh mode #154

Open
jtcohen6 opened this issue Dec 13, 2021 · 10 comments
Open

Support on_schema_change: full_refresh mode #154

jtcohen6 opened this issue Dec 13, 2021 · 10 comments
Labels
enhancement New feature or request paper_cut A small change that impacts lots of users in their day-to-day

Comments

@jtcohen6
Copy link
Contributor

Sujay in Slack:

> Can we have a config for dbt full refresh on schema change for incremental modes?

I started writing a response in Slack, but figured it's better for visibility here!

This was definitely a goal of our initial implementation (#3288), and it's worth revisiting in the future.

It's a bit trickier than the other modes, because (on most databases):

  • We need to create the temp table of "new data" first, in order to infer its column schema
  • Then, if on_schema_change is not ignore (default), we compare column schemas and (depending on mode) adjust the existing column schema

If we detected a change and wanted to trigger a full-refresh, we'd need to do either/both of:

  • Throw away that initial temp table, and jump into a totally different logical branch of the materialization (full-refresh logic)
  • Try to infer the new column schema from the model SQL (where false limit 0), rather than creating an actual temp table of new data. If the column schemas differ, proceed with full-refresh; otherwise, proceed as normal.

Both are possible, it'll just take some work to get there. We should use this opportunity to clean up and modularize some of the incremental materialization logic in the process (no breaking changes).

@github-actions
Copy link

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please remove the stale label or comment on the issue, or it will be closed in 7 days.

@github-actions github-actions bot added the Stale label Jun 12, 2022
@github-actions
Copy link

Although we are closing this issue as stale, it's not gone forever. Issues can be reopened if there is renewed community interest; add a comment to notify the maintainers.

@ma1ster
Copy link

ma1ster commented Jun 23, 2022

I am still interested. Since my models run on ci/cd I have to manually pass full refresh flag to specific models via ci/cd. It's very inconvenient since first of all I must be aware that the schema has changed.

@jtcohen6 jtcohen6 reopened this Oct 6, 2022
@jtcohen6 jtcohen6 removed the Stale label Oct 6, 2022
@jtcohen6
Copy link
Contributor Author

jtcohen6 commented Oct 6, 2022

Reopening as there's definitely still interest in this

Switching team assignment from Team:Execution to Team:Adapters. This would require refactoring the logic flow in most of our incremental materializations. We'd likely end up running the same model SQL twice:

  • First into a view / temp table to detect schema changes
  • Then again, if we've identified schema changes, to fully replace the existing table. This would also require re-compiling the model SQL with is_incremental set to False, which I see as the trickiest component.

@romeritosmorais
Copy link

I'm using incremental for some models, and on_schema_change: append_new_columns in dbt version 1.0.8, I do --full-refresh initially and run, the problem is that when I add new columns they don't appear in the destination table. What should I do?

@jtcohen6 jtcohen6 added the paper_cut A small change that impacts lots of users in their day-to-day label Nov 28, 2022
@github-actions
Copy link

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days.

@github-actions github-actions bot added the Stale label May 28, 2023
@github-actions
Copy link

github-actions bot commented Jun 4, 2023

Although we are closing this issue as stale, it's not gone forever. Issues can be reopened if there is renewed community interest. Just add a comment to notify the maintainers.

@wpromatt
Copy link

This would be a much appreciated feature.

@DRudel
Copy link

DRudel commented Mar 22, 2024

Definitely still interest here!

@dbeatty10 dbeatty10 transferred this issue from dbt-labs/dbt-core Apr 10, 2024
@lachniej
Copy link

yeah this is a much needed feature if still in progress

mikealfare pushed a commit that referenced this issue Dec 2, 2024
Co-authored-by: Jérémy Guiselin <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request paper_cut A small change that impacts lots of users in their day-to-day
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants