Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make propertyID global unique #4770

Merged
merged 5 commits into from
Jan 23, 2025
Merged

Make propertyID global unique #4770

merged 5 commits into from
Jan 23, 2025

Conversation

acquamarin
Copy link
Collaborator

@acquamarin acquamarin commented Jan 22, 2025

This PR makes the propertyID global unique which is need by indexCatalogEntry.
Once a property is created in the system, its propertyID never changes.

Our ultimate goal is to unify the columnID with the propertyID, however the backend still reindexes the columnID when we alter the table:

if (hasChanges) {
        // Deleted columns are vaccumed and not checkpointed or serialized.
        std::vector<column_id_t> columnIDs;
        columnIDs.push_back(0);
        for (auto& property : tableEntry->getProperties()) {
            columnIDs.push_back(tableEntry->getColumnID(property.getName()));
        }
        for (auto& directedRelData : directedRelData) {
            directedRelData->checkpoint(columnIDs);
        }
        tableEntry->vacuumColumnIDs(1);
        hasChanges = false;
    }

We should consider removing the columnID, so both frontend and backend can use the propertyID.

@ray6080 ray6080 requested review from ray6080 and removed request for benjaminwinger January 23, 2025 03:07
@@ -23,7 +23,9 @@ class KUZU_API NodeTableCatalogEntry final : public TableCatalogEntry {
common::TableType getTableType() const override { return common::TableType::NODE; }

std::string getPrimaryKeyName() const { return primaryKeyName; }
common::idx_t getPrimaryKeyIdx() const { return propertyCollection.getIdx(primaryKeyName); }
common::property_id_t getPrimaryKeyIdx() const {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
common::property_id_t getPrimaryKeyIdx() const {
common::property_id_t getPrimaryKeyID() const {

@ray6080
Copy link
Contributor

ray6080 commented Jan 23, 2025

This PR makes the propertyID global unique which is need by indexCatalogEntry. Once a property is created in the system, its propertyID never changes.

Our ultimate goal is to unify the columnID with the propertyID, however the backend still reindexes the columnID when we alter the table:

if (hasChanges) {
        // Deleted columns are vaccumed and not checkpointed or serialized.
        std::vector<column_id_t> columnIDs;
        columnIDs.push_back(0);
        for (auto& property : tableEntry->getProperties()) {
            columnIDs.push_back(tableEntry->getColumnID(property.getName()));
        }
        for (auto& directedRelData : directedRelData) {
            directedRelData->checkpoint(columnIDs);
        }
        tableEntry->vacuumColumnIDs(1);
        hasChanges = false;
    }

We should consider removing the columnID, so both frontend and backend can use the propertyID.

Yeah. I will handle the refactoring of columnID after this.

Copy link

codecov bot commented Jan 23, 2025

Codecov Report

Attention: Patch coverage is 93.18182% with 6 lines in your changes missing coverage. Please review.

Project coverage is 86.34%. Comparing base (99dd5dc) to head (658352c).
Report is 6 commits behind head on master.

Files with missing lines Patch % Lines
src/common/arrow/arrow_row_batch.cpp 0.00% 2 Missing ⚠️
...e/processor/operator/scan/offset_scan_node_table.h 0.00% 2 Missing ⚠️
src/catalog/property_definition_collection.cpp 97.36% 1 Missing ⚠️
src/function/table/call/table_info.cpp 66.66% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #4770      +/-   ##
==========================================
+ Coverage   86.31%   86.34%   +0.02%     
==========================================
  Files        1397     1397              
  Lines       59935    60121     +186     
  Branches     7384     7390       +6     
==========================================
+ Hits        51734    51912     +178     
- Misses       8034     8043       +9     
+ Partials      167      166       -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link

Benchmark Result

Master commit hash: 3900ed5623a041dcfc8699fd35a3a56a0dc257ce
Branch commit hash: f877ca47b4fd7734ae32bad7ef9fc5e96be7bb04

Query Group Query Name Mean Time - Commit (ms) Mean Time - Master (ms) Diff
aggregation q24 688.91 694.16 -5.25 (-0.76%)
aggregation q28 6072.46 6135.52 -63.06 (-1.03%)
copy node-Comment 72293.65 N/A N/A
copy node-Forum 5738.01 N/A N/A
copy node-Organisation 1241.58 N/A N/A
copy node-Person 2222.12 N/A N/A
copy node-Place 1210.09 N/A N/A
copy node-Post 31094.70 N/A N/A
copy node-Tag 1278.94 N/A N/A
copy node-Tagclass 1131.31 N/A N/A
copy rel-comment-hasCreator 57251.76 N/A N/A
copy rel-comment-hasTag 84514.48 N/A N/A
copy rel-comment-isLocatedIn 67388.40 N/A N/A
copy rel-containerOf 14588.21 N/A N/A
copy rel-forum-hasTag 3952.17 N/A N/A
copy rel-hasInterest 3091.11 N/A N/A
copy rel-hasMember 118683.39 N/A N/A
copy rel-hasModerator 1305.03 N/A N/A
copy rel-hasType 247.58 N/A N/A
copy rel-isPartOf 252.88 N/A N/A
copy rel-isSubclassOf 276.96 N/A N/A
copy rel-knows 13487.58 N/A N/A
copy rel-likes-comment 172721.74 N/A N/A
copy rel-likes-post 68949.38 N/A N/A
copy rel-organisation-isLocatedIn 221.41 N/A N/A
copy rel-person-isLocatedIn 450.41 N/A N/A
copy rel-post-hasCreator 14682.50 N/A N/A
copy rel-post-hasTag 22613.51 N/A N/A
copy rel-post-isLocatedIn 17736.48 N/A N/A
copy rel-replyOf-comment 48274.61 N/A N/A
copy rel-replyOf-post 36832.42 N/A N/A
copy rel-studyAt 858.08 N/A N/A
copy rel-workAt 1626.55 N/A N/A
filter q14 130.87 125.69 5.18 (4.12%)
filter q15 130.64 131.73 -1.09 (-0.83%)
filter q16 305.68 319.73 -14.05 (-4.40%)
filter q17 446.62 447.14 -0.52 (-0.12%)
filter q18 1900.14 1939.95 -39.82 (-2.05%)
filter zonemap-node 92.75 88.70 4.06 (4.57%)
filter zonemap-node-lhs-cast 89.26 89.76 -0.50 (-0.56%)
filter zonemap-node-null 85.36 85.37 -0.01 (-0.01%)
filter zonemap-rel 5865.60 5657.63 207.98 (3.68%)
fixed_size_expr_evaluator q07 580.23 574.37 5.86 (1.02%)
fixed_size_expr_evaluator q08 813.18 800.97 12.20 (1.52%)
fixed_size_expr_evaluator q09 812.80 802.05 10.75 (1.34%)
fixed_size_expr_evaluator q10 246.22 423.75 -177.54 (-41.90%)
fixed_size_expr_evaluator q11 238.15 231.10 7.05 (3.05%)
fixed_size_expr_evaluator q12 234.45 227.67 6.77 (2.97%)
fixed_size_expr_evaluator q13 1463.34 1470.43 -7.09 (-0.48%)
fixed_size_seq_scan q23 123.73 127.42 -3.69 (-2.90%)
join q29 605.52 602.62 2.90 (0.48%)
join q30 10107.84 10388.23 -280.39 (-2.70%)
join q31 5.58 5.51 0.06 (1.14%)
join SelectiveTwoHopJoin 50.70 53.68 -2.98 (-5.56%)
ldbc_snb_ic q35 2666.57 2493.54 173.03 (6.94%)
ldbc_snb_ic q36 460.25 487.00 -26.75 (-5.49%)
ldbc_snb_is q32 3.63 5.82 -2.19 (-37.63%)
ldbc_snb_is q33 11.28 13.48 -2.20 (-16.31%)
ldbc_snb_is q34 1.44 1.43 0.02 (1.08%)
multi-rel multi-rel-large-scan 1333.39 1317.09 16.29 (1.24%)
multi-rel multi-rel-lookup 44.04 35.80 8.25 (23.04%)
multi-rel multi-rel-small-scan 68.29 94.32 -26.03 (-27.60%)
order_by q25 132.36 133.41 -1.05 (-0.79%)
order_by q26 446.12 451.41 -5.29 (-1.17%)
order_by q27 1453.65 1453.12 0.53 (0.04%)
recursive_join recursive-join-bidirection 304.44 287.44 17.00 (5.91%)
recursive_join recursive-join-dense 7387.90 7335.21 52.68 (0.72%)
recursive_join recursive-join-path 24073.70 23480.96 592.74 (2.52%)
recursive_join recursive-join-sparse 1059.22 1057.62 1.61 (0.15%)
recursive_join recursive-join-trail 7337.00 7319.21 17.78 (0.24%)
scan_after_filter q01 172.08 172.24 -0.16 (-0.09%)
scan_after_filter q02 158.05 158.05 -0.01 (-0.00%)
shortest_path_ldbc100 q37 84.24 92.59 -8.36 (-9.02%)
shortest_path_ldbc100 q38 371.71 377.78 -6.08 (-1.61%)
shortest_path_ldbc100 q39 64.90 62.75 2.15 (3.42%)
shortest_path_ldbc100 q40 387.36 458.83 -71.47 (-15.58%)
var_size_expr_evaluator q03 2065.64 2059.14 6.49 (0.32%)
var_size_expr_evaluator q04 2193.81 2231.35 -37.54 (-1.68%)
var_size_expr_evaluator q05 2611.82 2593.36 18.46 (0.71%)
var_size_expr_evaluator q06 1324.25 1323.22 1.04 (0.08%)
var_size_seq_scan q19 1444.70 1435.43 9.27 (0.65%)
var_size_seq_scan q20 2752.23 2622.40 129.83 (4.95%)
var_size_seq_scan q21 2329.42 2257.93 71.49 (3.17%)
var_size_seq_scan q22 127.48 124.69 2.79 (2.24%)

Copy link

Benchmark Result

Master commit hash: 3900ed5623a041dcfc8699fd35a3a56a0dc257ce
Branch commit hash: cca0f671171c05102022fa767b06a5e9cc8dc0d3

Query Group Query Name Mean Time - Commit (ms) Mean Time - Master (ms) Diff
aggregation q24 692.01 694.16 -2.15 (-0.31%)
aggregation q28 6093.09 6135.52 -42.43 (-0.69%)
filter q14 126.34 125.69 0.66 (0.52%)
filter q15 128.66 131.73 -3.07 (-2.33%)
filter q16 303.59 319.73 -16.14 (-5.05%)
filter q17 446.46 447.14 -0.68 (-0.15%)
filter q18 1933.72 1939.95 -6.23 (-0.32%)
filter zonemap-node 88.67 88.70 -0.03 (-0.03%)
filter zonemap-node-lhs-cast 88.60 89.76 -1.16 (-1.29%)
filter zonemap-node-null 87.61 85.37 2.23 (2.61%)
filter zonemap-rel 5849.11 5657.63 191.48 (3.38%)
fixed_size_expr_evaluator q07 579.90 574.37 5.53 (0.96%)
fixed_size_expr_evaluator q08 809.65 800.97 8.68 (1.08%)
fixed_size_expr_evaluator q09 810.39 802.05 8.34 (1.04%)
fixed_size_expr_evaluator q10 245.33 423.75 -178.42 (-42.11%)
fixed_size_expr_evaluator q11 238.42 231.10 7.32 (3.17%)
fixed_size_expr_evaluator q12 233.84 227.67 6.16 (2.71%)
fixed_size_expr_evaluator q13 1458.53 1470.43 -11.90 (-0.81%)
fixed_size_seq_scan q23 115.36 127.42 -12.06 (-9.47%)
join q29 598.60 602.62 -4.02 (-0.67%)
join q30 10568.44 10388.23 180.21 (1.73%)
join q31 5.01 5.51 -0.50 (-9.10%)
join SelectiveTwoHopJoin 52.97 53.68 -0.71 (-1.33%)
ldbc_snb_ic q35 2644.65 2493.54 151.11 (6.06%)
ldbc_snb_ic q36 471.48 487.00 -15.52 (-3.19%)
ldbc_snb_is q32 4.40 5.82 -1.42 (-24.34%)
ldbc_snb_is q33 14.12 13.48 0.64 (4.78%)
ldbc_snb_is q34 1.42 1.43 -0.01 (-0.51%)
multi-rel multi-rel-large-scan 1326.61 1317.09 9.51 (0.72%)
multi-rel multi-rel-lookup 8.17 35.80 -27.63 (-77.17%)
multi-rel multi-rel-small-scan 57.94 94.32 -36.38 (-38.57%)
order_by q25 130.41 133.41 -3.00 (-2.25%)
order_by q26 450.75 451.41 -0.66 (-0.15%)
order_by q27 1454.55 1453.12 1.42 (0.10%)
recursive_join recursive-join-bidirection 289.23 287.44 1.79 (0.62%)
recursive_join recursive-join-dense 7369.64 7335.21 34.43 (0.47%)
recursive_join recursive-join-path 24135.27 23480.96 654.32 (2.79%)
recursive_join recursive-join-sparse 1070.52 1057.62 12.91 (1.22%)
recursive_join recursive-join-trail 7307.93 7319.21 -11.28 (-0.15%)
scan_after_filter q01 172.97 172.24 0.73 (0.43%)
scan_after_filter q02 157.80 158.05 -0.26 (-0.16%)
shortest_path_ldbc100 q37 86.38 92.59 -6.21 (-6.71%)
shortest_path_ldbc100 q38 406.09 377.78 28.30 (7.49%)
shortest_path_ldbc100 q39 66.09 62.75 3.34 (5.32%)
shortest_path_ldbc100 q40 440.27 458.83 -18.56 (-4.04%)
var_size_expr_evaluator q03 2058.34 2059.14 -0.80 (-0.04%)
var_size_expr_evaluator q04 2181.98 2231.35 -49.38 (-2.21%)
var_size_expr_evaluator q05 2626.48 2593.36 33.12 (1.28%)
var_size_expr_evaluator q06 1322.03 1323.22 -1.18 (-0.09%)
var_size_seq_scan q19 1449.95 1435.43 14.52 (1.01%)
var_size_seq_scan q20 2720.14 2622.40 97.74 (3.73%)
var_size_seq_scan q21 2329.23 2257.93 71.30 (3.16%)
var_size_seq_scan q22 128.26 124.69 3.58 (2.87%)

@acquamarin acquamarin merged commit 2a26123 into master Jan 23, 2025
24 of 25 checks passed
@acquamarin acquamarin deleted the unique-property-idx branch January 23, 2025 14:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants