Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RHINENG-14146 Optimize Get Recommendations Query #282

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

saltgen
Copy link
Contributor

@saltgen saltgen commented Dec 9, 2024

Why do we need this change? 💭

The current sql query is expensive considering there is a lot of records to sift through.
Analysis details are present in the JIRA card.

That said, some important details are provided in the Additional section below.

Documentation update? 📝

  • Yes
  • No

Security Checklist 🔒

Upon raising this PR please go through RedHatInsights/secure-coding-checklist

💂‍♂️ Checklist 🎯

  • Does this change depend on specific version of Kruize
    • If yes what is the version no:
    • Is that image available in production or needs deployment?
  • Bugfix
  • New Feature
  • Refactor
  • Unittests Added
  • DRY code
  • Dependency Added
  • DB Migration Added

Additional 📣

Indexes added

clusters (last_reported_at)
recommendation_sets (workload_id)
workloads (cluster_id)

Other Optimizations

  • Discarding usage of postgres' DATE method in queries
  • queryParams object is now a native interface

Query Diffs

GetRecommendationSet

Old

SELECT "recommendation_sets"."id","recommendation_sets"."workload_id","recommendation_sets"."container_name","recommendation_sets"."monitoring_start_time","recommendation_sets"."monitoring_end_time","recommendation_sets"."recommendations","recommendation_sets"."updated_at" FROM "recommendation_sets" 
                        JOIN workloads ON recommendation_sets.workload_id = workloads.id
                        JOIN clusters ON workloads.cluster_id = clusters.id
                        JOIN rh_accounts ON clusters.tenant_id = rh_accounts.id
                 WHERE rh_accounts.org_id = '3340851' AND DATE(recommendation_sets.monitoring_end_time) >= '1970-01-01' AND DATE(recommendation_sets.monitoring_end_time) <= '2024-12-02' ORDER BY clusters.last_reported_at LIMIT 10;

New

SELECT recommendation_sets.id, recommendation_sets.container_name AS container, workloads.namespace AS project, workloads.workload_name as workload, workloads.workload_type, clusters.source_id, clusters.cluster_uuid, clusters.cluster_alias, clusters.last_reported_at AS last_reported, recommendation_sets.recommendations FROM "recommendation_sets" 
			JOIN workloads ON recommendation_sets.workload_id = workloads.id
			JOIN clusters ON workloads.cluster_id = clusters.id
			JOIN rh_accounts ON clusters.tenant_id = rh_accounts.id
		 WHERE rh_accounts.org_id = '3340851' AND recommendation_sets.monitoring_end_time >= '1970-01-01 00:00:00' AND recommendation_sets.monitoring_end_time <= '2024-12-09 12:07:47.503' ORDER BY clusters.last_reported_at DESC LIMIT 10

GetRecommendationSetByID

Old

SELECT "recommendation_sets"."id","recommendation_sets"."workload_id","recommendation_sets"."container_name","recommendation_sets"."monitoring_start_time","recommendation_sets"."monitoring_end_time","recommendation_sets"."recommendations","recommendation_sets"."updated_at" FROM "recommendation_sets" JOIN workloads ON recommendation_sets.workload_id = workloads.id JOIN clusters ON workloads.cluster_id = clusters.id JOIN rh_accounts ON clusters.tenant_id = rh_accounts.id WHERE rh_accounts.org_id = '3340851' AND recommendation_sets.id = '4514582e-a156-47ae-8cc7-705a1dd3207b' ORDER BY "recommendation_sets"."id" LIMIT 1

New

SELECT recommendation_sets.id, recommendation_sets.container_name AS container, workloads.namespace AS project, workloads.workload_name as workload, workloads.workload_type, clusters.source_id, clusters.cluster_uuid, clusters.cluster_alias, clusters.last_reported_at AS last_reported, recommendation_sets.recommendations FROM "recommendation_sets" 
			JOIN workloads ON recommendation_sets.workload_id = workloads.id
			JOIN clusters ON workloads.cluster_id = clusters.id
			JOIN rh_accounts ON clusters.tenant_id = rh_accounts.id
		 WHERE rh_accounts.org_id = '3340851' AND recommendation_sets.id = '32a5d36c-aec1-41d7-b19f-5fef015a930a' ORDER BY "recommendation_sets"."id" LIMIT 1

@saltgen
Copy link
Contributor Author

saltgen commented Dec 9, 2024

I'll fix the unittest soon, please feel free to go through the other changes meanwhile.

Copy link
Collaborator

@kgaikwad kgaikwad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good to me. 👍 I will continue reviewing latest updated changes specific to date field.
Added just two inline suggestions. Please feel free to skip if it doesn't make sense.
Thanks!

Comment on lines 138 to 146
recommendationData["source_id"] = recommendation.SourceID
recommendationData["cluster_uuid"] = recommendation.ClusterUUID
recommendationData["cluster_alias"] = recommendation.ClusterAlias
recommendationData["project"] = recommendation.Project
recommendationData["workload_type"] = recommendation.WorkloadType
recommendationData["workload"] = recommendation.Workload
recommendationData["container"] = recommendation.Container
recommendationData["last_reported"] = recommendation.LastReported
recommendationData["recommendations"] = UpdateRecommendationJSON(handlerName, recommendation.ID, recommendation.ClusterUUID, unitChoices, setk8sUnits, recommendation.Recommendations)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
recommendationData["source_id"] = recommendation.SourceID
recommendationData["cluster_uuid"] = recommendation.ClusterUUID
recommendationData["cluster_alias"] = recommendation.ClusterAlias
recommendationData["project"] = recommendation.Project
recommendationData["workload_type"] = recommendation.WorkloadType
recommendationData["workload"] = recommendation.Workload
recommendationData["container"] = recommendation.Container
recommendationData["last_reported"] = recommendation.LastReported
recommendationData["recommendations"] = UpdateRecommendationJSON(handlerName, recommendation.ID, recommendation.ClusterUUID, unitChoices, setk8sUnits, recommendation.Recommendations)
recommendationData := map[string]interface{}{
"id": recommendation.ID,
"source_id": recommendation.SourceID,
"cluster_uuid": recommendation.ClusterUUID,
"cluster_alias": recommendation.ClusterAlias,
"project": recommendation.Project,
"workload_type": recommendation.WorkloadType,
"workload": recommendation.Workload,
"container": recommendation.Container,
"last_reported": recommendation.LastReported,
"recommendations": UpdateRecommendationJSON(handlerName, recommendation.ID, recommendation.ClusterUUID, unitChoices, setk8sUnits, recommendation.Recommendations)
}

With this approach, it will look more clean and concise. What do you think?

Copy link
Contributor Author

@saltgen saltgen Dec 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you'll like this change 🙂 579879c

Comment on lines 228 to 236
recommendationSlice["source_id"] = recommendationSet.SourceID
recommendationSlice["cluster_uuid"] = recommendationSet.ClusterUUID
recommendationSlice["cluster_alias"] = recommendationSet.ClusterAlias
recommendationSlice["project"] = recommendationSet.Project
recommendationSlice["workload_type"] = recommendationSet.WorkloadType
recommendationSlice["workload"] = recommendationSet.Workload
recommendationSlice["container"] = recommendationSet.Container
recommendationSlice["last_reported"] = recommendationSet.LastReported
recommendationSlice["recommendations"] = UpdateRecommendationJSON(handlerName, recommendationSet.ID, recommendationSet.ClusterUUID, unitChoices, setk8sUnits, recommendationSet.Recommendations)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Comment on lines +23 to +26
firstOfMonth := time.Date(now.Year(), now.Month(), 1, 0, 0, 0, 0, time.UTC).Truncate(time.Second)

startTime := time.Date(2023, 3, 23, 0, 0, 0, 0, time.UTC).Truncate(time.Second)
endTime := time.Date(2023, 3, 24, 0, 0, 0, 0, time.UTC).Truncate(time.Second)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
firstOfMonth := time.Date(now.Year(), now.Month(), 1, 0, 0, 0, 0, time.UTC).Truncate(time.Second)
startTime := time.Date(2023, 3, 23, 0, 0, 0, 0, time.UTC).Truncate(time.Second)
endTime := time.Date(2023, 3, 24, 0, 0, 0, 0, time.UTC).Truncate(time.Second)
firstOfMonth := time.Date(now.Year(), now.Month(), 1, 0, 0, 0, 0, time.UTC)
startTime := time.Date(2023, 3, 23, 0, 0, 0, 0, time.UTC)
endTime := time.Date(2023, 3, 24, 0, 0, 0, 0, time.UTC)

It won't require to apply Truncate as you are already passing seconds as zero.

startDateSlice := append(dateSlice, startDate.Format(timeLayout))
queryParams["DATE(recommendation_sets.monitoring_end_time) >= ?"] = startDateSlice
endDateStr := c.QueryParam("end_date")
queryParams["recommendation_sets.monitoring_end_time >= ?"] = startTimestamp.Truncate(time.Second)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any specific reason behind applying Truncate here? Is it really required here?

startDateStr := c.QueryParam("start_date")

if startDateStr == "" {
startDate = firstOfMonth
startTimestamp = firstOfMonth
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
startTimestamp = firstOfMonth
// Consider first day of the month when start_date is empty
startTimestamp = time.Date(now.Year(), now.Month(), 1, 0, 0, 0, 0, time.UTC)

Will it be possible to remove firstOfMonth and assign value directly with comment for additional info.

if endDateStr == "" {
endDate = now
endTimestamp = now
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
endTimestamp = now
endTimestamp = time.Date(now.Year(), now.Month(), now.Day(), 0, 0, 0, 0, time.UTC)

endDateSlice := append(dateSlice, endDate.Format(timeLayout))

queryParams["DATE(recommendation_sets.monitoring_end_time) <= ?"] = endDateSlice
queryParams["recommendation_sets.monitoring_end_time <= ?"] = endTimestamp.Truncate(time.Second)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with above suggestion, no need of calling Truncate function.

@upadhyeammit
Copy link
Contributor

Request to rebase! the IQE tests are running most of the times now!

@saltgen saltgen force-pushed the refactor/get-recommendations-query branch from d96ab53 to bd8f4da Compare December 18, 2024 11:27
@upadhyeammit
Copy link
Contributor

/retest

Copy link
Contributor

@upadhyeammit upadhyeammit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The query var in GetRecommendationSets and GetRecommendationSetByID is exactly same part from a filter Where("recommendation_sets.id = ?", recommendationID) . This is good candidate to consolidate and take out in another common function.

- queryparam now uses truncated ts
- Added test case for start,end date
- Parallelized tests
@saltgen saltgen force-pushed the refactor/get-recommendations-query branch from bd8f4da to 7e745bc Compare December 23, 2024 08:21
@upadhyeammit
Copy link
Contributor

/retest

@upadhyeammit
Copy link
Contributor

On IQE I can see its failing because of syntax error, however I dont see this locally? hmm.. is it because locally I am on 1.22 version of Go? need to check,

{"file":"/go/src/app/internal/api/handlers.go:118","func":"github.com/redhatinsights/ros-ocp-backend/internal/api.GetRecommendationSetList","level":"error","msg":"unable to fetch records from databaseERROR: syntax error at or near \")\" (SQLSTATE 42601)","service":"rosocp-api","time":"2024-12-24T07:31:53Z"}
{"time":"2024-12-24T07:31:53.233083919Z","id":"","remote_ip":"10.131.11.30","host":"ros-ocp-backend-api.ephemeral-xbkkoa.svc:8000","method":"GET","uri":"/api/cost-management/v1/recommendations/openshift?cluster=ros_ocp_cluster_caqjsNZzY&start_date=2024-12-17","user_agent":"OpenAPI-Generator/1.0.0/python","status":200,"error":"","latency":11045462,"latency_human":"11.045462ms","bytes_in":0,"bytes_out":361}

@saltgen
Copy link
Contributor Author

saltgen commented Dec 24, 2024

On IQE I can see its failing because of syntax error, however I dont see this locally? hmm.. is it because locally I am on 1.22 version of Go? need to check,

{"file":"/go/src/app/internal/api/handlers.go:118","func":"github.com/redhatinsights/ros-ocp-backend/internal/api.GetRecommendationSetList","level":"error","msg":"unable to fetch records from databaseERROR: syntax error at or near \")\" (SQLSTATE 42601)","service":"rosocp-api","time":"2024-12-24T07:31:53Z"}
{"time":"2024-12-24T07:31:53.233083919Z","id":"","remote_ip":"10.131.11.30","host":"ros-ocp-backend-api.ephemeral-xbkkoa.svc:8000","method":"GET","uri":"/api/cost-management/v1/recommendations/openshift?cluster=ros_ocp_cluster_caqjsNZzY&start_date=2024-12-17","user_agent":"OpenAPI-Generator/1.0.0/python","status":200,"error":"","latency":11045462,"latency_human":"11.045462ms","bytes_in":0,"bytes_out":361}

I'll take a look

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants