Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Startup optimization: Custom DatadogLoggerSink for internal logging / Configuration refactoring #4360

Open
wants to merge 32 commits into
base: master
Choose a base branch
from

Conversation

tonyredondo
Copy link
Member

@tonyredondo tonyredondo commented Jul 5, 2023

Summary of changes

This PR contains the following changes to improve startup time in the tracer and reduce the impact of the internal logger in the callsite thread.

  1. A new DatadogSharedFileSink: based on the already existing SharedFileSink this sink removes the mutex acquire for each LogItem written with a custom MutexStream that only acquires the mutex when there's something to be written to the underlying filestream, this combined with a buffered StreamWriter allow us to only acquire the mutex on buffer full or forced flush.
  2. A new DatadogDeferredSink: this sink deferred the write of each LogItem by enqueuing and moving the write to the threadpool. Allowing the removal of any impact of the logger to the current caller thread. This proved to bring faster startup time ( removing almost 30ms from the startup process).
  3. IConfigurationSource and ITelemeteredConfigurationSource new methods overloads for filtering items on GetDictionary methods. (avoid Dictionary -> Where -> ToDictionary scenario)

Reason for change

Reduce performance overhead of the internal logger.

This can be seen on both Execution Benchmark tests and Throughput tests:

image

image

Test coverage

Same tests must pass to confirm we don't change the current behavior.

@tonyredondo tonyredondo self-assigned this Jul 5, 2023
@github-actions github-actions bot added the area:tracer The core tracer library (Datadog.Trace, does not include OpenTracing, native code, or integrations) label Jul 5, 2023
@DataDog DataDog deleted a comment from andrewlock Jul 6, 2023
@DataDog DataDog deleted a comment from andrewlock Jul 6, 2023
@DataDog DataDog deleted a comment from andrewlock Jul 6, 2023
@DataDog DataDog deleted a comment from andrewlock Jul 6, 2023
@DataDog DataDog deleted a comment from andrewlock Jul 6, 2023
@DataDog DataDog deleted a comment from andrewlock Jul 6, 2023
@DataDog DataDog deleted a comment from datadog-ddstaging bot Jul 6, 2023
@tonyredondo tonyredondo changed the title Tony/logger shared mutexstream Custom DatadogLoggerSink for internal logging Jul 6, 2023
@tonyredondo tonyredondo changed the title Custom DatadogLoggerSink for internal logging Startup optimization: Custom DatadogLoggerSink for internal logging / Configuration refactoring Jul 6, 2023
@DataDog DataDog deleted a comment from datadog-ddstaging bot Jul 7, 2023
@DataDog DataDog deleted a comment from andrewlock Jul 7, 2023
@DataDog DataDog deleted a comment from andrewlock Jul 7, 2023
@DataDog DataDog deleted a comment from andrewlock Jul 7, 2023
@DataDog DataDog deleted a comment from andrewlock Jul 7, 2023
@DataDog DataDog deleted a comment from andrewlock Jul 7, 2023
@DataDog DataDog deleted a comment from andrewlock Jul 11, 2023
@DataDog DataDog deleted a comment from andrewlock Jul 11, 2023
@DataDog DataDog deleted a comment from andrewlock Jul 11, 2023
@DataDog DataDog deleted a comment from andrewlock Jul 11, 2023
@DataDog DataDog deleted a comment from andrewlock Jul 11, 2023
@DataDog DataDog deleted a comment from andrewlock Jul 11, 2023
@andrewlock

This comment has been minimized.

@andrewlock

This comment has been minimized.

@andrewlock

This comment has been minimized.

@DataDog DataDog deleted a comment from datadog-ddstaging bot Jul 12, 2023
@DataDog DataDog deleted a comment from andrewlock Jul 12, 2023
@DataDog DataDog deleted a comment from andrewlock Jul 12, 2023
@DataDog DataDog deleted a comment from andrewlock Jul 12, 2023
@DataDog DataDog deleted a comment from andrewlock Jul 12, 2023
@DataDog DataDog deleted a comment from andrewlock Jul 12, 2023
@DataDog DataDog deleted a comment from andrewlock Jul 12, 2023
@datadog-ddstaging
Copy link

datadog-ddstaging bot commented Jul 12, 2023

Datadog Report

Branch report: tony/logger-shared-mutexstream
Commit report: 2d2ff6a

❄️ dd-trace-dotnet: 0 Failed, 1 New Flaky, 291189 Passed, 1091 Skipped, 30m 24.87s Wall Time

New Flaky Tests (1)

  • UtcNow_GivesLegitTime - Datadog.Trace.Tests.TraceContextTests - Last Failure

    Expand for error
     Assert.True() Failure
     Expected: True
     Actual:   False
    

@andrewlock
Copy link
Member

Execution-Time Benchmarks Report ⏱️

Execution-time results for samples comparing the following branches/commits:

Execution-time benchmarks measure the whole time it takes to execute a program. And are intended to measure the one-off costs. Cases where the execution time results for the PR are worse than latest master results are shown in red. The following thresholds were used for comparing the execution times:

  • Welch test with statistical test for significance of 5%
  • Only results indicating a difference greater than 5% and 5 ms are considered.

Note that these results are based on a single point-in-time result for each branch. For full results, see the dashboard.

Graphs show the p99 interval based on the mean and StdDev of the test run, as well as the mean value of the run (shown as a diamond below the graph).

gantt
    title Execution time (ms) FakeDbCommand (.NET Framework 4.6.2) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (4360) - mean (2,991ms)  : 2932, 3050
     .   : milestone, 2991,
    master - mean (3,006ms)  : 2921, 3091
     .   : milestone, 3006,

    section CallTarget+Inlining+NGEN
    This PR (4360) - mean (3,760ms)  : 3670, 3849
     .   : milestone, 3760,
    master - mean (3,841ms)  : 3779, 3902
     .   : milestone, 3841,

Loading
gantt
    title Execution time (ms) FakeDbCommand (.NET Core 3.1) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (4360) - mean (3,119ms)  : 3030, 3208
     .   : milestone, 3119,
    master - mean (3,116ms)  : 3031, 3200
     .   : milestone, 3116,

    section CallTarget+Inlining+NGEN
    This PR (4360) - mean (3,570ms)  : 3467, 3673
     .   : milestone, 3570,
    master - mean (3,625ms)  : 3559, 3691
     .   : milestone, 3625,

Loading
gantt
    title Execution time (ms) FakeDbCommand (.NET 6) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (4360) - mean (3,099ms)  : 2997, 3202
     .   : milestone, 3099,
    master - mean (3,116ms)  : 3034, 3198
     .   : milestone, 3116,

    section CallTarget+Inlining+NGEN
    This PR (4360) - mean (3,537ms)  : 3463, 3611
     .   : milestone, 3537,
    master - mean (3,597ms)  : 3541, 3653
     .   : milestone, 3597,

Loading
gantt
    title Execution time (ms) HttpMessageHandler (.NET Framework 4.6.2) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (4360) - mean (189ms)  : 186, 193
     .   : milestone, 189,
    master - mean (190ms)  : 186, 195
     .   : milestone, 190,

    section CallTarget+Inlining+NGEN
    This PR (4360) - mean (1,088ms)  : 1057, 1119
     .   : milestone, 1088,
    master - mean (1,129ms)  : 1103, 1155
     .   : milestone, 1129,

Loading
gantt
    title Execution time (ms) HttpMessageHandler (.NET Core 3.1) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (4360) - mean (372ms)  : 363, 380
     .   : milestone, 372,
    master - mean (372ms)  : 365, 379
     .   : milestone, 372,

    section CallTarget+Inlining+NGEN
    This PR (4360) - mean (1,149ms)  : 1123, 1175
     .   : milestone, 1149,
    master - mean (1,197ms)  : 1168, 1225
     .   : milestone, 1197,

Loading
gantt
    title Execution time (ms) HttpMessageHandler (.NET 6) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (4360) - mean (358ms)  : 354, 363
     .   : milestone, 358,
    master - mean (360ms)  : 353, 366
     .   : milestone, 360,

    section CallTarget+Inlining+NGEN
    This PR (4360) - mean (1,099ms)  : 1077, 1120
     .   : milestone, 1099,
    master - mean (1,145ms)  : 1128, 1163
     .   : milestone, 1145,

Loading

@andrewlock
Copy link
Member

Benchmarks Report 🐌

Benchmarks for #4360 compared to master:

  • 4 benchmarks are faster, with geometric mean 1.162
  • 2 benchmarks are slower, with geometric mean 1.150
  • All benchmarks have the same allocations

The following thresholds were used for comparing the benchmark speeds:

  • Mann–Whitney U test with statistical test for significance of 5%
  • Only results indicating a difference greater than 10% and 0.3 ns are considered.

Allocation changes below 0.5% are ignored.

Benchmark details

Benchmarks.Trace.AgentWriterBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master WriteAndFlushEnrichedTraces net6.0 474μs 154ns 575ns 0 0 0 2.62 KB
master WriteAndFlushEnrichedTraces netcoreapp3.1 647μs 296ns 1.11μs 0 0 0 2.63 KB
master WriteAndFlushEnrichedTraces net472 794μs 409ns 1.59μs 0.396 0 0 3.22 KB
#4360 WriteAndFlushEnrichedTraces net6.0 471μs 104ns 387ns 0 0 0 2.63 KB
#4360 WriteAndFlushEnrichedTraces netcoreapp3.1 658μs 377ns 1.46μs 0 0 0 2.63 KB
#4360 WriteAndFlushEnrichedTraces net472 799μs 386ns 1.49μs 0.398 0 0 3.22 KB
Benchmarks.Trace.AppSecBodyBenchmark - Faster 🎉 Same allocations ✔️

Faster 🎉 in #4360

Benchmark base/diff Base Median (ns) Diff Median (ns) Modality
Benchmarks.Trace.AppSecBodyBenchmark.ObjectExtractorSimpleBody‑net472 1.241 187.47 151.04
Benchmarks.Trace.AppSecBodyBenchmark.ObjectExtractorSimpleBody‑net6.0 1.139 132.88 116.63

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master AllCycleSimpleBody net6.0 38.9μs 23.3ns 90.4ns 0.0195 0 0 1.65 KB
master AllCycleSimpleBody netcoreapp3.1 41.2μs 144ns 559ns 0.0207 0 0 1.63 KB
master AllCycleSimpleBody net472 42.3μs 126ns 487ns 0.249 0 0 1.69 KB
master AllCycleMoreComplexBody net6.0 222μs 81.9ns 317ns 0.111 0 0 9.22 KB
master AllCycleMoreComplexBody netcoreapp3.1 234μs 99.6ns 345ns 0.117 0 0 9.12 KB
master AllCycleMoreComplexBody net472 240μs 79.6ns 308ns 1.43 0 0 9.28 KB
master ObjectExtractorSimpleBody net6.0 133ns 0.0862ns 0.322ns 0.00396 0 0 280 B
master ObjectExtractorSimpleBody netcoreapp3.1 167ns 0.0591ns 0.229ns 0.00363 0 0 272 B
master ObjectExtractorSimpleBody net472 188ns 0.194ns 0.75ns 0.0446 0 0 281 B
master ObjectExtractorMoreComplexBody net6.0 3μs 1.52ns 5.91ns 0.054 0 0 3.88 KB
master ObjectExtractorMoreComplexBody netcoreapp3.1 4.12μs 2.33ns 9.01ns 0.0516 0 0 3.78 KB
master ObjectExtractorMoreComplexBody net472 4.09μs 1.69ns 6.33ns 0.617 0.00612 0 3.89 KB
#4360 AllCycleSimpleBody net6.0 39.1μs 8.07ns 29.1ns 0.0198 0 0 1.65 KB
#4360 AllCycleSimpleBody netcoreapp3.1 41.5μs 73.4ns 275ns 0.0206 0 0 1.63 KB
#4360 AllCycleSimpleBody net472 42.3μs 17.6ns 65.7ns 0.254 0 0 1.69 KB
#4360 AllCycleMoreComplexBody net6.0 223μs 175ns 678ns 0.11 0 0 9.22 KB
#4360 AllCycleMoreComplexBody netcoreapp3.1 230μs 139ns 521ns 0.114 0 0 9.12 KB
#4360 AllCycleMoreComplexBody net472 242μs 55.5ns 208ns 1.46 0 0 9.28 KB
#4360 ObjectExtractorSimpleBody net6.0 117ns 0.062ns 0.232ns 0.00403 0 0 280 B
#4360 ObjectExtractorSimpleBody netcoreapp3.1 170ns 0.0622ns 0.233ns 0.00376 0 0 272 B
#4360 ObjectExtractorSimpleBody net472 151ns 0.119ns 0.447ns 0.0446 0 0 281 B
#4360 ObjectExtractorMoreComplexBody net6.0 2.98μs 0.967ns 3.62ns 0.0566 0 0 3.88 KB
#4360 ObjectExtractorMoreComplexBody netcoreapp3.1 4.12μs 2.04ns 7.62ns 0.0514 0 0 3.78 KB
#4360 ObjectExtractorMoreComplexBody net472 4.1μs 2.91ns 11.3ns 0.617 0.00613 0 3.89 KB
Benchmarks.Trace.AspNetCoreBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master SendRequest net6.0 170μs 74.8ns 290ns 0.255 0 0 18.03 KB
master SendRequest netcoreapp3.1 187μs 288ns 1.12μs 0.186 0 0 20.09 KB
master SendRequest net472 0ns 0ns 0ns 0 0 0 0 b
#4360 SendRequest net6.0 169μs 104ns 388ns 0.169 0 0 18.03 KB
#4360 SendRequest netcoreapp3.1 189μs 249ns 966ns 0.189 0 0 20.09 KB
#4360 SendRequest net472 0.000401ns 0.000162ns 0.000583ns 0 0 0 0 b
Benchmarks.Trace.CIVisibilityProtocolWriterBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master WriteAndFlushEnrichedTraces net6.0 514μs 2.52μs 10.4μs 0.492 0 0 41.95 KB
master WriteAndFlushEnrichedTraces netcoreapp3.1 621μs 1.88μs 7.27μs 0.316 0 0 42.02 KB
master WriteAndFlushEnrichedTraces net472 767μs 2.43μs 9.4μs 8.38 2.67 0.381 53.25 KB
#4360 WriteAndFlushEnrichedTraces net6.0 532μs 1.84μs 7.12μs 0.512 0 0 41.86 KB
#4360 WriteAndFlushEnrichedTraces netcoreapp3.1 625μs 756ns 2.93μs 0.306 0 0 41.92 KB
#4360 WriteAndFlushEnrichedTraces net472 791μs 2.1μs 8.13μs 8.2 2.34 0.391 53.22 KB
Benchmarks.Trace.ConfigurationBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
#4360 CreateImmutableTracerSettings net6.0 1.84ms 823ns 3.08μs 0 0 0 37.66 KB
#4360 CreateImmutableTracerSettings netcoreapp3.1 1.82ms 1.05μs 3.92μs 0 0 0 38.64 KB
#4360 CreateImmutableTracerSettings net472 2.45ms 1.49μs 5.36μs 43.3 1.2 0 275.84 KB
Benchmarks.Trace.DbCommandBenchmark - Faster 🎉 Same allocations ✔️

Faster 🎉 in #4360

Benchmark base/diff Base Median (ns) Diff Median (ns) Modality
Benchmarks.Trace.DbCommandBenchmark.ExecuteNonQuery‑netcoreapp3.1 1.119 1,247.49 1,114.80

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master ExecuteNonQuery net6.0 841ns 0.228ns 0.822ns 0.0101 0 0 720 B
master ExecuteNonQuery netcoreapp3.1 1.25μs 0.336ns 1.26ns 0.00944 0 0 720 B
master ExecuteNonQuery net472 1.35μs 0.491ns 1.9ns 0.108 0 0 682 B
#4360 ExecuteNonQuery net6.0 881ns 0.304ns 1.18ns 0.0107 0 0 720 B
#4360 ExecuteNonQuery netcoreapp3.1 1.11μs 0.284ns 1.06ns 0.00945 0 0 720 B
#4360 ExecuteNonQuery net472 1.45μs 0.279ns 1.01ns 0.108 0 0 682 B
Benchmarks.Trace.ElasticsearchBenchmark - Slower ⚠️ Same allocations ✔️

Slower ⚠️ in #4360

Benchmark diff/base Base Median (ns) Diff Median (ns) Modality
Benchmarks.Trace.ElasticsearchBenchmark.CallElasticsearchAsync‑net6.0 1.155 987.28 1,140.33

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master CallElasticsearch net6.0 1.08μs 0.2ns 0.721ns 0.013 0 0 944 B
master CallElasticsearch netcoreapp3.1 1.24μs 0.855ns 3.2ns 0.0125 0 0 944 B
master CallElasticsearch net472 2.1μs 0.71ns 2.75ns 0.151 0.00104 0 955 B
master CallElasticsearchAsync net6.0 987ns 0.216ns 0.748ns 0.0129 0 0 920 B
master CallElasticsearchAsync netcoreapp3.1 1.31μs 0.543ns 2.1ns 0.0132 0 0 992 B
master CallElasticsearchAsync net472 2.11μs 0.533ns 2.06ns 0.161 0.00105 0 1.01 KB
#4360 CallElasticsearch net6.0 987ns 0.35ns 1.31ns 0.0139 0 0 944 B
#4360 CallElasticsearch netcoreapp3.1 1.21μs 0.438ns 1.58ns 0.0127 0 0 944 B
#4360 CallElasticsearch net472 2.1μs 0.76ns 2.94ns 0.151 0.00105 0 955 B
#4360 CallElasticsearchAsync net6.0 1.14μs 0.532ns 2.06ns 0.0135 0 0 920 B
#4360 CallElasticsearchAsync netcoreapp3.1 1.32μs 0.355ns 1.33ns 0.0131 0 0 992 B
#4360 CallElasticsearchAsync net472 2.14μs 0.366ns 1.37ns 0.16 0 0 1.01 KB
Benchmarks.Trace.GraphQLBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master ExecuteAsync net6.0 1.11μs 0.279ns 1.04ns 0.0125 0 0 912 B
master ExecuteAsync netcoreapp3.1 1.38μs 0.448ns 1.68ns 0.0124 0 0 912 B
master ExecuteAsync net472 1.57μs 0.376ns 1.46ns 0.139 0.000771 0 875 B
#4360 ExecuteAsync net6.0 1.16μs 0.466ns 1.75ns 0.0134 0 0 912 B
#4360 ExecuteAsync netcoreapp3.1 1.29μs 0.807ns 3.02ns 0.0118 0 0 912 B
#4360 ExecuteAsync net472 1.59μs 0.722ns 2.8ns 0.139 0 0 875 B
Benchmarks.Trace.HttpClientBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master SendAsync net6.0 3.48μs 1.9ns 7.1ns 0.026 0 0 1.9 KB
master SendAsync netcoreapp3.1 4.26μs 1.3ns 4.67ns 0.0322 0 0 2.43 KB
master SendAsync net472 6.91μs 1.77ns 6.86ns 0.477 0 0 3 KB
#4360 SendAsync net6.0 3.52μs 2.4ns 8.99ns 0.0283 0 0 1.9 KB
#4360 SendAsync netcoreapp3.1 4.29μs 2.61ns 9.75ns 0.0321 0 0 2.43 KB
#4360 SendAsync net472 6.93μs 1.72ns 6.65ns 0.474 0 0 3 KB
Benchmarks.Trace.ILoggerBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master EnrichedLog net6.0 1.23μs 0.746ns 2.79ns 0.0229 0 0 1.62 KB
master EnrichedLog netcoreapp3.1 1.96μs 1.73ns 6.71ns 0.0214 0 0 1.62 KB
master EnrichedLog net472 2.29μs 1.69ns 6.31ns 0.244 0 0 1.54 KB
#4360 EnrichedLog net6.0 1.26μs 0.647ns 2.42ns 0.0239 0 0 1.62 KB
#4360 EnrichedLog netcoreapp3.1 1.96μs 0.743ns 2.78ns 0.0216 0 0 1.62 KB
#4360 EnrichedLog net472 2.28μs 1.03ns 4ns 0.244 0 0 1.54 KB
Benchmarks.Trace.Log4netBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master EnrichedLog net6.0 113μs 91.7ns 355ns 0.0564 0 0 4.21 KB
master EnrichedLog netcoreapp3.1 116μs 167ns 648ns 0 0 0 4.21 KB
master EnrichedLog net472 148μs 173ns 670ns 0.662 0.221 0 4.38 KB
#4360 EnrichedLog net6.0 113μs 128ns 498ns 0.0569 0 0 4.21 KB
#4360 EnrichedLog netcoreapp3.1 117μs 255ns 986ns 0.0575 0 0 4.21 KB
#4360 EnrichedLog net472 151μs 336ns 1.3μs 0.674 0.225 0 4.38 KB
Benchmarks.Trace.NLogBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master EnrichedLog net6.0 2.86μs 0.604ns 2.18ns 0.0301 0 0 2.18 KB
master EnrichedLog netcoreapp3.1 3.83μs 1.23ns 4.61ns 0.029 0 0 2.18 KB
master EnrichedLog net472 4.52μs 1.14ns 4.43ns 0.315 0 0 1.99 KB
#4360 EnrichedLog net6.0 2.82μs 0.986ns 3.69ns 0.0325 0 0 2.18 KB
#4360 EnrichedLog netcoreapp3.1 3.81μs 1.26ns 4.55ns 0.0285 0 0 2.18 KB
#4360 EnrichedLog net472 4.45μs 0.729ns 2.63ns 0.315 0 0 1.99 KB
Benchmarks.Trace.RedisBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master SendReceive net6.0 1.12μs 0.536ns 2.08ns 0.0155 0 0 1.11 KB
master SendReceive netcoreapp3.1 1.39μs 0.55ns 2.06ns 0.0147 0 0 1.11 KB
master SendReceive net472 1.83μs 0.807ns 3.02ns 0.177 0 0 1.12 KB
#4360 SendReceive net6.0 1.11μs 1.81ns 6.99ns 0.0163 0 0 1.11 KB
#4360 SendReceive netcoreapp3.1 1.47μs 0.93ns 3.6ns 0.0147 0 0 1.11 KB
#4360 SendReceive net472 1.72μs 1.44ns 5.39ns 0.176 0 0 1.12 KB
Benchmarks.Trace.SerilogBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master EnrichedLog net6.0 2.6μs 1.26ns 4.89ns 0.0209 0 0 1.53 KB
master EnrichedLog netcoreapp3.1 3.54μs 1.11ns 4.14ns 0.0213 0 0 1.58 KB
master EnrichedLog net472 4.23μs 21ns 91.5ns 0.31 0 0 1.96 KB
#4360 EnrichedLog net6.0 2.54μs 0.479ns 1.73ns 0.0229 0 0 1.53 KB
#4360 EnrichedLog netcoreapp3.1 3.48μs 1.65ns 6.38ns 0.021 0 0 1.58 KB
#4360 EnrichedLog net472 4.02μs 1.46ns 5.65ns 0.309 0 0 1.96 KB
Benchmarks.Trace.SpanBenchmark - Slower ⚠️ Same allocations ✔️

Slower ⚠️ in #4360

Benchmark diff/base Base Median (ns) Diff Median (ns) Modality
Benchmarks.Trace.SpanBenchmark.StartFinishSpan‑net472 1.146 554.36 635.09

Faster 🎉 in #4360

Benchmark base/diff Base Median (ns) Diff Median (ns) Modality
Benchmarks.Trace.SpanBenchmark.StartFinishSpan‑netcoreapp3.1 1.152 584.22 507.33

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master StartFinishSpan net6.0 386ns 0.129ns 0.465ns 0.0075 0 0 536 B
master StartFinishSpan netcoreapp3.1 583ns 0.49ns 1.83ns 0.00731 0 0 536 B
master StartFinishSpan net472 554ns 0.0567ns 0.196ns 0.0852 0 0 538 B
master StartFinishScope net6.0 512ns 0.178ns 0.666ns 0.00915 0 0 656 B
master StartFinishScope netcoreapp3.1 749ns 0.204ns 0.764ns 0.00861 0 0 656 B
master StartFinishScope net472 806ns 0.264ns 0.987ns 0.098 0 0 618 B
#4360 StartFinishSpan net6.0 408ns 0.122ns 0.455ns 0.00793 0 0 536 B
#4360 StartFinishSpan netcoreapp3.1 507ns 0.445ns 1.72ns 0.00728 0 0 536 B
#4360 StartFinishSpan net472 635ns 0.217ns 0.839ns 0.0853 0 0 538 B
#4360 StartFinishScope net6.0 482ns 0.121ns 0.452ns 0.0095 0 0 656 B
#4360 StartFinishScope netcoreapp3.1 699ns 0.135ns 0.506ns 0.00886 0 0 656 B
#4360 StartFinishScope net472 854ns 0.2ns 0.776ns 0.098 0 0 618 B
Benchmarks.Trace.TraceAnnotationsBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master RunOnMethodBegin net6.0 674ns 0.47ns 1.63ns 0.00918 0 0 656 B
master RunOnMethodBegin netcoreapp3.1 756ns 0.31ns 1.12ns 0.00879 0 0 656 B
master RunOnMethodBegin net472 935ns 0.211ns 0.817ns 0.0979 0 0 618 B
#4360 RunOnMethodBegin net6.0 650ns 0.223ns 0.863ns 0.00949 0 0 656 B
#4360 RunOnMethodBegin netcoreapp3.1 778ns 0.677ns 2.44ns 0.00898 0 0 656 B
#4360 RunOnMethodBegin net472 919ns 1.39ns 5.39ns 0.098 0 0 618 B

@andrewlock
Copy link
Member

Throughput/Crank Report:zap:

Throughput results for AspNetCoreSimpleController comparing the following branches/commits:

Cases where throughput results for the PR are worse than latest master (5% drop or greater), results are shown in red.

Note that these results are based on a single point-in-time result for each branch. For full results, see one of the many, many dashboards!

gantt
    title Throughput Linux x64 (Total requests) 
    dateFormat  X
    axisFormat %s
    section Baseline
    This PR (4360) (11.040M)   : 0, 11040083
    master (11.039M)   : 0, 11038648
    benchmarks/2.33.0 (10.950M)   : 0, 10949608
    benchmarks/2.9.0 (11.003M)   : 0, 11003334

    section Automatic
    This PR (4360) (7.763M)   : 0, 7762643
    master (7.701M)   : 0, 7701120
    benchmarks/2.33.0 (7.621M)   : 0, 7620707
    benchmarks/2.9.0 (7.928M)   : 0, 7927595

    section Trace stats
    master (7.716M)   : 0, 7716275
    benchmarks/2.33.0 (7.602M)   : 0, 7602038

    section Manual
    This PR (4360) (9.901M)   : 0, 9900621
    master (9.985M)   : 0, 9984799
    benchmarks/2.33.0 (9.770M)   : 0, 9770235

    section Manual + Automatic
    This PR (4360) (7.487M)   : 0, 7486799
    master (7.390M)   : 0, 7389573
    benchmarks/2.33.0 (7.189M)   : 0, 7188658

    section Version Conflict
    master (6.712M)   : 0, 6712363
    benchmarks/2.33.0 (6.553M)   : 0, 6553316

Loading
gantt
    title Throughput Linux arm64 (Total requests) 
    dateFormat  X
    axisFormat %s
    section Baseline
    This PR (4360) (9.555M)   : 0, 9555401
    master (9.765M)   : 0, 9764849
    benchmarks/2.33.0 (9.426M)   : 0, 9425603
    benchmarks/2.9.0 (9.605M)   : 0, 9604953

    section Automatic
    This PR (4360) (6.755M)   : 0, 6755381
    master (6.735M)   : 0, 6734547
    benchmarks/2.33.0 (6.803M)   : 0, 6802932

    section Trace stats
    master (6.702M)   : 0, 6702121
    benchmarks/2.33.0 (6.810M)   : 0, 6810447

    section Manual
    This PR (4360) (8.615M)   : 0, 8615209
    master (8.572M)   : 0, 8572415
    benchmarks/2.33.0 (8.435M)   : 0, 8434569

    section Manual + Automatic
    This PR (4360) (6.330M)   : 0, 6330111
    master (6.616M)   : 0, 6616222
    benchmarks/2.33.0 (6.465M)   : 0, 6465149

    section Version Conflict
    master (5.860M)   : 0, 5860102
    benchmarks/2.33.0 (5.807M)   : 0, 5806690

Loading
gantt
    title Throughput Windows x64 (Total requests) 
    dateFormat  X
    axisFormat %s
    section Baseline
    This PR (4360) (9.327M)   : 0, 9327482
    master (9.976M)   : 0, 9976017
    benchmarks/2.33.0 (9.426M)   : 0, 9426080
    benchmarks/2.9.0 (8.999M)   : 0, 8999461

    section Automatic
    This PR (4360) (6.894M)   : 0, 6894120
    master (7.092M)   : 0, 7091991
    benchmarks/2.33.0 (6.469M)   : 0, 6468939
    benchmarks/2.9.0 (6.913M)   : 0, 6912666

    section Trace stats
    master (7.239M)   : 0, 7238581
    benchmarks/2.33.0 (6.830M)   : 0, 6830314

    section Manual
    This PR (4360) (8.610M)   : 0, 8610012
    master (8.535M)   : 0, 8534813
    benchmarks/2.33.0 (8.108M)   : 0, 8108026

    section Manual + Automatic
    This PR (4360) (6.699M)   : 0, 6699359
    master (6.990M)   : 0, 6990093
    benchmarks/2.33.0 (6.856M)   : 0, 6856379

    section Version Conflict
    master (6.339M)   : 0, 6339396
    benchmarks/2.33.0 (5.893M)   : 0, 5893016

Loading
gantt
    title Throughput Linux x64 (ASM) (Total requests) 
    dateFormat  X
    axisFormat %s
    section Baseline
    This PR (4360) (7.676M)   : 0, 7676412
    master (7.739M)   : 0, 7738834
    benchmarks/2.33.0 (7.765M)   : 0, 7764607
    benchmarks/2.9.0 (8.089M)   : 0, 8088812

    section No attack
    This PR (4360) (2.182M)   : 0, 2182373
    master (2.207M)   : 0, 2207260
    benchmarks/2.33.0 (2.209M)   : 0, 2208725
    benchmarks/2.9.0 (3.319M)   : 0, 3319024

    section Attack
    This PR (4360) (1.917M)   : 0, 1916755
    master (1.933M)   : 0, 1932642
    benchmarks/2.33.0 (1.911M)   : 0, 1910909
    benchmarks/2.9.0 (2.631M)   : 0, 2630747

    section Blocking
    This PR (4360) (3.827M)   : 0, 3827172
    master (3.845M)   : 0, 3844841
    benchmarks/2.33.0 (3.861M)   : 0, 3860900

Loading

@tonyredondo tonyredondo marked this pull request as ready for review July 12, 2023 12:28
@tonyredondo tonyredondo requested review from a team as code owners July 12, 2023 12:28
Copy link
Member

@andrewlock andrewlock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tl;dr:

  • We can't change IConfigurationSource, so we need to remove those APIs. Plus we only seem to actually use the function delegate once, in a path that is only ever called once. I'm not at all convinced the extra complexity that introduces is worth it
  • The IntegrationId updates look good, but I think we can simplify a lot given that the only cases where the non-optimal paths are called is if the customer is doing something funky (and incorrect) in code.
  • The serilog changes look like they could be high impact, and I like it in principle, but they also seem relatively high risk:
    • There's no changes in the throughput tests, that seems surprising, unless we don't write many logs by default (which I believe is the case). In which case, this seems like it may risky (completely replacing log infra) without much of an expected benefit? (if there's no logs, there's no benefit!)
    • Are we sure we can/should move log serialization off the main loop? Is there prior art from Serilog/other sinks for this, just to be sure that it's actually safe to do (I know it's not safe in some other log FXs). Thinking about async local values, shared scopes etc.

Sorry for all the nay-saying, I do like it in theory! 😄 (well, except for the function pointers 😝)

@@ -106,6 +106,71 @@ private static string GetToStringFast(StringBuilder sb, in EnumToGenerate enumTo
return sb.ToString();
}

private static string GetKeys(StringBuilder sb, in EnumToGenerate enumToGenerate)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion for a slight clarification

Suggested change
private static string GetKeys(StringBuilder sb, in EnumToGenerate enumToGenerate)
private static string GetConfigurationKeys(StringBuilder sb, in EnumToGenerate enumToGenerate)

@@ -106,6 +106,71 @@ private static string GetToStringFast(StringBuilder sb, in EnumToGenerate enumTo
return sb.ToString();
}

private static string GetKeys(StringBuilder sb, in EnumToGenerate enumToGenerate)
{
if (enumToGenerate.FullyQualifiedName != "Datadog.Trace.Configuration.IntegrationId")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have to commend your ingenuity, but also, jamming this into the generic "enum extensions" source generator makes my left eye twitch 😛

var tracer = Tracer.Instance;
if (tracer is null)
{
Log.Debug("Tracer.Instance is null after InitializeNoNativeParts was invoked");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't we want to log this anymore?

Copy link
Member Author

@tonyredondo tonyredondo Aug 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we are not logging this, previous to this code we already are doing Tracer.Instance.XXXX so it will crash earlier 🤦🏻

Comment on lines +79 to +87
for (var i = 0; i < _sources.Count; i++)
{
if (_sources[i].GetString(key, NullConfigurationTelemetry.Instance, validator: null, recordValue: true) is { } result)
{
return result.Result;
}
}

return default;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not worth changing these [PublicApi] methods in the configuration sources IMO. They're legacy implementations that we will never invoke, and are only remaining there because they're public APIs.

}

/// <inheritdoc />
ConfigurationResult<string>? ITelemeteredConfigurationSource.GetString(string key, IConfigurationTelemetry telemetry, Func<string, bool>? validator, bool recordValue)
=> _sources.Select(source => source.GetString(key, telemetry, validator, recordValue)).FirstOrDefault(value => value != null);
{
for (var i = 0; i < _sources.Count; i++)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These ITelemeteredConfigurationSource.GetString() rewrites are fine as we use them


RecordChanges();

void RecordChanges()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't you use the GetKeys() approach here? IMO we could just use that and remove the string.format/string/create approach entirely (as we're guaranteed to only be working with existing integration ids here)

Comment on lines +43 to +47
#if !NETCOREAPP
ThreadPool.UnsafeQueueUserWorkItem(_waitCallback, null);
#else
ThreadPool.UnsafeQueueUserWorkItem(this, true);
#endif
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This stuff all makes me nervous, as I've basically not used it for 10 years, so I don't know the edge cases and issues, and it feels pretty high risk to be replacing a "core" component like this.

It's also not entirely clear to me that it's safe to move the LogEvent to a background thread like this, rather than rendering it immediately. Is this something Serilog does internally? If not, why not? Have you checked if any of that has changed in Serilog 3.0?

So all I can really say is "I hope you've done this right" 😬

Copy link
Member Author

@tonyredondo tonyredondo Aug 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's also not entirely clear to me that it's safe to move the LogEvent to a background thread like this, rather than rendering it immediately. Is this something Serilog does internally? If not, why not? Have you checked if any of that has changed in Serilog 3.0?

All this is done not because the rendering but for the OpenFile handles of the sink.

I wanted to defer all log handling when doing a cold start because basically we write a bunch of logs at startup and then we don't write any after the app is running normally.

In fact the whole PR is just an attempt to reduce startup time by reducing time in configuration load and log writes. I guess only CI Visibility and Serverless will benefits of this.

This is way this shouldn't affect Throughput tests, but Execution time benchmarks. Which in fact we do: #4360 (comment)

I'm open for ideas for the R&D week 😉

// This product includes software developed at Datadog (https://www.datadoghq.com/). Copyright 2017 Datadog, Inc.
// </copyright>

using System;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nullable enable maybe? I'm guessing most of the Serilog code doesn't have that... also, if we're going to update to serilog 3.x soon, be aware we may need/want to update this soon

@@ -0,0 +1,249 @@
// <copyright file="DatadogRollingFileSink.cs" company="Datadog">
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is file is basically identical to RollingFileSink right, just it creates a DatadogDeferredSink instead of File/Shared Sink? Maybe we should add a comment about that?

/// Datadog Shared File Sink is based on SharedFileSink code but instead of mutex locking for every log item
/// the mutex is acquire in each flush, reducing the impact of the logger while keeping the mutex logic
/// </summary>
internal sealed class DatadogSharedFileSink : IFileSink, IDisposable
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, now I'm confused... so many sinks... 😂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:tracer The core tracer library (Datadog.Trace, does not include OpenTracing, native code, or integrations)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants