Skip to content

Build time tools in the flavor of Dapper

License

Notifications You must be signed in to change notification settings

yatlor/DapperAOT

 
 

Repository files navigation

Let's face it: ADO.NET is a complicated API, and writing "good" ADO.NET code by hand is time consuming and error-prone. But a lot of times you also don't want the ceremony of an ORM like EF or LLBLGenPro - you just want to execute SQL!

For years now, Dapper helped by providing a great low-friction way of talking to arbitrary ADO.NET databases, handling command preparation, invocation, and result parsing.

For example:

public static Customer GetCustomer(DbConnection connection, int id)
    => connection.QuerySingleOrDefault<Customer>(
        "select * from Customers where Id = @id", new { id });
    
int id = ...
var customer = GetCustomer(connection, id);

This works, but: there are problems:

  • it needs a lot of reflection and ref-emit work at runtime, which can impact cold-start performance, AOT-scenarios, and memory usage
  • you need to allocate objects to convey the parameters; this sounds trivial, but they add up
  • we can't really do a lot to help spot problems before they hit you at runtime (well, that's not entirely true - we have some in-house analyzers that do a lot of this, but not all)
  • it needs a lot of complicated cache code
  • it needs to do a lot of work to handle different command types, only identifiable at runtime
  • the code is really really hard to maintain - it is emitting raw IL, which is buggy - and it isn't even remotely practical to emit good async code, forcing a lot of compromises
  • due to the existing code, it is really hard to implement command re-use, which is a huge optimization (it could be done, but it would need a massive refactor)
  • the existing Dapper API has... kinks (we're working on that separately)

So... what if we could change all that?

C# 9 offers new partial method support for use with Roslyn generators, allowing us to take a completely different approach to ADO.NET; consider:

[Command("select * from Customers where Id = @id")]
public static partial Customer GetCustomer(DbConnection connection, int id);

int id = ...
var customer = GetCustomer(connection, id);

Here:

  • everything gets generated at build; no runtime reflection/ref-emit code (eventually; very incomplete!)
    • which also means it is AOT and linker friendly
  • our parameters are expressed naturally as parameters - no additional object allocations
  • we can even make ref / out etc work as expected (which is currently quite painful via vanilla Dapper)
  • it is designed to naturally allow command re-use, reducing allocations considerably
  • since we're already essentially writing a Roslyn analyzer, we can naturally give helpful warnings, errors, etc during build, so there's no surprises at runtime
    • for example, invalid SQL, mismatched parameters, possibly even query hints
  • we don't have to compromise our async code by having it constantly jumping between generated stubs (non-async) and a hoisting helper
  • no complicated cache lookups based on the command/types/etc
  • the code can be optimized to take full advantage of the capabilities of the types it can see during build
  • the generated code is viewable, and the code-generation code is finally maintanable
  • we can think about a range of new features that previously would have been too complicated to insert into the middle of the IL
  • we even have the future flexibility to consider alternative APIs, such as Woodstar

Don't worry; we still intend to allow ad-hoc SQL usage when needed (flexible search, for example); this could just be (TBD):

[Command]
public static partial Customer GetCustomer(DbConnection connection, [Command] string sql, int id);

So we're still fixing the parameters formally, but the SQL can be scenario-specific and controlled by your code.


Current status:

Todo:

  • formal analyzer diagnostics
  • materializer (currently using Dapper's materializer)
  • async including cancellation
  • lists/sequences (multi-row; IEnumerable<T>, IAsyncEnumerable<T>, List<T>, etc)
  • non-queries (void, Task, ValueTask)
  • scalars
  • out/ref args (TBD: what about async)
  • better build-time configuration
  • better run-time configuration
  • transactions
  • ad-hoc SQL support
  • feature injection (profiling, cross-cutting concerns, etc)
  • ambient/implied access to connections (for example, from a .Connection on the current instance)
  • documentation and examples
  • CI, myget
  • pooled awaitables (either using the new .NET 6 bits, or PooledAwait)
  • should we support SQL via a query registry? if so, how? (so the [Command] just knows about a token)
  • create connections inside the method? from what factory? what connection string?
  • returning DataTable / DataSet ?
  • probably lots of other things

But... it works!

Method Mean Error StdDev Gen 0 Gen 1 Gen 2 Allocated
DapperSystemData 75.82 us 0.615 us 0.545 us 0.3662 - - 3.2 KB
DapperMicrosoftData 79.78 us 1.051 us 0.983 us 0.3662 - - 3.41 KB
DapperAOTSystemData 76.57 us 0.955 us 0.746 us 0.1221 - - 1.95 KB
DapperAOTMicrosoftData 77.73 us 0.554 us 0.518 us 0.1221 - - 1.95 KB

Dapper is already fast in the mode being tested here, so we're not looking for a speed boost; but:

  • we can build on this model to help improve other scenarios where Dapper isn't quite as good
  • we're reducing allocations, mostly by re-using the command instances
  • and we want to look at a build-time generated materializer too, which may crank things further

About

Build time tools in the flavor of Dapper

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • C# 99.9%
  • Other 0.1%