Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Placeholder interpolation #309

Merged
merged 38 commits into from
Feb 14, 2015
Merged

Conversation

arvenil
Copy link
Contributor

@arvenil arvenil commented Feb 1, 2015

@methane did awesome job in #297 and since I'm highly interested in this PR I wanted to test it. However #297 was out of sync with master ("This pull request contains merge conflicts that must be resolved.") so I merged changes from master.
Since I was already testing this. I've also:

  • added some basic tests for escape functions (100% code coverage)
  • exported escape functions (in case anyone would like to use them)
  • added some basic sql injection tests (including sql_mode=NO_BACKSLASH_ESCAPES)
  • referenced ported code to mysql source code (just in case if anyone like me;) would like to take a look if port is correct)

Feel free to merge this or throw away, I did this mostly to verify correctness.

One more time thanks for this @methane !

@methane
Copy link
Member

methane commented Feb 2, 2015

Good job!

@arnehormann
Copy link
Member

I'll hopefully get to review this soon.
Some remarks I already have from skimming it:

@arvenil add yourself to the AUTHORS file please.
@methane please close the other PR and reference this one if you are ok with arvenil taking over for you.

I will probably merge #311 first (it's pretty small), as this also hits const.go, you'll have to resync - sorry.

Can you change the tests so SELECT ... CAST is used instead of table writes? I don't know if that's applicable here, but I'd love to reduce the amount of disk IO during testing in the long run. See the DATE/TIME tests for example.

I don't like the Escape... functions, I prefer an externally controllable buffer and quotes embedded in the function.
Have a look at http://play.golang.org/p/ujxynUDVhE (extra points to you for a benchmark gist to compare speed & allocations).
I also think the EscapeQuoting tests are wrong. As a question: how would you be able to upload and retrieve a string foo\x00bar unchanged by the driver?

@methane
Copy link
Member

methane commented Feb 6, 2015

I feel "interpolation" is better word than "substitution." But I'm not sure since I'm native English.

@arvenil May I write optimized version of escaping function?

@arvenil
Copy link
Contributor Author

arvenil commented Feb 6, 2015

@methane Yes, I've added you to https://github.com/arvenil/mysql/tree/placeholder as collaborator so you could push to this PR but do whatever suits you better.

I will address other questions and issues later, when time allows.

@arvenil
Copy link
Contributor Author

arvenil commented Feb 7, 2015

Hi @arnehormann

I also think the EscapeQuoting tests are wrong. As a question: how would you be able to upload and retrieve a string foo\x00bar unchanged by the driver?

For MySQL data is not changed, it is escaped. This is done exactly the same way how MySQL does it. Take a look at https://github.com/mysql/mysql-server/blob/mysql-5.7.5/mysys/charset.c#L823-L932 https://github.com/mysql/mysql-server/blob/mysql-5.7.5/mysys/charset.c#L963-L1038

To prove that data is not changed I've added TestInsertRetrieveEscapedData. You can see there that retrieved value equals inserted foo \x00\n\r\x1a\"'\\.

const (
statusInTrans statusFlag = 1 << iota
statusInAutocommit
statusUnknown1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

statusReserved // not in documentation please

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

benchmark                  old ns/op     new ns/op     delta
BenchmarkInterpolation     4065          2533          -37.69%

benchmark                  old allocs     new allocs     delta
BenchmarkInterpolation     15             6              -60.00%

benchmark                  old bytes     new bytes     delta
BenchmarkInterpolation     1144          560           -51.05%

When `interpolateParams` is true, calls to `sql.Db.Query()` and `sql.Db.Exec()` with params interpolates placeholders (`?`) with given params. This reduces roundtrips to database compared with `interpolateParams=false` since it uses prapre, exec and close to support parameters.

NOTE: It make SQL injection vulnerability when connection encoding is not utf8.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the connection encoding is not utf8, the connection becomes vulnerable to SQL injection attacks.

Maybe I misread something, but how does this depend on utf8?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If connection encoding is Shift-JIS, 95 5c 27 20 should be encoded into 95 5c 27 20 since 95 5c represents one multibyte char ().
But escapeBytesBackslash and escapeStringBackslash encode it into 95 5c 5c 27 20 since 5c is \.
This is famous SQL injection vulnerability in Japan.

@julienschmidt
Copy link
Member

Let me also do a final review later before merging.

@arnehormann
Copy link
Member

@julienschmidt I intended to do so. This is huge...

Default: false
```

When `interpolateParams` is true, calls to `sql.Db.Query()` and `sql.Db.Exec()` with params interpolates placeholders (`?`) with given params. This reduces roundtrips to database compared with `interpolateParams=false` since it uses prepare, exec and close to support parameters.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use just db.Query() / db.Exec() instead of sql.Db.Query() / sql.Db.Exec()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If interpolateParams is true, placeholders (?) in calls to db.Query() and db.Exec() are interpolated into a single query string with the given parameters. This reduces the number of roundtrips, since the driver has to prepare a statement, execute it with the given parameters and close the statement again with interpolateParams=false.

copy(n, buf)
buf = n
}
buf = buf[0:end]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The exact same code from L819-L824 is used 4 times. Move it to a helper function. And rename n to newBuf. n is more or less reserved for numbers 😉

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And why pos+end by the way?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did it.
pos+end = len(old)*2 + reserved size. len(buf)*2 is for exponential realloc.

// A whitelist of collations which safe to interpolate parameters.
// ASCII and latin-1 are safe since they are single byte encoding.
// utf-8 is safe since it doesn't conatins ASCII characters in trailing bytes.
safeCollations := []string{"ascii_", "latin1_", "utf8_", "utf8mb4_"}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be better to use a blacklist instead. MySQL only supports 5 unsafe collations: big5, cp932, gb2312, gbk and sjis

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The blacklist would then be:
"big5_chinese_ci": 1,
"sjis_japanese_ci": 13,
"gbk_chinese_ci": 28,
"big5_bin": 84,
"gb2312_bin": 86,
"gbk_bin": 87,
"sjis_bin": 88,
"cp932_japanese_ci": 95,
"cp932_bin": 96,

You can also directly check the ID instead of the name then

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed.

@arnehormann
Copy link
Member

@methane praise your patience and thank you for doing this... 😀

@julienschmidt
Copy link
Member

LGTM... almost

@julienschmidt
Copy link
Member

@arvenil please add yourself to the AUTHORS file

julienschmidt added a commit that referenced this pull request Feb 14, 2015
@julienschmidt julienschmidt merged commit 200c80b into go-sql-driver:master Feb 14, 2015
@julienschmidt
Copy link
Member

Thank you!

@methane
Copy link
Member

methane commented Feb 19, 2015

FYI, I've wrote blog post about optimizing this pull request.
http://methane.github.io/2015/02/reduce-allocation-in-go-code/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants