rewriter
Folders and files
Name | Name | Last commit date | ||
---|---|---|---|---|
parent directory.. | ||||
Copyright (c) 2015, 2022, Oracle and/or its affiliates. This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License, version 2.0, as published by the Free Software Foundation. This program is also distributed with certain software (including but not limited to OpenSSL) that is licensed under separate terms, as designated in a particular file or component or in included license documentation. The authors of MySQL hereby grant you an additional permission to link the program and your derivative works with the separately licensed software that they have included with MySQL. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License, version 2.0, for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA Rewriter plugin =============== This plugin accesses queries after parsing, potentially rewriting them. Terms and definitions ===================== - Rewrite rule: The specification of which queries should be rewritten and if so, how. An example rewrite rule: Rewrite all queries of the form "SELECT * FROM t WHERE c = ?" to "SELECT b FROM t WHERE c < ?" A rewrite rule consists of a pattern and a replacement. - Pattern: The part of the rewrite rule that enables us to determine whether a given query needs to be rewritten. The pattern syntax is identical to prepared statement syntax. - Replacement: A new query, also in prepared statement syntax. - Original query: Query which may get rewritten. This is a query as received by the server. - Rewritten query: Final query after a rule has been applied to an original query. - Literals: SQL literals (character strings, numbers, dates, etc.). Some literals may be extracted from the original query and inserted into the replacement to form the rewritten query. - Parameter markers: These are used for two purposes: - Wild cards for literals. A parameter marker in the pattern matches any literal. - References to matched literals. If a parameter marker is also present in the replacement, the matched literal is injected at that position. This process continues left to right with the rest of the matched literals until there are no more markers in the pattern. Syntactically, parameter markers are represented by '?' as in prepared statements. - Plugin user: DBA or anybody else who is in charge of launching the plugin, or changing the rules in the table. This doesn't include users who simply use the database and have their queries rewritten. Usage and things to know ======================== Installation ------------ It is recommended to install the plugin using the supplied SQL script install_rewriter.sql, which creates a database and table to hold the rewrite rules, and a stored procedure for loading the rules into the plugin. You can then add your rules in the table query_rewrite.rewrite_rules. The table and schema have roughly the following definitions (subject to change): CREATE DATABASE IF NOT EXISTS query_rewrite; CREATE TABLE IF NOT EXISTS query_rewrite.rewrite_rules ( id INT NOT NULL AUTO_INCREMENT PRIMARY KEY, pattern VARCHAR(10000), pattern_database VARCHAR(20), replacement VARCHAR(10000), enabled CHAR(1) NOT NULL DEFAULT 'Y', message VARCHAR(1000), ) CREATE PROCEDURE query_rewrite.flush_rewrite_rules() BEGIN DECLARE message_text VARCHAR(100); COMMIT; SELECT load_rewrite_rules() INTO message_text; IF NOT message_text IS NULL THEN SIGNAL SQLSTATE '45000' SET MESSAGE_TEXT = message_text; END IF; END // This procedure commits the current transaction before loading the rules: The transaction must be committed, otherwise there can be no guarantee that the plugin is able to see the new rules. The procedure uses the UDF load_rewrite_rules() from the plugin's shared library to load the rules into plugin memory, and handles errors. It is not recommended to call the UDF yourself. You may also install the plugin using the script install_rewriter_with_optional_column.sql. This creates the table with two additional columns that Rewriter uses to write the actual digest and the normalized form of the pattern. This is useful if you need to figure out why some query fails to get rewritten. You can insert a new rewrite rule by doing: INSERT INTO query_rewrite.rewrite_rules ( pattern, replacement ) VALUES ( 'SELECT * FROM db.t1 WHERE c1 = ?', 'SELECT * FROM db.t1 WHERE c2 = ?' ); It is not recommended to install the plugin yourself (that is, manually using INSTALL PLUGIN) rather than using one of the installation scripts. Loading Rules into the Plugin ----------------------------- The plugin keeps a copy of the rules table in memory to enable quicker matching of rule patterns. When updating the rules table, the updates are not immediately picked up by the plugin. This includes changing its content with a ROLLBACK statement. The current state of the rules table, as seen by the current session, is loaded into the plugin by running CALL query_rewrite.flush_rewrite_rules(). How Rewrites Happen ------------------- All original queries are checked for matches and possibly get rewritten to the replacement query. The matching is done in three stages with increased granularity for performance reasons: 1) Digest match. This is a quick-reject test with a relatively high false positive ratio, but without false negatives. As with all digest calculations there is a (extremely small) risk of hash collisions. There is also a limit on the portion of the query which gets a digest calculated. Hence extremely long queries that differ only far into the query always collide. The digest is calculated by the parser and is not part of the rewrite framework. 2) Tree structure matching. This makes sure that the original query and pattern have the same structure. The check is carried out by comparing the normalized query representation. Please refer to the section "21.7 Performance Schema Statement Digests" in the MySQL manual for details on normalized query representation. Practically, queries such as SELECT 1 FROM table WHERE name = 'Andrew' and SELECT 2 FROM table WHERE name = 'Lars' pass this test, since both are normalized to SELECT ? FROM table WHERE name = ?. 3) Literal matching. At this stage it has been established that the parse trees of the query and the pattern are equal. All that can differ at this point are the literal values. If either the pattern or the replacement is an incorrect SQL query (generates syntax errors), the plugin writes a message in the row's 'message' column. In this case, the rule is not loaded into the plugin. If the rule is later corrected and reloaded, the plugin will write NULL in the 'message' column. If a query is rewritten, an SQL Note is generated to indicate that. It is possible to have a pattern that has more parameters than the replacement - in which case the extra ones are just ignored. The opposite - more parameters in the replacement - is not allowed and causes the rule not to be loaded into memory. The plugin lets you know this by updating the 'message' column in the rules table. Queries using unqualified table references need some extra attention. Table names are resolved in the parser, that is, before any post-parse rewrite plugin is involved. What they get resolved to is of course the session-dependent setting of "current database", the one you set with the USE command. The problem is that Rewriter uses a session of its own to parse and load the rules, and it can't know what the value of "current database" was in the session that inserted the rule. To this end, use the 'pattern_database' column. This has the same effect as Rewriter issuing a USE command before parsing the pattern. All this of course means that a rule with some unqualified table references and a non-NULL value of 'pattern_database' will only apply if that value is indeed the current database. But this is likely what you want anyway. In short: - If a pattern uses only qualified table references, the value of 'pattern_database' won't matter. - If at least one table reference is unqualified, use 'pattern_database' to tell Rewriter for which value of "current database" the rule should apply. When an error occurs on loading a rule the system variable Rewriter_reload_error is set to ON and an error message is written in the rule's 'message' column. Uninstallation -------------- If you want to clean up your tracks completely, i.e., delete the rules table and database, it is recommended to use the supplied uninstall script. If you want only to uninstall the plugin, leaving the rules so that a subsequent installation could picks up where you left, you may issue these statements: DROP FUNCTION load_rewrite_rules; UNINSTALL PLUGIN rewriter; Status and System Variables --------------------------- The plugin defines four status variables: - Rewriter_number_loaded_rules: The number of rewrite rules in the in-memory rewrite hash table. - Rewriter_number_reloads: The number of times the rules table has been loaded into memory. - Rewriter_number_rewritten_queries: the number of queries which have been rewritten since last installing the plugin. - Rewriter_reload_error: ON if an error condition occurred when loading the rewrite rules table. That rule has an error message in its 'message' column. There are two system variables: - rewriter_verbose: Level of verbosity. At level 2, the plugin leaves an SQL Note even for queries that were not rewritten, attempting to explain why. - rewriter_enabled: If set to OFF, the plugin is still called by the server, but immediately returns, doing nothing. This is mainly used for internal testing purposes.