Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Track issue for expression test cases generated by LLM #59410

Open
61 tasks
YangKeao opened this issue Feb 11, 2025 · 3 comments
Open
61 tasks

Track issue for expression test cases generated by LLM #59410

YangKeao opened this issue Feb 11, 2025 · 3 comments

Comments

@YangKeao
Copy link
Member

YangKeao commented Feb 11, 2025

I'm trying to generate test cases for expression through LLM. In this issue, I'll track all issues and error cases found by this method. The description will included all sub-issues, and the failed test cases will be added to the comment.

Some of them may be duplicated, I didn't search the github before creating the issue (sorry). Feel free to close them if you find it's duplicated.

@YangKeao YangKeao added the type/bug The issue is confirmed as a bug. label Feb 11, 2025
@YangKeao
Copy link
Member Author

YangKeao commented Feb 11, 2025

The following SQLs don't have same result between TiDB and MySQL v8.4.4.

SELECT REGEXP_LIKE(BINARY 'ABC', 'abc', 'i');
SELECT REGEXP_LIKE(CAST('2023' AS BINARY), '^202');
SELECT REGEXP_LIKE('🇨🇦', '\x{1F1E8}\x{1F1E6}');
SELECT REGEXP_LIKE(0x48656C6C6F, '^Hello$');
SELECT REGEXP_LIKE('π', '\x{03C0}');
SELECT CAST('中文' AS CHAR CHARACTER SET latin1);
SELECT ROUND(123.4, 5);
SELECT ~BINARY '999';
SELECT BINARY 'Abc' REGEXP '^abc';
SELECT REGEXP_INSTR('a\nb', 'a.b', 1, 1, 0, 'n');
SELECT DATE('2024-00-15');
SELECT BIN(current_date);
SELECT GET_FORMAT(DATETIME, 123);
SELECT GET_FORMAT(TIME, 'usa');
SELECT GET_FORMAT(DATE, 0);
SELECT GET_FORMAT(DATE, 'FOO');
SELECT EXTRACT(DAY_HOUR FROM 0x0D0E0A);
SELECT DATE_FORMAT(0, '%Y%m%d');
SELECT DATE_FORMAT('2023-10-30', '');
SELECT REGEXP_REPLACE('', '^$', 'empty');
SELECT REGEXP_REPLACE('123', '1(?=2)', 'X', 1, 0);
SELECT sec_to_time(b'1111');
SELECT sec_to_time(0x1E0);
SELECT ADDDATE('0000-01-01', INTERVAL -1 DAY);
SELECT CONVERT_TZ('0000-01-01 00:00:00', '+00:00', '+01:00');
SELECT CONVERT_TZ('9999-12-31 23:59:59', 'UTC', 'Asia/Kolkata');
SELECT MINUTE('12:30:00+05:00');
SELECT TIME('');
SELECT TIME('NOT_A_TIME');
SELECT STR_TO_DATE('Tuesday, 23 May 2023', '%W, %d %M %Y');
SELECT UNIX_TIMESTAMP('');
SELECT UNIX_TIMESTAMP('2024');
SELECT UNIX_TIMESTAMP('20240228T123456');
SELECT UNIX_TIMESTAMP('28 February 2024');
SELECT UNIX_TIMESTAMP('2024-02-28abc');
SELECT ADDTIME('2023-01-01', '12:34:56');
SELECT ADDTIME('2023-01-01', '237:00:00');
SELECT PERIOD_DIFF('\n202405\n', '\t202406');
SELECT TIMESTAMP('0000-00-00 00:00:00');
SELECT TIMESTAMP('2023-10-10T12:00:00Z');
SELECT TIMESTAMPADD(DAY, -1, '2023-01-01');
SELECT TIMESTAMPADD(WEEK, 2, '2023-12-31');
SELECT TIMESTAMPADD(MONTH, 13, '2022-05-15');
SELECT TIMESTAMPADD(QUARTER, 4, '2023-01-01');
SELECT TIMESTAMPADD(DAY, 0, '2023-10-23');
SELECT TIMESTAMPADD(DAY, 1, 20231023);
SELECT TIMESTAMPADD(MONTH, 1, '2020-02-29');
SELECT TIMESTAMPADD(MONTH, 12, '2023-10-31');
SELECT TO_SECONDS('2024-03-05XYZ');
SELECT BIN(18446744073709551616);
SELECT BIN('');
SELECT FORMAT(1000/3, 5);
SELECT WEEK('2023-01-01', NULL);
SELECT TIME_FORMAT('12:34:56', '');
SELECT OCT('');
SELECT LPAD('abc', 5, '');
SELECT LPAD('abc', 5, '');
SELECT FIELD('2023-10-01', DATE '2023-10-01');
SELECT QUOTE(0xDEADBEEF);
SELECT TO_BASE64(1E+30);
SELECT SUBTIME('2024-03-01', '24:00:00');
SELECT SUBTIME('2024-02-29', '1 12:00:00');
SELECT SUBTIME('9999-12-31 23:59:59.999999', '9999-12-31 23:59:59.999999');
SELECT SUBTIME('2024-02-29', '1 00:00:00');
SELECT FIND_IN_SET(' ', '  , , ,');
SELECT WEIGHT_STRING(123);
SELECT WEIGHT_STRING(TRUE);
SELECT WEIGHT_STRING('a' COLLATE utf8mb4_bin);
SELECT RPAD('abc', 10, '');
SELECT FORMAT_BYTES(0);
SELECT FORMAT_BYTES(1023.999999999);
SELECT FORMAT_BYTES(b'11111111');
SELECT FORMAT_BYTES('');
SELECT FORMAT_BYTES(CHAR(52));
SELECT FORMAT_BYTES(JSON_EXTRACT('{"x": 512}','$.x'));
SELECT is_ipv4_mapped(NULL);
SELECT IS_IPV4(NULL);
SELECT IS_IPV4(IF(1, NULL, '127.0.0.1'));
SELECT IS_UUID(' 6ccd780c-baba-1026-8567-4cc3505b2a62 ');
SELECT IS_IPV4_COMPAT(NULL);
SELECT IS_IPV4_COMPAT(X'00000000000000000000000000000000');
SELECT IS_IPV6(NULL + INTERVAL 1 DAY);
SELECT NAME_CONST('bool1', TRUE);
SELECT NAME_CONST('bool2', FALSE);
SELECT NAME_CONST('pi', PI());
SELECT UUID_TO_BIN(' 6ccd780c-baba-1026-9564-5b8c656024db ');
SELECT json_replace(123, '$', 456);
SELECT INET6_NTOA(1234);
SELECT INET6_NTOA('abcdefghijklmnop');
SELECT JSON_LENGTH(123);
SELECT JSON_LENGTH(123.45);
SELECT JSON_LENGTH(CAST('{"i": 0}' AS BINARY));
SELECT JSON_DEPTH(10);
SELECT JSON_DEPTH(TRUE);
SELECT JSON_INSERT(123, '$.a', 'test');
SELECT JSON_SEARCH('{"h": "i"}', 'all', 'i', '\\', NULL);
SELECT JSON_QUOTE(X'7B');
SELECT JSON_QUOTE(BINARY '\0\x1F');
SELECT JSON_QUOTE(1/0);
SELECT JSON_QUOTE(0xDEADBEEF);
SELECT JSON_ARRAY_APPEND('[1]', '$', JSON_ARRAY(2, 3));
SELECT JSON_SET('{}', '$[last]', 'end')
SELECT MINUTE('12:30:00+05:00')
SELECT FOUND_ROWS() - 3;
SELECT SUBDATE(_utf8mb4'2024-01-01', INTERVAL X'01' DAY);
SELECT TIMEDIFF(CAST('2024-05-01' AS DATE), '2024-05-01 12:30:45');
SELECT TIMEDIFF(CAST('2024-05-01' AS DATE), '2024-05-01 12:30:45');
SELECT TIMEDIFF(123456.789, 987654.321);
SELECT TIMEDIFF('12:34', '56:78');
SELECT TIMEDIFF('1.5:0:0', '3:0:0');
SELECT sec_to_time(3020399.999999)

Too many issues. I will continue to group them tomorrow 🤦 . Have a good sleep tonight.

@YangKeao YangKeao removed the type/bug The issue is confirmed as a bug. label Feb 11, 2025
@2010YOUY01
Copy link

This is cool. I'm curious how did you guide LLM to generate different queries?
like

for func_doc in all_functions_doc:
    LLM_call(func_doc + "please generate queries with edge cases for this function")

@YangKeao
Copy link
Member Author

This is cool. I'm curious how did you guide LLM to generate different queries? like

for func_doc in all_functions_doc:
    LLM_call(func_doc + "please generate queries with edge cases for this function")

@2010YOUY01 You can see the prompt here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants