Skip to content

Content Enhanced BERT-based Text-to-SQL Generation

Notifications You must be signed in to change notification settings

bobkentt/NL2SQL-RULE

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

50 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NL2SQL-BERT

LICENSE

Content Enhanced BERT-based Text-to-SQL Generation https://arxiv.org/abs/1910.07179

(Incorporating database design rule into text-to-sql generation)

Requirements

python 3.6

records 0.5.3

torch 1.1.0

Run

1, Data prepare: Download all origin data( https://drive.google.com/file/d/1iJvsf38f16el58H4NPINQ7uzal5-V4v4 ) and put them at data_and_model directory.

Then run data_and_model/output_entity.py

2, Train and eval:

train.py

Results on BERT-Base-Uncased without EG

Model Dev
logical form
accuracy
Dev
execution
accuracy
Test
logical form
accuracy
Test
execution
accuracy
SQLova 80.6 86.5 80.0 85.5
Our Methods 84.3 90.3 83.7 89.2

Data

One data look:

{
	"table_id": "1-1000181-1",
	"phase": 1,
	"question": "Tell me what the notes are for South Australia ",
	"question_tok": ["Tell", "me", "what", "the", "notes", "are", "for", "South", "Australia"],
	"sql": {
		"sel": 5,
		"conds": [
			[3, 0, "SOUTH AUSTRALIA"]
		],
		"agg": 0
	},
	"query": {
		"sel": 5,
		"conds": [
			[3, 0, "SOUTH AUSTRALIA"]
		],
		"agg": 0
	},
	"wvi_corenlp": [
		[7, 8]
	],
	"bertindex_knowledge": [0, 0, 0, 0, 4, 0, 0, 1, 3],
	"header_knowledge": [2, 0, 0, 2, 0, 1]
}

The Table:

{
	"id": "1-1000181-1",
	"header": ["State/territory", "Text/background colour", "Format", "Current slogan", "Current series", "Notes"],
	"rows": [
		["Australian Capital Territory", "blue/white", "Yaa·nna", "ACT · CELEBRATION OF A CENTURY 2013", "YIL·00A", "Slogan screenprinted on plate"],
		["New South Wales", "black/yellow", "aa·nn·aa", "NEW SOUTH WALES", "BX·99·HI", "No slogan on current series"],
		["New South Wales", "black/white", "aaa·nna", "NSW", "CPX·12A", "Optional white slimline series"],
		["Northern Territory", "ochre/white", "Ca·nn·aa", "NT · OUTBACK AUSTRALIA", "CB·06·ZZ", "New series began in June 2011"],
		["Queensland", "maroon/white", "nnn·aaa", "QUEENSLAND · SUNSHINE STATE", "999·TLG", "Slogan embossed on plate"],
		["South Australia", "black/white", "Snnn·aaa", "SOUTH AUSTRALIA", "S000·AZD", "No slogan on current series"],
		["Victoria", "blue/white", "aaa·nnn", "VICTORIA - THE PLACE TO BE", "ZZZ·562", "Current series will be exhausted this year"]
	]
}

Trained model

https://drive.google.com/open?id=18MBm9qzobTBgWPZlpA2EErCQtsMhlTN2

Reference

https://github.com/naver/sqlova

About

Content Enhanced BERT-based Text-to-SQL Generation

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 52.8%
  • Jupyter Notebook 47.2%