Skip to content

Commit 58ecb1f

Browse files
author
xushengguo-xy
committed
hello quicksql
0 parents  commit 58ecb1f

File tree

1,618 files changed

+410694
-0
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

1,618 files changed

+410694
-0
lines changed

.gitignore

+26
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
target/
2+
!.mvn/wrapper/maven-wrapper.jar
3+
4+
### STS ###
5+
.apt_generated
6+
.classpath
7+
.factorypath
8+
.project
9+
.settings
10+
.springBeans
11+
12+
### IntelliJ IDEA ###
13+
.idea
14+
*.iws
15+
*.iml
16+
*.ipr
17+
18+
*.log
19+
20+
### NetBeans ###
21+
nbproject/private/
22+
nbbuild/
23+
nbdist/
24+
.nb-gradle/
25+
26+
/log/

LICENSE

+21
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
MIT License
2+
3+
Copyright (c) 2018 <QSql Project>
4+
5+
Permission is hereby granted, free of charge, to any person obtaining a copy
6+
of this software and associated documentation files (the "Software"), to deal
7+
in the Software without restriction, including without limitation the rights
8+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9+
copies of the Software, and to permit persons to whom the Software is
10+
furnished to do so, subject to the following conditions:
11+
12+
The above copyright notice and this permission notice shall be included in all
13+
copies or substantial portions of the Software.
14+
15+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21+
SOFTWARE.

NOTICE

+9
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
Copyright (c) 2018 QSql project
2+
3+
This product includes modules `qsql-calcite-analysis` and `qsql-calcite-elasticsearch`
4+
developed by the [Calcite] project.
5+
Please visit homepage `http://calcite.apache.org/` or
6+
project on GitHub `https://github.com/apache/calcite` of [Calcite] for more information.
7+
8+
This product contains code from the `imc` Project:
9+
Please visit Github project `https://github.com/picadoh/imc` for more information.

README.md

+324
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,324 @@
1+
![200_200](./doc/picture/logo.jpeg)
2+
3+
4+
5+
[![license](https://img.shields.io/badge/license-MIT-blue.svg?style=flat)](./LICENSE)[![Release Version](https://img.shields.io/badge/release-0.5-red.svg)]()[![PRs Welcome](https://img.shields.io/badge/PRs-welcome-brightgreen.svg)]()
6+
7+
QSQL is a SQL query product which can be used for specific datastore queries or multiple datastores correlated queries. It supports relational databases, non-relational databases and even datastore which does not support SQL (such as Elasticsearch, Druid) . In addition, a SQL query can join or union data from multiple datastores in QSQL. For example, you can perform unified SQL query on one situation that a part of data stored on Elasticsearch, but the other part of data stored on Hive. The most important is that QSQL is not dependent on any intermediate compute engine, users only need to focus on data and unified SQL grammar to finished statistics and analysis.
8+
9+
[English](./doc/README_doc.md)|[中文](./doc/README文档.md)
10+
11+
![1540973404791](./doc/picture/p1.png)
12+
13+
QSQL architecture consists of three layers:
14+
15+
- Parsing Layer: Used for parsing, validation, optimization of SQL statements, splitting of mixed SQL and finally generating Query Plan;
16+
17+
- Computing Layer: For routing query plan to a specific execution plan, then interpreted to executable code for given storage or engine(such as Elasticsearch JSON query or Hive HQL);
18+
19+
- Storage layer: For data prepared extraction and storage;
20+
21+
## Build
22+
23+
### Requirements
24+
25+
- CentOS 6.2
26+
- java >= 1.8
27+
- scala >= 2.11
28+
- spark >= 2.2
29+
- [Options] MySQL, Elasticsearch, Hive, Druid
30+
31+
### Deployment
32+
33+
Uncompress the package qsql-0.5.tar.gz
34+
35+
```shell
36+
tar -zxvf ./qsql-0.5.tar.gz
37+
```
38+
39+
Create a soft link
40+
41+
```shell
42+
ln -s qsql-0.5/ qsql
43+
```
44+
45+
The main directory structure after decompression of the release package is:
46+
47+
- bin: included all of scripts for building environment and running sql.
48+
- conf: included all of configures in runtime.
49+
- data: stored data for testing.
50+
- metastore: included a embedded database and create table statements scripts for managing metadata.
51+
52+
In directory ```$QSQL_HOME/conf```, configure the following files:
53+
54+
- base-env.sh:Included correlated environment variables:
55+
- JAVA_HOME
56+
- SPARK_HOME
57+
- QSQL_CLUSTER_URL
58+
- QSQL_HDFS_TMP
59+
- qsql-runner.properties:Included serveral runtime properties
60+
- log4j.properties:Included logger level
61+
62+
## Examples
63+
64+
### QSQL Shell
65+
66+
```
67+
./bin/qsql -e "select 1"
68+
```
69+
70+
Detailed:[English](./doc/CLI_doc.md)|[中文](./doc/CLI文档.md)
71+
72+
### Query Example
73+
74+
Several sample queries are included with QSQL. To run one of them, use ```./run-example <class> [params]```
75+
76+
Example 1: Memory Table Query
77+
78+
```
79+
./bin/run-example com.qihoo.qsql.CsvScanExample
80+
```
81+
82+
Example 2: Hive Join MySQL
83+
84+
```
85+
./bin/run-example com.qihoo.qsql.CsvJoinWithEsExample
86+
```
87+
88+
**Note**:
89+
90+
If you are running a hybrid query, make sure the current machine has deployed Spark, Hive and MySQL environment and inserted the correct connection information of Hive and MySQL into the metastore.
91+
92+
Detailed:[English](./doc/API_doc.md)|[中文](./doc/API文档.md)
93+
94+
## Properties Configure
95+
96+
### Environment Variables
97+
98+
| Property Name | Meaning |
99+
| ----------------------------------- | ----------------------- |
100+
| JAVA_HOME | Java installation path |
101+
| SPARK_HOME | Spark installation path |
102+
| QSQL_CLUSTER_URL | Hadoop cluster url |
103+
| QSQL_HDFS_TMP (Option) | Hadoop tmp path |
104+
| QSQL_DEFAULT_WORKER_NUM (Option) | Worker number |
105+
| QSQL_DEFAULT_WORKER_MEMORY (Option) | Worker memory size |
106+
| QSQL_DEFAULT_DRIVER_MEMORY (Option) | Driver memory size |
107+
| QSQL_DEFAULT_MASTER (Option) | Cluster mode in Spark |
108+
| QSQL_DEFAULT_RUNNER (Option) | Execution mode |
109+
110+
### Runtime Variables
111+
112+
#### Application Properties
113+
114+
| Property Name | Default | Meaning |
115+
| --------------------------------- | ------------------ | ------------------------------------------------------------ |
116+
| spark.sql.hive.metastore.jars | builtin | Hive Jars |
117+
| spark.sql.hive.metastore.version | 1.2.1 | Hive version |
118+
| spark.local.dir | /tmp | Temporary file path used by Spark |
119+
| spark.driver.userClassPathFirst | true | User jars are loaded first during Spark execution |
120+
| spark.sql.broadcastTimeout | 300 | Max broadcast waited Time |
121+
| spark.sql.crossJoin.enabled | true | Allow Spark SQL execute cross join |
122+
| spark.speculation | true | Spark will start task speculation execution |
123+
| spark.sql.files.maxPartitionBytes | 134217728(128MB) | The maximum number of bytes of a single partition when Spark reads a file |
124+
125+
#### Metadata Properties
126+
127+
| Property Name | Default | Meaning |
128+
| --------------------------- | ---------------------- | ------------------------------------------------------------ |
129+
| meta.storage.mode | intern | Metadata storage mode. intern: read the metadata stored in the embeded sqlite database. extern: read the metadata stored in the external database |
130+
| meta.intern.schema.dir | ../metastore/schema.db | The path of embeded database file |
131+
| meta.extern.schema.driver | (none) | The driver of external database |
132+
| meta.extern.schema.url | (none) | The connection url of external database |
133+
| meta.extern.schema.user | (none) | The user name of external database |
134+
| meta.extern.schema.password | (none) | The password of external database |
135+
136+
## Metadata Management
137+
138+
### Tables
139+
140+
#### DBS
141+
142+
| Fields | Note | Sample |
143+
| ------- | -------------------- | ---------------- |
144+
| DB_ID | Database ID | 1 |
145+
| DESC | Database Description | es index |
146+
| NAME | Database Name | es_profile_index |
147+
| DB_TYPE | Database Type | es, Hive, MySQL |
148+
149+
#### DATABASE_PARAMS
150+
151+
| Fields | Note | Sample |
152+
| ----------- | ----------- | -------- |
153+
| DB_ID | Database ID | 1 |
154+
| PARAM_KEY | Param Key | UserName |
155+
| PARAM_VALUE | Param Value | root |
156+
157+
#### TBLS
158+
159+
| Fields | Note | Sample |
160+
| ------------ | ------------- | ------------------- |
161+
| TBL_ID | Table ID | 101 |
162+
| CREATED_TIME | Creation Time | 2018-10-22 14:36:10 |
163+
| DB_ID | Database ID | 1 |
164+
| TBL_NAME | Table Name | student |
165+
166+
#### COLUMNS
167+
168+
| 表字段 | Note | Sample |
169+
| ----------- | ------------- | ------------ |
170+
| CD_ID | Column ID | 10101 |
171+
| COMMENT | Field Comment | Student Name |
172+
| COLUMN_NAME | Field Name | name |
173+
| TYPE_NAME | Field Type | varchar |
174+
| INTEGER_IDX | Field Index | 1 |
175+
176+
### Embedded SQLite Database
177+
178+
In the directory ```$QSQL_HOME/metastore```, included following files:
179+
180+
- sqlite3:SQLite command line tool
181+
- schema.db:SQLite embedded database
182+
- ./linux-x86/sqldiff:A tool that displays the differences between SQLite databases.
183+
- ./linux-x86/sqlite3_analyzer:A command-line utility program that measures and displays how much and how efficiently space is used by individual tables and indexes with an SQLite database file
184+
185+
Connect to the schema.db database via sqlite3 and manipulate the metadata table
186+
187+
```shell
188+
sqlite3 ../schema.db
189+
```
190+
191+
### External MySQL Database
192+
193+
Change the embedded SQLite data to a MySQL database
194+
195+
```shell
196+
vim metadata.properties
197+
```
198+
199+
> meta.storage.mode=extern
200+
> meta.extern.schema.driver = com.mysql.jdbc.Driver
201+
> meta.extern.schema.url = jdbc:mysql://ip:port/db?useUnicode=true
202+
> meta.extern.schema.user = YourName
203+
> meta.extern.schema.password = YourPassword
204+
205+
Initialize the sample data to the MySQL database
206+
207+
```shell
208+
cd $QSQL_HOME/bin/
209+
./metadata --dbType mysql --action init
210+
```
211+
212+
### Configure Metadata
213+
214+
#### Hive
215+
216+
Sample Configuration:
217+
218+
| DB_ID | DESC | NAME | DB_TYPE |
219+
| ----- | ------------ | ------------- | ------- |
220+
| 26 | hive message | hive_database | hive |
221+
222+
| DB_ID | PARAM_KEY | PARAM_VALUE |
223+
| ----- | --------- | ------------ |
224+
| 26 | cluster | cluster_name |
225+
226+
| TBL_ID | CREATE_TIME | DB_ID | TBL_NAME |
227+
| ------ | ------------------- | ----- | ----------- |
228+
| 60 | 2018-11-06 10:44:51 | 26 | hive_mobile |
229+
230+
| CD_ID | COMMENT | COLUMN_NAME | TYPE_NAME | INTEGER_IDX |
231+
| ----- | ------- | ----------- | --------- | ----------- |
232+
| 60 | | retsize | string | 1 |
233+
| 60 | | im | string | 2 |
234+
| 60 | | wto | string | 3 |
235+
| 60 | | pro | int | 4 |
236+
| 60 | | pday | string | 5 |
237+
238+
#### Elasticsearch
239+
240+
Sample Configuration:
241+
242+
| DB_ID | DESC | NAME | DB_TYPE |
243+
| ----- | ---------- | -------- | ------- |
244+
| 24 | es message | es_index | es |
245+
246+
| DB_ID | PARAM_KEY | PARAM_VALUE |
247+
| ----- | ----------- | ---------------- |
248+
| 24 | esNodes | localhost |
249+
| 24 | esPort | 9025 |
250+
| 24 | esUser | es_user |
251+
| 24 | esPass | es_password |
252+
| 24 | esIndex | es_index/es_type |
253+
| 24 | esScrollNum | 156 |
254+
255+
| TBL_ID | CREATE_TIME | DB_ID | TBL_NAME |
256+
| ------ | ------------------- | ----- | -------- |
257+
| 57 | 2018-11-06 10:44:51 | 24 | profile |
258+
259+
| CD_ID | COMMENT | COLUMN_NAME | TYPE_NAME | INTEGER_IDX |
260+
| ----- | ------- | ----------- | --------- | ----------- |
261+
| 57 | comment | id | int | 1 |
262+
| 57 | comment | name | string | 2 |
263+
| 57 | comment | country | string | 3 |
264+
| 57 | comment | gender | string | 4 |
265+
| 57 | comment | operator | string | 5 |
266+
267+
#### MySQL
268+
269+
Sample Configuration:
270+
271+
| DB_ID | DESC | NAME | DB_TYPE |
272+
| ----- | ---------------- | -------------- | ------- |
273+
| 25 | mysql db message | mysql_database | mysql |
274+
275+
| DB_ID | PARAM_KEY | PARAM_VALUE |
276+
| ----- | ------------ | ------------------------------------------ |
277+
| 25 | jdbcDriver | com.mysql.jdbc.Driver |
278+
| 25 | jdbcUrl | jdbc:mysql://localhost:3006/mysql_database |
279+
| 25 | jdbcUser | root |
280+
| 25 | jdbcPassword | root |
281+
282+
| TBL_ID | CREATE_TIME | DB_ID | TBL_NAME |
283+
| ------ | ------------------- | ----- | --------- |
284+
| 58 | 2018-11-06 10:44:51 | 25 | test_date |
285+
286+
| CD_ID | COMMENT | COLUMN_NAME | TYPE_NAME | INTEGER_IDX |
287+
| ----- | ------- | ----------- | --------- | ----------- |
288+
| 58 | comment | id | int | 1 |
289+
| 58 | comment | name | string | 2 |
290+
291+
## Contributing
292+
293+
We welcome contributions.
294+
295+
If you are intered in QSQL, you can download the source code from GitHub and execute the following maven comman at the project root directory:
296+
297+
```shell
298+
mvn -DskipTests clean package
299+
```
300+
301+
If you are planning to make a large contribution, talk to us first! It helps to agree on the general approach. Log a Issures on GitHub for your proposed feature.
302+
303+
Fork the GitHub repository, and create a branch for your feature.
304+
305+
Develop your feature and test cases, and make sure that `mvn install` succeeds. (Run extra tests if your change warrants it.)
306+
307+
Commit your change to your branch.
308+
309+
If your change had multiple commits, use `git rebase -i master` to squash them into a single commit, and to bring your code up to date with the latest on the main line.
310+
311+
Then push your commit(s) to GitHub, and create a pull request from your branch to the QSQL master branch. Update the JIRA case to reference your pull request, and a committer will review your changes.
312+
313+
The pull request may need to be updated (after its submission) for two main reasons:
314+
315+
1. you identified a problem after the submission of the pull request;
316+
2. the reviewer requested further changes;
317+
318+
In order to update the pull request, you need to commit the changes in your branch and then push the commit(s) to GitHub. You are encouraged to use regular (non-rebased) commits on top of previously existing ones.
319+
320+
### Talks
321+
322+
**QQ Group: 932439028**
323+
324+
**WeChat Group: Posted in the QQ Group** [ P.S. Incorrect QQ Group number will raise a NPE :) ]

0 commit comments

Comments
 (0)