Skip to content

Commit

Permalink
Merge from dev
Browse files Browse the repository at this point in the history
  • Loading branch information
ogolovatyi authored and ogolovatyi committed Feb 22, 2019
2 parents 28545f1 + 44aa6c8 commit 1ce0e57
Show file tree
Hide file tree
Showing 12 changed files with 244 additions and 136 deletions.
2 changes: 2 additions & 0 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,10 @@ install:
- pip install pytest
- pip install pytest-cov
- pip install coveralls
- npm install -g markdownlint-cli
script:
- pytest tabpy-server/server_tests/ --cov=tabpy-server/tabpy_server
- pytest tabpy-tools/tools_tests/ --cov=tabpy-tools/tabpy_tools --cov-append
- markdownlint .
after_success:
- coveralls
16 changes: 9 additions & 7 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -74,14 +74,17 @@ pytest tabpy-tools/tools_tests/ --cov=tabpy-tools/tabpy_tools --cov-append
If you have downloaded Tabpy and would like to manually install Tabpy Server
not using pip then follow the steps below [to run TabPy in Python virtual environment](docs/tabpy-virtualenv.md).


## Documentation Updates

For any process, scripts or API changes documentation needs to be updated accordingly.
Please use markdown validation tools like web-based[markdownlint](https://dlaa.me/markdownlint/)
or npm [markdownlint-cli](https://github.com/igorshubovych/markdownlint-cli).

TOC for markdown file is built with [markdonw-toc](https://www.npmjs.com/package/markdown-toc).
TOC for markdown file is built with [markdown-toc](https://www.npmjs.com/package/markdown-toc):

```sh
markdownlint -i docs/server-startup.md
```

## TabPy with Swagger

Expand All @@ -98,18 +101,17 @@ Access-Control-Allow-Methods = GET, OPTIONS, POST
```

- Start local instance of TabPy server following [TabPy Server Startup Guide](docs/server-startup.md).
- Run local copy of Swagger editor with steps provided at
- Run local copy of Swagger editor with steps provided at
[https://github.com/swagger-api/swagger-editor](https://github.com/swagger-api/swagger-editor).
- Open `misc/TabPy.yml` in Swagger editor.
- In case your TabPy server runs not on `localhost:9004` update
`host` value in `TabPy.yml` accordingly.

## Code styling

On github repo for merge request `pycodestyle` is used to check Python code against our
style conventions.

You need to install `pycodestyle` locally:
On github repo for merge request `pycodestyle` is used to check Python code
against our style conventions. You can run install it and run locally for
file where modifications were made:

```sh
pip install pycodestyle
Expand Down
12 changes: 9 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,16 @@
# TabPy

[![Community Supported](https://img.shields.io/badge/Support%20Level-Community%20Supported-457387.svg)](https://www.tableau.com/support-levels-it-and-developer-tools)
[![Build Status](https://travis-ci.com/tableau/TabPy.svg?branch=master)](https://travis-ci.com/tableau/TabPy)
[![Coverage Status](https://coveralls.io/repos/github/tableau/TabPy/badge.svg)](https://coveralls.io/github/tableau/TabPy)

[![Python 3.6](https://img.shields.io/badge/python-3.6-blue.svg)](https://www.python.org/downloads/release/python-360/)

TabPy (Tableau Python Server) is external server implementation which allows expanding Tableau with executing Python scripts on table calculation.
TabPy (Tableau Python Server) is external server implementation which allows
expanding Tableau with executing Python scripts in table calculation.

All documentation is in the [docs](docs) folder. Consider reading it the next order:
All documentation is in the [docs](docs) folder. Consider reading it the next
order:

* [About TabPy](docs/about.md)
* [TabPy Server Download Instructions](docs/server-download.md)
Expand All @@ -26,7 +29,10 @@ More technical topics:

Other useful resources:

* For all questions not related to the TabPy code (installation, deployment, connections, Python issues, etc.) and requests use the [External Services Forum](https://community.tableau.com/community/forums/externalservices) on [Tableau Community](https://community.tableau.com).
* For all questions not related to the TabPy code (installation, deployment,
connections, Python issues, etc.) and requests use the
[External Services Forum](https://community.tableau.com/community/forums/externalservices)
on [Tableau Community](https://community.tableau.com).
* [Building advanced analytics applications with TabPy](https://www.tableau.com/about/blog/2017/1/building-advanced-analytics-applications-tabpy-64916)
* [Building Data Science Applications with TabPy Video Tutorial](https://youtu.be/nRtOMTnBz_Y)
* [TabPy Tutorial on TabWiki](https://community.tableau.com/docs/DOC-10856)
Expand Down
112 changes: 78 additions & 34 deletions docs/TableauConfiguration.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
# Using Python in Tableau Calculations

<!-- markdownlint-disable MD004 -->
<!-- toc -->

- [Configuration](#configuration)
Expand All @@ -10,106 +11,149 @@
- [Using Deployed Functions](#using-deployed-functions)

<!-- tocstop -->
<!-- markdownlint-enable MD004 -->

## Configuration

Once you have a [TabPy instance](server-startup.md) set up you can easily configure Tableau to use this service for evaluating Python code.
Once you have a [TabPy instance](server-startup.md) set up you can easily
configure Tableau to use this service for evaluating Python code.

### Tableau Desktop

In Tableau Desktop version 10.1 or later:

1. Go to Help->Settings and Performance->Manage External Service Connection...
2. Enter the Server (localhost if running TabPy on the same computer) and the Port (default is 9004).
2. Enter the Server (localhost if running TabPy on the same computer) and the
Port (default is 9004).

![Screenshot of Configuration on Tableau Desktop](img/external-service-configuration.png)

### Tableau Server 2018.2 and Newer Versions

To configure Tableau Server 2018.2 and newer versions to connect to TabPy server
To configure Tableau Server 2018.2 and newer versions to connect to TabPy server
use [TSM command line tool](https://onlinehelp.tableau.com/current/server/en-us/tsm.htm).

To configure a non secure connection to TabPy server set `vizqlserver.extsvc.host` and `vizqlserver.extsvc.port`
parameters:
To configure a non secure connection to TabPy server set `vizqlserver.extsvc.host`
and `vizqlserver.extsvc.port` parameters:

```sh
tsm set vizqlserver.extsvc.host <ip address or host name of the machine hosting TabPy>
tsm set vizqlserver.extsvc.port <port for TabPy>
```

To configure a secure connection to TabPy server use `tsm security vizql-extsvc enable` command
as described at [TSM Security documentation page](https://onlinehelp.tableau.com/current/server/en-us/cli_security_tsm.htm#tsm_security_vizql-extsvc-ssl-enable).
To configure a secure connection to TabPy server use `tsm security vizql-extsvc enable`
command as described at
[TSM Security documentation page](https://onlinehelp.tableau.com/current/server/en-us/cli_security_tsm.htm#tsm_security_vizql-extsvc-ssl-enable).

<!-- markdownlint-disable MD013 -->

```sh
tsm security vizql-extsvc-ssl enable --connection-type <type> --extsvc-host <host_name> --extsvc-port <port> [options] [global options]
```

For how to configure a secure TabPy instance follow instructions at
[TabPy Server Config documentation](server-config.md).
<!-- markdownlint-enable MD013 -->

For how to configure a secure TabPy instance follow instructions at
[TabPy Server Config documentation](server-config.md).

### Tableau Server 2018.1 and Older Versions

For Tableau workbooks with embedded Python code to work on Tableau Server 10.1 or later, you need to go
through a similar setup but using the [tabadmin](https://onlinehelp.tableau.com/current/server/en-us/tabadmin.htm)
command line utility.
The two server settings that need to be configured are `vizqlserver.extsvc.host` and `vizqlserver.extsvc.port`.
For Tableau workbooks with embedded Python code to work on Tableau Server 10.1
or later, you need to go through a similar setup but using the
[tabadmin](https://onlinehelp.tableau.com/current/server/en-us/tabadmin.htm)
command line utility. The two server settings that need to be configured are
`vizqlserver.extsvc.host` and `vizqlserver.extsvc.port`.

```sh
tabadmin stop
tabadmin set vizqlserver.extsvc.host <ip address or host name of the machine hosting TabPy>
tabadmin set vizqlserver.extsvc.host <ip address or host name for TabPy>
tabadmin set vizqlserver.extsvc.port <port for TabPy>
tabadmin configure
tabadmin start
```

Note that you cannot use TabPy secure connection with 2018.1 and older versions of Tableau.
Note that you cannot use TabPy secure connection with 2018.1 and older versions
of Tableau.

Note that it is not necessary to install TabPy on the Tableau Server or Desktop computer-all
that is required is a pointer to a TabPy server instance.
Note that it is not necessary to install TabPy on the Tableau Server or Desktop
computer-all that is required is a pointer to a TabPy server instance.

Once you're done with configuration, you can use Python in calculated fields in Tableau.
Once you're done with configuration, you can use Python in calculated fields in
Tableau.

## Anatomy of a Python Calculation

Tableau can pass code to TabPy through four different functions: SCRIPT_INT, SCRIPT_REAL, SCRIPT_STR and SCRIPT_BOOL to accommodate the different return types.
In the example below you can see a simple function that passes a column of book names (highlighted in blue) to Python for proper casing. Since Python returns an array of string, SCRIPT_STR function is used.
Tableau can pass code to TabPy through four different functions: SCRIPT_INT,
SCRIPT_REAL, SCRIPT_STR and SCRIPT_BOOL to accommodate the different return
types. In the example below you can see a simple function that passes a column
of book names (highlighted in blue) to Python for proper casing. Since Python
returns an array of string, SCRIPT_STR function is used.

![A simple example of a Python calculated field in Tableau Desktop](img/Example1-SimpleFunctionCall.png)
For a SCRIPT call to Python to be successful, it needs to return a result explicitly specified with the `return` keyword (highlighted in red).

In this simple example, there is only one input but you can pass as many arguments to SCRIPT functions as you like. Tableau takes the arguments in the order provided and replaces the _argN placeholders accordingly. In this case ATTR([Book Name]) maps to _arg1 and both are highlighted to indicate the association.
For a SCRIPT call to Python to be successful, it needs to return a result
explicitly specified with the `return` keyword (highlighted in red).

In this simple example, there is only one input but you can pass as many
arguments to SCRIPT functions as you like. Tableau takes the arguments in the
order provided and replaces the _argN placeholders accordingly. In this case
ATTR([Book Name]) maps to _arg1 and both are highlighted to indicate the
association.

Tableau expects the SCRIPT to return a single column that has either a single row or the same number of rows as it passed to TabPy. The example above sends 18 rows of data to TabPy and receives 18 rows back.
Tableau expects the SCRIPT to return a single column that has either a single
row or the same number of rows as it passed to TabPy. The example above sends
18 rows of data to TabPy and receives 18 rows back.

In the example below Tableau passes multiple columns to TabPy and gets a single value (correlation coefficient) back. SUM(Sales) and SUM(Profit) are used as argument 1 and 2 respectively and highlighted in matching colors.
In this case the function `corrcoef` returns a matrix from which the correlation coefficient is extracted such that a single column is returned.
In the example below Tableau passes multiple columns to TabPy and gets a single
value (correlation coefficient) back. SUM(Sales) and SUM(Profit) are used as
argument 1 and 2 respectively and highlighted in matching colors.
In this case the function `corrcoef` returns a matrix from which the correlation
coefficient is extracted such that a single column is returned.

![Using Partitioning settings with calculations](img/Example2-MultipleFunctionCalls.png)

Tableau aggregates the data before sending to TabPy using the level of detail of the view. In this particular example each point in the scatter plot is a Customer and TabPy is receiving SUM(Sales) and SUM(Profit) for each Customer.
Tableau aggregates the data before sending to TabPy using the level of detail
of the view. In this particular example each point in the scatter plot is a
Customer and TabPy is receiving SUM(Sales) and SUM(Profit) for each Customer.

If you would like to run your Python code on disaggregate data, you can achieve this simply by unchecking the Aggregate Measures option under the Analysis menu.
If you would like to run your Python code on disaggregate data, you can achieve
this simply by unchecking the Aggregate Measures option under the Analysis menu.

The example above showcases another capability that can come in handy if you like to run the same Python script multiple times in different contexts. In this particular example, unchecking the Category and Segment boxes in the Table Calculation dialog results in Tableau making multiple calls to TabPy, once per each pane in the visualization.
The example above showcases another capability that can come in handy if you
like to run the same Python script multiple times in different contexts. In this
particular example, unchecking the Category and Segment boxes in the Table
Calculation dialog results in Tableau making multiple calls to TabPy, once per
each pane in the visualization.
Running regression analysis independently for each Segment-Category combination.

In all of these examples the data structure being returned by the function can be consumed by Tableau. This may not always be the case. If your Python code returns a one dimensional array but TabPy is failing to serialize it to JSON, you may want to convert it to a list as shown in the example below.
In all of these examples the data structure being returned by the function can
be consumed by Tableau. This may not always be the case. If your Python code
returns a one dimensional array but TabPy is failing to serialize it to JSON,
you may want to convert it to a list as shown in the example below.

![Converting to list to make the results JSON serializable](img/python-calculated-field.png)

You can find two detailed working examples with downloadable sample Tableau workbooks on [our blog](https://www.tableau.com/about/blog/2017/1/building-advanced-analytics-applications-tabpy-64916).
You can find two detailed working examples with downloadable sample Tableau
workbooks on [our blog](https://www.tableau.com/about/blog/2017/1/building-advanced-analytics-applications-tabpy-64916).

## Using Deployed Functions

[TabPy Tools documentation](tabpy-tools.md) covers in detail how functions could be deployed as endpoints.
You can invoke such endpoints using `tabpy.query` option by specifying the endpoint name and arguments and retrieving the `response` object.
[TabPy Tools documentation](tabpy-tools.md) covers in detail how functions
could be deployed as endpoints.
You can invoke such endpoints using `tabpy.query` option by specifying the
endpoint name and arguments and retrieving the `response` object.

A SCRIPT calculated field in Tableau using the [add endpoint](tabpy-tools.md#deploying-a-function) defined in [TabPy Tools documentation](tabpy-tools.md) could look like the following:
A SCRIPT calculated field in Tableau using the
[add endpoint](tabpy-tools.md#deploying-a-function) defined in
[TabPy Tools documentation](tabpy-tools.md) could look like the following:

```sh
SCRIPT_REAL("
return tabpy.query('add',_arg1,_arg2)['response']",
-SUM([Discount]),SUM([Price]))
```

You can find a detailed working example with a downloadable sample Tableau workbook showing how to publish models and use the published models in calculated fields on [our blog](https://www.tableau.com/about/blog/2017/1/building-advanced-analytics-applications-tabpy-64916).
You can find a detailed working example with a downloadable sample Tableau
workbook showing how to publish models and use the published models in
calculated fields on
[our blog](https://www.tableau.com/about/blog/2017/1/building-advanced-analytics-applications-tabpy-64916).
6 changes: 3 additions & 3 deletions docs/about.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,8 @@ TabPy framework allows Tableau to remotely execute Python code. It has two compo

1. A process built on Tornado, which allows for the remote execution of Python
code through a set of [REST APIs](server-rest.md). The code can either be immediately
executed or persisted in the server process and exposed as a REST endpoint, to be
called later.
executed or persisted in the server process and exposed as a REST endpoint,
to be called later.

2. A [tools library](tabpy-tools.md) that enables the deployment of such endpoints,
based on Python functions.
Expand All @@ -14,5 +14,5 @@ Tableau can connect to the TabPy server to execute Python code on the fly and
display results in Tableau visualizations. Users can control data and parameters
being sent to TabPy by interacting with their Tableau worksheets, dashboard or stories.

For how to configure Tableau to connect to TabPy server follow steps in
For how to configure Tableau to connect to TabPy server follow steps in
[Tableau Configuration Document](TableauConfiguration.md).
3 changes: 2 additions & 1 deletion docs/security.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,8 @@
The following security issues should be kept in mind as you use TabPy with Tableau:

- TabPy currently does not use authentication.
- Python scripts can contain code which can harm security on the server where the TabPy is running. For example:
- Python scripts can contain code which can harm security on the server where
the TabPy is running. For example:
- Access file system (read/write)
- Install new Python packages which can contain binary code
- Execute operating system commands
Expand Down
25 changes: 18 additions & 7 deletions docs/server-config.md
Original file line number Diff line number Diff line change
@@ -1,32 +1,43 @@
# TabPy Server Configuration Instructions

Default settings for TabPy may be viewed in the tabpy_server/common/default.conf file. This file also contains a commented example of how to set up your TabPy server to only serve HTTPS traffic.
Default settings for TabPy may be viewed in the
tabpy_server/common/default.conf file. This file also contains a
commented example of how to set up your TabPy server to only
serve HTTPS traffic.

Change settings by:

1. Adding environment variables:
- set the environment variable as required by your Operating System. When creating environment variables, use the same name as is in the config file as an environment variable. The files startup.sh and startup.cmd in the root of the install folder have examples of how to set environment variables in both Linux and Windows respectively. Set any desired environment variables and then start the application.
- set the environment variable as required by your Operating System. When
creating environment variables, use the same name as is in the config file
as an environment variable. The files startup.sh and startup.cmd in the root
of the install folder have examples of how to set environment variables in
both Linux and Windows respectively. Set any desired environment variables
and then start the application.
2. Modifying default.conf.
3. Specifying your own config file as a command line parameter.
- i.e. Running the command:
```python tabpy.py --config=path\to\my\config```

The default config file is provided to show you the default values but does not need to be present to run TabPy.
The default config file is provided to show you the default values but does not
need to be present to run TabPy.

## Configuring HTTP vs HTTPS

By default, TabPy serves only HTTP requests. TabPy can be configured to serve only HTTPS requests by setting the following parameter in the config file:
By default, TabPy serves only HTTP requests. TabPy can be configured to serve
only HTTPS requests by setting the following parameter in the config file:

```sh
TABPY_TRANSFER_PROTOCOL = https
```

If HTTPS is selected, the absolute paths to the cert and key file need to be specified in your config file using the following parameters:
If HTTPS is selected, the absolute paths to the cert and key file need to be
specified in your config file using the following parameters:

```sh
TABPY_CERTIFICATE_FILE = C:/path/to/cert/file.crt
TABPY_KEY_FILE = C:/path/to/key/file.key
```

Note that only PEM-encoded x509 certificates are supported for the secure connection scenario.

Note that only PEM-encoded x509 certificates are supported for the secure
connection scenario.
Loading

0 comments on commit 1ce0e57

Please sign in to comment.