Skip to content

Latest commit

 

History

History
 
 

didi_dataset

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 

The Didi dataset: Digital Ink Diagram data

This repository contains a Colab notebook that demonstrates how to handle the Digital ink data from the Didi dataset.

The dataset contains digital ink drawings of diagrams with dynamic drawing information. The dataset aims to foster research in interactive graphical symbolic understanding. The dataset was obtained using a prompted data collection effort.

Download the Didi dataset

We provide the raw data in NDJSON format as well as the prompts in png, dot, and xdot format.

The dataset and details about its construction and use are described in this ArXiV paper: The Didi dataset: Digital Ink Diagram data.

Visualizing and converting the data.

We are providing a colab notebook that demonstrates how to read and visualize the data. It also provides functions to convert the data to TFRecord files for easy use in tensorflow.

Training and evaluating a model

First download the Didi dataset. For this you can either download the raw data or use our demo colab to convert the data into TFRecord files.

Our paper gives more information about a potential train/validation/test split of the data.

Licenses

The data is licensed by Google LLC under CC BY 4.0 license. The code is released under an Apache 2 license.