This repository contains a Colab notebook that demonstrates how to handle the Digital ink data from the Didi dataset.
The dataset contains digital ink drawings of diagrams with dynamic drawing information. The dataset aims to foster research in interactive graphical symbolic understanding. The dataset was obtained using a prompted data collection effort.
We provide the raw data in NDJSON format as well as the prompts in png, dot, and xdot format.
The dataset and details about its construction and use are described in this ArXiV paper: The Didi dataset: Digital Ink Diagram data.
We are providing a colab notebook that demonstrates how to read and visualize the data. It also provides functions to convert the data to TFRecord files for easy use in tensorflow.
First download the Didi dataset. For this you can either download the raw data or use our demo colab to convert the data into TFRecord files.
Our paper gives more information about a potential train/validation/test split of the data.
The data is licensed by Google LLC under CC BY 4.0 license. The code is released under an Apache 2 license.