VisualDL, a visualization analysis tool of PaddlePaddle, provides a variety of charts to show the trends of parameters, and visualizes model structures, data samples, histograms of tensors, PR curves , ROC curves and high-dimensional data distributions. It enables users to understand the training process and the model structure more clearly and intuitively so as to optimize models efficiently.
VisualDL provides various visualization functions, including tracking metrics in real-time, visualizing the model structure, displaying the data sample, visualizing the relationship between hyperparameters and model metrics, presenting the changes of distributions of tensors, showing the pr curves, projecting high-dimensional data to a lower dimensional space and more. Additionally, VisualDL provides VDL.service, which enables developers easily to save, track and share visualization results of experiments. For specific guidelines of each function, please refer to VisualDL User Guide. For up-to-date experience, please feel free to try our Online Demo. Currently, VisualDL iterates rapidly and new functions will be continuously added.
Browsers supported by VisualDL are:
- Google Chrome ≥ 79
- Firefox ≥ 67
- Microsoft Edge ≥ 79
- Safari ≥ 11.1
VisualDL natively supports the use of Python. Developers can retrieve plentiful visualization results by simply adding a few lines of Python code into the model before training.
- Key Highlights
- Installation
- Usage Guideline
- Function Preview
- Frequently Asked Questions
- Contribution
- More Details
- Technical Communication
The high-level design of API makes it easy to use. Only one click can initiate the visualization of model structures.
The function contains the visualization of training parameters, data samples, graph structures, histograms of tensors, PR curves and high-dimensional data distributions.
VisualDL provides the visualization of the mainstream model structures such as Paddle, ONNX, Caffe, widely supporting visual analysis for diverse users.
By Integrating into PaddlePaddle and related modules, VisualDL allows developers to use different components without obstructions, and thus to have the best experience in the PaddlePaddle ecosystem.
python -m pip install visualdl -i https://mirror.baidu.com/pypi/simple
git clone https://github.com/PaddlePaddle/VisualDL.git
cd VisualDL
python setup.py bdist_wheel
pip install --upgrade dist/visualdl-*.whl
Please note that Python 2 is no longer maintained officially since January 1, 2020. VisualDL now only supports Python 3 in order to ensure the usability of codes.
VisualDL stores the data, parameters and other information of the training process in a log file. Users can launch the panel to observe the visualization results.
The Python SDK is provided at the back end of VisualDL, and a logger can be customized through LogWriter. The interface description is shown as follows:
class LogWriter(logdir=None,
max_queue=10,
flush_secs=120,
filename_suffix='',
**kwargs)
parameters | type | meaning |
---|---|---|
logdir | string | The path location of log file. VisualDL will create a log file under this path to record information generated by the training process. If not specified, the path will be runs/${CURRENT_TIME} as default. |
max_queue | int | The maximum capacity of the data generated before recording in a log file. Default value is 10. If the capacity is reached, the data are immediately written into the log file. |
flush_secs | int | The maximum cache time of the data generated before recording in a log file. Default value is 120. When this time is reached, the data are immediately written to the log file. (When the log message queue reaches the maximum cache time or maximum capacity, it will be written to the log file immediately) |
filename_suffix | string | Add a suffix to the default log file name. |
display_name | string | This parameter is displayed in the location of Select Data Stream in the panel. If not set, the default name is logdir .(When logdir is too long or needed to be hidden). |
file_name | string | Set the name of the log file. If the file_name already exists, setting the file_name will be new records in the same log file, which will continue to be used. Note that the name should include 'vdlrecords'. |
Create a log file and record scalar values:
from visualdl import LogWriter
# create a log file under `./log/scalar_test/train`
with LogWriter(logdir="./log/scalar_test/train") as writer:
# use `add_scalar` to record scalar values
writer.add_scalar(tag="acc", step=1, value=0.5678)
writer.add_scalar(tag="acc", step=2, value=0.6878)
writer.add_scalar(tag="acc", step=3, value=0.9878)
# you can also use the following method without using context manager `with`:
"""
writer = LogWriter(logdir="./log/scalar_test/train")
writer.add_scalar(tag="acc", step=1, value=0.5678)
writer.add_scalar(tag="acc", step=2, value=0.6878)
writer.add_scalar(tag="acc", step=3, value=0.9878)
writer.close()
"""
In the above example, the log has recorded three sets of scalar values. Developers can view the visualization results of the log file through launching the visualDL panel. There are two ways to launch the log file:
Use the command line to launch the VisualDL panel:
visualdl --logdir <dir_1, dir_2, ... , dir_n> --model <model_file> --host <host> --port <port> --cache-timeout <cache_timeout> --language <language> --public-path <public_path> --api-only
Parameter details:
parameters | meaning |
---|---|
--logdir | Set one or more directories of the log. All the logs in the paths or subdirectories will be displayed on the VisualDL Board indepentently. |
--model | Set a path to the model file (not a directory). VisualDL will visualize the model file in Graph page. PaddlePaddle、ONNX、Keras、Core ML、Caffe and other model formats are supported. Please refer to Graph - Functional Instructions. |
--host | Specify IP address. The default value is 127.0.0.1 . Specify it as 0.0.0.0 or public IP address so that other machines can visit VisualDL Board. |
--port | Set the port. The default value is 8040 . |
--cache-timeout | Cache time of the backend. During the cache time, the front end requests the same URL multiple times, and then the returned data are obtained from the cache. The default cache time is 20 seconds. |
--language | The language of the VisualDL panel. Language can be specified as 'en' or 'zh', and the default is the language used by the browser. |
--public-path | The URL path of the VisualDL panel. The default path is '/app', meaning that the access address is 'http://<host>:<port>/app'. |
--api-only | Decide whether or not to provide only API. If this parameter is set, VisualDL will only provides API service without displaying the web page, and the API address is 'http://<host>:<port>/<public_path>/api'. Additionally, If the public_path parameter is not specified, the default address is 'http://<host>:<port>/api'. |
To visualize the log file generated in the previous step, developers can launch the panel through the command:
visualdl --logdir ./log
Developers can start the VisualDL panel in Python script as follows:
visualdl.server.app.run(logdir,
model="path/to/model",
host="127.0.0.1",
port=8080,
cache_timeout=20,
language=None,
public_path=None,
api_only=False,
open_browser=False)
Please note: since all parameters are indefinite except logdir
, developers should specify parameter names when using them.
The interface parameters are as follows:
parameters | type | meaning |
---|---|---|
logdir | string or list[string_1, string_2, ... , string_n] | Set one or more directories of the log. All the logs in the paths or subdirectories will be displayed on the VisualDL Board indepentently. |
model | string | Set a path to the model file (not a directory). VisualDL will visualize the model file in Graph page. |
host | string | Specify IP address. The default value is 127.0.0.1 . Specify it as 0.0.0.0 or public IP address so that other machines can visit VisualDL Board. |
port | int | Set the port. The default value is 8040 . |
cache_timeout | int | Cache time of the backend. During the cache time, the front end requests the same URL multiple times, and then the returned data are obtained from the cache. The default cache time is 20 seconds. |
language | string | The language of the VisualDL panel. Language can be specified as 'en' or 'zh', and the default is the language used by the browser. |
public_path | string | The URL path of the VisualDL panel. The default path is '/app', meaning that the access address is 'http://<host>:<port>/app'. |
api_only | boolean | Decide whether or not to provide only API. If this parameter is set, VisualDL will only provides API service without displaying the web page, and the API address is 'http://<host>:<port>/<public_path>/api'. Additionally, If the parameter public_path is not specified, the default address is 'http://<host>:<port>/api'. |
open_browser | boolean | Whether or not to open the browser. If this parameter is set as True, the browser will be opened automatically and VisualDL panel will be launched at the same time. If parameter api_only is specified as True, parameter open_browser can be ignored. |
To visualize the log file generated in the previous step, developers can launch the panel through the command:
from visualdl.server import app
app.run(logdir="./log")
After launching the panel by one of the above methods, developers can see the visualization results on the browser shown as blow:
VisualDL also provides LogReader
interface to read any data from log files.
class LogReader(file_path='')
parameters | type | meaning |
---|---|---|
file_path | string | File path of the log file. Required. |
If there is a log file named vdlrecords.1605533348.log
in the directory of ./log
, we can retrieve the data under the 'loss' tag in the scalar by:
from visualdl import LogReader
reader = LogReader(file_path='./vdlrecords.1605533348.log')
data = reader.get_data('scalar', 'loss')
print(data)
The result will be a list shown as below:
...
id: 5
tag: "Metrics/Training(Step): loss"
timestamp: 1605533356039
value: 3.1297709941864014
...
For more information of LogReader
, please refer to LogReader.
Scalar makes use of various charts to display how the parameters, such as accuracy, loss and learning rate, changes during the training process. In this case, developers can observe not only the single but also the multiple groups of parameters in order to understand the training process and thus speed up the process of model tuning.
After the launch of VisualDL Board, the LogReader will continuously record the data to display in the front-end. Hence, the changes of parameters can be visualized in real-time, as shown below:
Developers can compare multiple experiments by specifying and uploading the path of each experiment at the same time so as to visualize the same parameters in the same chart.
Image provides real-time visualizations of the image data during the training process, allowing developers to observe the changes of images at different training stages and to deeply understand the effects of the training process.
Audio aims to allow developers to listen to the audio data in real-time during the training process, helping developers to monitor the process of speech recognition and text-to-speech.
Text visualizes the text output of NLP models within any stage, aiding developers to compare the changes of outputs so as to deeply understand the training process and simply evaluate the performance of the model.
Graph enables developers to visualize model structures by only one click. Moreover, Graph allows developers to explore model attributes, node information, node input and output. aiding them analyze model structures quickly and understand the direction of data flow easily. Additionally, Graph supports the visualization of dynamic and static model graph respectively.
- dynamic graph
- static graph
Histogram displays how the trend of tensor (weight, bias, gradient, etc.) changes during the training process in the form of histogram. Developers can adjust the model structures accurately by having an in-depth understanding of the effect of each layer.
- Offset Mode
- Overlay Mode
PR Curve displays the precision and recall values under different thresholds, helping developers to find the best threshold efficiently.
ROC Curve shows the performance of a classification model at all classification thresholds; the larger the area under the curve, the better the model performs, aiding developers in evaluating the model performance and choosing an appropriate threshold.
High Dimensional provides three approaches--T-SNE, PCA and UMAP--to do the dimensionality reduction, allowing developers to have an in-depth analysis of the relationship between high-dimensional data and to optimize algorithms based on the analysis.
Hyper Parameters visualize the relationship between hyperparameters and model metrics (such as accuracy and loss) in a rich view, helping you identify the best hyperparameters in an efficient way.
Performance Analysis(Profiler) visualize the profiling data collected during your program runs, helping you identify program bottlenecks and optimize performance. Please refer to VisualDL Profiler Guide。
VDL.service enables developers to easily save, track and share visualization results with anyone for free.
If you are confronted with some problems when using VisualDL, please refer to our FAQs.
VisualDL, in which Graph is powered by Netron, is an open source project supported by PaddlePaddle and ECharts.
Developers are warmly welcomed to use, comment and contribute.
For more details related to the use of VisualDL, please refer to VisualDL User Guide。
Welcome to join the official QQ group 1045783368 to communicate with PaddlePaddle team and other developers.