ultralytics 8.0.97 confusion matrix, windows, docs updates (ultraly…

…tics#2511) Co-authored-by: Yonghye Kwon <[email protected]> Co-authored-by: Dowon <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Laughing <[email protected]>
edgestar16 · May 9, 2023 · d1107ca · d1107ca
1 parent 6ee3a9a
commit d1107ca
Show file tree

Hide file tree

Showing 138 changed files with 747 additions and 354 deletions.
diff --git a/.github/workflows/publish.yml b/.github/workflows/publish.yml
@@ -23,6 +23,8 @@ jobs:
     steps:
       - name: Checkout code
         uses: actions/checkout@v3
+        with:
+          fetch-depth: "0" # pulls all commits (needed correct last updated dates in Docs)
       - name: Set up Python environment
         uses: actions/setup-python@v4
         with:

diff --git a/docs/README.md b/docs/README.md
@@ -1,3 +1,7 @@
+---
+description: Learn how to install the Ultralytics package in developer mode and build/serve locally using MkDocs. Deploy your project to your host easily.
+---
+
 # Ultralytics Docs
 
 Ultralytics Docs are deployed to [https://docs.ultralytics.com](https://docs.ultralytics.com).
@@ -82,4 +86,4 @@ for your repository and updating the "Custom domain" field in the "GitHub Pages"
 ![196814117-fc16e711-d2be-4722-9536-b7c6d78fd167](https://user-images.githubusercontent.com/26833433/210150206-9e86dcd7-10af-43e4-9eb2-9518b3799eac.png)
 
 For more information on deploying your MkDocs documentation site, see
-the [MkDocs documentation](https://www.mkdocs.org/user-guide/deploying-your-docs/).
+the [MkDocs documentation](https://www.mkdocs.org/user-guide/deploying-your-docs/).
diff --git a/docs/SECURITY.md b/docs/SECURITY.md
@@ -1,3 +1,7 @@
+---
+description: Learn how Ultralytics prioritize security. Get insights into Snyk and GitHub CodeQL scans, and how to report security issues in YOLOv8.
+---
+
 # Security Policy
 
 At [Ultralytics](https://ultralytics.com), the security of our users' data and systems is of utmost importance. To
@@ -25,4 +29,4 @@ reach out to us directly via our [contact form](https://ultralytics.com/contact)
 via [[email protected]](mailto:[email protected]). Our security team will investigate and respond as soon
 as possible.
 
-We appreciate your help in keeping the YOLOv8 repository secure and safe for everyone.
+We appreciate your help in keeping the YOLOv8 repository secure and safe for everyone.
diff --git a/docs/datasets/classify/index.md b/docs/datasets/classify/index.md
@@ -1,5 +1,6 @@
 ---
 comments: true
+description: Learn how torchvision organizes classification image datasets. Use this code to create and train models. CLI and Python code shown.
 ---
 
 # Image Classification Datasets Overview
@@ -77,6 +78,7 @@ cifar-10-/
 In this example, the `train` directory contains subdirectories for each class in the dataset, and each class subdirectory contains all the images for that class. The `test` directory has a similar structure. The `root` directory also contains other files that are part of the CIFAR10 dataset.
 
 ## Usage
+
 !!! example ""
 
     === "Python"
@@ -98,4 +100,5 @@ In this example, the `train` directory contains subdirectories for each class in
         ```
 
 ## Supported Datasets
+
 TODO
diff --git a/docs/datasets/detect/coco.md b/docs/datasets/detect/coco.md
@@ -1,5 +1,6 @@
 ---
 comments: true
+description: Learn about the COCO dataset, designed to encourage research on object detection, segmentation, and captioning with standardized evaluation metrics.
 ---
 
 # COCO Dataset

diff --git a/docs/datasets/detect/index.md b/docs/datasets/detect/index.md
@@ -1,5 +1,6 @@
 ---
 comments: true
+description: Learn about supported dataset formats for training YOLO detection models, including Ultralytics YOLO and COCO, in this Object Detection Datasets Overview.
 ---
 
 # Object Detection Datasets Overview
@@ -15,11 +16,12 @@ The dataset format used for training YOLO detection models is as follows:
 1. One text file per image: Each image in the dataset has a corresponding text file with the same name as the image file and the ".txt" extension.
 2. One row per object: Each row in the text file corresponds to one object instance in the image.
 3. Object information per row: Each row contains the following information about the object instance:
-   - Object class index: An integer representing the class of the object (e.g., 0 for person, 1 for car, etc.).
-   - Object center coordinates: The x and y coordinates of the center of the object, normalized to be between 0 and 1.
-   - Object width and height: The width and height of the object, normalized to be between 0 and 1.
-   
+    - Object class index: An integer representing the class of the object (e.g., 0 for person, 1 for car, etc.).
+    - Object center coordinates: The x and y coordinates of the center of the object, normalized to be between 0 and 1.
+    - Object width and height: The width and height of the object, normalized to be between 0 and 1.
+
 The format for a single row in the detection dataset file is as follows:
+
 ```
 <object-class> <x> <y> <width> <height>
 ```
@@ -55,6 +57,7 @@ The `names` field is a list of the names of the object classes. The order of the
 NOTE: Either `nc` or `names` must be defined. Defining both are not mandatory
 
 Alternatively, you can directly define class names like this:
+
 ```yaml
 names:
   0: person
@@ -72,6 +75,7 @@ names: ['person', 'car']
 ```
 
 ## Usage
+
 !!! example ""
 
     === "Python"
@@ -93,6 +97,7 @@ names: ['person', 'car']
         ```
 
 ## Supported Datasets
+
 TODO
 
 ## Port or Convert label formats
@@ -103,4 +108,4 @@ TODO
 from ultralytics.yolo.data.converter import convert_coco
 
 convert_coco(labels_dir='../coco/annotations/')
-```
+```
diff --git a/docs/datasets/index.md b/docs/datasets/index.md
@@ -1,5 +1,6 @@
 ---
 comments: true
+description: Ultralytics provides support for various datasets to facilitate multiple computer vision tasks. Check out our list of main datasets and their summaries.
 ---
 
 # Datasets Overview
@@ -10,48 +11,48 @@ Ultralytics provides support for various datasets to facilitate computer vision
 
 Bounding box object detection is a computer vision technique that involves detecting and localizing objects in an image by drawing a bounding box around each object.
 
- * [Argoverse](detect/argoverse.md): A dataset containing 3D tracking and motion forecasting data from urban environments with rich annotations.
- * [COCO](detect/coco.md): A large-scale dataset designed for object detection, segmentation, and captioning with over 200K labeled images.
- * [COCO8](detect/coco8.md): Contains the first 4 images from COCO train and COCO val, suitable for quick tests.
- * [Global Wheat 2020](detect/globalwheat2020.md): A dataset of wheat head images collected from around the world for object detection and localization tasks.
- * [Objects365](detect/objects365.md): A high-quality, large-scale dataset for object detection with 365 object categories and over 600K annotated images.
- * [SKU-110K](detect/sku-110k.md): A dataset featuring dense object detection in retail environments with over 11K images and 1.7 million bounding boxes.
- * [VisDrone](detect/visdrone.md): A dataset containing object detection and multi-object tracking data from drone-captured imagery with over 10K images and video sequences.
- * [VOC](detect/voc.md): The Pascal Visual Object Classes (VOC) dataset for object detection and segmentation with 20 object classes and over 11K images.
- * [xView](detect/xview.md): A dataset for object detection in overhead imagery with 60 object categories and over 1 million annotated objects.
+* [Argoverse](detect/argoverse.md): A dataset containing 3D tracking and motion forecasting data from urban environments with rich annotations.
+* [COCO](detect/coco.md): A large-scale dataset designed for object detection, segmentation, and captioning with over 200K labeled images.
+* [COCO8](detect/coco8.md): Contains the first 4 images from COCO train and COCO val, suitable for quick tests.
+* [Global Wheat 2020](detect/globalwheat2020.md): A dataset of wheat head images collected from around the world for object detection and localization tasks.
+* [Objects365](detect/objects365.md): A high-quality, large-scale dataset for object detection with 365 object categories and over 600K annotated images.
+* [SKU-110K](detect/sku-110k.md): A dataset featuring dense object detection in retail environments with over 11K images and 1.7 million bounding boxes.
+* [VisDrone](detect/visdrone.md): A dataset containing object detection and multi-object tracking data from drone-captured imagery with over 10K images and video sequences.
+* [VOC](detect/voc.md): The Pascal Visual Object Classes (VOC) dataset for object detection and segmentation with 20 object classes and over 11K images.
+* [xView](detect/xview.md): A dataset for object detection in overhead imagery with 60 object categories and over 1 million annotated objects.
 
 ## [Instance Segmentation Datasets](segment/index.md)
 
 Instance segmentation is a computer vision technique that involves identifying and localizing objects in an image at the pixel level.
 
- * [COCO](segment/coco.md): A large-scale dataset designed for object detection, segmentation, and captioning tasks with over 200K labeled images.
- * [COCO8-seg](segment/coco8-seg.md): A smaller dataset for instance segmentation tasks, containing a subset of 8 COCO images with segmentation annotations.
+* [COCO](segment/coco.md): A large-scale dataset designed for object detection, segmentation, and captioning tasks with over 200K labeled images.
+* [COCO8-seg](segment/coco8-seg.md): A smaller dataset for instance segmentation tasks, containing a subset of 8 COCO images with segmentation annotations.
 
 ## [Pose Estimation](pose/index.md)
 
 Pose estimation is a technique used to determine the pose of the object relative to the camera or the world coordinate system.
 
- * [COCO](pose/coco.md): A large-scale dataset with human pose annotations designed for pose estimation tasks.
- * [COCO8-pose](pose/coco8-pose.md): A smaller dataset for pose estimation tasks, containing a subset of 8 COCO images with human pose annotations.
+* [COCO](pose/coco.md): A large-scale dataset with human pose annotations designed for pose estimation tasks.
+* [COCO8-pose](pose/coco8-pose.md): A smaller dataset for pose estimation tasks, containing a subset of 8 COCO images with human pose annotations.
 
 ## [Classification](classify/index.md)
 
 Image classification is a computer vision task that involves categorizing an image into one or more predefined classes or categories based on its visual content.
 
- * [Caltech 101](classify/caltech101.md): A dataset containing images of 101 object categories for image classification tasks.
- * [Caltech 256](classify/caltech256.md): An extended version of Caltech 101 with 256 object categories and more challenging images.
- * [CIFAR-10](classify/cifar10.md): A dataset of 60K 32x32 color images in 10 classes, with 6K images per class.
- * [CIFAR-100](classify/cifar100.md): An extended version of CIFAR-10 with 100 object categories and 600 images per class.
- * [Fashion-MNIST](classify/fashion-mnist.md): A dataset consisting of 70,000 grayscale images of 10 fashion categories for image classification tasks.
- * [ImageNet](classify/imagenet.md): A large-scale dataset for object detection and image classification with over 14 million images and 20,000 categories.
- * [ImageNet-10](classify/imagenet10.md): A smaller subset of ImageNet with 10 categories for faster experimentation and testing.
- * [Imagenette](classify/imagenette.md): A smaller subset of ImageNet that contains 10 easily distinguishable classes for quicker training and testing.
- * [Imagewoof](classify/imagewoof.md): A more challenging subset of ImageNet containing 10 dog breed categories for image classification tasks.
- * [MNIST](classify/mnist.md): A dataset of 70,000 grayscale images of handwritten digits for image classification tasks.
+* [Caltech 101](classify/caltech101.md): A dataset containing images of 101 object categories for image classification tasks.
+* [Caltech 256](classify/caltech256.md): An extended version of Caltech 101 with 256 object categories and more challenging images.
+* [CIFAR-10](classify/cifar10.md): A dataset of 60K 32x32 color images in 10 classes, with 6K images per class.
+* [CIFAR-100](classify/cifar100.md): An extended version of CIFAR-10 with 100 object categories and 600 images per class.
+* [Fashion-MNIST](classify/fashion-mnist.md): A dataset consisting of 70,000 grayscale images of 10 fashion categories for image classification tasks.
+* [ImageNet](classify/imagenet.md): A large-scale dataset for object detection and image classification with over 14 million images and 20,000 categories.
+* [ImageNet-10](classify/imagenet10.md): A smaller subset of ImageNet with 10 categories for faster experimentation and testing.
+* [Imagenette](classify/imagenette.md): A smaller subset of ImageNet that contains 10 easily distinguishable classes for quicker training and testing.
+* [Imagewoof](classify/imagewoof.md): A more challenging subset of ImageNet containing 10 dog breed categories for image classification tasks.
+* [MNIST](classify/mnist.md): A dataset of 70,000 grayscale images of handwritten digits for image classification tasks.
 
 ## [Multi-Object Tracking](track/index.md)
 
 Multi-object tracking is a computer vision technique that involves detecting and tracking multiple objects over time in a video sequence.
 
 * [Argoverse](detect/argoverse.md): A dataset containing 3D tracking and motion forecasting data from urban environments with rich annotations for multi-object tracking tasks.
-* [VisDrone](detect/visdrone.md): A dataset containing object detection and multi-object tracking data from drone-captured imagery with over 10K images and video sequences.
+* [VisDrone](detect/visdrone.md): A dataset containing object detection and multi-object tracking data from drone-captured imagery with over 10K images and video sequences.
diff --git a/docs/datasets/pose/index.md b/docs/datasets/pose/index.md
@@ -1,5 +1,6 @@
 ---
 comments: true
+description: Learn how to format your dataset for training YOLO models with Ultralytics YOLO format using our concise tutorial and example YAML files.
 ---
 
 # Pose Estimation Datasets Overview
@@ -15,26 +16,26 @@ The dataset format used for training YOLO segmentation models is as follows:
 1. One text file per image: Each image in the dataset has a corresponding text file with the same name as the image file and the ".txt" extension.
 2. One row per object: Each row in the text file corresponds to one object instance in the image.
 3. Object information per row: Each row contains the following information about the object instance:
-   - Object class index: An integer representing the class of the object (e.g., 0 for person, 1 for car, etc.).
-   - Object center coordinates: The x and y coordinates of the center of the object, normalized to be between 0 and 1.
-   - Object width and height: The width and height of the object, normalized to be between 0 and 1.
-   - Object keypoint coordinates: The keypoints of the object, normalized to be between 0 and 1.
+    - Object class index: An integer representing the class of the object (e.g., 0 for person, 1 for car, etc.).
+    - Object center coordinates: The x and y coordinates of the center of the object, normalized to be between 0 and 1.
+    - Object width and height: The width and height of the object, normalized to be between 0 and 1.
+    - Object keypoint coordinates: The keypoints of the object, normalized to be between 0 and 1.
 
 Here is an example of the label format for pose estimation task:
 
 Format with Dim = 2
 
 ```
-<class-index> <x> <y> <width> <height> <px1> <py1> <px2> <py2>  <pxn> <pyn>
+<class-index> <x> <y> <width> <height> <px1> <py1> <px2> <py2> ... <pxn> <pyn>
 ```
+
 Format with Dim = 3
 
 ```
 <class-index> <x> <y> <width> <height> <px1> <py1> <p1-visibility> <px2> <py2> <p2-visibility> <pxn> <pyn> <p2-visibility>
 ```
 
-In this format, `<class-index>` is the index of the class for the object,`<x> <y> <width> <height>` are coordinates of boudning box, and `<px1> <py1> <px2> <py2>  <pxn> <pyn>` are the pixel coordinates of the keypoints. The coordinates are separated by spaces. 
-
+In this format, `<class-index>` is the index of the class for the object,`<x> <y> <width> <height>` are coordinates of boudning box, and `<px1> <py1> <px2> <py2> ... <pxn> <pyn>` are the pixel coordinates of the keypoints. The coordinates are separated by spaces.
 
 ** Dataset file format **
 
@@ -62,14 +63,15 @@ The `names` field is a list of the names of the object classes. The order of the
 NOTE: Either `nc` or `names` must be defined. Defining both are not mandatory
 
 Alternatively, you can directly define class names like this:
+
 ```
 names:
   0: person
   1: bicycle
 ```
 
 (Optional) if the points are symmetric then need flip_idx, like left-right side of human or face.
-For example let's say there're five keypoints of facial landmark: [left eye, right eye, nose, left point of mouth, right point of mouse], and the original index is [0, 1, 2, 3, 4], then flip_idx is [1, 0, 2, 4, 3].(just exchange the left-right index, i.e 0-1 and 3-4, and do not modify others like nose in this example) 
+For example let's say there're five keypoints of facial landmark: [left eye, right eye, nose, left point of mouth, right point of mouse], and the original index is [0, 1, 2, 3, 4], then flip_idx is [1, 0, 2, 4, 3].(just exchange the left-right index, i.e 0-1 and 3-4, and do not modify others like nose in this example)
 
 ** Example **
 
@@ -86,6 +88,7 @@ flip_idx: [0, 2, 1, 4, 3, 6, 5, 8, 7, 10, 9, 12, 11, 14, 13, 16, 15]
 ```
 
 ## Usage
+
 !!! example ""
 
     === "Python"
@@ -107,6 +110,7 @@ flip_idx: [0, 2, 1, 4, 3, 6, 5, 8, 7, 10, 9, 12, 11, 14, 13, 16, 15]
         ```
 
 ## Supported Datasets
+
 TODO
 
 ## Port or Convert label formats
@@ -117,4 +121,4 @@ TODO
 from ultralytics.yolo.data.converter import convert_coco
 
 convert_coco(labels_dir='../coco/annotations/', use_keypoints=True)
-```
+```
diff --git a/docs/datasets/segment/index.md b/docs/datasets/segment/index.md
@@ -1,5 +1,6 @@
 ---
 comments: true
+description: Learn about the Ultralytics YOLO dataset format for segmentation models. Use YAML to train Detection Models. Convert COCO to YOLO format using Python.
 ---
 
 # Instance Segmentation Datasets Overview
@@ -15,23 +16,24 @@ The dataset format used for training YOLO segmentation models is as follows:
 1. One text file per image: Each image in the dataset has a corresponding text file with the same name as the image file and the ".txt" extension.
 2. One row per object: Each row in the text file corresponds to one object instance in the image.
 3. Object information per row: Each row contains the following information about the object instance:
-   - Object class index: An integer representing the class of the object (e.g., 0 for person, 1 for car, etc.).
-   - Object bounding coordinates: The bounding coordinates around the mask area, normalized to be between 0 and 1.
+    - Object class index: An integer representing the class of the object (e.g., 0 for person, 1 for car, etc.).
+    - Object bounding coordinates: The bounding coordinates around the mask area, normalized to be between 0 and 1.
 
 The format for a single row in the segmentation dataset file is as follows:
 
 ```
 <class-index> <x1> <y1> <x2> <y2> ... <xn> <yn>
 ```
 
-In this format, `<class-index>` is the index of the class for the object, and `<x1> <y1> <x2> <y2> ... <xn> <yn>` are the bounding coordinates of the object's segmentation mask. The coordinates are separated by spaces. 
+In this format, `<class-index>` is the index of the class for the object, and `<x1> <y1> <x2> <y2> ... <xn> <yn>` are the bounding coordinates of the object's segmentation mask. The coordinates are separated by spaces.
 
 Here is an example of the YOLO dataset format for a single image with two object instances:
 
 ```
 0 0.6812 0.48541 0.67 0.4875 0.67656 0.487 0.675 0.489 0.66
 1 0.5046 0.0 0.5015 0.004 0.4984 0.00416 0.4937 0.010 0.492 0.0104
 ```
+
 Note: The length of each row does not have to be equal.
 
 ** Dataset file format **
@@ -56,6 +58,7 @@ The `names` field is a list of the names of the object classes. The order of the
 NOTE: Either `nc` or `names` must be defined. Defining both are not mandatory.
 
 Alternatively, you can directly define class names like this:
+
 ```yaml
 names:
   0: person
@@ -73,6 +76,7 @@ names: ['person', 'car']
 ```
 
 ## Usage
+
 !!! example ""
 
     === "Python"
@@ -103,4 +107,4 @@ names: ['person', 'car']
 from ultralytics.yolo.data.converter import convert_coco
 
 convert_coco(labels_dir='../coco/annotations/', use_segments=True)
-```
+```