forked from TZstatsADS/ADS_Teaching
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
23 changed files
with
17,023 additions
and
0 deletions.
There are no files selected for viewing
Binary file not shown.
810 changes: 810 additions & 0 deletions
810
Tutorials/wk7-OpenCV_tutorial/.ipynb_checkpoints/Basic Image Analysis-checkpoint.ipynb
Large diffs are not rendered by default.
Oops, something went wrong.
810 changes: 810 additions & 0 deletions
810
Tutorials/wk7-OpenCV_tutorial/Basic Image Analysis.ipynb
Large diffs are not rendered by default.
Oops, something went wrong.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,369 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"cell_type": "markdown", | ||
"id": "integrated-tunnel", | ||
"metadata": { | ||
"slideshow": { | ||
"slide_type": "slide" | ||
} | ||
}, | ||
"source": [ | ||
"# Weakly Supervised Learning" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "first-emergency", | ||
"metadata": { | ||
"slideshow": { | ||
"slide_type": "slide" | ||
} | ||
}, | ||
"source": [ | ||
"## Introduction\n", | ||
"\n", | ||
"From Wikipedia - **Supervised learning** is the machine learning task of learning a function that maps an input to an output based on example input-output pairs (i.e. training set). \n", | ||
"\n", | ||
"Most successful techniques, such as deep learning, require ground-truth labels to be given for a big training data\n", | ||
"set; in many tasks, however, it can be difficult to attain strong supervision information due to the high cost of the data-labeling process.\n", | ||
"\n", | ||
"Thus, it is desirable for machine-learning techniques to be able to work with weak supervision. In general, we refer these tasks as weakly supervised learning (WSL)." | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "liable-activity", | ||
"metadata": { | ||
"slideshow": { | ||
"slide_type": "slide" | ||
} | ||
}, | ||
"source": [ | ||
"## WSL Settings\n", | ||
"\n", | ||
"**(Strong) supervised learning**: learn a classifier $f: \\mathcal{X} \\mapsto \\mathcal{Y}$ from a training set $D = \\{(x_1,y_1), \\ldots, (x_m, y_m)\\}$, where $\\mathcal{X}$ is the feature space and $\\mathcal{Y} = \\{0,1\\}$ for a binary classification task. Furthermore, we assume that $(x_i,y_i)$ are i.i.d. samples generated from some unknown true distribution $\\mathcal{D}$ on $\\mathcal{X} \\times \\mathcal{Y}$." | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "vital-factory", | ||
"metadata": { | ||
"slideshow": { | ||
"slide_type": "fragment" | ||
} | ||
}, | ||
"source": [ | ||
"### Three types of WSL tasks\n", | ||
"\n", | ||
"Zhou(2018) categorize WSL tasks into three types:\n", | ||
"- Incomplete supervision\n", | ||
"- Inexact supervision\n", | ||
"- Inaccurate supervision" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "cultural-realtor", | ||
"metadata": { | ||
"slideshow": { | ||
"slide_type": "slide" | ||
} | ||
}, | ||
"source": [ | ||
"## Incomplete supervision\n", | ||
"\n", | ||
"Incomplete supervision concerns the situation in which we are given a small amount of labeled data, which is insufficient to train a good learner, while abundant unlabeled data are available. \n", | ||
"\n", | ||
"Formally, the task is to learn $f : \\mathcal{X} \\mapsto \\mathcal{Y}$ from a training data set\n", | ||
"$D = \\{(x_1, y_1), \\ldots , (x_l, y_l), x_{l+1}, \\ldots, x_{m}\\}$, where there are $l$ number of labeled training examples (i.e. those given with $y_i$) and $u = m − l$ number of unlabeled instances." | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "traditional-signature", | ||
"metadata": { | ||
"slideshow": { | ||
"slide_type": "fragment" | ||
} | ||
}, | ||
"source": [ | ||
"![](./Incomplete.png)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "monthly-smart", | ||
"metadata": { | ||
"slideshow": { | ||
"slide_type": "slide" | ||
} | ||
}, | ||
"source": [ | ||
"## Incomplete supervision\n", | ||
"\n", | ||
"There are two major techniques for incomplete supervision problems:\n", | ||
"- **Active learning**: there is an ‘oracle’, such as a human expert, that can be queried to get ground-truth labels for selected unlabeled instances.\n", | ||
"- **Semi-supervised learning**: automatically exploit unlabeled data in addition to labeled data to improve learning performance, where no human intervention is assumed" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "damaged-category", | ||
"metadata": { | ||
"slideshow": { | ||
"slide_type": "slide" | ||
} | ||
}, | ||
"source": [ | ||
"## Inexact supervision\n", | ||
"\n", | ||
"Inexact supervision concerns the situation in which some supervision information is given, but not as exact as desired.\n", | ||
"\n", | ||
"Formally, the task is to learn $f : \\mathcal{X} \\mapsto \\mathcal{Y}$ from a training data set $D = \\{(X_1,y_1), \\ldots, (X_m, y_m)\\}$, where $X_i = {x_{i,1}, \\ldots, x_{i,m_i}} \\subset \\mathcal{X}$ is called a bag, $x_{i,j} \\in \\mathcal{X}$ is an instance, $m_i$ is the number of instances in $X_i$. " | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "suited-condition", | ||
"metadata": { | ||
"slideshow": { | ||
"slide_type": "fragment" | ||
} | ||
}, | ||
"source": [ | ||
"![](./inexact.png)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "random-vitamin", | ||
"metadata": { | ||
"slideshow": { | ||
"slide_type": "slide" | ||
} | ||
}, | ||
"source": [ | ||
"## Inexact supervision: multi-instance learning\n", | ||
"\n", | ||
"Multi-instance learning is the case where $X_i$ is a positive bag, i.e. $y_i = 1$, if there exists $x_{i,p}$ that is positive, while $p \\in {1, \\ldots, m_i}$ is unknown. The goal is to predict labels for unseen bags. There are also studies trying to identify the key instancethat enables a positive bag to be positive.\n", | ||
"\n", | ||
"Multi-instance learning has been successfully applied to various tasks, including:\n", | ||
"- image categorization/retrieval/annotation\n", | ||
"- text categorization\n", | ||
"- spam detection\n", | ||
"- medical diagnosis\n", | ||
"- face/object detection \n", | ||
"- object class discovery\n", | ||
"- object tracking" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "reserved-commons", | ||
"metadata": { | ||
"slideshow": { | ||
"slide_type": "slide" | ||
} | ||
}, | ||
"source": [ | ||
"## Inaccurate supervision\n", | ||
"\n", | ||
"Inaccurate supervision concerns the situation in which the supervision information is not always ground-truth; in other words, some label information may suffer from errors. In practice, a basic idea is to identify the potentially mislabeled examples, and then try to make some correction. We refer this procedure as \"label correction\". Many WSL methods that address the issue of inaccuarate supervision fall in this category. " | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "statutory-termination", | ||
"metadata": { | ||
"slideshow": { | ||
"slide_type": "fragment" | ||
} | ||
}, | ||
"source": [ | ||
"![](./Inaccurate.png)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "persistent-central", | ||
"metadata": { | ||
"slideshow": { | ||
"slide_type": "slide" | ||
} | ||
}, | ||
"source": [ | ||
"## A typical situation with inaccurate supervision: crowdsourcing\n", | ||
"\n", | ||
"An interesting recent scenario of inaccurate supervision occurs with *crowdsourcing*, a popular paradigm to outsource work to individuals. A famous crowdsourcing system, Amazon Mechanical Turk (AMT), is a market where the user can submit a task, such as annotating images of trees versus non-trees, to be completed by workers in exchange for small monetary payments. \n", | ||
"\n", | ||
"The workers usually come from a large society and each of them is presented with multiple tasks. They are usually independent and relatively inexpensive, and will provide labels based on their own judgments. Among the workers, some\n", | ||
"may be more reliable than others; however, the user usually does not know this in advance because the identities of workers are protected. There may exist ‘spammers’ who assign almost random labels to the tasks (e.g. robots pretendto be a human forthe monetary payment), or ‘adversaries’ who give incorrect answers deliberately." | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "arranged-customer", | ||
"metadata": { | ||
"slideshow": { | ||
"slide_type": "slide" | ||
} | ||
}, | ||
"source": [ | ||
"## Mixed cases\n", | ||
"\n", | ||
"Though the three types of WSL are discussed separately, in practice they often occur simultaneously. " | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "outstanding-aaron", | ||
"metadata": { | ||
"slideshow": { | ||
"slide_type": "slide" | ||
} | ||
}, | ||
"source": [ | ||
"![](./WSL_types.png)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "defensive-official", | ||
"metadata": { | ||
"slideshow": { | ||
"slide_type": "slide" | ||
} | ||
}, | ||
"source": [ | ||
"# An example of WSL methods in image classification" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "covered-stretch", | ||
"metadata": { | ||
"slideshow": { | ||
"slide_type": "slide" | ||
} | ||
}, | ||
"source": [ | ||
"\n", | ||
"As mentioned earlier, many different methods exist in the literature of WSL. In this tutorial, we will briefly introduce a simplified version of the method proposed in Section 4 of [Inoue et al. (2017)](https://openaccess.thecvf.com/content_ICCV_2017_workshops/papers/w32/Inoue_Multi-Label_Fashion_Image_ICCV_2017_paper.pdf), which addresses the issue of inaccurate supervision." | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "facial-oregon", | ||
"metadata": { | ||
"slideshow": { | ||
"slide_type": "fragment" | ||
} | ||
}, | ||
"source": [ | ||
"## Problem setting\n", | ||
"\n", | ||
"Formally, we have a very large training dataset $T$. $T$ consists of tuples of noisy labels $y$ and images $x$, $T =\\{(x_i , y_i), \\ldots,\\}$. Additionally, we have a small dataset $V$ with human verified labels $v$, $V = \\{(x_j , y_j , v_j), \\ldots\\}$. The number of the data in $T$ is significantly larger than that in $V$. For a $d$-class classificaton problem, the $y_j$'s and $v_j$'s are $d$-dimensional one-hot variables that encode the class of the $j$-th object according to its noisy label and verified label. " | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "postal-costa", | ||
"metadata": { | ||
"slideshow": { | ||
"slide_type": "slide" | ||
} | ||
}, | ||
"source": [ | ||
"## The two-phase deep learning model\n", | ||
"\n", | ||
"The model is designed to jointly learn to generate accurate labels from noisy labels and to learn a more accurate multilabel classifier from the generated labels. In particular, it consists of two convolutional neural networks (For those who are not familiar with CNN, pytorch provides [a good tutorial](https://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial.html) among others):\n", | ||
"\n", | ||
"- **Phase 1: label correction**\n", | ||
" \n", | ||
" The classifier $g$ in this phase is called the *label cleaning network*. It learns a mapping from noisy labels $y$ to human-verified labels $v$, conditional on the input image. Its output $c$ denotes the cleaned labels. The classifier $g$ has two separate inputs, the noisy labels $y$ and the visual features $f(x)$. Each input is projected into an embedding by a linear layer and the two are concatenated, then transformed with a hidden linear layer. Finally, $y$ is added to the output by an identity-skip connection and clipped to $[0, 1]$ to remain in the valid label space. For the training, they consider to minimize the $L_1$ loss.\n", | ||
"\n", | ||
"\n", | ||
"\n", | ||
"- **Phase 2: image classication**\n", | ||
"\n", | ||
" The second classifier h is called the image classifier. It learns to predict labels by imitating the first classifier using only the image as input. This network is rather similar to the regular ones except it learns a mapping from input image $x$ to the corrected label $c$ instead of $y$ or $v$. For the training, the cross-entropy loss is minimized. " | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "drawn-chess", | ||
"metadata": { | ||
"slideshow": { | ||
"slide_type": "slide" | ||
} | ||
}, | ||
"source": [ | ||
"## The architectures - label correction network\n", | ||
"![](./label_cleaning_net.png)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "martial-puzzle", | ||
"metadata": { | ||
"slideshow": { | ||
"slide_type": "slide" | ||
} | ||
}, | ||
"source": [ | ||
"## The architectures - image classifier\n", | ||
"![](./image_clf.png)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "antique-hotel", | ||
"metadata": { | ||
"slideshow": { | ||
"slide_type": "fragment" | ||
} | ||
}, | ||
"source": [ | ||
"For more details of training, please refer to the original paper. " | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "classified-discharge", | ||
"metadata": { | ||
"slideshow": { | ||
"slide_type": "slide" | ||
} | ||
}, | ||
"source": [ | ||
"## References\n", | ||
"\n", | ||
" - A brief introduction to weakly supervised learning ([Zhou, 2018](https://academic.oup.com/nsr/article/5/1/44/4093912))\n", | ||
" - Multi-Label Fashion Image Classification with Minimal Human Supervision ([Inoue et al., 2017](https://openaccess.thecvf.com/content_ICCV_2017_workshops/papers/w32/Inoue_Multi-Label_Fashion_Image_ICCV_2017_paper.pdf))" | ||
] | ||
} | ||
], | ||
"metadata": { | ||
"celltoolbar": "Slideshow", | ||
"kernelspec": { | ||
"display_name": "Python 3", | ||
"language": "python", | ||
"name": "python3" | ||
}, | ||
"language_info": { | ||
"codemirror_mode": { | ||
"name": "ipython", | ||
"version": 3 | ||
}, | ||
"file_extension": ".py", | ||
"mimetype": "text/x-python", | ||
"name": "python", | ||
"nbconvert_exporter": "python", | ||
"pygments_lexer": "ipython3", | ||
"version": "3.8.5" | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 5 | ||
} |
Oops, something went wrong.