[SOLVED] BMVC 2012

$25

File Name: BMVC_2012.zip
File Size: 84.78 KB

5/5 - (1 vote)

Computer Vision
Machine learning basics and recognition
Semester 1
Changjae Oh

Copyright By Assignmentchef assignmentchef

Objectives
To understand machine learning basics for high-level vision problems

Machine learning problems
Slide credit: J. Hays

Machine learning problems
Slide credit: J. Hays

Dimensionality Reduction Principal component analysis (PCA),
PCA takes advantage of correlations in data dimensions t o produce the best possible lower dimensional representa tion based on linear projections (minimizes reconstruction error).
PCA should be used for dimensionality reduction, not for discovering patterns or making predictions. Dont try to as sign semantic meaning to the bases.
Locally Linear Embedding, Isomap, Autoencoder, etc.

Machine learning problems

means clustering
Image Clusters on intensity Clusters on color

Mean shift algorithm

Spectral clustering
Group points based on links in a graph

Visual PageRank
Determining importance by random walk
Whats the probability that you will randomly walk to a given node? Create adjacency matrix based on visual similarity
Edge weights determine probability of transition
C. Oh et al., Probabilistic Correspondence Matching using Random Walk with Restart, BMVC 2012

Machine learning problems

The machine learning framework
Apply a prediction function to a feature representation of the image to
get the desired output:
f( ) = apple
f( ) = tomato f( ) = cow
Slide credit: L. Lazebnik

Machine learning framework
output prediction function Image feature
Training: given a training set of labeled examples {(x1,y1), , (xN,yN)}, estimate the prediction function f by minimizing the prediction error on the training set
Testing: apply f to a never before seen test example x and output the predicted value y = f(x)
Slide credit: L. Lazebnik 13

Machine learning framework
Training Images
Test Image
Image Features
Image Features
Training Labels
Classifier Training
Trained Classifier
Trained Classifier
Prediction

Raw pixels
Histograms
GIST descriptors CNNs

Learning a classifier
Given some set of features with corresponding labels, learn a function to p redict the labels from the features

Many classifiers to choose from
Neural networks
Naive Bayes
K-nearest neighbour
Bayesian network
Logistic regression
Randomized Forests
Boosted Decision Trees
Deep Convolutional Network

Classifiers: Nearest neighbor
Training exa mples from class 1
Test exa mple
Training exa mples from class 2
f(x) = label of the training example nearest to x
All we need is a distance function for our inputs
No training required!
Slide credit: S. Lazebnik

Classifiers: Linear
Find a linear function to separate the classes: f(x) = sgn(w x + b)
Slide credit: L. Lazebnik 19

Example: Image Classification by K
Classifier
Prediction
Feature Extraction (HOG)

Example: Image Classification by K
Classifier
Prediction
K-NN Classifier

Example: Image Classification by K
Image Feature
Classifier
Prediction
Prediction

Recognition task and supervision
Images in the training set must be annotated with the correct answer that the model is expected to produce
Contains a motorbike
Slide credit: S. Lazebnik

Spectrum of supervision
Computer vision
Supervised Semi-Supervised Unsupervised Reinforcement learning learning learning learning

Spectrum of supervision
Slide credit: S. Lazebnik

Generalisation
How well does a learned model generalise from the data it was trained on to a new test set?
Training set (labels known) Test set (labels unknown)

Generalisation
How well does a learned model generalise from the data it was trained on to a new test set?

EBU7240 Computer Vision
Changjae Oh
Classification
Semester 1, 2021

Overview of recognition tasks
A statistical learning approach
Classic or shallow classification pipeline
Bag of features representation
Classifiers: nearest neighbor, linear, SVM

Verification/Classification
Is this a building?
Adapted from Fei-fei -Li

Where are the people?
Adapted from Fei-fei -Li

Identification
Is this ?
Adapted from Fei-fei -Li

Semantic Segmentation
Adapted from Fei-fei -Li

Object recognition
A collection of related tasks for identifying objects in digital photographs.
Consists of recognizing, identifying, and locating objects within a picture with a given de
gree of confidence.
image classification object detection
semantic segmentation instance segmentation
Image source

Image classification vs Object detection
Image classification
Identifying what is in the image and the associated level of confidence. can be binary label or multi-label classification
Object detection
Localising and classifying one or more objects in an image Object localisation and image classification

Semantic segmentation vs Instance segmentation
Semantic segmentation
Assigning a label to every pixel in the image.
Treating multiple objects of the same class as a single entity
Instance segmentation
Similar process as semantic segmentation, but identifies , for each pixel, the object in stance it belongs to.
Treating multiple objects of the same class as distinct individual objects (or instances) typically, instance segmentation is harder than semantic segmentation

Image classification

The machine learning framework
Apply a prediction function to a feature representation of the image to
get the desired output:
f( ) = apple
f( ) = tomato f( ) = cow
Slide credit: L. Lazebnik

Machine learning framework
output prediction function Image feature
Training: given a training set of labeled examples {(x1,y1), , (xN,yN)}, estimate the prediction function f by minimizing the prediction error on the training set
Testing: apply f to a never before seen test example x and output the predicted value y = f(x)
Slide credit: S. Lazebnik

Machine learning framework
Training Images
Test Image
Image Features
Image Features
Training Labels
Classifier Training
Trained Classifier
Trained Classifier
Prediction

Classic recognition pipeline
Hand-crafted feature representation
Off-the-shelf trainable classifier
Image Pixels
Trainable classifier
Class label
Feature representation

Classic representation: Bag of features
Representing images as orderless collections of local features

Motivation 1: Part
Various parts of the image are used separately to determine if and wher e an object of interest exists
based models
Weber, Welling & Perona (2000), Fergus, Perona & Zisserman (2003)

Motivation 2: Texture models
Texture is characterised by the repetition of basic elements or textons
Texton histogram
Texton dictionary
Julesz, 1981; Cula & Dana, 2001; Leung & Malik 2001; Mori, Belongie & Malik, 2001; Schmid 2001; Varma & Zisserman, 2002, 2003; Lazebnik, Schmid & Ponce, 2003

Motivation 3: Bags of words
Orderless document representation:
Frequencies of words from a dictionary Salton & McGill (1983)

Motivation 3: Bags of words
Orderless document representation:
Frequencies of words from a dictionary Salton & McGill (1983)

Motivation 3: Bags of words
Orderless document representation:
Frequencies of words from a dictionary Salton & McGill (1983)

Bag of features: Outline
1. Extract local features
2. Learn visual vocabulary
3. Quantize local features using visual vocabulary
4. Represent images by frequencies of visual words

1. Local feature extraction
Sample patches and extract descriptors

2. Learning the visual vocabulary
Extracted descriptors from the training set
Slide credit:

2. Learning the visual vocabulary
Clustering

2. Learning the visual vocabulary
Visual vocabulary
Clustering

Want to minimize sum of squared Euclidean distances between features xi and their nearest cluster centers mk
means clustering
Algorithm:
Randomly initialize K cluster centers Iterate until convergence:
D(X,M)= (x m )2 ik
Assign each feature to the nearest center
Recompute each cluster center as the mean of all features assigned to it
clusterk pointiin cluster k

Visual vocabularies
Appearance codebook
Source: B. Leibe

Bag of features: Outline
1. Extract local features
2. Learn visual vocabulary
3. Quantize local features using visual vocabulary
4. Represent images by frequencies of visual words

Classic recognition pipeline
Hand-crafted feature representation
Trainable classifier
Nearest Neighbor classifiers Support Vector machines
Image Pixels
Feature representation
Class label
Trainable classifier

Classifiers: Nearest neighbor
Training exa mples from class 1
Test exa mple
Training exa mples from class 2
f(x) = label of the training example nearest to x
All we need is a distance or similarity function for our inputs
No training required!

Functions for comparing histograms
L1 distance: 2
D(h,h )= 1 2
D(h,h )= 1 2
|h(i)h (i)| 12
distance:
2 i=1 (h(i)h(i))
Quadratic distance (cross-bin distance): D(h,h )=A (h(i)h (j))2
Histogram intersection (similarity function):
I(h,h)=N min(h(i),h(i))
h (i) + h (i) 12

For a new point, find the k closest points from training data
nearest neighbor classifier
Vote for class label with labels of the k points

nearest neighbor classifier
Which classifier is more robust to outliers?
Credit:, http://cs231n.github.io/classification/

nearest neighbor classifier
Credit:, http://cs231n.github.io/classification/

Linear classifiers
Find a linear function to separate the classes: f(x) = sgn(w x + b)

Visualizing linear classifiers
Example learned weights at the end of learning for CIFAR-10.
Credit:, http://cs231n.github.io/classification/

Nearest neighbor vs. linear classifiers
NN pros:
Simple to implement
Decision boundaries not necessarily linear Works for any number of classes Nonparametric method
NN cons:
Need good distance function Slow at test time
Linear pros:
Low-dimensional parametric representation Very fast at test time
Linear cons:
Works for two classes
How to train the linear function? What if data is not linearly separable?

Linear classifiers
When the data is linearly separable, there may be more than one separa tor (hyperplane)
Which separator is the best?

Support vector machines
Find a hyperplane that maximizes the margin between the positive and negative examples
xi positive(yi =1): xi w+b1 xi negative(yi =1): xi w+b1
xi w+b=1
|xi w+b| ||w||
Therefore, the margin is 2 / ||w|| Margin
For support vectors,
Distance between point and hyperplane:
Support vectors
C. Burges, A Tutorial on Support Vector Machines for Pattern Recognition, Data Mining and Knowledge Discovery, 1998

Finding the maximum margin hyperplane
1. Maximize margin 2 / ||w||
2. Correctly classify all training data:
xi positive(yi =1): xi w+b1 xi negative(yi =1): xi w+b1
Quadratic optimization problem:
min 12 w 2 subject to yi (w xi + b) 1 w,b
C. Burges, A Tutorial on Support Vector Machines for Pattern Recognition, Data Mining and Knowledge Discovery, 1998

SVM parameter learning
min12 w2 subjectto yi(wxi +b)1 Non-separable data:
min 2 w +C max(0,1-yi(wxi +b))
Separable data:
Maximize margin
Classify training data correctly
Maximize margin
Minimize classification mistakes

SVM parameter learning
min 2 w +C max(0,1-yi(wxi +b))
Demo: http://cs.stanford.edu/people/karpathy/svmjs/demo

Nonlinear SVMs
General idea: the original input space can always be mapped to some hi gher-dimensional feature space where the training set is separable

Nonlinear SVMs
Linearly separable dataset in 1D:
Non-separable dataset in 1D:
We can map the data to a higher-dimensional space: x2

SVMs: Pros and cons
Non-linear SVM framework is very powerful, flexible
Training is convex optimization, globally optimal solution can be found SVMs work very well in practice, even with very small training sample sizes
No direct multi-class SVM, must combine two-class SVMs (e.g., with one-vs-others) Computation, memory (esp. for nonlinear SVMs)

CS: assignmentchef QQ: 1823890830 Email: [email protected]

Reviews

There are no reviews yet.

Only logged in customers who have purchased this product may leave a review.

Shopping Cart
[SOLVED] BMVC 2012
$25