[SOLVED] CS deep learning algorithm Impact of Deep Learning Speech Recogni4on

$25

File Name: CS_deep_learning_algorithm_Impact_of_Deep_Learning__Speech_Recogni4on.zip
File Size: 649.98 KB

5/5 - (1 vote)

Impact of Deep Learning Speech Recogni4on
Computer Vision Recommender Systems
Language Understanding
Drug Discovery and Medical Image Analysis
[Courtesy of R. Salakhutdinov]

Deep Belief Networks: Training
[Hinton & Salakhutdinov, 2006]

Very Large Scale Use of DBNs
[Quoc Le, et al., ICML, 2012]
Data: 10 million 200200 unlabeled images, sampled from YouTube Training: use 1000 machines (16000 cores) for 1 week
Learned network: 3 multi-stage layers, 1.15 billion parameters
Achieves 15.8% (was 9.5%) accuracy classifying 1 of 20k ImageNet items
Real images that most excite the feature:
Image synthesized to most excite the feature:

Restricted Boltzmann Machines
Graphical Models: Powerful framework for represen4ng dependency structure between random variables.
hidden variables Pair-wise Unary Feature Detectors
Image
visible variables
RBM is a Markov Random Field with:
Stochas4c binary visible variables
Stochas4c binary hidden variables
Bipar4te connec4ons.
Markov random fields, Boltzmann machines, log-linear models.
[Courtesy, R. Salakhutdinov]

Model Learning Hidden units
Given a set of i.i.d. training examples
, we want to learn
model parameters . Maximize log-likelihood objec4ve:
Image
Deriva4ve of the log-likelihood:
visible units
[Courtesy, R. Salakhutdinov]

Deep Boltzmann Machines
Image
Low-level features: Edges
Built from unlabeled inputs.
Input: Pixels
(Salakhutdinov & Hinton, Neural Computation 2012)
[Courtesy, R. Salakhutdinov]

Deep Boltzmann Machines
Learn simpler representa4ons, then compose more complex ones
Higher-level features: Combina4on of edges
Low-level features: Edges
Built from unlabeled inputs.
Input: Pixels
Image
(Salakhutdinov 2008, Salakhutdinov & Hinton 2012)
[Courtesy, R. Salakhutdinov]

h3
h2
h1
v
Model Formula4on
Same as RBMs
W3
requires approximate inference to
W2
train, but it can be done
and scales to millions of examples
W1 Input
[Courtesy, R. Salakhutdinov]

Samples Generated by the Model Training Data Model-Generated Samples
Data
[Courtesy, R. Salakhutdinov]

Handwri4ng Recogni4on
MNIST Dataset
60,000 examples of 10 digits
Op4cal Character Recogni4on 42,152 examples of 26 English le_ers
Logis4c regression 22.14% K-NN 18.92%
Learning Algorithm
Error
Learning Algorithm
Error
Logis4c regression
K-NN
Neural Net (Pla_ 2005)
SVM (Decoste et.al. 2002)
Deep Autoencoder (Bengio et. al. 2007)
Deep Belief Net (Hinton et. al. 2006)
DBM
12.0% 3.09% 1.53% 1.40% 1.40%
1.20%
0.95%
Neural Net
SVM (Larochelle et.al. 2009)
Deep Autoencoder (Bengio et. al. 2007)
Deep Belief Net (Larochelle et. al. 2009)
14.62% 9.70% 10.05%
9.68%
Permuta4on-invariant version.
DBM 8.40%
[Courtesy, R. Salakhutdinov]

3-D object Recogni4on
NORB Dataset: 24,000 examples
Learning Algorithm
Error
Logis4c regression
K-NN (LeCun 2004)
SVM (Bengio & LeCun 2007)
Deep Belief Net (Nair & Hinton 2009)
DBM
22.5% 18.92% 11.6% 9.0%
7.2%
Pa_ern Comple4on
[Courtesy, R. Salakhutdinov]

Learning Shared Representa4ons Across Sensory Modali4es
Concept
sunset, pacific ocean, baker beach, seashore, ocean
[Courtesy, R. Salakhutdinov]

Mul4modal DBM
Gaussian model
0 Dense, real-valued 0
image features 01 0
Replicated Sojmax
Word counts
(Srivastava & Salakhutdinov, NIPS 2012, JMLR 2014)
[Courtesy, R. Salakhutdinov]

Mul4modal DBM
Gaussian model
0 Dense, real-valued 0
image features 01 0
Replicated Sojmax
Word counts
(Srivastava & Salakhutdinov, NIPS 2012, JMLR 2014)
[Courtesy, R. Salakhutdinov]

Mul4modal DBM
Gaussian model
Dense, real-valued 0
image features 01 0
Replicated Sojmax
Word counts
0
(Srivastava & Salakhutdinov, NIPS 2012, JMLR 2014)
[Courtesy, R. Salakhutdinov]

Mul4modal DBM
Word counts
Bo_om-up + Top-down
Gaussian model
0 Dense, real-valued 0
image features 01 0
Replicated Sojmax
(Srivastava & Salakhutdinov, NIPS 2012, JMLR 2014)
[Courtesy, R. Salakhutdinov]

Mul4modal DBM
Word counts
Bo_om-up + Top-down
Gaussian model
0 Dense, real-valued 0
image features 01 0
Replicated Sojmax
(Srivastava & Salakhutdinov, NIPS 2012, JMLR 2014)
[Courtesy, R. Salakhutdinov]

Text Generated from Images
Given
Generated
dog, cat, pet, ki_en, puppy, ginger, tongue, ki_y, dogs, furry
sea, france, boat, mer, beach, river, bretagne, plage, bri_any
portrait, child, kid, ritra_o, kids, children, boy, cute, boys, italy
Given
Generated
insect, bu_erfly, insects, bug, bu_erflies, lepidoptera
graffi4, streetart, stencil, s4cker, urbanart, graff, sanfrancisco
canada, nature, sunrise, ontario, fog, mist, bc, morning
[Courtesy, R. Salakhutdinov]

Given
Generated
portrait, women, army, soldier, mother, postcard, soldiers
obama, barackobama, elec4on, poli4cs, president, hope, change, sanfrancisco, conven4on, rally
water, glass, beer, bo_le, drink, wine, bubbles, splash, drops, drop
Text Generated from Images

Given
Images Selected from Text Retrieved
water, red, sunset
nature, flower, red, green
blue, green, yellow, colors
chocolate, cake
[Courtesy, R. Salakhutdinov]

Summary
Efficient learning algorithms for Deep Learning Models. Learning more adap4ve, robust, and structured representa4ons.
Text & image retrieval / Object recogniGon
HMM decoder
Image Tagging
mosque, tower, building, cathedral, dome, castle
MulGmodal Data
Learning a Category Hierarchy
CapGon GeneraGon
Speech RecogniGon
sunset, pacific ocean, beach, seashore
Deep models improve the current state-of-the art in many applica4on domains:
Object recogni4on and detec4on, text and image retrieval, handwri_en character and speech recogni4on, and others.
[Courtesy, R. Salakhutdinov]

Reviews

There are no reviews yet.

Only logged in customers who have purchased this product may leave a review.

Shopping Cart
[SOLVED] CS deep learning algorithm Impact of Deep Learning Speech Recogni4on
$25