3
12
MovieLens Spark
3.1
34
5
6
7
1 1
8910
Hofmann EM
3.2
3.2.1
Y X
Y X
X ( )p X Y X
( | )p X Y
( | ) ( | ) ( ) / ( )P X Y p X Y p X p Y(3.1)
1 2 i( , , , | c ) ( )m jP a a a P a c(3.2)
x
1 2
{ , , }
n
a a a 1 2 m{C ,C , C }
m ( | c )iP x
(c | )iP x
( | c ) (c )
(c | )
( )
i i
i
P x P
P x
P x
(3.3)
*
1 2 i i
( ) arg max ( , , , | c ) (c )
m
C x P a a a P (3.4)
(3.2)
*
i
( ) arg max (c ) ( )
j
C x P P a c (3.5)
1 2 n{i , , ,i }I i 1 2{ , , , }na a a
i
c
1
2 I 1 2{ , , }na a a
i( | )iP I c
3
1 n( | , )iP c I I ,
i
1
1 1
( | ) arg max ( ) ( | ), ,
m
i
i n n i i
P P P a cc I i I i c
(3.6)
4
3.1
3.2.2
11
–
12
13
14
15
16
3.2.3
SVD
ALS
SVD SVD++1718SVD++
SVD
SVD
ALS
Pan R, Zhou Y Netflix Prize
19ALS
ALS
m n – R X
R(3.7)
TR X UV (3.7)
m dU C n dV C d min( , )d r r m n,
r RSimon Funk
22
. .
( , )
T
ij ij ij ij i jij
L U V R X R U V (3.8)
(3.8)
2 22
. . . .2 2
,
T
ij i j i jij
L U V R U V U V (3.9)
V iU
(U,V)
0
i
L
U
.iU
1
. .
( )
T
i i ui ui ui ui
U RV V V n I
1,i m(3.10)
.iR i uiV i
I d d
ui
n iU iV .jV
1
. .
( )
T T
j j mj mj mj mj
V R U U U n I
1,j n(3.11)
. jR j mjU j
mjn j
0 0.01 V (3.10)U
(3.11)V RMSE
R
d R X 3.2
3.2
3.3
3.3.1
3.3.2
1)
1 2 n{i , , i }I i 1 2 n{N , , }N N
1
2 I 1 2 n{N , , }N N
i(a | c)P
3
4 23 1 n(c | , a )P a
1 1 n i
1
(C c | A a A a ) aP(C ) (A | C c)
n
n
i
P c P
5
2)
u, u,
1
2 2
u, u,
1 1
( ) ( )
( , )
( ) ( )
m
i i j j
u
m m
i i j j
u u
R R R R
sim i j
R R R R
[1, ]u m ,i j
k i
Ju i
,
u,
( ) ( , )
| ( , ) |
u j j
J
i i
J
R R sim i j
P R
sim i j
3)
a, b,
1
2 2
a, b,
1 1
( , )
( ) ( )
t
i i
i
t t
i i
i i
R R
sim a b
R R
L
,
a ,
( ) ( , )
| ( , ) |
b i b
L
i a
L
R R sim a b
P R
sim a b
3.3.4
Spark
–
val conf = new SparkConf().setAppName(ColdStart)
val sc = new SparkContext(conf)
// Spark
//
// rating =
//
//
// user_list
//
val item_sim = new scala.collection.mutable.HashMap[(String, String),
Double]()
val pred_rating = new scala.collection.mutable.ArrayBuffer[(String, String,
Double)
val neighbors = new scala.collection.mutable.HashSet[String]()
//
val s = vec1.zip(vec2).map{ case (f1, f2) => f1 * f2 }.sum
//
val pred_rating_rdd = sc.parallelize(pred_rating)
3.4
3.4.1
1)
Spark PC 1 Master
2 Slave Ubuntu14.10
1
1
IP
SparkMaster 192.168.1.110 NameNode,
JobTracker
SparkWorker1 192.168.1.113 DataNode,
TaskTracker
SparkWorker2 192.168.1.101 DataNode,
TaskTracker
2)
Java JDK jdk-7u75-linux-i586.gzhadoop hadoop-2.6.0.tar.gz
SSH
Hadoop Spark Spark1.2.0
IntelliJIDEA 14.0.3 Scala scala-sdk-2.10.4
Spark
3.4.2
MovieLens 20! 943
1682 100000 1 5
80% 80000
20000
943 1682-100000
100%=93.7%
943 1682
3.4.3
1)
MAE
n
1 2
{ , , , }
n
p p p 1 2{ , , , }nr r r MAE
1
| |
n
i i
i
p r
MAE
n
MAE
2)
PCF
ICF
MCFMAE
3.3
3.3 MovieLens MAE
MAE k
MAE
k
3.5
MovieLens
Spark
1 Schein A I, Popescul A, Ungar L H, et al. Methods and metrics for cold-start
recommendations[C] // Proceedings of the, International ACM SIGIR Conference on Research
and Development in Information Retrieval. 2002:253-260.
2 Guo H. SOAP: Live Recommendations through Social Agents[J]. Delos Workshop on Filtering
& Collaborative Filtering, 1998.
3 Schein A I, Popescul A, Ungar L H, et al. Methods and metrics for cold-start
recommendations[C] // Proceedings of the, International ACM SIGIR Conference on Research
and Development in Information Retrieval. 2002:253-260.
4 Guo H. SOAP: Live Recommendations through Social Agents[J]. Delos Workshop on Filtering
& Collaborative Filtering, 1998.
5 , , . [J]. ,
0.7
0.75
0.8
0.85
0.9
0.95
1
5 10 15 20 25 30
M
A
E
The number of nearest neighbors k
ICF UCF MCF PCF
2012(05):59-63.
6 D : 2008
7 D 2005
8 Balabanovic M, Shoham Y. Fab: Content-based, collaborative recommendation [J]. Communications of the
ACM, 1997 40 3 66-72.
9 Hofmann T Puzicha J. Latent class models for collaborative filtering C IJCAI 99. 1999 688-
693.
10 D Pennock E Horvitz S Lawrence et al. Collaborative filtering by personality diagnosis A
hybrid memory-and model-based approach C UAI 00. 2000 473-480.
11 Shan H, Kattge J, Reich P, et al. Gap Filling in the Plant KingdomTrait Prediction Using
Hierarchical Probabilistic Matrix Factorization[J]. ar Xiv preprint ar Xiv:1206.6439, 2012.
12 J 2007
18( 10) : 2403 2411
13 J
200314( 9) : 1621 1628
14 J 2010
20( 12) : 35 37
15 J 201036( 6) :
52 57
16. [D].,2012.
17Van Vleck E S. Continuous Matrix Factorizations[M]//Numerical Algebra, Matrix Theory,
Differential-Algebraic Equations and Control Theory. Springer International Publishing, 2015: 299-
318.
18 Benzi K, Kalofolias V, Bresson X, et al. Song Recommendation with Non-Negative Matrix
Factorization and Graph Total Variation[J]. arXiv preprint arXiv:1601.01892, 2016.
19 Pan R, Zhou Y, Cao B, et al. One-class collaborative filtering[C]//Data Mining, 2008. ICDM08.
Eighth IEEE International Conference on. IEEE, 2008: 502-511.
20 . [D]. , 2013.
Reviews
There are no reviews yet.