[SOLVED] information retrieval Java hadoop data mining 3

$25

File Name: information_retrieval_Java_hadoop_data_mining__3__.zip
File Size: 471 KB

5/5 - (1 vote)

3

12

MovieLens Spark

3.1

34

5

6

7

1 1

8910

Hofmann EM

3.2

3.2.1

Y X

Y X

X ( )p X Y X

( | )p X Y

( | ) ( | ) ( ) / ( )P X Y p X Y p X p Y(3.1)

1 2 i( , , , | c ) ( )m jP a a a P a c(3.2)

x

1 2
{ , , }

n
a a a 1 2 m{C ,C , C }

m ( | c )iP x

(c | )iP x

( | c ) (c )

(c | )
( )

i i
i

P x P
P x

P x
(3.3)

*

1 2 i i
( ) arg max ( , , , | c ) (c )

m
C x P a a a P (3.4)

(3.2)

*

i
( ) arg max (c ) ( )

j
C x P P a c (3.5)

1 2 n{i , , ,i }I i 1 2{ , , , }na a a

i
c

1

2 I 1 2{ , , }na a a

i( | )iP I c

3

1 n( | , )iP c I I ,

i
1

1 1
( | ) arg max ( ) ( | ), ,

m

i

i n n i i
P P P a cc I i I i c

(3.6)

4

3.1

3.2.2

11

12

13

14

15

16

3.2.3

SVD

ALS

SVD SVD++1718SVD++

SVD

SVD

ALS

Pan R, Zhou Y Netflix Prize

19ALS

ALS

m n – R X

R(3.7)

TR X UV (3.7)

m dU C n dV C d min( , )d r r m n,

r RSimon Funk

22

. .
( , )

T

ij ij ij ij i jij
L U V R X R U V (3.8)

(3.8)

2 22

. . . .2 2
,

T

ij i j i jij
L U V R U V U V (3.9)

V iU
(U,V)

0
i

L

U

.iU

1

. .
( )

T

i i ui ui ui ui
U RV V V n I

1,i m(3.10)

.iR i uiV i

I d d

ui
n iU iV .jV

1

. .
( )

T T

j j mj mj mj mj
V R U U U n I

1,j n(3.11)

. jR j mjU j

mjn j

0 0.01 V (3.10)U

(3.11)V RMSE

R

d R X 3.2

3.2

3.3

3.3.1

3.3.2

1)

1 2 n{i , , i }I i 1 2 n{N , , }N N

1

2 I 1 2 n{N , , }N N

i(a | c)P

3

4 23 1 n(c | , a )P a

1 1 n i

1

(C c | A a A a ) aP(C ) (A | C c)
n

n

i

P c P

5

2)

u, u,

1

2 2

u, u,

1 1

( ) ( )

( , )

( ) ( )

m

i i j j

u

m m

i i j j

u u

R R R R

sim i j

R R R R

[1, ]u m ,i j

k i

Ju i

,

u,

( ) ( , )

| ( , ) |

u j j

J
i i

J

R R sim i j

P R
sim i j

3)

a, b,

1

2 2

a, b,

1 1

( , )

( ) ( )

t

i i

i

t t

i i

i i

R R

sim a b

R R

L

,

a ,

( ) ( , )

| ( , ) |

b i b

L
i a

L

R R sim a b

P R
sim a b

3.3.4

Spark

val conf = new SparkConf().setAppName(ColdStart)

val sc = new SparkContext(conf)

// Spark

//

// rating = >

//

//

// user_list

// >

val item_sim = new scala.collection.mutable.HashMap[(String, String),

Double]()

val pred_rating = new scala.collection.mutable.ArrayBuffer[(String, String,

Double)

val neighbors = new scala.collection.mutable.HashSet[String]()

//

val s = vec1.zip(vec2).map{ case (f1, f2) => f1 * f2 }.sum

//

val pred_rating_rdd = sc.parallelize(pred_rating)

3.4

3.4.1

1)

Spark PC 1 Master

2 Slave Ubuntu14.10

1

1

IP

SparkMaster 192.168.1.110 NameNode,

JobTracker

SparkWorker1 192.168.1.113 DataNode,

TaskTracker

SparkWorker2 192.168.1.101 DataNode,

TaskTracker

2)

Java JDK jdk-7u75-linux-i586.gzhadoop hadoop-2.6.0.tar.gz

SSH

Hadoop Spark Spark1.2.0

IntelliJIDEA 14.0.3 Scala scala-sdk-2.10.4

Spark

3.4.2

MovieLens 20! 943

1682 100000 1 5

80% 80000

20000

943 1682-100000
100%=93.7%

943 1682

3.4.3

1)

MAE

n

1 2
{ , , , }

n
p p p 1 2{ , , , }nr r r MAE

1

| |
n

i i

i

p r

MAE
n

MAE

2)

PCF

ICF

MCFMAE

3.3

3.3 MovieLens MAE

MAE k

MAE

k

3.5

MovieLens

Spark

1 Schein A I, Popescul A, Ungar L H, et al. Methods and metrics for cold-start
recommendations[C] // Proceedings of the, International ACM SIGIR Conference on Research

and Development in Information Retrieval. 2002:253-260.
2 Guo H. SOAP: Live Recommendations through Social Agents[J]. Delos Workshop on Filtering

& Collaborative Filtering, 1998.
3 Schein A I, Popescul A, Ungar L H, et al. Methods and metrics for cold-start
recommendations[C] // Proceedings of the, International ACM SIGIR Conference on Research

and Development in Information Retrieval. 2002:253-260.
4 Guo H. SOAP: Live Recommendations through Social Agents[J]. Delos Workshop on Filtering

& Collaborative Filtering, 1998.
5 , , . [J]. ,

0.7

0.75

0.8

0.85

0.9

0.95

1

5 10 15 20 25 30

M
A

E

The number of nearest neighbors k

ICF UCF MCF PCF

2012(05):59-63.
6 D : 2008
7 D 2005
8 Balabanovic M, Shoham Y. Fab: Content-based, collaborative recommendation [J]. Communications of the

ACM, 1997 40 3 66-72.
9 Hofmann T Puzicha J. Latent class models for collaborative filtering C IJCAI 99. 1999 688-

693.
10 D Pennock E Horvitz S Lawrence et al. Collaborative filtering by personality diagnosis A

hybrid memory-and model-based approach C UAI 00. 2000 473-480.
11 Shan H, Kattge J, Reich P, et al. Gap Filling in the Plant KingdomTrait Prediction Using
Hierarchical Probabilistic Matrix Factorization[J]. ar Xiv preprint ar Xiv:1206.6439, 2012.
12 J 2007

18( 10) : 2403 2411
13 J

200314( 9) : 1621 1628
14 J 2010

20( 12) : 35 37
15 J 201036( 6) :

52 57
16. [D].,2012.
17Van Vleck E S. Continuous Matrix Factorizations[M]//Numerical Algebra, Matrix Theory,

Differential-Algebraic Equations and Control Theory. Springer International Publishing, 2015: 299-

318.
18 Benzi K, Kalofolias V, Bresson X, et al. Song Recommendation with Non-Negative Matrix
Factorization and Graph Total Variation[J]. arXiv preprint arXiv:1601.01892, 2016.
19 Pan R, Zhou Y, Cao B, et al. One-class collaborative filtering[C]//Data Mining, 2008. ICDM08.
Eighth IEEE International Conference on. IEEE, 2008: 502-511.
20 . [D]. , 2013.

Reviews

There are no reviews yet.

Only logged in customers who have purchased this product may leave a review.

Shopping Cart
[SOLVED] information retrieval Java hadoop data mining 3
$25