Machine Learning from Big Datasets

Machine Learning from Big Datasets

10-405 Big ML 1 Matrix Factorization 2 What is MF and what can you do with it? 3 Recovering latent factors in a matrix n rows m columns v1

1 vij vnm 4 Recovering latent factors in a matrix K*m y1 a1

a2 .. am x2 y2 b1 b2 bm

.. .. n*K x1 ~ v1 1 vij

vnm xn yn 5 What is this for? K*m y1 a1 a2 ..

am x2 y2 b1 b2 bm .. ..

n*K x1 ~ v1 1 vij vnm

xn yn 6 MF for collaborative filtering 7 What is collaborative filtering? Recovering latent factors in a matrix n users m movies v1

1 vij vnm V[i,j] = user is rating of movie j 9 Recovering latent factors in a matrix m movies m movies

y1 a1 a2 .. am x2 y2 b1 b2

bm .. .. n users x1 ~ v1 1

vij vnm xn yn V[i,j] = user is rating of movie j 10 11 12

Recovering latent factors in a matrix m movies m movies y1 a1 a2 .. am x2 y2

b1 b2 bm .. .. n users x1 ~ v1

1 vij vnm xn yn V[i,j] = user is rating of

movie j 13 MF for image modeling 14 15 PC1 MF for images 1000 images 1000 * 10,000,00 10,000 pixels 2 prototypes x1

y1 a1 a2 .. am x2 y2 b1 b2

bm .. .. ~ v1 1

vij PC2 xn yn vnm V[i,j] = pixel j in image i MF for modeling text 17

Recovering latent factors in a matrix The Neatest Little Guide to Stock Market Investing doc term matrix m terms Investing For Dummies, 4th Edition The Little Book of Common Sense Investing: The Only a1 Way a2 to..Guarantee am x1 y1 v1 1 Your Fair

Share of Stock Market Returns b1 b2 bm x2 y2 The Little Book of Value Investing .. .. Value Investing: From Graham to Buffett and Beyond Rich Dads Guide to Investing: What the vij

Rich Invest in, That the Poor and the Middle Class Do Not! Investing in Real Estate, 5th Edition Stock Investing For Dummies vnm Rich Dads Advisors: The ABCs of Real Estate Investing: The Secrets of Finding Hidden Profits Most Investors Miss V[i,j] = TFIDF score of term j in xn yn doc i https://technowiki.wordpress.com/2011/08/27/latent-semantic-analysis19 lsa-tutorial/ n documents

~ Recovering latent factors in a matrix dummy n documents stock saving advice ... doc term matrix m terms x1 y1 a1 a2

.. am x2 y2 b1 b2 bm ..

.. Doc = weighted sum of topics xn yn ~ v1 1

vij estate land invest rich vnm V[i,j] = TFIDF score of term j in doc i 20 Investing for real estate Rich Dads Advisors:

The ABCs of Real Estate Investment The little book of common sense investing: Neatest Little Guide to Stock Market Investing MF vs other learning tasks 23

MF is like linear regression 24 MF is like multiple-output multi-variable linear regression 1 = 1 2 = 2 = 25 Multi-output linear regression as MF examples m outputs for each xi n examples

m weight vectors x11 x12 a1 a2 x21 x22 b1 b2 .. ..

W .. am bm 1 = 1 2 = 2 X ~ v1

1 vij Y = vnm

xn1 yn . output 1 output 2 MF is like clustering 27 k-means Clustering Each point is in one cluster Each cluster is a weighte

d average of points centroid s 28 k-means as MF n examples indicators for r clusters 0 1 a1

a2 1 0 b1 b2 .. .. Z original data set cluster means M

.. am bm ~ v1 1

vij X vnm xn yn MF is soft clustering each example is a weighted sum of clusters K*m y1 a1

a2 .. am x2 y2 b1 b2 bm

.. .. n*K x1 ~ v1 1 vij

vnm xn yn 30 How do you do MF? 31 KDD 2011 talk pilfered from .. 32

33 Recovering latent factors in a matrix r x1 y1 a1 a2 x2 y2 b1 b2

.. .. n users m movies m movies W H .. am

bm ~ v1 1 vij V vnm

xn yn V[i,j] = user is rating of movie j 34 35 36 37 Matrix factorization as SGD step size why does this work? 38

Matrix factorization as SGD - why does this work? Heres the key claim: 39 Checking the claim Think for SGD for logistic regression LR loss = compare y and = dot(w,x) similar but now update w (user weights) and x (movie weight) 40 What loss functions are possible? 41 What loss functions are possible?

42 limited memory quasi-Newton ALS = alternating least squares 43 KDD 2011 talk pilfered from .. 44 iterative SGD, no mixing limited memory quasi-Newton param mixing alternating least squares IPM

45 Matrix factorization as SGD - why does this work? Heres the key claim: 46 47 48 W 1 H1 H2 H3 V 11 W 2

V V 33 Strata 1 12 W 2 22 W 3 W 1 H1 H2 H3 V

W 3 V 23 V 31 Strata 2 W 1 W 2 W 3 H1 H2 H3 V Node 1

13 Node 2 V Node 3 21 V 32 Strata 3 Epoch 1 49 iterative SGD, no mixing

limited memory quasi-Newton param mixing alternating least squares IPM 50 51 52 53 Hadoop scalability Hadoop process setup time starts to dominate 54

MF is like logistic regression 55 Linear regression as MF examples training data n examples weight vectors x11 x12 a1 a2 x21

x22 b1 b2 .. .. W .. am bm

1 = 1 2 = 2 X ~ v1 1 vij Y

= vnm xn1 yn . output 1 output 2 Logistic? regression as MF examples training data

n examples weight vectors x11 x12 a1 a2 x21 x22 b1 b2 ..

.. X W .. am bm ~ v1 1

vij Y vnm xn1 yn . output 1

output 2 Vectorizing logistic regression Many ML methods can be rewritten using nothing but vector-matrix operations (vectorizing) Why do this? Simpler (once you understand it well) Faster (given the right infrastructure - e.g., numpy, GPUs, ) Can simplify optimization (more later) 58 Vectorized minibatch logistic regression Computation wed like to vectorize: For each x in the minibatch, compute For each feature j: update w

using j 59 Vectorizing logistic regression Computation wed like to parallelize: For each x in the minibatch Xbatch, compute 1 1 hh = 1 [

1 1 = ][ ] [ ] 60

Vectorizing logistic regression Computation wed like to parallelize: For each x in the minibatch Xbatch, compute +1 [ ] in numpy if M is a matrix M+1 does the right thing so does M.exp(), M.dot(), 61 Vectorizing logistic regression

Computation wed like to parallelize: For each x in the minibatch, compute def logistic(X): return (X.exp() +1).reciprocal() p = logistic(Xb.dot(w)) # B rows, 1 column 62 Binary to softmax logistic regression 1 1 hh = 1 [

1 1 = ][ ] [ ]

63 Binary to softmax logistic regression exp ( ) exp ( ) X

XW 64 http://minpy.readthedocs.io/en/latest/get-started/logistic_regression Matrix multiply,; then exponentiate component-wise prob will have B rows and K columns, and each row will sum to 1 exp ( ) exp ( )

Sum the columns to get the denominator; keepdim=True means that this line will work correctly even though a and a_sum have different shapes XW 65 http://minpy.readthedocs.io/en/latest/get-started/logistic_regression 66 http://minpy.readthedocs.io/en/latest/get-started/logistic_regression 67

http://minpy.readthedocs.io/en/latest/get-started/logistic_regression x.T dy Error on each example x in batch and each class y python bug: should be x.T (transpose) The gradient step! 68 Vectorizing logistic regression Many ML methods can be rewritten

using nothing but vector-matrix operations (vectorizing) Why do this? Simpler (once you understand it well) Faster (given the right infrastructure - e.g., numpy, GPUs, ) Can simplify optimization (more later) 69

Recently Viewed Presentations

  • Mankiw 5/e Chapter 5: The Open Economy

    Mankiw 5/e Chapter 5: The Open Economy

    But the equilibrium level of net exports doesn't change, because the supply of dollars in the foreign exchange market (S-I) has not been affected by the trade policy. (Remember, S = Y-C-G, and the trade policy does not affect Y,...
  • MATH 3033 based on Dekking et al. A Modern Introduction to ...

    MATH 3033 based on Dekking et al. A Modern Introduction to ...

    Then X can be obtained from the standard uniform variable U as Proof: Algorithm Obtain a sample u from the standard uniform RV U Compute a sample from X as since U is Uniform(0, 1) Exponential random variables Applying this...
  • XXX-PDA Domain Name

    XXX-PDA Domain Name

    Single Submission Portal (SSP) The project proposes to study different forms of Single Submission Portals including: Port Community Systems, Cargo Community Systems, Data Pipelines, e-Commerce Platforms, International Trade Integrated Service Platforms or Interorganizational Operating Information Systems.
  • Sources of University Funds - Florida State University

    Sources of University Funds - Florida State University

    Cost reimbur-> SRS allows you to spend then invoice sponsored projects. Fixed price -> Cash comes in first, spend from Cash. E&G funds get a "budget. This amount is entered or "booked" into OMNI and is used to control spending....
  • Welcome to the AML LI1 initiation - Cardiff University

    Welcome to the AML LI1 initiation - Cardiff University

    ECG x3 Within 2 HOURS Cardiac Monitoring - AC220 To be completed at Day 1, Day 7 and Day 14 for cycles 2 - 4 (no small molecule in cycle 1) Please note: ECG machines are not provided in this...
  • ALIBABA.COM: A SMILING COMMUNITY WITH A DREAM E

    ALIBABA.COM: A SMILING COMMUNITY WITH A DREAM E

    Jack Ma- Founder and CEO. One of the first e-commerce entrepreneurs in china. Started Alibaba in 1999. Company will retain "start up" mentality. Known as quirky and disruptive "Smells" all departments. Resigned from Alibaba May 10 2013
  • Tolerances

    Tolerances

    clearance fit. interference fit - a binding fit that requires the parts to be forced together, much as if they were welded. transition fit - may range from a interference fit to a clearance fit between the mated parts. line...
  • Safe Browsing Tips and more! - First Heartland

    Safe Browsing Tips and more! - First Heartland

    Is it safe? They are smart and cunning, and they will use things called tiny links, short links, or redirectors to mask the true destination (which we try and block) that can start with tinyurl.com, goo.gl, bitly.com, ow.ly and a...