Decentralized matrix completion with gossiping

A Riemannian gossip approach to decentralized matrix completion

Authors

B. Mishra, H. Kasai, and A. Saroop

Abstract

In this paper, we propose novel gossip algorithms for the low-rank decentralized matrix completion problem. The proposed approach is on the Riemannian Grassmann manifold that allows local matrix completion by different agents while achieving asymptotic consensus on the global low-rank factors. The resulting approach is scalable and parallelizable. Our numerical experiments show the good performance of the proposed algorithms on various benchmarks.

Downloads

  • Status: Technical report, 2016. A shorter version has been accepted to 9th NIPS workshop on optimization for machine learning (OPT2016) to be held at Barcelona.
  • Paper: [arXiv:1605.06968].
  • Hiroyuki Kasai presented the work at IEICE meeting 2016, Japan.
  • Matlab code: [gossipMC_24Nov2016.zip].
    • Update on 24 Nov 2016: In the parallel code, we corrected a bug in storing the time taken by each agent for each parameter update. The bug was in using tic and toc outside a parfor loop, which added the time taken by all the agents that worked in parallel. Instead, we only needed the maximum time taken by the agents working in parallel. To accomplish this, we now use par.tic and par.toc to store the correct time taken by each agent working in parallel. This modification is based on the discussion in http://uk.mathworks.com/matlabcentral/fileexchange/27472-partictoc/content/Par.m.
    • Update on 14 Aug 2016: Regularization parameter functionality added with the “options.lambda” structure.
    • We provide a way to set the initial stepsize by linearizing the cost function. The stepsize is then diminished as O(1/t), where t is the iteration count.
    • We provide both online and parallel implementations, and both of these implementations have both standard and preconditioned Grassmann gradient updates. The parallel implementation requires the parallel computing toolbox of Matlab.
    • The performance (speed wise) of the parallel variant depends on where the algorithms are run (e.g., single machines versus multiple machines) and how well are the parallel architectures integrated with Matlab.
    • All the implementations are scalable to large datasets.

Examples

Effect of ρ on arriving at consensus among six agents (only two agents are shown).
MovieLens 20 M: decentralized matrix completion by four agents.
Netflix: comparison of gossip with 10 agents against the batch gradient descent RTRMC.
Timing performance on an instance of size 10K by 100K and rank 5 with six agents. The figure shows a 2 times improvement of the parallel variant.