Friday, June 30, 2017

Slides: Machine Learning Summer School @ Max Planck Institute for Intelligent Systems, Tübingen, Germany




Shai Ben-David
(Waterloo)
 Learning Theory.
Slides part 1 part 2 part 3
Dominik Janzing
(MPI for Intelligent Systems)
 Causality.
Slides here.
Stefanie Jegelka
(MIT)
 Submodularity.
Slides here.


Jure Lescovec
(Stanford)
Network Analysis.
Slides 1 2 3 4




Ruslan Salakhutdinov
(CMU)
Deep Learning.
Slides part 1 part 2


Suvrit Sra
(MIT)
Optimization.
Slides 1 2 3A 3B
Bharath Sriperumbudur
(PennState)
Kernel Methods.
Slides part 1 part 2 part 3


Max Welling
(Amsterdam)
Large Scale Bayesian Inference with an Application to Bayesian Deep Learning
Slides here.




Bernhard Schölkopf
(MPI for Intelligent Systems)
Introduction to ML and speak on Causality.
Slides here.

h/t Russ




Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Thursday, June 29, 2017

Learning to Learn without Gradient Descent by Gradient Descent


Learning to Learn without Gradient Descentby Gradient Descent by Yutian Chen, Matthew W. Hoffman, Sergio Gomez Colmenarejo, Misha Denil, Timothy P. Lillicrap, Matt Botvinick, Nando de Freitas 


We learn recurrent neural network optimizers trained on simple synthetic functions by gradient descent. We show that these learned optimizers exhibit a remarkable degree of transfer in that they can be used to efficiently optimize a broad range of derivative-free black-box functions, including Gaussian process bandits, simple control objectives, global optimization benchmarks and hyper-parameter tuning tasks. Up to the training horizon, the learned optimizers learn to tradeoff exploration and exploitation, and compare favourably with heavily engineered Bayesian optimization packages for hyper-parameter tuning.



Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Wednesday, June 28, 2017

Compressive Statistical Learning with Random Feature Moments

Learning, Compressive Sensing and Random Features together !



We describe a general framework --compressive statistical learning-- for resource-efficient large-scale learning: the training collection is compressed in one pass into a low-dimensional sketch (a vector of random empirical generalized moments) that captures the information relevant to the considered learning task. A near-minimizer of the risk is computed from the sketch through the solution of a nonlinear least squares problem. We investigate sufficient sketch sizes to control the generalization error of this procedure. The framework is illustrated on compressive clustering, compressive Gaussian mixture Modeling with fixed known variance, and compressive PCA.


Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Compressive optical interferometry



Compressive optical interferometry by Davood Mardani, H. Esat Kondakci, Lane Martin, Ayman F. Abouraddy, George K. Atia

Compressive sensing (CS) combines data acquisition with compression coding to reduce the number of measurements required to reconstruct a sparse signal. In optics, this usually takes the form of projecting the field onto sequences of random spatial patterns that are selected from an appropriate random ensemble. We show here that CS can be exploited in `native' optics hardware without introducing added components. Specifically, we show that random sub-Nyquist sampling of an interferogram helps reconstruct the field modal structure. The distribution of reduced sensing matrices corresponding to random measurements is provably incoherent and isotropic, which helps us carry out CS successfully.





 
Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Tuesday, June 27, 2017

Paris Machine Learning #10 Ending Season 4 : Large-Scale Video Classification, Community Detection, Code Mining, Maps, Load Monitoring and Cognitive.



This is the final episode of Season 4. Thanks to SCOR for hosting us and for the drinks and food. There will be a capacity of 130 seats (as usual : First come / first served). Most if not all presentation should be in French but the slides will be in English. Questions in either French or English are welcome.

Video of the meetup is here:



Schedule
  • 6:30 PM doors open
  • 6:45-9:00 PM : Talks
  • 9:00-10:00 PM : socializing

Talks (and slides)



We present state-of-the-art deep learning architectures for feature aggregation. We used them in the context of video representation and explain how we won the Youtube 8M Large-Scale Video Understanding Challenge. No prior knowledge in computer vision is required to understand the work. 

Christel Beltran (IBM) Innovating, Differenciating with Cognitive
AI or Cognitive, why now, why so fast, perspectives on current use cases

La détection de communautés au sein d'un graphe permet d'identifier les groupes d'individus, ainsi que leurs dynamiques. Une introduction à cette thématique sera faite, puis deux cas d'applications seront présentés sur les réseaux sociaux : Meetup et LinkedIn. Retrouve-t-on la communauté Data Science? 

Les applications en code mining sont les mêmes qu'en text mining : génération de code, traduction automatique dans un autre langage, extraction de logique métier... pourtant, la structure d'un document de code et son contenu diffèrent fortement d'un document de texte. Dans ce talk, nous verrons quelles sont les divergences entre les langages naturels et les langages de programmation et comment ces particularités influent sur la manière de préparer puis traiter automatiquement du code source

At Qucit we use geographic data collected from hundreds of cities on a daily basis. It is collected from many different sources, often from each city’s open data website and used as inputs to our models. We can then predict parking times, bikeshare stations occupations, stress levels, parking fraud... Gathering it is a lot of fastidious work and we aim at automating it. In order to do so we want to get our data from a source available everywhere: satellite images. We now have a good enough precision to detect roads, buildings and we believe that single trees can be detected too. We tested our model on the SpaceNet images and labels, acquired thanks to the Spacenet Challenge. During this challenge the Topcoder Community had to develop automated methods for extracting building footprints from high-resolution satellite imagery. 

Non Intrusive Load Monitoring is the field of electrical consumption disaggregation within a building thus enabling people to increase their energy efficiency and reduce both energy demand and electricity costs. We will present to you this active research field and what are the learning challenges @Smart Impulse.




Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Monday, June 26, 2017

Coherent inverse scattering via transmission matrices: Efficient phase retrieval algorithms and a public dataset - Dataset -

I just came upon this  work from the good folks at Rice and Northwestern that also features an attendant dataset. Enjoy !





Coherent inverse scattering via transmission matrices: Efficient phase retrieval algorithms and a public dataset by Christopher A. Metzler ; Manoj K. Sharma ; Sudarshan Nagesh ; Richard G. Baraniuk ; Oliver Cossairt ; Ashok Veeraraghavan

A transmission matrix describes the input-output relationship of a complex wavefront as it passes through/reflects off a multiple-scattering medium, such as frosted glass or a painted wall. Knowing a medium's transmission matrix enables one to image through the medium, send signals through the medium, or even use the medium as a lens. The double phase retrieval method is a recently proposed technique to learn a medium's transmission matrix that avoids difficult-to-capture interferometric measurements. Unfortunately, to perform high resolution imaging, existing double phase retrieval methods require (1) a large number of measurements and (2) an unreasonable amount of computation.

In this work we focus on the latter of these two problems and reduce computation times with two distinct methods: First, we develop a new phase retrieval algorithm that is significantly faster than existing methods, especially when used with an amplitude-only spatial light modulator (SLM). Second, we calibrate the system using a phase-only SLM, rather than an amplitude-only SLM which was used in previous double phase retrieval experiments. This seemingly trivial change enables us to use a far faster class of phase retrieval algorithms. As a result of these advances, we achieve a $100\times$ reduction in computation times, thereby allowing us to image through scattering media at state-of-the-art resolutions. In addition to these advances, we also release the first publicly available transmission matrix dataset. This contribution will enable phase retrieval researchers to apply their algorithms to real data. Of particular interest to this community, our measurement vectors are naturally i.i.d.~subgaussian, i.e., no coded diffraction pattern is required.
Dataset:

A public dataset containing the transmission matrices of various scattering media can be downloaded here.

The properties of the TM are briefly described here. 



Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Saturday, June 24, 2017

Saturday Morning Videos: Seminars of the Data Science Colloquium of the ENS: Beyond SGD, Functional brain mapping, What physics can tell us about inference?, Can Big Data cure Cancer?

 
As Gabriel mentioned on his twitter feed, several of the past seminars of the Data Science Colloquium of the ENS are here:

March. 7th, 2017, Francis Bach (INRIA) [video]
Title: Beyond stochastic gradient descent for large-scale machine learning
Abstract: Many machine learning and signal processing problems are traditionally cast as convex optimization problems. A common difficulty in solving these problems is the size of the data, where there are many observations ('large n') and each of these is large ('large p'). In this setting, online algorithms such as stochastic gradient descent which pass over the data only once, are usually preferred over batch algorithms, which require multiple passes over the data. In this talk, I will show how the smoothness of loss functions may be used to design novel algorithms with improved behavior, both in theory and practice: in the ideal infinite-data setting, an efficient novel Newton-based stochastic approximation algorithm leads to a convergence rate of O(1/n) without strong convexity assumptions, while in the practical finite-data setting, an appropriate combination of batch and online algorithms leads to unexpected behaviors, such as a linear convergence rate for strongly convex problems, with an iteration cost similar to stochastic gradient descent. (joint work with Nicolas Le Roux, Eric Moulines and Mark Schmidt).


Jan. 10th, 2017, Bertrand Thirion (INRIA and Neurospin) [video]
Title: A big data approach towards functional brain mapping
Abstract: Functional neuroimaging offers a unique view on brain functional organization, which is broadly characterized by two features: the segregation of brain territories into functionally specialized regions, and the integration of these regions into networks of coherent activity. Functional Magnetic Resonance Imaging yields a spatially resolved, yet noisy view of this organization. It also yields useful measurements of brain integrity to compare populations and characterize brain diseases. To extract information from these data, a popular strategy is to rely on supervised classification settings, where signal patterns are used to predict the experimental task performed by the subject during a given experiment, which is a proxy for the cognitive or mental state of this subject. In this talk we will describe how the reliance on large data copora changes the picture: it boosts the generalizability of the results and provides meaningful priors to analyze novel datasets. We will discuss the challenges posed by these analytic approaches, with an emphasis on computational aspects, and how the use of non-labelled data can be further used to improve the model learned from brain activity data.

Nov. 8th, 2016, Cristopher Moore (Santa Fe Institute) [video]
Title: What physics can tell us about inference?
Abstract: There is a deep analogy between statistical inference and statistical physics; I will give a friendly introduction to both of these fields. I will then discuss phase transitions in two problems of interest to a broad range of data sciences: community detection in social and biological networks, and clustering of sparse high-dimensional data. In both cases, if our data becomes too sparse or too noisy, it suddenly becomes impossible to find the underlying pattern, or even tell if there is one. Physics both helps us locate these phase transiitons, and design optimal algorithms that succeed all the way up to this point. Along the way, I will visit ideas from computational complexity, random graphs, random matrices, and spin glass theory. 

Oct. 11th, 2016, Jean-Philippe Vert (Mines ParisTech, Institut Curie and ENS) [video]
Title: Can Big Data cure Cancer?
Abstract: As the cost and throughput of genomic technologies reach a point where DNA sequencing is close to becoming a routine exam at the clinics, there is a lot of hope that treatments of diseases like cancer can dramatically improve by a digital revolution in medicine, where smart algorithms analyze « big medical data » to help doctors take the best decisions for each patient or to suggest new directions for drug development. While artificial intelligence and machine learning-based algorithms have indeed had a great impact on many data-rich fields, their application on genomic data raises numerous computational and mathematical challenges that I will illustrate on a few examples of patient stratification or drug response prediction from genomic data.
 
 
 
 
 
 
Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Printfriendly