Tuesday, January 24, 2017

Ce soir: Paris Machine Learning Hors serie # 5 Season 4, Workshop Amazon Web Service

Prélude: C'est le cinquième Hors série de la saison et le prochain sera un hors série. Nous sommes preneurs de présentations pour le 21 Février et le 8 Mars, si vous êtes intéressés, il faut remplir ce formulaire. Les prochains meetups sont les suivants:

Merci à Davidson de nous accueillir pour le meetup de ce soir. La video du streaming se trouve ici:

Voici le programme:

0) Set-up & Installation

Voici les Instructions de Julien Simon (Amazon Evangelist) pour ceux qui veulent coder:

1) Julien Simon (Amazon Evangelist), Bien démarrer avec AWS Infrastructure

• régions, zones de disponibilité, sécurité
• Services de base : instances EC2, permissions IAM, Amazon S3, bases de données relationnelles RDS
• D'autres options qu’EC2 pour déployer mon code : ElasticBeanstalk, ECS (Docker), Lambda

2) DevOps

• AWS Élasticité et optimisation : comment ajuster infra-structure et couts à ses besoins
• Automatiser la création de l’infrastructure avec CloudFormation
• Automatiser le déploiement de code avec CodePipeline, CodeDeploy.

3) Retour d'expérience : AWS @ Predicsis (machine learning as a service)
• Grégoire Morpain (DevOps Engineer), Sylvain Ferrandiz (Chief Data Scientist), Bertrand Grezes-Besset (Co-fondateur)

4) Amazon Machine Learning Services d’IA
• (lancés à re:Invent) : Amazon Polly (text to speech), Amazon Rekognition (reconnaissance d’images et de visage), Amazon Lex (chatbots)
• Deep Learning : instances GPU, Mxnet

credit photo: GOES-16 captured this view of the moon as it looked across the surface of the Earth on January 15. Like earlier GOES satellites, GOES-16 will use the moon for calibration. (NOAA/NASA)

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Monday, January 23, 2017

Jobs: Data Science Postdoctoral Fellows, Harvard University

Liz just sent me the following:

Dear Igor,
Harvard University just launched its inaugural Data Science Postdoctoral Fellows program and I would be grateful for your help in making sure it reaches interested applicants via your blog. The details are online at datascience.harvard.edu/funding-opportunities, and pasted below.
Thank you,
Liz Langdon-Gray

Elizabeth Langdon-Gray
Assistant Provost for Research Development and Planning

Harvard University, Office of the Vice Provost for Research
2 Arrow Street, 3rd Floor | Cambridge, MA 02138
Here is the announcement:

The Harvard University Data Science Initiative is seeking applications for its inaugural Harvard Data Science Postdoctoral Fellows Program for the 2017-2018 academic year. The normal duration of the Fellowship is two years. Fellows will receive a generous salary as well as an annual allocation for research and travel expenses.We are looking for researchers whose interests are in data science, broadly construed, and including researchers with both a methodological and applications focus. Fellows will be provided with the opportunity to pursue their research agenda in an intellectually vibrant environment with ample mentorship. We are looking for independent researchers who will seek out collaborations with other fellows and with Harvard faculty.The Data Science Postdoctoral Fellows Program is supported by the Harvard Data Science Initiative with administrative support from the Office of the Vice Provost for Research. The Data Science Initiative involves faculty from across the university. The Fellows program will concentrate some of its activity in new physical spaces provided by the Computer Science-Statistics Data Science Lab, the Longwood Data Science Lab, as well as space in the Institute for Quantitative Social Sciences.Funding PrioritiesThe Data Science Postdoctoral Fellows Program will support outstanding researchers whose interests relate to the following themes:1.    Methodological foundations, including for example, causal inference, data systems design, deep learning, experimental design, modeling of structured data, random matrix theory, non-parametric Bayesian methods, scalable inference, statistical computation, and visualization.2.    Development of data science approaches tailored to analytical challenges in substantive fields that span the full intellectual breadth of Harvard’s faculties.  To give some purely illustrative examples, these fields include health sciences (e.g. life and population sciences), earth systems (e.g. climate change research); society (e.g. data that can affect the experience of individuals, or policy and ethical questions); and the economy (e.g. automation, Internet of Things, digital economy). This list is by no means exhaustive.Successful applicants will be expected to lead their own research agenda, but also work collaboratively with others including with members of the Harvard faculty, and to contribute to building the data science intellectual community. The Fellows program will offer numerous opportunities to engage with the broader data science community, including through seminar series, informal lunches, mentoring opportunities, opportunities for fellow-led programming, and other networking events. Fellows should expect to spend most of their time in residence at Harvard.Available FundingStipend: $80,000 is available in salary support per year for an initial two year appointment. Appointments may be extended for a third year, budget and performance allowing. Travel: An additional$10,000 will be allocated for research and travel expenses each year.EligibilityApplicants must be outstanding, intellectually curious researchers at an early stage of their scholarly career. Applicants are required to have a doctorate in a related area by the expected start date. Applicants should have demonstrated a capacity for independent work, and will be expected to engage with researchers and faculty and participate in activities convened by the Data Science Initiative. We recognize that strength comes through diversity and actively seek and welcome people with diverse backgrounds, experiences, and identities.ApplicationWe encourage candidates to apply by February 3, 2017, but will continue to review applications until the positions are filled. Applicants should apply through the application portal linked below.  Required application documents include:1.    A CV2.    A cover letter that identifies up to five (and at least two) Harvard faculty members with whom the applicant would like to work.3.    A statement of research interests of up to three pages that succinctly describes the applicant’s research interests. The statement should explain the importance and potential impact of this research. If the research is anticipated to require significant space and/or equipment, the candidate should explore arrangements with relevant faculty prior to submitting an application.4.    Up to three representative papers.5.    Names and contact information for at least two and up to five references (the application is complete only when two letters have been submitted). Referees will be provided with a link to the submission portal.All materials should be submitted as PDF documents. We will strive to make decisions by February 28, 2017.We are an equal opportunity employer and all qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability status, protected veteran status, or any other characteristic protected by law.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Book: A Course in Machine Learning, Hal Daumé III

Hal mentioned it on his twitter feed:

A Course in Machine Learning by Hal Daumé III is here.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Sunday, January 22, 2017

Sunday Morning Insight: D'avions et d'Intelligence Artificielle en France [in French]

C'est une histoire qui se passe pendant la seconde guerre mondiale. veut aider l'effort de guerre, mais parce qu'il est d'Europe centrale, il ne peut pas travailler sur les programmes ultrasecret des radars ni dans le projet Manhattan. Il se retrouve au Statistical Research Group à New York. Quand le haut commandement Allié demande à Abe si il peut les aider, il s'empare immédiatement du projet

Le problème est simple. Pendant les campagnes de France et d'Allemagne qui vise a bombarder l'effort de guerre nazie, un certain nombre d'avions de la RAF et de l'US Air Force ne reviennent pas. Ceux qui reviennent sont criblés d'impact de balles et d'obus de partout, enfin presque partout. Le haut commandement se pose la question de savoir ou et comment blinder les avions de façon a avoir plus d'avions qui survivent de ces campagnes. Leur premièr instinct et de réparer les trous des avions qui reviennent criblés.

Abe fait la remarque suivante: si les avions sont visés sur toutes les surfaces de l'avion pendant les campagnes, il faut chercher les endroits qui n'ont pas été touchés. En effet, les avions qui ne sont pas revenus sont ceux qui ont été atteints à ces endroits la: Pour répondre à la question initiale, il faut blinder les avions aux endroits qui n'ont pas été touché sur les avions qui ont survécus.
On appelle cela le biais de sélection.

Ce biais apparait quand on est en face d'un groupe et que l'on se pose la question de savoir pourquoi il n'y a pas un certain type de personne dans ce groupe. Un autre exemple plus proche est celui de faire la cartographie de l'intelligence artificielle en France a partir des listings de programme d'investissements, de startups ou d'équipes de recherche qui existent déja. Ces efforts de listing sont importants et donnent une vraie visibilité aux gouvernants. Mais la question qu'il faut aussi se poser  est de voir quels sont les endroits ou il y a une demande sociétale forte avec en face des programmes d'investissements, des startups ou des équipes de recherche qui n'existent pas en France.

PS:

Abe fera un calcul sur plus de 400 avions et trouvera qu'il faut protéger les moteurs des avions des canons de 20mm et le fuselage des mitraillettes de 7.9mm. Une copie du rapport d'Abe Wald se trouve ici: "A method of estimating plane vulnerability based on damage of survivors"

J'ai lu cette histoire très efficace pour la première fois sur le blog de John. Jordan l'a raconté de façon plus étendue ici. Nous en avions parler la première fois au meetup du Paris Machine Learning quand Léon Bottou nous avait parlé au meetup 11 de la saison 1.

Credit photo: Cameron Moll, The Counterintuitive World, Kevin Drum

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Saturday, January 21, 2017

Saturday Morning Video: Stan Conference 2017 video streaming

The Stan Conference 2017 is streaming live on YouTube. Andrew starts with the recent US elections.

Here is the program:

• 9:00 AM - 10:00 AM
Dev talk Andrew Gelman:
"10 Things I Hate About Stan"
• 10:00 AM - 10:30 AM
Coffee
• 10:30 AM - 12:00 PM
Contributed talks 1. Jonathan Auerbach, Rob Trangucci:
"Twelve Cities: Does lowering speed limits save pedestrian lives?"
"Hierarchical Bayesian Modeling of the English Premier League"
3. Victor Lei, Nathan Sanders, Abigail Dawson:
4. Woo-Young Ahn, Nate Haines, Lei Zhang:
"hBayesDM: Hierarchical Bayesian modeling of decision-making tasks"
5. Charles Margossian, Bill Gillespie:
"Differential Equation Based Models in Stan"
• 12:00 PM - 1:15 PM
Lunch
• 1:15 PM - 2:15 PM
Dev talk Michael Betancourt:
"Everything You Should Have Learned About Markov Chain Monte Carlo”
• 2:15 PM - 2:30 PM
Stretch break
• 2:30 PM - 3:45 PM
Contributed talks
1. Teddy Groves:
"How to Test IRT Models Using Simulated Data"
2. Bruno Nicenboim, Shravan Vasishth:
"Models of Retrieval in Sentence Comprehension"
3. Rob Trangucci:
"Hierarchical Gaussian Processes in Stan"
4. Nathan Sanders, Victor Lei:
"Modeling the Rate of Public Mass Shootings with Gaussian Processes"
• 3:45 PM - 4:45 PM
Mingling and coffee
• 4:45 PM - 5:40 PM
Q&A Panel
• 5:40 PM - 6:00 PM
Closing remarks Bob Carpenter:
"Where is Stan Going Next?"

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Friday, January 20, 2017

Learning the structure of learning

If anything, there has been a flurry of effort in learning the structure of new learning architectures. Here is an ICLR2017 paper on the subject of meta learning and posters of the recent NIPS symposium on the topic.

Neural Architecture Search with Reinforcement Learning, Barret Zoph, Quoc Le (Open Review is here)

Abstract: Neural networks are powerful and flexible models that work well for many difficult learning tasks in image, speech and natural language understanding. Despite their success, neural networks are still hard to design. In this paper, we use a recurrent network to generate the model descriptions of neural networks and train this RNN with reinforcement learning to maximize the expected accuracy of the generated architectures on a validation set. On the CIFAR-10 dataset, our method, starting from scratch, can design a novel network architecture that rivals the best human-invented architecture in terms of test set accuracy. Our CIFAR-10 model achieves a test error rate of 3.65, which is 0.09 percent better and 1.05x faster than the previous state-of-the-art model that used a similar architectural scheme. On the Penn Treebank dataset, our model can compose a novel recurrent cell that outperforms the widely-used LSTM cell, and other state-of-the-art baselines. Our cell achieves a test set perplexity of 62.4 on the Penn Treebank, which is 3.6 perplexity better than the previous state-of-the-art model. The cell can also be transferred to the character language modeling task on PTB and achieves a state-of-the-art perplexity of 1.214.

• Jürgen Schmidhuber, Introduction to Recurrent Neural Networks and Other Machines that Learn Algorithms
• Paul Werbos, Deep Learning in Recurrent Networks: From Basics To New Data on the Brain
• Li Deng, Three Cool Topics on RNN
• Risto Miikkulainen, Scaling Up Deep Learning through Neuroevolution
• Jason Weston, New Tasks and Architectures for Language Understanding and Dialogue with Memory
• Oriol Vinyals, Recurrent Nets Frontiers
• Mike Mozer, Neural Hawkes Process Memories
• Ilya Sutskever, Using a slow RL algorithm to learn a fast RL algorithm using recurrent neural networks (Arxiv)
• Marcus Hutter, Asymptotically fastest solver of all well-defined problems
• Nando de Freitas , Learning to Learn, to Program, to Explore and to Seek Knowledge (Video)
• Alex Graves, Differentiable Neural Computer
• Nal Kalchbrenner, Generative Modeling as Sequence Learning
• Panel Discussion Topic: The future of machines that learn algorithms, Panelists: Ilya Sutskever, Jürgen Schmidhuber, Li Deng, Paul Werbos, Risto Miikkulainen, Sepp Hochreiter, Moderator: Alex Graves

Posters of the recent NIPS2016 workshop

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Thursday, January 19, 2017

Understanding deep learning requires rethinking generalization

Here is an interesting paper that pinpoints the influence of regularization on learning with Neural networks. From the paper:

Our central finding can be summarized as:
Deep neural networks easily fit random labels.

and later:

While simple to state, this observation has profound implications from a statistical learning perspective:
1. The effective capacity of neural networks is large enough for a brute-force memorization of the entire data set.
2. Even optimization on random labels remains easy. In fact, training time increases only by a small constant factor compared with training on the true labels.
3. Randomizing labels is solely a data transformation, leaving all other properties of the learning problem unchanged.

One can also read the interesting comments on OpenReview and on Reddit.

Understanding deep learning requires rethinking generalization by Chiyuan Zhang, Samy Bengio, Moritz Hardt, Benjamin Recht, Oriol Vinyals

Despite their massive size, successful deep artificial neural networks can exhibit a remarkably small difference between training and test performance. Conventional wisdom attributes small generalization error either to properties of the model family, or to the regularization techniques used during training.
Through extensive systematic experiments, we show how these traditional approaches fail to explain why large neural networks generalize well in practice. Specifically, our experiments establish that state-of-the-art convolutional networks for image classification trained with stochastic gradient methods easily fit a random labeling of the training data. This phenomenon is qualitatively unaffected by explicit regularization, and occurs even if we replace the true images by completely unstructured random noise. We corroborate these experimental findings with a theoretical construction showing that simple depth two neural networks already have perfect finite sample expressivity as soon as the number of parameters exceeds the number of data points as it usually does in practice.
We interpret our experimental findings by comparison with traditional models.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Wednesday, January 18, 2017

An NlogN Parallel Fast Direct Solver for Kernel Matrices

When Matrix Factorization meets Machine Learning:

Kernel matrices appear in machine learning and non-parametric statistics. Given N points in d dimensions and a kernel function that requires O(d) work to evaluate, we present an O(dNlogN)-work algorithm for the approximate factorization of a regularized kernel matrix, a common computational bottleneck in the training phase of a learning task. With this factorization, solving a linear system with a kernel matrix can be done with O(NlogN) work. Our algorithm only requires kernel evaluations and does not require that the kernel matrix admits an efficient global low rank approximation. Instead our factorization only assumes low-rank properties for the off-diagonal blocks under an appropriate row and column ordering. We also present a hybrid method that, when the factorization is prohibitively expensive, combines a partial factorization with iterative methods. As a highlight, we are able to approximately factorize a dense 11M×11M kernel matrix in 2 minutes on 3,072 x86 "Haswell" cores and a 4.5M×4.5M matrix in 1 minute using 4,352 "Knights Landing" cores.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Monday, January 16, 2017

Edward: Deep Probabilistic Programming - implementation -

Dustin mentioned it on his Twitter feed:

Deep Probabilistic Programming by Dustin Tran, Matthew D. Hoffman, Rif A. Saurous, Eugene Brevdo, Kevin Murphy, David M. Blei

We propose Edward, a Turing-complete probabilistic programming language. Edward builds on two compositional representations---random variables and inference. By treating inference as a first class citizen, on a par with modeling, we show that probabilistic programming can be as flexible and computationally efficient as traditional deep learning. For flexibility, Edward makes it easy to fit the same model using a variety of composable inference methods, ranging from point estimation, to variational inference, to MCMC. In addition, Edward can reuse the modeling representation as part of inference, facilitating the design of rich variational models and generative adversarial networks. For efficiency, Edward is integrated into TensorFlow, providing significant speedups over existing probabilistic systems. For example, on a benchmark logistic regression task, Edward is at least 35x faster than Stan and PyMC3.
from the Edward page:

A library for probabilistic modeling, inference, and criticism.

Edward is a Python library for probabilistic modeling, inference, and criticism. It is a testbed for fast experimentation and research with probabilistic models, ranging from classical hierarchical models on small data sets to complex deep probabilistic models on large data sets. Edward fuses three fields: Bayesian statistics and machine learning, deep learning, and probabilistic programming.
It supports modeling with
• Directed graphical models
• Neural networks (via libraries such as Keras and TensorFlow Slim)
• Conditionally specified undirected models
• Bayesian nonparametrics and probabilistic programs
It supports inference with
• Variational inference
• Black box variational inference
• Stochastic variational inference
• Inclusive KL divergence: $\text{KL}(p\|q)$
• Maximum a posteriori estimation
• Monte Carlo
• Hamiltonian Monte Carlo
• Metropolis-Hastings
• Compositions of inference
• Expectation-Maximization
• Pseudo-marginal and ABC methods
• Message passing algorithms
It supports criticism of the model and inference with
• Point-based evaluations
• Posterior predictive checks
Edward is built on top of TensorFlow. It enables features such as computational graphs, distributed training, CPU/GPU integration, automatic differentiation, and visualization with TensorBoard.

Authors

Edward is led by Dustin Tran with guidance by David Blei. The other developers are
We are open to collaboration, and welcome researchers and developers to contribute. Check out the contributing page for how to improve Edward’s software. For broader research challenges, shoot one of us an e-mail.
Edward has benefited enormously from the helpful feedback and advice of many individuals: Jaan Altosaar, Eugene Brevdo, Allison Chaney, Joshua Dillon, Matthew Hoffman, Kevin Murphy, Rajesh Ranganath, Rif Saurous, and other members of the Blei Lab, Google Brain, and Google Research.

Citation

We appreciate citations for Edward.
Dustin Tran, Alp Kucukelbir, Adji B. Dieng, Maja Rudolph, Dawen Liang, and David M. Blei. 2016. Edward: A library for probabilistic modeling, inference, and criticism. arXiv preprint arXiv:1610.09787.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Thesis: Privacy-aware and Scalable Recommender Systems using Sketching Techniques by Raghavendran Balu

Congratulations Dr. Balu !

In this thesis, we aim to study and evaluate the privacy and scalability properties of recommendersystems using sketching techniques and propose scalable privacy preserving personalization mechanisms. Hence, the thesis is at the intersection of three different topics: recommender systems, differential privacy and sketching techniques. On the privacy aspects, we are interested in both new privacy preserving mechanisms and the evaluation of such mechanisms. We observe that the primary parameter  in differential privacy is a control parameter and motivated to find techniques that can assess the privacy guarantees. We are also interested in proposing new mechanisms that are privacy preserving and get along well with the evaluation metrics. On the scalability aspects, weaim to solve the challenges arising in user modeling and item retrieval. User modeling with evolving data poses difficulties, to be addressed, in storage and adapting to new data. Also, addressing the retrieval aspects finds applications in various domains other than recommender systems. We evaluate the impact of our contributions through extensive experiments conducted on benchmark real datasets and through the results, we surmise that our contributions very well address the privacy and scalability challenges.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !