Titre : Learning visual representations from weakly annotated data Résumé : Unprecedented amount of visual data is now available on the Internet. Wouldn't it be great if a machine could automatically learn from all this information? For example, imagine an autonomous robot that can learn how to change a flat tire of a car by watching instruction videos on Youtube, or that can learn how to navigate in a city by observing street-view imagery. Learning from this data is, however, a very challenging problem as it is equipped only with weak supervisory signals such as human narration of the instruction video or noisy geotags for street-level imagery. In this talk, I will describe our recent progress on learning visual representations from such weakly annotated visual data.dented amount of visual data is now available on the Internet. Wouldn¿t it be great if a machine could automatically learn from all this information? For example, imagine an autonomous robot that can learn how to change a flat tire of a car by watching instruction videos on Youtube, or that can learn how to navigate in a city by observing street-view imagery. Learning from this data is, however, a very challenging problem as it is equipped only with weak supervisory signals such as human narration of the instruction video or noisy geotags for street-level imagery. In this talk, I will describe our recent progress on learning visual representations from such weakly annotated visual data.