Frédéric LERASLE's Home Page

Audio-video detection of the active speaker detection in meetings (F. Madrigal's post-doc 2016-2020)

This video illustrates deep learning based audio-video late fusion for speaker detection.

3D head pose estimation based on key-frames (F. Madrigal's postdoc 2016-2020)

This video illustrates head pose estimation based on RGB-D sensor and offline key-frames learning.

Human action recognition based on objects (C. Maurice's PhD 2018-2020)

This video illustrates video-based human action recognition when combining Bayesian and CNN inferences.

Multiple object tracking based on monocular vision (L. Fagot-Bouquet's PhD 2014-2017, in collaboration with CEA-LIST Saclay)

This video illustrates real time people tracking based on sparse representations and tracklet inference.

3D capture human motion based on a multi RGB-D camera platform (J.T. Masse's PhD 2011-2015)

This video illustrates real time 3D human motion tracking based on synchronized multi-Kinect sensors.

People detection with a ladybug camera (A.A. Mekonnen's PhD 2003-2006)

This video illustrates real time people detection with a 360° perspective camera.

Mutual assistance between speech and vision for HRI (B. Burger's PhD: 2006-2009)

This video illustrates our multimodal interface for human-robot interaction (HRI). Speech and gesture interpretation are fused in a late stage strategy.

Person tracking based on vision and RFID (T. Germa's PhD: 2007-2010)

This video illustrates the tracking of person thanks to an active perception system composed of a camera mounted on a pan-tilt unit and a 360° RFID detection system (bottom right image). Both hardware and software parts are embedded on a mobile robot in a human cluttered environment. The particle filtering framework enables the fusion of such heteregenous data.

A visual landmark framework for mobile robot navigation (J.B. Hayet's PhD: 1999-2003)

This video illustrates the Diligent robot navigating in corridors thanks to visual functions dedicated to the extraction and recognition of visual landmarks, here planar landmarks detected by a onboarded camera. The extracted landmarks are characterized by invariant attributes so that recognition is made possible from a large range of viewpoint.