The RFDT is a multimodal system that integrates real-time facial landmark detection with vocal formant simulation. By tracking mouth landmarks using Google’s MediaPipe and mapping them to vocal formants in SuperCollider, it enables unique, talkbox-like control of audio signals through physical gestures.
📄 READ FULL PAPER (PDF)