Screen-Time and Speaking-Time Estimation

Automatic estimation of screentime and speaking time by gender

System summary

Our audio-visual tool automatically detects and tracks faces in a video, as well identify the presence of speech – and then predict the gender of the speaker or the person shown onscreen.

Video demonstration

Ever wonder how our tool sees a video?