01 · Passive liveness can be replayed. Active liveness has to recognize the gesture.
The Challenge
Identt is an identity verification platform that ships eKYC into regulated workflows - banking, fintech, telco, gambling - where a missed spoof is a compliance incident, not a bug ticket. The product flow is the kind that compliance teams want: the user opens their phone, holds up their ID document, then records a short selfie video while performing a sequence of head and eye gestures the server picks for them.
Active liveness - gesture-based - is the right shape for this use case because it's replay-resistant by construction. A photo of the user can't blink. A pre-recorded video of the user blinking can't "look up, then blink, then look right" on the new sequence the server just generated. The protocol is sound; the failure mode is the model. If the gesture recognizer can't reliably tell "head turned left" from "head turned slightly left then back to center" across older devices, dim rooms, and varied facial structures, the fraud team starts paying for it in false rejections - and the customer drops out of the funnel.
The build had to do two things at once. Recognize the five gestures (blink, look left, look right, look up, look down) reliably enough to match the false-reject rate banks expect from a modern eKYC product, and confirm a real human face is producing them - not a static photo, not a screen replay, not a doctored video. All of that, server-side, inside Identt's own infrastructure.





