Ges­ture recog­ni­tion sys­tems have been around for quite some time now, but we still can­’t say we have dis­covered neither all the pos­sib­il­it­ies they have to offer nor the approaches we can take.

The emer­gence of integ­rated sensors on the mar­ket, includ­ing infrared trans­mit­ters and receiv­ers, and pre­cise tim­ing sys­tems oper­at­ing with great accur­acy make the use of this tech­no­logy very encouraging.

Ges­ture Con­tol PoC Demo-Video

There are count­less types of sensors that can be used for ges­ture recog­ni­tion. Giv­en a chance to try ourselves in this field, we’ve decided to cre­ate our new product – a ges­ture recog­ni­tion sys­tem using Time of Flight (ToF) cam­era. Even the best ges­ture recog­ni­tion sys­tem has no point without prac­tic­al use cases, so an applic­a­tion allow­ing to play Tet­ris using ges­tures was developed.

Time of Flight
ToF — is a meas­ure of the time it takes for an object, particle, or wave (e.g. acous­tic, elec­tro­mag­net­ic, etc.) to travel a dis­tance through a medi­um. This inform­a­tion can then be used to estab­lish a time stand­ard, as a means of meas­ur­ing velo­city or path length allow­ing the cam­era to detect objects in three dimen­sions.
The tech­no­logy works by send­ing mul­tiple pulses of light up to five meters away. Bright pulses return to the 3D ToF cam­era when they hit an object, and the time it takes to return to the cam­era is used to cal­cu­late the object’s dis­tance or depth. Just think of it as son­ar or echo­loca­tion, albeit in the light rather than in sound.
Unlike reg­u­lar cam­er­as, ToF does­n’t need an extern­al light source to per­form cor­rectly, so it is the per­fect solu­tion for places that are not always well lit, for example, car interi­ors. Due to their small dimen­sions, ToF sensors can be used in a wide range of applications. 

The most pop­u­lar among them are:

  • prox­im­ity sensors for robots
  • toi­let paper and soap dispensers
  • flush­ing cisterns in toilets
  • sink mix­ers
  • object sensors in robot­ic vacu­um cleaners
  • cheap user pres­ence detect­ors in laptops and monitors
  • invent­ory man­age­ment sys­tems in vend­ing machines
  • vend­ing machine coin counters
  • ground prox­im­ity detect­ors for drones
  • ceil­ing prox­im­ity detect­ors for indoor drones
  • pres­ence and ges­ture sensors in retail outlets
  • or play­ing Tetris!

Neur­al Net­work
The next step after get­ting the image was to answer the ques­tion ‘What’s in the image we’ve just received from the ToF cam­era?’. There isn’t any inform­a­tion on the inter­net about devel­op­ing ges­ture recog­ni­tion on the ToF sensor. After play­ing with determ­in­ist­ic algorithms for some time, we have decided to bring out the big guns. We cre­ated a spe­cial algorithm for image pre­pro­cessing first and then pre­pared a con­vo­lu­tion­al neur­al net­work mod­el, col­lec­ted a data­set con­tain­ing snap­shots of ges­tures we wanted our neur­al net­work to recog­nize. After few iter­a­tions of optim­iz­ing the mod­el, we achieved a recog­ni­tion accur­acy of 98%.
What’s more, we man­aged to run our recog­ni­tion sys­tem in real-time on Rasp­berry Pi 4!