Year of Publication

2005

Document Type

Thesis

College

Engineering

Department

Computer Science

First Advisor

Christopher O. Jaynes

Abstract

With computers being embedded in every walk of our life, there is an increasing demand forintuitive devices for human-computer interaction. As human beings use gestures as importantmeans of communication, devices based on gesture recognition systems will be effective for humaninteraction with computers. However, it is very important to keep such a system as non-intrusive aspossible, to reduce the limitations of interactions. Designing such non-intrusive, intuitive, camerabasedreal-time gesture recognition system has been an active area of research research in the fieldof computer vision.Gesture recognition invariably involves tracking body parts. We find many research works intracking body parts like eyes, lips, face etc. However, there is relatively little work being done onfull body tracking. Full-body tracking is difficult because it is expensive to model the full-body aseither 2D or 3D model and to track its movements.In this work, we propose a monocular gesture recognition system that focuses on recognizing a setof arm movements commonly used to direct traffic, guiding aircraft landing and for communicationover long distances. This is an attempt towards implementing gesture recognition systems thatrequire full body tracking, for e.g. an automated recognition semaphore flag signaling system.We have implemented a robust full-body tracking system, which forms the backbone of ourgesture analyzer. The tracker makes use of two dimensional link-joint (LJ) model, which representsthe human body, for tracking. Currently, we track the movements of the arms in a video sequence,however we have future plans to make the system real-time. We use distance transform techniquesto track the movements by fitting the parameters of LJ model in every frames of the video captured.The tracker's output is fed a to state-machine which identifies the gestures made. We haveimplemented this system using four sub-systems. Namely1. Background subtraction sub-system, using Gaussian models and median filters.2. Full-body Tracker, using L-J Model APIs3. Quantizer, that converts tracker's output into defined alphabets4. Gesture analyzer, that reads the alphabets into action performed.Currently, our gesture vocabulary contains gestures involving arms moving up and down which canbe used for detecting semaphore, flag signaling system. Also we can detect gestures like clappingand waving of arms.

Share

COinS