Accurate indoor positioning and tracking based on data fusion from inertial sensors, wireless signals and image devices using Deep Learning techniques

<< Volver atrás

Tesis:

Accurate indoor positioning and tracking based on data fusion from inertial sensors, wireless signals and image devices using Deep Learning techniques

Autor: BELMONTE HERNÁNDEZ, Alberto

Título: Accurate indoor positioning and tracking based on data fusion from inertial sensors, wireless signals and image devices using Deep Learning techniques

Fecha: 2019

Materia: Sin materia definida

Escuela: E.T.S. DE INGENIEROS DE TELECOMUNICACION

Departamentos: SEÑALES, SISTEMAS Y RADIOCOMUNICACIONES

Acceso electrónico: http://oa.upm.es/57985/

Director/a 1º: ÁLVAREZ GARCÍA, Federico

Resumen: Indoor Positioning Systems (IPS) became one of the main interests for researchers in the last years due to the potential number of applications that can be developed in different _elds such as health, security, retail, etc. Normally, the proposed solutions try to increase the accuracy of indoor person tracking by using different technological proposals and therefore this is the focus of this doctoral thesis. Two main ways to solve this problem are, on the one hand, the use of sensory solutions of different nature (wireless, motion, depth...) to get information about the motion of a person or in the other hand the use of image-based devices. Although di_erent solutions have been developed with only one technology, the interest of the combination of several of them has not been exploited enough. In this thesis the use of wireless signals in combination with motion sensors and image-based solutions involving cameras of different kinds has been proposed. The main reason to combine these technologies is that the deployed Wireless Sensor Network is able to cover a big area and can get measurements if the device is in the wireless range. In contrast, cameras can monitor big areas but with the limitations of indoor environments. If a person is being tracked by the camera and go outside the camera view, sensors can continue collecting measures allowing the person seamless tracking. Thanks to the big advances in the deep learning field in the last decade, this thesis exploits the use of these algorithms in order to achieve better accuracy than with traditional machinelearning based methods for wireless sensors tracking and classical computer vision methods. The thesis proposes Artificial Neural Networks (ANNs) to be applied to learn from large datasets how to model the behaviour of the signals and sensors in an indoor environment to perform positioning and tracking with wireless signals and motion sensors. Tracking algorithms try to estimate the next position of an object base on previous information. As it is difficult know which is the best model to use, Recurrent Neural Networks (RNNs) have been used to learn a regression model to track our signals. Generative Adversarial Networks (GANs) have been used in this work in combination with a tracking recovery module to generate new measurements when nothing is received during a tracking of a person in order to recover the trajectory in short periods. Regarding camera devices, deep learning solutions have been employed for object detection, tracking and reidentification algorithms. Normal field of view (NFoV), Fisheye and 360° cameras have been used to get different types of images and test the algorithms. A complete image-based framework have been developed to get the position of the persons in the coordinates system of the monitored environment. Finally, the fusion module combines the estimations from both independent systems increasing the final obtained accuracy. ----------RESUMEN---------- Los sistemas de posicionamiento en interiores (IPS) son uno de los principales intereses para los investigadores gracias al número de potenciales aplicaciones que pueden ser desarrolladas. Las soluciones propuestas se centran en el seguimiento preciso de personas. Este es el foco de esta tesis. El problema puede ser resuelto usando sensores de diferente tipo (inalámbricos, movimiento, profundidad...) para obtener información sobre el movimiento de las personas o el uso de dispositivos de imagen. Muchas soluciones usan un sólo tipo de tecnología, pero el gran interés en la combinación de varias ha sido poco explotado. En este trabajo se propone el uso de señales inalámbricas junto con sensores inerciales y cámaras de diferente óptica. La principal razón de combinar estas tecnologías es que una red inalámbrica de sensores (WSN) es capaz de monitorizar grandes áreas y recibir medidas dentro del rango inalámbrico. En contraste, las cámaras pueden monitorizar grande entornos pero solamente dentro de la sala en la que se encuentran instaladas. Si una persona está bajo el seguimiento de la cámara y deja la sala la red de sensores es capaz de continuar siguiendo al individuo continuamente. Gracias al auge del aprendizaje profundo, este trabajo se centra en su uso para conseguir mayor precisión que con técnicas precedentes. Las redes neuronales (ANNs) pueden ser aplicadas para modelar el comportamiento de señales y sensores en entornos de interior para realizar tareas de posicionamiento y seguimiento. Los algoritmos de seguimiento filtran las detecciones para ajustar un modelo de movimiento preciso. Debido a la complejidad se ha optado por el uso de redes recurrentes (RNNs) para crear un modelo de regresión que realice el seguimiento. Las redes generativas (GANs) permiten generar nuevos datos basados en entradas concretas. Un módulo para recuperamiento del seguimiento ha sido desarrollado para situaciones en las que una persona se pierde y vuelve a ser seguida después de periodos cortos. En imagen, las soluciones deep learning empleadas se centran en detectores de objetos y algoritmos de seguimiento y reidentificación de personas. Cámaras de visión normal, ojo de pez y 360° han sido usadas para obtener diferente tipo de perspectivas con el fin de testear los algoritmos en ellas. Una solución completa para el seguimiento con imagen ha sido desarrollada para obtener las coordenadas reales de una persona en el entorno monitorizado. Finalmente, un módulo de fusión combina las estimaciones de ambos sistemas independientes incrementando la precisión final.