Control architecture of a social robot for the navigation and interaction with the environment

<< Volver atrás

Tesis:

Control architecture of a social robot for the navigation and interaction with the environment

Autor: ALVARADO VÁSQUEZ, Biel Piero E.

Título: Control architecture of a social robot for the navigation and interaction with the environment

Fecha: 2019

Materia: Sin materia definida

Escuela: E.T.S. DE INGENIEROS INDUSTRIALES

Departamentos: AUTOMATICA, INGENIERIA ELECTRICA Y ELECTRONICA E INFORMATICA INDUSTRIAL

Acceso electrónico: http://oa.upm.es/55813/

Director/a 1º: MATÍA ESPADA, Fernando

Resumen: Muchas propuestas han surgido con respecto a la creación de robots que puedan interactuar con personas. Además, se ha hecho mucho énfasis en el aspecto físico del robot para que sea bien recibido por las personas que lo rodean. Por eso, muchos robots se han diseñado para que trabajen como camareros, ayudantes domésticos, asistentes personales y hasta guías en museos, los cuales explican los retratos y esculturas. Doris, que entra en esta categoría, es un robot de interiores diseñado para trabajar en entornos complejos y para interactuar con personas. Doris es la versión moderna de otros robots, Blacky y Urbano, diseñados en el Centro de Automática y Robótica de la Universidad Politécnica de Madrid. Para crear un robot se requieren conocimientos de mecánica, control, electrónica y programación, y programar un robot para que sea un guía turístico, no es una tarea fácil debido a que hay muchos aspectos que resolver como sus movimientos, su localización, la información que percibe del entorno, el procesamiento de ciertas condiciones que pueden cambiar en el tiempo, preguntas que pueden provenir de diferentes personas, etc. Aunque estas tareas se puedan resolver individualmente, trabajarlas en paralelo complica aún más las cosas. Esta tesis presenta una propuesta de arquitectura de control que cubre tanto desarrollo hardware y software para lograr que Doris trabaje de manera eficiente en un museo como guía turístico. Por lo tanto, para esta arquitectura se han desarrollado cinco capas las cuales son: hardware, lógica, enlace, planificación y usuario. La capa de hardware se comunica con la capa lógica a través de USB, RS232 o Ethernet. Las demás capas se comunican a través de otros métodos propios de los sistemas operativos como lo son pipes (tuberías), memoria compartida o sockets. Se parte de una estructura hardware ya diseñada previamente en otro trabajo como lo es el torso de Doris, ya instalado sobre la plataforma. Sobre este torso se ha instalado una cabeza, una cámara omnidireccional, antenas RFID, altavoces y demás accesorios para conseguir tanto interacción como localización, navegación y planificación. También como propuesta software se plantea un modelo cliente-servidor, en el cual se generan muchos hilos para poder trabajar los diferentes módulos paralelamente de forma síncrona y asíncrona. Otra de las propuestas de esta tesis es un planificador de tareas, el cual permite el desarrollo de una serie de acciones dentro del museo. Estas acciones se definen como un conjunto de operaciones, tales como el movimiento del robot, su localización, la planificación de rutas, las caras que debe poner y los eventos de los que Doris debe estar pendiente, así como personas que detecta, cuando pasa por una puerta o va a entrar en un pasillo o cuando alguien le hace una pregunta. Como la localización representa uno de los inconvenientes de un robot móvil, esta se resuelve mediante fusión sensorial implementada a través de un filtro extendido de Kalman. La fusión sensorial se hace usando un láser LMS-200 para detectar balizas reflectivas, una cámara omnidireccional para detectar marcas visuales y un mapa semántico. Con esta fusión sensorial se consigue una buena localización, con pocos milímetros de error. De la localización depende el control reactivo del robot. Para este control reactivo se han desarrollado dos importantes controladores usando lógica difusa. Uno de ellos es un control de velocidad para que el robot vaya de manera suave a los puntos de interés del museo y el segundo controlador es para pasillos, el cual mantiene al robot en el centro de este. Esta es una conducta propia del ser humano, que camina por el centro de un pasillo y esquiva objetos o personas cuando se interponen en su camino. La localización depende de un mapa semántico, siendo esta otra propuesta de esta tesis. El mapa semántico es construido por sectores que a su vez se subdividen en puntos de interés por los cuales se hará la navegación del robot. Esta navegación está basada en la búsqueda del mejor camino en un grafo, y para ello, los algoritmos más usados son el algoritmo de Dijsktra o el A*. La interacción de Doris con el entorno está basada en muchos subsistemas como la cara, el brazo, el habla, la sincronización labial y el sistema de emociones. En esta tesis se propuso el diseño de una cara de apariencia humana, hecha de metacrilato con 20 grados de libertad, los cuales están distribuidos a lo largo de ojos, cejas, párpados, mofletes, y boca. Esta cara ha sido cuidadosamente estudiada para evitar rechazos por parte de los visitantes del museo. El habla, el cual es otra propuesta de esta tesis, es la que permite explicar los objetos, retratos y esculturas a las personas que están de visita. Todo esto trabajado en sinergia con la sincronización del los labios para darle una apariencia más humana al robot, pero sin llegar al rechazo. También se han diseñado dos aplicaciones para el control remoto y local de Doris. Para el control remoto se usa una página web y para el control local, se usa una aplicación desarrollada en android. La idea es que con estas aplicaciones se pueda monitorizar al robot bien sea desde el mismo museo o desde otra parte del mundo. Este punto fue muy importante a la hora de establecer la arquitectura cliente-servidor. El planificador de tareas mencionado anteriormente, es planteado en esta tesis a través de un nuevo lenguaje de programación dedicado netamente a Doris. Para esto se tuvo que proveer al lenguaje de ciertas características desde las básicas de todos los lenguajes de programación como sentencias if, while, for, declaración de variables, arrays etc, a otras más dedicadas al movimiento y las expresiones como say, move, goto, etc. y otras un poco más avanzadas como el escuchar eventos. La versatilidad de este lenguaje de programación es la posesión de una estructura gramatical sencilla que permite crear un programa modular para cada tipo de visita turística distinta. Ya para finalizar, se desarrollan pruebas para unir todas estas características en una misma ejecución, en el Centro de Automática y Robótica de la Universidad Politécnica de Madrid, mostrando resultados en los que cada componente puede trabajar en conjunto con todos los demás, también se puede cambiar de programa de ejecución sin necesidad de cambiar nada de la arquitectura propuesta. ----------ABSTRACT---------- Several approaches have been made in order to create robots that can interact with people. Besides, special attention has been devoted to achieve widespread acceptance among the population. Thus, the variety of programmed robots goes from waiters, domestic helpers and personal assistants to guides that work in museums guiding people through the different areas and that explain sculptures and portraits. Doris, which belongs to this last mentioned category, is a mobile robot conceived to work in dynamic indoor environments and to interact with people. It was built as a way to upgrade Blacky and Urbano, two robots that have been developed at the Centre of Automation and Robotics at the Universidad Politécnica de Madrid in recent years. Programming a robot is not an easy task since there are many issues that need to be solved, such as the localization of the robot, its movements, the reading of information from complex environments, the processing of some eventual conditions, the interpretation of an input and the appropriate answer to this input and so on and so forth. Once they are solved, these tasks need to work synchronized. This dissertation presents a control architecture that involves both, hardware and software proposals, in order for Doris to be able to work as a successful tour-guide robot in museums. Therefore, five layers have been developed: a hardware, a logic, a link, a task planner and a user layer. The hardware layer communicates to the logic layer through a USB and a RS232 or Ethernet; and the remaining layers communicate with each other through pipes, shared memory, messages and sockets. When it comes to Doris’ hardware, the robot is equipped with a mobile platform. Prior to this dissertation, a skeleton placed on this platform was added to the robot in order to provide it with human appearance. Later, the head was attached to the skeleton so that interaction with the environment could be possible, as well as additional sensors such as the RFID and an omnidirectional camera, which are used for localization, navigation and interaction. Regarding software, there is a proposal of a client-server application which includes multiple threads which communicate with each other in order to achieve an excellent performance between internal processes and between the robot and the people. The task planner, previously mentioned, is another proposal of this doctoral thesis, which is responsible for indicating the set of actions that Doris must perform in a museum. This actions or tasks are the set of movements that Doris needs to do in order to reach a point of interest inside the museum. A route must be traced with a designed path planner in order to reach this point of interest, and in order to make the robot follow this path, two main components must be developed: the controller and the state observer, being this last one Doris’ oriented position in the plane. Doris’ localization is based on a sensor fusion, a subject that is approached with the detection of reflective beacons by means of a laser range finder, and with the detection of visual markers by means of an omnidirectional camera. The information provided along with the information contained in the semantic map is fused into an Extended Kalman Filter. Reflective landmarks are detected by using a LMS-200 laser range finder. The detection was used following certain conditions for classifying and detecting the center of the landmark by applying the Law of Sines and using it as a model in an iterative minimum squared error estimator. Visual markers are detected by using an omnidirectional camera Mobotix C25. The fiducial markers proposed for this dissertation are a matrix of 6 rows and 5 columns to avoid orientation issues, which are placed at 275cm from the ground to try to avoid occlusions and get the maximum numbers of visual markers. While the camera can get up to five markers at the same time, the laser can perceive a minimum of two. Concerning the reactive control proposed in this dissertation, two fuzzy controllers were developed. One of the controllers is a speed controller developed to move the robot from one point to another smoothly. Another controller is used in hallways, where the robot must keep in narrow areas to avoid collisions with other objects, which is similar to people’s typical behaviour. Another proposal of this dissertation is to design a semantic map of the environment that will be used for navigation. This map is subdivided into sectors and each sector reflects the points of interest where the robot must navigate. Different approaches, which are based on Dijsktra or other planners based on graphs representation, have been made to solve the navigation and planning problems. Regarding interaction, Doris’ system is based on several elements, which are the face, the arm, the emotion system, the speaking system and the lip sync system. The simpler the communication is, the better the robot will respond to the user, always having into account to avoid the uncanny valley, which is the degree of acceptance of people towards robots which are human like. Doris’ face has the most common characteristics of a person’s face, that is to say, eyes, eyelids, eyebrows, nose, lips, It is made of methacrylate and it has 20 degrees of freedom. Doris’ speaking system is another proposal of this project, a system that allows Doris communication with people in the museum, thus, achieving a greater acceptance among them. This speaking system works together with the emotional system, which consists in obtaining different face expressions by means of fuzzy logic. The remote control of Doris, which is one more proposal of this dissertation, is performed via ethernet connection. It allows to operate the robot in a local area network or in a wide area network, so that a robot located in a museum in Italy can be operated by a person located in Spain. This was a key reason for the client-server architecture to be chosen. The client is an application which performs a request to the server (Doris). This information travels through a communication tunnel, normally ethernet, and then, the server returns the information requested to the user. The architecture should handle different clients, like clients connected with a smartphone or tablet and clients connected via web. As the applications will connect to Doris via ethernet, sockets are used except for web applications, where web-sockets are used instead. In order to integrate al the submodules described above, a task planner is proposed, which integrates navigation, face and emotions by means of a tasks list, which is a source code of a program language developed for Doris with which the robot’s user can create or modify the order in which actions are going to be performed by the robot. Joining all the submodules into a single execution, a program for touring in the Centre of Automation and Robotics at Universidad Politécnica de Madrid has been developed. Doris initial results show that each component of the architecture produces acceptable outputs. The new language proves that modules can be merged and that Doris follows the actions (trajectories, speeches and actions based on events) specified in the program provided by the developer without making any changes in the lower layers of the architecture.