Adaptive Learning with Weak Supervision for Robotic Perception

<< Volver atrás

Tesis:

Adaptive Learning with Weak Supervision for Robotic Perception

Autor: RODRÍGUEZ VÁZQUEZ, Francisco Javier

Título: Adaptive Learning with Weak Supervision for Robotic Perception

Fecha: 2024

Materia:

Escuela: E.T.S DE INGENIEROS INFORMÁTICOS

Departamentos: INTELIGENCIA ARTIFICIAL

Acceso electrónico: https://oa.upm.es/81506/

Director/a 1º: MOLINA GONZÁLEZ, Martín
Director/a 2º: CAMPOY CERVERA, Pascual

Resumen: The field of deep learning, particularly in computer vision and robotic perception, has seen tremendous growth, but it is faced with issues of resource disparity and sustainability. This thesis attempts to tackle these issues through a multi-pronged approach that includes weak supervision, adaptive learning, and robotic perception, with a special emphasis on the practicality of real-world problems. Weak supervision is a key factor in this research, providing a solution to the common issue of limited data in deep learning. Traditional supervised learning systems are heavily dependent on data that is thoroughly labeled, which is a time-consuming and expensive process. By utilizing weak supervision, the data preparation overhead is drastically reduced, allowing for the use of large, inadequately labeled or completely unlabeled datasets. This not only makes advanced AI technologies more accessible and efficient for the general public, but also helps to achieve the goal of making them more sustainable. This thesis is based on adaptive learning, which is a dynamic and self-adjusting learning approach. Methods like GANs and domain adaptation techniques are used to enable the model to learn and adjust from the data itself, reducing the need for extensive labeling. This is especially important in cases of domain changes or when gathering data is very expensive, and when combined with weak supervision, it creates a synergy that improves the model's performance. This thesis examines the incorporation of various techniques into robotic perception. As robots become more and more prevalent in our lives, they face a number of difficulties in perceiving their environment, such as dealing with changes in distribution, different lighting conditions, and the requirement for rapid data processing. This research provides efficient and scalable solutions to these issues, making them applicable to areas like precision agriculture and industrial inspection. The goals of this thesis are numerous, with the primary focus being to reduce the cost of data labeling, create new object detection techniques, emphasize on-board processing for robotic platforms and validate approaches with real-world data through industry partnerships. All of these objectives are intended to increase the practicality and influence of deep learning in various industries. The contributions of this thesis are diverse and significant. They include the introduction of a novel isotropic object detection method using dot annotations, a robust pipeline for unsupervised domain adaptation, the development of a keypoint-based object detection method suited for industrial facility inspections, and the integration of these techniques in various domains. These innovations not only advance the state-of-the-art in their respective fields but also emphasize the practical applicability and scalability of the methods developed. This thesis is a comprehensive attempt to tackle the various difficulties associated with modern deep learning technologies. It seeks to remove any obstacles that may be preventing the widespread use of these complex models, making them more accessible, efficient, and better suited to practical requirements. The emphasis on weak supervision, adaptive learning, and robotic perception reflects a dedication to a more inclusive and sustainable future for deep learning and artificial intelligence on a large scale. The research highlights the value of open-source contributions to enhance the community and collective intellectual growth. In doing so, it also strives to make AI technologies more resilient to the ever-changing landscape of computational and financial limitations. RESUMEN El campo del aprendizaje profundo, especialmente en visión por computado y percepción robótica, ha experimentado un tremendo crecimiento, pero enfrenta problemas de disparidad de recursos y sostenibilidad. Esta tesis intenta abordar estos problemas mediante un enfoque que incluye supervisión débil, aprendizaje adaptativo y percepción robótica, con un énfasis especial en los problemas del mundo real. La supervisión débil es un factor clave en esta investigación, ofreciendo una solución al problema común de datos limitados en el aprendizaje profundo. Los sistemas tradicionales de aprendizaje supervisado dependen en gran medida de datos que están etiquetados, lo que es un proceso costoso y que consume tiempo. Al utilizar la supervisión débil, el coste de preparación de datos se reduce drásticamente, permitiendo el uso de grandes conjuntos de datos inadecuadamente etiquetados o completamente sin etiquetar. Esto no solo hace que las tecnologías avanzadas de IA sean más accesibles y eficientes para el público en general, sino que también ayuda a lograr el objetivo de hacerlas más sostenibles. Esta tesis se basa en el aprendizaje adaptativo, que es un enfoque de aprendizaje dinámico y autoajustable. Se utilizan métodos como Redes Generativas Antagónicas (GANs) y técnicas de adaptación de dominio para permitir que el modelo aprenda y se ajuste a partir de los datos mismos, reduciendo la necesidad de etiquetado extenso. Esto es especialmente importante en casos de cambios de dominio o cuando la recolección de datos es muy costosa, y cuando se combina con supervisión débil, crea una sinergia que mejora el rendimiento del modelo. Esta tesis examina la incorporación de varias técnicas en la percepción robótica. A medida que los robots se vuelven más prevalentes en nuestras vidas, enfrentan una serie de dificultades para percibir su entorno, como lidiar con cambios en la distribución, diferentes condiciones de iluminación y la necesidad de procesamiento rápido de datos. Esta investigación proporciona soluciones eficientes y escalables a estos problemas, haciéndolas aplicables a áreas como la agricultura de precisión y la inspección industrial. Los objetivos de esta tesis son numerosos, con el enfoque en reducir el costo del etiquetado de datos, crear nuevas técnicas de detección de objetos, enfatizar el procesamiento a bordo para plataformas robóticas y validar enfoques con datos del mundo real a través de asociaciones industriales. Todos estos objetivos están destinados a aumentar la practicidad e influencia del aprendizaje profundo en varias industrias. Las contribuciones de esta tesis son diversas. Incluyen la introducción de un novedoso método de detección de objetos utilizando anotaciones de puntos, un robusto proceso para la adaptación de dominio no supervisada, el desarrollo de un método de detección de objetos basado en puntos clave adecuado para inspecciones de instalaciones industriales y la integración de estas técnicas en varios dominios. Estas innovaciones no solo avanzan en el estado del arte en sus respectivos campos, sino que también enfatizan la aplicabilidad práctica y la escalabilidad de los métodos desarrollados. Esta tesis es un intento de abordar las diversas dificultades asociadas con las tecnologías modernas de aprendizaje profundo. Busca eliminar cualquier obstáculo que pueda limitar el uso generalizado de estos modelos complejos, haciéndolos más accesibles, eficientes y mejor adaptados a los requisitos prácticos. El énfasis en la supervisión débil, el aprendizaje adaptativo y la percepción robótica refleja una dedicación a un futuro más inclusivo y sostenible para el aprendizaje profundo y la IA a gran escala. La investigación destaca el valor de las contribuciones de código abierto para mejorar la comunidad y el crecimiento intelectual colectivo. Al hacerlo, también se esfuerza por hacer que las tecnologías de IA sean más resilientes al panorama en constante cambio de limitaciones computacionales y financieras.