Tesis Doctorales UPM: Consulta online

Autor: SÁNCHEZ MARTÍN, José Angel

Título: Networks in Natural Computing and Precision Medicine

Fecha: 2021

Materia: Sin materia definida

Escuela: E.T.S.I. DE SISTEMAS INFORMÁTICOS

Departamento: SISTEMAS INFORMATICOS

Acceso electrónico: https://oa.upm.es/68759/

Director/a(s):

Director/a: MITRANA, Victor
Director/a: PETRE, Ion

Resumen: The growth in technological and computational knowledge have opened several new paths of innovation for humanity, but at the same time it exposed the lacking nature of our current methods and vision of the world. In recent years, our civilization have witnessed a growing need for novel and unconventional computing paradigms in all areas of computability theory with the aim of processing big quantities of information and resolving problems of complex nature which are intractable by the current computational models. On the other hand, our traditional medical arts are also undergoing a bottleneck caused by various obstacles brought forth by personal patient’s allergies and disease resistances. In order to face these issues, there has been an increasing interest in the biological systems and phenomena of the natural world. Whereas the accent in systems biology modeling projects is on application of mathematics and computer science to biology, there has also been an important focus on the inspirations drawn from biology towards computer science and mathematics. The two research directions are complementary and synergistically support each other. On one hand, the lessons about robustness and evolvability learned in biomodeling projects can be used to propose novel computing paradigms. On the other hand, the new models of computation inspired by biology are often very efcient in optimization tasks, crucial steps in biomodeling. In this thesis, we contribute to our knowledge on these two complementary research directions inspired by biological phenomena. Firstly, we describe our fndings in the feld of bio-inspired paradigms aiming to improve our current computational models and tools. Secondly, we illustrate a new methodology, assisted by computational means, for the prescription of personalized drug therapies customized to the needs of each individual patient. The frst part of our research places its main focus in the scientifc scrutiny of networks of bio-inspired processors, a paradigm belonging to the feld of bio-inspired parallel and distributed computational models. Its basic concept is that of an arbitrary graph hosting processors in its nodes. These processors are simultaneously performing operations inspired by DNA mechanisms in order to streamline the computation. This parallelism makes possible the theoretical efcient solving of NP-complete problems like the satisfability problem, the 3-colorability problem and the Hamiltonian path problem. This thesis introduces variants of this model along with studies on their computational complexity. Furthermore, the manuscript illustrates efcient simulations by means of these models of known universal systems, such as the Turing machines or the 2-tag systems, as well as simulations between diferent variants of networks of bio-inspired processors. On the other hand, the availability of big data regarding genetic information and the intelligence concerning the behavior and interactions between genes and proteins have drawn the interest of computational scientists to the feld of biology. Lately, mathematical models are being built to explain the complex mechanisms through which our biological bodies work at any moment or to fnd out the core functionality of a disease. Some of these biological systems are abstracted to genetic networks with the proteins as nodes and the proteinprotein interactions (PPI) between them as edges, with the fnal purpose of manipulating the system to a planned state. This problem is commonly referred as system controllability. The solution to this controllability problem can open new possibilities to the feld of precision medicine which may lead us towards a customized treatment uniquely suited to the needs of each patient. Nevertheless, full controllability is known to lead to very large input sets ofering the control over the network, which makes this approach infeasible in practice. Because of this impediment, researchers proposed a more humble approach based on the manipulation of the system through the infuence of a few selected nodes, designated as target nodes. Although this problem is also proven to be difcult, its limited nature in the number of targeted nodes has allowed for the proposal of several approximate computational techniques to solve it. Currently, these methods have shown partial success on the examination of complex systems, such as cancer networks. In this thesis, we introduce a novel methodology aimed towards the diagnosis of drug therapies optimized for the use of an individual patient, through the analysis of that person’s genetic information related to the disease. The main concept behind our approach is the generation of PPI networks based on the patient and disease information, on which we apply heuristic algorithms with the objective of solving target controllability through a set of target nodes as minimal as possible. Furthermore, we also explore several network centrality methods in order to study the topology of the network, the expected efciency of the drug combination over the defcient disease genes and the nodes of importance for the disease and drug therapies. This thesis is structured in two parts. Part I records my contributions to the feld of networks of bio-inspired processors. Chapter 1 gives a brief introduction of the reasons to be for bio-inspired computing systems in general and networks of bio-inspired processors in particular. The variants of this last model and its basic properties and mechanisms are illustrated in this section. Chapter 2 collects the basic concepts and notations used in the following chapters to portray our research activity. Subsequent chapters (chapters 3, 4, 5, 6 and 7) expose our contributions to the state of art of networks of bio-inspired processors, which include complexity analysis and simulations of the variants with each other. Chapter 8 closes the frst part with a summary of the breakthroughs achieved in this frst part of the thesis and our thoughts for future paths of investigation. Part II illustrates our methodology in the feld of network medicine. Chapter 9 discusses the development of bioinformatics and its contributions towards precision medicine. Chapter 10 describes the network modeling methods, genetic data and software tools necessary to carry out the research projects documented in subsequent chapters. The research results in this latest part are composed of two chapters. In chapter 11, we evaluate the utility of network controllability in new precision medicine with the analysis of the controllability results of three representative multiple myeloma patients. Chapter 12 advances upon the foundations set in the previous chapter with an exhaustive study of network centrality and minimum dominating sets in addition to the aforementioned network controllability analysis. Furthermore, we demonstrate how the conclusions of the analysis in both chapters may be used to prescribe personalized drug combination therapies. Lastly, we conclude this topic with chapter 13 where we give a general outline of the research results illustrated in this part of the thesis and our thoughts for future paths of investigation. ----------RESUMEN---------- Los avances en el conocimiento tecnológico y computacional han abierto nuevas opciones de innovación para la humanidad, pero también han expuesto la naturaleza defciente de nuestra visión actual del mundo. En los últimos años, nuestra civilización ha presenciado una creciente demanda por nuevos paradigmas computacionales en todas las áreas de la teoría de la computabilidad, con el objetivo de procesar grandes cantidades de información y resolver complejos problemas intratables por los modelos computacionales actuales. Por otro lado, la medicina tradicional no es adecuada para la prevención de las alergias y resistencias a medicamentos que pudiesen ocurrir durante el tratamiento de un paciente específico. Con el objetivo de encontrar una solución a estos problemas, la comunidad científica está mostrando un interés creciente en los sistemas biológicos y los fenómenos del mundo natural. Aunque el enfoque principal consiste en aplicar las matemáticas y la informática al estudio de sistemas biológicos, también se ha mostrado interés por nuevos modelos informáticos y matemáticos inspirados por los sucesos naturales. Las dos direcciones de investigación son complementarias y se apoyan sinérgicamente. Por un lado, el análisis de la robustez y capacidades evolutivas mostradas por los paradigmas biológicos pueden utilizarse para el diseño de nuevos sistemas informáticos. Por otro lado, estos modelos de computación son normalmente muy efcientes en tareas de optimización, esenciales para el diseño de modelos biológicos. En esta tesis contribuimos a nuestro conocimiento de estas dos líneas complementarias de investigación inspiradas por los fenómenos biológicos. En primer lugar, describimos nuestros hallazgos en el campo de los paradigmas bioinspirados con el objetivo de mejorar nuestros modelos y herramientas computacionales actuales. En segundo lugar, ilustramos una nueva metodología asistida por medios computacionales, para la prescripción de terapias farmacológicas personalizadas y adaptadas a las necesidades de cada paciente. La primera parte de nuestra investigación se centra en el escrutinio científico de redes de procesadores bioinspirados, paradigma perteneciente al campo de los modelos computacionales paralelos y distribuidos bioinspirados. Su concepto básico es el de un gráfico arbitrario que aloja procesadores en sus nodos. Estos procesadores realizan simultáneamente operaciones inspiradas en los mecanismos del ADN para agilizar el cálculo. Este paralelismo hace posible desde una perspectiva teórica la solución efciente de problemas NP-completos como el problema de satisfacibilidad, el problema de 3 coloraciones y el problema del camino hamiltoniano. Esta tesis introduce variantes de este modelo junto con estudios sobre su complejidad computacional. Además, el manuscrito ilustra simulaciones efcientes mediante estos modelos de sistemas universales conocidos, como las máquinas de Turing o los sistemas 2-tag, así como simulaciones entre diferentes variantes de las redes de procesadores bioinspirados. Por otro lado, la disponibilidad de big data sobre la información genética y los estudios sobre el comportamiento y las interacciones entre genes y proteínas han atraído el interés de los informáticos en el campo de la biología. Últimamente, se están construyendo modelos matemáticos para explicar los complejos mecanismos a través de los cuales nuestros cuerpos biológicos funcionan en cualquier momento o para descubrir los mecanismos principales que rigen una enfermedad. Algunos de estos sistemas biológicos pueden ser analizados como redes genéticas cuyos vértices y aristas simbolizan proteínas y las interacciones entre proteínas (PPI), con el propósito fnal de manipular el sistema representado a un estado concreto. Este problema se conoce comúnmente como controlabilidad del sistema. La solución a este problema de controlabilidad puede abrir nuevas posibilidades al campo de la medicina de precisión que pudiesen conducirnos hacia un tratamiento personalizado y adaptado de forma única a las necesidades de cada paciente. Sin embargo, se sabe que la controlabilidad total requiere conjuntos de entrada muy grandes para obtener el control sobre la red, lo que hace este enfoque inviable en la práctica. Debido a este impedimento, los investigadores propusieron un enfoque más humilde basado en la manipulación del sistema a través de la infuencia de unos pocos nodos, designados como nodos objetivo. Aunque este problema es también difícil, la limitación en el número de nodos ha permitido la propuesta de varias técnicas computacionales de aproximación para solucionarlo. Actualmente, estos métodos han tenido un éxito parcial en el estudio de sistemas complejos, como las redes de cáncer. En esta tesis, presentamos una metodología novedosa cuyo objetivo es el diagnóstico de terapias farmacológicas optimizadas para el uso de un paciente individual, a través del análisis de la información genética de esa persona y su relación con la enfermedad. El concepto principal detrás de nuestro enfoque es la generación de redes PPI basadas en la información del paciente y la enfermedad, sobre las cuales aplicamos algoritmos heurísticos con el objetivo de lograr la controlabilidad del sistema por medio de un mínimo conjunto de nodos. También exploramos varios métodos para estudiar la topología de la red, la efciencia esperada de la combinación de fármacos sobre los genes afectados por la enfermedad y los nodos de importancia para la enfermedad y las terapias farmacológicas. Esta tesis se compone de dos partes. La parte I incluye mis contribuciones a la disciplina académica de las redes de procesadores bioinspirados. El capítulo 1 proporciona una breve introducción sobre los sistemas de computación bioinspirados en general y las redes de procesadores bioinspirados en particular. Esta sección ilustra las variantes de este último modelo y sus propiedades básicas. El capítulo 2 introduce los conceptos básicos y las notaciones que utilizamos en los capítulos posteriores para describir nuestra investigación. Los capítulos siguientes (capítulos 3, 4, 5, 6 y 7) exponen nuestras contribuciones al estudio de las redes de procesadores bioinspirados: análisis de complejidad y simulaciones entre distintas variantes del paradigma. El capítulo 8 concluye esta primera parte con un resumen de los resultados de investigación documentados en esta parte de la tesis y nuestras propuestas de líneas de investigación futuras. La parte II describe nuestra metodología en el ámbito de las redes aplicadas a la medicina. El capítulo 9 discute el desarrollo de la bioinformática y sus contribuciones a la medicina de precisión. El capítulo 10 describe los métodos de modelado de redes, las herramientas software y los datos genéticos necesarios para llevar a cabo los proyectos de investigación documentados en los capítulos posteriores. Los resultados de investigación de esta última parte de la tesis están incluidos en los siguientes dos capítulos. En el capítulo 11 evaluamos la utilidad de la controlabilidad de redes en la medicina de precisión mediante el análisis de los resultados de controlabilidad obtenidos para tres pacientes de mieloma múltiple. En el capítulo 12 mejoramos la estrategia del capítulo anterior con el estudio de la centralidad de red y los conjuntos dominantes mínimos además del previamente mencionado análisis de controlabilidad de la red. Las conclusiones de estos dos últimos capítulos demuestran que los métodos de análisis de controlabilidad, centralidad y conjuntos dominantes pueden usarse para prescribir terapias basadas en combinaciones de fármacos. El capítulo 13 concluye esta segunda y última parte de la tesis, con un resumen de los resultados de investigación documentados en esta parte y nuestras propuestas de líneas de investigación futuras.