Contribution to Market Making algorithms based on Reinforcement Learning

<< Volver atrás

Tesis:

Contribution to Market Making algorithms based on Reinforcement Learning

Autor: FALCES MARÍN, Javier

Título: Contribution to Market Making algorithms based on Reinforcement Learning

Fecha: 2023

Materia:

Escuela: E.T.S. DE INGENIEROS DE TELECOMUNICACION

Departamentos: SEÑALES, SISTEMAS Y RADIOCOMUNICACIONES

Acceso electrónico: https://oa.upm.es/74184/

Director/a 1º: LÓPEZ GONZALO, Eduardo

Resumen: Market making es un problema de high-frequency trading (trading de alta frecuencia) donde el aprendizaje por refuerzo está siendo muy investigado últimamente. Esta tesis presenta una nueva aproximación al market making (creación de mercado) usando aprendizaje por refuerzo profundo que puede mejorar las diferentes métricas de rentabilidad y de gestión de la posición frente a otros algoritmos de referencia. La aportación principal de la investigación es una aproximación diferente a la solución en la que en vez de generar unos precios de cotización (bid y ask) directamente, una red neuronal va a configurar unos parámetros de un algoritmo subyacente, Avellaneda-Stoikov que minimizará el riesgo de posición. Otra de las aportaciones será que el algoritmo de referencia Avellaneda-Stoikov va a ser optimizado usando un algoritmo genético en las muestras de entrenamiento que será nuestro principal benchmark en los tests. La última aportación será como reduciremos el número de características de los estados del algoritmo de aprendizaje por refuerzo utilizando la importancia de las características de los estados basado en la importancia media de las características en un random forest (bosque aleatorio). Diferentes variantes de modelos de Deep RL (Alpha-AS-1, Alpha-AS-2 y Alpha-CS) van a ser entrenados y backtesteados en datos reales (L2 datos de 30 días del par bitcoin-dólar) junto con el modelo Gen-AS y otros modelos base. Los modelos se van a comparar con cuatro métricas de rendimiento (Sharpe, Sortino, P&L-to-MAP ratios, y máximo drawdown). El modelo Gen-AS mejorará sustancialmente el resultado de todas las métricas a otros dos algoritmos de benchmark (referencia). En cambio, los modelos Alpha-AS mejoraran los resultados de Gen-AS en Sharpe, Sortino y P&L-to-MAP. La poca aversión al riesgo localizada de los algoritmos Alpha-AS se ve reflejada en el máximo drawdown para el que se proponen diferentes soluciones y experimentos. Nuestro simulador de alta frecuencia, usando datos de tick y L2, así como los algoritmos, están disponibles en un repositorio de github [23]. Este simulador es capaz de hacer un backtest de un día en segundos, simulando latencias y está preparado para usarse en mercados reales. ABSTRACT Market making is a high-frequency trading problem for which solutions based on reinforcement learning (RL) are being explored increasingly. This thesis presents an approach to market making using deep reinforcement learning, with the novelty that, rather than to set the bid and ask prices directly, the neural network output is used to tweak the risk aversion parameter and the output of the Avellaneda-Stoikov procedure, to obtain bid and ask prices that minimise inventory risk. Two further contributions are, first, that the initial parameters for the Avellaneda-Stoikov equations are optimised with a genetic algorithm, which parameters are also used to create a baseline Avellaneda-Stoikov agent (Gen-AS); and second, that state-defining features forming the RL agent’s neural network input are selected based on their relative importance by means of a random forest. Some variants of the deep RL model (Alpha-AS-1, Alpha-AS-2 and Alpha-CS) were trained and backtested on real data (L2 tick data from 30 days of bitcoin–dollar pair trading) alongside the Gen-AS model and the other baselines. The performance of the models was recorded through four indicators (the Sharpe, Sortino and P&L-to-MAP ratios, and the maximum drawdown). Gen-AS outperformed the two other baseline models on all indicators, and in turn the two Alpha-AS models substantially outperformed Gen-AS on Sharpe, Sortino and P&L-to-MAP. Localised excessive risk-taking by the Alpha-AS models, as reflected in a few heavy dropdowns, is a source of concern for which possible solutions and experiments are discussed. Our high-frequency trading simulator using L2 tick data and algorithms is openly accessible on a github repository [23] , this simulator can backtest a complete day of trading date in seconds, simulating latencies, it is ready to use in real markets.