Expert-Free Online Transfer Learning in Multi-Agent Reinforcement Learning

Castagna, Alberto

dc.contributor.advisor	Dusparic, Ivana	en
dc.contributor.author	Castagna, Alberto	en
dc.date.accessioned	2024-05-20T15:29:08Z
dc.date.available	2024-05-20T15:29:08Z
dc.date.issued	2024	en
dc.date.submitted	2024	en
dc.identifier.citation	Castagna, Alberto, Expert-Free Online Transfer Learning in Multi-Agent Reinforcement Learning, Trinity College Dublin, School of Computer Science & Statistics, Computer Science, 2024	en
dc.identifier.other	Y	en
dc.description	APPROVED	en
dc.description.abstract	Reinforcement Learning (RL) enables an intelligent agent to optimise its performance in a task by continuously taking action from an observed state and receiving a feedback from the environment in form of rewards. RL typically uses tables or linear approximators to map state-action tuples that maximises the reward. Combining RL with deep neural networks (DRL) significantly increases its scalability and enables it to address more complex problems than before. However, DRL also inherits downsides from both RL and deep learning. Despite DRL improves generalisation across similar state-action pairs when compared to simpler RL policy representations like tabular methods, it still requires the agent to adequately explore the state-action space. Additionally, deep methods require more training data, with the volume of data escalating with the complexity and size of the neural network. As a result, deep RL requires a long time to collect enough agent-environment samples and to successfully learn the underlying policy. Furthermore, often even a slight alteration to the task invalidates any previous acquired knowledge. To address these shortcomings, Transfer Learning (TL) has been introduced, which enables the use of external knowledge from other tasks or agents to enhance a learning process. The goal of TL is to reduce the learning complexity for an agent dealing with an unfamiliar task by simplifying the exploration process. This is achieved by lowering the amount of new information required by its learning model, resulting in a reduced overall convergence time. TL approaches can be divided in Task-to-Task (T2T) and Agent-to-Agent (A2A) transfer. In T2T, an agent with expertise in a specific task partially or totally reuses its learning model or belief to address a different novel previously unobserved task. In A2A, an agent transfer part of its knowledge to a target agent addressing an equally defined task, hence with the same state-action domain and alike reward model. Based on the timing of transfer, A2A can be further classified into online and offline. In online transfer, a novel agent may continuously access knowledge from another agent throughout its entire learning phase. On the other hand, in offline transfer, sharing happens exclusively at initialisation time. State-of-the-art approaches in online A2A TL follow teacher-student paradigm, in which expert agents transfer their expertise to novices during training through advice-sharing following the teacher-student paradigm. Knowledge transferred can influence either the action decision process of an agent or the learnt weight of an action. Having an optimal teacher to provide advice under the teacher-student framework leads to state of the art performance. In fact, effective transfer relies on the degree of expertise of the teacher. As the student improves its policy, the advice provided by the teacher may become outdated and may overwrite better policies learnt by receiving agents, i.e., a certain advice provided by the teacher although initially helpful, might prevent the agent from exploring better actions. To mitigate the outdated advice shortcoming, previous work introduced advising strategies to regulate the transfer process assuming that an expert agent is used as the source of advice. This thesis proposes Expert-Free Online Transfer Learning (EF-OnTL), a novel framework for experience sharing. It enables online transfer learning in multi-agent systems enabling mutual online knowledge exchange between the learning agents by selecting the most suitable source of transfer at each time. As a result, each target agent receives a customised stream of knowledge tailored to its specific knowledge gaps. Thus, agents are expected to improve their performances by reducing the exploration phase, leading to faster convergence times. EF-OnTL is a novel framework for online transfer learning as it facilitates the reciprocal exchange of knowledge across multiple agents without the need for a fixed expert, unlike existing methods that share actions as advice. Furthermore, in EF-OnTL, the target?s performance is not capped at the teacher?s expertise. Without the presence of a fixed known expert, successful transfer relies on agents correctly and fine-grainedly estimating their confidence in the knowledge samples that they do have. To this effect, this thesis also introduces a new epistemic uncertainty estimator State Action Reward Next-state Random Network Distillation (sars-RND): an estimator based on full RL interactions. Compared to a state-visit counter, sars-RND enables fine-grained estimation during the training phase by taking into account additional information. We evaluate EF-OnTL across 4 different benchmark environments. First, in three standard RL benchmarks environments of increasingly complexity: Cart-Pole, Multi-Team Predator-Prey, Half Field Offense, and a real-world simulated environment Ride-Sharing Ride-Requests Simulator. EF-OnTL has demonstrated better or equal performance compared to the benchmark TL baselines. We have observed that the degree of improvement correlates with the complexity of the environment addressed. In simpler environments, the improvement is relatively modest, while in more complex ones, the improvement is significantly greater.	en
dc.publisher	Trinity College Dublin. School of Computer Science & Statistics. Discipline of Computer Science	en
dc.rights	Y	en
dc.subject	Reinforcement Learning	en
dc.subject	Transfer Learning	en
dc.subject	Deep Reinforcement Learning	en
dc.subject	multi-agent	en
dc.title	Expert-Free Online Transfer Learning in Multi-Agent Reinforcement Learning	en
dc.type	Thesis	en
dc.type.supercollection	thesis_dissertations	en
dc.type.supercollection	refereed_publications	en
dc.type.qualificationlevel	Doctoral	en
dc.identifier.peoplefinderurl	https://tcdlocalportal.tcd.ie/pls/EnterApex/f?p=800:71:0::::P71_USERNAME:ACASTAGN	en
dc.identifier.rssinternalid	265858	en
dc.rights.ecaccessrights	openAccess
dc.contributor.sponsor	Science Foundation Ireland (SFI)	en
dc.identifier.uri	http://hdl.handle.net/2262/108443

Files in this item

Name:: thesis - signedAC.pdf
Size:: 4.839Mb
Format:: PDF
Description:: Published (publisher's copy)

View/Open

Name:: license.txt
Size:: 3.463Kb
Format:: Text file

View/Open

This item appears in the following Collection(s)

Computer Science (Theses and Dissertations)
Computer Science (Theses and Dissertations)
Trinity College Dublin Theses & Dissertations

Show simple item record

Browse

My Account

Expert-Free Online Transfer Learning in Multi-Agent Reinforcement Learning

Files in this item

This item appears in the following Collection(s)

Related items

Personalised E-Learning Through Learning Style Aware Adaptive Systems ﻿

Post-primary Student Perspectives on Teaching and Learning During Covid-19 School Closures: Lessons learned from Irish Students from schools in a Widening Participation Programme ﻿

Challenges encountered in creating personalised learning activities to suit students learning preferences ﻿

Personalised E-Learning Through Learning Style Aware Adaptive Systems

Post-primary Student Perspectives on Teaching and Learning During Covid-19 School Closures: Lessons learned from Irish Students from schools in a Widening Participation Programme

Challenges encountered in creating personalised learning activities to suit students learning preferences