Show simple item record

dc.contributor.advisorDusparic, Ivanaen
dc.contributor.authorCastagna, Albertoen
dc.date.accessioned2024-05-20T15:29:08Z
dc.date.available2024-05-20T15:29:08Z
dc.date.issued2024en
dc.date.submitted2024en
dc.identifier.citationCastagna, Alberto, Expert-Free Online Transfer Learning in Multi-Agent Reinforcement Learning, Trinity College Dublin, School of Computer Science & Statistics, Computer Science, 2024en
dc.identifier.otherYen
dc.descriptionAPPROVEDen
dc.description.abstractReinforcement Learning (RL) enables an intelligent agent to optimise its performance in a task by continuously taking action from an observed state and receiving a feedback from the environment in form of rewards. RL typically uses tables or linear approximators to map state-action tuples that maximises the reward. Combining RL with deep neural networks (DRL) significantly increases its scalability and enables it to address more complex problems than before. However, DRL also inherits downsides from both RL and deep learning. Despite DRL improves generalisation across similar state-action pairs when compared to simpler RL policy representations like tabular methods, it still requires the agent to adequately explore the state-action space. Additionally, deep methods require more training data, with the volume of data escalating with the complexity and size of the neural network. As a result, deep RL requires a long time to collect enough agent-environment samples and to successfully learn the underlying policy. Furthermore, often even a slight alteration to the task invalidates any previous acquired knowledge. To address these shortcomings, Transfer Learning (TL) has been introduced, which enables the use of external knowledge from other tasks or agents to enhance a learning process. The goal of TL is to reduce the learning complexity for an agent dealing with an unfamiliar task by simplifying the exploration process. This is achieved by lowering the amount of new information required by its learning model, resulting in a reduced overall convergence time. TL approaches can be divided in Task-to-Task (T2T) and Agent-to-Agent (A2A) transfer. In T2T, an agent with expertise in a specific task partially or totally reuses its learning model or belief to address a different novel previously unobserved task. In A2A, an agent transfer part of its knowledge to a target agent addressing an equally defined task, hence with the same state-action domain and alike reward model. Based on the timing of transfer, A2A can be further classified into online and offline. In online transfer, a novel agent may continuously access knowledge from another agent throughout its entire learning phase. On the other hand, in offline transfer, sharing happens exclusively at initialisation time. State-of-the-art approaches in online A2A TL follow teacher-student paradigm, in which expert agents transfer their expertise to novices during training through advice-sharing following the teacher-student paradigm. Knowledge transferred can influence either the action decision process of an agent or the learnt weight of an action. Having an optimal teacher to provide advice under the teacher-student framework leads to state of the art performance. In fact, effective transfer relies on the degree of expertise of the teacher. As the student improves its policy, the advice provided by the teacher may become outdated and may overwrite better policies learnt by receiving agents, i.e., a certain advice provided by the teacher although initially helpful, might prevent the agent from exploring better actions. To mitigate the outdated advice shortcoming, previous work introduced advising strategies to regulate the transfer process assuming that an expert agent is used as the source of advice. This thesis proposes Expert-Free Online Transfer Learning (EF-OnTL), a novel framework for experience sharing. It enables online transfer learning in multi-agent systems enabling mutual online knowledge exchange between the learning agents by selecting the most suitable source of transfer at each time. As a result, each target agent receives a customised stream of knowledge tailored to its specific knowledge gaps. Thus, agents are expected to improve their performances by reducing the exploration phase, leading to faster convergence times. EF-OnTL is a novel framework for online transfer learning as it facilitates the reciprocal exchange of knowledge across multiple agents without the need for a fixed expert, unlike existing methods that share actions as advice. Furthermore, in EF-OnTL, the target?s performance is not capped at the teacher?s expertise. Without the presence of a fixed known expert, successful transfer relies on agents correctly and fine-grainedly estimating their confidence in the knowledge samples that they do have. To this effect, this thesis also introduces a new epistemic uncertainty estimator State Action Reward Next-state Random Network Distillation (sars-RND): an estimator based on full RL interactions. Compared to a state-visit counter, sars-RND enables fine-grained estimation during the training phase by taking into account additional information. We evaluate EF-OnTL across 4 different benchmark environments. First, in three standard RL benchmarks environments of increasingly complexity: Cart-Pole, Multi-Team Predator-Prey, Half Field Offense, and a real-world simulated environment Ride-Sharing Ride-Requests Simulator. EF-OnTL has demonstrated better or equal performance compared to the benchmark TL baselines. We have observed that the degree of improvement correlates with the complexity of the environment addressed. In simpler environments, the improvement is relatively modest, while in more complex ones, the improvement is significantly greater.en
dc.publisherTrinity College Dublin. School of Computer Science & Statistics. Discipline of Computer Scienceen
dc.rightsYen
dc.subjectReinforcement Learningen
dc.subjectTransfer Learningen
dc.subjectDeep Reinforcement Learningen
dc.subjectmulti-agenten
dc.titleExpert-Free Online Transfer Learning in Multi-Agent Reinforcement Learningen
dc.typeThesisen
dc.type.supercollectionthesis_dissertationsen
dc.type.supercollectionrefereed_publicationsen
dc.type.qualificationlevelDoctoralen
dc.identifier.peoplefinderurlhttps://tcdlocalportal.tcd.ie/pls/EnterApex/f?p=800:71:0::::P71_USERNAME:ACASTAGNen
dc.identifier.rssinternalid265858en
dc.rights.ecaccessrightsopenAccess
dc.contributor.sponsorScience Foundation Ireland (SFI)en
dc.identifier.urihttp://hdl.handle.net/2262/108443


Files in this item

Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record