{rfName}
Ex

Indexed in

License and use

Altmetrics

Grant support

Open Access funding provided thanks to the CRUE-CSIC agreement with Springer Nature. This work is part of the preliminary tasks related to the Harvesting Visual Data (HVD) project (PID2021125051OB-I00) funded by the Ministerio de Ciencia e Innovacin of the Spanish Government.

Analysis of institutional authors

Montalvo, JCorresponding AuthorGarcia-Martin, AAuthorBescos, JAuthor

Share

October 3, 2022
Publications
>
Article
Hybrid Gold

Exploiting semantic segmentation to boost reinforcement learning in video game environments

Publicated to:MULTIMEDIA TOOLS AND APPLICATIONS. 82 (7): 10961-10979 - 2023-03-01 82(7), DOI: 10.1007/s11042-022-13695-1

Authors: Montalvo, Javier; Garcia-Martin, Alvaro; Bescos, Jesus

Affiliations

Univ Autonoma Madrid, VPULab, Ciudad Univ Cantoblanco, E-28049 Madrid, Spain - Author

Abstract

In this work we explore enhancing performance of reinforcement learning algorithms in video game environments by feeding it better, more relevant data. For this purpose, we use semantic segmentation to transform the images that would be used as input for the reinforcement learning algorithm from their original domain to a simplified semantic domain with just silhouettes and class labels instead of textures and colors, and then we train the reinforcement learning algorithm with these simplified images. We have conducted different experiments to study multiple aspects: feasibility of our proposal, and potential benefits to model generalization and transfer learning. Experiments have been performed with the Super Mario Bros video game as the testing environment. Our results show multiple advantages for this method. First, it proves that using semantic segmentation enables reaching higher performance than the baseline reinforcement learning algorithm without modifying the actual algorithm, and in fewer episodes; second, it shows noticeable performance improvements when training on multiple levels at the same time; and finally, it allows to apply transfer learning for models trained on visually different environments. We conclude that using semantic segmentation can certainly help reinforcement learning algorithms that work with visual data, by refining it. Our results also suggest that other computer vision techniques may also be beneficial for data prepossessing. Models and code will be available on github upon acceptance.

Keywords

Domain adaptationReinforcement learningSemantic segmentationSynthetic data

Quality index

Bibliometric impact. Analysis of the contribution and dissemination channel

The work has been published in the journal MULTIMEDIA TOOLS AND APPLICATIONS due to its progression and the good impact it has achieved in recent years, according to the agency Scopus (SJR), it has become a reference in its field. In the year of publication of the work, 2023, it was in position , thus managing to position itself as a Q1 (Primer Cuartil), in the category Media Technology.

From a relative perspective, and based on the normalized impact indicator calculated from the Field Citation Ratio (FCR) of the Dimensions source, it yields a value of: 1.15, which indicates that, compared to works in the same discipline and in the same year of publication, it ranks as a work cited above average. (source consulted: Dimensions Aug 2025)

Specifically, and according to different indexing agencies, this work has accumulated citations as of 2025-08-07, the following number of citations:

  • WoS: 1
  • Scopus: 3

Impact and social visibility

From the perspective of influence or social adoption, and based on metrics associated with mentions and interactions provided by agencies specializing in calculating the so-called "Alternative or Social Metrics," we can highlight as of 2025-08-07:

  • The use, from an academic perspective evidenced by the Altmetric agency indicator referring to aggregations made by the personal bibliographic manager Mendeley, gives us a total of: 14.
  • The use of this contribution in bookmarks, code forks, additions to favorite lists for recurrent reading, as well as general views, indicates that someone is using the publication as a basis for their current work. This may be a notable indicator of future more formal and academic citations. This claim is supported by the result of the "Capture" indicator, which yields a total of: 14 (PlumX).

With a more dissemination-oriented intent and targeting more general audiences, we can observe other more global scores such as:

  • The Total Score from Altmetric: 0.25.
  • The number of mentions on the social network X (formerly Twitter): 1 (Altmetric).

It is essential to present evidence supporting full alignment with institutional principles and guidelines on Open Science and the Conservation and Dissemination of Intellectual Heritage. A clear example of this is:

  • The work has been submitted to a journal whose editorial policy allows open Open Access publication.
  • Assignment of a Handle/URN as an identifier within the deposit in the Institutional Repository: https://repositorio.uam.es/handle/10486/704109

Leadership analysis of institutional authors

There is a significant leadership presence as some of the institution’s authors appear as the first or last signer, detailed as follows: First Author (MONTALVO RODRIGO, JAVIER) and Last Author (BESCOS CANO, JESUS).

the author responsible for correspondence tasks has been MONTALVO RODRIGO, JAVIER.