{rfName}
Is

License and use

Altmetrics

Analysis of institutional authors

Granados Sanandres, AnaAuthorCamacho Fernandez, DavidAuthorRodriguez Ortiz, Francisco BorjaAuthor

Share

October 19, 2020
Publications
>
Article
No

Is the contextual information relevant in text clustering by compression?

Publicated to:EXPERT SYSTEMS WITH APPLICATIONS. 39 (10): 8537-8546 - 2012-08-01 39(10), DOI: 10.1016/j.eswa.2012.01.215

Authors: Granados A; Camacho D; Rodríguez F

Affiliations

Universidad Autónoma de Madrid - Author

Abstract

Usually, when analyzing data that have not been processed or filtered yet, it can be observed that not all the data have equal importance. Thus, it is common to find relevant data surrounded by non relevant one. This occurs when analyzing textual information due to its intrinsic nature: texts contain words that provide a lot of information about the subject matter, whereas they contain other words with a little meaning or relevance. We believe that although in principle the non-relevant words are not as important as the relevant ones, the former constitute the substrate that supports the last. Since this substrate is the context that surrounds the relevant information, we call it the contextual information. In this paper, we analyze the relevance that the contextual information has in textual data, in a clustering by compression scenario. We generate the contextual information applying a distortion technique previously developed by the authors. One of the main characteristics of this technique is that it maintains the contextual information. In this paper we compare this technique with three new distortion techniques that destroy the contextual information in different ways. The experimental results support our hypothesis that the contextual information is relevant at least in the area of text clustering by compression. © 2012 Elsevier Ltd. All rights reserved.

Keywords

Compression-based text clusteringContextual informationWord removal

Quality index

Bibliometric impact. Analysis of the contribution and dissemination channel

The work has been published in the journal EXPERT SYSTEMS WITH APPLICATIONS due to its progression and the good impact it has achieved in recent years, according to the agency WoS (JCR), it has become a reference in its field. In the year of publication of the work, 2012, it was in position 13/79, thus managing to position itself as a Q1 (Primer Cuartil), in the category Operations Research & Management Science.

From a relative perspective, and based on the normalized impact indicator calculated from the Field Citation Ratio (FCR) of the Dimensions source, it yields a value of: 1.45, which indicates that, compared to works in the same discipline and in the same year of publication, it ranks as a work cited above average. (source consulted: Dimensions Jul 2025)

Specifically, and according to different indexing agencies, this work has accumulated citations as of 2025-07-26, the following number of citations:

  • WoS: 7
  • Scopus: 8
  • Google Scholar: 13

Impact and social visibility

From the perspective of influence or social adoption, and based on metrics associated with mentions and interactions provided by agencies specializing in calculating the so-called "Alternative or Social Metrics," we can highlight as of 2025-07-26:

  • The use of this contribution in bookmarks, code forks, additions to favorite lists for recurrent reading, as well as general views, indicates that someone is using the publication as a basis for their current work. This may be a notable indicator of future more formal and academic citations. This claim is supported by the result of the "Capture" indicator, which yields a total of: 20 (PlumX).

Leadership analysis of institutional authors

There is a significant leadership presence as some of the institution’s authors appear as the first or last signer, detailed as follows: First Author (GRANADOS SANANDRES, ANA) and Last Author (RODRIGUEZ ORTIZ, FRANCISCO BORJA).