Resumen
© Springer Nature Switzerland AG 2018. This article presents a method that extracts relevant concepts automatically, consisting of one or several words, whose main contribution is that it does so from a single document of any domain, regardless of its length; however, documents of short length are used (which are the most frequent to obtain on the web) to perform the work. This research was conducted for documents written in Spanish and was tested in multiple randomized domains to compare their results. For this, an algorithm was used to automatically identify syntactic patterns in the document. This work uses the previous work of [1] to obtain its results. This algorithm is based on statistical approximations and on the length of the identifiable patterns contained in the document, applies certain heuristic that can enhance or decrease the patterns’ choice according to the selection of one of the 5 methods that are processed (M1 to M5), with these patterns the candidate concepts are obtained, which go through another evaluation process that will obtain the final concepts. This proposal presents at least four advantages: (1) It is multi-domain, (2) It is independent of the text length, (3) It can work with one or more documents and (4) It allows the discarding of garbage or undesirable patterns from the beginning. The method was implemented in 11 different domains and its results range varies between 58%–70% of precision and 25%–46% of recall.
Idioma original | Inglés estadounidense |
---|---|
Título de la publicación alojada | Concept identification from single-documents |
Editores | Javier Del Cioppo-Morstadt, Néstor Vera-Lucio, Martha Bucaram-Leverone, Rafael Valencia-García, Gema Alcaraz-Mármol |
Editorial | Springer Verlag |
Páginas | 158-173 |
Número de páginas | 16 |
DOI | |
Estado | Publicada - 1 nov. 2018 |
Evento | International Conference on Technologies and Innovation: 4th International Conference, CITI 2018, Guayaquil, Ecuador - Guayaquil, Ecuador Duración: 6 nov. 2018 → 9 nov. 2018 Número de conferencia: 4 https://link.springer.com/book/10.1007/978-3-030-00940-3 |
Conferencia
Conferencia | International Conference on Technologies and Innovation |
---|---|
Título abreviado | Technologies and Innovation |
País/Territorio | Ecuador |
Ciudad | Guayaquil |
Período | 6/11/18 → 9/11/18 |
Dirección de internet |
Nota bibliográfica
Publisher Copyright:© Springer Nature Switzerland AG 2018.