Articles
Research on the effectiveness of using LSTM architecture in modeling the cognitive process of recognition
- Details
- Category: Content №1 2025
- Last Updated on 25 February 2025
- Published on 30 November -0001
- Hits: 58
Authors:
A.V.Miakenkyi*, orcid.org/0000-0002-4141-001X, Dnipro University of Technology, Dnipro, Ukraine, e-mail: This email address is being protected from spambots. You need JavaScript enabled to view it.
M.O.Aleksieiev, orcid.org/0000-0001-8726-7469, Dnipro University of Technology, Dnipro, Ukraine, e-mail: This email address is being protected from spambots. You need JavaScript enabled to view it.
S.M.Matsiuk, orcid.org/0000-0001-6798-5500, Dnipro University of Technology, Dnipro, Ukraine
* Corresponding author e-mail: This email address is being protected from spambots. You need JavaScript enabled to view it.
Naukovyi Visnyk Natsionalnoho Hirnychoho Universytetu. 2025, (1): 090 - 095
https://doi.org/10.33271/nvngu/2025-1/090
Abstract:
A person’s ability to recognize and separate the meanings of words when working with textual information refers to the higher cognitive functions of the brain, in particular to the cognitive process of recognition. The solution to the problem of extracting the meaning of words in text is related to the tasks of natural language processing (NLP) and is called word sense disambiguation (WSD). There are many approaches to solving WSD, particularly using neural networks.
Purpose. Creation and analysis of the bidirectional LSTM neural network architecture for solving the WSD problem in the Ukrainian language.
Methodology. One of the modern approaches to solving the WSD problem is the use of LSTM models – a type of recurrent architecture of neural networks that allows you to capture long-term dependencies when modeling sequences. To determine the effectiveness of using this architecture, two neural networks were built during the study: the classic LSTM architecture and its improved version – Bi-LSTM. As part of the study, a data set based on the SUM dictionary of the Ukrainian language was also created. The implemented models were trained on the generated data set, after which a comparative analysis of the obtained data was performed.
Findings. The analysis of the results of the accuracy of the built models made it possible to determine the efficiency of the neural network built according to the Bi-LSTM architecture. The obtained accuracy results are equal to 73 % for the LSTM model and 83 % for Bi-LSTM, respectively, which is due to the presence of an additional layer in the Bi-LSTM model, which provides the opportunity to take into account the full context of the word in the given text.
Originality. The paper establishes the effectiveness of the neural network model built on the Bi-LSTM architecture for solving the WSD problem in texts in Ukrainian in comparison with the classical LSTM architecture.
Practical value. As a result of the work, a model is proposed that allows solving the problem of eliminating the ambiguity of words in the Ukrainian language, and which can be used in text processing tasks, in particular for modeling the cognitive process of understanding.
Keywords: cognitive modelling, cognitive process, NLP, WSD, LSTM, Bi-LSTM, pymorphy2, stanza, tensorflow
References.
1. Metzler, T., & Shea, K. (2011). Taxonomy of cognitive functions. Proceedings of the 18th International Conference on Engineering Design, 330-341. Retrieved from https://mediatum.ub.tum.de/1167203.
2. Pal, A. R., & Saha, D. (2015). Word Sense Disambiguation: A Survey. International Journal of Control Theory and Computer Modeling, 5(3), 1-16. https://doi.org/10.5121/ijctcm.2015.5301.
3. Agirre, E., De Lacalle, O. L., & Soroa, A. (2014). Random Walks for Knowledge-Based Word Sense Disambiguation. Computational Linguistics, 40(1), 57-84. https://doi.org/10.1162/coli_a_00164.
4. Popov, A. (2017). Word Sense Disambiguation with Recurrent Neural Networks. RANLP 2017 – Student Research Workshop. Shoumen, Bulgaria: Incoma Ltd. https://doi.org/10.26615/issn.1314-9156.2017_004.
5. Sundermeyer, M., Alkhouli, T., Wuebker, J., & Ney, H. (2014). Translation Modeling with Bidirectional Recurrent Neural Networks. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 14-25. https://doi.org/10.3115/v1/D14-1003.
6. Murugesan, R., Mishra, E., & Krishnan, A. H. (2021). Deep Learning Based Models: Basic LSTM, Bi LSTM, Stacked LSTM, CNN LSTM and Conv LSTM to Forecast Agricultural Commodities Prices. Research Square. https://doi.org/10.21203/rs.3.rs-740568/v1.
7. Zhang, A., Lipton, Z. C., Li, M., & Smola, A. J. (2021). Dive into Deep Learning. arXiv (Cornell University). https://doi.org/10.48550/arxiv.2106.11342.
8. Shi, Y., Zheng, Y., Guo, K., Zhu, L., & Qu, Y. (2018). Intrinsic or Extrinsic Evaluation: An Overview of Word Embedding Evaluation. 2018 IEEE International Conference on Data Mining Workshops, 1, 1255-1262. https://doi.org/10.1109/icdmw.2018.00179.
9. Reisinger, J., & Mooney, R. J. (2010). Multi-Prototype Vector-Space Models of Word Meaning. North American Chapter of the Association for Computational Linguistics, 1173-1182. Retrieved from https://aclanthology.org/N10-1013.
10. Gunawan, D., Sembiring, C. A., & Budiman, M. A. (2018). The Implementation of Cosine Similarity to Calculate Text Relevance between Two Documents. Journal of Physics Conference Series, 978, 012120. https://doi.org/10.1088/1742-6596/978/1/012120.
11. Almeida, F., & Xexéo, G. (2019). Word Embeddings: A Survey. arXiv (Cornell University). https://doi.org/10.48550/arxiv.1901.09069.
12. Sun, S., & Iyyer, M. (2021). Revisiting Simple Neural Probabilistic Language Models. Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 5181-5188. https://doi.org/10.18653/v1/2021.naacl-main.407.
13. Baroni, M., Dinu, G., & Kruszewski, G. (2014). Don’t count, predict! A systematic comparison of context-counting vs. context-predicting semantic vectors. Proceedings of the 52 nd Annual Meeting of the Association for Computational Linguistics, 1, 238-247. https://doi.org/10.3115/v1/p14-1023.
14. Mikolov, T., Chen, K., Corrado, G. S., & Dean, J. (2013). Efficient Estimation of Word Representations in Vector Space. arXiv (Cornell University). https://doi.org/10.48550/arxiv.1301.3781.
15. Pennington, J., Socher, R., & Manning, C. (2014). Glove: Global Vectors for Word Representation. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 1532-1543. https://doi.org/10.3115/v1/d14-1162.
16. Mikolov, T., Sutskever, I., Chen, K., Corrado, G., & Dean, J. (2013). Distributed Representations of Words and Phrases and their Compositionality. arXiv (Cornell University). https://doi.org/10.48550/arxiv.1310.4546.
17. NER–models for MITIE. lang-uk. Retrieved from https://lang.org.ua/en/models/.
18. Tmienova, N., & Sus, B. (2019). System of Intellectual Ukrainian Language Processing. Selected Papers of the XIX International Scientific and Practical Conference “Information Technologies and Security”, 199-209. Retrieved from https://ceur-ws.org/Vol-2577/.
19. Qi, P., Zhang, Y., Zhang, Y., Bolton, J., & Manning, C. D. (2020). Stanza: A Python Natural Language Processing Toolkit for Many Human Languages. arXiv (Cornell University). https://doi.org/10.48550/arxiv.2003.07082.
20. Kågebäck, M., & Salomonsson, H. (2016). Word Sense Disambiguation using a Bidirectional LSTM. arXiv (Cornell University). https://doi.org/10.48550/arxiv.1606.03568.
21. Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, ..., & Zheng, X. (2016). TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems. arXiv (Cornell University). https://doi.org/10.48550/arxiv.1603.04467.
22. Dozat, T. (2016). Incorporating Nesterov Momentum into Adam. Proceedings of the 4 th International Conference on Learning Representations, 1-4. Retrieved from https://openreview.net/pdf?id=OM0jvwB8jIp57ZJjtNEZ.
Newer news items:
- Foreign economic activities of Ukrainian ferrous metallurgy enterprises in the conditions of crisis - 25/02/2025 12:56
- Human capital as a driver of the formation of Ukraine’s competitive advantages in the post-war period - 25/02/2025 12:56
- Talent management: a strategic priority for developing the enterprise’s intellectual potential in digitalization - 25/02/2025 12:56
- The impact of educational development on the countries’ competitiveness in the knowledge economy - 25/02/2025 12:56
- Methodology of comprehensive diagnostics of technical educational and scientific cluster management risks - 25/02/2025 12:56
- Innovations in the defense-industrial complex: current status and development prospects - 25/02/2025 12:56
- Commercialization of NTU “KhPI” innovations with digital marketing tools in the experience economy - 25/02/2025 12:56
- Systems engineering design and development of universal die set for hydraulic presses - 25/02/2025 12:56
- FoSDeT: a new hybrid machine learning model for accurate and fast detection of IoT botnet - 25/02/2025 12:56
- Technology for determining weight coefficients of components of information security - 25/02/2025 12:56
Older news items:
- Automatic compensation of the mill roll eccentricity in terms of limited speed of hydraulic compression devices - 25/02/2025 12:56
- Environmental factors for land use restrictions establishment in Ukraine - 25/02/2025 12:56
- Study of the effectiveness of extinguishing model fires of coniferous and deciduous wood - 25/02/2025 12:56
- Environmental safety assessment of soils in Khmelnytskyi region based on chemical composition and acidity analysis - 25/02/2025 12:56
- Impact of power electronics devices on leakage current in mine electrical systems: a case study in Vietnam - 25/02/2025 12:56
- Static continuous bulk material model for inclined bunker section - 25/02/2025 12:56
- Assessing criteria for casting and deformation suitability of metals and alloys - 25/02/2025 12:56
- Transformation of the kirigami-type deformable inlay during roll bonding - 25/02/2025 12:56
- Feasibility assessment of low-grade iron ore from El Ouenza mine by high-intensity magnetic separation - 25/02/2025 12:56
- Well operation by plunger rod pumps in difficult conditions - 25/02/2025 12:56