Articles

Research on the effectiveness of using LSTM architecture in modeling the cognitive process of recognition

User Rating:  / 0
PoorBest 

Authors:


A.V.Miakenkyi*, orcid.org/0000-0002-4141-001X, Dnipro University of Technology, Dnipro, Ukraine, e-mail: This email address is being protected from spambots. You need JavaScript enabled to view it.

M.O.Aleksieiev, orcid.org/0000-0001-8726-7469, Dnipro University of Technology, Dnipro, Ukraine, e-mail: This email address is being protected from spambots. You need JavaScript enabled to view it.

S.M.Matsiuk, orcid.org/0000-0001-6798-5500, Dnipro University of Technology, Dnipro, Ukraine

* Corresponding author e-mail: This email address is being protected from spambots. You need JavaScript enabled to view it.


повний текст / full article



Naukovyi Visnyk Natsionalnoho Hirnychoho Universytetu. 2025, (1): 090 - 095

https://doi.org/10.33271/nvngu/2025-1/090



Abstract:


A person’s ability to recognize and separate the meanings of words when working with textual information refers to the higher cognitive functions of the brain, in particular to the cognitive process of recognition. The solution to the problem of extracting the meaning of words in text is related to the tasks of natural language processing (NLP) and is called word sense disambiguation (WSD). There are many approaches to solving WSD, particularly using neural networks.


Purpose.
Creation and analysis of the bidirectional LSTM neural network architecture for solving the WSD problem in the Ukrainian language.


Methodology.
One of the modern approaches to solving the WSD problem is the use of LSTM models – a type of recurrent architecture of neural networks that allows you to capture long-term dependencies when modeling sequences. To determine the effectiveness of using this architecture, two neural networks were built during the study: the classic LSTM architecture and its improved version – Bi-LSTM. As part of the study, a data set based on the SUM dictionary of the Ukrainian language was also created. The implemented models were trained on the generated data set, after which a comparative analysis of the obtained data was performed.


Findings.
The analysis of the results of the accuracy of the built models made it possible to determine the efficiency of the neural network built according to the Bi-LSTM architecture. The obtained accuracy results are equal to 73 % for the LSTM model and 83 % for Bi-LSTM, respectively, which is due to the presence of an additional layer in the Bi-LSTM model, which provides the opportunity to take into account the full context of the word in the given text.


Originality.
The paper establishes the effectiveness of the neural network model built on the Bi-LSTM architecture for solving the WSD problem in texts in Ukrainian in comparison with the classical LSTM architecture.


Practical value.
As a result of the work, a model is proposed that allows solving the problem of eliminating the ambiguity of words in the Ukrainian language, and which can be used in text processing tasks, in particular for modeling the cognitive process of understanding.



Keywords:
cognitive modelling, cognitive process, NLP, WSD, LSTM, Bi-LSTM, pymorphy2, stanza, tensorflow

References.


1. Metzler, T., & Shea, K. (2011). Taxonomy of cognitive functions. Proceedings of the 18th International Conference on Engineering Design, 330-341. Retrieved from https://mediatum.ub.tum.de/1167203.

2. Pal, A. R., & Saha, D. (2015). Word Sense Disambiguation: A Survey. International Journal of Control Theory and Computer Modeling, 5(3), 1-16. https://doi.org/10.5121/ijctcm.2015.5301.

3. Agirre, E., De Lacalle, O. L., & Soroa, A. (2014). Random Walks for Knowledge-Based Word Sense Disambiguation. Computational Linguistics40(1), 57-84. https://doi.org/10.1162/coli_a_00164.

4. Popov, A. (2017). Word Sense Disambiguation with Recurrent Neural Networks. RANLP 2017 Student Research Workshop. Shoumen, Bulgaria: Incoma Ltd. https://doi.org/10.26615/issn.1314-9156.2017_004.

5. Sundermeyer, M., Alkhouli, T., Wuebker, J., & Ney, H. (2014). Translation Modeling with Bidirectional Recurrent Neural Networks. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 14-25. https://doi.org/10.3115/v1/D14-1003.

6. Murugesan, R., Mishra, E., & Krishnan, A. H. (2021). Deep Learning Based Models: Basic LSTM, Bi LSTM, Stacked LSTM, CNN LSTM and Conv LSTM to Forecast Agricultural Commodities Prices. Research Square. https://doi.org/10.21203/rs.3.rs-740568/v1.

7. Zhang, A., Lipton, Z. C., Li, M., & Smola, A. J. (2021). Dive into Deep Learning. arXiv (Cornell University). https://doi.org/10.48550/arxiv.2106.11342.

8. Shi, Y., Zheng, Y., Guo, K., Zhu, L., & Qu, Y. (2018). Intrinsic or Extrinsic Evaluation: An Overview of Word Embedding Evaluation. 2018 IEEE International Conference on Data Mining Workshops, 1, 1255-1262. https://doi.org/10.1109/icdmw.2018.00179.

9. Reisinger, J., & Mooney, R. J. (2010). Multi-Prototype Vector-Space Models of Word Meaning. North American Chapter of the Association for Computational Linguistics, 1173-1182. Retrieved from https://aclanthology.org/N10-1013.

10. Gunawan, D., Sembiring, C. A., & Budiman, M. A. (2018). The Implementation of Cosine Similarity to Calculate Text Relevance between Two Documents. Journal of Physics Conference Series, 978, 012120. https://doi.org/10.1088/1742-6596/978/1/012120.

11. Almeida, F., & Xexéo, G. (2019). Word Embeddings: A Survey. arXiv (Cornell University). https://doi.org/10.48550/arxiv.1901.09069.

12. Sun, S., & Iyyer, M. (2021). Revisiting Simple Neural Probabilistic Language Models. Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 5181-5188. https://doi.org/10.18653/v1/2021.naacl-main.407.

13. Baroni, M., Dinu, G., & Kruszewski, G. (2014). Don’t count, predict! A systematic comparison of context-counting vs. context-predicting semantic vectors. Proceedings of the 52 nd Annual Meeting of the Association for Computational Linguistics, 1, 238-247. https://doi.org/10.3115/v1/p14-1023.

14. Mikolov, T., Chen, K., Corrado, G. S., & Dean, J. (2013). Efficient Estimation of Word Representations in Vector Space. arXiv (Cornell University). https://doi.org/10.48550/arxiv.1301.3781.

15. Pennington, J., Socher, R., & Manning, C. (2014). Glove: Global Vectors for Word Representation. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 1532-1543. https://doi.org/10.3115/v1/d14-1162.

16. Mikolov, T., Sutskever, I., Chen, K., Corrado, G., & Dean, J. (2013). Distributed Representations of Words and Phrases and their Compositionality. arXiv (Cornell University). https://doi.org/10.48550/arxiv.1310.4546.

17. NER–models for MITIE. lang-uk. Retrieved from https://lang.org.ua/en/models/.

18. Tmienova, N., & Sus, B. (2019). System of Intellectual Ukrainian Language Processing. Selected Papers of the XIX International Scientific and Practical Conference “Information Technologies and Security”, 199-209. Retrieved from https://ceur-ws.org/Vol-2577/.

19. Qi, P., Zhang, Y., Zhang, Y., Bolton, J., & Manning, C. D. (2020). Stanza: A Python Natural Language Processing Toolkit for Many Human Languages. arXiv (Cornell University). https://doi.org/10.48550/arxiv.2003.07082.

20. Kågebäck, M., & Salomonsson, H. (2016). Word Sense Disambiguation using a Bidirectional LSTM. arXiv (Cornell University). https://doi.org/10.48550/arxiv.1606.03568.

21. Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, ..., & Zheng, X. (2016). TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems. arXiv (Cornell University). https://doi.org/10.48550/arxiv.1603.04467.

22. Dozat, T. (2016). Incorporating Nesterov Momentum into Adam. Proceedings of the 4 th International Conference on Learning Representations, 1-4. Retrieved from https://openreview.net/pdf?id=OM0jvwB8jIp57ZJjtNEZ.

 

Visitors

7944568
Today
This Month
All days
4318
250897
7944568

Guest Book

If you have questions, comments or suggestions, you can write them in our "Guest Book"

Registration data

ISSN (print) 2071-2227,
ISSN (online) 2223-2362.
Journal was registered by Ministry of Justice of Ukraine.
Registration number КВ No.17742-6592PR dated April 27, 2011.

Contacts

D.Yavornytskyi ave.,19, pavilion 3, room 24-а, Dnipro, 49005
Tel.: +38 (066) 379 72 44.
e-mail: This email address is being protected from spambots. You need JavaScript enabled to view it.
You are here: Home