An approach to identifying threats of extracting confidential data from automated control systems based on internet technologies

Keywords: machine learning, personal data, information security, deep learning, countering information security threats, confidential data, identifying the entities of natural language texts

Abstract

      Together with ubiquitous, global digitalization, cybercrime is growing and developing rapidly. The state considers the creation of an environment conducive to information security to be a strategic goal for the development of the information society in Russia. However, the question of how the “state of protection of the individual, society and the state from internal and external information threats” should be achieved in accordance with the “Information Security” and the “Digital Economy of Russia 2024” programs remains open. The aim of this study is to increase the efficiency whereby automated control systems identify confidential data from html-pages to reduce the risk of using this data in the preparatory and initial stages of attacks on the infrastructure of government organizations. The article describes an approach that has been developed to identify confidential data based on the combination of several neural network technologies: a universal sentence encoder and a neural network recurrent architecture of bidirectional long-term short-term memory. The results of an assessment in comparison with modern means of natural language text processing (SpaCy) showed the merits and prospects of the practical application of the methodological approach.

Downloads

Download data is not yet available.
Published
2021-09-28
How to Cite
Kuzmin V. N., & Menisov A. В. (2021). An approach to identifying threats of extracting confidential data from automated control systems based on internet technologies. BUSINESS INFORMATICS, 15(3), 35-47. https://doi.org/10.17323/2587-814X.2021.3.35.47
Section
Untitled section