CN113627174A - Sensitive information monitoring method and system based on enterprise historical digitization - Google Patents

Sensitive information monitoring method and system based on enterprise historical digitization Download PDF

Info

Publication number
CN113627174A
CN113627174A CN202110940138.XA CN202110940138A CN113627174A CN 113627174 A CN113627174 A CN 113627174A CN 202110940138 A CN202110940138 A CN 202110940138A CN 113627174 A CN113627174 A CN 113627174A
Authority
CN
China
Prior art keywords
sensitive
sensitive word
enterprise
historical information
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110940138.XA
Other languages
Chinese (zh)
Inventor
麦英健
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Power Supply Bureau Co Ltd
Original Assignee
Shenzhen Power Supply Bureau Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Power Supply Bureau Co Ltd filed Critical Shenzhen Power Supply Bureau Co Ltd
Priority to CN202110940138.XA priority Critical patent/CN113627174A/en
Publication of CN113627174A publication Critical patent/CN113627174A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Databases & Information Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)

Abstract

The invention discloses a sensitive information monitoring method and system based on enterprise historical digitization. The method comprises the following steps: acquiring enterprise historical information input by a user; identifying whether new sensitive words exist in the enterprise historical information or not by using a sensitive word identification model, and updating a dynamic sensitive word bank according to an identification result, wherein the dynamic sensitive word bank is used for storing the sensitive words; sensitive word matching is carried out on the enterprise historical information and the dynamic sensitive word bank by utilizing a word segmentation technology; when sensitive words matched with the dynamic sensitive word bank exist in the enterprise historical information, filtering the enterprise historical information; and when the sensitive words matched with the dynamic sensitive word bank do not exist in the enterprise historical information, normally displaying the enterprise historical information. The invention can realize the real-time monitoring of sensitive information of the co-constructed content in the process of the digital history co-construction of the enterprise.

Description

Sensitive information monitoring method and system based on enterprise historical digitization
Technical Field
The invention relates to the technical field of digital application, in particular to a sensitive information monitoring method and system based on enterprise historical digitization.
Background
With the rapid development of modern science and technology, more and more enterprises conform to the science and technology trend, get rid of the traditional constraint, carry out digital revolution, combine cultural transmission and high technology to create a brand-new field, and re-collect, store and process fragmented enterprise historical information and the innovation path of enterprise cultural development to form enterprise digital history. However, since the digitization of the enterprise history is a process of co-construction and sharing of all employees of the enterprise, and the enterprise history belongs to a relatively rigorous digital asset, it is very necessary to ensure the validity and the rigor of the digitization history in the co-construction process of the enterprise digitization history.
Most of the existing co-constructed digital information systems need manual review, or the validity of the digital content of the system is protected by simple sensitive word filtering, but the method is low in efficiency and not flexible enough on the whole, and a complete monitoring and subsequent processing means is lacked.
Disclosure of Invention
The technical problem to be solved by the embodiments of the present invention is to provide a method and a system for monitoring sensitive information based on enterprise history digitization, which can realize real-time monitoring of sensitive information on co-constructed content in the process of enterprise digitization history co-construction.
In order to solve the technical problem, the invention provides a sensitive information monitoring method based on enterprise historical digitization, which comprises the following steps:
step S1, acquiring enterprise history information input by a user;
step S2, identifying whether new sensitive words exist in the enterprise historical information by using a sensitive word identification model constructed based on a machine learning algorithm, and updating a dynamic sensitive word bank according to an identification result, wherein the dynamic sensitive word bank is used for storing the sensitive words;
step S3, sensitive word matching is carried out on the acquired enterprise historical information and the dynamic sensitive word bank by utilizing word segmentation technology;
step S4, when the enterprise historical information contains sensitive words matched with the dynamic sensitive word bank, filtering the enterprise historical information; and when the sensitive words matched with the dynamic sensitive word bank do not exist in the enterprise historical information, normally displaying the enterprise historical information.
Further, the sensitive word recognition model is constructed by the following steps:
collecting historical information of an enterprise for training;
marking the sensitive words in the historical information of the enterprise for training, and marking the sensitive word level corresponding to the sensitive words in the historical information of the enterprise for training;
and constructing the sensitive word recognition model by using the marked enterprise historical information for training as training data and utilizing a machine learning algorithm.
Further, the dynamic sensitive word bank is also used for storing the sensitive word level; the step S2 further includes: and when the sensitive word recognition model recognizes that a new sensitive word exists in the enterprise historical information, grading the new sensitive word according to the sensitive word level, and storing the new sensitive word and the sensitive word level corresponding to the new sensitive word in the dynamic sensitive word bank so as to update the dynamic sensitive word bank.
Further, the sensitive word level comprises a high risk level; the step S4 further includes: when the sensitive words matched with the dynamic sensitive word bank exist in the enterprise historical information: and acquiring the sensitive word level corresponding to the matched sensitive word from the dynamic sensitive word bank, sending alarm information when the sensitive word level corresponding to the matched sensitive word is a high-risk level, and freezing the enterprise historical information and the account number of the user inputting the enterprise historical information.
Further, the step S4 further includes: when the sensitive words matched with the dynamic sensitive word bank exist in the enterprise historical information: acquiring the sensitive word level corresponding to the matched sensitive word from the dynamic sensitive word library, recording the number of forbidden times of the user inputting the enterprise historical information when the sensitive word level corresponding to the matched sensitive word is not a high-risk level, judging whether the number of forbidden times of the user inputting the enterprise historical information reaches the preset alarm number, if the number of forbidden times reaches the alarm number, sending alarm information, and freezing the enterprise historical information and the account number of the user inputting the enterprise historical information; and if the number of the forbidden times does not reach the number of the alarm times, sensitive words are filtered on the enterprise historical information, and then the content of the enterprise historical information is displayed.
Further, the step S4 further includes: when the sensitive words matched with the dynamic sensitive word bank exist in the enterprise historical information: recording the number of the banning times of the user who inputs the enterprise historical information; judging whether the number of the forbidden times of the user inputting the enterprise historical information reaches the preset alarm number; if the number of the forbidden times reaches the alarm number, sending alarm information, and freezing the enterprise historical information and an account number of a user inputting the enterprise historical information; and if the number of the forbidden times does not reach the number of the alarm times, sensitive words are filtered on the enterprise historical information, and then the content of the enterprise historical information is displayed.
The invention also provides a sensitive information monitoring system based on enterprise historical digitization, which comprises: the dynamic sensitive word bank is used for storing sensitive words; the acquisition unit is used for acquiring enterprise history information input by a user; the identification unit is used for identifying whether new sensitive words exist in the enterprise historical information by using a sensitive word identification model constructed based on a machine learning algorithm and updating the dynamic sensitive word bank according to an identification result; the matching unit is used for matching the acquired enterprise historical information with the dynamic sensitive word bank by utilizing a word segmentation technology; the processing unit is used for filtering the enterprise historical information when the sensitive words matched with the dynamic sensitive word bank exist in the enterprise historical information; and when the sensitive words matched with the dynamic sensitive word bank do not exist in the enterprise historical information, normally displaying the enterprise historical information.
Further, the sensitive information monitoring system further includes: the acquisition unit is used for acquiring historical information of the enterprise for training; the marking unit is used for marking the sensitive words in the historical information of the enterprise for training and marking the sensitive word level corresponding to the sensitive words in the historical information of the enterprise for training; and the construction unit is used for constructing the sensitive word recognition model by using the marked enterprise historical information for training as training data and utilizing a machine learning algorithm.
Further, the dynamic sensitive word bank is also used for storing the sensitive word level; the identification unit is further configured to, when the sensitive word identification model identifies that a new sensitive word exists in the enterprise history information, grade the new sensitive word with respect to the sensitive word level, and store the new sensitive word and the sensitive word level corresponding to the new sensitive word in the dynamic sensitive word bank, so as to update the dynamic sensitive word bank.
Further, the sensitive word level comprises a high risk level; the processing unit is further configured to obtain, by the dynamic sensitive word bank, a sensitive word level corresponding to the matched sensitive word when a sensitive word matching the dynamic sensitive word bank exists in the enterprise history information, send alarm information when the sensitive word level corresponding to the matched sensitive word is a high-risk level, and freeze the enterprise history information and an account of a user who inputs the enterprise history information; when the sensitive word level corresponding to the matched sensitive word is not a high-risk level, recording the number of times of forbidding of the user who inputs the enterprise historical information, judging whether the number of times of forbidding of the user who inputs the enterprise historical information reaches a preset alarm number, if the number of times of forbidding reaches the alarm number, sending alarm information, and freezing the enterprise historical information and an account number of the user who inputs the enterprise historical information; and if the number of the forbidden times does not reach the number of the alarm times, sensitive words are filtered on the enterprise historical information, and then the content of the enterprise historical information is displayed.
The embodiment of the invention has the following beneficial effects: the method can automatically update the dynamic sensitive word bank, improves the maintenance efficiency of the sensitive word bank, further improves the efficiency of the whole sensitive word filtering work, and simultaneously improves the filtering accuracy of sensitive information by matching the sensitive words through the word segmentation technology; the invention can also identify the sensitive word condition and the user forbidden condition in the enterprise historical information, automatically carry out corresponding treatment, automatically alarm in real time to predict the risk in advance if necessary, and automatically deal with the situation to reduce the forbidden risk to the maximum extent. Therefore, the method and the device can realize the real-time monitoring and processing of the sensitive information full flow for the user co-constructed content in the process of enterprise digital history co-construction.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a flowchart of a sensitive information monitoring method based on enterprise history digitization according to an embodiment of the present invention.
Fig. 2 is a schematic diagram of a sensitive word recognition process of enterprise history information according to an embodiment of the present invention.
Fig. 3 is a flowchart illustrating a working process of the method for monitoring sensitive information based on enterprise history digitization according to an embodiment of the present invention when a sensitive word matching a dynamic sensitive word library exists in enterprise history information.
Detailed Description
The following description of the embodiments refers to the accompanying drawings, which are included to illustrate specific embodiments in which the invention may be practiced.
The embodiment of the invention provides a sensitive information monitoring method based on enterprise historical digitization, which can be used for monitoring sensitive information of contents which participate in co-construction on a system by a user in the process of co-construction of enterprise digital historical digitization.
Referring to fig. 1, the method for monitoring sensitive information based on enterprise history digitization according to the embodiment of the present invention includes the following steps S1-S4.
In step S1, the enterprise history information input by the user is acquired.
In step S2, a sensitive word recognition model constructed based on a machine learning algorithm is used to identify whether a new sensitive word exists in the acquired enterprise history information, and a dynamic sensitive word library is updated according to the identification result.
Specifically, the embodiment of the invention establishes a dynamic sensitive word bank for storing sensitive words, and automatically identifies new sensitive words from enterprise historical information in a machine learning mode to continuously update and improve the dynamic sensitive word bank.
The method comprises the following steps of automatically identifying new sensitive words from enterprise historical information in a machine learning mode, namely constructing a sensitive word identification model, wherein the construction process can be as follows: collecting a large amount of historical information of enterprises for training; marking the sensitive words in the collected historical information of the enterprise for training, and marking the sensitive word level corresponding to the sensitive words in the historical information of the enterprise for training; and (4) taking the marked enterprise historical information for training as training data, and constructing a sensitive word recognition model by using a machine learning algorithm.
The constructed sensitive word recognition model can be used for recognizing new sensitive words of input enterprise historical information, and storing the new sensitive words into the dynamic sensitive word stock when the new sensitive words are recognized, so that the dynamic sensitive word stock is continuously updated.
Further, the dynamic sensitive word stock is also used for storing the sensitive word level. In step S2, when the constructed sensitive word recognition model recognizes that a new sensitive word exists in the enterprise history information, the new sensitive word may be further classified according to the sensitive word level, and the new sensitive word and the sensitive word level corresponding to the new sensitive word are stored in the dynamic sensitive word bank, so as to update the dynamic sensitive word bank. Grading the sensitive words according to the grade of the sensitive words, namely grading the sensitive words according to the grade, can be used for setting the forbidden grade so as to automatically process the forbidden information and the users according to the forbidden grade. As will be further described below.
Therefore, the sensitive word recognition process for the acquired enterprise history information in step S2 may be as shown in fig. 2, and step S21 inputs the acquired enterprise history information into a sensitive word recognition model; step S22, the sensitive word recognition model recognizes a new sensitive word, if the sensitive word recognition model recognizes the new sensitive word, the step S23 is carried out, and the sensitive word recognition model recognizes the sensitive word level corresponding to the new sensitive word; in step S24, the new sensitive word and the sensitive word level corresponding to the new sensitive word are stored in the dynamic sensitive word bank, and the dynamic sensitive word bank is updated; if the sensitive word recognition model does not recognize the new sensitive word, the dynamic sensitive word bank is not updated, step S25.
Next, in step S3, sensitive word matching is performed on the acquired enterprise history information and the dynamic sensitive word library by using word segmentation technology. That is to say, the embodiment of the invention adopts the word segmentation technology to filter the sensitive words of the acquired enterprise historical information, thereby improving the accuracy of filtering the sensitive words.
In step S4, when there is a sensitive word matching the dynamic sensitive word bank in the enterprise history information, filtering the enterprise history information; and when the sensitive words matched with the dynamic sensitive word bank do not exist in the enterprise historical information, normally displaying the enterprise historical information.
Specifically, in step S3, the obtained enterprise history information and the dynamic sensitive word library are subjected to sensitive word matching by using a word segmentation technique, and whether a sensitive word exists in the enterprise history information is determined. In step S4, when there is no sensitive word matching with the dynamic sensitive word bank in the enterprise history information, it is determined that there is no sensitive word in the enterprise history information, and then the enterprise history information is displayed normally; and when the sensitive words matched with the dynamic sensitive word bank exist in the enterprise historical information, filtering the enterprise historical information.
In step S4, when there is a sensitive word matching the dynamic sensitive word library in the enterprise history information, the enterprise history information and the user inputting the enterprise history information may be further processed differently according to the level of the sensitive word and/or the number of times the user inputs the sensitive word. By way of example, the sensitive word levels may include a normal level and a high-risk level, and accordingly, the filtering process for the enterprise history information and the processing for the user inputting the enterprise history information may be as follows: filtering sensitive words of enterprise historical information containing common-level sensitive words, displaying the content of the enterprise historical information, and recording the forbidden times of a user inputting the enterprise historical information (when the forbidden times of the user reach the preset alarm times, sending alarm information, freezing an account number of the user inputting the enterprise historical information, and freezing and not displaying the content of the input enterprise historical information); and for enterprise historical information containing high-risk level sensitive words, alarm information is directly sent out, the content of the enterprise historical information is frozen and is not displayed, and meanwhile, an account number of a user inputting the enterprise historical information is frozen, so that greater harm is prevented.
Specifically, the above process may be as shown in fig. 3, when there is a sensitive word matching the dynamic sensitive word bank in the enterprise history information, further, in step S41, the dynamic sensitive word bank learns the sensitive word level corresponding to the matched sensitive word. In step S42, it is determined whether the sensitive word level corresponding to the matched sensitive word is a high-risk level. If the sensitive word level corresponding to the matched sensitive word is the high-risk level, in step S43, an alarm message is sent out, the content of the enterprise history information is frozen and not displayed, and the account of the user who inputs the enterprise history information is frozen. If the sensitive word level corresponding to the matched sensitive word is not the high-risk level (i.e., the normal level), in step S44, the number of times of the user 'S contraband inputting the enterprise history information is recorded, and it is determined whether the number of times of the user' S contraband reaches the alarm number. If the number of times of the user' S contraband reaches the alarm number, go to step S43: sending alarm information, freezing the content of the enterprise historical information, not displaying the content, and freezing an account number of a user inputting the enterprise historical information; if the number of the forbidden times of the user does not reach the number of the alarm times, sensitive word filtering is performed on the enterprise historical information, and then the content of the enterprise historical information is displayed, namely, the enterprise historical information is subjected to ordinary filtering display, and step S45.
When the alarm information is sent out, the system can automatically send the alarm information to a system administrator in a short message mode, and the system administrator performs auditing treatment.
In an embodiment of the present invention, the warning information may be issued directly according to the number of times of the user 'S contraband, for example, in step S4, when there is a sensitive word matching the dynamic sensitive word bank in the input enterprise history information, the number of times of the user' S contraband inputting the enterprise history information is recorded; judging whether the number of the forbidden times of the user inputting the enterprise historical information reaches the preset alarm number; if the number of forbidden times reaches the number of alarming times, alarming information is sent out, and the enterprise historical information (not shown) and the account number of the user inputting the enterprise historical information are frozen; if the number of forbidden times does not reach the number of alarm times, sensitive words are filtered on the enterprise historical information and then the content of the enterprise historical information is displayed, or the grade of the sensitive words can be further judged, as described above.
The embodiment of the invention also provides a sensitive information monitoring system based on enterprise historical digitization, which can implement the sensitive information monitoring method based on enterprise historical digitization and comprises a dynamic sensitive word bank, an acquisition unit, an identification unit, a matching unit and a processing unit. The dynamic sensitive word bank is used for storing sensitive words and sensitive word levels. The acquisition unit is used for acquiring enterprise history information input by a user. The identification unit is used for identifying whether new sensitive words exist in the enterprise historical information by using a sensitive word identification model constructed based on a machine learning algorithm and updating the dynamic sensitive word bank according to the identification result. And the matching unit is used for matching the acquired enterprise historical information with the dynamic sensitive word bank by utilizing a word segmentation technology. The processing unit is used for filtering the enterprise historical information when the sensitive words matched with the dynamic sensitive word bank exist in the enterprise historical information; and when the sensitive words matched with the dynamic sensitive word bank do not exist in the enterprise historical information, normally displaying the enterprise historical information.
In order to construct a sensitive word recognition model, the sensitive information monitoring system based on enterprise historical digitization further comprises an acquisition unit, a labeling unit and a construction unit, wherein the acquisition unit is used for acquiring a large amount of enterprise historical information for training; the marking unit is used for marking the sensitive words in the historical information of the enterprise for training and marking the sensitive word grades corresponding to the sensitive words in the historical information of the enterprise for training; the construction unit is used for constructing the sensitive word recognition model by using the marked enterprise historical information for training as training data and utilizing a machine learning algorithm.
The identification unit is also used for grading the new sensitive words according to the sensitive word grades when the sensitive word identification model identifies that the new sensitive words exist in the enterprise historical information, and storing the new sensitive words and the sensitive word grades corresponding to the new sensitive words in the dynamic sensitive word bank, so that the dynamic sensitive word bank is updated.
The sensitive word level may include a normal level and a high-risk level. The processing unit is further used for acquiring the sensitive word level corresponding to the matched sensitive word from the dynamic sensitive word bank when the sensitive word matched with the dynamic sensitive word bank exists in the enterprise historical information, sending out alarm information when the sensitive word level corresponding to the matched sensitive word is a high-risk level, and freezing the enterprise historical information (not displayed) and the account number of the user inputting the enterprise historical information. When the sensitive word level corresponding to the matched sensitive word is not the high-risk level, the processing unit is further used for recording the forbidden times of the user inputting the enterprise historical information, judging whether the forbidden times of the user inputting the enterprise historical information reaches the preset alarm times, if the forbidden times reaches the alarm times, sending out the alarm information, and freezing the enterprise historical information (not displayed) and the account number of the user inputting the enterprise historical information; and if the forbidden times do not reach the alarm times, sensitive words are filtered on the historical information of the enterprise, and then the content of the historical information of the enterprise is displayed.
The processing unit may be further configured to issue an alarm directly according to the number of times of the user's contraband. For example, the processing unit is further configured to record the number of times of banning of the user who inputs the enterprise history information when a sensitive word matching the dynamic sensitive word bank exists in the input enterprise history information; judging whether the number of the forbidden times of the user inputting the enterprise historical information reaches the preset alarm number; if the number of forbidden times reaches the number of alarming times, alarming information is sent out, and the enterprise historical information and the account number of the user inputting the enterprise historical information are frozen; if the number of forbidden times does not reach the number of alarm times, sensitive words are filtered on the enterprise historical information and then the content of the enterprise historical information is displayed, or the grade of the sensitive words can be further judged, as described above.
When the alarm information is sent out, the processing unit can automatically send the alarm information to a system administrator in a short message mode, and the system administrator is waited to carry out auditing processing.
According to the method, the sensitive word recognition model is established based on the machine learning algorithm, so that the new sensitive words in the enterprise historical information input by the user can be automatically recognized and further classified, the dynamic sensitive word bank is continuously updated and perfected, and the maintenance efficiency of the sensitive word bank is improved; in the sensitive word filtering process, the invention adopts the word segmentation technology to carry out sensitive word matching on the enterprise historical information and the dynamic sensitive word bank, thereby improving the filtering accuracy and flexibility; the method carries out classification on the grade of the sensitive words on the new sensitive words, records the number of the forbidden times of the user, and sends out alarm information when the forbidden times of the user reach the alarm times or the sensitive words of high-risk grade exist in the historical information of the enterprise, thereby realizing real-time automatic alarm; the invention can automatically perform different treatments on the enterprise historical information input by the user according to whether the sensitive words exist in the enterprise historical information, whether the sensitive words are in high-risk level and whether the forbidden times of the user reach the alarm times, such as normal display, display after filtering the sensitive words, freezing the enterprise historical information, account numbers of the user inputting the enterprise historical information and the like.
Compared with the prior art, the invention has the beneficial effects that: the method can automatically update the dynamic sensitive word bank, improves the maintenance efficiency of the sensitive word bank, further improves the efficiency of the whole sensitive word filtering work, and simultaneously improves the filtering accuracy of sensitive information by matching the sensitive words through the word segmentation technology; the invention can also identify the sensitive word condition and the user forbidden condition in the enterprise historical information, automatically carry out corresponding treatment, automatically alarm in real time to predict the risk in advance if necessary, and automatically deal with the situation to reduce the forbidden risk to the maximum extent. Therefore, the method and the device can realize the real-time monitoring of the sensitive information whole flow for the user co-constructed content in the process of enterprise digital history co-construction.
The above disclosure is only for the purpose of illustrating the preferred embodiments of the present invention, and it is therefore to be understood that the invention is not limited by the scope of the appended claims.

Claims (10)

1. A sensitive information monitoring method based on enterprise historical digitization is characterized by comprising the following steps:
step S1, acquiring enterprise history information input by a user;
step S2, identifying whether new sensitive words exist in the enterprise historical information by using a sensitive word identification model constructed based on a machine learning algorithm, and updating a dynamic sensitive word bank according to an identification result, wherein the dynamic sensitive word bank is used for storing the sensitive words;
step S3, sensitive word matching is carried out on the acquired enterprise historical information and the dynamic sensitive word bank by utilizing word segmentation technology;
step S4, when the enterprise historical information contains sensitive words matched with the dynamic sensitive word bank, filtering the enterprise historical information; and when the sensitive words matched with the dynamic sensitive word bank do not exist in the enterprise historical information, normally displaying the enterprise historical information.
2. The sensitive information monitoring method according to claim 1, wherein the sensitive word recognition model is constructed by the following steps:
collecting historical information of an enterprise for training;
marking the sensitive words in the historical information of the enterprise for training, and marking the sensitive word level corresponding to the sensitive words in the historical information of the enterprise for training;
and constructing the sensitive word recognition model by using the marked enterprise historical information for training as training data and utilizing a machine learning algorithm.
3. The sensitive information monitoring method of claim 2, wherein the dynamic sensitive word bank is further configured to store the sensitive word level;
the step S2 further includes: and when the sensitive word recognition model recognizes that a new sensitive word exists in the enterprise historical information, grading the new sensitive word according to the sensitive word level, and storing the new sensitive word and the sensitive word level corresponding to the new sensitive word in the dynamic sensitive word bank so as to update the dynamic sensitive word bank.
4. The sensitive information monitoring method of claim 3, wherein the sensitive word level comprises a high risk level;
the step S4 further includes: when the sensitive words matched with the dynamic sensitive word bank exist in the enterprise historical information: and acquiring the sensitive word level corresponding to the matched sensitive word from the dynamic sensitive word bank, sending alarm information when the sensitive word level corresponding to the matched sensitive word is a high-risk level, and freezing the enterprise historical information and the account number of the user inputting the enterprise historical information.
5. The sensitive information monitoring method according to claim 4, wherein the step S4 further comprises: when the sensitive words matched with the dynamic sensitive word bank exist in the enterprise historical information: acquiring the sensitive word level corresponding to the matched sensitive word from the dynamic sensitive word library, recording the number of forbidden times of the user inputting the enterprise historical information when the sensitive word level corresponding to the matched sensitive word is not a high-risk level, judging whether the number of forbidden times of the user inputting the enterprise historical information reaches the preset alarm number, if the number of forbidden times reaches the alarm number, sending alarm information, and freezing the enterprise historical information and the account number of the user inputting the enterprise historical information; and if the number of the forbidden times does not reach the number of the alarm times, sensitive words are filtered on the enterprise historical information, and then the content of the enterprise historical information is displayed.
6. The sensitive information monitoring method according to claim 1, wherein the step S4 further comprises: when the sensitive words matched with the dynamic sensitive word bank exist in the enterprise historical information: recording the number of the banning times of the user who inputs the enterprise historical information; judging whether the number of the forbidden times of the user inputting the enterprise historical information reaches the preset alarm number; if the number of the forbidden times reaches the alarm number, sending alarm information, and freezing the enterprise historical information and an account number of a user inputting the enterprise historical information; and if the number of the forbidden times does not reach the number of the alarm times, sensitive words are filtered on the enterprise historical information, and then the content of the enterprise historical information is displayed.
7. A sensitive information monitoring system based on enterprise historical digitization, comprising:
the dynamic sensitive word bank is used for storing sensitive words;
the acquisition unit is used for acquiring enterprise history information input by a user;
the identification unit is used for identifying whether new sensitive words exist in the enterprise historical information by using a sensitive word identification model constructed based on a machine learning algorithm and updating the dynamic sensitive word bank according to an identification result;
the matching unit is used for matching the acquired enterprise historical information with the dynamic sensitive word bank by utilizing a word segmentation technology;
the processing unit is used for filtering the enterprise historical information when the sensitive words matched with the dynamic sensitive word bank exist in the enterprise historical information; and when the sensitive words matched with the dynamic sensitive word bank do not exist in the enterprise historical information, normally displaying the enterprise historical information.
8. The sensitive information monitoring system of claim 7, further comprising:
the acquisition unit is used for acquiring historical information of the enterprise for training;
the marking unit is used for marking the sensitive words in the historical information of the enterprise for training and marking the sensitive word level corresponding to the sensitive words in the historical information of the enterprise for training;
and the construction unit is used for constructing the sensitive word recognition model by using the marked enterprise historical information for training as training data and utilizing a machine learning algorithm.
9. The sensitive information monitoring system of claim 8, wherein the dynamic sensitive word repository is further configured to store the sensitive word level;
the identification unit is further configured to, when the sensitive word identification model identifies that a new sensitive word exists in the enterprise history information, grade the new sensitive word with respect to the sensitive word level, and store the new sensitive word and the sensitive word level corresponding to the new sensitive word in the dynamic sensitive word bank, so as to update the dynamic sensitive word bank.
10. The sensitive information monitoring system of claim 9, wherein the sensitive word level comprises a high risk level; the processing unit is further configured to obtain, by the dynamic sensitive word bank, a sensitive word level corresponding to the matched sensitive word when a sensitive word matching the dynamic sensitive word bank exists in the enterprise history information, send alarm information when the sensitive word level corresponding to the matched sensitive word is a high-risk level, and freeze the enterprise history information and an account of a user who inputs the enterprise history information; when the sensitive word level corresponding to the matched sensitive word is not a high-risk level, recording the number of times of forbidding of the user who inputs the enterprise historical information, judging whether the number of times of forbidding of the user who inputs the enterprise historical information reaches a preset alarm number, if the number of times of forbidding reaches the alarm number, sending alarm information, and freezing the enterprise historical information and an account number of the user who inputs the enterprise historical information; and if the number of the forbidden times does not reach the number of the alarm times, sensitive words are filtered on the enterprise historical information, and then the content of the enterprise historical information is displayed.
CN202110940138.XA 2021-08-17 2021-08-17 Sensitive information monitoring method and system based on enterprise historical digitization Pending CN113627174A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110940138.XA CN113627174A (en) 2021-08-17 2021-08-17 Sensitive information monitoring method and system based on enterprise historical digitization

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110940138.XA CN113627174A (en) 2021-08-17 2021-08-17 Sensitive information monitoring method and system based on enterprise historical digitization

Publications (1)

Publication Number Publication Date
CN113627174A true CN113627174A (en) 2021-11-09

Family

ID=78385836

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110940138.XA Pending CN113627174A (en) 2021-08-17 2021-08-17 Sensitive information monitoring method and system based on enterprise historical digitization

Country Status (1)

Country Link
CN (1) CN113627174A (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110209796A (en) * 2019-04-29 2019-09-06 北京印刷学院 A kind of sensitive word detection filter method, device and electronic equipment
CN110288431A (en) * 2019-06-11 2019-09-27 达疆网络科技(上海)有限公司 A method of comment situation to identify malicious user according to user's difference
CN111814822A (en) * 2020-05-25 2020-10-23 北京印刷学院 Sensitive picture detection method and device and electronic equipment

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110209796A (en) * 2019-04-29 2019-09-06 北京印刷学院 A kind of sensitive word detection filter method, device and electronic equipment
CN110288431A (en) * 2019-06-11 2019-09-27 达疆网络科技(上海)有限公司 A method of comment situation to identify malicious user according to user's difference
CN111814822A (en) * 2020-05-25 2020-10-23 北京印刷学院 Sensitive picture detection method and device and electronic equipment

Similar Documents

Publication Publication Date Title
CN111475804B (en) Alarm prediction method and system
CN108537544B (en) Real-time monitoring method and monitoring system for transaction system
WO2021218312A1 (en) Method and apparatus for constructing service fraud identification database, and computer device
KR102259838B1 (en) Apparatus and method for building a blacklist of cryptocurrencies
CN107070897A (en) Network log storage method based on many attribute Hash duplicate removals in intruding detection system
CN113609118A (en) Data optimization method applied to big data and big data server
CN115719283A (en) Intelligent accounting management system
CN116739317B (en) Mining winch automatic management and dispatching platform, method, equipment and medium
CN111062827B (en) Engineering supervision method based on artificial intelligence mode
CN113627174A (en) Sensitive information monitoring method and system based on enterprise historical digitization
CN112686446A (en) Machine learning interpretability-oriented credit default prediction method and system
CN116737681A (en) Real-time abnormal log detection method and device, computer equipment and storage medium
CN115309871A (en) Industrial big data processing method and system based on artificial intelligence algorithm
CN112185083A (en) Repeated alarm judging method
CN113065710A (en) Financial prediction system based on artificial intelligence and block chain
CN112417007A (en) Data analysis method and device, electronic equipment and storage medium
CN110674269A (en) Cable information management and control method and system
CN104008614A (en) Self-service machine device system and method based on odor detection
CN111752727B (en) Log analysis-based three-layer association recognition method for database
CN117806890B (en) Slow disk detection processing method based on distributed storage
CN112800321B (en) Ambiguous post identification method based on keyword retrieval and computer equipment
CN113256180B (en) Customer service work order information intelligent dynamic loading method and system based on machine learning
CN116523521A (en) Financial anti-phishing system
CN116823440A (en) Method for obtaining scientific enterprise risk scoring model, risk scoring method and device
CN114169441A (en) Data identification method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination