CN114281978A - Sensitive word checking method, device, terminal and storage medium - Google Patents

Sensitive word checking method, device, terminal and storage medium Download PDF

Info

Publication number
CN114281978A
CN114281978A CN202111554831.XA CN202111554831A CN114281978A CN 114281978 A CN114281978 A CN 114281978A CN 202111554831 A CN202111554831 A CN 202111554831A CN 114281978 A CN114281978 A CN 114281978A
Authority
CN
China
Prior art keywords
character
matching
matched
text
character string
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111554831.XA
Other languages
Chinese (zh)
Inventor
方曦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Ubtech Technology Co ltd
Original Assignee
Shenzhen Ubtech Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Ubtech Technology Co ltd filed Critical Shenzhen Ubtech Technology Co ltd
Priority to CN202111554831.XA priority Critical patent/CN114281978A/en
Publication of CN114281978A publication Critical patent/CN114281978A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Machine Translation (AREA)

Abstract

The embodiment of the invention discloses a method, a device, a terminal and a storage medium for sensitive word checking, wherein the method comprises the following steps: generating a character tree based on preset sensitive words; the character tree includes a plurality of character strings; different said character strings correspond to different said sensitive words; the same characters in different character strings have a pointing relationship; and performing character matching on the text to be detected through each character string in the character tree and the pointing relation to obtain a matching result. The method and the device have the advantages that the character tree is generated based on the preset sensitive words, character matching is conducted on the text to be detected through the character tree, matching results are obtained, the matching process is related to the length and the similarity of the sensitive words based on the character strings and the directional relation in the character tree, therefore, the complexity can be reduced from exponential level to linear level, the complexity is greatly reduced, and the processing efficiency is effectively improved.

Description

Sensitive word checking method, device, terminal and storage medium
Technical Field
The present invention relates to the field of data processing technologies, and in particular, to a method, an apparatus, a terminal, and a storage medium for sensitive word checking.
Background
At present, there are a lot of needs to check and check contents (such as articles, scripts, novels or announcements), but the current checking and checking operations are performed in a manner of traversing each sensitive word through the contents, which has some problems, firstly, the amount of service data is large, and in many cases, the size of the contents included in a single article is also large, and in addition, there are many sensitive words in the sensitive word library, so there are many words that need to be matched; this results in a very high difficulty in completing the verification, which increases exponentially as the number of contents or sensitive words increases, thereby requiring a long time to complete the verification check, resulting in a very low efficiency, which is not sufficient to meet the current needs.
Thus, there is a need for a better solution to the problems of the prior art.
Disclosure of Invention
In view of the above, the present invention provides a method, an apparatus, a terminal and a storage medium for sensitive word checking, so as to solve the problems in the prior art.
Specifically, the present invention proposes the following specific examples:
the embodiment of the invention provides a method for sensitive word examination, which comprises the following steps:
generating a character tree based on preset sensitive words; the character tree includes a plurality of character strings; different said character strings correspond to different said sensitive words; the same characters among different character strings have a pointing relation;
and performing character matching on the text to be detected through each character string in the character tree and the pointing relation to obtain a matching result.
In a specific embodiment, the performing character matching on the text to be detected through each character string in the character tree and the directional relationship to obtain a matching result includes:
step A, matching the first character of the text to be detected in a mode of traversing the first character which is not matched in each character string;
step B, setting the character string successfully matched with the first character as a target character string;
c, judging whether characters which are not matched exist in the target character string and the text to be detected or not;
step D, if the judgment results show that the characters exist, setting the next character of the latest matched character in the target character string as a first character; setting the next character of the latest matched character in the text to be detected as a second character;
e, judging whether the first character is matched with the second character;
step F, if the first character is matched with the second character, executing step C;
g, if the judgment result is that the next character of the character which is newly matched in the target character string does not exist and the character which is newly matched in the target character string does not have a pointing relationship, finishing the current matching detection;
step H, if the first character is not matched with the second character, judging whether the latest matched character in the target character string has a pointing relation;
step I, if the judgment result is that the target character string exists, determining a new character string based on the latest matched character in the target character string and the pointing relation, updating the target character string into the new character string, and executing the step C;
step J, if the judgment result is that the matching detection does not exist, the current matching detection is finished;
step K, if the current matching detection is finished, taking the characters successfully matched in the current matching detection as a sub-matching result, and judging whether characters which are not matched exist in the text to be detected;
step L, if characters which are not matched exist in the text to be detected, executing the step A;
and step M, if characters which are not matched do not exist in the text to be detected, executing a preset ending process, and summarizing all the sub-matching results to serve as matching results.
In a specific embodiment, the performing character matching on the text to be detected through each character string in the character tree and the directional relationship to obtain a matching result includes:
segmenting a text to be detected to obtain a plurality of sub-texts;
performing character matching on each sub-text by adopting each character string and the pointing relation in the character tree in a multithreading parallel mode to obtain a plurality of sub-text matching results;
and synthesizing all the sub-text matching results to obtain a matching result.
In a specific embodiment, the method further comprises the following steps: and outputting the matching result.
In a specific embodiment, the method further comprises the following steps: associating the matching result with the text to be detected to generate an association relation;
and storing the incidence relation and the matching result in a preset database.
The embodiment of the invention also provides a device for sensitive word examination, which comprises:
the building module is used for generating a character tree based on preset sensitive words; the character tree includes a plurality of character strings; different said character strings correspond to different said sensitive words; the same characters among different character strings have a pointing relation;
and the matching module is used for performing character matching on the text to be detected through each character string in the character tree and the pointing relation to obtain a matching result.
In a specific embodiment, the matching module is configured to perform the following steps:
step A, matching the first character of the text to be detected in a mode of traversing the first character which is not matched in each character string;
step B, setting the character string successfully matched with the first character as a target character string;
c, judging whether characters which are not matched exist in the target character string and the text to be detected or not;
step D, if the judgment results show that the characters exist, setting the next character of the latest matched character in the target character string as a first character; setting the next character of the latest matched character in the text to be detected as a second character;
e, judging whether the first character is matched with the second character;
step F, if the first character is matched with the second character, executing step C;
g, if the judgment result is that the next character of the character which is newly matched in the target character string does not exist and the character which is newly matched in the target character string does not have a pointing relationship, finishing the current matching detection;
step H, if the first character is not matched with the second character, judging whether the latest matched character in the target character string has a pointing relation;
step I, if the judgment result is that the target character string exists, determining a new character string based on the latest matched character in the target character string and the pointing relation, updating the target character string into the new character string, and executing the step C;
step J, if the judgment result is that the matching detection does not exist, the current matching detection is finished;
step K, if the current matching detection is finished, taking the characters successfully matched in the current matching detection as a sub-matching result, and judging whether characters which are not matched exist in the text to be detected;
step L, if characters which are not matched exist in the text to be detected, executing the step A;
and step M, if characters which are not matched do not exist in the text to be detected, executing a preset ending process, and summarizing all the sub-matching results to serve as matching results.
In a specific embodiment, the matching module is configured to:
segmenting a text to be detected to obtain a plurality of sub-texts;
performing character matching on each sub-text by adopting each character string and the pointing relation in the character tree in a multithreading parallel mode to obtain a plurality of sub-text matching results;
and synthesizing all the sub-text matching results to obtain a matching result.
The embodiment of the present invention further provides a terminal, which includes a memory and a processor, where the memory stores a computer program, and the processor implements the method when executing the computer program.
An embodiment of the present invention further provides a storage medium, in which a computer program is stored, and the computer program implements the method when executed.
Therefore, the embodiment of the invention provides a method, a device, a terminal and a storage medium for sensitive word checking, wherein the method comprises the following steps: generating a character tree based on preset sensitive words; the character tree includes a plurality of character strings; different said character strings correspond to different said sensitive words; the same characters among different character strings have a pointing relation; and performing character matching on the text to be detected through each character string in the character tree and the pointing relation to obtain a matching result. The method and the device have the advantages that the character tree is generated based on the preset sensitive words, character matching is conducted on the text to be detected through the character tree, matching results are obtained, the matching process is related to the length and the similarity of the sensitive words based on the character strings and the directional relation in the character tree, therefore, the complexity can be reduced from exponential level to linear level, the complexity is greatly reduced, and the processing efficiency is effectively improved.
Drawings
In order to more clearly illustrate the technical solution of the present invention, the drawings required to be used in the embodiments will be briefly described below, and it should be understood that the following drawings only illustrate some embodiments of the present invention, and therefore should not be considered as limiting the scope of the present invention. Like components are numbered similarly in the various figures.
Fig. 1 is a flow chart illustrating a method for sensitive word checking according to an embodiment of the present invention;
FIG. 2 is a schematic structural diagram of a character tree generated by a method for sensitive word inspection according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of an apparatus for sensitive word inspection according to an embodiment of the present invention;
fig. 4 shows another schematic diagram of an apparatus for sensitive word checking according to an embodiment of the present invention.
Illustration of the drawings:
201-building blocks; 202-a matching module; 203-output module.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments.
The components of embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present invention without making any creative effort, shall fall within the protection scope of the present invention.
Hereinafter, the terms "including", "having", and their derivatives, which may be used in various embodiments of the present invention, are only intended to indicate specific features, numbers, steps, operations, elements, components, or combinations of the foregoing, and should not be construed as first excluding the existence of, or adding to, one or more other features, numbers, steps, operations, elements, components, or combinations of the foregoing.
Furthermore, the terms "first," "second," "third," and the like are used solely to distinguish one from another and are not to be construed as indicating or implying relative importance.
Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which various embodiments of the present invention belong. The terms (such as those defined in commonly used dictionaries) should be interpreted as having a meaning that is consistent with their contextual meaning in the relevant art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein in various embodiments of the present invention.
Example 1
The embodiment 1 of the invention discloses a method for checking sensitive words, which comprises the following steps as shown in figure 1:
s101, generating a character tree based on preset sensitive words; the character tree includes a plurality of character strings; different said character strings correspond to different said sensitive words; the same characters among different character strings have a pointing relation;
specifically, for example, the sensitive words are abcde, cdey, ex; the constructed character tree is shown in fig. 2, the character tree takes root as a root node, and each character string consists of one or more character nodes; each character node corresponds to a character, for example, three character strings in fig. 2, the leftmost character string corresponds to the sensitive word abcde, the middle character string corresponds to the sensitive word cdey, and the rightmost character string corresponds to the sensitive word ex. As for the directional relationship, see fig. 2, c in the leftmost string points to c in the middle string; the orientation relationships in particular fig. 2 are identified by fail. In addition, if all the nodes do not have fail to point to, all the nodes point to the root node uniformly, that is, the node represents that the current process is completed.
And S102, performing character matching on the text to be detected through each character string in the character tree and the pointing relation to obtain a matching result.
In a specific embodiment, the performing character matching on the text to be checked through each character string in the character tree and the directional relationship in step S102 to obtain a matching result includes:
step A, matching the first character of the text to be detected in a mode of traversing the first character which is not matched in each character string;
step B, setting the character string successfully matched with the first character as a target character string;
c, judging whether characters which are not matched exist in the target character string and the text to be detected or not;
step D, if the judgment results show that the characters exist, setting the next character of the latest matched character in the target character string as a first character; setting the next character of the latest matched character in the text to be detected as a second character;
e, judging whether the first character is matched with the second character;
step F, if the first character is matched with the second character, executing step C;
g, if the judgment result is that the next character of the character which is newly matched in the target character string does not exist and the character which is newly matched in the target character string does not have a pointing relationship, finishing the current matching detection;
step H, if the first character is not matched with the second character, judging whether the latest matched character in the target character string has a pointing relation;
step I, if the judgment result is that the target character string exists, determining a new character string based on the latest matched character in the target character string and the pointing relation, updating the target character string into the new character string, and executing the step C;
step J, if the judgment result is that the matching detection does not exist, the current matching detection is finished;
step K, if the current matching detection is finished, taking the characters successfully matched in the current matching detection as a sub-matching result, and judging whether characters which are not matched exist in the text to be detected;
step L, if characters which are not matched exist in the text to be detected, executing the step A;
and step M, if characters which are not matched do not exist in the text to be detected, executing a preset ending process, and summarizing all the sub-matching results to serve as matching results.
Specifically, the description is given by taking fig. 2 as an example, for example, a segment of the article to be detected is abcdef.
Step 1, traversing from the first character a of the article to match the child nodes of the root node, if finding that the character a is hit, continuing to match the next character of the article from the character node a downwards, and repeatedly executing the operation. If the leftmost character string is hit in a in fig. 2, the current leftmost character string is the target character string.
Step 2, based on the continuous matching of the next character in the step 1, the next character is matched until the e node; specifically, the leftmost character string in fig. 2 is sequentially matched with the upper abcde; at this time, it is found that the e node has no child node, and the sensitive word preliminary result abcde is collected. And simultaneously, changing the current node into the node pointed by the fail pointer by using the fail pointer of the e node, and continuing to match the next character of the article.
And 3, matching the e-node of the middle character string pointed by the fail pointer, finding that the next child node y of the middle character string is not matched with the next child node x of the middle character string, and continuing to find the e-node of the rightmost character string pointed by the next fail pointer of the e-node.
And 4, when the node e of the rightmost character string is reached and the character x which can be matched with the article is found, collecting abcdex serving as a final structure of the sensitive word into an answer set. When the rightmost character string x node comes, the rightmost character string x is found to have no child node, and the fail pointer of the x of the rightmost character string points to the root node, the matching process is completed. The sub-match result obtained at this time is abcdex.
And step 5, because the text to be detected is abcdexf and abcdex finishes matching, starting from the next character of the x character, namely f, and continuing the process of executing 1-4 until the examination of the whole article is finished.
Therefore, the process of 1-4 is executed each time to obtain a sub-matching result, and finally all the sub-matching results are summarized to obtain the matching result of the text to be detected.
In a specific embodiment, the performing character matching on the text to be checked through each character string in the character tree and the directional relationship in step S102 to obtain a matching result includes: segmenting a text to be detected to obtain a plurality of sub-texts; performing character matching on each sub-text by adopting each character string and the pointing relation in the character tree in a multithreading parallel mode to obtain a plurality of sub-text matching results; and synthesizing all the sub-text matching results to obtain a matching result.
Specifically, the scheme can perform character matching on the text to be checked as the whole character tree, the text to be checked only needs to be traversed once, the matching process is related to the length and the similarity of the sensitive words, the complexity is reduced from exponential level to linear level, the complexity is greatly reduced, and the processing efficiency is effectively improved. The scheme can adopt the working capacity of a single node and a single CPU, and can meet the requirements of most service scenes if the response time is not required.
In order to further improve the efficiency of character matching, tasks can be executed in a parallel mode, articles are split into a plurality of parts, and the parts are distributed to different nodes and CPU tasks for execution; that is, the text to be detected can be segmented first, and then character matching is performed on each segmented part based on the character tree, so that the processing efficiency can be further improved.
In a specific embodiment, after obtaining the matching result, the method further includes: and outputting the matching result.
In a specific embodiment, the method further comprises: associating the matching result with the text to be detected to generate an association relation; and storing the incidence relation and the matching result in a preset database.
Specifically, in order to facilitate subsequent query of the matching result of the text to be detected, the matching result and the text to be detected can be associated to generate an association relationship, and then the association relationship and the matching result are stored in a preset database, so that the matching result corresponding to the text to be detected can be sequentially queried from the database in the subsequent process, and repeated character matching is avoided.
Example 2
The embodiment of the invention also discloses a device for checking the sensitive words, which comprises the following components as shown in figure 3:
the building module 201 is configured to generate a character tree based on a preset sensitive word; the character tree includes a plurality of character strings; different said character strings correspond to different said sensitive words; the same characters among different character strings have a pointing relation;
and the matching module 202 is configured to perform character matching on the text to be detected through each character string in the character tree and the pointing relationship to obtain a matching result.
In a specific embodiment, the matching module 202 is configured to perform the following steps:
step A, matching the first character of the text to be detected in a mode of traversing the first character which is not matched in each character string;
step B, setting the character string successfully matched with the first character as a target character string;
c, judging whether characters which are not matched exist in the target character string and the text to be detected or not;
step D, if the judgment results show that the characters exist, setting the next character of the latest matched character in the target character string as a first character; setting the next character of the latest matched character in the text to be detected as a second character;
e, judging whether the first character is matched with the second character;
step F, if the first character is matched with the second character, executing step C;
g, if the judgment result is that the next character of the character which is newly matched in the target character string does not exist and the character which is newly matched in the target character string does not have a pointing relationship, finishing the current matching detection;
step H, if the first character is not matched with the second character, judging whether the latest matched character in the target character string has a pointing relation;
step I, if the judgment result is that the target character string exists, determining a new character string based on the latest matched character in the target character string and the pointing relation, updating the target character string into the new character string, and executing the step C;
step J, if the judgment result is that the matching detection does not exist, the current matching detection is finished;
step K, if the current matching detection is finished, taking the characters successfully matched in the current matching detection as a sub-matching result, and judging whether characters which are not matched exist in the text to be detected;
step L, if characters which are not matched exist in the text to be detected, executing the step A;
and step M, if characters which are not matched do not exist in the text to be detected, executing a preset ending process, and summarizing all the sub-matching results to serve as matching results.
In a specific embodiment, the matching module 202 is configured to:
segmenting a text to be detected to obtain a plurality of sub-texts;
performing character matching on each sub-text by adopting each character string and the pointing relation in the character tree in a multithreading parallel mode to obtain a plurality of sub-text matching results;
and synthesizing all the sub-text matching results to obtain a matching result.
In a specific embodiment, as shown in fig. 4, the apparatus further comprises: and the output module 203 is used for outputting the matching result.
In a specific embodiment, the apparatus further comprises: the storage module is used for associating the matching result with the text to be detected to generate an association relation;
and storing the incidence relation and the matching result in a preset database.
Example 3
The embodiment 3 of the present invention further discloses a terminal, which includes a memory and a processor, wherein the memory stores a computer program, and the processor implements the method described in the embodiment 1 when executing the computer program.
Example 4
Embodiment 4 of the present invention also discloses a storage medium, in which a computer program is stored, and when the computer program is executed, the method described in embodiment 1 is implemented.
Therefore, the invention provides a method, a device, a terminal and a storage medium for sensitive word checking, wherein the method comprises the following steps: generating a character tree based on preset sensitive words; the character tree includes a plurality of character strings; different said character strings correspond to different said sensitive words; the same characters among different character strings have a pointing relation; and performing character matching on the text to be detected through each character string in the character tree and the pointing relation to obtain a matching result. The method and the device have the advantages that the character tree is generated based on the preset sensitive words, character matching is conducted on the text to be detected through the character tree, matching results are obtained, the matching process is related to the length and the similarity of the sensitive words based on the character strings and the directional relation in the character tree, therefore, the complexity can be reduced from exponential level to linear level, the complexity is greatly reduced, and the processing efficiency is effectively improved.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method can be implemented in other ways. The apparatus embodiments described above are merely illustrative and, for example, the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In addition, each functional module or unit in each embodiment of the present invention may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.
The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention or a part of the technical solution that contributes to the prior art in essence can be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a smart phone, a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention.

Claims (10)

1. A method of sensitive word examination, comprising:
generating a character tree based on preset sensitive words; the character tree includes a plurality of character strings; different said character strings correspond to different said sensitive words; the same characters among different character strings have a pointing relation;
and performing character matching on the text to be detected through each character string in the character tree and the pointing relation to obtain a matching result.
2. The method of claim 1, wherein said character matching the text to be checked through each character string in the character tree and the pointing relationship to obtain a matching result comprises:
step A, matching the first character of the text to be detected in a mode of traversing the first character which is not matched in each character string;
step B, setting the character string successfully matched with the initial character as a target character string;
c, judging whether characters which are not matched exist in the target character string and the text to be detected or not;
step D, if the judgment results show that the characters exist, setting the next character of the latest matched character in the target character string as a first character; setting the next character of the latest matched character in the text to be detected as a second character;
e, judging whether the first character is matched with the second character;
step F, if the first character is matched with the second character, executing step C;
g, if the judgment result is that the next character of the character which is newly matched in the target character string does not exist and the character which is newly matched in the target character string does not have a pointing relationship, finishing the current matching detection;
step H, if the first character is not matched with the second character, judging whether the latest matched character in the target character string has a pointing relation;
step I, if the judgment result is that the target character string exists, determining a new character string based on the latest matched character in the target character string and the pointing relation, updating the target character string into the new character string, and executing the step C;
step J, if the judgment result is that the matching detection does not exist, the current matching detection is finished;
step K, if the current matching detection is finished, taking the characters successfully matched in the current matching detection as a sub-matching result, and judging whether characters which are not matched exist in the text to be detected;
step L, if characters which are not matched exist in the text to be detected, executing the step A;
and step M, if characters which are not matched do not exist in the text to be detected, executing a preset ending process, and summarizing all the sub-matching results to serve as matching results.
3. The method of claim 1, wherein said character matching the text to be checked through each character string in the character tree and the pointing relationship to obtain a matching result comprises:
segmenting a text to be detected to obtain a plurality of sub-texts;
performing character matching on each sub-text by adopting each character string and the pointing relation in the character tree in a multithreading parallel mode to obtain a plurality of sub-text matching results;
and synthesizing all the sub-text matching results to obtain a matching result.
4. The method of claim 1, further comprising: and outputting the matching result.
5. The method of claim 1, further comprising: associating the matching result with the text to be detected to generate an association relation;
and storing the incidence relation and the matching result in a preset database.
6. An apparatus for sensitive word examination, comprising:
the building module is used for generating a character tree based on preset sensitive words; the character tree includes a plurality of character strings; different said character strings correspond to different said sensitive words; the same characters among different character strings have a pointing relation;
and the matching module is used for performing character matching on the text to be detected through each character string in the character tree and the pointing relation to obtain a matching result.
7. The apparatus of claim 6, wherein the matching module is configured to perform the steps of:
step A, matching the first character of the text to be detected in a mode of traversing the first character which is not matched in each character string;
step B, setting the character string successfully matched with the first character as a target character string;
c, judging whether characters which are not matched exist in the target character string and the text to be detected or not;
step D, if the judgment results show that the characters exist, setting the next character of the latest matched character in the target character string as a first character; setting the next character of the latest matched character in the text to be detected as a second character;
e, judging whether the first character is matched with the second character;
step F, if the first character is matched with the second character, executing step C;
g, if the judgment result is that the next character of the character which is newly matched in the target character string does not exist and the character which is newly matched in the target character string does not have a pointing relationship, finishing the current matching detection;
step H, if the first character is not matched with the second character, judging whether the latest matched character in the target character string has a pointing relation;
step I, if the judgment result is that the target character string exists, determining a new character string based on the latest matched character in the target character string and the pointing relation, updating the target character string into the new character string, and executing the step C;
step J, if the judgment result is that the matching detection does not exist, the current matching detection is finished;
step K, if the current matching detection is finished, taking the characters successfully matched in the current matching detection as a sub-matching result, and judging whether characters which are not matched exist in the text to be detected;
step L, if characters which are not matched exist in the text to be detected, executing the step A;
and step M, if characters which are not matched do not exist in the text to be detected, executing a preset ending process, and summarizing all the sub-matching results to serve as matching results.
8. The apparatus of claim 6, wherein the matching module is to:
segmenting a text to be detected to obtain a plurality of sub-texts;
performing character matching on each sub-text by adopting each character string and the pointing relation in the character tree in a multithreading parallel mode to obtain a plurality of sub-text matching results;
and synthesizing all the sub-text matching results to obtain a matching result.
9. A terminal, characterized in that it comprises a memory in which a computer program is stored and a processor which, when executing the computer program, implements the method according to any one of claims 1-5.
10. A storage medium, characterized in that a computer program is stored in the storage medium, which computer program, when executed, implements the method of any one of claims 1-5.
CN202111554831.XA 2021-12-17 2021-12-17 Sensitive word checking method, device, terminal and storage medium Pending CN114281978A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111554831.XA CN114281978A (en) 2021-12-17 2021-12-17 Sensitive word checking method, device, terminal and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111554831.XA CN114281978A (en) 2021-12-17 2021-12-17 Sensitive word checking method, device, terminal and storage medium

Publications (1)

Publication Number Publication Date
CN114281978A true CN114281978A (en) 2022-04-05

Family

ID=80872942

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111554831.XA Pending CN114281978A (en) 2021-12-17 2021-12-17 Sensitive word checking method, device, terminal and storage medium

Country Status (1)

Country Link
CN (1) CN114281978A (en)

Similar Documents

Publication Publication Date Title
CN109299258B (en) Public opinion event detection method, device and equipment
CN110009430B (en) Cheating user detection method, electronic device and computer readable storage medium
EP3748507B1 (en) Automated software testing
CN111340054A (en) Data labeling method and device and data processing equipment
CN112162977B (en) MES-oriented mass data redundancy removing method and system
CN111143513B (en) Sensitive word recognition method and device and electronic equipment
CN112347767B (en) Text processing method, device and equipment
CN112115313A (en) Regular expression generation method, regular expression data extraction method, regular expression generation device, regular expression data extraction device, regular expression equipment and regular expression data extraction medium
CN111831685A (en) Query statement processing method, model training method, device and equipment
CN112567377A (en) Expression recognition using character skipping
CN104580109A (en) Method and device for generating click verification code
CN111881288B (en) Method and device for judging true and false of stroke information, storage medium and electronic equipment
CN109783139B (en) Software interface feature extraction method and device and electronic equipment
US9842112B1 (en) System and method for identifying fields in a file using examples in the file received from a user
CN114281978A (en) Sensitive word checking method, device, terminal and storage medium
WO2023024474A1 (en) Data set determination method and apparatus, and computer device and storage medium
CN115796146A (en) File comparison method and device
CN111090737A (en) Word stock updating method and device, electronic equipment and readable storage medium
CN104778202A (en) Analysis method and system based on event evolution process of key words
CN114254591A (en) Construction method and device of simplified and traditional conversion tool
US20170060998A1 (en) Method and apparatus for mining maximal repeated sequence
CN108304467A (en) For matched method between text
CN110309235B (en) Data processing method, device, equipment and medium
CN112559474B (en) Log processing method and device
CN113609279A (en) Material model extraction method and device and computer equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination