CN111753548A - Information acquisition method and device, computer storage medium and electronic equipment - Google Patents

Information acquisition method and device, computer storage medium and electronic equipment Download PDF

Info

Publication number
CN111753548A
CN111753548A CN202010144301.7A CN202010144301A CN111753548A CN 111753548 A CN111753548 A CN 111753548A CN 202010144301 A CN202010144301 A CN 202010144301A CN 111753548 A CN111753548 A CN 111753548A
Authority
CN
China
Prior art keywords
character
expression
execution expression
keyword
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010144301.7A
Other languages
Chinese (zh)
Inventor
王海
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jingdong Century Trading Co Ltd
Beijing Wodong Tianjun Information Technology Co Ltd
Original Assignee
Beijing Jingdong Century Trading Co Ltd
Beijing Wodong Tianjun Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jingdong Century Trading Co Ltd, Beijing Wodong Tianjun Information Technology Co Ltd filed Critical Beijing Jingdong Century Trading Co Ltd
Priority to CN202010144301.7A priority Critical patent/CN111753548A/en
Publication of CN111753548A publication Critical patent/CN111753548A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing

Abstract

The present disclosure relates to the technical field of computers, and provides an information acquisition method, an information acquisition device, a storage medium, and an electronic device, wherein the information acquisition method includes: acquiring a regular expression, and sequentially acquiring current characters from the regular expression; creating an initial keyword and an initial execution expression; when the current character is a non-special character, determining a target keyword according to the current character and the initial keyword; when the current character is a special character, updating the initial execution expression according to the current character and the target keyword to obtain an updated execution expression, and taking the updated execution expression as a new initial execution expression; and repeating the steps until an updating execution expression corresponding to the last character in the regular expression is obtained, taking the updating execution expression corresponding to the last character as a target execution expression, and obtaining target information based on the target execution expression. According to the method and the device, the efficiency and the accuracy of information acquisition are improved through the regular expression and the simple matching algorithm.

Description

Information acquisition method and device, computer storage medium and electronic equipment
Technical Field
The present disclosure relates to the field of computer technologies, and in particular, to an information obtaining method, an information obtaining apparatus, a computer-readable storage medium, and an electronic device.
Background
With the development of computer technology, people often use computer technology to obtain useful information. For example, in the field of risk control or hotword statistics, it is often necessary to determine whether a user inputs certain specific vocabulary, and obtain the attention of the user to a certain event through a statistical determination result.
In the prior art, whether a target word exists is judged by adopting a word-by-word circulation mode from a sentence needing to be detected. However, often, the sentence is too long or the target word is too complex, so that the overhead of system resources is increased in the process of acquiring information.
It is to be noted that the information disclosed in the above background section is only for enhancement of understanding of the background of the present disclosure, and thus may include information that does not constitute prior art known to those of ordinary skill in the art.
Disclosure of Invention
An object of the present disclosure is to provide an information acquisition method, an information acquisition apparatus, a computer-readable storage medium, and an electronic device, so as to improve the efficiency and accuracy of information acquisition at least to a certain extent.
Additional features and advantages of the disclosure will be set forth in the detailed description which follows, or in part will be obvious from the description, or may be learned by practice of the disclosure.
According to an aspect of the present disclosure, there is provided an information acquisition method including: acquiring regular expressions, and sequentially acquiring current characters from the regular expressions; creating an initial keyword and an initial execution expression; when the current character is a non-special character, determining a target keyword according to the current character and the initial keyword; when the current character is a special character, updating the initial execution expression according to the current character and the target keyword to obtain an updated execution expression, and taking the updated execution expression as a new initial execution expression; and repeating the steps until an updating execution expression corresponding to the last character in the regular expression is obtained, taking the updating execution expression corresponding to the last character as a target execution expression, and obtaining target information based on the target execution expression.
In some exemplary embodiments of the present disclosure, the method further comprises: and when the current character is the non-special character, generating the label.
In some exemplary embodiments of the present disclosure, determining a target keyword from the current character and the initial keyword includes: combining the current character and the initial keyword to form an updated keyword; and taking the updated keyword as the new initial keyword, repeating the previous step until the obtained character is a special character, and taking the finally obtained updated keyword as the target keyword.
In some exemplary embodiments of the present disclosure, when the current character is a special character, updating the initial execution expression according to the current character and the target keyword to obtain the updated execution expression, includes: when the current character is a special character, judging whether a label exists or not; if yes, matching the target keywords with the text to be detected, and updating the initial execution expression according to the matching result and the current character to obtain the updated execution expression; and if the current character does not exist, adding the current character into the initial execution expression to obtain the updated execution expression.
In some exemplary embodiments of the present disclosure, updating the initial execution expression according to the matching result and the current character to obtain the updated execution expression includes: and determining identification information according to the matching result, and updating the initial execution expression according to the identification information and the current character to obtain the updated execution expression.
In some exemplary embodiments of the present disclosure, the identification information includes a first identification and a second identification; updating the initial execution expression according to the identification information and the current character to obtain the updated execution expression, including: if the target keyword exists in the text to be detected, adding the current character and the first identifier to the initial execution expression to obtain the updated execution expression; and if the target keyword does not exist in the text to be detected, adding the current character and the second identifier into the initial execution expression to obtain the updated execution expression.
In some exemplary embodiments of the present disclosure, after the updating execution expression is treated as a new initial execution expression, the method further comprises: and clearing the target keywords and deleting the tags.
In some exemplary embodiments of the present disclosure, obtaining an updated execution expression corresponding to a last character in the regular expression includes: when the last character is the non-special character, taking an update keyword corresponding to the last character as the target keyword; and matching the target keywords with the text to be detected, and acquiring an updated execution expression corresponding to the last character according to the matching result.
In some exemplary embodiments of the present disclosure, obtaining the updated execution expression corresponding to the last character according to the matching result includes: if the target keyword corresponding to the last character exists in the text to be detected, adding a third identifier to the initial execution expression to obtain an updated execution expression corresponding to the last character; and if the target keyword corresponding to the last character does not exist in the text to be detected, adding a fourth identifier to the initial execution expression to obtain an updated execution expression corresponding to the last character.
In some exemplary embodiments of the present disclosure, obtaining target information based on the target execution expression includes: and performing statistics and/or evaluation according to the target execution expression to determine statistical information and/or evaluation information.
According to an aspect of the present disclosure, there is provided an information acquisition apparatus including: the obtaining expression module is used for obtaining regular expressions and sequentially obtaining current characters from the regular expressions; an initial word creating module for creating an initial keyword and an initial execution expression; the keyword updating module is used for determining a target keyword according to the current character and the initial keyword when the current character is a non-special character; an expression determining module, configured to update the initial execution expression according to the current character and the target keyword when the current character is a special character, to obtain the updated execution expression, and use the updated execution expression as a new initial execution expression; and the target information obtaining module is used for repeating the steps until an updating execution expression corresponding to the last character in the regular expression is obtained, taking the updating execution expression corresponding to the last character as a target execution expression, and obtaining target information based on the target execution expression.
According to an aspect of the present disclosure, there is provided a computer-readable medium on which a computer program is stored, the program, when executed by a processor, implementing the information acquisition method as described in the above embodiments.
According to an aspect of the present disclosure, there is provided an electronic device including: one or more processors; a storage device for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to implement the information acquisition method as described in the above embodiments.
As can be seen from the foregoing technical solutions, the information obtaining method and apparatus, the computer-readable storage medium, and the electronic device in the exemplary embodiments of the present disclosure have at least the following advantages and positive effects:
the information acquisition method of the exemplary embodiment of the present disclosure sequentially acquires current characters from regular expressions by acquiring the regular expressions, and creates initial keywords and initial execution expressions; when the current character is a non-special character, determining a target keyword according to the current character and the initial keyword, when the current character is a special character, updating an initial execution expression according to the current character and the target keyword to obtain an updated execution expression, and taking the updated execution expression as a new initial execution expression; and repeating the steps until an updating execution expression corresponding to the last character in the regular expression is obtained, taking the updating execution expression corresponding to the last character as a target execution expression, and obtaining target information based on the target execution expression. According to the information acquisition method, on one hand, the target keywords are created by traversing the regular expression, the creation process is simple, a more complex environment can be supported, and the system performance is optimized; on the other hand, the updating execution expression is determined through the target keyword, so that the final target execution expression is obtained, the times of cyclic detection are reduced, and the efficiency and the accuracy of information acquisition are improved; on the other hand, circulation is realized by judging the character type, the realization process is simple, the performance is better, and the system resource is saved.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure. It is to be understood that the drawings in the following description are merely exemplary of the disclosure, and that other drawings may be derived from those drawings by one of ordinary skill in the art without the exercise of inventive faculty.
Fig. 1 schematically shows a flow diagram of an information acquisition method according to an embodiment of the present disclosure;
FIG. 2 schematically illustrates a flow diagram for determining a target keyword according to an embodiment of the present disclosure;
FIG. 3 schematically illustrates a flow diagram for obtaining an update execution expression according to an embodiment of the present disclosure;
FIG. 4 schematically shows a structural diagram of obtaining target information according to an embodiment of the present disclosure;
FIG. 5 schematically shows a block diagram of an information acquisition apparatus according to an embodiment of the present disclosure;
FIG. 6 schematically shows a block schematic of an electronic device according to an embodiment of the present disclosure;
fig. 7 schematically shows a program product schematic according to an embodiment of the present disclosure.
Detailed Description
Example embodiments will now be described more fully with reference to the accompanying drawings. Example embodiments may, however, be embodied in many different forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art.
Furthermore, the described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided to give a thorough understanding of embodiments of the disclosure. One skilled in the relevant art will recognize, however, that the subject matter of the present disclosure can be practiced without one or more of the specific details, or with other methods, components, devices, steps, and so forth. In other instances, well-known methods, devices, implementations, or operations have not been shown or described in detail to avoid obscuring aspects of the disclosure.
The block diagrams shown in the figures are functional entities only and do not necessarily correspond to physically separate entities. I.e. these functional entities may be implemented in the form of software, or in one or more hardware modules or integrated circuits, or in different networks and/or processor means and/or microcontroller means.
The flow charts shown in the drawings are merely illustrative and do not necessarily include all of the contents and operations/steps, nor do they necessarily have to be performed in the order described. For example, some operations/steps may be decomposed, and some operations/steps may be combined or partially combined, so that the actual execution sequence may be changed according to the actual situation.
With the development of computer technology, it is often necessary to identify user requirements by computer technology, and the recent search records or browsing records of a user are usually detected and counted by a specific expression. In the prior art, word-by-word circulation is adopted from a sentence to be detected to judge whether a target word exists. However, often, the sentence is too long or the target word is too complex, so that the recognition efficiency is too low in the process of recognizing the target information, and the overhead of system resources is increased.
Based on the problems in the related art, an information acquisition method is provided in one embodiment of the present disclosure. The information acquisition method can be applied to, but is not limited to, the following scenarios: the method comprises the steps of hot word statistics, risk monitoring, business monitoring, user requirement identification and the like, specific application scenes of the information acquisition method are not particularly limited, and changes of the specific application scenes are understood to belong to the protection scope of the method. Fig. 1 shows a flow diagram of an information acquisition method, which, as shown in fig. 1, at least comprises the following steps:
step S110: acquiring a regular expression, and sequentially acquiring current characters from the regular expression;
step S120: creating an initial keyword and an initial execution expression;
step S130: when the current character is a non-special character, determining a target keyword according to the current character and the initial keyword;
step S140: when the current character is a special character, updating the initial execution expression according to the current character and the target keyword to obtain an updated execution expression, and taking the updated execution expression as a new initial execution expression;
step S150: and repeating the steps until an updating execution expression corresponding to the last character in the regular expression is obtained, taking the updating execution expression corresponding to the last character as a target execution expression, and obtaining target information based on the target execution expression.
On one hand, the information acquisition method in the embodiment of the disclosure creates the target keyword by traversing the regular expression, has simple creation process, can support more complex environment and optimizes system performance; on the other hand, the updating execution expression is determined through the target keyword, so that the final target execution expression is obtained, the times of cyclic detection are reduced, and the efficiency and the accuracy of information acquisition are improved; on the other hand, circulation is realized by judging the character type, the thought is simple, the performance is better, and the system resources are saved.
It should be noted that, the information obtaining method provided in the embodiment of the present disclosure is generally executed by a processor with a computing function, where the processor may include a terminal device, a server, or a processor with a computing function formed by combining a terminal device and a server, and the present disclosure is not limited in particular.
In order to make the technical solution of the present disclosure clearer, the information acquisition method in the present exemplary embodiment is described in detail below by way of example.
In step S110, regular expressions are obtained, and current characters are sequentially obtained from the regular expressions.
In an exemplary embodiment of the present disclosure, a regular expression is composed of special characters including "&", "(", ")", "! "and so on element characters, non-special characters include letters, numbers, Chinese characters, and so on, and the disclosure does not make specific restrictions on the specific categories of special characters and non-special characters.
In step S120, an initial keyword and an initial execution expression are created.
In an exemplary embodiment of the present disclosure, an initial keyword and an initial execution expression are created such that the initial keyword and the initial execution expression thereof are updated in the following steps, wherein the initial keyword and the initial execution expression may be null.
In step S130, when the current character is a non-special character, a target keyword is determined according to the current character and the initial keyword.
In an exemplary embodiment of the disclosure, fig. 2 schematically illustrates a flow chart of determining a target keyword, and as shown in fig. 2, a process of determining the target keyword according to a current character and an initial keyword when the current character is a non-special character by judging a type of the current character includes the following steps:
in step S210, the current character is merged with the initial keyword to form an updated keyword.
In an exemplary embodiment of the disclosure, the initial keyword is null, the current character is sequentially obtained from the regular expression, and when the current character is a non-special character, the current character and the initial keyword are combined to form an update keyword.
In step S220, the updated keyword is used as a new initial keyword, the previous step is repeated until the obtained character is a special character, and the finally obtained updated keyword is used as a target keyword.
In an exemplary embodiment of the present disclosure, an update keyword corresponding to a current character is used as a new initial keyword, when a next character adjacent to the current character is obtained, a type of the next character is determined, if the next character is still a non-special character, the next character and the new initial keyword are combined to form the update keyword, and the update keyword is used as the new initial keyword. And repeatedly executing the steps, sequentially acquiring characters from the regular expression, judging the type and updating the keywords, and taking the finally acquired updating keywords as target keywords when the acquired current characters are special characters.
For example, the process of determining the target keyword is as follows:
the method comprises the steps of obtaining a regular expression (mobile phone & explosion), sequentially obtaining current characters from the regular expression, wherein the current characters are '(', judging the current characters to be special characters, continuously obtaining next characters 'hands' from the regular expression as the current characters, judging the current characters to be non-special characters, combining the current characters 'hands' with initial keywords to form updating keywords, wherein the updating keywords are 'hands' at the moment and serve as the initial keywords of the next cycle, then obtaining next characters 'machine' from the regular expression as the current characters, judging the current characters to be special characters, combining the current characters 'machine' with the initial keywords 'hands' to form the updating keywords 'mobile phone', serving the mobile phone as the initial keywords of the next cycle, and continuously obtaining next characters '&' from the regular expression to serve as the initial keywords of the next cycle And if the current character is the special character, the current character is judged, and the updated keyword obtained in the previous cycle is used as the target keyword.
In an exemplary embodiment of the present disclosure, a tag is generated when the current character is a non-special character. The step of generating the tag may be performed before determining the target keyword or after determining the target keyword, which is not specifically limited by the present disclosure.
In an exemplary embodiment of the present disclosure, when the current character is a non-special character, it is determined whether a tag exists, and if the tag does not exist, the tag is generated. Specifically, the tag may be stored in a database, or the target keyword corresponding to the current character may be marked, which is not specifically limited in this disclosure.
Continuing to refer to fig. 1, in step S140, when the current character is a special character, the initial execution expression is updated according to the current character and the target keyword to obtain an updated execution expression, and the updated execution expression is used as a new initial execution expression.
In an exemplary embodiment of the disclosure, fig. 3 schematically shows a flowchart for obtaining an updated execution expression, and as shown in fig. 3, the method includes the following steps:
in step S310, when the current character is a special character, it is determined whether a label exists.
In an exemplary embodiment of the present disclosure, when a current character is a special character, it is determined whether a tag exists in a database, and if the tag does not exist, there are two cases, one is that the tag is deleted in the execution process of the previous cycle, or that the previous character corresponding to the current character is the special character and the tag is not generated.
In step S320, if the tag exists, matching the target keyword with the text to be detected, and updating the initial execution expression according to the matching result and the current character to obtain an updated execution expression.
In the exemplary embodiment of the disclosure, the target keyword is matched with the text to be detected, if the target keyword exists in the text to be detected, the identification information is determined to be the first identification, and the current character and the first identification are added to the initial execution expression to obtain an updated execution expression; and if the target keywords do not exist in the text to be detected, judging that the identification information is a second identification, and adding the current character and the second identification into the initial execution expression to obtain an updated execution expression. Wherein the first mark and the second mark are different and respectively represent two opposite meanings. For example, the first identifier may be true, and the second identifier is false; the first flag is false, and the second flag is true. If the first identifier can also be 0, the second identifier is 1; the first flag is 1 and the second flag is 0. Of course, the first mark and the second mark may be represented by other characters, which is not specifically limited by the present disclosure.
In step S330, if there is no tag, the current character is added to the initial execution expression to obtain an updated execution expression.
In an exemplary embodiment of the present disclosure, when no tag exists, the special character is directly added to the initial execution expression corresponding to the current character to obtain an updated execution expression.
In an exemplary embodiment of the present disclosure, after the updated execution expression is taken as a new initial execution expression, the target keyword is cleared, and the tag is deleted. Of course, after the initial execution expression is updated according to the matching result and the current character, the target keyword is cleared, the tag is deleted, the next loop is started, the next character corresponding to the current character is continuously obtained from the regular expression as the current character, and the operation is executed for the next character to form the final updated execution expression.
Continuing to refer to fig. 1, in step S150, the above steps are repeated until an updated execution expression corresponding to the last character in the regular expression is obtained, the updated execution expression corresponding to the last character is taken as a target execution expression, and target information is obtained based on the target execution expression.
In an exemplary embodiment of the present disclosure, the current characters are sequentially obtained from the regular expression, and step S130 or step S140 is repeatedly executed according to the type of the current character until an updated execution expression corresponding to the last character in the regular expression is obtained.
In an exemplary embodiment of the present disclosure, when the last character of the regular expression is a special character, step S140 is performed to obtain an updated execution expression corresponding to the last character, and take the updated execution expression corresponding to the last character as the target execution expression.
In an exemplary embodiment of the disclosure, when the last character of the regular expression is a non-special character, matching a target keyword corresponding to the last character with a text to be detected, and if the target keyword corresponding to the last character exists in the text to be detected, adding a third identifier to the initial execution expression to obtain an updated execution expression corresponding to the last character; and if the target keyword corresponding to the last character does not exist in the text to be detected, adding the fourth identifier into the initial execution expression to obtain an updated execution expression corresponding to the last character. The third mark is the same as the first mark, and the fourth mark is the same as the second mark.
In an exemplary embodiment of the present disclosure, obtaining target information based on a target execution expression includes: statistics and/or evaluations are performed according to the target execution expression to determine statistical information and/or evaluation information.
In an exemplary embodiment of the present disclosure, fig. 4 schematically illustrates a flowchart of obtaining target information, as shown in fig. 4, in step S401, a regular expression is obtained, and current characters are obtained from the regular expression in sequence; in step S402, an initial keyword and an initial execution expression are created; wherein, the initial keyword is null, the initial execution expression is null, and in step S403, it is determined whether the current character is a special character; in step S404, if the current character is a non-special character, adding the current character to the initial keyword to obtain an updated keyword, generating a tag, and repeatedly executing steps S402 to S404 until the obtained current character is a special character; in step S405, if the current character is a special character, the update keyword corresponding to the previous non-special character is used as the target keyword; in step S406, if the current character is a special character, it is determined whether a tag exists; in step S407, if there is no tag, adding the current character to the initial execution expression to obtain an updated execution expression and a new initial execution expression; in step S408, if the tag exists, it is determined whether the text to be detected includes the target keyword; in step S409, if the text to be detected includes the target keyword, adding the first identifier to the initial execution expression to obtain an updated execution expression and a new initial execution expression; in step S410, if the text to be detected does not include the target keyword, adding the second identifier to the initial execution expression to obtain an updated execution expression and a new initial execution expression; in step S411, the target keyword is cleared, the tag is deleted, and step S407 is executed; it should be noted that, the steps S401 to S411 are cycled until all the characters in the regular expression are obtained circularly, and in step S412, if the last character of the regular expression is a non-special character, the update keyword corresponding to the last character is used as the target keyword, and the steps S408 to S410 are executed again; in step S413, the updated execution expression corresponding to the last character is taken as the target execution expression, and the target information is acquired based on the target execution expression.
For example, with the regular expression "mobile phone & explosion", the text to be detected is "my mobile phone has exploded 10 th month, you can help me process how? For example, the process of acquiring information in the present disclosure is explained in detail, and the process of acquiring target information is as follows:
the first loop, taking the first character "(" the first character is a special character but no label exists in the regular expression, adding the first character to the initial execution expression to obtain an updated execution expression, the updated execution expression is "(", and the new initial execution expression is "(").
And a second loop, namely, taking a second character 'hand' in the regular expression, wherein the second character is a non-special character, adding the second character into the initial keyword to obtain an updated keyword, wherein the updated keyword is 'hand', and the new initial keyword is 'hand', and generating a label.
And a third loop, taking a third character 'machine' in the regular expression, wherein the third character is a non-special character, adding the third character into the initial keyword to obtain an updated keyword, wherein the updated keyword is 'mobile phone', the new initial keyword is 'mobile phone', and generating a label.
In the fourth cycle, firstly, the fourth character "&" in the regular expression is selected, the fourth character is a special character, the updating keyword obtained in the previous cycle is used as a target keyword, the target keyword is a mobile phone, and a label exists; then, the target keyword ' mobile phone ' in the text to be detected is judged, the first identification ' true ' and the fourth character ' & ' are added into the initial execution expression to obtain an updated execution expression and a new initial execution expression of ' (true & '; ' and finally, the target keyword is cleared and the label is deleted.
And in the fifth loop, taking a fifth character 'explosion' in the regular expression, wherein the fifth character is a non-special character, and adding the fifth character into the initial keyword to obtain an updated keyword and a new initial keyword which are 'explosion' generating labels.
And in the sixth loop, taking a sixth character in the regular expression as an 'explosion', wherein the sixth character is a non-special character, adding the sixth character into the initial keyword to obtain an updated keyword and a new initial keyword as an 'explosion', and generating a label.
In a seventh cycle, firstly, a seventh character ' in the regular expression is taken, and if the seventh character is a special character, the updated keyword obtained in the previous cycle is taken as a target keyword, the target keyword is ' explosion ', and a label exists; then, judging that a target keyword ' explosion ' exists in the text to be detected, and adding a first identifier ' true ' and a seventh character ') into the initial execution expression to obtain an updated execution expression and a new execution expression ' true & true '; and finally, emptying the target keyword and deleting the label.
And after the circulation is finished, if the last character of the regular expression is a special character, taking the updated execution expression corresponding to the last character as a target execution expression to obtain a target execution expression of (true & true) ", calculating the target execution expression of (true & true) ═ true", matching the regular expression and the text to be detected successfully, detecting a plurality of texts to be detected according to the method, counting the detection results, and finally obtaining the statistical information.
In addition, when the regular expression is 'mobile phone & explosion', the result executed according to the method for acquiring information is as follows:
in the first loop, the current character is 'hand', the updating keyword is 'hand', and the updating execution expression is null.
And in the second cycle, the current character is 'machine', the updating keyword is 'mobile phone', and the updating execution expression is null.
And in the third loop, the current character is "&", the target keyword is "mobile phone", the updating execution expression is "true &", and the target keyword is cleared.
In the fourth loop, the current character is 'explosion', the updating keyword is 'explosion', and the updating execution expression is 'true &'.
In the fifth loop, the current character is 'burst', the update key word is 'explosion', and the update execution expression is 'true &'.
And (4) ending circulation, wherein the last character of the regular expression is a non-special character, the updating keyword corresponding to the last character is used as a target keyword, and the target keyword is 'explosion'. And judging whether the target keyword 'explosion' exists in the text to be detected, if the target keyword exists in the text to be detected, updating the execution expression to 'true & true', and finally obtaining the target execution expression to 'true & true'.
It should be noted that the above scenario is only an exemplary scenario and does not provide any limitation to the scope of the exemplary embodiments of the present disclosure.
The following describes embodiments of the apparatus of the present disclosure, which may be used to perform the above-mentioned information acquisition method of the present disclosure. For details that are not disclosed in the embodiments of the apparatus of the present disclosure, refer to the embodiments of the information obtaining method described above in the present disclosure.
Fig. 5 schematically shows a block diagram of an information acquisition apparatus according to one embodiment of the present disclosure.
Referring to fig. 5, an information acquisition apparatus 500 according to an embodiment of the present disclosure, the information acquisition apparatus 500 includes: an expression obtaining module 501, an initial word creating module 502, a keyword updating module 503, an expression determining module 504 and an object information obtaining module 505. Specifically, the method comprises the following steps:
the obtaining expression module 501 is configured to obtain regular expressions, and obtain current characters from the regular expressions in sequence.
An create initial words module 502 is used to create initial keywords and initial execution expressions.
And an update keyword module 503, configured to determine a target keyword according to the current character and the initial keyword when the current character is a non-special character.
And the expression determining module 504 is configured to, when the current character is a special character, update the initial execution expression according to the current character and the target keyword to obtain an updated execution expression, and use the updated execution expression as a new initial execution expression.
And an obtaining target information module 505, configured to repeat the above steps until an update execution expression corresponding to the last character in the regular expression is obtained, take the update execution expression corresponding to the last character as a target execution expression, and obtain target information based on the target execution expression.
The details of each information acquisition device are already described in detail in the corresponding information acquisition method, and therefore are not described herein again.
It should be noted that although in the above detailed description several modules or units of the apparatus for performing are mentioned, such a division is not mandatory. Indeed, the features and functionality of two or more modules or units described above may be embodied in one module or unit, according to embodiments of the present disclosure. Conversely, the features and functions of one module or unit described above may be further divided into embodiments by a plurality of modules or units.
In an exemplary embodiment of the present disclosure, an electronic device capable of implementing the above method is also provided.
As will be appreciated by one skilled in the art, aspects of the present invention may be embodied as a system, method or program product. Thus, various aspects of the invention may be embodied in the form of: an entirely hardware embodiment, an entirely software embodiment (including firmware, microcode, etc.) or an embodiment combining hardware and software aspects that may all generally be referred to herein as a "circuit," module "or" system.
An electronic device 600 according to this embodiment of the invention is described below with reference to fig. 6. The electronic device 600 shown in fig. 6 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present invention.
As shown in fig. 6, the electronic device 600 is embodied in the form of a general purpose computing device. The components of the electronic device 600 may include, but are not limited to: the at least one processing unit 610, the at least one memory unit 620, a bus 630 connecting different system components (including the memory unit 620 and the processing unit 610), and a display unit 640.
Wherein the storage unit stores program code that is executable by the processing unit 610 to cause the processing unit 610 to perform steps according to various exemplary embodiments of the present invention as described in the above section "exemplary methods" of the present specification. For example, the processing unit 610 may execute step S110 shown in fig. 1, obtain regular expressions, and sequentially obtain current characters from the regular expressions; step S120, creating an initial keyword and an initial execution expression; step S130, when the current character is a non-special character, determining a target keyword according to the current character and the initial keyword; step S140, when the current character is a special character, updating the initial execution expression according to the current character and the target keyword to obtain an updated execution expression, and taking the updated execution expression as a new initial execution expression; and S150, repeatedly executing the steps until an updated execution expression corresponding to the last character in the regular expression is obtained, taking the updated execution expression corresponding to the last character as a target execution expression, and obtaining target information based on the target execution expression.
The storage unit 620 may include readable media in the form of volatile memory units, such as a random access memory unit (RAM)6201 and/or a cache memory unit 6202, and may further include a read-only memory unit (ROM) 6203.
The memory unit 620 may also include a program/utility 6204 having a set (at least one) of program modules 6205, such program modules 605 including but not limited to: an operating system, one or more application programs, other program modules, and program data, each of which, or some combination thereof, may comprise an implementation of a network environment.
Bus 630 may be one or more of several types of bus structures, including a memory unit bus or memory unit controller, a peripheral bus, an accelerated graphics port, a processing unit, or a local bus using any of a variety of bus architectures.
The electronic device 600 may also communicate with one or more external devices 800 (e.g., keyboard, pointing device, bluetooth device, etc.), with one or more devices that enable a viewer to interact with the electronic device 600, and/or with any devices (e.g., router, modem, etc.) that enable the electronic device 600 to communicate with one or more other computing devices. Such communication may occur via an input/output (I/O) interface 650. Also, the electronic device 600 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network such as the Internet) via the network adapter 660. As shown, the network adapter 660 communicates with the other modules of the electronic device 600 over the bus 630. It should be appreciated that although not shown in the figures, other hardware and/or software modules may be used in conjunction with the electronic device 600, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.
Through the above description of the embodiments, those skilled in the art will readily understand that the exemplary embodiments described herein may be implemented by software, or by software in combination with necessary hardware. Therefore, the technical solution according to the embodiments of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (which may be a CD-ROM, a usb disk, a removable hard disk, etc.) or on a network, and includes several instructions to enable a computing device (which may be a personal computer, a server, a terminal device, or a network device, etc.) to execute the method according to the embodiments of the present disclosure.
In an exemplary embodiment of the present disclosure, there is also provided a computer-readable storage medium having stored thereon a program product capable of implementing the above-described method of the present specification. In some possible embodiments, aspects of the invention may also be implemented in the form of a program product comprising program code means for causing a terminal device to carry out the steps according to various exemplary embodiments of the invention described in the above section "exemplary methods" of the present description, when said program product is run on the terminal device.
Referring to fig. 7, a program product 700 for implementing the above method according to an embodiment of the present invention is described, which may employ a portable compact disc read only memory (CD-ROM) and include program code, and may be run on a terminal device, such as a personal computer. However, the program product of the present invention is not limited in this regard and, in the present document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
A computer readable signal medium may include a propagated data signal with readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A readable signal medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., through the internet using an internet service provider).
Furthermore, the above-described figures are merely schematic illustrations of processes involved in methods according to exemplary embodiments of the invention, and are not intended to be limiting. It will be readily understood that the processes shown in the above figures are not intended to indicate or limit the chronological order of the processes. In addition, it is also readily understood that these processes may be performed synchronously or asynchronously, e.g., in multiple modules.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is to be limited only by the terms of the appended claims.

Claims (13)

1. An information acquisition method, comprising:
acquiring regular expressions, and sequentially acquiring current characters from the regular expressions;
creating an initial keyword and an initial execution expression;
when the current character is a non-special character, determining a target keyword according to the current character and the initial keyword;
when the current character is a special character, updating the initial execution expression according to the current character and the target keyword to obtain an updated execution expression, and taking the updated execution expression as a new initial execution expression;
and repeating the steps until an updating execution expression corresponding to the last character in the regular expression is obtained, taking the updating execution expression corresponding to the last character as a target execution expression, and obtaining target information based on the target execution expression.
2. The information acquisition method according to claim 1, characterized in that the method further comprises:
and generating a label when the current character is the non-special character.
3. The information acquisition method according to claim 1, wherein determining a target keyword from the current character and the initial keyword comprises:
combining the current character and the initial keyword to form an updated keyword;
and taking the updated keyword as a new initial keyword, repeating the previous step until the obtained current character is a special character, and taking the finally obtained updated keyword as the target keyword.
4. The information acquisition method according to claim 1,
when the current character is a special character, updating the initial execution expression according to the current character and the target keyword to obtain the updated execution expression, wherein the updating comprises:
when the current character is a special character, judging whether a label exists or not;
if yes, matching the target keywords with the text to be detected, and updating the initial execution expression according to the matching result and the current character to obtain the updated execution expression;
and if the current character does not exist, adding the current character into the initial execution expression to obtain the updated execution expression.
5. The information acquisition method according to claim 4, wherein updating the initial execution expression according to the matching result and the current character to obtain the updated execution expression comprises:
and determining identification information according to the matching result, and updating the initial execution expression according to the identification information and the current character to obtain the updated execution expression.
6. The information acquisition method according to claim 5, wherein the identification information includes a first identification and a second identification;
updating the initial execution expression according to the identification information and the current character to obtain the updated execution expression, including:
if the target keyword exists in the text to be detected, adding the current character and the first identifier to the initial execution expression to obtain the updated execution expression;
and if the target keyword does not exist in the text to be detected, adding the current character and the second identifier into the initial execution expression to obtain the updated execution expression.
7. The information acquisition method according to claim 4, wherein after the updated execution expression is treated as a new initial execution expression, the method further comprises:
and clearing the target keywords and deleting the tags.
8. The information acquisition method according to claim 3, wherein acquiring the updated execution expression corresponding to the last character in the regular expression comprises:
when the last character is a non-special character, taking an update keyword corresponding to the last character as the target keyword;
and matching the target keywords with the text to be detected, and acquiring an updated execution expression corresponding to the last character according to the matching result.
9. The information acquisition method according to claim 8, wherein acquiring the updated execution expression corresponding to the last character according to the matching result includes:
if the target keyword corresponding to the last character exists in the text to be detected, adding a third identifier to the initial execution expression to obtain an updated execution expression corresponding to the last character;
and if the target keyword corresponding to the last character does not exist in the text to be detected, adding a fourth identifier to the initial execution expression to obtain an updated execution expression corresponding to the last character.
10. The information acquisition method according to claim 1, wherein acquiring target information based on the target execution expression includes:
and performing statistics and/or evaluation according to the target execution expression to determine statistical information and/or evaluation information.
11. An information acquisition apparatus characterized by comprising:
the obtaining expression module is used for obtaining regular expressions and sequentially obtaining current characters from the regular expressions;
an initial word creating module for creating an initial keyword and an initial execution expression;
the keyword updating module is used for determining a target keyword according to the current character and the initial keyword when the current character is a non-special character;
an expression determining module, configured to update the initial execution expression according to the current character and the target keyword when the current character is a special character, to obtain the updated execution expression, and use the updated execution expression as a new initial execution expression;
and the target information obtaining module is used for repeating the steps until an updating execution expression corresponding to the last character in the regular expression is obtained, taking the updating execution expression corresponding to the last character as a target execution expression, and obtaining target information based on the target execution expression.
12. A computer-readable storage medium on which a computer program is stored, the program implementing the information acquisition method according to any one of claims 1 to 10 when executed by a processor.
13. An electronic device, comprising:
one or more processors;
storage means for storing one or more programs that, when executed by the one or more processors, cause the one or more processors to implement the information acquisition method according to any one of claims 1 to 10.
CN202010144301.7A 2020-03-04 2020-03-04 Information acquisition method and device, computer storage medium and electronic equipment Pending CN111753548A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010144301.7A CN111753548A (en) 2020-03-04 2020-03-04 Information acquisition method and device, computer storage medium and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010144301.7A CN111753548A (en) 2020-03-04 2020-03-04 Information acquisition method and device, computer storage medium and electronic equipment

Publications (1)

Publication Number Publication Date
CN111753548A true CN111753548A (en) 2020-10-09

Family

ID=72673103

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010144301.7A Pending CN111753548A (en) 2020-03-04 2020-03-04 Information acquisition method and device, computer storage medium and electronic equipment

Country Status (1)

Country Link
CN (1) CN111753548A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113342866A (en) * 2021-06-22 2021-09-03 广州华多网络科技有限公司 Keyword updating method and device, computer equipment and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113342866A (en) * 2021-06-22 2021-09-03 广州华多网络科技有限公司 Keyword updating method and device, computer equipment and storage medium
CN113342866B (en) * 2021-06-22 2022-06-21 广州华多网络科技有限公司 Keyword updating method and device, computer equipment and storage medium

Similar Documents

Publication Publication Date Title
CN107679039B (en) Method and device for determining statement intention
CN107992596B (en) Text clustering method, text clustering device, server and storage medium
US20200184307A1 (en) Utilizing recurrent neural networks to recognize and extract open intent from text inputs
CN109002510B (en) Dialogue processing method, device, equipment and medium
EP3579124A1 (en) Method and apparatus for providing search results
CN110046169B (en) Computing service implementation scheme based on structured query language statements
TW202020691A (en) Feature word determination method and device and server
CN110276023B (en) POI transition event discovery method, device, computing equipment and medium
CN112084752A (en) Statement marking method, device, equipment and storage medium based on natural language
CN111753548A (en) Information acquisition method and device, computer storage medium and electronic equipment
US20220327147A1 (en) Method for updating information of point of interest, electronic device and storage medium
CN111079432A (en) Text detection method and device, electronic equipment and storage medium
CN108171576B (en) Order processing method and device, electronic equipment and computer readable storage medium
CN108228567B (en) Method and device for extracting short names of organizations
CN113434695A (en) Financial event extraction method and device, electronic equipment and storage medium
US20210158210A1 (en) Hybrid in-domain and out-of-domain document processing for non-vocabulary tokens of electronic documents
CN111339760A (en) Method and device for training lexical analysis model, electronic equipment and storage medium
CN110941951B (en) Text similarity calculation method, text similarity calculation device, text similarity calculation medium and electronic equipment
CN109815325B (en) Answer extraction method, device, server and storage medium
CN110929499B (en) Text similarity obtaining method, device, medium and electronic equipment
US11429789B2 (en) Natural language processing and candidate response identification
EP3961426A2 (en) Method and apparatus for recommending document, electronic device and medium
CN114595686B (en) Knowledge extraction method, and training method and device of knowledge extraction model
US11163953B2 (en) Natural language processing and candidate response evaluation
CN110502630B (en) Information processing method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination