WO2023019576A1 - 一种文本搜索处理的方法以及相关设备 - Google Patents

一种文本搜索处理的方法以及相关设备 Download PDF

Info

Publication number
WO2023019576A1
WO2023019576A1 PCT/CN2021/113863 CN2021113863W WO2023019576A1 WO 2023019576 A1 WO2023019576 A1 WO 2023019576A1 CN 2021113863 W CN2021113863 W CN 2021113863W WO 2023019576 A1 WO2023019576 A1 WO 2023019576A1
Authority
WO
WIPO (PCT)
Prior art keywords
keyword
preset
search
keywords
text
Prior art date
Application number
PCT/CN2021/113863
Other languages
English (en)
French (fr)
Inventor
袁苏亮
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to EP21953817.0A priority Critical patent/EP4383085A1/en
Priority to CN202180006898.1A priority patent/CN115997201A/zh
Priority to PCT/CN2021/113863 priority patent/WO2023019576A1/zh
Publication of WO2023019576A1 publication Critical patent/WO2023019576A1/zh
Priority to US18/443,398 priority patent/US20240184687A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/362Software debugging
    • G06F11/3636Software debugging by tracing the execution of the program
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/90335Query processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3466Performance evaluation by tracing or monitoring
    • G06F11/3476Data logging
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/90335Query processing
    • G06F16/90344Query processing by using string matching techniques

Definitions

  • This application relates to the field of information and communications technology (ICT), in particular to a text search processing method and related equipment.
  • ICT information and communications technology
  • Software code usually contains a large number of running branch structures. In the process of actually running the software code, running different branches will record the corresponding running path information in the log file. When scanning and analyzing log files, the running path of the software code is generally identified according to the keywords in the running path information, so as to analyze and obtain the actual running behavior of the software code, so as to implement corresponding quality policies.
  • the embodiment of the present application provides a text search processing method and related equipment, aiming to search for keywords in multiple running path information through keyword rules in the keyword rule set, the search results are more accurate, and satisfy the needs of users. search requests.
  • the first aspect of the embodiments of the present application provides a method for text search processing, which can be applied in log analysis and processing scenarios.
  • the method can also be applied to devices such as terminal devices and vehicles.
  • the method may include: acquiring the first text and a set of preset search rules.
  • the first text includes one or more pieces of running path information
  • the set of preset search rules includes one or more preset search rules
  • each preset search rule is used to indicate a logical relationship between at least one keyword. Then, based on the first keyword and the first preset search rule, one or more second keywords are searched.
  • the described first keyword is a keyword obtained according to the first running path information
  • the first running path information is any one of one or more running path information
  • the first preset search rule is the same as the first running path information Any one of the corresponding one or more preset search rules.
  • a first search result is determined according to the first keyword and one or more second keywords, and the first search result is used to indicate the running behavior of the running path corresponding to the first running path information.
  • the method of searching for one or more second keywords can specifically obtain the first row of values first, and the first row of values is used to identify The line number where the first keyword is located. Then, based on the first row value and the first preset search rule, one or more second keywords are searched within the preset offset range.
  • the preset offset range indicates row offset values between one or more second keywords and the first keyword.
  • searching for one or more second keywords within a preset offset range based on the first row value and the first preset search rule includes: according to the first row value and the first preset search rule Set the search rule to search for the third keyword within the first preset offset range, the third keyword is any one of one or more second keywords, and the first preset offset range corresponds to the third keyword .
  • any one of the one or more second keywords, that is, the third keyword can be searched within the respective corresponding first preset offset ranges, thereby improving search efficiency.
  • the logical relationship includes at least one of the following: a first identifier, a second identifier, and a third identifier.
  • the first identifier indicates that one or more keywords exist within a preset offset range
  • the second identifier indicates that One of the multiple keywords exists within the preset offset range
  • the third flag indicates that one or more keywords do not exist within the preset offset range.
  • the one or more second keywords further include a fourth keyword and a fifth keyword.
  • the second preset offset range is obtained from the row offset value between the fourth keyword and the first keyword, and the fourth keyword is one or more first keywords Two keywords; or, the second preset offset range is obtained by the row offset value between the fifth keyword and the fourth keyword, and the row offset value between the fourth keyword and the first keyword,
  • the fifth keyword is a keyword different from the fourth keyword among the one or more second keywords.
  • acquiring the first row value may include: acquiring a second text, where the second text is obtained by processing the first text through a hash algorithm. Then, get the first row value based on the second text.
  • the first text can be transformed into the second text that stores data by row, not only can quickly mark the location of the keyword in each running path information based on the second text The line number, but also can quickly search vertically.
  • the type of the keyword includes a string type and/or a key-value pair type.
  • the obtained first search result is used to detect
  • the same operation process may be performed on the rest of the running path information based on other preset search rules in the preset search rule set. That is, the text search processing method further includes: based on the sixth keyword and the second preset search rule, searching for one or more seventh keywords from the second running route information, the sixth keyword is based on the second running path
  • the keyword obtained from the path information, and the second preset search rule is any one of one or more preset search rules corresponding to the second running path information.
  • a second search result is determined based on the sixth keyword and one or more seventh keywords. The second search result is used to indicate the running behavior of the running path corresponding to the second running path information.
  • the second running path information is different from the first running path information.
  • the second preset search rule may or may not be the same as the first preset search rule, which is not limited in this application.
  • the sixth keyword and the first keyword may or may not be the same, and are not limited in this application.
  • the embodiment of the present application provides a text search device.
  • the text search device may be a terminal device, a vehicle, a smart car, a computer, and the like.
  • the text search device includes an acquisition unit and a processing unit.
  • the obtaining unit is used to obtain the first text and a set of preset search rules.
  • the first text includes one or more running path information
  • the preset search rule set includes one or more preset search rules
  • each preset search rule indicates that the corresponding running path information includes a plurality of keywords between logical relationship.
  • a processing unit configured to search for one or more second keywords based on the first keyword and the first preset search rule, and determine a first search result based on the first keyword and the one or more second keywords.
  • the first keyword is a keyword obtained according to the first running path information
  • the first running path information is any one of the one or more running path information
  • the first preset search rule is the same as the Any one of the one or more preset search rules corresponding to the first running path information
  • the first search result is used to indicate the running behavior of the running path corresponding to the first running path information.
  • the acquiring unit is configured to acquire a first row value, where the first row value is used to identify a row number where the first keyword is located.
  • the processing unit is configured to search for one or more second keywords within a preset offset range according to the first row value and the first preset search rule, and the preset offset range indicates one or more second keywords The line offset value between the word and the first key.
  • the processing unit is further configured to search for a third keyword within a first preset offset range according to the first row value and the first preset search rule, and the first The three keys are any one of the one or more second keys, and the first preset offset range corresponds to the third key.
  • the logical relationship includes at least one of the following: a first identifier, a second identifier, and a third identifier.
  • the first flag indicates that one or more keywords exist within the preset offset range
  • the second flag indicates that multiple keywords exist in one of the preset offset ranges
  • the third flag indicates that one or more keywords exist within the preset offset range. Set to not exist within the offset range.
  • the one or more second keywords further include a fourth keyword and a fifth keyword.
  • the processing unit is configured to search for the fourth keyword according to a second preset offset range, the second preset offset range consisting of rows between the fourth keyword and the first keyword The offset value is obtained, the fourth keyword is one or more of the second keywords; or, the second preset offset range is determined by the fifth keyword and the fourth keyword The row offset value of the first keyword and the row offset value between the fourth keyword and the first keyword are obtained, and the fifth keyword is different from the one or more second keywords keyword for the fourth keyword.
  • the obtaining unit is configured to obtain a second text, the second text is obtained by processing the first text through a hash algorithm, and obtains the first text based on the second text row value.
  • the type of the keyword includes a string type and/or a key-value pair type.
  • the processing unit is further configured to: search for one or more seventh keywords from the second running path information based on the sixth keyword and the second preset search rule, the sixth keyword is based on The keyword obtained from the second running path information, and the second preset search rule is any one of one or more preset search rules corresponding to the second running path information. Then, the processing unit determines a second search result according to the sixth keyword and one or more seventh keywords, and the second search result is used to indicate the running behavior of the running path corresponding to the second running path information.
  • a third aspect of the present application provides a vehicle, which may include: a memory configured to store computer-readable instructions. It may also include a processor coupled to the memory, configured to execute computer-readable instructions in the memory, so as to execute the method described in the first aspect or any possible implementation manner of the first aspect.
  • a fourth aspect of the present application provides a server, which may include: a memory configured to store computer-readable instructions. It may also include a processor coupled to the memory, configured to execute computer-readable instructions in the memory, so as to execute the method described in the first aspect or any possible implementation manner of the first aspect.
  • the fifth aspect of the present application provides a computer-readable storage medium.
  • the computer device executes the method described in the first aspect or any possible implementation manner of the first aspect.
  • a sixth aspect of the present application provides a computer program product, which, when run on a computer, enables the computer to execute the method described in the first aspect or any possible implementation manner of the first aspect.
  • the seventh aspect of the present application provides a system-on-a-chip, which may include a processor, configured to support the text search device to implement the method involved in the above-mentioned first aspect or any possible implementation manner of the first aspect. Function.
  • the chip system may further include a memory, and the memory is used for storing necessary program instructions and data of the text search device.
  • the system-on-a-chip may consist of chips, or may include chips and other discrete devices. Wherein, the system-on-a-chip may include an application specific integrated circuit (ASIC), a field programmable gate array (FPGA), or other programmable logic devices. Further, the chip system may also include an interface circuit and the like.
  • ASIC application specific integrated circuit
  • FPGA field programmable gate array
  • the chip system may also include an interface circuit and the like.
  • each operation path information corresponds to one or more preset search rules
  • each preset search rule indicates the number of keywords included in the operation path information logical relationship. Therefore, after obtaining the first keyword in the first operating path information, one or more second keywords can be searched according to the first keyword and the first preset search rule, and then according to the first keyword and a The one or more second keywords determine the first search result.
  • keywords in multiple running path information are searched through the keyword rules in the keyword rule set. Users do not need to master complex regular expressions, but only need to understand the The logical relationship between multiple keywords is enough.
  • dividing the log text to be processed with a large amount of data into multiple subtexts to be processed starting from the first keyword in each running path information can efficiently, quickly and accurately analyze the rest of the key words. word to search.
  • FIG. 1 Schematic diagram of search keywords in related schemes
  • FIG. 2 is a schematic flowchart of a method for text search processing provided by the present application
  • FIG. 3A is a schematic diagram of a preset logical relationship among multiple keywords provided by the present application.
  • FIG. 3B is a schematic diagram of an interface for establishing search rules provided by the present application.
  • FIG. 4 is a schematic diagram of a search using the scheme of the present application.
  • FIG. 5 is a schematic diagram of a hardware structure of a communication device provided by an embodiment of the present application.
  • FIG. 6 is a schematic structural diagram of a message processing device provided by an embodiment of the present application.
  • the embodiment of the present application provides a text search processing method and related equipment, aiming to search for keywords in multiple running path information through keyword rules in the keyword rule set, the search results are more accurate, and satisfy the needs of users. search requests.
  • Software code usually contains a large number of running branch structures. In the process of actually running the software code, running different branches will record the corresponding running path information in the log file.
  • the running path of the software code is generally identified according to the keyword (keyword, KW) in the running path information, so as to analyze the actual running behavior of the software code, so as to implement the corresponding quality strategy.
  • FIG. 1 it is a schematic diagram of searching keywords in related schemes. It can be seen from FIG. 1 that the log texts generated by the software code in two different running processes (eg, running process 1 and running process 2) are all in the same log file. In addition, if the user wants to search the logs generated in the running process 2 through the keywords KW01, KW02, and KW03, but based on traditional regular expressions, it is very likely that the logs generated in the running process 1 will be searched. line01.KW01 and line03.KW02, and line09.KW03 generated in process 2.
  • the embodiment of the present application provides a method for text search processing, through which the keyword rules (keywords rules, KWR) in the keyword rules set (keywords rules set, KWRS) pair
  • the keywords in multiple running path information are searched, and the search results are relatively accurate and meet the user's search demands.
  • the text search processing method can be applied in vehicles, and can also be applied in devices such as terminal devices and servers.
  • the vehicle may include but not limited to a smart car and the like.
  • Terminal devices may include, but are not limited to, personal computers, mobile phones, tablet computers, wearable smart devices, and the like.
  • the method for text search processing can be applied in log analysis and processing scenarios.
  • the text search processing method can also be applied in the entire life cycle of the product, and can also be applied in the development and debugging, maintenance and maintenance stages, etc., which is not limited here.
  • FIG. 2 is a schematic flowchart of a text search processing method provided by an embodiment of the present application.
  • the text search processing method shown in FIG. 2 can be applied in a text search device, which may include but not limited to a vehicle, a terminal device, a server, etc., and is not limited here.
  • the text search processing method includes the following steps:
  • running software codes in different processes will generate corresponding log texts.
  • the log text includes one or more running path information, and each running path information may include multiple keywords. Therefore, for each running process, the first text in the running process can be obtained.
  • the first text can be stored in devices such as a local server, it can also be uploaded and stored in the cloud. Therefore, the first text can be obtained from a device such as a local server, or obtained from a cloud. In practical applications, there may also be other acquisition methods, which are not limited here.
  • the first text includes, but is not limited to, log text of vehicle power management, software logs of vehicle-mounted products, logs generated during system operation, etc., which are not limited here.
  • the type of the keyword may include a character string (string) type and/or a key-value pair (key-value, K-V) type.
  • the type of the keyword may be other types, which are not specifically limited here.
  • 202 Acquire a preset search rule set, where the preset search rule set includes one or more preset search rules, and each preset search rule is used to indicate a logical relationship between at least one keyword.
  • each running path information may include multiple keywords
  • a logical relationship between at least one keyword in each running path information is set as a preset search rule.
  • one piece of running path information may correspond to one or more preset search rules, and one or more preset search rules are combined into the preset search rule set.
  • the logical relationship between at least one keyword included in the running path information A can be understood by referring to FIG. 3A .
  • the logical relationship can be: KW1 needs to appear in the running path information A, which can be identified by the symbol "AND”; any keyword in KW2a and KW2b appears in the running path information A. Yes, the symbol "OR” can be used to identify; and KW3 does not need to appear in the running path information A, it can be marked with the symbol "NOT”; KW4 needs to appear in the running path information A, indicating that in the running
  • the keyword at the end of the search in path information A can be identified by the symbol "AND”.
  • the logical relationship shown in FIG. 3A can be configured as a preset search rule corresponding to the running path information A.
  • the input window of the preset search rule includes a "keyword type” input window, a "keyword” input window, a “keyword value” input window, a "logical relationship” Input window, and "preset offset range” input window, etc.
  • the user can fill in each keyword satisfying the logical relationship in the input window corresponding to FIG. 3B according to actual needs, and then a preset search rule can be obtained.
  • KW1 is the vehicle
  • KW2a is the power supply
  • KW2b KW3 is the remaining power
  • KW3 is the management
  • KW4 is the consumption.
  • the user can use KW1 of the String type as the search entry for the running path information A in the "vehicle power management log text", and enter the "keyword type” input window, the "keyword” input window, the “operator” Fill in the input window, "keyword value” input window, “logical relationship” input window, and “preset line offset range” input window respectively: String, KW1, "", vehicle, AND, 0.
  • the user can also add a search window for two keywords. Among them, the user enters the "keyword type” input window, "keyword” input window, “operator” input window, “keyword value” input window, “logical relationship” input window, and “preset row” input window of a keyword. Fill in the “Offset Range” input window: String, KW2a, "", Power, OR, 2. And, the user enters the "keyword type” input window, "keyword” input window, “operator” input window, “keyword value” input window, "logical relationship” input window, and “preset” input window of another keyword. Fill in the "row offset range” input window: String, KW2b, "", remaining power, OR, 2.
  • the user wants to be within the preset offset range whose maximum row offset is 5, he does not want to find the keyword "management”.
  • the user can also add a keyword search window, and in the "keyword type” input window, "keyword” input window, “operator” input window, “keyword value” input window, “logical relationship” " input window, and “preset row offset range” input window are respectively filled with: K-V, KW3, "", management, NOT, 5.
  • the user also hopes to find the keyword “consumption” within the preset offset range of the maximum row offset of 7.
  • the user can also enter the "keyword type” input window, "keyword Input window, “operator” input window, “keyword value” input window, “logic relationship” input window, and “preset line offset range” input window respectively fill in: String, KW4, "", consumption , AND, 7.
  • the preset search rule can be a piece of machine-executable code, including but not limited to xml, json, yaml and other formats.
  • the preset search rules corresponding to the running path information A shown in FIG. 3B can be expressed in xml format, specifically as follows:
  • preset search rules corresponding to the running path information A may also be set based on other logical relationships.
  • preset search rules corresponding to each running path information can also be understood with reference to the logical relationship shown in FIG. 3A above, and details are not described here.
  • one or more corresponding preset search rules are configured for each running path information. Moreover, when searching for keywords in each running path information, it is necessary to first set a preset search keyword as a search entry for the current running path information. Therefore, the first keyword may be obtained according to the first running path information, or in other words, the preset search keyword in the first running path information may be determined as the first keyword. Then, after the first keyword is obtained, one or more second keywords may be searched from the first running path information based on the first keyword and the first preset search rule.
  • the first preset search rule is the search rule shown in FIG. 3A and FIG. 3B
  • the first running path information is the running path information A in the log text of the vehicle power management.
  • the user can use KW1 in the preset search rule as the search entry of the running route information A, that is, the first keyword.
  • KW1 and the preset search rules shown in FIGS. 3A and 3B it is possible to search for other keywords that meet the preset search rules from the running path information A, that is, one or more second keywords, Such as KW2a or KW2b, and KW4.
  • a maximum row offset may be set for each keyword.
  • the following manner may be specifically adopted, that is, the first row value is obtained, and the first row value is used to identify the row number where the first keyword is located. And, based on the first row value and the first preset search rule, one or more second keywords are searched within the preset offset range.
  • the preset offset range indicates row offset values between one or more second keywords and the first keyword.
  • a row offset range is set for each second keyword, that is, the row offset value between the current second keyword and the first keyword is represented by the row offset range. In this way, you only need to search for the remaining keywords within a certain row offset range, without the need for full-text search, which improves the search efficiency.
  • the described row offset range is an estimate and is not limited here.
  • searching for one or more second keywords within the preset offset range can also be carried out in the following manner, that is: according to the first row value and the first preset search rule, searching for a third keyword within the first preset offset range, the third keyword being any one of the one or more second keywords, the first A preset offset range corresponds to the third keyword.
  • KW1 is used as the first key, and its row offset range is 0.
  • KW2a is used as a second keyword to be searched, and its maximum row offset range from the row where KW1 is located can be set to be 2.
  • KW2b, KW3, and KW4 are other second keywords that need to be searched, and their maximum row offset ranges from the row where KW1 is located can be set to be 2, 5, and 7 respectively.
  • row offset range 2 is taken as an example to illustrate the row offset ranges between KW2a, KW2b and KW1.
  • row offset ranges can also be set, such as: 4, 8 Etc., no limiting description is made here.
  • the row offset ranges for KW3 and KW4 can also be set to other values, which are not limited here.
  • the row offset range of KW2a is set to 0, it indicates that KW2a and KW1 are located in the same row in the running path information A, and the rest of KW3, KW4, etc. can also be understood with reference to KW2a.
  • the logical relationship can also be set as whether each keyword needs to appear within the corresponding row offset range.
  • the logical relationship may include at least one of the following: a first identifier, a second identifier, and a third identifier.
  • the first identifier indicates that one or more of the keywords exist within the preset offset range
  • the second identifier indicates that one of the multiple keywords exists within the preset offset range
  • the The third identifier indicates that one or more of the keywords do not exist within the preset offset range.
  • first mark can be understood as “AND” in the aforementioned FIG. 3A
  • second mark can be understood as “OR” in FIG. 3A
  • third mark can also be understood as “NOT” in FIG. 3A
  • first identifier, the second identifier, and the third identifier may also be represented by other identifiers, which are not limited here.
  • a preset offset range may be directly set for each second keyword. That is, the one or more second keywords also include the fourth keyword and the fifth keyword, then when determining the preset offset range of each second keyword, it can be determined in the following two ways, namely :
  • the second preset offset range is obtained from the row offset value between the fourth keyword and the first keyword, the fourth keyword is a or multiple secondary keywords. That is to say, a second preset offset range can be directly set for each fourth keyword, and the second preset offset range at this time is the row offset between each fourth keyword directly and the first keyword transfer value. For example, if the fourth keyword currently to be searched is KW2a, then the row offset value between KW2a and KW1 can be set to 6, 7, 8, etc., and the range of the second preset offset range at this time is It can be 6 to 8, that is, at this time, it is only necessary to search for the KW2a from the last 8 rows of the row where the KW1 is located.
  • the second preset offset range at this time consists of the row offset value between the fifth keyword and the fourth keyword, and The row offset value between the fourth key and the first key is obtained, and the fifth key is a key different from the fourth key among the one or more second keys Character.
  • the row offset values between the fourth keyword KW2a and KW1 are 6, 7, 8, etc. (that is, the row offset range of KW2a is 6 to 8).
  • you can set the line offset value between the KW4 and the KW2a Such as: 1 to 2, etc.
  • the row offset range (ie 6 to 8) set for the fourth keyword KW2a and the row offset value (ie 1 to 2) between the KW4 and the KW2a it can be known that the KW4 and the KW1
  • the offset range between the second preset is up, ie 7 to 10.
  • An example of any piece of running path information (that is, the first running path information) among the one or more running path information is used for illustration.
  • a search is performed according to a preset search rule corresponding to the first running path information, and one or more second keywords satisfying the preset search rule can be found.
  • the described hash algorithm includes but is not limited to MD5 message-digest algorithm (MD5 message-digest algorithm, MD5), MD4 algorithm, etc., which are not limited here.
  • FIG. 4 shows a schematic diagram of a search applying the solution of the present application.
  • the first text can be transformed into the second text after being processed by the hash algorithm.
  • the second text can be understood as a matrix text structure stored in rows.
  • the line where the first keyword (such as: KW1) is located can be quickly marked, such as: No.1, No.5, No.9 and so on.
  • the second keyword found in segment 1 and segment 3 conforms to the preset search rule.
  • the first search result can be determined in combination with the first keywords, for example: the running scenario corresponding to running the first running path information, or running the second - Abnormal information generated when running path information, etc.
  • the first search result may reflect the running behavior of the running path corresponding to one or more running path information. For example, through the first search result, it may be learned which running paths have errors or warnings. In this way, it is further possible to analyze the system running conditions and the like based on the error information or warning information corresponding to the running path where the error or warning occurs.
  • the first search result will change as the first preset search rule changes.
  • the running path information A corresponds to two first preset search rules (namely search rule A and search rule B). If the search rule A is different from the search rule B, then the rest of the second keywords found from the running path information A according to the search rule A will also be the same as those found from the running path information A according to the search rule B. The second keywords are different, so the determined first search results will also be different.
  • the text search processing method may further include: searching for one or more seventh key words from the second running path information based on the sixth key word and the second preset search rule. Character. Then, a second search result is also determined according to the sixth keyword and one or more seventh keywords, and the second search result is used to indicate the running behavior of the running path corresponding to the second running path information.
  • the first stream keyword is a preset search keyword in the second running path information.
  • the sixth keyword can be understood with reference to the first keyword in the aforementioned first running path information, which will not be described in detail here.
  • the sixth keyword may or may not be the same as the first keyword, which is not limited here.
  • the second running path information can also be understood as any one of one or more running path information.
  • the second preset search rule is any one of one or more preset search rules corresponding to the second running path information.
  • the second preset search rule described can also be understood with reference to the aforementioned first preset search rule, which will not be repeated here.
  • each piece of running path information corresponds to one or more preset search rules
  • each preset search rule indicates a logical relationship between at least one keyword. Therefore, after the first keyword is obtained according to the first operating path information, one or more second keywords can be searched based on the first keyword and the first preset search rule, and then based on the first keyword and one or A plurality of second keywords determine a first search result.
  • keywords in multiple running path information are searched through the keyword rules in the keyword rule set. Users do not need to master complex regular expressions, but only need to understand the The logical relationship between multiple keywords is enough.
  • the first text with a large amount of data is divided into multiple subtexts to be processed starting from the first keyword in each running path information, and each running path information corresponds to one or more A preset search rule can search other keywords efficiently, quickly and accurately.
  • the above-mentioned text search device includes corresponding hardware structures and/or software modules for performing various functions.
  • the present application can be implemented in the form of hardware or a combination of hardware and computer software. Whether a certain function is executed by hardware or computer software drives hardware depends on the specific application and design constraints of the technical solution. Those skilled in the art may use different methods to implement the described functions for each specific application, but such implementation should not be regarded as exceeding the scope of the present application.
  • the above-mentioned text search device can be realized by one physical device, or jointly realized by multiple physical devices, or can be a logical functional unit in one physical device, which is not discussed in this embodiment of the present application. Specific limits.
  • FIG. 5 is a schematic diagram of a hardware structure of a communication device provided by an embodiment of the present application.
  • the communication device includes at least one processor 501 , a memory 502 and a transceiver device 503 .
  • the processor 501 may be a general-purpose central processing unit CPU, a microprocessor, an application-specific integrated circuit, or one or more integrated circuits used to control the program execution of the program of this application.
  • the processor 501 can perform operations such as judgment, analysis, and calculation, including searching for one or more second keywords according to the first keyword and the first preset search rule.
  • the processor 501 also includes determining a first search result and the like according to the first keyword and one or more second keywords.
  • Transceiver device 503 using any device such as a transceiver for communicating with other devices or communication networks, such as Ethernet, radio access network (radio access network, RAN), wireless local area networks (wireless local area networks, WLAN), etc. .
  • the transceiver device 503 may be connected to the processor 501 .
  • the transceiver device 503 can acquire the first text, and acquire a set of preset search rules and the like.
  • the memory 502 can be a read-only memory (read-only memory, ROM) or other types of static storage devices that can store static information and instructions, random access memory (random access memory, RAM) or other types that can store information and instructions It can also be an electrically erasable programmable read-only memory (EEPROM), compact disc read-only memory (CD-ROM) or other optical disc storage, optical disc storage (including compact discs, laser discs, optical discs, digital versatile discs, Blu-ray discs, etc.), magnetic disk storage media or other magnetic storage devices, or can be used to carry or store desired program code in the form of instructions or data structures and can be programmed by a computer Any other medium accessed, but not limited to.
  • the memory 502 may exist independently, or may be connected with the processor 501 .
  • the memory 502 can also be integrated with the processor 501 .
  • the memory 502 is used to store computer-executed instructions for implementing the solutions of the present application, and the execution is controlled by the processor 501 .
  • the processor 501 is configured to execute the computer-executed instructions stored in the memory 502, so as to implement the text search processing method provided by the above-mentioned method embodiments of the present application.
  • the computer-executed instructions in the embodiments of the present application may also be referred to as application program codes, which is not specifically limited in the embodiments of the present application.
  • the processor 501 may include one or more CPUs, for example, CPU0 and CPU1 in FIG. 5 .
  • this application can divide the functional units of the text search device according to the above method embodiments, for example, each functional unit can be divided corresponding to each function, or two or more functions can be integrated into one function in the unit.
  • the above-mentioned integrated functional units can be implemented in the form of hardware or in the form of software functional units.
  • FIG. 6 shows a schematic structural diagram of a text search device provided by an embodiment of the present application.
  • an embodiment of the text search device of the present application may include: an acquisition unit 601 and a processing unit 602 .
  • the obtaining unit 601 is configured to obtain a first text, and the first text includes one or more running path information.
  • the first text includes one or more running path information.
  • the acquiring unit 601 is further configured to acquire a preset search rule set, the preset search rule set includes one or more preset search rules, and each preset search rule indicates a preset logical relationship between at least one keyword.
  • the preset search rule set includes one or more preset search rules
  • each preset search rule indicates a preset logical relationship between at least one keyword.
  • the processing unit 602 is configured to search for one or more second keywords based on the first keyword and a first preset search rule.
  • the first keyword is a keyword obtained according to the first running path information
  • the first running path information is any one of the one or more running path information
  • the first preset search rule is the same as the Any one of the one or more preset search rules corresponding to the first running path information.
  • the processing unit 602 is further configured to determine a first search result based on the first keyword and one or more second keywords.
  • a first search result based on the first keyword and one or more second keywords.
  • the processing unit 602 is configured to: obtain the first row value, and based on the first row value and the first preset search rule, search for one or more second row values within a preset offset range. keywords.
  • the first row value is used to identify the row number where the first keyword is located, and the preset offset range indicates row offset values between one or more second keywords and the first keyword.
  • the processing unit 602 is further configured to search for a third keyword within a first preset offset range according to the first row value and the first preset search rule , the third keyword is any one of the one or more second keywords, and the first preset offset range corresponds to the third keyword.
  • the processing unit 602 is further configured to search for a third keyword within a first preset offset range according to the first row value and the first preset search rule , the third keyword is any one of the one or more second keywords, and the first preset offset range corresponds to the third keyword.
  • the logical relationship includes a first identifier, a second identifier and/or a third identifier, the first identifier indicates that one or more keywords exist within a preset offset range, and the second identifier indicates One of the multiple keywords exists within the preset offset range, and the third flag indicates that one or more keywords do not exist within the preset offset range.
  • the one or more second keywords further include a fourth keyword and a fifth keyword.
  • the processing unit 602 is configured to search for the fourth keyword according to a second preset offset range, where the second preset offset range is determined by the distance between the fourth keyword and the first keyword The row offset value is obtained, the fourth keyword is one or more of the second keywords; or, the second preset offset range is obtained by the fifth keyword and the fourth keyword The row offset value between, and the row offset value between the fourth keyword and the first keyword is obtained, and the fifth keyword is different from the one or more second keywords A keyword for the fourth keyword.
  • the obtaining unit 601 is configured to: obtain a second text, the second text is obtained by processing the first text through a hash algorithm; obtain the text based on the second text Describe the first row of values.
  • the type of the keyword includes a string type and/or a key-value pair type.
  • the processing unit 602 is further configured to: based on the sixth keyword and the second preset search rule, find one or more seventh keywords from the second running path information, the sixth keyword is a keyword obtained according to the second running path information, and the second preset search rule is any one of one or more preset search rules corresponding to the second running path information. Then, the processing unit 602 also determines a second search result according to the sixth keyword and one or more seventh keywords, and the second search result is used to indicate the running behavior of the running path corresponding to the second running path information.
  • the text search device provided in the embodiment of the present application is used to execute the method in the method embodiment corresponding to FIG. 2 , so the embodiment of the present application can be understood with reference to relevant parts in the method embodiment corresponding to FIG. 2 .
  • the text search device may include but not limited to a vehicle, a terminal device, a server, and the like.
  • the text search device is presented in the form of dividing each functional unit in an integrated manner.
  • the "functional unit” here may refer to an application-specific integrated circuit (ASIC), a processor and memory executing one or more software or firmware programs, an integrated logic circuit, and/or other devices that can provide the above functions device.
  • ASIC application-specific integrated circuit
  • the text search device can take the form shown in FIG. 5 .
  • the processor 501 in FIG. 5 may invoke the computer-executed instructions stored in the memory 502, so that the text search device executes the method performed by the text search device in the method embodiment corresponding to FIG. 2 .
  • the function/implementation process of the processing unit 602 in FIG. 6 can be implemented by the processor 501 in FIG. 5 mobilizing the computer execution instructions stored in the memory 502.
  • the function/implementation process of the acquiring unit 601 in FIG. 6 can be realized by the transceiver device 503 in FIG. 5 .
  • each component is connected by communication, that is, the processing unit (or processor), the storage unit (or memory) and the transceiver device (transceiver) communicate with each other through an internal connection path, and transfer control and/or data.
  • the foregoing method embodiments of the present application may be applied to a processor, or the processor implements the steps of the foregoing method embodiments.
  • a processor may be an integrated circuit chip with signal processing capabilities.
  • each step of the above-mentioned method embodiments may be completed by an integrated logic circuit of hardware in a processor or instructions in the form of software.
  • processor can be central processing unit (central processing unit, CPU), network processor (network processor, NP) or the combination of CPU and NP, digital signal processor (digital signal processor, DSP), application-specific integrated circuit (application specific integrated circuit (ASIC), off-the-shelf programmable gate array (field programmable gate array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components.
  • CPU central processing unit
  • NP network processor
  • DSP digital signal processor
  • ASIC application-specific integrated circuit
  • FPGA off-the-shelf programmable gate array
  • a general-purpose processor may be a microprocessor, or the processor may be any conventional processor, or the like.
  • the steps of the methods disclosed in this application can be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor.
  • the software module can be located in a mature storage medium in the field such as random access memory, flash memory, read-only memory, programmable read-only memory or electrically erasable programmable memory, register.
  • the storage medium is located in the memory, and the processor reads the information in the memory, and completes the steps of the above method in combination with its hardware.
  • the apparatus may include multiple processors or the processor may include multiple processing units.
  • the processor may be a single-core (single-CPU) processor or a multi-core (multi-CPU) processor.
  • Memory is used to store computer instructions for execution by the processor.
  • the memory may be a storage circuit or a memory.
  • Memory can be volatile memory or nonvolatile memory, or can include both volatile and nonvolatile memory.
  • the non-volatile memory may be a read-only memory, a programmable read-only memory, an erasable programmable read-only memory, an electrically erasable programmable read-only memory or a flash memory.
  • Volatile memory can be random access memory, which acts as external cache memory.
  • the memory may be independent of the processor, or may be a storage unit in the processor, which is not limited here. Although only one memory is shown in the figure, the device may include multiple memories or the memory may include multiple storage units.
  • the transceiver is used to implement content interaction between the processor and other units or network elements.
  • the transceiver may be a communication interface of the device, may also be a transceiver circuit or a communication unit, and may also be a transceiver.
  • the transceiver may also be a communication interface or a transceiver circuit of the processor.
  • the transceiver may be a transceiver chip.
  • the transceiver may also include a sending unit and/or an acquiring unit.
  • the transceiver may include at least one communication interface.
  • the transceiver may also be a unit implemented in software.
  • the processor may interact with other units or network elements through a transceiver. For example: the processor obtains or receives content from other network elements through the transceiver. If the processor and the transceiver are physically separated components, the processor may interact with other units of the device without using the transceiver.
  • the processor, the memory, and the transceiver may be connected to each other through a bus.
  • the bus may be a peripheral component interconnect standard (peripheral component interconnect, PCI) bus or an extended industry standard architecture (extended industry standard architecture, EISA) bus or the like.
  • PCI peripheral component interconnect
  • EISA extended industry standard architecture
  • the bus can be divided into address bus, data bus, control bus and so on.
  • words such as “exemplary” or “for example” are used as examples, illustrations or descriptions. Any embodiment or design solution described as “exemplary” or “for example” in the embodiments of the present application shall not be interpreted as being more preferred or more advantageous than other embodiments or design solutions. Rather, the use of words such as “exemplary” or “such as” is intended to present related concepts in a concrete manner.
  • a computer program product includes one or more computer instructions. When computer-executed instructions are loaded and executed on a computer, the processes or functions according to the embodiments of the present application are generated in whole or in part.
  • a computer can be a general purpose computer, special purpose computer, computer network, or other programmable device.
  • Computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, e.g. Coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (such as infrared, wireless, microwave, etc.) to another website site, computer, server or data center.
  • the computer-readable storage medium may be any available medium that can be stored by a computer, or a data storage device such as a server, a data center, etc. integrated with one or more available media. Available media may be magnetic media (eg, floppy disk, hard disk, magnetic tape), optical media (eg, DVD), or semiconductor media (eg, Solid State Disk (SSD)).
  • SSD Solid State Disk

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

公开了一种文本搜索处理的方法以及相关设备,通过关键字规则集合中的关键字规则对多个运行路径信息中的关键字进行搜索,搜索结果较为准确,且满足用户的搜索诉求。该方法包括获取第一文本,第一文本包括一个或多个运行路径信息;获取预设搜索规则集合,每个预设搜索规则指示至少一个关键字之间的逻辑关系;基于第一关键字和第一预设搜索规则,搜索一个或多个第二关键字,第一关键字为根据第一运行路径信息获得的关键字,第一运行路径信息为一个或多个运行路径信息中的任意一个;基于第一关键字和一个或多个第二关键字确定第一搜索结果。本申请实施例提供的方法可以应用于智能汽车、终端、计算机等电子设备的日志文本处理。

Description

一种文本搜索处理的方法以及相关设备 技术领域
本申请涉及信息与通信技术(information and communications technology,ICT)领域,具体涉及一种文本搜索处理的方法以及相关设备。
背景技术
软件代码通常包含有大量的运行分支结构。在实际运行软件代码的过程中,运行不同的分支都会将对应的运行路径信息记录在日志文件中。对日志文件进行扫描分析时,一般是根据运行路径信息中的关键字来识别该软件代码的运行路径,从而分析得出软件代码的实际运行行为,以便执行相应的质量策略。
然而,相关方案一般是基于正则表达式来搜索日志文件中的关键字,从而得出搜索结果。但是软件代码在运行一段时间内所产生的日志文件均在同一份文件中,如果基于正则表达式从多个运行路径信息中搜索关键字时,极大可能会从当前过程中的运行路径信息中跳跃到其他过程中的运行路径信息中进行搜索,导致搜索结果不准确,无法满足用户的搜索诉求。
发明内容
本申请实施例提供了一种文本搜索处理的方法以及相关设备,旨在通过关键字规则集合中的关键字规则对多个运行路径信息中的关键字进行搜索,搜索结果较为准确,并且满足用户的搜索诉求。
本申请实施例的第一方面提供一种文本搜索处理的方法,该方法可以应用在日志的分析处理场景中。另外,该方法也可以应用于终端设备、车辆等设备中。该方法可以包括:获取第一文本和预设搜索规则集合。其中,第一文本包括一个或多个运行路径信息,预设搜索规则集合包括一个或多个预设搜索规则,每个预设搜索规则用于指示至少一个多个关键字之间的逻辑关系。然后,基于第一关键字和第一预设搜索规则,搜索一个或多个第二关键字。所描述的第一关键字为根据第一运行路径信息获得的关键字,第一运行路径信息为一个或多个运行路径信息中的任意一个,第一预设搜索规则为与第一运行路径信息对应的一个或多个预设搜索规则中的任意一个。并且,根据第一关键字和一个或多个第二关键字确定第一搜索结果,该第一搜索结果用于指示与第一运行路径信息所对应的运行路径的运行行为。通过上述方式,一方面,在多路径搜索的场景下,通过预设搜索规则集合中的预设搜索规则对多个运行路径信息中的关键字进行搜索,使用者无需掌握复杂的正则表达式,仅需理解需要搜索的多个关键字之间的逻辑关系即可。另一方面,将数据量较大的第一文本分割成以每个运行路径信息中的第一关键字为起点的多个待处理的子文本,而且每个运行路径信息都各自对应一个或多个预设搜索规则,能够高效、快速、准确地对其余的关键字进行搜索。
在一些可能的实施方式中,基于第一关键字和第一预设搜索规则,搜索一个或多个第二关键字的方式,具体可以先获取第一行值,该第一行值用于标识第一关键字所在的行号。 然后,基于第一行值和第一预设搜索规则,在预设偏移范围内查找一个或多个第二关键字。需说明,预设偏移范围指示一个或多个第二关键字与第一关键字之间的行偏移值。通过预设偏移范围,可以快速地确定出该第二关键字所在的搜索范围。这样,只需要从预设偏移范围内搜索满足该预设搜索规则的第二关键字即可,无需从全局查找,搜索效率高。
在一些可能的实施方式中,基于第一行值和与第一预设搜索规则,在预设偏移范围内查找一个或多个第二关键字,包括:根据第一行值和第一预设搜索规则,在第一预设偏移范围内查找第三关键字,第三关键字为一个或多个第二关键字中的任意一个,第一预设偏移范围与第三关键字对应。通过上述方式,针对一个或多个第二关键字中的任意一个关键字,即第三关键字,都可以在各自对应的第一预设偏移范围内搜索,提高搜索效率。
在一些可能的实施方式中,逻辑关系包括以下至少一个:第一标识、第二标识和第三标识,第一标识指示一个或多个关键字在预设偏移范围内存在,第二标识指示多个关键字在预设偏移范围内择一存在,第三标识指示一个或多个关键字在预设偏移范围内不存在。
在一些可能的实施方式中,所述一个或多个第二关键字还包括第四关键字和第五关键字。根据第二预设偏移范围搜索第四关键字,第二预设偏移范围由第四关键字与第一关键字之间的行偏移值得到,第四关键字为一个或多个第二关键字;或,第二预设偏移范围由第五关键字与第四关键字之间的行偏移值,以及第四关键字与第一关键字之间的行偏移值得到,第五关键字为一个或多个第二关键字中不同于第四关键字的关键字。通过上述方式,提供了多种用于确定第二预设偏移范围的方式,适用于多种可能出现的场景。
在一些可能的实施方式中,获取第一行值,可以包括:获取第二文本,该第二文本为由第一文本通过哈希算法进行处理获得。然后,根据第二文本获取第一行值。通过上述方式,基于哈希算法对第一文本的处理,使得第一文本可以转变成按行存储数据的第二文本,不仅能够基于第二文本快速地标定出每个运行路径信息中关键字所在的行号,而且还能够快速地进行纵向查找。
在一些可能的实施方式中,关键字的类型包括字符串类型和/或键值对类型。
在一些可能的实施方式中,除了根据第一关键字和第一预设搜索规则,从第一运行路径信息中查找一个或多个第二关键字,以此通过得到的第一搜索结果检测出该第一运行路径信息所对应的运行路径的运行行为以外,还可以基于预设搜索规则集合中的其他预设搜索规则对其余的运行路径信息执行同样地操作处理。即,该文本搜索处理的方法还包括:基于第六关键字和第二预设搜索规则,从第二运行路径信息中查找一个或多个第七关键字,第六关键字为根据第二运行路径信息获得的关键字,第二预设搜索规则为与第二运行路径信息对应的一个或多个预设搜索规则中的任意一个。然后,基于第六关键字和一个或多个第七关键字确定第二搜索结果。该第二搜索结果用于指示与第二运行路径信息所对应的运行路径的运行行为。
需说明,第二运行路径信息与第一运行路径信息不相同。另外,第二预设搜索规则与第一预设搜索规则可以相同,也可以不相同,在本申请中不做限定。另外,第六关键字和第一关键字可以相同,也可以不相同,在本申请中也不做限定说明。
第二方面,本申请实施例提供一种文本搜索装置。该文本搜索装置可以是终端设备、 车辆、智能汽车、计算机等。该文本搜索装置包括获取单元和处理单元。获取单元用于获取第一文本和预设搜索规则集合。其中,第一文本包括一个或多个运行路径信息,预设搜索规则集合包括一个或多个预设搜索规则,每个预设搜索规则指示对应的运行路径信息所包括的多个关键字之间的逻辑关系。处理单元,用于基于第一关键字和第一预设搜索规则,搜索一个或多个第二关键字,并且基于第一关键字和一个或多个第二关键字确定第一搜索结果。需说明,第一关键字为根据第一运行路径信息获得的关键字,第一运行路径信息为所述一个或多个运行路径信息中的任意一个,所述第一预设搜索规则为与所述第一运行路径信息对应的一个或多个所述预设搜索规则中的任意一个,第一搜索结果用于指示与第一运行路径信息所对应的运行路径的运行行为。
在一些可能的实施方式中,所述获取单元,用于获取第一行值,第一行值用于标识第一关键字所在的行号。该处理单元,用于根据第一行值和与第一预设搜索规则,在预设偏移范围内查找一个或多个第二关键字,预设偏移范围指示一个或多个第二关键字与第一关键字之间的行偏移值。
在一些可能的实施方式中,所述处理单元还用于根据所述第一行值和所述第一预设搜索规则,在第一预设偏移范围内查找第三关键字,所述第三关键字为所述一个或多个第二关键字中的任意一个,所述第一预设偏移范围与第三关键字对应。
在一些可能的实施方式中,逻辑关系包括以下至少一个:第一标识、第二标识和第三标识。第一标识指示一个或多个关键字在预设偏移范围内存在,第二标识指示多个关键字在预设偏移范围内择一存在,第三标识指示一个或多个关键字在预设偏移范围内不存在。
在一些可能的实施方式中,所述一个或多个第二关键字还包括第四关键字和第五关键字。所述处理单元,用于根据第二预设偏移范围搜索所述第四关键字,所述第二预设偏移范围由所述第四关键字与所述第一关键字之间的行偏移值得到,所述第四关键字为一个或多个所述第二关键字;或,所述第二预设偏移范围由所述第五关键字与所述第四关键字之间的行偏移值,以及所述第四关键字与所述第一关键字之间的行偏移值得到,所述第五关键字为所述一个或多个第二关键字中不同于所述第四关键字的关键字。
在一些可能的实施方式中,所述获取单元用于获得第二文本,所述第二文本由所述第一文本通过哈希算法进行处理获得,并且基于所述第二文本获取所述第一行值。
在一些可能的实施方式中,关键字的类型包括字符串类型和/或键值对类型。
在一些可能的实施方式中,处理单元还用于:基于第六关键字和第二预设搜索规则,从第二运行路径信息中查找一个或多个第七关键字,第六关键字为根据第二运行路径信息获得的关键字,第二预设搜索规则为与第二运行路径信息对应的一个或多个预设搜索规则中的任意一个。然后,该处理单元还根据第六关键字和一个或多个第七关键字确定第二搜索结果,第二搜索结果用于指示与第二运行路径信息所对应的运行路径的运行行为。
本申请第三方面提供一种车辆,可以包括:存储器,用于存储计算机可读指令。还可以包括,与存储器耦合的处理器,用于执行存储器中的计算机可读指令从而执行如第一方面或第一方面任意一种可能的实施方式中所描述的方法。
本申请第四方面提供一种服务器,可以包括:存储器,用于存储计算机可读指令。还 可以包括,与存储器耦合的处理器,用于执行存储器中的计算机可读指令从而执行如第一方面或第一方面任意一种可能的实施方式中所描述的方法。
本申请第五方面提供一种计算机可读存储介质,当指令在计算机装置上运行时,使得计算机装置执行如第一方面或第一方面任意一种可能的实施方式中所描述的方法。
本申请第六方面提供一种计算机程序产品,当在计算机上运行时,使得计算机可以执行如第一方面或第一方面任意一种可能的实施方式中所描述的方法。
本申请第七方面提供一种芯片系统,该芯片系统可以包括处理器,用于支持文本搜索装置实现上述第一方面或第一方面任意一种可能的实施方式中所描述的方法中所涉及的功能。
可选地,结合上述第七方面,在第一种可能的实施方式中,芯片系统还可以包括存储器,存储器,用于保存文本搜索装置必要的程序指令和数据。该芯片系统,可以由芯片构成,也可以包含芯片和其他分立器件。其中,芯片系统可以可以包括专用集成电路(application specific integrated circuit,ASIC)、现成可编程门阵列(field programmable gate array,FPGA)或者其他可编程逻辑器件等。进一步,芯片系统还可以可以包括接口电路等。
需要说明的是,本申请第二方面至第六方面的实施方式所带来的有益效果可以参照第一方面的实施方式进行理解,此处不做重复赘述。
本申请实施例提供的技术方案中,由于每个运行路径信息都各自对应一个或多个预设搜索规则,并且每个预设搜索规则都指示了运行路径信息所包括的多个关键字之间的逻辑关系。因此,在获得第一运行路径信息中的第一关键字后,即可根据第一关键字和第一预设搜索规则,搜索一个或多个第二关键字,进而根据第一关键字和一个或多个第二关键字确定第一搜索结果。一方面,在多路径搜索的场景下,通过关键字规则集合中的关键字规则对多个运行路径信息中的关键字进行搜索,使用者无需掌握复杂的正则表达式,仅需理解需要搜索的多个关键字之间的逻辑关系即可。另一方面,将数据量较大的待处理日志文本分割成以每个运行路径信息中的第一关键字为起点的多个待处理的子文本,能够高效、快速、准确地对其余的关键字进行搜索。
附图说明
为了更清楚地说明本申请实施例的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例。
图1相关方案中搜索关键字的示意图;
图2为本申请提供的一种文本搜索处理的方法的流程示意图;
图3A为本申请提供的一种多个关键字之间的预设逻辑关系的示意图;
图3B为本申请提供的一种建立搜索规则的界面示意图;
图4为一种应用本申请方案的搜索示意图;
图5为本申请实施例提供的通信设备的硬件结构示意图;
图6为本申请实施例提供的一种报文处理装置的结构示意图。
具体实施方式
本申请实施例提供了一种文本搜索处理的方法以及相关设备,旨在通过关键字规则集合中的关键字规则对多个运行路径信息中的关键字进行搜索,搜索结果较为准确,并且满足用户的搜索诉求。
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行描述。应当理解,本申请的说明书和权利要求书中使用的术语“包括”和“包含”指示所描述特征、整体、步骤、操作、元素和/或组件的存在,但并不排除一个或多个其它特征、整体、步骤、操作、元素、组件和/或其集合的存在或添加。还应当理解,本申请说明书中所使用的术语仅仅是出于描述特定实施例的目的,而并不意在限定本申请。
软件代码通常包含有大量的运行分支结构。在实际运行软件代码的过程中,运行不同的分支都会将对应的运行路径信息记录在日志文件中。对日志文件进行扫描分析时,一般是根据运行路径信息中的关键字(keyword,KW)来识别该软件代码的运行路径,从而分析得出软件代码的实际运行行为,以便执行相应的质量策略。
然而,相关方案一般是基于正则表达式来搜索日志文件中的关键字,从而得出搜索结果。参阅图1,为相关方案中搜索关键字的示意图。从图1可以看出,软件代码在两个不同的运行过程(如:运行过程1和运行过程2)中所产生的日志文本均在同一份日志文件中。并且,若用户欲想通过关键字KW01、KW02、KW03搜过在运行过程2中所产生的日志,但是基于传统的正则表达式,极大可能地搜索到的是在运行过程1中所产生的line01.KW01和line03.KW02,以及运行过程2中所产生的line09.KW03。很明显,基于正则表达式从多个运行路径信息中搜索关键字时,极大可能会从当前过程2中的运行路径信息中跳跃到其他过程1中的运行路径信息中进行搜索,导致搜索结果不准确,无法满足用户的搜索诉求。
因此,为了解决上述相关方案中引出的技术问题,本申请实施例提供一种文本搜索处理的方法,通过关键字规则集合(keywords rules set,KWRS)中的关键字规则(keywords rules,KWR)对多个运行路径信息中的关键字进行搜索,搜索结果较为准确,并且满足用户的搜索诉求。该文本搜索处理的方法可以应用在车辆中,也可以应用在终端设备、服务器等设备中。其中,车辆可以包括但不限于智能车等。终端设备可以包括但不限于个人计算机、手机、平板电脑、可穿戴智能设备等。该文本搜索处理的方法可以应用在日志的分析处理场景中。例如,适用于包括但不限于车载产品的软件日志、系统运行过程产生的日志等分析处理场景中,也可以适用于其他产品形态的日志的分析场景中,此处不做限定。此外,该文本搜索处理的方法也可以应用在产品的整个生命周期,也可以应用在开发调试,维修保养阶段等,此处不做限定说明。
图2为本申请实施例提供的一种文本搜索处理的方法的流程示意图。该图2所示的文本搜索处理的方法可以应用在文本搜索装置中,文本搜索装置可以包括但不限于车辆、终端设备、服务器等,此处不做限定。如图2所示,该文本搜索处理的方法包括如下步骤:
201、获取第一文本,第一文本包括一个或多个运行路径信息。
该示例中,在不同的过程中运行软件代码,都会产生相应的日志文本。并且日志文本中包括了一个或多个运行路径信息,每个运行路径信息中又可以包括多个关键字。因此,针对每一个运行过程,都可以获取该运行过程中的第一文本。
另外,由于第一文本可以存储在本地服务器等设备中,也可以上传并保存在云端。因此,既可以从本地服务器等设备中获取第一文本,也可以从云端获取第一文本。在实际应用中,还可能存在其他的获取方式,此处不做限定说明。此外,第一文本包括但不限于车辆电源管理的日志文本、车载产品的软件日志、系统运行过程产生的日志等,此处不做限定说明。
需说明,关键字的类型可以包括字符串(string)类型和/或键值对(key-value,K-V)类型。其中,键值对类型是一种带操作符的字典类型,其通用格式为:key op value。例如:state=Init,date:20201104。在实际应用中,关键字的类型还可能是其他的类型,具体此处不做限定。
202、获取预设搜索规则集合,预设搜索规则集合包括一个或多个预设搜索规则,每个预设搜索规则用于指示至少一个关键字之间的逻辑关系。
该示例中,由于每个运行路径信息中都可能包括多个关键字,因此将每个运行路径信息中的至少一个关键字之间的逻辑关系设定为一个预设搜索规则。这样,一个运行路径信息可以对应一个或多个预设搜索规则,一个或多个预设搜索规则组合成该预设搜索规则集合。
举例来说,针对运行路径信息A,该运行路径信息A中所包括的至少一个关键字之间的逻辑关系,可以参阅图3A进行理解。如图3A所示,该逻辑关系可以为:KW1需要在该运行路径信息A中出现,可以用符号“AND”来标识;KW2a和KW2b中的任意一个关键字在该运行路径信息A中出现即可,可以用符号“OR”来标识;而KW3是不需要出现在该运行路径信息A中,可以用符号“NOT”来标识;KW4为需要在该运行路径信息A中出现,指示在该运行路径信息A中搜索结束的关键字,可以用符号“AND”来标识。基于此,可以将图3A所示的逻辑关系配置为与该运行路径信息A对应的预设搜索规则。
参阅图3B,为本申请提供的一种建立搜索规则的界面示意图。从图3B可以看出,在该交互显示界面中,该预设搜索规则的输入窗口包括“关键字类型”输入窗口、“关键字”输入窗口、“关键字值”输入窗口、“逻辑关系”输入窗口、以及“预设偏移范围”输入窗口等。用户可以根据实际需求将满足逻辑关系的各个关键字填充在图3B对应的输入窗口中,即可得到一条预设搜索规则。
举例来说,若用户需要从“车辆电源管理的日志文本”中搜索关键字,并且希望能够搜索到满足上述图3A所示的逻辑关系的关键字,比如:KW1为车辆,KW2a为电源,KW2b为剩余电量,KW3为管理,KW4为消耗。
那么,用户可以将String类型的KW1作为该“车辆电源管理的日志文本”中的运行路径信息A的搜索入口,并在“关键字类型”输入窗口、“关键字”输入窗口、“操作符”输入窗口、“关键字值”输入窗口、“逻辑关系”输入窗口、以及“预设行偏移范围”输入窗口中分别填入:String,KW1,"",车辆,AND,0。
同样地,若用户希望在最大行偏移量为2的预设偏移范围内搜索到KW2a和KW2b中的任意一个关键字,那么用户也可以新增两个关键字的搜索窗口。其中,用户在一个关键字的“关键字类型”输入窗口、“关键字”输入窗口、“操作符”输入窗口、“关键字值”输入窗口、 “逻辑关系”输入窗口、以及“预设行偏移范围”输入窗口中分别填入:String,KW2a,"",电源,OR,2。以及,用户在另一个关键字的“关键字类型”输入窗口、“关键字”输入窗口、“操作符”输入窗口、“关键字值”输入窗口、“逻辑关系”输入窗口、以及“预设行偏移范围”输入窗口中分别填入:String,KW2b,"",剩余电量,OR,2。
同样地,若用户希望在最大行偏移量为5的预设偏移范围内,并不希望查找到关于“管理”这一关键字。此时,用户也可以新增一个关键字的搜索窗口,并在“关键字类型”输入窗口、“关键字”输入窗口、“操作符”输入窗口、“关键字值”输入窗口、“逻辑关系”输入窗口、以及“预设行偏移范围”输入窗口中分别填入:K-V,KW3,"",管理,NOT,5。
另外,用户也希望在最大行偏移量为7的预设偏移范围内,查到到关于“消耗”这一关键字,此时用户也可以在“关键字类型”输入窗口、“关键字”输入窗口、“操作符”输入窗口、“关键字值”输入窗口、“逻辑关系”输入窗口、以及“预设行偏移范围”输入窗口中分别填入:String,KW4,"",消耗,AND,7。
这样,用户在该交互显示界面中填充完每个需要查找的关键字之间的逻辑关系后,便可以生成与该逻辑关系对应的预设搜索规则了。需说明,针对图3B中所示出的各个输入窗口的取值,也仅仅是一个示例性的描述,本申请不做限定说明。所描述的预设偏移范围可以参照后续的步骤203中的内容进行理解,此处先不做赘述。
该预设搜索规则可以为一段机器可执行的代码,包括但不限于xml、json、yaml等格式。例如,图3B所示的与运行路径信息A对应的预设搜索规则可以用xml格式表示,具体如下:
Figure PCTCN2021113863-appb-000001
Figure PCTCN2021113863-appb-000002
需说明,上述仅以图3A所示的逻辑关系为例,来描述与车辆电源管理的日志文本中的运行路径信息A对应的一个预设搜索规则。在实际应用中,还可以基于其他的逻辑关系来设定与该运行路径信息A对应的预设搜索规则。另外,针对多个运行路径信息,每个运行路径信息对应的一个或多个预设搜索规则,也可以参照上述图3A所示的逻辑关系进行理解,此处不做赘述。
203、基于第一关键字和第一预设搜索规则,搜索一个或多个第二关键字,第一关键字为根据第一运行路径信息获得的关键字。
该示例中,由于为每一个运行路径信息都配置了一个或多个相应的预设搜索规则。并且,在搜索每个运行路径信息中的关键字时,都需要先设定一个预设搜索的关键字作为当前运行路径信息的搜索入口。因此,可以根据第一运行路径信息获得第一关键字,或者说确定该第一运行路径信息中预设搜索的关键字为第一关键字。然后,在获得到第一关键字后,可以基于该第一关键字和第一预设搜索规则,从第一运行路径信息中查找一个或多个第二关键字。
举例来说,假设第一预设搜索规则为图3A和图3B中所示出的搜索规则,并且第一运行路径信息为该车辆电源管理的日志文本中的运行路径信息A。此时,用户可以将预设搜索规则中的KW1作为该运行路径信息A的搜索入口,即第一关键字。这样,基于该KW1和前述图3A和图3B所示的预设搜索规则,能够从该运行路径信息A中搜索符合该预设搜索规则的其他关键字,即一个或多个第二关键字,如KW2a或者KW2b,以及KW4。
在一些可能的示例中,为了能够快速地查找其余的关键字,可以为每个关键字设定一个最大行偏移量。示例性地,针对步骤203中查找一个或多个第二关键字,具体可以采用如下方式,即:获得第一行值,第一行值用于标识第一关键字所在的行号。并且,基于第一行值和第一预设搜索规则,在预设偏移范围内查找一个或多个第二关键字。
需说明,预设偏移范围指示一个或多个第二关键字与第一关键字之间的行偏移值。或者也可以理解成,为每个第二关键字设定一个行偏移范围,即通过行偏移范围表示出当前的第二关键字与该第一关键字之间的行偏移值。这样,只需要在一定的行偏移范围内查找 其余的关键字即可,无需全文搜索,提高了查找效率。所描述的行偏移范围是一个估算量,此处不做限定说明。
示例性地,根据第一行值和第一预设搜索规则,在预设偏移范围内查找一个或多个第二关键字,也可以通过如下方式进,即:根据所述第一行值和所述第一预设搜索规则,在第一预设偏移范围内查找第三关键字,所述第三关键字为所述一个或多个第二关键字中的任意一个,所述第一预设偏移范围与第三关键字对应。
举例来说,以图3B所示的预设搜索规则为例,KW1作为第一关键字,它自身的行偏移范围为0。KW2a作为需要查找的一个第二关键字,可以设置它距离该KW1所在行的最大行偏移范围为2,此时只需要在KW1的后两行内查找该KW2a即可。同样地,KW2b、KW3、KW4作为需要查找的另一些第二关键字,分别可以设置它们距离该KW1所在行的最大行偏移范围为2,5,7。需说明,此处仅以行偏移范围2为例进行说明KW2a、KW2b与KW1之间的行偏移范围,在实际应用中,也可以设置成其他的行偏移范围,如:4、8等,此处不做限定说明。此外,针对KW3、KW4的行偏移范围也可以设置成其他的值,此处不做限定。此外,若将KW2a的行偏移范围设置为0,则表明了该KW2a与KW1位于该运行路径信息A中的同一行,其余的KW3、KW4等也可以参照该KW2a进行理解。
在为每个第二关键字设定一个行偏移范围的情况下,逻辑关系也可以设置为每个关键字在相应的行偏移范围内是否需要出现。具体地,该逻辑关系可以包括以下至少一个:第一标识、第二标识和第三标识。其中,第一标识指示一个或多个所述关键字在预设偏移范围内存在,所述第二标识指示所述多个关键字在所述预设偏移范围内择一存在,所述第三标识指示一个或多个所述关键字在所述预设偏移范围内不存在。
需说明,第一标识可以理解成前述图3A中的“AND”,第二标识可以理解成图3A中的“OR”,以及第三标识也可以理解成图3A中的“NOT”。在实际应用中,第一标识、第二标识以及第三标识还可以使用其他的标识来表示,此处不做限定说明。
另外,在一些示例中,可以直接为每个第二关键字设置一个预设偏移范围。即,该一个或多个第二关键字还包括第四关键字和第五关键字,那么在确定每个第二关键字的预设偏移范围时,可以从以下两种方式来确定,即:
①根据第二预设偏移范围搜索所述第四关键字,第二预设偏移范围由第四关键字与第一关键字之间的行偏移值得到,该第四关键字为一个或多个第二关键字。也就是说,可以直接为每个第四关键字设置一个第二预设偏移范围,此时的第二预设偏移范围是各个第四关键字直接与第一关键字之间的行偏移值。举例来说,若当前需要查找的第四关键字为KW2a,那么可以设置该KW2a与KW1之间的行偏移值为6、7、8等,此时的第二预设偏移范围的范围可以为6至8,即此时只需要从该KW1所在行的后8行内查找该KW2a即可。
②或者,根据第二预设偏移范围搜索第四关键字,此时的第二预设偏移范围由所述第五关键字与所述第四关键字之间的行偏移值,以及所述第四关键字与所述第一关键字之间的行偏移值得到,所述第五关键字为所述一个或多个第二关键字中不同于所述第四关键字的关键字。
举例来说,若当前需要搜索的第五关键字为KW4,并且已知第四关键字KW2a与KW1之 间的行偏移值为6、7、8等(即KW2a的行偏移范围为6至8)。那么,若要为该KW4设定一个行偏移范围,可以在该KW2a的行偏移范围(如:6至8)的基础上,设定该KW4到该KW2a之间的行偏移值(如:1至2等)。这样,基于为第四关键字KW2a设定的行偏移范围(即6至8)和该KW4到该KW2a之间的行偏移值(即1至2),即可获知该KW4与该KW1之间的第二预设偏移范围了,即7至10。
以一个或多个运行路径信息中的任意一个运行路径信息(即第一运行路径信息)为例进行说明。通过哈希(hash)算法对该第一文件进行处理,得到第二文本,进而基于该第二文本获得第一行值,并以该第一行值所标定的行的第一关键字作为搜索入口。然后,在预设偏移范围,按照与该第一运行路径信息对应的预设搜索规则进行搜索,即可查找到满足该预设搜索规则的一个或多个第二关键字。需说明,所描述的哈希算法包括但不限于MD5信息摘要算法(MD5message-digest algorithm,MD5)、MD4算法等,此处不做限定说明。
例如,图4示出了一种应用本申请方案的搜索示意图。从图4可以看出,第一文本经过哈希算法处理后,可以转变成第二文本。需说明,该第二文本可以理解成按行存储的矩阵式文本结构。从该第二文本中可以快速地标定出第一关键字(如:KW1)所在的行,例如:No.1、No.5、No.9等。然后,以No.1、No.5、No.9所在行的第一关键字作为搜索入口,分段式地进行搜索匹配,查找第二关键字,直到所有分段都搜索完毕为止。很明显,从图4可以知道,分段1和分段3所查找出的第二关键字符合该预设搜索规则。
需说明,在多个运行路径信息中,也可以参照从该第一运行路径信息查找一个或多个关键字进行理解,此处不做赘述。
204、根据第一关键字和一个或多个第二关键字确定第一搜索结果。
该示例中,在查找到一个或多个第二关键字之后,便可以结合第一关键字确定第一搜索结果,例如:运行该第一运行路径信息时所对应的运行场景,或者运行该第一运行路径信息时所产生的异常信息等。该第一搜索结果可以反映出与一个或多个运行路径信息所对应的运行路径的运行行为。例如,通过该第一搜索结果可以获知哪些运行路径出现了错误或者告警等。这样,进一步可以基于出现错误或告警的运行路径所对应的错误信息或者告警信息分析该系统运行情况等。
需说明,第一搜索结果会随着第一预设搜索规则的变化而改变。举例来说,针对同一个运行路径信息A,该运行路径信息A对应着两条第一预设搜索规则(即搜索规则A和搜索规则B)。若该搜索规则A和搜索规则B不相同,那么根据搜索规则A从该运行路径信息A中查找到的其余的第二关键字,也会与根据搜索规则B从该运行路径信息A中查找到的第二关键字不相同,从而确定出的第一搜索结果也会不同。
上述主要从一个或多个运行路径信息中的任意一个运行路径信息(即第一运行路径信息)的角度描述了如何查找一个或多个关键字。当该第一文本包括多个运行路径信息时,用户希望分别从这多个运行路径信息中查找出各自对应的关键字,从而实现对整个第一文本进行分析。因此,在另一些可能的实施例方式中,该文本搜索处理的方法还可以包括:基于第六关键字和第二预设搜索规则,从第二运行路径信息中查找一个或多个第七关键字。然后,还根据第六关键字和一个或多个第七关键字确定第二搜索结果,第二搜索结果用于 指示与第二运行路径信息所对应的运行路径的运行行为。
需说明,第流关键字为所述第二运行路径信息中预设搜索的关键字。该第六关键字可以参照前述第一运行路径信息中的第一关键字进行理解成,此处不做赘述。另外,该第六关键字可以与第一关键字相同,也可以不相同,此处不做限定。另外,第二运行路径信息也可以理解成一个或多个运行路径信息中的任意一个。但是需注意的是,该第二运行路径信息与所述第一运行路径信息不相同。另外,所述第二预设搜索规则为与所述第二运行路径信息对应的一个或多个所述预设搜索规则中的任意一个。所描述的第二预设搜索规则也可以参照前述第一预设搜索规则进行理解,此处不做赘述。
在本申请实施例中,由于每个运行路径信息都各自对应着一个或多个预设搜索规则,并且每个预设搜索规则都指示了至少一个关键字之间的逻辑关系。因此,在根据第一运行路径信息获得第一关键字后,即可基于第一关键字和第一预设搜索规则,查找一个或多个第二关键字,进而根据第一关键字和一个或多个第二关键字确定第一搜索结果。一方面,在多路径搜索的场景下,通过关键字规则集合中的关键字规则对多个运行路径信息中的关键字进行搜索,使用者无需掌握复杂的正则表达式,仅需理解需要搜索的多个关键字之间的逻辑关系即可。另一方面,将数据量较大的第一文本分割成以每个运行路径信息中的第一关键字为起点的多个待处理的子文本,而且每个运行路径信息都各自对应一个或多个预设搜索规则,能够高效、快速、准确地对其余的关键字进行搜索。
上述主要从方法的角度对本申请实施例提供的方案进行了介绍。可以理解的是,上述的文本搜索装置为了实现上述功能,其包含了执行各个功能相应的硬件结构和/或软件模块。本领域技术人员应该很容易意识到,结合本文中所公开的实施例描述的功能,本申请能够以硬件或硬件和计算机软件的结合形式来实现。某个功能究竟以硬件还是计算机软件驱动硬件的方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
从实体设备角度来描述,上述文本搜索装置具体可以由一个实体设备实现,也可以由多个实体设备共同实现,还可以是一个实体设备内的一个逻辑功能单元,本申请实施例对此不做具体限定。
例如,上述文本搜索装置可以由图5中的通信设备来实现。图5为本申请实施例提供的通信设备的硬件结构示意图。该通信设备包括至少一个处理器501、存储器502以及收发设备503。
处理器501可以是一个通用中央处理器CPU,微处理器,特定应用集成电路(application-specific integrated circuit),或一个或多个用于控制本申请方案程序执行的集成电路。该处理器501能够进行判断、分析、运算等操作,包括根据第一关键字和第一预设搜索规则,搜索一个或多个第二关键字。并且该处理器501还包括根据第一关键字和一个或多个第二关键字确定第一搜索结果等。
收发设备503,使用任何收发器一类的装置,用于与其他设备或通信网络通信,如以太网,无线接入网(radio access network,RAN),无线局域网(wireless local area  networks,WLAN)等。收发设备503可以与处理器501相连接。该收发设备503可以获取第一文本,以及获取预设搜索规则集合等。
存储器502可以是只读存储器(read-only memory,ROM)或可存储静态信息和指令的其他类型的静态存储设备,随机存取存储器(random access memory,RAM)或者可存储信息和指令的其他类型的动态存储设备,也可以是电可擦可编程只读存储器(electrically erable programmable read-only memory,EEPROM)、只读光盘(compact disc read-only memory,CD-ROM)或其他光盘存储、光碟存储(包括压缩光碟、激光碟、光碟、数字通用光碟、蓝光光碟等)、磁盘存储介质或者其他磁存储设备、或者能够用于携带或存储具有指令或数据结构形式的期望的程序代码并能够由计算机存取的任何其他介质,但不限于此。存储器502可以是独立存在,也可以与处理器501相连接。存储器502也可以和处理器501集成在一起。
其中,存储器502用于存储执行本申请方案的计算机执行指令,并由处理器501来控制执行。处理器501用于执行存储器502中存储的计算机执行指令,从而实现本申请上述方法实施例提供的文本搜索处理的方法。
一种可能的实现方式,本申请实施例中的计算机执行指令也可以称之为应用程序代码,本申请实施例对此不做具体限定。
在具体实现中,作为一种实施例,处理器501可以包括一个或多个CPU,例如图5中的CPU0和CPU1。
从功能单元的角度,本申请可以根据上述方法实施例对文本搜索装置进行功能单元的划分,例如,可以对应各个功能划分各个功能单元,也可以将两个或两个以上的功能集成在一个功能单元中。上述集成的功能单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
比如,以采用集成的方式划分各个功能单元的情况下,图6示出了本申请实施例提供的一种文本搜索装置的结构示意图。如图6所示,本申请的文本搜索装置的一个实施例可以包括:获取单元601和处理单元602。
其中,获取单元601,用于获取第一文本,第一文本包括一个或多个运行路径信息。具体实现方式请参照前述图2中的步骤201的详细说明,此处不做赘述。
获取单元601,还用于获取预设搜索规则集合,预设搜索规则集合包括一个或多个预设搜索规则,每个预设搜索规则指示至少一个关键字之间的预设逻辑关系。具体实现方式请参照前述图2中的步骤202的详细说明,此处不做赘述。
处理单元602,用于基于第一关键字和第一预设搜索规则,搜索一个或多个第二关键字。其中,第一关键字为根据第一运行路径信息获得的关键字,第一运行路径信息为所述一个或多个运行路径信息中的任意一个,所述第一预设搜索规则为与所述第一运行路径信息对应的一个或多个所述预设搜索规则中的任意一个。具体实现方式请参照前述图2中的步骤203的详细说明,此处不做赘述。
处理单元602还用于基于第一关键字和一个或多个第二关键字确定第一搜索结果。具体实现方式请参照前述图2中的步骤204的详细说明,此处不做赘述。
在一些可选的实施例中,处理单元602用于:获取第一行值,并且基于第一行值和与第一预设搜索规则,在预设偏移范围内查找一个或多个第二关键字。其中,第一行值用于标识第一关键字所在的行号,预设偏移范围指示一个或多个第二关键字与第一关键字之间的行偏移值。具体实现方式请参照前述图2中的步骤203的详细说明,此处不做赘述。
在另一些可选的实施例中,所述处理单元602,还用于根据所述第一行值和所述第一预设搜索规则,在第一预设偏移范围内查找第三关键字,所述第三关键字为所述一个或多个第二关键字中的任意一个,所述第一预设偏移范围与第三关键字对应。具体实现方式请参照前述图2中的步骤203的详细说明,此处不做赘述。
在另一些可选的实施例中,逻辑关系包括第一标识、第二标识和/或第三标识,第一标识指示一个或多个关键字在预设偏移范围内存在,第二标识指示多个关键字在预设偏移范围内择一存在,第三标识指示一个或多个关键字在预设偏移范围内不存在。
在另一些可选的实施例中,所述一个或多个第二关键字还包括第四关键字和第五关键字。所述处理单元602,用于根据第二预设偏移范围搜索所述第四关键字,所述第二预设偏移范围由所述第四关键字与所述第一关键字之间的行偏移值得到,所述第四关键字为一个或多个所述第二关键字;或,所述第二预设偏移范围由所述第五关键字与所述第四关键字之间的行偏移值,以及所述第四关键字与所述第一关键字之间的行偏移值得到,所述第五关键字为所述一个或多个第二关键字中不同于所述第四关键字的关键字。
在另一些可选的实施例中,所述获取单元601用于:获得第二文本,所述第二文本由所述第一文本通过哈希算法进行处理获得;基于所述第二文本获取所述第一行值。
在另一些可选的实施例中,关键字的类型包括字符串类型和/或键值对类型。
在另一些可能的实施方式中,处理单元602还用于:基于第六关键字和第二预设搜索规则,从第二运行路径信息中查找一个或多个第七关键字,第六关键字为根据第二运行路径信息获得的关键字,第二预设搜索规则为与第二运行路径信息对应的一个或多个预设搜索规则中的任意一个。然后,该处理单元602还根据第六关键字和一个或多个第七关键字确定第二搜索结果,第二搜索结果用于指示与第二运行路径信息所对应的运行路径的运行行为。
本申请实施例提供的文本搜索装置用于执行图2中对应的方法实施例中的方法,故本申请实施例可以参考图2对应的方法实施例中的相关部分进行理解。此外,该文本搜索装置可以包括但不限于车辆、终端设备、服务器等。
本申请实施例中,文本搜索装置以采用集成的方式划分各个功能单元的形式来呈现。这里的“功能单元”可以指特定应用集成电路(application-specific integrated circuit,ASIC),执行一个或多个软件或固件程序的处理器和存储器,集成逻辑电路,和/或其他可以提供上述功能的器件。在一个简单的实施例中,本领域的技术人员可以想到文本搜索装置可以采用图5所示的形式。
比如,图5的处理器501可以通过调用存储器502中存储的计算机执行指令,使得文本搜索装置执行图2对应的方法实施例中文本搜索装置所执行的方法。
具体的,图6中的处理单元602的功能/实现过程可以通过图5中的处理器501调动存 储器502中存储的计算机执行指令来实现。图6中的获取单元601的功能/实现过程可以通过图5中的收发设备503来实现。
在本申请图5的设备中各个组件通信连接,即处理单元(或者处理器)、存储单元(或者存储器)和收发设备(收发器)之间通过内部连接通路互相通信,传递控制和/或数据信号。本申请上述方法实施例可以应用于处理器中,或者由处理器实现上述方法实施例的步骤。处理器可能是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法实施例的各步骤可以通过处理器中的硬件的集成逻辑电路或者软件形式的指令完成。上述的处理器可以是中央处理器(central processing unit,CPU),网络处理器(network processor,NP)或者CPU和NP的组合、数字信号处理器(digital signal processor,DSP)、专用集成电路(application specific integrated circuit,ASIC)、现成可编程门阵列(field programmable gate array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本申请中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本申请所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器,处理器读取存储器中的信息,结合其硬件完成上述方法的步骤。虽然图中仅仅示出了一个处理器,该装置可以包括多个处理器或者处理器包括多个处理单元。具体的,处理器可以是一个单核(single-CPU)处理器,也可以是一个多核(multi-CPU)处理器。
存储器用于存储处理器执行的计算机指令。存储器可以是存储电路也可以是存储器。存储器可以是易失性存储器或非易失性存储器,或可包括易失性和非易失性存储器两者。其中,非易失性存储器可以是只读存储器、可编程只读存储器、可擦除可编程只读存储器、电可擦除可编程只读存储器或闪存。易失性存储器可以是随机存取存储器,其用作外部高速缓存。存储器可以独立于处理器,也可以是处理器中的存储单元,在此不做限定。虽然图中仅仅示出了一个存储器,该装置也可以包括多个存储器或者存储器包括多个存储单元。
收发器用于实现处理器与其他单元或者网元的内容交互。具体的,收发器可以是该装置的通信接口,也可以是收发电路或者通信单元,还可以是收发信机。收发器还可以是处理器的通信接口或者收发电路。可选的,收发器可以是一个收发芯片。该收发器还可以包括发送单元和/或获取单元。在一种可能的实现方式中,该收发器可以包括至少一个通信接口。在另一种可能的实现方式中,该收发器也可以是以软件形式实现的单元。在本申请的各实施例中,处理器可以通过收发器与其他单元或者网元进行交互。例如:处理器通过该收发器获取或者接收来自其他网元的内容。若处理器与收发器是物理上分离的两个部件,处理器可以不经过收发器与该装置的其他单元进行内容交互。
一种可能的实现方式中,处理器、存储器以及收发器可以通过总线相互连接。总线可以是外设部件互连标准(peripheral component interconnect,PCI)总线或扩展工业标 准结构(extended industry standard architecture,EISA)总线等。总线可以分为地址总线、数据总线、控制总线等。
本申请实施例中,“示例性的”或者“例如”等词用于表示作例子、例证或说明。本申请实施例中被描述为“示例性的”或者“例如”的任何实施例或设计方案不应被解释为比其他实施例或设计方案更优选或更具优势。确切而言,使用“示例性的”或者“例如”等词旨在以具体方式呈现相关概念。
在本申请的各实施例中,为了方便理解,进行了多种举例说明。然而,这些例子仅仅是一些举例,并不意味着是实现本申请的最佳实现方式。
上述实施例,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现,当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。
计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行计算机执行指令时,全部或部分地产生按照本申请实施例的流程或功能。计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一计算机可读存储介质传输,例如,计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。计算机可读存储介质可以是计算机能够存储的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。可用介质可以是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如,DVD)、或者半导体介质(例如固态硬盘Solid State Disk(SSD))等。
以上对本申请所提供的技术方案进行了详细介绍,本申请中应用了具体个例对本申请的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本申请的方法及其核心思想;同时,对于本领域的一般技术人员,依据本申请的思想,在具体实施方式及应用范围上均会有改变之处,综上,本说明书内容不应理解为对本申请的限制。

Claims (18)

  1. 一种文本搜索方法,其特征在于,包括:
    获取第一文本,所述第一文本包括一个或多个运行路径信息;
    获取预设搜索规则集合,所述预设搜索规则集合包括一个或多个预设搜索规则,每个所述运行路径信息对应一个或多个所述预设搜索规则,每个所述预设搜索规则用于指示至少一个关键字之间的逻辑关系;
    根据第一关键字和第一预设搜索规则,搜索一个或多个第二关键字,其中,所述第一关键字为根据第一运行路径信息获得的关键字,所述第一运行路径信息为所述一个或多个运行路径信息中的任意一个,所述第一预设搜索规则为与所述第一运行路径信息对应的一个或多个所述预设搜索规则中的任意一个;
    根据所述第一关键字和所述一个或多个第二关键字获得第一搜索结果。
  2. 根据权利要求1所述的方法,其特征在于,所述根据第一关键字和所述第一预设搜索规则,搜索一个或多个第二关键字,包括:
    获取第一行值,所述第一行值用于标识所述第一关键字所在的行号;
    基于所述第一行值和所述第一预设搜索规则,在预设偏移范围内查找一个或多个第二关键字,所述预设偏移范围指示一个或多个所述第二关键字与所述第一关键字之间的行偏移值。
  3. 根据权利要求2所述的方法,其特征在于,基于所述第一行值和所述第一预设搜索规则,在预设偏移范围内查找一个或多个第二关键字,包括:
    根据所述第一行值和所述第一预设搜索规则,在第一预设偏移范围内查找第三关键字,所述第三关键字为所述一个或多个第二关键字中的任意一个,所述第一预设偏移范围与所述第三关键字对应。
  4. 根据权利要求2至3所述的方法,其特征在于,所述逻辑关系包括以下至少一个:第一标识、第二标识和第三标识,其中,所述第一标识指示一个或多个所述关键字在所述预设偏移范围内存在,所述第二标识指示所述多个关键字在所述预设偏移范围内择一存在,所述第三标识指示一个或多个所述关键字在所述预设偏移范围内不存在。
  5. 根据权利要求2-4中任一项所述的方法,其特征在于,所述方法还包括:
    所述一个或多个第二关键字还包括第四关键字和第五关键字;
    根据第二预设偏移范围搜索所述第四关键字,所述第二预设偏移范围由所述第四关键字与所述第一关键字之间的行偏移值得到,所述第四关键字为一个或多个所述第二关键字中的一个;或,
    所述第二预设偏移范围由所述第五关键字与所述第四关键字之间的行偏移值,以及所述第四关键字与所述第一关键字之间的行偏移值得到,所述第五关键字为所述一个或多个第二关键字中不同于所述第四关键字的关键字。
  6. 根据权利要求2-5中任一项所述的方法,其特征在于,所述获取第一行值,包括:
    获得第二文本,所述第二文本为由所述第一文本通过哈希算法进行处理获得;
    根据所述第二文本获取所述第一行值。
  7. 根据权利要求1-6中任一项所述的方法,其特征在于,所述关键字的类型包括字符串类型和/或键值对类型。
  8. 一种文本搜索装置,其特征在于,包括:
    获取单元,用于获取第一文本,所述第一文本包括一个或多个运行路径信息;
    所述获取单元,用于获取预设搜索规则集合,所述预设搜索规则集合包括一个或多个预设搜索规则,每个所述预设搜索规则用于指示至少一个多个关键字之间的逻辑关系;
    处理单元,用于根据第一关键字和第一预设搜索规则,搜索一个或多个第二关键字,所述第一关键字为根据第一运行路径信息中获得的关键字,所述第一运行路径信息为所述一个或多个运行路径信息中的任意一个,所述第一预设搜索规则为与所述第一运行路径信息对应的一个或多个所述预设搜索规则中的任意一个;
    所述处理单元,用于根据所述第一关键字和所述一个或多个第二关键字获得定第一搜索结果。
  9. 根据权利要求8所述的文本搜索装置,其特征在于,
    所述获取单元,用于获取第一行值,所述第一行值用于标识所述第一关键字所在的行号;
    所述处理单元,用于根据所述第一行值和与所述第一预设搜索规则,在预设偏移范围内查找一个或多个第二关键字,所述预设偏移范围指示一个或多个所述第二关键字与所述第一关键字之间的行偏移值。
  10. 根据权利要求8或9所述的文本搜索装置,其特征在于,
    所述处理单元,还用于根据所述第一行值和所述第一预设搜索规则,在第一预设偏移范围内查找第三关键字,所述第三关键字为所述一个或多个第二关键字中的任意一个,所述第一预设偏移范围与所述第三关键字对应。
  11. 根据权利要求8-10中任一项所述的文本搜索装置,其特征在于,所述逻辑关系包括第一标识、第二标识和/或第三标识,所述第一标识指示一个或多个所述关键字在所述预设偏移范围内存在,所述第二标识指示所述多个关键字在所述预设偏移范围内择一存在,所述第三标识指示一个或多个所述关键字在所述预设偏移范围内不存在。
  12. 根据权利要求8-11中任一项所述的文本搜索装置,其特征在于,所述一个或多个第二关键字还包括第四关键字和第五关键字;
    所述处理单元,用于根据第二预设偏移范围搜索所述第四关键字,所述第二预设偏移范围由所述第四关键字与所述第一关键字之间的行偏移值得到,所述第四关键字为一个或多个所述第二关键字;或,所述第二预设偏移范围由所述第五关键字与所述第四关键字之间的行偏移值,以及所述第四关键字与所述第一关键字之间的行偏移值得到,所述第五关键字为所述一个或多个第二关键字中不同于所述第四关键字的关键字。
  13. 根据权利要求8-12中任一项所述的文本搜索装置,其特征在于,所述获取单元用于:
    获得第二文本,所述第二文本由所述第一文本通过哈希算法进行处理获得;
    基于所述第二文本获取所述第一行值。
  14. 根据权利要求8-13中任一项所述的文本搜索装置,其特征在于,所述关键字的类型包括字符串类型和/或键值对类型。
  15. 一种车辆,其特征在于,包括:所述车辆包括存储器和所述存储器耦合的处理器;其中,所述存储器用于存储计算机可读指令;所述处理器用于执行所述存储器中的计算机可读指令从而执行如权利要求1至7任一项所描述的方法。
  16. 一种服务器,其特征在于,包括:所述服务器包括存储器和所述存储器耦合的处理器;其中,所述存储器用于存储计算机可读指令;所述处理器用于执行所述存储器中的计算机可读指令从而执行如权利要求1至7任一项所描述的方法。
  17. 一种计算机可读存储介质,其特征在于,当指令在计算机装置上运行时,使得所述计算机装置执行如权利要求1至7任一项所描述的方法。
  18. 一种计算机程序产品,当在计算机上运行时,使得计算机可以执行如权利要求1至7任一所描述的方法。
PCT/CN2021/113863 2021-08-20 2021-08-20 一种文本搜索处理的方法以及相关设备 WO2023019576A1 (zh)

Priority Applications (4)

Application Number Priority Date Filing Date Title
EP21953817.0A EP4383085A1 (en) 2021-08-20 2021-08-20 Text search processing method and related device
CN202180006898.1A CN115997201A (zh) 2021-08-20 2021-08-20 一种文本搜索处理的方法以及相关设备
PCT/CN2021/113863 WO2023019576A1 (zh) 2021-08-20 2021-08-20 一种文本搜索处理的方法以及相关设备
US18/443,398 US20240184687A1 (en) 2021-08-20 2024-02-16 Text search processing method and related device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/113863 WO2023019576A1 (zh) 2021-08-20 2021-08-20 一种文本搜索处理的方法以及相关设备

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US18/443,398 Continuation US20240184687A1 (en) 2021-08-20 2024-02-16 Text search processing method and related device

Publications (1)

Publication Number Publication Date
WO2023019576A1 true WO2023019576A1 (zh) 2023-02-23

Family

ID=85239373

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/113863 WO2023019576A1 (zh) 2021-08-20 2021-08-20 一种文本搜索处理的方法以及相关设备

Country Status (4)

Country Link
US (1) US20240184687A1 (zh)
EP (1) EP4383085A1 (zh)
CN (1) CN115997201A (zh)
WO (1) WO2023019576A1 (zh)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101013430A (zh) * 2007-01-29 2007-08-08 华为技术有限公司 搜索方法及装置
CN102779151A (zh) * 2012-05-10 2012-11-14 北京奇虎科技有限公司 应用程序的搜索方法、装置及系统
US20140280033A1 (en) * 2013-03-13 2014-09-18 Wal-Mart Stores, Inc. Rule triggering for search rule engine
CN105068716A (zh) * 2015-08-11 2015-11-18 广东欧珀移动通信有限公司 信息搜索方法及装置
US20170124075A1 (en) * 2014-05-23 2017-05-04 Yinsheng DENG System for identifying, associating, searching and presenting documents based on relation combination
CN111858062A (zh) * 2020-07-27 2020-10-30 中国平安财产保险股份有限公司 评估规则优化方法、业务评估方法及相关设备

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101013430A (zh) * 2007-01-29 2007-08-08 华为技术有限公司 搜索方法及装置
CN102779151A (zh) * 2012-05-10 2012-11-14 北京奇虎科技有限公司 应用程序的搜索方法、装置及系统
US20140280033A1 (en) * 2013-03-13 2014-09-18 Wal-Mart Stores, Inc. Rule triggering for search rule engine
US20170124075A1 (en) * 2014-05-23 2017-05-04 Yinsheng DENG System for identifying, associating, searching and presenting documents based on relation combination
CN105068716A (zh) * 2015-08-11 2015-11-18 广东欧珀移动通信有限公司 信息搜索方法及装置
CN111858062A (zh) * 2020-07-27 2020-10-30 中国平安财产保险股份有限公司 评估规则优化方法、业务评估方法及相关设备

Also Published As

Publication number Publication date
CN115997201A (zh) 2023-04-21
US20240184687A1 (en) 2024-06-06
EP4383085A1 (en) 2024-06-12

Similar Documents

Publication Publication Date Title
US10645105B2 (en) Network attack detection method and device
US20240126817A1 (en) Graph data query
RU2665920C2 (ru) Оптимизированный процесс визуализации в браузере
CN108008936B (zh) 一种数据处理方法、装置及电子设备
WO2021051624A1 (zh) 数据获取方法、装置、电子设备及存储介质
CN111506608A (zh) 一种结构化文本的比较方法和装置
US10404676B2 (en) Method and apparatus to coordinate and authenticate requests for data
WO2018205689A1 (zh) 合并文件的方法、存储装置、存储设备和存储介质
CN115934756A (zh) 混编算子计算方法、装置、设备及介质
CN112818937B (zh) Excel文件的识别方法、装置、电子设备及可读存储介质
CN117493309A (zh) 一种标准模型生成方法、装置、设备及存储介质
WO2023019576A1 (zh) 一种文本搜索处理的方法以及相关设备
CN116938776A (zh) 一种网络资产测绘的方法、装置、电子设备及介质
CN115237954A (zh) 基于pim装置的布谷鸟哈希查询的方法、pim装置及系统
KR102215263B1 (ko) Sql 쿼리의 유형을 분류하는 방법, 이상 상황 발생 여부 결정 방법 및 컴퓨팅 디바이스
CN113779029A (zh) 一种数据查询的方法及装置
CN117632820B (zh) 请求处理方法、装置、总线桥、电子设备及可读存储介质
CN111163088B (zh) 消息处理方法、系统、装置及电子设备
CN117115380B (zh) 多源空间数据处理方法和系统
CN114237509B (zh) 数据访问方法及装置
CN108984450B (zh) 数据传输方法、装置和设备
CN116600031B (zh) 报文处理方法、装置、设备及存储介质
US20230101493A1 (en) Data Processing Method and Interaction System
WO2024017100A1 (zh) 获取代码片段的方法、装置及存储介质
CN117793149A (zh) 一种通信数据传输方法、装置、服务器和介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21953817

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2021953817

Country of ref document: EP

Effective date: 20240306

NENP Non-entry into the national phase

Ref country code: DE