CN116561292A

CN116561292A - Data searching method, device, electronic equipment and computer readable medium

Info

Publication number: CN116561292A
Application number: CN202310551514.5A
Authority: CN
Inventors: 孙博
Original assignee: China Construction Bank Corp; CCB Finetech Co Ltd
Current assignee: China Construction Bank Corp; CCB Finetech Co Ltd
Priority date: 2023-05-16
Filing date: 2023-05-16
Publication date: 2023-08-08

Abstract

The invention discloses a data searching method, a data searching device, electronic equipment and a computer readable medium, and relates to the technical field of big data processing. One embodiment of the method comprises the following steps: receiving operation and maintenance data pushed by each operation and maintenance system, or pulling the operation and maintenance data from each operation and maintenance system; adding classification fields and tag fields to the operation and maintenance data, writing the operation and maintenance data and the corresponding classification fields and tag fields into a database, creating a document according to the operation and maintenance data and the corresponding classification fields and tag fields, and updating the document into an index; receiving a search request sent by a searcher, wherein the search request carries a search sentence input by a user, identifying a target field from the search sentence, searching a plurality of matched documents in an index according to the target field, and sequencing the plurality of documents, so that the document with the front sequencing result is returned to the searcher. The implementation mode can solve the technical problem that full-ecological operation and maintenance data cannot be conveniently searched.

Description

Data searching method, device, electronic equipment and computer readable medium

Technical Field

The present invention relates to the field of big data processing technologies, and in particular, to a data searching method, a data searching device, an electronic device, and a computer readable medium.

Background

With the popularization of computer systems and intelligent office systems, the scale and frequency of various operation and maintenance operations are also increasing. Various information generated in the operation process of the operation and maintenance system can help analyze the work content, solve the problem of discovery, provide inquiry and the like of related operation and maintenance data, and have great research and preservation values, so that most operation and maintenance systems can record the operation information of the operation and maintenance system for subsequent use.

However, the operation and maintenance data are not linked to each other due to the great difference in functions between the different operation and maintenance systems and the difference in operation and maintenance data complexity. Even if some systems realize cross-platform data storage, complex search support cannot be provided, and it is difficult to find out full-ecological operation and maintenance data of one hardware or network in real time according to a certain dimension, so that operation and maintenance personnel are inconvenient in searching the data, and have to search a plurality of systems or perform multiple searches and then splice manually to obtain the desired data.

Disclosure of Invention

In view of the above, the embodiments of the present invention provide a data searching method, apparatus, electronic device and computer readable medium, so as to solve the technical problem that the full-ecological operation and maintenance data cannot be conveniently searched.

To achieve the above object, according to an aspect of an embodiment of the present invention, there is provided a data search method including:

receiving operation and maintenance data pushed by each operation and maintenance system, or pulling the operation and maintenance data from each operation and maintenance system;

adding a classification field and a label field for the operation and maintenance data, writing the operation and maintenance data and the corresponding classification field and label field into a database, creating a document according to the operation and maintenance data and the corresponding classification field and label field, and updating the document into an index;

receiving a search request sent by a searcher, wherein the search request carries a search statement input by a user, and identifying a target field from the search statement;

and searching a plurality of matched documents in the index according to the target field, and sequencing the plurality of documents, so that the document with the front sequencing result is returned to the searcher.

Optionally, writing the operation data and the corresponding classification field and the label field thereof into a database includes:

generating a tree directory according to the classification field, and adding the operation and maintenance data into the tree directory;

and writing the corresponding relation between the tree directory and the operation and maintenance data and the tag field into a database.

Optionally, creating a document according to the operation and maintenance data and the corresponding classification field and label field, and updating the document into an index, including:

generating an index field according to the operation and maintenance data;

and assembling the index field, the operation data, the corresponding classification field and the label field into a document, and writing the document into an index creator so as to update the document into an index.

Optionally, generating an index field according to the operation data includes:

generating an index field according to the metadata of the operation and maintenance data; and/or the number of the groups of groups,

and performing word segmentation on the operation and maintenance data to obtain an index field.

Optionally, identifying the target field from the search statement includes:

preprocessing the search sentence, wherein the preprocessing comprises at least one of pinyin conversion processing, complement processing and paraphrase complement processing;

extracting keywords and/or tag fields from the preprocessed search statement, and identifying user intention according to the keywords and/or the tag fields so as to obtain associated fields;

performing word segmentation processing on the search statement so as to obtain segmented words;

The target field comprises a keyword and/or label field, an association field and a word segmentation.

Optionally, searching a plurality of matched documents in the index according to the target field, and sorting the plurality of documents, including:

for each document in the index, respectively calculating the relevance scores of each target field and each index field, each classification field and label field in the document, and carrying out weighted summation on the calculated relevance, thereby obtaining BM25 values of the search statement and the document;

the individual documents are ranked according to the order in which the BM25 values are from large to small.

In addition, according to another aspect of an embodiment of the present invention, there is provided a data search apparatus including:

the receiving module is used for receiving the operation and maintenance data pushed by each operation and maintenance system or pulling the operation and maintenance data from each operation and maintenance system;

the storage module is used for adding a classification field and a label field to the operation and maintenance data, writing the operation and maintenance data and the corresponding classification field and label field into a database, creating a document according to the operation and maintenance data and the corresponding classification field and label field, and updating the document into an index;

The processing module is used for receiving a search request sent by a searcher, wherein the search request carries a search statement input by a user, and a target field is identified from the search statement;

and the searching module is used for searching out a plurality of matched documents in the index according to the target field, and sequencing the documents so as to return the document with the front sequencing result to the searcher.

Optionally, the storage module is further configured to:

generating an index field according to the operation and maintenance data;

Optionally, the storage module is further configured to:

Optionally, the processing module is further configured to:

Optionally, the search module is further configured to:

According to another aspect of an embodiment of the present invention, there is also provided an electronic device including:

one or more processors;

storage means for storing one or more programs,

The one or more processors implement the method of any of the embodiments described above when the one or more programs are executed by the one or more processors.

According to another aspect of an embodiment of the present invention, there is also provided a computer readable medium having stored thereon a computer program which, when executed by a processor, implements the method according to any of the embodiments described above.

According to another aspect of embodiments of the present invention, there is also provided a computer program product comprising a computer program which, when executed by a processor, implements the method according to any of the embodiments described above.

One embodiment of the above invention has the following advantages or benefits: because the classification field and the label field are added for the operation and maintenance data, the operation and maintenance data and the corresponding classification field and label field are written into the database, the document is created according to the operation and maintenance data and the corresponding classification field and label field, the document is updated into the index, the target field is identified from the search statement, the matched multiple documents are searched in the index according to the target field, and the multiple documents are ordered, the technical problem that the full-ecological operation and maintenance data cannot be conveniently searched in the prior art is solved. The embodiment of the invention not only can flexibly store a large amount of complex multi-type data information, but also can accurately search out corresponding operation and maintenance data according to different requirements, thereby conveniently searching the full-ecological operation and maintenance data.

Further effects of the above-described non-conventional alternatives are described below in connection with the embodiments.

Drawings

In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art. Wherein:

FIG. 1 is a flow chart of a data search method according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a system architecture for implementing a data search method according to an embodiment of the present invention;

FIG. 3 is a flow chart of a data search method according to one referenceable embodiment of the invention;

FIG. 4 is a flow chart of a data search method according to another referenceable embodiment of the invention;

fig. 5 is a schematic diagram of a data search device according to an embodiment of the present invention;

FIG. 6 is an exemplary system architecture diagram in which embodiments of the present invention may be applied;

fig. 7 is a schematic diagram of a computer system suitable for use in implementing an embodiment of the invention.

Detailed Description

Exemplary embodiments of the present invention will now be described with reference to the accompanying drawings, in which various details of the embodiments of the present invention are included to facilitate understanding, and are to be considered merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

In the technical scheme of the invention, the aspects of acquisition, analysis, use, transmission, storage and the like of the related user personal information all meet the requirements of related laws and regulations, are used for legal and reasonable purposes, are not shared, leaked or sold outside the aspects of legal use and the like, and are subjected to supervision and management of a supervision department. Necessary measures should be taken for the personal information of the user to prevent illegal access to such personal information data, ensure that personnel having access to the personal information data comply with the regulations of the relevant laws and regulations, and ensure the personal information of the user. Once these user personal information data are no longer needed, the risk should be minimized by limiting or even prohibiting the data collection and/or deletion.

The embodiment of the invention mainly realizes the functions of two aspects, namely the collection of complex multi-type system information and data and the efficient, accurate and multi-dimensional searching function based on the complex multi-type system information and the data, thereby conveniently searching the full-ecological operation and maintenance information.

Fig. 1 is a flowchart of a data search method according to an embodiment of the present invention. As an embodiment of the present invention, as shown in fig. 1, the data searching method may include:

and step 101, receiving the operation and maintenance data pushed by each operation and maintenance system or pulling the operation and maintenance data from each operation and maintenance system.

As shown in fig. 2, the latest operation and maintenance data pushed by each operation and maintenance system (such as operation and maintenance system a, operation and maintenance system B, operation and maintenance system C, etc.) can be actively received through the built data processing layer service. In addition, as an auxiliary measure, the data processing layer service can actively scan data of each accessed operation and maintenance system at regular time, so as to achieve the effects of leak detection and deficiency repair and ensure the integrity of data.

Sometimes, due to network problems or traffic limitations, the operation and maintenance system cannot guarantee that the latest operation and maintenance data is pushed in real time, so that if the operation and maintenance data is passively received only by means of an interface, data loss can be caused, and therefore, the service also needs an active data scanning mechanism. For all operation and maintenance systems of the access system, the service can flexibly set a scanning period for each system according to the data volume or the request frequency so as to meet the requirement of continuously and regularly scanning the operation and maintenance system according to a certain frequency. It should be noted that, the actively scanned operation and maintenance data is screened, and the information already recorded in the database is ignored and is not repeatedly recorded.

Therefore, the operation and maintenance data are obtained in parallel in an active mode and a passive mode, and the integrity and the instantaneity of the access data are ensured.

Step 102, adding a classification field and a label field to the operation and maintenance data, writing the operation and maintenance data and the classification field and the label field corresponding to the operation and maintenance data into a database, creating a document according to the operation and maintenance data and the classification field and the label field corresponding to the operation and maintenance data, and updating the document into an index.

In order to flexibly store a large amount of operation and maintenance data, the data storage method adopts a Nosql distributed database cluster for data storage, and the reason for the selection is as follows: the storage data structure is flexible, and the operation and maintenance data of various operation and maintenance systems can be adapted; the data storage quantity is large, and the capacity expansion at any time is convenient. Therefore, the Nosql distributed database cluster is adopted to store operation and maintenance data, so that a large amount of data is ensured to be stored, and the original format of the data is also protected.

Because the Nosql database is adopted for data storage, the interface service for receiving the operation and maintenance data has no strict limit on the transmitted data structure, and the original operation and maintenance data structure can be maintained in the data storage. While maintaining the native data structure, the data processing layer may add additional system information for subsequent searching and management, such as:

1) Adding a classification field: according to the data source, namely the operation and maintenance data acquired from which operation and maintenance system, classification fields are added to the operation and maintenance data, such as the name, the number, the type and the like of the operation and maintenance system, and the information of deployment operation, such as the name, the number and the like of network segments, deployment machine rooms, deployment equipment, deployment areas and the like, is convenient for subsequent information search according to the specific classification fields as conditions, such as searching with the network segments or the deployment machine rooms.

2) Adding a tag field: the tag field and the classification field are different in that the classification is a limited cluster, which reflects the overall information of the operation and maintenance system, and the tag is some phrase or keyword, which can describe the functional details of the system to the greatest extent. And extracting some keywords from the operation and maintenance data according to the preset label type to serve as label fields. Tags are very valuable search criteria that can greatly increase search accuracy in subsequent searches. It should be noted that, depending on the tag hit condition of the operation and maintenance data, one tag field may be added to the operation and maintenance data, or a plurality of tag fields may be added.

3) Setting a catalogue: and generating a tree directory according to the classification field, and adding the operation and maintenance data into directory information. By using the tree directory format, the corresponding operation and maintenance data can be managed and searched more conveniently, such as operation and maintenance data under the same kind of information can be searched conveniently and rapidly according to the units, departments, machine rooms and the like.

Optionally, writing the operation data and the corresponding classification field and the label field thereof into a database includes: generating a tree directory according to the classification field, and adding the operation and maintenance data into the tree directory; and writing the corresponding relation between the tree directory and the operation and maintenance data and the tag field into a database. In the embodiment of the invention, after operation and maintenance data pushed by each operation and maintenance system are received or pulled from each operation and maintenance system, classification fields and label fields are respectively added for each operation and maintenance data, then a tree directory is generated according to the classification fields and the operation and maintenance data are added into the tree directory, and the data processing layer pushes the tree directory added with the operation and maintenance data and the corresponding relation between the operation and maintenance data and the label fields to the data storage layer for persistent storage and subsequent processing.

In order to facilitate storage and management of a large amount of heterogeneous operation and maintenance data, the embodiment of the invention adopts a NoSql distributed database cluster as a database application of a data storage layer service bottom layer. The data storage layer also comprises the following two parts of services: storing a plurality of heterogeneous operation and data information transmitted by the data processing layer by utilizing a Nosql flexible data structure; for subsequent searching, the stored operation and maintenance data is processed into search index format data used by a search service, namely, a document is created according to the operation and maintenance data and the corresponding classification fields and label fields, and the document is updated into an index.

The data pushed from the data processing layer is correspondingly added or modified with two different processing modes. The data which does not exist in the database is newly added data, and the corresponding fields are modified and updated by the existing data in the system, so that the instantaneity and the correctness of the data are ensured. Specifically, it is stored into a distributed database cluster according to the data type. The data type reflects the information category of the system, such as system information of a numeric type or a text type, various system calendars of a date type, system logs of a file type, a system manual, and the like. The corresponding relation between the data identification of the operation and maintenance data and the tag field can be stored in a database table and used for representing the corresponding relation between the operation and maintenance data and the tag field.

Optionally, creating a document according to the operation and maintenance data and the corresponding classification field and label field, and updating the document into an index, including: generating an index field according to the operation and maintenance data; and assembling the index field, the operation data, the corresponding classification field and the label field into a document, and writing the document into an index creator so as to update the document into an index. As shown in fig. 2, to facilitate use of subsequent data search services, the newly entered data activates a process of updating the search index to ensure that it will be searched by the user at the first time. The method mainly comprises the following steps: first, index fields (fields) are generated from the operation and maintenance data, then the classification fields, tag fields and index fields of the operation and maintenance data are assembled into a Document (Document), and finally the Document of the operation and maintenance data is written into an index creator (index), which updates the Document into an index. That is, the operation and maintenance data of the plurality of operation and maintenance systems are organized to finally form an index to facilitate searching. After the database and the corresponding index are updated, the data updating process is finished. Finally, the data store stores the generated search index data into a database to facilitate searching.

Optionally, generating an index field according to the operation data includes: generating an index field according to the metadata of the operation and maintenance data; and/or performing word segmentation on the operation and maintenance data to obtain an index field. For each piece of operation and maintenance data, all types of data are extracted, so that index fields are generated. According to the classification of the data and the type of the information content, some index fields directly store unprocessed metadata, such as data identification, network type, system deployment region and the like, which need to have integrity; and some fields need to be stored after word segmentation, such as system logs, system files, system profiles, etc., in order to be searched as conveniently as possible.

Step 103, receiving a search request sent by a searcher, wherein the search request carries a search statement input by a user, and identifying a target field from the search statement.

For the search request of the operation and maintenance data, the search service layer can identify the search statement input by the user so as to identify the target field, and then find the operation and maintenance data which meets the user's expectations from the data storage layer according to the identified target field. As shown in fig. 2, for example, input processing such as keyword extraction, pinyin conversion, intent recognition and the like is performed on a search sentence input by a user, so that target fields such as a target keyword, a tag field, an associated field, a word segmentation and the like are recognized, and a search service can obtain a real search intent of the user more accurately.

And 104, searching a plurality of matched documents in the index according to the target field, and sequencing the plurality of documents, so that the document with the front sequencing result is returned to the searcher.

As shown in fig. 2, the target field identified in step 103 is used to find out the documents meeting the conditions in the search index of the data storage layer, and for the documents meeting the conditions, the documents are ranked according to the relevance between the target field and the documents, and the documents with the front ranking result are returned to the searcher. Therefore, the embodiment of the invention can find the operation and maintenance data which meets the requirements of the user from the data storage layer according to the multi-dimension and the requirements of the user.

The embodiment of the invention collects a large amount of operation and maintenance data and can provide complete data of the longitudinal direction and the transverse direction of one operation and maintenance node (such as a server). The longitudinal data of the operation and maintenance node refers to all operation and maintenance data generated from birth to death, and the transverse data of the operation and maintenance node refers to the environment in which the operation and maintenance node is located and the content of the adjacent nodes. Therefore, the embodiment of the invention can open the longitudinal and transverse data of the operation and maintenance node, thereby effectively mining the value of the node data.

According to the various embodiments described above, it can be seen that the technical means of adding the classification field and the tag field to the operation and maintenance data, writing the operation and maintenance data and the corresponding classification field and tag field thereof into the database, creating a document according to the operation and maintenance data and the corresponding classification field and tag field thereof, updating the document into the index, identifying the target field from the search statement, finding a plurality of matched documents in the index according to the target field, and sorting the plurality of documents is provided, so that the technical problem that the full-ecological operation and maintenance data cannot be conveniently searched in the prior art is solved. The embodiment of the invention not only can flexibly store a large amount of complex multi-type data information, but also can accurately search out corresponding operation and maintenance data according to different requirements, thereby conveniently searching the full-ecological operation and maintenance data.

Fig. 3 is a flowchart of a data search method according to one referenceable embodiment of the present invention. As still another embodiment of the present invention, as shown in fig. 3, the data searching method may include:

in step 301, operation and maintenance data pushed by each operation and maintenance system is received, or the operation and maintenance data is pulled from each operation and maintenance system.

Step 302, adding a classification field and a label field to the operation and maintenance data, writing the operation and maintenance data and the classification field and the label field corresponding to the operation and maintenance data into a database, creating a document according to the operation and maintenance data and the classification field and the label field corresponding to the operation and maintenance data, and updating the document into an index.

Step 303, receiving a search request sent by a searcher, wherein the search request carries a search sentence input by a user.

And step 304, preprocessing the search sentence, wherein the preprocessing comprises at least one of pinyin conversion processing, completion processing and paraphrase supplementing processing.

Step 305, extracting keywords and/or tag fields from the preprocessed search statement, and identifying user intention according to the keywords and/or the tag fields, thereby obtaining associated fields.

Compared with the traditional Like search, the embodiment of the invention has higher accuracy and flexibility by adopting the Lucene full-text search in the data search, and can return the result which most accords with the user requirement on the basis of trying to understand the user intention.

In order to search out operation and maintenance data meeting the requirements according to the conditions, firstly, processing search sentences input by a user to help the user to express the search intention more clearly, and mainly comprising the following processing steps:

1) And (3) pinyin conversion processing: the step is bidirectional, for non-Chinese characters input by a user, whether the non-Chinese characters are pinyin corresponding to Chinese words is checked, and if the non-Chinese characters can be found, the non-Chinese characters are converted into Chinese words and added into search sentences; for the input Chinese, the input Chinese is converted into corresponding pinyin so as to cope with the situation of homonyms or wrongly written characters.

Such as: database- > shujuku

Ceshi- > test

2) And (3) finishing the supplementing treatment: and assisting the user to complete the input of the search sentence according to the phrase extracted from the search index and the tag field. Because the operation and maintenance system is a relatively independent field, more proper nouns need to be memorized, the function can help the user to perfect input information conveniently, for example, when the user inputs Orac, the database noun of Oracle is complemented.

3) Supplementary treatment of paraphrasing: for the input search sentence, whether other phrases with similar meaning are available or not is searched in a maintained paraphrasing table, if so, the search sentence is supplemented, and the possibility of search hit is enlarged. For the application scenario of the operation and maintenance system, some nouns can be grouped into close meaning words, such as: software and applications, frameworks and architectures, etc.

4) Keyword extraction: as operation and maintenance data, there are many terms or key sentences, such as IP addresses, etc., according to a keyword table maintained in advance, an IP format, etc., keywords of a search can be extracted, and in a subsequent search, a biased search is performed according to the keywords.

5) Tag inspection: tags may be included in the input search statement, and if a tag is hit, the data including the tag is searched for in a subsequent search. For example, for a search statement that "intranet test environment contains information systems deployed at single points" that is desired to be searched, if the system contains tags such as "intranet", "test environment", "single point deployment", "information system", the search condition can be considered to be basically met.

6) And (5) intention recognition: based on the extracted keywords and the detected tags, the search intentions of the user, such as the type of the searched information system, the search tendency, and the like, can be guessed, and the intentions can help the subsequent process to more accurately find out the data meeting the needs of the user. For example, when a user searches for a "system with higher recent performance pressure", it can judge and understand according to the intention, and return the system with more network requests and higher data saturation such as CPU and memory.

And 306, performing word segmentation processing on the search statement, thereby obtaining segmented words.

Because of the specificity of Chinese ideas, the searching effect is good, and the important point is the word segmentation of the searching sentence and the supplement of recall data. Because the indexing data is segmented in the fields during the establishment, in order to find the data to the maximum extent, the search statement is also required to be segmented so as to achieve the matching of the search content and the storage content.

Even with a better word segmentation device, it is not possible to perfectly express the user's intention due to the extremely high complexity of Chinese word segmentation. And sometimes the word is not segmented and is directly searched according to the sentence, namely the best result is found in the LIKE form. Therefore, in addition to word segmentation search, in the case of fewer recall results or lower recall scores, additional means are adopted to expand the range of retrieving data:

1) Word-by-word segmentation search: splitting each word of the search statement, and searching the word segmentation result in the search index. This form has good effect on when the input word is not long and recall data is very few. For the application scene of searching the operation and maintenance information system, the search statement may be complex sometimes, or the word segmentation result may be affected by the fact that the original data content contains some mixed information, no matter how the word segmentation mode is easy to match with the documents stored in the index, if the word segmentation mode is split word by word and then the search mode is searched, the matched result is returned, and a proper result may be found.

2) Non-word segmentation search: the LIKE form of database search is adopted, the search sentence is regarded as a complete phrase, and the data containing the phrase is searched in the index for recall. For search sentences with poor Chinese-English mixing or word segmentation effects, the recall data can be supplemented in the form, so that a required result can be found. Such as the number or name of certain functions or components of the search information system, which are sometimes mixed chinese numerals and letters, it may be more accurate to search with a complete search term that is not segmented.

No matter what form of recalled data is adopted, the data is not arranged in sequence at this time, and in order to return the result most suitable for the user's intention preferentially, the data needs to be ordered.

Step 307, searching a plurality of matched documents in the index according to the target field, and sorting the plurality of documents.

And step 308, returning the document with the front ranking result to the searching party.

In addition, in one embodiment of the present invention, the implementation of the data searching method has been described in detail in the above-mentioned data searching method, and thus the description thereof will not be repeated here.

Fig. 4 is a flowchart of a data search method according to another referenceable embodiment of the present invention. As another embodiment of the present invention, as shown in fig. 4, the data searching method may include:

in step 401, operation and maintenance data pushed by each operation and maintenance system is received, or the operation and maintenance data is pulled from each operation and maintenance system.

Step 402, adding a classification field and a label field to the operation and maintenance data, writing the operation and maintenance data and the classification field and the label field corresponding to the operation and maintenance data into a database, creating a document according to the operation and maintenance data and the classification field and the label field corresponding to the operation and maintenance data, and updating the document into an index.

Step 403, receiving a search request sent by a searcher, where the search request carries a search sentence input by a user.

Step 404, identifying a target field from the search statement.

Step 405, for each document in the index, calculating the relevance score of each target field and each index field, each classification field and label field in the document, and weighting and summing the calculated relevance, so as to obtain the BM25 value of the search statement and the document.

Data sorting, namely sequentially adjusting the data to be returned according to a certain rule, wherein the aim is that: the result most conforming to the search semantics is placed in the head as much as possible, and the data most likely to be selected by the user is placed in the head, and the focus of this step is on the adopted ordering dimension and the used ordering algorithm. The embodiment of the invention adopts a BM25 algorithm to score the recall data, and after comparing the scores of the recall data, the related sequence is also available.

Optionally, the tag field has a highest weight and is greater than 1. However, since the fields of the information system are of a great variety, and the importance of these contents is not the same, such as the system name or system profile, which is obviously greater than the value contained in the system log or the system file during searching, the BM25 value needs to be calculated, weighted and summed according to the specific source of the field matched by searching, and then ranked. Such as:

it can be seen that the tags are important to represent the specific content of a system, and that matching documents are guaranteed to be ranked ahead by giving higher weight to the content that is matched in the tags. In addition, the matching result is given a forward weighting calculation for important fields such as a system name or a system profile. In order to prevent confusion of the results, the results are appropriately placed in the rear display by a reverse weighted guide ordering method.

In step 406, the documents are ranked according to the order in which the BM25 values are from large to small.

And step 407, returning the document with the front ranking result to the searching party.

In addition, in another embodiment of the present invention, the implementation of the data searching method is described in detail in the above data searching method, so that the description thereof will not be repeated here.

Fig. 5 is a schematic diagram of a data search device according to an embodiment of the present invention. As shown in fig. 5, the data searching apparatus 500 includes a receiving module 501, a storing module 502, a processing module 503, and a searching module 504; the receiving module 501 is configured to receive operation and maintenance data pushed by each operation and maintenance system, or pull operation and maintenance data from each operation and maintenance system; the storage module 502 is configured to add a classification field and a tag field to the operation data, write the operation data and the classification field and the tag field corresponding thereto into a database, create a document according to the operation data and the classification field and the tag field corresponding thereto, and update the document into an index; the processing module 503 is configured to receive a search request sent by a searcher, where the search request carries a search sentence input by a user, and identify a target field from the search sentence; the searching module 504 is configured to find a plurality of matched documents in the index according to the target field, and rank the plurality of documents, so as to return the document with the top ranking result to the searcher.

Optionally, the storage module 502 is further configured to:

generating an index field according to the operation and maintenance data;

Optionally, the storage module 502 is further configured to:

Optionally, the processing module 503 is further configured to:

Optionally, the search module 504 is further configured to:

The specific implementation of the data searching apparatus according to the present invention is described in detail in the above-described data searching method, and thus the description thereof will not be repeated here.

Fig. 6 illustrates an exemplary system architecture 600 to which the data searching method or data searching apparatus of embodiments of the present invention may be applied.

As shown in fig. 6, the system architecture 600 may include terminal devices 601, 602, 603, a network 604, and a server 605. The network 604 is used as a medium to provide communication links between the terminal devices 601, 602, 603 and the server 605. The network 604 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.

A user may interact with the server 605 via the network 604 using the terminal devices 601, 602, 603 to receive or send messages, etc. Various communication client applications such as shopping class applications, web browser applications, search class applications, instant messaging tools, mailbox clients, social platform software, etc. (by way of example only) may be installed on the terminal devices 601, 602, 603.

The terminal devices 601, 602, 603 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smartphones, tablets, laptop and desktop computers, and the like.

The server 605 may be a server providing various services, such as a background management server (by way of example only) providing support for shopping-type websites browsed by users using terminal devices 601, 602, 603. The background management server can analyze and other data such as the received article information inquiry request and feed back the processing result to the terminal equipment.

It should be noted that, the data searching method provided by the embodiment of the present invention is generally executed by the server 605, and accordingly, the data searching device is generally disposed in the server 605.

It should be understood that the number of terminal devices, networks and servers in fig. 6 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.

Referring now to FIG. 7, there is illustrated a schematic diagram of a computer system 700 suitable for use in implementing an embodiment of the present invention. The terminal device shown in fig. 7 is only an example, and should not impose any limitation on the functions and the scope of use of the embodiment of the present invention.

As shown in fig. 7, the computer system 700 includes a Central Processing Unit (CPU) 701, which can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 702 or a program loaded from a storage section 708 into a Random Access Memory (RAM) 703. In the RAM703, various programs and data required for the operation of the system 700 are also stored. The CPU 701, ROM 702, and RAM703 are connected to each other through a bus 704. An input/output (I/O) interface 705 is also connected to bus 704.

The following components are connected to the I/O interface 705: an input section 706 including a keyboard, a mouse, and the like; an output portion 707 including a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, a speaker, and the like; a storage section 708 including a hard disk or the like; and a communication section 709 including a network interface card such as a LAN card, a modem, or the like. The communication section 709 performs communication processing via a network such as the internet. The drive 710 is also connected to the I/O interface 705 as needed. A removable medium 711 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 710 as necessary, so that a computer program read therefrom is mounted into the storage section 708 as necessary.

In particular, according to embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method shown in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via the communication portion 709, and/or installed from the removable medium 711. The above-described functions defined in the system of the present invention are performed when the computer program is executed by a Central Processing Unit (CPU) 701.

The computer readable medium shown in the present invention may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present invention, however, the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.

The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer programs according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The modules involved in the embodiments of the present invention may be implemented in software or in hardware. The described modules may also be provided in a processor, for example, as: a processor includes a receiving module, a storage module, a processing module, and a search module, where the names of the modules do not constitute a limitation on the module itself in some cases.

As another aspect, the present invention also provides a computer-readable medium that may be contained in the apparatus described in the above embodiments; or may be present alone without being fitted into the device. The computer readable medium carries one or more programs which, when executed by a device, implement the method of: receiving operation and maintenance data pushed by each operation and maintenance system, or pulling the operation and maintenance data from each operation and maintenance system; adding a classification field and a label field for the operation and maintenance data, writing the operation and maintenance data and the corresponding classification field and label field into a database, creating a document according to the operation and maintenance data and the corresponding classification field and label field, and updating the document into an index; receiving a search request sent by a searcher, wherein the search request carries a search statement input by a user, and identifying a target field from the search statement; and searching a plurality of matched documents in the index according to the target field, and sequencing the plurality of documents, so that the document with the front sequencing result is returned to the searcher.

As a further aspect, embodiments of the present invention also provide a computer program product comprising a computer program which, when executed by a processor, implements the method according to any of the above embodiments.

According to the technical scheme of the embodiment of the invention, the technical means of adding the classification field and the label field to the operation and dimension data, writing the operation and dimension data and the corresponding classification field and label field thereof into the database, creating the document according to the operation and dimension data and the corresponding classification field and label field thereof, updating the document into the index, identifying the target field from the search statement, searching the matched multiple documents in the index according to the target field and sequencing the multiple documents are adopted, so that the technical problem that the full-ecological operation and dimension data cannot be conveniently searched in the prior art is solved. The embodiment of the invention not only can flexibly store a large amount of complex multi-type data information, but also can accurately search out corresponding operation and maintenance data according to different requirements, thereby conveniently searching the full-ecological operation and maintenance data.

The above embodiments do not limit the scope of the present invention. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives can occur depending upon design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention should be included in the scope of the present invention.

Claims

1. A data search method, comprising:

2. The method of claim 1, wherein writing the operation data and its corresponding classification and tag fields to a database comprises:

3. The method of claim 1, wherein creating a document from the run-and-run data and its corresponding classification and tag fields and updating the document into an index comprises:

generating an index field according to the operation and maintenance data;

4. A method according to claim 3, wherein generating an index field from the run-time data comprises:

5. The method of claim 1, wherein identifying a target field from the search statement comprises:

6. The method of claim 5, wherein finding a matching plurality of documents in the index based on the target field and ranking the plurality of documents comprises:

7. A data search device, comprising:

8. The apparatus of claim 7, wherein the memory module is further configured to:

9. The apparatus of claim 7, wherein the memory module is further configured to:

generating an index field according to the operation and maintenance data;

10. The apparatus of claim 9, wherein the memory module is further configured to:

11. The apparatus of claim 7, wherein the processing module is further configured to:

12. The apparatus of claim 11, wherein the search module is further configured to:

13. An electronic device, comprising:

one or more processors;

storage means for storing one or more programs,

the one or more processors implement the method of any of claims 1-6 when the one or more programs are executed by the one or more processors.

14. A computer readable medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements the method according to any of claims 1-6.

15. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any of claims 1-6.