CN116108062A - Data retrieval matching method, device, electronic equipment and storage medium - Google Patents

Data retrieval matching method, device, electronic equipment and storage medium Download PDF

Info

Publication number
CN116108062A
CN116108062A CN202310134978.6A CN202310134978A CN116108062A CN 116108062 A CN116108062 A CN 116108062A CN 202310134978 A CN202310134978 A CN 202310134978A CN 116108062 A CN116108062 A CN 116108062A
Authority
CN
China
Prior art keywords
matching
data
target
attribute
searched
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310134978.6A
Other languages
Chinese (zh)
Inventor
朱安宁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Lianren Healthcare Big Data Technology Co Ltd
Original Assignee
Lianren Healthcare Big Data Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Lianren Healthcare Big Data Technology Co Ltd filed Critical Lianren Healthcare Big Data Technology Co Ltd
Priority to CN202310134978.6A priority Critical patent/CN116108062A/en
Publication of CN116108062A publication Critical patent/CN116108062A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2462Approximate or statistical queries
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Fuzzy Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a data retrieval matching method, a data retrieval matching device, electronic equipment and a storage medium. The method is characterized by comprising the following steps: receiving an object to be retrieved; extracting attributes of the object to be searched according to a preset data block configuration item, and determining an attribute value to be searched; performing word segmentation on the attribute value to be searched according to a preset word segmentation dictionary, and determining a search attribute word segmentation set, wherein the search attribute word segmentation set comprises at least one attribute word segment to be searched; and determining target data matched with the object to be searched by searching and matching the searching attribute word segmentation set and the plurality of target data blocks. The matching target data can be quickly searched, the data searching efficiency can be improved, the data searching range can be reduced, and the data searching accuracy can be improved.

Description

Data retrieval matching method, device, electronic equipment and storage medium
Technical Field
The present invention relates to the field of data retrieval, and in particular, to a data retrieval matching method, apparatus, electronic device, and storage medium.
Background
In the environment of using big data in various industries at present, massive data usually exists, a user needs to search the data used by the user in the big data, a traditional data search method is usually used, when the traditional data search method searches in the massive data, the massive data and the searched objects need to be matched with each other, the whole process is very slow, the search result cannot be obtained quickly, and the other method can cause inaccurate search because a plurality of data with different sources exist in the same type of data in the big data. Taking the medicine field as an example, the medicine can search the complete information of the medicine in the medicine big data system through the medicine name and the medicine specification, but as the same kind of medicine can have a plurality of medicine manufacturers to generate a plurality of same medicine data, when the searched medicine is directly matched correspondingly, the efficiency is low and the accuracy is lower. In the prior art, when searching is performed in big data, target data cannot be accurately and rapidly searched, and a large number of results of server computing performance are consumed.
Disclosure of Invention
The invention provides a data retrieval matching method, a device, electronic equipment and a storage medium, which are used for realizing accurate and rapid retrieval of a target object in complex and large-scale data.
According to an aspect of the present invention, there is provided a data retrieval matching method, including:
receiving an object to be retrieved;
extracting attributes of the object to be searched according to a preset data block configuration item, and determining an attribute value to be searched;
performing word segmentation on the attribute value to be searched according to a preset word segmentation dictionary, and determining a search attribute word segmentation set, wherein the search attribute word segmentation set comprises at least one attribute word segment to be searched;
and determining target data matched with the object to be searched by searching and matching the searching attribute word segmentation set and the plurality of target data blocks.
According to another aspect of the present invention, there is provided a data retrieval matching device including:
the data receiving module is used for receiving the object to be retrieved;
the attribute extraction module is used for extracting the attribute of the object to be searched according to a preset data block configuration item and determining an attribute value to be searched;
the word segmentation module is used for segmenting the attribute value to be searched according to a preset word segmentation dictionary and determining a search attribute word segmentation set, wherein the search attribute word segmentation set comprises at least one attribute word segment to be searched;
And the data retrieval matching module is used for determining target data matched with the object to be retrieved by carrying out retrieval matching on the retrieval attribute word segmentation set and the plurality of target data blocks.
According to another aspect of the present invention, there is provided an electronic apparatus including:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,,
the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the data retrieval matching method of any one of the embodiments of the present invention.
According to another aspect of the present invention, there is provided a computer readable storage medium storing computer instructions for causing a processor to implement the data retrieval matching method according to any of the embodiments of the present invention when executed.
The technical scheme of the embodiment of the invention is that the object to be searched is received; extracting attributes of the object to be searched according to a preset data block configuration item, determining an attribute value to be searched, extracting effective search attributes in the search object by carrying out data block on the search object, and improving the accuracy of search; the attribute values to be searched are segmented according to a preset word segmentation dictionary, a search attribute word segmentation set is determined, word segmentation is carried out on the attribute values to be searched, noise words affecting search accuracy are removed, search range is reduced, and search matching efficiency and accuracy are improved; the target data matched with the object to be searched is determined by searching and matching the searching attribute word segmentation set and the plurality of target data blocks, and the target data matched with the target data subjected to data segmentation is used for realizing the effects of improving the data searching speed and reducing the data searching range, effectively improving the accuracy of data searching, solving the technical problems of low searching speed and low accuracy in big data in the prior art, accurately and quickly searching the target data and saving the computing resources of a server.
It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the invention or to delineate the scope of the invention. Other features of the present invention will become apparent from the description that follows.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a data retrieval matching method according to an embodiment of the present invention;
FIG. 2 is a flowchart of another data retrieval matching method according to a second embodiment of the present invention;
FIG. 3 is a flowchart of another data retrieval matching method provided by an embodiment of the present invention;
fig. 4 is a schematic structural diagram of a data retrieval matching device according to a third embodiment of the present invention;
fig. 5 is a schematic structural diagram of an electronic device implementing a data retrieval matching method according to an embodiment of the present invention.
Detailed Description
In order that those skilled in the art will better understand the present invention, a technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.
Example 1
Fig. 1 is a flowchart of a data search matching method according to an embodiment of the present invention, where the method may be performed by a data search matching device, and the data search matching device may be implemented in hardware and/or software, and the data search matching device may be configured in an electronic device. As shown in fig. 1, the method includes:
s110, receiving an object to be retrieved.
The object to be retrieved may be one or more entity objects that need to be retrieved and matched. It should be noted that the entity object may be an input object that can be used for retrieval. Typically, the entity object may be an object for characterizing a retrieval intent. The entity object representation forms can be various, and can be text, pictures and other objects. In the medical field, the input object to be retrieved may be drug related data, for example, "drug name: red injection, manufacturer: region B pharmaceutical industry ", etc.
Optionally, before receiving the object to be retrieved, a plurality of input modes capable of inputting the object to be retrieved are preset. For example, setting a search bar capable of inputting in a search page, and receiving an input object to be searched after detecting the input object to be searched; the method can also provide a search option in a search page, and a user can select an object to be searched according to search selection, and search after submitting the object to be searched.
Specifically, an object to be searched input by a user is received, and the object to be searched is searched.
And S120, extracting the attribute of the object to be searched according to a preset data block configuration item, and determining the attribute value to be searched.
The data partitioning configuration item may be a preset attribute value for extracting data from data. Optionally, before the data retrieval, determining a data block configuration item capable of retrieving the distinguishing data, and setting the data block configuration item for attribute extraction. Illustratively, taking medicine data as an example, when medicine needs to be retrieved, medicine needs to be determined according to medicine name, manufacturer and medicine dosage form, and then the medicine name, manufacturer and medicine dosage form are determined as data block configuration items.
The attribute value to be retrieved may be an entity attribute value conforming to the data block configuration item in the object to be retrieved. The entity attribute value may be used to describe an object attribute to be retrieved, and the attribute value may be used to determine data matching the object to be retrieved. Illustratively, the input object to be retrieved may be: the data block configuration item can be a medicine name, a manufacturer and a medicine dosage form, and the attribute value to be searched obtained by extracting the attribute of the object to be searched according to the data block configuration item can be the A pharmaceutical company, the B pharmaceutical company and the injection.
Optionally, the object to be searched input by the user is unstructured data under normal conditions, and in the searching process, the unstructured data is converted into structured data by extracting the attribute value conforming to the data block configuration item from the object to be searched, so that the searching efficiency and accuracy can be effectively improved.
Specifically, a preset data block configuration item is obtained, attribute extraction is carried out according to the data block configuration item, namely entity attribute values conforming to the data block configuration item are extracted from the object to be searched, and the entity attribute values extracted by the attributes are determined to be the attribute values to be searched.
S130, word segmentation is carried out on the attribute values to be searched according to a preset word segmentation dictionary, and a search attribute word segmentation set is determined.
The search attribute word segmentation set comprises at least one attribute word segment to be searched. The set of search attribute terms may be a set in which the attribute terms to be searched are stored. Optionally, a search attribute word segmentation queue may be established, and the attribute words to be searched are sequentially stored in the search attribute word segmentation queue.
The preset word segmentation dictionary may be a dictionary for word segmentation, which is preset. It should be noted that different word segmentation dictionaries can be set according to the data types of the data sources retrieved by the object to be retrieved; when the large data to be searched is the car data, the word-segmentation dictionary may be used to segment the car data, and when the large data to be searched is the medicine data, the word-segmentation dictionary may be used to segment the medicine data.
Alternatively, when the preset word segmentation dictionary is used for word segmentation, a word segmentation algorithm may be used for word segmentation of the attribute value to be retrieved. The word segmentation algorithm comprises at least one of a complete segmentation algorithm, a forward longest matching algorithm, a reverse longest matching algorithm, a bidirectional longest matching algorithm and the like.
Optionally, the attribute value to be searched is segmented according to a preset word segmentation dictionary, a search attribute word segmentation set for storing the attribute word segments to be searched may be preset, or after word segmentation is finished, a corresponding search attribute word segmentation set may be established according to the number of the attribute word segments to be searched. In the embodiment of the invention, the storage of the to-be-searched attribute part-words in the search attribute part-word set can be sequentially stored according to the sequence of segmenting the to-be-searched attribute part-words.
Optionally, after the attribute value to be searched is segmented according to a preset word segmentation dictionary, hot words exist in the word segmentation result, the hot words in the word segmentation result are removed, and the word segmentation result with the hot words removed is used as the attribute word segmentation to be searched.
Specifically, after extracting the attribute of the object to be searched, the attribute value to be searched is required to be segmented, the attribute value to be searched is segmented according to a preset word segmentation dictionary, the attribute segmentation to be searched of the attribute value to be searched is determined, and a search attribute segmentation set is determined according to the attribute segmentation to be searched.
And S140, determining target data matched with the object to be searched by searching and matching the searching attribute word segmentation set and the plurality of target data blocks.
The target data block may be a data block obtained by compressing the data block. Before searching and matching the search attribute word segmentation set, the data is subjected to data block compression to obtain a target data block. By searching and matching the target data block, the accuracy of searching and matching can be improved, and the omission probability can be reduced.
Alternatively, the target data block may be structured by a data block K-V structure (key=value), where Key of the target data block may identify each target data block, and Value of the target data block may record data stored in the target data block.
Wherein the target data may be data matching the object to be retrieved. The number of the target data may be plural or one.
Optionally, the search attribute word segmentation set is searched and matched with each target data block, whether data matched with the search attribute word segmentation set exists in all target data blocks is queried, and if the data matched with the search attribute word segmentation exists, the target data matched with the object to be searched is determined.
Specifically, a search attribute word segmentation set is obtained, search matching is carried out on the search attribute word segmentation set and a plurality of target data blocks, data matched with the search attribute word segmentation set is obtained, and the data matched with the target data to be searched are determined.
The technical scheme of the embodiment of the invention is that the object to be searched is received; extracting attributes of the object to be searched according to a preset data block configuration item, determining an attribute value to be searched, extracting effective search attributes in the search object by carrying out data block on the search object, and improving the accuracy of search; the attribute values to be searched are segmented according to a preset word segmentation dictionary, a search attribute word segmentation set is determined, word segmentation is carried out on the attribute values to be searched, noise words affecting search accuracy are removed, search range is reduced, and search matching efficiency and accuracy are improved; the target data matched with the object to be searched is determined by searching and matching the searching attribute word segmentation set and the plurality of target data blocks, and the target data matched with the target data subjected to data segmentation is used for realizing the effects of improving the data searching speed and reducing the data searching range, effectively improving the accuracy of data searching, solving the technical problems of low searching speed and low accuracy in big data in the prior art, accurately and quickly searching the target data and saving the computing resources of a server.
Example two
Fig. 2 is a flowchart of another data retrieval matching method according to the second embodiment of the present invention, and the relationship between the present embodiment and the above embodiment is that a specific matching process of a retrieval attribute word segmentation set and a plurality of target data blocks is described in detail. As shown in fig. 2, the data retrieval matching method includes:
s210, receiving an object to be retrieved.
S220, extracting the attribute of the object to be searched according to a preset data block configuration item, and determining the attribute value to be searched.
S230, word segmentation is carried out on the attribute values to be searched according to a preset word segmentation dictionary, and a search attribute word segmentation set is determined.
S240, determining a target matching block set by carrying out search matching on the search attribute word segmentation set and the target data blocks.
The target matching block set may be a block set obtained by matching the search attribute word segment set with a plurality of target data blocks.
Specifically, a search attribute word segmentation set is obtained, search matching is carried out on the search attribute word segmentation set and a plurality of target data blocks, a block set matched with the search attribute word segmentation set is obtained, and the target matching block set matched with the object to be searched is determined.
Optionally, in another optional embodiment of the present invention, the determining a target matching partition set by performing search matching on the search attribute word segmentation set and the plurality of target data blocks includes:
traversing the search attribute word segmentation set, sequentially carrying out search matching on each to-be-searched attribute word segmentation and the target word segmentation attribute value of each target data block, and determining a plurality of preliminary matching data blocks corresponding to each to-be-searched attribute word segmentation to obtain a preliminary matching block set corresponding to each to-be-searched attribute word segmentation; and determining the target matching block set according to the preliminary matching block sets corresponding to the attribute segmentation words to be searched.
The target word segmentation attribute value may be an identifier of a target data block, and may be used to uniquely identify each target data block. It should be noted that, the target data block is a Key value pair structure, abbreviated as K-V structure, and the target word segmentation attribute value is used as the Key of the target data block, so as to identify the target data block.
The preliminary matching data block can be a target data block with the same target word segmentation attribute value as the attribute word to be searched. Optionally, in the process of searching and matching, matching the attribute word to be searched with the attribute value of the target word of each target data block, determining the attribute value of the target word of the target data block identical to the attribute word to be searched, and obtaining the corresponding target data block as the preliminary matching data block.
The preliminary matching block set may be a block set in the preliminary matching data block. In the target data block, the target data block is in a K-V structure, the target word attribute Value is used as the Key of the target data block, and the Value of the target data block is a block set, wherein the block set comprises data stored in the target data block. Wherein the target data block stores data in the form of a collection.
Optionally, in the target data block, the target word segmentation attribute value is used as a Key of the target data block, the target data block can be identified, when the attribute word to be searched can be matched with the target word segmentation attribute value in the searching and matching process, the target data block of the target word segmentation attribute value is determined, and the block set of the target data block is obtained as the preliminary matching block set.
Optionally, when the search matching is performed on the search attribute word segment set and each target data block, each to-be-searched attribute word segment in the search attribute word segment set is searched and matched with each target data block, and then the search attribute word segment set is traversed, after the search matching is performed on one to-be-searched attribute word segment and each target data block, the next to-be-searched attribute word segment is obtained and then is searched and matched with each target data block, until all to-be-searched attribute word segments in the search attribute word segment set are completely searched and matched.
Specifically, a search attribute word segmentation set is obtained, the search attribute word segmentation set is traversed, to-be-searched attribute words in the search attribute word segmentation set are sequentially obtained, search matching is conducted on the to-be-searched attribute word segmentation and target word segmentation attribute values of each target data block, a plurality of preliminary matching data blocks corresponding to each to-be-searched attribute word segmentation are determined, a preliminary matching block set corresponding to each to-be-searched attribute word segmentation is obtained, and the target matching block set is determined according to the preliminary matching block sets corresponding to the to-be-searched attribute word segmentation.
Optionally, in another optional embodiment of the present invention, each to-be-searched attribute word in the search attribute word segmentation set may be searched and matched with a target data block, after searching and matching with a target data block, searching and matching the to-be-searched attribute word with a next target data block, and further searching and matching each to-be-searched attribute word with a plurality of preliminary matching data blocks corresponding to each to-be-searched attribute word.
Optionally, in another optional embodiment of the present invention, the determining the target matching block set according to a plurality of the preliminary matching block sets includes:
And determining the target matching block set based on intersections of the preliminary matching block sets corresponding to the attribute segmentation words to be searched.
Optionally, in the embodiment of the present invention, the search attribute word segmentation set has a plurality of to-be-searched attribute word segments, and after search matching is performed, preliminary matching data blocks corresponding to the plurality of to-be-searched attribute word segments can be obtained, so as to determine a preliminary matching block set corresponding to the plurality of to-be-searched attribute word segments.
Specifically, a preliminary matching block set corresponding to each attribute word to be searched is obtained, intersections of the preliminary matching block sets corresponding to the attribute words to be searched are determined, and a target matching block set is determined based on the intersections of the preliminary matching block sets corresponding to the attribute words to be searched.
That is, the target matching block set is determined based on whether the same preliminary matching block set exists in the preliminary matching block sets corresponding to the different attribute segmentations to be searched.
Optionally, in another optional embodiment of the present invention, the determining the target matching block set based on intersections of the preliminary matching block sets corresponding to the attribute to be retrieved, includes:
determining a first intersection of the preliminary mark matching block sets corresponding to all the attribute segmentation words to be searched; if the first intersection is not an empty set, determining the first intersection as the target matching block set; if the first intersection sets are empty sets, respectively determining second intersection sets of every two preliminary mark matching block sets, and determining first union sets of all the second intersection sets; if the first union set is not an empty set, determining the first union set as the target matching block set; and if the first union set is an empty set, acquiring second union sets of all the preliminary target matching block sets, and determining the second union sets as the target matching block sets.
The first intersection may be an intersection of the preliminary target matching block sets corresponding to all the attribute segmentations to be retrieved. The second intersection may be an intersection of every two preliminary mark matching block sets, and the number of the second intersections may be plural. The first intersection may be the intersection of all the second intersections. The second union may be a union of all preliminary mark matching block sets. Illustratively, when the number of the preliminary mark matching block sets is 3, the set a, the set B and the set C are set a, set B and set C, respectively, the first intersection may be intersection of set a, set B and set C (a n B n C), the second intersection may be intersection of each two of set a, set B and set C (a n B, B n C and a n C), the first intersection may be intersection of each two of set a, set B and set C (a n B, B n C and a n C) ((a n B) U (a n C)), and the second intersection may be intersection of set a, set B and set C (AUBUC).
Specifically, the preliminary mark matching block sets corresponding to a plurality of attribute segmentation words to be searched are obtained, first intersections of the preliminary mark matching block sets corresponding to all the attribute segmentation words to be searched are determined, whether the first intersections are empty sets or not is judged, and if the first intersections are not empty sets, the first intersections are determined to be target matching block sets; if the first intersection is an empty set, respectively determining second intersections of every two primary standard matching block sets, determining first union sets of all the second intersections, judging whether the first union sets are empty sets, and if the first union sets are not empty sets, determining the first union sets as target matching block sets; and if the first union set is an empty set, acquiring a second union set of all the preliminary mark matching block sets, and determining the second union set as the target matching block set.
S250, determining target data matched with the object to be retrieved according to the target matching block set.
Specifically, a target matching block set is obtained, and target data matched with an object to be retrieved is determined according to the target matching block set.
Optionally, in another optional implementation of the present invention, the determining, according to the target matching partition set, target data that the object to be retrieved matches includes:
determining standard data matched with the object to be retrieved according to the target matching block set; and calculating the matching similarity of the object to be searched and the standard data through a preset similarity algorithm, and determining the target data matched with the object to be searched according to the matching similarity.
The standard data may be data stored in advance, and the standard data may be stored in a database in advance or in a storage device in advance. Illustratively, taking drug data as an example, the standard data may be drug standard data "arginine injection, district A Co., ltd.).
The similarity algorithm may be an algorithm preset to calculate the similarity between the standard data and the object to be retrieved. The similarity algorithm may include at least one of a cosine similarity algorithm, a Word frequency similarity algorithm, a semantic similarity algorithm, a Word2vec (wordtovector) algorithm, and an unsupervised text matching algorithm.
The matching similarity may be a similarity value between the standard data and the object to be retrieved.
Optionally, the object to be searched can be matched with a plurality of standard data, so that similarity calculation is performed on the plurality of standard data matched with the object to be searched and the object to be searched through a similarity algorithm, a matching similarity score of each standard data and the object to be searched is determined, the matching similarity scores of each standard data are ranked, and then target data are determined according to a ranking result of the similarity scores.
Optionally, a matching similarity threshold is preset, after the matching similarity between each standard data and the object to be retrieved is determined, whether the matching similarity is greater than the matching similarity threshold is judged, and if the matching similarity of the standard data is greater than the matching similarity threshold, the standard data is determined to be the target data.
Specifically, a plurality of standard data matched with the object to be searched are determined according to the target matching block set, matching similarity between the object to be searched and the plurality of standard data is calculated through a preset similarity algorithm, and target data matched with the object to be searched is determined according to the matching similarity of each standard data.
The technical scheme of the embodiment of the invention is that the object to be searched is received; extracting attributes of the object to be searched according to a preset data block configuration item, and determining an attribute value to be searched; performing word segmentation on the attribute value to be searched according to a preset word segmentation dictionary, and determining a search attribute word segmentation set; the search attribute word segmentation set and the plurality of target data blocks are subjected to search matching, a target matching block set is determined, the range of determining target data can be further narrowed through the target matching block set, and the efficiency of determining the target data is improved; and determining target data matched with the object to be retrieved according to the target matching block set. The method has the advantages of improving the data retrieval rate and reducing the data retrieval range, effectively improving the accuracy of data retrieval, solving the technical problems of low retrieval speed and low accuracy in large data in the prior art, being capable of accurately and rapidly retrieving target data, reducing the retrieval matching range, further improving the matching efficiency and saving the computing resources of a server.
Fig. 3 is a flowchart of another data retrieval matching method according to an embodiment of the present invention, and this embodiment describes in detail a process of performing data blocking on standard data to determine a target data block. As shown in fig. 3, the data retrieval matching method includes:
s310, standard data are acquired.
Optionally, in the embodiment of the present invention, the standard data may be stored in a database in advance, and the standard data may be obtained in the database, or may be input before the data search matching is performed.
Alternatively, the standard data may include at least one of an index identification of the standard data and a data name of the standard data. Illustratively, taking drug data in a medical system as an example, the drug data includes index identification of the drug, drug name, drug dosage form name, drug specification, manufacturer and drug standard code, and drug source, as shown in table 1 below:
TABLE 1
Figure BDA0004085129860000121
Figure BDA0004085129860000131
Specifically, standard data is acquired.
S320, extracting the attributes of the standard data according to the data block configuration item, and determining the attribute value of the target block.
Optionally, before attribute extraction is performed on the standard data, determining an attribute-extracted data block configuration item, performing attribute extraction on the standard data according to the attribute block configuration item, and determining a target block attribute value.
Taking the drug data in table 1 as an example, when the drug needs to be retrieved, the drug needs to be determined according to the drug name, manufacturer, drug formulation and drug specification combination, and then the drug name, manufacturer and drug formulation are determined as data block configuration items, and after the standard data with index of 1 in table 1 is subjected to attribute extraction, the extracted target block attribute values are arginine injection, district a pharmaceutical company, injection and 5ml:0.25g.
Specifically, a preset data block configuration item is obtained, attribute extraction is carried out according to the data block configuration item, namely entity attribute values conforming to the data block configuration item are extracted from standard data, and the entity attribute values extracted by the attribute are determined to be target block attribute values.
S330, word segmentation is carried out on the target block attribute value according to the preset word segmentation dictionary, and the target word segmentation attribute value is determined.
Specifically, after extracting the attributes of the standard data, the target block attribute value needs to be segmented, the target block attribute value is segmented according to a preset word segmentation dictionary, and the target word segmentation attribute value of the target block attribute value is determined.
Taking the drug data of table 2 as an example, the extracted drug name and manufacturer are segmented by a preset word segmentation dictionary, and the obtained word segmentation result is shown in table 2 below:
TABLE 2
Figure BDA0004085129860000141
The hot words 'limited and company' exist in the word segmentation result of the manufacturer, and the hot words 'limited and company' can be removed to obtain the medicine name and the target word segmentation attribute value of the manufacturer.
S340, constructing a plurality of target data blocks corresponding to the standard data according to the target word segmentation attribute values.
Specifically, a target word segmentation attribute value is obtained, and the target data block corresponding to standard data is constructed according to the target word segmentation attribute value.
Optionally, the target data block is in a K-V structure, the target word segmentation attribute Value is used as a Key of the target data block, and Value of the target data block is a block set, and the block set stores an index identifier corresponding to the target word segmentation attribute Value; and further, standard data corresponding to the index identification can be determined according to the index identification.
Illustratively, taking the drug data of table 1 as an example, the following table 3 shows the target data block for constructing standard data according to the target word segmentation attribute value:
Figure BDA0004085129860000151
the technical scheme of the embodiment of the invention obtains standard data; according to the data block configuration item, attribute extraction is carried out on the standard data, a target block attribute value is determined, and effective retrieval attributes in the standard data are extracted by carrying out data block on the standard data, so that the retrieval accuracy can be improved; performing word segmentation on the target block attribute value according to the preset word segmentation dictionary, and determining a target word segmentation attribute value; the target block attribute value is segmented, and the range of determining the target data can be further narrowed through the target segmentation attribute value, so that the efficiency of determining the target data is improved; and constructing a plurality of target data blocks corresponding to the standard data according to the target word segmentation attribute values. The method has the advantages of improving the data retrieval rate and reducing the data retrieval range, effectively improving the accuracy of data retrieval, solving the technical problems of low retrieval speed and low accuracy in large data in the prior art, being capable of accurately and rapidly retrieving target data, reducing the retrieval matching range, further improving the matching efficiency and saving the computing resources of a server.
Optionally, the embodiment of the invention discloses another data retrieval matching method, wherein the method comprises the following steps:
s1, data block compression. Standard data are obtained; extracting attributes of the standard data according to the data block configuration items, and determining target block attribute values; performing word segmentation on the target block attribute value according to the preset word segmentation dictionary, and determining a target word segmentation attribute value; and constructing a plurality of target data blocks corresponding to the standard data according to the target word segmentation attribute values.
S2, block retrieval matching. Receiving an object to be retrieved; extracting attributes of the object to be searched according to a preset data block configuration item, and determining an attribute value to be searched; performing word segmentation on the attribute value to be searched according to a preset word segmentation dictionary, and determining a search attribute word segmentation set; the search attribute word segmentation set and a plurality of target data blocks are subjected to search matching, so that a target matching block set is determined; and determining target data matched with the object to be retrieved according to the target matching block set.
Illustratively, at the beginning of the block search match, the memory loads the plurality of target data blocks generated in S1, the target data blocks being exemplified in table 3, and at the time of the search match, "medicine name: red injection, manufacturer: the region B pharmaceutical industry is used as an object to be searched, attribute extraction is carried out on an object system to be searched, and an attribute value to be searched is determined as a medicine name: red injection, manufacturer: the region B pharmaceutical industry performs word segmentation operation on the attribute value to be searched to obtain a search attribute word segmentation set { 'red', 'injection', 'liquid', 'region B', 'pharmaceutical industry', 'and the like', and enters a memory to perform search matching on the search attribute word segmentation set and a plurality of target data blocks to obtain a preliminary matching block set, wherein the preliminary matching block set is shown in the following table 4:
TABLE 4 Table 4
Figure BDA0004085129860000161
Figure BDA0004085129860000171
Determining a target matching block set as [3,4] according to the intersection of the plurality of preliminary matching block sets, and inquiring arginine injection, 5ml:0.25g, region A pharmaceutical company Limited company, corresponding to standard data in table 1 according to index identification in the target matching block set, wherein the code C is coded; vitamin injection, 2ml of injection, 0.1g, code D' of regional B high yam industry Co., ltd., object to be searched "medicine name: red injection, manufacturer: and (3) carrying out similarity calculation on the regional B pharmaceutical industry and two standard data, determining the matching similarity of the regional B pharmaceutical industry and the two standard data, and sequencing to obtain target data of vitamin injection, 2ml of injection, 0.1g of regional B high yam industry Co., ltd., code D ".
The technical scheme of the embodiment of the invention can improve the data retrieval rate and reduce the action effect of the data retrieval range, effectively improve the accuracy of data retrieval, solve the technical problems of low retrieval speed and low accuracy in large data in the prior art, accurately and quickly retrieve target data, reduce the retrieval matching range, further improve the matching efficiency and save the computing resources of a server.
Example III
Fig. 4 is a schematic structural diagram of a data retrieval matching device according to a third embodiment of the present invention. As shown in fig. 4, the apparatus includes: a data receiving module 410, an attribute extraction module 420, a word segmentation module 430, and a data retrieval matching module 440, wherein,
a data receiving module 410, configured to receive an object to be retrieved;
the attribute extraction module 420 is configured to perform attribute extraction on the object to be retrieved according to a preset data block configuration item, and determine an attribute value to be retrieved;
the word segmentation module 430 is configured to segment the attribute value to be searched according to a preset word segmentation dictionary, and determine a search attribute word segmentation set, where the search attribute word segmentation set includes at least one attribute word segment to be searched;
the data retrieval matching module 440 is configured to determine target data that matches the object to be retrieved by performing retrieval matching on the retrieval attribute word segmentation set and the plurality of target data blocks.
The technical scheme of the embodiment of the invention is that the object to be searched is received; extracting attributes of the object to be searched according to a preset data block configuration item, determining an attribute value to be searched, extracting effective search attributes in the search object by carrying out data block on the search object, and improving the accuracy of search; the attribute values to be searched are segmented according to a preset word segmentation dictionary, a search attribute word segmentation set is determined, word segmentation is carried out on the attribute values to be searched, noise words affecting search accuracy are removed, search range is reduced, and search matching efficiency and accuracy are improved; the target data matched with the object to be searched is determined by searching and matching the searching attribute word segmentation set and the plurality of target data blocks, and the target data matched with the target data subjected to data segmentation is used for realizing the effects of improving the data searching speed and reducing the data searching range, effectively improving the accuracy of data searching, solving the technical problems of low searching speed and low accuracy in big data in the prior art, accurately and quickly searching the target data and saving the computing resources of a server.
Optionally, the data retrieval matching module is specifically configured to:
the search attribute word segmentation set and a plurality of target data blocks are subjected to search matching, so that a target matching block set is determined;
and determining target data matched with the object to be retrieved according to the target matching block set.
Optionally, the data retrieval matching module is specifically further configured to:
traversing the search attribute word segmentation set, sequentially carrying out search matching on each to-be-searched attribute word segmentation and the target word segmentation attribute value of each target data block, and determining a plurality of preliminary matching data blocks corresponding to each to-be-searched attribute word segmentation to obtain a preliminary matching block set corresponding to each to-be-searched attribute word segmentation;
and determining the target matching block set according to the preliminary matching block sets corresponding to the attribute segmentation words to be searched.
Optionally, the data retrieval matching module is specifically further configured to:
and determining the target matching block set based on intersections of the preliminary matching block sets corresponding to the attribute segmentation words to be searched.
Optionally, the data retrieval matching module is specifically further configured to:
determining a first intersection of the preliminary mark matching block sets corresponding to all the attribute segmentation words to be searched;
If the first intersection is not an empty set, determining the first intersection as the target matching block set;
if the first intersection sets are empty sets, respectively determining second intersection sets of every two preliminary mark matching block sets, and determining first union sets of all the second intersection sets;
if the first union set is not an empty set, determining the first union set as the target matching block set;
and if the first union set is an empty set, acquiring second union sets of all the preliminary target matching block sets, and determining the second union sets as the target matching block sets.
Optionally, the data retrieval matching module is specifically further configured to:
determining standard data matched with the object to be retrieved according to the target matching block set;
and calculating the matching similarity of the object to be searched and the standard data through a preset similarity algorithm, and determining the target data matched with the object to be searched according to the matching similarity.
Optionally, the data retrieval matching device further comprises a standard data acquisition module and a data block construction module;
the standard data acquisition module is used for acquiring standard data;
The attribute extraction module is further used for extracting attributes of the standard data according to the data block configuration item and determining a target block attribute value;
the word segmentation module is used for segmenting the target block attribute value according to the preset word segmentation dictionary and determining a target word segmentation attribute value;
the data block construction module is used for constructing a plurality of target data blocks corresponding to the standard data according to the target word segmentation attribute values.
The data retrieval matching device provided by the embodiment of the invention can execute the data retrieval matching method provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method.
Example IV
Fig. 5 shows a schematic diagram of the structure of an electronic device 10 that may be used to implement an embodiment of the invention. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. Electronic equipment may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices (e.g., helmets, glasses, watches, etc.), and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the inventions described and/or claimed herein.
As shown in fig. 5, the electronic device 10 includes at least one processor 11, and a memory, such as a Read Only Memory (ROM) 12, a Random Access Memory (RAM) 13, etc., communicatively connected to the at least one processor 11, in which the memory stores a computer program executable by the at least one processor, and the processor 11 may perform various appropriate actions and processes according to the computer program stored in the Read Only Memory (ROM) 12 or the computer program loaded from the storage unit 18 into the Random Access Memory (RAM) 13. In the RAM13, various programs and data required for the operation of the electronic device 10 may also be stored. The processor 11, the ROM12 and the RAM13 are connected to each other via a bus 14. An input/output (I/O) interface 15 is also connected to bus 14.
Various components in the electronic device 10 are connected to the I/O interface 15, including: an input unit 16 such as a keyboard, a mouse, etc.; an output unit 17 such as various types of displays, speakers, and the like; a storage unit 18 such as a magnetic disk, an optical disk, or the like; and a communication unit 19 such as a network card, modem, wireless communication transceiver, etc. The communication unit 19 allows the electronic device 10 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks.
The processor 11 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of processor 11 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various processors running machine learning model algorithms, digital Signal Processors (DSPs), and any suitable processor, controller, microcontroller, etc. The processor 11 performs the various methods and processes described above, such as the data retrieval matching method.
In some embodiments, the data retrieval matching method may be implemented as a computer program tangibly embodied on a computer-readable storage medium, such as storage unit 18. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 10 via the ROM12 and/or the communication unit 19. When the computer program is loaded into RAM13 and executed by processor 11, one or more of the steps of the data retrieval matching method described above may be performed. Alternatively, in other embodiments, the processor 11 may be configured to perform the data retrieval matching method in any other suitable way (e.g. by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.
A computer program for carrying out methods of the present invention may be written in any combination of one or more programming languages. These computer programs may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the computer programs, when executed by the processor, cause the functions/acts specified in the flowchart and/or block diagram block or blocks to be implemented. The computer program may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of the present invention, a computer-readable storage medium may be a tangible medium that can contain, or store a computer program for use by or in connection with an instruction execution system, apparatus, or device. The computer readable storage medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. Alternatively, the computer readable storage medium may be a machine readable signal medium. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on an electronic device having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) through which a user can provide input to the electronic device. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), blockchain networks, and the internet.
The computing system may include clients and servers. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical hosts and VPS service are overcome.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps described in the present invention may be performed in parallel, sequentially, or in a different order, so long as the desired results of the technical solution of the present invention are achieved, and the present invention is not limited herein.
Example five
The present embodiment provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the data retrieval matching method steps as provided by any embodiment of the present invention, the method comprising:
receiving an object to be retrieved;
extracting attributes of the object to be searched according to a preset data block configuration item, and determining an attribute value to be searched;
performing word segmentation on the attribute value to be searched according to a preset word segmentation dictionary, and determining a search attribute word segmentation set, wherein the search attribute word segmentation set comprises at least one attribute word segment to be searched;
and determining target data matched with the object to be searched by searching and matching the searching attribute word segmentation set and the plurality of target data blocks.
The computer storage media of embodiments of the invention may take the form of any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. The computer readable storage medium may be, for example, but not limited to: an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the computer-readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
The computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, either in baseband or as part of a carrier wave. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations of the present invention may be written in one or more programming languages, including an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).
It will be appreciated by those of ordinary skill in the art that the modules or steps of the invention described above may be implemented in a general purpose computing device, they may be centralized on a single computing device, or distributed over a network of computing devices, or they may alternatively be implemented in program code executable by a computer device, such that they are stored in a memory device and executed by the computing device, or they may be separately fabricated as individual integrated circuit modules, or multiple modules or steps within them may be fabricated as a single integrated circuit module. Thus, the present invention is not limited to any specific combination of hardware and software.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps described in the present invention may be performed in parallel, sequentially, or in a different order, so long as the desired results of the technical solution of the present invention are achieved, and the present invention is not limited herein.
The above embodiments do not limit the scope of the present invention. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention should be included in the scope of the present invention.

Claims (10)

1. A data retrieval matching method, comprising:
receiving an object to be retrieved;
extracting attributes of the object to be searched according to a preset data block configuration item, and determining an attribute value to be searched;
performing word segmentation on the attribute value to be searched according to a preset word segmentation dictionary, and determining a search attribute word segmentation set, wherein the search attribute word segmentation set comprises at least one attribute word segment to be searched;
and determining target data matched with the object to be searched by searching and matching the searching attribute word segmentation set and the plurality of target data blocks.
2. The method according to claim 1, wherein the determining the target data to be searched for matching the object to be searched by searching and matching the search attribute word segmentation set and a plurality of target data blocks includes:
the search attribute word segmentation set and a plurality of target data blocks are subjected to search matching, so that a target matching block set is determined;
and determining target data matched with the object to be retrieved according to the target matching block set.
3. The method of claim 2, wherein determining a set of target matching blocks by search matching the set of search attribute tokens to a plurality of target data blocks comprises:
Traversing the search attribute word segmentation set, sequentially carrying out search matching on each to-be-searched attribute word segmentation and the target word segmentation attribute value of each target data block, and determining a plurality of preliminary matching data blocks corresponding to each to-be-searched attribute word segmentation to obtain a preliminary matching block set corresponding to each to-be-searched attribute word segmentation;
and determining the target matching block set according to the preliminary matching block sets corresponding to the attribute segmentation words to be searched.
4. A method according to claim 3, wherein said determining said set of target matching blocks from a plurality of said sets of preliminary matching blocks comprises:
and determining the target matching block set based on intersections of the preliminary matching block sets corresponding to the attribute segmentation words to be searched.
5. The method of claim 4, wherein the determining the set of target matching segments based on intersections of the set of preliminary matching segments corresponding to the plurality of the to-be-retrieved property segmentations comprises:
determining a first intersection of the preliminary mark matching block sets corresponding to all the attribute segmentation words to be searched;
if the first intersection is not an empty set, determining the first intersection as the target matching block set;
If the first intersection sets are empty sets, respectively determining second intersection sets of every two preliminary mark matching block sets, and determining first union sets of all the second intersection sets;
if the first union set is not an empty set, determining the first union set as the target matching block set;
and if the first union set is an empty set, acquiring second union sets of all the preliminary target matching block sets, and determining the second union sets as the target matching block sets.
6. The method according to claim 2, wherein said determining target data for which the object to be retrieved matches from the set of target match blocks comprises:
determining standard data matched with the object to be retrieved according to the target matching block set;
and calculating the matching similarity of the object to be searched and the standard data through a preset similarity algorithm, and determining the target data matched with the object to be searched according to the matching similarity.
7. The method of claim 2, wherein prior to said matching by searching said set of search attribute terms with said plurality of target data blocks, further comprising:
Standard data are obtained;
extracting attributes of the standard data according to the data block configuration items, and determining target block attribute values;
performing word segmentation on the target block attribute value according to the preset word segmentation dictionary, and determining a target word segmentation attribute value;
and constructing a plurality of target data blocks corresponding to the standard data according to the target word segmentation attribute values.
8. A data retrieval matching device, comprising:
the data receiving module is used for receiving the object to be retrieved;
the attribute extraction module is used for extracting the attribute of the object to be searched according to a preset data block configuration item and determining an attribute value to be searched;
the word segmentation module is used for segmenting the attribute value to be searched according to a preset word segmentation dictionary and determining a search attribute word segmentation set, wherein the search attribute word segmentation set comprises at least one attribute word segment to be searched;
and the data retrieval matching module is used for determining target data matched with the object to be retrieved by carrying out retrieval matching on the retrieval attribute word segmentation set and the plurality of target data blocks.
9. An electronic device, the electronic device comprising:
at least one processor; and
A memory communicatively coupled to the at least one processor; wherein,,
the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the data retrieval matching method of any one of claims 1-7.
10. A computer readable storage medium storing computer instructions for causing a processor to perform the data retrieval matching method of any one of claims 1-7.
CN202310134978.6A 2023-02-20 2023-02-20 Data retrieval matching method, device, electronic equipment and storage medium Pending CN116108062A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310134978.6A CN116108062A (en) 2023-02-20 2023-02-20 Data retrieval matching method, device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310134978.6A CN116108062A (en) 2023-02-20 2023-02-20 Data retrieval matching method, device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN116108062A true CN116108062A (en) 2023-05-12

Family

ID=86255997

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310134978.6A Pending CN116108062A (en) 2023-02-20 2023-02-20 Data retrieval matching method, device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116108062A (en)

Similar Documents

Publication Publication Date Title
CN113590645B (en) Searching method, searching device, electronic equipment and storage medium
JP2022191412A (en) Method for training multi-target image-text matching model and image-text retrieval method and apparatus
CN112989235B (en) Knowledge base-based inner link construction method, device, equipment and storage medium
US20220358178A1 (en) Data query method, electronic device, and storage medium
CN113032673A (en) Resource acquisition method and device, computer equipment and storage medium
CN112560461A (en) News clue generation method and device, electronic equipment and storage medium
CN113408660B (en) Book clustering method, device, equipment and storage medium
CN113901214B (en) Method and device for extracting form information, electronic equipment and storage medium
CN113722600B (en) Data query method, device, equipment and product applied to big data
CN112699237B (en) Label determination method, device and storage medium
CN113408280A (en) Negative example construction method, device, equipment and storage medium
CN116340518A (en) Text association matrix establishment method and device, electronic equipment and storage medium
CN116166814A (en) Event detection method, device, equipment and storage medium
CN116108062A (en) Data retrieval matching method, device, electronic equipment and storage medium
CN108009233B (en) Image restoration method and device, computer equipment and storage medium
CN116089459B (en) Data retrieval method, device, electronic equipment and storage medium
CN115511014B (en) Information matching method, device, equipment and storage medium
CN112818167B (en) Entity retrieval method, entity retrieval device, electronic equipment and computer readable storage medium
CN116737520B (en) Data braiding method, device and equipment for log data and storage medium
CN115795023B (en) Document recommendation method, device, equipment and storage medium
CN115828915B (en) Entity disambiguation method, device, electronic equipment and storage medium
CN113268987A (en) Entity name identification method and device, electronic equipment and storage medium
CN115329748A (en) Log analysis method, device, equipment and storage medium
CN117216398A (en) Enterprise recommendation method, device, equipment and medium
CN114116914A (en) Entity retrieval method and device based on semantic tag and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination