CN114117168A - Label matching method, device, equipment and computer storage medium - Google Patents
Label matching method, device, equipment and computer storage medium Download PDFInfo
- Publication number
- CN114117168A CN114117168A CN202111475736.0A CN202111475736A CN114117168A CN 114117168 A CN114117168 A CN 114117168A CN 202111475736 A CN202111475736 A CN 202111475736A CN 114117168 A CN114117168 A CN 114117168A
- Authority
- CN
- China
- Prior art keywords
- node
- propagation
- label
- score
- determining
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 76
- 230000005284 excitation Effects 0.000 claims description 42
- 238000004590 computer program Methods 0.000 claims description 17
- 230000015654 memory Effects 0.000 claims description 15
- 238000013138 pruning Methods 0.000 claims description 13
- 238000004364 calculation method Methods 0.000 description 18
- 238000010586 diagram Methods 0.000 description 16
- 239000011159 matrix material Substances 0.000 description 15
- 238000004422 calculation algorithm Methods 0.000 description 13
- 238000012545 processing Methods 0.000 description 9
- 238000005516 engineering process Methods 0.000 description 7
- 230000005540 biological transmission Effects 0.000 description 5
- 230000006870 function Effects 0.000 description 5
- 238000012549 training Methods 0.000 description 5
- 230000003287 optical effect Effects 0.000 description 3
- 238000012804 iterative process Methods 0.000 description 2
- 238000010845 search algorithm Methods 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000002457 bidirectional effect Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 230000002349 favourable effect Effects 0.000 description 1
- 230000005055 memory storage Effects 0.000 description 1
- 238000012821 model calculation Methods 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000005192 partition Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/906—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/082—Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Biomedical Technology (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Databases & Information Systems (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Mobile Radio Communication Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The embodiment of the application provides a label matching method, a label matching device, electronic equipment and a computer storage medium, wherein the method comprises the following steps: determining an alternative node set from nodes associated with each label in a label group to be matched, and taking each alternative node in the alternative node set as a first gradient propagation node; determining whether an irrelevant label meeting a preset propagation condition exists in each label associated with each node in the ith gradient propagation node; when the situation exists, adding the irrelevant tags meeting the preset propagation condition into the tag group to be matched, and taking the non-alternative nodes associated with the irrelevant tags meeting the preset propagation condition as the (i + 1) th gradient propagation node; when the gradient propagation node does not exist, determining the propagation score of each label associated with each node in the ith gradient propagation node; and determining the final score of each candidate node in the first gradient propagation node according to the propagation scores of the labels associated with each node in the ith gradient propagation node.
Description
Technical Field
The present application relates to the field of cloud computing technology of financial technology (Fintech), and in particular, to a tag matching method, apparatus, electronic device, and computer storage medium.
Background
With the development of computer technology, more and more technologies are applied in the financial field, and the traditional financial industry is gradually changing to financial technology, but higher requirements are also put forward on the technologies due to the requirements of the financial industry on safety and real-time performance.
In a typical business application system, when the tag system is used for data matching, on one hand, for the real-time response, most of the data matching is completed by means of an accurate/fuzzy search algorithm, such as inverted query, and on the other hand, for the generalization of the matching, data pre-scoring about the tag is performed by combining a matrix model with belief propagation, but the timeliness of the matrix model calculation is low, and the calculation needs to be performed by combining a global tag network; in the process, if the matrix model is calculated in real time, the updating or the inserting of the label is performed once, so that the iterative training of N rounds is performed once until the global convergence is achieved, and the complexity of the label matching calculation is increased.
Disclosure of Invention
The application provides a tag matching method, a tag matching device, electronic equipment and a computer storage medium, which can solve the problem of large calculation amount in tag matching in the related art.
The technical scheme of the application is realized as follows:
the embodiment of the application provides a label matching method, which comprises the following steps:
receiving a tag matching request, wherein the tag matching request comprises a tag group to be matched; determining an alternative node set from nodes associated with each label in the label group to be matched, and taking each alternative node in the alternative node set as a first gradient propagation node;
determining whether an irrelevant label meeting a preset propagation condition exists in each label associated with each node in the ith gradient propagation node; the unrelated tags represent tags not included in the set of tags to be matched; i is an integer of 1 or more;
when the situation that the irrelevant label accords with the preset propagation condition exists is determined, the irrelevant label which accords with the preset propagation condition is added into the label group to be matched, and a non-alternative node which is associated with the irrelevant label which accords with the preset propagation condition is used as an i +1 th gradient propagation node;
when the propagation score of each label associated with each node in the ith gradient propagation node is determined to be not present and the value of i is greater than 1; determining a final score of each candidate node in the first gradient propagation node according to the propagation score of each label associated with each node in the ith gradient propagation node; and the final score is used for reflecting the matching degree of the label group to be matched and each alternative node.
In some embodiments, the determining a final score for each candidate node in the first gradient propagation node according to the propagation scores of the respective labels associated with each candidate node in the ith gradient propagation node comprises:
when the value of i is larger than 1, determining that irrelevant tags meeting preset propagation conditions do not exist in the tags associated with each node in the ith gradient propagation node, and determining the propagation score of each tag associated with each node in the ith-1 gradient propagation node according to the propagation score of each tag associated with each node in the ith gradient propagation node;
until the propagation score of each label associated with each alternative node in the first gradient propagation node is obtained; and determining the final score of each candidate node in the first gradient propagation node according to the propagation score of each label associated with each candidate node in the first gradient propagation node.
It can be seen that, in the embodiment of the present application, in the process of performing label matching, according to the propagation scores of the labels associated with each node in the current gradient propagation node, the propagation scores of the labels associated with each node in the previous gradient are determined, that is, the final score of each candidate node is determined in a back propagation manner, so that the accuracy of the matching result can be ensured.
In some embodiments, the method further comprises:
when different nodes in the ith gradient propagation node are determined to be associated with the same target label, determining a propagation score of each node in the different nodes associated with the same target label;
selecting the maximum propagation score of the target label from the propagation scores of each node in the different nodes related to the same target label;
and determining the propagation score of each label associated with each node in the i-1 gradient propagation nodes according to the maximum propagation score of the target label.
It can be seen that in the embodiment of the present application, when different nodes in a certain gradient propagation node are associated with the same label, multiple propagation scores corresponding to the label can be obtained, and the propagation scores can indicate the importance of the label relative to the nodes; at this time, the maximum propagation score is selected as the propagation score of the label, so that the accuracy of the subsequent matching result can be further improved.
In some embodiments, the method further comprises:
determining an initial score of each label associated with each candidate node in the first gradient propagation node;
if the situation that the irrelevant label meeting the preset propagation condition does not exist in the labels associated with each candidate node in the first gradient propagation node is determined; and determining the initial score sum of the labels associated with each candidate node in the first gradient propagation node as a final score result of each candidate node.
It can be seen that, in the embodiment of the present application, when it is determined that an unrelated label meeting a preset propagation condition does not exist in the labels associated with each candidate node in the first gradient propagation node, it indicates that the current propagation network does not need to continue to extend, and at this time, the calculation amount of the label matching algorithm can be greatly reduced by determining the initial sum of the scores associated with each candidate node in the first gradient propagation node as the final score result of each candidate node.
In some embodiments, the determining whether there is an irrelevant label meeting a preset propagation condition in the labels associated with each of the ith gradient propagation node includes:
if the number ratio of the nodes related to the irrelevant labels in the ith gradient propagation node to each node in the ith gradient propagation node is larger than a set threshold value, determining that the irrelevant labels meeting preset propagation conditions exist;
and if the number ratio of the nodes related to the irrelevant label in the ith gradient propagation node to each node in the ith gradient propagation node is determined to be less than or equal to a set threshold value, determining that the irrelevant label meeting a preset propagation condition does not exist.
It can be seen that, in the embodiment of the application, by setting the preset propagation condition, some implicit irrelevant labels which can affect the matching result can be better found for each gradient propagation node, and the iteration times of the label matching algorithm are reduced.
In some embodiments, the determining a set of alternative nodes from the nodes associated with each tag in the set of tags to be matched includes:
determining at least one label meeting a preset pruning condition from the label group to be matched;
and taking the node associated with each label in the at least one label as a candidate node and putting the candidate node into a candidate node set.
It can be seen that, in the embodiment of the application, by presetting pruning conditions, tags meeting actual requirements are selected from tags in the tag group to be matched for subsequent matching, and a certain amount of calculation can be reduced without affecting matching results.
In some embodiments, the determining a final score for each candidate node in the first gradient propagation node according to the propagation scores of the respective labels associated with each candidate node in the ith gradient propagation node comprises:
determining an initial score of each label associated with each node in the ith gradient propagation node under the condition that the value of i is greater than 1;
and determining a final score of each candidate node in the first gradient propagation node according to the propagation score and the initial score of each label associated with each node in the ith gradient propagation node.
It can be seen that, in the embodiment of the present application, the final score of each candidate node in the first gradient propagation node is jointly determined by using the respective label propagation score and the initial score associated with each node in the other gradient propagation nodes, so that the generalization of the matching result can be improved.
In some embodiments, the method further comprises:
pre-determining the characteristic attribute and the excitation value of each label in the label group to be matched;
under the condition that the value of i is greater than or equal to 1, when determining that each label associated with each node in the ith gradient propagation node is a label in the label group to be matched, determining an initial component of each label associated with each node in the ith gradient propagation node according to the excitation value of each label associated with each node in the ith gradient propagation node and the occurrence frequency of the characteristic attribute in the label group to be matched;
and when determining that each label associated with each node in the ith gradient propagation node is not a label in the label group to be matched, taking a preset value as an initial score of each label associated with each node in the ith gradient propagation node.
It can be seen that, in the embodiment of the present application, the initial score of each tag can be determined in a targeted manner by traversing the determination result whether each tag associated with each node in the current gradient propagation node is a tag in the tag group to be matched, so as to ensure the validity of the final score result.
The embodiment of the application also provides a label matching device, which comprises a first determining module and a second determining module, wherein,
the system comprises a first determining module, a second determining module and a matching module, wherein the first determining module is used for receiving a tag matching request which comprises a tag group to be matched; determining an alternative node set from nodes associated with each label in the label group to be matched, and taking each alternative node in the alternative node set as a first gradient propagation node;
the second determining module is used for determining whether an irrelevant label meeting a preset propagation condition exists in all labels associated with each node in the ith gradient propagation node; the unrelated tags represent tags not included in the set of tags to be matched; i is an integer of 1 or more; when the situation that the irrelevant label accords with the preset propagation condition exists is determined, the irrelevant label which accords with the preset propagation condition is added into the label group to be matched, and a non-alternative node which is associated with the irrelevant label which accords with the preset propagation condition is used as an i +1 th gradient propagation node; when the propagation score of each label associated with each node in the ith gradient propagation node is determined to be not present and the value of i is greater than 1; determining a final score of each candidate node in the first gradient propagation node according to the propagation score of each label associated with each node in the ith gradient propagation node; and the final score is used for reflecting the matching degree of the label group to be matched and each alternative node.
The embodiment of the present application provides an electronic device, where the device includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and when the processor executes the computer program, the tag matching method provided in one or more of the foregoing technical solutions is implemented.
The embodiment of the application provides a computer storage medium, wherein a computer program is stored in the computer storage medium; the computer program can implement the tag matching method provided by one or more of the above technical solutions after being executed.
The embodiment of the application provides a label matching method, a label matching device, electronic equipment and a computer storage medium, wherein the method comprises the following steps: receiving a tag matching request, wherein the tag matching request comprises a tag group to be matched; determining an alternative node set from nodes associated with each label in the label group to be matched, and taking each alternative node in the alternative node set as a first gradient propagation node; determining whether an irrelevant label meeting a preset propagation condition exists in each label associated with each node in the ith gradient propagation node; the unrelated tags represent tags not included in the set of tags to be matched; i is an integer of 1 or more; when the situation that the irrelevant label accords with the preset propagation condition exists is determined, the irrelevant label which accords with the preset propagation condition is added into the label group to be matched, and a non-alternative node which is associated with the irrelevant label which accords with the preset propagation condition is used as an i +1 th gradient propagation node; when the propagation score of each label associated with each node in the ith gradient propagation node is determined to be not present and the value of i is greater than 1; determining a final score of each candidate node in the first gradient propagation node according to the propagation score of each label associated with each node in the ith gradient propagation node; and the final score is used for reflecting the matching degree of the label group to be matched and each alternative node.
It can be seen that, in the embodiment of the present application, whether an irrelevant label meeting a preset propagation condition exists in each label associated with each node in the ith gradient propagation node is determined, that is, whether a label propagation network can be extended continuously is determined based on the irrelevant label, so that some implicit irrelevant labels that can affect a label matching result can be found better, and the accuracy of subsequent label matching is improved; in addition, in the process of determining whether the irrelevant tags meeting the preset propagation condition exist, if the situation that the tags associated with each node in a certain gradient propagation node do not meet the preset propagation condition any more exists, the tags associated with other nodes are not judged any more, namely, the iterative computation aiming at the tag matching process can be stopped, and the propagation depth is effectively prevented from being too large; compared with the prior art that if the label is updated or inserted once, N rounds of iterative training are required until the global convergence is reached, the number of iterations in the label matching process is less, and the calculation amount of label matching can be greatly reduced.
Drawings
Fig. 1 is a schematic diagram of a network structure of tag matching in the related art;
fig. 2a is a schematic flow chart of a tag matching method in an embodiment of the present application;
FIG. 2b is a schematic diagram of a network structure for performing tag matching in an embodiment of the present application;
FIG. 2c is a schematic diagram of another network structure for performing tag matching in the embodiment of the present application;
FIG. 2d is a schematic diagram illustrating a scoring value transmission process when a predetermined propagation condition is not met according to an embodiment of the present application;
fig. 3 is a schematic structural diagram of a tag matching apparatus according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of an electronic device provided in an embodiment of the present application.
Detailed Description
The present application will be described in further detail below with reference to the accompanying drawings and examples. It should be understood that the examples provided herein are merely illustrative of the present application and are not intended to limit the present application. In addition, the following examples are provided as partial examples for implementing the present application, not all examples for implementing the present application, and the technical solutions described in the examples of the present application may be implemented in any combination without conflict.
It should be noted that in the embodiments of the present application, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a method or apparatus that comprises a list of elements does not include only the elements explicitly recited, but also includes other elements not explicitly listed or inherent to the method or apparatus. Without further limitation, the use of the phrase "including a. -. said." does not exclude the presence of other elements (e.g., steps in a method or elements in a device, such as portions of circuitry, processors, programs, software, etc.) in the method or device in which the element is included.
The term "and/or" herein is merely an associative relationship that describes an associated object, meaning that three relationships may exist, e.g., I and/or J, may mean: the three cases of the single existence of I, the simultaneous existence of I and J and the single existence of J. In addition, the term "at least one" herein means any one of a plurality or any combination of at least two of a plurality, for example, including at least one of I, J, R, and may mean including any one or more elements selected from the group consisting of I, J and R.
For example, the tag matching method provided in the embodiment of the present application includes a series of steps, but the tag matching method provided in the embodiment of the present application is not limited to the described steps, and similarly, the tag matching device provided in the embodiment of the present application includes a series of modules, but the tag matching device provided in the embodiment of the present application is not limited to include the explicitly described modules, and may also include modules that are required to be provided for acquiring relevant task data or performing processing based on the task data.
Embodiments of the application are operational with numerous other general purpose or special purpose computing system environments or configurations. Here, the server may be a distributed cloud computing technology environment including a small computer system, a large computer system, and the like.
The electronic device such as the server can realize corresponding functions through the execution of the program module. Generally, program modules may include routines, programs, objects, components, logic, data structures, etc. that perform particular tasks or implement particular abstract data types. The computer system/server may be practiced in distributed cloud computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed cloud computing environment, program modules may be located in both local and remote computer system storage media including memory storage devices.
In the related art, data pre-scoring about tags can be performed by combining a matrix model with belief propagation, and taking the Pagerank algorithm of Google as an example, the influence (Pagerank, PR) value of each node is assumed to be a matrix variable related to the tags of the nodes, which is shown in fig. 1; in order to obtain a data pre-score, multiple rounds of iterative computation are needed to enable PR values of all nodes in the network to tend to be stable, the number of iterative rounds is not fixed, and the iterative process is that the PR value of each node is multiplied by an adjacent matrix (an equivalent and propagation matrix represents the difference rate of propagation of each node to the adjacent node); setting PR value of node M to be PR under N iterationsn,mPR matrix of N rounds is Sn=(PRn,0+PRn,1+..+PRn,m) The propagation matrix is L, and the iterative process is Sn+1=Sn*L。
If the real-time and generalization requirements need to be satisfied simultaneously, the following two problems occur:
1) if the matrix model is calculated in real time, one time of label updating or inserting can bring one time of N rounds of iterative training until the global convergence.
2) The updating, inserting or deleting of the label is carried out concurrently, the dynamic parameter updating in a large range can be generated on the matrix, the matrix falls into an unavailable state, and if the label network is simply split, the local optimization can be easily caused, and the accuracy of the matching result is influenced.
In view of the above technical problems, the following embodiments are proposed.
In some embodiments of the present Application, the tag matching method may be implemented by using a Processor in the tag matching Device, and the Processor may be at least one of an Application Specific Integrated Circuit (ASIC), a Digital Signal Processor (DSP), a Digital Signal Processing Device (DSPD), a Programmable Logic Device (PLD), a Field Programmable Gate Array (FPGA), a Central Processing Unit (CPU), a controller, a microcontroller, and a microprocessor.
Fig. 2a is a schematic flowchart of a tag matching method in an embodiment of the present application, and as shown in fig. 2a, the method includes the following steps:
step 200: receiving a tag matching request, wherein the tag matching request comprises a tag group to be matched; and determining an alternative node set from nodes associated with each label in the label group to be matched, and taking each alternative node in the alternative node set as a first gradient propagation node.
In the embodiment of the present application, the Label matching method may be applied to a mesh structure composed of labels (labels) and cluster nodes (nodes), which is referred to as a Label network for short; here, the cluster node means a set including a plurality of nodes, each of which may correspond to a service entity, wherein the type of the service entity is related to a service scenario to which the tag matching method is actually applied.
Illustratively, the set of tags to be matched may include one or more tags; here, each tag in the set of tags to be matched has a corresponding characteristic attribute, and each characteristic attribute has a respective incentive value, which is equivalent to a weighted value, and the incentive value participates in the subsequent score calculation.
In the embodiment of the application, when the label group to be matched comprises a plurality of labels, the characteristic attributes of different labels can be the same or different; under the condition that the characteristic attributes of different labels are the same, the excitation values of the labels are also the same; here, the excitation value of the tag indicates an excitation value of the characteristic attribute corresponding to the tag.
In some embodiments, the embodiment of the present application is not limited to the classification manner of the feature attributes of the tags; for example, the characteristic attributes of the tags may be classified into a Core tag (Core), a most Suitable tag (Suitable), a preferred tag (Prioritized), an Optional tag (Optional), and the like in a classification concept; the characteristic attributes of the tags may also be classified in other ways.
For example, what feature attribute should be given to each tag in the matching tag group, and how much the specific excitation value is, may be determined by means of a reduction algorithm, or may be determined according to human experience, which is not limited in the embodiment of the present application; for example, it is possible to artificially preset an excitation value of 2.0 for the Core tag (Core), an excitation value of 1.0 for the optimum tag (survivable), an excitation value of 0.8 for the preferred tag (Prioritized), and an excitation value of 0.3 for the Optional tag (Optional).
Exemplarily, after the tag network receives a tag matching request sent from the outside, a tag group to be matched may be obtained from the tag matching request; and further, determining an alternative node set according to nodes associated with each label in the label group to be matched.
In some embodiments, the nodes associated with each tag in the tag group to be matched may be all used as candidate nodes in the candidate node set.
In some embodiments, for an implementation manner of determining the alternative node set, the following may also be implemented: determining at least one label meeting a preset pruning condition from a label group to be matched; and taking the node associated with each label in at least one label as a candidate node and putting the candidate node into a candidate node set.
Here, the preset pruning conditions are some screening conditions set according to actual experience related to the service scenario, and the purpose is to reduce the amount of calculation; for example, in the case that the to-be-matched tag group includes a plurality of tags, at least one tag meeting a preset pruning condition may be determined therefrom; and further, taking the node associated with each label in the at least one label as a candidate node in the candidate node set.
Fig. 2b is a schematic diagram of a network structure for performing tag matching in the embodiment of the present application, and as shown in fig. 2b, a to-be-matched tag group includes: four labels, and the characteristic attributes of the four labels are respectively an optimal label (Suitable), a preferred label (Prioritized), an Optional label (Optional) and a Core label (Core); assuming that the preset pruning conditions are as follows: the candidate nodes must associate all tags with Core and capable characteristic attributes in the tag group to be matched, and then two nodes in the dotted line part in fig. 2b are the candidate nodes, and the two candidate nodes constitute a candidate node set.
For example, after the candidate node set is obtained, each candidate node in the candidate node set may be used as a first gradient propagation node; as can be seen in connection with fig. 2b, the first gradient propagation node comprises two nodes in the dashed part.
It can be seen that, in the embodiment of the application, by presetting pruning conditions, tags meeting actual requirements are selected from tags in the tag group to be matched for subsequent matching, and a certain amount of calculation can be reduced without affecting matching results.
Step 201: determining whether an irrelevant label meeting a preset propagation condition exists in each label associated with each node in the ith gradient propagation node; the irrelevant tags represent tags not included in the tag group to be matched; i is an integer of 1 or more.
Illustratively, after obtaining the current gradient propagation node, traversing each label associated with each node in the current gradient propagation node to determine an initial score of each label associated with each node in the current gradient propagation node; here, if i is equal to 1, the current gradient propagation node is a first gradient propagation node; if i is equal to 2, the current gradient propagation node is the second gradient propagation node.
For example, for an implementation of determining an initial score of each label associated with each node in the current gradient propagation nodes, the following may be implemented: the method comprises the steps of determining the characteristic attribute and the excitation value of each label in a label group to be matched in advance; under the condition that the value of i is greater than or equal to 1, when determining that each label associated with each node in the ith gradient propagation node is a label in the label group to be matched, determining the initial score of each label associated with each node in the ith gradient propagation node according to the excitation value of each label associated with each node in the ith gradient propagation node and the occurrence frequency of the characteristic attribute in the label group to be matched; and when determining that each label associated with each node in the ith gradient propagation node is not a label in the label group to be matched, taking a preset value as an initial score of each label associated with each node in the ith gradient propagation node.
Illustratively, according to the above step 200, the characteristic attribute and the excitation value of each tag in the tag group to be matched can be predetermined; in the process of traversing each tag associated with each node in the current gradient propagation node, if it is determined that the current tag is a tag in the tag group to be matched, an initial score of the tag may be determined according to the excitation value of the tag and the number of times that the characteristic attribute appears in the tag group to be matched, and specifically, with reference to expression (1), the initial score S of the tag may be:
S=(C/Fn)*Bn。
here, C represents a basic score of the current label, which can be set by human, and is set to 1 by default, which is used to expand the data diversity; fn represents the times of the characteristic attribute of the current label appearing in the label to be matched; bn represents the excitation value of the current tag corresponding to the characteristic attribute.
Illustratively, in the process of traversing each label associated with each node in the current gradient propagation node, if the current label is determined not to be a label in the label group to be matched, the initial score of the current label may be set to a preset value; here, the value of the preset value may be set according to an actual service scenario, and the embodiment of the present application is not limited, and for example, the value may be set to 0, and may also be set to other values.
It can be seen that, in the embodiment of the present application, the initial score of each label can be determined in a targeted manner by traversing the determination result of whether each label associated with each node in the current gradient propagation node is a label in the label group to be matched, and then the initial score is used for score calculation of subsequent candidate nodes, so that the validity of the final score result can be ensured.
In the embodiment of the application, after the initial scores of the labels associated with each node in the current gradient propagation node are obtained, whether the labels associated with each node in the current gradient propagation node have the irrelevant labels meeting the preset propagation condition is determined by taking the current gradient propagation node as a reference.
Here, the irrelevant tag means a tag that is not included in the tag group to be matched corresponding to the current gradient propagation node; exemplarily, it is assumed that a current gradient propagation node includes a node 1 and a node 2, and a corresponding to-be-matched tag group includes a tag 1 and a tag 2; if node 1 associates tag 1 and tag 3, and node 2 associates tag 1, tag 2, and tag 3, then tag 3 belongs to an unrelated tag.
In some embodiments, determining whether an unrelated label meeting a preset propagation condition exists in the labels associated with each of the ith gradient propagation node may include: if the number ratio of the nodes related to the irrelevant labels in the ith gradient propagation node to each node in the ith gradient propagation node is larger than a set threshold value, determining that the irrelevant labels meeting the preset propagation conditions exist; and if the number ratio of the nodes related to the irrelevant labels in the ith gradient propagation node to each node in the ith gradient propagation node is smaller than or equal to the set threshold, determining that the irrelevant labels meeting the preset propagation condition do not exist.
Exemplarily, the number ratio of the nodes associated with the unrelated labels in the current gradient propagation node to all the nodes included in the current gradient propagation node is determined; then comparing the quantity ratio with a set threshold value to obtain a comparison result; if the quantity ratio is determined to be larger than the set threshold according to the comparison result, the irrelevant label meeting the preset propagation condition is indicated to exist; otherwise, if the number ratio is determined to be smaller than or equal to the set threshold according to the comparison result, it indicates that no unrelated label meeting the preset propagation condition exists.
Here, the value of the set threshold may be determined according to an actual service scenario, which is not limited in the embodiment of the present application, and for example, the value may be 0.5, or may be other values.
Exemplarily, it is assumed that a current gradient propagation node includes a node 1 and a node 2, and a corresponding to-be-matched tag group includes a tag 1 and a tag 2; if the node 1 is associated with the tag 1 and the tag 3, and the node 2 is associated with the tag 1, the tag 2 and the tag 3, then for the unrelated tag (the tag 3), since the node 1 and the node 2 are both associated with the tag 3, it can be determined that the number of nodes associated with the unrelated tag in the current gradient propagation node is 2, and thus, it can be determined that the ratio of the number of nodes associated with the unrelated tag in the current gradient propagation node to the number of all nodes included in the current gradient propagation node is 1; at this time, if the value of the set threshold is 0.5, since the number ratio 1 is greater than the set threshold 0.5, it can be determined that an irrelevant label (label 3) meeting the preset propagation condition exists in the labels associated with each node in the current gradient propagation node.
It can be seen that, in the embodiment of the application, through setting the preset propagation conditions, the iteration times of the tag matching algorithm can be reduced while some implicit irrelevant tags which can influence the matching result can be better found; this is because, if there is no unrelated label meeting the preset propagation condition in the labels corresponding to the current gradient propagation node, the iterative calculation of the next gradient propagation node is not performed.
Step 202: when the situation exists, adding the irrelevant tags meeting the preset propagation condition into the tag group to be matched, and taking the non-alternative nodes associated with the irrelevant tags meeting the preset propagation condition as the (i + 1) th gradient propagation node; when the propagation score of each label associated with each node in the ith gradient propagation node is determined to be not present and the value of i is greater than 1; and determining the final score of each candidate node in the first gradient propagation node according to the propagation scores of the labels associated with each node in the ith gradient propagation node.
In this embodiment of the application, when it is determined that an unrelated label meeting a preset propagation condition exists in each label associated with each node in the current gradient propagation node according to step 201, it indicates that the unrelated label has an actual meaning, and the current propagation network may continue to extend, that is, the unrelated label has a certain influence on a score result of a candidate node in a subsequent first gradient propagation node; at this time, the irrelevant label is added into the label group to be matched, a non-alternative node related to the irrelevant label is searched out, and the non-alternative node is used as the next gradient propagation node.
Exemplarily, the current gradient propagation node includes a node 1, a node 2 and a node 3, and the corresponding to-be-matched tag group includes a tag 1 and a tag 2; if the node 1 associates the tag 1 and the tag 3, and the node 2 associates the tag 1, the tag 2 and the tag 3, it can be determined that an unrelated tag (tag 3) meeting a preset propagation condition exists in each tag associated with each node in the current gradient propagation node, and at this time, the unrelated tag (tag 3) is added into a tag group to be matched, that is, the matched tag group includes the tag 1, the tag 2 and the tag 3; if the label 3 is associated with the node 3, the node 3 is a non-alternative node associated with an irrelevant label, and at this time, the node 3 is taken as a next gradient propagation node. Here, the set of labels to be matched corresponding to the next gradient propagation node is the set of labels to be matched after the unrelated label is added.
Exemplarily, after obtaining the next gradient propagation node, continuously determining the initial score of each label associated with each node in the next gradient propagation node according to step 201, and determining whether an unrelated label meeting a preset propagation condition exists; the specific implementation manner is similar to the implementation manner described above for the current gradient propagation node, and is not described here again.
Exemplarily, if the current gradient propagation node is the first gradient propagation node, the next gradient propagation node is the second gradient propagation node, and after the initial score of each label associated with each node in the second gradient propagation node is determined according to step 201, it is determined that another unrelated label meeting the preset propagation condition exists, the unrelated label is added into the group of labels to be matched corresponding to the second gradient propagation node, and the non-alternative node associated with the unrelated label is taken as the third gradient propagation node, and the determination is continued according to step 201 until no unrelated label meeting the preset propagation condition exists in the propagation network.
In some embodiments, determining the final score of each candidate node in the first gradient propagation node according to the propagation scores of the respective labels associated with each of the ith gradient propagation node may include: when the value of i is larger than 1, determining that irrelevant tags meeting preset propagation conditions do not exist in the tags associated with each node in the ith gradient propagation node, and determining the propagation score of each tag associated with each node in the ith-1 gradient propagation node according to the propagation score of each tag associated with each node in the ith gradient propagation node; until the propagation score of each label associated with each alternative node in the first gradient propagation node is obtained; determining a final score of each candidate node in the first gradient propagation node according to the propagation score of each label associated with each candidate node in the first gradient propagation node; here, the final score is used to reflect the degree of matching of the tag group to be matched with each candidate node.
Exemplarily, it is assumed that, according to step 201, it is determined that, among the labels associated with each node in the second gradient propagation nodes, no unrelated label meeting the preset propagation condition exists, and at this time, i is equal to 2; determining a propagation score of each label associated with each node in the second gradient propagation nodes; further, determining the propagation score of each label associated with each node in the first gradient propagation node according to the propagation score of each label associated with each node in the second gradient propagation node; and determining the final score of each candidate node in the first gradient propagation node according to the propagation score of each label associated with each candidate node in the first gradient propagation node.
In some embodiments, determining the final score of each candidate node in the first gradient propagation node according to the propagation scores of the respective labels associated with each of the ith gradient propagation node may include: determining the initial score of each label associated with each node in the ith gradient propagation node under the condition that the value of i is greater than 1; and determining a final score of each candidate node in the first gradient propagation node according to the propagation score of each label associated with each node in the ith gradient propagation node and the initial score.
Exemplarily, in the case that i is determined to be equal to 2 according to the above steps, first, an initial score of each label associated with each node in the second gradient propagation node is determined, a propagation score of each label is determined according to the initial score of each label associated with each node in the second gradient propagation node, and then, a propagation score of each label associated with each node in the first gradient propagation node is determined according to the propagation score of each label associated with each node in the second gradient propagation node; finally, determining a final score of each candidate node in the first gradient propagation node according to the propagation score of each candidate node in the first gradient propagation node and the initial score. It can be seen that, taking the case that i is equal to 2 as an example, the propagation score reflects the score of each label associated with each node in the second gradient propagation node, and the influence degree of the score of the candidate node associated with the same label (i.e., the unrelated label added to the label group to be matched) in the first gradient propagation node is reflected.
It can be seen that, in the embodiment of the present application, the final score of each candidate node in the first gradient propagation node is jointly determined by using the respective label propagation score and the initial score associated with each node in the other gradient propagation nodes, so that the generalization of the matching result can be improved.
Exemplarily, it is assumed that the first gradient propagation node includes node 1 and node 2, and the corresponding to-be-matched tag group 1 includes tag 1 and tag 2; the node 1 associates the tag 1 with the tag 3, the node 2 associates the tag 2 with the tag 3, and the tag 3 is added into the tag group 1 to be matched to obtain a tag group 2 to be matched; if the label 3 is associated with the node 3, the second gradient propagation node includes the node 3, and if the node 3 is not associated with other unrelated labels, the propagation score of the node 3 in the second gradient propagation node is calculated. Specifically, the initial score of the node 3 in the second gradient propagation node is determined, the ratio of the excitation value of the characteristic attribute corresponding to the tag 3 to the excitation values of the characteristic attributes corresponding to all tags in the tag group 2 to be matched is determined, and the product of the initial score and the excitation value ratio of the node 3 is used as the propagation score of the node 3.
Illustratively, assuming that the initial partition of the node 3 is 0.5, and the excitation values corresponding to the tag 1, the tag 2 and the tag 3 in the tag group 2 to be matched are 2, 1.5 and 0.5, respectively, it is determined that the ratio of the excitation values corresponding to the node 3 is 0.125, and at this time, the propagation of the node 3 is 0.0625.
Further, after the propagation score of the node 3 in the second gradient propagation node is obtained, the first gradient propagation node is returned, and at this time, since the node 1 in the first gradient propagation node is associated with the label 1 and the label 3, and the node 2 is associated with the label 2 and the label 3, that is, both the node 1 and the node 2 are associated with the label 3, when the final scores of the node 1 and the node 2 are calculated, the initial score 0.5 of the label 3 and the propagation score 0.0625 are accumulated to obtain an accumulated score 0.5625, and the accumulated score is further accumulated with the initial scores of other associated labels to obtain the final scores of the node 1 and the node 2, that is, the final score of the node 1 is 2.5625, and the final score of the node 2 is the sum of the excitation value 2 corresponding to the label 1 and the accumulated score 0.5625 corresponding to the label 3, and is 2.0625.
It can be seen that, in the embodiment of the present application, in the process of performing label matching, according to the propagation scores of the labels associated with each node in the current gradient propagation node, the propagation scores of the labels associated with each node in the previous gradient are determined, that is, the final score of each candidate node is determined in a back propagation manner, so that the accuracy of the matching result can be improved.
In some embodiments, the method may further include: when different nodes in the ith gradient propagation node are determined to be associated with the same target label, determining a propagation score of each node in the different nodes associated with the same target label; selecting the maximum propagation score of the target label from the propagation scores of each node in different nodes related to the same target label; and determining the propagation score of each label associated with each node in the i-1 gradient propagation nodes according to the maximum propagation score of the target label.
Illustratively, the second gradient propagation node includes a node 3 and a node 4, and if both the node 3 and the node 4 are associated with a label 4, it is assumed that when the node 3 is associated with the label 4, the propagation score of the label 4 is determined to be 0.2, and when the node 4 is associated with the label 4, the propagation score of the label 4 is determined to be 0.5, at this time, 0.5 is taken as the propagation score of the label 4 to determine the propagation score of each node in the last gradient propagation node (first gradient propagation node), and finally, the final score of each candidate node in the first gradient propagation node is determined.
It can be seen that in the embodiment of the present application, when different nodes in a certain gradient propagation node are associated with the same label, multiple propagation scores corresponding to the label can be obtained, and the propagation scores can indicate the importance of the label relative to the nodes; at this time, the maximum propagation score is selected as the propagation score of the label, so that the accuracy of the subsequent matching result can be improved.
The embodiment of the application provides a label matching method, a label matching device, electronic equipment and a computer storage medium, wherein the method comprises the following steps: receiving a tag matching request, wherein the tag matching request comprises a tag group to be matched; determining an alternative node set from nodes associated with each label in a label group to be matched, and taking each alternative node in the alternative node set as a first gradient propagation node; determining whether an irrelevant label meeting a preset propagation condition exists in each label associated with each node in the ith gradient propagation node; the irrelevant tags represent tags not included in the tag group to be matched; i is an integer of 1 or more; when the situation exists, adding the irrelevant tags meeting the preset propagation condition into the tag group to be matched, and taking the non-alternative nodes associated with the irrelevant tags meeting the preset propagation condition as the (i + 1) th gradient propagation node; when the propagation score of each label associated with each node in the ith gradient propagation node is determined to be not present and the value of i is greater than 1; determining the final score of each alternative node in the first gradient propagation node according to the propagation score of each label associated with each node in the ith gradient propagation node; the final score is used for reflecting the matching degree of the label group to be matched and each alternative node.
It can be seen that, in the embodiment of the present application, whether an irrelevant label meeting a preset propagation condition exists in each label associated with each node in the ith gradient propagation node is determined, that is, whether a label propagation network can be extended continuously is determined based on the irrelevant label, so that some implicit irrelevant labels that can affect a label matching result can be found better, and the accuracy of subsequent label matching is improved; in addition, in the process of determining whether the irrelevant tags meeting the preset propagation condition exist, if the situation that the tags associated with each node in a certain gradient propagation node do not meet the preset propagation condition any more exists, the tags associated with other nodes are not judged any more, namely, the iterative computation aiming at the tag matching process can be stopped, and the propagation depth is effectively prevented from being too large; compared with the prior art that if the label is updated or inserted once, N rounds of iterative training are required until the global convergence is reached, the number of iterations in the label matching process is less, and the calculation amount of label matching can be greatly reduced.
In some embodiments, the method may further include: determining an initial score of each label associated with each candidate node in the first gradient propagation node; if the situation that the irrelevant label meeting the preset propagation condition does not exist in the labels associated with each alternative node in the first gradient propagation node is determined; and determining the sum of the initial scores of the labels associated with each candidate node in the first gradient propagation node as the final score result of each candidate node.
Exemplarily, in the case that the current gradient propagation node is the first gradient propagation node, after the initial score of each label associated with each candidate node in the first gradient propagation node is determined in the above manner, if it is determined that no unrelated label meeting the preset propagation condition exists in each label associated with each candidate node in the first gradient propagation node; namely, each label associated with each alternative node is a label in the label group to be matched; at this time, the initial scores of the labels associated with each candidate node are determined according to the formula (1), and the sum of the initial scores of the labels associated with each candidate node is used as the final score result of each candidate node.
Exemplarily, it is assumed that the first gradient propagation node includes node 1 and node 2, and the corresponding to-be-matched tag group includes tag 1 and tag 2; if the node 1 is associated with the label 1, and the node 2 is associated with the label 1 and the label 2, and it is determined according to the above formula (1) that the initial scores of the label 1 and the label 2 are 0.3 and 0.8, respectively, the final score result of the node 1 is 0.3, and the final score result of the node 2 is 1.1.
It can be seen that, in the embodiment of the present application, when it is determined that an unrelated label meeting a preset propagation condition does not exist in the labels associated with each candidate node in the first gradient propagation node, it indicates that the current propagation network does not need to continue to extend, and at this time, the calculation amount of the label matching algorithm can be greatly reduced by determining the initial sum of the scores associated with each candidate node in the first gradient propagation node as the final score result of each candidate node.
Exemplarily, after the final score of each candidate node in the first gradient propagation node is obtained, the final scores of each candidate node may be sorted in descending order, and the candidate node ranked in the front is the node with the highest matching degree with the tag group to be matched; or, firstly, the final score of each candidate node is normalized according to the standard deviation, and the results after the normalization processing are sorted from large to small, wherein the candidate node ranked in the front is the node with the highest matching degree with the tag group to be matched. Since the node corresponds to a service entity, the service entity most matched with the tag group to be matched can be obtained by the method in the embodiment of the present application.
In order to further embody the object of the present application, the present application will be further described with reference to the above-described embodiments.
Fig. 2b shows a network structure as a local network, fig. 2c shows another schematic diagram of a network structure for performing tag matching in this embodiment of the present application, as shown in fig. 2c, the network structure is a bidirectional ring structure, and this network structure can be represented by two sets of relationships, that is, the out-degree and in-degree of an edge, and nodes (Node) - > tag (Label) and Label- > Node, which are used as inputs of a tag matching algorithm, and the following describes the algorithm specifically:
step A1: through a search algorithm, a Label group to be matched attached in a Label matching request is taken as a condition to obtain a Label- > Node relationship group, and before traversal is started, a preset pruning condition of the network can be set to prune the network, for example, the preset pruning condition in fig. 2b is: the alternative Node must associate the tags of all Core and capable feature attributes involved in the tag group to be matched, so the dotted part in fig. 2b is the first gradient propagation Node of the propagation network, and other nodes that do not meet the condition are cut; certainly, if the preset pruning condition is not set, the original Label- > Node relation group is used, and all the related nodes are used as first gradient propagation nodes; here, the first gradient propagation node is also a candidate node for the final matching result.
Step A2: taking each Node in the current gradient propagation Node as a reference, searching to obtain a Node- > Label relationship group, traversing the Label associated with each Node, and marking an initial score for each Label, if the Label is not a Label attached in the Label group to be matched, the initial score is 0, otherwise, the current Label is set as N, the initial score is C, the frequency Fn of the characteristic attribute of the Label appearing in the Label group to be matched, and the excitation value of the characteristic attribute of the Label is Bn, the initial score is: (C/Fn) × Bn.
Step A3: inverting the Node- > Label relationship of the step A2, and traversing and calculating the Label- > Node, wherein the influence of other labels (irrelevant labels) which are not included in the Label group to be matched on the score is ignored in the step A2, but the irrelevant labels also have influence on the score due to the propagation network; here, the ratio of the alternative Node associated with one unrelated label to the total associated Node may be set as a propagation condition, the ratio defaults to 0.5, and as long as the ratio is exceeded, the unrelated label is considered to have an actual meaning, and the propagation network may continue to extend; if the irrelevant label meeting the propagation condition exists, dynamically adding the irrelevant label into the label group to be matched, searching out a non-alternative node associated with the irrelevant label as a next propagation gradient node, and returning to the step A2; otherwise, the next step is carried out.
Step A4: traversing again according to the Node- > Label relationship in the step 2, taking the sum of the scores of all labels associated with each Node in the current gradient propagation Node as the score of each Node in the current gradient propagation Node, if the Node in the current gradient propagation Node is not an alternative Node, determining the propagation score of each Label associated with the Node according to the excitation value ratio of the corresponding characteristic attribute of the Label associated with the Node in the current gradient propagation Node, wherein one Label may have a plurality of propagation scores and is the largest, and after the calculation is finished, returning the previous gradient until the final score result of each alternative Node in the first gradient propagation Node is obtained; otherwise, the score calculation is completed, and the final score result of each alternative Node is returned.
Step A5: and normalizing the final score result of each alternative Node according to the standard deviation.
It can be seen that the number of iterations of the tag matching algorithm proposed in the embodiment of the present application is very small, if the tags in the set of tags to be matched are very clear and accurate, there may be only one to two steps of calculation, and meanwhile, compared with the related art, some implicit factors affecting the matching result can be better found, for example, some implicit tags that do not appear in the query are generated in the calculation process, and the potential features of the tags are already obtained through pre-operation, so that the accuracy of the matching result can be ensured. Furthermore, in the process of matching the labels, the method also adopts a network propagation mode, and in order to prevent the propagation depth from being too large and the loop network from being generated, the method of propagation while pruning and using the ratio to limit the propagation conditions is adopted, and different propagation networks are dynamically outlined.
Illustratively, for what characteristic attribute each tag should be given, how much a specific excitation value takes, the core tag and the non-core tags and their weighted weights (excitation values) may be defined by means of a reduction algorithm (e.g. Rougthset fuzzy set theory), the core process using the reduction algorithm: assuming that each label has all characteristic attributes, the respective weighted value is an artificial weighted value x an excitation value (the artificial weighted value is set by experience, and the excitation value is initially 1), the input training data set is a series of query labels and corresponding matching entities, the query label set can be an input matrix of the weighted value, the matching entities can be a 0-1 sequence with the length of N (the number of the candidate entities) by combining all the number of the candidate entities, the reduction algorithm is to convert the problem into a classification problem, each iteration is to solve an optimal classification neural network, the input matrix is adjusted (the excitation value is increased or decreased) according to the reduction condition after each iteration is completed, and the reduction condition is required to be favorable for enhancing the classification capability of the input matrix.
The feature attribute is used for reducing the size of the propagation network generated each time, and does not affect the result of tag matching, namely the most suitable tagged entity, so that the influence on the confidence range of the result after whether the matching entity result changes and the excitation value of the feature attribute of the tag is adjusted is used as a reduction condition in the reduction algorithm. And finally, taking the characteristic attribute with the highest weighted value in each query label as the unique characteristic attribute, and taking the excitation value of the final iteration of each characteristic attribute as the final excitation value. In an actual scenario, the characteristic attribute may be manually and empirically given to the tag, and similarly, the excitation value of the characteristic attribute may be generally preset, for example, the excitation value of the Core tag (Core) is 2.0, the excitation value of the optimum tag (survivable) is 1.0, the excitation value of the preferred tag (Prioritized) is 0.8, and the excitation value of the Optional tag (Optional) is 0.3. In addition, the number of the characteristic attributes is not fixed, a continuous numerical range can be decomposed, multiple sections can be segmented, each section represents one characteristic attribute, and the range of the sections is the range of the excitation values. The characteristic attribute and the incentive value are equivalent to prior knowledge and indirectly participate in the calculation of the tag matching score, but cannot directly represent the final matching score of the data, so that the incentive value of the characteristic attribute does not need to be adjusted frequently, and is not related to specific matching queries unlike the traditional scoring network.
Fig. 2d is a schematic flow chart of score value transmission when the preset propagation condition is not met in the embodiment of the present application, and as shown in fig. 2d, init indicates that the associated Label is scored after the Candidate Node (Candidate Node) of the first gradient propagation Node is found, and the score corresponding to the Candidate Node is implicitly transmitted to the associated Label; sum means that the score of a non-alternative node is the sum of all Label scores associated with it, which is also a pass; while the broadcast of the non-candidate Node (unknown Node) to the Label associated with the Node is broadcast according to the excitation value of the characteristic attribute described in the step A4, namely, a backward transmission, calculating a broadcast score; finally, correlating the Label to the broadcast transmission of the alternative node, namely, taking the maximum transmission branch of the Label to transmit, and finally accumulating to obtain the initial part of the alternative node to obtain the real score of the alternative node; that is, the overall delivery order is init- > sum- > broadcast- > broadcast.
Fig. 3 is a schematic diagram of a component structure of a tag matching apparatus according to an embodiment of the present application, and as shown in fig. 3, the apparatus includes a first determining module 300 and a second determining module 301, wherein,
a first determining module 300, configured to receive a tag matching request, where the tag matching request includes a tag group to be matched; determining an alternative node set from nodes associated with each label in the label group to be matched, and taking each alternative node in the alternative node set as a first gradient propagation node;
a second determining module 301, configured to determine whether an unrelated tag meeting a preset propagation condition exists in each tag associated with each node in the ith gradient propagation node; the unrelated tags represent tags not included in the set of tags to be matched; i is an integer of 1 or more; when the situation that the irrelevant label accords with the preset propagation condition exists is determined, the irrelevant label which accords with the preset propagation condition is added into the label group to be matched, and a non-alternative node which is associated with the irrelevant label which accords with the preset propagation condition is used as an i +1 th gradient propagation node; when the propagation score of each label associated with each node in the ith gradient propagation node is determined to be not present and the value of i is greater than 1; determining a final score of each candidate node in the first gradient propagation node according to the propagation score of each label associated with each node in the ith gradient propagation node; and the final score is used for reflecting the matching degree of the label group to be matched and each alternative node.
In some embodiments, the second determining module 301 is configured to determine a final score of each candidate node in the first gradient propagation node according to the propagation scores of the respective labels associated with each candidate node in the ith gradient propagation node, and includes:
when the value of i is larger than 1, determining that irrelevant tags meeting preset propagation conditions do not exist in the tags associated with each node in the ith gradient propagation node, and determining the propagation score of each tag associated with each node in the ith-1 gradient propagation node according to the propagation score of each tag associated with each node in the ith gradient propagation node;
until the propagation score of each label associated with each alternative node in the first gradient propagation node is obtained; and determining the final score of each candidate node in the first gradient propagation node according to the propagation score of each label associated with each candidate node in the first gradient propagation node.
In some embodiments, the second determining module 301 is further configured to:
when different nodes in the ith gradient propagation node are determined to be associated with the same target label, determining a propagation score of each node in the different nodes associated with the same target label;
selecting the maximum propagation score of the target label from the propagation scores of each node in the different nodes related to the same target label;
and determining the propagation score of each label associated with each node in the i-1 gradient propagation nodes according to the maximum propagation score of the target label.
In some embodiments, the second determining module 301 is further configured to:
determining an initial score of each label associated with each candidate node in the first gradient propagation node;
if the situation that the irrelevant label meeting the preset propagation condition does not exist in the labels associated with each candidate node in the first gradient propagation node is determined; and determining the initial score sum of the labels associated with each candidate node in the first gradient propagation node as a final score result of each candidate node.
In some embodiments, the second determining module 301 is configured to determine whether an unrelated label meeting a preset propagation condition exists in the labels associated with each of the ith gradient propagation node, and includes:
if the number ratio of the nodes related to the irrelevant labels in the ith gradient propagation node to each node in the ith gradient propagation node is larger than a set threshold value, determining that the irrelevant labels meeting preset propagation conditions exist;
and if the number ratio of the nodes related to the irrelevant label in the ith gradient propagation node to each node in the ith gradient propagation node is determined to be less than or equal to a set threshold value, determining that the irrelevant label meeting a preset propagation condition does not exist.
In some embodiments, the first determining module 300 is configured to determine, from the nodes associated with each tag in the tag group to be matched, an alternative node set, and includes:
determining at least one label meeting a preset pruning condition from the label group to be matched;
and taking the node associated with each label in the at least one label as a candidate node and putting the candidate node into a candidate node set.
In some embodiments, the determining a final score for each candidate node in the first gradient propagation node according to the propagation scores of the respective labels associated with each candidate node in the ith gradient propagation node comprises:
determining an initial score of each label associated with each node in the ith gradient propagation node under the condition that the value of i is greater than 1;
and determining a final score of each candidate node in the first gradient propagation node according to the propagation score and the initial score of each label associated with each node in the ith gradient propagation node.
In some embodiments, the second determining module 301 is further configured to:
pre-determining the characteristic attribute and the excitation value of each label in the label group to be matched;
under the condition that the value of i is greater than or equal to 1, when determining that each label associated with each node in the ith gradient propagation node is a label in the label group to be matched, determining an initial component of each label associated with each node in the ith gradient propagation node according to the excitation value of each label associated with each node in the ith gradient propagation node and the occurrence frequency of the characteristic attribute in the label group to be matched;
and when determining that each label associated with each node in the ith gradient propagation node is not a label in the label group to be matched, taking a preset value as an initial score of each label associated with each node in the ith gradient propagation node.
In practical applications, each of the first determining module 300 and the second determining module 301 may be implemented by a processor located in an electronic device, and the processor may be at least one of an ASIC, a DSP, a DSPD, a PLD, an FPGA, a CPU, a controller, a microcontroller, and a microprocessor.
In addition, each functional module in this embodiment may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware or a form of a software functional module.
Based on the understanding that the technical solution of the present embodiment essentially or a part contributing to the related art, or all or part of the technical solution, may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device, etc.) or a Processor (Processor) to execute all or part of the steps of the method of the present embodiment. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read Only Memory (ROM), a RAM, a magnetic disk, or an optical disk.
Specifically, the computer program instructions corresponding to a tag matching method in the present embodiment may be stored on a storage medium such as an optical disc, a hard disc, or a usb disk, and when the computer program instructions corresponding to a tag matching method in the storage medium are read or executed by an electronic device, any one of the tag matching methods of the foregoing embodiments is implemented.
Based on the same technical concept of the foregoing embodiment, referring to fig. 4, it shows an electronic device 400 provided in the embodiment of the present application, which may include: a memory 401 and a processor 402; wherein,
a memory 401 for storing computer programs and data;
a processor 402 for executing a computer program stored in the memory to implement any of the tag matching methods of the previous embodiments.
In practical applications, the Memory 401 may be a Volatile Memory (Volatile Memory), such as a RAM; or a Non-Volatile Memory (Non-Volatile Memory), such as a ROM, a Flash Memory, a Hard Disk (HDD), or a Solid-State Drive (SSD); or a combination of the above types of memories and provides instructions and data to the processor 402.
The processor 402 may be at least one of an ASIC, a DSP, a DSPD, a PLD, an FPGA, a CPU, a controller, a microcontroller, and a microprocessor. It is understood that the electronic device for implementing the above-mentioned processor function may be other for different tag matching devices, and the embodiments of the present application are not particularly limited.
In some embodiments, functions of or modules included in the apparatus provided in the embodiments of the present application may be used to execute the method described in the above method embodiments, and specific implementation thereof may refer to the description of the above method embodiments, and for brevity, will not be described again here.
The foregoing description of the various embodiments is intended to highlight various differences between the embodiments, and the same or similar parts may be referred to each other, and for brevity, will not be described again herein.
The methods disclosed in the method embodiments provided by the present application can be combined arbitrarily without conflict to obtain new method embodiments.
Features disclosed in various product embodiments provided by the application can be combined arbitrarily to obtain new product embodiments without conflict.
The features disclosed in the various method or apparatus embodiments provided herein may be combined in any combination to arrive at new method or apparatus embodiments without conflict.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of a hardware embodiment, a software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The above description is only a preferred embodiment of the present application, and is not intended to limit the scope of the present application.
Claims (11)
1. A method of tag matching, the method comprising:
receiving a tag matching request, wherein the tag matching request comprises a tag group to be matched; determining an alternative node set from nodes associated with each label in the label group to be matched, and taking each alternative node in the alternative node set as a first gradient propagation node;
determining whether an irrelevant label meeting a preset propagation condition exists in each label associated with each node in the ith gradient propagation node; the unrelated tags represent tags not included in the set of tags to be matched; i is an integer of 1 or more;
when the situation that the irrelevant label accords with the preset propagation condition exists is determined, the irrelevant label which accords with the preset propagation condition is added into the label group to be matched, and a non-alternative node which is associated with the irrelevant label which accords with the preset propagation condition is used as an i +1 th gradient propagation node;
when the propagation score of each label associated with each node in the ith gradient propagation node is determined to be not present and the value of i is greater than 1; determining a final score of each candidate node in the first gradient propagation node according to the propagation score of each label associated with each node in the ith gradient propagation node; and the final score is used for reflecting the matching degree of the label group to be matched and each alternative node.
2. The method of claim 1, wherein determining a final score for each candidate node in the first gradient propagation node based on the propagation scores of the respective labels associated with each of the ith gradient propagation nodes comprises:
when the value of i is larger than 1, determining that irrelevant tags meeting preset propagation conditions do not exist in the tags associated with each node in the ith gradient propagation node, and determining the propagation score of each tag associated with each node in the ith-1 gradient propagation node according to the propagation score of each tag associated with each node in the ith gradient propagation node;
until the propagation score of each label associated with each alternative node in the first gradient propagation node is obtained; and determining the final score of each candidate node in the first gradient propagation node according to the propagation score of each label associated with each candidate node in the first gradient propagation node.
3. The method of claim 2, further comprising:
when different nodes in the ith gradient propagation node are determined to be associated with the same target label, determining a propagation score of each node in the different nodes associated with the same target label;
selecting the maximum propagation score of the target label from the propagation scores of each node in the different nodes related to the same target label;
and determining the propagation score of each label associated with each node in the i-1 gradient propagation nodes according to the maximum propagation score of the target label.
4. The method of claim 1, further comprising:
determining an initial score of each label associated with each candidate node in the first gradient propagation node;
if the situation that the irrelevant label meeting the preset propagation condition does not exist in the labels associated with each candidate node in the first gradient propagation node is determined; and determining the initial score sum of the labels associated with each candidate node in the first gradient propagation node as a final score result of each candidate node.
5. The method according to claim 1, wherein the determining whether there is an irrelevant label meeting a preset propagation condition in the labels associated with each of the ith gradient propagation nodes comprises:
if the number ratio of the nodes related to the irrelevant labels in the ith gradient propagation node to each node in the ith gradient propagation node is larger than a set threshold value, determining that the irrelevant labels meeting preset propagation conditions exist;
and if the number ratio of the nodes related to the irrelevant label in the ith gradient propagation node to each node in the ith gradient propagation node is determined to be less than or equal to a set threshold value, determining that the irrelevant label meeting a preset propagation condition does not exist.
6. The method according to claim 1, wherein the determining a set of alternative nodes from the nodes associated with each tag in the set of tags to be matched comprises:
determining at least one label meeting a preset pruning condition from the label group to be matched;
and taking the node associated with each label in the at least one label as a candidate node and putting the candidate node into a candidate node set.
7. The method of claim 1, wherein determining a final score for each candidate node in the first gradient propagation node based on the propagation scores of the respective labels associated with each of the ith gradient propagation nodes comprises:
determining an initial score of each label associated with each node in the ith gradient propagation node under the condition that the value of i is greater than 1;
and determining a final score of each candidate node in the first gradient propagation node according to the propagation score and the initial score of each label associated with each node in the ith gradient propagation node.
8. The method according to claim 4 or 7, characterized in that the method further comprises:
pre-determining the characteristic attribute and the excitation value of each label in the label group to be matched;
under the condition that the value of i is greater than or equal to 1, when determining that each label associated with each node in the ith gradient propagation node is a label in the label group to be matched, determining an initial component of each label associated with each node in the ith gradient propagation node according to the excitation value of each label associated with each node in the ith gradient propagation node and the occurrence frequency of the characteristic attribute in the label group to be matched;
and when determining that each label associated with each node in the ith gradient propagation node is not a label in the label group to be matched, taking a preset value as an initial score of each label associated with each node in the ith gradient propagation node.
9. A tag matching apparatus, characterized in that the apparatus comprises:
the system comprises a first determining module, a second determining module and a matching module, wherein the first determining module is used for receiving a tag matching request which comprises a tag group to be matched; determining an alternative node set from nodes associated with each label in the label group to be matched, and taking each alternative node in the alternative node set as a first gradient propagation node;
the second determining module is used for determining whether an irrelevant label meeting a preset propagation condition exists in all labels associated with each node in the ith gradient propagation node; the unrelated tags represent tags not included in the set of tags to be matched; i is an integer of 1 or more; when the situation that the irrelevant label accords with the preset propagation condition exists is determined, the irrelevant label which accords with the preset propagation condition is added into the label group to be matched, and a non-alternative node which is associated with the irrelevant label which accords with the preset propagation condition is used as an i +1 th gradient propagation node; when the propagation score of each label associated with each node in the ith gradient propagation node is determined to be not present and the value of i is greater than 1; determining a final score of each candidate node in the first gradient propagation node according to the propagation score of each label associated with each node in the ith gradient propagation node; and the final score is used for reflecting the matching degree of the label group to be matched and each alternative node.
10. An electronic device, characterized in that the device comprises a memory, a processor and a computer program stored on the memory and executable on the processor, which when executing the program implements the method of any of claims 1 to 8.
11. A computer storage medium on which a computer program is stored, characterized in that the computer program realizes the method of any one of claims 1 to 8 when executed by a processor.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111475736.0A CN114117168A (en) | 2021-12-06 | 2021-12-06 | Label matching method, device, equipment and computer storage medium |
PCT/CN2022/099766 WO2023103327A1 (en) | 2021-12-06 | 2022-06-20 | Label matching method and apparatus, and device, computer storage medium, and program |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111475736.0A CN114117168A (en) | 2021-12-06 | 2021-12-06 | Label matching method, device, equipment and computer storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114117168A true CN114117168A (en) | 2022-03-01 |
Family
ID=80366948
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111475736.0A Pending CN114117168A (en) | 2021-12-06 | 2021-12-06 | Label matching method, device, equipment and computer storage medium |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN114117168A (en) |
WO (1) | WO2023103327A1 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2023103327A1 (en) * | 2021-12-06 | 2023-06-15 | 深圳前海微众银行股份有限公司 | Label matching method and apparatus, and device, computer storage medium, and program |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10824674B2 (en) * | 2016-06-03 | 2020-11-03 | International Business Machines Corporation | Label propagation in graphs |
CN109582675A (en) * | 2018-11-29 | 2019-04-05 | 北京达佳互联信息技术有限公司 | Tag match method, apparatus, server and storage medium |
CN111967262B (en) * | 2020-06-30 | 2024-01-12 | 北京百度网讯科技有限公司 | Determination method and device for entity tag |
CN112507066A (en) * | 2020-11-18 | 2021-03-16 | 北京三快在线科技有限公司 | Label marking method and device, electronic equipment and readable storage medium |
CN112699237B (en) * | 2020-12-24 | 2021-10-15 | 百度在线网络技术(北京)有限公司 | Label determination method, device and storage medium |
CN114117168A (en) * | 2021-12-06 | 2022-03-01 | 深圳前海微众银行股份有限公司 | Label matching method, device, equipment and computer storage medium |
-
2021
- 2021-12-06 CN CN202111475736.0A patent/CN114117168A/en active Pending
-
2022
- 2022-06-20 WO PCT/CN2022/099766 patent/WO2023103327A1/en unknown
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2023103327A1 (en) * | 2021-12-06 | 2023-06-15 | 深圳前海微众银行股份有限公司 | Label matching method and apparatus, and device, computer storage medium, and program |
Also Published As
Publication number | Publication date |
---|---|
WO2023103327A1 (en) | 2023-06-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CA2562281C (en) | Partial query caching | |
US20190303141A1 (en) | Syntax Based Source Code Search | |
CN105740386B (en) | Thesis searching method and device based on sorting integration | |
Scanagatta et al. | Efficient learning of bounded-treewidth Bayesian networks from complete and incomplete data sets | |
JP2013522720A (en) | Determination of word information entropy | |
US20230409588A1 (en) | System and method for subset searching and associated search operators | |
CN109992590B (en) | Approximate space keyword query method and system with digital attributes in traffic network | |
US11556527B2 (en) | System and method for value based region searching and associated search operators | |
CN112508177A (en) | Network structure searching method and device, electronic equipment and storage medium | |
CN114117168A (en) | Label matching method, device, equipment and computer storage medium | |
US20220284023A1 (en) | Estimating computational cost for database queries | |
CN115989489A (en) | Concept prediction for automatically creating new intents and assigning examples in a dialog system | |
CN116108076B (en) | Materialized view query method, materialized view query system, materialized view query equipment and storage medium | |
EP2731021B1 (en) | Apparatus, program, and method for reconciliation processing in a graph database | |
Guo et al. | K-loop free assignment in conference review systems | |
Blake et al. | Reinforcement learning based decision tree induction over data streams with concept drifts | |
Wang et al. | Efficient similarity search for sets over graphs | |
Argentini | Ranking aggregation based on belief function theory | |
CN110209829B (en) | Information processing method and device | |
US20160321575A1 (en) | Scoring entries in a repository of business process models to facilitate searching | |
CN112084290B (en) | Data retrieval method, device, equipment and storage medium | |
CN116501841B (en) | Fuzzy query method, system and storage medium for data model | |
Seleznev et al. | Double-Hashing Algorithm for Frequency Estimation in Data Streams | |
Zhang et al. | Algorithms to calculate the most reliable maximum flow in content delivery network | |
Meena et al. | Integrating swarm intelligence and statistical data for feature selection in text categorization |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |