WO2023103327A1 - 一种标签匹配方法、装置、设备、计算机存储介质和程序 - Google Patents

一种标签匹配方法、装置、设备、计算机存储介质和程序 Download PDF

Info

Publication number
WO2023103327A1
WO2023103327A1 PCT/CN2022/099766 CN2022099766W WO2023103327A1 WO 2023103327 A1 WO2023103327 A1 WO 2023103327A1 CN 2022099766 W CN2022099766 W CN 2022099766W WO 2023103327 A1 WO2023103327 A1 WO 2023103327A1
Authority
WO
WIPO (PCT)
Prior art keywords
node
label
propagation
gradient
candidate
Prior art date
Application number
PCT/CN2022/099766
Other languages
English (en)
French (fr)
Inventor
华德义
王和平
黄山
杨峙岳
刘有
杨永坤
白乐
徐嘉杨
饶进阳
邸帅
卢道和
Original Assignee
深圳前海微众银行股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳前海微众银行股份有限公司 filed Critical 深圳前海微众银行股份有限公司
Publication of WO2023103327A1 publication Critical patent/WO2023103327A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/906Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections

Definitions

  • the present application relates to the technical field of cloud computing of financial technology (Fintech), involving but not limited to a tag matching method, device, electronic equipment, computer storage medium and computer program product.
  • the present application provides a label matching method, device, electronic equipment, computer storage medium and computer program product, which can solve the problem of large amount of calculation in label matching in the related art.
  • An embodiment of the present application provides a label matching method, the method comprising:
  • the label matching request includes the label group to be matched; determine the candidate node set from the nodes associated with each label in the label group to be matched, and set each candidate node set in the candidate node set The node is used as the first gradient propagation node;
  • the determining the final score of each candidate node in the first gradient propagation node according to the propagation score of each label associated with each node in the i-th gradient propagation node includes:
  • the propagation score of each label associated with each node in the previous gradient is determined according to the propagation score of each label associated with each node in the current gradient propagation node. , that is, determining the final score of each candidate node through back propagation can ensure the accuracy of the matching result.
  • the method also includes:
  • the method also includes:
  • each of the tags associated with each candidate node in the first gradient propagation node determines that among the tags associated with each candidate node in the first gradient propagation node, there is no irrelevant tag that meets the preset propagation conditions; each of the tags associated with each candidate node in the first gradient propagation node The sum of the initial scores of labels is determined as the final score result of each candidate node.
  • the determining whether there is an irrelevant label that meets a preset propagation condition among the labels associated with each node in the i-th gradient propagation node includes:
  • the determining the candidate node set from the nodes associated with each tag in the tag group to be matched includes:
  • the node associated with each label in the at least one label is used as a candidate node and put into the candidate node set.
  • the determining the final score of each candidate node in the first gradient propagation node according to the propagation score of each label associated with each node in the i-th gradient propagation node includes:
  • the final score of each candidate node in the first gradient propagation node is determined according to the propagation score and the initial score of each label associated with each node in the i-th gradient propagation node.
  • the final score of each candidate node in the first gradient propagation node is jointly determined by using the label propagation scores associated with each node in other gradient propagation nodes and the initial score, which can improve the matching result. generalizability.
  • the method also includes:
  • each label associated with each node in the i-th gradient propagation node is a label in the label group to be matched, according to the i-th gradient propagation node
  • the excitation value of each label associated with each node and the number of times the feature attribute appears in the tag group to be matched determines the initial score of each label associated with each node in the i-th gradient propagation node
  • the preset value is used as each label associated with each node in the i-th gradient propagation node initial score.
  • the embodiment of the present application also proposes a tag matching device, the device includes a first determination module and a second determination module, wherein,
  • the first determination module is configured to receive a tag matching request, the tag matching request includes a tag group to be matched; from the nodes associated with each tag in the tag group to be matched, determine a set of candidate nodes, and use the Each candidate node in the selected node set is used as the first gradient propagation node;
  • the second determination module is configured to determine whether there is an irrelevant label that meets a preset propagation condition among the labels associated with each node in the i-th gradient propagation node; the irrelevant label represents a label that is not included in the label group to be matched ; i is an integer greater than or equal to 1; when it is determined to exist, add the irrelevant tag that meets the preset propagation condition to the tag group to be matched, and associate the irrelevant tag that meets the preset propagation condition as a non-alternative
  • the node is used as the i+1th gradient propagation node; when it is determined that it does not exist, and the value of i is greater than 1, determine the propagation points of each label associated with each node in the i-th gradient propagation node; according to the i-th gradient
  • the propagation points of each label associated with each node in the propagation node determine the final score of each candidate node in the first gradient propagation node; the final score is used to reflect the relationship between the label group to be matched and each
  • An embodiment of the present application provides an electronic device, the device includes a memory, a processor, and a computer program stored on the memory and operable on the processor, and the processor implements one or more of the aforementioned technologies when executing the program The label matching method provided by the scheme.
  • An embodiment of the present application provides a computer storage medium, and the computer storage medium stores a computer program; after the computer program is executed, the tag matching method provided by one or more of the foregoing technical solutions can be implemented.
  • the embodiment of the present application also provides a computer program product, including computer readable code, when the computer readable code is run in the electronic device, the processor in the electronic device executes to implement the aforementioned one or more The tag matching method provided by the technical solution.
  • the embodiment of the present application proposes a tag matching method, device, electronic equipment, computer storage medium and computer program product, the method includes: receiving a tag matching request, the tag matching request includes a group of tags to be matched; from the Among the nodes associated with each tag in the tag group to be matched, determine a set of candidate nodes, and use each candidate node in the set of candidate nodes as the first gradient propagation node; determine the node associated with each node in the i-th gradient propagation node Whether there are irrelevant tags that meet the preset propagation conditions in each tag; the irrelevant tags represent tags that are not included in the tag group to be matched; i is an integer greater than or equal to 1; The irrelevant label of the propagation condition is added to the label group to be matched, and the non-candidate node associated with the irrelevant label that meets the preset propagation condition is used as the i+1th gradient propagation node; when it is determined that it does not exist, and the selection of i When the value is greater than 1, determine the propagation
  • the embodiment of the present application by determining whether there are irrelevant labels that meet the preset propagation conditions among the labels associated with each node in the i-th gradient propagation node, that is, determine whether the label propagation network can continue based on the irrelevant labels Extension, in this way, it is possible to better discover some implicit irrelevant tags that can affect the tag matching results, and improve the accuracy of subsequent tag matching; in addition, in the process of determining whether there are irrelevant tags that meet the preset propagation conditions, if there are If the labels associated with each node in a gradient propagation node no longer meet the preset propagation conditions, the labels associated with other nodes will no longer be judged, that is, the iterative calculation for the label matching process can be stopped, effectively preventing the propagation depth from being too deep. Large; compared with related technologies, if a tag is updated or inserted once, N rounds of iterative training are required until global convergence, the embodiment of the present application has fewer iterations in the tag matching process, which can greatly reduce the calculation of tag
  • FIG. 1 is a schematic diagram of a tag matching network structure in the related art
  • Figure 2a is a schematic flow diagram of a tag matching method in the embodiment of the present application.
  • Figure 2b is a schematic diagram of a network structure for label matching in the embodiment of the present application.
  • FIG. 2c is a schematic diagram of another network structure for label matching in the embodiment of the present application.
  • Fig. 2d is a schematic flow chart of score value transfer when the preset propagation conditions are not met in the embodiment of the present application;
  • Fig. 3 is a schematic diagram of the composition and structure of the label matching device of the embodiment of the present application.
  • FIG. 4 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.
  • the term “comprising”, “comprising” or any other variation thereof is intended to cover a non-exclusive inclusion, so that a method or device comprising a series of elements not only includes the explicitly stated elements, but also include other elements not explicitly listed, or also include elements inherent in implementing the method or apparatus.
  • an element defined by the phrase “comprising a" does not exclude the presence of additional related elements (such as steps in the method or A unit in an apparatus, for example, a unit may be part of a circuit, part of a processor, part of a program or software, etc.).
  • the tag matching method provided in the embodiment of the present application includes a series of steps, but the tag matching method provided in the embodiment of the present application is not limited to the steps described.
  • the tag matching device provided in the embodiment of the present application includes a A series of modules, but the label matching device provided by the embodiment of the present application is not limited to include the explicitly recorded modules, and may also include modules that need to be set up for obtaining relevant task data or processing based on task data.
  • the embodiments of the present application can be applied to a computer system composed of servers, and can operate together with many other general-purpose or special-purpose computing system environments or configurations.
  • the server may be a distributed cloud computing technology environment including a small computer system, a large computer system, and so on.
  • program modules may include routines, programs, objects, components, logic, data structures, etc., that perform particular tasks or implement particular abstract data types.
  • the computer system/server can be practiced in distributed cloud computing environments where tasks are performed by remote processing devices that are linked through a communications network.
  • program modules may be located in both local and remote computing system storage media including storage devices.
  • data pre-scoring on labels can be performed by combining a matrix model with belief propagation.
  • a matrix model with belief propagation.
  • the influence (Pagerank, PR) value of each node is a value related to the node label.
  • Matrix variables as shown in Figure 1; in order to obtain data pre-scores, multiple rounds of iterative calculations are required to stabilize the PR values of all nodes in the network, and the number of iterations is variable.
  • the label matching method can be realized by using a processor in the label matching device, and the processor can be an Application Specific Integrated Circuit (ASIC), a digital signal processor (Digital Signal Processor, DSP), Digital Signal Processing Device (Digital Signal Processing Device, DSPD), Programmable Logic Device (Programmable Logic Device, PLD), Field Programmable Logic Gate Array (Field Programmable Gate Array, FPGA), Central Processing Unit (Central Processing Unit , CPU), controller, microcontroller, microprocessor at least one.
  • ASIC Application Specific Integrated Circuit
  • DSP Digital Signal Processor
  • DSPD Digital Signal Processing Device
  • PLD Programmable Logic Device
  • Field Programmable Logic Gate Array Field Programmable Gate Array
  • FPGA Field Programmable Gate Array
  • CPU Central Processing Unit
  • CPU Central Processing Unit
  • microcontroller microprocessor at least one.
  • Figure 2a is a schematic flow chart of a label matching method in the embodiment of the present application, as shown in Figure 2a, the method includes the following steps:
  • Step 200 Receive a label matching request, the label matching request includes the label group to be matched; determine the candidate node set from the nodes associated with each label in the label group to be matched, and use each candidate node in the candidate node set as the first A gradient propagation node.
  • the label matching method can be applied to a network structure composed of a label (Label) and a cluster node (Node), referred to as a label network; here, a cluster node represents a collection of multiple nodes, and each node It may correspond to a business entity, where the type of the business entity is related to the business scenario in which the tag matching method is actually applied.
  • Label label
  • Node cluster node
  • tags may be included in the tag group to be matched; here, each tag in the tag group to be matched has a corresponding feature attribute, and each feature attribute has its own incentive value, which is equivalent to the weighted value , the incentive value will participate in the subsequent score calculation.
  • the feature attributes between different tags can be the same or different; and when the feature attributes between different tags are all the same, these tags
  • the incentive value of is also the same; here, the incentive value of the label represents the incentive value of the feature attribute corresponding to the label.
  • the embodiment of the present application does not limit the classification of the characteristic attributes of tags; for example, the characteristic attributes of tags can be divided into core tags (Core), optimal tags (Suitable), preferred Tags (Prioritized), optional tags (Optional), etc.; the characteristic attributes of tags can also be classified in other ways.
  • the incentive value of the core label can be artificially preset to 2.0
  • the incentive value of the most suitable label is 1.0
  • the incentive value of the preferred label is 0.8
  • the incentive value of the optional label is 0.8.
  • the tag network when it receives the tag matching request sent from the outside, it can obtain the tag group to be matched from the tag matching request; then, according to the nodes associated with each tag in the tag group to be matched, determine the candidate node set .
  • the node associated with each tag in the tag group to be matched may be used as a candidate node in the candidate node set.
  • determining the candidate node set it may also be: determine at least one label that meets the preset pruning condition from the label group to be matched; associate each label in the at least one label As a candidate node, put it into the candidate node set.
  • the preset pruning conditions are some filtering conditions set according to the actual experience related to the business scenario, the purpose of which is to reduce the amount of calculation; for example, when the tag group to be matched includes multiple tags, it can be determined from At least one label that meets the preset pruning conditions is selected; furthermore, the node associated with each label in the at least one label is used as a candidate node in the candidate node set.
  • Fig. 2b is a schematic diagram of a network structure for tag matching in the embodiment of the present application.
  • the tag group to be matched includes: four tags, and the characteristic attributes of these four tags are respectively optimal tags ( Suitable), preferred label (Prioritized), optional label (Optional) and core label (Core); assuming that the preset pruning condition is: the candidate node must be associated with all labels with Core and Suitable feature attributes in the label group to be matched, Then the two nodes in the dotted line part in Fig. 2b are candidate nodes, and these two candidate nodes form a candidate node set.
  • each candidate node in the candidate node set can be used as the first gradient propagation node; it can be seen from FIG. 2b that the first gradient propagation node includes two nodes in the dotted line.
  • Step 201 Determine whether there are irrelevant labels that meet the preset propagation conditions among the labels associated with each node in the i-th gradient propagation node; irrelevant labels represent labels that are not included in the label group to be matched; i is an integer greater than or equal to 1 .
  • each label associated with each node in the current gradient propagation node is traversed to determine the initial score of each label associated with each node in the current gradient propagation node; here, if i is equal to 1 , then the current gradient propagation node is the first gradient propagation node; if i is equal to 2, the current gradient propagation node is the second gradient propagation node.
  • determining the initial score of each tag associated with each node in the current gradient propagation node it may be: predetermine the characteristic attribute and incentive value of each tag in the tag group to be matched; the value of i In the case of greater than or equal to 1, when it is determined that each label associated with each node in the i-th gradient propagation node is a label in the label group to be matched, according to the incentive value of each label associated with each node in the i-th gradient propagation node and The number of times that the feature attribute appears in the tag group to be matched determines the initial score of each tag associated with each node in the i-th gradient propagation node; in determining that each tag associated with each node in the i-th gradient propagation node is not a tag to be matched When labeling in the group, the preset value is used as the initial score of each label associated with each node in the i-th gradient propagation node.
  • the feature attribute and incentive value of each tag in the tag group to be matched can be predetermined; in the process of traversing the tags associated with each node in the current gradient propagation node, if The current tag is a tag in the tag group to be matched, then the initial score of the tag can be determined according to the incentive value of the tag and the number of times the feature attribute appears in the tag group to be matched, specifically refer to the expression (1), then the The initial score S of the label can be:
  • C represents the basic score of the current label, which can be set manually, and the default setting is 1, which is used to expand the data difference;
  • Fn represents the number of times the feature attribute of the current label appears in the label to be matched;
  • Bn represents the current label Incentive values corresponding to feature attributes.
  • the initial score of the current label can be set as a preset value; here , the value of the preset value can be set according to the actual business scenario, which is not limited in this embodiment of the application, for example, it can be set to 0, or can be set to other values.
  • the initial score of each label can be determined in a targeted manner, and then , the initial score is used for the score calculation of subsequent candidate nodes, thus, the validity of the final score result can be ensured.
  • irrelevant labels represent labels that are not included in the label group to be matched corresponding to the current gradient propagation node; for example, assuming that the current gradient propagation node includes node 1 and node 2, the corresponding label group to be matched includes label 1 and label 2; if node 1 is associated with label 1 and label 3, and node 2 is associated with label 1, label 2, and label 3, then label 3 is an irrelevant label.
  • determining whether there are irrelevant labels that meet the preset propagation conditions among the labels associated with each node in the i-th gradient propagation node may include: When the ratio of the number of each node in the i gradient propagation node is greater than the set threshold, it is determined that there are irrelevant labels that meet the preset propagation conditions; if the number of nodes associated with irrelevant labels in the i gradient propagation node and each node in the i gradient propagation node When the ratio is less than or equal to the set threshold, it is determined that there is no irrelevant label meeting the preset propagation condition.
  • the value of the set threshold may be determined according to the actual service scenario, which is not limited in this embodiment of the present application.
  • the value may be 0.5 or other values.
  • the current gradient propagation node includes node 1 and node 2, and its corresponding label group to be matched includes label 1 and label 2; if node 1 is associated with label 1 and label 3, node 2 is associated with label 1 and label 2 and label 3, then for the irrelevant label (label 3), since both node 1 and node 2 are associated with label 3, it can be determined that the number of nodes associated with irrelevant labels in the current gradient propagation node is 2, thus, it can be determined that the current The ratio of the number of nodes associated with irrelevant labels in the gradient propagation node to all nodes included in the current gradient propagation node is 1; at this time, if the value of the set threshold is 0.5, since the number ratio of 1 is greater than the set threshold of 0.5, therefore, It can be determined that among the labels associated with each node in the current gradient propagation node, there is an irrelevant label (label 3) that meets the preset propagation condition.
  • Step 202 When it is determined that it exists, add irrelevant labels that meet the preset propagation conditions into the tag group to be matched, and use the non-candidate nodes associated with the irrelevant labels that meet the preset propagation conditions as the i+1th gradient propagation node; When it exists, and the value of i is greater than 1, determine the propagation points of each label associated with each node in the i-th gradient propagation node; determine the propagation points of each label associated with each node in the i-th gradient propagation node The final score of each candidate node in a gradient propagation node.
  • step 201 when it is determined according to the above step 201 that there are irrelevant labels that meet the preset propagation conditions among the labels associated with each node in the current gradient propagation node, it means that the irrelevant labels have practical significance, and the current propagation network is still It can continue to be extended, that is, the irrelevant label has a certain influence on the score result of the candidate node in the subsequent first gradient propagation node; at this time, add the irrelevant label to the label group to be matched, and search for the non-correlated label associated with the irrelevant label.
  • the current gradient propagation node includes node 1, node 2 and node 3, and its corresponding label group to be matched includes label 1 and label 2; if node 1 is associated with label 1 and label 3, node 2 is associated with label 1, label 2 and label 3, it can be determined that there is an irrelevant label (label 3) that meets the preset propagation conditions among the labels associated with each node in the current gradient propagation node. At this time, add the irrelevant label (label 3) to be matched In the label group, that is, the matching label group includes label 1, label 2, and label 3; if label 3 is associated with node 3, then node 3 is a non-candidate node associated with an irrelevant label. At this time, node 3 is used as the next gradient spread node.
  • the label group to be matched corresponding to the next gradient propagation node is the label group to be matched after adding irrelevant labels.
  • next gradient propagation node After obtaining the next gradient propagation node, continue to determine the initial score of each label associated with each node in the next gradient propagation node according to step 201, and determine whether there is an irrelevant label that meets the preset propagation conditions; specific implementation The method is similar to the above-mentioned implementation method for the current gradient propagation node, and will not be repeated here.
  • the next gradient propagation node is the second gradient propagation node, and according to step 201, the initial classification of each label associated with each node in the second gradient propagation node is determined , it is determined that there is another irrelevant label that meets the preset propagation conditions, then add the irrelevant label to the label group to be matched corresponding to the second gradient propagation node, and use the non-candidate node associated with the irrelevant label as the third gradient propagation
  • the node continues to judge according to step 201 until there is no irrelevant tag meeting the preset propagation condition in the propagation network.
  • determining the final score of each candidate node in the first gradient propagation node according to the propagation score of each label associated with each node in the i-th gradient propagation node may include: when the value of i is greater than 1 In the case of , when it is determined that there are no irrelevant labels that meet the preset propagation conditions among the labels associated with each node in the i-th gradient propagation node, according to the propagation points of each label associated with each node in the i-th gradient propagation node, determine The propagation points of each label associated with each node in the i-1th gradient propagation node; until the propagation points of each label associated with each candidate node in the first gradient propagation node; according to each candidate node in the first gradient propagation node Select the propagation score of each label associated with the node to determine the final score of each candidate node in the first gradient propagation node; here, the final score is used to reflect the matching degree between the label group to be matched and each candidate node.
  • each of the second gradient propagation nodes The propagation points of each label associated with each node; then, according to the propagation points of each label associated with each node in the second gradient propagation node, determine the propagation points of each label associated with each node in the first gradient propagation node; then according to The propagation score of each label associated with each candidate node in the first gradient propagation node determines the final score of each candidate node in the first gradient propagation node.
  • determining the final score of each candidate node in the first gradient propagation node according to the propagation score of each label associated with each node in the i-th gradient propagation node may include: when the value of i is greater than 1 In the case of , determine the initial score of each label associated with each node in the i-th gradient propagation node; according to the propagation score and initial score of each label associated with each node in the i-th gradient propagation node, determine the first gradient propagation node. The final score for each candidate node.
  • i is equal to 2 according to the above steps, first, determine the initial score of each label associated with each node in the second gradient propagation node, according to each label associated with each node in the second gradient propagation node
  • the initial score of each label determines the propagation score of each label, and then, according to the propagation score of each label associated with each node in the second gradient propagation node, determines the propagation score of each label associated with each node in the first gradient propagation node; finally,
  • the final score of each candidate node in the first gradient propagation node is determined according to the propagation score and the initial score of each candidate node in the first gradient propagation node.
  • the propagation score reflects the scores of the labels associated with each node in the second gradient propagation node, and the same label associated with the first gradient propagation node (that is, added to the tag group to be matched irrelevant labels of ) to the influence degree of candidate node scores.
  • the final score of each candidate node in the first gradient propagation node is jointly determined by using the label propagation scores associated with each node in other gradient propagation nodes and the initial score, which can improve the matching result. generalizability.
  • the first gradient propagation node includes node 1 and node 2, and its corresponding label group 1 to be matched includes label 1 and label 2; node 1 is associated with label 1 and label 3, and node 2 is associated with label 2 and label 3.
  • node 1 in the first gradient propagation node is associated with label 1 and label 3
  • node 2 is associated with label 2 and Label 3, that is, both node 1 and node 2 are associated with label 3. Therefore, when calculating the final score of node 1 and node 2, the initial score of label 3, 0.5, and the propagation score of 0.0625 are first accumulated to obtain an accumulated score of 0.5625.
  • the final score of node 1 is the sum of the incentive value 2 corresponding to label 1 and the cumulative score of label 3 0.5625 , is 2.5625, and the final score of node 2 is the sum of the incentive value 1.5 corresponding to label 2 and the cumulative score 0.5625 of label 3, which is 2.0625.
  • the propagation score of each label associated with each node in the previous gradient is determined according to the propagation score of each label associated with each node in the current gradient propagation node. , that is, determining the final score of each candidate node through back propagation can improve the accuracy of the matching result.
  • the above method may further include: when determining that different nodes in the i-th gradient propagation node are associated with the same target label, determining the propagation score of each node in the different nodes associated with the same target label; Among the propagation points of nodes associated with the same target label, select the maximum propagation score of the target label; according to the maximum propagation score of the target label, determine the propagation points of each label associated with each node in the i-1th gradient propagation node.
  • the second gradient propagation node includes node 3 and node 4. If both node 3 and node 4 are associated with label 4, it is assumed that when node 3 is associated with label 4, the propagation score of label 4 is determined to be 0.2, and in When node 4 is associated with label 4, it is determined that the propagation score of label 4 is 0.5. At this time, 0.5 is used as the propagation score of label 4 to determine the propagation score of each node in the previous gradient propagation node (the first gradient propagation node). , finally, determine the final score of each candidate node in the first gradient propagation node.
  • the embodiment of the present application proposes a label matching method, device, electronic equipment, computer storage medium and computer program product.
  • the method includes: receiving a label matching request, wherein the label matching request includes a label group to be matched; Among the nodes associated with each label, determine the set of candidate nodes, and use each candidate node in the set of candidate nodes as the first gradient propagation node; Set the irrelevant label of the propagation condition; the irrelevant label represents the label that is not included in the label group to be matched; i is an integer greater than or equal to 1; when it is determined to exist, add the irrelevant label that meets the preset propagation condition to the label group to be matched, and Use the non-candidate node associated with the irrelevant label that meets the preset propagation conditions as the i+1th gradient propagation node; when it is determined that it does not exist, and when the value of i is greater than 1, determine the node associated with each node in the i-th gradient propagation node
  • the propagation points of each label according to the propag
  • the embodiment of the present application by determining whether there are irrelevant labels that meet the preset propagation conditions among the labels associated with each node in the i-th gradient propagation node, that is, determine whether the label propagation network can continue based on the irrelevant labels Extension, in this way, it is possible to better discover some implicit irrelevant tags that can affect the tag matching results, and improve the accuracy of subsequent tag matching; in addition, in the process of determining whether there are irrelevant tags that meet the preset propagation conditions, if there are If the labels associated with each node in a gradient propagation node no longer meet the preset propagation conditions, the labels associated with other nodes will no longer be judged, that is, the iterative calculation for the label matching process can be stopped, effectively preventing the propagation depth from being too deep. Large; compared with related technologies, if a tag is updated or inserted once, N rounds of iterative training are required until global convergence, the embodiment of the present application has fewer iterations in the tag matching process, which can greatly reduce the calculation of tag
  • the above method may further include: determining the initial score of each label associated with each candidate node in the first gradient propagation node; There are no irrelevant labels that meet the preset propagation conditions; the sum of the initial scores of the labels associated with each candidate node in the first gradient propagation node is determined as the final score result of each candidate node.
  • the current gradient propagation node is the first gradient propagation node
  • the first gradient propagation node is determined Among the labels associated with each candidate node in , there is no irrelevant label that meets the preset propagation conditions; that is, each label associated with each candidate node is a label in the label group to be matched; at this time, according to the formula (1 ) to determine the initial score of each label associated with each candidate node, and use the sum of the initial scores of each label associated with each candidate node as the final score result of each candidate node.
  • the first gradient propagation node includes node 1 and node 2, and its corresponding label group to be matched includes label 1 and label 2; if node 1 is associated with label 1, node 2 is associated with label 1 and label 2, and According to the above formula (1), the initial scores of label 1 and label 2 are determined to be 0.3 and 0.8 respectively, then the final score of node 1 is 0.3, and the final score of node 2 is 1.1.
  • the final score of each candidate node can be sorted in descending order, and the top candidate node is is the node with the highest degree of matching with the label group to be matched; or, firstly, normalize the final score of each candidate node according to the standard deviation, and the normalized results are in order from large to small The order is sorted, and the candidate node ranked first is the node with the highest degree of matching with the label group to be matched. Since a node corresponds to a service entity, in the embodiment of the present application, the service entity that best matches the label group to be matched can be obtained through the above method.
  • the network structure shown in Figure 2b is a partial network
  • Figure 2c is a schematic diagram of another network structure for label matching in the embodiment of the present application, as shown in Figure 2c, the network structure is a bidirectional ring structure, and the network
  • the structure can be represented by two sets of relationships, that is, the out-degree and in-degree of the edge, node (Node)->label (Label) and Label->Node, as the input of the label matching algorithm.
  • the algorithm is described in detail below:
  • Step A1 Use the search algorithm to obtain the Label->Node relationship group based on the label group to be matched attached to the label matching request.
  • you can set the preset pruning conditions of the network to prune the network for example, Figure 2b
  • the preset pruning condition in is: the candidate Node must be associated with the tags of all the Core and Suitable feature attributes involved in the tag group to be matched, so the dotted line in Figure 2b is the first gradient propagation node of the propagation network, and other nodes that do not meet
  • the conditional node is pruned; of course, if the preset pruning condition is not set, the original Label->Node relationship group is used, and all involved Nodes are used as the first gradient propagation node; here, the first gradient propagation node is also the last match Alternative nodes for the result.
  • Step A2 Based on each Node in the current gradient propagation node, search to obtain the Node->Label relationship group, traverse the Label associated with each Node node, and give each Label an initial score, if the Label is not a label group to be matched The label attached to the label is initially divided into 0, otherwise, the current Label is set to N, and the basic score is C, the number of times Fn of the characteristic attribute of the Label in the label group to be matched, and the incentive value of the characteristic attribute of the Label is Bn, then The initial score should be: (C/Fn)*Bn.
  • Step A3 Invert the Node->Label relationship in step A2, traverse and calculate Label->Node, because the effect of other labels (irrelevant labels) not included in the label group to be matched on the score is ignored in step A2, but irrelevant labels
  • the reason for the propagation network will also affect the score; here, you can set the ratio of the candidate Node associated with an irrelevant label to the total associated Node as the propagation condition. The ratio defaults to 0.5.
  • the irrelevant label is considered It also has practical significance, and the propagation network can continue to extend; if there is an irrelevant label that meets the propagation conditions, this irrelevant label is dynamically added to the label group to be matched, and the non-candidate node associated with the irrelevant label is searched out as the next propagation gradient node , return to step A2; otherwise, proceed to the next step.
  • Step A4 According to the Node->Label relationship in step 2, traverse again, take the sum of the scores of all labels associated with each Node in the current gradient propagation node, as the score of each Node in the current gradient propagation node, if the current gradient If the Node node in the propagation node is not a candidate node, the propagation score of each Label associated with the Node node is determined according to the proportion of the incentive value of the corresponding characteristic attribute of the Label associated with the Node node in the current gradient propagation node.
  • a Label may have Multiple propagation points, take the largest one. After the calculation is completed, return to the previous gradient until the final score result of each candidate node in the first gradient propagation node is obtained; otherwise, the score calculation is completed, and return the score of each candidate node final scoring result.
  • Step A5 Normalize the final score of each candidate Node according to the standard deviation.
  • the number of iterations of the tag matching algorithm proposed in the embodiment of the present application is very small. If the tags in the tag group to be matched are very clear and precise, then there may only be one or two steps of calculation. At the same time, compared with related technologies, it can It is better to discover some hidden factors that affect the matching results. For example, during the calculation process, some hidden tags that do not appear in the query are generated. The potential features of these tags have been obtained through pre-computation, so , which can ensure the accuracy of the matching results. Furthermore, in the process of label matching, this application also adopts the method of network propagation. In order to prevent the propagation depth from being too large and the generation of loopback network, the method of pruning while propagating and using the ratio to limit the propagation conditions is adopted. Different distribution networks are dynamically outlined.
  • the core label and non-core label and their weighted weight can be defined by means of reduction algorithm (such as Rougthset fuzzy set theory) , the core process of using the reduction algorithm: Assume that each label has all the characteristic attributes, and the respective weights are artificial weighted value * incentive value (the artificial weighted value is set empirically, and the initial incentive value is 1), the input training data
  • the set is a series of query tags and corresponding matching entities.
  • the query tag set can be an input matrix of weighted values.
  • the number of matching entities combined with all candidate entities can be a 0-1 sequence with a length of N (number of candidate entities), about
  • the simplified algorithm is to transform the problem into a classification problem. Each iteration is to solve an optimal classification neural network. After each round of iteration, the input matrix will be adjusted according to the reduction condition (increase or decrease the incentive value) before iterating. , and the reduction condition needs to be beneficial to enhance the classification ability of the input matrix.
  • the role of feature attributes is to reduce the size of the propagation network generated each time without affecting the result of tag matching, that is, the most suitable tagged entity. Therefore, in the reduction algorithm, whether the matching entity result is changed and adjusted The influence of the label feature attribute incentive value on the confidence range of the result is used as the reduction condition. Finally, the feature attribute with the highest weighted value in each query tag is used as its unique feature attribute, and the incentive value of the final iteration of each feature attribute is the final incentive value.
  • the incentive value of feature attributes can be preset.
  • the incentive value of the core tag (Core) is 2.0
  • the optimal tag (Suitable) has an incentive value of 1.0
  • the preferred label (Prioritized) has an incentive value of 0.8
  • the optional label (Optional) has an incentive value of 0.3
  • the number of characteristic attributes is often not fixed.
  • a continuous numerical range can be decomposed and divided into multiple segments. Each segment represents a characteristic attribute, and the range of the segment is the range of incentive values.
  • Feature attributes and incentive values are equivalent to prior knowledge and indirectly participate in the calculation of label matching scores, but they cannot directly represent the final matching score of the data, so the incentive values of feature attributes do not need to be frequently adjusted. Unlike traditional scoring networks, Feature attributes and stimulus values are independent of specific matching queries.
  • Fig. 2d is a schematic flow chart of score value transfer when a kind of preset propagation condition is not met in the embodiment of the present application.
  • initialization means that the candidate node (Candidate Node ) after scoring the associated Label, which means that the score of the candidate node is implicitly passed to the associated Label;
  • the sum (sum) means that the score of the non-alternative node is the sum of all the Label scores associated with it, which is also a transfer;
  • the broadcast propagation between the non-candidate node (Unrelated Node) and its associated Label is propagated according to the proportion of the incentive value of the characteristic attribute described in step A4, which is a reverse transmission; finally associate Label to The broadcast of the candidate node is to take the maximum broadcast score of the Label itself for propagation, and finally accumulate the initial score of the candidate node to get the real score of the candidate node; that is, the overall transmission sequence is init- >sum->broadcast->
  • Fig. 3 is a schematic diagram of the composition and structure of the label matching device according to the embodiment of the present application. As shown in Fig. 3, the device includes a first determination module 300 and a second determination module 301, wherein,
  • the first determination module 300 is configured to receive a tag matching request, the tag matching request includes a tag group to be matched; from the nodes associated with each tag in the tag group to be matched, determine a set of candidate nodes, and the Each candidate node in the candidate node set is used as the first gradient propagation node;
  • the second determination module 301 is configured to determine whether there is an irrelevant label that meets the preset propagation condition among the labels associated with each node in the i-th gradient propagation node; the irrelevant label indicates that it is not included in the label group to be matched Tag; i is an integer greater than or equal to 1; when it is determined to exist, add the irrelevant tag that meets the preset propagation conditions to the tag group to be matched, and associate the non-standby tag that meets the preset propagation conditions Select the node as the i+1 gradient propagation node; when it is determined that it does not exist, and when the value of i is greater than 1, determine the propagation points of each label associated with each node in the i gradient propagation node; according to the i The propagation points of each label associated with each node in the gradient propagation node determine the final score of each candidate node in the first gradient propagation node; the final score is used to reflect the relationship between the label group to be matched and each The matching degree of each candidate node.
  • the second determination module 301 is configured to determine each candidate node in the first gradient propagation node according to the propagation score of each label associated with each node in the i-th gradient propagation node The final score, including:
  • the second determination module 301 is further configured to:
  • the second determination module 301 is further configured to:
  • each of the tags associated with each candidate node in the first gradient propagation node determines that among the tags associated with each candidate node in the first gradient propagation node, there is no irrelevant tag that meets the preset propagation conditions; each of the tags associated with each candidate node in the first gradient propagation node The sum of the initial scores of labels is determined as the final score result of each candidate node.
  • the second determination module 301 is configured to determine whether there are irrelevant labels that meet preset propagation conditions among the labels associated with each node in the i-th gradient propagation node, including:
  • the first determining module 300 is configured to determine a set of candidate nodes from the nodes associated with each tag in the tag group to be matched, including:
  • the node associated with each label in the at least one label is used as a candidate node and put into the candidate node set.
  • the determining the final score of each candidate node in the first gradient propagation node according to the propagation score of each label associated with each node in the i-th gradient propagation node includes:
  • the final score of each candidate node in the first gradient propagation node is determined according to the propagation score and the initial score of each label associated with each node in the i-th gradient propagation node.
  • the second determination module 301 is further configured to:
  • each label associated with each node in the i-th gradient propagation node is a label in the label group to be matched, according to the i-th gradient propagation node
  • the excitation value of each label associated with each node and the number of times the feature attribute appears in the tag group to be matched determines the initial score of each label associated with each node in the i-th gradient propagation node
  • the preset value is used as each label associated with each node in the i-th gradient propagation node initial score.
  • both the above-mentioned first determination module 300 and second determination module 301 can be implemented by a processor located in an electronic device, and the processor can be an ASIC, DSP, DSPD, PLD, FPGA, CPU, controller, microcontroller, etc. At least one of a controller and a microprocessor.
  • each functional module in this embodiment may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated units can be implemented in the form of hardware or in the form of software function modules.
  • the integrated unit is implemented in the form of a software function module and is not sold or used as an independent product, it can be stored in a computer-readable storage medium.
  • the technical solution of this embodiment is essentially or The contribution made by related technologies or all or part of the technical solution can be embodied in the form of software products, the computer software products are stored in a storage medium, and include several instructions to make a computer device (which can be a personal computer) , server, or network device, etc.) or a Processor (processor) executes all or part of the steps of the method of this embodiment.
  • the aforementioned storage medium includes: various media that can store program codes such as U disk, mobile hard disk, read-only memory (Read Only Memory, ROM), RAM, magnetic disk or optical disk.
  • the computer program instructions corresponding to a label matching method in this embodiment can be stored on storage media such as optical discs, hard disks, and U disks.
  • storage media such as optical discs, hard disks, and U disks.
  • FIG. 4 shows an electronic device 400 provided by an embodiment of the present application, which may include: a memory 401 and a processor 402; wherein,
  • memory 401 configured to store computer programs and data
  • the processor 402 is configured to execute the computer program stored in the memory, so as to implement any tag matching method in the foregoing embodiments.
  • the above-mentioned memory 401 can be a volatile memory (Volatile Memory), such as RAM; or a non-volatile memory (Non-Volatile Memory), such as ROM, flash memory (Flash Memory), hard disk (Hard Disk Drive, HDD) or solid-state drive (Solid-State Drive, SSD); or a combination of the above-mentioned types of memory, and provide instructions and data to the processor 402.
  • volatile memory such as RAM
  • Non-Volatile Memory Non-Volatile Memory
  • ROM read-only memory
  • flash Memory flash memory
  • HDD Hard Disk Drive
  • SSD solid-state drive
  • the aforementioned processor 402 may be at least one of ASIC, DSP, DSPD, PLD, FPGA, CPU, controller, microcontroller, and microprocessor. It can be understood that, for different tag matching devices, the electronic device used to implement the above processor function may also be other, which is not specifically limited in this embodiment of the present application.
  • the embodiment of the present application also provides a computer program product, the computer program product carries a program code, and the instructions included in the program code can be used to execute the load balancing method described in the above method embodiment, for details, please refer to the above method embodiment , which will not be repeated here.
  • the above-mentioned computer program product may be specifically implemented by means of hardware, software or a combination thereof.
  • the computer program product is embodied as a computer storage medium, and in another optional embodiment, the computer program product is embodied as a software product, such as a software development kit (Software Development Kit, SDK) etc. wait.
  • a software development kit Software Development Kit, SDK
  • the functions or modules included in the device provided by the embodiments of the present application can be used to execute the methods described in the above method embodiments, and its specific implementation can refer to the descriptions of the above method embodiments. For brevity, here No longer.
  • the embodiments of the present application may be provided as methods, systems, or computer program products. Accordingly, the present application may take the form of a hardware embodiment, a software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, optical storage, etc.) having computer-usable program code embodied therein.
  • a computer-usable storage media including but not limited to disk storage, optical storage, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

本申请实施例提出了一种标签匹配方法、装置、电子设备、计算机存储介质和计算机程序产品,该方法包括:从待匹配标签组中每个标签关联的节点中,确定备选节点集,将备选节点集中每个备选节点作为第一梯度传播节点;确定第i梯度传播节点中每个节点关联的各个标签中是否存在符合预设传播条件的无关标签;确定存在时,将符合预设传播条件的无关标签加入待匹配标签组中,并将符合预设传播条件的无关标签关联的非备选节点作为第i+1梯度传播节点;确定不存在时,确定第i梯度传播节点中每个节点关联的各个标签的传播分;根据第i梯度传播节点中每个节点关联的各个标签的传播分,确定第一梯度传播节点中每个备选节点的最终得分。

Description

一种标签匹配方法、装置、设备、计算机存储介质和程序
相关申请的交叉引用
本申请基于申请号为202111475736.0、申请日为2021年12月06日的中国专利申请提出,申请人为深圳前海微众银行股份有限公司,申请名称为“一种标签匹配方法、装置、设备和计算机存储介质”的技术方案,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此引入本申请作为参考。
技术领域
本申请涉及金融科技(Fintech)的云计算技术领域,涉及但不限于一种标签匹配方法、装置、电子设备、计算机存储介质和计算机程序产品。
背景技术
随着计算机技术的发展,越来越多的技术应用在金融领域,传统金融业正在逐步向金融科技转变,但由于金融行业的安全性、实时性要求,也对技术提出了更高的要求。
在典型的业务应用系统中,凡是运用到标签系统进行数据匹配,一方面为了响应的实时性,大部分借助精确/模糊搜索算法来完成,例如倒排查询,另一方面为了匹配的泛化性,通过矩阵模型结合置信传播的方式来进行关于标签的数据预评分,但矩阵模型计算的时效性低,需要联合全局标签网络来进行计算;在此过程中,如果实时计算矩阵模型,一次标签的更新或插入,带来的是一次N轮的迭代训练直至全局收敛,增加了标签匹配计算的复杂度。
发明内容
本申请提供一种标签匹配方法、装置、电子设备、计算机存储介质和计算机程序产品,可以解决相关技术中在进行标签匹配时计算量较大的问题。
本申请的技术方案是这样实现的:
本申请实施例提供了一种标签匹配方法,所述方法包括:
接收标签匹配请求,所述标签匹配请求中包括待匹配标签组;从所述待匹配标签组中每个标签关联的节点中,确定备选节点集,将所述备选节点集中每个备选节点作为第一梯度传播节点;
确定第i梯度传播节点中每个节点关联的各个标签中是否存在符合预设传播条件的无关标签;所述无关标签表示未包括在所述待匹配标签组中 的标签;i为大于等于1的整数;
确定存在时,将所述符合预设传播条件的无关标签加入所述待匹配标签组中,并将所述符合预设传播条件的无关标签关联的非备选节点作为第i+1梯度传播节点;
确定不存在时,且i的取值大于1时,确定所述第i梯度传播节点中每个节点关联的各个标签的传播分;根据所述第i梯度传播节点中每个节点关联的各个标签的传播分,确定所述第一梯度传播节点中每个备选节点的最终得分;所述最终得分用于反映所述待匹配标签组与所述每个备选节点的匹配程度。
在一些实施例中,所述根据所述第i梯度传播节点中每个节点关联的各个标签的传播分,确定所述第一梯度传播节点中每个备选节点的最终得分,包括:
在i的取值大于1的情况下,确定所述第i梯度传播节点中每个节点关联的各个标签中不存在符合预设传播条件的无关标签时,根据所述第i梯度传播节点中每个节点关联的各个标签的传播分,确定第i-1梯度传播节点中每个节点关联的各个标签的传播分;
直至得到所述第一梯度传播节点中每个备选节点关联的各个标签的传播分;根据所述第一梯度传播节点中每个备选节点关联的各个标签的传播分,确定所述第一梯度传播节点中每个备选节点的最终得分。
可以看出,本申请实施例中,在进行标签匹配的过程中,根据当前梯度传播节点中每个节点关联的各个标签的传播分,确定上一梯度中每个节点关联的各个标签的传播分,即,通过反向传播的方式确定每个备选节点的最终得分,可以确保匹配结果的准确性。
在一些实施例中,所述方法还包括:
在确定所述第i梯度传播节点中不同节点关联同一个目标标签时,确定所述不同节点中每个节点关联同一个目标标签的传播分;
从所述不同节点中每个节点关联同一个目标标签的传播分中,选择所述目标标签的最大传播分;
根据所述目标标签的最大传播分,确定所述第i-1梯度传播节点中每个节点关联的各个标签的传播分。
可以看出,本申请实施例中,在某一梯度传播节点中不同节点关联同一标签时,可以得到该标签对应的多个传播分,由于传播分能够表明标签相对于节点的重要性;此时,选择最大传播分作为该标签的传播分,可以进一步提高后续匹配结果的精确性。
在一些实施例中,所述方法还包括:
确定所述第一梯度传播节点中每个备选节点关联的各个标签的初始分;
如果确定所述第一梯度传播节点中每个备选节点关联的各个标签中不存在符合所述预设传播条件的无关标签;将所述第一梯度传播节点中每个 备选节点关联的各个标签的初始分之和,确定为所述每个备选节点的最终得分结果。
可以看出,本申请实施例中,在确定第一梯度传播节点中每个备选节点关联的各个标签中不存在符合预设传播条件的无关标签时,说明当前的传播网络无需继续延伸,此时,通过将第一梯度传播节点中每个备选节点关联的各个标签的初始分之和,确定为每个备选节点的最终得分结果,可以大大减少标签匹配算法的计算量。
在一些实施例中,所述确定第i梯度传播节点中每个节点关联的各个标签中是否存在符合预设传播条件的无关标签,包括:
如果确定所述第i梯度传播节点中关联所述无关标签的节点和所述第i梯度传播节点中各个节点的数量比值大于设置阈值时,确定存在符合预设传播条件的无关标签;
如果确定所述第i梯度传播节点中关联所述无关标签的节点和所述第i梯度传播节点中各个节点的数量比值小于或等于设置阈值时,确定不存在符合预设传播条件的无关标签。
可以看出,本申请实施例中,通过设置预设传播条件,可以针对每个梯度传播节点,更好地发现一些隐含的能够影响匹配结果的无关标签的同时,减少标签匹配算法的迭代次数。
在一些实施例中,所述从所述待匹配标签组中每个标签关联的节点中,确定备选节点集,包括:
从所述待匹配标签组中确定出符合预设剪枝条件的至少一个标签;
将所述至少一个标签中每个标签关联的节点作为备选节点,放入备选节点集中。
可以看出,本申请实施例中,通过预设剪枝条件,从待匹配标签组中的标签中选择符合实际要求的标签进行后续匹配,可以在不对匹配结果产生影响的情况下,减少一定的计算量。
在一些实施例中,所述根据所述第i梯度传播节点中每个节点关联的各个标签的传播分,确定所述第一梯度传播节点中每个备选节点的最终得分,包括:
在i的取值大于1的情况下,确定所述第i梯度传播节点中每个节点关联的各个标签的初始分;
根据所述第i梯度传播节点中每个节点关联的各个标签的传播分以及初始分,确定所述第一梯度传播节点中每个备选节点的最终得分。
可以看出,本申请实施例中,利用其它梯度传播节点中每个节点关联的各个标签传播分以及初始分,共同确定第一梯度传播节点中每个备选节点的最终得分,可以提高匹配结果的泛化性。
在一些实施例中,所述方法还包括:
预先确定所述待匹配标签组中每个标签的特征属性和激励值;
在i的取值大于等于1的情况下,在确定所述第i梯度传播节点中每个节点关联的各个标签为所述待匹配标签组中的标签时,根据所述第i梯度传播节点中每个节点关联的各个标签的激励值以及特征属性在所述待匹配标签组中出现的次数,确定所述第i梯度传播节点中每个节点关联的各个标签的初始分;
在确定所述第i梯度传播节点中每个节点关联的各个标签不为所述待匹配标签组中的标签时,将预设值作为所述第i梯度传播节点中每个节点关联的各个标签的初始分。
可以看出,本申请实施例中,通过遍历当前梯度传播节点中每个节点关联的各个标签是否为待匹配标签组中的标签的判断结果,可以有针对性地确定各个标签的初始分,确保最终得分结果的有效性。
本申请实施例还提出了一种标签匹配装置,所述装置包括第一确定模块和第二确定模块,其中,
第一确定模块,配置为接收标签匹配请求,所述标签匹配请求中包括待匹配标签组;从所述待匹配标签组中每个标签关联的节点中,确定备选节点集,将所述备选节点集中每个备选节点作为第一梯度传播节点;
第二确定模块,配置为确定第i梯度传播节点中每个节点关联的各个标签中是否存在符合预设传播条件的无关标签;所述无关标签表示未包括在所述待匹配标签组中的标签;i为大于等于1的整数;确定存在时,将所述符合预设传播条件的无关标签加入所述待匹配标签组中,并将所述符合预设传播条件的无关标签关联的非备选节点作为第i+1梯度传播节点;确定不存在时,且i的取值大于1时,确定所述第i梯度传播节点中每个节点关联的各个标签的传播分;根据所述第i梯度传播节点中每个节点关联的各个标签的传播分,确定所述第一梯度传播节点中每个备选节点的最终得分;所述最终得分用于反映所述待匹配标签组与所述每个备选节点的匹配程度。
本申请实施例提供一种电子设备,所述设备包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,所述处理器执行所述程序时实现前述一个或多个技术方案提供的标签匹配方法。
本申请实施例提供一种计算机存储介质,所述计算机存储介质存储有计算机程序;所述计算机程序被执行后能够实现前述一个或多个技术方案提供的标签匹配方法。
本申请实施例还提供了一种计算机程序产品,包括计算机可读代码,当所述计算机可读代码在电子设备中运行时,所述电子设备中的处理器执行用于实现前述一个或多个技术方案提供的标签匹配方法。
本申请实施例提出了一种标签匹配方法、装置、电子设备、计算机存储介质和计算机程序产品,所述方法包括:接收标签匹配请求,所述标签匹配请求中包括待匹配标签组;从所述待匹配标签组中每个标签关联的节点中,确定备选节点集,将所述备选节点集中每个备选节点作为第一梯度 传播节点;确定第i梯度传播节点中每个节点关联的各个标签中是否存在符合预设传播条件的无关标签;所述无关标签表示未包括在所述待匹配标签组中的标签;i为大于等于1的整数;确定存在时,将所述符合预设传播条件的无关标签加入所述待匹配标签组中,并将所述符合预设传播条件的无关标签关联的非备选节点作为第i+1梯度传播节点;确定不存在时,且i的取值大于1时,确定所述第i梯度传播节点中每个节点关联的各个标签的传播分;根据所述第i梯度传播节点中每个节点关联的各个标签的传播分,确定所述第一梯度传播节点中每个备选节点的最终得分;所述最终得分用于反映所述待匹配标签组与所述每个备选节点的匹配程度。
可以看出,本申请实施例中,通过确定第i梯度传播节点中每个节点关联的各个标签中是否存在符合预设传播条件的无关标签,即,基于无关标签确定标签传播网络是否还可以继续延伸,如此,可以更好地发现一些隐含的能够影响标签匹配结果的无关标签,提升后续标签匹配的准确性;另外,在确定是否存在符合预设传播条件的无关标签的过程中,如果存在某一梯度传播节点中各个节点关联的标签不再符合预设传播条件的情况,则不再对其它节点关联的标签进行判断,即,能够停止针对标签匹配过程的迭代计算,有效防止传播深度过大;与相关技术中,如果更新或插入一次标签,则需要一次N轮的迭代训练直至全局收敛相比,本申请实施例在标签匹配过程中的迭代次数较少,可以大大减少标签匹配的计算量。
附图说明
图1是相关技术中的一种标签匹配的网络结构示意图;
图2a是本申请实施例中的一种标签匹配方法的流程示意图;
图2b是本申请实施例中的一种进行标签匹配的网络结构示意图;
图2c是本申请实施例中的另一种进行标签匹配的网络结构示意图;
图2d是本申请实施例中的一种不符合预设传播条件时评分值传递的流程示意图;
图3是本申请实施例的标签匹配装置的组成结构示意图;
图4是本申请实施例提供的电子设备的结构示意图。
具体实施方式
以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处所提供的实施例仅仅用以解释本申请,并不用于限定本申请。另外,以下所提供的实施例是用于实施本申请的部分实施例,而非提供实施本申请的全部实施例,在不冲突的情况下,本申请实施例记载的技术方案可以任意组合的方式实施。
需要说明的是,在本申请实施例中,术语“包括”、“包含”或者其任何其它变体意在涵盖非排他性的包含,从而使得包括一系列要素的方法或 者装置不仅包括所明确记载的要素,而且还包括没有明确列出的其它要素,或者是还包括为实施方法或者装置所固有的要素。在没有更多限制的情况下,由语句“包括一个......”限定的要素,并不排除在包括该要素的方法或者装置中还存在另外的相关要素(例如方法中的步骤或者装置中的单元,例如的单元可以是部分电路、部分处理器、部分程序或软件等等)。
本文中术语“和/或”,仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,I和/或J,可以表示:单独存在I,同时存在I和J,单独存在J这三种情况。另外,本文中术语“至少一种”表示多种中的任意一种或多种中的至少两种的任意组合,例如,包括I、J、R中的至少一种,可以表示包括从I、J和R构成的集合中选择的任意一个或多个元素。
例如,本申请实施例提供的标签匹配方法包含了一系列的步骤,但是本申请实施例提供的标签匹配方法不限于所记载的步骤,同样地,本申请实施例提供的标签匹配装置包括了一系列模块,但是本申请实施例提供的标签匹配装置不限于包括所明确记载的模块,还可以包括为获取相关任务数据、或基于任务数据进行处理时所需要设置的模块。
本申请实施例可以应用于服务端组成的计算机系统中,并可以与众多其它通用或专用计算系统环境或配置一起操作。这里,服务端可以是包括小型计算机系统﹑大型计算机系统的分布式云计算技术环境,等等。
服务端等电子设备可以通过程序模块的执行实现相应的功能。通常,程序模块可以包括例程、程序、目标程序、组件、逻辑、数据结构等等,它们执行特定的任务或者实现特定的抽象数据类型。计算机系统/服务端可以在分布式云计算环境中实施,分布式云计算环境中,任务是由通过通信网络链接的远程处理设备执行的。在分布式云计算环境中,程序模块可以位于包括存储设备的本地或远程计算系统存储介质上。
相关技术中,可以通过矩阵模型结合置信传播的方式来进行关于标签的数据预评分,以Google的Pagerank算法为例,假设每个节点的影响力(Pagerank,PR)值为一个与节点标签有关的矩阵变量,参照图1所示;为了得出数据预评分,需要多轮迭代计算使得网络中所有节点的PR值趋于稳定,迭代轮数不定,迭代的过程就是每个节点的PR值乘以一个邻近矩阵(等同与传播矩阵,代表每个节点向相邻节点传播的差率);设N轮迭代下节点M的PR值为PR n,m,N轮的PR矩阵为S n=(PR n,0+PR n,1+..+PR n,m),传播矩阵为L,迭代过程即为S n+1=S n*L。
如果需要同时满足实时性和泛化性的需求,则会产生以下两个问题:
1)若实时计算矩阵模型,一次标签的更新或插入,会带来一次N轮的迭代训练直至全局收敛。
2)并发地进行标签的更新、插入或删除,会对矩阵产生大范围的动态参数更新,使得矩阵陷入不可用状态,而如果对标签网络进行简单拆分,又会容易陷入局部最优,影响匹配结果的准确性。
针对上述技术问题,提出以下各实施例。
在本申请的一些实施例中,标签匹配方法可以利用标签匹配装置中的处理器实现,上述处理器可以为特定用途集成电路(Application Specific Integrated Circuit,ASIC)、数字信号处理器(Digital Signal Processor,DSP)、数字信号处理装置(Digital Signal Processing Device,DSPD)、可编程逻辑装置(Programmable Logic Device,PLD)、现场可编程逻辑门阵列(Field Programmable Gate Array,FPGA)、中央处理器(Central Processing Unit,CPU)、控制器、微控制器、微处理器中的至少一种。
图2a是本申请实施例中的一种标签匹配方法的流程示意图,如图2a所示,该方法包括如下步骤:
步骤200:接收标签匹配请求,标签匹配请求中包括待匹配标签组;从待匹配标签组中每个标签关联的节点中,确定备选节点集,将备选节点集中每个备选节点作为第一梯度传播节点。
本申请实施例中,标签匹配方法可以应用于一个由标签(Label)和集群节点(Node)组成的网状结构中,简称标签网络;这里,集群节点表示包括多个节点的集合,每个节点可以对应一个业务实体,其中,业务实体的类型与标签匹配方法实际应用的业务场景相关。
示例性地,待匹配标签组中可以包括一个或多个标签;这里,待匹配标签组中的每个标签具有对应的特征属性,而每种特征属性都有各自的激励值,等同于加权值,该激励值会参与后续的得分计算。
本申请实施例中,在待匹配标签组包括多个标签的情况下,不同标签之间的特征属性可以相同,也可以不同;而在不同标签之间的特征属性均相同的情况下,这些标签的激励值也相同;这里,标签的激励值表示标签对应的特征属性的激励值。
在一些实施例中,对于标签的特征属性的分类方式,本申请实施例不作限定;例如,可以在分类概念上将标签的特征属性分为核心标签(Core)、最适标签(Suitable)、优选标签(Prioritized)、可选标签(Optional)等;还可以按照其它方式对标签的特征属性进行分类。
示例性地,对于匹配标签组中的每个标签应该被赋予什么特征属性,具体的激励值取多少,可以借助约简算法进行确定,也可以根据人为经验进行确定,本申请实施例对此不作限定;例如,可以人为预设核心标签(Core)的激励值为2.0、最适标签(Suitable)的激励值为1.0、优选标签(Prioritized)的激励值为0.8以及可选标签(Optional)的激励值为0.3。
示例性地,当标签网络接收到外部发送的标签匹配请求后,可以从标签匹配请求中获取待匹配标签组;进而,根据待匹配标签组中每个标签关联的节点中,确定备选节点集。
在一些实施例中,可以将待匹配标签组中每个标签关联的节点,均作为备选节点集中的备选节点。
在一些实施例中,对于确定备选节点集的实现方式,还可以为:从待匹配标签组中确定出符合预设剪枝条件的至少一个标签;将至少一个标签中每个标签关联的节点作为备选节点,放入备选节点集中。
这里,预设剪枝条件是根据与业务场景相关的实际经验所设置的一些筛选条件,其目的是减少计算量;示例性地,在待匹配标签组包括多个标签的情况下,可以从中确定出符合预设剪枝条件的至少一个标签;进而,将至少一个标签中每个标签关联的节点作为备选节点集中的备选节点。
图2b是本申请实施例中的一种进行标签匹配的网络结构示意图,如图2b所示,待匹配标签组中包括:四个标签,且这四个标签的特征属性分别为最适标签(Suitable)、优选标签(Prioritized)、可选标签(Optional)和核心标签(Core);假设预设剪枝条件是:备选节点必须关联待匹配标签组中所有具有Core和Suitable特征属性的标签,则图2b中虚线部分中的两个节点即为备选节点,这两个备选节点组成备选节点集。
示例性地,在得到备选节点集后,可以将备选节点集中的每个备选节点作为第一梯度传播节点;结合图2b可知,第一梯度传播节点包括虚线部分中的两个节点。
可以看出,本申请实施例中,通过预设剪枝条件,从待匹配标签组中的标签中选择符合实际要求的标签进行后续匹配,可以在不对匹配结果产生影响的情况下,减少一定的计算量。
步骤201:确定第i梯度传播节点中每个节点关联的各个标签中是否存在符合预设传播条件的无关标签;无关标签表示未包括在待匹配标签组中的标签;i为大于等于1的整数。
示例性地,在得到当前梯度传播节点后,遍历当前梯度传播节点中每个节点关联的各个标签,以确定当前梯度传播节点中每个节点关联的各个标签的初始分;这里,若i等于1,则当前梯度传播节点为第一梯度传播节点;若i等于2,则当前梯度传播节点为第二梯度传播节点。
示例性地,对于确定当前梯度传播节点中每个节点关联的各个标签的初始分的实现方式,可以为:预先确定待匹配标签组中每个标签的特征属性和激励值;在i的取值大于等于1的情况下,在确定第i梯度传播节点中每个节点关联的各个标签为待匹配标签组中的标签时,根据第i梯度传播节点中每个节点关联的各个标签的激励值以及特征属性在待匹配标签组中出现的次数,确定第i梯度传播节点中每个节点关联的各个标签的初始分;在确定第i梯度传播节点中每个节点关联的各个标签不为待匹配标签组中的标签时,将预设值作为第i梯度传播节点中每个节点关联的各个标签的初始分。
示例性地,根据上述步骤200可知,待匹配标签组中每个标签的特征属性和激励值是可以预先确定的;在遍历当前梯度传播节点中每个节点关联的各个标签的过程中,如果确定当前标签为待匹配标签组中的标签,则 可以根据该标签的激励值以及特征属性在待匹配标签组中出现的次数,确定该标签的初始分,具体可以参照表达式(1),则该标签的初始分S可以为:
S=(C/Fn)*Bn     (1)
这里,C表示当前标签的基本分,其可以由人为设置,默认设置为1,是用来扩大数据差异性的;Fn表示当前标签的特征属性在待匹配标签中出现的次数;Bn表示当前标签对应特征属性的激励值。
示例性地,在遍历当前梯度传播节点中每个节点关联的各个标签的过程中,如果确定当前标签不为待匹配标签组中的标签,可以将当前标签的初始分设置为预设值;这里,对于预设值的取值可以根据实际的业务场景进行设置,本申请实施例不作限定,例如,可以设置为0,也可以设置为其它数值。
可以看出,本申请实施例中,通过遍历当前梯度传播节点中每个节点关联的各个标签是否为待匹配标签组中的标签的判断结果,可以有针对性地确定各个标签的初始分,进而,将该初始分用于后续备选节点的得分计算,因而,可以确保最终得分结果的有效性。
本申请实施例中,在得到当前梯度传播节点中每个节点关联的各个标签的初始分后,继续以当前梯度传播节点为基准,确定当前梯度传播节点中每个节点关联的各个标签中是否存在符合预设传播条件的无关标签。
这里,无关标签表示未包括在当前梯度传播节点对应的待匹配标签组中的标签;示例性地,假设当前梯度传播节点中包括节点1和节点2,其对应的待匹配标签组中包括标签1和标签2;若节点1关联标签1和标签3,节点2关联标签1、标签2和标签3,则标签3属于无关标签。
在一些实施例中,确定第i梯度传播节点中每个节点关联的各个标签中是否存在符合预设传播条件的无关标签,可以包括:如果确定第i梯度传播节点中关联无关标签的节点和第i梯度传播节点中各个节点的数量比值大于设置阈值时,确定存在符合预设传播条件的无关标签;如果确定第i梯度传播节点中关联无关标签的节点和第i梯度传播节点中各个节点的数量比值小于或等于设置阈值时,确定不存在符合预设传播条件的无关标签。
示例性地,首先确定当前梯度传播节点中关联无关标签的节点与当前梯度传播节点中包括的所有节点的数量比值;然后将该数量比值与设定阈值进行比较,得到比较结果;如果根据比较结果确定该数量比值大于设定阈值,则说明存在符合预设传播条件的无关标签;反之,如果根据比较结果确定该数量比值小于或等于设置阈值,则说明不存在符合预设传播条件的无关标签。
这里,设定阈值的取值可以根据实际业务场景进行确定,本申请实施例对此不作限定,例如,可以取值为0.5,也可以取其它数值。
示例性地,假设当前梯度传播节点中包括节点1和节点2,其对应的待 匹配标签组中包括标签1和标签2;若节点1关联标签1和标签3,节点2关联标签1、标签2和标签3,则对于无关标签(标签3)而言,由于节点1和节点2均关联了标签3,则可以确定当前梯度传播节点中关联无关标签的节点的数量为2,因而,可以确定当前梯度传播节点中关联无关标签的节点与当前梯度传播节点中包括的所有节点的数量比值为1;此时,若设定阈值的取值为0.5,由于数量比值1大于设定阈值0.5,因而,可以确定当前梯度传播节点中每个节点关联的标签中存在符合预设传播条件的无关标签(标签3)。
可以看出,本申请实施例中,通过设置预设传播条件,可以更好地发现一些隐含的能够影响匹配结果的无关标签的同时,减少标签匹配算法的迭代次数;这是因为,若当前梯度传播节点对应的标签中不存在符合预设传播条件的无关标签时,则不再进行下一梯度传播节点的迭代计算。
步骤202:确定存在时,将符合预设传播条件的无关标签加入待匹配标签组中,并将符合预设传播条件的无关标签关联的非备选节点作为第i+1梯度传播节点;确定不存在时,且i的取值大于1时,确定第i梯度传播节点中每个节点关联的各个标签的传播分;根据第i梯度传播节点中每个节点关联的各个标签的传播分,确定第一梯度传播节点中每个备选节点的最终得分。
本申请实施例中,在根据上述步骤201,确定当前梯度传播节点中每个节点关联的各个标签中存在符合预设传播条件的无关标签时,说明该无关标签具有实际意义,当前的传播网络还可以继续延伸,即,该无关标签对后续第一梯度传播节点中备选节点的得分结果具有一定的影响;此时,将该无关标签加入待匹配标签组中,搜索出该无关标签关联的非备选节点,并将非备选节点作为下一梯度传播节点。
示例性地,当前梯度传播节点中包括节点1、节点2和节点3,其对应的待匹配标签组中包括标签1和标签2;若节点1关联标签1和标签3,节点2关联标签1、标签2和标签3,则可以确定当前梯度传播节点中每个节点关联的各个标签中存在符合预设传播条件的无关标签(标签3),此时,将该无关标签(标签3)加入待匹配标签组中,即,匹配标签组中包括标签1、标签2和标签3;若标签3关联节点3,则节点3为无关标签关联的非备选节点,此时,将节点3作为下一梯度传播节点。这里,下一梯度传播节点对应的待匹配标签组为加入无关标签后的待匹配标签组。
示例性地,在得到下一梯度传播节点后,继续根据步骤201确定下一梯度传播节点中每个节点关联的各个标签的初始分,并确定是否存在符合预设传播条件的无关标签;具体实现方式与上述针对当前梯度传播节点的实现方式类似,此处不再赘述。
示例性地,若当前梯度传播节点为第一梯度传播节点,下一梯度传播节点为第二梯度传播节点,且根据步骤201确定第二梯度传播节点中每个 节点关联的各个标签的初始分后,确定存在符合预设传播条件的另一无关标签,则将该无关标签加入第二梯度传播节点对应的待匹配标签组中,并将与该无关标签关联的非备选节点作为第三梯度传播节点,继续根据步骤201进行判断,直到传播网络中不再存在符合预设传播条件的无关标签。
在一些实施例中,根据第i梯度传播节点中每个节点关联的各个标签的传播分,确定第一梯度传播节点中每个备选节点的最终得分,可以包括:在i的取值大于1的情况下,确定第i梯度传播节点中每个节点关联的各个标签中不存在符合预设传播条件的无关标签时,根据第i梯度传播节点中每个节点关联的各个标签的传播分,确定第i-1梯度传播节点中每个节点关联的各个标签的传播分;直至得到第一梯度传播节点中每个备选节点关联的各个标签的传播分;根据第一梯度传播节点中每个备选节点关联的各个标签的传播分,确定第一梯度传播节点中每个备选节点的最终得分;这里,最终得分用于反映待匹配标签组与每个备选节点的匹配程度。
示例性地,假设根据步骤201确定第二梯度传播节点中每个节点关联的各个标签中不存在符合预设传播条件的无关标签,此时,i等于2;则确定第二梯度传播节点中每个节点关联的各个标签的传播分;进而,根据第二梯度传播节点中每个节点关联的各个标签的传播分,确定第一梯度传播节点中每个节点关联的各个标签的传播分;再根据第一梯度传播节点中每个备选节点关联的各个标签的传播分,确定第一梯度传播节点中每个备选节点的最终得分。
在一些实施例中,根据第i梯度传播节点中每个节点关联的各个标签的传播分,确定第一梯度传播节点中每个备选节点的最终得分,可以包括:在i的取值大于1的情况下,确定第i梯度传播节点中每个节点关联的各个标签的初始分;根据第i梯度传播节点中每个节点关联的各个标签的传播分以及初始分,确定第一梯度传播节点中每个备选节点的最终得分。
示例性地,在根据上述步骤确定i等于2的情况下,首先,确定第二梯度传播节点中每个节点关联的各个标签的初始分,根据第二梯度传播节点中每个节点关联的各个标签的初始分确定各个标签的传播分,进而,根据第二梯度传播节点中每个节点关联的各个标签的传播分,确定第一梯度传播节点中每个节点关联的各个标签的传播分;最终,根据第一梯度传播节点中每个备选节点的传播分以及初始分,确定第一梯度传播节点中每个备选节点的最终得分。可见,以i等于2的情况为例,传播分反映了第二梯度传播节点中每个节点关联的各个标签的分数,对第一梯度传播节点中关联相同标签(即,加入待匹配标签组中的无关标签)的备选节点得分的影响程度。
可以看出,本申请实施例中,利用其它梯度传播节点中每个节点关联的各个标签传播分以及初始分,共同确定第一梯度传播节点中每个备选节点的最终得分,可以提高匹配结果的泛化性。
示例性地,假设第一梯度传播节点中包括节点1和节点2,其对应的待匹配标签组1中包括标签1和标签2;节点1关联标签1和标签3,节点2关联标签2和标签3,将标签3加入待匹配标签组1中,得到待匹配标签组2;若标签3关联节点3,则第二梯度传播节点中包括节点3,若节点3未关联其它无关标签,此时,计算第二梯度传播节点中节点3的传播分。具体地,首先确定第二梯度传播节点中节点3的初始分,再确定标签3对应特征属性的激励值与待匹配标签组2中所有标签对应特征属性的激励值占比,将节点3的初始分与激励值占比的乘积作为节点3的传播分。
示例性地,假设节点3的初始分为0.5,待匹配标签组2中标签1、标签2和标签3对应的激励值分别为2、1.5、0.5,则确定节点3对应的激励值占比为0.125,此时,节点3的传播分为0.0625。
进一步地,在得到第二梯度传播节点中节点3的传播分后,返回第一梯度传播节点,此时,由于第一梯度传播节点中节点1关联标签1和标签3,节点2关联标签2和标签3,即,节点1和节点2均关联了标签3,因而,在计算节点1和节点2的最终得分时,首先将标签3的初始分0.5与传播分0.0625进行累加,得到累加分0.5625,再将该累加分与其它关联标签的初始分作进一步累加,得到节点1和节点2的最终得分,即,节点1的最终得分为标签1对应的激励值2与标签3的累加分0.5625之和,为2.5625,节点2的最终得分为标签2对应的激励值1.5与标签3的累加分0.5625之和,为2.0625。
可以看出,本申请实施例中,在进行标签匹配的过程中,根据当前梯度传播节点中每个节点关联的各个标签的传播分,确定上一梯度中每个节点关联的各个标签的传播分,即,通过反向传播的方式确定每个备选节点的最终得分,可以提高匹配结果的准确性。
在一些实施例中,上述方法还可以包括:在确定第i梯度传播节点中不同节点关联同一个目标标签时,确定不同节点中每个节点关联同一个目标标签的传播分;从不同节点中每个节点关联同一个目标标签的传播分中,选择目标标签的最大传播分;根据目标标签的最大传播分,确定第i-1梯度传播节点中每个节点关联的各个标签的传播分。
示例性地,第二梯度传播节点中包括节点3和节点4,若节点3和节点4均关联了标签4,假设,在节点3关联标签4时,确定标签4的传播分为0.2,而在节点4关联标签4时,确定标签4的传播分为0.5,此时,将0.5作为标签4的传播分,用于确定上一梯度传播节点(第一梯度传播节点)中每个节点的传播分,最终,确定第一梯度传播节点中每个备选节点的最终得分。
可以看出,本申请实施例中,在某一梯度传播节点中不同节点关联同一标签时,可以得到该标签对应的多个传播分,由于传播分能够表明标签相对于节点的重要性;此时,选择最大传播分作为该标签的传播分,可以 提高后续匹配结果的精确性。
本申请实施例提出了一种标签匹配方法、装置、电子设备、计算机存储介质和计算机程序产品,该方法包括:接收标签匹配请求,标签匹配请求中包括待匹配标签组;从待匹配标签组中每个标签关联的节点中,确定备选节点集,将备选节点集中每个备选节点作为第一梯度传播节点;确定第i梯度传播节点中每个节点关联的各个标签中是否存在符合预设传播条件的无关标签;无关标签表示未包括在待匹配标签组中的标签;i为大于等于1的整数;确定存在时,将符合预设传播条件的无关标签加入待匹配标签组中,并将符合预设传播条件的无关标签关联的非备选节点作为第i+1梯度传播节点;确定不存在时,且i的取值大于1时,确定第i梯度传播节点中每个节点关联的各个标签的传播分;根据第i梯度传播节点中每个节点关联的各个标签的传播分,确定第一梯度传播节点中每个备选节点的最终得分;最终得分用于反映待匹配标签组与每个备选节点的匹配程度。
可以看出,本申请实施例中,通过确定第i梯度传播节点中每个节点关联的各个标签中是否存在符合预设传播条件的无关标签,即,基于无关标签确定标签传播网络是否还可以继续延伸,如此,可以更好地发现一些隐含的能够影响标签匹配结果的无关标签,提升后续标签匹配的准确性;另外,在确定是否存在符合预设传播条件的无关标签的过程中,如果存在某一梯度传播节点中各个节点关联的标签不再符合预设传播条件的情况,则不再对其它节点关联的标签进行判断,即,能够停止针对标签匹配过程的迭代计算,有效防止传播深度过大;与相关技术中,如果更新或插入一次标签,则需要一次N轮的迭代训练直至全局收敛相比,本申请实施例在标签匹配过程中的迭代次数较少,可以大大减少标签匹配的计算量。
在一些实施例中,上述方法还可以包括:确定第一梯度传播节点中每个备选节点关联的各个标签的初始分;如果确定第一梯度传播节点中每个备选节点关联的各个标签中不存在符合预设传播条件的无关标签;将第一梯度传播节点中每个备选节点关联的各个标签的初始分之和,确定为每个备选节点的最终得分结果。
示例性地,在当前梯度传播节点为第一梯度传播节点的情况下,通过上述方式确定第一梯度传播节点中每个备选节点关联的各个标签的初始分后,如果确定第一梯度传播节点中每个备选节点关联的各个标签中不存在符合预设传播条件的无关标签;即,每个备选节点关联的各个标签均为待匹配标签组中的标签;此时,根据公式(1)确定每个备选节点关联的各个标签的初始分,并将每个备选节点关联的各个标签的初始分之和,作为每个备选节点的最终得分结果。
示例性地,假设第一梯度传播节点中包括节点1和节点2,其对应的待匹配标签组中包括标签1和标签2;若节点1关联标签1,节点2关联标签1和标签2,且根据上述公式(1)确定标签1和标签2的初始分分别为0.3、 0.8,则节点1的最终得分结果为0.3,节点2的最终得分结果为1.1。
可以看出,本申请实施例中,在确定第一梯度传播节点中每个备选节点关联的各个标签中不存在符合预设传播条件的无关标签时,说明当前的传播网络无需继续延伸,此时,通过将第一梯度传播节点中每个备选节点关联的各个标签的初始分之和,确定为每个备选节点的最终得分结果,可以大大减少标签匹配算法的计算量。
示例性地,在得到第一梯度传播节点中每个备选节点的最终得分后,可以按照从大到小的顺序对每个备选节点的最终得分进行排序,排名在前的备选节点即为与待匹配标签组匹配程度最高的节点;或者,先对将根据标准差对每个备选节点的最终得分做归一化处理,并对归一化处理后的结果按照从大到小的顺序进行排序,排名在前的备选节点即为与待匹配标签组匹配程度最高的节点。由于节点对应一个业务实体,因而,本申请实施例可以通过上述方法得到与待匹配标签组最为匹配的业务实体。
为了能够更加体现本申请的目的,在本申请上述实施例的基础上,进行进一步的说明。
图2b所示的网络结构是一个局部网络,图2c是本申请实施例中的另一种进行标签匹配的网络结构示意图,如图2c所示,该网络结构为一个双向环状结构,这个网络结构可以用两组关系表示,也就是边的出度和入度,节点(Node)->标签(Label)和Label->Node,作为标签匹配算法的输入,下面对该算法进行具体说明:
步骤A1:通过搜索算法,以标签匹配请求中附带的待匹配标签组为条件,获得Label->Node关系组,在开始遍历之前,可以设置网络的预设剪枝条件裁剪网络,例如,图2b中的预设剪枝条件是:备选Node必须关联待匹配标签组中涉及的所有Core和Suitable特征属性的标签,所以图2b中虚线部分便是传播网络的第一梯度传播节点,其它不符合条件的节点被裁剪;当然,若未设置预设剪枝条件,则使用原Label->Node关系组,以所有涉及到的Node作为第一梯度传播节点;这里,第一梯度传播节点也是最后匹配结果的备选节点。
步骤A2:以当前梯度传播节点中的每个Node为准,搜索得到Node->Label关系组,遍历每个Node节点关联的Label,并给每个Label打初始分,如果Label不是待匹配标签组中附带的标签,则初始分为0,否则设当前Label为N,基本分为C,该Label的特征属性在待匹配标签组中出现的次数Fn,Label的特征属性的激励值为Bn,则初始分应为:(C/Fn)*Bn。
步骤A3:将步骤A2的Node->Label关系倒转,遍历计算Label->Node,由于步骤A2中忽略了未包括在待匹配标签组中的其它标签(无关标签)对评分的作用,但无关标签因为传播网络的原因也会对评分造成影响;这里,可以设定一个无关标签关联的备选Node占总关联Node的比值作为传播条件,比值默认为0.5,只要超过这个比值,就认为该无关标签也具有实际意 义,传播网络还可以继续延伸;如果存在符合传播条件的无关标签,将这个无关标签动态加入待匹配标签组中,搜索出该无关标签关联的非备选节点作为下一个传播梯度节点,返回到步骤A2;否则进行下一步。
步骤A4:根据步骤2的Node->Label关系,再次遍历,取当前梯度传播节点中的每个Node关联所有标签的得分之和,作为当前梯度传播节点中的每个Node的评分,如果当前梯度传播节点中的Node节点不是备选节点,则按照当前梯度传播节点中的Node节点关联的Label对应特征属性的激励值占比,确定该Node节点关联的每个Label的传播分,一个Label可能有多个传播分,取最大的即可,计算完成后,返回上一个梯度,直至得到第一梯度传播节点中每个备选节点的最终得分结果;否则评分计算完成,返回每个备选Node的最终得分结果。
步骤A5:根据标准差对每个备选Node的最终得分结果做归一化处理。
可以看出,本申请实施例提出的标签匹配算法的迭代次数非常少,如果待匹配标签组中的标签非常明确和精准,那么可能只有一到两步的计算量,同时相比相关技术,可以更好地发现一些隐含的影响匹配结果的因素,例如在计算过程中,产生了一些隐含的未在查询里出现的标签,这些标签的潜在特征则是已经通过预运算得出的,如此,可以确保匹配结果的准确性。进一步地,本申请在进行标签匹配的过程中,同样也采用网络传播的方式进行,为了防止传播深度过大以及回环网络的产生,采用了边传播边剪枝和使用比值限定传播条件的方式,动态勾画出不同的传播网络。
示例性地,对于每一个标签应该被赋予什么特征属性,具体的激励值取多少,可以借助约简算法(例如Rougthset模糊集理论)界定核心标签和非核心标签以及他们的加权权重(激励值),使用约简算法的核心过程:假定每个标签都具备所有的特征属性,各自的加权值为人工加权值*激励值(人工加权值为经验设置,激励值初始为1),输入的训练数据集为一系列查询标签和对应的匹配实体,查询标签集合可以为一个加权值的输入矩阵,匹配实体结合所有的候选实体数量可以为一个长度为N(候选实体数)的0-1序列,约简算法就是把问题转化为一个分类问题,每次迭代就是在求解一个最优的分类神经网络,每轮迭代完成后都会根据约简条件对输入矩阵进行调整处理(增减激励值)后再迭代,而约简条件需要是有利于增强输入矩阵分类能力的。
其中,特征属性的作用是为了能够缩减每次生成的传播网络大小,又不影响标签匹配的结果,即最符合的带标签的实体,所以在约简算法中应该以匹配实体结果是否改变和调整标签特征属性激励值后对结果的置信范围的影响作为约简条件。最终以每个查询标签里加权值最高的特征属性为其唯一特征属性,每个特征属性最终迭代的激励值为最终激励值。当然,在实际场景中,还可以人为根据经验对标签赋予特征属性,同样的,在一般情况下可以预设特征属性的激励值,例如,核心标签(Core)的激励值 为2.0、最适标签(Suitable)的激励值为1.0、优选标签(Prioritized)的激励值为0.8以及可选标签(Optional)的激励值为0.3。此外,特征属性的数量往往不是固定的,可以对一段连续数值范围进行分解,切分多段,每一段就代表一个特征属性,段的范围就是激励值范围。特征属性和激励值相当于先验知识,间接地参与到标签匹配评分计算中去,但不能直接代表数据的最终匹配得分,所以特征属性的激励值不需要频繁调整,与传统的评分网络不同,特征属性和激励值与具体的匹配查询无关。
[根据细则26改正 18.07.2022]
图2d是本申请实施例中的一种不符合预设传播条件时评分值传递的流程示意图,如图2d所示,初始化 (init) 表示在找到第一梯度传播节点的备选节点(Candidate Node)后给关联Label打分,相当于备选节点的评分隐式传递到了关联Label;求和 (sum) 是指非备选节点的评分是与它关联的所有Label评分之和,这也是一次传递;而非备选节点(Unrelated Node)到与它关联Label之间的broadcast传播则按照步骤A4中描述的特征属性的激励值占比计算传播分来传播,这是一次反向传递;最后关联Label到备选节点的传播 (broadcast) 传播则是取Label自身的最大传播分进行传播,最后累加得到备选节点的初始分上,得到该备选节点的真实评分;即,整体的传递顺序就是init->sum->broadcast->broadcast。
图3是本申请实施例的标签匹配装置的组成结构示意图,如图3所示,该装置包括第一确定模块300和第二确定模块301,其中,
第一确定模块300,配置为接收标签匹配请求,所述标签匹配请求中包括待匹配标签组;从所述待匹配标签组中每个标签关联的节点中,确定备选节点集,将所述备选节点集中每个备选节点作为第一梯度传播节点;
第二确定模块301,配置为确定第i梯度传播节点中每个节点关联的各个标签中是否存在符合预设传播条件的无关标签;所述无关标签表示未包括在所述待匹配标签组中的标签;i为大于等于1的整数;确定存在时,将所述符合预设传播条件的无关标签加入所述待匹配标签组中,并将所述符合预设传播条件的无关标签关联的非备选节点作为第i+1梯度传播节点;确定不存在时,且i的取值大于1时,确定所述第i梯度传播节点中每个节点关联的各个标签的传播分;根据所述第i梯度传播节点中每个节点关联的各个标签的传播分,确定所述第一梯度传播节点中每个备选节点的最终得分;所述最终得分用于反映所述待匹配标签组与所述每个备选节点的匹配程度。
在一些实施例中,所述第二确定模块301,配置为根据所述第i梯度传播节点中每个节点关联的各个标签的传播分,确定所述第一梯度传播节点中每个备选节点的最终得分,包括:
在i的取值大于1的情况下,确定所述第i梯度传播节点中每个节点关联的各个标签中不存在符合预设传播条件的无关标签时,根据所述第i梯度传播节点中每个节点关联的各个标签的传播分,确定第i-1梯度传播节点中每个节点关联的各个标签的传播分;
直至得到所述第一梯度传播节点中每个备选节点关联的各个标签的传播分;根据所述第一梯度传播节点中每个备选节点关联的各个标签的传播分,确定所述第一梯度传播节点中每个备选节点的最终得分。
在一些实施例中,所述第二确定模块301,还配置为:
在确定所述第i梯度传播节点中不同节点关联同一个目标标签时,确定所述不同节点中每个节点关联同一个目标标签的传播分;
从所述不同节点中每个节点关联同一个目标标签的传播分中,选择所述目标标签的最大传播分;
根据所述目标标签的最大传播分,确定所述第i-1梯度传播节点中每个节点关联的各个标签的传播分。
在一些实施例中,所述第二确定模块301,还配置为:
确定所述第一梯度传播节点中每个备选节点关联的各个标签的初始分;
如果确定所述第一梯度传播节点中每个备选节点关联的各个标签中不存在符合所述预设传播条件的无关标签;将所述第一梯度传播节点中每个备选节点关联的各个标签的初始分之和,确定为所述每个备选节点的最终得分结果。
在一些实施例中,所述第二确定模块301,配置为确定第i梯度传播节点中每个节点关联的各个标签中是否存在符合预设传播条件的无关标签,包括:
如果确定所述第i梯度传播节点中关联所述无关标签的节点和所述第i梯度传播节点中各个节点的数量比值大于设置阈值时,确定存在符合预设传播条件的无关标签;
如果确定所述第i梯度传播节点中关联所述无关标签的节点和所述第i梯度传播节点中各个节点的数量比值小于或等于设置阈值时,确定不存在符合预设传播条件的无关标签。
在一些实施例中,所述第一确定模块300,配置为从所述待匹配标签组中每个标签关联的节点中,确定备选节点集,包括:
从所述待匹配标签组中确定出符合预设剪枝条件的至少一个标签;
将所述至少一个标签中每个标签关联的节点作为备选节点,放入备选节点集中。
在一些实施例中,所述根据所述第i梯度传播节点中每个节点关联的各个标签的传播分,确定所述第一梯度传播节点中每个备选节点的最终得分,包括:
在i的取值大于1的情况下,确定所述第i梯度传播节点中每个节点关联的各个标签的初始分;
根据所述第i梯度传播节点中每个节点关联的各个标签的传播分以及初始分,确定所述第一梯度传播节点中每个备选节点的最终得分。
在一些实施例中,第二确定模块301,还配置为:
预先确定所述待匹配标签组中每个标签的特征属性和激励值;
在i的取值大于等于1的情况下,在确定所述第i梯度传播节点中每个节点关联的各个标签为所述待匹配标签组中的标签时,根据所述第i梯度传播节点中每个节点关联的各个标签的激励值以及特征属性在所述待匹配标签组中出现的次数,确定所述第i梯度传播节点中每个节点关联的各个标签的初始分;
在确定所述第i梯度传播节点中每个节点关联的各个标签不为所述待匹配标签组中的标签时,将预设值作为所述第i梯度传播节点中每个节点关联的各个标签的初始分。
在实际应用中,上述第一确定模块300和第二确定模块301均可以由位于电子设备中的处理器实现,该处理器可以为ASIC、DSP、DSPD、PLD、FPGA、CPU、控制器、微控制器、微处理器中的至少一种。
另外,在本实施例中的各功能模块可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。
集成的单元如果以软件功能模块的形式实现并非作为独立的产品进行销售或使用时,可以存储在一个计算机可读取存储介质中,基于这样的理解,本实施例的技术方案本质上或者说对相关技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机、服务器、或者网络设备等)或Processor(处理器)执行本实施例方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read Only Memory,ROM)、RAM、磁碟或者光盘等各种可以存储程序代码的介质。
具体来讲,本实施例中的一种标签匹配方法对应的计算机程序指令可以被存储在光盘、硬盘、U盘等存储介质上,当存储介质中的与一种标签匹配方法对应的计算机程序指令被一电子设备读取或被执行时,实现前述实施例的任意一种标签匹配方法。
基于前述实施例相同的技术构思,参见图4,其示出了本申请实施例提供的电子设备400,可以包括:存储器401和处理器402;其中,
存储器401,配置为存储计算机程序和数据;
处理器402,配置为执行存储器中存储的计算机程序,以实现前述实施例的任意一种标签匹配方法。
在实际应用中,上述存储器401可以是易失性存储器(Volatile Memory),例如RAM;或者非易失性存储器(Non-Volatile Memory),例如ROM、快闪存储器(Flash Memory)、硬盘(Hard Disk Drive,HDD)或固态硬盘(Solid-State Drive,SSD);或者上述种类的存储器的组合,并向处理器402 提供指令和数据。
上述处理器402可以为ASIC、DSP、DSPD、PLD、FPGA、CPU、控制器、微控制器、微处理器中的至少一种。可以理解地,对于不同的标签匹配设备,用于实现上述处理器功能的电子器件还可以为其它,本申请实施例不作具体限定。
本申请实施例还提供一种计算机程序产品,该计算机程序产品承载有程序代码,所述程序代码包括的指令可用于执行上述方法实施例中所述的负载均衡方法,具体可参见上述方法实施例,在此不再赘述。
其中,上述计算机程序产品可以具体通过硬件、软件或其结合的方式实现。在一个可选实施例中,所述计算机程序产品具体体现为计算机存储介质,在另一个可选实施例中,计算机程序产品具体体现为软件产品,例如软件开发包(Software Development Kit,SDK)等等。
在一些实施例中,本申请实施例提供的装置具有的功能或包含的模块可以用于执行上文方法实施例描述的方法,其具体实现可以参照上文方法实施例的描述,为了简洁,这里不再赘述。
上文对各个实施例的描述倾向于强调各个实施例之间的不同之处,其相同或相似之处可以互相参考,为了简洁,本文不再赘述。
本申请所提供的各方法实施例中所揭露的方法,在不冲突的情况下可以任意组合,得到新的方法实施例。
本申请所提供的各产品实施例中所揭露的特征,在不冲突的情况下可以任意组合,得到新的产品实施例。
本申请所提供的各方法或设备实施例中所揭露的特征,在不冲突的情况下可以任意组合,得到新的方法实施例或设备实施例。
本领域内的技术人员应明白,本申请的实施例可提供为方法、系统、或计算机程序产品。因此,本申请可采用硬件实施例、软件实施例、或结合软件和硬件方面的实施例的形式。而且,本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器和光学存储器等)上实施的计算机程序产品的形式。
本申请是参照根据本申请实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现 的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。
以上,仅为本申请的较佳实施例而已,并非用于限定本申请的保护范围。

Claims (12)

  1. 一种标签匹配方法,所述方法包括:
    接收标签匹配请求,所述标签匹配请求中包括待匹配标签组;从所述待匹配标签组中每个标签关联的节点中,确定备选节点集,将所述备选节点集中每个备选节点作为第一梯度传播节点;
    确定第i梯度传播节点中每个节点关联的各个标签中是否存在符合预设传播条件的无关标签;所述无关标签表示未包括在所述待匹配标签组中的标签;i为大于等于1的整数;
    确定存在时,将所述符合预设传播条件的无关标签加入所述待匹配标签组中,并将所述符合预设传播条件的无关标签关联的非备选节点作为第i+1梯度传播节点;
    确定不存在时,且i的取值大于1时,确定所述第i梯度传播节点中每个节点关联的各个标签的传播分;根据所述第i梯度传播节点中每个节点关联的各个标签的传播分,确定所述第一梯度传播节点中每个备选节点的最终得分;所述最终得分用于反映所述待匹配标签组与所述每个备选节点的匹配程度。
  2. 根据权利要求1所述的方法,其中,所述根据所述第i梯度传播节点中每个节点关联的各个标签的传播分,确定所述第一梯度传播节点中每个备选节点的最终得分,包括:
    在i的取值大于1的情况下,确定所述第i梯度传播节点中每个节点关联的各个标签中不存在符合预设传播条件的无关标签时,根据所述第i梯度传播节点中每个节点关联的各个标签的传播分,确定第i-1梯度传播节点中每个节点关联的各个标签的传播分;
    直至得到所述第一梯度传播节点中每个备选节点关联的各个标签的传播分;根据所述第一梯度传播节点中每个备选节点关联的各个标签的传播分,确定所述第一梯度传播节点中每个备选节点的最终得分。
  3. 根据权利要求2所述的方法,其中,所述方法还包括:
    在确定所述第i梯度传播节点中不同节点关联同一个目标标签时,确定所述不同节点中每个节点关联同一个目标标签的传播分;
    从所述不同节点中每个节点关联同一个目标标签的传播分中,选择所述目标标签的最大传播分;
    根据所述目标标签的最大传播分,确定所述第i-1梯度传播节点中每个节点关联的各个标签的传播分。
  4. 根据权利要求1所述的方法,其中,所述方法还包括:
    确定所述第一梯度传播节点中每个备选节点关联的各个标签的初始分;
    如果确定所述第一梯度传播节点中每个备选节点关联的各个标签中不存在符合所述预设传播条件的无关标签;将所述第一梯度传播节点中每个 备选节点关联的各个标签的初始分之和,确定为所述每个备选节点的最终得分结果。
  5. 根据权利要求1所述的方法,其中,所述确定第i梯度传播节点中每个节点关联的各个标签中是否存在符合预设传播条件的无关标签,包括:
    如果确定所述第i梯度传播节点中关联所述无关标签的节点和所述第i梯度传播节点中各个节点的数量比值大于设置阈值时,确定存在符合预设传播条件的无关标签;
    如果确定所述第i梯度传播节点中关联所述无关标签的节点和所述第i梯度传播节点中各个节点的数量比值小于或等于设置阈值时,确定不存在符合预设传播条件的无关标签。
  6. 根据权利要求1所述的方法,其中,所述从所述待匹配标签组中每个标签关联的节点中,确定备选节点集,包括:
    从所述待匹配标签组中确定出符合预设剪枝条件的至少一个标签;
    将所述至少一个标签中每个标签关联的节点作为备选节点,放入备选节点集中。
  7. 根据权利要求1所述的方法,其中,所述根据所述第i梯度传播节点中每个节点关联的各个标签的传播分,确定所述第一梯度传播节点中每个备选节点的最终得分,包括:
    在i的取值大于1的情况下,确定所述第i梯度传播节点中每个节点关联的各个标签的初始分;
    根据所述第i梯度传播节点中每个节点关联的各个标签的传播分以及初始分,确定所述第一梯度传播节点中每个备选节点的最终得分。
  8. 根据权利要求4或7所述的方法,其中,所述方法还包括:
    预先确定所述待匹配标签组中每个标签的特征属性和激励值;
    在i的取值大于等于1的情况下,在确定所述第i梯度传播节点中每个节点关联的各个标签为所述待匹配标签组中的标签时,根据所述第i梯度传播节点中每个节点关联的各个标签的激励值以及特征属性在所述待匹配标签组中出现的次数,确定所述第i梯度传播节点中每个节点关联的各个标签的初始分;
    在确定所述第i梯度传播节点中每个节点关联的各个标签不为所述待匹配标签组中的标签时,将预设值作为所述第i梯度传播节点中每个节点关联的各个标签的初始分。
  9. 一种标签匹配装置,所述装置包括:
    第一确定模块,配置为接收标签匹配请求,所述标签匹配请求中包括待匹配标签组;从所述待匹配标签组中每个标签关联的节点中,确定备选节点集,将所述备选节点集中每个备选节点作为第一梯度传播节点;
    第二确定模块,配置为确定第i梯度传播节点中每个节点关联的各个标签中是否存在符合预设传播条件的无关标签;所述无关标签表示未包括在 所述待匹配标签组中的标签;i为大于等于1的整数;确定存在时,将所述符合预设传播条件的无关标签加入所述待匹配标签组中,并将所述符合预设传播条件的无关标签关联的非备选节点作为第i+1梯度传播节点;确定不存在时,且i的取值大于1时,确定所述第i梯度传播节点中每个节点关联的各个标签的传播分;根据所述第i梯度传播节点中每个节点关联的各个标签的传播分,确定所述第一梯度传播节点中每个备选节点的最终得分;所述最终得分用于反映所述待匹配标签组与所述每个备选节点的匹配程度。
  10. 一种电子设备,所述设备包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,所述处理器执行所述程序时实现权利要求1至8任一项所述的方法。
  11. 一种计算机存储介质,其上存储有计算机程序,该计算机程序被处理器执行时实现权利要求1至8任一项所述的方法。
  12. 一种计算机程序产品,包括计算机可读代码,当所述计算机可读代码在电子设备中运行时,所述电子设备中的处理器执行权利要求1至8任一项所述的方法。
PCT/CN2022/099766 2021-12-06 2022-06-20 一种标签匹配方法、装置、设备、计算机存储介质和程序 WO2023103327A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111475736.0A CN114117168A (zh) 2021-12-06 2021-12-06 一种标签匹配方法、装置、设备和计算机存储介质
CN202111475736.0 2021-12-06

Publications (1)

Publication Number Publication Date
WO2023103327A1 true WO2023103327A1 (zh) 2023-06-15

Family

ID=80366948

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/099766 WO2023103327A1 (zh) 2021-12-06 2022-06-20 一种标签匹配方法、装置、设备、计算机存储介质和程序

Country Status (2)

Country Link
CN (1) CN114117168A (zh)
WO (1) WO2023103327A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114117168A (zh) * 2021-12-06 2022-03-01 深圳前海微众银行股份有限公司 一种标签匹配方法、装置、设备和计算机存储介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170351681A1 (en) * 2016-06-03 2017-12-07 International Business Machines Corporation Label propagation in graphs
CN109582675A (zh) * 2018-11-29 2019-04-05 北京达佳互联信息技术有限公司 标签匹配方法、装置、服务器及存储介质
CN111967262A (zh) * 2020-06-30 2020-11-20 北京百度网讯科技有限公司 实体标签的确定方法和装置
CN112507066A (zh) * 2020-11-18 2021-03-16 北京三快在线科技有限公司 标签标记的方法、装置、电子设备及可读存储介质
CN112699237A (zh) * 2020-12-24 2021-04-23 百度在线网络技术(北京)有限公司 标签确定方法、设备和存储介质
CN114117168A (zh) * 2021-12-06 2022-03-01 深圳前海微众银行股份有限公司 一种标签匹配方法、装置、设备和计算机存储介质

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170351681A1 (en) * 2016-06-03 2017-12-07 International Business Machines Corporation Label propagation in graphs
CN109582675A (zh) * 2018-11-29 2019-04-05 北京达佳互联信息技术有限公司 标签匹配方法、装置、服务器及存储介质
CN111967262A (zh) * 2020-06-30 2020-11-20 北京百度网讯科技有限公司 实体标签的确定方法和装置
CN112507066A (zh) * 2020-11-18 2021-03-16 北京三快在线科技有限公司 标签标记的方法、装置、电子设备及可读存储介质
CN112699237A (zh) * 2020-12-24 2021-04-23 百度在线网络技术(北京)有限公司 标签确定方法、设备和存储介质
CN114117168A (zh) * 2021-12-06 2022-03-01 深圳前海微众银行股份有限公司 一种标签匹配方法、装置、设备和计算机存储介质

Also Published As

Publication number Publication date
CN114117168A (zh) 2022-03-01

Similar Documents

Publication Publication Date Title
CN110168523B (zh) 改变监测跨图查询
US9189506B2 (en) Database index management
US6321230B1 (en) Binary tree with override nodes for representing a time-varying function in an enterprise model
US8229968B2 (en) Data caching for distributed execution computing
Yuan et al. Efficient keyword search on uncertain graph data
CN105991397B (zh) 信息传播方法和装置
Kolchinsky et al. Lazy evaluation methods for detecting complex events
WO2016177279A1 (zh) 数据处理的方法及系统
CN105740386A (zh) 基于排序集成的论文搜索方法及装置
WO2023103327A1 (zh) 一种标签匹配方法、装置、设备、计算机存储介质和程序
TW201837749A (zh) 基於社交網路的群組查找方法和裝置
CN114168608A (zh) 一种用于更新知识图谱的数据处理系统
KR20090024877A (ko) 지식 베이스 검색 기능을 이용한 사용자 정의 추론 규칙적용 방법 및 이를 구현한 지식 베이스 관리 시스템
US20240095241A1 (en) Data search method and apparatus, and device
Altowim et al. ProgressER: adaptive progressive approach to relational entity resolution
Shen et al. A Generic Framework for Top-${\schmi k} $ Pairs and Top-${\schmi k} $ Objects Queries over Sliding Windows
US20210073662A1 (en) Machine Learning Systems and Methods for Performing Entity Resolution Using a Flexible Minimum Weight Set Packing Framework
EP2731021B1 (en) Apparatus, program, and method for reconciliation processing in a graph database
CN112966054A (zh) 基于企业图谱节点间关系的族群划分方法和计算机设备
CN116204532A (zh) 一种基于dcg的图索引迁移方法及装置
CN110955712A (zh) 基于多数据源的开发api处理方法及装置
Setayesh et al. Presentation of an Extended Version of the PageRank Algorithm to Rank Web Pages Inspired by Ant Colony Algorithm
CN110019783A (zh) 属性词聚类方法及装置
Wang et al. RODA: A fast outlier detection algorithm supporting multi-queries
Liu et al. An analysis of missing data treatment methods and their application to health care dataset

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22902765

Country of ref document: EP

Kind code of ref document: A1