WO2023213233A1 - Procédé de traitement de tâche, procédé d'entraînement de réseau neuronal, appareil, dispositif et support - Google Patents

Procédé de traitement de tâche, procédé d'entraînement de réseau neuronal, appareil, dispositif et support Download PDF

Info

Publication number
WO2023213233A1
WO2023213233A1 PCT/CN2023/091356 CN2023091356W WO2023213233A1 WO 2023213233 A1 WO2023213233 A1 WO 2023213233A1 CN 2023091356 W CN2023091356 W CN 2023091356W WO 2023213233 A1 WO2023213233 A1 WO 2023213233A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
scale
matching
graph representation
node
Prior art date
Application number
PCT/CN2023/091356
Other languages
English (en)
Chinese (zh)
Inventor
邰骋
汤林鹏
Original Assignee
墨奇科技(北京)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from CN202210488516.XA external-priority patent/CN117078977A/zh
Priority claimed from CN202210488466.5A external-priority patent/CN117077751A/zh
Application filed by 墨奇科技(北京)有限公司 filed Critical 墨奇科技(北京)有限公司
Publication of WO2023213233A1 publication Critical patent/WO2023213233A1/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/75Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries

Definitions

  • Step 101 Obtain first data and second data.
  • the first data and second data are respectively one of image data, audio data, text data, molecular structure data and sequence data;
  • Step 104 Perform graph matching on the graph representation of the first scale of the first data and the graph representation of the first scale of the second data to obtain a first matching result;
  • FIG. 2 shows a schematic diagram of a multi-scale graph representation according to one embodiment of the present disclosure.
  • graph representations 202, 204, and 206 of three scales from high to low constitute a multi-scale graph representation.
  • Each graph representation includes multiple nodes, and the graph representation 206 includes multiple adjacent edges.
  • Graph representations 202, 204, 206 may be obtained by sparsifying dense data 208.
  • the dense data 208 includes three dense data corresponding to the three scales respectively.
  • the graph representations 202, 204, and 206 of the three scales can be obtained.
  • detection-based approaches may be used to determine nodes in dense data.
  • the detection-based method may include key point detection, target detection, or other types of detection, which are not limited here.
  • node sparsification can be performed through a node sparsification network, and the node sparsification network can include a detection network, a saliency network, etc.
  • the node sparse network is a detection network
  • the dense data is input into the detection network, and the sparse nodes corresponding to the dense data and the corresponding confidence of the nodes are obtained.
  • the node sparse network is a saliency network
  • the dense data and the feature vector corresponding to the dense data are input into the saliency network, and the saliency score of each dense node corresponding to the dense data is obtained.
  • the nodes of at least one scale may be obtained by merging low-scale nodes, and the low-scale nodes may be obtained by sparsifying dense data.
  • a clustering or graph neural network method may be used to cluster multiple low-scale nodes obtained by sparsification, or the package A subgraph containing multiple low-scale nodes is input into the graph neural network to obtain higher-scale nodes and/or node attributes.
  • Multiscale graphs can include adjacency edges when the relative relationships between nodes are helpful in characterizing the data. For example, the distance between two targets in the image, the role between the two targets in the image, the association between the preceding and following words in speech, and the interaction of different groups in the sequence.
  • the nodes at the second scale are obtained by clustering dense data
  • the nodes at the first scale are obtained by merging the nodes at the second scale, and then the merging is used to obtain the first
  • the attributes of the dependent edge may be determined based on the attributes of the two nodes connected to the dependent edge.
  • the attributes of the subordinate edge can be determined in various ways according to the vector type attributes and/or the scalar type attributes of the two nodes connected to the subordinate edge, which are not limited here.
  • Step 306 Based on the matching results of the candidate matching edge pairs, determine the matching results of the graph representation of the scale of the first data and the graph representation of the scale of the second data.
  • the graph representation of each scale in the multi-scale graph representation may include at least one node, the node may include attributes, and the attributes of the node may include scalar type attributes and vector type attributes.
  • the graph representation of at least one scale in the multi-scale graph representation may include at least one adjacent edge. Each of the at least one adjacent edge is used to characterize the relative relationship between two nodes of the same scale. The adjacent edge has an attribute. Properties include scalar type properties and vector type properties.
  • the Nth round of training means that the network has undergone at least one round of training and thus has a certain inference ability, but it is not intended to limit the specific number of training rounds of the network.
  • FIG. 10 shows a structural block diagram of a task processing device 1000 according to an embodiment of the present disclosure.
  • the device 1000 includes: a first acquisition unit 1010 configured to acquire first data and second data, the first data and the second data.
  • the two data are respectively one of image data, audio data, text data, molecular structure data, and sequence data;
  • the second acquisition unit 1020 is configured to acquire a first-scale graphic representation of each of the first data and the second data,
  • the graph representation of the first scale includes at least one node of the first scale, wherein the node of the first scale has attributes, and the attributes of the nodes of the first scale include attributes of vector type;
  • the third obtaining unit 1030 is configured to obtain the first A graph representation of a second scale respectively of the data and the second data, the second scale being lower than the first scale, the graph representation of the second scale including at least one node of the second scale, wherein the node of the second scale has an attribute, and the graph representation of the second scale
  • the attributes of the scale nodes include
  • Respective multi-scale graph representation wherein the multi-scale graph representation is determined using the graph representation extraction network, and the multi-scale graph representation includes a graph representation of the first scale and a graph representation of the second scale;
  • the third graph matching unit 1130 is configured To perform graph matching on the graph representation of the first scale of the first sample data and the graph representation of the first scale of the second sample data to obtain a first current matching result that represents the matching degree of the first scale;
  • the fourth graph matching Unit 1140 is configured to perform graph matching on the graph representation of the second scale of the first sample data and the graph representation of the second scale of the second sample data to obtain a second current matching result that represents the matching degree of the second scale.
  • FIG 12 illustrates an example configuration of an electronic device 1200 that may be used to implement the methods described herein.
  • Each of the above-described apparatus 1000 and apparatus 1100 may also be fully or at least partially implemented by an electronic device 1200 or similar device or system.
  • Electronic device 1200 may be a variety of different types of devices. Examples of electronic devices 1200 include, but are not limited to: desktop computers, server computers, laptop or netbook computers, mobile devices (e.g., tablet computers, cellular or other wireless phones (e.g., smartphones), notepad computers, mobile stations), Wearable devices (eg, glasses, watches), entertainment devices (eg, entertainment appliances, set-top boxes communicatively coupled to display devices, game consoles), televisions or other display devices, automotive computers, and the like.
  • mobile devices e.g., tablet computers, cellular or other wireless phones (e.g., smartphones), notepad computers, mobile stations
  • Wearable devices eg, glasses, watches
  • entertainment devices eg, entertainment appliances, set-top boxes communicatively coupled to display devices, game consoles
  • televisions or other display devices automotive computers, and the like.
  • a display device 1208, such as a monitor may be included for displaying information and images to a user.
  • Other I/O devices 1210 may be devices that receive various inputs from the user and provide various outputs to the user, and may include touch input devices, gesture input devices, cameras, keyboards, remote controls, mice, printers, audio input/ Output devices and so on.
  • a cloud includes and/or represents a platform for resources.
  • the platform abstracts the underlying functionality of the cloud's hardware (e.g., servers) and software resources.
  • Resources may include applications and/or data that may be used while performing computing processing on a server remote from electronic device 1200 .
  • Resources may also include services provided over the Internet and/or through subscriber networks such as cellular or Wi-Fi networks.
  • the platform can abstract resources and functionality to connect electronic device 1200 with other electronic devices. Therefore, implementation of the functionality described in this article can be distributed throughout the cloud. For example, functionality may be implemented partly on the electronic device 1200 and partly through a platform that abstracts the functionality of the cloud.

Abstract

Procédé de traitement de tâche, procédé d'entraînement de réseau neuronal, appareil, dispositif et support. Le procédé de traitement de tâche consiste : à obtenir des premières données et des secondes données ; à obtenir des représentations graphiques respectives d'une première échelle et des représentations graphiques respectives d'une seconde échelle des premières données et des secondes données, la seconde échelle étant inférieure à la première échelle, la représentation graphique de chaque échelle comprenant un nœud de l'échelle, le nœud de chaque échelle comprenant un attribut d'un type de vecteur, un nœud d'au moins une échelle de chaque élément de données étant obtenu par la dispersion de données denses correspondant aux données, et la représentation graphique d'au moins une échelle de chaque élément de données comprenant des bords adjacents représentant une relation relative du nœud de l'échelle ; à effectuer une mise en correspondance de graphes sur la première échelle et la seconde échelle sur les premières données et les secondes données, respectivement afin d'obtenir un premier résultat de mise en correspondance et un second résultat de mise en correspondance ; et à déterminer un résultat de mise en correspondance multi-échelle en fonction du premier résultat de mise en correspondance et/ou du second résultat de mise en correspondance, et à déterminer en outre un résultat de traitement de tâche.
PCT/CN2023/091356 2022-05-06 2023-04-27 Procédé de traitement de tâche, procédé d'entraînement de réseau neuronal, appareil, dispositif et support WO2023213233A1 (fr)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
CN202210488516.X 2022-05-06
CN202210488516.XA CN117078977A (zh) 2022-05-06 2022-05-06 任务处理方法、神经网络的训练方法、装置、设备和介质
CN202210488466.5 2022-05-06
CN202210488466.5A CN117077751A (zh) 2022-05-06 2022-05-06 神经网络的训练方法、图表示提取方法和任务处理方法

Publications (1)

Publication Number Publication Date
WO2023213233A1 true WO2023213233A1 (fr) 2023-11-09

Family

ID=88646265

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/091356 WO2023213233A1 (fr) 2022-05-06 2023-04-27 Procédé de traitement de tâche, procédé d'entraînement de réseau neuronal, appareil, dispositif et support

Country Status (1)

Country Link
WO (1) WO2023213233A1 (fr)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020006961A1 (fr) * 2018-07-03 2020-01-09 北京字节跳动网络技术有限公司 Procédé et dispositif d'extraction d'image
EP3876114A2 (fr) * 2020-12-25 2021-09-08 Baidu Online Network Technology (Beijing) Co., Ltd. Procede de recommandation d'un terme de recherche, procede d'entrainement d'un modele cible, dispositif de recommandation d'un terme de recherche, dispositif d'entrainement d'un modele cible, dispositif electronique et produit programme
CN113536383A (zh) * 2021-01-27 2021-10-22 支付宝(杭州)信息技术有限公司 基于隐私保护训练图神经网络的方法及装置
CN113609337A (zh) * 2021-02-24 2021-11-05 腾讯科技(深圳)有限公司 图神经网络的预训练方法、训练方法、装置、设备及介质

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020006961A1 (fr) * 2018-07-03 2020-01-09 北京字节跳动网络技术有限公司 Procédé et dispositif d'extraction d'image
EP3876114A2 (fr) * 2020-12-25 2021-09-08 Baidu Online Network Technology (Beijing) Co., Ltd. Procede de recommandation d'un terme de recherche, procede d'entrainement d'un modele cible, dispositif de recommandation d'un terme de recherche, dispositif d'entrainement d'un modele cible, dispositif electronique et produit programme
CN113536383A (zh) * 2021-01-27 2021-10-22 支付宝(杭州)信息技术有限公司 基于隐私保护训练图神经网络的方法及装置
CN113609337A (zh) * 2021-02-24 2021-11-05 腾讯科技(深圳)有限公司 图神经网络的预训练方法、训练方法、装置、设备及介质

Similar Documents

Publication Publication Date Title
CN111581961B (zh) 一种中文视觉词汇表构建的图像内容自动描述方法
US20210382937A1 (en) Image processing method and apparatus, and storage medium
CN112269868B (zh) 一种基于多任务联合训练的机器阅读理解模型的使用方法
WO2019214289A1 (fr) Procédé et appareil de traitement d'images, et dispositif électronique et support de stockage
CN109034248B (zh) 一种基于深度学习的含噪声标签图像的分类方法
CN113033438B (zh) 一种面向模态非完全对齐的数据特征学习方法
CN113204952B (zh) 一种基于聚类预分析的多意图与语义槽联合识别方法
CN114092742B (zh) 一种基于多角度的小样本图像分类装置和方法
CN110188827B (zh) 一种基于卷积神经网络和递归自动编码器模型的场景识别方法
US11803971B2 (en) Generating improved panoptic segmented digital images based on panoptic segmentation neural networks that utilize exemplar unknown object classes
Bresler et al. Modeling flowchart structure recognition as a max-sum problem
CN114612921B (zh) 表单识别方法、装置、电子设备和计算机可读介质
CN114564563A (zh) 一种基于关系分解的端到端实体关系联合抽取方法及系统
CN112966088B (zh) 未知意图的识别方法、装置、设备及存储介质
CN113434683A (zh) 文本分类方法、装置、介质及电子设备
CN111538846A (zh) 基于混合协同过滤的第三方库推荐方法
CN114691864A (zh) 文本分类模型训练方法及装置、文本分类方法及装置
CN115544303A (zh) 用于确定视频的标签的方法、装置、设备及介质
CN112418320A (zh) 一种企业关联关系识别方法、装置及存储介质
KR102608867B1 (ko) 업계 텍스트를 증분하는 방법, 관련 장치 및 매체에 저장된 컴퓨터 프로그램
Rajyagor et al. Tri-level handwritten text segmentation techniques for Gujarati language
Ambili et al. Siamese Neural Network Model for Recognizing Optically Processed Devanagari Hindi Script
CN111966798A (zh) 一种基于多轮K-means算法的意图识别方法、装置和电子设备
CN115640401B (zh) 文本内容提取方法及装置
WO2023213233A1 (fr) Procédé de traitement de tâche, procédé d'entraînement de réseau neuronal, appareil, dispositif et support

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23799219

Country of ref document: EP

Kind code of ref document: A1