CN115080241A - Data processing method and device - Google Patents

Data processing method and device Download PDF

Info

Publication number
CN115080241A
CN115080241A CN202210762371.8A CN202210762371A CN115080241A CN 115080241 A CN115080241 A CN 115080241A CN 202210762371 A CN202210762371 A CN 202210762371A CN 115080241 A CN115080241 A CN 115080241A
Authority
CN
China
Prior art keywords
processing
target objects
operator
operators
data queue
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210762371.8A
Other languages
Chinese (zh)
Inventor
郭志强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alipay Hangzhou Information Technology Co Ltd
Original Assignee
Alipay Hangzhou Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alipay Hangzhou Information Technology Co Ltd filed Critical Alipay Hangzhou Information Technology Co Ltd
Priority to CN202210762371.8A priority Critical patent/CN115080241A/en
Publication of CN115080241A publication Critical patent/CN115080241A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5072Grid computing

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

An embodiment of the present specification provides a data processing method and an apparatus, where the data processing method is applied to a first processing end, and includes: determining a plurality of target objects and a plurality of processing operators corresponding to the target objects respectively, wherein the target objects carry execution sequences aiming at the plurality of processing operators, each processing operator corresponds to one data queue, and the processing operators are operators for performing dimensionality reduction on the characteristic vectors of the target objects; under the condition that any processing operator is executed on a target object, freezing the target object, and adding the target object to a data queue corresponding to the any processing operator; extracting a plurality of target objects from any data queue, sending the target objects to a second processing end, and enabling the second processing end to concurrently execute a processing operator corresponding to any data queue on the target objects; and receiving a processing result fed back by the second processing end, and unfreezing the plurality of target objects according to the processing result. Target objects are accurately collected, and the efficiency of executing processing operators on the target objects is improved.

Description

Data processing method and device
Technical Field
The embodiment of the specification relates to the technical field of computers, in particular to a data processing method. One or more embodiments of the present specification relate to a data processing apparatus, a data processing system, a computing device, a computer-readable storage medium, and a computer program.
Background
With the development of computer technology, more and more computer technology is applied in the field of data processing. Taking the graph learning architecture as an example, a graph is composed of two types of elements, which are respectively called "points" (or nodes, nodes) and "edges". Each edge has two vertices as its endpoints, and we call this edge "connect" its two endpoints.
At present, all points and edges in one graph can be sent to a graphics processor for calculation, but a huge graph often includes a large number of points and edges, and the points and edges spend a large amount of time in the processing process, which greatly affects the efficiency of data processing, and therefore, an efficient data processing scheme is urgently needed.
Disclosure of Invention
In view of this, the embodiments of the present specification provide a data processing method. One or more embodiments of the present specification relate to a data processing apparatus, a data processing system, a computing device, a computer-readable storage medium, and a computer program, so as to solve the technical problems in the prior art.
According to a first aspect of the embodiments of the present specification, there is provided a data processing method applied to a first processing end, the method including:
determining a plurality of target objects and a plurality of processing operators corresponding to the target objects respectively, wherein the target objects carry execution sequences aiming at the plurality of processing operators, each processing operator corresponds to one data queue, and the processing operators are operators for performing dimensionality reduction on the characteristic vectors of the target objects;
under the condition that any processing operator is executed on a target object, freezing the target object, and adding the target object to a data queue corresponding to the any processing operator;
extracting a plurality of target objects from any data queue, sending the target objects to a second processing end, and enabling the second processing end to concurrently execute a processing operator corresponding to any data queue on the target objects;
and receiving a processing result fed back by the second processing end, and unfreezing the plurality of target objects according to the processing result.
According to a second aspect of embodiments of the present specification, there is provided a data processing method applied to a second processing side, the method including:
receiving a plurality of target objects sent by a first processing end, wherein the target objects determine a plurality of target objects and a plurality of processing operators corresponding to the target objects respectively through the first processing end, under the condition that any processing operator is executed on the target objects, the target objects are frozen, the target objects are added to data queues corresponding to any processing operator and extracted from any data queue, the target objects carry execution sequences aiming at the plurality of processing operators, each processing operator corresponds to one data queue, and the processing operators are operators for performing dimensionality reduction on characteristic vectors of the target objects;
concurrently executing a processing operator corresponding to any data queue on the plurality of target objects to obtain a processing result;
and feeding back the processing result to the first processing end, so that the first processing end unfreezes the plurality of target objects according to the processing result.
According to a third aspect of the embodiments of the present specification, there is provided a data processing apparatus applied to a first processing side, the apparatus including:
the determining module is configured to determine a plurality of target objects and a plurality of processing operators corresponding to the target objects respectively, wherein the target objects carry execution sequences for the plurality of processing operators, each processing operator corresponds to one data queue, and the processing operators are operators for performing dimensionality reduction on feature vectors of the target objects;
the freezing module is configured to freeze the target object and add the target object to a data queue corresponding to any processing operator under the condition that any processing operator is executed on the target object;
the sending module is configured to extract a plurality of target objects from any data queue, send the plurality of target objects to the second processing end and enable the second processing end to concurrently execute a processing operator corresponding to any data queue on the plurality of target objects;
and the unfreezing module is configured to receive the processing result fed back by the second processing end and unfreeze the plurality of target objects according to the processing result.
According to a fourth aspect of the embodiments of the present specification, there is provided a data processing apparatus applied to a second processing side, the apparatus including:
the receiving module is configured to receive a plurality of target objects sent by the first processing end, wherein the target objects determine a plurality of target objects and a plurality of processing operators corresponding to the target objects respectively through the first processing end, the target objects are frozen under the condition that any processing operator is executed on the target objects, the target objects are added to data queues corresponding to any processing operator and extracted from any data queue, the target objects carry execution sequences aiming at the plurality of processing operators, each processing operator corresponds to one data queue, and the processing operators are operators for performing dimension reduction processing on feature vectors of the target objects;
the processing module is configured to concurrently execute a processing operator corresponding to any one data queue on the plurality of target objects to obtain a processing result;
and the feedback module is configured to feed back the processing result to the first processing end, so that the first processing end unfreezes the plurality of target objects according to the processing result.
According to a fifth aspect of embodiments herein, there is provided a data processing system comprising a first processing terminal and a second processing terminal;
the first processing end is configured to determine a plurality of target objects and a plurality of processing operators corresponding to the target objects respectively, wherein the target objects carry execution sequences for the plurality of processing operators, each processing operator corresponds to one data queue, and the processing operators are operators for performing dimensionality reduction on feature vectors of the target objects; under the condition that any processing operator is executed on a target object, freezing the target object, and adding the target object to a data queue corresponding to the any processing operator; extracting a plurality of target objects from any data queue and sending the target objects to a second processing end; receiving a processing result fed back by the second processing end, and unfreezing the plurality of target objects according to the processing result;
the second processing terminal is configured to receive the plurality of target objects sent by the first processing terminal; concurrently executing a processing operator corresponding to any data queue on the plurality of target objects to obtain a processing result; and feeding back a processing result to the first processing end.
According to a sixth aspect of embodiments herein, there is provided a computing device comprising:
a memory and a processor;
the memory is configured to store computer-executable instructions, and the processor is configured to execute the computer-executable instructions, which when executed by the processor implement the steps of the data processing method provided in the first aspect or the second aspect.
According to a seventh aspect of embodiments herein, there is provided a computer-readable storage medium storing computer-executable instructions that, when executed by a processor, implement the steps of the data processing method provided by the first aspect or the second aspect.
According to an eighth aspect of embodiments herein, there is provided a computer program, wherein when the computer program is executed in a computer, the computer is caused to execute the steps of the data processing method provided by the first aspect or the second aspect.
In the data processing method provided in one embodiment of the present specification, a plurality of target objects and a plurality of processing operators corresponding to the target objects are determined, where the target objects carry execution sequences for the plurality of processing operators, each processing operator corresponds to one data queue, and the processing operators are operators for performing dimension reduction processing on feature vectors of the target objects; under the condition that any processing operator is executed on a target object, freezing the target object, and adding the target object to a data queue corresponding to the any processing operator; extracting a plurality of target objects from any data queue, sending the target objects to a second processing end, and enabling the second processing end to concurrently execute a processing operator corresponding to any data queue on the target objects; and receiving a processing result fed back by the second processing end, and unfreezing the plurality of target objects according to the processing result. The multiple target objects corresponding to any processing operator are accurately collected, the processing operator corresponding to any data queue is executed for the multiple target objects concurrently, and data processing efficiency is improved.
Drawings
FIG. 1 is a flow diagram of data processing under a data processing system, according to an embodiment of the present disclosure;
FIG. 2 is a system architecture diagram applied to a graph learning scenario according to an embodiment of the present disclosure;
FIG. 3 is a flow chart of a data processing method provided by an embodiment of the present specification;
FIG. 4 is a flow diagram of another data processing method provided by one embodiment of the present description;
FIG. 5 is a flowchart illustrating a data processing method according to an embodiment of the present disclosure;
fig. 6 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present specification;
FIG. 7 is a block diagram of another data processing apparatus according to an embodiment of the present disclosure;
fig. 8 is a block diagram of a computing device according to an embodiment of the present disclosure.
Detailed Description
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present description. This description may be embodied in many different forms and should not be construed as limited to the embodiments set forth herein, as those skilled in the art will be able to make and use the present disclosure without departing from the spirit and scope of the present disclosure.
The terminology used in the description of the one or more embodiments is for the purpose of describing the particular embodiments only and is not intended to be limiting of the description of the one or more embodiments. As used in one or more embodiments of the present specification and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used in one or more embodiments of the present specification refers to and encompasses any and all possible combinations of one or more of the associated listed items.
It should be understood that although the terms first, second, etc. may be used herein in one or more embodiments to describe various information, these information should not be limited by these terms. These terms are only used to distinguish one type of information from another. For example, a first can be termed a second and, similarly, a second can be termed a first without departing from the scope of one or more embodiments of the present description. The word "if" as used herein may be interpreted as "at … …" or "when … …" or "in response to a determination", depending on the context.
First, the noun terms to which one or more embodiments of the present specification relate are explained.
And (3) processing an operator: the processing operator is an operator used in the image processing process, for example, in the image learning algorithm model, the processing operator is an operator for performing dimension reduction processing on the characteristic vectors of the points and/or the edges in the image learning algorithm model; as another example, in a picture recognition scenario, the processing operator is an operator for performing dimension reduction processing on the feature vectors of each sub-tile in the picture.
CPU (CPU, central processing unit): the CPU is the final execution unit for data processing and program operation, which is the core of the operation and control of the computer system.
Graphics Processing Unit (GPU): the GPU, also known as a display core, a visual processor, and a display chip, is a microprocessor that is dedicated to image and graphics related operations on personal computers, workstations, game machines, and some mobile devices (e.g., tablet computers, smart phones, etc.).
In this specification, a data processing method is provided. One or more embodiments of the present specification also relate to a data processing apparatus, a data processing system, a computing device, a computer-readable storage medium, and a computer program, which are described in detail in the following embodiments one by one.
With the development of computer technology, more and more computer technology is applied in the field of data processing. Taking the graph learning architecture as an example, a graph is composed of two types of elements, which are respectively called "points" (or nodes, nodes) and "edges". Each edge has two vertices as its endpoints, and we call this edge "connect" its two endpoints.
In practical application, if the used graph nodes are ultra-large-scale graph nodes, the graph learning architecture needs to be deployed as a distributed graph learning architecture, and the graph nodes are dispersed to each distributed graph learning device in the graph learning architecture through a graph partitioning algorithm. As graph nodes are scattered to various distributed graph learning apparatuses, critical nodes may exist in the scattered graph nodes. Some neighbor nodes of a critical node are stored in the graph learning device where the critical node is located, and the rest neighbor nodes of the critical node are stored in the other graph learning devices.
When graph learning is performed, the graph learning device where the critical node is located needs to store node data of the critical node, and the node data of the critical node needs to be mapped to the graph learning devices where other neighbor nodes of the critical node are located, that is, the graph learning devices where other neighbor nodes of the critical node are located need to store mapping information of the node data of the critical node. The graph nodes stored in the graph learning apparatus may be referred to as Master nodes, the node data of the Master nodes may be referred to as Master data, and the node data mapped to the graph nodes in the other graph learning apparatuses may be referred to as Mirror data (Mirror data) of the graph nodes. In this case, the graph nodes mapped into other graph learning apparatuses may also be referred to as Mirror nodes (Mirror nodes).
Taking the graph learning algorithm model as an example, a relationship graph in the graph learning algorithm model comprises a plurality of nodes and edges, and taking the node a as an example, after the processing operator X is executed on the node a and the neighbor nodes of the node a, the processing operator X +1 can be executed on the node a. When the processing operator X +1 is executed on the node a, the processing operator X +1 is executed on the node a processed by the processing operator X. This results in that for node a, all processing operators corresponding to that node cannot be executed simultaneously in the dimension of a single node. Therefore, only after all nodes or edges are mapped, the processing operators corresponding to the nodes can be executed in sequence. However, it takes a lot of time to sequentially execute the processing operators corresponding to the nodes, which greatly affects the efficiency of data processing, and therefore, an efficient data processing scheme is urgently needed.
In order to improve the efficiency of data processing, the scheme of one embodiment of the present description provides a data processing method, where the data processing method may determine a plurality of target objects and a plurality of processing operators corresponding to the target objects, where the target objects carry execution sequences for the plurality of processing operators, each processing operator corresponds to one data queue, and the processing operators are operators performing dimensionality reduction processing on feature vectors of the target objects; under the condition that any processing operator is executed on a target object, freezing the target object, and adding the target object to a data queue corresponding to the any processing operator; extracting a plurality of target objects from any data queue, sending the target objects to a second processing end, and enabling the second processing end to concurrently execute a processing operator corresponding to any data queue on the target objects; and receiving a processing result fed back by the second processing end, and unfreezing the plurality of target objects according to the processing result. The multiple target objects corresponding to any processing operator are accurately collected, the processing operator corresponding to any data queue is executed for the multiple target objects concurrently, and data processing efficiency is improved.
Referring to fig. 1, fig. 1 illustrates a data processing flow diagram under a data processing system provided in an embodiment of the present specification, where as shown in fig. 1, the data processing system includes a first processing end and a second processing end;
the first processing end is configured to determine a plurality of target objects and a plurality of processing operators corresponding to the target objects respectively, wherein the target objects carry execution sequences for the plurality of processing operators, each processing operator corresponds to one data queue, and the processing operators are operators for performing dimensionality reduction on feature vectors of the target objects; under the condition that any processing operator is executed on a target object, freezing the target object, and adding the target object to a data queue corresponding to the any processing operator; extracting a plurality of target objects from any data queue and sending the target objects to a second processing end; receiving a processing result fed back by the second processing end, and unfreezing the plurality of target objects according to the processing result;
the second processing terminal is configured to receive the plurality of target objects sent by the first processing terminal; concurrently executing a processing operator corresponding to any data queue on the plurality of target objects to obtain a processing result; and feeding back a processing result to the first processing end.
The method includes the steps that a plurality of target objects and a plurality of processing operators corresponding to the target objects are determined by applying the scheme of the embodiment of the specification, wherein the target objects carry execution sequences aiming at the processing operators, each processing operator corresponds to a data queue, and the processing operators are operators for performing dimension reduction processing on feature vectors of the target objects; under the condition that any processing operator is executed on a target object, freezing the target object, and adding the target object to a data queue corresponding to the any processing operator; extracting a plurality of target objects from any data queue, sending the target objects to a second processing end, and enabling the second processing end to concurrently execute a processing operator corresponding to any data queue on the target objects; and receiving a processing result fed back by the second processing end, and unfreezing the plurality of target objects according to the processing result. The multiple target objects corresponding to any processing operator are accurately collected, the processing operator corresponding to any data queue is executed for the multiple target objects concurrently, and data processing efficiency is improved.
One or more embodiments provided by the embodiments of the present specification may be applied to a graph learning scenario, referring to fig. 2, fig. 2 is a schematic diagram illustrating a system architecture applied to the graph learning scenario, where the system may include a first processing end 100 and a plurality of second processing ends 200. The plurality of second processing terminals 200 may establish communication connection with each other through the first processing terminal 100. The first processing terminal 100 may send the target object to a plurality of second processing terminals 200.
The second processing terminal 200 and the first processing terminal 100 establish a connection through a network. The network provides a medium for a communication link between the second processing end and the first processing end. The network may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.
The second processing end 200 may be a browser, an APP (Application), or a web Application such as H5(HyperText Markup Language5, 5 th edition) Application, or a light Application (also referred to as an applet, a light Application), or a cloud Application, and the second processing end 200 may be based on an SDK (Software Development Kit) of a corresponding service provided by the first processing end, such as Development and acquisition based on an RTC SDK, and the like. The second processing end 200 may be deployed in an electronic device, and needs to run depending on the device running or some apps in the device running, etc. The electronic device may have a display screen and support information browsing, etc., for example, and may be a personal mobile terminal such as a mobile phone, a tablet computer, a personal computer, etc. Various other types of applications may also be typically deployed in an electronic device, such as a human-machine conversation type application, a model training type application, a text processing type application, a web browser application, a shopping type application, a search type application, an instant messaging tool, a mailbox second processing end, social platform software, and the like.
The first processing terminal 100 may include a server providing various services, such as a server providing communication services for a plurality of second processing terminals, a server for background training that provides support for models used on the second processing terminals, a server for processing data sent by the second processing terminals, and the like.
It should be noted that the first processing end 100 may be implemented as a distributed server cluster composed of a plurality of servers, or may be implemented as a single server. The server may also be a server of a distributed system, or a server incorporating a blockchain. The server may also be a cloud server of basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a Network service, cloud communication, a middleware service, a domain name service, a security service, a Content Delivery Network (CDN), a big data and artificial intelligence platform, or an intelligent cloud computing server or an intelligent cloud host with artificial intelligence technology.
It should be noted that the data processing method provided in the embodiment of the present specification is generally executed by the first processing end, but in other embodiments of the present specification, the second processing end may also have a similar function to the first processing end, so as to execute the data processing method provided in the embodiment of the present specification. In other embodiments, the data processing method provided in the embodiments of the present specification may also be executed by the second processing end and the first processing end together.
Specifically, the data processing method provided in this specification may be applied to named entity identification in a graph learning scene, such as processing an enterprise relationship diagram to obtain a category name of each enterprise, or processing a commodity relationship diagram to obtain a category name of each commodity.
In practical application, the data processing method provided in this specification is also applied to data processing procedures in other scenes, such as pet identification in a picture recognition scene, and certainly can also be applied to other scenes, and the application scene of the data processing method in this specification is not limited.
Taking the picture identification scene as an example, if the picture to be identified is too large, the picture to be identified can be dispersed to each distributed system in the picture identification architecture for picture identification. Specifically, a plurality of sub-image blocks of the picture to be identified and a plurality of processing operators corresponding to the sub-image blocks are determined, wherein the sub-image blocks carry execution sequences aiming at the plurality of processing operators, each processing operator corresponds to one data queue, and the processing operators are operators for performing dimension reduction processing on feature vectors of the sub-image blocks; freezing the sub-blocks under the condition that any processing operator is executed on the sub-block, and adding the sub-blocks to a data queue corresponding to any processing operator; extracting a plurality of sub-image blocks from any data queue, and sending the plurality of sub-image blocks to a second processing end, so that the second processing end concurrently executes a processing operator corresponding to any data queue to the plurality of sub-image blocks; and receiving a processing result fed back by the second processing end, and unfreezing the plurality of sub-image blocks according to the processing result. A plurality of sub-image blocks corresponding to any processing operator are accurately collected, the processing operators corresponding to any data queue are executed for the plurality of sub-image blocks concurrently, and the efficiency of picture identification is improved.
Referring to fig. 3, fig. 3 shows a flowchart of a data processing method provided in an embodiment of the present specification, where the data processing method is applied to a first processing end, and specifically includes the following steps:
step 302: determining a plurality of target objects and a plurality of processing operators corresponding to the target objects respectively, wherein the target objects carry execution sequences aiming at the plurality of processing operators, each processing operator corresponds to one data queue, and the processing operators are operators for performing dimensionality reduction on the characteristic vectors of the target objects.
In one or more embodiments of the present specification, in order to accurately determine multiple target objects that can be executed concurrently in a plurality of target objects, before processing, it is necessary to determine the plurality of target objects and multiple processing operators corresponding to the plurality of target objects.
Specifically, the target object refers to an object that needs to perform data processing, including but not limited to a point, an edge, a sub-tile block in a picture, and the like in the figure, and a plurality of target objects may also be understood as a plurality of target objects, and the processing operator may be a function, an instruction, an execution code, and the like, such as matrix multiplication (matrix multiplication), specifically selected according to an actual situation, which is not limited in this embodiment of the present specification. One target object may correspond to a plurality of processing operators, and the target object may carry execution sequences for the plurality of processing operators. The execution sequence includes an order of the plurality of processing operators executed on the target object. For example, the feature vector of the target object is 500 dimensions, after the processing operator X is performed on the feature vector of the target object, the feature vector of the target object is reduced to 200 dimensions, and then, the processing operator X +1 is performed on the feature vector of the target object of 200 dimensions, so that a 100-dimensional feature vector of the target object is obtained.
Illustratively, node a corresponds to four processing operators, which are matrix multiplex 1, matrix multiplex 2, matrix multiplex 3, and matrix multiplex 4, respectively. Meanwhile, the node a carries an execution sequence a for the four processing operators, where the execution sequence a may be { matrix multiplex 1, matrix multiplex 3, matrix multiplex 2, and matrix multiplex 4}, and the execution sequence a indicates that the matrix multiplex 1 is executed on the node a first; after completing the execution of matrix multiplex 1, continuing to execute matrix multiplex 3 for node A; after completing the execution of matrix multiplex 3, continuing to execute matrix multiplex 2 for node A; after completing the execution of matrix multiplex 2, it continues to execute matrix multiplex 4 for node A.
In practical application, there are various ways for the first processing end to determine the plurality of target objects and the plurality of processing operators corresponding to the target objects, and the first processing end is specifically selected according to an actual situation, which is not limited in this embodiment of the present specification.
In a possible implementation manner of this specification, the first processing end may determine a plurality of target objects that need to be processed, and poll the plurality of processing operators to obtain a plurality of processing operators corresponding to each of the plurality of target objects.
For example, assume that there are four objects, node a, node B, node C, and node D, and six processing operators, respectively, matrix multiplex 1, matrix multiplex 2, matrix multiplex 3, matrix multiplex 4, matrix multiplex 5, and matrix multiplex 6. The first processing end determines that the target objects are node A and node B, and for the node A, the first processing end polls matrix multiplex 1-matrix multiplex 6 in sequence and determines that processing operators corresponding to the node A are matrix multiplex 1 and matrix multiplex 4 respectively; for the node B, the first processing end polls matrix multiplex 1-matrix multiplex 6 in sequence, and determines that the processing operators corresponding to the node B are matrix multiplex 4 and matrix multiplex 5, respectively.
In another possible implementation manner of this specification, each target object carries a processing operator tag, and after the first processing end determines a plurality of target objects, a plurality of processing operators corresponding to the plurality of target objects may be determined according to the processing operator tags carried by the plurality of target objects.
For example, assume that there are four objects, node a, node B, node C, and node D, and six processing operators, respectively, matrix multiplex 1, matrix multiplex 2, matrix multiplex 3, matrix multiplex 4, matrix multiplex 5, and matrix multiplex 6. The first processing end determines that the target objects are a node A and a node B, wherein the node A carries processing operator labels 1 and 4, and the node B carries processing operator labels 4 and 5. Aiming at the node A, processing operators corresponding to the processing operator labels 1 and 4 are matched in six processing operators, namely, matrix multiplex 1-matrix multiplex 6, the processing operators corresponding to the node A are determined to be matrix multiplex 1 and matrix multiplex 4, and the processing operators corresponding to the node B are determined to be matrix multiplex 4 and matrix multiplex 5 in the same way.
In an embodiment of this specification, the execution sequence corresponding to each target object may be preset according to prior knowledge, or the execution sequence may be set by the first processing end receiving an operation requirement, and the step of setting according to the operation requirement, that is, before the step of determining a plurality of target objects and a plurality of processing operators corresponding to the target objects, may further include the following steps:
setting an execution sequence of a plurality of processing operators corresponding to any target object according to an operation requirement, wherein the execution sequence comprises an execution sequence of the plurality of processing operators;
and respectively setting the data queues corresponding to the processing operators.
Specifically, the operation requirement may be an execution sequence requirement of a processing operator in a data processing process, and may also be any other requirement, including but not limited to an execution time requirement, which is specifically selected according to an actual situation, and this is not limited in this embodiment of the present specification.
For example, suppose node a corresponds to four processing operators, which are matrix multiplex 1, matrix multiplex 2, matrix multiplex 3 and matrix multiplex 4, respectively. Receiving an operation requirement of' aiming at the node A, firstly executing matrix multiplex 1; after the matrix multiplex 1 is executed, the execution of matrix multiplex 3 is continued; after the matrix multiplex 3 is executed, the execution of matrix multiplex 2 is continued; after completing execution of matrix multiplex 2, it continues to execute matrix multiplex 4 ″, and according to the above operation requirement, the execution sequence a of node a may be set as { matrix multiplex 1, matrix multiplex 3, matrix multiplex 2, and matrix multiplex 4 }.
In practical application, after a plurality of target objects and a plurality of processing operators corresponding to the target objects are determined, data queues corresponding to the plurality of processing operators can be set.
Referring to the above example, node a corresponds to four processing operators, matrix multiplex 1, matrix multiplex 2, matrix multiplex 3, and matrix multiplex 4, sets data queue 1 for matrix multiplex 1, data queue 2 for matrix multiplex 2, data queue 3 for matrix multiplex 3, and data queue 4 for matrix multiplex 4.
By applying the scheme of the embodiment of the specification, for any target object, according to the operation requirement, the execution sequences of the plurality of processing operators corresponding to the target object are set, and the data queues corresponding to the plurality of processing operators are respectively set, so that the plurality of target objects corresponding to any processing operator can be accurately and quickly processed subsequently.
Step 304: and under the condition that any processing operator is executed on the target object, freezing the target object, and adding the target object to a data queue corresponding to any processing operator.
In one or more embodiments of the present specification, after the first processing end determines a plurality of target objects and a plurality of processing operators corresponding to the target objects, the target objects may be frozen and added to data queues corresponding to any processing operator when any processing operator is executed on the target objects.
Specifically, freezing may also be understood as hibernation, and freezing a target object means suspending execution of a corresponding processing operator on the target object in the case where any processing operator is executed on the target object. Any processing operator refers to any one of a plurality of processing operators corresponding to the target object, and is specifically selected according to the actual situation, which is not limited in this embodiment of the present specification.
For example, assume that node a corresponds to four processing operators, matrix multiplex 1, matrix multiplex 2, matrix multiplex 3, and matrix multiplex 4. When matrix multiplex 2 is executed on node a, the first processing end freezes node a, so that node a is in a dormant state, and the first processing end does not execute any other processing operator corresponding to node a. At this time, the first processing end adds the node a to the data queue 2 corresponding to the matrix multiplex 2; assuming that node B corresponds to three processing operators, namely, matrix multiplex 2, matrix multiplex 3, and matrix multiplex 4, when node B executes matrix multiplex 2, the first processing end freezes node B, so that node B is in a sleep state, and the first processing end does not execute any other processing operator corresponding to node B. At this time, the first processing end adds node B to the data queue 2 corresponding to the matrix multiplex 2, and at this time, the data queue 2 includes node a and node B.
In practical application, there are various ways in which the first processing end adds the target object to the data queue corresponding to any processing operator, and the way is specifically selected according to an actual situation, which is not limited in this embodiment of the present specification.
In a possible implementation manner of this specification, the first processing end may poll the plurality of data queues, determine a data queue corresponding to a processing operator currently executed by the frozen target object, and add the target object to the data queue.
In another possible implementation manner of this specification, the step of searching for the data queue corresponding to the target object according to the preset operator identifier corresponding to the processing operator, that is, the step of adding the target object to the data queue corresponding to any processing operator, may include the following steps:
acquiring a preset operator identification corresponding to any processing operator;
searching a data queue corresponding to the preset operator identification in the plurality of data queues according to the preset operator identification;
and adding the target object to a data queue corresponding to the preset operator identification.
Specifically, the preset operator identifier refers to a preset operator identifier, and the operator identifier can uniquely identify a corresponding processing operator, for example, the preset operator identifier corresponding to the processing operator matrix multiplex 1 is "1".
For example, referring to the example of step 304, when the node a executes matrix multiplex 2, it may obtain the preset operator identifier "2" corresponding to matrix multiplex 2, search the data queue corresponding to the preset operator identifier "2" in the data queue 1, the data queue 2, the data queue 3, and the data queue 4, and obtain the search result as the data queue 2, and at this time, the first processing end may add the node a to the data queue 2 corresponding to the preset operator identifier "2".
By applying the scheme of the embodiment of the specification, a preset operator identifier corresponding to any processing operator is obtained; searching a data queue corresponding to the preset operator identification in the plurality of data queues according to the preset operator identification; the target object is added to the data queue corresponding to the preset operator identification, so that the efficiency of searching the data queue corresponding to the processing operator is improved, and the efficiency of data processing is further improved.
Step 306: and extracting a plurality of target objects from any data queue, sending the plurality of target objects to the second processing end, and enabling the second processing end to concurrently execute a processing operator corresponding to any data queue on the plurality of target objects.
In one or more embodiments of the present specification, a first processing end determines a plurality of target objects and a plurality of processing operators corresponding to the target objects, where in a case where a target object executes any processing operator, the target object is frozen, and after the target object is added to a data queue corresponding to any processing operator, the plurality of target objects may be extracted from any data queue, and the plurality of target objects are sent to a second processing end, so that the second processing end concurrently executes the processing operator corresponding to any data queue to the plurality of target objects.
Illustratively, the data queue 1 corresponding to matrix multiplex 1 includes node a, node B, node C, and node D; the data queue 2 corresponding to the matrix multiplex 2 includes a node E, a node F, and a node G. The first processing end may extract a plurality of target objects from the data queue 1, or extract a plurality of target objects from the data queue 2, and the number of the target objects extracted by the first processing end is specifically selected according to the actual situation, which is not limited in this embodiment of the present specification.
In one or more embodiments of the present disclosure, after the first processing end needs to send the extracted target object to the second processing end, the second processing end may execute processing operators corresponding to a plurality of target objects, and under a condition that a network bandwidth is fixed, a large amount of time is spent in the whole data processing process. Therefore, in this embodiment of the specification, the step of, by using multiple threads, enabling a first processing end to use multiple threads, so that at least one thread sends multiple target objects to a second processing end, where the second processing end is concurrently executing a processing operator on the multiple target objects, and at least another thread is sending another target object to the second processing end, that is, extracting multiple target objects from any data queue, and sending the multiple target objects to the second processing end, so that the second processing end concurrently executing the processing operator corresponding to any data queue on the multiple target objects, may include the following steps:
and utilizing the multiple threads to sequentially extract the multiple target objects from any data queue and send the multiple target objects to the second processing end, so that the multiple target objects sent by the second processing end to at least one thread concurrently execute a processing operator corresponding to any data queue, and at the same time, at least another thread is sending the multiple target objects.
In one or more embodiments, multiple threads may be utilized to retrieve multiple target objects from the data queue, as the number of threads that may be retrieved per thread is limited. In order to avoid that multiple threads repeatedly extract the same target object, in this embodiment of the present specification, multiple threads may be used to sequentially extract multiple target objects from the data queue. Further, the number of threads is specifically selected according to actual situations, and this is not limited in this embodiment of the present specification.
Illustratively, thread 1 may extract two target objects from the data queue at a time, and thread 2 may extract three target objects from the data queue at a time. The data queue 1 includes a node a, a node B, a node C, a node D, and a node E. The first processing end may extract node a and node B from data queue 1 using thread 1, and extract node C, node D, and node E using thread 2. After the thread 1 extracts the node a and the node B, the node a and the node B are sent to the second processing end, after the second processing end receives the node a and the node B, the data queue 1 corresponding to the matrix multiplex 1 is executed for the node a and the node B at the same time, and simultaneously, the thread 2 is sending the node C, the node D, and the node E to the second processing end.
By applying the scheme of the embodiment of the specification, the multiple threads are used for sequentially extracting the multiple target objects from any data queue and sending the multiple target objects to the second processing end, so that the multiple target objects sent by the second processing end to at least one thread concurrently execute the processing operator corresponding to any data queue, and at least another thread is sending the multiple target objects, so that the transmission of the target objects and the processing of the target objects can be concurrently performed, the network influence can be effectively overlapped, and the data processing efficiency is improved.
In practical application, the target objects may be points and edges of a relationship graph in a graph learning algorithm model, the processing operator is an operator for performing dimension reduction processing on feature vectors of the points and/or the edges in the graph learning algorithm model, the step of extracting a plurality of target objects from any one data queue and sending the plurality of target objects to the second processing end, so that the second processing end concurrently executes the processing operator corresponding to any one data queue to the plurality of target objects may include the following steps:
extracting a plurality of points and/or edges from any data queue;
respectively acquiring the characteristic dimensions of a plurality of points and/or edges;
converting a plurality of points and/or edges into a matrix format according to the characteristic dimension;
and sending the plurality of points and/or edges in the matrix format to a second processing end, and enabling the second processing end to concurrently execute a processing operator corresponding to any data queue on the plurality of points and/or edges.
In one or more embodiments of the present disclosure, the feature vectors of the points and/or edges have corresponding feature dimensions, and the first processing end may convert the extracted points and/or edges into a matrix format, so as to facilitate processing by the second processing end.
Illustratively, if the feature vector A of the node A is A m*n The feature vector B of the node B is B m*n The feature vector C of the node C is C m*n Then A can be substituted m*n As the first column of the matrix, B m*n As the second column of the matrix, C m*n As a third column of the matrix, a plurality of points and/or edges converted into a matrix format are obtained, and the plurality of points and/or edges of the matrix format are sent to the second processing end, so that the second processing end can process a plurality of points of the matrix formatAnd/or executing the processing operator corresponding to any data queue at the same time, thereby improving the processing efficiency of the second processing end.
By applying the scheme of the embodiment of the specification, a plurality of points and/or edges are extracted from the data queue; respectively acquiring the characteristic dimensions of a plurality of points and/or edges; converting a plurality of points and/or edges into a matrix format according to the characteristic dimension; and a plurality of points and/or edges in the matrix format are/is sent to the second processing end, so that the second processing end can process the data conveniently, and the data processing efficiency is improved.
Step 308: and receiving a processing result fed back by the second processing end, and unfreezing the plurality of target objects according to the processing result.
In one or more embodiments of the present specification, a first processing end determines a plurality of target objects and a plurality of processing operators corresponding to the target objects, freezes the target objects when any processing operator is executed on a target object, adds the target objects to a data queue corresponding to any processing operator, extracts the plurality of target objects from any data queue, and sends the plurality of target objects to a second processing end, so that the second processing end concurrently executes the processing operators corresponding to any data queue on the plurality of target objects, and further may receive a processing result fed back by the second processing end, and unfreeze the plurality of target objects according to the processing result.
Specifically, the thawing may also be understood as releasing from dormancy, the first processing end may thaw the plurality of target objects after receiving the processing result fed back by the second processing end, and after the thawing is completed, the first processing end may continue to execute the respective processing operator on the plurality of target objects. For example: while performing matrix multiplex 1 on node A, node A is frozen, and further, after unfreezing node A, the next processing operator of matrix multiplex 1, matrix multiplex 2, may continue to be performed on node A.
By applying the scheme of the embodiment of the description, a plurality of target objects and a plurality of processing operators corresponding to the target objects are determined, wherein the target objects carry execution sequences aiming at the plurality of processing operators, each processing operator corresponds to one data queue, and the processing operators are operators for performing dimensionality reduction on the feature vectors of the target objects; under the condition that any processing operator is executed on a target object, freezing the target object, and adding the target object to a data queue corresponding to the any processing operator; extracting a plurality of target objects from any data queue, sending the target objects to a second processing end, and enabling the second processing end to concurrently execute a processing operator corresponding to any data queue on the target objects; and receiving a processing result fed back by the second processing end, and unfreezing the plurality of target objects according to the processing result. The multiple target objects corresponding to any processing operator are accurately collected, the processing operator corresponding to any data queue is executed for the multiple target objects concurrently, and data processing efficiency is improved.
In practical applications, after receiving the processing result fed back by the second processing end, the first processing end may update the execution states of the multiple target objects, that is, after the step of receiving the processing result fed back by the second processing end, the method may further include the following steps:
and updating the execution states of the plurality of target objects according to the processing result fed back by the second processing end.
Specifically, the execution state of the target object may record whether the current target object is in a frozen state or a unfrozen state, and may also record to which processing operator the target object is currently executed. The first processing end updates the execution states of the target objects in a mode of marking the target objects, for example, marking the node a as "frozen"; or a plurality of target objects may be updated in the execution state table, for example, a node a is added in the frozen state table, which is specifically selected according to an actual situation, and this is not limited in this embodiment of the present specification.
By applying the scheme of the embodiment of the specification, the execution states of the plurality of target objects are updated according to the processing result fed back by the second processing end, so that the plurality of target objects can clearly determine the current processing operator, and then the next processing operator of the current processing operator in the execution sequence is executed, and the data processing efficiency is improved.
Referring to fig. 4, fig. 4 is a flowchart illustrating another data processing method provided in an embodiment of the present specification, where the data processing method is applied to a second processing end, and specifically includes the following steps:
step 402: and receiving a plurality of target objects sent by the first processing terminal.
The target object is frozen by a first processing end and is added to a data queue corresponding to any processing operator and extracted from any data queue, the target object carries execution sequences aiming at the processing operators, each processing operator corresponds to one data queue, and the processing operators are operators for performing dimensionality reduction on a feature vector of the target object.
Step 404: and concurrently executing a processing operator corresponding to any data queue on the plurality of target objects to obtain a processing result.
Step 406: and feeding back the processing result to the first processing end, so that the first processing end unfreezes the plurality of target objects according to the processing result.
Specifically, the concurrent execution means that the second processing end executes corresponding processing operators on a plurality of target objects at the same time, and the execution result may be that the plurality of target objects have already been processed, and of course, the processing result may also carry processing time and the like, which is specifically selected according to an actual situation, and this is not limited in this embodiment of the present specification.
Illustratively, the second processing end receives three target objects sent by the first processing end, where the three target objects are node a, node B, and node C, and the second processing end simultaneously executes matrix multiplex 1 for node a, node B, and node C, and feeds back the results that node a, node B, and node C have completed processing to the first processing end, so that the first processing end unfreezes node a, node B, and node C according to the processing results.
By applying the scheme of the embodiment of the specification, the multiple target objects sent by the first processing end are received, the processing operators corresponding to any data queue are executed concurrently on the multiple target objects, the processing result is obtained, the processing result is fed back to the first processing end, the first processing end unfreezes the multiple target objects according to the processing result, the processing operators corresponding to the multiple target objects are executed concurrently, and the data processing efficiency is improved.
In practical applications, the graph learning algorithm has multiple layers of matrix multiplication functions, such as N functions from the matrix multiplication function X to the matrix multiplication function X + N. When the matrix multiplication function X +1 is executed on the feature vector of any node, the matrix multiplication function X +1 is executed on the feature vector of the node after the processing by the matrix multiplication function X. For example, the feature vector of the node a is 500 dimensions, after the matrix multiplication function X is performed on the feature vector of the node a, the feature vector of the node a is reduced to 200 dimensions, and then the matrix multiplication function X +1 is performed on the feature vector of the node a of 200 dimensions, and the feature vector of the node a of 100 dimensions is obtained. This results in the inability for the node to do parallelization processing directly in the dimension of a single node.
The following describes the data processing method further by taking an application of the data processing method provided in this specification in a scene of image learning as an example with reference to fig. 5. The first processing end may be a CPU side, the second processing end may be a GPU side, the target object may be a node or an edge, and the processing operator may be a matrix multiplication function. Fig. 5 is a flowchart illustrating a processing procedure of a data processing method provided in an embodiment of the present specification, where the method is applied to a CPU side, and specifically includes the following steps:
step 502: and setting an execution sequence of a plurality of matrix multiplication functions corresponding to any node according to the operation requirement, wherein the execution sequence comprises the execution sequence of the matrix multiplication functions.
Step 504: and respectively setting the data queues corresponding to the matrix multiplication functions.
Specifically, matrix multiplex 1 is executed first for all nodes in the graph, and then matrix multiplex 2 is executed for all nodes in the graph. The two matrix multiplication functions, matrix multiplication 1 and matrix multiplication 2, which function in the same way to reduce the eigenvectors of each node in the graph, are actually two execution stages, and in this case, 1 data queue may be allocated for each of matrix multiplication 1 and matrix multiplication 2.
Step 506: the method comprises the steps of determining a plurality of nodes and a plurality of matrix multiplication functions corresponding to the nodes respectively, wherein the nodes carry execution sequences aiming at the matrix multiplication functions, and each matrix multiplication function corresponds to one data queue.
Specifically, the matrix multiplication function is a function for performing dimension reduction processing on the feature vector of each node. There is an execution sequence for each node in the graph, and for node a and node B, the execution sequence may be { matrix multiplex 1 for point a, matrix multiplex 1 for point B, matrix multiplex 2 for point a, and matrix multiplex 2 for point B }, that is, each node may be considered as an independent entity, and the nodes may automatically execute forward according to the execution sequence.
Step 508: and in the case of executing any matrix multiplication function on the node, freezing the node and adding the node to a data queue corresponding to any matrix multiplication function.
Specifically, at the time when the node a executes the matrix multiplex 1, the execution of the matrix multiplex 1 on the node a may be stopped, so that the node a enters the sleep state, which may also be understood as a frozen state, and the node a is added to the data queue corresponding to the matrix multiplex 1. In this way, all nodes on the data queue corresponding to the matrix multiplex 1 are nodes executing to the matrix multiplex 1.
Step 510: and utilizing the multiple threads to sequentially extract the multiple nodes from any data queue and send the multiple nodes to the GPU side, so that the multiple nodes sent by the GPU side to at least one thread concurrently execute the matrix multiplication function corresponding to any data queue, and at the same time, at least one other thread is sending the multiple matrix multiplication functions.
Specifically, the CPU may extract a plurality of nodes from a data queue corresponding to the multiple threads matrix multiplex 1 in sequence, and convert feature vectors of the plurality of nodes on the CPU side into a matrix format. Assuming that each point is a 100-dimensional feature input for the matrix multiplex 1 function, according to the 100-dimensional feature input, each point is converted into a matrix format used by the GPU, and the CPU sequentially extracts 64 nodes from the data queue by using a plurality of threads and sends the 64 nodes to the GPU side, so that the GPU side concurrently executes matrix multiplex 1 on the 64 nodes.
Step 512: and receiving a processing result fed back by the GPU side, and unfreezing the plurality of nodes according to the processing result.
Step 514: and updating the execution states of the plurality of nodes according to the processing result fed back by the GPU side.
Specifically, after the GPU side has executed 64 nodes, the CPU side unfreezes the 64 nodes and updates the states of the nodes: having executed the matrix multiplex 1, the next instruction, matrix multiplex 2, is executed next.
By applying the scheme of the embodiment of the specification, the combination of multiple points and/or edges is realized, the same matrix multiplication function is concurrently executed on the multiple points and/or edges, the problem that the GPU parallelization cannot be directly carried out on single point/edge dimensions in the use process of the graph learning in accelerating equipment such as a GPU is solved, the overlapping of calculation processing and communication is realized, the points/edges are processed in batch, and the data processing efficiency is improved.
Corresponding to the above data processing method embodiment applied to the first processing end, this specification further provides an embodiment of a data processing apparatus, which is applied to the first processing end, and fig. 6 shows a schematic structural diagram of a data processing apparatus provided in an embodiment of this specification. As shown in fig. 6, the apparatus includes:
a determining module 602, configured to determine a plurality of target objects and a plurality of processing operators corresponding to the target objects, where the target objects carry execution sequences for the plurality of processing operators, each processing operator corresponds to one data queue, and the processing operators are operators performing dimensionality reduction processing on feature vectors of the target objects;
the freezing module 604 is configured to freeze the target object and add the target object to a data queue corresponding to any processing operator in the case that any processing operator is executed on the target object;
a sending module 606, configured to extract a plurality of target objects from any data queue, and send the plurality of target objects to a second processing end, so that the second processing end concurrently executes a processing operator corresponding to any data queue on the plurality of target objects;
and the unfreezing module 608 is configured to receive the processing result fed back by the second processing end and unfreeze the plurality of target objects according to the processing result.
Optionally, the target object is a point and an edge of a relation graph in the graph learning algorithm model, and the processing operator is an operator for performing dimension reduction processing on a feature vector of the point and/or the edge in the graph learning algorithm model;
a sending module 606 further configured to extract points and/or edges from any of the data queues; respectively acquiring the characteristic dimensions of a plurality of points and/or edges; converting a plurality of points and/or edges into a matrix format according to the characteristic dimension; and sending the plurality of points and/or edges in the matrix format to a second processing end, and enabling the second processing end to concurrently execute a processing operator corresponding to any data queue on the plurality of points and/or edges.
Optionally, the sending module 606 is further configured to utilize multiple threads to sequentially extract multiple target objects from any data queue, and send the multiple target objects to the second processing end, so that the multiple target objects sent by at least one thread by the second processing end concurrently execute a processing operator corresponding to any data queue, while at least another thread is sending the multiple target objects.
Optionally, the apparatus further comprises: the setting module is configured to set an execution sequence of a plurality of processing operators corresponding to any target object according to an operation requirement, wherein the execution sequence comprises an execution sequence of the plurality of processing operators; and respectively setting respective corresponding data queues for the plurality of processing operators.
Optionally, the apparatus further comprises: and the updating module is configured to update the execution states of the plurality of target objects according to the processing result fed back by the second processing end.
Optionally, the freezing module 604 is further configured to obtain a preset operator identifier corresponding to any processing operator; searching a data queue corresponding to the preset operator identification in the plurality of data queues according to the preset operator identification; and adding the target object to a data queue corresponding to the preset operator identification.
The method comprises the steps that a plurality of target objects and a plurality of processing operators corresponding to the target objects are determined by applying the scheme of the embodiment of the specification, wherein the target objects carry execution sequences aiming at the processing operators, each processing operator corresponds to a data queue, and the processing operators are operators for performing dimensionality reduction on feature vectors of the target objects; under the condition that any processing operator is executed on a target object, freezing the target object, and adding the target object to a data queue corresponding to the any processing operator; extracting a plurality of target objects from any data queue, sending the target objects to a second processing end, and enabling the second processing end to concurrently execute a processing operator corresponding to any data queue on the target objects; and receiving a processing result fed back by the second processing end, and unfreezing the plurality of target objects according to the processing result. The multiple target objects corresponding to any processing operator are accurately collected, the processing operator corresponding to any data queue is executed for the multiple target objects concurrently, and data processing efficiency is improved.
The above is a schematic configuration of a data processing apparatus of the present embodiment. It should be noted that the technical solution of the data processing apparatus and the technical solution of the data processing method belong to the same concept, and details that are not described in detail in the technical solution of the data processing apparatus can be referred to the description of the technical solution of the data processing method.
Corresponding to the above data processing method embodiment applied to the second processing end, this specification also provides an embodiment of a data processing apparatus, which is applied to the second processing end, and fig. 7 shows a schematic structural diagram of another data processing apparatus provided in an embodiment of this specification. As shown in fig. 7, the apparatus includes:
a receiving module 702, configured to receive multiple target objects sent by a first processing end, where a plurality of target objects and multiple processing operators corresponding to the target objects are determined by the first processing end for a target object, and when any processing operator is executed on the target object, the target object is frozen, and is added to a data queue corresponding to any processing operator, and is extracted from any data queue, where the target object carries an execution sequence for the multiple processing operators, each processing operator corresponds to one data queue, and the processing operator is an operator for performing dimension reduction processing on a feature vector of the target object;
the processing module 704 is configured to concurrently execute a processing operator corresponding to any one data queue on the plurality of target objects to obtain a processing result;
and a feedback module 706 configured to feed back the processing result to the first processing end, so that the first processing end unfreezes the plurality of target objects according to the processing result.
By applying the scheme of the embodiment of the specification, a plurality of target objects sent by a first processing end are received, wherein the target objects are obtained by determining a plurality of target objects and a plurality of processing operators corresponding to the target objects by the first processing end, freezing the target objects under the condition that any processing operator is executed on the target objects, adding the target objects to a data queue corresponding to any processing operator, and extracting the target objects from any data queue, wherein the target objects carry execution sequences aiming at the plurality of processing operators, each processing operator corresponds to one data queue, and the processing operators are operators for performing dimension reduction processing on feature vectors of the target objects; concurrently executing a processing operator corresponding to any data queue on the plurality of target objects to obtain a processing result; and feeding back the processing result to the first processing end, so that the first processing end unfreezes the plurality of target objects according to the processing result, the processing operator corresponding to any data queue is concurrently executed on the plurality of target objects, and the data processing efficiency is improved.
The above is a schematic configuration of a data processing apparatus of the present embodiment. It should be noted that the technical solution of the data processing apparatus and the technical solution of the data processing method belong to the same concept, and details that are not described in detail in the technical solution of the data processing apparatus can be referred to the description of the technical solution of the data processing method.
Fig. 8 shows a block diagram of a computing device according to an embodiment of the present specification. The components of the computing device 800 include, but are not limited to, memory 810 and a processor 820. The processor 820 is coupled to the memory 810 via a bus 830, and the database 850 is used to store data.
Computing device 800 also includes access device 840, access device 840 enabling computing device 800 to communicate via one or more networks 860. Examples of such networks include a Public Switched Telephone Network (PSTN), a Local Area Network (LAN), a Wide Area Network (WAN), a Personal Area Network (PAN), or a combination of communication networks such as the internet. The Access device 840 may include one or more of any type of Network Interface (e.g., a Network Interface Card (NIC)) whether wired or Wireless, such as an IEEE802.11 Wireless Local Area Network (WLAN) Wireless Interface, a worldwide Interoperability for Microwave Access (Wi-MAX) Interface, an ethernet Interface, a Universal Serial Bus (USB) Interface, a cellular Network Interface, a bluetooth Interface, a Near Field Communication (NFC) Interface, and so forth.
In one embodiment of the present description, the above-described components of computing device 800, as well as other components not shown in FIG. 8, may also be connected to each other, such as by a bus. It should be understood that the block diagram of the computing device architecture shown in FIG. 8 is for purposes of example only and is not limiting as to the scope of the description. Those skilled in the art may add or replace other components as desired.
Computing device 800 may be any type of stationary or mobile computing device, including a mobile computer or mobile computing device (e.g., tablet, personal digital assistant, laptop, notebook, netbook, etc.), a mobile phone (e.g., smartphone), a wearable computing device (e.g., smartwatch, smartglasses, etc.), or other type of mobile device, or a stationary computing device such as a desktop computer or PC. Computing device 800 may also be a mobile or stationary server.
Wherein the processor 820 is configured to execute computer-executable instructions, which when executed by the processor, implement the steps of the data processing method provided in fig. 3 or fig. 4 described above.
The above is an illustrative scheme of a computing device of the present embodiment. It should be noted that the technical solution of the computing device and the technical solution of the data processing method provided by the above-mentioned fig. 3 or fig. 4 belong to the same concept, and details of the technical solution of the computing device, which are not described in detail, can be referred to the description of the technical solution of the data processing method provided by the above-mentioned fig. 3 or fig. 4.
An embodiment of the present specification further provides a computer-readable storage medium storing computer-executable instructions, which when executed by a processor, implement the steps of the data processing method provided in fig. 3 or fig. 4.
The above is an illustrative scheme of a computer-readable storage medium of the present embodiment. It should be noted that the technical solution of the storage medium belongs to the same concept as the technical solution of the data processing method provided by the above-mentioned fig. 3 or fig. 4, and details of the technical solution of the storage medium, which are not described in detail, can be referred to the description of the technical solution of the data processing method provided by the above-mentioned fig. 3 or fig. 4.
An embodiment of the present specification further provides a computer program, wherein when the computer program is executed in a computer, the computer program causes the computer to execute the steps of the data processing method provided in fig. 3 or fig. 4.
The above is a schematic scheme of a computer program of the present embodiment. It should be noted that the technical solution of the computer program is the same as the technical solution of the data processing method provided by the above-mentioned fig. 3 or fig. 4, and details of the technical solution of the computer program, which are not described in detail, can be referred to the description of the technical solution of the data processing method provided by the above-mentioned fig. 3 or fig. 4.
The foregoing description has been directed to specific embodiments of this disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
The computer instructions comprise computer program code which may be in the form of source code, object code, an executable file or some intermediate form, or the like. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like.
It should be noted that, for the sake of simplicity, the foregoing method embodiments are described as a series of combinations of acts, but it should be understood by those skilled in the art that the embodiments are not limited by the described order of acts, as some steps may be performed in other orders or simultaneously according to the embodiments. Further, those skilled in the art should also appreciate that the embodiments described in this specification are preferred embodiments and that acts and modules referred to are not necessarily required for an embodiment of the specification.
In the foregoing embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to the related descriptions of other embodiments.
The preferred embodiments of the present specification disclosed above are intended only to aid in the description of the specification. Alternative embodiments are not exhaustive and do not limit the invention to the precise embodiments described. Obviously, many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles of the embodiments and the practical application, to thereby enable others skilled in the art to best understand and utilize the embodiments. The specification is limited only by the claims and their full scope and equivalents.

Claims (12)

1. A data processing method is applied to a first processing end, and comprises the following steps:
determining a plurality of target objects and a plurality of processing operators corresponding to the target objects respectively, wherein the target objects carry execution sequences aiming at the processing operators, each processing operator corresponds to one data queue, and the processing operators are operators for performing dimensionality reduction on feature vectors of the target objects;
under the condition that any processing operator is executed on the target object, freezing the target object, and adding the target object to a data queue corresponding to the processing operator;
extracting a plurality of target objects from any data queue, sending the target objects to a second processing end, and enabling the second processing end to concurrently execute a processing operator corresponding to any data queue on the target objects;
and receiving a processing result fed back by the second processing end, and unfreezing the plurality of target objects according to the processing result.
2. The method according to claim 1, wherein the target objects are points and edges of a relation graph in a graph learning algorithm model, and the processing operators are operators for performing dimension reduction processing on feature vectors of the points and/or the edges in the graph learning algorithm model;
the step of extracting a plurality of target objects from any data queue and sending the plurality of target objects to a second processing end, so that the second processing end concurrently executes a processing operator corresponding to any data queue to the plurality of target objects, includes:
extracting a plurality of points and/or edges from any data queue;
respectively acquiring the characteristic dimensions of the plurality of points and/or edges;
converting the plurality of points and/or edges into a matrix format according to the characteristic dimension;
and sending the plurality of points and/or edges in the matrix format to a second processing end, and enabling the second processing end to concurrently execute the processing operator corresponding to any data queue on the plurality of points and/or edges.
3. The method according to claim 1, wherein the step of extracting a plurality of target objects from any data queue and sending the plurality of target objects to a second processing end, so that the second processing end concurrently executes a processing operator corresponding to any data queue on the plurality of target objects comprises:
and utilizing a plurality of threads to sequentially extract a plurality of target objects from any data queue, sending the plurality of target objects to a second processing end, and enabling the second processing end to simultaneously execute a processing operator corresponding to any data queue on the plurality of target objects sent by at least one thread and at the same time at least another thread is sending the plurality of target objects.
4. The method of claim 1, wherein the step of determining a number of target objects and a plurality of processing operators corresponding to the target objects is preceded by:
setting an execution sequence of a plurality of processing operators corresponding to any target object according to an operation requirement, wherein the execution sequence comprises an execution sequence of the plurality of processing operators;
and respectively setting respective corresponding data queues for the plurality of processing operators.
5. The method of claim 1, wherein after the step of receiving the processing result fed back by the second processing end, the method further comprises:
and updating the execution states of the plurality of target objects according to the processing result fed back by the second processing end.
6. The method of claim 1, wherein the step of adding the target object to the data queue corresponding to any processing operator comprises:
acquiring a preset operator identification corresponding to any processing operator;
searching a data queue corresponding to the preset operator identification in a plurality of data queues according to the preset operator identification;
and adding the target object to a data queue corresponding to the preset operator identification.
7. A data processing method is applied to a second processing end, and comprises the following steps:
receiving a plurality of target objects sent by a first processing end, wherein the target objects determine a plurality of target objects and a plurality of processing operators corresponding to the target objects respectively through the first processing end, under the condition that any processing operator is executed on the target objects, the target objects are frozen, the target objects are added to data queues corresponding to any processing operator and extracted from any data queue, the target objects carry execution sequences aiming at the plurality of processing operators, each processing operator corresponds to one data queue, and the processing operators are operators for performing dimensionality reduction on eigenvectors of the target objects;
concurrently executing a processing operator corresponding to any one data queue on the plurality of target objects to obtain a processing result;
and feeding back the processing result to the first processing end, so that the first processing end unfreezes the target objects according to the processing result.
8. A data processing apparatus applied to a first processing end, the apparatus comprising:
the determining module is configured to determine a plurality of target objects and a plurality of processing operators corresponding to the target objects, wherein the target objects carry execution sequences for the plurality of processing operators, each processing operator corresponds to one data queue, and the processing operators perform dimensionality reduction on feature vectors of the target objects;
the freezing module is configured to freeze the target object and add the target object to a data queue corresponding to any processing operator under the condition that any processing operator is executed on the target object;
the sending module is configured to extract a plurality of target objects from any data queue, send the target objects to a second processing end, and enable the second processing end to concurrently execute a processing operator corresponding to any data queue on the target objects;
and the unfreezing module is configured to receive a processing result fed back by the second processing end and unfreeze the plurality of target objects according to the processing result.
9. A data processing apparatus applied to a second processing end, the apparatus comprising:
a receiving module, configured to receive a plurality of target objects sent by a first processing end, where the target objects determine, by the first processing end, a plurality of target objects and a plurality of processing operators corresponding to the target objects, and when any processing operator is executed on the target objects, the target objects are frozen, the target objects are added to a data queue corresponding to any processing operator and extracted from any data queue, the target objects carry execution sequences for the plurality of processing operators, each processing operator corresponds to one data queue, and the processing operators are operators for performing dimensionality reduction on feature vectors of the target objects;
the processing module is configured to concurrently execute a processing operator corresponding to any one of the data queues on the plurality of target objects to obtain a processing result;
and the feedback module is configured to feed back the processing result to the first processing end, so that the first processing end unfreezes the plurality of target objects according to the processing result.
10. A data processing system comprises a first processing terminal and a second processing terminal;
the first processing terminal is configured to determine a plurality of target objects and a plurality of processing operators corresponding to the target objects, wherein the target objects carry execution sequences for the plurality of processing operators, each processing operator corresponds to one data queue, and the processing operators are operators for performing dimensionality reduction on feature vectors of the target objects; under the condition that any processing operator is executed on the target object, freezing the target object, and adding the target object to a data queue corresponding to the processing operator; extracting a plurality of target objects from any data queue and sending the target objects to a second processing end; receiving a processing result fed back by the second processing end, and unfreezing the plurality of target objects according to the processing result;
the second processing terminal is configured to receive the plurality of target objects sent by the first processing terminal; concurrently executing a processing operator corresponding to any one data queue on the plurality of target objects to obtain a processing result; and feeding back the processing result to the first processing end.
11. A computing device, comprising:
a memory and a processor;
the memory is for storing computer-executable instructions and the processor is for executing the computer-executable instructions, which when executed by the processor implement the steps of the data processing method of any one of claims 1 to 6 or claim 7.
12. A computer readable storage medium storing computer executable instructions which, when executed by a processor, implement the steps of the data processing method of any one of claims 1 to 6 or claim 7.
CN202210762371.8A 2022-06-30 2022-06-30 Data processing method and device Pending CN115080241A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210762371.8A CN115080241A (en) 2022-06-30 2022-06-30 Data processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210762371.8A CN115080241A (en) 2022-06-30 2022-06-30 Data processing method and device

Publications (1)

Publication Number Publication Date
CN115080241A true CN115080241A (en) 2022-09-20

Family

ID=83258287

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210762371.8A Pending CN115080241A (en) 2022-06-30 2022-06-30 Data processing method and device

Country Status (1)

Country Link
CN (1) CN115080241A (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160078077A1 (en) * 2014-09-12 2016-03-17 Palaniappan Gandhi Methods, systems, and apparatus for processing data event streams in a database environment
WO2018121738A1 (en) * 2016-12-30 2018-07-05 北京奇虎科技有限公司 Method and apparatus for processing streaming data task
US20190118085A1 (en) * 2016-09-21 2019-04-25 Tencent Technology (Shenzhen) Company Limited Data processing method and apparatus, and storage medium
CN111198769A (en) * 2018-11-16 2020-05-26 北京京东金融科技控股有限公司 Information processing method and system, computer system and computer readable medium
CN112148455A (en) * 2020-09-29 2020-12-29 星环信息科技(上海)有限公司 Task processing method, device and medium
US20210216373A1 (en) * 2020-01-15 2021-07-15 Beijing Baidu Netcom Science And Technology Co., Ltd. Method and apparatus for traversing graph database
CN113419824A (en) * 2021-01-25 2021-09-21 阿里巴巴集团控股有限公司 Data processing method, device, system and computer storage medium
CN113613028A (en) * 2021-08-03 2021-11-05 北京达佳互联信息技术有限公司 Live broadcast data processing method, device, terminal, server and storage medium

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160078077A1 (en) * 2014-09-12 2016-03-17 Palaniappan Gandhi Methods, systems, and apparatus for processing data event streams in a database environment
US20190118085A1 (en) * 2016-09-21 2019-04-25 Tencent Technology (Shenzhen) Company Limited Data processing method and apparatus, and storage medium
WO2018121738A1 (en) * 2016-12-30 2018-07-05 北京奇虎科技有限公司 Method and apparatus for processing streaming data task
CN111198769A (en) * 2018-11-16 2020-05-26 北京京东金融科技控股有限公司 Information processing method and system, computer system and computer readable medium
US20210216373A1 (en) * 2020-01-15 2021-07-15 Beijing Baidu Netcom Science And Technology Co., Ltd. Method and apparatus for traversing graph database
CN112148455A (en) * 2020-09-29 2020-12-29 星环信息科技(上海)有限公司 Task processing method, device and medium
CN113419824A (en) * 2021-01-25 2021-09-21 阿里巴巴集团控股有限公司 Data processing method, device, system and computer storage medium
CN113613028A (en) * 2021-08-03 2021-11-05 北京达佳互联信息技术有限公司 Live broadcast data processing method, device, terminal, server and storage medium

Similar Documents

Publication Publication Date Title
US11762697B2 (en) Method and apparatus for scheduling resource for deep learning framework
Qiao et al. A new era for web AR with mobile edge computing
EP3731161A1 (en) Model application method and system, and model management method and server
CN110830807B (en) Image compression method, device and storage medium
EP4016398A1 (en) Apparatus and method for distributed training model, and computer program product
CN114201278B (en) Task processing method, task processing device, electronic equipment and storage medium
CN113139660A (en) Model reasoning method and device, electronic equipment and storage medium
CN111949324A (en) Distributed serial number generation method and device
CN116048673A (en) Service processing method and platform
CN113987300A (en) Label generation method and device
CN111813529B (en) Data processing method, device, electronic equipment and storage medium
CN113779422A (en) Method and device for realizing relation chain label, electronic equipment and storage medium
CN113657411A (en) Neural network model training method, image feature extraction method and related device
CN115378937B (en) Distributed concurrency method, device, equipment and readable storage medium for tasks
US20200286012A1 (en) Model application method, management method, system and server
CN112423024A (en) Video transcoding method and device, computer equipment and storage medium
CN115080241A (en) Data processing method and device
CN115361382B (en) Data processing method, device, equipment and storage medium based on data group
CN114374703B (en) Cloud mobile phone information acquisition method, device, equipment and storage medium
CN112966723A (en) Video data augmentation method, video data augmentation device, electronic device and readable storage medium
CN111353585A (en) Structure searching method and device of neural network model
CN114443900A (en) Video annotation method, client, server and system
CN112308205A (en) Model improvement method and device based on pre-training model
CN112241621A (en) Method and device for identifying client environment through user agent
CN114363640B (en) Data storage method, device and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination