CN114217921A

CN114217921A - Distributed parallel computing method, computing device and storage medium

Info

Publication number: CN114217921A
Application number: CN202111355296.5A
Authority: CN
Inventors: 蒋永俊
Original assignee: Ubtech Robotics Corp
Current assignee: Ubtech Robotics Corp
Priority date: 2021-11-16
Filing date: 2021-11-16
Publication date: 2022-03-22

Abstract

The utility model is suitable for the technical field of computers, and provides a distributed parallel computing method, a computing device and a storage medium, which are characterized in that socket connection is established between a host and a plurality of slave machines, the host decomposes a preset task into a plurality of subtasks, then the description information of the plurality of subtasks is synchronously distributed to the plurality of slave machines, the input parameters of the plurality of subtasks are synchronously distributed to the plurality of slave machines, each slave machine executes the corresponding subtask according to the description information and the input parameters of the received subtask, an execution result is generated and sent to the host machine, and the host machine receives the execution results of the plurality of subtasks sent by the plurality of slave machines; the host computer does not need to carry out environment configuration on the slave computers and configure and determine the computing tasks of the slave computers in advance, and supports the dynamic sending of complex and various parallel computing tasks and the acquisition of parallel computing results, so that the efficient and fast distributed computer parallel computing method can be realized.

Description

Distributed parallel computing method, computing device and storage medium

Technical Field

The present application belongs to the field of computer technologies, and in particular, to a distributed parallel computing method, a computing device, and a storage medium.

Background

In the field of computer technology, the hardware computing power of a computer is an important index. In order to meet the requirements of fast calculation of big data and high complexity, the hardware performance of a computer is simply improved, the cost and the improvement degree are limited, and the parallel calculation of a computer cluster needs natural resources. In order to realize parallel computing of a computer cluster, computers in the computer cluster need to be matched with each other, share input data of computing tasks, share and execute the computing tasks, and summarize computing results, so as to realize a parallel computing effect. The parallel computing power of a computer cluster can theoretically be several times of that of a single computer.

At present, a common distributed computer parallel computing method is a Multi-Point Interface (MPI), and has high performance, large-scale performance, and portability. However, the disadvantage of MPI is that it needs to configure the same working environment for each computer in the computer cluster in advance, the computers are logged in without secret based on Secure Shell (SSH), the computing task of each computer needs to be determined well in advance, and it is difficult to configure in a large-scale, complex and changeable parallel task scenario.

Disclosure of Invention

The embodiment of the application provides a distributed parallel computing method, a host computer, a slave computer, computing equipment and a storage medium, and aims to solve the problem that the existing MPI-based distributed computer parallel computing method is difficult to configure in a large-scale complex and changeable parallel task scene.

A first aspect of an embodiment of the present application provides a distributed parallel computing method, which is applied to a host computer, where the host computer establishes socket connections with m slave computers, and the method includes:

decomposing a preset task into m subtasks;

synchronously distributing the description information of the m subtasks to the m slave machines;

synchronously distributing the input parameters of the m subtasks to the m slave machines;

receiving the execution results of the m subtasks sent by the m slave machines;

and each slave machine executes the corresponding subtask according to the description information and the input parameter of the received subtask, generates an execution result and sends the execution result to the master machine.

A second aspect of the embodiments of the present application provides a distributed parallel computing method, which is applied to a slave computer, where the slave computer establishes a socket connection with a host computer, and the method includes:

receiving description information of the subtasks sent by the host;

receiving input parameters of the subtasks sent by the host;

executing the subtasks and generating an execution result according to the description information and the input parameters;

and sending the execution result to the host.

A third aspect of embodiments of the present application provides a host that establishes socket connections with m slaves, the host including:

the task decomposition unit is used for decomposing the preset task into m subtasks;

the description information distribution unit is used for synchronously distributing the description information of the m subtasks to the m slave machines;

an input parameter distribution unit, configured to synchronously distribute input parameters of the m subtasks to the m slaves;

an execution result receiving unit, configured to receive execution results of m sub-tasks sent by the m slaves;

A fourth aspect of the embodiments of the present application provides a slave that establishes a socket connection with a master, the slave including:

the description information receiving unit is used for receiving the description information of the subtask sent by the host;

an input parameter receiving unit, configured to receive an input parameter of the subtask sent by the host;

the task execution unit is used for executing the subtasks and generating an execution result according to the description information and the input parameters;

and the execution result sending unit is used for sending the execution result to the host.

A fifth aspect of embodiments of the present application provides a computing device comprising a communication module, a processor, a memory, and a computer program stored in the memory and executable on the processor;

when the computing device is a host, the processor implements the steps of the distributed parallel computing method provided in the first aspect of the embodiment of the present application when executing the computer program;

when the computing device is a slave, the processor implements the steps of the distributed parallel computing method provided by the second aspect of the embodiment of the present application when executing the computer program.

A sixth aspect of embodiments of the present application provides a computer-readable storage medium, which stores a computer program that, when executed by a processor, implements the steps of the distributed parallel computing method provided in the first or second aspect of embodiments of the present application.

The distributed parallel computing method provided by the first aspect of the embodiment of the application is applied to a host, the host establishes socket connection with a plurality of slaves to decompose a preset task into a plurality of subtasks, then synchronously distributes description information of the plurality of subtasks to the plurality of slaves, synchronously distributes input parameters of the plurality of subtasks to the plurality of slaves, each slave executes a corresponding subtask according to the received description information and input parameter of one subtask, generates an execution result and sends the execution result to the host, and finally the host receives the execution result of the plurality of subtasks sent by the plurality of slaves; the host computer does not need to carry out environment configuration on the slave computers and configure and determine the computing tasks of the slave computers in advance, and supports the dynamic sending of complex and various parallel computing tasks and the acquisition of parallel computing results, so that the efficient and fast distributed computer parallel computing method can be realized.

The distributed parallel computing method provided by the second aspect of the embodiment of the application is applied to a slave computer, the slave computer establishes socket connection with a host computer, receives description information of subtasks sent by the host computer, receives input parameters of the subtasks sent by the host computer, executes the subtasks according to the description information and the input parameters, generates an execution result, and finally sends the execution result to the host computer, so that the slave computer does not need to carry out environment configuration and configure and determine computing tasks in advance through the host computer, and supports dynamic receiving of complex and diverse computing tasks sent by the host computer and sending the computing result to the host computer, and therefore the high-efficiency and quick distributed computer parallel computing method can be achieved by matching with the host computer.

It is to be understood that, for the beneficial effects of the third aspect to the sixth aspect, reference may be made to the description of the first aspect or the second aspect, and details are not described herein again.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.

Fig. 1 is a first flowchart schematic diagram of a first distributed parallel computing method provided in an embodiment of the present application;

FIG. 2 is a schematic structural diagram of a distributed parallel computing system provided by an embodiment of the present application;

fig. 3 is a second flowchart schematic diagram of a first distributed parallel computing method according to an embodiment of the present application;

fig. 4 is a first flowchart schematic diagram of a second distributed parallel computing method according to an embodiment of the present application;

fig. 5 is a second flowchart schematic diagram of a second distributed parallel computing method according to an embodiment of the present application;

FIG. 6 is a schematic structural diagram of a host provided in an embodiment of the present application;

FIG. 7 is a schematic structural diagram of a slave provided in an embodiment of the present application;

fig. 8 is a schematic structural diagram of a computing device provided in an embodiment of the present application.

Detailed Description

In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system structures, techniques, etc. in order to provide a thorough understanding of the embodiments of the present application. It will be apparent, however, to one skilled in the art that the present application may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present application with unnecessary detail.

It will be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

It should also be understood that the term "and/or" as used in this specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items.

As used in this specification and the appended claims, the term "if" may be interpreted contextually as "when", "upon" or "in response to" determining "or" in response to detecting ". Similarly, the phrase "if it is determined" or "if a [ described condition or event ] is detected" may be interpreted contextually to mean "upon determining" or "in response to determining" or "upon detecting [ described condition or event ]" or "in response to detecting [ described condition or event ]".

Furthermore, in the description of the present application and the appended claims, the terms "first," "second," "third," and the like are used for distinguishing between descriptions and not necessarily for describing or implying relative importance.

Reference throughout this specification to "one embodiment" or "some embodiments," or the like, means that a particular feature, structure, or characteristic described in connection with the embodiment is included in one or more embodiments of the present application. Thus, appearances of the phrases "in one embodiment," "in some embodiments," "in other embodiments," or the like, in various places throughout this specification are not necessarily all referring to the same embodiment, but rather "one or more but not all embodiments" unless specifically stated otherwise. The terms "comprising," "including," "having," and variations thereof mean "including, but not limited to," unless expressly specified otherwise.

The first distributed parallel computing method provided by the embodiment of the application is applied to a host, and can be executed when a processor of the host runs a corresponding computer program, when a preset task needs to be executed, the host establishes socket (socket) connection with multiple slaves first, decomposes the preset task into multiple subtasks, then synchronously distributes description information of the multiple subtasks to the multiple slaves, synchronously distributes input parameters of the multiple subtasks to the multiple slaves, each slave executes the corresponding subtasks according to the received description information and the input parameters of one subtask, generates an execution result and sends the execution result to the host, and the host can receive the execution result of the multiple subtasks sent by the multiple slaves, so that the host can complete one-time remote synchronous call and finish distributed parallel computing of the preset task. The first distributed parallel computing method enables the host computer to support the dynamic sending of complex and diverse parallel computing tasks and the acquisition of parallel computing results without the need of carrying out environment configuration on the slave computers and configuring and determining computing tasks of each slave computer in advance, thereby realizing the efficient and rapid distributed computer parallel computing method.

In application, the master and the slave may be computing devices such as computers or servers. One host computer can establish socket connection with a plurality of slave computers to form a distributed parallel computing system, and the distributed parallel computing system can be a multitasking computer system.

As shown in fig. 1, a first distributed parallel computing method provided in the embodiment of the present application includes the following steps S100 to S104:

step S100, establishing socket connection with m slave machines;

and S101, decomposing the preset task into m subtasks.

In application, m is an integer greater than or equal to 2, and before synchronously sending a subtask to m slaves, the master needs to establish socket connection with the m slaves based on a TCP or IP protocol.

As shown in fig. 2, the distributed parallel computing system provided in the embodiment of the present application includes a master 1 and a plurality of slaves 2, where the master 1 establishes a socket connection with each of the slaves 2.

In application, the implementation process of establishing socket connection between the host and the multiple slaves specifically includes:

the method comprises the steps that a host synchronously sends a plurality of socket connection requests to a plurality of slave machines, specifically, the host respectively sends one socket request to each slave machine, and the socket request sent to each slave machine carries the address and the port number of the slave machine;

each slave computer is in a state of waiting for connection, monitors a socket connection request sent by the host computer, and responds to the socket connection request when monitoring the socket connection request and sends the socket description information of the slave computer to the host computer;

after the host confirms that the socket description information of the slave is correct, the host establishes socket connection with the slave.

In application, the host establishes socket connections with n slaves in advance based on a TCP or IP protocol, n should be set to a larger value, for example, the number m of the subtasks into which any preset task can be decomposed should be greater than or equal to, so that, even when the host needs to decompose any preset task into m subtasks and synchronously distribute the m subtasks to m slaves in the following, only m subtasks need to be synchronously distributed to m slaves in the n slaves, and socket connections with the corresponding number of slaves do not need to be established before distributing the subtasks each time. The method is particularly suitable for the situation that a plurality of different preset tasks need to be continuously executed, for example, the preset task A is firstly decomposed into a subtasks by the host computer, and then the subtasks are synchronously distributed to a slave computers for execution; after receiving the execution results of the a subtasks formed by decomposing the preset task A, the host decomposes the preset task B into B subtasks, and then synchronously distributes the B subtasks to the B slave machines for execution; wherein a ≠ b and both a and b are less than or equal to n.

In application, the host decomposes the preset task into a plurality of subtasks which have no time sequence correlation and are independent from each other. The type of the preset task and the subtask may be a data processing task, for example, an image data processing task, a graphic data processing task, a text data processing task, an audio data processing task, etc.; the image data processing task may be a picture data processing task or a video data processing task.

In application, the specific content of the preset task may be to process a plurality of data, and correspondingly, the specific content of the sub-task may be to process one data. According to different types of the preset tasks and the subtasks, the types of data required to be processed by the preset tasks and the subtasks are different, for example, the data can be image data, graphic data, text data, audio data and the like; the image data may be picture data or video data, among others.

Step S102, synchronously distributing the description information of the m subtasks to m slave machines;

step S103, synchronously distributing input parameters of m subtasks to m slave machines;

and step S104, receiving the execution results of the m subtasks sent by the m slave machines.

In the application, in the process of executing a plurality of subtasks corresponding to a preset task, each slave machine receives the description information and the input parameters of one subtask, executes the corresponding subtask according to the received description information and the input parameters of one subtask, generates an execution result and sends the execution result to the master machine; the master machine receives the execution result of one subtask sent by each slave machine to obtain m execution results of m subtasks, wherein the m execution results are the execution results of the preset tasks.

In the application, the description information of the subtask may be a program name of the application program, the input parameter of the subtask may be data that needs to be processed when the application program runs, the slave calls the corresponding application program according to the program name to process the data when the subtask is executed, and the data processing result is returned to the host after the subtask is executed. The type of the application program is different according to the type of the subtask, for example, the application program may be an image data processing program, a graphic data processing program, a text data processing program, an audio data processing program, or the like; the image data processing program may be a picture data processing program or a video data processing program.

In one embodiment, the preset task is to process m image data, the subtask is to process one image data, the description information is a program name of an image data processing program, and the input parameter is one image data.

In application, the m image data can be continuous, the host machine synchronously distributes the m image data to the m slave machines for parallel processing to obtain m processing results returned by the m slave machines, compared with the method that the host machine independently processes the continuous m image data, the processing time can be shortened to 1/m of the original processing time, and the data processing capacity and efficiency are greatly improved.

In application, if each sub-task is a continuous sub-task, for example, each sub-task is a sub-task that needs to process a plurality of continuous data of the same type, the master may send the description information of the sub-task to each slave only once, and then may send a plurality of input parameters and a plurality of corresponding output parameters to each slave continuously.

As shown in fig. 3, in one embodiment, before step S102, the method includes:

step S301, synchronously sending task start notification to m slave machines;

before step S204, the method includes:

step S302, task end notification is synchronously sent to m slave machines.

In application, the task start notification is used for notifying each slave machine to wait for receiving description information of a subtask, each slave machine monitors the waiting host machine to send the task start notification, and waits for receiving the description information of the subtask after receiving the task start notification; and the task ending notice is used for informing each slave machine that the description information of one subtask is distributed, each slave machine monitors and waits for the host machine to send the task ending notice, and after receiving the task ending notice, executes the corresponding subtask according to the received description information and input parameters of one subtask, generates an execution result and sends the execution result to the host machine.

In an application, the Task start Notification and the Task end Notification may be Task notifications (Task notifications) with different identifiers, for example, the Task start Notification is a Task Notification with a first identifier, and the Task end Notification is a Task Notification with a second identifier, so that the slave device may identify different types of Task notifications according to the identifier of the Task Notification, and accordingly respond accordingly.

As shown in fig. 3, in one embodiment, before step S302, the method includes:

and step S303, synchronously distributing the output parameters of the m subtasks to m slave machines.

In the application, the output parameter is used for instructing each slave machine to create a memory space for storing the execution result of the corresponding subtask. Each slave machine locally creates a memory space for storing the execution result of the corresponding subtask according to the received output parameter of the subtask, stores the execution result of the subtask to the memory space after the execution of the corresponding subtask is completed, and sends all memory data stored in the memory space to the host machine, namely, the execution result of the subtask is sent to the host machine.

In application, steps S102, S103, and S303 may be executed simultaneously, that is, the master may send description information, input parameters, and output parameters of one subtask to each slave at the same time, so that the interaction time between the master and the slave may be saved, thereby saving the overall parallel computing time. Steps S102, S103, and S303 may also be performed in sequence, that is, the host may send description information, input parameters, and output parameters of a sub-task to each slave in sequence, so that the amount of data sent by the host and received by the slave each time is small, and the efficiency of retransmission after a data transmission error may be improved, thereby improving the fault tolerance rate.

The second distributed parallel computing method provided by the embodiment of the application is applied to a slave computer, and can be executed when a processor of the slave computer runs a corresponding computer program, the slave computer receives description information of a subtask sent by a host computer after establishing socket connection with the host computer, receives an input parameter of the subtask sent by the host computer, executes the subtask according to the description information and the input parameter and generates an execution result, and finally sends the execution result to the host computer.

As shown in fig. 4, a second distributed parallel computing method provided in the embodiment of the present application includes the following steps S401 to S404:

step S400, establishing socket connection with a host;

step S401, receiving description information of the subtasks sent by the host;

s402, receiving input parameters of the subtasks sent by the host;

step S403, executing the subtasks and generating an execution result according to the description information and the input parameters;

and step S404, sending the execution result to the host.

In application, before receiving the description information of the subtask sent by the host, the slave needs to establish a socket connection with the host based on a TCP or IP protocol, and the socket connection between the host and the slave is initiated by the host.

In application, the process of establishing socket connection between the slave and the host specifically includes:

the method comprises the steps that a host sends a socket connection request to a slave, specifically, the host sends the socket request carrying the address and the port number of the slave to the slave;

the slave computer is in a state of waiting for connection, monitors a socket connection request sent by the host computer, responds to the socket connection request when monitoring the socket connection request, and sends the socket description information of the slave computer to the host computer;

In application, the subtask is one of a plurality of independent subtasks which are obtained by decomposing the preset task by the host and have no time sequence correlation with each other. The type of the preset task and the subtask may be a data processing task, for example, an image data processing task, a graphic data processing task, a text data processing task, an audio data processing task, etc.; the image data processing task may be a picture data processing task or a video data processing task.

In application, the slave receives the description information and the input parameters of the subtasks, executes the corresponding subtasks according to the received description information and the input parameters of the subtasks, generates an execution result and sends the execution result to the master.

In one embodiment, the subtask is to process one image data, the description information is a program name of an image data processing program, and the input parameter is one image data.

In application, if the sub-task is a continuous sub-task, for example, the sub-task is a sub-task that needs to process a plurality of continuous data of the same type, the slave may receive the description information of the sub-task sent by the master only once, and then may continuously receive a plurality of input parameters and a plurality of corresponding output parameters sent by the master.

As shown in fig. 5, in one embodiment, before step S401, the method includes:

step S501, receiving a task start notification sent by a host;

step S502, waiting for receiving the description information of the subtasks;

before step S403, the method includes:

step S503 is to receive a task end notification sent by the host.

In application, a slave monitors a task start notification sent by a waiting host, and waits to receive description information of a subtask after receiving the task start notification; and the slave monitors and waits for the host to send a task ending notice, executes the corresponding subtask according to the received description information and input parameters of the subtask after receiving the task ending notice, generates an execution result and sends the execution result to the host.

In the application, the task start notification and the task end notification may be task notifications with different identifiers, for example, the task start notification is a task notification with a first identifier, and the task end notification is a task notification with a second identifier, so that the slave may identify different types of task notifications according to the identifier of the task notification, and then respond accordingly.

As shown in fig. 5, in one embodiment, before step S503, the method includes:

step S504, receiving output parameters of the subtasks sent by the host;

step S505, creating a memory space for storing the execution result of the subtask according to the output parameter;

after step S403, the method includes:

step S506, saving the execution result to the memory space.

In the application, the slave machine locally creates a memory space for storing the execution result of the corresponding subtask according to the received output parameter of the subtask, stores the execution result of the subtask to the memory space after the execution of the corresponding subtask is completed, and sends all memory data stored in the memory space to the host machine, namely, the execution result of the subtask is sent to the host machine.

In application, steps S401, S402, and S504 may be executed simultaneously, that is, the slave may receive the description information, the input parameter, and the output parameter of the subtask sent by the master at the same time, so that the interaction time between the master and the slave may be saved, thereby saving the overall parallel computing time. Steps S401, S402, and S504 may also be performed in sequence, that is, the slave may receive the description information, the input parameter, and the output parameter of the subtask sent by the host in sequence, so that the amount of data sent by the host is smaller each time the slave receives the data, and the efficiency of data receiving error and repeating can be improved, thereby improving the fault tolerance rate.

It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present application.

As shown in fig. 6, an embodiment of the present application further provides a host 100, configured to execute the steps in the first distributed parallel computing method, where the host 100 includes:

a task decomposition unit 101, configured to decompose a preset task into m subtasks;

a description information distribution unit 102, configured to synchronously distribute description information of m subtasks to m slaves;

an input parameter distribution unit 103, configured to synchronously distribute input parameters of m subtasks to m slaves;

an execution result receiving unit 104 is configured to receive execution results of m sub-tasks sent by m slaves.

In one implementation, the master further comprises a communication unit for establishing socket connections with the m slaves.

In one implementation, the host further comprises:

a task start notification sending unit, configured to synchronously send task start notifications to m slaves;

and the task end notification sending unit is used for synchronously sending task end notifications to the m slave machines.

In one implementation, the master further includes an output parameter distribution unit for synchronously distributing the output parameters of the m subtasks to the m slaves.

As shown in fig. 7, an embodiment of the present application further provides a slave 200, configured to execute the steps in the second distributed parallel computing method, where the slave 200 includes:

a description information receiving unit 201, configured to receive description information of a subtask sent by a host;

an input parameter receiving unit 202, configured to receive an input parameter of a subtask sent by a host;

the task execution unit 203 is used for executing the subtasks according to the description information and the input parameters and generating an execution result;

an execution result sending unit 204, configured to send the execution result to the host.

In one embodiment, the slave further comprises a communication unit for establishing a socket connection with the master.

In one embodiment, the slave further comprises:

a task start notification receiving unit, configured to receive a task start notification sent by a host;

the waiting unit is used for waiting to receive the description information of the subtasks;

and the task ending notice receiving unit is used for receiving the task ending notice sent by the host.

In one embodiment, the slave further comprises:

the output parameter receiving unit is used for receiving the output parameters of the subtasks sent by the host;

the memory space creating unit is used for creating a memory space for storing the execution result of the subtask according to the output parameter;

and the storage unit is used for storing the execution result to the memory space.

In application, each unit in the master and the slave may be a software program unit, may be implemented by different logic circuits integrated in the processor or an independent physical component connected to the processor, and may also be implemented by a plurality of distributed processors.

As shown in fig. 8, an embodiment of the present application further provides a computing device 300, including: a communication module 301, at least one processor 302 (only one processor is shown in fig. 8), a memory 303, and a computer program 304 stored in the memory 303 and executable on the at least one processor 302;

when the computing device 300 is a host, the steps in the first distributed parallel computing method embodiment are implemented when the processor 302 executes the computer program 304;

when computing device 300 is a slave, steps in the second distributed parallel computing method embodiment are implemented when processor 302 executes computer program 304.

In an application, the computing device may include, but is not limited to, a communication module, a processor, and a memory, and fig. 8 is merely an example of a computing device and is not intended to be limiting, and may include more or less components than those shown, or some components in combination, or different components, such as an input output device, a network access device, etc., and may include a display screen for displaying operating parameters.

In an Application, the Processor may be a Central Processing Unit (CPU), and the Processor may also be other general purpose processors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components, and the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

In some embodiments, the storage may be an internal storage unit of the computing device, such as a hard disk or a memory of the computing device. The memory may also be an external storage device of the computing device in other embodiments, such as a plug-in hard drive provided on the computing device, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and so forth. The memory may also include both internal storage units of the computing device and external storage devices. The memory is used for storing an operating system, an application program, a Boot Loader (Boot Loader), data, and other programs, such as program codes of computer programs. The memory may also be used to temporarily store data that has been output or is to be output.

In application, the Display screen may be a Thin Film Transistor Liquid Crystal Display (TFT-LCD), a Liquid Crystal Display (LCD), an Organic electroluminescent Display (OLED), a Quantum Dot Light Emitting diode (QLED) Display screen, a seven-segment or eight-segment digital tube, and the like.

In application, the Communication module may provide a solution for Communication applied to the network device, including Wireless Local Area Networks (WLANs) (such as Wi-Fi Networks), bluetooth, Zigbee, mobile Communication Networks, Global Navigation Satellite Systems (GNSS), Frequency Modulation (FM), Near Field Communication (NFC), Infrared (IR), and the like. The communication module may include an antenna, and the antenna may have only one array element, or may be an antenna array including a plurality of array elements. The communication module can receive electromagnetic waves through the antenna, frequency-modulate and filter electromagnetic wave signals, and send the processed signals to the processor. The communication module can also receive a signal to be sent from the processor, frequency-modulate and amplify the signal, and convert the signal into electromagnetic waves through the antenna to radiate the electromagnetic waves.

It should be noted that, for the information interaction, execution process, and other contents between the above-mentioned devices/modules, the specific functions and technical effects thereof are based on the same concept as those of the embodiment of the method of the present application, and reference may be made to the part of the embodiment of the method specifically, and details are not described here.

It will be clear to those skilled in the art that, for convenience and simplicity of description, the foregoing division of the functional modules is merely illustrated, and in practical applications, the above function distribution may be performed by different functional modules according to needs, that is, the internal structure of the apparatus is divided into different functional modules to perform all or part of the above described functions. Each functional module in the embodiments may be integrated into one processing module, or each module may exist alone physically, or two or more modules are integrated into one module, and the integrated module may be implemented in a form of hardware, or in a form of software functional module. In addition, specific names of the functional modules are only used for distinguishing one functional module from another, and are not used for limiting the protection scope of the application. The specific working process of the modules in the system may refer to the corresponding process in the foregoing method embodiment, and is not described herein again.

The embodiment of the present application further provides a computer-readable storage medium, in which a computer program is stored, and when the computer program is executed by a processor, the steps in the embodiment of the distributed parallel computing method may be implemented.

Embodiments of the present application provide a computer program product, which when running on a computing device, enables the computing device to implement the steps in the above-described embodiments of the distributed parallel computing method.

The integrated module, if implemented in the form of a software functional module and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, all or part of the processes in the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium and can implement the steps of the embodiments of the methods described above when the computer program is executed by a processor. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer readable medium may include at least: any entity or device capable of carrying computer program code to a host or a slave, recording media, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier signals, telecommunications signals, and software distribution media. Such as a usb-disk, a removable hard disk, a magnetic or optical disk, etc.

In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.

Those of ordinary skill in the art will appreciate that the various illustrative modules and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is merely a logical division, and in actual implementation, there may be other divisions, for example, multiple modules or components may be combined or integrated into another system, or some features may be omitted, or not implemented. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or modules, and may be in an electrical, mechanical or other form.

The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical modules, may be located in one place, or may be distributed on a plurality of network modules. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.

The above-mentioned embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present application and are intended to be included within the scope of the present application.

Claims

1. A distributed parallel computing method is applied to a host computer, the host computer and m slave computers establish socket connection, and the method comprises the following steps:

decomposing a preset task into m subtasks;

receiving the execution results of the m subtasks sent by the m slave machines;

2. The distributed parallel computing method according to claim 1, wherein before synchronously distributing the description information of the m subtasks to the m slaves, the method includes:

synchronously sending task start notification to the m slave machines;

before receiving the execution result of the m subtasks sent by the m slaves, the method includes:

synchronously sending task ending notice to the m slave machines;

after receiving the task start notification, each slave waits for receiving the description information of one subtask, and after receiving the task end notification, executes the corresponding subtask according to the received description information of one subtask and the input parameter, generates an execution result, and sends the execution result to the master.

3. The distributed parallel computing method according to claim 1, wherein before receiving the execution results of the m subtasks sent by the m slaves, the method includes:

synchronously distributing the output parameters of the m subtasks to the m slave machines;

and each slave computer creates a memory space for storing the execution result of the corresponding subtask according to the received output parameter of the subtask.

4. A distributed parallel computing method as claimed in any one of claims 1 to 3, wherein the predetermined task is processing m image data, the subtask is processing one image data, the description information is a program name of an image data processing program, and the input parameter is one image data.

5. A distributed parallel computing method is applied to a slave computer, the slave computer and a host computer establish a socket connection, and the method comprises the following steps:

receiving description information of the subtasks sent by the host;

receiving input parameters of the subtasks sent by the host;

and sending the execution result to the host.

6. The distributed parallel computing method of claim 5, wherein before receiving the description information of the subtasks sent by the host, the method comprises:

receiving a task start notification sent by the host;

waiting for receiving the description information of the subtask;

before executing the subtask and generating an execution result according to the description information and the input parameter, the method includes:

receiving a task ending notice sent by the host;

the executing the subtask and generating an execution result according to the description information and the input parameter includes:

and after receiving the task end notification, executing the subtask and generating an execution result according to the description information and the input parameters.

7. The distributed parallel computing method of claim 5, wherein prior to executing the subtasks and generating the execution result based on the description information and the input parameters, comprising:

receiving output parameters of the subtasks sent by the host;

creating a memory space for storing the execution result of the subtask according to the output parameter;

after the sub-task is executed and an execution result is generated according to the description information and the input parameters, the method comprises the following steps:

and storing the execution result to the memory space.

8. A distributed parallel computing method according to any one of claims 5 to 7, wherein the subtask is processing an image data, the description information is a program name of an image data processing program, and the input parameter is an image data.

9. A computing device comprising a communication module, a processor, a memory, and a computer program stored in the memory and executable on the processor;

when the computing device is a host, the processor executes the computer program to realize the steps of the distributed parallel computing method according to any one of claims 1 to 4;

when the computing device is a slave, the processor implements the steps of the distributed parallel computing method according to any one of claims 5 to 8 when executing the computer program.

10. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program which, when executed by a processor, implements the steps of the distributed parallel computing method of any of claims 1 to 4 or any of claims 5 to 8.