CN116579380A

CN116579380A - Data processing method and related equipment

Info

Publication number: CN116579380A
Application number: CN202210115049.6A
Authority: CN
Inventors: 王仁宇; 杨宇庭; 张胜涛; 钱莉
Original assignee: Huawei Technologies Co Ltd
Current assignee: Huawei Technologies Co Ltd
Priority date: 2022-01-30
Filing date: 2022-01-30
Publication date: 2023-08-11
Also published as: WO2023143080A1

Abstract

The embodiment of the application discloses a data processing method and related equipment, and the method can be used in the field of artificial intelligence. The method comprises the following steps: the terminal equipment inputs the data to be processed into a first neural network to obtain a first intermediate result, and the first intermediate result is sent to a server; the server inputs the first intermediate result into a second neural network to obtain a second intermediate result, and sends the second intermediate result to the terminal equipment; the terminal equipment inputs the second intermediate result into a third neural network to obtain a prediction result corresponding to the data to be processed; the number of the neural network layers in the first neural network or the third neural network deployed on the terminal equipment is changed at two different moments, namely the first moment and the second moment, and different intermediate results are sent between the terminal equipment and the server at different moments, so that the protection degree of privacy of user data is further improved.

Description

Data processing method and related equipment

Technical Field

The application relates to the field of artificial intelligence, in particular to a data processing method and related equipment.

Background

Artificial intelligence (Artificial Intelligence, AI) is the theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and extend human intelligence, sense the environment, acquire knowledge and use the knowledge to obtain optimal results. In other words, artificial intelligence is a branch of computer science that attempts to understand the essence of intelligence and to produce a new intelligent machine that can react in a similar manner to human intelligence. Artificial intelligence, i.e. research on design principles and implementation methods of various intelligent machines, enables the machines to have functions of sensing, reasoning and decision.

In order to ensure the performance of the neural network when the neural network performs the target task, the number of the existing application model parameters based on the deep neural network often reaches 10-100M, and for some terminal devices (such as intelligent wearable devices or intelligent sensors) with relatively tight computing resource limitation, the computing resources of the terminal devices are often difficult to complete the computation of the whole neural network.

At present, data to be processed of a user can be acquired at a terminal device side, the data to be processed is sent to a server, and after the data to be processed is processed by the server through a neural network, a prediction result corresponding to the data to be processed is obtained and returned to the terminal device.

However, since the data to be processed of the user needs to be transmitted in the network, and the server can obtain the original data to be processed, the privacy protection degree of the user data is weaker.

Disclosure of Invention

The embodiment of the application provides a data processing method and related equipment, and because the operation of a second neural network is completed by a server, the computer resources of terminal equipment occupied in the calculation process of the whole neural network can be reduced; the terminal equipment inputs the data to be processed into the first neural network for calculation and then sends the first intermediate result to the server, so that the leakage of the original data to be processed is avoided, and the protection degree of the privacy of the user data is improved; and the calculation of the third neural network in the whole neural network is also executed by the terminal equipment side, which is beneficial to further improving the protection degree of the privacy of the user data.

In order to solve the technical problems, the embodiment of the application provides the following technical scheme:

in a first aspect, an embodiment of the present application provides a data processing method, which may be used in the field of artificial intelligence. The method is applied to a data processing system, the data processing system comprises a first terminal device and a server, a first neural network and a third neural network are deployed on the first terminal device, a second neural network is deployed on the server, the first neural network, the second neural network and the third neural network form a target neural network, the first neural network is located before the second neural network, the third neural network is located behind the second neural network, and the second neural network is located between the first neural network and the third neural network.

Further, the "the first neural network is located before the second neural network" means that in the process of inputting the data to be processed into the target neural network and performing data processing through the target neural network, the data to be processed will first pass through the first neural network in the target neural network, and then pass through the second neural network in the target neural network. The concept that the third neural network is located after the second neural network can also be understood by the foregoing description, and will not be repeated here.

The data processing method comprises the following steps: the method comprises the steps that first terminal equipment inputs data to be processed into a first neural network, a first intermediate result generated by the first neural network is obtained, and the first intermediate result is sent to a server; the first intermediate result generated by the first neural network may also be referred to as a first hidden vector generated by the first neural network, where the first intermediate result generated by the first neural network includes data required by the second neural network when performing data processing; further, the "first intermediate result generated by the first neural network" includes data generated by the last neural network layer in the first neural network, or the "first intermediate result generated by the first neural network" includes data generated by the last neural network layer in the first neural network and data generated by other neural network layers in the first neural network. The server inputs the first intermediate result into a second neural network to obtain a second intermediate result generated by the second neural network, and sends the second intermediate result to the first terminal equipment; the meaning of "second intermediate result" may be understood with reference to the meaning of "first intermediate result", and will not be described herein. The first terminal device inputs the second intermediate result into a third neural network to obtain a prediction result which is generated by the third neural network and corresponds to the data to be processed, and the type of information indicated by the prediction result corresponds to the type of the target task.

The neural network deployed on the first terminal device has the following changes at two different moments, namely a first moment and a second moment: the number of neural network layers in the first neural network changes or the number of neural network layers in the third neural network changes.

In the implementation manner, since the operation of the second neural network is completed by the server, the computer resources of the first terminal equipment occupied in the calculation process of the whole target neural network can be reduced; the first terminal equipment sends the first intermediate result to the server after the first neural network calculates the data to be processed before the data to be processed is input, so that the leakage of the original data to be processed is avoided, and the protection degree of the privacy of the user data is improved; and the calculation of the rear third neural network in the whole target neural network is also executed by the first terminal equipment side, so that the protection degree of the privacy of the user data is further improved. Because an attacker may reversely push the intermediate result sent between the first terminal device and the server according to the obtained intermediate result to obtain the original data to be processed, and for the two different moments of the first moment and the second moment, the number of the neural network layers deployed on the first terminal device is changed, that is, different intermediate results are sent between the first terminal device and the server at different moments, so that the difficulty of the attacker in obtaining the original data to be processed is further increased, and the protection degree of privacy of the user data is further improved.

In one possible implementation manner of the first aspect, at a first moment, the first neural network includes N neural network layers, the third neural network includes S neural network layers, at a second moment, the first neural network includes N neural network layers, the third neural network includes S neural network layers, where N and N are different and/or S and S are different, the method further includes: the server sends n neural network layers and s neural network layers to the first terminal device.

In the implementation manner, when the number of the neural network layers deployed on the first terminal device changes, the server can send the updated first neural network and the updated third neural network to the first terminal device, so that the difficulty of an attacker in determining the neural network deployed on the first terminal device is further improved, the difficulty of the attacker in reversely pushing the intermediate result to obtain the original data to be processed is further improved, and the privacy protection degree of the user data is further improved.

In one possible implementation manner of the first aspect, the method further includes: the server determines a first neural network and a third neural network from target neural networks, wherein the target neural networks are the neural networks for executing target tasks, and the determining factors of the first neural network and the third neural network comprise: and when the target task is not executed, the occupied amount of the processor resource of the first terminal device and/or the occupied amount of the memory resource of the first terminal device. Optionally, the determining factors of the first neural network and the third neural network may further include any one or more of the following: the number of processes currently running on the first terminal device, the time each process on the first terminal device has been running, the running state of each process on the first terminal device, or other enabling factors, etc., are not exhaustive herein. Further, the evaluation index of the "occupation amount of the memory resource of the first terminal device" may include any one or more of the following indexes: the size of the total memory resource of the first terminal device, the size of the occupied memory resource of the first terminal device, the occupancy rate of the memory resource of the first terminal device or other evaluation indexes, etc. The evaluation index of the "occupation amount of the processor resource of the first terminal device" may include any one or more of the following: the occupation rate of the processor resources of the first terminal device, the occupation duration of each processor on the first terminal device for executing the target task, the load of the processor on the first terminal device for executing the target task allocation, the performance of the processor on the first terminal device for executing the target task allocation, or other evaluation indexes capable of reflecting the occupation amount of the processor resources on the first terminal device for executing the target task, etc. are not exhaustive herein.

In this implementation manner, since a plurality of application programs are usually required to be run on the first terminal device, computer resources that can be allocated to the target task by the first terminal device may be different at different times of the same terminal device, and determining factors of the first neural network and the third neural network include an occupation amount of a processor resource of the first terminal device and/or an occupation amount of a memory resource of the first terminal device, which is beneficial to ensuring that the neural network deployed on the first terminal device can be matched with an computing power of the first terminal device, so as to avoid increasing an computing pressure of the first terminal device in a process of executing the target task.

In a possible implementation manner of the first aspect, the system for data processing further includes a second terminal device, the number of the neural network layers in the first neural network deployed on the first terminal device and the first neural network deployed on the second terminal device are different, and/or the number of the neural network layers in the third neural network deployed on the first terminal device and the third neural network deployed on the second terminal device are different; the first terminal device and the second terminal device are different types of terminal devices, and/or the first terminal device and the second terminal device are different types of terminal devices in the same type.

In this implementation manner, since the configuration of the computer resources of the different types of terminal devices may be different, the configuration of the computer resources of the different types of terminal devices in the same type may also be different, and thus the different types of terminal devices or the computer resources that the different types of terminal devices in the same type can be allocated to the target task may also be different.

In one possible implementation manner of the first aspect, the first neural network and the second neural network may be obtained by splitting a target neural network for the server. The server may store a first mapping relationship, where the first mapping relationship may store the number of neural network layers deployed on each type of terminal device, and when the server needs to deploy the first neural network and the second neural network to a new first terminal device, two split nodes corresponding to the first terminal device of the target type may be determined according to the target type and the first mapping relationship of the new first terminal device. Or, the server may store a second mapping relationship, where the second mapping relationship may store the number of neural network layers corresponding to at least one model of each type of terminal device, and when the server needs to deploy the first neural network and the second neural network to a new first terminal device, two split nodes corresponding to the first terminal device of the target type may be determined according to the target type, the target model, and the second mapping relationship of the new first terminal device. Wherein the determining factors of the first neural network and the third neural network in the first mapping relation (or the second mapping relation) may include any one or a combination of multiple factors as follows: when the first terminal device performs the target task, a pre-estimate of the processor resources allocated by the first terminal device, a pre-estimate of the memory resources allocated by the first terminal device, or other types of factors, etc.

In one possible implementation manner of the first aspect, processor resources occupied during data processing by the first terminal device through the first neural network and the third neural network are smaller than processor resources occupied during data processing by the server through the second neural network, and memory resources occupied during data processing by the first terminal device through the first neural network and the third neural network are smaller than memory resources occupied during data processing by the server through the second neural network.

In the implementation manner, the second neural network is deployed on the server, and the second neural network occupies more processor resources and more memory resources in the process of data processing, so that the computer resources of the first terminal equipment occupied in the whole calculation process of the neural network can be further reduced, and the calculation pressure of the first terminal equipment in the process of executing the target task can be reduced; because most of calculation in the data processing process of the whole neural network is executed by the server, the deep neural network with more parameters can be adopted to generate the prediction result fixed with the data to be processed, and the accuracy of the prediction result generated by the whole neural network is improved.

In a possible implementation manner of the first aspect, the data to be processed may be specifically represented as any one of the following data: sound data, image data, fingerprint data, outline data of an ear, sequence data capable of reflecting habit of a user, text data, point cloud data, or other types of data, and the like. In the implementation mode, various expression forms of the data to be processed are provided, the application scene of the scheme is expanded, and the implementation flexibility of the scheme is improved.

In a second aspect, an embodiment of the present application provides a data processing method, which may be used in the field of artificial intelligence. The method is applied to a data processing system, the data processing system comprises a first terminal device and a server, the first terminal device is provided with a first neural network, and the server is provided with a second neural network, and the method comprises the following steps: the method comprises the steps that first terminal equipment inputs data to be processed into a first neural network, a first intermediate result generated by the first neural network is obtained, and the first intermediate result is sent to a server; the server inputs the first intermediate result into a second neural network to obtain a prediction result which is generated by the second neural network and corresponds to the data to be processed; the first neural network and the second neural network form a target neural network, and the number of the neural network layers in the first neural network deployed on the first terminal equipment is changed at two different moments of the first moment and the second moment.

In one possible implementation manner of the second aspect, at a first time instant, the first neural network includes N neural network layers, and at a second time instant, the first neural network includes N neural network layers, N being different from N, the method further includes: the server sends n neural network layers to the first terminal device.

In one possible implementation manner of the second aspect, the system for data processing further includes a second terminal device, where the number of neural network layers in the first neural network deployed on the first terminal device is different from the number of neural network layers in the first neural network deployed on the second terminal device; the first terminal device and the second terminal device are different types of terminal devices, and/or the first terminal device and the second terminal device are different types of terminal devices in the same type.

The steps performed by the data processing system provided by the second aspect of the embodiment of the present application may also be performed by the data processing system in each possible implementation manner of the first aspect, and for the specific implementation steps, meaning of terms and beneficial effects brought by each possible implementation manner of the second aspect of the embodiment of the present application and each possible implementation manner of the second aspect, reference may be made to descriptions in each possible implementation manner of the first aspect, which are not described herein in detail.

In a third aspect, an embodiment of the present application provides a data processing method, which may be used in the field of artificial intelligence. The method is applied to the first terminal equipment, the first terminal equipment is contained in a data processing system, the data processing system further comprises a server, the first neural network and the third neural network are deployed on the first terminal equipment, and the second neural network is deployed on the server, and the method comprises the following steps: inputting data to be processed into a first neural network to obtain a first intermediate result generated by the first neural network; the first intermediate result is sent to a server, and the first intermediate result is used for the server to obtain a second intermediate result by using a second neural network; receiving a second intermediate result sent by the server, and inputting the second intermediate result into a third neural network to obtain a prediction result which is generated by the third neural network and corresponds to the data to be processed; the first neural network, the second neural network and the third neural network form a target neural network, and the neural network deployed on the first terminal equipment has the following changes at two different moments of the first moment and the second moment: the number of neural network layers in the first neural network changes or the number of neural network layers in the third neural network changes.

The data processing method provided in the third aspect of the embodiment of the present application may further execute steps executed by the first terminal device in each possible implementation manner of the first aspect, and for specific implementation steps of the third aspect of the embodiment of the present application and each possible implementation manner of the third aspect, and beneficial effects brought by each possible implementation manner, reference may be made to descriptions in each possible implementation manner of the first aspect, which are not described in detail herein.

In a fourth aspect, an embodiment of the present application provides a data processing method, which may be used in the field of artificial intelligence. The method is applied to a server, the server is contained in a data processing system, the data processing system further comprises a first terminal device, a first neural network and a third neural network are deployed on the first terminal device, a second neural network is deployed on the server, and the method comprises the following steps: receiving a first intermediate result sent by first terminal equipment, wherein the first intermediate result is obtained based on data to be processed and a first neural network; inputting the first intermediate result into a second neural network to obtain a second intermediate result generated by the second neural network; the second intermediate result is sent to the first terminal equipment and is used for the first terminal equipment to obtain a prediction result corresponding to the data to be processed by using the third neural network; the first neural network, the second neural network and the third neural network form a target neural network, and the neural network deployed on the first terminal equipment has the following changes at two different moments of the first moment and the second moment: the number of neural network layers in the first neural network changes or the number of neural network layers in the third neural network changes.

The data processing method provided in the fourth aspect of the present application may further perform steps performed by the server in each possible implementation manner of the first aspect, and for specific implementation steps of the fourth aspect of the present application and each possible implementation manner of the fourth aspect, and beneficial effects brought by each possible implementation manner, reference may be made to descriptions in each possible implementation manner of the first aspect, which are not repeated herein.

In a fifth aspect, an embodiment of the present application provides a data processing method, which may be used in the field of artificial intelligence. The method is applied to the first terminal equipment, the first terminal equipment is contained in a data processing system, the data processing system further comprises a server, the first neural network is deployed on the first terminal equipment, and the second neural network is deployed on the server, and the method comprises the following steps: inputting data to be processed into a first neural network to obtain a first intermediate result generated by the first neural network; the first intermediate result is sent to a server and is used for the server to obtain a prediction result corresponding to the data to be processed by using a second neural network; the first neural network and the second neural network form a target neural network, and the number of the neural network layers in the first neural network deployed on the first terminal equipment is changed at two different moments of the first moment and the second moment.

The data processing method provided in the fifth aspect of the embodiment of the present application may further perform the steps performed by the first terminal device in each possible implementation manner of the second aspect, and for the specific implementation steps of the fifth aspect of the embodiment of the present application and each possible implementation manner of the fifth aspect, and the beneficial effects brought by each possible implementation manner, reference may be made to descriptions in each possible implementation manner of the second aspect, which are not described herein in detail.

In a sixth aspect, an embodiment of the present application provides a data processing method, which may be used in the field of artificial intelligence. The method is applied to a server, the server is contained in a data processing system, the data processing system further comprises a first terminal device, a first neural network is deployed on the first terminal device, a second neural network is deployed on the server, and the method comprises the following steps: receiving a first intermediate result sent by first terminal equipment, wherein the first intermediate result is obtained based on data to be processed and N first intermediate results; inputting the first intermediate result into a second neural network to obtain a prediction result which is generated by the second neural network and corresponds to the data to be processed; the first neural network and the second neural network form a target neural network, and the number of the neural network layers in the first neural network deployed on the first terminal equipment is changed at two different moments of the first moment and the second moment.

The data processing method provided in the sixth aspect of the embodiment of the present application may further perform steps performed by the server in each possible implementation manner of the second aspect, and for specific implementation steps of the sixth aspect of the embodiment of the present application and each possible implementation manner of the sixth aspect, and beneficial effects brought by each possible implementation manner, reference may be made to descriptions in each possible implementation manner of the second aspect, which are not described herein in detail.

In a seventh aspect, embodiments of the present application provide a data processing apparatus that may be used in the field of artificial intelligence. The data processing device is disposed on a first terminal device, the first terminal device is included in a data processing system, the data processing system further includes a server, the first terminal device is disposed with a first neural network and a third neural network, and the server is disposed with a second neural network, the device includes: the input module is used for inputting the data to be processed into the first neural network to obtain a first intermediate result generated by the first neural network; the sending module is used for sending the first intermediate result to the server, and the first intermediate result is used for the server to obtain a second intermediate result by using the second neural network; the receiving module is used for receiving the second intermediate result sent by the server; the input module is also used for inputting the second intermediate result into the third neural network to obtain a prediction result which is generated by the third neural network and corresponds to the data to be processed; the first neural network, the second neural network and the third neural network form a target neural network, and the neural network deployed on the first terminal equipment has the following changes at two different moments of the first moment and the second moment: the number of neural network layers in the first neural network changes, or the number of neural network layers in the third neural network changes

The data processing apparatus provided in the seventh aspect of the present application may further execute the steps executed by the first terminal device in each possible implementation manner of the first aspect, and for the specific implementation steps of the seventh aspect of the present application and each possible implementation manner of the seventh aspect, the beneficial effects brought by each possible implementation manner may refer to descriptions in each possible implementation manner of the first aspect, which are not described herein in detail.

In an eighth aspect, an embodiment of the present application provides a data processing apparatus, which may be used in the field of artificial intelligence. The data processing device is disposed on a server, the server is included in a data processing system, the data processing system further includes a first terminal device, a first neural network and a third neural network are disposed on the first terminal device, a second neural network is disposed on the server, and the device includes: the receiving module is used for receiving a first intermediate result sent by the first terminal equipment, and the first intermediate result is obtained based on the data to be processed and the first neural network; the input module is used for inputting the first intermediate result into the second neural network to obtain a second intermediate result generated by the second neural network; the sending module is used for sending a second intermediate result to the first terminal equipment, wherein the second intermediate result is used for the first terminal equipment to obtain a prediction result corresponding to the data to be processed by using a third neural network; the first neural network, the second neural network and the third neural network form a target neural network, and the neural network deployed on the first terminal equipment has the following changes at two different moments of the first moment and the second moment: the number of neural network layers in the first neural network changes or the number of neural network layers in the third neural network changes.

The data processing apparatus provided in the eighth aspect of the embodiment of the present application may further perform steps performed by the server in each possible implementation manner of the first aspect, and for specific implementation steps of the eighth aspect of the embodiment of the present application and each possible implementation manner, beneficial effects brought by each possible implementation manner may refer to descriptions in each possible implementation manner of the first aspect, which are not described herein in detail.

In a ninth aspect, an embodiment of the present application provides a data processing apparatus that may be used in the field of artificial intelligence. The data processing device is disposed at a first terminal device, the first terminal device is included in a data processing system, the data processing system further includes a server, a first neural network is disposed on the first terminal device, and a second neural network is disposed on the server, the device includes: the input module is used for inputting the data to be processed into the first neural network to obtain a first intermediate result generated by the first neural network; the sending module is used for sending a first intermediate result to the server, wherein the first intermediate result is used for the server to obtain a prediction result corresponding to the data to be processed by using the second neural network; the first neural network and the second neural network form a target neural network, and the number of the neural network layers in the first neural network deployed on the first terminal equipment is changed at two different moments of the first moment and the second moment.

The data processing apparatus provided in the ninth aspect of the embodiment of the present application may further perform the steps performed by the first terminal device in each possible implementation manner of the second aspect, and for the specific implementation steps of the ninth aspect of the embodiment of the present application and each possible implementation manner of the ninth aspect, and the beneficial effects brought by each possible implementation manner, reference may be made to descriptions in each possible implementation manner of the second aspect, which are not described herein in detail.

In a tenth aspect, embodiments of the present application provide a data processing apparatus that may be used in the field of artificial intelligence. The server is included in a data processing system, the data processing system further includes a first terminal device, a first neural network is disposed on the first terminal device, a second neural network is disposed on the server, and the apparatus includes: the receiving module is used for receiving a first intermediate result sent by the first terminal equipment, and the first intermediate result is obtained based on the data to be processed and N first intermediate results; the input module is used for inputting the first intermediate result into the second neural network to obtain a prediction result which is generated by the second neural network and corresponds to the data to be processed; the first neural network and the second neural network form a target neural network, and the number of the neural network layers in the first neural network deployed on the first terminal equipment is changed at two different moments of the first moment and the second moment.

The data processing apparatus provided in the tenth aspect of the embodiment of the present application may further perform steps performed by the server in each possible implementation manner of the second aspect, and for specific implementation steps of the tenth aspect of the embodiment of the present application and each possible implementation manner, beneficial effects brought by each possible implementation manner may refer to descriptions in each possible implementation manner of the second aspect, which are not described herein in detail.

In an eleventh aspect, an embodiment of the present application provides a first terminal device, which may include a processor, where the processor is coupled to a memory, and the memory stores program instructions, and when the program instructions stored in the memory are executed by the processor, implement steps performed by the first terminal device in the data processing method in the foregoing aspects.

In a twelfth aspect, an embodiment of the present application provides a server, which may include a processor, where the processor is coupled to a memory, and the memory stores program instructions, where the program instructions stored in the memory implement steps performed by the server in the data processing method according to the above aspects when executed by the processor.

In a thirteenth aspect, an embodiment of the present application provides a data processing system, which may include a first terminal device and a server, where the first terminal device is configured to perform a step performed by the first terminal device in the method described in the first aspect, and the server is configured to perform a step performed by the server in the method described in the first aspect; alternatively, the first terminal device is configured to perform the step performed by the first terminal device in the method described in the second aspect, and the server is configured to perform the step performed by the server in the method described in the second aspect.

In a fourteenth aspect, an embodiment of the present application provides a computer-readable storage medium having stored therein a computer program which, when executed on a computer, causes the computer to perform the steps performed by the first terminal device in the data processing method described in the above aspects, or causes the computer to perform the steps performed by the server in the data processing method described in the above aspects.

In a fifteenth aspect, an embodiment of the present application provides a computer program product, which includes a program that, when executed on a computer, causes the computer to perform the steps performed by the first terminal device in the data processing method described in the above aspects, or causes the computer to perform the steps performed by the server in the data processing method described in the above aspects.

In a sixteenth aspect, an embodiment of the present application provides a circuit system, where the circuit system includes a processing circuit configured to perform a step performed by a first terminal device in the data processing method described in the above aspects, or configured to perform a step performed by a server in the data processing method described in the above aspects.

In a seventeenth aspect, embodiments of the present application provide a chip system, which includes a processor for implementing the functions involved in the above aspects, for example, transmitting or processing data and/or information involved in the above method. In one possible design, the chip system further includes a memory for holding program instructions and data necessary for the server or the communication device. The chip system can be composed of chips, and can also comprise chips and other discrete devices.

Drawings

FIG. 1a is a schematic diagram of an artificial intelligence main body framework according to an embodiment of the present application;

FIG. 1b is a schematic diagram of an application scenario of a data processing method according to an embodiment of the present application;

FIG. 2a is a system architecture diagram of a data processing system according to an embodiment of the present application;

FIG. 2b is a diagram of a system architecture of a data processing system according to an embodiment of the present application;

FIG. 3 is a schematic flow chart of a data processing method according to an embodiment of the present application;

FIG. 4 is a schematic flow chart of a data processing method according to an embodiment of the present application;

fig. 5 is a schematic diagram of two split nodes corresponding to a target neural network in the data processing method according to the embodiment of the present application;

FIG. 6 is a schematic diagram of a first intermediate result in a data processing method according to an embodiment of the present application;

FIG. 7 is another schematic diagram of a first intermediate result in a data processing method according to an embodiment of the present application;

FIG. 8 is another schematic diagram of a second intermediate result in the data processing method according to the embodiment of the present application;

FIG. 9 is a schematic flow chart of a data processing method according to an embodiment of the present application;

FIG. 10 is a flow chart of updating split nodes corresponding to a target neural network according to an embodiment of the present application;

fig. 11 is a schematic diagram of a split node corresponding to a target neural network in the data processing method according to the embodiment of the present application;

FIG. 12 is a schematic flow chart of a data processing method according to an embodiment of the present application;

FIG. 13 is a schematic flow chart of a data processing method according to an embodiment of the present application;

FIG. 14 is a schematic diagram of a data processing apparatus according to an embodiment of the present application;

FIG. 15 is a schematic diagram of a data processing apparatus according to an embodiment of the present application;

FIG. 16 is a schematic diagram of a data processing apparatus according to an embodiment of the present application;

FIG. 17 is a schematic diagram of a data processing apparatus according to an embodiment of the present application;

Fig. 18 is a schematic structural diagram of a first terminal device according to an embodiment of the present application;

fig. 19 is a schematic structural diagram of a server according to an embodiment of the present application;

fig. 20 is a schematic structural diagram of a chip according to an embodiment of the present application.

Detailed Description

The terms first, second and the like in the description and in the claims and in the above-described figures, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the terms so used are interchangeable under appropriate circumstances and are merely illustrative of the manner in which embodiments of the application have been described in connection with the description of the objects having the same attributes. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of elements is not necessarily limited to those elements, but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.

Embodiments of the present application are described below with reference to the accompanying drawings. As one of ordinary skill in the art can know, with the development of technology and the appearance of new scenes, the technical scheme provided by the embodiment of the application is also applicable to similar technical problems.

Referring to fig. 1a, a schematic structural diagram of an artificial intelligence main body framework is shown in fig. 1a, and the artificial intelligence main body framework is described below from two dimensions of "intelligent information chain" (horizontal axis) and "IT value chain" (vertical axis). Where the "intelligent information chain" reflects a list of processes from the acquisition of data to the processing. For example, there may be general procedures of intelligent information awareness, intelligent information representation and formation, intelligent reasoning, intelligent decision making, intelligent execution and output. In this process, the data undergoes a "data-information-knowledge-wisdom" gel process. The "IT value chain" reflects the value that artificial intelligence brings to the information technology industry from the underlying infrastructure of personal intelligence, information (provisioning and processing technology implementation), to the industrial ecological process of the system.

(1) Infrastructure of

The infrastructure provides computing capability support for the artificial intelligence system, realizes communication with the outside world, and realizes support through the base platform. Communicating with the outside through the sensor; the computing power is provided by a smart chip, which may specifically be a hardware acceleration chip such as a central processing unit (central processing unit, CPU), an embedded neural network processor (neural-network processing unit, NPU), a graphics processor (graphics processing unit, GPU), an application specific integrated circuit (application specific integrated circuit, ASIC), or a field programmable gate array (field programmable gate array, FPGA); the basic platform comprises a distributed computing framework, a network and other relevant platform guarantees and supports, and can comprise cloud storage, computing, interconnection and interworking networks and the like. For example, the sensor and external communication obtains data that is provided to a smart chip in a distributed computing system provided by the base platform for computation.

(2) Data

The data of the upper layer of the infrastructure is used to represent the data source in the field of artificial intelligence. The data relate to graphics, images, voice and text, and also relate to the internet of things data of the traditional equipment, including service data of the existing system and sensing data such as force, displacement, liquid level, temperature, humidity and the like.

(3) Data processing

Data processing typically includes data training, machine learning, deep learning, searching, reasoning, decision making, and the like.

Wherein machine learning and deep learning can perform symbolized and formalized intelligent information modeling, extraction, preprocessing, training and the like on data.

Reasoning refers to the process of simulating human intelligent reasoning modes in a computer or an intelligent system, and carrying out machine thinking and problem solving by using formal information according to a reasoning control strategy, and typical functions are searching and matching.

Decision making refers to the process of making decisions after intelligent information is inferred, and generally provides functions of classification, sequencing, prediction and the like.

(4) General capability

After the data has been processed, some general-purpose capabilities can be formed based on the result of the data processing, such as algorithms or a general-purpose system, for example, translation, text analysis, computer vision processing, speech recognition, image recognition, etc.

(5) Intelligent product and industry application

The intelligent product and industry application refers to products and applications of an artificial intelligent system in various fields, is encapsulation of an artificial intelligent overall solution, and realizes land application by making intelligent information decisions, and the application fields mainly comprise: intelligent terminal, intelligent manufacturing, intelligent transportation, intelligent home, intelligent medical treatment, intelligent security, automatic driving, smart city, etc. The embodiment of the application can be applied to various fields in the artificial intelligence field, and particularly can be applied to an application scene that the first terminal equipment utilizes the neural network to process data, and specific examples are as follows.

1. Intelligent terminal field

As an example, for example, in the field of intelligent terminals, the aforementioned intelligent terminals may be specifically represented as smart wearable devices such as a bracelet, a watch, an earphone, glasses, and the like, and may also be represented as intelligent terminals such as a mobile phone, a tablet, and the like. The intelligent terminal can be configured with a face recognition function, when a user wants to unlock the intelligent terminal, open privacy data on the intelligent terminal or execute other operations, the intelligent terminal can acquire a face image of the current user, further acquire a recognition result corresponding to the face image of the current user, and trigger the execution of the corresponding operation only when the current user is determined to be a registered user, and other functions can be configured on the intelligent terminal, which are not listed one by one.

2. Smart home field

As an example, for example in the field of smart home, the aforementioned smart home may be embodied as a sweeping robot, an air conditioner, a lamp, a water heater, a refrigerator or other type of smart home, etc. When a user sends a control instruction to the intelligent home in a voice mode, the intelligent home can acquire a voiceprint recognition result corresponding to the voice of the user, and the intelligent home can be triggered to execute an operation corresponding to the control instruction under the condition that the user who sends the voice is determined to be a specific user.

In order to understand the present solution more intuitively, referring to fig. 1b, fig. 1b is an application scenario diagram of a data processing method provided by an embodiment of the present application, as shown in fig. 1b, when a user issues an instruction of "turning on an air conditioner" to an air conditioner (i.e. an example of a smart home) shown in fig. 1b in a sound manner, the air conditioner may obtain a voiceprint recognition result corresponding to the foregoing control instruction, and execute an operation of turning on the air conditioner if it is determined that the user who issues the instruction of "turning on the air conditioner" is a user having control authority of the air conditioner, where it is to be understood that the example in fig. 1b is for facilitating understanding of an application scenario of the present solution, and is not limited to the present solution.

3. Automatic driving field

As an example, in the field of automatic driving, for example, a face recognition function may be configured on a vehicle, the vehicle acquires image data of a user's face and acquires a recognition result corresponding to the image data of the user's face, and if it is determined that the current user is a user having a vehicle start right, the vehicle start is triggered.

It should be noted that the foregoing examples are merely for convenience of understanding the application scenario of the embodiment of the present application, and in many other application scenarios, the first terminal device may also need to use the neural network to perform data processing, and the examples herein are not limited to the application scenario of the embodiment of the present application. In the above-mentioned various scenarios, in order to reduce the occupation of computer resources of the first terminal device in the whole calculation process of the neural network and improve the protection degree of privacy of user data, the data processing method provided by the embodiment of the present application may be adopted.

With reference to fig. 2a and 2b, a description will be given of a data processing system according to an embodiment of the present application. In a system architecture, please refer to fig. 2a, fig. 2a is a diagram of a system architecture of a data processing system according to an embodiment of the present application. In fig. 2a, the data processing system may comprise a server 210, a database 220, a first terminal device 230 and a server 240, the first terminal device 230 comprising a first computing module and the server 240 comprising a second computing module.

In the training stage of the target neural network 201, a training data set is stored in the database 220, the server 210 generates the target neural network 201 for executing the target task, and the target neural network 201 includes a plurality of neural network layers; the server 210 performs iterative training on the target neural network 201 using the training data set in the database 220, resulting in a trained target neural network 201.

The server 240 may obtain the trained target neural network 201, where the server 240 deploys a portion of the neural network layers in the trained target neural network 201 in the first computing module of the first terminal device 230, and deploys another portion of the neural network layers in the trained target neural network 201 in the second computing module of the server 240.

In the reasoning stage of the target neural network 201, the first calculation module in the first terminal device 230 performs a part of data calculation in the target neural network 201, and the second calculation module in the server 240 performs another part of data calculation in the target neural network 201, so as to reduce occupation of computer resources of the first terminal device 230 in the whole calculation process of the neural network.

In another system architecture, referring to FIG. 2b, FIG. 2b is a diagram of a system architecture of a data processing system according to an embodiment of the present application. In fig. 2b, the data processing system may comprise a server 210, a database 220, a first terminal device 230, a first server 241 and a second server 242, the first terminal device 230 comprising a first calculation module and the second server 242 comprising a second calculation module.

The difference between fig. 2b and fig. 2a is that in the system architecture shown in fig. 2a, the server 240 is used to perform the allocation operation of the plurality of neural network layers of the target neural network 201, and the second calculation module in the server 240 is used to complete the calculation of a part of the neural network layers in the target neural network 201. In the system architecture shown in fig. 2b, the first server 241 and the second server 242 are two independent devices, the first server 241 is used for performing the allocation operation of the plurality of neural network layers of the target neural network 201, and the second calculation module in the second server 242 is used for completing the calculation of a part of the neural network layers in the target neural network 201.

In some embodiments of the present application, referring to fig. 2a and fig. 2b, a "user" may directly interact with the first terminal device 230, that is, the first terminal device 230 may directly display the prediction result output by the entire target neural network 201 to the "user", and it should be noted that fig. 2a and fig. 2b are only two schematic architecture diagrams of the data processing system provided by the embodiments of the present application, and the positional relationship among the devices, modules and the like shown in the figures does not constitute any limitation. For example, in other embodiments of the present application, the first terminal device 230 and the client device may be separate devices, where the client device is configured to present the prediction result output by the entire target neural network 201 to the "user", and the first terminal device 230 is configured with an input/output (in/out, I/O) interface, and the first terminal device 230 performs data interaction with the client device through the I/O interface.

Further, in the inference phase of the target neural network 201, in one implementation, the target neural network 201 may include a first neural network, a second neural network, and a third neural network.

Still further, the first neural network includes a front plurality of neural network layers of the target neural network 201, and the third neural network includes a rear plurality of neural network layers of the target neural network 201. That is, the first neural network is located before the second neural network, the third neural network is located after the second neural network, and the second neural network is located between the first neural network and the third neural network. The first neural network and the third neural network are deployed on the first terminal equipment, and the second neural network is deployed on the server.

In another implementation, the target neural network 201 may be split into two parts, including a first neural network that is one sub-neural network of the target neural network 201 and a second neural network that is another sub-neural network of the target neural network 201, the first neural network being located before the second neural network. The first neural network is deployed on the first terminal equipment, and the second neural network is deployed on the server.

When the target neural network 201 adopts the above two different splitting manners, the processing flows of the first terminal device and the server are different, and specific implementation flows of the above two splitting manners are described below respectively.

1. The target neural network comprises a first neural network, a second neural network and a third neural network

In order to more intuitively understand the scheme in the embodiment of the present application, please refer to fig. 3, and fig. 3 is a schematic flow chart of a data processing method provided in the embodiment of the present application. As shown in fig. 3, the target neural network includes a first neural network, a second neural network, and a third neural network, the first neural network and the third neural network are disposed on the first terminal device, and the second neural network is disposed on the server. A1, the first terminal equipment inputs original data to be processed into a first neural network to obtain a first intermediate result generated by the first neural network. A2, the first terminal equipment sends the first intermediate result to the server. And A3, the server inputs the first intermediate result into the second neural network to obtain a second intermediate result generated by the second neural network, and the second intermediate result is sent to the first terminal equipment. A4, the first terminal equipment inputs the second intermediate result into a third neural network to obtain a prediction result which is generated by the third neural network and corresponds to the data to be processed; it should be understood that the example in fig. 3 is merely for facilitating understanding of the present solution, and is not intended to limit the present solution. Specifically, referring to fig. 4, fig. 4 is a schematic flow chart of a data processing method according to an embodiment of the present application, where the data processing method according to the embodiment of the present application may include:

401. The method comprises the steps that a server sends a first neural network and a third neural network to a first terminal device, wherein a second neural network is deployed on the server, the first neural network comprises N neural network layers, the second neural network comprises M neural network layers, the third neural network comprises S neural network layers, and the first neural network, the second neural network and the third neural network form a target neural network.

In some embodiments of the present application, the server may determine the number of neural network layers in the first neural network and the number of third neural network layers corresponding to the first terminal device, where at the first moment, the first neural network includes N neural network layers, the second neural network includes M neural network layers, the third neural network includes S neural network layers, the first neural network, the second neural network, and the third neural network form a target neural network, and each of N, M and S is an integer greater than or equal to 1.

The server may send the first neural network and the third neural network to the first terminal device, the first terminal device receives and stores the first neural network and the third neural network, so as to deploy the first neural network and the third neural network on the first terminal device, and deploy the second neural network on the server.

It should be noted that, the server may further deploy the first neural network and the third neural network on the first terminal device in other manners, for example, deploy the first neural network and the third neural network on the first terminal device by using a removable storage device, which is not exhaustive in the embodiment of the present application.

In addition, the target neural network in the embodiment of the present application may be a neural network after performing pretreatment, where the pretreatment may be pruning, distillation or other processing manners for reducing the number of parameters of the standard neural network, which is not exhaustive herein. Alternatively, the target neural network in the embodiment of the present application may be a standard neural network, and the specific expression form of the target neural network may be determined in conjunction with an actual application scenario, which is not limited herein.

The server performing step 401 may be the server 240 in the data processing system shown in fig. 2a, or may be the first server 241 in the data processing system shown in fig. 2 b.

The target neural network is a neural network for performing a target task, which may be any type of task. As an example, the target task may be a function of authentication by recognizing input user data, and the authentication-like task may be voiceprint recognition, face recognition, fingerprint recognition, ear print recognition, or a task of authentication using other types of users. As another example, for example, the target task may be a personalized recommendation class task that may generate a charging scheme for personalization, a personalized recommendation recipe, a personalized recommendation exercise scheme, a personalized recommendation movie work, a personalized recommendation application, and so forth, although not exhaustive herein. As another example, the target task may be a task of a feature extraction class, which may be extraction of voiceprint features, extraction of image features, or extraction of text features, and so forth. As another example, the target task may also be a task of identifying voice content, translating text between different languages, identifying a target in a surrounding environment, migrating an image style, or executing by other first terminal devices using a neural network, and the like, which types of tasks the target task is embodied as is not exhaustive in the embodiment of the present application.

The target neural network may be embodied as a convolutional neural network, a cyclic neural network, a residual neural network, or other type of neural network, etc., and the specific morphology of the target neural network may be determined in conjunction with a "target task" being a task of a particular neural type, without limitation herein. The target neural network includes a plurality of neural network layers.

Optionally, the first neural network, the second neural network and the third neural network are split from the target neural network. In the embodiment corresponding to fig. 4, the multiple neural network layers included in the whole target neural network are split into three parts, that is, two split nodes corresponding to the target neural network in this embodiment are included, where the two split nodes include a first split node and a second split node, the first split node is a split node of the first neural network and the second neural network, and the second split node is a split node of the second neural network and the third neural network.

The term "the first neural network is located before the second neural network" means that in the process of inputting the data to be processed into the target neural network and performing data processing through the target neural network, the data to be processed will first pass through the first neural network of the target neural network and then pass through the second neural network of the target neural network. That is, the order of each neural network layer in the target neural network refers to the process that data is transmitted forward in the target neural network, the neural network layer through which the data passes first represents the neural network layer with the front position, and the neural network layer through which the data passes later represents the neural network layer with the rear position. The concept that the third neural network is located after the second neural network can also be understood by the foregoing description, and will not be repeated here.

In order to more intuitively understand the present solution, please refer to fig. 5, fig. 5 is a schematic diagram of two split nodes corresponding to a target neural network in the data processing method provided by the embodiment of the present application, in fig. 5, the target neural network is taken as a residual neural network (residual networks, reserve) 34, a target task is taken as an example of extracting voiceprint features, as shown in fig. 5, the target neural network includes 4 residual blocks (residual blocks), a neural network layer located before a first split node in the target neural network is referred to as a first neural network, a neural network layer located between the first split node and a second split node is referred to as a second neural network, a neural network layer located after the second split node is referred to as a third neural network, that is, the first neural network is located before the second neural network, and it should be understood that the example in fig. 5 is merely for facilitating understanding the present solution and is not limited to the present solution. Further, the parameter amounts of the respective parts in the neural network in fig. 5 are disclosed in the form of a table as follows.

Neural network Layer (Layer)	Parameter number (Parameters)
		First convolution layer	3332＝288
Residual block 1	(3332322)*3＝55296
		Residual block 2	(3364642)*4＝294912
Residual block 3	(331281282)*6＝1769472
		Residual block 4	(332562562)*3＝3538944
Pooling layer	-
		First linear connection layer	2568256＝524288
Second linear connection layer	256*256＝65536

TABLE 1

Referring to table 1 above, it can be seen that most of the parameter calculation in the data processing process of the entire target neural network is consumed in the calculation of the residual block 1 to the residual block 4, the parameter amounts of the first convolution layer and the last Linear connection (Linear) layer are small, and the analysis shows that the first plurality of neural network layers and the last plurality of neural network layers in the entire target neural network can be deployed in the first terminal device, and the middle plurality of neural network layers are deployed in the server, so that the computer resources on the first terminal device consumed in the data processing process of the entire target neural network can be greatly reduced.

In the embodiment of the application, the number of the neural network layers deployed by the target is determined for the first time aiming at the server. In one implementation, the number of neural network layers in the first neural network deployed on the first terminal device is different from the number of neural network layers in the first neural network deployed on the second terminal device, and/or the number of neural network layers in the third neural network deployed on the first terminal device is different from the number of neural network layers in the third neural network deployed on the second terminal device.

The first terminal device and the second terminal device may be different types of terminal devices. As an example, for example, the first terminal device is a wristwatch and the second terminal device is a mobile phone; as another example, for example, the first terminal device is a lamp and the second terminal device is an air conditioner; as another example, the first terminal device is a cell phone, the second terminal device is a tablet, etc., which is not exhaustive herein.

Or the first terminal device and the second terminal device are terminal devices with different models in the same type. It should be noted that, in this solution, when two different terminal devices (i.e., the first terminal device and the second terminal device) are configured with part of the neural network layers included in the target neural network, when the number of the neural network layers deployed on the first terminal device and the number of the neural network layers deployed on the second terminal device are different, the first terminal device and the second terminal device may be different types of terminal devices or different types of terminal devices in the same type, but do not represent that the number of the neural network layers deployed on any two different types of terminal devices is different, and also do not represent that the number of the neural network layers deployed on any two different types of terminal devices in the same category is different.

Optionally, if the first neural network, the second neural network, and the third neural network are obtained by splitting the target neural network, in the embodiment corresponding to fig. 4, the target neural network corresponds to two splitting nodes, where "the splitting nodes corresponding to the target neural network are different" means that the two splitting nodes corresponding to the first terminal device are not identical to the two splitting nodes corresponding to the second terminal device.

Specifically, in the embodiment corresponding to fig. 4, the following three cases exist in the case that the split node corresponding to the target neural network is different: in one case, the first split node corresponding to the first terminal device is the same as the first split node corresponding to the second terminal device, and the second split node corresponding to the first terminal device is different from the second split node corresponding to the second terminal device. In another case, the first split node corresponding to the first terminal device is different from the first split node corresponding to the second terminal device, and the second split node corresponding to the first terminal device is the same as the second split node corresponding to the second terminal device. In another case, the first split node corresponding to the first terminal device is different from the first split node corresponding to the second terminal device, and the second split node corresponding to the first terminal device is different from the second split node corresponding to the second terminal device.

Correspondingly, in the embodiment corresponding to fig. 4, the "split node corresponding to the target neural network is the same" means that the first split node corresponding to the first terminal device is the same as the first split node corresponding to the second terminal device, and the second split node corresponding to the first terminal device is the same as the second split node corresponding to the second terminal device.

In the embodiment of the application, as the configuration of the computer resources of the terminal devices of different types may be different, the configuration of the computer resources of the terminal devices of different types in the same type may also be different, so that the computer resources which can be allocated to the target task by the terminal devices of different types or the terminal devices of different types in the same type may also be different.

Specifically, a procedure of determining two split nodes corresponding to a certain first terminal device with respect to the server. If the number of the neural network layers deployed on the terminal devices of different types may be different, the number of the neural network layers deployed on the terminal devices of different types in the same type may be the same, a first mapping relationship may be preconfigured on the server, the number of the neural network layers deployed on each type of terminal device may be stored in the first mapping relationship, and when the server needs to deploy the first neural network and the second neural network on a new first terminal device, two split nodes corresponding to the first terminal device of the target type may be determined according to the target type and the first mapping relationship of the new first terminal device.

Before step 401 is performed, when a part of the neural network layers in the target neural network needs to be deployed on the first terminal device, a first request may be sent to the server, where the first request is used to request to acquire the part of the neural network layers in the target neural network, and the first request also carries the target type of the first terminal device. The server acquires two split nodes corresponding to the target type from the first mapping relation according to the received target type of the first terminal equipment; and splitting the first neural network and the third neural network from the target neural network by the server according to the acquired two splitting nodes.

The first mapping relationship may be stored on the server in a table, an array, or other forms. For a more intuitive understanding of the present solution, the first mapping relationship is shown in the form of a table below.

TABLE 2

As shown in table 2 above, when the first terminal device appears as a different type of terminal device, the two split nodes corresponding to the target neural network may be the same or different. For example, when the first terminal device appears as a lamp and the first terminal device appears as a refrigerator, the two split nodes corresponding to the target neural network are different; for another example, when the first terminal device is represented as a refrigerator and the first terminal device is represented as an air conditioner, the two split nodes corresponding to the target neural network are the same, and it should be understood that the example in table 2 is only for facilitating understanding of the content in the first mapping relationship, and is not limited to this scheme.

More specifically, in one implementation, the first mapping relationship is sent to the server by the other device. In another implementation, the first mapping is generated by a server.

Further, the determining factors of the first neural network and the third neural network in the first mapping relationship may include any one or a combination of the following factors: when the first terminal device performs the target task, a pre-estimate of the processor resources allocated by the first terminal device, a pre-estimate of the memory resources allocated by the first terminal device, or other types of factors, etc.

That is, before performing step 401, the server may determine the number of neural network layers deployed on each type of terminal device according to the above-mentioned index of each type of terminal device, according to the above-mentioned index of each type of terminal device. The more the pre-estimated amount of the processor resources allocated by the first terminal device is, the more the number of the neural network layers allocated on the first terminal device is, and the less the pre-estimated amount of the processor resources allocated by the first terminal device is, the fewer the number of the neural network layers allocated on the first terminal device is. The more the pre-estimated amount of memory resources allocated by the first terminal device is, the more the number of neural network layers allocated on the first terminal device is, and the fewer the pre-estimated amount of memory resources allocated by the first terminal device is, the fewer the number of neural network layers allocated on the first terminal device is.

The processor may specifically be represented by a central processing unit (central processing unit, CPU), a graphics processor (graphics processing unit, GPU), an application specific integrated circuit (application specific integrated circuit, ASIC), or other type of processor, etc., where the specific type of processor configured on the first terminal device may be determined in conjunction with the actual product form, and is not limited herein.

If only one processor is allocated on the first terminal device to perform the target task, the evaluation index of "the pre-estimate of the processor resource allocated by the first terminal device" may include any one or more of the following elements: the occupation time of the processor allocated by the first terminal device for executing the target task and the performance of the processor allocated by the first terminal device for executing the target task. If at least two processors are allocated on the first terminal device to perform the target task, the evaluation index of "the pre-estimate of the processor resource allocated by the first terminal device" may include any one or more of the following elements: the occupation duration of each processor allocated by the first terminal device to execute the target task, the performance of each processor allocated by the first terminal device to execute the target task, the number of processors, the type of each processor or other elements, and the like.

Further, the evaluation index of the performance of the processor may be any one or more of the following evaluation indexes: floating-point operations (flow) performed by the processor per second, the number of millions of instructions (dhrystone million instructions executed per second, DMIPS) performed by the processor per second, i.e., a measure of how many millions of instructions are performed by the processor per second or other indicators for evaluating the performance of the processor, or other types of performance evaluation indicators for the processor may be employed, etc., without being exhaustive.

The evaluation index of the "pre-estimated amount of memory resources allocated by the first terminal device" may be the size of the memory space of the memory allocated by the first terminal device for executing the target task.

It should be noted that, the "occupation duration of the processor allocated by the first terminal device to execute the target task" and the "size of the memory space of the memory allocated by the first terminal device to execute the target task" may be a predicted value range or a predicted determined value. Further, the unit of "the occupied period of the processor allocated by the first terminal device to execute the target task" may be millions of instructions (million instructions executed per second, MIPS), seconds or other types of time units executed per second, etc., which are not exhaustive herein.

As an example, for example, the occupation duration of the processor allocated by the first terminal device to perform the target task may be 0.5MIPS-1MIPS, and the size of the memory allocated by the first terminal device to perform the target task may be 20M-30M; as another example, for example, the occupation duration of the processor allocated by the first terminal device to perform the target task may be 1.5MIPS, and the size of the memory space of the memory allocated by the first terminal device to perform the target task may be 25M, which is to be understood that this is only for convenience of understanding the present solution, and is not limited to this solution.

If the number of the neural network layers deployed on the terminal devices of different types may be different, and the number of the neural network layers deployed on the terminal devices of different types in the same type may also be different, a second mapping relationship may be configured on the server, where the number of the neural network layers corresponding to at least one type of each type of terminal device may be stored in the second mapping relationship, and when the server needs to deploy the first neural network and the second neural network to a new first terminal device, two split nodes corresponding to the first terminal device of the target type may be determined according to the target type, and the second mapping relationship of the new first terminal device.

Before executing step 401, when a part of the neural network layers in the target neural network needs to be deployed on the first terminal device, a first request may be sent to the server, where the first request is used to request to acquire the part of the neural network layers in the target neural network, and the first request further carries a target type of the first terminal device and a target model of the first terminal device. The server can receive the target type of the first terminal equipment and the target model of the first terminal equipment, and acquire two split nodes corresponding to the target type and the target model from the second mapping relation; and splitting the first neural network and the third neural network from the target neural network by the server according to the acquired two splitting nodes.

The second mapping relationship may be stored on the server in a table, an array, or other form. For a more intuitive understanding of the present solution, the second mapping relationship is shown in the form of a table below.

TABLE 3 Table 3

As shown in table 3, for two first terminal devices of the same type and different models, the two split nodes corresponding to the target neural network may be the same or different. For example, when two different terminal devices appear as different models of lights, the number of neural network layers deployed by the target neural network deployed on all models of lights is the same. When two different terminal devices are respectively a mobile phone of model 0001 and a mobile phone of model 0004, the number of the neural network layers deployed by the target neural network deployed on the two different terminal devices is different, and the example in table 3 is only for facilitating understanding of the content in the second mapping relationship, and is not used for limiting the scheme.

More specifically, in one implementation, the second mapping relationship is sent to the server by the other device. In another implementation, the second mapping is generated by a server.

Further, the determining factors of the first neural network and the third neural network in the second mapping relationship may include any one or a combination of the following factors: when the first terminal device performs the target task, a pre-estimate of the processor resources allocated by the first terminal device, a pre-estimate of the memory resources allocated by the first terminal device, or other types of factors, etc.

That is, the server may acquire the above-mentioned index of the first terminal device of each model of the at least one model of each type, generate the number of neural network layers deployed on the first terminal device of the target model of a certain target type according to the above-mentioned index of the first terminal device of each model of the at least one model of each type, and repeatedly perform the above-mentioned operations to generate the second mapping relationship. The more the pre-estimated amount of the processor resources allocated by the first terminal device is, the more the number of the neural network layers allocated on the first terminal device is, and the less the pre-estimated amount of the processor resources allocated by the first terminal device is, the fewer the number of the neural network layers allocated on the first terminal device is. The more the pre-estimated amount of memory resources allocated by the first terminal device is, the more the number of neural network layers allocated on the first terminal device is, and the fewer the pre-estimated amount of memory resources allocated by the first terminal device is, the fewer the number of neural network layers allocated on the first terminal device is.

For understanding of the two concepts of "the estimated amount of the processor resource allocated by the first terminal device" and "the estimated amount of the memory resource allocated by the first terminal device" reference may be made to the above description, and details are not repeated here.

In another implementation manner, in the case that the first terminal device is the first terminal device and the first terminal device is the second terminal device, the split nodes corresponding to the target neural network may be the same, that is, the number of the neural network layers deployed by the target neural network deployed on the different first terminal devices may be the same.

402. The first terminal equipment inputs the data to be processed into the first neural network to obtain a first intermediate result generated by the first neural network.

In the embodiment of the present application, step 401 is an optional step, and if step 401 is executed, the first terminal device may receive the first neural network and the third neural network sent by the server, and store the received first neural network and third neural network locally.

If step 401 is not performed, in one implementation, if the first neural network, the second neural network, and the third neural network are obtained by splitting the target neural network, the server may send the first P neural network layers in the target neural network and the last Q neural network layers in the target neural network to the first terminal device, and send the first indication information to the first terminal device; wherein P is an integer greater than or equal to N, Q is an integer greater than or equal to S, and the first indication information is used for informing the first terminal equipment of the positions of the two split nodes corresponding to the target neural network in the target neural network.

The first terminal equipment stores the received first P neural network layers and the received last Q neural network layers locally, determines a first neural network from the first P neural network layers according to the received first indication information, and determines a third neural network from the last Q neural network layers, namely the first neural network and the third neural network are deployed on the first terminal equipment.

In another implementation manner, the server may further send the trained entire target neural network to the first terminal device, and send first indication information to the first terminal device, where the first indication information is used to inform the first terminal device of positions of two split nodes corresponding to the target neural network in the target neural network. Therefore, the first terminal equipment can split the received target neural network according to the received first indication information to determine the first neural network and the third neural network, namely the first neural network and the third neural network are deployed on the first terminal equipment.

After the first neural network and the third neural network are deployed, the first terminal device may input the data to be processed into the first neural network, so as to obtain a first intermediate result generated by the first neural network. Wherein the data to be processed is embodied as what type of data is related to what type of task is embodied as a target task, for example, the data to be processed may be embodied as any of the following data: sound data, image data, fingerprint data, profile data of ears, sequence data capable of reflecting user habits, text data, point cloud data or other types of data, etc., it should be understood that what type of data is employed for the data to be processed needs to be determined in connection with what type of task is the target task performed through the target neural network, and is not limited herein. In the implementation mode, various expression forms of the data to be processed are provided, the application scene of the scheme is expanded, and the implementation flexibility of the scheme is improved.

The "first intermediate result generated by the first neural network" may also be referred to as "first hidden vector generated by the first neural network", where the first intermediate result generated by the first neural network includes data required for the data processing by the second neural network.

Further, in one case, the "first intermediate result generated by the first neural network" includes data generated by a last neural network layer in the first neural network. For a more intuitive understanding of the present solution, please refer to fig. 6, fig. 6 is a schematic diagram of a first intermediate result in the data processing method according to an embodiment of the present application. As shown in fig. 6, the first intermediate result includes data generated by the last neural network layer in the first neural network (i.e., the third convolutional layer in fig. 6), and it should be understood that the example in fig. 6 is merely for convenience of understanding the present solution, and is not limited to the present solution.

In another case, the "first intermediate result generated by the first neural network" includes data generated by the last neural network layer in the first neural network and data generated by other neural network layers in the first neural network. For a more intuitive understanding of the present solution, please refer to fig. 7, fig. 7 is another schematic diagram of a first intermediate result in the data processing method according to an embodiment of the present application. The two split nodes corresponding to the target neural network shown in fig. 7 are identical to the two split nodes corresponding to the target neural network shown in fig. 5, and as shown in fig. 7, the first intermediate result includes not only data generated by the last neural network layer (i.e., the 5 th convolutional layer in fig. 7) in the first neural network, but also data generated by the N-2 nd neural network layer (i.e., the 3 rd convolutional layer in fig. 7) in the first neural network, and it should be understood that the example in fig. 7 is only for facilitating understanding of the present scheme, and is not limited to the present scheme.

403. The first terminal device sends the first intermediate result to the server.

In the embodiment of the application, after the first intermediate result is obtained, the first terminal device can encrypt the first intermediate result and send the encrypted first intermediate result to the server. Among other encryption algorithms employed include, but are not limited to, secure socket layer (secure sockets layer, SSL) encryption algorithms or other types of encryption algorithms, and the like.

404. The server inputs the first intermediate result into the second neural network to obtain a second intermediate result generated by the second neural network.

In the embodiment of the application, after receiving the encrypted first intermediate result, the server can decrypt the encrypted first intermediate result to obtain the first intermediate result, and input the first intermediate result into the second neural network to obtain the second intermediate result generated by the second neural network.

The "second intermediate result generated by the second neural network" may also be referred to as "second hidden vector generated by the second neural network", and the second intermediate result generated by the second neural network includes data required for data processing by the third neural network.

Further, in one case, the "second intermediate result generated by the second neural network" includes data generated by the last neural network layer in the second neural network. For a more visual understanding of the present solution, please understand with reference to fig. 7, as shown in fig. 7, the second splitting node corresponding to the target neural network (i.e., the splitting node between the second neural network and the third neural network) is located between the pooling layer and the first linear connection layer, and then the second intermediate result includes the data generated by the last neural network layer in the second neural network (i.e., the pooling layer in fig. 7), which should be understood that the example in fig. 7 is only for facilitating understanding of the present solution, and is not limited to this solution.

In another case, the "second intermediate result generated by the second neural network" includes data generated by the last neural network layer in the second neural network and data generated by other neural network layers in the second neural network. Referring to fig. 8, fig. 8 is another schematic diagram of a second intermediate result in the data processing method according to the embodiment of the application. As shown in fig. 8, the second intermediate result includes not only the data generated by the last neural network layer in the second neural network (i.e., the last convolutional layer in fig. 8), but also the data generated by the M-2 nd neural network layer in the second neural network (i.e., the 3 rd convolutional layer in fig. 8), and it should be understood that the example in fig. 8 is only for convenience of understanding the present solution, and is not limited to the present solution.

405. The server sends the second intermediate result to the first terminal device.

In the embodiment of the present application, after obtaining the second intermediate result, the server may encrypt the second intermediate result and send the encrypted second intermediate result to the first terminal device, and the encryption algorithm specifically adopted may refer to the description in step 403, which is not described herein.

406. The first terminal device inputs the second intermediate result into a third neural network to obtain a prediction result which is generated by the third neural network and corresponds to the data to be processed, and the type of information indicated by the prediction result corresponds to the type of the target task.

In the embodiment of the application, after receiving the encrypted second intermediate result, the first terminal device may input the second intermediate result into the third neural network, that is, input the second intermediate result into the last S neural network layers of the target neural network, to obtain a prediction result corresponding to the data to be processed generated by the third neural network (that is, obtain a prediction result corresponding to the data to be processed output by the whole target neural network).

The type of the information indicated by the prediction result corresponding to the data to be processed corresponds to the type of the target task. As an example, if the target task is voiceprint recognition, the data to be processed may be voice data, and the prediction result corresponding to the data to be processed is used to indicate whether the data to be processed (i.e., voice data) is a preset user voice. As another example, for example, if the target task is voiceprint feature extraction, the data to be processed may be sound data, and the prediction result corresponding to the data to be processed is voiceprint feature extracted from the data to be processed.

As another example, for example, the target task is face recognition, the data to be processed may be image data of a face of a user, and the prediction result corresponding to the data to be processed is used to indicate whether the user is a preset user. As another example, for example, the target task is fingerprint recognition, the data to be processed is fingerprint data of a user, and the prediction result corresponding to the data to be processed is used to indicate whether the user is a preset user. As yet another example, for example, the objective task is to perform feature extraction on the contour data of the ear of the user, the data to be processed is the contour data of the ear of the user, the prediction result corresponding to the data to be processed is the feature of the contour data of the ear of the user, and the like, and the prediction result corresponding to the data to be processed is not exhaustive here.

The processor resources occupied by the first terminal equipment in the process of data processing through the first neural network and the third neural network are smaller than the processor resources occupied by the server in the process of data processing through the second neural network, and the memory resources occupied by the first terminal equipment in the process of data processing through the first neural network and the third neural network are smaller than the memory resources occupied by the server in the process of data processing through the second neural network.

In the embodiment of the application, the second neural network is deployed on the server, and the second neural network occupies more processor resources and more memory resources in the process of data processing, so that the computer resources of the first terminal equipment occupied in the whole calculation process of the neural network can be further reduced, and the calculation pressure of the first terminal equipment in the process of executing the target task can be reduced; because most of calculation in the data processing process of the whole neural network is executed by the server, the deep neural network with more parameters can be adopted to generate the prediction result fixed with the data to be processed, and the accuracy of the prediction result generated by the whole neural network is improved.

In the embodiment of the present application, after obtaining the prediction result corresponding to the data to be processed, the first terminal device may execute subsequent steps according to the prediction result corresponding to the data to be processed, and specifically execute which steps may be determined in combination with the actual application scenario, which is not limited herein.

For a more intuitive understanding of the present solution, please refer to fig. 9, fig. 9 is a flow chart of a data processing method according to an embodiment of the present application. In fig. 9, taking a target task executed by a target neural network as extraction of voiceprint features, and splitting the target neural network by the first neural network, the second neural network and the third neural network is taken as an example, as shown in fig. 9, B1 and the first terminal device acquire data to be processed input by a user (i.e., voice data input by the user shown in fig. 9). B2, the first terminal equipment inputs the data to be processed into a first neural network (namely the first N neural network layers of the target neural network shown in fig. 9), and a first intermediate result generated by the first neural network is obtained. And B3, the first terminal equipment encrypts the first intermediate result and sends the encrypted first intermediate result to the server so as to realize encrypted transmission of the first intermediate result. And B4, the server decrypts the encrypted first intermediate result to obtain a first intermediate result, and inputs the first intermediate result into a second neural network (namely M neural network layers after N neural network layers) to obtain a second intermediate result generated by the second neural network. And B5, the server encrypts the second intermediate result and sends the encrypted second intermediate result to the first terminal equipment so as to realize encrypted transmission of the second intermediate result. And B6, the first terminal equipment decrypts the encrypted second intermediate result to obtain a second intermediate result, and inputs the second intermediate result into a third neural network (namely, the last S neural network layers of the target neural network) to obtain a prediction result (namely, voiceprint characteristics extracted from the input voice data) which is output by the whole target neural network and corresponds to the data to be processed. B7, the first terminal device compares each voiceprint feature in the locally stored at least one voiceprint feature with the acquired voiceprint features to determine whether the acquired voiceprint feature is any one of the at least one voiceprint features stored in advance, so as to determine whether the user is a user with authority, and it should be understood that the example in FIG. 9 is only for facilitating understanding of the scheme and is not intended to limit the scheme.

407. The server obtains updated split nodes corresponding to the target neural network, wherein the updated split nodes indicate that the first neural network comprises n neural network layers, the second neural network comprises m neural network layers and the third neural network comprises s neural network layers.

In some embodiments of the present application, after the first neural network and the third neural network are deployed on one determined first terminal device, the server may obtain updated split nodes corresponding to the target neural network (i.e., the neural network to which the neural network layer deployed on the first terminal device belongs), that is, at two different moments, i.e., a first moment and a second moment, where the neural network deployed on the first terminal device has the following changes: the number of neural network layers in the first neural network is changed, or the number of neural network layers in the third neural network is changed, that is, at two different times, i.e., the first time and the second time, different from the split node corresponding to the target neural network.

It should be noted that, in this solution, when the same first terminal device is at two different times, that is, the first time and the second time, the split node corresponding to the target neural network may be different, but not represent that the number of the neural network layers deployed by the target neural network is different for any two different times of the same first terminal device.

The meaning of the "the split node corresponding to the target neural network is different" may refer to the description in the above step, where the updated split node indicates that the first neural network includes N neural network layers, the second neural network includes m neural network layers, and the third neural network includes S neural network layers, the first neural network and the third neural network are deployed on the first terminal device, the second neural network is deployed on the server, N, S and m are integers greater than or equal to 1, N and N are different and/or S and S are different.

Further, the positional relationship between the "first neural network" and the "second neural network" in the target neural network, and the positional relationship between the "second neural network" and the "third neural network" in the target neural network may refer to the description in step 401, which is not repeated herein.

In the embodiment of the application, because an attacker may reversely push to obtain the original data to be processed according to the obtained intermediate result after obtaining the intermediate result sent between the first terminal device and the server, and for the two different moments of the first moment and the second moment, the split nodes corresponding to the neural network are different, namely, different intermediate results are sent between the first terminal device and the server at different moments, the difficulty of the attacker in obtaining the original data to be processed is further increased, and the protection degree of privacy of user data is further improved.

And acquiring the trigger point of the updated split node corresponding to the target neural network aiming at the server. In one implementation, the server may reacquire split nodes corresponding to the target neural network every fixed time period; by way of example, the fixed duration may be, for example, one day, one week, ten days, fifteen days, one month or other length, etc., and is not exhaustive herein.

In another implementation, the server may reacquire the split node corresponding to the target neural network at a fixed point in time; as an example, the fixed point in time may be, for example, 2 a.m. 1 of each month, 3 a.m. on monday of each week, or other point in time, etc., which is not exhaustive herein.

In another implementation manner, the first terminal device may send a request message to the server, where the request message is used to request to update the number of the neural network layers deployed by the target neural network, that is, to request to update the deployment situation of the plurality of the neural network layers included in the target neural network on the first terminal device and the server. Alternatively, the request message may be actively triggered by the user through the first terminal device, i.e. the user may actively trigger updating the number of neural network layers deployed by the target neural network, etc.

Further, in one case, the first terminal device may send a request message to the server each time the target task needs to be performed, the request message being used to request updating of the number of neural network layers deployed by the target neural network; in another case, the first terminal device may send a request message to the server for requesting updating of the number of neural network layers deployed by the target neural network every time the target task is performed a target number of times; or may trigger the first terminal device to send the request message to the server in other cases, which is not exhaustive here.

It should be noted that other ways may also exist to trigger the server to obtain the updated split node corresponding to the target neural network, and the specific implementation manner may be flexibly determined in combination with the specific application scenario, which is not limited herein.

And acquiring a specific realization process of the updated split node corresponding to the target neural network aiming at the server. The determining factors of the number of the neural network layers deployed on the first terminal device may include: the processor resource occupancy of the first terminal device and/or the memory resource occupancy of the first terminal device. Optionally, the determining factors of the first neural network and the third neural network may further include any one or more of the following: the number of processes currently running on the first terminal device, the time that each process on the first terminal device has run, the running state of each process on the first terminal device, or other factors that can be considered, may be specifically determined in conjunction with the actual application scenario, and are not listed here.

Further, the evaluation index of the "occupation amount of the memory resource of the first terminal device" may include any one or more of the following indexes: the size of the total memory resource of the first terminal device, the size of the occupied memory resource of the first terminal device, the occupancy rate of the memory resource of the first terminal device or other evaluation indexes, etc.

The evaluation index of the "occupation amount of the processor resource of the first terminal device" may include any one or more of the following: the occupation rate of the processor resources of the first terminal device, the occupation duration of each processor on the first terminal device for executing the target task, the load of the processor on the first terminal device for executing the target task allocation, the performance of the processor on the first terminal device for executing the target task allocation, or other evaluation indexes capable of reflecting the occupation amount of the processor resources on the first terminal device for executing the target task, etc. are specifically determined by combining with actual products, and are not exhaustive herein.

Specifically, the server may calculate, according to the amount of occupation of the processor resources of the first terminal device, an estimated amount of the processor resources allocated when the first terminal device executes the target task; correspondingly, the server can calculate the available amount of the memory resources of the first terminal device according to the occupied amount of the memory resources of the first terminal device, and further can obtain the pre-estimated amount of the memory resources allocated when the first terminal device executes the target task.

The server may generate an updated split node corresponding to the target neural network based on the estimated amount of processor resources allocated when the first terminal device performs the target task and the estimated amount of memory resources allocated when the first terminal device performs the target task. If the target neural network is split according to the updated splitting node corresponding to the target neural network, processor resources occupied by the first neural network and the third neural network deployed on the first terminal device in the data processing process are smaller than or equal to the estimated amount of the processor resources allocated when the first terminal device executes the target task; the memory resources occupied by the first neural network and the third neural network deployed on the first terminal device in the data processing process are less than or equal to the pre-estimated amount of memory resources allocated when the first terminal device executes the target task.

More specifically, the process of "the server obtains an estimated amount of processor resources allocated when the first terminal device performs the target task according to the amount of occupied processor resources of the first terminal device" is aimed at. In one implementation, a regression model for performing the training operation may be stored on the server, where the regression model is used to perform the estimating operation; by way of example, the regression model may employ an autoregressive moving average (autoregressive integrated moving average, ARIMA) model, a recurrent neural network (recursive neural network, RNN) or other type of model, or the like, as examples and not exhaustive herein. The input of the regression model may include an occupancy rate of a processor resource on the first terminal device, a utilization rate of a memory resource on the first terminal device, a number of processes currently running on the first terminal device, and a time that each process on the terminal has been running; the output of the regression model may be the estimated occupancy of the processor resource and the estimated occupancy of the memory resource corresponding to each process in a future period of time.

The server may calculate the estimated available amount of the processor resource and the estimated available amount of the memory resource of the first terminal device in a future period according to the estimated occupancy rate of the processor resource and the estimated occupancy rate of the memory resource corresponding to each process in the future period. Further, in one case, the server may determine the estimated availability of the processor resource of the first terminal device in the future period of time as an estimated amount of the processor resource allocated when the first terminal device performs the target task, and determine the estimated availability of the memory resource of the first terminal device in the future period of time as an estimated amount of the memory resource allocated when the first terminal device performs the target task.

In another case, the server may multiply the estimated available amount of the processor resource of the first terminal device with the first ratio in the future period of time, and determine the obtained product as the estimated amount of the processor resource allocated when the first terminal device performs the target task; multiplying the estimated available amount of the memory resource of the first terminal device in the future period of time by the first proportion, and determining the obtained product as the estimated amount of the memory resource allocated when the first terminal device executes the target task; wherein the first ratio is less than 1.

In another implementation manner, the server may also determine, according to a preset rule, an estimated amount of the allocated processor resource when the first terminal device executes the target task. The server may multiply the current occupancy of the processor resource of the first terminal device by the second ratio, and determine the obtained product as an estimated occupancy of the processor resource of the first terminal device in a future period of time; multiplying the current occupation amount of the memory resources of the first terminal equipment by the second proportion, and determining the obtained product as the estimated occupation amount of the memory resources of the first terminal equipment in a future period of time; the second ratio is greater than 1.

The server determines the estimated available amount of the processor resource of the first terminal equipment in a future period according to the estimated occupied amount of the processor resource of the first terminal equipment in the future period; and determining the estimated available amount of the memory resource of the first terminal equipment in a future period according to the estimated occupied amount of the memory resource of the first terminal equipment in the future period. Further, the estimated amount of the allocated processor resources and the estimated amount of the memory resources when the first terminal device executes the target task can be determined according to the estimated available amount of the processor resources and the estimated available amount of the memory resources of the first terminal device in a future period.

It should be noted that, the description of "the server obtains the predicted amount of the processor resource allocated when the first terminal device performs the target task according to the occupied amount of the processor resource of the first terminal device" is merely to prove the feasibility of the solution, and the server may also obtain the predicted amount of the processor resource allocated when the first terminal device performs the target task in other manners, which is not exhaustive in each implementation here.

Optionally, in order to enable the updated split node of the target neural network is different from the split node before the update. In one implementation, after generating the updated split node corresponding to the target neural network according to the estimated amount of the processor resource allocated when the first terminal device executes the target task and the estimated amount of the memory resource allocated when the first terminal device executes the target task, the server may randomly adjust the determined split node, that is, randomly forward or backward move the position of the split node in the target neural network, so as to update the updated split node corresponding to the target neural network again, to obtain the final updated split node corresponding to the target neural network.

Further, in the embodiment corresponding to fig. 4, two split nodes are corresponding to the target neural network, so when the determined split nodes are randomly adjusted, only the position of the first split node in the target neural network can be randomly adjusted, and only the position of the second split node in the target neural network can be randomly adjusted; the positions of the first split node and the second split node in the target neural network can be randomly adjusted.

In order to more intuitively understand the solution, referring to fig. 10, fig. 10 is a schematic flow chart of updating a splitting node corresponding to a target neural network in the embodiment of the present application, fig. 10 is a flowchart of splitting the target neural network by using a first neural network, a second neural network and a third neural network, and as shown in fig. 10, C1, when a first terminal device has not performed a target task, acquires a plurality of parameters associated with a computer resource already occupied on the first terminal device, where the plurality of parameters may include a occupation amount of a processor resource of the first terminal device, a occupation amount of a memory resource of the first terminal device, a number of processes currently running on the first terminal device and a time that each process has run on the first terminal device, and the first terminal device sends the plurality of parameters to a server; the server determines the estimated amount of the allocated processor resources and the estimated amount of the memory resources when the first terminal equipment executes the target task according to the received multiple parameters; the server obtains updated split nodes corresponding to the target neural network according to the estimated amount of the processor resources and the estimated amount of the memory resources allocated by the first terminal equipment when the target task is executed; c4, the server randomly forwards or backwards moves the updated split node corresponding to the target neural network to obtain a final updated split node corresponding to the target neural network; c5, the server determines n neural network layers included in the first neural network, m neural network layers included in the second neural network and s neural network layers included in the third neural network from the target neural network according to the final updated split node corresponding to the target neural network; and C6, the server sends n neural network layers included in the first neural network and s neural network layers included in the third neural network to the first terminal device so as to deploy the first neural network and the third neural network to the first terminal device and deploy the second neural network to the server, and it should be understood that the example in fig. 10 is only for facilitating understanding of the present solution, and is not used for limiting the present solution.

In another implementation manner, the server may adopt different prediction algorithms at different moments, and obtain, according to the available amount of the processor resources of the first terminal device, the predicted amount of the processor resources allocated when the first terminal device executes the target task, so as to improve the probabilities of different predicted amounts of the processor resources corresponding to different moments; correspondingly, the server can adopt different pre-estimation algorithms at different moments, and according to the available amount of the memory resources of the first terminal equipment, the pre-estimated amount of the memory resources allocated when the first terminal equipment executes the target task is obtained, so that the probability of different pre-estimated amounts of the memory resources corresponding to different moments is improved. Thereby improving the probability of different numbers of the neural network layers deployed by the targets corresponding to different moments.

In the embodiment of the application, since a plurality of application programs are usually required to be run on the first terminal equipment, computer resources which can be allocated to the target task by the first terminal equipment at different moments of the same terminal equipment are possibly different, and determining factors of the first neural network and the third neural network comprise the occupation amount of processor resources of the first terminal equipment and/or the occupation amount of memory resources of the first terminal equipment, the neural network deployed on the first terminal equipment is beneficial to ensuring that the neural network can be matched with the calculation power of the first terminal equipment so as to avoid increasing the calculation pressure of the first terminal equipment in the process of executing the target task.

408. The server sends n neural network layers included in the first neural network and s neural network layers included in the third neural network to the first terminal device.

In some embodiments of the present application, the server may split the target neural network into the first neural network, the second neural network, and the third neural network according to the updated two split nodes; the server sends n neural network layers included in the first neural network and s neural network layers included in the third neural network to the first terminal device, so that the first terminal device sends the first neural network and the third neural network to be deployed on the first terminal device, and the s neural network layers included in the second neural network are deployed on the server.

In the embodiment of the application, when the number of the neural network layers deployed on the first terminal equipment changes, the server can send the updated first neural network and the updated third neural network to the first terminal equipment, so that the difficulty of an attacker in determining the neural network deployed on the first terminal equipment is further improved, the difficulty of the attacker in reversely pushing the intermediate result to obtain the original data to be processed is further improved, and the privacy protection degree of the user data is further improved.

409. The first terminal equipment inputs the data to be processed into the first neural network to obtain a third intermediate result generated by the first neural network, and at a second moment, the first neural network comprises n neural network layers.

In the embodiment of the present application, steps 407 to 413 are optional steps, and if step 407 is not performed, steps 408 to 413 are not required to be performed either, that is, for different moments of the same terminal device, the number of the neural network layers deployed by the target neural network may not be updated, so that the first neural network and the third neural network do not need to be deployed for the first terminal device.

If step 407 is performed, that is, the number of neural network layers deployed by the target neural network is updated for different times of the same terminal device, if step 408 is performed, the first terminal device may receive n neural network layers included in the first neural network and s neural network layers included in the third neural network, and store the received n neural network layers and s neural network layers locally.

If step 407 is executed, step 408 is not executed, and step 401 is executed, if the first neural network, the second neural network, and the third neural network are obtained by splitting the target neural network, in step 401, the server deploys the first neural network and the third neural network in the target neural network on a new first terminal device for the first time, the determination of two splitting nodes corresponding to the first neural network and the third neural network may be based on the maximum estimated amount of the computer resources allocated to the first terminal device, where the "maximum estimated amount of the computer resources allocated to the first terminal device" includes "the maximum estimated amount of the processing resources allocated to the first terminal device" and "the maximum estimated amount of the memory resources allocated to the first terminal device", then the value of N may be greater than or equal to N, and the value of S may be greater than or equal to S.

The server may send second indication information to the first terminal device after obtaining the updated split node corresponding to the target neural network, where the second indication information is used to inform the first terminal device of the two updated split nodes corresponding to the target neural network. The first terminal device can determine the first neural network from the first neural network according to the second indication information, and determine the third neural network from the third neural network, so that the first neural network and the third neural network are deployed on the first terminal device.

If step 407 is executed, step 408 is not executed, and step 401 is not executed, if the first neural network, the second neural network, and the third neural network are obtained by splitting the target neural network, in one implementation manner, if the first terminal device stores the first P neural network layers in the target neural network and the last Q neural network layers in the target neural network, the server may send second indication information to the first terminal device after obtaining the updated splitting node corresponding to the target neural network, the first terminal device may determine the first neural network from the first P neural network layers according to the received second indication information, and determine the third neural network from the last Q neural network layers, that is, it is realized that the first neural network and the third neural network are deployed on the first terminal device, where P is an integer greater than or equal to n, and Q is an integer greater than or equal to s.

In another implementation manner, if the first terminal device stores the trained whole target neural network, after the server obtains the updated split nodes corresponding to the target neural network, the server may send second instruction information to the first terminal device, where the second instruction information is used to inform the first terminal device of two updated split nodes corresponding to the target neural network, so that the first terminal device may determine the first neural network and the third neural network from the target neural network according to the second instruction information, and the server may determine the second neural network from the target neural network according to the updated split nodes corresponding to the target neural network, that is, deploy the first neural network, the second neural network and the third neural network on the first terminal device and the server respectively.

After the first terminal device is deployed with the first neural network and the third neural network, the data to be processed may be input into the first neural network to obtain a third intermediate result generated by the first neural network, and the specific implementation manner of the foregoing step may refer to the description in step 402, where the concept of the "third intermediate result" is similar to that of the "first intermediate result", and details are not repeated herein.

The number of execution times between the step 401 and the step 409 is not limited in the embodiment of the present application, and the step 409 may be executed a plurality of times after the step 401 is executed once.

410. The first terminal device sends the third intermediate result to the server.

411. The server inputs the third intermediate result into the second neural network to obtain a fourth intermediate result generated by the second neural network, and at the second moment, the third neural network comprises m neural network layers.

412. The server sends the fourth intermediate result to the first terminal device.

413. The first terminal equipment inputs the fourth intermediate result into a third neural network to obtain a prediction result which is generated by the third neural network and corresponds to the data to be processed, and at the second moment, the third neural network comprises s neural network layers.

In the embodiment of the present application, the specific manner of steps 410 to 413 may refer to the descriptions in steps 403 to 406, and the difference is that the "first intermediate result" in steps 403 to 406 is replaced by the "third intermediate result" in steps 410 to 413, and the "second intermediate result" in steps 403 to 406 is replaced by the "fourth intermediate result" in steps 410 to 413, where the meaning of the "fourth intermediate result" is similar to the meaning of the "second intermediate result", and the description is omitted herein.

For a more intuitive understanding of the present solution, please refer to fig. 11, fig. 11 is a schematic diagram of a split node corresponding to a target neural network in the data processing method according to an embodiment of the present application. Fig. 11 shows a split node before updating and a split node after updating, wherein a first split node before updating is an X point in fig. 11, a first split node after updating is a Y point in fig. 11, and a second split node before updating and a second split node after updating are both H points in fig. 11. As shown in fig. 11, for two different cases of two split nodes before the update of the target neural network and two split nodes after the update of the target neural network, the first neural network deployed on the first terminal device changes, the second neural network deployed on the server also changes, and the intermediate result sent by the first terminal device to the server also changes, and it should be understood that the corresponding embodiment of fig. 11 is only for facilitating understanding of the present solution, and is not used for limiting the present solution.

In the embodiment of the application, because the operation of the second neural network is completed by the server, the computer resources of the first terminal equipment occupied in the calculation process of the whole target neural network can be reduced; the first terminal equipment sends the first intermediate result to the server after the first neural network calculates the data to be processed before the data to be processed is input, so that the leakage of the original data to be processed is avoided, and the protection degree of the privacy of the user data is improved; and the calculation of the rear third neural network in the whole target neural network is also executed by the first terminal equipment side, so that the protection degree of the privacy of the user data is further improved. Because an attacker may reversely push the intermediate result sent between the first terminal device and the server according to the obtained intermediate result to obtain the original data to be processed, and for the two different moments of the first moment and the second moment, the number of the neural network layers deployed on the first terminal device is changed, that is, different intermediate results are sent between the first terminal device and the server at different moments, so that the difficulty of the attacker in obtaining the original data to be processed is further increased, and the protection degree of privacy of the user data is further improved.

2. The target neural network comprises a first neural network and a second neural network

In an embodiment of the present application, referring to fig. 12, fig. 12 is a schematic flow chart of a data processing method provided in an embodiment of the present application, where the data processing method provided in the embodiment of the present application may include:

1201. the server sends a first neural network to the first terminal equipment, a second neural network is deployed on the server, the first neural network comprises N neural network layers at a first moment, the second neural network comprises M neural network layers, and the first neural network and the second neural network form a target neural network.

1202. The first terminal equipment inputs the data to be processed into the first neural network to obtain a first intermediate result generated by the first neural network.

1203. The first terminal device sends the first intermediate result to the server.

In the embodiment of the present application, the specific implementation manner of steps 1201 to 1203 may refer to the descriptions in steps 401 to 403 in the corresponding embodiment of fig. 4, where the difference is that, in steps 401 to 403, the target neural network includes a first neural network, a second neural network, and a third neural network; in steps 1201-1203, the target neural network includes a first neural network and a second neural network, the first neural network being located before the second neural network.

Optionally, if the first neural network and the second neural network are obtained by splitting the target neural network, the first neural network refers to a neural network layer located before the target splitting node in the target neural network, and the second neural network refers to a neural network layer located after the target splitting node in the target neural network; an understanding of the concepts of "the first neural network is located before the second neural network" and "the first intermediate result" may refer to the description in the corresponding embodiment of fig. 4, and will not be repeated here.

1204. The server inputs the first intermediate result into the second neural network to obtain a prediction result which is generated by the second neural network and corresponds to the data to be processed, and the type of information indicated by the prediction result corresponds to the type of the target task.

In the embodiment of the application, after receiving the encrypted first intermediate result, the server may decrypt the encrypted first intermediate result to obtain the first intermediate result, and input the first intermediate result into the second neural network to obtain the prediction result corresponding to the data to be processed generated by the second neural network (that is, obtain the prediction result corresponding to the data to be processed output by the whole target neural network), where the type of information indicated by the prediction result corresponds to the type of the target task. Further, the concept of "target task", "prediction result corresponding to data to be processed" may be understood with reference to the description in the embodiment corresponding to fig. 4, which is not described herein.

For a more intuitive understanding of the present solution, please refer to fig. 13, fig. 13 is a schematic flow chart of a data processing method according to an embodiment of the present application. In fig. 13, taking a target task executed by the target neural network as an example of extracting voiceprint features, as shown in fig. 13, D1, the first terminal device obtains data to be processed input by a user (i.e., voice data input by the user shown in fig. 13). D2, the first terminal device inputs the data to be processed into a first neural network (i.e. the first N neural network layers of the target neural network shown in fig. 13), so as to obtain a first intermediate result generated by the first neural network. And D3, the first terminal equipment encrypts the first intermediate result and sends the encrypted first intermediate result to the server so as to realize encrypted transmission of the first intermediate result. And D4, decrypting the encrypted first intermediate result by the server to obtain a first intermediate result, and inputting the first intermediate result into a second neural network (namely, the last M neural network layers of the target neural network) to obtain a prediction result (namely, voiceprint characteristics extracted from the input voice data) which is output by the whole target neural network and corresponds to the data to be processed. D5, the server compares each voiceprint feature in the registered at least one voiceprint features with the acquired voiceprint features to determine whether the acquired voiceprint features are any one of the pre-registered at least one voiceprint features to determine a voiceprint recognition result, wherein the voiceprint recognition result is used for indicating whether the user is a user with authority. And D6, the server sends the voiceprint recognition result to the first terminal equipment. It should be understood that the example in fig. 13 is merely for facilitating understanding of the present solution, and is not intended to limit the present solution.

It should be noted that, step 1201 is an optional step, and if step 1201 is not performed, the manner in which the server deploys the first neural network to the first terminal device may refer to the description in step 402 in the corresponding embodiment of fig. 4, which is not described herein.

1205. The server obtains updated split nodes corresponding to the target neural network, and the updated split nodes indicate that the target neural network comprises n neural network layers in the first neural network and m neural network layers in the second neural network.

In the embodiment of the present application, the specific manner of step 1205 may refer to the description of step 407 in the corresponding embodiment of fig. 4, where the difference is that two split nodes corresponding to the target neural network are acquired in step 407, and one split node corresponding to the target neural network is acquired in step 1205.

In addition, the embodiment of the present application does not limit the number of executions between the step 1201 and the step 1205, and the step 1205 may be executed a plurality of times after the step 1201 is executed once.

1206. The server sends n neural network layers included in the first neural network to the first terminal device.

1207. The first terminal equipment inputs the data to be processed into the first neural network to obtain a third intermediate result generated by the first neural network, and at a second moment, the first neural network comprises n neural network layers.

1208. The first terminal device sends the third intermediate result to the server.

1209. And the server inputs the third intermediate result into m neural network layers included in the second neural network to obtain a prediction result which is generated by the second neural network and corresponds to the data to be processed.

In the embodiment of the present application, the specific manner of steps 1206 to 1209 may be referred to as descriptions in steps 401 to 404 in the corresponding embodiment of fig. 4, wherein the difference is that "the first intermediate result" in steps 1206 to 1209 is replaced by "the third intermediate result" in steps 401 to 404, which are not described herein.

It should be noted that, steps 1205 to 1209 are optional steps, and if step 1205 is not performed, steps 1206 to 1209 are not required to be performed; if step 1205 is executed, step 1206 is also an optional step, and if step 1205 is executed and step 1206 is not executed, the manner in which the server deploys the first neural network to the first terminal device may be described in step 409 in the corresponding embodiment of fig. 4, which is not described herein.

In the embodiment of the application, because the operation of the second neural network is completed by the server, the computer resources of the first terminal equipment occupied in the calculation process of the whole neural network can be reduced; the first terminal equipment sends the first intermediate result to the server after the first neural network calculates the data to be processed before the data to be processed is input, so that the leakage of the original data to be processed is avoided, and the protection degree of the privacy of the user data is improved; and the calculation of the rear third neural network in the whole neural network is also executed by the first terminal equipment side, which is beneficial to further improving the protection degree of the privacy of the user data.

In order to better implement the above-described scheme of the embodiment of the present application on the basis of the embodiments corresponding to fig. 1 to 13, a related apparatus for implementing the above-described scheme is further provided below. Referring specifically to fig. 14, fig. 14 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present application, where the data processing apparatus 1400 is disposed on a first terminal device, the first terminal device is included in a data processing system, the data processing system further includes a server, a first neural network and a third neural network are disposed on the first terminal device, a second neural network is disposed on the server, and the data processing apparatus 1400 includes: an input module 1401, configured to input data to be processed into a first neural network, to obtain a first intermediate result generated by the first neural network; a sending module 1402, configured to send a first intermediate result to a server, where the first intermediate result is used for the server to obtain a second intermediate result by using a second neural network; a receiving module 1403, configured to receive a second intermediate result sent by the server; the input module 1401 is further configured to input the second intermediate result into a third neural network, to obtain a prediction result generated by the third neural network and corresponding to the data to be processed; the first neural network, the second neural network and the third neural network form a target neural network, and the neural network deployed on the first terminal equipment has the following changes at two different moments of the first moment and the second moment: the number of neural network layers in the first neural network changes or the number of neural network layers in the third neural network changes.

In one possible design, at a first time, the first neural network comprises N neural network layers, the third neural network comprises S neural network layers, and at a second time, the first neural network comprises N neural network layers, the third neural network comprises S neural network layers, wherein N and N are different and/or S and S are different; the receiving module 1403 is further configured to receive the n neural network layers and the s neural network layers sent by the server.

It should be noted that, the content of information interaction and execution process between each module/unit in the data processing apparatus 1400, and the respective method embodiments corresponding to fig. 3 to 11 in the present application are based on the same concept, and specific content may be referred to the description in the foregoing method embodiments of the present application, which is not repeated herein.

Referring to fig. 15, fig. 15 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present application, where the data processing apparatus 1500 is deployed on a server, the server is included in a data processing system, the data processing system further includes a first terminal device, a first neural network and a third neural network are deployed on the first terminal device, a second neural network is deployed on the server, and the data processing apparatus 1500 includes: a receiving module 1501, configured to receive a first intermediate result sent by a first terminal device, where the first intermediate result is obtained based on data to be processed and a first neural network; an input module 1502, configured to input the first intermediate result into a second neural network, to obtain a second intermediate result generated by the second neural network; a sending module 1503, configured to send a second intermediate result to the first terminal device, where the second intermediate result is used for the first terminal device to obtain a prediction result corresponding to the data to be processed by using the third neural network; the first neural network, the second neural network and the third neural network form a target neural network, and the neural network deployed on the first terminal equipment has the following changes at two different moments of the first moment and the second moment: the number of neural network layers in the first neural network changes or the number of neural network layers in the third neural network changes.

In one possible design, at a first time, the first neural network comprises N neural network layers, the third neural network comprises S neural network layers, and at a second time, the first neural network comprises N neural network layers, the third neural network comprises S neural network layers, wherein N and N are different and/or S and S are different; the sending module 1503 is further configured to send n neural network layers and s neural network layers to the first terminal device.

It should be noted that, the content of information interaction and execution process between each module/unit in the data processing apparatus 1500, and the respective method embodiments corresponding to fig. 3 to 11 in the present application are based on the same concept, and specific content may be referred to the description in the foregoing method embodiments of the present application, which is not repeated herein.

Referring to fig. 16, fig. 16 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present application, where a data processing apparatus 1600 is disposed at a first terminal device, the first terminal device is included in a data processing system, the data processing system further includes a server, a first neural network is disposed on the first terminal device, a second neural network is disposed on the server, and the data processing apparatus 1600 includes: the input module 1601 is configured to input data to be processed into a first neural network, to obtain a first intermediate result generated by the first neural network; a sending module 1602, configured to send a first intermediate result to the server, where the first intermediate result is used for the server to obtain a prediction result corresponding to the data to be processed by using the second neural network; the first neural network and the second neural network form a target neural network, and the number of the neural network layers in the first neural network deployed on the first terminal equipment is changed at two different moments of the first moment and the second moment.

In one possible design, at a first time, the first neural network includes N neural network layers, and at a second time, the first neural network includes N neural network layers, N and N being different; the data processing apparatus 1600 further includes a receiving module configured to receive the first neural network transmitted by the server

It should be noted that, content such as information interaction and execution process between each module/unit in the data processing apparatus 1600, each method embodiment corresponding to fig. 12 or fig. 13 in the present application is based on the same concept, and specific content may be referred to the description in the foregoing method embodiment of the present application, which is not repeated herein.

Referring to fig. 17, fig. 17 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present application, where the data processing apparatus 1700 is disposed on a server, the server is included in a data processing system, the data processing system further includes a first terminal device, a first neural network is disposed on the first terminal device, a second neural network is disposed on the server, and the data processing apparatus 1700 includes: a receiving module 1701, configured to receive a first intermediate result sent by a first terminal device, where the first intermediate result is obtained based on data to be processed and N first intermediate results; the input module 1702 is configured to input the first intermediate result into a second neural network, so as to obtain a prediction result generated by the second neural network and corresponding to the data to be processed; the first neural network and the second neural network form a target neural network, and the number of the neural network layers in the first neural network deployed on the first terminal equipment is changed at two different moments of the first moment and the second moment.

In one possible design, at a first time, the first neural network includes N neural network layers, and at a second time, the first neural network includes N neural network layers, N and N being different; the apparatus further comprises: and the sending module is used for sending the n neural network layers to the terminal equipment.

It should be noted that, content such as information interaction and execution process between each module/unit in the data processing apparatus 1700, each method embodiment corresponding to fig. 12 or fig. 13 in the present application is based on the same concept, and specific content may be referred to the description in the foregoing method embodiment of the present application, which is not repeated herein.

Next, referring to fig. 18, fig. 18 is a schematic structural diagram of a first terminal device according to an embodiment of the present application. Specifically, the first terminal device 1800 includes: receiver 1801, transmitter 1802, processor 1803 and memory 1804 (where the number of processors 1803 in the first terminal device 1800 may be one or more, as exemplified by one processor in fig. 18), wherein processor 1803 may include an application processor 18031 and a communication processor 18032. In some embodiments of the application, the receiver 1801, transmitter 1802, processor 1803 and memory 1804 may be connected by a bus or other means.

Memory 1804 may include read only memory and random access memory and provide instructions and data to processor 1803. A portion of the memory 1804 may also include non-volatile random access memory (non-volatile random access memory, NVRAM). The memory 1804 stores a processor and operating instructions, executable modules or data structures, or a subset thereof, or an extended set thereof, wherein the operating instructions may include various operating instructions for performing various operations.

The processor 1803 controls the operation of the first terminal device. In a specific application, the individual components of the first terminal device are coupled together by a bus system, which may comprise, in addition to a data bus, a power bus, a control bus, a status signal bus, etc. For clarity of illustration, however, the various buses are referred to in the figures as bus systems.

The methods disclosed in the embodiments of the present application described above may be applied to the processor 1803 or implemented by the processor 1803. The processor 1803 may be an integrated circuit chip with signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuitry in hardware or instructions in software in the processor 1803. The processor 1803 may be a general-purpose processor, a digital signal processor (digital signal processing, DSP), a microprocessor, or a microcontroller, and may further include an application specific integrated circuit (application specific integrated circuit, ASIC), a field-programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components. The processor 1803 may implement or perform the methods, steps, and logic blocks disclosed in embodiments of the present application. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present application may be embodied directly in the execution of a hardware decoding processor, or in the execution of a combination of hardware and software modules in a decoding processor. The software modules may be located in a random access memory, flash memory, read only memory, programmable read only memory, or electrically erasable programmable memory, registers, etc. as well known in the art. The storage medium is located in the memory 1804, and the processor 1803 reads information in the memory 1804 and, in combination with the hardware, performs the steps of the method described above.

The receiver 1801 may be used to receive input numeric or character information and to generate signal inputs related to the relevant settings and function control of the first terminal device. The transmitter 1802 is operable to output numeric or character information via a first interface; the transmitter 1802 is further operable to send instructions to the disk stack via the first interface to modify data in the disk stack; the transmitter 1802 may also include a display device such as a display screen.

In the embodiment of the present application, in one case, the processor 1803 is configured to perform the steps performed by the first terminal device in each of the method embodiments corresponding to fig. 3 to 11. It should be noted that, the specific manner in which the processor 1803 executes the foregoing steps is based on the same concept as that of the method embodiments corresponding to fig. 3 to 11 in the present application, so that the technical effects brought by the same concept are the same as those of the method embodiments corresponding to fig. 3 to 11 in the present application, and the details of the method embodiments shown in the foregoing description of the present application will be referred to and will not be repeated herein.

In another case, the processor 1803 is configured to perform steps performed by the first terminal device in the respective method embodiments corresponding to fig. 12 or fig. 13. It should be noted that, the specific manner in which the processor 1803 executes the foregoing steps is based on the same concept, and the technical effects brought by the embodiments of the method corresponding to fig. 12 or fig. 13 are the same as those brought by the embodiments of the method corresponding to fig. 12 or fig. 13, and the specific details can be referred to the descriptions of the foregoing embodiments of the method according to the present application, which are not repeated herein.

Referring to fig. 19, fig. 19 is a schematic structural diagram of a server according to an embodiment of the present application, specifically, the server 1900 is implemented by one or more servers, where the server 1900 may have a relatively large difference due to configuration or performance, and may include one or more central processing units (central processing units, CPU) 1922 (e.g., one or more processors) and a memory 1932, and one or more storage media 1930 (e.g., one or more mass storage devices) storing application programs 1942 or data 1944. Wherein the memory 1932 and storage medium 1930 may be transitory or persistent. The program stored in the storage medium 1930 may include one or more modules (not shown), each of which may include a series of instruction operations on a server. Still further, a central processor 1922 may be provided in communication with a storage medium 1930 to execute a series of instruction operations in the storage medium 1930 on the server 1900.

The server 1900 may also include one or more power supplies 1926, one or more wired or wireless network interfaces 1950, one or more input/output interfaces 1958, and/or one or more operating systems 1941, such as Windows Server, mac OS XTM, unixTM, linuxTM, freeBSDTM, and the like.

In one case, the cpu 1922 is configured to perform the steps performed by the server in the respective embodiments corresponding to fig. 3 to 11. It should be noted that, the specific manner in which the cpu 1922 executes the foregoing steps is based on the same concept as that of the method embodiments corresponding to fig. 3 to 11, so that the technical effects are the same as those of the method embodiments corresponding to fig. 3 to 11, and the details of the method embodiments are described in the foregoing description of the method embodiments of the present application, which is not repeated herein.

In another case, the central processor 1922 is for performing the steps performed by the server in the respective embodiments corresponding to fig. 12 or fig. 13. It should be noted that, the specific manner in which the cpu 1922 executes the foregoing steps is based on the same concept, and the technical effects brought by the embodiments of the method corresponding to fig. 12 or fig. 13 are the same as those brought by the embodiments of the method corresponding to fig. 12 or fig. 13, and the details of the embodiments of the method are described in the foregoing embodiments of the present application, and are not repeated herein.

Embodiments of the present application also provide a computer program product, which when run on a computer, causes the computer to perform the steps performed by the first terminal device in the method described in the embodiment shown in fig. 3 to 11, or causes the computer to perform the steps performed by the server in the method described in the embodiment shown in fig. 3 to 11, or causes the computer to perform the steps performed by the first terminal device in the method described in the embodiment shown in fig. 12 or 13, or causes the computer to perform the steps performed by the server in the method described in the embodiment shown in fig. 12 or 13.

In an embodiment of the present application, there is also provided a computer-readable storage medium having stored therein a program for performing signal processing, which when run on a computer, causes the computer to perform the steps performed by the first terminal device in the method described in the embodiment shown in fig. 3 to 11, or causes the computer to perform the steps performed by the server in the method described in the embodiment shown in fig. 3 to 11, or causes the computer to perform the steps performed by the first terminal device in the method described in the embodiment shown in fig. 12 or 13, or causes the computer to perform the steps performed by the server in the method described in the embodiment shown in fig. 12 or 13.

The embodiment of the application also provides a data processing system, which may include a first terminal device and a server, where the first terminal device is the first terminal device described in the embodiment shown in fig. 18, and the server is the server described in the embodiment shown in fig. 19.

The data processing device provided by the embodiment of the application can be a chip, and the chip comprises: a processing unit, which may be, for example, a processor, and a communication unit, which may be, for example, an input/output interface, pins or circuitry, etc. The processing unit may execute the computer-executable instructions stored in the storage unit to cause the chip to perform the data processing method described in the embodiment shown in fig. 12 or fig. 13, or to cause the chip to perform the data processing method described in the embodiment shown in fig. 3 to fig. 11. Optionally, the storage unit is a storage unit in the chip, such as a register, a cache, etc., and the storage unit may also be a storage unit in the wireless access device side located outside the chip, such as a read-only memory (ROM) or other type of static storage device that may store static information and instructions, a random access memory (random access memory, RAM), etc.

Specifically, referring to fig. 20, fig. 20 is a schematic structural diagram of a chip provided in an embodiment of the present application, where the chip may be represented as a neural network processor NPU 200, and the NPU 200 is mounted as a coprocessor on a main CPU (Host CPU), and the Host CPU distributes tasks. The core part of the NPU is an arithmetic circuit 2003, and the controller 2004 controls the arithmetic circuit 2003 to extract matrix data in the memory and perform multiplication.

In some implementations, the arithmetic circuit 2003 internally includes a plurality of processing units (PEs). In some implementations, the operational circuit 2003 is a two-dimensional systolic array. The operation circuit 2003 may also be a one-dimensional systolic array or other electronic circuit capable of performing mathematical operations such as multiplication and addition. In some implementations, the operational circuit 2003 is a general-purpose matrix processor.

For example, assume that there is an input matrix a, a weight matrix B, and an output matrix C. The arithmetic circuit takes the data corresponding to matrix B from the weight memory 2002 and buffers it on each PE in the arithmetic circuit. The arithmetic circuit takes matrix a data and matrix B from the input memory 2001, performs matrix operation, and the obtained partial result or final result of the matrix is stored in an accumulator (accumulator) 2008.

The unified memory 2006 is used for storing input data and output data. The weight data is carried directly to the weight memory 2002 by the memory cell access controller (Direct Memory Access Controller, DMAC) 2005. The input data is also carried into the unified memory 2006 through the DMAC.

BIU is Bus Interface Unit, i.e., bus interface unit 2010, for the AXI bus to interact with DMAC and finger memory (Instruction Fetch Buffer, IFB) 2009.

The bus interface unit 2010 (Bus Interface Unit, abbreviated as BIU) is configured to obtain an instruction from the external memory by the instruction fetch memory 2009, and further configured to obtain the raw data of the input matrix a or the weight matrix B from the external memory by the storage unit access controller 2005.

The DMAC is mainly used to transfer input data in the external memory DDR to the unified memory 2006 or to transfer weight data to the weight memory 2002 or to transfer input data to the input memory 2001.

The vector calculation unit 2007 includes a plurality of operation processing units that perform further processing on the output of the operation circuit, such as vector multiplication, vector addition, exponential operation, logarithmic operation, magnitude comparison, and the like, as necessary. The method is mainly used for non-convolution/full-connection layer network calculation in the neural network, such as Batch Normalization (batch normalization), pixel-level summation, up-sampling of a characteristic plane and the like.

In some implementations, the vector calculation unit 2007 can store the vector of processed outputs to the unified memory 2006. For example, the vector calculation unit 2007 may apply a linear function and/or a nonlinear function to the output of the operation circuit 2003, for example, linearly interpolate the feature plane extracted by the convolution layer, and further, for example, accumulate a vector of values to generate an activation value. In some implementations, the vector calculation unit 2007 generates normalized values, pixel-level summed values, or both. In some implementations, the vector of processed outputs can be used as an activation input to the operational circuitry 2003, e.g., for use in subsequent layers in a neural network.

A fetch memory (instruction fetch buffer) 2009 connected to the controller 2004 for storing instructions used by the controller 2004;

the unified memory 2006, the input memory 2001, the weight memory 2002, and the finger memory 2009 are all On-Chip memories. The external memory is proprietary to the NPU hardware architecture.

In the embodiments corresponding to fig. 3 to 13, at least one neural network layer in the target neural network is disposed on each of the first terminal device and the server, and the operation of the neural network layer in the target neural network may be performed by the operation circuit 2003 or the vector calculation unit 2007.

The processor mentioned in any of the above may be a general-purpose central processing unit, a microprocessor, an ASIC, or one or more integrated circuits for controlling the execution of the program of the method of the first aspect.

It should be further noted that the above-described apparatus embodiments are merely illustrative, and that the units described as separate units may or may not be physically separate, and that units shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. In addition, in the drawings of the embodiment of the device provided by the application, the connection relation between the modules represents that the modules have communication connection, and can be specifically implemented as one or more communication buses or signal lines.

From the above description of the embodiments, it will be apparent to those skilled in the art that the present application may be implemented by means of software plus necessary general purpose hardware, or of course by means of special purpose hardware including application specific integrated circuits, special purpose CPUs, special purpose memories, special purpose components, etc. Generally, functions performed by computer programs can be easily implemented by corresponding hardware, and specific hardware structures for implementing the same functions can be varied, such as analog circuits, digital circuits, or dedicated circuits. However, a software program implementation is a preferred embodiment for many more of the cases of the present application. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a readable storage medium, such as a floppy disk, a usb disk, a removable hard disk, a ROM, a RAM, a magnetic disk or an optical disk of a computer, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method according to the embodiments of the present application.

In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product.

The computer program product includes one or more computer instructions. When loaded and executed on a computer, produces a flow or function in accordance with embodiments of the present application, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center by a wired (e.g., coaxial cable, fiber optic, digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer readable storage medium may be any available medium that can be stored by a computer or a data storage device such as a server, data center, etc. that contains an integration of one or more available media. The usable medium may be a magnetic medium (e.g., a floppy Disk, a hard Disk, a magnetic tape), an optical medium (e.g., a DVD), or a semiconductor medium (e.g., a Solid State Disk (SSD)), or the like.

Claims

1. A data processing method, wherein the method is applied to a data processing system, the data processing system includes a first terminal device and a server, a first neural network and a third neural network are deployed on the first terminal device, and a second neural network is deployed on the server, the method includes:

the first terminal equipment inputs data to be processed into the first neural network to obtain a first intermediate result generated by the first neural network, and the first intermediate result is sent to the server;

the server inputs the first intermediate result into the second neural network to obtain a second intermediate result generated by the second neural network, and sends the second intermediate result to the terminal equipment;

the first terminal device inputs the second intermediate result into the third neural network to obtain a prediction result which is generated by the third neural network and corresponds to the data to be processed;

the first neural network, the second neural network and the third neural network form a target neural network, and the neural network deployed on the first terminal device has the following changes at two different moments of a first moment and a second moment: the number of neural network layers in the first neural network is changed, or the number of neural network layers in the third neural network is changed.

2. The method of claim 1, wherein at the first time instant the first neural network comprises N neural network layers, the third neural network comprises S neural network layers, and at the second time instant the first neural network comprises N neural network layers, the third neural network comprises S neural network layers, wherein the N and the N are different and/or the S and the S are different, the method further comprising:

and the server sends the n neural network layers and the s neural network layers to the first terminal equipment.

3. The method according to claim 1 or 2, characterized in that the method further comprises:

the server determines the first neural network and the third neural network from the target neural network, wherein the determining factors of the first neural network and the third neural network comprise: the processor resource occupation amount of the first terminal equipment and/or the memory resource occupation amount of the first terminal equipment.

4. The method according to claim 1 or 2, wherein the system for data processing further comprises a second terminal device, the number of neural network layers in the first neural network deployed on the first terminal device and the first neural network deployed on the second terminal device being different, and/or the number of neural network layers in the third neural network deployed on the first terminal device and the third neural network deployed on the second terminal device being different;

The first terminal equipment and the second terminal equipment are different types of terminal equipment, and/or the first terminal equipment and the second terminal equipment are different types of terminal equipment in the same type.

5. A method according to claim 1 or 2, characterized in that,

6. The method according to claim 1 or 2, wherein the data to be processed is any one of the following data: sound data, image data of a face, fingerprint data, or contour data of an ear.

7. A data processing method, wherein the method is applied to a data processing system, the data processing system including a first terminal device and a server, the first terminal device having a first neural network disposed thereon, and the server having a second neural network disposed thereon, the method comprising:

the server inputs the first intermediate result into the second neural network to obtain a prediction result which is generated by the second neural network and corresponds to the data to be processed;

the first neural network and the second neural network form a target neural network, and the number of the neural network layers in the first neural network deployed on the first terminal device is changed at two different moments, namely a first moment and a second moment.

8. The method of claim 7, wherein at the first time instant the first neural network comprises N neural network layers, and at the second time instant the first neural network comprises N neural network layers, the N being different from the N, the method further comprising:

and the server sends the n neural network layers to the first terminal equipment.

9. The method according to claim 7 or 8, wherein the system for data processing further comprises a second terminal device, the number of neural network layers in the first neural network deployed on the first terminal device and the first neural network deployed on the second terminal device being different;

10. A data processing method, wherein the method is applied to a first terminal device, the first terminal device is included in a data processing system, the data processing system further includes a server, a first neural network and a third neural network are disposed on the first terminal device, and a second neural network is disposed on the server, the method includes:

inputting data to be processed into the first neural network to obtain a first intermediate result generated by the first neural network;

the first intermediate result is sent to the server, and the first intermediate result is used for the server to obtain a second intermediate result by using the second neural network;

receiving the second intermediate result sent by the server, and inputting the second intermediate result into the third neural network to obtain a prediction result which is generated by the third neural network and corresponds to the data to be processed;

11. A data processing method, wherein the method is applied to a server, the server is included in a data processing system, the data processing system further includes a first terminal device, a first neural network and a third neural network are disposed on the first terminal device, and a second neural network is disposed on the server, the method includes:

receiving a first intermediate result sent by the first terminal equipment, wherein the first intermediate result is obtained based on data to be processed and the first neural network;

inputting the first intermediate result into the second neural network to obtain a second intermediate result generated by the second neural network;

the second intermediate result is sent to the first terminal equipment, and the second intermediate result is used for the first terminal equipment to obtain a prediction result corresponding to the data to be processed by utilizing the third neural network;

12. A data processing method, wherein the method is applied to a first terminal device, the first terminal device is included in a data processing system, the data processing system further includes a server, a first neural network is disposed on the first terminal device, and a second neural network is disposed on the server, the method includes:

the first intermediate result is sent to the server and is used for the server to obtain a prediction result corresponding to the data to be processed by using the second neural network;

13. A data processing method, wherein the method is applied to a server, the server is included in a data processing system, the data processing system further includes a first terminal device, a first neural network is disposed on the first terminal device, and a second neural network is disposed on the server, the method includes:

Receiving a first intermediate result sent by the first terminal equipment, wherein the first intermediate result is obtained based on data to be processed and the N first intermediate results;

inputting the first intermediate result into the second neural network to obtain a prediction result which is generated by the second neural network and corresponds to the data to be processed;

14. A data processing apparatus, wherein the data processing apparatus is disposed on a first terminal device, the first terminal device is included in a data processing system, the data processing system further includes a server, a first neural network and a third neural network are disposed on the first terminal device, and a second neural network is disposed on the server, the apparatus comprising:

the input module is used for inputting data to be processed into the first neural network to obtain a first intermediate result generated by the first neural network;

the sending module is used for sending the first intermediate result to the server, and the first intermediate result is used for the server to obtain a second intermediate result by using the second neural network;

The receiving module is used for receiving the second intermediate result sent by the server;

the input module is further configured to input the second intermediate result into the third neural network, so as to obtain a prediction result generated by the third neural network and corresponding to the data to be processed;

15. The apparatus of claim 14, wherein at the first time instant the first neural network comprises N neural network layers, the third neural network comprises S neural network layers, and at the second time instant the first neural network comprises N neural network layers, the third neural network comprises S neural network layers, wherein the N and the N are different and/or the S and the S are different;

The receiving module is further configured to receive the n neural network layers and the s neural network layers sent by the server.

16. A data processing apparatus, wherein the data processing apparatus is disposed on a server, the server is included in a data processing system, the data processing system further includes a first terminal device, a first neural network and a third neural network are disposed on the first terminal device, and a second neural network is disposed on the server, the apparatus comprising:

the receiving module is used for receiving a first intermediate result sent by the first terminal equipment, and the first intermediate result is obtained based on data to be processed and the first neural network;

the input module is used for inputting the first intermediate result into the second neural network to obtain a second intermediate result generated by the second neural network;

the sending module is used for sending the second intermediate result to the first terminal equipment, and the second intermediate result is used for the first terminal equipment to obtain a prediction result corresponding to the data to be processed by using the third neural network;

17. The apparatus of claim 16, wherein at the first time instant the first neural network comprises N neural network layers, the third neural network comprises S neural network layers, and at the second time instant the first neural network comprises N neural network layers, the third neural network comprises S neural network layers, wherein the N and the N are different and/or the S and the S are different;

the sending module is further configured to send the n neural network layers and the s neural network layers to the first terminal device.

18. A data processing apparatus, wherein the data processing apparatus is disposed at a first terminal device, the first terminal device is included in a data processing system, the data processing system further includes a server, a first neural network is disposed on the first terminal device, and a second neural network is disposed on the server, the apparatus comprising:

the sending module is used for sending the first intermediate result to the server, and the first intermediate result is used for the server to obtain a prediction result corresponding to the data to be processed by using the second neural network;

19. The apparatus of claim 18, wherein at the first time instant the first neural network comprises N neural network layers, and at the second time instant the first neural network comprises N neural network layers, the N and the N being different;

the device also comprises a receiving module, which is used for receiving the first neural network sent by the server.

20. A data processing apparatus, wherein the data processing apparatus is disposed on a server, the server is included in a data processing system, the data processing system further includes a first terminal device, a first neural network is disposed on the first terminal device, and a second neural network is disposed on the server, the apparatus comprising:

the receiving module is used for receiving a first intermediate result sent by the first terminal equipment, and the first intermediate result is obtained based on the data to be processed and the N first intermediate results;

The input module is used for inputting the first intermediate result into the second neural network to obtain a prediction result which is generated by the second neural network and corresponds to the data to be processed;

21. The apparatus of claim 20, wherein at the first time instant the first neural network comprises N neural network layers, and at the second time instant the first neural network comprises N neural network layers, the N and the N being different;

the apparatus further comprises: and the sending module is used for sending the n neural network layers to the terminal equipment.

22. A terminal device comprising a processor and a memory, said processor being coupled to said memory,

the memory is used for storing programs;

the processor being configured to execute the program in the memory, causing the terminal device to perform the steps performed by the terminal device in the method of any one of claims 1 to 9, claim 10 or claim 12.

23. A server comprising a processor and a memory, the processor being coupled to the memory,

the memory is used for storing programs;

the processor being configured to execute the program in the memory, such that the server performs the steps performed by the server in the method of any one of claims 1 to 9, claim 11 or claim 13.

24. A data processing system, characterized in that the data processing system comprises a terminal device for performing the steps performed by the terminal device in the method according to any one of claims 1 to 6 and a server for performing the steps performed by the server in the method according to any one of claims 1 to 6; or,

the terminal device being adapted to perform the steps performed by the terminal device in the method according to any of claims 7 to 10, the server being adapted to perform the steps performed by the server in the method according to any of claims 7 to 10.

25. A computer program product, characterized in that the computer program product comprises a program which, when run on a computer, causes the computer to perform the steps performed by the terminal device in the method according to any one of claims 1 to 9, claim 10 or claim 12, or causes the computer to perform the steps performed by the server in the method according to any one of claims 1 to 9, claim 11 or claim 13.

26. A computer-readable storage medium, in which a program is stored which, when run on a computer, causes the computer to perform the steps performed by the terminal device in the method according to any one of claims 1 to 9, claim 10 or claim 12, or causes the computer to perform the steps performed by the server in the method according to any one of claims 1 to 9, claim 11 or claim 13.