WO2022160155A1

WO2022160155A1 - Method and apparatus for model management

Info

Publication number: WO2022160155A1
Application number: PCT/CN2021/074081
Authority: WO
Inventors: 许斌; 梁琪
Original assignee: 华为技术有限公司
Priority date: 2021-01-28
Filing date: 2021-01-28
Publication date: 2022-08-04

Abstract

The present application provides a method and apparatus for model management. By acquiring a model uploaded by a terminal device and converting the model uploaded by the terminal device into a model that can run in a running environment on a server side, the management complexity of the server side is helped to be reduced, so that a server does not need to customize a running environment for models of different types of terminal devices, thus reducing development costs.

Description

Method and apparatus for managing models

technical field

The present application relates to the field of communications, and more particularly, to a method and apparatus for managing models.

Background technique

Artificial intelligence (AI) is a theory, method, technology and application system that uses digital computers or machines controlled by digital computers to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge and use knowledge to obtain the best results. Deep learning is an important technology in artificial intelligence. Deep learning aims to build a neural network that simulates the human brain for analysis and learning, and processes data by imitating the working mechanism of the human brain. Deep learning has been widely used in many fields such as image processing and speech recognition. Currently, end devices are also increasingly applying deep learning to support complex applications. However, increasingly complex computing tasks and increasingly large amounts of data make the computing resources required for deep learning larger and larger. This puts forward higher requirements on the hardware resources of the terminal device. However, due to factors such as small size and low power consumption, terminal devices cannot integrate computing resources with high computing power, thus limiting the application of deep learning in terminal devices.

In order to reduce the computing load of the terminal device, a mobile edge computing method is currently introduced, which deploys the hardware resources required for deep learning on the mobile edge computing device. Mobile edge computing devices need to be oriented to different types of terminal devices, so it is necessary to provide or deploy a variety of virtualized containers to handle requests from different types of terminal devices, and the development cost is high.

SUMMARY OF THE INVENTION

In view of this, the present application provides a method and apparatus for managing models, which can reduce development costs and management complexity.

In a first aspect, a method of managing a model is provided. The method may, for example, be performed by a server, or may also be performed by a component (eg, a circuit or a chip) in the server. This application does not limit this. Optionally, the server may be an edge computing device, such as a mobile edge computing (mobile edge computing, MEC) device.

The method includes: first, the server obtains a model file of a first model of a terminal device and information (eg, application data) of a task to be processed of the terminal device, where the first model is a model related to the task to be processed , the first model can run in the running environment of the terminal device; then, the first model is converted into a second model, and the second model can run in the running environment of the server; finally, scheduling the first model A virtualized container, and the second model and the to-be-processed task are loaded into the first virtualized container.

The server converts or adapts the model uploaded by the terminal device, and the converted model can be run on the server side, that is, adapted to the running environment of the server. For different types of terminal devices, the server can use the model adaptation technology to convert the model on the terminal device side into a model that can run on the server side. In this way, application developers do not need to develop models adapted to the server side according to the hardware resources of the server, nor do they need to deploy different types of virtualized containers on the server side according to the model type, which reduces development costs and reduces the management complexity of the server side. .

The first model is a model related to a task to be processed by the terminal device, which can be understood as: the first model is a training model for processing the task to be processed by the terminal device. For example, if the task to be processed is a certain image task of the terminal device, then the first model is a training model for processing the image task (such as image data).

Optionally, the server obtains the model file of the first model of the terminal device and the information of the tasks to be processed of the terminal device through a network device (such as an access network device or a core network device).

In a possible implementation manner, the server scheduling the first virtualized container includes: the server inquires about an idle container; in the case of an idle container, the server uses the idle container as the first virtual container or, if there is no idle container, the server obtains the first virtualized container through an instantiation operation.

That is, if there is an idle container or an unused container, the server can use the idle container directly. In this way, the server can use the existing idle containers, and there is no need to re-pull the image from the image repository and instantiate it into a container, thus saving the overhead of image pulling and instantiation. If there is no idle container, the server needs to re-pull the image from the image repository and instantiate it into a container to obtain the first virtualized container.

Optionally, the method further includes: returning, by the server, a processing result for the to-be-processed task, where the processing result is that the first virtualization container processes the second model and the to-be-processed task owned. After the processing of the first virtualized container is completed, the server returns the processing result to the terminal device so that the terminal device can perform further processing. In this way, the processing of the to-be-processed task is completed on the server side, thereby saving power consumption on the terminal device side.

It can be understood that the present application does not limit the specific manner in which the server returns the processing result. The server can return the processing result to the terminal device through the network device.

In a second aspect, a method of managing a model is provided. The method may be performed by, for example, a terminal device, or may also be performed by a component (eg, a circuit or a chip) in the terminal device. This application does not limit this.

The method includes: when it is determined that the computing power of the task to be processed is unloaded, the terminal device triggers a model moving mechanism, and the model moving mechanism is used to upload the first model of the terminal device to the server, the first model The model is a model related to the task to be processed, and the first model can run in the operating environment of the terminal device; the terminal device sends the first message, and the first message includes the first model the model file and the information about the pending task.

In this application, when the computing power offloading mechanism is triggered, the terminal device can also trigger (or drive) the model up-moving mechanism at the same time, and upload the model file on the terminal device side to the server. That is, the terminal device transmits the model file on the terminal device side to the server when the computing power is unloaded. The terminal device uploads the model file to the server, so that the server converts or adapts the model uploaded by the terminal device, and the converted model can run on the server side, that is, adapt to the running environment of the server. In this way, the tasks to be processed by the terminal device are moved to the processor side for processing, which helps to save the power consumption of the terminal device.

For different types of terminal devices, the server can use the model adaptation technology to convert the model on the terminal device side into a model that can run on the server side. In this way, application developers do not need to develop models adapted to the server side according to the hardware resources of the server, nor do they need to deploy different types of virtualized containers on the server side according to the model type, which reduces development costs and reduces the management complexity of the server side. .

The first model is a model related to a task to be processed by the terminal device, which can be understood as: the first model is a training model for processing the task to be processed by the terminal device. For example, if the task to be processed is an image task of the terminal device, the first model is a training model for processing the image task (eg, image data).

Optionally, the terminal device may send the first message to the server through a network device (such as an access network device or a core network device).

In a possible implementation manner, the first message may be an inference request message.

Optionally, the method further includes: receiving, by the terminal device, a processing result for the to-be-processed task, where the processing result is obtained by processing the second model and the to-be-processed task by the first virtualization container, The second model is capable of running in the operating environment of the server. In this way, the processing of the to-be-processed task is completed on the server side, thereby saving power consumption on the terminal device side.

It can be understood that the present application does not limit the specific manner in which the terminal device receives the processing result. The terminal device can obtain the processing result through the network device.

A third aspect provides an apparatus for managing models, where the apparatus is configured to execute the method provided in the first aspect. Specifically, the apparatus may include a unit for performing the method provided by the first aspect.

In a fourth aspect, an apparatus for managing models is provided, where the apparatus is configured to execute the method provided in the second aspect. Specifically, the apparatus may include a unit for performing the method provided by the second aspect.

In a fifth aspect, an apparatus for managing a model is provided, including a processor. The processor is coupled to the memory and can be used to execute instructions in the memory to implement the method in any one of the possible implementations of the first aspect above. Optionally, the apparatus for managing the model further includes a memory. Optionally, the apparatus for managing the model further includes a communication interface, and the processor is coupled to the communication interface.

In one implementation, the device is a server. When the apparatus is a server, the communication interface may be a transceiver, or an input/output interface.

In another implementation manner, the device is a chip configured in a server. When the device is a chip configured in a server, the communication interface may be an input/output interface.

Optionally, the transceiver may be a transceiver circuit. Optionally, the input/output interface may be an input/output circuit.

In a sixth aspect, an apparatus for managing a model is provided, including a processor. The processor is coupled to the memory and can be used to execute instructions in the memory to implement the method in any of the possible implementations of the second aspect above. Optionally, the apparatus further includes a memory. Optionally, the apparatus further includes a communication interface to which the processor is coupled.

In an implementation manner, the apparatus is a terminal device. When the apparatus is a terminal device, the communication interface may be a transceiver, or an input/output interface.

In another implementation manner, the apparatus is a chip configured in a terminal device. When the device is a chip configured in a terminal device, the communication interface may be an input/output interface.

In a seventh aspect, a processor is provided, including: an input circuit, an output circuit, and a processing circuit. The processing circuit is configured to receive a signal through the input circuit and transmit a signal through the output circuit, so that the processor executes the method in any one of the possible implementations of the first aspect and the second aspect.

In a specific implementation process, the above-mentioned processor may be a chip, the input circuit may be an input pin, the output circuit may be an output pin, and the processing circuit may be a transistor, a gate circuit, a flip-flop, and various logic circuits. The input signal received by the input circuit may be received and input by, for example, but not limited to, a receiver, the signal output by the output circuit may be, for example, but not limited to, output to and transmitted by a transmitter, and the input circuit and output The circuit can be the same circuit that acts as an input circuit and an output circuit at different times. The embodiments of the present application do not limit the specific implementation manners of the processor and various circuits.

In an eighth aspect, an apparatus is provided, including a processor and a memory. The processor is used for reading the instructions stored in the memory, and can receive signals through the receiver and transmit signals through the transmitter, so as to execute the method in any possible implementation manner of the first aspect and the second aspect.

Optionally, there are one or more processors and one or more memories.

Optionally, the memory may be integrated with the processor, or the memory may be provided separately from the processor.

In the specific implementation process, the memory can be a non-transitory memory, such as a read only memory (ROM), which can be integrated with the processor on the same chip, or can be separately set in different On the chip, the embodiment of the present application does not limit the type of the memory and the setting manner of the memory and the processor.

It should be understood that the relevant data interaction process, for example, sending information may be a process of outputting information from the processor, and receiving information may be a process of receiving information by the processor. Specifically, the data output by the processing can be output to the transmitter, and the input data received by the processor can be from the receiver. Among them, the transmitter and the receiver may be collectively referred to as a transceiver.

The device in the above-mentioned eighth aspect may be a chip, and the processor may be implemented by hardware or by software. When implemented by hardware, the processor may be a logic circuit, an integrated circuit, etc.; when implemented by software , the processor may be a general-purpose processor, and is implemented by reading software codes stored in a memory, which may be integrated in the processor or located outside the processor and exist independently.

In a ninth aspect, a computer-readable storage medium is provided, on which a computer program is stored, and when the computer program is executed by an apparatus for managing a model, the apparatus enables the apparatus to implement the method in any possible implementation manner of the first aspect .

A tenth aspect provides a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by an apparatus for managing a model, the apparatus enables the apparatus to implement the method in any possible implementation manner of the second aspect .

In an eleventh aspect, there is provided a computer program product comprising instructions that, when executed by a computer, cause an apparatus for managing models to implement the method provided in the first aspect.

A twelfth aspect provides a computer program product comprising instructions that, when executed by a computer, cause an apparatus for managing models to implement the method provided by the second aspect.

In a thirteenth aspect, a communication system is provided, including the aforementioned server and terminal device. Optionally, the communication system further includes access network equipment and/or core network equipment.

Description of drawings

FIG. 1 is a schematic structural diagram of a mobile communication system to which an embodiment of the present application is applied;

Figure 2 is an example diagram of an architecture based on mobile edge computing;

Figure 3 is an example diagram of an architecture based on cloud computing;

4 is a schematic diagram of a method for managing a model according to an embodiment of the present application;

5 is a schematic block diagram of an apparatus for managing a model provided by an embodiment of the present application;

6 is a schematic structural diagram of another apparatus for managing a model provided by an embodiment of the present application;

FIG. 7 is a schematic structural diagram of a terminal device provided by an embodiment of the present application.

Detailed ways

The technical solutions in the present application will be described below with reference to the accompanying drawings.

The technical solutions of the embodiments of the present application can be applied to various communication systems, such as long term evolution (long term evolution, LTE) system, LTE frequency division duplex (frequency division duplex, FDD) system, LTE time division duplex (time division duplex, TDD), universal mobile telecommunication system (UMTS), worldwide interoperability for microwave access (WiMAX) communication system, future 5th generation (5G) system or new wireless (new radio, NR), device-to-device (D2D) communication systems, machine communication systems, vehicle networking communication systems, IoT systems, non-terrestrial network (NTN) satellite communication systems or future The communication system and the like are not limited in the embodiments of the present application.

FIG. 1 is a schematic structural diagram of a mobile communication system to which an embodiment of the present application is applied. As shown in FIG. 1 , the mobile communication system includes a core network device 110 , an access network device 120 and at least one terminal device (such as the terminal device 130 and the terminal device 140 in FIG. 1 ). The terminal equipment is wirelessly connected to the access network equipment, and the access network equipment is wirelessly or wiredly connected to the core network equipment. The core network device and the access network device can be independent and different physical devices, or the functions of the core network device and the logical function of the access network device can be integrated on the same physical device, or they can be integrated on one physical device. The functions of part of the core network equipment and part of the access network equipment are described. Terminal equipment can be fixed or movable. FIG. 1 is just a schematic diagram, and the communication system may also include other network devices, such as wireless relay devices and wireless backhaul devices, which are not shown in FIG. 1 . The embodiments of the present application do not limit the number of core network devices, access network devices, and terminal devices included in the mobile communication system.

The terminal device in this application can communicate with the access network device. It should be understood that terminal equipment may also be referred to as user equipment (UE), access terminal, subscriber unit, subscriber station, mobile station, mobile station, remote station, remote terminal, mobile device, user terminal, terminal, wireless communication device, user agent, or user device. The terminal device in the embodiment of the present application may be a mobile phone, a satellite phone, a cellular phone, a cordless phone, a session initiation protocol (SIP) phone, a tablet computer (pad), or a computer with a wireless transceiver function. , wireless local loop (wireless local loop, WLL) station, personal digital assistant (PDA), virtual reality (virtual reality, VR) terminal equipment, augmented reality (augmented reality, AR) terminal equipment, industrial control ( Wireless terminals in industrial control, wireless terminals in self-driving, wireless terminals in remote medical, wireless terminals in smart grid, wireless terminals in transportation safety Wireless terminal, wireless terminal in smart city (smart city), wireless terminal in smart home (smart home), terminal equipment in 5G network or future evolution of public land mobile network (PLMN) terminal equipment, etc. The embodiments of the present application do not limit application scenarios. The terminal device may also refer to a chip with a communication function, etc., which is not limited in this embodiment of the present application. For the convenience of description, the technical solutions of the present application are described in detail by taking a terminal device as an example.

The access network device in this application may be a radio access network (radio access network, RAN) device. RAN equipment includes but is not limited to: evolved node B (evolved nodeB, eNB or eNodeB), radio network controller (radio network controller, RNC), node B (node B, NB), base station controller (base station controller, BSC) ), base transceiver station (base transceiver station, BTS), home base station (for example, home evolved node B, or home node B, HNB), base band unit (base band unit, BBU), wireless fidelity (wireless fidelity, WIFI) The access point (AP), wireless relay node, wireless backhaul node, transmission point (TP) or transmission and reception point (TRP) in the system can also be 5G, For example, NR, gNB in system, or transmission point (TRP or TP), base station in future mobile communication system or access node in wireless fidelity (Wi-Fi) system, base station in 5G system One or a group of (including multiple antenna panels) antenna panels, or, it can also be a network node that constitutes a gNB or a transmission point, such as a baseband unit (baseband unit, BBU), or, distributed unit (distributed unit, DU) Wait. The embodiments of the present application do not limit the specific technology and specific device form adopted by the wireless access network device. In some deployments, a gNB may include a centralized unit (CU) and a DU. The gNB may also include an active antenna unit (AAU). The CU implements some functions of the gNB, and the DU implements some functions of the gNB. For example, the CU is responsible for processing non-real-time protocols and services, and implementing functions of radio resource control (RRC) and packet data convergence protocol (PDCP) layers. The DU is responsible for processing physical layer protocols and real-time services, and implementing the functions of the radio link control (RLC) layer, the media access control (MAC) layer and the physical (PHY) layer. AAU implements some physical layer processing functions, radio frequency processing and related functions of active antennas. Since the information of the RRC layer will eventually become the information of the PHY layer, or be transformed from the information of the PHY layer, therefore, in this architecture, the higher-layer signaling, such as the RRC layer signaling, can also be considered to be sent by the DU. , or, sent by DU+AAU. It can be understood that the network device may be a device including one or more of a CU node, a DU node, and an AAU node. In addition, the CU can be divided into network devices in the access network, and the CU can also be divided into network devices in a core network (core network, CN), which is not limited in this application. Alternatively, the access network device may also refer to a chip with communication functions, etc., or may also be a satellite, a device-to-device (D2D), a vehicle-to-everything (V2X), a machine A device that assumes a base station function in machine-to-machine (M2M) communication, and so on.

The solutions of the embodiments of the present application may be applicable to edge computing scenarios or cloud computing scenarios. In an edge computing scenario or a cloud computing scenario, part or all of the data processing and storage functions of the terminal device are moved to a remote server (referred to as a server for short). The server in this application has the computing resources required for deep learning, and can handle complex computing tasks (eg, large computing power, huge data, etc.). For example, the server can be mobile edge computing (MEC) or the cloud. For example, the hardware resources required for deep learning are deployed on the MEC device, so that the terminal device can move tasks with high power consumption and high computing power to the MEC side, while the terminal device is locally responsible for tasks with lower complexity (such as , result rendering and display), reducing the computing load of the terminal device. For convenience of description, the remote server will be referred to as a server for short in the following.

The following describes a system architecture to which the embodiments of the present application are applied with reference to FIG. 2 and FIG. 3 .

Figure 2 is an example diagram of an architecture based on mobile edge computing. As shown in Figure 2, the architecture includes a base station, an MEC server and n UEs (UE1, UE2, ... UEn), where n is a positive integer. The MEC server can be deployed in core network equipment. The MEC server has resources for processing tasks, such as communication resources, computing resources, and storage resources shown in FIG. 2 . The task processing of the UE requires certain communication resources, computing resources and storage resources (for example, FIG. 2 shows the resources required by task 1 in UE1, task 2 in UE2, ... task n in UEn). In the mobile edge computing architecture, the task processing process of the UE can be offloaded to the MEC server side. As shown in FIG. 2 , the processing process of task 1 in UE1, task 2 of UE2, ... task n of UEn can be implemented on the MEC server side. The interaction between the UE and the MEC server may be forwarded by the base station or the core network device, which is not specifically limited.

Figure 3 is an example diagram of an architecture based on cloud computing. As shown in Figure 3, the architecture includes cloud server, server, central processing unit, different types of access devices (such as access point AP, base station, satellite device), and different types of UE (such as tablet, mobile phone, laptop). The cloud server is connected to the server through a network. The uplink and downlink communication is implemented between the access network device and the UE through a communication link. Communication between the access network equipment and the central processor is also possible. In the cloud computing architecture, part or all of the data processing and storage functions of the UE can be moved to the cloud server.

It should be understood that the communication systems in FIG. 1 to FIG. 3 are only exemplary descriptions, and do not limit the protection scope of the embodiments of the present application. The technical solutions of the embodiments of the present application can also be applied to other communication systems.

It should also be understood that the present application does not limit the devices or network elements included in the communication systems of FIGS. 1 to 3 . For example, the communication system may also include other devices, which are not limited. In addition, the present application does not specifically limit the number of devices or network elements included in FIG. 1 to FIG. 3 .

In this embodiment of the present application, the terminal device or server includes a hardware layer, an operating system layer running on the hardware layer, and an application layer running on the operating system layer. This hardware layer includes hardware such as central processing unit (CPU), memory management unit (MMU), and memory (also called main memory). The operating system may be any one or more computer operating systems that implement business processing through processes, such as a Linux operating system, a Unix operating system, an Android operating system, an iOS operating system, or a Windows operating system. The application layer includes applications such as browsers, address books, word processing software, and instant messaging software. In addition, the embodiments of the present application do not specifically limit the specific structure of the execution body of the methods provided by the embodiments of the present application, as long as the program that records the codes of the methods provided by the embodiments of the present application can be executed to provide the methods provided by the embodiments of the present application. For example, the execution subject of the method provided by the embodiment of the present application may be a terminal device or a server, or a functional module in the terminal device or server that can call and execute a program.

Additionally, various aspects or features of the present application may be implemented as a method, apparatus, or article of manufacture using standard programming and/or engineering techniques. The term "article of manufacture" as used in this application encompasses a computer program accessible from any computer readable device, carrier or medium. For example, computer readable media may include, but are not limited to: magnetic storage devices (eg, hard disks, floppy disks, or magnetic tapes, etc.), optical disks (eg, compact discs (CDs), digital versatile discs (DVDs) etc.), smart cards and flash memory devices (eg, erasable programmable read-only memory (EPROM), card, stick or key drives, etc.). Additionally, various storage media described herein can represent one or more devices and/or other machine-readable media for storing information. The term "machine-readable medium" may include, but is not limited to, wireless channels and various other media capable of storing, containing, and/or carrying instructions and/or data.

The following describes a method for managing a model according to an embodiment of the present application with reference to FIG. 4 .

FIG. 4 is a schematic diagram of a method 300 for managing a model according to an embodiment of the present application. Exemplarily, the server in method 300 may be the MEC server in FIG. 2 or the cloud server in FIG. 3 , which is not limited.

As shown in FIG. 4, the method 300 includes:

S310, in the case of determining that the computing power of the task to be processed is unloaded, the terminal device triggers a model moving mechanism, and the model moving mechanism is used to upload the first model of the terminal device to the server, where the first model is A model related to the task to be processed (eg, application data).

The task to be processed refers to a reasoning task of a certain application of the terminal device. For example, the image task of the application of the terminal device. The first model is a model related to the task to be processed by the terminal device, which can be understood as: the first model is a training model or a machine learning model for processing the task to be processed by the terminal device. For example, if the task to be processed is an image task of the terminal device, the first model is a training model for processing the image task (eg, image data). The first model in this application is a machine learning model (also referred to as a target model), and can run in the environment of a terminal device. For example, in the terminal device, the application data is input into the target model, and the processing result, operation result, or inference result of the application data can be obtained. This application does not specifically limit the first model, for example, the first model may be a mathematical model such as a neural network model, a support vector machine model, or the like.

HashRate offloading, which can also be referred to as computing power upward, refers to offloading or moving up part or all of the computing functions or storage functions of the tasks to be processed on the terminal device to the server side for processing. "Determining to unload computing power" can be understood as triggering the unloading mechanism of computing power. In this application, when the computing power offloading mechanism is triggered, the terminal device can also trigger (or drive) the model up-moving mechanism at the same time, and upload the model file on the terminal device side to the server. That is, the terminal device transmits the model file on the terminal device side to the server when the computing power is unloaded.

It can be understood that this application does not specifically limit the triggering method of the computing power offloading mechanism, which may be triggered by a user or by a terminal device.

The model moving mechanism is used to upload the first model of the terminal device to the server, which specifically includes: uploading the model file (eg, model structure and model parameters, etc.) of the first model of the terminal device to the server. Among them, the model parameters refer to parameters related to the model, such as weights, biases and other parameters.

The present application does not specifically limit the manner of triggering the model move-up mechanism. For example, when the terminal device determines that the application (APP) has high computing power requirements or high computing requirements, it can automatically trigger the model moving mechanism, and upload the local model file and application data to the server, so that the The server processes computing tasks to save the computing overhead of the terminal.

S320: The terminal device sends a first message, where the first message includes a model file of the first model of the terminal device and information of a task to be processed. Correspondingly, the server obtains the model file of the first model of the terminal device and the information of the task to be processed.

The purpose of the terminal device sending the first message is to upload or transmit the local model file of the terminal device (such as the model file of the first model) and the information of the task to be processed to the server side, so that the server side can process the application data.

Optionally, the information of the task to be processed includes application data. It can be understood that, in addition to the application data, the information of the to-be-processed task may also include other information related to the to-be-processed task, which is not specifically limited.

This application does not limit the form or name of the first message. The first message may be in the form of an existing message, or may be a newly defined message. Optionally, the first message may be an inference request message sent by the terminal device.

Optionally, the terminal device may send the first message to the server through the network device. For example, the network device is an access network device or a core network device. For example, the terminal device sends a first message to the access network device, where the first message includes the model file of the first model and the application data of the terminal device; the access network device sends the model file of the first model and the application data of the terminal device to Core network device; after receiving the model file of the first model and the application data, the core network device sends the model file of the first model and the application data of the terminal device to the server. For another example, the terminal device sends a first message to the access network device, where the first message includes the model file of the first model and the application data of the terminal device; the access network device sends the model file of the first model and the application data of the terminal device to the server.

Correspondingly, this application does not specifically limit how the server obtains the model file uploaded by the terminal device and the information of the task to be processed. The server may obtain the information of the first model of the terminal device and the pending tasks of the terminal device from the access network device, or obtain the information of the first model of the terminal device and the pending tasks of the terminal device from the core network device .

This application does not limit the form or name of the message (the message includes the model file uploaded by the terminal device and the information of the task to be processed) exchanged between the server and the network device (access network device or core network device), and the existing The message form can also be a newly defined message. For example, the access network device or the core network device may transparently transmit the first message to the server. For another example, the access network device or the core network device can process the first message to obtain the second message (the second message includes the model file uploaded by the terminal device and the information of the task to be processed), and then send the second message to server.

After obtaining the model file of the first model uploaded by the terminal device, the server may perform conversion processing on the first model of the terminal device, so that the converted model can be adapted to the operating environment of the server.

S330, the server converts the first model into a second model, where the second model can run in the running environment of the server.

Wherein, the first model and the second model have different requirements on the physical environment. The first model can run in the running environment of the terminal device. The second model can run on the server's runtime environment.

After obtaining the model file of the first model uploaded by the terminal device, the server uses the model conversion technology (or model adaptation technology) to convert the first model uploaded by the terminal device according to the model file of the first model and in combination with its own operating environment. Converting or adapting to the second model enables the second model to be adapted to the operating environment of the server. The model conversion technology refers to the technical means adopted by those skilled in the art in order to convert the model uploaded by the terminal device into a model suitable for the operating environment of the server, and the specific method is not limited.

For example, assuming that the first model uploaded by the terminal device is the TensorFlow model, the operating environment of the terminal device is the TensorFlow environment, and the operating environment of the server is the Caffe environment, then the server uses the model conversion technology to convert the TensorFlow model to the Caffe model, That is, the second model is the Caffe model, and the Caffe model is adapted to the Caffe environment of the server.

It can be understood that the first model of the terminal device is used as an example for description here, but it is not limited to this. In fact, for models of different types of terminal devices (for example, TensorFlow model, Caffe model, Davinci model, etc.), the server can also convert or adapt the model to a model that can run in the server's operating environment. That is to say, the server can convert models of different types of terminal devices based on its own platform operating environment, and the converted models can run in the environment of the server. The server side can obtain the model of the terminal device, and there is no need to deploy additional models on the server side. There is also no need to build an additional model repository on the server side to store models.

In this application, a unified virtualized environment is deployed on the server side, that is, models uploaded by different terminal devices can run in the unified virtualized environment after being adapted. That is, for models of different types of terminal devices, the server can process requests from the terminal device side in a unified virtualization environment, which reduces the management complexity on the server side. Taking the corresponding running environment of TensorFlow model deployed in the server (for example, the necessary dependent libraries for running the TensorFlow model model, supported operators and other files) as an example, the server can upload different types of target models uploaded by multiple terminal devices (for example, Caffe model, Davinci model, etc.) are uniformly converted into TensorFlow model models, so that different types of target models can be run on the server side. To put it another way, if the virtual environment deployed on the server side includes the TensorFlow framework (also referred to as the TF environment), then after the server obtains the model files of different types of terminal devices, it only needs to convert the model of the terminal device to a model that matches the TensorFlow framework. , without having to customize the environment for each model. Since the server can process different model types in a unified virtualization environment, the server does not need to load or deploy corresponding virtualized environments according to different model types. Application developers also do not need to deploy or customize different types of virtual environments on the server according to the type of target model, thereby reducing development costs.

The virtualization environment in this application may include virtualization technologies such as containers (docker) and virtual machines, which are not specifically limited.

For the application data uploaded by the terminal device, the server caches the application data uploaded by the terminal device locally, and notifies the container manager to schedule the work container.

S340: The server schedules a first virtualized container, and loads the second model and the to-be-processed task of the terminal device into the first virtualized container.

The first virtualized container refers to a working container. After the server converts the first model to the second model, the work container can be scheduled so that the work container can process tasks to be processed on the terminal device.

Optionally, scheduling the first virtualized container by the server includes: the server inquires about an idle container; if there is an idle container, the server uses the idle container as the first virtualized container; or, if there is no idle container In the case of a container, the server obtains the first virtualized container through an instantiation operation. Free containers are containers that are not occupied or used.

In this embodiment of the present application, the terminal device may upload the local model file to the server. The server converts or adapts the model uploaded by the terminal device, and the converted model can be run on the server side, that is, adapted to the running environment of the server. For different types of terminal devices, the server can use the model adaptation technology to convert the model on the terminal device side into a model that can run on the server side. In this way, application developers do not need to develop models adapted to the server side according to the hardware resources of the server, nor do they need to deploy different types of virtualized containers on the server side according to the model type, which reduces development costs and reduces the management complexity of the server side. . It can also be expressed in another way, and the terminal device side is referred to as "end side" for short, and the server side is called "side side" for description. Perceived coordination and model move up enables intelligent loading of the side, and can adapt to different types of models without additional deployment on the side.

In response to the first message (such as an inference request message) on the terminal device side, the server side may return the processing result after processing the task on the terminal device side.

Optionally, the method 300 further includes: the server returning a processing result for the to-be-processed task, where the processing result is obtained by processing the second model and the to-be-processed task by the first virtualization container of. Correspondingly, the terminal device receives the processing result.

Specifically, the server starts the first virtualized container, and uses the second model to process to-be-processed tasks of the terminal device, so as to respond to the inference request on the terminal side. After the processing of the first virtualized container is completed, the server returns the processing result of the to-be-processed task to the terminal device, so that the terminal device can perform further processing. In this way, the processing of the to-be-processed task is completed on the server side, thereby saving power consumption on the terminal device side.

It can be understood that the present application does not limit the specific manner in which the server returns the processing result. The server can return the processing result to the terminal device through the network device. For example, the server sends the processing result to the core network device, and then the core network device sends the processing result to the access network device, and finally the access network device sends the processing result to the terminal device.

This application does not specifically limit the internal implementation or structure of the server. Optionally, the server may include a data processing module, a container manager, and a working container instance, and then the server-side method in the present application is implemented through the data processing module, the container manager, and the working container instance.

The specific process is described below in conjunction with the data processing module, the container housekeeper, and the example of the work container.

For example, the data processing module in the server can store the application data transmitted from the terminal device side and cache the application data locally. The data processing module in the server can convert the model uploaded by the terminal device into a model suitable for the container environment of the server. In addition, the data processing module in the server notifies the container steward to schedule the container. The container manager determines whether there is an idle container by querying the idle container. If there is no free container, pull the image and instantiate it into a container, and finally start the container as a working container. If there is a free container, use that free container as the work container. The data processing module in the server loads the application data and the transformed model into the worker container. The work container starts the inference process and responds to the inference request from the terminal device. After the work container is processed, the processing result or inference result is returned to the terminal device side for further processing by the terminal device side.

It can be understood that, the server includes a data processing module, a container manager, and an instance of a work container as an example for description here, and does not constitute a limitation to the embodiments of the present application.

It can be understood that, in some scenarios, some optional features in the embodiments of the present application can be implemented independently of other features, such as the solution currently based on them, to solve corresponding technical problems and achieve corresponding The effect can also be combined with other features according to requirements in some scenarios. Correspondingly, the apparatuses provided in the embodiments of the present application may also implement these features or functions correspondingly, which will not be repeated here.

It is to be understood that reference throughout the specification to an "embodiment" means that a particular feature, structure, or characteristic associated with the embodiment is included in at least one embodiment of the present application. Thus, various embodiments throughout this specification are not necessarily necessarily referring to the same embodiment. Furthermore, the particular features, structures or characteristics may be combined in any suitable manner in one or more embodiments.

It should be understood that various solutions in the embodiments of the present application may be used in a reasonable combination, and the explanations or descriptions of various terms appearing in the embodiments may be referred to or explained in each embodiment, which is not limited.

It should also be understood that, in various embodiments of the present application, the size of the sequence numbers of the above-mentioned processes does not mean the sequence of execution, and the execution sequence of each process should be determined by its functions and inherent logic. The various numerical numbers or serial numbers involved in the above processes are only for the convenience of description, and should not constitute any limitation on the implementation process of the embodiments of the present application.

Corresponding to the methods given in the foregoing method embodiments, the embodiments of the present application further provide corresponding apparatuses, and the apparatuses include corresponding modules for executing the foregoing embodiments. The modules may be software, hardware, or a combination of software and hardware. It can be understood that the technical features described in the method embodiments are also applicable to the following apparatus embodiments.

FIG. 5 is a schematic block diagram of an apparatus for managing a model provided by an embodiment of the present application. As shown in FIG. 5 , the apparatus 1000 of the management model may include a transceiver unit 1100 and a processing unit 1200 .

In a possible design, the apparatus 1000 of the management model may correspond to the server in the above method embodiment, for example, may be a server or a chip configured in the server.

Specifically, the apparatus 1000 for managing the model may correspond to the server in the method 300 according to the embodiment of the present application, and the apparatus 1000 for managing the model may include a unit for executing the method executed by the server in the method 300 in FIG. 4 . In addition, each unit in the apparatus 1000 for managing the model and the above-mentioned other operations or functions are respectively for realizing the corresponding flow of the server in the method 300 in FIG. 4 .

In a possible implementation manner, the transceiver unit 1100 and the processing unit 1200 may be respectively used for:

A transceiver unit 1100, configured to acquire a model file of a first model of a terminal device and information about tasks to be processed of the terminal device, where the first model is a model related to the task to be processed, and the first model can run in the operating environment of the terminal device.

The processing unit 1200 is configured to convert the first model into a second model, where the second model can run in the operating environment of the server; and is further configured to schedule a first virtualized container, and convert the second model and the to-be-processed task is loaded into the first virtualized container.

Optionally, the processing unit 1200 is configured to schedule the first virtualized container, including: querying an idle container; if there is an idle container, using the idle container as the first virtualized container; or, when not When there is an idle container, the first virtualized container is obtained through an instantiation operation.

Optionally, the transceiver unit 1100 is further configured to: return a processing result for the to-be-processed task, where the processing result is that the first virtualization container processes the second model and the to-be-processed task owned.

It should be understood that the specific process of each unit performing the above-mentioned corresponding steps has been described in detail in the above-mentioned method embodiments, and for the sake of brevity, it will not be repeated here.

It should also be understood that when the device 1000 of the management model is a server, the transceiver unit 1100 in the device 1000 may correspond to the transceiver 830 in the device 800 shown in FIG. 6 , and the processing unit 1200 in the device 1000 may correspond to The processor 810 in the apparatus 800 shown in FIG. 6 .

It should also be understood that when the apparatus 1000 of the management model is a chip configured in a server, the transceiver unit 1200 in the apparatus 1000 may be an input/output interface circuit.

Optionally, the apparatus 1000 for managing a model further includes a storage unit, which can be used to store instructions or data, and the processing unit can call the instructions or data stored in the storage unit to implement corresponding operations. The storage unit may be implemented by at least one memory, for example, may correspond to the memory 820 in the device 800 in FIG. 6 .

In a possible design, the apparatus 1000 for the management model may correspond to the terminal device in the above method embodiments, for example, it may be a terminal device or a chip configured in the terminal device.

Specifically, the apparatus 1000 of the management model may correspond to the terminal device in the method 300 according to the embodiment of the present application, and the apparatus 1000 of the management model may include a unit for executing the method performed by the terminal device in the method 300 in FIG. 4 . . Moreover, each unit in the apparatus 1000 of the management model and the above-mentioned other operations or functions are respectively for realizing the corresponding process of the terminal device in the method 300 in FIG. 4 .

The processing unit 1200 is configured to trigger a model moving mechanism when it is determined to unload the computing power of the task to be processed, where the model moving mechanism is used to upload the first model of the terminal device to the server, the first model The model is a model related to the to-be-processed task, and the first model can run in the running environment of the terminal device.

The transceiver unit 1100 is configured to send the first message, where the first message includes the model file of the first model and the information of the to-be-processed task.

Optionally, the transceiver unit 1100 is further configured to receive a processing result for the to-be-processed task, where the processing result is obtained by the first virtualization container processing the second model and the to-be-processed task, so The second model can run in the operating environment of the server.

It should also be understood that when the apparatus 1000 of the management model is a terminal device, the transceiver unit 1100 in the apparatus 1000 of the management model may correspond to the transceiver 2020 of the terminal device 2000 shown in FIG. 7 , and the apparatus 1000 of the management model The processing unit 1200 in may correspond to the processor 2010 in the terminal device 2000 shown in FIG. 7 .

It should also be understood that when the apparatus 1000 of the management model is a chip configured in a terminal device, the transceiver unit 1200 in the apparatus 1000 of the management model may be an input/output interface circuit.

Optionally, the apparatus 1000 for managing a model further includes a storage unit, which can be used to store instructions or data, and the processing unit can call the instructions or data stored in the storage unit to implement corresponding operations. The storage unit may be implemented by at least one memory, for example, may correspond to the memory 2030 in the terminal device 2000 in FIG. 7 .

As shown in FIG. 6 , an embodiment of the present application further provides an apparatus 800 for managing models. The apparatus 800 includes a processor 810 coupled to a memory 820 for storing computer programs or instructions and/or data, and the processor 810 for executing the computer programs or instructions and/or data stored in the memory 820 such that The methods in the above method embodiments are performed.

Optionally, the apparatus 800 includes one or more processors 810 .

Optionally, as shown in FIG. 6 , the apparatus 800 may further include a memory 820 .

Optionally, the device 800 may include one or more memories 820 .

Optionally, the memory 820 may be integrated with the processor 810, or provided separately.

Optionally, as shown in FIG. 6 , the apparatus 800 may further include a transceiver 830, and the transceiver 830 is used for signal reception and/or transmission. For example, the processor 810 is used to control the transceiver 830 to receive and/or transmit signals.

As a solution, the apparatus 800 is used to implement the operations performed by the server in the above method embodiments.

For example, the processor 810 is configured to implement the processing-related operations performed by the server in the above method embodiments, and the transceiver 830 is configured to implement the above-mentioned method embodiments performed by the server.

It will be appreciated that FIG. 6 only shows a schematic simplified block diagram. In practical applications, the apparatus 800 may also include other elements, including but not limited to any number of transceivers, processors, controllers, memories, etc., which are not specifically limited.

In a possible design, the apparatus 800 may be a chip, such as a chip that can be used in a server, for implementing the relevant functions of the processor 810 . The chip can be a field programmable gate array, an application-specific integrated chip, a system chip, a central processing unit, a network processor, a digital signal processing circuit, a microcontroller, and a programmable controller or other integrated chips for realizing related functions. The chip may optionally include one or more memories for storing program codes, and when the codes are executed, make the processor implement corresponding functions.

FIG. 7 is a schematic structural diagram of a terminal device 2000 provided by an embodiment of the present application. The terminal device 2000 can be applied to the system shown in FIG. 1 to perform the functions of the terminal device in the foregoing method embodiments. As shown in FIG. 7 , the terminal device 2000 includes a processor 2010 and a transceiver 2020 . Optionally, the terminal device 2000 further includes a memory 2030 . Among them, the processor 2010, the transceiver 2002 and the memory 2030 can communicate with each other through an internal connection path to transmit control or data signals, the memory 2030 is used to store computer programs, and the processor 2010 is used to call and store computer programs from the memory 2030. The computer program is run to control the transceiver 2020 to send and receive signals. Optionally, the terminal device 2000 may further include an antenna 2040 for sending the uplink data or uplink control signaling output by the transceiver 2020 through wireless signals.

The above-mentioned processor 2010 and the memory 2030 can be combined into a processing device, and the processor 2010 is configured to execute the program codes stored in the memory 2030 to realize the above-mentioned functions. During specific implementation, the memory 2030 may also be integrated in the processor 2010 or independent of the processor 2010 . The processor 2010 may correspond to the processing unit in FIG. 5 .

The foregoing transceiver 2020 may correspond to the communication unit in FIG. 5 , and may also be referred to as a transceiver unit. The transceiver 2020 may include a receiver (or receiver, receiving circuit) and a transmitter (or transmitter, transmitting circuit). Among them, the receiver is used for receiving signals, and the transmitter is used for transmitting signals.

It should be understood that the terminal device 2000 shown in FIG. 7 can implement various processes involving the terminal device in the method embodiment shown in FIG. 4 . The operations or functions of each module in the terminal device 2000 are respectively to implement the corresponding processes in the foregoing method embodiments. For details, reference may be made to the descriptions in the foregoing method embodiments, and to avoid repetition, the detailed descriptions are appropriately omitted here.

The above-mentioned processor 2010 may be used to perform the actions described in the foregoing method embodiments that are implemented inside the terminal device, and the transceiver 2020 may be used to perform the actions described in the foregoing method embodiments that the terminal device sends to or receives from the network device. action. For details, please refer to the descriptions in the foregoing method embodiments, which will not be repeated here.

Optionally, the above terminal device 2000 may further include a power supply 2050 for providing power to various devices or circuits in the terminal device.

In addition, in order to make the functions of the terminal device more complete, the terminal device 2000 may further include one or more of an input unit 2060, a display unit 2070, an audio circuit 2080, a camera 2090, a sensor 2100, etc., the audio circuit Speakers 2082, microphones 2084, etc. may also be included.

It can be understood that the terminal device shown in FIG. 7 is only an example, and does not limit the protection scope of the present application. In fact, the terminal equipment can also have other forms.

According to the method provided by the embodiment of the present application, the present application also provides a computer program product, the computer program product includes: computer program code, when the computer program code runs on a computer, the computer is made to execute any of the above method embodiments The method on the server side in the above method, or make the computer execute the method on the terminal device side in the embodiment shown in any of the above methods.

According to the methods provided by the embodiments of the present application, the present application further provides a computer-readable storage medium, where program codes are stored in the computer-readable medium, and when the program codes are run on a computer, the computer is made to execute any of the above-mentioned methods. The method on the server side in the example embodiment, or the computer is caused to execute the method on the terminal device side in the embodiment shown by any of the above methods.

An embodiment of the present application further provides a processing apparatus, including a processor and an interface; the processor is configured to execute the method for managing a model in any of the foregoing method embodiments.

An embodiment of the present application further provides a chip, including a processor and an interface; the processor reads the instructions stored in the memory through the interface, and is used to execute the method for managing the model in any of the above method embodiments.

An embodiment of the present application further provides a communication system, where the communication system includes the server and the terminal device in the above embodiment.

The apparatus and method for managing the model in the above-mentioned various apparatus embodiments completely correspond to the server and the terminal device, and corresponding steps are performed by the corresponding module or unit, for example, the communication unit (transceiver) performs the receiving or sending in the method embodiment. The steps other than sending and receiving can be performed by the processing unit (processor). For functions of specific units, reference may be made to corresponding method embodiments. The number of processors may be one or more.

Those skilled in the art can also understand that various illustrative logical blocks (illustrative logical blocks) and steps (steps) listed in the embodiments of the present application may be implemented by electronic hardware, computer software, or a combination of the two. Whether such functionality is implemented in hardware or software depends on the specific application and overall system design requirements. Those skilled in the art may use various methods to implement the described functions for each specific application, but such implementation should not be construed as exceeding the protection scope of the embodiments of the present application.

It should be understood that the processor in this embodiment of the present application may be an integrated circuit chip, which has a signal processing capability. In the implementation process, each step of the above method embodiments may be completed by a hardware integrated logic circuit in a processor or an instruction in the form of software. The above-mentioned processor may be a general-purpose processor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), a field programmable gate array (FPGA), or other programmable circuits. Programming logic devices, discrete gate or transistor logic devices, discrete hardware components, may also be a system on chip (SoC), a central processor unit (CPU), or a network processor (network processor). processor, NP), can also be a digital signal processing circuit (digital signal processor, DSP), can also be a microcontroller (micro controller unit, MCU), can also be a programmable logic device (programmable logic device, PLD) or other Integrated chip. The methods, steps, and logic block diagrams disclosed in the embodiments of this application can be implemented or executed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in conjunction with the embodiments of the present application may be directly embodied as executed by a hardware decoding processor, or executed by a combination of hardware and software modules in the decoding processor. The software module may be located in random access memory, flash memory, read-only memory, programmable read-only memory or electrically erasable programmable memory, registers and other storage media mature in the art. The storage medium is located in the memory, and the processor reads the information in the memory, and completes the steps of the above method in combination with its hardware.

The techniques described in this application can be implemented in a variety of ways. For example, these techniques can be implemented in hardware, software, or a combination of hardware. For a hardware implementation, a processing unit for performing the techniques at an apparatus (eg, server, base station, terminal, network entity, or chip) may be implemented in one or more general purpose processors, DSPs, digital signal processing devices, ASICs , programmable logic device, FPGA, or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or any combination of the foregoing. A general-purpose processor may be a microprocessor, or alternatively, the general-purpose processor may be any conventional processor, controller, microcontroller, or state machine. A processor may also be implemented by a combination of computing devices, such as a digital signal processor and a microprocessor, multiple microprocessors, one or more microprocessors in combination with a digital signal processor core, or any other similar configuration. accomplish.

It can be understood that the memory in this embodiment of the present application may be a volatile memory or a non-volatile memory, or may include both volatile and non-volatile memory. The non-volatile memory may be read-only memory (ROM), programmable read-only memory (PROM), erasable programmable read-only memory (EPROM), electrically programmable Erase programmable read-only memory (electrically EPROM, EEPROM) or flash memory. Volatile memory may be random access memory (RAM), which acts as an external cache. By way of example and not limitation, many forms of RAM are available, such as static random access memory (SRAM), dynamic random access memory (DRAM), synchronous DRAM, SDRAM), double data rate synchronous dynamic random access memory (double data rate SDRAM, DDR SDRAM), enhanced synchronous dynamic random access memory (enhanced SDRAM, ESDRAM), synchronous link dynamic random access memory (synchlink DRAM, SLDRAM) ) and direct memory bus random access memory (direct rambus RAM, DR RAM). It should be noted that the memory of the systems and methods described herein is intended to include, but not be limited to, these and any other suitable types of memory.

In the above-mentioned embodiments, it may be implemented in whole or in part by software, hardware, firmware or any combination thereof. When implemented in software, it can be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When the computer instructions are loaded and executed on a computer, all or part of the processes or functions described in the embodiments of the present application are generated. The computer may be a general purpose computer, special purpose computer, computer network, or other programmable device. The computer instructions may be stored in or transmitted from one computer readable storage medium to another computer readable storage medium, for example, the computer instructions may be downloaded from a website site, computer, server or data center Transmission to another website site, computer, server, or data center by wire (eg, coaxial cable, optical fiber, digital subscriber line, DSL) or wireless (eg, infrared, wireless, microwave, etc.). The computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that includes an integration of one or more available media. The available media may be magnetic media (eg, floppy disks, hard disks, magnetic tapes), optical media (eg, high-density digital video discs (DVDs)), or semiconductor media (eg, solid state disks, SSD)) etc.

It should be understood that in this application, "when", "if" and "if" all mean that the server or terminal device will perform corresponding processing under certain objective circumstances, not a limited time, and does not require the server Or the terminal device must have a judgment action when it is implemented, and it does not mean that there are other restrictions.

Those of ordinary skill in the art can understand that the first, second, and other numeral numbers involved in the present application are only for the convenience of description, and are not used to limit the scope of the embodiments of the present application, but also represent the sequence.

References in this application to elements in the singular are intended to mean "one or more" rather than "one and only one" unless specifically stated otherwise. In this application, unless otherwise specified, "at least one" is intended to mean "one or more", and "plurality" is intended to mean "two or more".

Additionally, the terms "system" and "network" are often used interchangeably herein. The term "and/or" in this article is only an association relationship to describe the associated objects, indicating that there can be three kinds of relationships, for example, A and/or B, it can mean that A exists alone, A and B exist at the same time, and A and B exist independently The three cases of B, where A can be singular or plural, and B can be singular or plural.

The character "/" generally indicates that the associated objects are an "or" relationship.

The terms "component", "module", "system" and the like are used in this specification to refer to a computer-related entity, hardware, firmware, a combination of hardware and software, software, or software in execution. For example, a component may be, but is not limited to, a process running on a processor, a processor, an object, an executable, a thread of execution, a program, and/or a computer. By way of illustration, both an application running on a computing device and the computing device may be components. One or more components may reside within a process and/or thread of execution, and a component may be localized on one computer and/or distributed between 2 or more computers. In addition, these components can execute from various computer readable media having various data structures stored thereon. A component may, for example, be based on a signal having one or more data packets (eg, data from two components interacting with another component between a local system, a distributed system, and/or a network, such as the Internet interacting with other systems via signals) Communicate through local and/or remote processes.

Those of ordinary skill in the art can realize that the units and algorithm steps of each example described in conjunction with the embodiments disclosed herein can be implemented in electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are performed in hardware or software depends on the specific application and design constraints of the technical solution. Skilled artisans may implement the described functionality using different methods for each particular application, but such implementations should not be considered beyond the scope of this application.

Those skilled in the art can clearly understand that, for the convenience and brevity of description, the specific working process of the above-described systems, devices and units may refer to the corresponding processes in the foregoing method embodiments, which will not be repeated here.

In the several embodiments provided in this application, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. For example, the apparatus embodiments described above are only illustrative. For example, the division of the units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored, or not implemented. On the other hand, the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of devices or units, which may be electrical, mechanical or other forms.

The units described as separate components may or may not be physically separated, and components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.

In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.

The functions, if implemented in the form of software functional units and sold or used as independent products, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present application can be embodied in the form of a software product in essence, or the part that contributes to the prior art or the part of the technical solution. The computer software product is stored in a storage medium, including Several instructions are used to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of the present application. The aforementioned storage medium includes: a U disk, a removable hard disk, a read-only memory ROM, a random access memory RAM, a magnetic disk or an optical disk and other media that can store program codes.

The above are only specific embodiments of the present application, but the protection scope of the present application is not limited to this. should be covered within the scope of protection of this application. Therefore, the protection scope of the present application should be subject to the protection scope of the claims.

Claims

A method for managing models, comprising:

The server acquires a first model of a terminal device and information about tasks to be processed of the terminal device, where the first model is a model related to the tasks to be processed, and the first model can run when the terminal device runs surroundings;

The server converts the first model into a second model, and the second model can run in the operating environment of the server;

The server schedules a first virtualized container, and loads the second model and the to-be-processed task into the first virtualized container.
The method according to claim 1, wherein the server scheduling the first virtualized container comprises:

the server queries idle containers;

If there is an idle container, the server uses the idle container as the first virtualized container; or,

When there is no idle container, the server obtains the first virtualized container through an instantiation operation.
The method according to claim 1 or 2, wherein the method further comprises:

The server returns a processing result for the to-be-processed task, where the processing result is obtained by processing the second model and the to-be-processed task by the first virtualization container.
A method for managing models, comprising:

In the case where it is determined to unload the computing power of the task to be processed, the terminal device triggers a model moving mechanism, and the model moving mechanism is used to upload the first model of the terminal device to the server, and the first model is the same as the the model related to the task to be processed, the first model can run in the running environment of the terminal device;

The terminal device sends the first message, where the first message includes a model file of the first model and information of the to-be-processed task.
The method according to claim 4, wherein the method further comprises:

The terminal device receives a processing result for the to-be-processed task, where the processing result is obtained by processing a second model and the to-be-processed task by the first virtualization container, and the second model can run on the The operating environment of the server.
A device for managing models, comprising:

a transceiver unit, configured to acquire a first model of a terminal device and information about tasks to be processed of the terminal device, where the first model is a model related to the tasks to be processed, and the first model can run on the The operating environment of the terminal equipment;

a processing unit, configured to convert the first model into a second model, where the second model can run in the running environment of the server;

The processing unit is further configured to schedule a first virtualized container, and load the second model and the to-be-processed task into the first virtualized container.
The apparatus according to claim 6, wherein the processing unit is configured to schedule the first virtualized container, comprising:

Query free containers;

If there is an idle container, use the idle container as the first virtualized container; or,

In the case that there is no idle container, the first virtualized container is obtained through an instantiation operation.
The device according to claim 6 or 7, wherein the transceiver unit is further configured to:

Return a processing result for the to-be-processed task, where the processing result is obtained by the first virtualization container processing the second model and the to-be-processed task.
An apparatus for managing models, characterized in that the apparatus is a terminal device, and the terminal device includes:

a processing unit, configured to trigger a model moving mechanism when it is determined that the computing power of the task to be processed is unloaded, and the model moving mechanism is used to upload the first model of the terminal device to the server, the first model is a model related to the task to be processed, and the first model can run in the operating environment of the terminal device;

A transceiver unit, configured to send the first message, where the first message includes a model file of the first model and information of the to-be-processed task.
The apparatus according to claim 9, wherein the transceiver unit is further configured to receive a processing result for the to-be-processed task, where the processing result is the first virtualization container's response to the second model and the to-be-processed task. Obtained by processing the processing task, the second model can run in the running environment of the server.
An apparatus for managing models, characterized in that it comprises a processor, wherein the processor is coupled with a memory, the memory is used for storing computer programs or instructions, and the processor is used for executing the computer programs or instructions in the memory, The method of any one of claims 1 to 3 is caused to be performed.
An apparatus for managing models, characterized in that it comprises a processor, wherein the processor is coupled with a memory, the memory is used for storing computer programs or instructions, and the processor is used for executing the computer programs or instructions in the memory, The method of claim 4 or 5 is caused to be performed.
A computer-readable storage medium, characterized in that a computer program or instruction is stored, and when the computer program or instruction is executed, the method of any one of claims 1 to 3 is executed.
A computer-readable storage medium, characterized in that a computer program or instruction is stored, and when the computer program or instruction is executed, the method according to claim 4 or 5 is performed.