CN112698945A - Resource allocation method and device based on multiple models, electronic equipment and storage medium - Google Patents

Resource allocation method and device based on multiple models, electronic equipment and storage medium Download PDF

Info

Publication number
CN112698945A
CN112698945A CN202011603677.6A CN202011603677A CN112698945A CN 112698945 A CN112698945 A CN 112698945A CN 202011603677 A CN202011603677 A CN 202011603677A CN 112698945 A CN112698945 A CN 112698945A
Authority
CN
China
Prior art keywords
request
model
pod
container
pod unit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011603677.6A
Other languages
Chinese (zh)
Inventor
郑振鹏
王健宗
罗剑
程宁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN202011603677.6A priority Critical patent/CN112698945A/en
Publication of CN112698945A publication Critical patent/CN112698945A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Computer And Data Communications (AREA)

Abstract

The invention relates to a load allocation technology, and discloses a resource allocation method based on multiple models, which comprises the following steps: packaging the network model by using a container packaging technology to obtain a model container; calling a POD unit to obtain a POD unit set of the model container and generate a service label; distributing the request data to the corresponding model containers by utilizing the load balancing component; the method comprises the steps of performing request splitting on request content in request data to obtain a request block set, dynamically adjusting the number of POD units in the POD unit set by using service tags corresponding to distributed model containers, and acquiring operation space resources from a server by using the POD unit set to execute data processing on the request block set. The invention also relates to a blockchain technology, and the network model container can be stored in a blockchain node. The invention also provides a resource allocation device, equipment and a computer readable storage medium based on the multi-model. The invention can dynamically allocate resources to a plurality of network models.

Description

Resource allocation method and device based on multiple models, electronic equipment and storage medium
Technical Field
The present invention relates to the field of load allocation technologies, and in particular, to a resource allocation method and apparatus based on multiple models, an electronic device, and a computer-readable storage medium.
Background
Speech recognition is a complex project, the speech recognition effect has a great relationship with regions, scenes, languages, ages, sexes and the like of people, and a single network model cannot achieve a good recognition effect in all aspects, so that a speech recognition system is often configured with multiple network models. However, when multiple network models work together, a large amount of voice recognition tasks may compete for memory resources, resulting in system crash.
Disclosure of Invention
The invention provides a resource allocation method and device based on multiple models, electronic equipment and a computer readable storage medium, and aims to dynamically allocate resources to multiple network models.
In order to achieve the above object, the present invention provides a resource allocation method based on multiple models, including:
obtaining a plurality of network models, packaging the operating environment of each network model by utilizing a pre-constructed container packaging technology to obtain a model container of each network model, and storing the address information of each model container in a load balancing component;
calling a pre-constructed POD unit in each model container to obtain a POD unit set of the model container and generate a service label of the POD unit set;
receiving a request data set transmitted by a user side by using the load balancing component, matching a request address in each request data in the request data set with the address information of the model container, and distributing the request data to the corresponding model container according to a matching result;
and performing request splitting on request content in the request data to obtain a request block set, dynamically adjusting the number of POD units in the corresponding POD unit set by using the service tag corresponding to the distributed model container according to the request block set, and acquiring operation space resources from a pre-constructed server by using the POD unit set to execute data processing on the request block set.
Optionally, the invoking a pre-constructed POD unit in each model container to obtain a POD unit set of the model container, and generating a service tag of the POD unit set includes:
operating the model container to obtain a quantity set of the POD units required for operating the model container under a preset plurality of conditions;
setting a quantity interval of POD units in the model container according to the quantity set, and randomly calling any number of POD units in the quantity interval to obtain a POD unit set of the model container;
and setting an elastic telescopic monitoring object and an automatic telescopic object, and generating a service label according to the quantity interval of the POD units and the elastic telescopic monitoring object and the automatic telescopic object.
Optionally, the dynamically adjusting, according to the request block set, the number of POD units in the corresponding POD unit set by using the service tag corresponding to the allocated model container includes:
monitoring the number of POD units in the POD unit set by utilizing the elastic telescopic monitoring object in the service tag according to the number of request blocks in the request block set, and increasing or deleting the POD units in the POD unit set by utilizing the automatic telescopic object in the service tag;
and when the number of the POD units is equal to the number of the request blocks in the request block set or the upper limit of the number interval is reached, stopping adding the POD units and allocating one POD unit to each request block.
Optionally, before the obtaining of the computation space resource from the pre-constructed server by using the POD unit set and performing the data processing on the request block set, the method further includes:
detecting the connection quantity of the model containers of each server;
and connecting the distributed model container to the server with the minimum number of connections of the model container.
Optionally, the acquiring, by using the POD unit set, an operation space resource from a pre-constructed server to execute data processing on the request block set includes:
monitoring the residual space in the memory operation space in the server, and when the residual space is larger than the consumption space of the POD unit set, receiving memory resources by using the POD unit set and executing data processing on the request block set;
when the residual space is smaller than the consumption space of the POD unit set, the memory operation space of the server side is cleaned by using a preset elimination strategy until the residual space is larger than the consumption space of the POD unit set, and the POD unit set is used for receiving memory resources and executing data processing on the request block set.
Optionally, the receiving, by using the load balancing component, a request data set transmitted by a user side, and matching a request address in each request data in the request data set with the address information of the model container includes:
intercepting the request data set sent by the user side by utilizing an HTTP (hyper text transport protocol) in the load balancing component;
acquiring a request address and request content included by each request data in the request data set;
and matching the request address with the address information of the model container stored in the load balancing component to obtain a matching result.
Optionally, the encapsulating, by using a pre-constructed container encapsulation technology, the operating environment of each network model to obtain a model container of each network model includes:
obtaining a dependency package of the network model to obtain an operating environment of the network model;
and packaging the operating environment by utilizing a pre-constructed container packaging technology to obtain a model container of each network model.
In order to solve the above problem, the present invention further provides a resource allocation apparatus based on multiple models, the apparatus including:
the system comprises a container construction module, a load balancing component and a load balancing module, wherein the container construction module is used for acquiring a plurality of network models, packaging the operating environment of each network model by utilizing a pre-constructed container packaging technology to obtain a model container of each network model, and storing the address information of each model container in the load balancing component;
the label configuration module is used for calling a pre-constructed POD unit in each model container to obtain a POD unit set of the model container and generate a service label of the POD unit set;
the request distribution module is used for receiving a request data set transmitted by a user side by using the load balancing component, matching a request address in each request data in the request data set with the address information of the model container, and distributing the request data to the corresponding model container according to a matching result;
and the resource dynamic calling module is used for requesting and splitting the request content in the request data to obtain a request block set, dynamically adjusting the number of POD units in the corresponding POD unit set by using the service tag corresponding to the distributed model container according to the request block set, and acquiring operation space resources from a pre-constructed server by using the POD unit set to execute data processing on the request block set.
In order to solve the above problem, the present invention also provides an electronic device, including:
at least one processor; and the number of the first and second groups,
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores computer program instructions executable by the at least one processor to cause the at least one processor to perform the multi-model based resource allocation method described above.
In order to solve the above problem, the present invention further provides a computer-readable storage medium including a storage data area and a storage program area, the storage data area storing created data, the storage program area storing a computer program; wherein the computer program, when executed by a processor, implements the multi-model based resource allocation method described above.
In addition, the embodiment of the invention configures a load balancing component, transfers the request data to the container, connects the container with an idle server, prevents resource contention, further adjusts the number of POD units of each container by using a dynamic expansion technology, and can realize reasonable distribution of memory resources. The embodiment of the invention can carry out resource dynamic allocation on a plurality of network models.
Drawings
Fig. 1 is a schematic flowchart of a resource allocation method based on multiple models according to an embodiment of the present invention;
fig. 2 is a flowchart of generating a service tag in the resource allocation method based on multiple models according to the embodiment of the present invention;
fig. 3 is a flowchart of dynamically adjusting a POD unit in the multi-model-based resource allocation method according to an embodiment of the present invention;
FIG. 4 is a block diagram of a resource allocation apparatus based on multiple models according to an embodiment of the present invention;
fig. 5 is a schematic internal structural diagram of an electronic device implementing a resource allocation method based on multiple models according to an embodiment of the present invention;
the implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.
Detailed Description
It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
The embodiment of the application provides a resource allocation method based on multiple models. The executing body of the resource allocation method based on multiple models includes, but is not limited to, at least one of electronic devices, such as a server, a terminal, and the like, which can be configured to execute the method provided by the embodiments of the present application. In other words, the resource allocation method based on multiple models may be performed by software or hardware installed in a terminal device or a server device, and the software may be a blockchain platform. The server includes but is not limited to: a single server, a server cluster, a cloud server or a cloud server cluster, and the like.
Fig. 1 is a schematic flow chart of a resource allocation method based on multiple models according to an embodiment of the present invention. In this embodiment, the resource allocation method based on multiple models includes:
s1, obtaining a plurality of network models, packaging the operating environment of each network model by using a pre-constructed container packaging technology to obtain a model container of each network model, and storing the address information of each model container in a load balancing component.
In the embodiment of the present invention, the network model may be different speech recognition models. Different network models can be influenced by the environments of different service ends, and in order to ensure that each network model has a good effect in different service ends, the embodiment of the invention encapsulates the network models and constructs the network models into containers.
In detail, the embodiment of the present invention obtains the dependency package of the network model to obtain the operating environment of the network model; and packaging the operating environment by utilizing a pre-constructed container packaging technology to obtain a model container of each network model.
In the embodiment of the invention, the container packaging technology can be Docker container technology, wherein Docker is an open-source application container engine and is used for packaging and generating a lightweight and portable container. According to the embodiment of the invention, each model container is obtained by packaging the network model, so that the limitation and isolation of resources on different network models can be realized.
Further, the load balancing component in the embodiment of the present invention is an nginx component, and is configured to provide a load balancing service for request data sent by a user terminal.
And S2, calling a pre-constructed POD unit in each model container to obtain a POD unit set of the model container and generate a service label of the POD unit set.
In the embodiment of the present invention, the POD unit is the smallest unit capable of being created and deployed in kubernets, and is an application example in a kubernets cluster. In the embodiment of the invention, one model container can call a plurality of POD units, and each POD unit shares the operation space resource of the same server, so that the function of exchanging information with high efficiency can be achieved.
In detail, referring to fig. 2, the S2 includes:
s20, operating the model container to obtain a set of numbers of POD units required to operate the model container under a preset plurality of conditions.
The embodiment of the invention can use the pre-constructed test case to operate each model container, record the number of calling POD units in each model container and form the number set.
S21, setting a quantity interval of POD units in the model container according to the quantity set, and randomly calling any number of POD units in the quantity interval to obtain the POD unit set of the model container.
In an application example of the present invention, the data set may be [3, 4, 5, 7, 8, 9, 6], and then [3, 9 ] is set as a quantity interval of the POD units, and when the model container is started, in order to increase the processing speed, the number of POD units in the quantity interval may be randomly called first. For example, in the embodiment of the present invention, a value may be 3, and the POD units are (POD1, POD2, POD 3).
S22, setting an elastic telescopic monitoring object and an automatic telescopic object, and generating a service label according to the quantity interval of the POD units and the elastic telescopic monitoring object and the automatic telescopic object.
Furthermore, the embodiment of the invention sets an elastic telescopic monitoring object and an automatic telescopic object, and constructs the service label by using the quantity interval. The automatic telescopic object can delete the POD units or introduce the pre-constructed POD units, and the elastic telescopic monitoring object can monitor the quantity change condition of the POD units in real time.
S3, receiving a request data set transmitted by a user side by using the load balancing component, matching a request address in each request data in the request data set with the address information of the model container, and distributing the request data to the corresponding model container according to a matching result.
In the embodiment of the present invention, the request data set is a set of request data, where the request data may be request information for acquiring resources, which is sent from a user side to a server side.
The request data in the embodiment of the invention comprises a request address and request content. In detail, the receiving, by the load balancing component, a request data set transmitted by a user side, and matching a request address in each request data in the request data set with address information of the model container includes:
intercepting the request data set sent by the user side by utilizing an HTTP (hyper text transport protocol) in the load balancing component;
acquiring a request address and request content included by each request data in the request data set;
and matching the request address with the address information of the model container stored in the load balancing component to obtain a matching result.
Further, according to the matching result, the embodiment of the present invention sends the request content in the request data to the model container to which the request address is matched.
According to the embodiment of the invention, the load balancing component is utilized to distribute the request data to different model containers, so that the reaction capability of the multiple network models is improved, single-point faults can be eliminated, and the usability of the network models is further improved.
S4, request splitting is carried out on the request content in the request data to obtain a request block set, the number of POD units in the corresponding POD unit set is dynamically adjusted by using the service label corresponding to the distributed model container according to the request block set, and the POD unit set is used for acquiring operation space resources from a pre-constructed server to execute data processing on the request block set.
In the embodiment of the present invention, each POD unit may invoke the computation resource space in the server, and in order to ensure fast computation of the model container, the requested content needs to be distributed to a plurality of POD units.
In detail, referring to fig. 3, in the embodiment of the present invention, the dynamically adjusting, according to the request block set, the number of POD units in the corresponding POD unit set by using the service tag corresponding to the allocated model container includes:
step S40, according to the number of request blocks in the request block set, monitoring the number of POD units in the POD unit set by using the elastically telescopic monitoring object in the service tag, and adding or deleting POD units in the POD unit set by using the automatically telescopic object in the service tag;
step S41, when the number of POD units is equal to the number of request blocks in the request block set or reaches the upper limit of the number interval, stopping adding the POD units, and allocating one POD unit to each request block.
Further, in this embodiment of the present invention, the acquiring, by using the POD unit set, an operation space resource from a pre-constructed server to execute data processing on the request block set includes:
monitoring the residual space in the memory operation space in the server, and when the residual space is larger than the consumption space of the POD unit set, receiving memory resources by using the POD unit set and executing data processing on the request block set;
when the residual space is smaller than the consumption space of the POD unit set, the memory operation space of the server side is cleaned by using a preset elimination strategy until the residual space is larger than the consumption space of the POD unit set, and the POD unit set is used for receiving memory resources and executing data processing on the request block set.
In the embodiment of the invention, the elimination strategy utilizes remote dictionary service (Redis) to set the maximum memory instruction (maxmemory), and when the memory occupation in the server is larger than the maxmemory, the POD units in other containers which are used least recently are preferentially deleted, so that the effect of cleaning the memory is achieved.
In detail, in another embodiment of the present invention, before the acquiring computation space resources from a pre-constructed server by using the POD unit set and performing data processing on the request block set, the method further includes:
detecting the connection quantity of the model containers of each server;
and connecting the distributed model container to the server with the minimum number of connections of the model container.
The embodiment of the invention inquires the connection quantity of each server connection model container according to the monitoring address of the server by using a minimum connection quantity algorithm, and connects the distributed model container to the server with the minimum connection quantity by using the load balancing component.
In addition, the embodiment of the invention configures a load balancing component, transfers the request data to the container, connects the container with an idle server, prevents resource contention, further adjusts the number of POD units of each container by using a dynamic expansion technology, and can realize reasonable distribution of memory resources. The embodiment of the invention can carry out resource dynamic allocation on a plurality of network models.
Fig. 4 is a schematic block diagram of a resource allocation apparatus based on multiple models according to the present invention.
The resource allocation apparatus 100 based on multiple models according to the present invention can be installed in an electronic device. According to the implemented functions, the multi-model-based resource allocation apparatus may include a container construction module 101, a tag configuration module 102, a request distribution module 103, and a resource dynamic retrieval module 104. The module of the present invention, which may also be referred to as a unit, refers to a series of computer program segments that can be executed by a processor of an electronic device and that can perform a fixed function, and that are stored in a memory of the electronic device.
In the present embodiment, the functions regarding the respective modules/units are as follows:
the container construction module 101 is configured to obtain a plurality of network models, encapsulate an operating environment of each network model by using a pre-constructed container encapsulation technology, obtain a model container of each network model, and store address information of each model container in a load balancing component.
In the embodiment of the present invention, the network model may be different speech recognition models. Different network models can be influenced by the environments of different service ends, and in order to ensure that each network model has a good effect in different service ends, the embodiment of the invention encapsulates the network models and constructs the network models into containers.
In detail, the embodiment of the present invention obtains the dependency package of the network model to obtain the operating environment of the network model; and packaging the operating environment by utilizing a pre-constructed container packaging technology to obtain a model container of each network model.
In the embodiment of the invention, the container packaging technology can be Docker container technology, wherein Docker is an open-source application container engine and is used for packaging and generating a lightweight and portable container. According to the embodiment of the invention, each model container is obtained by packaging the network model, so that the limitation and isolation of resources on different network models can be realized.
Further, the load balancing component in the embodiment of the present invention is an nginx component, and is configured to provide a load balancing service for request data sent by a user terminal.
The tag configuration module 102 is configured to invoke a pre-constructed POD unit in each model container, obtain a POD unit set of the model container, and generate a service tag of the POD unit set.
In the embodiment of the present invention, the POD unit is the smallest unit capable of being created and deployed in kubernets, and is an application example in a kubernets cluster. In the embodiment of the invention, one model container can call a plurality of POD units, and each POD unit shares the operation space resource of the same server, so that the function of exchanging information with high efficiency can be achieved.
In detail, the tag configuration module 102 is specifically configured to:
operating the model container to obtain a quantity set of the POD units required for operating the model container under a preset plurality of conditions; setting a quantity interval of POD units in the model container according to the quantity set, and randomly calling any number of POD units in the quantity interval to obtain a POD unit set of the model container; and setting an elastic telescopic monitoring object and an automatic telescopic object, and generating a service label according to the quantity interval of the POD units and the elastic telescopic monitoring object and the automatic telescopic object.
The embodiment of the invention can use the pre-constructed test case to operate each model container, record the number of calling POD units in each model container and form the number set.
In an application example of the present invention, the data set may be [3, 4, 5, 7, 8, 9, 6], and then [3, 9 ] is set as a quantity interval of the POD units, and when the model container is started, in order to increase the processing speed, the number of POD units in the quantity interval may be randomly called first. For example, in the embodiment of the present invention, a value may be 3, and the POD units are (POD1, POD2, POD 3).
Furthermore, the embodiment of the invention sets an elastic telescopic monitoring object and an automatic telescopic object, and constructs the service label by using the quantity interval. The automatic telescopic object can delete the POD units or introduce the pre-constructed POD units, and the elastic telescopic monitoring object can monitor the quantity change condition of the POD units in real time.
The request distribution module 103 is configured to receive, by using the load balancing component, a request data set transmitted by a user, match a request address in each request data in the request data set with address information of the model container, and distribute the request data to a corresponding model container according to a matching result.
In the embodiment of the present invention, the request data set is a set of request data, where the request data may be request information for acquiring resources, which is sent from a user side to a server side.
The request data in the embodiment of the invention comprises a request address and request content. In detail, the request distribution module 103 is specifically configured to: intercepting the request data set sent by the user side by utilizing an HTTP (hyper text transport protocol) in the load balancing component; acquiring a request address and request content included by each request data in the request data set; and matching the request address with the address information of the model container stored in the load balancing component to obtain a matching result.
Further, according to the matching result, the embodiment of the present invention sends the request content in the request data to the model container to which the request address is matched.
According to the embodiment of the invention, the load balancing component is utilized to distribute the request data to different model containers, so that the reaction capability of the multiple network models is improved, single-point faults can be eliminated, and the usability of the network models is further improved.
The resource dynamic invoking module 104 is configured to perform request splitting on request content in the request data to obtain a request block set, dynamically adjust the number of POD units in a corresponding POD unit set by using the service tag corresponding to the allocated model container according to the request block set, and acquire computation space resources from a pre-constructed server by using the POD unit set to perform data processing on the request block set.
In the embodiment of the present invention, each POD unit may invoke the computation resource space in the server, and in order to ensure fast computation of the model container, the requested content needs to be distributed to a plurality of POD units.
In detail, in the embodiment of the present invention, the resource dynamic invoking module 104 is specifically configured to: monitoring the number of POD units in the POD unit set by utilizing the elastic telescopic monitoring object in the service tag according to the number of request blocks in the request block set, and increasing or deleting the POD units in the POD unit set by utilizing the automatic telescopic object in the service tag; and when the number of the POD units is equal to the number of the request blocks in the request block set or the upper limit of the number interval is reached, stopping adding the POD units and allocating one POD unit to each request block.
Further, in this embodiment of the present invention, the acquiring, by using the POD unit set, an operation space resource from a pre-constructed server to execute data processing on the request block set includes:
monitoring the residual space in the memory operation space in the server, and when the residual space is larger than the consumption space of the POD unit set, receiving memory resources by using the POD unit set and executing data processing on the request block set;
when the residual space is smaller than the consumption space of the POD unit set, the memory operation space of the server side is cleaned by using a preset elimination strategy until the residual space is larger than the consumption space of the POD unit set, and the POD unit set is used for receiving memory resources and executing data processing on the request block set.
In the embodiment of the invention, the elimination strategy utilizes remote dictionary service (Redis) to set the maximum memory instruction (maxmemory), and when the memory occupation in the server is larger than the maxmemory, the POD units in other containers which are used least recently are preferentially deleted, so that the effect of cleaning the memory is achieved.
In detail, in another embodiment of the present invention, the resource dynamic invoking module 104 is further configured to:
detecting the connection quantity of the model containers of each server;
and connecting the distributed model container to the server with the minimum number of connections of the model container.
The embodiment of the invention inquires the connection quantity of each server connection model container according to the monitoring address of the server by using a minimum connection quantity algorithm, and connects the distributed model container to the server with the minimum connection quantity by using the load balancing component.
In addition, the embodiment of the invention configures a load balancing component, transfers the request data to the container, connects the container with an idle server, prevents resource contention, further adjusts the number of POD units of each container by using a dynamic expansion technology, and can realize reasonable distribution of memory resources. The embodiment of the invention can carry out resource dynamic allocation on a plurality of network models.
Fig. 5 is a schematic structural diagram of an electronic device implementing the resource allocation method based on multiple models according to the present invention.
The electronic device 1 may comprise a processor 10, a memory 11 and a bus, and may further comprise a computer program, such as a multi-model based resource allocation program 12, stored in the memory 11 and executable on the processor 10.
The memory 11 includes at least one type of readable storage medium, which includes flash memory, removable hard disk, multimedia card, card-type memory (e.g., SD or DX memory, etc.), magnetic memory, magnetic disk, optical disk, etc. The memory 11 may in some embodiments be an internal storage unit of the electronic device 1, such as a removable hard disk of the electronic device 1. The memory 11 may also be an external storage device of the electronic device 1 in other embodiments, such as a plug-in mobile hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like, which are provided on the electronic device 1. Further, the memory 11 may also include both an internal storage unit and an external storage device of the electronic device 1. The memory 11 may be used not only for storing application software installed in the electronic device 1 and various types of data, such as codes of the multi-model based resource allocation program 12, but also for temporarily storing data that has been output or is to be output.
The processor 10 may be composed of an integrated circuit in some embodiments, for example, a single packaged integrated circuit, or may be composed of a plurality of integrated circuits packaged with the same or different functions, including one or more Central Processing Units (CPUs), microprocessors, digital Processing chips, graphics processors, and combinations of various control chips. The processor 10 is a Control Unit (Control Unit) of the electronic device, connects various components of the electronic device by using various interfaces and lines, and executes various functions and processes data of the electronic device 1 by running or executing programs or modules (for example, executing a resource allocation program based on multiple models, etc.) stored in the memory 11 and calling data stored in the memory 11.
The bus may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. The bus is arranged to enable connection communication between the memory 11 and at least one processor 10 or the like.
Fig. 5 only shows an electronic device with components, and it will be understood by a person skilled in the art that the structure shown in fig. 5 does not constitute a limitation of the electronic device 1, and may comprise fewer or more components than shown, or a combination of certain components, or a different arrangement of components.
For example, although not shown, the electronic device 1 may further include a power supply (such as a battery) for supplying power to each component, and preferably, the power supply may be logically connected to the at least one processor 10 through a power management device, so as to implement functions of charge management, discharge management, power consumption management, and the like through the power management device. The power supply may also include any component of one or more dc or ac power sources, recharging devices, power failure detection circuitry, power converters or inverters, power status indicators, and the like. The electronic device 1 may further include various sensors, a bluetooth module, a Wi-Fi module, and the like, which are not described herein again.
Further, the electronic device 1 may further include a network interface, and optionally, the network interface may include a wired interface and/or a wireless interface (such as a WI-FI interface, a bluetooth interface, etc.), which are generally used for establishing a communication connection between the electronic device 1 and other electronic devices.
Optionally, the electronic device 1 may further comprise a user interface, which may be a Display (Display), an input unit (such as a Keyboard), and optionally a standard wired interface, a wireless interface. Alternatively, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch device, or the like. The display, which may also be referred to as a display screen or display unit, is suitable for displaying information processed in the electronic device 1 and for displaying a visualized user interface, among other things.
It is to be understood that the described embodiments are for purposes of illustration only and that the scope of the appended claims is not limited to such structures.
The multi-model based resource allocation program 12 stored in the memory 11 of the electronic device 1 is a combination of a plurality of computer programs, which when executed in the processor 10, may implement:
obtaining a plurality of network models, packaging the operating environment of each network model by utilizing a pre-constructed container packaging technology to obtain a model container of each network model, and storing the address information of each model container in a load balancing component;
calling a pre-constructed POD unit in each model container to obtain a POD unit set of the model container and generate a service label of the POD unit set;
receiving a request data set transmitted by a user side by using the load balancing component, matching a request address in each request data in the request data set with the address information of the model container, and distributing the request data to the corresponding model container according to a matching result;
and performing request splitting on request content in the request data to obtain a request block set, dynamically adjusting the number of POD units in the corresponding POD unit set by using the service tag corresponding to the distributed model container according to the request block set, and acquiring operation space resources from a pre-constructed server by using the POD unit set to execute data processing on the request block set.
Further, the integrated modules/units of the electronic device 1, if implemented in the form of software functional units and sold or used as separate products, may be stored in a computer readable storage medium. The computer readable storage medium may be volatile or non-volatile. For example, the computer-readable medium may include: any entity or device capable of carrying said computer program code, recording medium, U-disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM).
Further, the computer usable storage medium may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function, and the like; the storage data area may store data created according to the use of the blockchain node, and the like.
The present invention also provides a computer-readable storage medium, storing a computer program which, when executed by a processor of an electronic device, may implement:
obtaining a plurality of network models, packaging the operating environment of each network model by utilizing a pre-constructed container packaging technology to obtain a model container of each network model, and storing the address information of each model container in a load balancing component;
calling a pre-constructed POD unit in each model container to obtain a POD unit set of the model container and generate a service label of the POD unit set;
receiving a request data set transmitted by a user side by using the load balancing component, matching a request address in each request data in the request data set with the address information of the model container, and distributing the request data to the corresponding model container according to a matching result;
and performing request splitting on request content in the request data to obtain a request block set, dynamically adjusting the number of POD units in the corresponding POD unit set by using the service tag corresponding to the distributed model container according to the request block set, and acquiring operation space resources from a pre-constructed server by using the POD unit set to execute data processing on the request block set.
In the embodiments provided in the present invention, it should be understood that the disclosed apparatus, device and method can be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is only one logical functional division, and other divisions may be realized in practice.
The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.
In addition, functional modules in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional module.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof.
The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any accompanying claims should not be construed as limiting the claim concerned.
The block chain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.
Furthermore, it is obvious that the word "comprising" does not exclude other elements or steps, and the singular does not exclude the plural. A plurality of units or means recited in the system claims may also be implemented by one unit or means in software or hardware. The terms second, etc. are used to denote names, but not any particular order.
Finally, it should be noted that the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting, and although the present invention is described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention.

Claims (10)

1. A method for resource allocation based on multiple models, the method comprising:
obtaining a plurality of network models, packaging the operating environment of each network model by utilizing a pre-constructed container packaging technology to obtain a model container of each network model, and storing the address information of each model container in a load balancing component;
calling a pre-constructed POD unit in each model container to obtain a POD unit set of the model container and generate a service label of the POD unit set;
receiving a request data set transmitted by a user side by using the load balancing component, matching a request address in each request data in the request data set with the address information of the model container, and distributing the request data to the corresponding model container according to a matching result;
and performing request splitting on request content in the request data to obtain a request block set, dynamically adjusting the number of POD units in the corresponding POD unit set by using the service tag corresponding to the distributed model container according to the request block set, and acquiring operation space resources from a pre-constructed server by using the POD unit set to execute data processing on the request block set.
2. The multi-model-based resource allocation method according to claim 1, wherein said invoking a pre-constructed POD unit in each of said model containers, obtaining a set of POD units of said model container, and generating service tags for said set of POD units comprises:
operating the model container to obtain a quantity set of the POD units required for operating the model container under a preset plurality of conditions;
setting a quantity interval of POD units in the model container according to the quantity set, and randomly calling any number of POD units in the quantity interval to obtain a POD unit set of the model container;
and setting an elastic telescopic monitoring object and an automatic telescopic object, and generating a service label according to the quantity interval of the POD units and the elastic telescopic monitoring object and the automatic telescopic object.
3. The multi-model-based resource allocation method according to claim 2, wherein said dynamically adjusting the number of POD units in the corresponding POD unit set according to the request block set by using the service tag corresponding to the allocated model container comprises:
monitoring the number of POD units in the POD unit set by utilizing the elastic telescopic monitoring object in the service tag according to the number of request blocks in the request block set, and increasing or deleting the POD units in the POD unit set by utilizing the automatic telescopic object in the service tag;
and when the number of the POD units is equal to the number of the request blocks in the request block set or the upper limit of the number interval is reached, stopping adding the POD units and allocating one POD unit to each request block.
4. The multi-model-based resource allocation method of claim 1, wherein before said obtaining computation space resources from a pre-built server using said set of POD units to perform data processing on said set of request blocks, said method further comprises:
detecting the connection quantity of the model containers of each server;
and connecting the distributed model container to the server with the minimum number of connections of the model container.
5. The multi-model-based resource allocation method according to claim 4, wherein said acquiring computation space resources from a pre-built server side by using said POD unit set to perform data processing on said request block set comprises:
monitoring the residual space in the memory operation space in the server, and when the residual space is larger than the consumption space of the POD unit set, receiving memory resources by using the POD unit set and executing data processing on the request block set;
when the residual space is smaller than the consumption space of the POD unit set, the memory operation space of the server side is cleaned by using a preset elimination strategy until the residual space is larger than the consumption space of the POD unit set, and the POD unit set is used for receiving memory resources and executing data processing on the request block set.
6. The method according to any one of claims 1 to 5, wherein the receiving, by the load balancing component, the request data set transmitted by the user end, and matching the request address in each request data in the request data set with the address information of the model container, includes:
intercepting the request data set sent by the user side by utilizing an HTTP (hyper text transport protocol) in the load balancing component;
acquiring a request address and request content included by each request data in the request data set;
and matching the request address with the address information of the model container stored in the load balancing component to obtain a matching result.
7. The multi-model-based resource allocation method according to any one of claims 1 to 5, wherein the encapsulating the operating environment of each network model by using a pre-constructed container encapsulation technology to obtain a model container of each network model comprises:
obtaining a dependency package of the network model to obtain an operating environment of the network model;
and packaging the operating environment by utilizing a pre-constructed container packaging technology to obtain a model container of each network model.
8. An apparatus for resource allocation based on multiple models, the apparatus comprising:
the system comprises a container construction module, a load balancing component and a load balancing module, wherein the container construction module is used for acquiring a plurality of network models, packaging the operating environment of each network model by utilizing a pre-constructed container packaging technology to obtain a model container of each network model, and storing the address information of each model container in the load balancing component;
the label configuration module is used for calling a pre-constructed POD unit in each model container to obtain a POD unit set of the model container and generate a service label of the POD unit set;
the request distribution module is used for receiving a request data set transmitted by a user side by using the load balancing component, matching a request address in each request data in the request data set with the address information of the model container, and distributing the request data to the corresponding model container according to a matching result;
and the resource dynamic calling module is used for requesting and splitting the request content in the request data to obtain a request block set, dynamically adjusting the number of POD units in the corresponding POD unit set by using the service tag corresponding to the distributed model container according to the request block set, and acquiring operation space resources from a pre-constructed server by using the POD unit set to execute data processing on the request block set.
9. An electronic device, characterized in that the electronic device comprises:
at least one processor; and the number of the first and second groups,
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores computer program instructions executable by the at least one processor to enable the at least one processor to perform the multi-model based resource allocation method of any one of claims 1 to 7.
10. A computer-readable storage medium comprising a storage data area storing created data and a storage program area storing a computer program; characterized in that the computer program, when being executed by a processor, implements the multi-model based resource allocation method according to any one of claims 1 to 7.
CN202011603677.6A 2020-12-29 2020-12-29 Resource allocation method and device based on multiple models, electronic equipment and storage medium Pending CN112698945A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011603677.6A CN112698945A (en) 2020-12-29 2020-12-29 Resource allocation method and device based on multiple models, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011603677.6A CN112698945A (en) 2020-12-29 2020-12-29 Resource allocation method and device based on multiple models, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN112698945A true CN112698945A (en) 2021-04-23

Family

ID=75512261

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011603677.6A Pending CN112698945A (en) 2020-12-29 2020-12-29 Resource allocation method and device based on multiple models, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112698945A (en)

Similar Documents

Publication Publication Date Title
CN111538594B (en) Order ID generation method, device, equipment and medium based on distributed system
CN113704665B (en) Dynamic service release method and device, electronic equipment and storage medium
CN112528307A (en) Service request checking method and device, electronic equipment and storage medium
CN113890712A (en) Data transmission method and device, electronic equipment and readable storage medium
CN112506559A (en) Gray scale publishing method and device based on gateway, electronic equipment and storage medium
CN114398194A (en) Data collection method and device, electronic equipment and readable storage medium
CN112445623A (en) Multi-cluster management method and device, electronic equipment and storage medium
CN114881616A (en) Business process execution method and device, electronic equipment and storage medium
CN114844844A (en) Delay message processing method, device, equipment and storage medium
CN114020414B (en) Android system and bottom Linux symbiotic method and device, electronic equipment and storage medium
CN115373826B (en) Task scheduling method and device based on cloud computing
CN111858035A (en) FPGA equipment allocation method, device, equipment and storage medium
CN115002011B (en) Flow bidirectional test method and device, electronic equipment and storage medium
CN112698945A (en) Resource allocation method and device based on multiple models, electronic equipment and storage medium
CN113918305B (en) Node scheduling method, node scheduling device, electronic equipment and readable storage medium
CN112328656B (en) Service query method, device, equipment and storage medium based on middle platform architecture
CN113419772A (en) Response data packing and unpacking method, response data packing device, response data unpacking device and response data unpacking medium
CN113626222A (en) Message processing method and device, computer equipment and storage medium
CN106844036A (en) The access method and device of physical equipment
CN112631675A (en) Workflow configuration method, device, equipment and computer readable storage medium
CN112527443A (en) Prompt box display method and device, electronic equipment and computer readable storage medium
CN112540839A (en) Information changing method, device, electronic equipment and storage medium
CN115174691B (en) Big data loading method, device, equipment and medium based on page request
CN113452785B (en) Service access method and device based on offline resources, electronic equipment and medium
CN117455568B (en) Transaction incentive resource transmitting method, device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination