CN113392798A - Multi-model selection and fusion method for optimizing motion recognition precision under resource limitation - Google Patents

Multi-model selection and fusion method for optimizing motion recognition precision under resource limitation Download PDF

Info

Publication number
CN113392798A
CN113392798A CN202110729963.5A CN202110729963A CN113392798A CN 113392798 A CN113392798 A CN 113392798A CN 202110729963 A CN202110729963 A CN 202110729963A CN 113392798 A CN113392798 A CN 113392798A
Authority
CN
China
Prior art keywords
model
resource
rectangular
recognition
models
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110729963.5A
Other languages
Chinese (zh)
Other versions
CN113392798B (en
Inventor
张兰
李向阳
刘梦境
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Science and Technology of China USTC
Original Assignee
University of Science and Technology of China USTC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Science and Technology of China USTC filed Critical University of Science and Technology of China USTC
Priority to CN202110729963.5A priority Critical patent/CN113392798B/en
Publication of CN113392798A publication Critical patent/CN113392798A/en
Application granted granted Critical
Publication of CN113392798B publication Critical patent/CN113392798B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Software Systems (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a multi-model selection and fusion method for optimizing action recognition precision under resource limitation, which belongs to the field of intelligent perception and multi-mode fusion and comprises the following steps: step 1, modeling resource limiting parameters and resource parameters of a single action recognition model; step 2, training an actor-critic reinforcement learning model to obtain an actor network serving as an online selection model and a critic network serving as a value scoring model; and 3, operating corresponding models according to the model combination, and fusing the recognition results of the models to serve as final recognition results. The method has the advantages that strict orthogonal resource constraint can be processed, data in various modes and various models can be fused and utilized, and higher precision can be achieved under the condition of lower resource occupation compared with a direct end-to-end fusion mode. The method can be applied to action recognition in a multi-modal environment when resources are limited, such as scenes of smart home, patient care, unmanned driving and the like.

Description

Multi-model selection and fusion method for optimizing motion recognition precision under resource limitation
Technical Field
The invention relates to the field of intelligent behavior perception, in particular to a multi-model selection and fusion method for optimizing motion recognition accuracy under resource limitation.
Background
With the development of intelligent sensing equipment and artificial intelligence recognition technology, intelligent behavior sensing is receiving more and more attention. For multi-modal perception scenes, such as smart homes, patient nursing, unmanned driving and other scenes, recognition results of multiple models are fused to improve recognition accuracy of multi-modal perception data collected in the scenes, so that opportunities are brought, and meanwhile, new challenges are provided.
The existing intelligent behavior perception methods are mainly divided into the following methods: 1) a method aiming at improving precision; 2) to balance resource consumption and accuracy. The former approach focuses on the final recognition accuracy without regard to the overhead of resources. The latter approach takes into account resource constraints, such as device occupancy, energy consumption, and the like.
However, the existing method for balancing resource consumption and precision only qualitatively considers energy consumption, but does not consider more strict and quantitative resource limitations, such as memory occupation, time delay caused by calculation time, and the like. In addition, the existing method for balancing resource consumption and precision only involves the fusion of two models or two modalities, and the fusion of more than two models or modalities is not realized.
Disclosure of Invention
Aiming at the problems in the prior art, the invention aims to provide a multi-model selection and fusion method for optimizing action recognition accuracy under resource limitation, which can solve the problems that the existing intelligent perception recognition method for balancing resource consumption and accuracy does not consider more strict quantitative resource limitation and only involves two models or two modes for fusion.
The purpose of the invention is realized by the following technical scheme:
the embodiment of the invention provides a multi-model selection method for optimizing action recognition precision under resource limitation, which comprises the following steps:
step 1, resource limiting parameter modeling and resource parameter modeling of each action recognition model:
modeling a resource limiting parameter determined according to processing resources into a total rectangular model, wherein the total memory limiting parameter of the resource limiting parameter is used as the length of the total rectangular model, and the total delay limiting parameter of the resource limiting parameter is used as the width of the total rectangular model;
modeling a resource parameter of each action recognition model in an action recognition model library into a sub-rectangular model, wherein a memory parameter of the resource parameter is used as the length of the sub-rectangular model, and a time delay parameter of the resource parameter is used as the width of the sub-rectangular model;
the length and the width of the sub-rectangular model are respectively smaller than those of the total rectangular model;
step 2, using an operator-critic reinforcement learning model as an online selection model, using multi-modal perception data aligned in time as an operator network for inputting and training the online selection model, operating each action recognition model in a model combination selected from the action recognition model library by the operator network, fusing the recognition results of each action recognition model to obtain a final recognition result, and judging whether the final recognition result is correct or not by comparing the final recognition result with an actual data label;
taking a model combination output by the multi-modal perception data and the operator network as a critic network for inputting and training an operator-critic reinforcement learning model to obtain the value of the current model combination;
utilizing the resource limiting parameter modeling of the step 1 and the resource parameter modeling of each action recognition model to judge whether the model combination exceeds the resource limiting parameter;
calculating a reward function by combining whether the final identification result is correct and whether the model combination exceeds the resource limit, and updating the parameters of the operator network and the critic network based on a gradient descent method according to the reward function;
step 3, online action recognition:
inputting multi-modal perception data to be recognized into a trained online selection model, outputting a model combination by the online selection model, operating each action recognition model in the model combination and fusing the recognition result of each action recognition model to obtain a final recognition result.
According to the technical scheme provided by the invention, the multi-model selection method for optimizing the action recognition accuracy under the resource limitation provided by the embodiment of the invention has the beneficial effects that:
by modeling the resource limitation parameters into a total rectangular model, whether the model combination selected by the online selection model meets the resource limitation or not is judged conveniently by using a rectangular packing mode, and then the optimal model combination can be obtained under the resource limitation, and the recognition precision after the multiple models are fused is optimized. The coincidence can dynamically fuse various models and modes under the limitation of memory and time, and the action recognition precision is optimized.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on the drawings without creative efforts.
FIG. 1 is a flowchart of a multi-model selection and fusion method for optimizing motion recognition accuracy under resource constraints according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of resource constraint parameter modeling provided by an embodiment of the present invention;
FIG. 3 is a schematic diagram of a system for collecting multimodal data according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of an online identification method according to an embodiment of the present invention;
Detailed Description
The technical solutions in the embodiments of the present invention are clearly and completely described below with reference to the specific contents of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present invention without making any creative effort, shall fall within the protection scope of the present invention. Details which are not described in detail in the embodiments of the invention belong to the prior art which is known to the person skilled in the art.
Referring to fig. 1, an embodiment of the present invention provides a multi-model selection and fusion method for optimizing motion recognition accuracy under resource constraint, including:
step 1, resource limiting parameter modeling and resource parameter modeling of each action recognition model:
modeling a resource limiting parameter determined according to processing resources into a total rectangular model, wherein the total memory limiting parameter of the resource limiting parameter is used as the length of the total rectangular model, and the total delay limiting parameter of the resource limiting parameter is used as the width of the total rectangular model;
modeling a resource parameter of each action recognition model in an action recognition model library into a sub-rectangular model, wherein a memory parameter of the resource parameter is used as the length of the sub-rectangular model, and a time delay parameter of the resource parameter is used as the width of the sub-rectangular model;
the length and the width of the sub-rectangular model are respectively smaller than those of the total rectangular model;
step 2, using an operator-critic reinforcement learning model as an online selection model, using multi-modal perception data aligned in time as an operator network for inputting and training the online selection model, operating each action recognition model in a model combination selected from the action recognition model library by the operator network, fusing the recognition results of each action recognition model to obtain a final recognition result, and judging whether the final recognition result is correct or not by comparing the final recognition result with an actual data label;
taking a model combination output by the multi-modal perception data and the operator network as a critic network for inputting and training an operator-critic reinforcement learning model to obtain the value of the current model combination;
utilizing the resource limiting parameter modeling of the step 1 and the resource parameter modeling of each action recognition model to judge whether the model combination exceeds the resource limiting parameter;
calculating a reward function by combining whether the final identification result is correct and whether the model combination exceeds the resource limit, and updating the parameters of the operator network and the critic network based on a gradient descent method according to the reward function;
step 3, online action recognition:
inputting multi-modal perception data to be recognized into a trained online selection model, outputting a model combination by the online selection model, operating each action recognition model in the model combination and fusing the recognition result of each action recognition model to obtain a final recognition result.
In the method, the loss function of the operator network is the negative of the average value of the critic network output; the penalty function of the critic network is the mean square error of the value it outputs and the reward value of the reward function calculated subsequently.
In step 2 of the method, a reward function is calculated by combining the correctness of the final recognition result and whether the model combination exceeds the resource limit parameter in the following manner, wherein the reward function is as follows:
Figure BDA0003139657480000041
the reward function r ∈ [0,1] includes: whether the model combination exceeds the resource limit rs is belonged to {0,1 }; and whether the final identification result of the model combination is correct re ∈ {0,1 }.
In the above method, determining whether the model combination exceeds the resource limit in the following manner includes:
judging whether the resource parameters of each action recognition model of the model combination correspond to the sub-rectangular models or not, putting the sub-rectangular models into the modeling total rectangular model corresponding to the resource limitation parameters established in the step 1, if the sub-rectangular models can be put into the modeling total rectangular model, determining that the model combination does not exceed the resource limitation, namely rs is 1, and if the sub-rectangular models cannot be put into the modeling total rectangular model, determining that the model combination exceeds the resource limitation, namely rs is 0;
and if the final identification result is consistent with the actual data label comparison, determining that the final identification result is correct, wherein re is 1, and otherwise, re is 0.
In the method, whether the resource parameters of each action recognition model of the model combination can be correspondingly divided into the rectangular models or not is judged through a rectangular packing algorithm, and the rectangular models are put into the modeling total rectangular model corresponding to the resource limitation parameters established in the step 1.
The combination of models resulting from steps 2 and 3 of the above method includes the selected plurality of models and the weight of each motion recognition model. The combination of models corresponds to a subset of models of the motion recognition model library. The weight of each action recognition model is automatically distributed by an operator network according to the reward of the critic network in the training and learning process.
In the method, the multi-modal sensing data is data sensed by multiple sensors to be identified.
In step 2 of the above method, the recognition results of the motion recognition models are fused in a weighted manner according to the weights of the motion recognition models in the model combination.
The method can select the optimal model combination meeting the resource limitation condition on line under the limitation of memory and time, and optimize the action recognition precision by dynamically fusing various models and modes.
The embodiments of the present invention are described in further detail below.
Referring to fig. 1, an embodiment of the present invention provides a multi-model selection method for optimizing motion recognition accuracy under resource constraint, including the following steps:
step 1, modeling a resource limitation parameter based on a rectangular packing algorithm: modeling the resource limiting parameter into a total rectangular model, taking the memory limiting parameter of the resource limiting parameter as the length of the total rectangular model, and taking the time delay limiting parameter of the resource limiting parameter as the width of the total rectangular model;
modeling the resource parameter of each action recognition model in the model library into a sub-rectangular model, wherein the memory parameter of the resource parameter is used as the length of the sub-rectangular model, and the time delay parameter of the resource parameter is used as the width of the sub-rectangular model;
the length and the width of the established sub-rectangle model are respectively smaller than those of the total rectangle model, namely the total rectangle model is a large rectangle, and the sub-rectangle models are small rectangles, so that the resource constraint is converted into whether a plurality of selected small rectangles can be placed in the large rectangle (see figure 2), and the small rectangles cannot rotate and cannot be overlapped;
step 2, taking an operator-critic reinforcement learning model as an online selection model, training the online selection model to select a model combination from a model library online, specifically:
step 21) training an actor network and a critic network of the actor-critic reinforcement learning model, wherein the input of the actor network is multi-modal perception data, and the output of the actor network is a model combination (comprising the selected models and the weight of each model); the input of the critic network is a model combination of multi-mode perception data and operator network output, and the output is the value of the current model combination; the loss function of the actor network is the negative of the mean of the values of the critic network output; the loss function of the critic network is the mean square error of the value output by the critic network and the reward value of the reward function obtained by subsequent calculation; the two networks update network parameters based on a gradient descent method;
step 22) adopting the following reward function feedback to update the operator network and the criticc network, wherein the reward function r belongs to [0,1]]The method comprises the following two aspects: whether the model combination exceeds the resource rs belongs to {0,1 }; whether the identification result fused with each model in the model combination is correct re ∈ {0,1}, and the reward function is specifically as follows:
Figure BDA0003139657480000051
step 3, operating corresponding action recognition models according to the model combination output by the online selection model, and fusing the recognition result of each action recognition model to obtain a final recognition result;
and 4, calculating the reward function in the step 22 according to each action recognition model in the model combination and the final fusion result, recording tuples consisting of input data, the model combination and the reward value, and updating the parameters of the operator network and the critic network based on a gradient descent method by using a historical record.
The method of the invention models two-dimensional resource constraint, dynamically selects and fuses a plurality of models by online selecting model combination in the steps, and improves the identification precision of multi-model fusion.
Examples
(1) Early preparation:
1a) collect multimodal perception data (see fig. 3), including: the method comprises the following steps of (1) smart phone acceleration sensor data, wifi data and sound data;
1b) respectively training a plurality of action recognition models in an action recognition model library by using perception data of different modes, wherein the method comprises the following steps: SVM, xgboost, LSTM model, and measuring the recognition accuracy, memory occupation and recognition time delay of the model;
(2) training an online selection model:
21) time alignment is carried out on the collected multi-modal perception data, and the multi-modal perception data are used for training an online selection model;
22) training an operator-critic reinforcement learning model serving as an online selection model by using multi-modal perception data aligned in time, and obtaining a model combination from a model library; a model is combined with a plurality of models, and each model is attached with a weight. The action recognition model library is provided with a plurality of action recognition models, and the model combination only comprises a plurality of action recognition models, and can be regarded as a subset of the model library. In addition, each motion recognition model in the model combination has a weight as the weight for the recognition result of the subsequent fusion. For example: the model library has 3 models, and the model combination may be one vector (a, b, c), where a, b, c represent motion recognition models one, two, three, respectively, where a, b, c take values between 0 and 1, and represent weights of the models one, two, three, and if a ═ 0 represents that the model one is not selected, a ═ 0.5 represents that the weight of the model one is 0.5.
23) Operating each model in the model combination, and fusing the recognition results of each model to obtain a final recognition result;
24) judging whether the final recognition result is correct or not by comparing the final recognition result with the actual data label; judging whether the model combination exceeds the resource limit or not through modeling in the step 1, calculating a reward function by combining the two conditions, and updating the parameters of the operator network and the critic network based on a gradient descent method;
(3) online action recognition (see fig. 4):
inputting multi-modal perception data into a trained online selection model, outputting a model combination by the online selection model, operating each action recognition model in the model combination and fusing the recognition result of each action recognition model to obtain a final recognition result.
The method of the invention utilizes the rectangular packing modeling condition limitation, and solves the problem of strict quantitative resource limitation; the model is designed on line based on the operator-critical framework, so that the problem that the label of the model combination is not unique and is difficult to obtain is solved; and optimizing the final recognition precision by dynamically selecting model combinations on line. The method has the advantages that strict orthogonal resource constraint can be processed, perception data in various modes and various models can be fused and utilized, and compared with a direct end-to-end fusion mode, the method can achieve higher precision under the condition of lower resource occupation. The method can be applied to action recognition in a multi-modal environment when resources are limited, such as scenes of intelligent home, patient nursing, unmanned driving and the like.
Those of ordinary skill in the art will understand that: all or part of the processes of the methods for implementing the embodiments may be implemented by a program, which may be stored in a computer-readable storage medium, and when executed, may include the processes of the embodiments of the methods as described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.
The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (8)

1. A multi-model selection and fusion method for optimizing motion recognition accuracy under resource limitation is characterized by comprising the following steps:
step 1, resource limiting parameter modeling and resource parameter modeling of each action recognition model:
modeling a resource limiting parameter determined according to processing resources into a total rectangular model, wherein the total memory limiting parameter of the resource limiting parameter is used as the length of the total rectangular model, and the total delay limiting parameter of the resource limiting parameter is used as the width of the total rectangular model;
modeling a resource parameter of each action recognition model in an action recognition model library into a sub-rectangular model, wherein a memory parameter of the resource parameter is used as the length of the sub-rectangular model, and a time delay parameter of the resource parameter is used as the width of the sub-rectangular model;
the length and the width of the sub-rectangular model are respectively smaller than those of the total rectangular model;
step 2, using an operator-critic reinforcement learning model as an online selection model, using multi-modal perception data aligned in time as an operator network for inputting and training the online selection model, operating each action recognition model in a model combination selected from the action recognition model library by the operator network, fusing the recognition results of each action recognition model to obtain a final recognition result, and judging whether the final recognition result is correct or not by comparing the final recognition result with an actual data label;
taking a model combination output by the multi-modal perception data and the operator network as a critic network for inputting and training an operator-critic reinforcement learning model to obtain the value of the current model combination;
utilizing the resource limiting parameter modeling of the step 1 and the resource parameter modeling of each action recognition model to judge whether the model combination exceeds the resource limiting parameter;
calculating a reward function by combining whether the final identification result is correct and whether the model combination exceeds the resource limit, and updating the parameters of the operator network and the critic network based on a gradient descent method according to the reward function;
step 3, online action recognition:
inputting multi-modal perception data to be recognized into a trained online selection model, outputting a model combination by the online selection model, operating each action recognition model in the model combination and fusing the recognition result of each action recognition model to obtain a final recognition result.
2. The method for multi-model selection and fusion of motion recognition accuracy under resource constraints of claim 1, wherein the loss function of the actor network is the negative of the mean of the values of the critic network output; the penalty function of the critic network is the mean square error of the value it outputs and the reward value of the reward function calculated subsequently.
3. The method for selecting and fusing multiple models for optimizing motion recognition accuracy under resource constraint according to claim 1 or 2, wherein in the step 2 of the method, a reward function is calculated by combining the correctness of the final recognition result and whether the model combination exceeds the resource constraint parameter in the following way, and the reward function is:
Figure FDA0003139657470000021
the reward function r ∈ [0,1] includes: whether the model combination exceeds the resource limit rs is belonged to {0,1 }; and whether the final identification result of the model combination is correct re ∈ {0,1 }.
4. The method of claim 3, wherein determining whether the model combination exceeds the resource limit comprises:
judging whether the resource parameters of each action recognition model of the model combination correspond to the sub-rectangular models or not, putting the sub-rectangular models into the modeling total rectangular model corresponding to the resource limitation parameters established in the step 1, if the sub-rectangular models can be put into the modeling total rectangular model, determining that the model combination does not exceed the resource limitation, namely rs is 1, and if the sub-rectangular models cannot be put into the modeling total rectangular model, determining that the model combination exceeds the resource limitation, namely rs is 0;
and if the final identification result is consistent with the actual data label comparison, determining that the final identification result is correct, wherein re is 1, and otherwise, re is 0.
5. The method for selecting multiple models for optimizing motion recognition accuracy under resource constraints according to claim 4, wherein a rectangular packing algorithm is used to determine whether the resource parameters of each motion recognition model of the model combination can be divided into rectangular models and put into the total rectangular model for modeling corresponding to the resource constraint parameters established in step 1.
6. The method for selecting multiple models to optimize motion recognition accuracy under resource constraints according to claim 1 or 2, wherein the combination of models obtained in step 2 and step 3 comprises a plurality of selected motion recognition models and a weight of each motion recognition model.
7. The multi-model selection and fusion method for optimizing motion recognition accuracy under resource constraints according to claim 1 or 2, wherein the multi-modal perception data is multi-sensor perception data to be recognized.
8. The method for selecting and fusing multiple models to optimize motion recognition accuracy under resource constraints according to claim 1 or 2, wherein in step 2, the recognition results of the motion recognition models are fused in a weighted manner according to the weights of the motion recognition models in the model combination.
CN202110729963.5A 2021-06-29 2021-06-29 Multi-model selection and fusion method for optimizing motion recognition precision under resource limitation Active CN113392798B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110729963.5A CN113392798B (en) 2021-06-29 2021-06-29 Multi-model selection and fusion method for optimizing motion recognition precision under resource limitation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110729963.5A CN113392798B (en) 2021-06-29 2021-06-29 Multi-model selection and fusion method for optimizing motion recognition precision under resource limitation

Publications (2)

Publication Number Publication Date
CN113392798A true CN113392798A (en) 2021-09-14
CN113392798B CN113392798B (en) 2022-09-02

Family

ID=77624468

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110729963.5A Active CN113392798B (en) 2021-06-29 2021-06-29 Multi-model selection and fusion method for optimizing motion recognition precision under resource limitation

Country Status (1)

Country Link
CN (1) CN113392798B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108600379A (en) * 2018-04-28 2018-09-28 中国科学院软件研究所 A kind of isomery multiple agent Collaborative Decision Making Method based on depth deterministic policy gradient
WO2019081778A1 (en) * 2017-10-27 2019-05-02 Deepmind Technologies Limited Distributional reinforcement learning for continuous control tasks
CN112700664A (en) * 2020-12-19 2021-04-23 北京工业大学 Traffic signal timing optimization method based on deep reinforcement learning

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019081778A1 (en) * 2017-10-27 2019-05-02 Deepmind Technologies Limited Distributional reinforcement learning for continuous control tasks
CN108600379A (en) * 2018-04-28 2018-09-28 中国科学院软件研究所 A kind of isomery multiple agent Collaborative Decision Making Method based on depth deterministic policy gradient
CN112700664A (en) * 2020-12-19 2021-04-23 北京工业大学 Traffic signal timing optimization method based on deep reinforcement learning

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
BOYUAN YAN等: "Actor-Critic-Based Resource Allocation for Multi-modal Optical Networks", 《IEEE》 *

Also Published As

Publication number Publication date
CN113392798B (en) 2022-09-02

Similar Documents

Publication Publication Date Title
CN112418451A (en) Transformer fault diagnosis positioning system based on digital twinning
CN112084793B (en) Semantic recognition method, device and readable storage medium based on dependency syntax
US20140129499A1 (en) Value oriented action recommendation using spatial and temporal memory system
CN113688957A (en) Target detection method, device, equipment and medium based on multi-model fusion
CN110930705B (en) Intersection traffic decision system, method and equipment
KR20210067605A (en) A method for controlling commercial laundry machine and system for the same using artificial intelligence
CN112182362A (en) Method and device for training model for online click rate prediction and recommendation system
CN112418302A (en) Task prediction method and device
KR102535185B1 (en) Method and apparatus for providing waste plastic recycling service
CN111523604A (en) User classification method and related device
CN111651989B (en) Named entity recognition method and device, storage medium and electronic device
WO2024174767A1 (en) Model construction method and apparatus, and device and storage medium
CN113392798B (en) Multi-model selection and fusion method for optimizing motion recognition precision under resource limitation
CN116993396B (en) Risk early warning method based on vehicle user tag and computer equipment
CN117749836A (en) Internet of things terminal monitoring method and system based on artificial intelligence
KR102406375B1 (en) An electronic device including evaluation operation of originated technology
CN114581652A (en) Target object detection method and device, electronic equipment and storage medium
CN116520074A (en) Active power distribution network fault positioning method and system based on cloud edge cooperation
CN114462526B (en) Classification model training method and device, computer equipment and storage medium
CN114900435B (en) Connection relation prediction method and related equipment
CN116955788A (en) Method, device, equipment, storage medium and program product for processing content
CN115730248A (en) Machine account detection method, system, equipment and storage medium
CN109684471B (en) Application method of AI intelligent text processing system in new retail field
CN114611696A (en) Model distillation method, device, electronic equipment and readable storage medium
KR102639379B1 (en) Method, apparatus and system for providing of user-customized golf professionals matching platform service based on artificial intelligence model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant