CN109670600A

CN109670600A - Decision-making technique and system based on cloud platform

Info

Publication number: CN109670600A
Application number: CN201811536784.4A
Authority: CN
Inventors: 高超
Original assignee: Qiyuan World (beijing) Information Technology Service Co Ltd
Current assignee: Qiyuan World (beijing) Information Technology Service Co Ltd
Priority date: 2018-12-14
Filing date: 2018-12-14
Publication date: 2019-04-23
Anticipated expiration: 2038-12-14
Also published as: CN109670600B

Abstract

The invention belongs to field of artificial intelligence, it discloses a kind of decision-making technique and system based on cloud platform.Method includes receiving simulated training request；Configure hardware resource, initialization simulation engine, decision engine and study engine；Decision engine obtains acting on the instruction in simulated environment according to the simulated environment state and initial model that are generated by simulation engine, and circulation executes repeatedly, obtains simulated training collection；Learn engine according to simulated training collection training initial model, obtains updated initial model；Terminal device initialization requests are sent, terminal device initialization requests include hardware resource and updated initial model, and are used to make terminal equipment configuration hardware resource and act on the instruction of true environment according to the state output of true environment.System includes the first dispatching device, simulation engine, decision engine, study engine and the second dispatching device.The present invention forms complete decision closed loop through the above scheme, has reached and has met expected decision purpose.

Description

Decision-making technique and system based on cloud platform

Technical field

The invention belongs to field of artificial intelligence, in particular to a kind of decision-making technique and system based on cloud platform.

Background technique

The rule that existing decision system is defined based on the mankind can not be adjusted according to the variation of environment.

Existing intelligence system is based only on artificial intelligence, can not calibrate to the result of machine learning.Also, it is most of Intelligence system concentrates on study side, can not provide decision and support end to end.There are also a small number of systems end to end to be in scientific research Stage can not adapt to the production environment that extensive and high availability requires, also not to various deployment forms and study end or certainly Plan end equipment provides support.

Summary of the invention

To solve the above-mentioned problems, one aspect of the present invention provides a kind of decision-making technique based on cloud platform comprising: it connects It receives and receives simulated training request, simulated training request includes: hardware resource, true environment, initial model and nitrification enhancement； Configure the hardware resource, initialization simulation engine, decision engine and study engine；The decision engine is according to by the simulation The simulated environment state and the initial model that engine generates are instructed, which acts in the simulated environment, circulation It executes repeatedly, obtains simulated training collection, it includes: the state and instruction of the simulated environment that each sample is concentrated in the simulated training； The study engine obtains updated initial model according to the simulated training collection training initial model；Send terminal Equipment initialization requests, the terminal device initialization requests include: hardware resource and updated initial model, and for making Terminal equipment configuration hardware resource and the instruction that true environment is acted on according to the state output of true environment.

In decision-making technique as described above, it is preferable that the simulated training request further include: rule；Accordingly, described Decision engine is instructed according to the simulated environment state, initial model and the rule that are generated by the simulation engine；Accordingly, institute Stating terminal device initialization requests includes: hardware resource, updated initial model and rule.

In decision-making technique as described above, it is preferable that the decision engine is according to the mould generated by the simulation engine Quasi- ambient condition, initial model and rule are instructed, and are specifically included: the decision engine first given birth to by the simulation engine by basis At simulated environment state and initial model obtain initial order, further according to described in the rules modification modify instruct, will correct As a result the instruction as output；Or the decision engine first judges that the state of the simulated environment is to meet model prediction, still Meet regular prediction, meet model prediction if being judged as, is referred to according to the state and initial model that are generated by simulated environment It enables, meets regular prediction if being judged as, instructed according to the state and rule that are generated by simulated environment.

On the other hand a kind of decision system based on cloud platform is provided comprising: the first scheduler subsystem, for receiving Simulated training request, simulated training request includes: hardware resource, true environment, initial model and nitrification enhancement, and is configured Hardware resource and initialization simulation engine, decision engine and study engine；The simulation engine is for generating simulated environment with mould Quasi- true environment；The decision engine is described for being instructed according to the state and the initial model of the simulated environment In the simulated environment, circulation executes repeatedly instruction execution, obtains simulated training collection；The study engine is used for according to The simulated training collection training initial model, obtains updated initial model；Second dispatching device is for sending terminal Equipment initialization requests, the terminal device initialization requests include: hardware resource and updated initial model and are used to make Terminal equipment configuration hardware resource and the instruction that true environment is acted on according to the state output of true environment.

In decision system as described above, it is preferable that the received simulated training request of the first scheduler subsystem is also It include: rule；Accordingly, the decision engine is used for according to simulated environment state, the introductory die generated by the simulation engine Type and rule are instructed；After the terminal device initialization requests that second dispatching device is sent include: hardware resource, update Initial model and rule.

In decision system as described above, it is preferable that the decision engine is used for basis and is generated by the simulation engine Simulated environment state, initial model and rule instructed, be specifically used for: the decision engine is first used for according to by the mould The simulated environment state and initial model that quasi- engine generates obtain initial order, refer to further according to modification described in the rules modification It enables, using correction result as the instruction of output；Or the decision engine is for first judging that the state of the simulated environment is to meet Model prediction still meets regular prediction, meets model prediction if being judged as, according to the state generated by simulated environment and just Beginning model is instructed, and meets regular prediction if being judged as, is instructed according to the state and rule that are generated by simulated environment.

Another aspect of the invention provides a kind of decision-making technique based on cloud platform comprising: receive simulated training Request, simulated training request includes: hardware resource, true environment, initial model and nitrification enhancement；Configure the hardware money Source, initialization simulation engine, decision engine and study engine；The decision engine is according to the simulation generated by the simulation engine Ambient condition and the initial model are instructed, which acts in the simulated environment, and circulation executes repeatedly, obtain mould Quasi- training set, it includes: the state and instruction of the simulated environment that each sample is concentrated in the simulated training；The study engine according to The simulated training collection training initial model, obtains updated initial model；Send terminal device initialization requests, institute Stating terminal device initialization requests includes: hardware resource；Hardware resource is configured, institute is returned to the true environment state that will acquire It states decision engine and executes the instruction of the decision engine output in true environment；Circulation executes repeatedly in true environment, True training set is obtained, each sample includes: the state and instruction of true environment in the true training set；The study engine root According to the updated initial model of the true training set training, updated initial model again is obtained；At the beginning of sending terminal device Beginningization request, terminal device initialization requests include: hardware resource and updated initial model again, and are used to that terminal to be made to set Standby configuration hardware resource and the instruction that true environment is acted on according to the state output of true environment.

Further aspect of the present invention provides a kind of decision system based on cloud platform comprising: the first scheduler subsystem is used In receiving simulated training request, the simulated training request includes: hardware resource, true environment, initial model and intensified learning Algorithm, and configure hardware resource and initialization simulation engine, decision engine and study engine；The simulation engine is for generating mould Near-ring border is to simulate true environment；The decision engine according to the state and the initial model of the simulated environment for obtaining The instruction in the simulated environment is acted on, circulation executes repeatedly, obtains simulated training collection；The study engine is used for according to institute The simulated training collection training initial model is stated, updated initial model is obtained；Second dispatching device is for sending end End equipment initialization requests, the terminal device initialization requests include: hardware resource so that terminal equipment configuration hardware resource And the true environment state that will acquire returns to the decision engine and executes the decision engine output in true environment Instruction；The decision engine is also used to be instructed according to updated initial model and the state output of true environment, in true ring Circulation executes repeatedly in border, obtains true training set, each sample includes: the state of true environment and refers in the true training set It enables；The study engine is also used to be obtained updated again according to the updated initial model of the true training set training Initial model；Second dispatching device is also used to send terminal device initialization requests, the terminal device initialization requests It include: hardware resource and updated initial model again, and for making terminal equipment configuration hardware resource and according to true ring The state output in border acts on the instruction of true environment.

Bring of the embodiment of the present invention has the beneficial effect that:

Complete decision closed loop is formd, has reached and has met expected decision purpose.Based on cloud technology, public affairs can be provided There are cloud, private clound, mixed cloud etc. to dispose form, while supporting various heterogeneous terminals equipment.The primary production scale for supporting super large, Meet requirement of the production environment to availability simultaneously.

Detailed description of the invention

Fig. 1 is a kind of flow diagram for decision-making technique based on cloud platform that one embodiment of the invention provides；

Fig. 2 be another embodiment of the present invention provides a kind of decision-making technique based on cloud platform flow diagram；

Fig. 3 is a kind of structural schematic diagram for decision system based on cloud platform that one embodiment of the invention provides.

Specific embodiment

To make the object, technical solutions and advantages of the present invention clearer, below in conjunction with attached drawing to embodiment party of the present invention Formula is described in further detail.

Referring to Fig. 1, the embodiment of the invention provides a kind of decision-making techniques based on cloud platform comprising:

Step 101, simulated training request is received, simulated training request includes: hardware resource, true environment, initial model And nitrification enhancement.

Specifically, simulated training request can be sent by web console or terminal device by user, can also passed through The script finished in advance sends simulated training request.The request includes: hardware resource, true environment, initial model and extensive chemical Practise algorithm.Hardware resource is user based on the hardware configuration for needing service scale to be offered and selecting comprising but be not limited to: CPU quantity, amount of memory, GPU quantity, disk size, machine quantity.True environment is to apply ring when executing this decision-making technique Border (or scene), such as 1) city brain field: traffic, electric power, communication, the energy.2) sphere of life: smart home, intelligence Assistant's reason, automatic Pilot, intelligent translation etc..3) entertainment field: game, music synthesis, dance & art sound effect control, intelligent body performance (such as unmanned aerial vehicle group).4) industrial circle: Mechanical course etc..5) information industry field: recommendation and searching order etc..6) education neck Domain.7) military field.Due to directlying adopt Live Environmental Training, economic cost and time cost are very big, influence efficiency, therefore It needs to generate simulated environment according to true environment.Initial model can be the model that user voluntarily takes in, and can also be user's root The model provided according to the cloud platform that the actual demand of oneself selects.The initial model is used for defeated according to following simulated environment states It instructs out, which acts on simulated environment.Preferably, initial model is neural network framework.Nitrification enhancement is training Algorithm used when the initial model.Cloud platform can be the deployment form such as public cloud, private clound, mixed cloud.

Step 102, hardware resource, initialization simulation engine, decision engine and study engine are configured.

Hardware resource is configured according to the demand that the received simulated training request of step 101 is user, such as configures entity services Device or cluster resource or Cloud Server example resource.Simulation engine generates simulated environment according to true environment, that is, completes environment Initialization.In practical application, the implementation of simulation engine, which can be, uses illusory series, unity game engine etc.；It can be with It is using some modeling tools, such as: Simulink；For bigger scene, it is also possible to using distributed simulation engine, Such as: SpatialOS and KBEngine.Decision engine is completed to initialize according to initial model.Study engine is calculated according to intensified learning Method completes initialization.In practical application, learn engine implementation can by TensorFlow, Pytorch, MXNet, Caffee, MPI, Parameter Server etc. are realized.

Step 103, decision engine is instructed according to the state and initial model of simulated environment, and the instruction execution is in simulation In environment, circulation is executed repeatedly, obtains simulated training collection, simulated training concentrates each sample to include: the state of simulated environment and refer to It enables.

Decision engine is made prediction using state of the initial model to received simulated environment, is instructed, which makees For simulated environment, the state of simulated environment can be made to change therewith, then repeat the step, largely simulated Training sample (or being simulated environment sample) under environment, each sample includes: the state of simulated environment, instruction, by a large amount of samples This collection is collectively referred to as simulated training collection.Duplicate number can be determined that such as duplicate number is reached by preset frequency threshold value To frequency threshold value, then stop executing the step, duplicate number is number of repetition to be determined at this time；It can also be by presetting Duration threshold value determine, such as start the clock from receiving request, after repeating n times, decision duration reaches duration threshold value, then Stop executing the step, N is number of repetition to be determined at this time, and the present embodiment is to this without limiting.

Step 104, study engine obtains updated initial model according to simulated training collection training initial model.

Learn engine and be based on nitrification enhancement, using analog sample collection training initial model, reaches convergence or reach pre- If after time threshold, updating the parameter of initial model, the updated model of parameter is referred to as updated initial model, by it Output is to decision engine, and decision engine is as decision model.

Step 105, terminal device initialization requests are sent, terminal device initialization requests include: hardware resource and update Initial model afterwards, the request are acted on for making terminal equipment configuration hardware resource, and according to the state output of true environment The instruction of true environment.

Decision model of the updated initial model as terminal device, terminal device utilize its state to true environment It makes a policy, the decision is for making terminal device execute corresponding behavior in true environment.Terminal device is that a variety of isomeries are whole One of end equipment, can be with are as follows: mobile phone, tablet computer, mobile phone, automobile and Intelligent hardware so support that isomery is whole Indifference deployment and operation are realized, using convenient in end.Since decision model is deployed on terminal device, can terminal be set It is standby to make a policy in real time, it is not weak by signal of communication between terminal device and cloud platform or transmission delay is influenced.At it In his embodiment, based on the considerations of terminal device hardware resource limiting factor, decision model can be deployed in cloud platform, at this time Terminal device initialization requests include: hardware resource, and the state for the true environment that terminal device will acquire returns to decision engine, Instruction is returned to terminal device according to the state of true environment by decision engine, and terminal device executes this in true environment and refers to It enables.

In order to further improve the foundation of prediction (or output order), simulated training request further include: rule passes through at this time Using the combination output order of rule and initial model, achieve the purpose that hybrid predicting, rule is preset condition.Practical application In, according to different application scenarios, rule can be adaptively adjusted.The method of hybrid predicting may is that first with first Beginning model is made prediction result (i.e. output order) according to the state of simulated environment, then corrects prediction result according to rule.Also It may is that and judge that the state of simulated environment is to meet model prediction, still meet regular prediction, to meet model pre- if being judged as It surveys, is then handled to obtain prediction result using state of the initial model to simulated environment；Meet regular prediction if being judged as, It is handled using state of the rule to simulated environment, obtains prediction result.It may is that again and tear the state of simulated environment open Point, multiple sub- states are resolved into, the method predicted every sub- state can be any one of aforementioned two methods.

The embodiment of the present invention forms complete decision closed loop through the above scheme, has reached and has met expected decision mesh 's.Based on cloud technology, the deployment form such as public cloud, private clound, mixed cloud can be provided, while various heterogeneous terminals being supported to set It is standby.The primary production scale for supporting super large, while meeting requirement of the production environment to availability.

Referring to fig. 2, in order to further complete decision model, another embodiment of the present invention provides a kind of based on cloud platform Decision-making technique comprising following steps:

Step 201, simulated training request is received, simulated training request includes: hardware resource, true environment, initial model And nitrification enhancement.

Step 202, hardware resource, initialization simulation engine, decision engine and study engine are configured.Cloud platform configures hardware Resource.

Step 203, decision engine obtains the finger executed in simulated environment according to the state and initial model of simulated environment It enables, circulation executes repeatedly, obtains simulated training collection, it includes: the state and instruction of simulated environment that each sample is concentrated in simulated training.

Step 204, study engine obtains updated initial model according to simulated training collection training initial model.

Step 205, terminal device initialization requests are sent, terminal device initialization requests include: hardware resource.Terminal is set Standby configuration hardware resource.

Step 206, hardware resource is configured, decision engine is returned to and in true environment with the true environment state that will acquire The middle instruction for executing decision engine output.

Step 207, circulation executes repeatedly in true environment, obtains true training set, each sample packet in true training set It includes: the state and instruction of true environment.

Step 208, study engine obtains updated first again according to the updated initial model of true training set training Beginning model.Decision model of the updated initial model as terminal device again.

Step 209, terminal device initialization requests are sent, terminal device initialization requests include: hardware resource and decision Model, the request act on true environment for making terminal equipment configuration hardware resource, and according to the state output of true environment Instruction.

Associated description about step 201~209 can be found in the related content of step 101~105 of above-described embodiment, this Place no longer repeats one by one.

Referring to Fig. 3, the embodiment of the invention provides a kind of decision systems based on cloud platform, are used to execute above-mentioned implementation The method of example comprising: the first dispatching device 10, simulation engine 20, decision engine 30, study engine 40 and the second dispatching device 50。

Wherein, for the first dispatching device 10 for receiving simulated training request, it includes: hardware resource that simulated training, which is requested, true Real environment, initial model and nitrification enhancement, and configure hardware resource and initialize simulation engine, decision engine and study and draw It holds up.Simulation engine 20 is for generating simulated environment to simulate true environment.Decision engine 30 is used for the state according to simulated environment It is instructed with initial model, in simulated environment, circulation executes repeatedly the instruction execution, obtains simulated training collection.Study is drawn 40 are held up according to simulated training collection training initial model, obtains updated initial model.Second dispatching device 50 is for sending end End equipment initialization requests, terminal device initialization requests include: hardware resource and updated initial model, which is used for Make terminal equipment configuration hardware resource, and acts on the instruction of true environment according to the state output of true environment.

The embodiment of the invention provides a kind of decision systems based on cloud platform, are used to execute the side of above-described embodiment Method comprising: the first dispatching device 10, simulation engine 20, decision engine 30, study engine 40 and the second dispatching device 50.

Wherein, for the first dispatching device 10 for receiving simulated training request, it includes: hardware resource that simulated training, which is requested, true Real environment, initial model and nitrification enhancement, and configure hardware resource and initialize simulation engine, decision engine and study and draw It holds up.Simulation engine 20 is for generating simulated environment to simulate true environment.Decision engine 30 is used for the state according to simulated environment It is instructed with initial model, which executes in simulated environment, and decision engine circulation executes repeatedly, obtains simulated training Collection.Learn engine 40 to be used to obtain updated initial model according to simulated training collection training initial model.Second dispatching device 50 for sending terminal device initialization requests, and terminal device initialization requests include: hardware resource.The request is for making terminal Device configuration hardware resource and the true environment state that will acquire return to decision engine 30 and execute decision in true environment The instruction that engine 30 exports.Decision engine 30 is also used to be instructed according to true environment state output, and the circulation of decision engine 30 executes Repeatedly, true training set is obtained, each sample includes: the state and instruction of true environment in true training set.Learn engine 40 also For obtaining updated initial model again according to the updated initial model of true training set training.Second dispatching device 50 are also used to send terminal device initialization requests, and terminal device initialization requests include: hardware resource and updated again Initial model, the request act on really for making terminal equipment configuration hardware resource, and according to the state output of true environment The instruction of environment.

It should be understood that decision system provided by the above embodiment is in decision, only drawing with above-mentioned each functional module Divide and be illustrated, in practical application, can according to need and be completed by different functional modules above-mentioned function distribution, i.e., The internal structure of system is divided into different functional modules, to complete all or part of the functions described above, such as by One dispatching device and the second dispatching device are divided into adjustment device.In addition, decision system provided by the above embodiment and decision-making party It is owned by France in same design, specific implementation process is detailed in embodiment of the method, and which is not described herein again.

As known by the technical knowledge, the present invention can pass through the embodiment party of other essence without departing from its spirit or essential feature Case is realized.Therefore, embodiment disclosed above, in all respects are merely illustrative, not the only.Institute Have within the scope of the present invention or is included in the invention in the change being equal in the scope of the present invention.

Claims

1. a kind of decision-making technique based on cloud platform, which is characterized in that the decision-making technique includes:

Simulated training request is received, simulated training request includes: hardware resource, true environment, initial model and extensive chemical Practise algorithm；

Configure the hardware resource, initialization simulation engine, decision engine and study engine；

The decision engine is acted on according to the simulated environment state and the initial model that are generated by the simulation engine Instruction in the simulated environment, circulation execute repeatedly, obtain simulated training collection, and each sample of the simulated training concentration includes: The state and instruction of the simulated environment；

The study engine obtains updated initial model according to the simulated training collection training initial model；

Send terminal device initialization requests, the terminal device initialization requests include: hardware resource and updated initial Model, and be used to make terminal equipment configuration hardware resource and act on the finger of true environment according to the state output of true environment It enables.

2. decision-making technique according to claim 1, which is characterized in that the simulated training request further include: rule；

Accordingly, the decision engine according to generated by the simulation engine simulated environment state, initial model and regular To instruction；

Accordingly, the terminal device initialization requests include: hardware resource, updated initial model and rule.

3. decision-making technique according to claim 2, which is characterized in that the decision engine is given birth to according to by the simulation engine At simulated environment state, initial model and rule instructed, specifically include:

The decision engine first obtains initial order according to the simulated environment state and initial model that are generated by the simulation engine, Further according to instruction is modified described in the rules modification, using correction result as the instruction of output；Or

The decision engine first judges that the state of the simulated environment is to meet model prediction, still meets regular prediction, if sentencing Break to meet model prediction, is then instructed according to the state and initial model that are generated by simulated environment, meet rule if being judged as It then predicts, is then instructed according to the state and rule that are generated by simulated environment.

4. a kind of decision system based on cloud platform, which is characterized in that the decision system includes:

First dispatching device, for receiving simulated training request, simulated training request includes: hardware resource, true environment, initial Model and nitrification enhancement, and configure hardware resource and initialization simulation engine, decision engine and study engine；

The simulation engine is for generating simulated environment to simulate true environment；

The decision engine according to the state and the initial model of the simulated environment for obtaining acting on the analog loop Instruction in border, circulation execute repeatedly, obtain simulated training collection；

The study engine is used to obtain updated initial model according to the simulated training collection training initial model；

For second dispatching device for sending terminal device initialization requests, the terminal device initialization requests include: hard Part resource and updated initial model and for making terminal equipment configuration hardware resource and the state output according to true environment Act on the instruction of true environment.

5. decision system according to claim 4, which is characterized in that the received simulated training of the first scheduler subsystem Request further include: rule；

Accordingly, the decision engine is used for according to simulated environment state, initial model and the rule generated by the simulation engine Then instructed；

The terminal device initialization requests that second dispatching device is sent include: hardware resource, updated initial model and Rule.

6. decision system according to claim 5, which is characterized in that the decision engine is used to draw according to by the simulation Simulated environment state, initial model and the rule for holding up generation are instructed, and are specifically used for:

The decision engine is first used to be obtained initially according to the simulated environment state and initial model that are generated by the simulation engine Instruction is instructed further according to modifying described in the rules modification, using correction result as the instruction of output；Or

The decision engine still meets regular prediction for first judging that the state of the simulated environment is to meet model prediction, Meet model prediction if being judged as, is instructed according to the state and initial model that are generated by simulated environment, if being judged as symbol It normally predicts, is then instructed according to the state and rule that are generated by simulated environment.

7. a kind of decision-making technique based on cloud platform, which is characterized in that the decision-making technique includes:

Simulated training request is received, simulated training request includes: that hardware resource, true environment, initial model and intensified learning are calculated Method；

Terminal device initialization requests are sent, the terminal device initialization requests include: hardware resource；

Hardware resource is configured, the decision engine is returned to the true environment state that will acquire and executes institute in true environment State the instruction of decision engine output；

Circulation executes repeatedly in true environment, obtains true training set, each sample includes: true ring in the true training set The state and instruction in border；

The study engine obtains updated introductory die again according to the updated initial model of the true training set training Type；

Terminal device initialization requests are sent, the terminal device initialization requests include: hardware resource and updated again Initial model, and for making terminal equipment configuration hardware resource and acting on true environment according to the state output of true environment Instruction.

8. a kind of decision system based on cloud platform, which is characterized in that the decision system includes:

First scheduler subsystem, for receiving simulated training request, the simulated training request includes: hardware resource, true ring Border, initial model and nitrification enhancement, and configure hardware resource and initialization simulation engine, decision engine and study engine；

For second dispatching device for sending terminal device initialization requests, the terminal device initialization requests include: hard Part resource is so that the terminal equipment configuration hardware resource and true environment state that will acquire returns to the decision engine and true The instruction of the decision engine output is executed in real environment；

The decision engine is also used to be instructed according to updated initial model and the state output of true environment, in true environment Middle circulation executes repeatedly, obtains true training set, each sample includes: the state of true environment and refers in the true training set It enables；

The study engine is also used to be obtained updated again according to the updated initial model of the true training set training Initial model；

Second dispatching device is also used to send terminal device initialization requests, and the terminal device initialization requests include: Hardware resource and again updated initial model, and for making terminal equipment configuration hardware resource and the shape according to true environment State output action is in the instruction of true environment.