CN109241139A - Data processing method, logical model system and data processing system - Google Patents

Data processing method, logical model system and data processing system Download PDF

Info

Publication number
CN109241139A
CN109241139A CN201811018904.1A CN201811018904A CN109241139A CN 109241139 A CN109241139 A CN 109241139A CN 201811018904 A CN201811018904 A CN 201811018904A CN 109241139 A CN109241139 A CN 109241139A
Authority
CN
China
Prior art keywords
model
initial logic
training
logic model
gradient
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811018904.1A
Other languages
Chinese (zh)
Other versions
CN109241139B (en
Inventor
王鹏
向辉
胡文晖
王奇刚
师忠超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Lenovo Beijing Ltd
Original Assignee
Lenovo Beijing Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Lenovo Beijing Ltd filed Critical Lenovo Beijing Ltd
Priority to CN201811018904.1A priority Critical patent/CN109241139B/en
Publication of CN109241139A publication Critical patent/CN109241139A/en
Application granted granted Critical
Publication of CN109241139B publication Critical patent/CN109241139B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/145Network analysis or design involving simulating, designing, planning or modelling of a network
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/02Total factory control, e.g. smart factories, flexible manufacturing systems [FMS] or integrated manufacturing systems [IMS]

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Stored Programmes (AREA)

Abstract

Present disclose provides a kind of data processing methods, comprising: receives the training logical data of user's input, wherein the trained logical data can be used in constructing initial logic model;Multiple initial logic models are constructed based on the trained logical data;It controls the multiple initial logic model to be trained based on sample data, the initial logic model after obtaining multiple training;And target logic model is determined according to the initial logic model after the multiple training.

Description

Data processing method, logical model system and data processing system
Technical field
This disclosure relates to a kind of data processing method, logical model system and data processing system.
Background technique
With the fast development of electronic technology, it usually needs handle a large amount of data.Such as during model training, number It is very big according to measuring, therefore, for the speed of acceleration model training, the speed usually trained by parallel computation come acceleration model, Such as accelerate training speed by way of Distributed Parallel Computing.But the prior art, in parallel training model, user is not Only need to construct training pattern, it is also necessary to understand the logic of more Distributed Parallel Computing so that the development cost of user compared with Height, use process are cumbersome.Therefore, the operating process for how optimizing parallel computation reduces the development cost of user, is promoted using simultaneously The problem of efficiency and flexibility that line mode is calculated become urgent need to resolve.
Summary of the invention
An aspect of this disclosure provides a kind of data processing method, comprising: receives the training logical number of user's input According to, wherein the trained logical data can be used in constructing initial logic model, multiple based on the trained logical data building Initial logic model is controlled the multiple initial logic model and is trained based on sample data, first after obtaining multiple training Beginning logical model determines target logic model according to the initial logic model after the multiple training.
Optionally, the above-mentioned initial logic model according to after the multiple training determines target logic model, comprising: obtains The model gradient of initial logic model after the multiple training, based on initial logic model described in the model gradient updating Model parameter obtains target logic model.
Optionally, the multiple initial logic model of above-mentioned control is trained based on sample data, comprising: described in control Each initial logic model in multiple initial logic models obtains increment notebook data from the sample data, controls described every A initial logic model is based on corresponding increment notebook data and is trained.
Optionally, above-mentioned trained logical data includes: loss function information and gradient information, described to be based on the training Logical data constructs multiple initial logic models, comprising: based on initial logic model described in the loss function information configuration Model parameter calculates figure and gradiometer nomogram based on the model parameter and the gradient information construction logic model.
Optionally, the above method further include: receive the control information of user's input, the control information can be used in generating Control instruction, the control instruction are used to control the training of the initial logic model.
Optionally, the multiple initial logic model of above-mentioned control is trained based on sample data, including according to default Cycle-index circulation executes: controlling each initial logic model in the multiple initial logic model from the sample data Increment notebook data is obtained, the multiple initial logic model is controlled and is trained respectively based on corresponding increment notebook data, obtained Multiple groups model gradient corresponding with the multiple initial logic model, based on initially being patrolled described in the multiple groups model gradient updating Collect the model parameter of model.
Optionally, the above method is used for electronic equipment, and the electronic equipment includes parameter server and multiple calculate nodes, The calculate node includes multiple computing units, and each computing unit includes the initial logic model, the method packet It includes: controlling each computing unit and obtain increment notebook data from the sample data, and is corresponding based on the training of increment notebook data Initial logic model obtains the model gradient of the corresponding initial logic model, the mould that the training of each computing unit is obtained Type gradient uploads to the corresponding calculate node of the computing unit, controls the calculate node and handles the model gradient received, And model gradient is uploaded to the parameter server by treated, after controlling the parameter server based on the processing received Model gradient updating described in initial logic model model parameter, the updated model parameter is sent to described each Computing unit controls each computing unit based on the updated model parameter received and updates corresponding initial logic model.
Optionally, the above method, wherein the multiple computing units for belonging to a calculate node include a main computation unit With at least one secondary computing unit, the model parameter based on initial logic model described in the loss function information configuration, Include: that the control main computation unit extracts the loss function information from the trained logical data, controls the analytic accounting Unit model parameter according to the loss function information configuration is calculated, controls the secondary computing unit by accessing the analytic accounting Calculate the model parameter of unit replicated setup.
Optionally, above-mentioned that figure and gradient calculating are calculated based on the model parameter and gradient information building model Figure, comprising: control the main computation unit and the secondary computing unit extracts the gradient letter from the trained logical data Breath controls the main computation unit and the secondary model parameter and the gradient information structure of the computing unit based on the configuration Established model calculates figure and gradiometer nomogram.
Another aspect of the disclosure provides a kind of logical model system, comprising: and multiple initial logic models are described more A initial logic model is the built-up logical model of the training logical data that input based on user, wherein at the beginning of the multiple Beginning logical model can be used in executing: it is trained based on sample data, the initial logic model after obtaining multiple training, In, the initial logic model after the multiple training can be used in determining target logic model.
Optionally, above-mentioned logical model system is able to carry out: the mould of the initial logic model after obtaining the multiple training Type gradient obtains target logic model based on the model parameter of initial logic model described in the model gradient updating.
Optionally, above-mentioned logical model system is able to carry out: it is initial to control each of the multiple initial logic model Logical model obtains increment notebook data from the sample data, controls each initial logic model and is based on corresponding increment Notebook data is trained.
Optionally, above-mentioned trained logical data includes: loss function information and gradient information, the logical model system It is able to carry out, comprising: the model parameter based on initial logic model described in the loss function information configuration is based on the model Parameter and the gradient information construction logic model calculate figure and gradiometer nomogram.
Optionally, above-mentioned logical model system is able to carry out: receiving the control information of user's input, the control information energy It is enough in generation control instruction, the control instruction is used to control the training of the initial logic model.
Optionally, above-mentioned logical model system can be executed according to preset loop number of cycles: control is the multiple initial Each initial logic model in logical model obtains increment notebook data from the sample data, and control is the multiple initially to patrol It collects model to be trained respectively based on corresponding increment notebook data, obtain and the corresponding multiple groups of the multiple initial logic model Model gradient, the model parameter based on initial logic model described in the multiple groups model gradient updating.
Optionally, above-mentioned logical model system can be used in electronic equipment, the electronic equipment include parameter server and Multiple calculate nodes, the calculate node include multiple computing units, and each computing unit includes the initial logic mould Type, the logical model system are able to carry out: it controls each computing unit and obtains increment notebook data from the sample data, and Based on the corresponding initial logic model of increment notebook data training, the model gradient of the corresponding initial logic model is obtained, it will The model gradient that each computing unit training obtains uploads to the corresponding calculate node of the computing unit, controls the calculating section The model gradient that point processing receives, and by treated, model gradient is uploaded to the parameter server, controls the parameter Model parameter of the server based on initial logic model described in the model gradient updating that receives that treated, after the update Model parameter be sent to each computing unit, control each computing unit based on the updated model parameter received Update corresponding initial logic model.
Optionally, wherein the multiple computing units for belonging to a calculate node include a main computation unit and at least one A pair computing unit, above-mentioned logical model system are able to carry out: controlling the main computation unit from the trained logical data The loss function information is extracted, main computation unit model parameter according to the loss function information configuration is controlled, Control the model parameter that the secondary computing unit passes through the access main computation unit replicated setup.
Optionally, above-mentioned logical model system is able to carry out: control the main computation unit and the secondary computing unit from The gradient information is extracted in the trained logical data, controls the main computation unit and the secondary computing unit based on described The model parameter of configuration and gradient information building model calculate figure and gradiometer nomogram.
Another aspect of the disclosure provides a kind of data processing system, comprising: the first receiving module, building module, Training module and determining module.Wherein, the first receiving module receives the training logical data of user's input, wherein the instruction Practicing logical data can be used in constructing initial logic model, and building module is based on that the trained logical data building is multiple initially to patrol Model is collected, training module is controlled the multiple initial logic model and is trained based on sample data, after obtaining multiple training Initial logic model, determining module determine target logic model according to the initial logic model after the multiple training.
Optionally, the above-mentioned initial logic model according to after the multiple training determines target logic model, comprising: obtains The model gradient of initial logic model after the multiple training, based on initial logic model described in the model gradient updating Model parameter obtains target logic model.
Optionally, the multiple initial logic model of above-mentioned control is trained based on sample data, comprising: described in control Each initial logic model in multiple initial logic models obtains increment notebook data from the sample data, controls described every A initial logic model is based on corresponding increment notebook data and is trained.
Optionally, above-mentioned trained logical data includes: loss function information and gradient information, described to be based on the training Logical data constructs multiple initial logic models, comprising: based on initial logic model described in the loss function information configuration Model parameter calculates figure and gradiometer nomogram based on the model parameter and the gradient information construction logic model.
Optionally, above system further include: the second receiving module receives the control information of user's input, the control letter Breath can be used in generating control instruction, and the control instruction is used to control the training of the initial logic model.
Optionally, the multiple initial logic model of above-mentioned control is trained based on sample data, including according to default Cycle-index circulation executes: controlling each initial logic model in the multiple initial logic model from the sample data Increment notebook data is obtained, the multiple initial logic model is controlled and is trained respectively based on corresponding increment notebook data, obtained Multiple groups model gradient corresponding with the multiple initial logic model, based on initially being patrolled described in the multiple groups model gradient updating Collect the model parameter of model.
Optionally, above system is used for electronic equipment, and the electronic equipment includes parameter server and multiple calculate nodes, The calculate node includes multiple computing units, and each computing unit includes the initial logic model, the system energy It is enough to execute: to control each computing unit and obtain increment notebook data from the sample data, and based on the training pair of increment notebook data The initial logic model answered obtains the model gradient of the corresponding initial logic model, and the training of each computing unit is obtained Model gradient upload to the corresponding calculate node of the computing unit, control the model ladder that calculate node processing receives Degree, and by treated, model gradient is uploaded to the parameter server, controls the parameter server based on the place received The updated model parameter is sent to described by the model parameter of initial logic model described in the model gradient updating after reason Each computing unit controls each computing unit based on the updated model parameter received and updates corresponding initial logic mould Type.
Optionally, the multiple computing units for belonging to a calculate node include a main computation unit and at least one pair meter Calculate unit, the model parameter based on initial logic model described in the loss function information configuration, comprising: control the master Computing unit extracts the loss function information from the trained logical data, controls the main computation unit according to the damage It loses function information and configures the model parameter, control the secondary computing unit by accessing the main computation unit replicated setup Model parameter.
Optionally, above-mentioned that figure and gradient calculating are calculated based on the model parameter and gradient information building model Figure, comprising: control the main computation unit and the secondary computing unit extracts the gradient letter from the trained logical data Breath controls the main computation unit and the secondary model parameter and the gradient information structure of the computing unit based on the configuration Established model calculates figure and gradiometer nomogram.
Another aspect of the present disclosure provides a kind of non-volatile memory medium, is stored with computer executable instructions, institute Instruction is stated when executed for realizing method as described above.
Another aspect of the present disclosure provides a kind of computer program, and the computer program, which includes that computer is executable, to be referred to It enables, described instruction is when executed for realizing method as described above.
Detailed description of the invention
In order to which the disclosure and its advantage is more fully understood, referring now to being described below in conjunction with attached drawing, in which:
Fig. 1 diagrammatically illustrates the applied field of data processing method and data processing system according to the embodiment of the present disclosure Scape;
Fig. 2 diagrammatically illustrates the flow chart of the data processing method according to the embodiment of the present disclosure;
Fig. 3 diagrammatically illustrates the flow chart of the building initial logic model according to the embodiment of the present disclosure;
Fig. 4 diagrammatically illustrates the flow chart of the training initial logic model according to the embodiment of the present disclosure;
Fig. 5 diagrammatically illustrates the flow chart for the logical model that sets the goal really according to the embodiment of the present disclosure;
Fig. 6 diagrammatically illustrates the stream according to preset loop number training initial logic model according to the embodiment of the present disclosure Cheng Tu;
Fig. 7 diagrammatically illustrates the flow chart of the data processing method for electronic equipment according to the embodiment of the present disclosure;
Fig. 8 diagrammatically illustrates the model of the configuration initial logic model for electronic equipment according to the embodiment of the present disclosure The flow chart of parameter;
Fig. 9 diagrammatically illustrates the building model calculating figure and gradiometer for electronic equipment according to the embodiment of the present disclosure The flow chart of nomogram;
Figure 10 diagrammatically illustrates the flow chart of the data processing method according to another embodiment of the disclosure;
Figure 11 diagrammatically illustrates the schematic diagram of the data processing method for electronic equipment according to the embodiment of the present disclosure;
Figure 12 diagrammatically illustrates the schematic diagram of the logical model system according to the embodiment of the present disclosure;
Figure 13 diagrammatically illustrates the block diagram of the data processing system according to the embodiment of the present disclosure;
Figure 14 diagrammatically illustrates the block diagram of the data processing system according to another embodiment of the disclosure;And
Figure 15 diagrammatically illustrates the block diagram of the computer system for data processing according to the embodiment of the present disclosure.
Specific embodiment
Hereinafter, will be described with reference to the accompanying drawings embodiment of the disclosure.However, it should be understood that these descriptions are only exemplary , and it is not intended to limit the scope of the present disclosure.In the following detailed description, to elaborate many specific thin convenient for explaining Section is to provide the comprehensive understanding to the embodiment of the present disclosure.It may be evident, however, that one or more embodiments are not having these specific thin It can also be carried out in the case where section.In addition, in the following description, descriptions of well-known structures and technologies are omitted, to avoid Unnecessarily obscure the concept of the disclosure.
Term as used herein is not intended to limit the disclosure just for the sake of description specific embodiment.It uses herein The terms "include", "comprise" etc. show the presence of feature, step, operation and/or component, but it is not excluded that in the presence of or add Add other one or more features, step, operation or component.
There are all terms (including technical and scientific term) as used herein those skilled in the art to be generally understood Meaning, unless otherwise defined.It should be noted that term used herein should be interpreted that with consistent with the context of this specification Meaning, without that should be explained with idealization or excessively mechanical mode.
It, in general should be according to this using statement as " at least one in A, B and C etc. " is similar to Field technical staff is generally understood the meaning of the statement to make an explanation (for example, " system at least one in A, B and C " Should include but is not limited to individually with A, individually with B, individually with C, with A and B, with A and C, have B and C, and/or System etc. with A, B, C).Using statement as " at least one in A, B or C etc. " is similar to, generally come Saying be generally understood the meaning of the statement according to those skilled in the art to make an explanation (for example, " having in A, B or C at least One system " should include but is not limited to individually with A, individually with B, individually with C, with A and B, have A and C, have B and C, and/or the system with A, B, C etc.).It should also be understood by those skilled in the art that substantially arbitrarily indicating two or more The adversative conjunction and/or phrase of optional project shall be construed as either in specification, claims or attached drawing A possibility that giving including one of these projects, either one or two projects of these projects.For example, phrase " A or B " should A possibility that being understood to include " A " or " B " or " A and B ".
Shown in the drawings of some block diagrams and/or flow chart.It should be understood that some sides in block diagram and/or flow chart Frame or combinations thereof can be realized by computer program instructions.These computer program instructions can be supplied to general purpose computer, The processor of special purpose computer or other programmable data processing units, so that these instructions are when executed by this processor can be with Creation is for realizing function/operation device illustrated in these block diagrams and/or flow chart.
Therefore, the technology of the disclosure can be realized in the form of hardware and/or software (including firmware, microcode etc.).Separately Outside, the technology of the disclosure can take the form of the computer program product on the computer-readable medium for being stored with instruction, should Computer program product uses for instruction execution system or instruction execution system is combined to use.In the context of the disclosure In, computer-readable medium, which can be, can include, store, transmitting, propagating or transmitting the arbitrary medium of instruction.For example, calculating Machine readable medium can include but is not limited to electricity, magnetic, optical, electromagnetic, infrared or semiconductor system, device, device or propagation medium. The specific example of computer-readable medium includes: magnetic memory apparatus, such as tape or hard disk (HDD);Light storage device, such as CD (CD-ROM);Memory, such as random access memory (RAM) or flash memory;And/or wire/wireless communication link.
Embodiment of the disclosure provides a kind of data processing method, and this method includes receiving the training logic of user's input Data, wherein training logical data can be used in constructing initial logic model, and logical data building is multiple initially to patrol based on training Model is collected, multiple initial logic models is controlled and is trained based on sample data, the initial logic model after obtaining multiple training, Target logic model is determined according to the initial logic model after multiple training.
As it can be seen that the training logical data by receiving user's input constructs more in the technical solution of the embodiment of the present disclosure A initial logic model, and pass through the parallel training of multiple initial logic models, target logic model is obtained, realizes that optimization is parallel The operating process of calculating reduces the development cost of user, promotes the efficiency and flexibility calculated using parallel mode.
Fig. 1 diagrammatically illustrates the applied field of data processing method and data processing system according to the embodiment of the present disclosure Scape.It should be noted that being only the example that can apply the scene of the embodiment of the present disclosure shown in Fig. 1, to help art technology Personnel understand the technology contents of the disclosure, but are not meant to that the embodiment of the present disclosure may not be usable for other equipment, system, environment Or scene.
As shown in Figure 1, the application scenarios 100 for example may include the training logical data 110, multiple initial of user's input Logical model 120 and target logic model 130.
According to the embodiment of the present disclosure, the training logical data 110 of user's input for example may include for construction logic mould The data information of type, such as may include the parameter information for the logical model to be constructed, loss function information, gradient information etc. Deng.
In the embodiments of the present disclosure, multiple initial logic models 120 for example may include initial logic model 121, it is initial Logical model 122, initial logic model 123 etc..Multiple initial logic model for example can be the instruction based on user's input It is built-up to practice logical data 110.
According to the embodiment of the present disclosure, multiple initial logic models 120 for example can be multinomial model, categorised decision tree mould Type, neural network model etc. include the model of parameter to be asked.Multiple initial logic model 120 for example can be identical, wherein Multiple initial logic model 120 can be trained to obtain the initial logic model after multiple training based on sample data, should Each initial logic model in multiple initial logic models 120 can be trained based on different sample datas, be obtained Initial logic model after multiple training is different (such as the model parameter of the initial logic model after multiple training is different).
In the embodiments of the present disclosure, target logic model 130 for example can be based on the initial logic mould after multiple training Type determination obtains.Such as target logic model 130 can be one in the initial logic model after multiple training, or may be used also To be to perform corresponding processing to obtain target logic model to the initial logic model after multiple training, the corresponding processing is for example It can be and be averaging processing etc..
According to the embodiment of the present disclosure, such as the training logical data that can be inputted by distributed training engine based on user Automatically multiple initial logic models are constructed, are trained by multiple initial logic models based on sample data, may be implemented point Cloth parallel computation can speed up the training of model by distributed type assemblies.
Wherein, the training logical data of user's input for example can be the single machine job training logic of user's building, distribution Formula training engine can extract the logical model structure in single machine job training logic constructed by user by auto extractive technology The execution information of information and logical model training stage is built, the distribution including multiple initial logic models 120 is constructed with this Training logic, by the distribution training multiple initial logic models 120 of logic parallel training, with the training of this acceleration model.
In the embodiments of the present disclosure, pass through the single machine job training logic of distributed training engine auto extractive user input Distribution training logic of the building comprising multiple initial logic models 120, the process understand without user and formulate distributed instruction Practice logic, it is only necessary to which user provides single machine job training logic (such as user is only needed to provide single machine code).Distribution instruction Practice in engine comprising a variety of distributed training techniques, is instructed by the machine business of distribution training engine auto extractive user input The mode for practicing logic can be suitable for a variety of user's scenes, reduce the development cost of user.
Fig. 2 diagrammatically illustrates the flow chart of the data processing method according to the embodiment of the present disclosure.
As shown in Fig. 2, this method includes operation S210~S240.
In operation S210, the training logical data of user's input is received, wherein training logical data can be used in building just Beginning logical model.
In the embodiments of the present disclosure, such as by distributed training engine the training logical data for receiving user's input, should Training logical data for example can be the single machine job training logic of user's building, which can be used in structure Build logical model.Wherein, the building information and logical model for the logical model which may include are instructed Practice the execution information etc. in stage, distribution training engine can extract the building letter of the logical model in single machine job training logic Breath constructs initial logic model, and the execution information of the model training stage in extraction single machine job training logic controls The training of initial logic model.
In operation S220, multiple initial logic models are constructed based on training logical data.
According to the embodiment of the present disclosure, distribution training engine can receive the training logical data of user and construct multiple first Beginning logical model, multiple initial logic model for example can be identical logical model.For example, the initial logic model can be with It is the model that multinomial model, categorised decision tree-model, neural network model etc. include parameter to be asked.
In operation S230, controls multiple initial logic models and be trained based on sample data, after obtaining multiple training Initial logic model.
In the embodiments of the present disclosure, sample data for example can be the data for training initial logic model.Wherein, more Each initial logic model in a initial logic model for example can be trained to obtain multiple based on different sample datas Initial logic model after different training.
For example, each initial logic model can be for example trained based on the different piece of sample data, such as sample Data include multiple data, and each initial logic model can randomly choose at least one data from multiple data and be instructed Practice, the training data of each initial logic model is different, and the initial logic model after obtained multiple training is different.
In operation S240, target logic model is determined according to the initial logic model after multiple training.
According to the embodiment of the present disclosure, determine target logic model for example including various ways.Wherein, a kind of mode for example may be used To be to select at least one as target logic model from the initial logic model after multiple training, another way be can be Initial logic model after multiple training is handled to obtain target logic model.
For example, selecting at least one as target logic model from the initial logic model after multiple training, can be Select to meet from the initial logic model after multiple training the logical models of the conditions such as better astringency, model error be small as Target logic model.
For example, being handled to obtain target logic model to the initial logic model after multiple training, can be to multiple The model parameter (or model gradient etc.) of initial logic model after training is handled, such as is averaging processing to obtain mesh Marking model parameter (perhaps object module gradient), objective model parameter (or object module gradient) is used as update by treated The model parameter (or model gradient) of initial logic model, obtains target logic model.
For example, by taking initial logic model is neural network model as an example, such as multiple initial neural network model A1, A2 ..., An (multiple initial neural network model is for example identical), multiple initial neural network model be based on sample data It is trained to obtain the neural network model after multiple training, the model gradient of the neural network model after multiple training is for example Respectively B1, B2 ..., Bn, to model gradient B1, B2 ..., Bn be averaging processing the model after being averaged ladder Degree, the model parameter of neural network model, obtains updated neural network mould at the beginning of the model gradient updating based on this after average Type C.
For example, using the updated neural network model C as target nerve network model, alternatively, this is updated Neural network model C obtains target nerve network model as initial neural network model, and after carrying out multiple circuit training.
According to the technical solution of the embodiment of the present disclosure, the training logical data by receiving user's input constructs multiple initial Logical model, and pass through the parallel training of multiple initial logic models, obtain target logic model.That is, passing through distributed training The training logical data construction logic model of engine auto extractive user input, the process understand without user and formulate distributed Training logic, it is only necessary to which user provides training logical data (such as user is only needed to provide single machine code), passes through the distribution The mode of the training logical data of training engine auto extractive user input can be suitable for a variety of user's scenes, reduce user's Development cost.The embodiment of the present disclosure can realize the operation of optimization parallel computation with this by multiple logical model parallel trainings Process promotes the efficiency and flexibility calculated using parallel mode.
Fig. 3 diagrammatically illustrates the flow chart of the building initial logic model according to the embodiment of the present disclosure.
As shown in figure 3, operation S220 includes operation S221~S222.
In the embodiments of the present disclosure, such as distributed training engine can receive the depth based on calculating figure that user inputs Pursue one's vocational study logic, and the service logic to building, backward gradient for example from calculating before model and session logic etc. is more when operation A part composition.
For example, the training logical data of user's input includes: loss function information and gradient information.
Wherein, the loss function that loss function information for example can be user's building calculates primitive, and gradient information for example may be used To be the application gradient primitive of user's building, which calculates primitive and application gradient calculates primitive and is capable of forming single machine instruction Practice file, distribution training engine can obtain single machine training file building initial logic model.
In operation S221, the model parameter based on loss function information configuration initial logic model.
According to the embodiment of the present disclosure, such as pass through the training logical data of distributed training engine auto extractive user input In loss function information, to mould before and the model parameter based on loss function information configuration initial logic model, such as deployment Shape parameter.
In operation S222, figure and gradiometer nomogram are calculated based on model parameter and gradient information construction logic model.
According to the embodiment of the present disclosure, such as pass through the training logical data of distributed training engine auto extractive user input In gradient information, and the model parameter based on configuration and gradient information construction logic model calculate figure and gradient calculates Figure, such as deployment forward model calculates and backcasting subgraph.
Fig. 4 diagrammatically illustrates the flow chart of the training initial logic model according to the embodiment of the present disclosure.
As shown in figure 4, operation S230 includes operation S231~S232.
In operation S231, each initial logic model controlled in multiple initial logic models obtains son from sample data Sample data.
According to the embodiment of the present disclosure, multiple initial logic models can be trained based on sample data.Wherein, Mei Gechu The sample data of beginning logical model can be different, such as training the increment notebook data of each initial logic model can not Together, each increment notebook data is the fetching portion data from sample data.
In operation S232, controls each initial logic model and be trained based on corresponding increment notebook data.
In the embodiments of the present disclosure, each initial logic model can be trained based on corresponding increment notebook data, be obtained Initial logic model after to multiple training, wherein since the corresponding increment notebook data of each initial logic model is different, because This, the model parameter (or model gradient) of the initial logic model after each training is different.
Fig. 5 diagrammatically illustrates the flow chart for the logical model that sets the goal really according to the embodiment of the present disclosure.
As shown in figure 5, operation S240 includes operation S241~S242.
In operation S241, the model gradient of the initial logic model after obtaining multiple training.
According to the embodiment of the present disclosure, such as by taking initial logic model is initial neural network model as an example, each initial mind Initial neural network after network is trained to obtain multiple training based on corresponding increment notebook data, after multiple training Each neural network in initial neural network includes corresponding model gradient.
Target logic model is obtained based on the model parameter of model gradient updating initial logic model in operation S242.
For example, the model gradient of the initial logic model after multiple training can be averaging processing, after being averaged Model gradient updated logic mould is obtained based on the model parameter of the model gradient updating initial logic model after average Type, can be using updated logical model as target logic model, alternatively, using updated logical model as initial logic Model carries out multiple circuit training, obtains target logic model.
Fig. 6 diagrammatically illustrates the stream according to preset loop number training initial logic model according to the embodiment of the present disclosure Cheng Tu.
As shown in fig. 6, operation S230 includes operation S233~S235.
In the embodiments of the present disclosure, following operation S233~S235 can be executed according to preset loop number.Wherein, it presets Cycle-index can be number set by user, such as can set cycle-index according to the size of data of sample data, may be used also It is met certain condition using the ratio that the sample data for participating in training is accounted for total number of samples evidence as preset loop number.
For example, sample data includes 10000 data, each initial logic model in multiple initial logic models is each It all obtains 100 data at random from 10000 data to be trained, at this point, preset loop number can be defined by the user It is 120 times, or can be circulation and execute following operation S233~S235 until 80% data in 10000 data participate in Training.
In operation S233, each initial logic model controlled in multiple initial logic models obtains son from sample data Sample data.
In the embodiments of the present disclosure, multiple initial logic models for example including A1, A2 ..., An, each initial logic mould Type is obtained at random from obtaining increment notebook data in sample data and for example can be each initial logic model from 10000 data 100 data are trained, wherein 100 data obtained at random are the corresponding subsample number of each initial logic model According to.
In operation S234, controls multiple initial logic models and be trained respectively based on corresponding increment notebook data, obtained Multiple groups model gradient corresponding with multiple initial logic models.
According to the embodiment of the present disclosure, each initial logic model is based on corresponding increment notebook data and is trained, and obtains phase The model gradient answered, that is, after each initial logic model is by training, obtain corresponding model gradient, such as multiple initially patrol Volume model A1, A2 ..., the corresponding model gradient of An be respectively B1, B2 ..., Bn.
In operation S235, the model parameter based on multiple groups model gradient updating initial logic model.
Such as can based on multiple groups model gradient B1, B2 ..., Bn, update initial logic model model parameter, such as Can to multiple groups model gradient B1, B2 ..., Bn be averaging processing to obtain a group model gradient B, with the model gradient B into Row updates the model parameter of initial logic model, obtains updated initial logic model, and returns to operation S233 and continue to train Initial logic model is until meet preset loop number.
Fig. 7 diagrammatically illustrates the flow chart of the data processing method for electronic equipment according to the embodiment of the present disclosure.
As shown in fig. 7, this method includes operation S701~S706.
The data processing method of the embodiment of the present disclosure can be used in electronic equipment, the electronic equipment include parameter server and Multiple calculate nodes, each calculate node include multiple computing units, and each computing unit includes initial logic model.
Wherein, which can support Distributed Parallel Computing, for example, the electronic equipment can be server set Group.
In the embodiments of the present disclosure, parameter server for example can be a calculate node, for collecting multiple calculating sections The model gradient that multiple initial logic model trainings in point obtain, the model gradient being collected into is handled, and based on place The model parameter of model gradient updating initial logic model after reason, and updated model parameter is sent to multiple calculating and is saved In point, the logical model in multiple calculate nodes, which is based on updated model parameter, to be continued to train.
According to the embodiment of the present disclosure, calculate node for example can be a computer, and a calculate node includes multiple meters Unit is calculated, multiple computing unit being capable of parallel computation.Such as can be a computer includes multiple GPU (or thread), it should Multiple GPU (or thread) can be with parallel computation.
It in operation S701, controls each computing unit and obtains increment notebook data from sample data, and be based on subsample number According to the corresponding initial logic model of training, the model gradient of corresponding initial logic model is obtained.
For example, each computing unit includes an initial logic model, the initial logic model example of each computing unit It such as can be identical.Each computing unit can train initially patrolling in the computing unit from increment notebook data is obtained in sample data Model is collected, each computing unit can obtain corresponding model gradient after training.
In operation S702, the model gradient that the training of each computing unit obtains is uploaded into the corresponding calculating of computing unit and is saved Point.
In the embodiment of the present disclosure, the corresponding model gradient of each computing unit is uploaded to the corresponding calculating of computing unit and is saved In point, each calculate node can for example receive the corresponding corresponding model gradient of multiple computing units, that is, each calculating section Point can for example receive multiple groups model gradient.
In operation S703, control calculate node handles the model gradient received, and model gradient uploads by treated To parameter server.
According to the embodiment of the present disclosure, received multiple groups model gradient is handled by corresponding calculate node, such as often A calculate node is averaging processing to obtain a group model gradient to the multiple groups model gradient received, simultaneously by each calculate node By treated, a group model gradient is uploaded in parameter server, the multiple groups uploaded by parameter server to multiple calculate nodes Model gradient is handled.
In operation S704, control parameter server is based on the model gradient updating initial logic model that receives that treated Model parameter.
In the embodiments of the present disclosure, parameter server receives the multiple groups model gradient of multiple calculate nodes, and is based on connecing The model parameter of the multiple groups model gradient updating initial logic model received.For example, parameter server can be according to default distribution The multiple groups model gradient that formula more new strategy (such as do average treatment etc.) processing receives, the group model ladder that obtains that treated Degree, and the model parameter based on parameter server treated a group model gradient updating initial logic model.
In operation S705, updated model parameter is sent to each computing unit.
Wherein, the updated model parameter of parameter server is sent to each computing unit, that is, each computing unit connects Identical updated model parameter is received, which can update the first of itself as multiple computing units Model parameter needed for beginning logical model.
In operation S706, it is corresponding initial based on the updated model parameter update received to control each computing unit Logical model.
In the embodiments of the present disclosure, the initial logic model in each computing unit is carried out based on different increment notebook datas Training, the initial logic model after being trained, due to the difference of increment notebook data, so that after training in each computing unit Initial logic model it is different, therefore, the updated model parameter for the parameter server that each computing unit receives can For updating the initial logic model after the corresponding training of each computing unit.
Fig. 8 diagrammatically illustrates the model of the configuration initial logic model for electronic equipment according to the embodiment of the present disclosure The flow chart of parameter.
As shown in figure 8, operation S221 includes operation S2211~S2213.
According to the embodiment of the present disclosure, the multiple computing units for belonging to a calculate node include a main computation unit and extremely A few secondary computing unit.For example, when a calculate node includes more than two computing units, based on a computing unit Computing unit, other multiple computing units are secondary computing unit.
In the embodiments of the present disclosure, each calculate node can obtain the training logical data of user's input, each calculating Node can start corresponding multiple computing units and be based on training logical data building initial logic model, each computing unit example Such as there is built-in service logic extractor, can be used in extracting the corresponding information in trained logical data, detailed process is as follows.
In operation S2211, control main computation unit extracts loss function information from training logical data.
In the embodiments of the present disclosure, the loss letter in training logical data is extracted by the main computation unit in calculate node Loss function in number information, such as extraction single machine job training logic calculates primitive.
In operation S2212, main computation unit is controlled according to loss function information configuration model parameter.
According to the embodiment of the present disclosure, main computation unit can for example be disposed in main computation unit according to loss function information The forward model parameter of the corresponding calculate node of the main computation unit.
In operation S2213, model parameter of the secondary computing unit by access main computation unit replicated setup is controlled.
According to the embodiment of the present disclosure, secondary computing unit can belong to the host computer list of the same calculate node by accessing Member, and the configured model parameter of main computation unit is replicated, so that each computing unit in calculate node is with configured Model parameter.
Fig. 9 diagrammatically illustrates the building model calculating figure and gradiometer for electronic equipment according to the embodiment of the present disclosure The flow chart of nomogram.
As shown in figure 9, operation S222 includes operation S2221~S2222.
In operation S2221, control main computation unit and secondary computing unit extract gradient information from training logical data.
In the embodiments of the present disclosure, the main computation unit and secondary computing unit for belonging to a calculate node can be defeated from user Gradient information is extracted in the training logical data entered, for example, each computing unit can extract gradient information, based on itself Calculate building unit initial logic model.
In operation S2222, control main computation unit and secondary model parameter and gradient information of the computing unit based on configuration It constructs model and calculates figure and gradiometer nomogram.
In the embodiments of the present disclosure, each computing unit being capable of model parameter and gradient letter based on the configuration that itself includes The initial logic model of breath building itself, wherein the constructed initial logic model of each computing unit for example can be identical.
According to the technical solution of the embodiment of the present disclosure, multiple calculate nodes pass through the training logical data for receiving user's input Multiple initial logic models are constructed, multiple computing units pass through the parallel training of multiple initial logic models, obtain target logic Model.That is, passing through the training logical data construction logic of distributed training engine control computing unit auto extractive user input Model, the process are not necessarily to user and understand and formulate distributed training logic, it is only necessary to user's offer training logical data (such as only need User provides single machine code), pass through the side of the training logical data of distribution training engine auto extractive user input Formula can be suitable for a variety of user's scenes, reduce the development cost of user.The embodiment of the present disclosure can pass through multiple logical models Parallel training realizes the operating process of optimization parallel computation with this, promotes the efficiency calculated using parallel mode and flexibly Property.
Figure 10 diagrammatically illustrates the flow chart of the data processing method according to another embodiment of the disclosure.
As shown in Figure 10, this method includes operation S210~S240 and S1010.Wherein, operation S210~S240 is as above The operation with reference to described in Fig. 2 is same or like, and details are not described herein.
In operation S1010, the control information of user's input is received, control information can be used in generating control instruction, control Instruct the training for controlling initial logic model.
In the embodiments of the present disclosure, each computing unit can receive the control information of user's input, the control information example It can be such as input in calculate node together with training logical data, during initial logic model training, the control information Control instruction can be generated, for controlling the training process of multiple initial logic models.
Figure 11 diagrammatically illustrates the schematic diagram of the data processing method for electronic equipment according to the embodiment of the present disclosure.
As shown in figure 11, the embodiment of the present disclosure includes that multiple calculate nodes are (merely exemplary in figure to show a calculating section Point), each calculate node includes multiple computing units (or multiple thread), wherein one is main computation unit, other are pair Computing unit (merely exemplary in figure to show a secondary computing unit).
The embodiment of the present disclosure includes operation S1110~S1180.
In operation S1110, the training logical data of user's input is obtained by each computing unit.
In operation S1120, initial logic model is constructed.For example, main computation unit obtains the training logical number of user's input Loss function information in disposes forward model parameter according to loss function information in main computation unit, belongs to same meter Other secondary computing units in operator node obtain forward model parameter by access main computation unit, and main computation unit is calculated with secondary Unit obtains the gradient information in the training logical data of user's input, and just based on forward model parameter and gradient information building Beginning logical model.
In operation S1130, the session for obtaining user's input executes the first interface information in operating, and each computing unit exists When going to first interface information, the training of initial logic model is carried out, and the model gradient obtained after training is uploaded to pair In the calculate node answered.
In operation S1140, if computing unit is main computation unit, preset condition is based on to calculating by main computation unit The logical model after training in node is verified or is summarized etc..It is executed for example, going to session in main computation unit When second interface information in operation, and when meeting preset condition, the logical model after training is verified or summarized. Such as verifying model convergence or other intermediate variables it is whether normal etc., it is whether normal that model is verified with this.Wherein, in advance If condition for example can be condition set by user, for example, the condition be reach default frequency of training or the condition can be with All sample datas whether are covered for sample data used during model training.
In operation S1150, the customized operation of user is executed.For example, to be able to carry out user customized for each computing unit Operation, which for example can be the statistical operation in training process, and visualized operation etc. understands convenient for user Training process.
Wherein, operation operation S1130 is returned to, circulation executes S1130~S1150 until meeting preset loop number.
In operation S1160, the model gradient after training is uploaded in parameter server by each calculate node.Wherein, it counts The model gradient that operator node can for example send computing unit is handled (such as average treatment), and by treated, model is terraced Degree is uploaded to parameter server.Calculate node can also send the instruction for requesting updated model parameter to parameter server.
In operation S1170, the model gradient that multiple calculate nodes upload is received.
In operation S1180, according to preset distributed more new strategy (such as average treatment strategy), more based on model gradient The model parameter of new initial logic model, and the request to updated model parameter of response computation node.
Figure 12 diagrammatically illustrates the schematic diagram of the logical model system according to the embodiment of the present disclosure.
As shown in figure 12, a kind of logical model system 1200 disclosed in the embodiment of the present disclosure, comprising: multiple initial logic moulds Type, multiple initial logic models are the built-up logical model of the training logical data that is inputted based on user, wherein Duo Gechu Beginning logical model can be used in executing: it is trained based on sample data, the initial logic model after obtaining multiple training, In, the initial logic model after multiple training can be used in determining target logic model.
For example, logical model system 1200 includes multiple initial logic models 1210,1220,1230 ..., the disclosure is real Example is applied so that logical model is neural network model as an example.The training that multiple initial logic model can for example be inputted based on user Logical data 1240 is built-up, and multiple initial logic model can carry out parallel computation based on sample data 1250, obtains Initial logic model 1210a, 1220a, 1230a ... after multiple training, and based on the initial logic model after multiple training 1210a, 1220a, 1230a determine target logic model 1260.
Figure 13 diagrammatically illustrates the block diagram of the data processing system according to the embodiment of the present disclosure.
As shown in figure 13, data processing system 1300 includes the first receiving module 1310, building module 1320, training module 1330 and determining module 1340.
According to the embodiment of the present disclosure, data processing system 1300 is used for electronic equipment, which includes parameter service Device and multiple calculate nodes, calculate node include multiple computing units, and each computing unit includes initial logic model, the system 1300 are able to carry out: controlling each computing unit and obtain increment notebook data from sample data, and based on the training of increment notebook data Corresponding initial logic model obtains the model gradient of corresponding initial logic model, and the training of each computing unit is obtained Model gradient uploads to the corresponding calculate node of computing unit, and control calculate node handles the model gradient received, and will place Model gradient after reason is uploaded to parameter server, and control parameter server is based on the model gradient updating that receives that treated Updated model parameter is sent to each computing unit, controls each computing unit by the model parameter of initial logic model Corresponding initial logic model is updated based on the updated model parameter received.
Wherein, the first receiving module 1310 can be used for receiving the training logical data of user's input, wherein training logic Data can be used in constructing initial logic model.According to the embodiment of the present disclosure, the first receiving module 1310 can for example be executed Text is with reference to the operation S210 of Fig. 2 description, and details are not described herein.
Building module 1320 can be used for constructing multiple initial logic models based on training logical data.
According to the embodiment of the present disclosure, training logical data includes: loss function information and gradient information;It is patrolled based on training It collects data and constructs multiple initial logic models, comprising: the model parameter based on loss function information configuration initial logic model, base Figure and gradiometer nomogram are calculated in model parameter and gradient information construction logic model.
According to the embodiment of the present disclosure, the multiple computing units for belonging to a calculate node include a main computation unit and extremely A few secondary computing unit, the model parameter based on loss function information configuration initial logic model, comprising: control host computer list Member extracts loss function information from training logical data, and control main computation unit is joined according to loss function information configuration model Number controls model parameter of the secondary computing unit by access main computation unit replicated setup.
According to the embodiment of the present disclosure, based on model parameter and gradient information building model calculates figure and gradient calculates Figure, comprising: control main computation unit and secondary computing unit extract gradient information from training logical data, control main computation unit Model parameter and gradient information building model with secondary computing unit based on configuration calculate figure and gradiometer nomogram.
According to the embodiment of the present disclosure, the operation S220 described above with reference to Fig. 2 can for example be executed by constructing module 1320, This is repeated no more.
Training module 1330 can be used for controlling multiple initial logic models and is trained based on sample data, obtain multiple Initial logic model after training.
It according to the embodiment of the present disclosure, controls multiple initial logic models and is trained based on sample data, comprising: control is more Each initial logic model in a initial logic model obtains increment notebook data from sample data, controls each initial logic Model is based on corresponding increment notebook data and is trained.
It according to the embodiment of the present disclosure, controls multiple initial logic models and is trained based on sample data, including according to pre- If cycle-index circulation executes: each initial logic model controlled in multiple initial logic models obtains son from sample data Sample data, controls multiple initial logic models and is based on corresponding increment notebook data and be trained respectively, obtain with it is multiple initial The corresponding multiple groups model gradient of logical model, the model parameter based on multiple groups model gradient updating initial logic model.
According to the embodiment of the present disclosure, training module 1330 can for example execute the operation S230 above with reference to Fig. 2 description, This is repeated no more.
Determining module 1340 can be used for determining target logic model according to the initial logic model after multiple training.
According to the embodiment of the present disclosure, target logic model is determined according to the initial logic model after multiple training, comprising: obtain The model gradient of initial logic model after taking multiple training, based on the model parameter of model gradient updating initial logic model, Obtain target logic model.
According to the embodiment of the present disclosure, determining module 1340 can for example execute the operation S240 above with reference to Fig. 2 description, This is repeated no more.
Figure 14 diagrammatically illustrates the block diagram of the data processing system according to another embodiment of the disclosure;
As shown in figure 14, data processing system 1400 includes the first receiving module 1310, building module 1320, training module 1330, determining module 1340 and the second receiving module 1410.Wherein, the first receiving module 1310, building module 1320, training The module that module 1330 and determining module 1340 are described above with reference to Figure 13 is same or like, and details are not described herein.
Wherein, the second receiving module 1410 can be used for receiving the control information of user's input, and control information can be used in Control instruction is generated, control instruction is used to control the training of initial logic model.According to the embodiment of the present disclosure, the second receiving module 1410 can for example execute the operation S1010 above with reference to Figure 10 description, and details are not described herein.
It is module according to an embodiment of the present disclosure, submodule, unit, any number of or in which any more in subelement A at least partly function can be realized in a module.It is single according to the module of the embodiment of the present disclosure, submodule, unit, son Any one or more in member can be split into multiple modules to realize.According to the module of the embodiment of the present disclosure, submodule, Any one or more in unit, subelement can at least be implemented partly as hardware circuit, such as field programmable gate Array (FPGA), programmable logic array (PLA), system on chip, the system on substrate, the system in encapsulation, dedicated integrated electricity Road (ASIC), or can be by the hardware or firmware for any other rational method for integrate or encapsulate to circuit come real Show, or with any one in three kinds of software, hardware and firmware implementations or with wherein any several appropriately combined next reality It is existing.Alternatively, can be at least by part according to one or more of the module of the embodiment of the present disclosure, submodule, unit, subelement Ground is embodied as computer program module, when the computer program module is run, can execute corresponding function.
For example, the first receiving module 1310, building module 1320, training module 1330, determining module 1340 and second Any number of in receiving module 1410, which may be incorporated in a module, to be realized or any one module therein can be by Split into multiple modules.Alternatively, at least partly function of one or more modules in these modules can be with other modules At least partly function combines, and realizes in a module.In accordance with an embodiment of the present disclosure, the first receiving module 1310, structure Modeling at least one of block 1320, training module 1330, determining module 1340 and second receiving module 1410 can be at least It is implemented partly as hardware circuit, such as field programmable gate array (FPGA), programmable logic array (PLA), on piece system System, the system on substrate, the system in encapsulation, specific integrated circuit (ASIC), or can be by being integrated or being sealed to circuit The hardware such as any other rational method or firmware of dress realize, or in three kinds of software, hardware and firmware implementations Any one several appropriately combined is realized with wherein any.Alternatively, the first receiving module 1310, building module 1320, At least one of training module 1330, determining module 1340 and second receiving module 1410 can be at least at least partially implemented Corresponding function can be executed when the computer program module is run for computer program module.
Figure 15 diagrammatically illustrates the block diagram of the computer system for data processing according to the embodiment of the present disclosure.Figure Computer system shown in 15 is only an example, should not function to the embodiment of the present disclosure and use scope bring any limit System.
As shown in figure 15, realize that the computer system 1500 for data processing includes processor 1501, computer-readable Storage medium 1502.The system 1500 can execute the method according to the embodiment of the present disclosure.
Specifically, processor 1501 for example may include general purpose microprocessor, instruction set processor and/or related chip group And/or special microprocessor (for example, specific integrated circuit (ASIC)), etc..Processor 1501 can also include for caching The onboard storage device of purposes.Processor 1501 can be the different movements for executing the method flow according to the embodiment of the present disclosure Single treatment unit either multiple processing units.
Computer readable storage medium 1502, for example, can be can include, store, transmitting, propagating or transmitting instruction Arbitrary medium.For example, readable storage medium storing program for executing can include but is not limited to electricity, magnetic, optical, electromagnetic, infrared or semiconductor system, dress It sets, device or propagation medium.The specific example of readable storage medium storing program for executing includes: magnetic memory apparatus, such as tape or hard disk (HDD);Light Storage device, such as CD (CD-ROM);Memory, such as random access memory (RAM) or flash memory;And/or wire/wireless communication Link.
Computer readable storage medium 1502 may include computer program 1503, which may include Code/computer executable instructions executes processor 1501 and is implemented according to the disclosure The method or its any deformation of example.
Computer program 1503 can be configured to have the computer program code for example including computer program module.Example Such as, in the exemplary embodiment, the code in computer program 1503 may include one or more program modules, for example including 1503A, module 1503B ....It should be noted that the division mode and number of module are not fixed, those skilled in the art It can be combined according to the actual situation using suitable program module or program module, when these program modules are combined by processor When 1501 execution, processor 1501 is executed according to the method for the embodiment of the present disclosure or its any deformation.
According to an embodiment of the invention, the first receiving module 1310, building module 1320, training module 1330, determine mould At least one of block 1340 and the second receiving module 1410 can be implemented as the computer program module with reference to Figure 15 description, Corresponding operating described above may be implemented when being executed by processor 1501 in it.
The disclosure additionally provides a kind of computer-readable medium, which, which can be in above-described embodiment, retouches Included in the equipment/device/system stated;It is also possible to individualism, and without in the supplying equipment/device/system.On It states computer-readable medium and carries one or more program, when said one or multiple programs are performed, realize:
A kind of data processing method, comprising: receive the training logical data of user's input, wherein training logical data energy It is enough in building initial logic model, multiple initial logic models are constructed based on training logical data, control multiple initial logics Model is trained based on sample data, the initial logic model after obtaining multiple training, according to initially patrolling after multiple training It collects model and determines target logic model.
Optionally, the above-mentioned initial logic model according to after multiple training determines target logic model, comprising: obtains multiple The model gradient of initial logic model after training obtains mesh based on the model parameter of model gradient updating initial logic model Mark logical model.
Optionally, the multiple initial logic models of above-mentioned control are trained based on sample data, comprising: control is multiple initial Each initial logic model in logical model obtains increment notebook data from sample data, controls each initial logic model base It is trained in corresponding increment notebook data.
Optionally, above-mentioned trained logical data includes: loss function information and gradient information, based on training logical data Construct multiple initial logic models, comprising: the model parameter based on loss function information configuration initial logic model is based on model Parameter and gradient information construction logic model calculate figure and gradiometer nomogram.
Optionally, the above method further include: receive the control information of user's input, control information can be used in generating control Instruction, control instruction are used to control the training of initial logic model.
Optionally, the multiple initial logic models of above-mentioned control are trained based on sample data, including according to preset loop Number of cycles executes: each initial logic model controlled in multiple initial logic models obtains subsample number from sample data According to, control multiple initial logic models be based on corresponding increment notebook data be trained respectively, obtain and multiple initial logic moulds The corresponding multiple groups model gradient of type, the model parameter based on multiple groups model gradient updating initial logic model.
Optionally, the above method is used for electronic equipment, which includes parameter server and multiple calculate nodes, meter Operator node includes multiple computing units, and each computing unit includes initial logic model, this method comprises: each calculating of control is single Member obtains increment notebook data from sample data, and based on the corresponding initial logic model of increment notebook data training, is corresponded to Initial logic model model gradient, it is corresponding that the obtained model gradient of each computing unit training is uploaded into computing unit Calculate node, control calculate node handles the model gradient received, and model gradient is uploaded to parameter service by treated Device, model parameter of the control parameter server based on the model gradient updating initial logic model that receives that treated, will more Model parameter after new is sent to each computing unit, controls each computing unit based on the updated model parameter received Update corresponding initial logic model.
Optionally, the above method, wherein the multiple computing units for belonging to a calculate node include a main computation unit With at least one secondary computing unit, the model parameter based on loss function information configuration initial logic model, comprising: control analytic accounting It calculates unit and extracts loss function information from training logical data, control main computation unit according to loss function information configuration model Parameter controls model parameter of the secondary computing unit by access main computation unit replicated setup.
It is optionally, above-mentioned that figure and gradiometer nomogram are calculated based on model parameter and gradient information building model, comprising: Control main computation unit and pair computing unit extract gradient information, control main computation unit and secondary calculating from training logical data Model parameter and gradient information building model of the unit based on configuration calculate figure and gradiometer nomogram.
In accordance with an embodiment of the present disclosure, computer-readable medium can be computer-readable signal media or computer can Read storage medium either the two any combination.Computer readable storage medium for example can be --- but it is unlimited In system, device or the device of --- electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor, or any above combination.It calculates The more specific example of machine readable storage medium storing program for executing can include but is not limited to: have the electrical connection, portable of one or more conducting wires Formula computer disk, hard disk, random access storage device (RAM), read-only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD-ROM), light storage device, magnetic memory device or The above-mentioned any appropriate combination of person.In the disclosure, computer readable storage medium can be it is any include or storage program Tangible medium, which can be commanded execution system, device or device use or in connection.And in this public affairs In opening, computer-readable signal media may include in a base band or as carrier wave a part propagate data-signal, In carry computer-readable program code.The data-signal of this propagation can take various forms, including but not limited to Electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be computer-readable Any computer-readable medium other than storage medium, the computer-readable medium can send, propagate or transmit for by Instruction execution system, device or device use or program in connection.The journey for including on computer-readable medium Sequence code can transmit with any suitable medium, including but not limited to: wireless, wired, optical cable, radiofrequency signal etc., or Above-mentioned any appropriate combination.
Flow chart and block diagram in attached drawing are illustrated according to the system of the various embodiments of the disclosure, method and computer journey The architecture, function and operation in the cards of sequence product.In this regard, each box in flowchart or block diagram can generation A part of one module, program segment or code of table, a part of above-mentioned module, program segment or code include one or more Executable instruction for implementing the specified logical function.It should also be noted that in some implementations as replacements, institute in box The function of mark can also occur in a different order than that indicated in the drawings.For example, two boxes succeedingly indicated are practical On can be basically executed in parallel, they can also be executed in the opposite order sometimes, and this depends on the function involved.Also it wants It is noted that the combination of each box in block diagram or flow chart and the box in block diagram or flow chart, can use and execute rule The dedicated hardware based systems of fixed functions or operations is realized, or can use the group of specialized hardware and computer instruction It closes to realize.
It will be understood by those skilled in the art that the feature recorded in each embodiment and/or claim of the disclosure can To carry out multiple combinations or/or combination, even if such combination or combination are not expressly recited in the disclosure.Particularly, exist In the case where not departing from disclosure spirit or teaching, the feature recorded in each embodiment and/or claim of the disclosure can To carry out multiple combinations and/or combination.All these combinations and/or combination each fall within the scope of the present disclosure.
Although the disclosure, art technology has shown and described referring to the certain exemplary embodiments of the disclosure Personnel it should be understood that in the case where the spirit and scope of the present disclosure limited without departing substantially from the following claims and their equivalents, A variety of changes in form and details can be carried out to the disclosure.Therefore, the scope of the present disclosure should not necessarily be limited by above-described embodiment, But should be not only determined by appended claims, also it is defined by the equivalent of appended claims.

Claims (10)

1. a kind of data processing method, comprising:
Receive the training logical data of user's input, wherein the trained logical data can be used in constructing initial logic model;
Multiple initial logic models are constructed based on the trained logical data;
It controls the multiple initial logic model to be trained based on sample data, the initial logic mould after obtaining multiple training Type;And
Target logic model is determined according to the initial logic model after the multiple training.
2. according to the method described in claim 1, wherein, the initial logic model according to after the multiple training determines mesh Mark logical model, comprising:
The model gradient of initial logic model after obtaining the multiple training;
Based on the model parameter of initial logic model described in the model gradient updating, target logic model is obtained.
3. according to the method described in claim 1, wherein, the multiple initial logic model of control be based on sample data into Row training, comprising:
The each initial logic model controlled in the multiple initial logic model obtains subsample number from the sample data According to;
Each initial logic model is controlled to be trained based on corresponding increment notebook data.
4. according to the method described in claim 1, wherein:
The trained logical data includes: loss function information and gradient information;
It is described that multiple initial logic models are constructed based on the trained logical data, comprising:
Model parameter based on initial logic model described in the loss function information configuration;
Figure and gradiometer nomogram are calculated based on the model parameter and the gradient information construction logic model.
5. according to the method described in claim 1, further include:
The control information of user's input is received, the control information can be used in generating control instruction, and the control instruction is used for Control the training of the initial logic model.
6. according to the method described in claim 4, wherein, the multiple initial logic model of control be based on sample data into Row training, including executed according to preset loop number of cycles:
The each initial logic model controlled in the multiple initial logic model obtains subsample number from the sample data According to;
Control the multiple initial logic model be based on corresponding increment notebook data be trained respectively, obtain with it is the multiple just The corresponding multiple groups model gradient of beginning logical model;
Model parameter based on initial logic model described in the multiple groups model gradient updating.
7. the electronic equipment includes parameter server and multiple meters according to the method described in claim 4, being used for electronic equipment Operator node, the calculate node include multiple computing units, and each computing unit includes the initial logic model, described Method includes:
It controls each computing unit and obtains increment notebook data from the sample data, and is corresponding based on the training of increment notebook data Initial logic model obtains the model gradient of the corresponding initial logic model;
The model gradient that the training of each computing unit obtains is uploaded into the corresponding calculate node of the computing unit;
It controls the calculate node and handles the model gradient received, and model gradient is uploaded to the parameter clothes by treated Business device;
Control model of the parameter server based on initial logic model described in the model gradient updating that receives that treated Parameter;
The updated model parameter is sent to each computing unit;
It controls each computing unit and corresponding initial logic model is updated based on the updated model parameter received.
8. according to the method described in claim 7, wherein:
The multiple computing units for belonging to a calculate node include a main computation unit and at least one secondary computing unit;
The model parameter based on initial logic model described in the loss function information configuration, comprising:
It controls the main computation unit and extracts the loss function information from the trained logical data;
Control main computation unit model parameter according to the loss function information configuration;
Control the model parameter that the secondary computing unit passes through the access main computation unit replicated setup;
It is described that figure and gradiometer nomogram are calculated based on the model parameter and gradient information building model, comprising:
It controls the main computation unit and the secondary computing unit extracts the gradient information from the trained logical data;
Control the main computation unit and the secondary model parameter and the gradient information of the computing unit based on the configuration It constructs model and calculates figure and gradiometer nomogram.
9. a kind of logical model system, comprising:
Multiple initial logic models, the multiple initial logic model are built-up for the training logical data inputted based on user Logical model,
Wherein, the multiple initial logic model can be used in executing:
It is trained based on sample data, the initial logic model after obtaining multiple training, wherein first after the multiple training Beginning logical model can be used in determining target logic model.
10. a kind of data processing system, comprising:
First receiving module receives the training logical data of user's input, wherein the trained logical data can be used in constructing Initial logic model;
Module is constructed, multiple initial logic models are constructed based on the trained logical data;
Training module is controlled the multiple initial logic model and is trained based on sample data, first after obtaining multiple training Beginning logical model;And
Determining module determines target logic model according to the initial logic model after the multiple training.
CN201811018904.1A 2018-08-31 2018-08-31 Data processing method, logic model system and data processing system Active CN109241139B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811018904.1A CN109241139B (en) 2018-08-31 2018-08-31 Data processing method, logic model system and data processing system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811018904.1A CN109241139B (en) 2018-08-31 2018-08-31 Data processing method, logic model system and data processing system

Publications (2)

Publication Number Publication Date
CN109241139A true CN109241139A (en) 2019-01-18
CN109241139B CN109241139B (en) 2023-05-26

Family

ID=65060188

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811018904.1A Active CN109241139B (en) 2018-08-31 2018-08-31 Data processing method, logic model system and data processing system

Country Status (1)

Country Link
CN (1) CN109241139B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110009100A (en) * 2019-03-28 2019-07-12 北京中科寒武纪科技有限公司 The calculation method and Related product of customized operator
CN110059813A (en) * 2019-02-13 2019-07-26 阿里巴巴集团控股有限公司 The method, device and equipment of convolutional neural networks is updated using GPU cluster
CN110135582A (en) * 2019-05-09 2019-08-16 北京市商汤科技开发有限公司 Neural metwork training, image processing method and device, storage medium
CN110175335A (en) * 2019-05-08 2019-08-27 北京百度网讯科技有限公司 The training method and device of translation model
CN111860828A (en) * 2020-06-15 2020-10-30 北京仿真中心 Neural network training method, storage medium and equipment
CN112001500A (en) * 2020-08-13 2020-11-27 星环信息科技(上海)有限公司 Model training method, device and storage medium based on longitudinal federated learning system
WO2021052422A1 (en) * 2019-09-17 2021-03-25 第四范式(北京)技术有限公司 System and method for executing automated machine learning solution, and electronic apparatus

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140200878A1 (en) * 2013-01-14 2014-07-17 Xerox Corporation Multi-domain machine translation model adaptation
CN106557846A (en) * 2016-11-30 2017-04-05 成都寻道科技有限公司 Based on university students school data graduation whereabouts Forecasting Methodology
CN106778684A (en) * 2017-01-12 2017-05-31 易视腾科技股份有限公司 deep neural network training method and face identification method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140200878A1 (en) * 2013-01-14 2014-07-17 Xerox Corporation Multi-domain machine translation model adaptation
CN106557846A (en) * 2016-11-30 2017-04-05 成都寻道科技有限公司 Based on university students school data graduation whereabouts Forecasting Methodology
CN106778684A (en) * 2017-01-12 2017-05-31 易视腾科技股份有限公司 deep neural network training method and face identification method

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110059813A (en) * 2019-02-13 2019-07-26 阿里巴巴集团控股有限公司 The method, device and equipment of convolutional neural networks is updated using GPU cluster
CN110059813B (en) * 2019-02-13 2021-04-06 创新先进技术有限公司 Method, device and equipment for updating convolutional neural network by using GPU cluster
US11640531B2 (en) 2019-02-13 2023-05-02 Advanced New Technologies Co., Ltd. Method, apparatus and device for updating convolutional neural network using GPU cluster
CN110009100A (en) * 2019-03-28 2019-07-12 北京中科寒武纪科技有限公司 The calculation method and Related product of customized operator
CN110009100B (en) * 2019-03-28 2021-01-05 安徽寒武纪信息科技有限公司 Calculation method of user-defined operator and related product
CN110175335A (en) * 2019-05-08 2019-08-27 北京百度网讯科技有限公司 The training method and device of translation model
CN110175335B (en) * 2019-05-08 2023-05-09 北京百度网讯科技有限公司 Translation model training method and device
CN110135582A (en) * 2019-05-09 2019-08-16 北京市商汤科技开发有限公司 Neural metwork training, image processing method and device, storage medium
WO2021052422A1 (en) * 2019-09-17 2021-03-25 第四范式(北京)技术有限公司 System and method for executing automated machine learning solution, and electronic apparatus
CN111860828A (en) * 2020-06-15 2020-10-30 北京仿真中心 Neural network training method, storage medium and equipment
CN111860828B (en) * 2020-06-15 2023-11-28 北京仿真中心 Neural network training method, storage medium and equipment
CN112001500A (en) * 2020-08-13 2020-11-27 星环信息科技(上海)有限公司 Model training method, device and storage medium based on longitudinal federated learning system

Also Published As

Publication number Publication date
CN109241139B (en) 2023-05-26

Similar Documents

Publication Publication Date Title
CN109241139A (en) Data processing method, logical model system and data processing system
US11445385B2 (en) Systems and methods for communications node upgrade and selection
CN112235384B (en) Data transmission method, device, equipment and storage medium in distributed system
WO2018148931A1 (en) Map drawing method, and cloud platform and server therefor
CN103050016B (en) Hybrid recommendation-based traffic signal control scheme real-time selection method
CN103383263A (en) Interactive dynamic cloud navigation system
CN106412277A (en) Method and apparatus for loading virtual scene
CN109472764A (en) Method, apparatus, equipment and the medium of image synthesis and the training of image synthetic model
WO2023098374A1 (en) Network resource deployment method and apparatus, and electronic device and storage medium
CN104735166A (en) Skyline service selection method based on MapReduce and multi-target simulated annealing
CN114020005B (en) Flight path planning method and system for multi-unmanned aerial vehicle collaborative inspection distribution network line
CN105980839A (en) Roadway infrastructure monitoring based on aggregated mobile vehicle communication parameters
US11110366B2 (en) Computerized system and method of using a physical toy construction set by multiple users
CN112631151B (en) Simulation test method and device
CN114339842A (en) Method and device for designing dynamic trajectory of unmanned aerial vehicle cluster under time-varying scene based on deep reinforcement learning
CN111710153A (en) Traffic flow prediction method, device, equipment and computer storage medium
CN114492849B (en) Model updating method and device based on federal learning
CN110135633A (en) A kind of railway service Call failure prediction technique and device
CN113724493B (en) Method and device for analyzing flow channel, storage medium and terminal
Sakellariou et al. Crowd formal modelling and simulation: The sa'yee ritual
CN114896442A (en) Power transmission line image data management system and method
CN109413205A (en) A kind of management data uploading method and device
CN108235244A (en) A kind of method and device for arranging base station automatically in network plan simulation
CN110351755A (en) A kind of method and device of control node
CN114880757B (en) Construction method of comprehensive pipe rack information management BIM model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant