WO2023221407A1 - 模型生成方法、装置和电子设备 - Google Patents

模型生成方法、装置和电子设备 Download PDF

Info

Publication number
WO2023221407A1
WO2023221407A1 PCT/CN2022/128534 CN2022128534W WO2023221407A1 WO 2023221407 A1 WO2023221407 A1 WO 2023221407A1 CN 2022128534 W CN2022128534 W CN 2022128534W WO 2023221407 A1 WO2023221407 A1 WO 2023221407A1
Authority
WO
WIPO (PCT)
Prior art keywords
calculation graph
differential
target
graph
order
Prior art date
Application number
PCT/CN2022/128534
Other languages
English (en)
French (fr)
Inventor
李懋林
胡晓光
白童心
刘红雨
于佃海
高铁柱
马艳军
Original Assignee
北京百度网讯科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京百度网讯科技有限公司 filed Critical 北京百度网讯科技有限公司
Publication of WO2023221407A1 publication Critical patent/WO2023221407A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Definitions

  • the present disclosure relates to the field of computer technology, and in particular to the field of deep learning technology. Specifically, it relates to a model generation method, device and electronic equipment.
  • the construction of neural networks is divided into the construction process of forward network and the construction process of reverse network.
  • the present disclosure provides a model generation method, device and electronic equipment.
  • a model generation method including:
  • the forward propagation calculation graph includes a basic operator, and the basic operator is an operator with one mathematical operation;
  • the target calculation graph includes at least one of a forward differential calculation graph and a reverse differential calculation graph
  • the target model is generated based on the target computation graph.
  • a model generation device including:
  • the acquisition module is used to obtain the forward propagation calculation graph of the target model.
  • the forward propagation calculation graph includes a basic operator, and the basic operator is an operator with one mathematical operation;
  • a differential transformation module used to perform differential transformation on the basic operators in the forward propagation calculation graph to obtain a target calculation graph.
  • the target calculation graph includes at least one of a forward differential calculation graph and a reverse differential calculation graph. ;
  • a generation module configured to generate the target model based on the target calculation graph.
  • an electronic device including:
  • the memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor, so that the at least one processor can perform the method described in the first aspect.
  • a non-transitory computer-readable storage medium storing computer instructions, wherein the computer instructions are used to cause the computer to execute the method described in the first aspect.
  • a computer program product including a computer program that implements the method of the first aspect when executed by a processor.
  • Figure 1 is a flow chart of a model generation method provided by an embodiment of the present disclosure
  • Figure 2 is a schematic flow chart of the process of using linearize rules to process basic operators in the forward propagation calculation graph to obtain the forward differential calculation graph in an embodiment of the present disclosure
  • Figure 3 is a schematic flow chart of the process of using transpose rules to process the linearized part of the forward differential calculation graph to obtain the reverse differential calculation graph in an embodiment of the present disclosure
  • Figure 4 is a schematic structural diagram of a procedural architecture framework provided by an embodiment of the present disclosure.
  • Figure 5 is one of the structural schematic diagrams of a generation device provided by an embodiment of the present disclosure.
  • Figure 6 is a schematic structural diagram of an acquisition module in an embodiment of the present disclosure.
  • Figure 7 is a second structural schematic diagram of a generation device provided by an embodiment of the present disclosure.
  • FIG. 8 is a block diagram of an electronic device for implementing a model generation method provided by an embodiment of the present disclosure.
  • Figure 1 is a model generation method provided by an embodiment of the present disclosure.
  • the model generation method includes the following steps:
  • Step S101 Obtain the forward propagation calculation graph of the target model.
  • the forward propagation calculation graph includes a basic operator, and the basic operator is an operator with one mathematical operation;
  • Step S102 Perform differential transformation on the basic operators in the forward propagation calculation graph to obtain a target calculation graph, where the target calculation graph includes at least one of a forward differential calculation graph and a reverse differential calculation graph;
  • Step S103 Generate the target model based on the target calculation graph.
  • the above model generation method can be applied to build various models with automatic differentiation functions. That is, the target model can be various types of models with automatic differentiation functions.
  • the target model may be a model built in a flow field analysis scenario in fluid mechanics, or a model built in a geological exploration scenario to analyze soil components, etc.
  • the target model may be a model built for the problem of lid-driven cavity flow (LDC).
  • the target model may be a model built for the Darcy problem in a porous media fluid mechanics scenario to correctly fit the pressure distribution in the soil.
  • the target model may be a model in all fields such as image processing (Compute Vision, CV), natural language processing (Neuro Liguistic Programming, NLP), etc.
  • the target model may be a model used to analyze the distribution of oil in soil in a geological exploration scenario.
  • the target model may be a model built for the lid-driven cavity flow (LDC) problem.
  • LDC lid-driven cavity flow
  • the LDC problem is a classic problem of computational fluid mechanics. Its specific content is: a cavity closed on three sides and open at the top is filled with liquid. Given that the top liquid has a horizontal flow velocity u, the goal is to simulate the flow out of the cavity. Liquid flow velocity at each point in the body (including horizontal and vertical flow velocity). That is, the target model is used to calculate the liquid flow rate at each point in the cavity based on the given horizontal flow rate of the top liquid.
  • a 10-layer fully connected network with 50 hidden layer nodes can be used as the neural network model, and in the rectangular area from [-0.05,-0.05] to [0.05, 0.05], 100*100 is the granularity grid, and the loss function Loss is designed based on the partial differential equations and boundary conditions, and training is performed to obtain the target model.
  • the liquid flow velocity distribution in the horizontal and vertical directions in the cavity is correctly simulated, and the mean square error of the result is in the order of 1e-4 compared with the traditional method based on OpenFOAM software.
  • the specific generation process of the target model may include the following steps:
  • a fully connected network is built as the basic network forward process, that is, the forward propagation calculation graph is built.
  • the target model is generated using the model generation method provided by the embodiment of the present disclosure.
  • the forward differential calculation graph and the reverse differential calculation graph can be generated based on the method provided by the present disclosure.
  • the forward propagation process of the target model is built based on the forward differential calculation graph, and the entire process is completed based on the reverse differential calculation graph.
  • the network backpropagation process is established to generate the target model.
  • the above-mentioned basic operators are operators with one mathematical operation.
  • they can be multiplication operators, addition operators, negation operators, etc.
  • forward automatic differentiation can be performed on the basic operators in the forward propagation calculation graph to obtain the forward differential.
  • forward automatic differentiation can be performed on the basic operators in the forward propagation calculation graph to obtain the forward differential.
  • reverse automatic differentiation can be performed on the basic operators in the forward propagation calculation graph to obtain the reverse differentiation calculation graph.
  • the above-mentioned generation of the target model based on the target calculation graph may mean that each layer in the built neural network model adds operators in the target calculation graph correspondingly, and based on the target calculation graph, the relationship between the operators in the target calculation graph is The connection relationship corresponds to connecting the operators in each layer of the neural network model.
  • the target calculation graph is obtained by performing differential transformation on the basic operators in the forward propagation calculation graph, because the target calculation graph includes at least one of a forward differential calculation graph and a reverse differential calculation graph. Therefore, the present disclosure can realize both the forward process in model generation and the reverse process in model generation by performing differential transformation on the basic operator, which is beneficial to simplifying the process of model generation.
  • performing differential transformation processing on the basic operator based on the linearize rule may refer to performing linearization processing on the basic operator based on the predefined linearize rule.
  • the linearize rule may be a common linearization method in related technologies.
  • the forward propagation calculation graph may be a calculation graph generated based on the following target model:
  • the target model can be generated based on a procedural architecture framework.
  • the design and implementation of the procedural architecture framework rely on many states, centered on multi-dimensional data tensors. For automatic differentiation, there will be original tensor, forward differential tensor, reverse differential tensor, first-order differential tensor, second-order differential tensor and other states.
  • x and y are the inputs in the forward propagation calculation graph
  • z is the output in the forward propagation calculation graph. That is, the x, y and z can be used as the original tensor.
  • x', y' and z' in Figure 2 can be used as the forward differential tensor.
  • the x', y' and z' can be determined in the graph.
  • the exponential function (exp) operator is first used to perform exp processing on x to obtain the intermediate result t 0 .
  • the sin operator is used to perform sin processing on y to obtain the intermediate result t1.
  • use the div operator to perform division with t 0 as the dividend and t 1 as the divisor to obtain z.
  • t 4 is the forward differential tensor corresponding to t 0 .
  • the intermediate result t 1 obtained belongs to a forward calculation process, and its corresponding forward differential process is: first input y into the cos operator to perform cos to obtain the intermediate result t 2 , and then Input t 2 and y' into the mul operator for multiplication processing to obtain t 5 , where t 5 is the forward differential tensor corresponding to t 1 .
  • the div operator is used to perform division processing with t 0 as the dividend and t 1 as the divisor.
  • the obtained z belongs to a forward calculation process, and its corresponding forward differential process is: input the t 4 and t 1 into div The operator performs division operation to obtain t 6 , input t 5 to neg operator to perform inversion (or negative) operation to obtain t 7 , input t 0 and t 7 to mul operator to perform multiplication operation to obtain t 8 .
  • input t 1 into the pow-2 operator and do the exponent power-2 operation to get t 3 .
  • input t 3 and t 8 into the mul operator for multiplication operation to get t 9 .
  • add t 6 and t 9 Input the add operator to perform the addition operation to obtain z', thus completing the generation process of the forward differential calculation graph.
  • the forward differential calculation graph can be obtained by linearizing the basic operators in the forward propagation calculation graph using linearize rules to perform linearize rule differential transformation, which is beneficial to further simplifying the process of model generation.
  • performing differential transformation on the basic operators in the forward propagation calculation graph to obtain a target calculation graph including:
  • the basic operators in the forward propagation calculation graph are differentially transformed in sequence according to the linearize rule and the transpose rule to obtain the reverse differential calculation graph.
  • the differential transformation processing of the operator based on the transpose rule may refer to the transposition processing of the operator based on the predefined transpose rule.
  • the transpose rule may be a common transposition method in related technologies.
  • differential transformation is performed on the linearized operator in the forward differential calculation graph according to the transpose rule to obtain the reverse Flow diagram of the process of differential calculation graph.
  • the x', y' and z' can be used as the original tensor, and then the corresponding reverse differential tensor is first determined in the graph: x_bar , y_bar and z_bar. Then, each linearization operator in the forward differential calculation graph is processed based on the transpose rule, and the reverse differential calculation graph shown in Figure 3 can be obtained.
  • the specific generation process is shown in Figure 3. Among them, z_bar is the input of reverse differentiation, and x_bar and y_bar are the outputs of reverse differentiation.
  • the reverse differential calculation graph can be obtained by differentially transforming the linearized operator in the forward differential calculation graph according to the transpose rule. In this way, the reverse differential calculation graph can be obtained based on the reverse differential calculation.
  • the graph generates the backpropagation part of the target model.
  • reverse differentiation is more efficient when the number of inputs is greater than the number of outputs; forward differentiation is more efficient when the number of inputs is less than the number of outputs.
  • forward differentiation is more efficient when the number of inputs is less than the number of outputs.
  • performing differential transformation on the basic operators in the forward propagation calculation graph to obtain a target calculation graph including:
  • target calculation graph is an I-order differential calculation graph
  • target differential transformation perform I times of target differential transformation on the basic operators in the forward propagation calculation graph, where the kth time in the I times of target differential transformation Differential transformations include:
  • the k is an integer greater than 0, and the k is not greater than the I, and the I is an integer not less than 2; when the k is equal to 1, the first target calculation graph is For the forward propagation calculation graph, when k is not equal to 1, the first target calculation graph is: the k-1th order forward differential calculation graph, or the k-1th order reverse differential calculation graph. .
  • the differential calculation graph of any order may include a forward differential calculation graph of the corresponding order and a reverse differential calculation graph of the corresponding order. That is, performing forward differentiation on the k-1th order forward differential calculation graph can obtain the kth order forward differential calculation graph, and performing reverse differentiation on the k-1th order forward differential calculation graph can obtain the kth order.
  • Order reverse differential calculation diagram can also perform forward differentiation on the k-1th order reverse differential calculation graph to obtain the k-th order forward differential calculation graph, and perform reverse differentiation on the k-1th order reverse differential calculation graph to obtain Obtain the kth order reverse differential calculation diagram.
  • operators are processed by alternately using linearize rules and transpose rules.
  • automatic differentiation of any order can be achieved, that is, basic operators in the forward propagation calculation graph can be processed by alternately using linearize rules and transpose rules.
  • the construction of the high-order differential part in the forward process or reverse process can be completed, which is conducive to further simplifying the model generation process.
  • obtaining the forward propagation calculation graph of the target model includes:
  • the native operator is an executable operator with at least one basic operator function.
  • the above-mentioned native operator can be an operator formed by a variety of different types of basic operators, that is, the native operator can include two or more mathematical operations.
  • the native operators when the native operator is expressed as exp( When x)/sin(y), the native operators include the following three basic operators exp operator, div operator and sin operator.
  • the formula model of the target model can be expressed as:
  • a corresponding calculation flow chart can be drawn based on the formula model, that is, the initial calculation chart can be drawn.
  • the operators in the formula are usually native operators, and the native operators usually include more than one basic operator, and the user can customize the components of the native operator. Therefore, in related technologies Automatic differentiation of native operators is usually not possible.
  • the native operators in the initial calculation graph are converted into basic operators, thereby obtaining a forward propagation calculation graph composed of basic operators, so as to facilitate subsequent calculations based on the forward propagation graph.
  • the basic operators in are used for automatic differentiation.
  • basic operators and native operators can share a set of standardized intermediate representations, but unlike operators in the native operator system, these basic operators do not include core (kernel) implementation.
  • Basic operators can be used to express semantics. It can also be used to convert to and from the native operator system. At the same time, it can also be used to perform automatic differential changes. Since the basic operator does not include kernel implementation, the basic operator cannot be executed directly.
  • the native operator includes the kernel implementation, and therefore, the native operator is an executable operator.
  • the rule orig2prim for converting a native operator into a basic operator can be predefined, and then the orig2prim rule is used to convert the native operator in the initial calculation graph into a basic operator, so as to obtain The forward propagation calculation diagram.
  • the native operator elementwise_add
  • This operator has two inputs and one output, and contains attributes such as scale_x, scale_y, and scale_out.
  • Split into basic operators may include broadcast_p, fill_constant_p, mul_p, add_p. Whether broadcast_p is needed is determined based on the specific shape of the two inputs. If the attributes scale_x, scale_y, scale_out are not 1.0, the corresponding scale logic needs to be implemented through fill_constant_p, mul_p. .
  • the native operators can also be directly split based on the number of basic operators included in the native operators. That is, the basic operators in the native operators are directly separated from the native operators, and the correlation between different basic operators is established through connecting lines.
  • a forward propagation calculation graph composed of basic operators is obtained, so as to facilitate subsequent basic operations in the forward propagation calculation graph. to perform automated differentiation.
  • the method further includes:
  • Generating the target model based on the target calculation graph includes:
  • the target model is generated based on the target differential calculation graph.
  • the operators in the target calculation graph are basic operators, the basic operators cannot be executed directly. Therefore, in the embodiment of the present disclosure, after the differential transformation is completed, the basic operators in the target calculation graph can be converted into native operators, so that the generated target model can perform the calculations in the target differential calculation graph. son.
  • the rule prim2orig for converting a basic operator into a native operator can be defined in advance, and then the prim2orig rule is used to convert the basic operator in the target calculation graph into a native operator, thereby obtaining the target differential calculation. picture.
  • the basic operator add_p as an example. This operator has two inputs and one output, and has no attributes. Convert to a native operator (elementwise_add), where the three attributes of the native operator, scale_x, scale_y, and scale_out, are all 1.0.
  • the basic operators in the target calculation graph are converted into native operators, so that the generated target model can execute the operators in the target differential calculation graph.
  • the calculation graph includes a forward differential calculation graph and a reverse differential calculation graph
  • the target model includes a forward network and a reverse network
  • generating the target model based on the target calculation graph includes:
  • the forward network is generated based on the forward propagation calculation graph and the I-th order target calculation graph
  • the reverse network is generated based on the I+1-th order target calculation graph, wherein the I-th order target calculation graph It is: the I-order forward differential calculation diagram or the I-order reverse differential calculation diagram
  • the I+1-order target calculation diagram is: the I+1-order forward differential calculation diagram or the I+1-order reverse calculation diagram. Differential calculation diagram.
  • the target calculation graph includes the I-th order target calculation graph and the I+1-th order target calculation graph.
  • the above-mentioned I+1th-order forward differential calculation graph may be: a calculation graph obtained by performing forward or reverse differentiation on the basic operators in the I-order target calculation graph.
  • the I+1th-order reverse differential calculation graph may be: a calculation graph obtained by performing forward or reverse differentiation on the basic operators in the I-th order target calculation graph.
  • the forward network is generated based on the forward propagation calculation graph and the I-th order target calculation graph
  • the reverse network is generated based on the I+1-th order target calculation graph, thereby completing the The generation process of the target model.
  • embodiments of the present disclosure provide a procedural architecture framework, based on which the differential transformation process in the above model generation method can be implemented.
  • a set of standardized intermediate representations is predefined in the procedural architecture framework, and native operators and basic operators share the standardized intermediate representations.
  • the process of forward automated differentiation and reverse automated differentiation of the initial calculation graph based on the procedural architecture framework is as follows: you can first use the orig2prim rule to convert the native operators in the initial calculation graph into basic operators, and then use the linearize rule and The transpose rule performs differential transformation on the basic operator, and can realize forward automatic differentiation, reverse automatic differentiation and high-order derivative automatic differentiation in the process, thereby obtaining the target calculation graph. Then, use the prim2orig rule to convert the target calculation graph into The basic operators are converted into native operators to obtain the target differential calculation graph.
  • the embodiment of the present disclosure also designs the following interface for the procedural architecture framework:
  • enable_prim() turns on the automatic differentiation mechanism based on basic operators
  • disable_prim() turns off the automatic differentiation mechanism based on basic operators
  • the return value of prim_enabled() indicates whether the automatic differentiation mechanism based on basic operators is enabled
  • orig2prim() converts the native operator system into a basic operator system
  • prim2orig() converts the basic operator system into a native operator system.
  • the disclosed embodiment provides a complete set of automatic differentiation scheme, which supports forward and reverse modes, supports high-order differential functions, and is based on a procedural architecture, which has better applicability than a function-based architecture. .
  • FIG. 5 is a schematic structural diagram of a model generation device 500 provided by an embodiment of the present disclosure.
  • the model generation device 500 includes:
  • the acquisition module 501 is used to obtain the forward propagation calculation graph of the target model.
  • the forward propagation calculation graph includes basic operators, and the basic operators are operators with one mathematical operation;
  • the differential transformation module 502 is used to perform differential transformation on the basic operators in the forward propagation calculation graph to obtain a target calculation graph.
  • the target calculation graph includes at least one of a forward differential calculation graph and a reverse differential calculation graph. kind;
  • Generating module 503 configured to generate the target model based on the target calculation graph.
  • the differential transformation module 502 is specifically configured to perform linearize rule differential transformation on the basic operators in the forward propagation calculation graph when performing automatic differentiation in the forward differential mode to obtain the forward Differential calculation diagram.
  • the differential transformation module 502 is specifically configured to differentiate the basic operators in the forward propagation calculation graph in sequence according to the linearize rule and the transpose rule when performing automatic differentiation in the reverse differential mode. Transform to obtain the reverse differential calculation graph.
  • the differential transformation module 502 is specifically configured to perform one target differential transformation on the basic operators in the forward propagation calculation graph when the target calculation graph is an I-order differential calculation graph, wherein, the kth differential transformation among the I target differential transformations includes:
  • the k is an integer greater than 0, and the k is not greater than the I, and the I is an integer not less than 2; when the k is equal to 1, the first target calculation graph is For the forward propagation calculation graph, when k is not equal to 1, the first target calculation graph is: the k-1th order forward differential calculation graph, or the k-1th order reverse differential calculation graph. .
  • the acquisition module 501 includes:
  • the conversion submodule 5012 is used to convert the native operators in the initial calculation graph into the basic operators to obtain the forward propagation calculation graph;
  • the native operator is an executable operator with at least one basic operator function.
  • the device also includes:
  • the conversion module 504 is used to convert the basic operators in the target calculation graph into native operators to obtain the target differential calculation graph;
  • the generation module 503 is specifically configured to generate the target model based on the target differential calculation graph.
  • the calculation graph includes a forward differential calculation graph and a reverse differential calculation graph
  • the target model includes a forward network and a reverse network
  • the generation module 503 is specifically used to calculate based on the forward propagation
  • the forward network is generated based on the graph and the I-th order target calculation graph
  • the reverse network is generated based on the I+1-th order target calculation graph, wherein the I-th order target calculation graph is: I-th order forward Differential calculation graph or I-th order reverse differential calculation graph; the I+1-order target calculation graph is: I+1-order forward differential calculation graph or I+1-order reverse differential calculation graph.
  • model generation device 500 provided in this embodiment can implement all the technical solutions of the above model generation method embodiments, and therefore can at least achieve all the above technical effects, which will not be described again here.
  • the present disclosure also provides an electronic device, a readable storage medium, and a computer program product.
  • FIG. 8 shows a schematic block diagram of an example electronic device 800 that may be used to implement embodiments of the present disclosure.
  • Electronic devices are intended to refer to various forms of digital computers, such as laptop computers, desktop computers, workstations, personal digital assistants, servers, blade servers, mainframe computers, and other suitable computers.
  • Electronic devices may also represent various forms of mobile devices, such as personal digital assistants, cellular phones, smart phones, wearable devices, and other similar computing devices.
  • the components shown herein, their connections and relationships, and their functions are examples only and are not intended to limit implementations of the disclosure described and/or claimed herein.
  • the electronic device 800 includes a computing unit 801, which can be loaded into a random access memory (Random Access Memory) according to a computer program stored in a read-only memory (Read-Only Memory, ROM) 802 or from a storage unit 808.
  • Computer program in RAM) 803 to perform various appropriate actions and processes.
  • RAM 803 various programs and data required for the operation of the device 800 can also be stored.
  • Computing unit 801, ROM 802 and RAM 803 are connected to each other via bus 804.
  • An input/output (I/O) interface 805 is also connected to bus 804.
  • the I/O interface 805 includes: an input unit 806, such as a keyboard, a mouse, etc.; an output unit 807, such as various types of displays, speakers, etc.; a storage unit 808, such as a magnetic disk, an optical disk, etc. etc.; and a communication unit 809, such as a network card, modem, wireless communication transceiver, etc.
  • the communication unit 809 allows the device 800 to exchange information/data with other devices through computer networks such as the Internet and/or various telecommunications networks.
  • Computing unit 801 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of the computing unit 801 include, but are not limited to, a central processing unit (Central Processing Unit, CPU), a graphics processing unit (Graphics Processing Unit, GPU), various dedicated artificial intelligence (Artificial Intelligence, AI) computing chips, various running The computing unit of the machine learning model algorithm, Digital Signal Processor (DSP), and any appropriate processor, controller, microcontroller, etc.
  • the computing unit 801 performs various methods and processes described above, such as the model generation method.
  • the model generation method may be implemented as a computer software program that is tangibly embodied in a machine-readable medium, such as storage unit 808.
  • part or all of the computer program may be loaded and/or installed onto device 800 via ROM 802 and/or communication unit 809 .
  • the computer program When the computer program is loaded into the RAM 803 and executed by the computing unit 801, one or more steps of the model generation method described above are performed.
  • the computing unit 801 may be configured to perform the model generation method in any other suitable manner (eg, by means of firmware).
  • FPGAs Field Programmable Gate Arrays
  • ASICs Application Specific Integrated Circuits
  • ASSP Application Specific Standard Parts
  • SOC System on Chip
  • CPLD Complex Programming Logic Device
  • These various embodiments may include implementation in one or more computer programs executable and/or interpreted on a programmable system including at least one programmable processor, the programmable processor
  • the processor which may be a special purpose or general purpose programmable processor, may receive data and instructions from a storage system, at least one input device, and at least one output device, and transmit data and instructions to the storage system, the at least one input device, and the at least one output device.
  • An output device may be a special purpose or general purpose programmable processor, may receive data and instructions from a storage system, at least one input device, and at least one output device, and transmit data and instructions to the storage system, the at least one input device, and the at least one output device.
  • An output device may be a special purpose or general purpose programmable processor, may receive data and instructions from a storage system, at least one input device, and at least one output device, and transmit data and instructions to the storage system, the at least one input device, and the at least one output device.
  • Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general-purpose computer, special-purpose computer, or other programmable data processing device, such that the program codes, when executed by the processor or controller, cause the functions specified in the flowcharts and/or block diagrams/ The operation is implemented.
  • the program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
  • a machine-readable medium may be a tangible medium that may contain or store a program for use by or in connection with an instruction execution system, apparatus, or device.
  • the machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium.
  • Machine-readable media may include, but are not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, devices or devices, or any suitable combination of the foregoing.
  • machine-readable storage media would include one or more wire-based electrical connections, laptop disks, hard drives, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (Erasable Programmable Read-Only Memory, EPROM, also known as flash memory), optical fiber, portable compact disk read-only memory (Compact Disc Read-Only Memory, CD-ROM), optical storage device, magnetic storage device, or the above any suitable combination of content.
  • RAM random access memory
  • ROM read only memory
  • EPROM erasable programmable read only memory
  • EPROM also known as flash memory
  • optical fiber Portable compact disk read-only memory
  • CD-ROM Compact Disc Read-Only Memory
  • optical storage device magnetic storage device, or the above any suitable combination of content.
  • the systems and techniques described herein may be implemented on a computer having a display device (e.g., a cathode ray tube (CRT), CRT) or a liquid crystal display (LCD) for displaying information to the user.
  • a display device e.g., a cathode ray tube (CRT), CRT
  • LCD liquid crystal display
  • LCD Liquid Crystal Display
  • keyboard and pointing device e.g., a mouse or a trackball
  • Other kinds of devices may also be used to provide interaction with the user; for example, the feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and may be provided in any form, including Acoustic input, voice input or tactile input) to receive input from the user.
  • the systems and techniques described herein may be implemented in a computing system that includes back-end components (e.g., as a data server), or a computing system that includes middleware components (e.g., an application server), or a computing system that includes front-end components (e.g., A user's computer having a graphical user interface or web browser through which the user can interact with implementations of the systems and technologies described herein), or including such backend components, middleware components, or any combination of front-end components in a computing system.
  • the components of the system may be interconnected by any form or medium of digital data communication (eg, a communications network). Examples of communication networks include: Local Area Network (LAN), Wide Area Network (Wide Area Network, WAN), and the Internet.
  • Computer systems may include clients and servers.
  • Clients and servers are generally remote from each other and typically interact over a communications network.
  • the relationship of client and server is created by computer programs running on corresponding computers and having a client-server relationship with each other.
  • the server can be a cloud server, a distributed system server, or a server combined with a blockchain.
  • the program can be stored in a computer-readable storage medium.
  • the program can be stored in a computer-readable storage medium.
  • the process may include the processes of the embodiments of each of the above methods.
  • the storage medium can be a magnetic disk, an optical disk, a read-only memory (Read-Only Memory, ROM) or a random access memory (Random Access Memory, RAM), etc.
  • modules, units, sub-modules, sub-units, etc. can be implemented in one or more Application Specific Integrated Circuits (ASIC), Digital Signal Processing (DSP), Digital Signal Processing Equipment ( DSP Device (DSPD), Programmable Logic Device (PLD), Field-Programmable Gate Array (FPGA), general-purpose processor, controller, microcontroller, microprocessor, for In other electronic units or combinations thereof that perform the functions described in this application.
  • ASIC Application Specific Integrated Circuits
  • DSP Digital Signal Processing
  • DSPD Digital Signal Processing Equipment
  • PLD Programmable Logic Device
  • FPGA Field-Programmable Gate Array
  • the technology described in some embodiments of the present disclosure may be implemented through modules (eg, procedures, functions, etc.) that perform the functions described in some embodiments of the present disclosure.
  • Software code may be stored in memory and executed by a processor.
  • the memory can be implemented in the processor or external to the processor.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Neurology (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

一种模型生成方法、装置和电子设备,包括:获取目标模型的前向传播计算图,前向传播计算图包括基础算子,基础算子为具有一次数学运算的算子(S101);对前向传播计算图中的基础算子进行微分变换,得到目标计算图,目标计算图包括前向微分计算图和反向微分计算图中的至少一种(S102);基于目标计算图生成目标模型(S103)。

Description

模型生成方法、装置和电子设备
相关申请的交叉引用
本申请主张在2022年5月18日在中国提交的中国专利申请号No.202210551044.8的优先权,其全部内容通过引用包含于此。
技术领域
本公开涉及计算机技术领域,尤其涉及深度学习技术领域。具体涉及一种模型生成方法、装置和电子设备。
背景技术
在传统的深度学习任务中,神经网络的搭建分为前向网络的搭建过程和反向网络的搭建过程。在搭建出前向网络之后,可以通过深度学习框架中的自动微分机制对前向网络中的算子求一阶导数,即可完成反向网络的搭建过程。
发明内容
本公开提供了一种模型生成方法、装置和电子设备。
根据本公开的第一方面,提供了一种模型生成方法,包括:
获取目标模型的前向传播计算图,所述前向传播计算图包括基础算子,所述基础算子为具有一次数学运算的算子;
对所述前向传播计算图中的基础算子进行微分变换,得到目标计算图,所述目标计算图包括前向微分计算图和反向微分计算图中的至少一种;
基于所述目标计算图生成所述目标模型。
根据本公开的第二方面,提供了一种模型生成装置,包括:
获取模块,用于获取目标模型的前向传播计算图,所述前向传播计算图包括基础算子,所述基础算子为具有一次数学运算的算子;
微分变换模块,用于对所述前向传播计算图中的基础算子进行微分变换,得到目标计算图,所述目标计算图包括前向微分计算图和反向微分计算图中 的至少一种;
生成模块,用于基于所述目标计算图生成所述目标模型。
根据本公开的第三方面,提供了一种电子设备,包括:
至少一个处理器;以及
与所述至少一个处理器通信连接的存储器;其中,
所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够执行上述第一方面所述的方法。
根据本公开的第四方面,提供了一种存储有计算机指令的非瞬时计算机可读存储介质,其中,所述计算机指令用于使所述计算机执行上述第一方面所述的方法。
根据本公开的第五方面,提供了一种计算机程序产品,包括计算机程序,所述计算机程序在被处理器执行时实现第一方面所述的方法。
附图说明
附图用于更好地理解本方案,不构成对本公开的限定。其中:
图1是本公开实施例提供的一种模型生成方法的流程图;
图2是本公开实施例中,利用linearize规则对前向传播计算图中的基础算子进行处理,得到前向微分计算图过程的流程示意图;
图3是本公开实施例中,利用transpose规则对前向微分计算图中的线性化部分进行处理,得到反向微分计算图过程的流程示意图;
图4是本公开实施例提供的一种过程式架构框架的结构示意图;
图5是本公开实施例提供的一种生成装置的结构示意图之一;
图6是本公开实施例中获取模块的结构示意图;
图7是本公开实施例提供的一种生成装置的结构示意图之二;
图8本公开实施例提供的用于实现模型生成方法的电子设备的框图。
具体实施方式
以下结合附图对本公开的示范性实施例做出说明,其中包括本公开实施 例的各种细节以助于理解,应当将它们认为仅仅是示范性的。因此,本领域普通技术人员应当认识到,可以对这里描述的实施例做出各种改变和修改,而不会背离本公开的范围和精神。同样,为了清楚和简明,以下的描述中省略了对公知功能和结构的描述。
请参见图1,图1为本公开实施例提供的一种模型生成方法,所述模型生成方法包括以下步骤:
步骤S101、获取目标模型的前向传播计算图,所述前向传播计算图包括基础算子,所述基础算子为具有一次数学运算的算子;
步骤S102、对所述前向传播计算图中的基础算子进行微分变换,得到目标计算图,所述目标计算图包括前向微分计算图和反向微分计算图中的至少一种;
步骤S103、基于所述目标计算图生成所述目标模型。
上述模型生成方法可以应用于搭建各种具有自动微分功能的模型。也即所述目标模型可以是具有自动微分功能的各种类型模型。其中,所述目标模型可以为流体力学中的流场分析场景中搭建的模型,或者,在地质勘探场景中,对土壤成分进行分析场景中所搭建的模型等。
例如,所述目标模型可以为针对顶盖驱动方腔流(Lid-driven Cavity Flow,LDC)问题所搭建的模型。或者,所述目标模型可以是针对多孔介质流体力学场景中的Darcy问题所搭建的模型,以正确拟合出了土壤中的压强分布。或者,所述目标模型可以是图像处理(Compute Vision,CV),自然语言处理(Neuro Liguistic Programming,NLP)等所有领域中的模型。或者,所述目标模型可以为在地质勘探场景中,用于分析土壤中石油分布的模型。
在本公开一个实施例中,所述目标模型可以为针对顶盖驱动方腔流(Lid-driven Cavity Flow,LDC)问题所搭建的模型。其中,所述LDC问题是计算流体力学的一个经典问题,其具体内容为:在一个三面封闭,顶部开放的腔体中装满液体,给定顶部液体有水平方向流速u,目标是模拟出腔体中每个点的液体流速(包括水平方向和垂直方向流速)。即所述目标模型用于基于给定顶部液体有水平方向流速,计算出腔体中每个点的液体流速。
针对所述LDC问题,本公开实施例中,可以使用隐藏层节点数为50的 10层全连接网络作为神经网络模型,在[-0.05,-0.05]到[0.05,0.05]的矩形区域上以100*100的为粒度划分网格,根据偏微分方程组和边界条件设计损失函数Loss,进行训练从而得到所述目标模型。基于所述目标模型对偏微分方程组的求解,从而正确模拟出了腔体内水平方向和垂直方向上的液体流速分布,与基于OpenFOAM软件实现的传统方法结果均方误差在1e-4数量级。
其中,所述目标模型的具体生成过程可以包括如下步骤:
首先搭建全连接网络作为基础的网络前向过程,即搭建出所述前向传播计算图。使用本公开实施例提供的模型生成方法,生成所述目标模型。其中,可以基于本公开提供的方法生成前向微分计算图和反向微分计算图,然后,基于前向微分计算图搭建出目标模型的前向传播过程,基于所述反向微分计算图完成整个网络反向传播过程搭建,从而生成所述目标模型。
上述基础算子为具有一次数学运算的算子,例如,可以是乘法算子、加法算子、取反算子等,请参见下表为本公开实施例列举的部分基础算子:
Figure PCTCN2022128534-appb-000001
可以理解的是,前向传播计算图中的所有算子均为基础算子。在本公开 实施例中,可以根据相关技术中对基础算子进行前向自动微分的规则,对所述前向传播计算图中的基础算子进行前向自动微分,以得到所述前向微分计算图。同时,可以根据相关技术中对基础算子进行反向自动微分的规则,对所述前向传播计算图中的基础算子进行反向自动微分,以得到所述反向微分计算图。
上述基于所述目标计算图生成所述目标模型可以是指,在所搭建的神经网络模型中各层对应添加所述目标计算图中的算子,并根据所述目标计算图中算子之间的连接关系,对应连接所述神经网络模型各层中的算子。
该实施方式中,通过对所述前向传播计算图中的基础算子进行微分变换,得到目标计算图,由于所述目标计算图包括前向微分计算图和反向微分计算图中的至少一种,因此,本公开通过对基础算子进行微分变换,既可以实现模型生成中的前向过程,还可以实现模型生成中的反向过程,有利于简化模型生成的过程。
可选地,在进行前向微分模式自动微分时,对所述前向传播计算图中的基础算子进行线性化(linearize)规则微分变换,得到所述前向微分计算图。
具体地,所述基于linearize规则对基础算子进行微分变换处理可以是指:基于预先定义的linearize规则对所述基础算子进行线性化处理。其中,所述linearize规则可以是相关技术中常见的线性化方法。
请参见图2,在本公开一个实施例中,对所述前向传播计算图中的基础算子进行微分变换,得到所述前向微分计算图过程的流程示意图。其中,所述前向传播计算图可以是基于如下目标模型生成的计算图:
def f(x,y)
return exp(x)/sin(y)
其中,本公开实施例中可以基于过程式架构框架生成所述目标模型,所述过程式架构框架设计和实现上依赖很多状态,以多维数据张量(tensor)为中心。针对自动微分会存在原始tensor,前向微分tensor,反向微分tensor,一阶微分tensor,二阶微分tensor等这些状态。
图2中x和y为前向传播计算图中的输入,z为前向传播计算图中的输出。即所述x、y和z可以作为所述原始tensor。相应地,图2中的x'、y'和z'可以 作为所述前向微分tensor。在对所述前向传播计算图中的基础算子进行微分变换之前,可以在图中先确定所述x'、y'和z'。
请参见图2,在前向传播计算图中,先利用指数函数(exp)算子对x做exp处理得到中间结果t 0,同时,利用sin算子对y做sin处理,得到中间结果t1,然后,利用div算子以t 0为被除数、以t 1为除数进行除法处理,得到z。
其中,利用exp算子对x做exp处理得到中间结果t 0属于一个前向计算过程,其对应的前向微分过程为:将t 0与x'输入mul算子做乘法处理,得到t 4,其中,t 4为t 0对应的前向微分tensor。
同理,利用sin算子对y做sin处理,得到中间结果t 1属于一个前向计算过程,其对应的前向微分过程为:先将y输入cos算子做cos得到中间结果t 2,再将t 2和y'输入mul算子做乘法处理,得到t 5,其中,t 5为t 1对应的前向微分tensor。
相应地,利用div算子以t 0为被除数、以t 1为除数进行除法处理,得到z属于一个前向计算过程,其对应的前向微分过程为:将所述t 4和t 1输入div算子做除法运算得到t 6,将所述t 5输入neg算子做取反(或取负)运算得到t 7,将t 0和t 7输入mul算子做乘法运算得到t 8。同时,将t 1输入pow-2算子,做取指数幂-2运算,得到t 3,然后,将t 3和t 8输入mul算子做乘法运算得到t 9,最后,将t 6和t 9输入add算子做加法运算,得到z',从而完成所述前向微分计算图的生成过程。
该实施方式中,可以通过使用linearize规则对前向传播计算图中的基础算子进行线性化linearize规则微分变换,即可得到前向微分计算图,从而有利于进一步简化模型生成的过程。
可选地,所述对所述前向传播计算图中的基础算子进行微分变换,得到目标计算图,包括:
在进行反向微分模式自动微分时,对所述前向传播计算图中的基础算子依次按照线性化linearize规则和转置(transpose)规则进行微分变换,得到所述反向微分计算图。
具体地,可以先按照线性化linearize规则对前向传播计算图中的基础算子进行微分变换得到前向微分计算图,然后,按照transpose规则对前向微分 计算图中的基础算子进行微分变化,以得到所述反向微分计算图。
所述基于transpose规则对算子进行微分变换处理可以是指:基于预先定义的transpose规则对算子进行转置处理。其中,所述transpose规则可以是相关技术中常见的转置方法。
请参见图3,为本公开实施例中,在前向微分计算图的基础上,按照所述transpose规则对所述前向微分计算图中的线性化算子进行微分变换,得到所述反向微分计算图的过程的流程示意图。
请参见图3,在反向微分计算图生成的过程中,可以将所述x'、y'和z'可以作为所述原始tensor,然后,先在图中确定对应的反向微分tensor:x_bar、y_bar和z_bar。再基于所述transpose规则对所述前向微分计算图中的每个线性化算子进行处理,即可得到图3所示的反向微分计算图,其具体生成过程如图3所示。其中,z_bar为反向微分的输入,x_bar和y_bar为反向微分的输出。
该实施方式中,通过按照所述transpose规则对所述前向微分计算图中的线性化算子进行微分变换,即可得到所述反向微分计算图,这样,可以基于所述反向微分计算图生成所述目标模型的反向传播部分。
在一些复杂的深度学习任务中,有时会使用到高阶导数。在科学计算领域的深度学习任务中,由于引入偏微分方程组,往往需要使用到高阶导数。
特别地,在输入数量大于输出数量时,反向微分更加高效;在输入数量小于输出数量时,前向微分更加高效。在高阶微分计算中,随着阶数的升高,输出数量会越来越多,前向微分重要性也会越来越高。
相关技术中,当需要基于前向传播计算图生成高阶微分计算图时,在过程式框架中存在如下探索:
A、注册高阶导数算子,以支持高阶自动微分。然而,使用该方案支持高阶自动微分,依赖在框架中添加高阶导数算子,考虑到框架支持的算子数量,是一个较大的工作量。另一方面,随着阶数的增大,高阶导数算子复杂度和开发难度也会急速增加。使用该方案,扩展性差,且无法做到无限阶数的高阶自动微分
B、改变过程式架构框架自动微分机制,使用前向算子组合反向过程。然 而,改变过程式架构框架自动微分机制,使用前向算子组合反向过程。使用该方案支持高阶自动微分,受限于算子体系设计,无法在所有算子上实现高阶微分功能。
C、通过使用二元数支持前向微分。然而,使用该方案支持前向微分,需要为每个算子写虚部的计算逻辑,有较大的开发量,另外受限于算子体系设计,无法在所有算子上实现高阶微分功能。
D、通过调用两次反向微分组合出一次前向微分。然而,使用该方案支持前向微分功能,实际上产生了很多冗余运算,性能较差。
当需要基于前向传播计算图生成高阶微分计算图时,在函数式架构框架中存在如下探索:
通过定义符号化的算子集合和其上的微分规则,配合程序变换实现可组合的前反向高阶自动微分。其中,JAX中使用该方案可以较为完善地支持前反向高阶自动微分功能,具有良好的扩展性。但是符号化的算子集合不能直接执行,需要绑定编译器XLA才能运行。另外函数式接口对一般用户来讲有较大的学习成本,且由于函数式接口要求没有副作用,在神经网络的搭建过程中,网络参数等信息也需要显示地暴漏出来。
为克服上述缺陷,本公开实施例中,还进一步做了如下改进:
可选地,所述对所述前向传播计算图中的基础算子进行微分变换,得到目标计算图,包括:
在所述目标计算图为I阶微分计算图的情况下,对所述前向传播计算图中的基础算子执行I次目标微分变换,其中,所述I次目标微分变换中的第k次微分变换包括:
对第一目标计算图中的基础算子按照线性化linearize规则进行微分变换,得到第k阶前向微分计算图;或者,
对第一目标计算图中的基础算子按照线性化linearize规则进行微分变换,得到第k阶前向微分计算图;对所述第k阶前向微分计算图中的基础算子按照转置transpose规则进行微分变换,得到第k阶反向微分计算图;
其中,所述k为大于0的整数,且所述k不大于所述I,所述I为不小于2的整数;在所述k等于1的情况下,所述第一目标计算图为所述前向传播 计算图,在所述k不等于1的情况下,所述第一目标计算图为:第k-1阶前向微分计算图,或者,第k-1阶反向微分计算图。
可以理解的是,在任意阶微分计算图中均可以包括对应阶的前向微分计算图和对应阶的反向微分计算图。即在所述第k-1阶前向微分计算图进行前向微分可以得到第k阶前向微分计算图,在所述第k-1阶前向微分计算图进行反向微分可以得到第k阶反向微分计算图。或者,也可以在所述第k-1阶反向微分计算图进行前向微分可以得到第k阶前向微分计算图,在所述第k-1阶反向微分计算图进行反向微分可以得到第k阶反向微分计算图。
该实施方式中,通过交替使用linearize规则和transpose规则对算子进行处理,如此,可以实现任意阶数自动化微分,即可以通过交替使用linearize规则和transpose规则对前向传播计算图中的基础算子进行处理,即可完成前向过程或反向过程中高阶微分部分的搭建,从而有利于进一步简化模型的生成过程。
可选地,所述获取所述目标模型的前向传播计算图,包括:
获取所述目标模型的初始计算图,所述初始计算图包括原生算子;
将所述初始计算图中的原生算子转换为所述基础算子,得到所述前向传播计算图;
其中,所述原生算子为具有至少一个基础算子功能的可执行算子。
上述原生算子可以是由多种不同类型的基础算子共同形成的算子,即所述原生算子中可以包括两种或两种以上的数学运算,例如,当原生算子表示为exp(x)/sin(y)时,所述原生算子包括如下三个基础算子exp算子、div算子和sin算子。
具体地,用户在构建所述目标模型之前,需要先搭建目标模型对应的公式模型,例如,所述目标模型的公式模型可以表示为:
def f(x,y)
return exp(x)/sin(y)
然后,可以基于所述公式模型绘制对应的计算流程图,即绘制所述初始计算图。由于用户在构建公式模型时,公式中的算子通常为原生算子,而原生算子通常包括一种以上的基础算子,且用户可以自定义原生算子的组成部 分,因此,相关技术中通常无法实现对原生算子的自动微分。基于此,本公开实施例中,通过将所述初始计算图中的原生算子转换为基础算子,从而得到由基础算子构成的前向传播计算图,以便于后续基于前向传播计算图中的基础算子进行自动化微分。
其中,基础算子和原生算子可以共用一套标准化中间表示,但是与原生算子体系中的算子不同,这些基础算子不包含核心(kernel)实现,基础算子可以用于表达语义,以及用于与原生算子体系之间相互转化,同时,还可以用于进行自动微分变化,由于所述基础算子不包含kernel实现所述基础算子不能直接被执行。相应地,所述原生算子包括所述kernel实现,因此,所述原生算子为可执行算子。
在本公开一个实施例中,可以预先定义由原生算子转为基础算子的规则orig2prim,然后,利用所述orig2prim规则将所述初始计算图中的原生算子转换为基础算子,以得到所述前向传播计算图。例如,以深度学习框架Paddle中的原生算子(elementwise_add)为例,该算子有两个输入一个输出,并且包含scale_x,scale_y,scale_out这些属性。拆分为基础算子可能包括broadcast_p,fill_constant_p,mul_p,add_p,其中根据两个输入的具体形状决定是否需要broadcast_p,如果scale_x,scale_y,scale_out这些属性不是1.0,需要通过fill_constant_p,mul_p实现对应的scale逻辑。
此外,在本公开另一个实施例中,也可以基于所述原生算子中所包含的基础算子的数量直接对原生算子进行拆分。即将所述原生算子中的基础算子直接从原生算子中分离,并通过连接线建立不同基础算子之间的关联关系。
该实施方式中,通过将所述初始计算图中的原生算子转换为基础算子,从而得到由基础算子构成的前向传播计算图,以便于后续基于前向传播计算图中的基础算子进行自动化微分。
可选地,所述对所述前向传播计算图中的基础算子进行微分变换,得到目标计算图之后,所述方法还包括:
将所述目标计算图中的基础算子转换为原生算子,得到所述目标微分计算图;
所述基于所述目标计算图生成所述目标模型,包括:
基于所述目标微分计算图生成所述目标模型。
其中,由于所述目标计算图中的算子为基础算子,而基础算子无法直接被执行。因此,本公开实施例中,在完成微分变换之后,可以将所述目标计算图中的基础算子转换为原生算子,以便于所生成的目标模型能够执行所述目标微分计算图中的算子。
具体地,可以预先定义由基础算子转为原生算子的规则prim2orig,然后,利用所述prim2orig规则将所述目标计算图中的基础算子转换为原生算子,从而得到所述目标微分计算图。例如,以基础算子add_p为例,该算子有两个输入一个输出,没有属性。转化为原生算子(elementwise_add),其中原生算子的三个属性scale_x,scale_y,scale_out均为1.0。
可以理解的是,还可以采用相关技术中的其他方法实现所述原生算子与基础算子之间的相关转换,对此不作限制。
该实施方式中,通过在完成微分变换之后,将所述目标计算图中的基础算子转换为原生算子,以便于所生成的目标模型能够执行所述目标微分计算图中的算子。
可选地,所述计算图包括前向微分计算图和反向微分计算图,所述目标模型包括前向网络和反向网络,所述基于所述目标计算图生成所述目标模型,包括:
基于所述前向传播计算图和第I阶目标计算图生成所述前向网络,以及,基于第I+1阶目标计算图生成所述反向网络,其中,所述第I阶目标计算图为:第I阶前向微分计算图或者第I阶反向微分计算图;所述第I+1阶目标计算图为:第I+1阶前向微分计算图或者第I+1阶反向微分计算图。
可以理解的是,所述目标计算图包括所述第I阶目标计算图和所述第I+1阶目标计算图。
上述第I+1阶前向微分计算图可以是:对第I阶目标计算图中的基础算子进行前向或反向微分得到的计算图。相应地,所述第I+1阶反向微分计算图可以是:对第I阶目标计算图中的基础算子进行前向或反向微分得到的计算图。
该实施方式中,通过基于前向传播计算图和第I阶目标计算图生成所述 前向网络,以及,基于所述第I+1阶目标计算图生成所述反向网络,从而完成所述目标模型的生成过程。
请参见图4,本公开实施例提供了一种过程式架构框架,基于所述过程式架构框架可以实现上述模型生成方法中的微分变换过程。其中,请参见图4,在所述过程式架构框架中预先定义了一套标准化中间表示,原生算子与基础算子共用所述标准化中间表示。基于过程式架构框架对初始计算图进行前向自动化微分和反向自动化微分的过程如下:可以先利用orig2prim规则将初始计算图中的原生算子转换为基础算子,然后,再利用linearize规则和transpose规则对基础算子进行微分变换,还过程中可以实现前向自动微分、反向自动微分以及高阶导数自动微分,从而得到目标计算图,然后,利用所述prim2orig规则将目标计算图中的基础算子转换为原生算子,从而得到目标微分计算图。
本公开实施例为实现上述自动微分过程,还为所述过程式架构框架设计了如下接口:
gradients(xs,ys,ys_bar)->xs_bar反向自动微分接口;
forward_gradients(xs,ys,xs_dot)->ys_dot前向自动微分接口;
enable_prim()开启基于基础算子的自动微分机制;
disable_prim()关闭基于基础算子的自动微分机制;
prim_enabled()返回值表明是否开启了基于基础算子的自动微分机制;
orig2prim()原生算子体系转化为基础算子体系;
prim2orig()基础算子体系转化为原生算子体系。
本公开实施例提供了一套完整的自动微分方案,该方案支持前向、反向两种模式,支持高阶微分功能,并且是基于过程式架构的,比基于函数架构有更好的应用性。
请参见图5,为本公开实施例提供的一种模型生成装置500的结构示意图,所述模型生成装置500,包括:
获取模块501,用于获取目标模型的前向传播计算图,所述前向传播计算图包括基础算子,所述基础算子为具有一次数学运算的算子;
微分变换模块502,用于对所述前向传播计算图中的基础算子进行微分 变换,得到目标计算图,所述目标计算图包括前向微分计算图和反向微分计算图中的至少一种;
生成模块503,用于基于所述目标计算图生成所述目标模型。
可选地,所述微分变换模块502,具体用于在进行前向微分模式自动微分时,对所述前向传播计算图中的基础算子进行线性化linearize规则微分变换,得到所述前向微分计算图。
可选地,所述微分变换模块502,具体用于在进行反向微分模式自动微分时,对所述前向传播计算图中的基础算子依次按照线性化linearize规则和转置transpose规则进行微分变换,得到所述反向微分计算图。
可选地,所述微分变换模块502,具体用于在所述目标计算图为I阶微分计算图的情况下,对所述前向传播计算图中的基础算子执行I次目标微分变换,其中,所述I次目标微分变换中的第k次微分变换包括:
对第一目标计算图中的基础算子按照线性化linearize规则进行微分变换,得到第k阶前向微分计算图;或者,
对第一目标计算图中的基础算子按照线性化linearize规则进行微分变换,得到第k阶前向微分计算图;对所述第k阶前向微分计算图中的基础算子按照转置transpose规则进行微分变换,得到第k阶反向微分计算图;
其中,所述k为大于0的整数,且所述k不大于所述I,所述I为不小于2的整数;在所述k等于1的情况下,所述第一目标计算图为所述前向传播计算图,在所述k不等于1的情况下,所述第一目标计算图为:第k-1阶前向微分计算图,或者,第k-1阶反向微分计算图。
可选地,请参见图6,所述获取模块501,包括:
获取子模块5011,用于获取所述目标模型的初始计算图,所述初始计算图包括原生算子;
转换子模块5012,用于将所述初始计算图中的原生算子转换为所述基础算子,得到所述前向传播计算图;
其中,所述原生算子为具有至少一个基础算子功能的可执行算子。
可选地,请参见图7,所述装置还包括:
转换模块504,用于将所述目标计算图中的基础算子转换为原生算子, 得到所述目标微分计算图;
所述生成模块503,具体用于基于所述目标微分计算图生成所述目标模型。
可选地,所述计算图包括前向微分计算图和反向微分计算图,所述目标模型包括前向网络和反向网络,所述生成模块503,具体用于基于所述前向传播计算图和第I阶目标计算图生成所述前向网络,以及,基于第I+1阶目标计算图生成所述反向网络,其中,所述第I阶目标计算图为:第I阶前向微分计算图或者第I阶反向微分计算图;所述第I+1阶目标计算图为:第I+1阶前向微分计算图或者第I+1阶反向微分计算图。
需要说明地,本实施例提供的模型生成装置500能够实现上述模型生成方法实施例的全部技术方案,因此至少能够实现上述全部技术效果,此处不再赘述。
本公开的技术方案中,所涉及的用户个人信息的获取,存储和应用等,均符合相关法律法规的规定,且不违背公序良俗。
根据本公开的实施例,本公开还提供了一种电子设备、一种可读存储介质和一种计算机程序产品。
图8示出了可以用来实施本公开的实施例的示例电子设备800的示意性框图。电子设备旨在表示各种形式的数字计算机,诸如,膝上型计算机、台式计算机、工作台、个人数字助理、服务器、刀片式服务器、大型计算机、和其它适合的计算机。电子设备还可以表示各种形式的移动装置,诸如,个人数字处理、蜂窝电话、智能电话、可穿戴设备和其它类似的计算装置。本文所示的部件、它们的连接和关系、以及它们的功能仅仅作为示例,并且不意在限制本文中描述的和/或者要求的本公开的实现。
如图8所示,电子设备800包括计算单元801,其可以根据存储在只读存储器(Read-Only Memory,ROM)802中的计算机程序或者从存储单元808加载到随机访问存储器(Random Access Memory,RAM)803中的计算机程序,来执行各种适当的动作和处理。在RAM803中,还可存储设备800操作所需的各种程序和数据。计算单元801、ROM802以及RAM803通过总线804彼此相连。输入/输出(Input/Output,I/O)接口805也连接至总线804。
电子设备800中的多个部件连接至I/O接口805,包括:输入单元806,例如键盘、鼠标等;输出单元807,例如各种类型的显示器、扬声器等;存储单元808,例如磁盘、光盘等;以及通信单元809,例如网卡、调制解调器、无线通信收发机等。通信单元809允许设备800通过诸如因特网的计算机网络和/或各种电信网络与其他设备交换信息/数据。
计算单元801可以是各种具有处理和计算能力的通用和/或专用处理组件。计算单元801的一些示例包括但不限于中央处理单元(Central Processing Unit,CPU)、图形处理单元(Graphics Processing Unit,GPU)、各种专用的人工智能(Artificial Intelligence,AI)计算芯片、各种运行机器学习模型算法的计算单元、数字信号处理器(Digital Signal Processor,DSP)、以及任何适当的处理器、控制器、微控制器等。计算单元801执行上文所描述的各个方法和处理,例如模型生成方法。例如,在一些实施例中,模型生成方法可被实现为计算机软件程序,其被有形地包含于机器可读介质,例如存储单元808。在一些实施例中,计算机程序的部分或者全部可以经由ROM802和/或通信单元809而被载入和/或安装到设备800上。当计算机程序加载到RAM803并由计算单元801执行时,执行上文描述的模型生成方法的一个或多个步骤。备选地,在其他实施例中,计算单元801可以通过其他任何适当的方式(例如,借助于固件)而被配置为执行模型生成方法。
本文中以上描述的系统和技术的各种实施方式可以在数字电子电路系统、集成电路系统、场可编程门阵列(Field Programmable Gate Array,FPGA)、专用集成电路(Application Specific Integrated Circuit,ASIC)、专用标准产品(Application Specific Standard Parts,ASSP)、芯片上系统的系统(System on Chip,SOC)、负载可编程逻辑设备(Complex Programming Logic Device,CPLD)、计算机硬件、固件、软件、和/或它们的组合中实现。这些各种实施方式可以包括:实施在一个或者多个计算机程序中,该一个或者多个计算机程序可在包括至少一个可编程处理器的可编程系统上执行和/或解释,该可编程处理器可以是专用或者通用可编程处理器,可以从存储系统、至少一个输入装置、和至少一个输出装置接收数据和指令,并且将数据和指令传输至该存储系统、该至少一个输入装置、和该至少一个输出装置。
用于实施本公开的方法的程序代码可以采用一个或多个编程语言的任何组合来编写。这些程序代码可以提供给通用计算机、专用计算机或其他可编程数据处理装置的处理器或控制器,使得程序代码当由处理器或控制器执行时使流程图和/或框图中所规定的功能/操作被实施。程序代码可以完全在机器上执行、部分地在机器上执行,作为独立软件包部分地在机器上执行且部分地在远程机器上执行或完全在远程机器或服务器上执行。
在本公开的上下文中,机器可读介质可以是有形的介质,其可以包含或存储以供指令执行系统、装置或设备使用或与指令执行系统、装置或设备结合地使用的程序。机器可读介质可以是机器可读信号介质或机器可读储存介质。机器可读介质可以包括但不限于电子的、磁性的、光学的、电磁的、红外的、或半导体系统、装置或设备,或者上述内容的任何合适组合。机器可读存储介质的更具体示例会包括基于一个或多个线的电气连接、便携式计算机盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦除可编程只读存储器(Erasable Programmable Read-Only Memory,EPROM,又称为快闪存储器)、光纤、便捷式紧凑盘只读存储器(Compact Disc Read-Only Memory,CD-ROM)、光学储存设备、磁储存设备、或上述内容的任何合适组合。
为了提供与用户的交互,可以在计算机上实施此处描述的系统和技术,该计算机具有:用于向用户显示信息的显示装置(例如,阴极射线管(Cathode Ray Tube,CRT)或者液晶显示器(Liquid Crystal Display,LCD)监视器);以及键盘和指向装置(例如,鼠标或者轨迹球),用户可以通过该键盘和该指向装置来将输入提供给计算机。其它种类的装置还可以用于提供与用户的交互;例如,提供给用户的反馈可以是任何形式的传感反馈(例如,视觉反馈、听觉反馈、或者触觉反馈);并且可以用任何形式(包括声输入、语音输入或者、触觉输入)来接收来自用户的输入。
可以将此处描述的系统和技术实施在包括后台部件的计算系统(例如,作为数据服务器)、或者包括中间件部件的计算系统(例如,应用服务器)、或者包括前端部件的计算系统(例如,具有图形用户界面或者网络浏览器的用户计算机,用户可以通过该图形用户界面或者该网络浏览器来与此处描述的系统和技术的实施方式交互)、或者包括这种后台部件、中间件部件、或者 前端部件的任何组合的计算系统中。可以通过任何形式或者介质的数字数据通信(例如,通信网络)来将系统的部件相互连接。通信网络的示例包括:局域网(Local Area Network,LAN)、广域网(Wide Area Network,WAN)和互联网。
计算机系统可以包括客户端和服务器。客户端和服务器一般远离彼此并且通常通过通信网络进行交互。通过在相应的计算机上运行并且彼此具有客户端-服务器关系的计算机程序来产生客户端和服务器的关系。服务器可以是云服务器,也可以为分布式系统的服务器,或者是结合了区块链的服务器。
应该理解,可以使用上面所示的各种形式的流程,重新排序、增加或删除步骤。例如,本公开中记载的各步骤可以并行地执行也可以顺序地执行也可以不同的次序执行,只要能够实现本公开公开的技术方案所期望的结果,本文在此不进行限制。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机程序来控制相关的硬件来完成,所述的程序可存储于一计算机可读取存储介质中,该程序在执行时,可包括如上述各方法的实施例的流程。其中,所述的存储介质可为磁碟、光盘、只读存储记忆体(Read-Only Memory,ROM)或随机存储记忆体(Random Access Memory,RAM)等。
可以理解的是,本公开的一些实施例描述的这些实施例可以用硬件、软件、固件、中间件、微码或其组合来实现。对于硬件实现,模块、单元、子模块、子单元等可以实现在一个或多个专用集成电路(Application Specific Integrated Circuits,ASIC)、数字信号处理器(Digital Signal Processing,DSP)、数字信号处理设备(DSP Device,DSPD)、可编程逻辑设备(Programmable Logic Device,PLD)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)、通用处理器、控制器、微控制器、微处理器、用于执行本申请所述功能的其它电子单元或其组合中。
对于软件实现,可通过执行本公开的一些实施例所述功能的模块(例如过程、函数等)来实现本公开的一些实施例所述的技术。软件代码可存储在存储器中并通过处理器执行。存储器可以在处理器中或在处理器外部实现。
上述具体实施方式,并不构成对本公开保护范围的限制。本领域技术人 员应该明白的是,根据设计要求和其他因素,可以进行各种修改、组合、子组合和替代。任何在本公开的精神和原则之内所作的修改、等同替换和改进等,均应包含在本公开保护范围之内。

Claims (17)

  1. 一种模型生成方法,包括:
    获取目标模型的前向传播计算图,所述前向传播计算图包括基础算子,所述基础算子为具有一次数学运算的算子;
    对所述前向传播计算图中的基础算子进行微分变换,得到目标计算图,所述目标计算图包括前向微分计算图和反向微分计算图中的至少一种;
    基于所述目标计算图生成所述目标模型。
  2. 根据权利要求1所述的方法,其中,所述对所述前向传播计算图中的基础算子进行微分变换,得到目标计算图,包括:
    在进行前向微分模式自动微分时,对所述前向传播计算图中的基础算子进行线性化规则微分变换,得到所述前向微分计算图。
  3. 根据权利要求1所述的方法,其中,所述对所述前向传播计算图中的基础算子进行微分变换,得到目标计算图,包括:
    在进行反向微分模式自动微分时,对所述前向传播计算图中的基础算子依次按照线性化规则和转置规则进行微分变换,得到所述反向微分计算图。
  4. 根据权利要求1所述的方法,其中,所述对所述前向传播计算图中的基础算子进行微分变换,得到目标计算图,包括:
    在所述目标计算图为I阶微分计算图的情况下,对所述前向传播计算图中的基础算子执行I次目标微分变换,其中,所述I次目标微分变换中的第k次微分变换包括:
    对第一目标计算图中的基础算子按照线性化规则进行微分变换,得到第k阶前向微分计算图;或者,
    对第一目标计算图中的基础算子按照线性化规则进行微分变换,得到第k阶前向微分计算图;对所述第k阶前向微分计算图中的基础算子按照转置规则进行微分变换,得到第k阶反向微分计算图;
    其中,所述k为大于0的整数,且所述k不大于所述I,所述I为不小于2的整数;在所述k等于1的情况下,所述第一目标计算图为所述前向传播计算图,在所述k不等于1的情况下,所述第一目标计算图为:第k-1阶前 向微分计算图,或者,第k-1阶反向微分计算图。
  5. 根据权利要求1至4中任一项所述的方法,其中,所述获取所述目标模型的前向传播计算图,包括:
    获取所述目标模型的初始计算图,所述初始计算图包括原生算子;
    将所述初始计算图中的原生算子转换为所述基础算子,得到所述前向传播计算图;
    其中,所述原生算子为具有至少一个基础算子功能的可执行算子。
  6. 根据权利要求5所述的方法,其中,所述对所述前向传播计算图中的基础算子进行微分变换,得到目标计算图之后,所述方法还包括:
    将所述目标计算图中的基础算子转换为原生算子,得到所述目标微分计算图;
    所述基于所述目标计算图生成所述目标模型,包括:
    基于所述目标微分计算图生成所述目标模型。
  7. 根据权利要求4所述的方法,其中,所述计算图包括前向微分计算图和反向微分计算图,所述目标模型包括前向网络和反向网络,所述基于所述目标计算图生成所述目标模型,包括:
    基于所述前向传播计算图和第I阶目标计算图生成所述前向网络,以及,基于第I+1阶目标计算图生成所述反向网络,其中,所述第I阶目标计算图为第I阶前向微分计算图或者第I阶反向微分计算图;所述第I+1阶目标计算图为第I+1阶前向微分计算图或者第I+1阶反向微分计算图。
  8. 一种模型生成装置,包括:
    获取模块,用于获取目标模型的前向传播计算图,所述前向传播计算图包括基础算子,所述基础算子为具有一次数学运算的算子;
    微分变换模块,用于对所述前向传播计算图中的基础算子进行微分变换,得到目标计算图,所述目标计算图包括前向微分计算图和反向微分计算图中的至少一种;
    生成模块,用于基于所述目标计算图生成所述目标模型。
  9. 根据权利要求8所述的装置,其中,所述微分变换模块,具体用于在进行前向微分模式自动微分时,对所述前向传播计算图中的基础算子进行线 性化规则微分变换,得到所述前向微分计算图。
  10. 根据权利要求8所述的装置,其中,所述微分变换模块,具体用于在进行反向微分模式自动微分时,对所述前向传播计算图中的基础算子依次按照线性化规则和转置规则进行微分变换,得到所述反向微分计算图。
  11. 根据权利要求8所述的装置,其中,所述微分变换模块,具体用于在所述目标计算图为I阶微分计算图的情况下,对所述前向传播计算图中的基础算子执行I次目标微分变换,其中,所述I次目标微分变换中的第k次微分变换包括:
    对第一目标计算图中的基础算子按照线性化规则进行微分变换,得到第k阶前向微分计算图;或者,
    对第一目标计算图中的基础算子按照线性化规则进行微分变换,得到第k阶前向微分计算图;对所述第k阶前向微分计算图中的基础算子按照转置规则进行微分变换,得到第k阶反向微分计算图;
    其中,所述k为大于0的整数,且所述k不大于所述I,所述I为不小于2的整数;在所述k等于1的情况下,所述第一目标计算图为所述前向传播计算图,在所述k不等于1的情况下,所述第一目标计算图为:第k-1阶前向微分计算图,或者,第k-1阶反向微分计算图。
  12. 根据权利要求8至11中任一项所述的装置,其中,所述获取模块,包括:
    获取子模块,用于获取所述目标模型的初始计算图,所述初始计算图包括原生算子;
    转换子模块,用于将所述初始计算图中的原生算子转换为所述基础算子,得到所述前向传播计算图;
    其中,所述原生算子为具有至少一个基础算子功能的可执行算子。
  13. 根据权利要求12所述的装置,其中,所述装置还包括:
    转换模块,用于将所述目标计算图中的基础算子转换为原生算子,得到所述目标微分计算图;
    所述生成模块,具体用于基于所述目标微分计算图生成所述目标模型。
  14. 根据权利要求11所述的装置,其中,所述计算图包括前向微分计算 图和反向微分计算图,所述目标模型包括前向网络和反向网络,所述生成模块,具体用于基于所述前向传播计算图和第I阶目标计算图生成所述前向网络,以及,基于第I+1阶目标计算图生成所述反向网络,其中,所述第I阶目标计算图为:第I阶前向微分计算图或者第I阶反向微分计算图;所述第I+1阶目标计算图为:第I+1阶前向微分计算图或者第I+1阶反向微分计算图。
  15. 一种电子设备,包括:
    至少一个处理器;以及
    与所述至少一个处理器通信连接的存储器;其中,
    所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够执行权利要求1-7中任一项所述的模型生成方法。
  16. 一种存储有计算机指令的非瞬时计算机可读存储介质,其中,所述计算机指令用于使所述计算机执行权利要求1-7中任一项所述的模型生成方法。
  17. 一种计算机程序产品,包括计算机程序,所述计算机程序在被处理器执行时实现权利要求1-7中任一项所述的模型生成方法。
PCT/CN2022/128534 2022-05-18 2022-10-31 模型生成方法、装置和电子设备 WO2023221407A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210551044.8 2022-05-18
CN202210551044.8A CN114897146B (zh) 2022-05-18 2022-05-18 模型生成方法、装置和电子设备

Publications (1)

Publication Number Publication Date
WO2023221407A1 true WO2023221407A1 (zh) 2023-11-23

Family

ID=82723188

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/128534 WO2023221407A1 (zh) 2022-05-18 2022-10-31 模型生成方法、装置和电子设备

Country Status (2)

Country Link
CN (1) CN114897146B (zh)
WO (1) WO2023221407A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114897146B (zh) * 2022-05-18 2023-11-03 北京百度网讯科技有限公司 模型生成方法、装置和电子设备

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113449842A (zh) * 2020-03-27 2021-09-28 华为技术有限公司 一种分布式自动微分方法及相关装置
CN113537486A (zh) * 2020-04-16 2021-10-22 罗伯特·博世有限公司 单调算子神经网络的系统和方法
US20210357760A1 (en) * 2018-11-09 2021-11-18 Nippon Telegraph And Telephone Corporation Distributed Deep Learning System and Data Transfer Method
CN114139104A (zh) * 2021-12-10 2022-03-04 北京百度网讯科技有限公司 基于偏微分方程处理流场数据的方法、装置及电子设备
CN114897146A (zh) * 2022-05-18 2022-08-12 北京百度网讯科技有限公司 模型生成方法、装置和电子设备

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107607105B (zh) * 2017-10-30 2019-09-13 长江师范学院 基于分数阶微分的光纤陀螺温度非线性误差补偿方法
CN109615209B (zh) * 2018-12-05 2021-08-03 山东大学 一种时滞电力系统特征值计算方法及系统
CN110990500B (zh) * 2019-03-29 2023-12-15 天维讯达(湖南)科技有限公司 传播路径模型地图建立方法及路径损耗确定方法
CN112529206B (zh) * 2019-09-18 2024-05-17 华为技术有限公司 一种模型运行方法和系统
CN110907926A (zh) * 2019-10-29 2020-03-24 长江大学 基于传播算子的双基地emvs-mimo雷达快速目标定位算法及装置
CN111860278B (zh) * 2020-07-14 2024-05-14 陕西理工大学 一种基于深度学习的人体行为识别算法
CN112947933A (zh) * 2021-02-24 2021-06-11 上海商汤智能科技有限公司 一种算子的执行方法、装置、计算机设备及存储介质
CN114282664A (zh) * 2021-04-26 2022-04-05 阿波罗智联(北京)科技有限公司 自反馈模型训练方法、装置、路侧设备及云控平台

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210357760A1 (en) * 2018-11-09 2021-11-18 Nippon Telegraph And Telephone Corporation Distributed Deep Learning System and Data Transfer Method
CN113449842A (zh) * 2020-03-27 2021-09-28 华为技术有限公司 一种分布式自动微分方法及相关装置
CN113537486A (zh) * 2020-04-16 2021-10-22 罗伯特·博世有限公司 单调算子神经网络的系统和方法
CN114139104A (zh) * 2021-12-10 2022-03-04 北京百度网讯科技有限公司 基于偏微分方程处理流场数据的方法、装置及电子设备
CN114897146A (zh) * 2022-05-18 2022-08-12 北京百度网讯科技有限公司 模型生成方法、装置和电子设备

Also Published As

Publication number Publication date
CN114897146A (zh) 2022-08-12
CN114897146B (zh) 2023-11-03

Similar Documents

Publication Publication Date Title
US20220129731A1 (en) Method and apparatus for training image recognition model, and method and apparatus for recognizing image
US20210397772A1 (en) Quantum circuit simulation method and device, apparatus, and storage medium
US20220076154A1 (en) Control pulse generation method, system, device and storage medium
KR20220005416A (ko) 다항 관계 생성 모델의 트레이닝 방법, 장치, 전자 기기 및 매체
JP7358698B2 (ja) 語義表現モデルの訓練方法、装置、デバイス及び記憶媒体
CN112749809B (zh) 构造量子仿真系统的方法和装置
US20210406579A1 (en) Model training method, identification method, device, storage medium and program product
KR20210148918A (ko) 언어 모델에 기반한 단어 벡터 획득 방법, 장치, 기기 및 기록매체
US20230267357A1 (en) Simulation method of quantum system, computing device and storage medium
JP7297038B2 (ja) ニューラルネットワークモデルの事前トレーニング方法、装置、電子機器及び媒体
KR20220125712A (ko) 이미지 처리 방법, 텍스트 인식 방법 및 장치
CN112927328B (zh) 表情迁移方法、装置、电子设备及存储介质
WO2023221407A1 (zh) 模型生成方法、装置和电子设备
US20230054391A1 (en) Calibration of quantum measurement device
WO2022111002A1 (zh) 用于训练神经网络的方法、设备和计算机可读存储介质
JP2022188275A (ja) 量子チャンネルクラシック容量の推定方法及び装置、電子機器と媒体
JP7390445B2 (ja) 文字位置決めモデルのトレーニング方法及び文字位置決め方法
US20220398834A1 (en) Method and apparatus for transfer learning
JP7404597B2 (ja) 分子構造取得方法、装置、電子デバイス及び記憶媒体
KR20210105315A (ko) 데이터 주석 방법, 장치, 기기, 저장매체 및 컴퓨터 프로그램
CN114693934A (zh) 语义分割模型的训练方法、视频语义分割方法及装置
JP2023092442A (ja) 集積回路チップ検証方法、装置、電子デバイス及び記憶媒体
JP7105297B2 (ja) ニューラルネットワークモデルの処理方法及び装置
Huai et al. Latency-constrained DNN architecture learning for edge systems using zerorized batch normalization
US20230141932A1 (en) Method and apparatus for question answering based on table, and electronic device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22942431

Country of ref document: EP

Kind code of ref document: A1