CN116578525A - Data processing method and related equipment - Google Patents

Data processing method and related equipment Download PDF

Info

Publication number
CN116578525A
CN116578525A CN202210114746.XA CN202210114746A CN116578525A CN 116578525 A CN116578525 A CN 116578525A CN 202210114746 A CN202210114746 A CN 202210114746A CN 116578525 A CN116578525 A CN 116578525A
Authority
CN
China
Prior art keywords
neural network
data
information
characteristic information
performance evaluation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210114746.XA
Other languages
Chinese (zh)
Inventor
陈志堂
吕俊龙
冯昌
耿彦辉
徐志节
陈永伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN202210114746.XA priority Critical patent/CN116578525A/en
Priority to PCT/CN2023/072095 priority patent/WO2023143128A1/en
Publication of CN116578525A publication Critical patent/CN116578525A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored program computers
    • G06F15/78Architectures of general purpose stored program computers comprising a single central processing unit
    • G06F15/7807System on chip, i.e. computer system on a single chip; System in package, i.e. computer system on one or more chips in a single package
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored program computers
    • G06F15/78Architectures of general purpose stored program computers comprising a single central processing unit
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored program computers
    • G06F15/78Architectures of general purpose stored program computers comprising a single central processing unit
    • G06F15/7839Architectures of general purpose stored program computers comprising a single central processing unit with memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Software Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Microelectronics & Electronic Packaging (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The embodiment of the application discloses a data processing method and related equipment, wherein the method can be used for solving the optimization problem in the field of artificial intelligence, and comprises the following steps: performing feature extraction on the first data and the second data through a first neural network to obtain first feature information; inputting the first characteristic information into a second neural network for optimization solution, wherein the performance evaluation information output by the second neural network indicates the performance of the equipment in the running environment; and acquiring updated first characteristic information corresponding to the target performance evaluation information, and carrying out inverse transformation on the information to obtain configuration parameters corresponding to the equipment. The first data does not need to be adjusted in the process of optimizing and solving, so that interference of 'large fluctuation of the first data' to output performance evaluation information is avoided; the updated first characteristic information is subjected to inverse transformation, so that the configuration parameters of the equipment can be directly obtained, and more proper configuration parameters can be obtained.

Description

Data processing method and related equipment
Technical Field
The application relates to the field of artificial intelligence, in particular to a data processing method and related equipment.
Background
Artificial intelligence (Artificial Intelligence, AI) is the theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and extend human intelligence, sense the environment, acquire knowledge and use the knowledge to obtain optimal results. In other words, artificial intelligence is a branch of computer science that attempts to understand the essence of intelligence and to produce a new intelligent machine that can react in a similar manner to human intelligence. Artificial intelligence, i.e. research on design principles and implementation methods of various intelligent machines, enables the machines to have functions of sensing, reasoning and decision. Solving the optimization problem with artificial intelligence is a common application of artificial intelligence.
There is an optimization problem in the fields of wireless network, power system, power management and the like, in which the input of the neural network includes a set of parameters obtained based on an objective operating environment and a set of configuration parameters for the device, the output of the neural network is performance evaluation information of the device in the foregoing operating environment on the premise of the input configuration parameters, and the optimization objective of the optimization problem is to obtain a set of configuration parameters so as to optimize the performance of the device when operating in the target environment.
However, the skilled person finds that, because the fluctuation of the parameters obtained based on the operation environment is large in the optimization process, when different configuration parameters are input, the performance index of the output of the neural network is not changed greatly, i.e. the neural network is very insensitive to the configuration parameters, so that the optimization target is difficult to achieve.
Disclosure of Invention
The embodiment of the application provides a data processing method and related equipment, wherein a mapping relation between first characteristic information and performance evaluation information is established in a second neural network, first data is not required to be adjusted in the process of optimizing and solving, and interference to output performance evaluation information caused by 'large fluctuation of the first data' is avoided; and the obtained second characteristic information is subjected to inverse transformation, so that the configuration parameters of the equipment can be directly obtained, and more proper configuration parameters can be obtained.
In order to solve the technical problems, the embodiment of the application provides the following technical scheme:
in a first aspect, an embodiment of the present application provides a data processing method, which may be used to solve an optimization problem in the field of artificial intelligence. The data processing method comprises the following steps: the execution device inputs the first data and the second data into a first neural network, so that the first data and the second data are subjected to feature extraction through the first neural network to obtain first feature information; the first data is a parameter obtained based on objective environment, namely the first data is not a parameter which can be adjusted in real time by manpower, wherein the first data can comprise a parameter for reflecting the running environment of the target equipment; the first data includes configuration parameters corresponding to a target device operating in the environment. The execution device inputs the first characteristic information into a second neural network, and performs optimization solution by using the second neural network until an optimization target is met, wherein the performance evaluation information is performance evaluation information output by the second neural network and is used for indicating the performance of target equipment in an operation environment, the optimization solution process comprises the steps of inputting a plurality of updated first characteristic information into the second neural network so that the performance evaluation information output by the second neural network is optimized, and the updated first characteristic information of each of the plurality of updated first characteristic information is obtained based on the performance evaluation information output by the second neural network.
The execution device acquires second characteristic information, wherein the second characteristic information comprises updated first characteristic information corresponding to target performance evaluation information, and the target performance evaluation information is performance evaluation information output by a second neural network; further, when the second neural network outputs the target performance evaluation information, the second characteristic information of the second neural network is input. Performing inverse transformation on the second characteristic information by the execution device to obtain target data corresponding to the second data, wherein the target data comprises configuration parameters corresponding to the device; further, the target data and the second data adopt the same type of data, and the number of values included in the target data and the second data is the same.
In the implementation manner, the first characteristic information is updated in the process of optimizing and solving, and then the updated first characteristic information is input into the second neural network until the target performance evaluation information output by the second neural network meets the optimization target, namely, the mapping relationship between the first characteristic information and the performance evaluation information is established in the second neural network, so that the first data does not need to be adjusted in the process of optimizing and solving, and the interference of 'large fluctuation of the first data' to the output performance evaluation information is avoided; and the obtained second characteristic information is subjected to inverse transformation, so that the configuration parameters of the equipment can be directly obtained, and more proper configuration parameters can be obtained.
In one possible implementation manner of the first aspect, the first feature information is capable of obtaining the second data through inverse transformation, and the first feature information is not capable of obtaining the first data through inverse transformation. In the implementation manner, the inverse transformation of the first characteristic information only can obtain the configuration parameters of the equipment, and cannot obtain the parameters of the running environment where the equipment is located, namely, the irreversible neural network layer is adopted in the processing process of the first data in the first neural network, so that the complexity of the first neural network is reduced, and the size of the first neural network and the computer resources occupied during running are reduced.
In one possible implementation manner of the first aspect, the first neural network includes a first neural network module, a second neural network module, and a third neural network module, and each of the first neural network module, the second neural network module, and the third neural network module includes at least one neural network layer. The executing device performs feature extraction on the first data and the second data through a first neural network, and the executing device comprises: the executing equipment performs feature extraction on the first data through the first neural network module to obtain feature information of the first data; feature extraction is carried out on the second data through a second neural network module, so that feature information of the second data is obtained; the characteristic information of the first data and the characteristic information of the second data may be represented as vectors, matrices, or other formats of data, etc. The execution device couples the characteristic information of the first data and the characteristic information of the second data through a third neural network module. Wherein, the second neural network module and the third neural network module both adopt reversible neural network layers; further, the input and output of the reversible neural network layer are mapped in two directions, and the output of the reversible neural network layer can be calculated by a clear formula to obtain the input of the reversible neural network layer; all information in the input of the reversible neural network layer is reserved in the output of the reversible neural network layer, so that the non-degradability of the reversible neural network layer to the input data is ensured.
In the implementation mode, the first data and the second data are subjected to feature extraction through the first neural network module and the second neural network module respectively, so that decoupling of the first neural network module and the second neural network module is facilitated; the aim of the scheme is to realize the sensitivity of the performance evaluation information output by the second neural network to the second data, namely, whether the performance evaluation information output by the second neural network is sensitive to the first data is not considered, and the first neural network module and the second neural network module are arranged as two independent neural network modules, so that the design emphasis is placed on the neural network module for processing the second data, and the difficulty of the design process of the first neural network is reduced.
In one possible implementation manner of the first aspect, the first neural network module adopts an irreversible neural network layer.
In a possible implementation manner of the first aspect, the coupling, by the execution device, the characteristic information of the first data and the characteristic information of the second data through the third neural network module includes: the execution device couples the characteristic information of the first data and the characteristic information of the second data through the third neural network module to obtain target characteristic information. Further, in one implementation, the executing device may directly determine the target feature information as the first feature information. In another implementation manner, the execution device may couple the target feature information with the feature information of the first data again through the third neural network module to obtain updated target feature information; the execution device may repeatedly execute the foregoing steps a plurality of times through the third neural network module, and determine the final updated target feature information as the first feature information.
In one possible implementation manner of the first aspect, the device is any one of the following devices: a base station, a vehicle, a chip, or a power source. In the implementation mode, specific multiple implementation states of the equipment are provided, the application scene of the scheme is expanded, and the implementation flexibility of the scheme is improved.
In a second aspect, an embodiment of the present application provides a neural network training method, which may be used to solve an optimization problem in the field of artificial intelligence. The training method of the neural network may include: the training equipment performs feature extraction on first training data and second training data through a first neural network to obtain first feature information, wherein the first training data comprises parameters for reflecting the running environment of the equipment, the second training data comprises configuration parameters corresponding to the equipment, and the second training data can be obtained by performing inverse transformation on the first feature information; inputting the first characteristic information into a second neural network to obtain performance evaluation information output by the second neural network, wherein the performance evaluation information is used for indicating the performance of the equipment in an operating environment; the training device performs iterative training on the first neural network and the second neural network by using a loss function according to the correct performance evaluation information and the output performance evaluation information corresponding to the first training data and the second training data until convergence conditions are met, and the trained first neural network and the trained second neural network are obtained, wherein the loss function indicates the similarity between the correct performance evaluation information and the output performance evaluation information.
In the embodiment of the application, not only the steps of the reasoning stages of the first neural network and the second neural network are provided, but also the steps of the training stages of the first neural network and the second neural network are provided, so that the application scene of the scheme is expanded.
In a possible implementation manner of the second aspect, the inverse transformation of the first feature information can obtain the second training data, and the first feature information cannot obtain the first training data through the inverse transformation.
In one possible implementation manner of the second aspect, the first neural network includes a first neural network module, a second neural network module, and a third neural network module, and the training device performs feature extraction on the first data and the second data through the first neural network, including: the training equipment performs feature extraction on the first data through a first neural network module to obtain feature information of the first data; feature extraction is carried out on the second data through a second neural network module, so that feature information of the second data is obtained; the training device couples the characteristic information of the first data and the characteristic information of the second data through a third neural network module, wherein the second neural network module and the third neural network module both adopt reversible neural networks.
For the specific implementation steps of the second aspect of the embodiment of the present application and the various possible implementation manners of the second aspect, the specific meaning of the noun, and the beneficial effects brought by each possible implementation manner, reference may be made to descriptions in the various possible implementation manners of the second aspect, which are not described herein in detail.
In a third aspect, an embodiment of the present application provides a data processing apparatus, which may be used to solve an optimization problem in the field of artificial intelligence. The data processing device includes: the feature extraction module is used for carrying out feature extraction on first data and second data through the first neural network to obtain first feature information, wherein the first data comprises parameters for reflecting the running environment of the equipment, and the second data comprises configuration parameters corresponding to the equipment; the optimization solving module is used for inputting the first characteristic information into the second neural network, carrying out optimization solving by using the second neural network until an optimization target is met, wherein the performance evaluation information is output by the second neural network and is used for indicating the performance of the equipment in an operation environment, and the optimization solving process comprises the steps of inputting a plurality of updated first characteristic information into the second neural network so as to optimize the performance evaluation information output by the second neural network, and obtaining the updated first characteristic information based on the performance evaluation information output by the second neural network; the acquisition module is used for acquiring second characteristic information, wherein the second characteristic information comprises updated first characteristic information corresponding to target performance evaluation information, and the target performance evaluation information is one piece of performance evaluation information output by the second neural network; and the transformation module is used for carrying out inverse transformation on the second characteristic information to obtain target data corresponding to the second data, wherein the target data comprises configuration parameters corresponding to the equipment.
The data processing apparatus provided in the third aspect of the embodiment of the present application may further perform steps performed by the executing device in each possible implementation manner of the first aspect, and for specific implementation steps of the third aspect of the embodiment of the present application and each possible implementation manner of the third aspect, and beneficial effects brought by each possible implementation manner, reference may be made to descriptions in each possible implementation manner of the first aspect, which are not described herein in detail.
In a fourth aspect, an embodiment of the present application provides a training apparatus for a neural network, which may be used to solve an optimization problem in the field of artificial intelligence. The training device of the neural network comprises: the feature extraction module is used for extracting features of first training data and second training data through a first neural network to obtain first feature information, wherein the first training data comprises parameters for reflecting the running environment of equipment, the second training data comprises configuration parameters corresponding to the equipment, and the second training data can be obtained by carrying out inverse transformation on the first feature information; the performance evaluation module is used for inputting the first characteristic information into the second neural network to obtain performance evaluation information output by the second neural network, wherein the performance evaluation information is used for indicating the performance of the equipment in the running environment; and the training module is used for carrying out iterative training on the first neural network and the second neural network by using a loss function according to the correct performance evaluation information and the output performance evaluation information corresponding to the first training data and the second training data until convergence conditions are met, so as to obtain the trained first neural network and the trained second neural network, wherein the loss function indicates the similarity between the correct performance evaluation information and the output performance evaluation information.
The training device for a neural network provided in the fourth aspect of the present application may further perform the steps performed by the training device in each possible implementation manner of the second aspect, and for the specific implementation steps of the fourth aspect of the present application and each possible implementation manner of the fourth aspect, and the beneficial effects brought by each possible implementation manner, reference may be made to descriptions in each possible implementation manner of the second aspect, which are not described herein in detail.
In a fifth aspect, embodiments of the present application provide a computer program product comprising a program which, when run on a computer, causes the computer to perform the method of the first or second aspect described above.
In a sixth aspect, embodiments of the present application provide a computer readable storage medium having a computer program stored therein, which when run on a computer causes the computer to perform the method of the first or second aspects described above.
In a seventh aspect, an embodiment of the present application provides an execution device, which may include a processor, and a memory coupled to the processor, where the memory stores program instructions, and when the program instructions stored in the memory are executed by the processor, implement the data processing method described in the first aspect.
In an eighth aspect, an embodiment of the present application provides a training device, which may include a processor, and a memory coupled to the processor, where the memory stores program instructions, and the training method of the neural network described in the second aspect above when the program instructions stored in the memory are executed by the processor.
In a ninth aspect, embodiments of the present application provide a chip system, where the chip system includes a processor for implementing the functions involved in the above aspects, for example, sending or processing data and/or information involved in the above method. In one possible design, the chip system further includes a memory for holding program instructions and data necessary for the server or the communication device. The chip system can be composed of chips, and can also comprise chips and other discrete devices.
Drawings
FIG. 1 is a schematic diagram of an artificial intelligence main body framework according to an embodiment of the present application;
FIG. 2a is a system architecture diagram of a data processing system according to an embodiment of the present application;
FIG. 2b is a schematic flow chart of a data processing method according to an embodiment of the present application;
FIG. 3 is a schematic flow chart of a data processing method according to an embodiment of the present application;
FIG. 4 is a schematic diagram of a first neural network in a data processing method according to an embodiment of the present application;
FIG. 5 is a schematic diagram of a second neural network module in the first neural network in the data processing method according to the embodiment of the present application;
fig. 6 is a schematic diagram of a third neural network module in the first neural network in the data processing method according to the embodiment of the present application;
FIG. 7 is a schematic diagram of an optimization solution performed by a second neural network in the data processing method according to the embodiment of the present application;
fig. 8 is a schematic flow chart of a training method of a neural network according to an embodiment of the present application;
FIG. 9 is a schematic illustration of the benefits provided by embodiments of the present application;
FIG. 10 is a schematic diagram of a data processing apparatus according to an embodiment of the present application;
FIG. 11 is a schematic structural diagram of a training device for neural networks according to an embodiment of the present application;
fig. 12 is a schematic structural diagram of an execution device according to an embodiment of the present application;
FIG. 13 is a schematic structural view of a training device according to an embodiment of the present application;
fig. 14 is a schematic structural diagram of a chip according to an embodiment of the present application.
Detailed Description
The terms first, second and the like in the description and in the claims and in the above-described figures, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the terms so used are interchangeable under appropriate circumstances and are merely illustrative of the manner in which embodiments of the application have been described in connection with the description of the objects having the same attributes. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of elements is not necessarily limited to those elements, but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.
Embodiments of the present application are described below with reference to the accompanying drawings. As one of ordinary skill in the art can know, with the development of technology and the appearance of new scenes, the technical scheme provided by the embodiment of the application is also applicable to similar technical problems.
Referring to fig. 1, a schematic structural diagram of an artificial intelligence main body framework is shown in fig. 1, and the artificial intelligence main body framework is described below from two dimensions of "intelligent information chain" (horizontal axis) and "IT value chain" (vertical axis). Where the "intelligent information chain" reflects a list of processes from the acquisition of data to the processing. For example, there may be general procedures of intelligent information awareness, intelligent information representation and formation, intelligent reasoning, intelligent decision making, intelligent execution and output. In this process, the data undergoes a "data-information-knowledge-wisdom" gel process. The "IT value chain" reflects the value that artificial intelligence brings to the information technology industry from the underlying infrastructure of personal intelligence, information (provisioning and processing technology implementation), to the industrial ecological process of the system.
(1) Infrastructure of
The infrastructure provides computing capability support for the artificial intelligence system, realizes communication with the outside world, and realizes support through the base platform. Communicating with the outside through the sensor; the computing power is provided by a smart chip, which may specifically be a hardware acceleration chip such as a central processing unit (central processing unit, CPU), an embedded neural network processor (neural-network processing unit, NPU), a graphics processor (graphics processing unit, GPU), an application specific integrated circuit (application specific integrated circuit, ASIC), or a field programmable gate array (field programmable gate array, FPGA); the basic platform comprises a distributed computing framework, a network and other relevant platform guarantees and supports, and can comprise cloud storage, computing, interconnection and interworking networks and the like. For example, the sensor and external communication obtains data that is provided to a smart chip in a distributed computing system provided by the base platform for computation.
(2) Data
The data of the upper layer of the infrastructure is used to represent the data source in the field of artificial intelligence. The data relate to graphics, images, voice and text, and also relate to the internet of things data of the traditional equipment, including service data of the existing system and sensing data such as force, displacement, liquid level, temperature, humidity and the like.
(3) Data processing
Data processing typically includes data training, machine learning, deep learning, searching, reasoning, decision making, and the like.
Wherein machine learning and deep learning can perform symbolized and formalized intelligent information modeling, extraction, preprocessing, training and the like on data.
Reasoning refers to the process of simulating human intelligent reasoning modes in a computer or an intelligent system, and carrying out machine thinking and problem solving by using formal information according to a reasoning control strategy, and typical functions are searching and matching.
Decision making refers to the process of making decisions after intelligent information is inferred, and generally provides functions of classification, sequencing, prediction and the like.
(4) General capability
After the data has been processed, some general-purpose capabilities can be formed based on the result of the data processing, such as algorithms or a general-purpose system, for example, translation, text analysis, computer vision processing, speech recognition, image recognition, etc.
(5) Intelligent product and industry application
The intelligent product and industry application refers to products and applications of an artificial intelligent system in various fields, is encapsulation of an artificial intelligent overall solution, and realizes land application by making intelligent information decisions, and the application fields mainly comprise: intelligent terminal, intelligent manufacturing, intelligent transportation, intelligent home, intelligent medical treatment, intelligent security, automatic driving, safe city, etc.
The embodiment of the application can be applied to various fields in the artificial intelligence field, and particularly, the data processing method provided by the embodiment of the application can be used for solving the optimization problem in a time evolution system. In the time evolution system, the operating environment where the target device is located may be changed, and then parameters for reflecting the operating environment where the target device is located may be changed. The optimization objectives of the optimization problem may include: and at any moment, determining configuration parameters corresponding to the target equipment according to the acquired parameters corresponding to the running environment so as to ensure that the performance of the target equipment in the running environment is optimal. The data processing method in the embodiment of the application can be used in the fields of intelligent terminals, automatic driving, intelligent manufacturing and the like, and a plurality of application scenes of landing to products are described below.
As an example, the target device may appear as a base station, for example in the area of intelligent terminals. The evaluation index of the optimization target may include: the response rate of the base station (i.e., an example of the target device) to the in-coverage electronic device and the quality of communication between the base station and the in-coverage electronic device; parameters for reflecting the operating environment in which the base station is located may include: the number of the electronic devices accessed to the base station at the current moment, the frequency band used by the electronic devices accessed to the base station at the current moment, the number of antenna interfaces of the base station or other parameters and the like; the configuration parameters of the base station may include: the number of antenna interfaces in the base station that are in an activated state, the number of antenna interfaces in the base station that are used to transmit fourth generation mobile communication technology (the 4th generation mobile communication technology,4G) signals, the number of antenna interfaces in the base station that are used to transmit fifth generation mobile communication technology (the 5th generation mobile communication technology,5G) signals, or other configuration parameters, etc., are merely for convenience in understanding the application scenario of the embodiments of the present application, and are not intended to limit the present application.
As another example, in the field of smart terminals, for example, the target device may appear as a power source in the terminal device. The evaluation index of the optimization target may include: the duration that the power supply is in a full state of the battery, the electric quantity of the power supply or other evaluation indexes when the user stops charging, and the like. Parameters for reflecting the operating environment in which the power source is located may include: the current time of charging, the amount of power when the power source is charged, the location or other parameters when the power source is charged, etc. The configuration parameters corresponding to the power source may include: charging power or other parameters, etc.
As another example, the target device may appear as a vehicle, for example in the area of autopilot. The evaluation index of the optimization target may include: the length of time it takes for a vehicle to pass through a target road segment, the distance between the vehicle and a nearby vehicle during travel, the distance between the vehicle and other road participating entities during travel, or other evaluation indicators, etc. Parameters for reflecting the operating environment in which the vehicle is located may include: the location of each vehicle in the target road segment, the type of road corresponding to the target road segment, the location of other road participant entities in the target road segment, or other parameters, etc. The configuration parameters corresponding to the vehicle may include: the speed of travel of the vehicle, the travel maneuver employed by the vehicle (e.g., a turn, lane change, stop or other type of maneuver, etc.), or other parameters, etc.
As another example, in the field of smart manufacturing, for example, a target device may be represented as a chip, and the evaluation index of the optimization target may include: the number of instructions to run on the chip per cycle, etc. Parameters for reflecting the operating environment in which the chip is located may include: machine cycles spent in non-stall state, type of active instruction technique employed by the chip, type of load instruction technique employed by the chip, etc. The configuration parameters corresponding to the chip may include: the size of the primary cache in the chip and the size of the secondary cache in the chip, etc.
In the embodiment of the application, specific multiple realization states of the equipment are provided, the application scene of the scheme is expanded, and the realization flexibility of the scheme is improved.
It should be noted that, the above examples of the application scenario in the embodiment of the present application are only for facilitating understanding of the present solution, and are not limited to the present solution, and the embodiment of the present application may also be applied to other application scenarios with optimization problems, where no one-to-one enumeration is performed on other application scenarios.
Before describing the data processing method provided by the embodiment of the present application in detail, a description is given of a data processing system adopted by the embodiment of the present application. Referring to fig. 2a, fig. 2a is a system architecture diagram of a data processing system according to an embodiment of the present application, in fig. 2a, a data processing system 200 includes an execution device 230, a training device 210, a database 220, a data storage system 240, and a client device 250, where the execution device 230 includes a computing module 231.
Wherein the database 220 has stored therein a training data set, the training device 210 generates a first model/rule 201 and a second model/rule 202. The first model/rule 201 and the second model/rule 202 may be models in the form of a neural network or a non-neural network, and in the embodiment of the present application, the first model/rule 201 and the second model/rule 202 are both examples of a neural network.
In the training phase of the first model/rule 201 and the second model/rule 202, the training device 210 uses the training data set in the database 220 to iteratively train the first model/rule 201 and the second model/rule 202, so as to obtain a trained first model/rule 201 and a trained second model/rule 202. The training device 210 may deploy the trained first model/rule 201 and the trained second model/rule 202 in the computing module 231 of the execution device 230.
The execution device 230 may invoke data, code, etc. in the data storage system 240, or may store data, instructions, etc. in the data storage system 240. The data storage system 240 may be disposed in the execution device 230, or the data storage system 240 may be an external memory with respect to the execution device 230.
In the reasoning stage of the first model/rule 201 and the second model/rule 202, specifically, the calculation module 231 of the execution device 230 may perform feature extraction on the acquired first data and second data by using the trained first model/rule 201 to obtain first feature information; and (3) carrying out optimization solution by using the second model/rule 202 until the optimization target is met, and acquiring configuration parameters corresponding to the target equipment.
In some embodiments of the present application, referring to fig. 2a, the execution device 230 and the client device 250 may be separate devices, where the execution device 230 is configured with an I/O interface 212, and performs data interaction with the client device 250, and a "user" may perform information interaction with the execution device 230 through the client device 250. It should be noted that fig. 2a is only a schematic architecture diagram of a data processing system according to an embodiment of the present application, and the positional relationship among devices, modules, etc. shown in the figure does not constitute any limitation. For example, in other embodiments of the present application, the execution device 230 may also interact with the "user" directly.
With reference to fig. 2b, fig. 2b is a schematic flow chart of the data processing method according to the embodiment of the present application. A1, the execution device performs feature extraction on first data and second data through a first neural network to obtain first feature information, wherein the first data comprises parameters for reflecting an operation environment where the target device is located, and the second data comprises configuration parameters corresponding to the target device. A2, the execution device inputs the first characteristic information into a second neural network, and performs optimization solution by using the second neural network until an optimization target is met, wherein the performance evaluation information is used for indicating the performance of the target device in an operation environment, the optimization solution process comprises inputting the updated first characteristic information into the second neural network so as to optimize the performance evaluation information output by the second neural network, and the updated first characteristic information is obtained based on the performance evaluation information output by the second neural network. A3, the execution equipment acquires second characteristic information, wherein the second characteristic information comprises updated first characteristic information corresponding to target performance evaluation information, and the target performance evaluation information is one piece of performance evaluation information output by the second neural network. And A4, carrying out inverse transformation on the second characteristic information by the execution equipment to obtain target data corresponding to the second data, wherein the target data comprises configuration parameters corresponding to the target equipment.
In the embodiment of the application, the first characteristic information is updated in the process of optimizing and solving, and then the updated first characteristic information is input into the second neural network until the target performance evaluation information output by the second neural network meets the optimization target, namely, the mapping relationship between the first characteristic information and the performance evaluation information is established in the second neural network, so that the first data is not required to be adjusted in the process of optimizing and solving, and the interference of 'large fluctuation of the first data' to the output performance evaluation information is avoided; and the obtained second characteristic information is subjected to inverse transformation, so that the configuration parameters of the equipment can be directly obtained, and more proper configuration parameters can be obtained.
In order to facilitate understanding of the present solution, before describing in detail the processing method of the user interface of the application provided by the embodiment of the present application, related terms and concepts that may be related to the embodiment of the present application are described below.
(1) First data and second data
The first data is a parameter obtained based on objective environment, namely, the first data is not a parameter which can be adjusted in real time by manpower, wherein the first data can comprise a parameter for reflecting the running environment of the target equipment; for example, the first data may be referred to the above description of the application scenario of the embodiment of the present application, and will not be repeated here.
The second data and the first data are two different concepts, and the first data comprise configuration parameters corresponding to target equipment running in the environment; for example, the second data may be referred to the above description of the application scenario of the embodiment of the present application, and will not be repeated here.
(2) Second characteristic information
The second characteristic information is updated first characteristic information corresponding to the target performance evaluation information, and when the second neural network outputs the target performance evaluation information, the second characteristic information is input to the second neural network.
(3) Inverse transformation
"reversible transformation" is a one-to-one function transformation, for example, one of which is mapped from A to B, and the inverse of the aforementioned mapping relationship is from B to A, and the transformation from A to B is referred to as a reversible transformation.
(4) Reversible neural network layer
The input and the output of the reversible neural network layer are mapped in two directions, and the output of the reversible neural network layer can be calculated through a clear formula to obtain the input of the reversible neural network layer; all information in the input of the reversible neural network layer is reserved in the output of the reversible neural network layer, so that the non-degradability of the reversible neural network layer to the input data is ensured.
In combination with the above description, a description will be given below of a specific implementation flow of the reasoning phase and training phase of the neural network provided by the embodiment of the present application.
1. Inference phase
Specifically, referring to fig. 3, fig. 3 is a flow chart of a data processing method according to an embodiment of the present application, where the data processing method according to the embodiment of the present application may include:
301. the execution device inputs the first data and the second data into a first neural network, the first data and the second data are subjected to feature extraction through the first neural network, first feature information generated by the first neural network is obtained, the first data comprise parameters for reflecting the running environment of the device, and the second data comprise configuration parameters corresponding to the target device.
In the embodiment of the application, after the execution device acquires the explicit first data, the execution device can acquire a group of second data; the aforementioned second data may be a set of second data generated randomly, or may be a set of second data input by a technician based on experience, and the second data may also be obtained by other manners, which is not meant to be exhaustive herein.
The execution device inputs the first data and the second data into a first neural network, and performs feature extraction on the first data and the second data through the first neural network to obtain first feature information generated by the first neural network, wherein the first feature information comprises feature information of the first data and feature information of the second data. It should be noted that, the understanding of the two concepts of the "first data" and the "second data" may refer to the above description, and are not repeated herein.
In one implementation, the first characteristic information can be converted into the second data by inverse transformation, and the first data can also be converted into the first data by inverse transformation, i.e. the first neural network is a reversible neural network. For an understanding of the two concepts of "reversible transformation" and "reversible neural network" reference is made to the above description, and no further description is given here.
In another implementation, the first characteristic information can only obtain the second data by inverse transformation, and the first data cannot be obtained by inverse transformation. In the embodiment of the application, the configuration parameters of the equipment can only be obtained by carrying out inverse transformation on the first characteristic information, and the parameters of the running environment where the equipment is located cannot be obtained, namely, the irreversible neural network layer is adopted in the processing process of the first data in the first neural network, so that the complexity of the first neural network is reduced, and the size of the first neural network and the computer resources occupied during running are reduced.
For the first characteristic information, the second data can be obtained only by inverse transformation, and the first data cannot be obtained by inverse transformation. Specifically, step 301 may include: the executing equipment performs feature extraction on the first data through the first neural network module to obtain feature information of the first data; feature extraction is carried out on the second data through a second neural network module, so that feature information of the second data is obtained; the execution device couples the characteristic information of the first data and the characteristic information of the second data through the third neural network module to obtain target characteristic information. The feature information of the first data and the feature information of the second data may be represented as vectors, matrices, or other data formats, and the like, and may be specifically and flexibly determined in combination with an actual application scenario, which is not limited herein.
Further, in one implementation, the executing device may directly determine the target feature information as the first feature information. In another implementation manner, the execution device may couple the target feature information with the feature information of the first data again through the third neural network module to obtain updated target feature information; the execution device may repeatedly execute the foregoing steps a plurality of times through the third neural network module, and determine the final updated target feature information as the first feature information.
The first neural network module, the second neural network module and the third neural network module are all attributed to the first neural network, and each of the first neural network module, the second neural network module and the third neural network module comprises at least one neural network layer. The second neural network module and the third neural network module both adopt reversible neural network layers, and the first neural network module adopts irreversible neural network layers.
For a more intuitive understanding of the present solution, please refer to fig. 4, fig. 4 is a schematic diagram of a first neural network in the data processing method according to an embodiment of the present application. In fig. 4, the feature information of the first data, the feature information of the second data, and the first feature information are each represented as a vector. The first neural network may include a first neural network module, a second neural network module and a third neural network module, where the second neural network module and the third neural network module both adopt a reversible neural network layer, and the first neural network module adopts an irreversible neural network layer. The execution device inputs the first data and the second data into a first neural network, and a first neural network module in the first neural network is used for extracting features of the first data to obtain feature information of the first data generated by the first neural network module.
The second neural network module in the first neural network is used for extracting the characteristics of the second data to obtain the characteristic information of the second data generated by the second neural network module. Since the number of parameters in the first data is often greater than the number of parameters in the second data, in the process of extracting features of the second data, dimension-up operation may be performed, so that the dimension of feature information of the first data is equivalent to the dimension of feature information of the second data, and a difference between the dimension of the feature information of the first data and the dimension of the feature information of the second data is less than or equal to a first threshold. In the embodiment of the application, the dimension of the characteristic information of the first data is equivalent to that of the characteristic information of the second data, which is favorable for reducing the complexity of the subsequent optimization solving process, namely, the difficulty of the subsequent optimization solving process, so that more proper configuration parameters are favorable.
The third neural network module in the first neural network may rearrange the characteristic information of the second data in order (i.e. "disorder" in fig. 4), and couple the characteristic information of a part (for convenience of description, hereinafter referred to as "first part") of the characteristic information of the second data after the "disorder" with the characteristic information of the first data to obtain a first coupling result; the third neural network module re-couples the first coupling result with the characteristic information of the other part of the characteristic information of the second data (namely, the characteristic information except the first part in the characteristic information of the second data) to obtain a second coupling result; and the third neural network module splices the characteristic information of the first part with the second coupling result to obtain target characteristic information. The execution device may repeatedly execute the foregoing operations through a third neural network module in the first neural network, so as to implement multiple updates of the target feature information, and determine the final updated target feature information as the first feature information. It should be understood that the example in fig. 4 is merely for convenience in understanding the architecture of the "first neural network", and is not intended to limit the present scheme.
More specifically, the method aims at a process that the executing device obtains the characteristic information of the first data through the first neural network module. The first neural network module may be formed by a simple neural network layer, and the execution device may transform the input first data through the first neural network module in the first neural network to obtain the characteristic information of the first data, which is favorable for enhancing the expression capability of the first data, where the "transform" operation may be represented as vectorization, convolution operation or other types of transform operations, which is not exhaustive herein.
And obtaining the characteristic information of the second data through the second neural network module aiming at the executing equipment. For a more intuitive understanding of the present solution, please refer to fig. 5, fig. 5 is a schematic diagram of a second neural network module in the first neural network in the data processing method according to an embodiment of the present application. As shown in fig. 5, there is a through (ST) design in the second neural network module, that is, the execution device may copy the input second data into the feature information of the second data through the second neural network module in the first neural network, so as to ensure that all the information of the second data is carried in the feature information of the second data.
In order to ensure that the difference between the number of the values included in the feature information of the second data and the number of the values included in the feature information of the first data is smaller than or equal to a first threshold, that is, the feature information of the second data needs to include the number of the target values, the dimension-lifting operation needs to be performed through a second neural network module in the first neural network. Specifically, in one implementation, the second neural network module may populate a plurality of preset values into the feature information of the second data; in another implementation, the second neural network module may also populate the feature information of the second data with random values, and so on, which is not exhaustive of the specific implementation of the second neural network module to perform the dimension-increasing operation.
It should be understood that the example in fig. 5 is only for facilitating understanding of the present solution, and is not limited to this solution, as long as the second neural network module adopts a reversible neural network layer, and a specific implementation manner of the second neural network module may be flexibly determined in combination with an actual situation.
And obtaining target characteristic information through a third neural network module aiming at the executing equipment. In order to more intuitively understand the present solution, please refer to fig. 6, fig. 6 is a schematic diagram of a third neural network module in the first neural network in the data processing method according to the embodiment of the present application. As shown in fig. 6, the execution device may pass through a third nerve of the first neural network The network module reorders the characteristic information of the second data (i.e., out of order as shown in fig. 6); the third neural network module includes u which is included in the characteristic information of the second data after disorder ≤d (i.e. the first d number values in the characteristic information of the second data) is copied into the target characteristic information, and u is included in the characteristic information of the second data after disorder >d Transforming to obtain target characteristic information generated by the third neural network module; that is, the remaining feature information in the target feature information can be obtained by the following formula:
φ(u >d ;g(s,u ≤d )); (1)
wherein u is >d Representing d+1st value to last value in the feature information of the second data after disorder, s represents the feature information of the first data, u ≤d The first d number value, g (s, u) ≤d ) Represents s and u ≤d In fig. 6 an intermediate concept Θ, Θ=g (s, u ≤d ) Phi (u) >d ;g(s,u ≤d ))=φ(u >d The method comprises the steps of carrying out a first treatment on the surface of the Θ), i.e. Θ represents s and u ≤d Is a first coupling result of (a); phi (u) >d ;g(s,u ≤d ) Represents u) >d And a second coupling result with the first coupling result, the second coupling result being the remaining characteristic information in the target characteristic information. In FIG. 6Representation (1) is capable of reversible transformation.
Optionally, phi (u >d The method comprises the steps of carrying out a first treatment on the surface of the Θ) can be obtained by the following formula:
φ(u >d ;Θ)=u >d ⊙exp(σ(s,u ≤d ))+t(s,u ≤d ); (2)
wherein u is >d The d+1st value to the last value in the characteristic information representing the second data after disorder, as a result, the product of the point-wise vector inner product between the two vectors, sigma and t represent nonlinear functions, s represents the characteristic information of the first data, u ≤d Representing disorderThe first d number values in the characteristic information of the subsequent second data are represented by exp.
Correspondingly, the following formula can be deduced from the above formula (2):
as is clear from the above formula (3), the formula adopted in the formula (2) is capable of realizing reversible transformation, that is, if the remaining feature information in the target feature information is calculated by the formula in the formula (2), the remaining feature information in the target feature information can be obtained by the inverse transformation to u >d
It should be noted that the examples in fig. 6 and (2) are merely to demonstrate the implementation of the present solution, and are not intended to limit the present solution, and in actual situations, the remaining feature information in the target feature information may be calculated in other manners, which is not exhaustive herein.
It should be noted that, the examples in fig. 4 to fig. 6 are only for facilitating understanding of the present solution, and the specific structure of the first neural network may be flexibly determined according to practical situations, and is not limited to the present solution. As an example, the feature information of the first part and the feature information of the second part of the feature information of the second data may also differ in data amount, for example.
As another example, in another implementation manner, the third neural network module in the first neural network may couple the characteristic information of the second half of the characteristic information of the second data with the characteristic information of the first half of the characteristic information of the first data, then couple the coupling result with the characteristic information of the first half of the characteristic information of the second data, and splice the coupling result obtained in the foregoing step with the characteristic information of the second half of the characteristic information of the second data to obtain the target characteristic information.
As another example, for example, the third neural network module in the first neural network may couple the feature information of the second part in the feature information of the second data with the feature information of the first data, and then directly splice the feature information of the first part in the feature information of the second data with the coupling result obtained in the foregoing step to obtain the target feature information, which is not exhaustive about the specific expression form of the third neural network module in the first neural network. That is, as long as the second neural network module and the third neural network module are both guaranteed to adopt the reversible neural network layer, the first neural network module is only required to adopt the irreversible neural network layer, and specifically, what neural network layer is adopted by the first neural network module, the second neural network module and the third neural network module in the first neural network is not limited herein.
The second data can be obtained only by inverse transformation for the first characteristic information, and the first data can be obtained by inverse transformation. Specifically, in one implementation, step 301 may include: the execution equipment inputs the first data and the second data into a first neural network, the first data and the second data can be spliced through the first neural network, and then the characteristic extraction is carried out on the spliced data through the first neural network, so that the first characteristic information is obtained; wherein the entire first neural network is a reversible neural network.
In another implementation, the first neural network module, the second neural network module, and the third neural network module are all assigned to a first neural network, and the first neural network module, the second neural network module, and the third neural network module all employ a reversible neural network layer. Step 301 may include: the executing equipment performs feature extraction on the first data through the first neural network module to obtain feature information of the first data; feature extraction is carried out on the second data through a second neural network module, so that feature information of the second data is obtained; the execution device couples the characteristic information of the first data and the characteristic information of the second data through the third neural network module to obtain target characteristic information.
Further, in one implementation, the executing device may directly determine the target feature information as the first feature information. In another implementation manner, the execution device may couple the target feature information with the feature information of the first data again through the third neural network module to obtain updated target feature information; the execution device may repeatedly execute the foregoing steps a plurality of times through the third neural network module, and determine the final updated target feature information as the first feature information.
In the embodiment of the application, the first data and the second data are respectively subjected to feature extraction through the first neural network module and the second neural network module, so that decoupling of the first neural network module and the second neural network module is facilitated; the aim of the scheme is to realize the sensitivity of the performance evaluation information output by the second neural network to the second data, namely, whether the performance evaluation information output by the second neural network is sensitive to the first data is not considered, and the first neural network module and the second neural network module are arranged as two independent neural network modules, so that the design emphasis is placed on the neural network module for processing the second data, and the difficulty of the design process of the first neural network is reduced.
302. In the process that the execution device utilizes the second neural network to carry out optimization solution, the first characteristic information is input into the second neural network to obtain performance evaluation information output by the second neural network, and the performance evaluation information is used for indicating the performance of the target device in an operation environment.
In the embodiment of the application, after the execution device obtains the first characteristic information corresponding to the first data and the second data through the first neural network, the execution device can start to perform optimization solution by using the second neural network. In the process that the execution device utilizes the second neural network to carry out optimization solution, the first characteristic information is input into the second neural network, and is processed through the second neural network to obtain performance evaluation information output by the second neural network, wherein the performance evaluation information is used for indicating the performance of the target device in an operation environment.
The optimization solving process comprises the step of inputting the updated first characteristic information into a second neural network so as to optimize the performance evaluation information output by the second neural network. The phrase "to optimize the performance evaluation information output by the second neural network" means that the overall objective of the overall optimization solution is to obtain better performance evaluation information, and does not represent that the performance evaluation information output by the second neural network is better than the performance evaluation information output last time after the updated first feature information is input into the second neural network.
The performance evaluation information may include a value of at least one performance evaluation index, where the value of the at least one performance evaluation index is used to indicate performance of the target device in the operating environment, and specifically, what type of performance evaluation index needs to be combined with an actual operating environment of the target device is determined, which is not limited herein; reference is made to the above description for an example of the actual operating environment of the target device, which is not illustrated here.
The second neural network need not employ a reversible neural network, and as an example, the second neural network may employ a multi-layer persistence (MLP), a convolutional neural network, a cyclic neural network, or other type of neural network, etc., and the detailed description of the second neural network is not exhaustive.
303. In the process of carrying out optimization solving by the execution equipment by utilizing the second neural network, judging whether an optimization target is met, if not, entering a step 304; if yes, go to step 305.
In the embodiment of the application, in the process that the execution device performs optimization solution by using the second neural network, the execution device can judge whether the optimization target is met after obtaining the performance evaluation information output by the second neural network, and if the judgment result is negative, the step 304 is entered; if yes, go to step 305.
In one implementation, the performance evaluation information may include a value of at least one performance evaluation index, and the optimization objective may include a value range of each performance evaluation index in the at least one performance evaluation index.
In another implementation, the optimization objective may include the number of times the execution device inputs the first feature information (including the updated first feature information) reaching a preset number of times.
In another implementation manner, the optimizing target may include traversing a feature space in which the first feature information is located, where the feature space in which the first feature information is located includes a plurality of feature information having the same data type as the first feature information and the same number of values included in the feature space, and other feature information in the feature space in which the first feature information is located may be regarded as updated first feature information. It should be understood that the specific meaning of the "optimization objective" may be flexibly determined in connection with the actual application scenario, and is not limited herein.
Specifically, if the optimization objective includes a range of values of each of the at least one performance evaluation index. In one implementation, when the value of each performance evaluation index in the at least one performance evaluation index meets the value range specified by the optimization target, the execution device may consider that the performance evaluation information output by the second neural network meets the optimization target; if any one performance evaluation index in the at least one performance evaluation index does not meet the value range specified by the optimization target, the performance evaluation information output by the second neural network is regarded as not meeting the optimization target,
In another implementation manner, when any performance evaluation index in the at least one performance evaluation index does not meet the value range specified by the optimization target, the performance evaluation information output by the second neural network is considered to meet the optimization target; if the value of each performance evaluation index in the at least one performance evaluation index does not satisfy the value range specified by the optimization target, the execution device may consider that the performance evaluation information output by the second neural network does not satisfy the optimization target.
It should be noted that, the specific implementation manner of "determining whether the performance evaluation information output by the second neural network meets the optimization target" may be flexibly set in combination with the actual application environment, which is not limited herein.
304. In the process that the execution equipment utilizes the second neural network to carry out optimization solution, the updated first characteristic information is input into the second neural network, and updated performance evaluation information output by the second neural network is obtained.
In the embodiment of the application, if the execution device determines that the optimization target is not met, the execution device can input the updated first characteristic information into the second neural network, and process the updated first characteristic information again through the second neural network to obtain updated performance evaluation information output by the second neural network; and re-enter step 303 after obtaining updated performance evaluation information output by the second neural network to determine whether the optimization objective is met.
The updated first characteristic information and the first characteristic information adopt the same type of data, and the number of the numerical values included in the updated first characteristic information and the numerical values included in the first characteristic information are the same; as an example, if the first feature information is embodied as a vector including 7 values, for example, the updated first feature information is also a vector including 7 values, and it should be understood that the present solution is merely illustrated herein for convenience of understanding, and is not limited to this solution.
For the process of acquiring the updated first feature information, in one implementation manner, the execution device may update the input first feature information according to the performance evaluation information output by the second neural network, to obtain updated first feature information.
In another implementation manner, before executing step 304, the executing device may determine a feature space where the first feature information obtained in step 301 is located, and then the executing device obtains a new first feature information (i.e. obtains the updated first feature information) from the feature space where the first feature information is located; the manner in which the updated first characteristic information is obtained by the execution device is not exhaustive here.
It should be noted that, the number of times of execution between steps 303 and 304 and steps 301 and 302 is not limited in the embodiment of the present application, and steps 303 and 304 may be executed one or more times after steps 301 and 302 are executed once.
In order to more intuitively understand the present solution, please refer to fig. 7, fig. 7 is a schematic diagram of an optimization solution through a second neural network in the data processing method according to an embodiment of the present application. As shown in fig. 7, B1, the execution device inputs the first characteristic information into the second neural network, and obtains performance evaluation information output from the second neural network. B2, the execution equipment judges whether the optimization target is met, and if the optimization target is not met, the step B3 is entered; if the optimization objective is met, step B4 is entered. And B3, the execution device acquires the updated first characteristic information and reenters the step B1. And B4, the execution equipment acquires second characteristic information, wherein the second characteristic information comprises updated first characteristic information corresponding to target performance evaluation information, and the target performance evaluation information is performance evaluation information output by the second neural network. It should be understood that the example in fig. 7 is merely for facilitating understanding of the present solution, and is not intended to limit the present solution.
305. The execution device acquires second characteristic information, wherein the second characteristic information comprises updated first characteristic information corresponding to target performance evaluation information, and the target performance evaluation information is performance evaluation information output by a second neural network.
In the embodiment of the application, under the condition that the execution device determines that the optimization target is met, the execution device can acquire the second characteristic information corresponding to the target performance evaluation information. The second characteristic information comprises updated first characteristic information corresponding to the target performance evaluation information, namely the second characteristic information and the first characteristic information adopt the same type of data, and the number of the numerical values included in the second characteristic information and the number of the numerical values included in the first characteristic information are the same. For understanding of the correspondence between the "second feature information" and the "target performance evaluation information", reference may be made to the above description, and details thereof are not repeated here.
The target performance evaluation information is one of a plurality of performance evaluation information output by the second neural network. Further, the target performance evaluation information may be one of the performance evaluation information that is optimal among the plurality of performance evaluation information output by the second neural network; or, the value of one performance index in the target performance evaluation information is highest in the plurality of performance evaluation information; alternatively, the target performance index may be determined based on other principles, and the like, without limitation.
306. The execution device performs inverse transformation on the second characteristic information to obtain target data corresponding to the second data, wherein the target data comprises configuration parameters corresponding to the target device.
In the embodiment of the application, the first characteristic information generated by the first neural network can be inverse-transformed to obtain the first data, and because the second characteristic information is updated first characteristic information, that is, the second characteristic information and the first characteristic information adopt the same type of data, and the number of the values included in the second characteristic information and the first characteristic information is the same, the executing device performs inverse transformation on the second characteristic information, so that the target data corresponding to the second data can be obtained.
The target data comprises configuration parameters corresponding to the target device, further, the target data and the second data adopt the same type of data, and the number of the numerical values included in the target data and the second data is the same.
In the embodiment of the application, the first characteristic information is updated in the process of optimizing and solving, and then the updated first characteristic information is input into the second neural network until the target performance evaluation information output by the second neural network meets the optimization target, namely, the mapping relationship between the first characteristic information and the performance evaluation information is established in the second neural network, so that the first data is not required to be adjusted in the process of optimizing and solving, and the interference of 'large fluctuation of the first data' to the output performance evaluation information is avoided; and the obtained second characteristic information is subjected to inverse transformation, so that the configuration parameters of the equipment can be directly obtained, and more proper configuration parameters can be obtained.
2. Training phase
Specifically, referring to fig. 8, fig. 8 is a schematic flow chart of a neural network training method provided by an embodiment of the present application, where the neural network training method provided by the embodiment of the present application may include:
801. the training device inputs first training data and second training data into a first neural network, the first training data and the second training data are subjected to feature extraction through the first neural network to obtain first feature information, wherein the first training data comprise parameters for reflecting the running environment of the device, the second training data comprise configuration parameters of the device, and the second training data can be obtained by carrying out inverse transformation on the first feature information.
802. The training device inputs the first characteristic information into the second neural network to obtain performance evaluation information output by the second neural network, wherein the performance evaluation information is used for indicating the performance of the device in the running environment.
In the embodiment of the application, the training device may be preconfigured with a training data set, where the training data set includes multiple sets of training data, and each set of training data may include first training data, second training data, and correct performance evaluation information corresponding to the first training data and the second training data. The specific implementation of the training device to perform steps 801 and 802 may refer to the descriptions of steps 301 and 302 in the corresponding embodiment of fig. 3, which are not described herein. The difference is that the first data in the corresponding embodiment of fig. 3 is replaced with the first training data in the corresponding embodiment of fig. 8, and the second data in the corresponding embodiment of fig. 3 is replaced with the second training data in the corresponding embodiment of fig. 8.
803. The training device trains the first neural network and the second neural network by using a target loss function according to correct performance evaluation information and output performance evaluation information corresponding to the first training data and the second training data, wherein the target loss function indicates similarity between the correct performance evaluation information and the output performance evaluation information.
In the embodiment of the application, after obtaining the performance evaluation information output by the second neural network (i.e. obtaining the predicted performance evaluation information), the training device can generate the function value of the target loss function according to the correct performance evaluation information corresponding to the first training data and the second training data and the output performance evaluation information; the training equipment conducts gradient derivation according to the function value of the target loss function, and conducts reverse updating on the weight parameters of the first neural network and the second neural network so as to complete one-time training of the first neural network and the second neural network.
Wherein the objective loss function indicates a degree of similarity between the correct performance evaluation information and the output performance evaluation information; as examples, the target loss function may specifically take the form of a square loss function, a 0-1 loss function, or other types of loss functions, for example, which are not exhaustive herein.
The training device repeatedly executes steps 801 to 803 for a plurality of times to complete iterative training of the first neural network and the second neural network until convergence conditions are met; the convergence condition may be a convergence condition that satisfies the objective loss function, or may be that the iteration number reaches a preset number, or the like, which is not exhaustive herein.
It should be noted that, in the corresponding embodiment of fig. 8, the meanings of the various terms may be referred to in the above description, and the meanings of the various terms in fig. 8 are not described one by one.
In the embodiment of the application, not only the steps of the reasoning stages of the first neural network and the second neural network are provided, but also the steps of the training stages of the first neural network and the second neural network are provided, so that the application scene of the scheme is expanded.
In order to more intuitively understand the beneficial effects of the embodiments of the present application, the following description is made in connection with test data. In this experiment, the first data includes 5 parameter values, and the second data includes 1 parameter value, which is described below with reference to fig. 9, and fig. 9 is a schematic diagram of the beneficial effects provided by the embodiment of the present application. Taking the optimization problem as an example of selecting an optimal strategy in fig. 9, the ordinate of fig. 9 represents the cumulative remorse value, and a smaller cumulative remorse value represents better performance of the selected strategy. The neural network adopted by the reference group 1 is a multi-layer perceptron, and the reference group 2 is a modification of the reference group 1, as can be seen from fig. 9, the performance of the strategy selected based on the method provided by the embodiment of the application is far superior to that of the reference group.
In order to better implement the above-described scheme of the embodiment of the present application on the basis of the embodiments corresponding to fig. 3 to 8, the following provides a related device for implementing the above-described scheme. Referring specifically to fig. 10, fig. 10 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present application, where a data processing apparatus 1000 includes: the feature extraction module 1001 is configured to perform feature extraction on first data and second data through a first neural network to obtain first feature information, where the first data includes parameters for reflecting an operating environment where the device is located, and the second data includes configuration parameters corresponding to the device; the optimization solving module 1002 is configured to input the first feature information into a second neural network, perform optimization solving by using the second neural network until an optimization target is met, where the second neural network outputs performance evaluation information, the performance evaluation information is used to indicate performance of the device in an operating environment, and the optimization solving process includes inputting a plurality of updated first feature information into the second neural network, so that the performance evaluation information output by the second neural network is optimized, and the updated first feature information is obtained based on the performance evaluation information output by the second neural network; an obtaining module 1003, configured to obtain second feature information, where the second feature information includes updated first feature information corresponding to target performance evaluation information, and the target performance evaluation information is performance evaluation information output by the second neural network; the transforming module 1004 is configured to perform inverse transformation on the second feature information to obtain target data corresponding to the second data, where the target data includes configuration parameters corresponding to the device.
In one possible design, the first characteristic information is not able to obtain the first data by inverse transformation.
In one possible design, the first neural network includes a first neural network module, a second neural network module, and a third neural network module, and the feature extraction module 1001 is specifically configured to: extracting the characteristics of the first data through a first neural network module to obtain characteristic information of the first data; feature extraction is carried out on the second data through a second neural network module, so that feature information of the second data is obtained; and coupling the characteristic information of the first data and the characteristic information of the second data through a third neural network module, wherein the second neural network module and the third neural network module adopt reversible neural network layers.
In one possible design, the device is any one of the following: a base station, a vehicle, a chip, or a power source.
It should be noted that, content such as information interaction and execution process between each module/unit in the data processing apparatus 1000, each method embodiment corresponding to fig. 3 to 7 in the present application is based on the same concept, and specific content may be referred to the description in the foregoing method embodiment of the present application, which is not repeated herein.
The embodiment of the application further provides a training device for the neural network, please refer to fig. 11, and fig. 11 is a schematic structural diagram of the training device for the neural network provided by the embodiment of the application. The training apparatus 1100 of the neural network includes: the feature extraction module 1101 is configured to perform feature extraction on first training data and second training data through a first neural network to obtain first feature information, where the first training data includes parameters for reflecting an operating environment where the device is located, the second training data includes configuration parameters corresponding to the device, and performing inverse transformation on the first feature information to obtain second training data; the performance evaluation module 1102 is configured to input the first feature information into a second neural network, and obtain performance evaluation information output by the second neural network, where the performance evaluation information is used to indicate performance of the device in an operating environment; the training module 1103 is configured to iteratively train the first neural network and the second neural network by using a loss function according to the correct performance evaluation information and the output performance evaluation information corresponding to the first training data and the second training data until convergence conditions are satisfied, so as to obtain a trained first neural network and a trained second neural network, where the loss function indicates a similarity between the correct performance evaluation information and the output performance evaluation information.
In one possible design, the first characteristic information is not able to obtain the first training data by inverse transformation.
In one possible design, the first neural network includes a first neural network module, a second neural network module, and a third neural network module, and the feature extraction module 1101 is specifically configured to: extracting the characteristics of the first data through a first neural network module to obtain characteristic information of the first data; feature extraction is carried out on the second data through a second neural network module, so that feature information of the second data is obtained; and coupling the characteristic information of the first data and the characteristic information of the second data through a third neural network module, wherein the second neural network module and the third neural network module both adopt reversible neural networks.
It should be noted that, in the training apparatus 1100 of the neural network, the content such as information interaction and execution process between each module/unit is based on the same concept, and specific content may be referred to the description in the foregoing method embodiment of the present application, which is not repeated herein.
Next, referring to fig. 12, fig. 12 is a schematic structural diagram of an execution device provided in an embodiment of the present application, where a data processing apparatus 1000 described in the corresponding embodiment of fig. 10 may be disposed on the execution device 1200, so as to implement functions of the execution device in the corresponding embodiment of fig. 3 to fig. 7. Specifically, the execution apparatus 1200 includes: a receiver 1201, a transmitter 1202, a processor 1203 and a memory 1204 (where the number of processors 1203 in the execution apparatus 1200 may be one or more, one processor is exemplified in fig. 8), wherein the processor 1203 may include an application processor 12031 and a communication processor 12032. In some embodiments of the application, the receiver 1201, the transmitter 1202, the processor 1203, and the memory 1204 may be connected by a bus or other means.
The memory 1204 may include read only memory and random access memory, and provides instructions and data to the processor 1203. A portion of the memory 1204 may also include non-volatile random access memory (non-volatile random access memory, NVRAM). The memory 1204 stores a processor and operating instructions, executable modules or data structures, or a subset thereof, or an extended set thereof, wherein the operating instructions may include various operating instructions for implementing various operations.
The processor 1203 controls the operation of the execution apparatus. In a specific application, the individual components of the execution device are coupled together by a bus system, which may include, in addition to a data bus, a power bus, a control bus, a status signal bus, etc. For clarity of illustration, however, the various buses are referred to in the figures as bus systems.
The method disclosed in the above embodiment of the present application may be applied to the processor 1203 or implemented by the processor 1203. The processor 1203 may be an integrated circuit chip with signal processing capabilities. In implementation, the steps of the method described above may be performed by integrated logic circuitry in hardware or instructions in software in the processor 1203. The processor 1203 may be a general purpose processor, a digital signal processor (digital signal processing, DSP), a microprocessor or a microcontroller, and may further include an application specific integrated circuit (application specific integrated circuit, ASIC), a field-programmable gate array (FPGA-programmable gate array) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components. The processor 1203 may implement or perform the methods, steps, and logic blocks disclosed in embodiments of the present application. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present application may be embodied directly in the execution of a hardware decoding processor, or in the execution of a combination of hardware and software modules in a decoding processor. The software modules may be located in random access memory, flash memory, read-only memory, programmable read-only memory, or electrically erasable programmable memory, registers, etc. as well as storage media after training in the art. The storage medium is located in the memory 1204, and the processor 1203 reads the information in the memory 1204 and performs the steps of the above method in combination with its hardware.
The receiver 1201 may be used to receive input numeric or character information and to generate signal inputs related to performing relevant settings and function control of the device. The transmitter 1202 may be configured to output numeric or character information via a first interface; the transmitter 1202 may also be configured to send instructions to the disk stack via the first interface to modify data in the disk stack; transmitter 1202 may also include a display device such as a display screen.
In an embodiment of the present application, in one case, the application processor 12031 in the processor 1203 is configured to perform the data processing method performed by the execution apparatus in the corresponding embodiment of fig. 3 to 7. Specifically, the application processor 12031 is configured to perform feature extraction on first data and second data through a first neural network to obtain first feature information, where the first data includes parameters for reflecting an operating environment where the device is located, and the second data includes configuration parameters corresponding to the device; inputting the first characteristic information into a second neural network, and carrying out optimization solution by using the second neural network until the optimization target is met, wherein the performance evaluation information output by the second neural network is used for indicating the performance of the equipment in an operating environment, the optimization solution process comprises inputting a plurality of updated first characteristic information into the second neural network so as to optimize the performance evaluation information output by the second neural network, and the updated first characteristic information is obtained based on the performance evaluation information output by the second neural network; acquiring second characteristic information, wherein the second characteristic information comprises updated first characteristic information corresponding to target performance evaluation information, and the target performance evaluation information is performance evaluation information output by a second neural network; and carrying out inverse transformation on the second characteristic information to obtain target data corresponding to the second data, wherein the target data comprises configuration parameters corresponding to the equipment.
It should be noted that, the specific manner in which the application processor 12031 performs the above steps is based on the same concept as that of the method embodiments corresponding to fig. 3 to 7 in the present application, so that the technical effects are the same as those of the method embodiments corresponding to fig. 3 to 7 in the present application, and the details of the method embodiments shown in the foregoing description of the present application will not be repeated here.
Referring to fig. 13, fig. 13 is a schematic structural diagram of a training apparatus provided by an embodiment of the present application, on which training device 1300 may be disposed a training device 1100 of a neural network described in a corresponding embodiment of fig. 11, for implementing functions of the training apparatus in a corresponding embodiment of fig. 8, specifically, training device 1300 is implemented by one or more servers, where training device 1300 may have relatively large differences due to different configurations or performances, and may include one or more central processing units (central processing units, CPU) 1322 (e.g., one or more processors) and a memory 1332, and one or more storage media 1330 (e.g., one or more mass storage devices) storing application programs 1342 or data 1344. Wherein the memory 1332 and storage medium 1330 may be transitory or persistent. The program stored on the storage medium 1330 may include one or more modules (not shown), each of which may include a series of instruction operations on the training device. Further, central processor 1322 may be configured to communicate with storage medium 1330, executing a series of instruction operations on storage medium 1330 on exercise device 1300.
The exercise device 1300 may also include one or more power supplies 1326, one or more wired or wireless network interfaces 1350, one or more input/output interfaces 1358, and/or one or more operating systems 1341, such as Windows Server, mac OS XTM, unixTM, linuxTM, freeBSDTM, etc.
In an embodiment of the present application, the cpu 1322 is configured to perform the data processing method performed by the training device in the corresponding embodiment of fig. 8. Specifically, the central processor 1322 is configured to perform feature extraction on first training data and second training data through a first neural network to obtain first feature information, where the first training data includes parameters for reflecting an operating environment where the device is located, the second training data includes configuration parameters corresponding to the device, and perform inverse transformation on the first feature information to obtain second training data; inputting the first characteristic information into a second neural network to obtain performance evaluation information output by the second neural network, wherein the performance evaluation information is used for indicating the performance of the equipment in an operating environment; and performing iterative training on the first neural network and the second neural network by using a loss function according to the correct performance evaluation information and the output performance evaluation information corresponding to the first training data and the second training data until convergence conditions are met, so as to obtain the trained first neural network and the trained second neural network, wherein the loss function indicates the similarity between the correct performance evaluation information and the output performance evaluation information.
It should be noted that, the specific manner in which the cpu 1322 executes the above steps is based on the same concept as that of the method embodiment corresponding to fig. 8 in the present application, so that the technical effects thereof are the same as those of the method embodiment corresponding to fig. 8 in the present application, and the specific details can be found in the description of the method embodiment shown in the foregoing description of the present application, which is not repeated here.
There is also provided in an embodiment of the present application a computer program product comprising a program which, when run on a computer, causes the computer to perform the steps performed by the apparatus in the method described in the embodiment of fig. 3 to 7 described above, or causes the computer to perform the steps performed by the training apparatus in the method described in the embodiment of fig. 8 described above.
In an embodiment of the present application, there is also provided a computer-readable storage medium having stored therein a program for performing signal processing, which when run on a computer causes the computer to perform the steps performed by the performing device in the method described in the embodiment shown in the foregoing fig. 3 to 7, or causes the computer to perform the steps performed by the training device in the method described in the embodiment shown in the foregoing fig. 8.
The execution device, training device, data processing device or training device of neural network provided in the embodiment of the present application may be specifically a chip, where the chip includes: a processing unit, which may be, for example, a processor, and a communication unit, which may be, for example, an input/output interface, pins or circuitry, etc. The processing unit may execute the computer-executable instructions stored in the storage unit, so that the chip in the execution device performs the data processing method described in the embodiment shown in fig. 3 to 7, or so that the chip in the training device performs the training method of the neural network described in the embodiment shown in fig. 8. Optionally, the storage unit is a storage unit in the chip, such as a register, a cache, etc., and the storage unit may also be a storage unit in the wireless access device side located outside the chip, such as a read-only memory (ROM) or other type of static storage device that may store static information and instructions, a random access memory (random access memory, RAM), etc.
Specifically, referring to fig. 14, fig. 14 is a schematic structural diagram of a chip provided in an embodiment of the present application, where the chip may be represented as a neural network processor NPU 140, and the NPU 140 is mounted as a coprocessor on a main CPU (Host CPU), and the Host CPU distributes tasks. The core part of the NPU is an operation circuit 1403, and the operation circuit 1403 is controlled by a controller 1404 to extract matrix data in a memory and perform multiplication operation.
In some implementations, the arithmetic circuit 1403 internally includes a plurality of processing units (PEs). In some implementations, the operation circuit 1403 is a two-dimensional systolic array. The operation circuit 1403 may also be a one-dimensional systolic array or other electronic circuit capable of performing mathematical operations such as multiplication and addition. In some implementations, the operation circuit 1403 is a general-purpose matrix processor.
For example, assume that there is an input matrix a, a weight matrix B, and an output matrix C. The arithmetic circuit fetches the data corresponding to the matrix B from the weight memory 1402 and buffers the data on each PE in the arithmetic circuit. The arithmetic circuit takes matrix a data from the input memory 1401 and performs matrix operation with matrix B, and the partial result or the final result of the matrix obtained is stored in an accumulator (accumulator) 1408.
The unified memory 1406 is used for storing input data and output data. The weight data is directly transferred to the weight memory 1402 through the memory cell access controller (Direct Memory Access Controller, DMAC) 1405. The input data is also carried into the unified memory 1406 via the DMAC.
BIU is Bus Interface Unit, bus interface unit 1410, for the AXI bus to interact with DMAC and finger memory (Instruction Fetch Buffer, IFB) 1409.
The bus interface unit 1410 (Bus Interface Unit, abbreviated as BIU) is configured to fetch the instruction from the external memory by the instruction fetch memory 1409, and further configured to fetch the raw data of the input matrix a or the weight matrix B from the external memory by the memory unit access controller 1405.
The DMAC is mainly used to transfer input data in the external memory DDR to the unified memory 1406 or to transfer weight data to the weight memory 1402 or to transfer input data to the input memory 1401.
The vector calculation unit 1407 includes a plurality of operation processing units, and further processes such as vector multiplication, vector addition, exponential operation, logarithmic operation, size comparison, and the like are performed on the output of the operation circuit if necessary. The method is mainly used for non-convolution/full-connection layer network calculation in the neural network, such as Batch Normalization (batch normalization), pixel-level summation, up-sampling of a characteristic plane and the like.
In some implementations, the vector computation unit 1407 can store the vector of processed outputs to the unified memory 1406. For example, the vector calculation unit 1407 may apply a linear function and/or a nonlinear function to the output of the operation circuit 1403, for example, linearly interpolate the feature plane extracted by the convolution layer, and further, for example, accumulate a vector of values to generate an activation value. In some implementations, the vector computation unit 1407 generates normalized values, pixel-level summed values, or both. In some implementations, the vector of processed outputs can be used as an activation input to the arithmetic circuit 1403, e.g., for use in subsequent layers in a neural network.
An instruction fetch memory (instruction fetch buffer) 1409 connected to the controller 1404 and used for storing instructions used by the controller 1404;
the unified memory 1406, the input memory 1401, the weight memory 1402, and the finger memory 1409 are all On-Chip memories. The external memory is proprietary to the NPU hardware architecture.
Wherein the operation of each of the neural network layers in the first neural network and the operation of each of the neural network layers in the second neural network shown in the above-described respective embodiments may be performed by the operation circuit 1403 or the vector calculation unit 1407.
The processor mentioned in any of the above may be a general-purpose central processing unit, a microprocessor, an ASIC, or one or more integrated circuits for controlling the execution of the program of the method of the first aspect.
It should be further noted that the above-described apparatus embodiments are merely illustrative, and that the units described as separate units may or may not be physically separate, and that units shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. In addition, in the drawings of the embodiment of the device provided by the application, the connection relation between the modules represents that the modules have communication connection, and can be specifically implemented as one or more communication buses or signal lines.
From the above description of the embodiments, it will be apparent to those skilled in the art that the present application may be implemented by means of software plus necessary general purpose hardware, or of course by means of special purpose hardware including application specific integrated circuits, special purpose CPUs, special purpose memories, special purpose components, etc. Generally, functions performed by computer programs can be easily implemented by corresponding hardware, and specific hardware structures for implementing the same functions can be varied, such as analog circuits, digital circuits, or dedicated circuits. However, a software program implementation is a preferred embodiment for many more of the cases of the present application. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a readable storage medium, such as a floppy disk, a usb disk, a removable hard disk, a ROM, a RAM, a magnetic disk or an optical disk of a computer, etc., comprising several instructions for causing a computer device (which may be a personal computer, a training device, a network device, etc.) to perform the method according to the embodiments of the present application.
In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product.
The computer program product includes one or more computer instructions. When loaded and executed on a computer, produces a flow or function in accordance with embodiments of the present application, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from one website, computer, training device, or data center to another website, computer, training device, or data center via a wired (e.g., coaxial cable, optical fiber, digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer readable storage medium may be any available medium that can be stored by a computer or a data storage device such as a training device, a data center, or the like that contains an integration of one or more available media. The usable medium may be a magnetic medium (e.g., a floppy Disk, a hard Disk, a magnetic tape), an optical medium (e.g., a DVD), or a semiconductor medium (e.g., a Solid State Disk (SSD)), or the like.

Claims (18)

1. A data processing method, characterized in that the data processing method comprises:
performing feature extraction on first data and second data through a first neural network to obtain first feature information, wherein the first data comprises parameters for reflecting an operation environment where equipment is located, and the second data comprises configuration parameters corresponding to the equipment;
inputting the first characteristic information into a second neural network, and performing optimization solution by using the second neural network until an optimization target is met, wherein the performance evaluation information output by the second neural network is used for indicating the performance of the equipment in an operation environment, the optimization solution process comprises inputting a plurality of updated first characteristic information into the second neural network so as to optimize the performance evaluation information output by the second neural network, and the updated first characteristic information is obtained based on the performance evaluation information output by the second neural network;
acquiring second characteristic information, wherein the second characteristic information comprises the updated first characteristic information corresponding to target performance evaluation information, and the target performance evaluation information is performance evaluation information output by the second neural network;
And carrying out inverse transformation on the second characteristic information to obtain target data corresponding to the second data, wherein the target data comprises configuration parameters corresponding to the equipment.
2. The method of claim 1, wherein the first characteristic information is not capable of obtaining the first data by inverse transformation.
3. The method of claim 1 or 2, wherein the first neural network comprises a first neural network module, a second neural network module, and a third neural network module, the feature extraction of the first data and the second data by the first neural network comprising:
extracting the characteristics of the first data through a first neural network module to obtain characteristic information of the first data;
performing feature extraction on the second data through a second neural network module to obtain feature information of the second data;
and coupling the characteristic information of the first data and the characteristic information of the second data through a third neural network module, wherein the second neural network module and the third neural network module adopt reversible neural network layers.
4. The method according to claim 1 or 2, characterized in that the device is any one of the following: a base station, a vehicle, a chip, or a power source.
5. A method for training a neural network, the method comprising:
performing feature extraction on first training data and second training data through a first neural network to obtain first feature information, wherein the first training data comprises parameters for reflecting an operation environment where equipment is located, the second training data comprises configuration parameters corresponding to the equipment, and the second training data can be obtained by performing inverse transformation on the first feature information;
inputting the first characteristic information into a second neural network to obtain performance evaluation information output by the second neural network, wherein the performance evaluation information is used for indicating the performance of the equipment in an operating environment;
and performing iterative training on the first neural network and the second neural network by using a loss function according to the correct performance evaluation information corresponding to the first training data and the second training data and the output performance evaluation information until convergence conditions are met, so as to obtain the trained first neural network and the trained second neural network, wherein the loss function indicates the similarity between the correct performance evaluation information and the output performance evaluation information.
6. The method of claim 5, wherein the first characteristic information is not capable of obtaining the first training data by inverse transformation.
7. The method of claim 5 or 6, wherein the first neural network comprises a first neural network module, a second neural network module, and a third neural network module, the feature extraction of the first data and the second data by the first neural network comprising:
extracting the characteristics of the first data through a first neural network module to obtain characteristic information of the first data;
performing feature extraction on the second data through a second neural network module to obtain feature information of the second data;
and coupling the characteristic information of the first data and the characteristic information of the second data through a third neural network module, wherein the second neural network module and the third neural network module both adopt reversible neural networks.
8. A data processing apparatus, characterized in that the data processing apparatus comprises:
the device comprises a feature extraction module, a feature extraction module and a feature extraction module, wherein the feature extraction module is used for carrying out feature extraction on first data and second data through a first neural network to obtain first feature information, the first data comprises parameters for reflecting an operation environment where equipment is located, and the second data comprises configuration parameters corresponding to the equipment;
The optimization solving module is used for inputting the first characteristic information into a second neural network, carrying out optimization solving by using the second neural network until an optimization target is met, wherein the output of the second neural network is performance evaluation information, the performance evaluation information is used for indicating the performance of the equipment in an operation environment, the optimization solving process comprises the steps of inputting a plurality of updated first characteristic information into the second neural network so as to optimize the performance evaluation information output by the second neural network, and the updated first characteristic information is obtained based on the performance evaluation information output by the second neural network;
the acquisition module is used for acquiring second characteristic information, wherein the second characteristic information comprises the updated first characteristic information corresponding to target performance evaluation information, and the target performance evaluation information is one piece of performance evaluation information output by the second neural network;
and the transformation module is used for carrying out inverse transformation on the second characteristic information to obtain target data corresponding to the second data, wherein the target data comprises configuration parameters corresponding to the equipment.
9. The apparatus of claim 8, wherein the first characteristic information is not capable of obtaining the first data by inverse transformation.
10. The apparatus according to claim 8 or 9, wherein the first neural network comprises a first neural network module, a second neural network module and a third neural network module, the feature extraction module being specifically configured to:
extracting the characteristics of the first data through a first neural network module to obtain characteristic information of the first data;
performing feature extraction on the second data through a second neural network module to obtain feature information of the second data;
and coupling the characteristic information of the first data and the characteristic information of the second data through a third neural network module, wherein the second neural network module and the third neural network module adopt reversible neural network layers.
11. The apparatus according to claim 8 or 9, wherein the device is any one of the following: a base station, a vehicle, a chip, or a power source.
12. A neural network training device, characterized in that the neural network training device comprises:
the device comprises a feature extraction module, a feature extraction module and a feature extraction module, wherein the feature extraction module is used for carrying out feature extraction on first training data and second training data through a first neural network to obtain first feature information, the first training data comprises parameters for reflecting the running environment of equipment, the second training data comprises configuration parameters corresponding to the equipment, and the second training data can be obtained by carrying out inverse transformation on the first feature information;
The performance evaluation module is used for inputting the first characteristic information into a second neural network to obtain performance evaluation information output by the second neural network, wherein the performance evaluation information is used for indicating the performance of the equipment in an operation environment;
and the training module is used for carrying out iterative training on the first neural network and the second neural network by using a loss function according to the correct performance evaluation information corresponding to the first training data and the second training data and the output performance evaluation information until convergence conditions are met, so as to obtain the trained first neural network and the trained second neural network, wherein the loss function indicates the similarity between the correct performance evaluation information and the output performance evaluation information.
13. The apparatus of claim 12, wherein the first characteristic information is not capable of obtaining the first training data by inverse transformation.
14. The apparatus according to claim 12 or 13, wherein the first neural network comprises a first neural network module, a second neural network module and a third neural network module, the feature extraction module being specifically configured to:
Extracting the characteristics of the first data through a first neural network module to obtain characteristic information of the first data;
performing feature extraction on the second data through a second neural network module to obtain feature information of the second data;
and coupling the characteristic information of the first data and the characteristic information of the second data through a third neural network module, wherein the second neural network module and the third neural network module both adopt reversible neural networks.
15. A computer program product, characterized in that the computer program product comprises a program which, when run on a computer, causes the computer to perform the method according to any one of claims 1 to 4 or causes the computer to perform the method according to any one of claims 5 to 7.
16. A computer readable storage medium, characterized in that the computer readable storage medium has stored therein a program which, when run on a computer, causes the computer to perform the method of any one of claims 1 to 4 or causes the computer to perform the method of any one of claims 5 to 7.
17. An execution device comprising a processor and a memory, the processor coupled to the memory,
the memory is used for storing programs;
the processor configured to execute a program in the memory, so that the execution device executes the method according to any one of claims 1 to 4.
18. A training device comprising a processor and a memory, the processor being coupled to the memory,
the memory is used for storing programs;
the processor for executing a program in the memory, causing the training device to perform the method of any one of claims 5 to 7.
CN202210114746.XA 2022-01-30 2022-01-30 Data processing method and related equipment Pending CN116578525A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202210114746.XA CN116578525A (en) 2022-01-30 2022-01-30 Data processing method and related equipment
PCT/CN2023/072095 WO2023143128A1 (en) 2022-01-30 2023-01-13 Data processing method and related device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210114746.XA CN116578525A (en) 2022-01-30 2022-01-30 Data processing method and related equipment

Publications (1)

Publication Number Publication Date
CN116578525A true CN116578525A (en) 2023-08-11

Family

ID=87470455

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210114746.XA Pending CN116578525A (en) 2022-01-30 2022-01-30 Data processing method and related equipment

Country Status (2)

Country Link
CN (1) CN116578525A (en)
WO (1) WO2023143128A1 (en)

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106647265A (en) * 2016-12-12 2017-05-10 辽宁工程技术大学 Intelligent control method for mine rescue detection robot
CN113592060A (en) * 2020-04-30 2021-11-02 华为技术有限公司 Neural network optimization method and device
CN112149797B (en) * 2020-08-18 2023-01-03 Oppo(重庆)智能科技有限公司 Neural network structure optimization method and device and electronic equipment
CN112512069B (en) * 2021-02-02 2021-05-28 网络通信与安全紫金山实验室 Network intelligent optimization method and device based on channel beam pattern

Also Published As

Publication number Publication date
WO2023143128A1 (en) 2023-08-03

Similar Documents

Publication Publication Date Title
CN111401406B (en) Neural network training method, video frame processing method and related equipment
WO2022012407A1 (en) Neural network training method and related device
WO2021218471A1 (en) Neural network for image processing and related device
CN108171328B (en) Neural network processor and convolution operation method executed by same
EP4303767A1 (en) Model training method and apparatus
CN113065636A (en) Pruning processing method, data processing method and equipment for convolutional neural network
WO2023202511A1 (en) Data processing method, neural network training method and related device
CN113065997B (en) Image processing method, neural network training method and related equipment
CN113095475A (en) Neural network training method, image processing method and related equipment
CN113505883A (en) Neural network training method and device
WO2023179482A1 (en) Image processing method, neural network training method and related device
CN113869496A (en) Acquisition method of neural network, data processing method and related equipment
CN112528108B (en) Model training system, gradient aggregation method and device in model training
CN115238909A (en) Data value evaluation method based on federal learning and related equipment thereof
WO2023045949A1 (en) Model training method and related device
CN115795025A (en) Abstract generation method and related equipment thereof
CN116578525A (en) Data processing method and related equipment
CN112766475B (en) Processing component and artificial intelligence processor
CN115907041A (en) Model training method and device
CN115081615A (en) Neural network training method, data processing method and equipment
CN114254724A (en) Data processing method, neural network training method and related equipment
WO2024016894A1 (en) Method for training neural network and related device
US20240232575A1 (en) Neural network obtaining method, data processing method, and related device
CN116050480A (en) Battery data processing method, machine learning model training method and equipment
WO2024140630A1 (en) Model training method and related device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication