CN111327472B

CN111327472B - Method and device for acquiring target network

Info

Publication number: CN111327472B
Application number: CN202010113213.0A
Authority: CN
Inventors: 希滕; 张刚; 温圣召
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2020-02-24
Filing date: 2020-02-24
Publication date: 2022-08-02
Anticipated expiration: 2040-02-24
Also published as: CN111327472A

Abstract

The embodiment of the disclosure discloses a method and a device for acquiring a target network. One embodiment of the method comprises: for a reference network in the reference network set, calculating negative gradient information of the reference network corresponding to the super network to be processed; calculating a gradient mean value of the at least one piece of negative gradient information for the at least one piece of negative gradient information corresponding to the reference network set, and setting the gradient mean value as a performance feedback parameter of the to-be-processed super network; updating and iterating the network parameters of the super network to be processed according to the performance feedback parameters; and in response to the set iteration times of the updating iteration operation or the variable quantity of the network parameters is less than or equal to a set threshold value, marking the to-be-processed hyper network after the updating iteration operation as a target hyper network. The implementation mode enables the target hyper-network and the reference network in the local network space to have good consistency, and improves the adaptability of the target hyper-network.

Description

Method and device for acquiring target network

Technical Field

The embodiment of the disclosure relates to the technical field of computers, in particular to a method and a device for acquiring a target network.

Background

In order to adapt to the needs of various network structures, the super network can be trained by NAS (Neural Architecture Search technology). That is, the super network includes a plurality of network structures, and can be applied to a plurality of different network structures, thereby realizing the sharing of the network structures.

Disclosure of Invention

The embodiment of the disclosure provides a method and a device for acquiring a target network.

In a first aspect, an embodiment of the present disclosure provides a method for acquiring a target network, where the method includes: calculating negative gradient information of a reference network corresponding to a to-be-processed super network for the reference network in a reference network set, wherein the reference network is used for representing a corresponding local network space in the to-be-processed super network; calculating a gradient mean value of the at least one piece of negative gradient information for the at least one piece of negative gradient information corresponding to the reference network set, and setting the gradient mean value as a performance feedback parameter of the to-be-processed super network; updating and iterating the network parameters of the super network to be processed according to the performance feedback parameters; and in response to the set iteration times of the updating iteration operation or the variable quantity of the network parameters is less than or equal to a set threshold value, marking the to-be-processed hyper network after the updating iteration operation as a target hyper network.

In some embodiments, the reference network set is obtained by: dividing the network space into at least one local network space with a set size; and selecting at least one reference network from the at least one local network space to form a reference network set.

In some embodiments, the dividing the network space into at least one local network space of a set size includes: dividing the network space into at least one local network space with a set size according to a specified mode, wherein the specified mode comprises at least one of the following modes: network type, geographical location of the network.

In some embodiments, the selecting at least one reference network from the at least one local network space to form a reference network set includes: for the local network space in the at least one local network space, at least one first reference network is randomly selected, and the at least one first reference network is combined to form a reference network set.

In some embodiments, the selecting at least one reference network from the at least one local network space to form a reference network set includes: randomly selecting at least one second reference network for a local network space of the at least one local network space; and combining at least one second reference network corresponding to the at least one local network space to form a reference network set.

In some embodiments, the calculating negative gradient information of the reference network corresponding to the super network to be processed includes: and after the local network space corresponding to the reference network in the to-be-processed super network is replaced by the reference network, calculating the negative gradient information of the reference network corresponding to the to-be-processed super network.

In some embodiments, after replacing the local network space corresponding to the reference network in the to-be-processed super network with the reference network, calculating negative gradient information of the reference network corresponding to the to-be-processed super network includes: and carrying out forward propagation and backward propagation on the to-be-processed hyper network based on the reference network to obtain the negative gradient information of the reference network corresponding to the to-be-processed hyper network.

In a second aspect, an embodiment of the present disclosure provides an apparatus for acquiring a target network, the apparatus including: the negative gradient information calculation unit is used for calculating the negative gradient information of a reference network corresponding to a to-be-processed super network for the reference network in the reference network set, wherein the reference network is used for representing a corresponding local network space in the to-be-processed super network; a performance feedback parameter obtaining unit, configured to calculate a gradient mean value of the at least one piece of negative gradient information for the at least one piece of negative gradient information corresponding to the reference network set, and set the gradient mean value as a performance feedback parameter of the to-be-processed super network; the updating unit is configured to perform updating iteration operation on the network parameters of the to-be-processed hyper-network through the performance feedback parameters; and the target network acquisition unit is used for responding to the set iteration times of the updating iteration operation or the variable quantity of the network parameters is less than or equal to a set threshold value, and is configured to mark the to-be-processed hyper network after the updating iteration operation as a target hyper network.

In some embodiments, the apparatus includes a reference network set acquiring unit configured to acquire a reference network set, and the reference network set acquiring unit includes: a local network space dividing unit configured to divide a network space into at least one local network space of a set size; and the reference network set constructing subunit is configured to select at least one reference network from the at least one local network space to form a reference network set.

In some embodiments, the local network space partitioning subunit includes: a local network space dividing module configured to divide the network space into at least one local network space with a set size according to a specified manner, wherein the specified manner includes at least one of: network type, geographical location of the network.

In some embodiments, the reference network set constructing subunit includes: and the first reference network set construction module is used for randomly selecting at least one first reference network for the local network spaces in the at least one local network space and combining the at least one first reference network to form a reference network set.

In some embodiments, the reference network set constructing subunit includes: a second reference network selection module configured to randomly pick at least one second reference network for a local network space of the at least one local network space; and the second reference network set building module is configured to combine at least one second reference network corresponding to the at least one local network space to form a reference network set.

In some embodiments, the negative gradient information calculation unit includes: and the negative gradient information calculation subunit is configured to calculate the negative gradient information of the reference network corresponding to the super network to be processed after the reference network replaces the local network space corresponding to the reference network in the super network to be processed.

In some embodiments, the negative gradient information calculating subunit includes: and the negative gradient information calculation module is configured to perform forward propagation and backward propagation on the to-be-processed super network based on the reference network to obtain the negative gradient information of the reference network corresponding to the to-be-processed super network.

In a third aspect, an embodiment of the present disclosure provides an electronic device, including: one or more processors; a memory having one or more programs stored thereon, which when executed by the one or more processors, cause the one or more processors to perform the method for acquiring a target network of the first aspect.

In a fourth aspect, an embodiment of the present disclosure provides a computer-readable medium, on which a computer program is stored, where the program is executed by a processor to implement the method for acquiring a target network of the first aspect.

The method and the device for acquiring the target network provided by the embodiment of the disclosure comprise the steps of firstly calculating the negative gradient information of a reference network corresponding to a to-be-processed super network; then calculating a gradient mean value of the at least one piece of negative gradient information, and setting the gradient mean value as a performance feedback parameter of the to-be-processed super network; then, updating and iterating the network parameters of the super network to be processed through the performance feedback parameters; and finally, when the updating iteration operation reaches the set iteration times or the variation of the network parameters is less than or equal to the set threshold value, marking the to-be-processed hyper network after the updating iteration operation as a target hyper network. The method and the device can enable the target hyper-network and the reference network in the local network space to have good consistency, and improve the adaptability of the target hyper-network. The method is beneficial to the equipment to reach the performance of the reference network structure as soon as possible when the target hyper-network is operated, so that the data processing efficiency of the equipment for operating the target hyper-network is improved, and the memory space of the equipment occupied by the target hyper-network is reduced.

Drawings

Other features, objects and advantages of the disclosure will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, made with reference to the accompanying drawings in which:

FIG. 1 is an exemplary system architecture diagram in which one embodiment of the present disclosure may be applied;

FIG. 2 is a flow diagram of one embodiment of a method for acquiring a target network according to the present disclosure;

FIG. 3 is a schematic diagram of one application scenario of a method for acquiring a target network according to the present disclosure;

FIG. 4 is a flow diagram of yet another embodiment of a method for acquiring a target network according to the present disclosure;

FIG. 5 is a schematic block diagram illustrating one embodiment of an apparatus for acquiring a target network according to the present disclosure;

FIG. 6 is a schematic diagram of an electronic device structure suitable for use to implement embodiments of the present disclosure.

Detailed Description

The present disclosure is described in further detail below with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant invention and not restrictive of the invention. It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings.

It should be noted that, in the present disclosure, the embodiments and features of the embodiments may be combined with each other without conflict. The present disclosure will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.

Fig. 1 illustrates an exemplary system architecture 100 of a method for acquiring a target network or an apparatus for acquiring a target network to which embodiments of the present disclosure may be applied.

As shown in fig. 1, the system architecture 100 may include

terminal devices

101, 102, 103, a network 104, a network training server 105, and a network fabric server 106. The network 104 is used to provide a medium for communication links between the

terminal devices

101, 102, 103, the network training server 105 and the network fabric server 106. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.

The

terminal devices

101, 102, 103 interact with a network training server 105 over a network 104 to receive or send messages or the like. The

terminal devices

101, 102, 103 may have various communication client applications installed thereon, such as a web browser application, a shopping application, a search application, an instant messaging tool, a mailbox client, social platform software, and the like. These models are all data processed through the corresponding network structure.

The

terminal apparatuses

101, 102, and 103 may be hardware or software. When the

terminal devices

101, 102, 103 are hardware, they may be various electronic devices having a display screen and supporting web browsing, information search, instant messaging, etc., including but not limited to smart phones, tablet computers, e-book readers, laptop portable computers, desktop computers, etc. When the

terminal apparatuses

101, 102, 103 are software, they can be installed in the electronic apparatuses listed above. It may be implemented as a plurality of software or software modules (for example, for providing distributed services), or as a single software or software module, which is not specifically limited herein.

The network fabric server 106 stores a network space containing various types of network fabrics to meet the needs of a multi-use application scenario.

The network training server 105 may be a server that provides various services, for example, selecting a reference network from a network space and training a super network with the reference network.

It should be noted that the method for acquiring the target network provided by the embodiment of the present disclosure is generally performed by the network training server 105, and accordingly, the apparatus for acquiring the target network is generally disposed in the network training server 105.

It should be noted that the network training server 105 and/or the network configuration server 106 may be hardware or software. When the network training server 105 and/or the network structure server 106 are hardware, they may be implemented as a distributed server cluster composed of a plurality of servers, or may be implemented as a single server. When the network training server 105 and/or the network configuration server 106 are software, they may be implemented as a plurality of software or software modules (for example, for providing distributed services), or as a single software or software module, and are not limited in this respect.

It should be understood that the number of terminal devices, networks, network training servers, network fabric servers in fig. 1 are merely illustrative. Any number of terminal devices, networks, network training servers, network structure servers may be present, as desired for implementation.

With continued reference to fig. 2, a flow 200 of one embodiment of a method for acquiring a target network in accordance with the present disclosure is shown. The method for acquiring the target network comprises the following steps:

step 201, for a reference network in the reference network set, calculating negative gradient information of the super network to be processed corresponding to the reference network.

In this embodiment, an executing entity (e.g., the network training server 105 shown in fig. 1) of the method for acquiring the target network may acquire the reference network set from the network structure server 106 through a wired connection manner or a wireless connection manner. It should be noted that the wireless connection means may include, but is not limited to, a 3G/4G connection, a WiFi connection, a bluetooth connection, a WiMAX connection, a Zigbee connection, a uwb (ultra wideband) connection, and other wireless connection means now known or developed in the future.

In the process of training the existing hyper-network, the overall performance of various network structures is mainly considered. Therefore, when the trained super network is applied to a specific scene, the super network often fails to achieve the performance of the independent network structure corresponding to the scene. That is, the performance of the super-network trained by the existing method is poor in consistency with the independent network structure, and the performance of the independent network structure cannot be achieved.

Therefore, the execution subject of the application can firstly obtain a reference network set and calculate the negative gradient information of the reference network in training the to-be-processed hyper-network. The reference network set may be obtained by dividing a network structure in the super network to be processed into a plurality of local network spaces. The reference network may be used to characterize the corresponding local network space in the above-described pending super network. The negative gradient information may be used to characterize the convergence state of the training of the pending super-network.

In some optional implementations of this embodiment, the reference network set is obtained by:

in the first step, the network space is divided into at least one local network space with a set size.

In order to make the trained hyper-network consistent with the independent network structure to the maximum extent, the execution main body of the application can firstly divide the network space corresponding to the hyper-network to be processed into at least one local network space with a set size. The set size may be a network structure with a set number, or a network structure with a set number of network layers, which may be determined according to actual needs.

And secondly, selecting at least one reference network from the at least one local network space to form a reference network set.

After the local network space is determined, the execution subject may select a certain number of network structures from the local network space as reference networks, and form a reference network set through the reference networks.

In some optional implementation manners of this embodiment, the dividing the network space into at least one local network space with a set size may include: and dividing the network space into at least one local network space with a set size according to a specified mode.

The network space includes various types of network structures, and the network structures are regional in practice. Therefore, when the local network space is divided, the execution subject can divide the network space into at least one local network space with a set size according to a designated mode, so that the trained hyper-network can meet the actual requirement as much as possible. Wherein, the above-mentioned specifying mode may include at least one of the following: network type, geographical location of the network.

In some optional implementations of this embodiment, the selecting at least one reference network from the at least one local network space to form a reference network set may include: for the local network space in the at least one local network space, at least one first reference network is randomly selected, and the at least one first reference network is combined to form a reference network set.

When the pending super-network is trained for a local network space, reference networks may be selected from the network structure in the local network space to form a set of reference networks. For example, the network space is divided into 5 local network spaces. And selecting the 3 rd local network space to train the super network to be processed. In this case, the execution agent may select a reference network from the 3 rd local network space to form a reference network set. At this time, the set of reference networks may represent the 3 rd local network space to train the super network to be processed.

In some optional implementations of this embodiment, the selecting at least one reference network from the at least one local network space to form a reference network set may include the following steps:

in the first step, at least one second reference network is randomly selected for the local network space in the at least one local network space.

The above describes training the pending super-network through a certain local network space. In addition, the to-be-processed hyper-network can be trained through a plurality of local network spaces simultaneously. At this time, the execution principal may select a plurality of second reference networks from a plurality of local network spaces. That is, the plurality of different second reference networks may be from different local network spaces. For example, the network space is divided into 5 local network spaces. The execution agent may select 3 second reference networks from the 1 st local network space; selecting 5 second reference networks from the 2 nd local network space; selecting 1 second reference network from the 3 rd local network space; selecting 2 second reference networks from the 4 th local network space; 3 second reference networks are selected from the 5 th local network space.

And secondly, combining at least one second reference network corresponding to the at least one local network space to form a reference network set.

After selecting at least one second reference network, the execution principal may combine the at least one second reference network into a set of reference networks. The set of reference networks at this point may represent the entire network space.

In some optional implementation manners of this embodiment, the calculating negative gradient information of the reference network corresponding to the to-be-processed super network may include: and after the local network space corresponding to the reference network in the to-be-processed super network is replaced by the reference network, calculating the negative gradient information of the reference network corresponding to the to-be-processed super network.

When training the to-be-processed hyper-network, the reference network can replace the corresponding local network space. The reference network can replace the role of the local network space in the network space, the obtained negative gradient information of the reference network corresponding to the to-be-processed super network can represent the influence of the local network space on the training of the to-be-processed super network, and the training direction of the to-be-processed super network is guided through the reference network.

In some optional implementation manners of this embodiment, after replacing the local network space corresponding to the reference network in the to-be-processed super network with the reference network, calculating negative gradient information of the reference network corresponding to the to-be-processed super network may include: and carrying out forward propagation and backward propagation on the to-be-processed hyper network based on the reference network to obtain the negative gradient information of the reference network corresponding to the to-be-processed hyper network.

The execution main body can replace the corresponding local network space with the reference network, and then forward propagation and backward propagation are carried out on the super network to be processed in the network space, so that negative gradient information of the super network to be processed corresponding to the reference network can be obtained.

Step 202, for at least one piece of negative gradient information corresponding to the reference network set, calculating a gradient mean value of the at least one piece of negative gradient information, and setting the gradient mean value as a performance feedback parameter of the to-be-processed super network.

Each reference network corresponds to one piece of negative gradient information, and at least one reference network contained in the reference network set corresponds to at least one piece of negative gradient information. As can be seen from the above description, the negative gradient information can be used to characterize the convergence status when training the pending super-network. The execution subject can calculate a gradient mean value of at least one piece of negative gradient information, and the gradient mean value can represent the comprehensive influence of the reference network set on the convergence of the super network to be processed, and can also consider the adaptability or consistency of the trained super network to be processed on the reference network in the reference network set. The execution subject may set the gradient mean as a performance feedback parameter of the super network to be processed.

As can be seen from the above description, the reference networks in the reference network set may be from the same local network space. At this point, the gradient mean characterizes the performance of the super network to be processed in the local network space. The reference networks within the set of reference networks may also be from different local network spaces. At this time, the gradient mean value represents the performance of the super network to be processed on a plurality of local network spaces.

And 203, updating and iterating the network parameters of the to-be-processed hyper-network according to the performance feedback parameters.

After the performance feedback parameters are obtained, the execution main body can update and iterate the network parameters of the to-be-processed super network according to the performance feedback parameters, so that the to-be-processed super network after iteration operation further adapts to the local network space corresponding to the reference network.

And step 204, in response to the update iteration operation reaching the set iteration number or the network parameter variation being less than or equal to the set threshold, marking the to-be-processed hyper-network after the update iteration operation as a target hyper-network.

The execution subject may detect the number of iterations of the update iteration operation or a change in a network parameter of the super network to be processed. When the number of iterations reaching the set value of the updating iteration operation or the variation of the network parameters is less than or equal to the set threshold value, the performance of the to-be-processed hyper-network is consistent with that of the reference network, and the to-be-processed hyper-network tends to be stable. At this time, the execution subject may mark the to-be-processed hyper-network after the update iteration operation as a target hyper-network. The target piconet is able to optimize the performance of the corresponding local network space.

With continued reference to fig. 3, fig. 3 is a schematic diagram of an application scenario of the method for acquiring a target network according to the present embodiment. In the application scenario of fig. 3, the network training server 105 is in data communication with the network fabric server 106 via the network 104. The network training server 105 may divide the network space on the network fabric server 106 into a plurality of local network spaces and determine a set of reference networks from the local network spaces. When training the to-be-processed super network, the network training server 105 first calculates negative gradient information of the to-be-processed super network corresponding to the reference network. And then, calculating a gradient mean value of at least one piece of negative gradient information, and setting the gradient mean value as a performance feedback parameter of the super network to be processed. And then, updating and iterating the network parameters of the to-be-processed hyper-network through the performance feedback parameters. And finally, when the updating iteration operation reaches the set iteration times or the variation of the network parameters is less than or equal to the set threshold value, marking the to-be-processed hyper network after the updating iteration operation as a target hyper network. In this manner, the target hyper-network on the network fabric server 106 is made to have better consistency with the network fabric within the local network space. Thereafter, the network training server 105 may send the target hyper-network to the terminal device 103 after receiving the network request sent by the terminal device 103.

The method provided by the embodiment of the disclosure includes firstly, calculating negative gradient information of a reference network corresponding to a to-be-processed hyper network; then calculating a gradient mean value of the at least one piece of negative gradient information, and setting the gradient mean value as a performance feedback parameter of the to-be-processed super network; then, updating and iterating the network parameters of the super network to be processed through the performance feedback parameters; and finally, when the updating iteration operation reaches the set iteration times or the variation of the network parameters is less than or equal to the set threshold value, marking the to-be-processed hyper-network after the updating iteration operation as a target hyper-network. The method and the device can enable the target hyper-network and the reference network in the local network space to have good consistency, and improve the adaptability of the target hyper-network. The method is beneficial to the equipment to reach the performance of the reference network structure as soon as possible when the target hyper-network is operated, so that the data processing efficiency of the equipment for operating the target hyper-network is improved, and the memory space of the equipment occupied by the target hyper-network is reduced.

With further reference to fig. 4, a flow 400 of yet another embodiment of a method for acquiring a target network is illustrated. The process 400 of the method for acquiring a target network includes the following steps:

step 401, for the reference network in the reference network set, calculating the negative gradient information of the super network to be processed corresponding to the reference network.

The content of step 401 is the same as that of step 201, and is not described in detail here.

Step 402, calculating a gradient mean value of the at least one piece of negative gradient information for the at least one piece of negative gradient information corresponding to the reference network set, and setting the gradient mean value as a performance feedback parameter of the to-be-processed super network.

The content of step 402 is the same as that of step 202, and is not described in detail here.

And 403, updating and iterating the network parameters of the to-be-processed super network according to the performance feedback parameters.

The content of step 403 is the same as that of step 203, and is not described in detail here.

In step 404, in response to the update iteration operation reaching the set iteration number or the network parameter variation being less than or equal to the set threshold, the to-be-processed piconet after the update iteration operation is marked as a target piconet.

The content of step 404 is the same as that of step 204, and is not described in detail here.

Step 405, responding to the received network request, and sending the target hyper network to the device corresponding to the network request.

The execution agent may receive a network request from the

terminal device

101, 102, 103, etc., requesting to send the target hyper-network to the

terminal device

101, 102, 103. In this way, compatibility of the target network is facilitated, so that the target hyper-network is better applied to the

terminal devices

101, 102, 103.

With further reference to fig. 5, as an implementation of the methods shown in the above-mentioned figures, the present disclosure provides an embodiment of an apparatus for acquiring a target network, where the embodiment of the apparatus corresponds to the embodiment of the method shown in fig. 2, and the apparatus may be specifically applied to various electronic devices.

As shown in fig. 5, the apparatus 500 for acquiring a target network according to this embodiment may include: a negative gradient information calculation unit 501, a performance feedback parameter acquisition unit 502, an update unit 503, and a target network acquisition unit 504. The negative gradient information calculating unit 501 is configured to calculate, for a reference network in a reference network set, negative gradient information of a to-be-processed super network corresponding to the reference network, where the reference network is used to represent a corresponding local network space in the to-be-processed super network; a performance feedback parameter obtaining unit 502, configured to calculate a gradient mean of the at least one piece of negative gradient information for the at least one piece of negative gradient information corresponding to the reference network set, and set the gradient mean as a performance feedback parameter of the to-be-processed super network; the updating unit 503 is configured to perform an updating iteration operation on the network parameters of the super network to be processed through the performance feedback parameters; the target network obtaining unit 504, in response to the number of iterations reaching the set value of the update iteration operation or a variation of the network parameter being less than or equal to a set threshold value, is configured to mark the to-be-processed super network after the update iteration operation as a target super network.

In some optional implementations of this embodiment, the apparatus 500 for acquiring a target network may include a reference network set acquiring unit (not shown in the figure) configured to acquire a reference network set, where the reference network set acquiring unit may include: the local network space partition unit (not shown in the figure) and the reference network set construction subunit (not shown in the figure). Wherein the local network space dividing unit is configured to divide the network space into at least one local network space of a set size; the reference network set constructing subunit is configured to select at least one reference network from the at least one local network space to form a reference network set.

In some optional implementations of this embodiment, the local cyberspace dividing unit may include: a local network space dividing module (not shown in the figure) configured to divide the network space into at least one local network space of a set size according to a specified manner, where the specified manner includes at least one of: network type, geographical location of the network.

In some optional implementation manners of this embodiment, the reference network set constructing subunit may include: and a first reference network set construction module (not shown) configured to randomly pick at least one first reference network for a local network space of the at least one local network space, and combine the at least one first reference network to form a reference network set.

In some optional implementation manners of this embodiment, the reference network set constructing subunit may include: a second reference network selection module (not shown) and a second reference network set building module (not shown). Wherein the second reference network selection module is configured to randomly select at least one second reference network for a local network space of the at least one local network space; and the second reference network set building module is configured to combine at least one second reference network corresponding to the at least one local network space to form a reference network set.

In some optional implementations of the present embodiment, the negative gradient information calculating unit 501 may include: and the negative gradient information calculating subunit (not shown in the figure) is configured to calculate the negative gradient information of the reference network corresponding to the super network to be processed after the reference network replaces the local network space corresponding to the reference network in the super network to be processed.

In some optional implementations of the present embodiment, the negative gradient information calculating subunit may include: and the negative gradient information calculation module (not shown) is configured to forward propagate and backward propagate the to-be-processed super network based on the reference network to obtain the negative gradient information of the reference network corresponding to the to-be-processed super network.

The present embodiment also provides an electronic device, including: one or more processors; a memory having one or more programs stored thereon, which when executed by the one or more processors, cause the one or more processors to perform the above-described method for acquiring a target network.

The present embodiment also provides a computer-readable medium, on which a computer program is stored, which program, when executed by a processor, implements the above-described method for acquiring a target network.

Referring now to FIG. 6, shown is a block diagram of a computer system 600 of an electronic device (e.g., the network training server 105 of FIG. 1) suitable for use in implementing embodiments of the present disclosure. The electronic device shown in fig. 6 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure.

As shown in fig. 6, electronic device 600 may include a processing means (e.g., central processing unit, graphics processor, etc.) 601 that may perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)602 or a program loaded from a storage means 608 into a Random Access Memory (RAM) 603. In the RAM 603, various programs and data necessary for the operation of the electronic apparatus 600 are also stored. The processing device 601, the ROM 602, and the RAM 603 are connected to each other via a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.

Generally, the following devices may be connected to the I/O interface 605: input devices 606 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; output devices 607 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage 608 including, for example, tape, hard disk, etc.; and a communication device 609. The communication means 609 may allow the electronic device 600 to communicate with other devices wirelessly or by wire to exchange data. While fig. 6 illustrates an electronic device 600 having various means, it is to be understood that not all illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided. Each block shown in fig. 6 may represent one device or may represent multiple devices as desired.

In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via the communication means 609, or may be installed from the storage means 608, or may be installed from the ROM 602. The computer program, when executed by the processing device 601, performs the above-described functions defined in the methods of the embodiments of the present disclosure.

It should be noted that the computer readable medium mentioned above in the embodiments of the present disclosure may be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In embodiments of the disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In embodiments of the present disclosure, however, a computer readable signal medium may comprise a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.

The computer readable medium may be embodied in the electronic device; or may exist separately without being assembled into the electronic device. The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: calculating negative gradient information of a reference network corresponding to a to-be-processed super network for the reference network in a reference network set, wherein the reference network is used for representing a corresponding local network space in the to-be-processed super network; calculating a gradient mean value of the at least one piece of negative gradient information for the at least one piece of negative gradient information corresponding to the reference network set, and setting the gradient mean value as a performance feedback parameter of the to-be-processed super network; updating and iterating the network parameters of the super network to be processed according to the performance feedback parameters; and in response to the set iteration times of the updating iteration operation or the variable quantity of the network parameters is less than or equal to a set threshold value, marking the to-be-processed hyper network after the updating iteration operation as a target hyper network.

Computer program code for carrying out operations for embodiments of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The units described in the embodiments of the present disclosure may be implemented by software or hardware. The described units may also be provided in a processor, and may be described as: a processor includes a negative gradient information calculation unit, a performance feedback parameter acquisition unit, an update unit, and a target network acquisition unit. The names of these units do not form a limitation on the unit itself in some cases, for example, the target network acquisition unit may also be described as "a unit that marks the hyper network to be processed as a target hyper network when a set condition is satisfied".

The foregoing description is only exemplary of the preferred embodiments of the disclosure and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the invention in the present disclosure is not limited to the specific combination of the above-mentioned features, but also encompasses other embodiments in which any combination of the above-mentioned features or their equivalents is made without departing from the spirit of the invention. For example, the above features and (but not limited to) the features disclosed in this disclosure having similar functions are replaced with each other to form the technical solution.

Claims

1. A method for acquiring a target network, comprising:

for a reference network in a reference network set, calculating negative gradient information of the reference network corresponding to a to-be-processed super network, wherein the reference network is used for representing a corresponding local network space in the to-be-processed super network, and the negative gradient information is used for representing a convergence state of a training to-be-processed super network;

calculating a gradient mean value of the at least one piece of negative gradient information for the at least one piece of negative gradient information corresponding to the reference network set, and setting the gradient mean value as a performance feedback parameter of the to-be-processed super network;

updating and iterating the network parameters of the super network to be processed according to the performance feedback parameters;

and in response to the set iteration times of the updating iteration operation or the variable quantity of the network parameters is less than or equal to a set threshold value, marking the to-be-processed hyper network after the updating iteration operation as a target hyper network.

2. The method of claim 1, wherein the set of reference networks is obtained by:

dividing the network space into at least one local network space with a set size;

and selecting at least one reference network from the at least one local network space to form a reference network set.

3. The method of claim 2, wherein the dividing the network space into at least one local network space of a set size comprises:

dividing the network space into at least one local network space with a set size according to a specified mode, wherein the specified mode comprises at least one of the following modes: network type, geographical location of the network.

4. The method of claim 2, wherein said choosing at least one reference network from said at least one local network space to form a set of reference networks comprises:

for a local network space of the at least one local network space, at least one first reference network is randomly chosen, and the at least one first reference network is combined to form a reference network set.

5. The method of claim 2, wherein said choosing at least one reference network from said at least one local network space to form a set of reference networks comprises:

randomly picking at least one second reference network for a local network space of the at least one local network space;

and combining at least one second reference network corresponding to the at least one local network space to form a reference network set.

6. The method of claim 1, wherein said calculating negative gradient information of the reference network corresponding to the super network to be processed comprises:

and after the local network space corresponding to the reference network in the to-be-processed super network is replaced by the reference network, calculating the negative gradient information of the reference network corresponding to the to-be-processed super network.

7. The method of claim 6, wherein said calculating the negative gradient information of the reference network corresponding to the super network to be processed after replacing the local network space corresponding to the reference network in the super network to be processed with the reference network comprises:

and carrying out forward propagation and backward propagation on the to-be-processed super network based on the reference network to obtain the negative gradient information of the reference network corresponding to the to-be-processed super network.

8. An apparatus for acquiring a target network, comprising:

the negative gradient information calculation unit is used for calculating the negative gradient information of a reference network corresponding to a to-be-processed super network for the reference network in the reference network set, wherein the reference network is used for representing a corresponding local network space in the to-be-processed super network, and the negative gradient information is used for representing the convergence state of a training to-be-processed super network;

a performance feedback parameter obtaining unit, configured to calculate a gradient mean value of at least one piece of negative gradient information corresponding to the reference network set, and set the gradient mean value as a performance feedback parameter of the to-be-processed super network;

the updating unit is configured to perform updating iteration operation on the network parameters of the to-be-processed hyper-network through the performance feedback parameters;

and the target network acquisition unit is used for responding to the set iteration times of the updating iteration operation or the variation of the network parameters is less than or equal to a set threshold value, and is configured to mark the to-be-processed hyper network after the updating iteration operation as a target hyper network.

9. The apparatus according to claim 8, wherein the apparatus comprises a reference network set acquisition unit configured to acquire a reference network set, the reference network set acquisition unit comprising:

a local network space dividing unit configured to divide a network space into at least one local network space of a set size;

a reference network set constructing subunit configured to select at least one reference network from the at least one local network space to constitute a reference network set.

10. The apparatus of claim 9, wherein the local cyberspace molecule dividing unit comprises:

a local network space dividing module configured to divide the network space into at least one local network space of a set size in a specified manner, wherein the specified manner includes at least one of: network type, geographical location of the network.

11. The apparatus of claim 9, wherein the reference network set constructing subunit comprises:

a first reference network set construction module configured to randomly pick at least one first reference network for a local network space of the at least one local network space, and combine the at least one first reference network to form a reference network set.

12. The apparatus of claim 9, wherein the reference network set constructing subunit comprises:

a second reference network selection module configured to randomly pick at least one second reference network for a local network space of the at least one local network space;

and the second reference network set building module is configured to combine at least one second reference network corresponding to the at least one local network space to form a reference network set.

13. The apparatus according to claim 8, wherein the negative gradient information calculating unit includes:

and the negative gradient information calculation subunit is configured to calculate the negative gradient information of the reference network corresponding to the super network to be processed after the reference network replaces the local network space corresponding to the reference network in the super network to be processed.

14. The apparatus of claim 13, wherein the negative gradient information calculating subunit comprises:

and the negative gradient information calculation module is configured to perform forward propagation and backward propagation on the to-be-processed super network based on the reference network to obtain the negative gradient information of the reference network corresponding to the to-be-processed super network.

15. An electronic device, comprising:

one or more processors;

a memory having one or more programs stored thereon,

the one or more programs, when executed by the one or more processors, cause the one or more processors to perform the method of any of claims 1-7.

16. A computer-readable medium, on which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1 to 7.