Detailed Description
Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of the embodiments of the disclosure are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
Fig. 1 is a schematic diagram 100 illustrating a first embodiment according to the present disclosure. The incremental learning method based on the image processing model comprises the following steps:
s101, obtaining an image processing model to be iterated and an initial incremental network corresponding to the image processing model to be iterated.
In this embodiment, the execution subject of the incremental learning method based on the image processing model may acquire the initial incremental network including the image processing model to be iterated and the image processing model corresponding to the iterated image processing model in various ways. As an example, the image processing model to be iterated above may include various neural network models for image processing, such as a convolutional network for face recognition. The initial incremental network corresponding to the image processing model to be iterated may include a branch network having a structure similar to that of the image processing model to be iterated, and parameter values of network parameters of the branch network may be updated according to a new training sample. The image processing model to be iterated can be used as a memory storage unit, and the parameter values of the network parameters are usually not adjusted following a new training sample.
S102, respectively inputting a sample image in the target sample data acquired in advance to the image processing model to be iterated and the initial incremental network to obtain a first image feature and a second image feature which respectively correspond to the image processing model to be iterated and the initial incremental network.
In this embodiment, the executing agent may first obtain target sample data from the local or communicatively connected electronic device by a wired connection or a wireless connection method. The target sample data may include a first image feature and a second image feature, which are used for training the image processing model to be iterated and the initial incremental network, so as to obtain the first image feature and the second image feature, which respectively correspond to the image processing model to be iterated and the initial incremental network. The first image feature may be a feature extracted by the image processing model to be iterated with respect to the sample image in the target sample data, and may generally play a role of "memory"; the second image feature may be a feature extracted by the initial incremental network in the image processing model to be iterated with respect to the sample image in the target sample data, and may generally play a role of "continuous learning".
And S103, generating a fusion feature based on the first image feature and the second image feature.
In this embodiment, based on the first image feature and the second image feature obtained in step S102, the executing subject may fuse the first image feature and the second image feature in various ways to generate a fused feature. As an example, the executing entity may perform a weighted summation of the first image feature and the second image feature, thereby generating a fused feature.
And S104, adjusting the network parameters of the initial incremental network based on the difference of the fusion characteristics and the labeling information corresponding to the sample image.
In this embodiment, based on the difference between the fusion feature generated in step S103 and the annotation information corresponding to the input sample image, the execution subject may adjust the network parameters of the initial incremental network in various ways. As an example, the executing entity may determine the difference between the fusion feature generated in step S103 and the annotation information corresponding to the input sample image by using a preset loss function. Then, according to the determined loss value, the execution body may update the network parameters of the initial incremental network by using a Back Propagation (Back Propagation) method.
According to the method provided by the embodiment of the disclosure, the image processing model to be iterated is divided into the original model part for parameter retention and the incremental network for parameter adjustment, the training samples are used for respectively obtaining the image features corresponding to the original model part and the incremental network for parameter adjustment, the obtained image features are fused, and the loss value is determined according to the fused features and the annotation information for adjusting the parameters of the incremental network, so that a new incremental learning mode is provided, the image processing model can overcome forgetting and can be continuously learned in the training process, and the performance of the image processing model is further improved.
In some optional implementations of this embodiment, based on the first image feature and the second image feature, the executing subject may generate the fusion feature according to the following steps, including:
and step one, taking the first image characteristic and the second image characteristic as the input of a preset gate control unit to obtain the output corresponding to the preset gate control unit.
In these implementations, the executing body may use the first image feature and the second image feature obtained in step S102 as inputs of a preset gate unit, so as to obtain an output corresponding to the preset gate unit. The operation logic of the preset gate control unit can be preset, so that different outputs can be generated according to the requirements of application scenes.
And secondly, generating fusion characteristics based on the corresponding output of the preset gate control unit.
In these implementations, the execution subject may generate the fusion feature in various ways based on the output corresponding to the preset gate unit obtained in the first step. As an example, the execution subject may directly determine an output corresponding to the preset gate unit as the fusion feature. As another example, the execution body maps the output corresponding to the preset gate control unit according to a corresponding mapping rule, and determines the mapped result as the fusion feature.
Based on the optional implementation mode, the scheme can fuse the image processing model to be iterated and the first image feature and the second image feature output by the initial incremental network by using the preset gate control unit, so that the memory function can be better exerted, and the forgetting problem in the training process of the image processing model is further relieved.
In some optional implementation manners of this embodiment, the executing body may further determine, as the new image processing model to be iterated, an entirety formed by the adjusted initial incremental network and the image processing model to be iterated.
In these implementations, the executing entity may determine the initial incremental network after the network parameter adjustment is performed in step S104 and the image processing model to be iterated obtained in step S101 as a new image processing model to be iterated as a whole. Thus, steps S101-S104 as described above may be continued with the determined new image processing model to be iterated for the next iteration.
Based on the optional implementation manner, the adjusted incremental network and the image processing model to be iterated are combined to serve as a new image processing model to be iterated, and therefore continuous learning by using incremental data is achieved.
With continued reference to fig. 2, fig. 2 is a schematic diagram 200 according to a second embodiment of the present disclosure. The incremental learning method based on the image processing model comprises the following steps:
s201, obtaining an image processing model to be iterated and an initial incremental network corresponding to the image processing model to be iterated.
S202, respectively inputting a sample image in the pre-acquired target sample data into the image processing model to be iterated and the initial incremental network to obtain a first image feature and a second image feature which respectively correspond to the image processing model to be iterated and the initial incremental network.
S203, executing a preset first operation on the iteration output of the last step corresponding to the preset gate control unit and the first image characteristic to generate a first characteristic.
In this embodiment, the executing entity may execute a preset first operation on the previous iteration output corresponding to the preset gate control unit and the first image feature to generate the first feature. The preset gate control unit may include a preset gate control cycle unit. The first feature is generally inversely related to a difference between the last iteration output of the preset gating cell and the first image feature.
As an example, the output of the previous iteration corresponding to the preset gate control unit can be represented by fm-1. The first image feature may be denoted by fo. The first characteristic may be denoted by f 1. For example, the preset first operation may be f1 ═ fm-1-fo |. For another example, the preset first operation may further include inputting an absolute value of a difference between the last iteration output of the preset gate control unit and the first image feature to a preset activation function, for example, f1 ═ 1-sigmoid (| fm-1-fo |).
And S204, executing preset second operation on the first characteristic and the second image characteristic to generate a second characteristic.
In this embodiment, the executing entity may execute a preset second operation on the first feature generated in step S203 and the second image feature obtained in step S202, so as to generate a second feature. As an example, the second image feature may be represented by fsTo indicate. The predetermined second operation may include a non-linear operation.
In some optional implementations of this embodiment, the preset second operation may include multiplication. As an example, the above second feature may be represented by f2To indicate. The predetermined second operation may be f2=f1·fs。
And S205, executing a preset third operation on the first image characteristic and the second characteristic to generate a current step output corresponding to the preset gate control unit.
In this embodiment, the executing entity may execute a preset third operation on the first image feature obtained in step S202 and the second feature generated in step S204, so as to generate a current step output corresponding to a preset gate unit. The current step output corresponding to the preset gate control unit is usually positively correlated with the first image characteristic and the second characteristic. As an example, the current step output corresponding to the preset gate control unit can be represented by fmTo indicate.
In some optional implementations of the present embodiment, the preset third operation may include addition. As an example, the preset third operation may be fm=fo+f2。
And S206, generating an output corresponding to the preset gate control unit based on the current step output corresponding to the preset gate control unit.
In this embodiment, the execution body may generate the output corresponding to the preset gate control unit in various ways based on the current step output corresponding to the preset gate control unit. As an example, the execution subject may directly determine the output corresponding to the preset gate unit generated in step S205 as the output corresponding to the preset gate unit. As another example, the executing agent may determine whether the current step number reaches a preset loop step number, in response to determining that the current step number does not reach, output the current step corresponding to the preset gate control unit as a previous iteration output of a next iteration of step numbers, and continue to execute the steps S203-S205; and in response to the determination that the iteration step number is reached, determining the output corresponding to the iteration step number as the output corresponding to the preset gate control unit.
And S207, generating fusion characteristics based on the corresponding output of the preset gate control unit.
And S208, adjusting the network parameters of the initial incremental network based on the difference of the labeling information corresponding to the fusion characteristics and the sample image.
S201, S202, S207, and S208 may respectively coincide with the optional implementation manner of S101, S102, and S103 and S104 in the foregoing embodiment, and the above description for the optional implementation manner of S101, S102, and S103 and S104 also applies to S201, S202, S207, and S208, which is not described herein again.
As can be seen from fig. 2, the flow 200 of the incremental learning method based on image processing model in the present embodiment represents a step of processing the first image feature and the second image feature by using a preset gating cycle unit to generate a fusion feature. Therefore, the scheme described in this embodiment can provide another training mode of the incremental learning-based image processing model, and further alleviate the forgetting problem in the model training process and improve the incremental learning performance.
With continued reference to fig. 3, fig. 3 is a schematic diagram of an application scenario of an incremental learning method based on an image processing model according to an embodiment of the present disclosure. In the application scenario of fig. 3, a background server may first obtain a face recognition model 301 to be iterated and an initial incremental network 302 corresponding to the face recognition model 301 to be iterated. Next, the background server inputs a sample face image 303 in the target sample data acquired in advance into the face recognition model 301 to be iterated and the initial incremental network 302, respectively, to obtain a first image feature 304 and a second image feature 305 corresponding to the face recognition model 301 to be iterated and the initial incremental network 302, respectively. The backend server may then generate a fused feature 306 based on the fusion of the first image feature 304 and the second image feature 305 described above. Then, based on the difference 307 between the fused feature 306 and the annotation information (e.g., "zhang san") corresponding to the sample face image 303, the background server may adjust the network parameters of the initial incremental network 302. Optionally, the background server may further determine an entire body formed by the adjusted initial incremental network 302 and the face recognition model 301 to be iterated as a new face recognition model to be iterated, so as to perform continuous learning.
At present, one of the prior arts usually adopts an incremental learning method based on regularization or an incremental learning method based on playback, but the former corrects the gradient by introducing extra loss, and it is difficult to make a trade-off on the performance of the old task and the new task due to the limited capacity of the model; the latter requires additional computing resources and memory space for recalling old knowledge, which results in higher training costs when the task category is increased. In the method provided by the embodiment of the disclosure, the image processing model to be iterated is divided into the original model part for parameter retention and the incremental network for parameter adjustment, the training samples are used to respectively obtain the image features corresponding to the original model part and the incremental network for parameter adjustment, the obtained image features are fused, and the loss value is determined according to the fused features and the annotation information to adjust the parameters of the incremental network, so that a new incremental learning mode is provided, and the image processing model can overcome forgetting and can be continuously learned in the training process, thereby improving the performance of the image processing model.
With continued reference to fig. 4, fig. 4 is a flowchart illustration 400 of an image processing method according to an embodiment of the disclosure. The image processing method comprises the following steps:
s401, acquiring an image to be processed.
In the present embodiment, the execution subject of the image processing method may acquire the image to be processed in various ways. The image to be processed may include various images that can be processed by the neural network model, and is not limited herein.
In this embodiment, the image to be processed may be an image including a human face, as an example. The execution main body can acquire the image to be processed from local or communication connected electronic equipment.
S402, inputting the image to be processed into a pre-trained image processing model, and generating an image processing result.
In this embodiment, the executing body may input the to-be-processed image acquired in step S201 to a pre-trained image processing model, and generate an object detection result corresponding to the to-be-processed image. The image processing model can be obtained by training through an incremental learning method based on the image processing model as described in any implementation manner of the foregoing embodiment. The image processing result may correspond to the image processing model. As an example, when the image processing model is a face recognition model, the image processing result may be used to represent information of a person displayed in a face image. As still another example, when the image processing model is a lane detection model, the image processing result may be used to indicate a position where a lane line is displayed in an image.
As can be seen from fig. 4, the flow 400 of the image processing method in the present embodiment represents the step of processing the image by using the image processing model trained by the incremental learning method based on the image processing model. Therefore, the scheme described in this embodiment can improve the effect of image processing by using the image processing model for improving forgetting.
With further reference to fig. 5, as an implementation of the methods shown in the above figures, the present disclosure provides an embodiment of an incremental learning apparatus based on an image processing model, which corresponds to the method embodiment shown in fig. 1 or fig. 2, and which can be applied in various electronic devices in particular.
As shown in fig. 5, the incremental learning apparatus 500 based on an image processing model provided by the present embodiment includes a first acquisition unit 501, a first generation unit 502, a fusion unit 503, and an adjustment unit 504. The first obtaining unit 501 is configured to obtain an image processing model to be iterated and an initial incremental network corresponding to the image processing model to be iterated; a first generating unit 502, configured to input a sample image in pre-acquired target sample data to an image processing model to be iterated and an initial incremental network respectively, to obtain a first image feature and a second image feature corresponding to the image processing model to be iterated and the initial incremental network respectively, where the target sample data includes the sample image and corresponding annotation information; a fusion unit 503 configured to generate a fusion feature based on the first image feature and the second image feature; an adjusting unit 504 configured to adjust a network parameter of the initial incremental network based on a difference of the annotation information corresponding to the fused feature and the sample image.
In the present embodiment, in the incremental learning apparatus 500 based on an image processing model: the specific processing of the first obtaining unit 501, the first generating unit 502, the fusing unit 503 and the adjusting unit 504 and the technical effects thereof can refer to the related descriptions of steps S101, S102, S103 and S104 in the corresponding embodiment of fig. 1, and are not repeated herein.
In some optional implementations of this embodiment, the fusing unit 503 may include: an output subunit (not shown in the figure), configured to use the first image feature and the second image feature as inputs of a preset gate control unit, and obtain an output corresponding to the preset gate control unit; and a fusion subunit (not shown in the figure) configured to generate a fusion feature based on the output corresponding to the preset gate unit.
In some optional implementations of this embodiment, the preset gating unit may include a preset gating cycle unit. The output subunit may include: a first generating module (not shown in the figure) configured to perform a preset first operation on the previous iteration output corresponding to the preset gate control unit and the first image feature to generate a first feature, wherein the first feature is inversely related to a difference between the previous iteration output corresponding to the preset gate control unit and the first image feature; a second generating module (not shown in the figure) configured to perform a preset second operation on the first feature and the second image feature to generate a second feature; a third generating module (not shown in the figure) configured to perform a preset third operation on the first image feature and the second feature, and generate a current step output corresponding to the preset gate control unit; and a fourth generating module (not shown in the figure) configured to generate an output corresponding to the preset gate control unit based on the current step output corresponding to the preset gate control unit.
In some optional implementations of this embodiment, the first generating module may be further configured to: determining a difference value between the last step iteration output of the gate control unit and the first memory image characteristic; generating an output value corresponding to the difference value by using a preset activation function; based on the inverse of the output value, a first feature is generated.
In some optional implementations of this embodiment, the preset second operation may include multiplication.
In some optional implementations of the present embodiment, the preset third operation may include addition.
In some optional implementations of this embodiment, the incremental learning apparatus based on an image processing model may further include: and a determining unit (not shown in the figure) configured to determine the whole body formed by the adjusted initial increment network and the image processing model to be iterated as the new image processing model to be iterated.
In the apparatus provided by the above embodiment of the present disclosure, the first obtaining unit 501 obtains an original model part for parameter retention and an incremental network for parameter adjustment, which are used to form an image processing model to be iterated, the first generating unit 502 obtains image features corresponding to the original model part and the incremental network for parameter adjustment, respectively, by using a training sample, the fusing unit 503 fuses the obtained image features, and the adjusting unit 504 determines a loss value according to the fused features and the label information to adjust parameters of the incremental network, thereby providing a new incremental learning manner, and enabling the image processing model to overcome forgetting and continue learning in a training process, and further improving performance of the image processing model.
Fig. 6, as an implementation of the method shown in the above figures, the present disclosure provides an embodiment of an image processing apparatus, which corresponds to the method embodiment shown in fig. 4, and which is particularly applicable to various electronic devices.
As shown in fig. 6, the image processing apparatus 600 provided in the present embodiment includes a second acquiring unit 601 and a second generating unit 602. Wherein, the second acquiring unit 601 is configured to acquire an image to be processed; a second generating unit 602, configured to input the image to be processed to a pre-trained image processing model, and generate an image processing result, where the image processing model is trained according to the incremental learning method based on the image processing model described in any implementation manner in the foregoing embodiments.
In the present embodiment, in the image processing apparatus 600: the specific processing of the second obtaining unit 601 and the second generating unit 602 and the technical effects thereof can refer to the related descriptions of steps S401 and S402 in the corresponding embodiment of fig. 4, which are not repeated herein.
The apparatus provided by the above embodiment of the present disclosure is a step of processing, by the second generating unit 602, the to-be-processed image acquired by the second acquiring unit 601 by using the image processing model trained by the incremental learning method based on the image processing model. Therefore, the scheme described in this embodiment can improve the effect of image processing by using the image processing model for improving forgetting.
In the technical scheme of the disclosure, the acquisition, storage, application and the like of the personal information of the related user all accord with the regulations of related laws and regulations, necessary confidentiality measures are taken, and the customs of the public order is not violated.
The present disclosure also provides an electronic device, a readable storage medium, and a computer program product according to embodiments of the present disclosure.
FIG. 7 illustrates a schematic block diagram of an example electronic device 700 that can be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the disclosure described and/or claimed herein.
As shown in fig. 7, the device 700 comprises a computing unit 701, which may perform various suitable actions and processes according to a computer program stored in a Read Only Memory (ROM)702 or a computer program loaded from a storage unit 708 into a Random Access Memory (RAM) 703. In the RAM 703, various programs and data required for the operation of the device 700 can also be stored. The computing unit 701, the ROM 702, and the RAM 703 are connected to each other by a bus 704. An input/output (I/O) interface 705 is also connected to bus 704.
Various components in the device 700 are connected to the I/O interface 705, including: an input unit 706 such as a keyboard, a mouse, or the like; an output unit 707 such as various types of displays, speakers, and the like; a storage unit 708 such as a magnetic disk, optical disk, or the like; and a communication unit 709 such as a network card, modem, wireless communication transceiver, etc. The communication unit 709 allows the device 700 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks.
Computing unit 701 may be a variety of general purpose and/or special purpose processing components with processing and computing capabilities. Some examples of the computing unit 701 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and so forth. The calculation unit 701 executes the respective methods and processes described above, such as an incremental learning method based on an image processing model or an image processing method. For example, in some embodiments, the image processing model-based incremental learning method or the image processing method may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as the storage unit 707. In some embodiments, part or all of a computer program may be loaded onto and/or installed onto device 700 via ROM 702 and/or communications unit 709. When the computer program is loaded into the RAM 703 and executed by the computing unit 701, one or more steps of the image processing model-based incremental learning method or the image processing method described above may be performed. Alternatively, in other embodiments, the computing unit 701 may be configured by any other suitable means (e.g. by means of firmware) to perform an image processing model based incremental learning method or an image processing method.
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.
The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server may be a cloud server, a server of a distributed system, or a server with a combined blockchain.
It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present disclosure may be executed in parallel or sequentially or in different orders, and are not limited herein as long as the desired results of the technical solutions disclosed in the present disclosure can be achieved.
The above detailed description should not be construed as limiting the scope of the disclosure. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present disclosure should be included in the scope of protection of the present disclosure.