CN109685723A

CN109685723A - A kind of multimedia data information processing method

Info

Publication number: CN109685723A
Application number: CN201811306628.9A
Authority: CN
Inventors: 林路路
Original assignee: Jianhu Yunfei Data Technology Co Ltd
Current assignee: Jianhu Yunfei Data Technology Co Ltd
Priority date: 2018-11-05
Filing date: 2018-11-05
Publication date: 2019-04-26

Abstract

The present invention provides a kind of multimedia data information processing methods, which comprises obtains first nerves Network Prediction Model, is handled according to the first nerves Network Prediction Model specified image, obtain intermediate features figure；Successively the intermediate features figure is handled according to N number of warp block of the first nerves Network Prediction Model, obtains fisrt feature image；Nervus opticus Network Prediction Model is obtained, is handled according to gradient map of the nervus opticus Network Prediction Model to the specified image, obtains the second feature image of the specified image；Obtain third nerve Network Prediction Model, according to the third nerve Network Prediction Model, image after being handled according to the fisrt feature image of input and the second feature image.The present invention due to after the processing image can effectively restore the scene information feature and marginal information feature of specified image, so as to promote the efficiency of image enhancement processing.

Description

A kind of multimedia data information processing method

Technical field

The present invention relates to multimedia technology field more particularly to a kind of multimedia data information processing methods.

Background technique

Image enhancement is to add some information or transformation data to original image by certain means, selectively protrudes image In interested feature or inhibit image in certain unwanted features, so that image is matched with eye response characteristic.Image Enhancing technology can be different according to the space where enhanced processes, can be divided into the algorithm based on airspace and the calculation based on frequency domain Method two major classes.Operation directly is done to image gray levels when algorithm process based on airspace, the algorithm based on frequency domain is in image Certain amendment is carried out to the transform coefficient values of image in certain transform domain, is a kind of algorithm enhanced indirectly.Existing image increases Strong technology is handled image with only considered globality, lower so as to cause treatment effeciency.

Summary of the invention

The present invention provides a kind of multimedia data information processing methods, and its technical solution is as follows:

First nerves Network Prediction Model is obtained, the first nerves Network Prediction Model includes: N number of convolution block, and N number of warp block, the N are positive integer correspondingly with N number of convolution block, and each convolution block includes multiple convolution Process layer, the scale for belonging to any two process of convolution layer of same convolution block is identical, belongs to any two of different convolution blocks The scale of process of convolution layer is different, the number for the process of convolution layer that each warp block includes with included by corresponding convolution block The number of process of convolution layer is equal, the scale for the process of convolution layer that each warp block includes, and included by corresponding convolution block Process of convolution layer scale it is identical；

According to N number of convolution block of first nerves Network Prediction Model successively the specified image is handled, is obtained Between characteristic pattern；According to N number of warp block of the first nerves Network Prediction Model successively to the intermediate features figure at Reason, obtains fisrt feature image；

Nervus opticus Network Prediction Model is obtained, the nervus opticus Network Prediction Model includes: M convolution block and institute The one-to-one M warp block of M convolution block and Recognition with Recurrent Neural Network model are stated, the M is positive integer, each volume Block includes multiple process of convolution layers, and the scale for belonging to any two process of convolution layer of same convolution block is identical, belongs to difference The scale of any two process of convolution layer of convolution block is different, the number for the process of convolution layer that each warp block includes with it is corresponding Convolution block included by process of convolution layer number it is equal, the scale for the process of convolution layer that each warp block includes is and right The scale of process of convolution layer included by the convolution block answered is identical；

According to M convolution block of nervus opticus Network Prediction Model successively to the gradient map of the specified image at Reason, obtains Middle-gradient characteristic pattern, according to M warp block of nervus opticus Network Prediction Model successively to the Middle-gradient Characteristic pattern is handled, and fisrt feature figure is obtained, and is determining power of the Recognition with Recurrent Neural Network model on different gradient directions After weight values, based on the Recognition with Recurrent Neural Network model for determining weighted value, the fisrt feature figure is handled, is obtained described The second feature image of specified image；

Third nerve Network Prediction Model is obtained, the third nerve Network Prediction Model includes multiple convolutional layers；

According to third nerve Network Prediction Model, according to the fisrt feature image and the second feature image of input Image after being handled.

Preferably, the multimedia data information processing method further include:

Based on the K group training image, training loss function is constructed, the trained loss function includes: perception loss letter At least one of number and confrontation loss function；

It is described according to the loss function, enhancing processing model is trained, comprising:

By the loss function and the trained loss function superposition, superpositing function is obtained；

Based on the superpositing function, enhancing processing model is trained；

Obtain K group training image, training image described in every group includes normal illumination image, and with the normal illumination figure As corresponding low light image, K is the integer greater than 1, and the low light image is according to Gamma correction function to described normal The image that light image obtains after being handled；

Based on the K group training image, loss function, the loss function L are constructed_mse(θ) meets:

Wherein, L_kAnd R_kLow light image and normal illumination image respectively in kth group training image, f is described image Enhancement Method institute according to enhancing handle model, the enhancing handles model by the first nerves Network Prediction Model, described Nervus opticus Network Prediction Model and third nerve Network Prediction Model composition；f(L_k, θ) and it is to scheme to kth group training Low light image L as in_kImage after the processing obtained after enhancing processing is carried out, θ is the parameter in the enhancing processing model, K is the positive integer no more than K；

According to the loss function, enhancing processing model is trained, the first nerves neural network forecast is obtained Model, the nervus opticus Network Prediction Model and the third nerve Network Prediction Model.

The embodiment of the invention provides a kind of multimedia data information processing methods, can be according to first nerves neural network forecast Model obtains the fisrt feature image of the specified image, which can reflect the scene information of specified image, and The second feature image that the specified image can be obtained according to nervus opticus Network Prediction Model, finally to the fisrt feature image With the second feature image carry out fusion treatment after handled after image.Since image can effectively restore specified after the processing The scene information feature and marginal information feature of image, so as to promote the efficiency of image enhancement processing.

Detailed description of the invention

To describe the technical solutions in the embodiments of the present invention more clearly, make required in being described below to embodiment Attached drawing is briefly described, it should be apparent that, drawings in the following description are only some embodiments of the invention, for For those of ordinary skill in the art, without creative efforts, it can also be obtained according to these attached drawings other Attached drawing.

Fig. 1 is a kind of flow chart of multimedia data information processing method provided in an embodiment of the present invention.

Specific embodiment

To make the object, technical solutions and advantages of the present invention clearer, below in conjunction with attached drawing to embodiment party of the present invention Formula is described in further detail.

Referring to FIG. 1, the embodiment of the invention provides a kind of multimedia data information processing method, main following step It is rapid:

Step 1 obtains first nerves Network Prediction Model, and the first nerves Network Prediction Model includes: N number of convolution Block, and N number of warp block, the N are positive integer correspondingly with N number of convolution block, each convolution block includes Multiple process of convolution layers, the scale for belonging to any two process of convolution layer of same convolution block is identical, belongs to different convolution blocks The scale of any two process of convolution layer is different, the number for the process of convolution layer that each warp block includes and corresponding convolution block The number of included process of convolution layer is equal, the scale for the process of convolution layer that each warp block includes, with corresponding convolution The scale of process of convolution layer included by block is identical；

Step 2 is successively handled the specified image according to N number of convolution block of first nerves Network Prediction Model, Obtain intermediate features figure；According to N number of warp block of the first nerves Network Prediction Model successively to the intermediate features figure It is handled, obtains fisrt feature image；

Step 3 obtains nervus opticus Network Prediction Model, and the nervus opticus Network Prediction Model includes: M convolution Block and the one-to-one M warp block of the M convolution block and Recognition with Recurrent Neural Network model, the M are positive integer, often A convolution block includes multiple process of convolution layers, and the scale for belonging to any two process of convolution layer of same convolution block is identical, The scale for belonging to any two process of convolution layer of different convolution blocks is different, for the process of convolution layer that each warp block includes Number, the ruler of process of convolution layer that each warp block include equal with the number of process of convolution layer included by corresponding convolution block Degree is identical as the scale of process of convolution layer included by corresponding convolution block；

Step 4, according to M convolution block of nervus opticus Network Prediction Model successively to the gradient map of the specified image into Row processing, obtains Middle-gradient characteristic pattern, according to M warp block of nervus opticus Network Prediction Model successively to the centre Gradient Features figure is handled, and fisrt feature figure is obtained, and is determining the Recognition with Recurrent Neural Network model on different gradient directions Weighted value after, based on determine weighted value the Recognition with Recurrent Neural Network model, the fisrt feature figure is handled, is obtained The second feature image of the specified image；

Step 5 obtains third nerve Network Prediction Model, and the third nerve Network Prediction Model includes multiple convolution Layer；

Step 6, according to third nerve Network Prediction Model, according to the fisrt feature image of input and described second special Image after sign image is handled.

Preferred embodiment China, the multimedia data information processing method further include:

Step 7 is based on the K group training image, constructs training loss function, and the trained loss function includes: perception At least one of loss function and confrontation loss function；

It is step 8, described according to the loss function, enhancing processing model is trained, comprising:

The loss function and the trained loss function are superimposed by step 9, obtain superpositing function；

Step 10 is based on the superpositing function, is trained to enhancing processing model；

Step 11, obtain K group training image, training image described in every group includes normal illumination image, and with it is described just The corresponding low light image of normal light image, K are the integer greater than 1, and the low light image is according to Gamma correction function pair The image that the normal illumination image obtains after being handled；

Step 12 is based on the K group training image, constructs loss function, the loss function L_mse(θ) meets:

Step 13, according to the loss function, enhancing processing model is trained, the first nerves net is obtained Network prediction model, the nervus opticus Network Prediction Model and the third nerve Network Prediction Model.

And multimedia data information processing method provided in an embodiment of the present invention, it can be mentioned respectively according to Recurrent networks model Fetching determines the fisrt feature image and second feature image of image, and wherein the fisrt feature image can be used for reflecting specified image Scene content feature, which can reflect the marginal information feature of specified figure, therefore by the fisrt feature figure Image can guarantee while restoration scenario true content after picture and second feature image carry out the processing that fusion treatment obtains The detail edges feature of specified image is also effectively restored, and treatment effect is preferable.

Specifically, in the step 1 and step 3, the first nerves Network Prediction Model can be for based on convolution mind Network model through network.The fisrt feature image can be used for reflecting the scene information of the specified image, that is to say, can be anti- Which reflect in the specified image specifically comprising reference object.The nervus opticus Network Prediction Model can be for based on convolutional Neural net The network model of network and Recognition with Recurrent Neural Network, the second feature image can be used for reflecting the edge feature of the specified image.Its In, the edge of image refers to the discontinuous place of characteristic in image (such as pixel grey scale, texture) distribution, and edge is generally present in figure As between object and background or between object and object.

The first nerves Network Prediction Model can be the network model based on convolutional neural networks, the first nerves network Prediction model may include: N number of convolution block, and N number of warp block, the N can be positive correspondingly with N number of convolution block Integer, such as can be the integer greater than 1.Wherein, each convolution block may include multiple process of convolution layers, each process of convolution Layer can be convolutional layer, corrosion convolutional layer or down-sampling convolutional layer etc..Also, belong to any two convolution of same convolution block The scale of process layer is identical, and the scale for belonging to any two process of convolution layer of different convolution blocks is different.

The number for the process of convolution layer that each warp block includes and process of convolution layer included by corresponding convolution block Number is equal, the scale for the process of convolution layer that each warp block includes, with process of convolution layer included by corresponding convolution block Scale it is identical.The size of obtained characteristic pattern after being handled according to the process of convolution layer of different scale specified image (i.e. resolution ratio) is different, and the scale size of each process of convolution layer can be determined by the size of convolution kernel.

It, can be first according to N number of convolution when being handled according to the first nerves Network Prediction Model specified image Block successively specifies image to handle this, obtains intermediate features figure；Then further according to N number of warp block successively among this Characteristic pattern is handled, and the fisrt feature image finally can be obtained.Due to the first nerves Network Prediction Model include it is N number of Convolution block and N number of warp block correspond, therefore are specified at image according to the first nerves Network Prediction Model to this After reason, the resolution ratio of finally obtained fisrt feature image is identical as the specified resolution ratio of image.

For example, the first nerves Network Prediction Model may include three convolution blocks and three warp blocks.First convolution Block may include: sequentially connected two corrosion convolutional layers and a convolutional layer；Volume Two block may include: sequentially connected Two convolutional layers and a down-sampling convolutional layer；Third convolution block may include: sequentially connected three convolutional layers.Wherein, exist Two corrosion convolutional layers are set in first volume block, can effectively promote the first nerves Network Prediction Model is exported first The perception of characteristic image is wild.In convolutional neural networks, perception open country refers to the pixel on the characteristic pattern that network model is exported The area size mapped on the original image.

First warp block may include: sequentially connected three convolutional layers；Second warp block may include: successively to connect The warp lamination and two convolutional layers connect；Third warp block may include: a sequentially connected warp lamination and two A convolutional layer.

The nervus opticus Network Prediction Model can be the model based on convolutional neural networks and Recognition with Recurrent Neural Network, should Nervus opticus Network Prediction Model may include: M convolution block, with the one-to-one M warp block of M convolution block, under Sampling model and Recognition with Recurrent Neural Network model.Wherein, M can be positive integer, such as can be the integer greater than 1.

In the M convolution block, each convolution block may include multiple process of convolution layers, belong to any the two of same convolution block The scale of a process of convolution layer is identical, and the scale for belonging to any two process of convolution layer of different convolution blocks is different；The M warp In block, of the number for the process of convolution layer that each warp block includes and process of convolution layer included by corresponding convolution block Equal, the scale for the process of convolution layer that each warp block includes of number, with process of convolution layer included by corresponding convolution block Scale is identical；The down-sampling model then may include the down-sampling convolutional layer of multiple and different scales.

N number of volume in M convolution block and the first nerves Network Prediction Model in the nervus opticus Network Prediction Model Block can also correspond, M warp block in the nervus opticus Network Prediction Model and the first nerves neural network forecast N number of warp block in model can also correspond.Wherein, the scale of corresponding two convolution blocks is identical, corresponding The scale of two warp blocks is identical.

In step 4, specifies the gradient map of image to handle this according to nervus opticus Network Prediction Model, obtain this and refer to Determine the second feature image of image, the gradient map of the available specified image, which can reflect in the specified image Grey scale change in the neighborhood of each pixel.After terminal is handled the gradient map according to nervus opticus Network Prediction Model, The second feature image of the specified image can be obtained.

In step 6, according to third nerve Network Prediction Model, the fisrt feature image and the second feature image are carried out Fusion treatment, image after being handled.The third nerve Network Prediction Model can be the model based on convolutional neural networks.Eventually End carries out obtained by fusion treatment the fisrt feature image and the second feature image by third nerve Network Prediction Model Processing after image, can not only restore the scene information of specified image, the edge that can also effectively restore the specified image is special Sign, the visual effect of image is preferable after the processing, effectively increases the treatment effect of the multimedia data information processing method.

Layered shaping is carried out to the second feature figure of the fisrt feature figure and multiple different scale, determines circulation nerve net Weighted value of the network model on different gradient directions.

Layered shaping can be carried out to the second feature figure of the fisrt feature figure and multiple different scale, so that it is determined that following Weighted value of the ring neural network model on different gradient directions.Wherein, which may include totally four up and down Direction.

Since the second feature figure of different scale can provide different marginal informations.For example, the second of small resolution ratio is special Sign figure can provide strong marginal information, i.e. the detail textures information of image, and the second feature figure of big resolution ratio can provide Tiny marginal information, the i.e. overall profile and structural information of image, therefore during the layered shaping, it can use more Feature provided by the second feature figure of a different scale constrains the weighted value of gradient directions different in RNNs model, from And determine the weighted value of the RNNs model on different gradient directions.The bigger pixel of pixel value is corresponding in general second feature figure RNNs model in weighted value it is also bigger.

Based on the Recognition with Recurrent Neural Network model for determining weighted value, which is handled, it is specified to obtain this The second feature image of image.It can be chosen most in the weighted value of the extracted different gradient directions of Recognition with Recurrent Neural Network model Big weighted value is as final gradient direction instruction as a result, simultaneously can be based on the result control loop neural network model to this Fisrt feature figure is handled, and the second feature image of the specified image is obtained.

The fisrt feature image and the second feature image are overlapped, it can be according to the third based on convolutional neural networks Neural network prediction model handles the fisrt feature image and the second feature image, with image after being handled.

Superimposed image is handled according to multiple convolutional layers, image after being handled, the third nerve network It may include multiple convolutional layers in prediction model, such as may include two convolutional layers.Terminal can be according to multiple convolutional layer Superimposed image is handled, thus image after being handled.Image can effectively reflect specified image after the processing Scene information and edge feature, therefore after the processing image visual effect it is preferable.

Each convolution block or deconvolution in the first nerves Network Prediction Model and nervus opticus Network Prediction Model Block institute according to the number amount and type of process of convolution layer can be adjusted according to the actual situation, for example, first nerves network is pre- The corrosion convolutional layer of a corrosion convolutional layer or three or more, third convolution can be set in the first volume block surveyed in model Also four convolutional layers can be set in block.It is not limited in the embodiment of the present invention.

The present embodiment also discloses the step of being trained to enhancing processing model, comprising:

1, obtain K group training image, every group of training image includes normal illumination image, and with the normal illumination image Corresponding low light image.Wherein, K is the integer greater than 1, and the normal illumination image in the K group training image can be exploitation The brightness of personnel's artificial screening is higher, and relatively clear image；Or the normal illumination image in the K group training image can also Think that the brightness and contrast of machine screening meets the image of preset condition.Low light image in every group of training image is basis The image that Gamma correction function obtains after handling the normal illumination image in this group of training image, therefore every group of training figure Normal illumination image and low light image as in are referred to as a pair of of matching pair.

2, it is based on the K group training image, constructs loss function.

Loss function L_mse(θ) can satisfy:

Wherein, | | | | it indicates to solve two norms, L_kAnd R_kLow light image respectively in kth group training image and normal Light image, f for image enchancing method institute provided in an embodiment of the present invention according to enhancing handle model, as it was noted above, should Enhancing processing model can be by the first nerves Network Prediction Model, the nervus opticus Network Prediction Model and the third nerve net Network prediction model composition；f(L_k, θ) and it is to the low light image L in the kth group training image_kObtain after enhancing processing Image after processing, θ are the parameter in enhancing processing model, and k is the positive integer no more than K.

f(L_k,θ)-R_kIt can refer to the difference of the pixel value of respective pixel in two images, it follows that the loss function Namely it is to solve for the mean square error after the processing obtained according to enhancing processing model between image, with true normal illumination image Difference.

3, it is based on the K group training image, constructs training loss function.

The step specifically includes:

A, one group of target training image is chosen from K group training image.

Such as terminal can randomly select one group of training image as target training image.

B, model is handled according to the enhancing and enhancing processing is carried out to the low light image of target in the target training image, obtained Image after to the corresponding processing of the low light image of the target.

Model f light image L low to the target, which is handled, according to the enhancing carries out image after the processing obtained after enhancing processing It can be expressed as f (L, θ).

C, the target training image, construction perception loss function are based on.

In embodiments of the present invention, terminal can be according to preset neural network model ψ respectively to the target training image In target normal illumination image R and the corresponding processing of the low light image L of the target after after image f (L, θ) handled, Construction perception loss function L_per, perception loss function L_perIt can satisfy:

Wherein, ψ_i,jIndicate that j-th of convolutional layer in the preset neural network model ψ after i-th of pond layer is extracted Characteristic pattern out, W_i,jAnd H_i,jThe length and width of every layer of characteristic pattern respectively in the preset neural network model.

In embodiments of the present invention, in order to further increase the treatment effect that the enhancing handles model, terminal can be with structure Other training loss functions are made, which may include: perception loss function and fight in loss function at least It is a kind of.

4, the loss function is superimposed with the training loss function, obtains superpositing function.

Assuming that the training loss function includes perception loss function and confrontation loss function.Then correspondingly, terminal can incite somebody to action The loss function, the perception loss function and the confrontation loss function are overlapped, and obtain superpositing function.

5, it is based on the superpositing function, enhancing processing model is trained.

Finally, terminal can be based on the superpositing function, enhancing processing model is trained, at the regularization enhancing Manage the parameter θ in model, thus obtain the first nerves Network Prediction Model, the nervus opticus Network Prediction Model and this Three neural network prediction models.Enhancing processing model after the training can effectively be fitted the K group training image.Wherein, it is instructing In experienced process, the parameter in enhancing processing model can be updated by way of back transfer, until the superpositing function is received It holds back.

The foregoing is merely presently preferred embodiments of the present invention, is not intended to limit the invention, it is all in spirit of the invention and Within principle, any modification, equivalent replacement, improvement and so on be should all be included in the protection scope of the present invention.

Claims

1. a kind of multimedia data information processing method, which is characterized in that the described method includes:

Obtain first nerves Network Prediction Model, the first nerves Network Prediction Model includes: N number of convolution block, and with institute N number of convolution block N number of warp block correspondingly is stated, the N is positive integer, and each convolution block includes multiple process of convolution Layer, the scale for belonging to any two process of convolution layer of same convolution block is identical, belongs to any two convolution of different convolution blocks The scale of process layer is different, the number for the process of convolution layer that each warp block includes and convolution included by corresponding convolution block The number of process layer is equal, the scale for the process of convolution layer that each warp block includes, with volume included by corresponding convolution block The scale of product process layer is identical；

Successively the specified image is handled according to N number of convolution block of first nerves Network Prediction Model, obtains intermediate spy Sign figure；Successively the intermediate features figure is handled according to N number of warp block of the first nerves Network Prediction Model, is obtained To fisrt feature image；

Nervus opticus Network Prediction Model is obtained, the nervus opticus Network Prediction Model includes: M convolution block and the M The one-to-one M warp block of convolution block and Recognition with Recurrent Neural Network model, the M are positive integer, each convolution block Including multiple process of convolution layers, the scale for belonging to any two process of convolution layer of same convolution block is identical, belongs to different convolution The scale of any two process of convolution layer of block is different, the number of the process of convolution layer that each warp block includes and corresponding volume The number of process of convolution layer included by block is equal, the scale for the process of convolution layer that each warp block includes, and corresponding The scale of process of convolution layer included by convolution block is identical；

Successively the gradient map of the specified image is handled according to M convolution block of nervus opticus Network Prediction Model, is obtained To Middle-gradient characteristic pattern, according to M warp block of nervus opticus Network Prediction Model successively to the Middle-gradient feature Figure is handled, and fisrt feature figure is obtained, and is determining weighted value of the Recognition with Recurrent Neural Network model on different gradient directions Afterwards, based on the Recognition with Recurrent Neural Network model for determining weighted value, the fisrt feature figure is handled, is obtained described specified The second feature image of image；

According to third nerve Network Prediction Model, obtained according to the fisrt feature image of input and the second feature image Image after processing.

2. the method according to claim 1, wherein the method also includes:

Based on the K group training image, construct training loss function, the trained loss function include: perception loss function and Fight at least one of loss function；

Based on the superpositing function, enhancing processing model is trained；

Obtain K group training image, training image described in every group includes normal illumination image, and with the normal illumination image pair The low light image answered, K are the integer greater than 1, and the low light image is according to Gamma correction function to the normal illumination The image that image obtains after being handled；

Wherein, L_kAnd R_kLow light image and normal illumination image respectively in kth group training image, f are described image enhancing Method institute according to enhancing handle model, the enhancing handles model by the first nerves Network Prediction Model, described second Neural network prediction model and third nerve Network Prediction Model composition；f(L_k, θ) and it is in the kth group training image Low light image L_kImage after the processing obtained after enhancing processing is carried out, θ is the parameter in the enhancing processing model, and k is Positive integer no more than K；

According to the loss function, enhancing processing model is trained, obtain the first nerves Network Prediction Model, The nervus opticus Network Prediction Model and the third nerve Network Prediction Model.