CN112818161B

CN112818161B - Method for identifying original image by merging media asset library thumbnail based on deep learning

Info

Publication number: CN112818161B
Application number: CN202110208085.2A
Authority: CN
Inventors: 李传咏; 陈宁; 李贤�
Original assignee: Xi'an Webber Software Co ltd
Current assignee: Xi'an Webber Software Co ltd
Priority date: 2021-02-24
Filing date: 2021-02-24
Publication date: 2023-03-24
Anticipated expiration: 2041-02-24
Also published as: CN112818161A

Abstract

The invention discloses a method for identifying an original image by a deep learning-based thumbnail of a converged media asset library, which comprises the following steps of S1, inputting preprocessed image data into an input layer; s2, extracting high-dimensional features by using a convolution layer, fading irrelevant factors such as a background and the like through convolution, and using ReLU as an activation function; s3, reducing the size of the matrix through the largest pooling layer, reducing parameters, accelerating calculation and preventing over-fitting; s4, converting full connection into convolution, and converting the full connection layer into three convolution layers; s5, reading pictures in the picture library in batches, and reading picture data from the specified directory in batches into the internal memory; s6, carrying out batch preprocessing, and zooming each picture into a picture with a fixed size; and S10, extracting the features, calculating the similarity of the features and all the image features in the feature library, and taking the image P with the features with the maximum similarity, namely the original image. The invention reduces the dependency of recognition on factors such as picture size, color, deformation and the like, and improves the recognition efficiency and accuracy.

Description

Method for identifying original image by merging media asset library thumbnail based on deep learning

Technical Field

The invention relates to the technical field of thumbnail processing, in particular to a method for identifying an original image by a fusion media asset library thumbnail based on deep learning.

Background

With the development of the large trend of media fusion, a large amount of picture resources can be generated, and the efficient management of the resources brings a serious test. Meanwhile, the requirement of finding the original image through the thumbnail becomes urgent in a converged media asset library system, and the provision of the high-quality thumbnail identification original image can bring great convenience to users.

The existing traditional thumbnail identification original image generally uses a perceptual hash algorithm, namely, the image is firstly scaled into a fixed-size gray image, then the gray image is converted into a black-and-white dragging image, each pixel in the black-and-white dragging image is represented by binary numbers 0 and 1 through the algorithm, 0 represents black, and 1 represents black, so as to form a 0-1 characteristic matrix (also called fingerprint), and finally the original image is identified by comparing the similarity (hamming distance) of the characteristic matrix. The algorithm is influenced by the severity of image deformation, and the identification accuracy is easily influenced by factors such as image deformation and color, so that the accuracy is low.

Therefore, how to provide a method for identifying an original image based on a deep learning merged media asset library thumbnail is an urgent problem to be solved by those skilled in the art.

Disclosure of Invention

The present invention is directed to solving, at least to some extent, one of the technical problems in the related art. Therefore, the invention aims to provide a method for identifying an original image by a fused media resource library thumbnail based on deep learning, which reduces the dependency of identification on factors such as the size, the color, the deformation and the like of an image, and greatly improves the identification efficiency and the accuracy.

The method for identifying the original image based on the deep learning merged media asset library thumbnail is characterized by comprising the following specific steps:

s1, inputting preprocessed image data into an input layer;

s2, extracting high-dimensional features by using a convolution layer, fading background irrelevant factors through convolution, highlighting main features of an object to be classified, and using a ReLU as an activation function;

s3, reducing the size of the matrix through the largest pooling layer, reducing parameters, accelerating calculation and preventing over-fitting, wherein the size of the filter is 2x2, and the step length is 2;

s4, converting full connection into convolution, and converting the full connection layer into three convolution layers;

s5, reading pictures in the picture library in batches, and reading picture data from the specified directory in batches into the internal memory;

s6, preprocessing in batch, zooming each picture into a picture with a fixed size for subsequent processing;

s7, extracting the batch features according to the steps of S1-S4, storing all the extracted features in a feature library, and establishing indexes of the pictures and the corresponding features for facilitating subsequent searching;

s8, judging whether all the pictures are processed or not, if not, circularly performing the steps from S5 to S7 until all the pictures are processed, and then entering the next step;

s9, receiving a request of identifying the original image by the user thumbnail, and zooming the picture P1 into a picture P2 with a fixed size;

s10, extracting features, calculating similarity with all picture features in a feature library, and taking a picture P with the features with the maximum similarity, namely an original picture;

s11, obtaining user feedback and adjusting the optimization model, and optimizing the model according to the user feedback result to enable the result to be more accurate.

Preferably, the convolutional layer extraction high-dimensional feature formula is as follows:

wherein, (f × g) (n) is the convolution of f and g.

Preferably, the S2 specifically includes the following steps:

s21, adding all-0 filling to the boundary of the input matrix to enable the size of the output matrix to be consistent with that of the input matrix;

s22, inputting the filled matrix obtained in the S21 into a convolution network;

and S23, adding an offset to the result obtained by the convolution operation of the S22, and outputting the result to the next layer through an activation function ReLU.

Preferably, the steps S1 to S4 are method steps of feature extraction.

Preferably, three convolutional layers in the full connection layer are 1 conv 7x7 and 2 conv 1x1, respectively.

Preferably, the size of the input layer in S1 is selected to be a multiple of 32.

Compared with the prior art, the invention has the beneficial effects that:

(1) The method uses a deep learning technology, utilizes the feature extraction capability of an optimized multilayer neural network, desalts irrelevant factors such as a background and the like through convolution, and highlights the main features of the object to be classified, so that the extracted picture features are more accurate and are less influenced by factors such as picture deformation, color, clipping and the like. The problems that the traditional perceptual hash algorithm is low in accuracy, and is greatly influenced by picture deformation and color factors are solved. The accuracy of the original image identification is greatly improved;

(2) The invention uses a multilayer neural network by utilizing a deep learning technology, wherein 13 convolution layers, 3 full-connection layers and 5 persistence layers are used, the whole network uses convolution kernel size (3 x 3) and maximum pooling size (2 x 2) with the same size, and the combination of several small filter (3 x 3) convolution layers is compared with a large filter (5 x5 or 7x 7), so that input pictures with any size can be processed while preventing model overfitting. By utilizing the characteristic extraction capability of the multilayer neural network, background and other irrelevant factors are diluted through convolution, the main characteristics of the object to be classified are highlighted, the extracted picture characteristics are more accurate, and the accuracy of the original image identification of the thumbnail is greatly improved.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention and not to limit the invention. In the drawings:

FIG. 1 is an overall flowchart of a method for identifying an original image based on a deep learning merged media asset library thumbnail provided by the invention;

fig. 2 is a schematic structural diagram of a feature extraction neural network architecture for recognizing an original image based on a deep learning merged media asset library thumbnail provided by the present invention.

Detailed Description

The present invention will now be described in further detail with reference to the accompanying drawings. These drawings are simplified schematic views illustrating only the basic structure of the present invention in a schematic manner, and thus show only the constitution related to the present invention.

Referring to fig. 1-2, a method for identifying an original image based on a deep learning merged media asset library thumbnail is characterized by comprising the following specific method steps:

step one, feature extraction:

s1, inputting the preprocessed image data into an input layer, wherein the input layer is selected to be a multiple of 32;

s2, extracting high-dimensional features by using a convolution layer, fading background irrelevant factors by convolution, highlighting main features of an object to be classified, and using a ReLU as an activation function;

the formula for extracting the high-dimensional characteristic of the convolutional layer is as follows:

wherein, (f × g) (n) is the convolution of f and g.

S2 specifically comprises the following steps:

and S23, adding an offset to the result obtained by the convolution operation of the S22, and outputting the result to the next layer through the activation function ReLU.

wherein, three convolution layers in the full connection layer are respectively 1 conv 7x7 and 2 conv 1x1.

Step two, feature extraction:

s6, batch preprocessing, namely scaling each picture into a picture with a fixed size, such as 224 × 224, for subsequent processing;

step three, identifying the original image by the thumbnail:

s9, receiving a request for identifying the original image from the user thumbnail, and scaling the picture P1 to a picture P2 with a fixed size, such as 224 × 224;

The method uses a deep learning technology, utilizes the feature extraction capability of an optimized multilayer neural network, desalts irrelevant factors such as a background and the like through convolution, and highlights the main features of the object to be classified, so that the extracted picture features are more accurate and are less influenced by factors such as picture deformation, color, clipping and the like. The problems that the traditional perceptual hash algorithm is low in accuracy, and is greatly influenced by picture deformation and color factors are solved. The accuracy of recognizing the original image is greatly improved.

Example 1:

a resource management system refers to the method for identifying the original image by the thumbnail of the merged media resource library based on deep learning, which comprises the following steps:

step 1, using 50 pictures (wherein the training set is 40 thousands, and the testing set is 10 thousands), preprocessing to obtain pictures with the same size (wherein 0 needs to be supplemented when the size is not enough);

step 2, inputting the training data obtained in the step 1 into a neural network model for training to obtain a model for extracting characteristics, wherein the model is used for extracting picture characteristics;

step 3, performing feature extraction on all pictures in the library by using the model obtained in the step 2, and storing the extracted features in a warehouse;

step 4, extracting features of the picture to be recognized by using the model in the step 2, performing similarity calculation on the extracted features and all features stored in the library in the step 3, and storing the result as a similarity array;

and 5, searching whether pictures smaller than a set threshold exist in all the similarity arrays:

if yes, giving the picture as an original picture to be searched;

if not, the original image is not found.

The invention uses multilayer neural network by using deep learning technique, wherein 13 convolution layers, 3 full-connection layers and 5 sustained layers are used, and the whole network uses convolution kernel size (3 x 3) and maximum pooling size (2 x 2) with the same size, and the combination of several small filter (3 x 3) convolution layers is more than one large filter (5 x5 or 7x 7), so that input pictures with any size can be processed while preventing model overfitting. By utilizing the characteristic extraction capability of the multilayer neural network, background and other irrelevant factors are diluted through convolution, the main characteristics of the object to be classified are highlighted, the extracted picture characteristics are more accurate, and the accuracy of the original image identification of the thumbnail is greatly improved.

The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art should be considered as the technical solutions and the inventive concepts of the present invention within the technical scope of the present invention.

Claims

1. A method for identifying an original image by a converged media asset library thumbnail based on deep learning is characterized by comprising the following specific method steps:

s1, inputting preprocessed image data into an input layer;

s3, reducing the size of the matrix through the largest pooling layer, reducing parameters, and accelerating calculation while preventing over fitting;

s6, carrying out batch preprocessing, namely zooming each picture into a picture with a fixed size for subsequent processing;

s10, extracting features, calculating similarity with all image features in a feature library, and taking a picture P with the features with the maximum similarity as an original picture;

2. The method for recognizing the artwork based on the deep learning converged media asset library thumbnail as claimed in claim 1, wherein the convolutional layer extraction high-dimensional feature formula is as follows:

wherein, (f × g) (n) is the convolution of f and g.

3. The method for identifying the artwork based on the deep learning converged media asset library thumbnail as claimed in claim 1, wherein the S2 specifically comprises the following method steps:

4. The method for identifying artwork based on deep learning from thumbnail of confluent media asset library of claim 1, wherein said steps S1-S4 are method steps of feature extraction.

5. The method for identifying artwork based on deep learning from a thumbnail of a converged media asset library according to claim 1, wherein three convolutional layers in the fully connected layer are 1 conv 7x7 and 2 conv 1x1 respectively.

6. The method for identifying artwork based on deep learning from a merged media asset library thumbnail as claimed in claim 1, wherein the size of the input layer in S1 is selected to be a multiple of 32.