CN112818161B - Method for identifying original image by merging media asset library thumbnail based on deep learning - Google Patents

Method for identifying original image by merging media asset library thumbnail based on deep learning Download PDF

Info

Publication number
CN112818161B
CN112818161B CN202110208085.2A CN202110208085A CN112818161B CN 112818161 B CN112818161 B CN 112818161B CN 202110208085 A CN202110208085 A CN 202110208085A CN 112818161 B CN112818161 B CN 112818161B
Authority
CN
China
Prior art keywords
picture
features
convolution
thumbnail
layer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110208085.2A
Other languages
Chinese (zh)
Other versions
CN112818161A (en
Inventor
李传咏
陈宁
李贤�
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xi'an Webber Software Co ltd
Original Assignee
Xi'an Webber Software Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xi'an Webber Software Co ltd filed Critical Xi'an Webber Software Co ltd
Priority to CN202110208085.2A priority Critical patent/CN112818161B/en
Publication of CN112818161A publication Critical patent/CN112818161A/en
Application granted granted Critical
Publication of CN112818161B publication Critical patent/CN112818161B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/51Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Library & Information Science (AREA)
  • Databases & Information Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a method for identifying an original image by a deep learning-based thumbnail of a converged media asset library, which comprises the following steps of S1, inputting preprocessed image data into an input layer; s2, extracting high-dimensional features by using a convolution layer, fading irrelevant factors such as a background and the like through convolution, and using ReLU as an activation function; s3, reducing the size of the matrix through the largest pooling layer, reducing parameters, accelerating calculation and preventing over-fitting; s4, converting full connection into convolution, and converting the full connection layer into three convolution layers; s5, reading pictures in the picture library in batches, and reading picture data from the specified directory in batches into the internal memory; s6, carrying out batch preprocessing, and zooming each picture into a picture with a fixed size; and S10, extracting the features, calculating the similarity of the features and all the image features in the feature library, and taking the image P with the features with the maximum similarity, namely the original image. The invention reduces the dependency of recognition on factors such as picture size, color, deformation and the like, and improves the recognition efficiency and accuracy.

Description

Method for identifying original image by merging media asset library thumbnail based on deep learning
Technical Field
The invention relates to the technical field of thumbnail processing, in particular to a method for identifying an original image by a fusion media asset library thumbnail based on deep learning.
Background
With the development of the large trend of media fusion, a large amount of picture resources can be generated, and the efficient management of the resources brings a serious test. Meanwhile, the requirement of finding the original image through the thumbnail becomes urgent in a converged media asset library system, and the provision of the high-quality thumbnail identification original image can bring great convenience to users.
The existing traditional thumbnail identification original image generally uses a perceptual hash algorithm, namely, the image is firstly scaled into a fixed-size gray image, then the gray image is converted into a black-and-white dragging image, each pixel in the black-and-white dragging image is represented by binary numbers 0 and 1 through the algorithm, 0 represents black, and 1 represents black, so as to form a 0-1 characteristic matrix (also called fingerprint), and finally the original image is identified by comparing the similarity (hamming distance) of the characteristic matrix. The algorithm is influenced by the severity of image deformation, and the identification accuracy is easily influenced by factors such as image deformation and color, so that the accuracy is low.
Therefore, how to provide a method for identifying an original image based on a deep learning merged media asset library thumbnail is an urgent problem to be solved by those skilled in the art.
Disclosure of Invention
The present invention is directed to solving, at least to some extent, one of the technical problems in the related art. Therefore, the invention aims to provide a method for identifying an original image by a fused media resource library thumbnail based on deep learning, which reduces the dependency of identification on factors such as the size, the color, the deformation and the like of an image, and greatly improves the identification efficiency and the accuracy.
The method for identifying the original image based on the deep learning merged media asset library thumbnail is characterized by comprising the following specific steps:
s1, inputting preprocessed image data into an input layer;
s2, extracting high-dimensional features by using a convolution layer, fading background irrelevant factors through convolution, highlighting main features of an object to be classified, and using a ReLU as an activation function;
s3, reducing the size of the matrix through the largest pooling layer, reducing parameters, accelerating calculation and preventing over-fitting, wherein the size of the filter is 2x2, and the step length is 2;
s4, converting full connection into convolution, and converting the full connection layer into three convolution layers;
s5, reading pictures in the picture library in batches, and reading picture data from the specified directory in batches into the internal memory;
s6, preprocessing in batch, zooming each picture into a picture with a fixed size for subsequent processing;
s7, extracting the batch features according to the steps of S1-S4, storing all the extracted features in a feature library, and establishing indexes of the pictures and the corresponding features for facilitating subsequent searching;
s8, judging whether all the pictures are processed or not, if not, circularly performing the steps from S5 to S7 until all the pictures are processed, and then entering the next step;
s9, receiving a request of identifying the original image by the user thumbnail, and zooming the picture P1 into a picture P2 with a fixed size;
s10, extracting features, calculating similarity with all picture features in a feature library, and taking a picture P with the features with the maximum similarity, namely an original picture;
s11, obtaining user feedback and adjusting the optimization model, and optimizing the model according to the user feedback result to enable the result to be more accurate.
Preferably, the convolutional layer extraction high-dimensional feature formula is as follows:
Figure GDA0003946020660000021
wherein, (f × g) (n) is the convolution of f and g.
Preferably, the S2 specifically includes the following steps:
s21, adding all-0 filling to the boundary of the input matrix to enable the size of the output matrix to be consistent with that of the input matrix;
s22, inputting the filled matrix obtained in the S21 into a convolution network;
and S23, adding an offset to the result obtained by the convolution operation of the S22, and outputting the result to the next layer through an activation function ReLU.
Preferably, the steps S1 to S4 are method steps of feature extraction.
Preferably, three convolutional layers in the full connection layer are 1 conv 7x7 and 2 conv 1x1, respectively.
Preferably, the size of the input layer in S1 is selected to be a multiple of 32.
Compared with the prior art, the invention has the beneficial effects that:
(1) The method uses a deep learning technology, utilizes the feature extraction capability of an optimized multilayer neural network, desalts irrelevant factors such as a background and the like through convolution, and highlights the main features of the object to be classified, so that the extracted picture features are more accurate and are less influenced by factors such as picture deformation, color, clipping and the like. The problems that the traditional perceptual hash algorithm is low in accuracy, and is greatly influenced by picture deformation and color factors are solved. The accuracy of the original image identification is greatly improved;
(2) The invention uses a multilayer neural network by utilizing a deep learning technology, wherein 13 convolution layers, 3 full-connection layers and 5 persistence layers are used, the whole network uses convolution kernel size (3 x 3) and maximum pooling size (2 x 2) with the same size, and the combination of several small filter (3 x 3) convolution layers is compared with a large filter (5 x5 or 7x 7), so that input pictures with any size can be processed while preventing model overfitting. By utilizing the characteristic extraction capability of the multilayer neural network, background and other irrelevant factors are diluted through convolution, the main characteristics of the object to be classified are highlighted, the extracted picture characteristics are more accurate, and the accuracy of the original image identification of the thumbnail is greatly improved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention and not to limit the invention. In the drawings:
FIG. 1 is an overall flowchart of a method for identifying an original image based on a deep learning merged media asset library thumbnail provided by the invention;
fig. 2 is a schematic structural diagram of a feature extraction neural network architecture for recognizing an original image based on a deep learning merged media asset library thumbnail provided by the present invention.
Detailed Description
The present invention will now be described in further detail with reference to the accompanying drawings. These drawings are simplified schematic views illustrating only the basic structure of the present invention in a schematic manner, and thus show only the constitution related to the present invention.
Referring to fig. 1-2, a method for identifying an original image based on a deep learning merged media asset library thumbnail is characterized by comprising the following specific method steps:
step one, feature extraction:
s1, inputting the preprocessed image data into an input layer, wherein the input layer is selected to be a multiple of 32;
s2, extracting high-dimensional features by using a convolution layer, fading background irrelevant factors by convolution, highlighting main features of an object to be classified, and using a ReLU as an activation function;
the formula for extracting the high-dimensional characteristic of the convolutional layer is as follows:
Figure GDA0003946020660000031
wherein, (f × g) (n) is the convolution of f and g.
S2 specifically comprises the following steps:
s21, adding all-0 filling to the boundary of the input matrix to enable the size of the output matrix to be consistent with that of the input matrix;
s22, inputting the filled matrix obtained in the S21 into a convolution network;
and S23, adding an offset to the result obtained by the convolution operation of the S22, and outputting the result to the next layer through the activation function ReLU.
S3, reducing the size of the matrix through the largest pooling layer, reducing parameters, accelerating calculation and preventing over-fitting, wherein the size of the filter is 2x2, and the step length is 2;
s4, converting full connection into convolution, and converting the full connection layer into three convolution layers;
wherein, three convolution layers in the full connection layer are respectively 1 conv 7x7 and 2 conv 1x1.
Step two, feature extraction:
s5, reading pictures in the picture library in batches, and reading picture data from the specified directory in batches into the internal memory;
s6, batch preprocessing, namely scaling each picture into a picture with a fixed size, such as 224 × 224, for subsequent processing;
s7, extracting the batch features according to the steps of S1-S4, storing all the extracted features in a feature library, and establishing indexes of the pictures and the corresponding features for facilitating subsequent searching;
s8, judging whether all the pictures are processed or not, if not, circularly performing the steps from S5 to S7 until all the pictures are processed, and then entering the next step;
step three, identifying the original image by the thumbnail:
s9, receiving a request for identifying the original image from the user thumbnail, and scaling the picture P1 to a picture P2 with a fixed size, such as 224 × 224;
s10, extracting features, calculating similarity with all picture features in a feature library, and taking a picture P with the features with the maximum similarity, namely an original picture;
s11, obtaining user feedback and adjusting the optimization model, and optimizing the model according to the user feedback result to enable the result to be more accurate.
The method uses a deep learning technology, utilizes the feature extraction capability of an optimized multilayer neural network, desalts irrelevant factors such as a background and the like through convolution, and highlights the main features of the object to be classified, so that the extracted picture features are more accurate and are less influenced by factors such as picture deformation, color, clipping and the like. The problems that the traditional perceptual hash algorithm is low in accuracy, and is greatly influenced by picture deformation and color factors are solved. The accuracy of recognizing the original image is greatly improved.
Example 1:
a resource management system refers to the method for identifying the original image by the thumbnail of the merged media resource library based on deep learning, which comprises the following steps:
step 1, using 50 pictures (wherein the training set is 40 thousands, and the testing set is 10 thousands), preprocessing to obtain pictures with the same size (wherein 0 needs to be supplemented when the size is not enough);
step 2, inputting the training data obtained in the step 1 into a neural network model for training to obtain a model for extracting characteristics, wherein the model is used for extracting picture characteristics;
step 3, performing feature extraction on all pictures in the library by using the model obtained in the step 2, and storing the extracted features in a warehouse;
step 4, extracting features of the picture to be recognized by using the model in the step 2, performing similarity calculation on the extracted features and all features stored in the library in the step 3, and storing the result as a similarity array;
and 5, searching whether pictures smaller than a set threshold exist in all the similarity arrays:
if yes, giving the picture as an original picture to be searched;
if not, the original image is not found.
The invention uses multilayer neural network by using deep learning technique, wherein 13 convolution layers, 3 full-connection layers and 5 sustained layers are used, and the whole network uses convolution kernel size (3 x 3) and maximum pooling size (2 x 2) with the same size, and the combination of several small filter (3 x 3) convolution layers is more than one large filter (5 x5 or 7x 7), so that input pictures with any size can be processed while preventing model overfitting. By utilizing the characteristic extraction capability of the multilayer neural network, background and other irrelevant factors are diluted through convolution, the main characteristics of the object to be classified are highlighted, the extracted picture characteristics are more accurate, and the accuracy of the original image identification of the thumbnail is greatly improved.
The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art should be considered as the technical solutions and the inventive concepts of the present invention within the technical scope of the present invention.

Claims (6)

1. A method for identifying an original image by a converged media asset library thumbnail based on deep learning is characterized by comprising the following specific method steps:
s1, inputting preprocessed image data into an input layer;
s2, extracting high-dimensional features by using a convolution layer, fading background irrelevant factors by convolution, highlighting main features of an object to be classified, and using a ReLU as an activation function;
s3, reducing the size of the matrix through the largest pooling layer, reducing parameters, and accelerating calculation while preventing over fitting;
s4, converting full connection into convolution, and converting the full connection layer into three convolution layers;
s5, reading pictures in the picture library in batches, and reading picture data from the specified directory in batches into the internal memory;
s6, carrying out batch preprocessing, namely zooming each picture into a picture with a fixed size for subsequent processing;
s7, extracting the batch features according to the steps of S1-S4, storing all the extracted features in a feature library, and establishing indexes of the pictures and the corresponding features for facilitating subsequent searching;
s8, judging whether all the pictures are processed or not, if not, circularly performing the steps from S5 to S7 until all the pictures are processed, and then entering the next step;
s9, receiving a request of identifying the original image by the user thumbnail, and zooming the picture P1 into a picture P2 with a fixed size;
s10, extracting features, calculating similarity with all image features in a feature library, and taking a picture P with the features with the maximum similarity as an original picture;
s11, obtaining user feedback and adjusting the optimization model, and optimizing the model according to the user feedback result to enable the result to be more accurate.
2. The method for recognizing the artwork based on the deep learning converged media asset library thumbnail as claimed in claim 1, wherein the convolutional layer extraction high-dimensional feature formula is as follows:
Figure FDA0003946020650000011
wherein, (f × g) (n) is the convolution of f and g.
3. The method for identifying the artwork based on the deep learning converged media asset library thumbnail as claimed in claim 1, wherein the S2 specifically comprises the following method steps:
s21, adding all-0 filling to the boundary of the input matrix to enable the size of the output matrix to be consistent with that of the input matrix;
s22, inputting the filled matrix obtained in the S21 into a convolution network;
and S23, adding an offset to the result obtained by the convolution operation of the S22, and outputting the result to the next layer through the activation function ReLU.
4. The method for identifying artwork based on deep learning from thumbnail of confluent media asset library of claim 1, wherein said steps S1-S4 are method steps of feature extraction.
5. The method for identifying artwork based on deep learning from a thumbnail of a converged media asset library according to claim 1, wherein three convolutional layers in the fully connected layer are 1 conv 7x7 and 2 conv 1x1 respectively.
6. The method for identifying artwork based on deep learning from a merged media asset library thumbnail as claimed in claim 1, wherein the size of the input layer in S1 is selected to be a multiple of 32.
CN202110208085.2A 2021-02-24 2021-02-24 Method for identifying original image by merging media asset library thumbnail based on deep learning Active CN112818161B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110208085.2A CN112818161B (en) 2021-02-24 2021-02-24 Method for identifying original image by merging media asset library thumbnail based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110208085.2A CN112818161B (en) 2021-02-24 2021-02-24 Method for identifying original image by merging media asset library thumbnail based on deep learning

Publications (2)

Publication Number Publication Date
CN112818161A CN112818161A (en) 2021-05-18
CN112818161B true CN112818161B (en) 2023-03-24

Family

ID=75865439

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110208085.2A Active CN112818161B (en) 2021-02-24 2021-02-24 Method for identifying original image by merging media asset library thumbnail based on deep learning

Country Status (1)

Country Link
CN (1) CN112818161B (en)

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106022380A (en) * 2016-05-25 2016-10-12 中国科学院自动化研究所 Individual identity identification method based on deep learning
CN106294798B (en) * 2016-08-15 2020-01-17 华为技术有限公司 Image sharing method and terminal based on thumbnail
CN106651765A (en) * 2016-12-30 2017-05-10 深圳市唯特视科技有限公司 Method for automatically generating thumbnail by use of deep neutral network
WO2018128741A1 (en) * 2017-01-06 2018-07-12 Board Of Regents, The University Of Texas System Segmenting generic foreground objects in images and videos
CN107491726B (en) * 2017-07-04 2020-08-04 重庆邮电大学 Real-time expression recognition method based on multichannel parallel convolutional neural network
CN110245659B (en) * 2019-05-21 2021-08-13 北京航空航天大学 Image salient object segmentation method and device based on foreground and background interrelation
CN110648334A (en) * 2019-09-18 2020-01-03 中国人民解放军火箭军工程大学 Multi-feature cyclic convolution saliency target detection method based on attention mechanism
CN110751271B (en) * 2019-10-28 2023-05-26 西安烽火软件科技有限公司 Image traceability feature characterization method based on deep neural network
CN111191662B (en) * 2019-12-31 2023-06-30 网易(杭州)网络有限公司 Image feature extraction method, device, equipment, medium and object matching method
CN111475662A (en) * 2020-04-03 2020-07-31 南京云吾时信息科技有限公司 Background retrieval system for graphic database
CN112308859A (en) * 2020-09-01 2021-02-02 北京小米松果电子有限公司 Method and device for generating thumbnail, camera and storage medium
CN112258487A (en) * 2020-10-29 2021-01-22 德鲁动力科技(海南)有限公司 Image detection system and method
AU2020103901A4 (en) * 2020-12-04 2021-02-11 Chongqing Normal University Image Semantic Segmentation Method Based on Deep Full Convolutional Network and Conditional Random Field

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
"Learning object classes from image thumbnails through deep neural networks";Erkang Chen;《2008 IEEE International Conference on Acoustics, Speech and Signal Processing》;20081231;第829-832页 *

Also Published As

Publication number Publication date
CN112818161A (en) 2021-05-18

Similar Documents

Publication Publication Date Title
WO2019201035A1 (en) Method and device for identifying object node in image, terminal and computer readable storage medium
EP4156017A1 (en) Action recognition method and apparatus, and device and storage medium
US7277584B2 (en) Form recognition system, form recognition method, program and storage medium
CN109919077B (en) Gesture recognition method, device, medium and computing equipment
CN110569814A (en) Video category identification method and device, computer equipment and computer storage medium
CN112163114B (en) Image retrieval method based on feature fusion
CN115937655A (en) Target detection model of multi-order feature interaction, and construction method, device and application thereof
CN111079511A (en) Document automatic classification and optical character recognition method and system based on deep learning
CN110706232A (en) Texture image segmentation method, electronic device and computer storage medium
CN116597267B (en) Image recognition method, device, computer equipment and storage medium
CN112818161B (en) Method for identifying original image by merging media asset library thumbnail based on deep learning
WO2023173552A1 (en) Establishment method for target detection model, application method for target detection model, and device, apparatus and medium
CN115761332A (en) Smoke and flame detection method, device, equipment and storage medium
CN115205884A (en) Bill information extraction method and device, equipment, medium and product thereof
CN115410185A (en) Method for extracting specific name and unit name attributes in multi-modal data
CN114565913A (en) Text recognition method and device, equipment, medium and product thereof
Žižakić et al. Learning local image descriptors with autoencoders
CN113516148A (en) Image processing method, device and equipment based on artificial intelligence and storage medium
CN111178409A (en) Image matching and recognition system based on big data matrix stability analysis
CN117408259B (en) Information extraction method, device, computer equipment and storage medium
CN112036501A (en) Image similarity detection method based on convolutional neural network and related equipment thereof
CN111797973A (en) Method, device and electronic system for determining model structure
CN116152522B (en) Multi-scale feature extraction method and system based on deep learning
CN117033308B (en) Multi-mode retrieval method and device based on specific range
CN114677568B (en) Linear target detection method, module and system based on neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant