CN112584146B - Method and system for evaluating interframe similarity - Google Patents

Method and system for evaluating interframe similarity Download PDF

Info

Publication number
CN112584146B
CN112584146B CN201910944335.1A CN201910944335A CN112584146B CN 112584146 B CN112584146 B CN 112584146B CN 201910944335 A CN201910944335 A CN 201910944335A CN 112584146 B CN112584146 B CN 112584146B
Authority
CN
China
Prior art keywords
frame
feature information
similarity
block
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910944335.1A
Other languages
Chinese (zh)
Other versions
CN112584146A (en
Inventor
许燚
高龙文
田凯
周水庚
孙胡杨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fudan University
Shanghai Bilibili Technology Co Ltd
Original Assignee
Fudan University
Shanghai Bilibili Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fudan University, Shanghai Bilibili Technology Co Ltd filed Critical Fudan University
Priority to CN201910944335.1A priority Critical patent/CN112584146B/en
Publication of CN112584146A publication Critical patent/CN112584146A/en
Application granted granted Critical
Publication of CN112584146B publication Critical patent/CN112584146B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/132Sampling, masking or truncation of coding units, e.g. adaptive resampling, frame skipping, frame interpolation or high-frequency transform coefficient masking
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock

Abstract

The embodiment of the application provides an interframe similarity evaluation method, which comprises the following steps: acquiring a first frame and a second frame in a frame sequence; extracting a plurality of feature information of the first frame and a plurality of feature information of the second frame; partitioning the plurality of feature information of the first frame and the plurality of feature information of the second frame to obtain a plurality of first blocks corresponding to the first frame and a plurality of second blocks corresponding to the second frame; acquiring a plurality of second blocks associated with each first block; and according to a plurality of second blocks associated with each first block, performing similarity calculation on each feature information of the first frame and part of feature information of the second frame respectively to obtain the inter-frame similarity between the first frame and the second frame. The embodiment of the application can effectively reduce the computing resource of the inter-frame similarity.

Description

Method and system for evaluating interframe similarity
Technical Field
The embodiment of the application relates to the technical field of computers, in particular to a method, a system, computer equipment and a computer readable storage medium for evaluating interframe similarity.
Background
With the application and development of video services in various fields, video encoding and decoding become one of the key technologies concerned and developed by all parties. Video coding refers to a method of converting a file in a certain video format into a file in another video format by a specific compression technology, so that bandwidth cost and occupied space in a storage medium during transmission can be reduced.
However, video compression is typically lossy based on some video compression algorithm, and the resulting lossy video is often accompanied by various compression artifacts, such as occlusion, edge/texture floating, mosquito noise and jerkiness, etc. As described above, the noise generated by video compression inevitably reduces the picture quality of the video, and thus the visual experience of the video viewer. The inventor finds that: the quality enhancement operation can be performed on the frame by utilizing the information in other frames according to the inter-frame similarity between different frames, so that the visual experience of a video viewer is improved. However, the inter-frame similarity calculation is too computationally expensive.
It should be noted that the above findings of the present inventors are not disclosed as an innovative matter and are only used for describing the technical problems of the present application.
Disclosure of Invention
An object of the embodiments of the present application is to provide a method, a system, a computer device, and a computer-readable storage medium for evaluating inter-frame similarity, which can be used to solve the technical problem that computing of inter-frame similarity consumes too much computing resources.
One aspect of the embodiments of the present application provides a method for evaluating inter-frame similarity, where the method includes: acquiring a first frame and a second frame in a frame sequence; extracting a plurality of feature information of the first frame and a plurality of feature information of the second frame; partitioning the plurality of feature information of the first frame and the plurality of feature information of the second frame to obtain a plurality of first blocks corresponding to the first frame and a plurality of second blocks corresponding to the second frame; acquiring a plurality of second blocks associated with each first block; and according to a plurality of second blocks associated with each first block, performing similarity calculation on each feature information of the first frame and part of feature information of the second frame respectively to obtain the inter-frame similarity between the first frame and the second frame.
Optionally, obtaining a plurality of second blocks associated with each first block includes: pooling each first block into a corresponding first downsampling feature information to obtain M first downsampling feature information; pooling each second block into corresponding second downsampling characteristic information to obtain M second downsampling characteristic information; calculating the similarity of the first downsampling characteristic information a and each second downsampling characteristic information, wherein the first downsampling characteristic information a corresponds to a first block a; and determining k second blocks corresponding to the k second downsampling feature information with the highest similarity as k second blocks associated with the first block a, wherein a is more than or equal to 1 and less than or equal to M, k is more than or equal to 1 and less than M, and a, k and M are natural numbers.
Optionally, according to a plurality of second blocks associated with each first block, performing similarity calculation on each feature information of the first frame and part of feature information of the second frame, to obtain inter-frame similarity between the first frame and the second frame, includes: by similarity matrix
Figure BDA0002223753390000021
Representing an inter-frame similarity between the first frame and the second frame; wherein the content of the first and second substances,
Figure BDA0002223753390000023
is composed of
Figure BDA0002223753390000022
Is used for representing the similarity between the feature information j of the first frame and the feature information i of the second frame: when the feature information j of the first frame and the feature information i of the second frame are respectively and correspondingly located in a first block and a second block which have an association relationship, calculating the similarity between the feature information j of the first frame and the feature information i of the second frame; and when the feature information j of the first frame and the feature information i of the second frame are not correspondingly positioned in the first block and the second block which have the association relationship, setting the similarity between the feature information j of the first frame and the feature information i of the second frame to be 0.
Optionally, the similarity between the feature information j of the first frame and the feature information i of the second frame is calculated according to the following calculation formula:
Figure BDA0002223753390000031
Figure BDA0002223753390000032
wherein, Ft(j) Is the characteristic information j, F of the first framet-1(i) Is the characteristic information i of the second frame,
Figure BDA0002223753390000033
β is a constant, which is a euclidean distance between the feature information j of the first frame and the feature information i of the second frame.
Optionally, the inter-frame similarity is used to determine a reference weight between the first frame and the second frame.
Optionally, the method further includes: learning hidden state information at the t moment through a non-local convolution long-short term memory network, wherein the hidden state information is used for enhancing the first frame; wherein the non-local convolution long-short term memory network is configured to: determining the weight of hidden state information and the weight of unit state information output at the time t-1 according to the interframe similarity between a first frame corresponding to the time t and a second frame corresponding to the time t-1, and converting the hidden state information and the unit state information output at the time t-1 according to the weight of the hidden state information and the weight of the unit state information output at the time t-1 to obtain target hidden state information and target unit state information which are used as input data of the non-local convolution long-short term memory network at the time t.
Optionally, converting the hidden state information and the unit state information output at the time t-1 according to the weight of the hidden state information and the weight of the unit state information output at the time t-1, includes: extracting a plurality of third blocks at corresponding positions from hidden state information output at the t-1 moment according to a plurality of second blocks associated with each first block, and generating target hidden state information according to the hidden state information in the third blocks and the similarity matrix; and according to a plurality of second blocks associated with each first block, extracting a plurality of fourth blocks at corresponding positions from the unit state information output at the t-1 moment, and generating the target hidden state information according to the unit state information in the plurality of fourth blocks and the similarity matrix.
Another aspect of the embodiments of the present application further provides an inter-frame similarity evaluation system, where the system includes: the device comprises a first acquisition module, a second acquisition module and a first display module, wherein the first acquisition module is used for acquiring a first frame and a second frame in a frame sequence; an extraction module, configured to extract a plurality of feature information of the first frame and a plurality of feature information of the second frame; a blocking module, configured to block the feature information of the first frame and the feature information of the second frame to obtain a plurality of first blocks corresponding to the first frame and a plurality of second blocks corresponding to the second frame; the second acquisition module is used for acquiring a plurality of second blocks associated with each first block; and the third acquisition module is used for carrying out similarity calculation on each piece of characteristic information of the first frame and part of characteristic information of the second frame according to the plurality of second blocks associated with each first block so as to obtain the inter-frame similarity between the first frame and the second frame.
Yet another aspect of the embodiments of the present application provides a computer device, including a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor is configured to implement the steps of the inter-frame similarity estimation method according to any one of the above items when executing the computer program.
Yet another aspect of embodiments of the present application provides a computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, is configured to implement the steps of the inter-frame similarity assessment method according to any one of the above.
According to the inter-frame similarity evaluation method, the inter-frame similarity evaluation system, the computer device and the computer readable storage medium, the technical complexity of calculating the inter-frame similarity can be dynamically reduced according to the sizes of the first block and the second block and the number of the plurality of second blocks associated with each first block, and the excessive consumption of calculation resources in the calculation process can be greatly reduced on the premise of approximately maintaining the accuracy.
Drawings
Fig. 1 schematically shows a flowchart of an inter-frame similarity evaluation method according to a first embodiment of the present application;
FIG. 2 schematically shows a sub-flowchart of step S106 in FIG. 1;
FIG. 3 schematically shows another flowchart of an inter-frame similarity evaluation method according to a first embodiment of the present application;
FIG. 4 schematically shows a sub-flowchart of step S110 in FIG. 3;
FIG. 5 schematically illustrates an architecture diagram of a video quality enhancement operation;
FIG. 6 schematically illustrates a workflow diagram of a first non-local module;
FIG. 7 is a schematic diagram showing the operational architecture of a forward non-partial convolution long and short term memory network;
FIG. 8 is a block diagram schematically illustrating an inter-frame similarity evaluation system according to a second embodiment of the present application; and
fig. 9 schematically shows a hardware architecture diagram of a computer device suitable for implementing the inter-frame similarity evaluation method according to a third embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application more clearly understood, the present application is further described in detail below with reference to the accompanying drawings and the embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
It should be noted that the description relating to "first", "second", etc. in the present invention is for descriptive purposes only and is not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In addition, technical solutions between various embodiments may be combined with each other, but must be realized by a person skilled in the art, and when the technical solutions are contradictory or cannot be realized, such a combination should not be considered to exist, and is not within the protection scope of the present invention.
Example one
Fig. 1 schematically shows a flowchart of an inter-frame similarity evaluation method according to a first embodiment of the present application. It is to be understood that the flow charts in the embodiments of the present method are not intended to limit the order in which the steps are performed. The following description is made by way of example with the computer device 2 as the execution subject.
As shown in fig. 1, the method for evaluating inter-frame similarity may include steps S100 to S108, where:
step S100, a first frame and a second frame in a frame sequence are acquired.
The frame sequence is
Figure BDA0002223753390000061
The first frame and the second frame may refer to the sequence of frames
Figure BDA0002223753390000062
In two arbitrarily adjacent frames, e.g. XtAnd Xt-1,Xt-1And Xt-2
The sequence of frames
Figure BDA0002223753390000063
The frame sequence may be a video segment of a lossy video, which may be based on various types of encoded compressed video, such as compressed video based on compression algorithms such as h.264/AVC or h.265/HEVC. It is well understood that lossy video obtained via compression may lose much information resulting in various compression artifacts.
There will be some timing relationship between two adjacent frames, such as texture, color, motion track, etc. For example, if there is an object a in the previous frame and there is an object a in the next frame, the object a is the spatio-temporal dependency information between the two frames, and based on the spatio-temporal dependency information, it is possible to try to repair the frame with poor details with the frame with good details. For a first frame, the information it lost during compression may be present in a second or other adjacent frame; similarly, for the second frame, the information it lost during compression may be present in the first frame or other adjacent frames.
Step S102, extracting a plurality of feature information of the first frame and a plurality of feature information of the second frame.
Feature information of each frame may be extracted by using a method such as HOG (Histogram of Oriented Gradient), SIFT (Scale-invariant feature transform), and the like, or may be extracted by using a deep neural network.
In an exemplary embodiment, the computer device 2 may be configured as an encoder for extracting feature information, wherein the encoder comprises a convolutional neural network and a nonlinear activation function, wherein the convolutional neural network comprises a plurality of convolutional layers.
For convenience of description, the first frame is taken as X in the followingtAnd the second frame is Xt-1The present embodiment is exemplarily described. Correspondingly, from the first frame XtThe extracted characteristic information is FtFrom the second frame Xt-1The extracted characteristic information is Ft-1
Step S104, for the first frame XtA plurality of feature information FtAnd the second frame Xt-1A plurality of feature information Ft-1And partitioning to obtain a plurality of first blocks corresponding to the first frame and a plurality of second blocks corresponding to the second frame.
A plurality of feature information FtAnd a plurality of feature information Ft-1May exist in the form of feature maps, and each feature map may include N pieces of feature information. When the feature maps are partitioned, the size of the blocks can be set to p × p, i.e., each feature map is partitioned into N/p2Block, p is a natural number.
As can be seen from the above, the first frame XtA plurality of feature information FtIs divided into N/p2A first block; the second frame Xt-1A plurality of feature information Ft-1Is divided into N/p2A second block.
And step S106, acquiring a plurality of second blocks associated with each first block.
Finding a plurality of second blocks which are most similar to each first block by calculating the similarity between each first block and each second block, such as: a plurality of second blocks most similar to the first block, a plurality of second blocks most similar to the second first block, …, and an Nth/p-th block2A plurality of second blocks most similar to the first blocks.
In an exemplary embodiment, as shown in fig. 2, the step S106 includes the steps of: step S200, pooling each first block into corresponding first down-sampling feature information to obtain M first down-sampling feature information; step S202, pooling each second block into one corresponding second down-sampling feature information to obtain M second down-sampling feature information; step S204, calculating the similarity between the first down-sampling feature information a and each second down-sampling feature information, wherein the first down-sampling feature information a corresponds to a first block a; step S206, determining k second blocks corresponding to k second downsampling feature information with the highest similarity as k second blocks associated with the first block a, where a is greater than or equal to 1 and less than or equal to M, k is greater than or equal to 1 and less than M, a, k, and M are natural numbers, and M is N/p2
For the plurality of feature information FtN/p obtained by blocking2Pooling (Pooling) of the first blocks to obtain the first frame XtN/p of2Down-sampled characteristic information, i.e. Ft p(ii) a For the plurality of feature information Ft-1N/p obtained by blocking2Obtaining the second frame X after pooling the second blockst-1N/p of2Down-sampling feature information, i.e.
Figure BDA0002223753390000081
F is to bet pEach of the down-sampled feature information of (1) and
Figure BDA0002223753390000082
is subjected to euclidean distance calculation, i.e.,
Figure BDA0002223753390000083
to the corresponding Euclidean distance matrix
Figure BDA0002223753390000084
Wherein the content of the first and second substances,
Figure BDA0002223753390000085
for down-sampling the characteristic information Ft pWherein each down-sampled feature information is derived from down-sampled feature information
Figure BDA0002223753390000086
K pieces of down-sampling characteristic information with the shortest Euclidean distance to the first block are screened out, and therefore the first block and the F are obtainedt pAnd the corresponding relation of the down-sampling characteristic information in (2), and each second block and
Figure BDA0002223753390000087
and obtaining k second blocks associated with each first block according to the corresponding relation of the down-sampling feature information.
Step S108, according to a plurality of second blocks associated with each first block, similarity calculation is carried out on each feature information of the first frame and part of feature information of the second frame respectively, so as to obtain inter-frame similarity between the first frame and the second frame.
The similarity matrix may be an inter-frame pixel-level similarity matrix that may be used in a non-local attention mechanism, such as a non-local convolutional long-short term memory network as set forth later in this document.
By similarity matrix
Figure BDA0002223753390000088
Represents the first frame XtAnd the second frame Xt-1Inter-frame similarity between them;
wherein the content of the first and second substances,
Figure BDA0002223753390000091
is composed of
Figure BDA0002223753390000092
Is used for representing the similarity between the feature information j of the first frame and the feature information i of the second frame:
when the feature information j of the first frame and the feature information i of the second frame are respectively and correspondingly located in a first block and a second block which have an association relationship, calculating the similarity between the feature information j of the first frame and the feature information i of the second frame; and when the feature information j of the first frame and the feature information i of the second frame are not correspondingly positioned in the first block and the second block which have the association relationship, setting the similarity between the feature information j of the first frame and the feature information i of the second frame to be 0.
Calculating the similarity between the feature information j of the first frame and the feature information i of the second frame, wherein the calculation formula is as follows:
Figure BDA0002223753390000093
Figure BDA0002223753390000094
wherein, Ft(j) Is the characteristic information j, F of the first framet-1(i) Is the characteristic information i of the second frame,
Figure BDA0002223753390000095
β is a constant, which is a euclidean distance between the feature information j of the first frame and the feature information i of the second frame.
In an exemplary embodiment, the inter-frame similarity is used to determine a reference weight between the first frame and the second frame. The reference weights include weights of information in the first frame during the second frame enhancement operation or weights of information in the second frame during the first frame enhancement operation.
In an exemplary embodiment, the inter-frame similarity evaluation method may be used in a video quality enhancement operation to process motion trajectories (motion patterns) between different frames, such as large motion or blurred motion trajectories, in a situation with a low computational resource occupancy. For example, in a Non-local module for a Non-local Convolutional Long Short Term Memory network (NL-ConvLSTM).
In an exemplary embodiment, as shown in fig. 3, the method further includes a step S110: learning hidden state information at the t moment through a non-local convolution long-short term memory network, wherein the hidden state information is used for enhancing the first frame; wherein the non-local convolution long-short term memory network is configured to: determining the weight of hidden state information and the weight of unit state information output at the time t-1 according to the interframe similarity between a first frame corresponding to the time t and a second frame corresponding to the time t-1, and converting the hidden state information and the unit state information output at the time t-1 according to the weight of the hidden state information and the weight of the unit state information output at the time t-1 to obtain target hidden state information and target unit state information which are used as input data of the non-local convolution long-short term memory network at the time t.
In an exemplary embodiment, to further reduce the consumption of computing resources, as shown in fig. 4, the conversion process is as follows: step S400, according to a plurality of second blocks associated with each first block, extracting a plurality of third blocks (top-k positions in H) at corresponding positions from the hidden state information output at the time t-1t-1) Generating the target hidden state information according to the hidden state information in the plurality of third blocks and the similarity matrix; step S402, according to a plurality of second blocks associated with each first block, a plurality of fourth blocks at corresponding positions are extracted from the unit state information output at the time t-1, and the target hidden state information is generated according to the unit state information in the plurality of fourth blocks and the similarity matrix.
Referring to FIG. 5, for ease of understanding, the following provides an operational flow of a video enhancement method, the present operationThe flow is directed to the first frame XtPerforming enhancement operation to obtain enhanced frame of first frame
Figure BDA0002223753390000101
Step one, acquiring a frame sequence to be processed
Figure BDA0002223753390000102
The frame sequence comprises a first frame XtThe second frame Xt-1And other adjacent frames.
And secondly, extracting a plurality of characteristic information of each frame in the frame sequence.
From the first frame X by the encodertExtracting a plurality of corresponding characteristic information FtFrom the second frame Xt-1Extracting a plurality of corresponding characteristic information Ft-1… to obtain a sequence of frames { X }t-T,...,Xt+TThe corresponding characteristic information sequence { F }t-T...Ft-2,Ft-1,Ft,Ft+1,Ft+2,...Ft+T}。
Inputting a plurality of characteristic information of each frame into a non-local convolution long-short term memory network, and acquiring the reference characteristic information through the non-local convolution long-short term memory network, wherein the reference characteristic information comprises the information corresponding to the first frame XtHidden state information of (H)t
The non-local convolution long-short term memory network comprises a forward non-local convolution long-short term memory network and a backward non-local convolution long-short term memory network, the forward non-local convolution long-short term memory network comprises a first non-local module and a forward LSTM module, the backward non-local convolution long-short term memory network comprises a second non-local module and a backward LSTM module, the first non-local module is used for determining the weight of hidden state information output by a previous frame and the weight of unit state information output by the previous frame according to the inter-frame similarity between two adjacent frames, and the second non-local module is used for determining the weight of hidden state information output by a next frame and the weight of unit state information output by the next frame according to the inter-frame similarity between the two adjacent frames.
The forward non-local convolution long-short term memory network and the backward non-local convolution long-short term memory network are similar and are different in time sequence. For ease of understanding, the operation of the non-partial convolution long-short term memory network will now be described with reference to fig. 7.
With reference to fig. 6 and fig. 7, the work flow of the forward non-local convolution long-short term memory network at time t is taken as an example:
(1) receiving, by the first non-local module: the first frame X corresponding to the time ttA plurality of feature information FtAnd a second frame X corresponding to time t-1t-1A plurality of feature information Ft-1Wherein the time t is the current time;
(2) calculating a first frame X by the first non-local moduletAnd a second frame Xt-1Inter-frame similarity between:
the calculation process is as follows:
(2.1) for a plurality of feature information FtAnd a plurality of feature information Ft-1Partitioning to obtain a plurality of first blocks corresponding to the first frame and a plurality of second blocks corresponding to the second frame;
a plurality of feature information FtAnd a plurality of feature information Ft-1The method can exist in the form of feature maps, and each feature map can comprise N pieces of feature information. The size of the block can be set to p, and the characteristic information FtIs divided into N/p2A first block; a plurality of feature information Ft-1Is divided into N/p2A second block;
(2.2) for a plurality of feature information FtN/p of2Pooling (pooling) of the first blocks to obtain the first frame XtN/p of2Down-sampled characteristic information, i.e. Ft p(ii) a For a plurality of characteristic information Ft-1N/p of2Obtaining the second frame X after pooling the second blockst-1N/p of2The information of the down-sampled characteristic is obtained,namely, it is
Figure BDA0002223753390000121
(2.3)Ft pEach of the down-sampled feature information of (1) and
Figure BDA0002223753390000122
each down-sampling feature information in the image is subjected to Euclidean distance calculation to a corresponding Euclidean distance matrix
Figure BDA0002223753390000123
(2.4) is Ft pEach down-sampled feature information from
Figure BDA0002223753390000124
K pieces of down-sampling characteristic information with the shortest Euclidean distance are screened out to obtain k pieces of second blocks (top-k blocks in F) with the most similar first blockst-1);
(2.5) calculating to obtain the first frame X according to a plurality of second blocks which are most similar to the first blockstAnd the second frame Xt-1Similarity matrix between
Figure BDA0002223753390000125
Wherein the content of the first and second substances,
Figure BDA0002223753390000126
is composed of
Figure BDA0002223753390000127
For representing said first frame XtCharacteristic information j and second frame Xt-1Similarity between the feature information i of (a):
when the feature information j of the first frame and the second frame Xt-1Respectively corresponding to the first block and the second block with the association relationship, and calculating the first frame XtCharacteristic information j and second frame Xt-1Similarity between the characteristic information i of (1); when in useThe first frame XtCharacteristic information j and second frame Xt-1If the characteristic information i is not correspondingly located in the first block and the second block with the association relationship, the first frame X is processedtCharacteristic information j and second frame Xt-1The similarity between the feature information i of (2) is set to 0.
Calculating the similarity between the feature information j of the first frame and the feature information i of the second frame, wherein the calculation formula is as follows:
Figure BDA0002223753390000131
Figure BDA0002223753390000132
wherein, Ft(j) Is the characteristic information j, F of the first framet-1(i) Is the characteristic information i of the second frame,
Figure BDA0002223753390000133
β is a constant, which is a euclidean distance between the feature information j of the first frame and the feature information i of the second frame.
(3) Receiving hidden state information H output at the moment t-1 through the first non-local modulet-1And cell state information Ct-1(ii) a And according to k second blocks with most similar first blocks, hidden state information H output from the t-1 momentt-1A plurality of third blocks (top-k positions in H) with corresponding positions extractedt-1) Generating the target hidden state information according to the hidden state information in the third blocks and the similarity matrix
Figure BDA0002223753390000134
Unit state information C output from the t-1 moment according to k second blocks most similar to the first blockst-1Extracting a plurality of fourth blocks at corresponding positions, and generating the target concealment according to unit state information in the fourth blocks and the similarity matrixStatus information
Figure BDA0002223753390000135
The reference formula is as follows:
Figure BDA0002223753390000136
the first non-local module is for assisting in capturing a sequence of frames
Figure BDA0002223753390000137
The trajectory trend in (1) can be seen as a mechanism of attention. The first non-local module may capture global motion trajectories (global motion patterns) more efficiently than motion compensation (motion compensation). In addition, in the processing of the first non-local block, the inter-frame similarity can be directly determined according to the feature information of the corresponding two frames, and an additional network layer (additional layer) for generating a motion vector field (motion field) is required by training, for example, motion compensation.
(4) F is to bet
Figure BDA0002223753390000138
And
Figure BDA0002223753390000139
inputting the hidden state information H into the forward LSTM module, and outputting the hidden state information H at the time t through the forward LSTM moduletAnd cell state information CtSpecifically, the formula can be used:
Figure BDA0002223753390000141
illustratively, the forward LSTM module operating principle may be as follows:
Figure BDA0002223753390000142
Figure BDA0002223753390000143
Figure BDA0002223753390000144
Figure BDA0002223753390000145
Figure BDA0002223753390000146
Ht=ot⊙tanh(Ct)
a forgetting gate for receiving a memory message and deciding which part of the memory is to be reserved and forgotten;
wherein the forgetting factor is ft,ft∈[0,1],ftTarget unit state information representing output from time t to time t-1
Figure BDA0002223753390000147
Is used for determining whether the memory information learned at the time t-1 (namely the target unit state information output at the time t-1 and obtained by conversion)
Figure BDA0002223753390000148
) Pass or partially pass.
An input gate for selecting information to be memorized;
it∈[0,1],itindicating temporary cell state information g at time ttSelection weight of gtTemporary cell state information at time t;
Figure BDA0002223753390000149
may indicate information that is desired to be deleted, it⊙gtIt is possible to indicate the newly added information,the cell state information C at the time t can be obtained through the two partst
An output gate for outputting the hidden state information H at time ttWherein o ist∈[0,1],otShowing the selection weight of the cell state information at time t.
In addition, W isxf、Whf、Wxg、Whg、Wxi、Whi、Wxo、WhoAll are weight parameters in the forward LSTM module; bf、bg、bi、boAre all bias terms in the forward LSTM module; these parameters are obtained by model training.
It should be noted that the above exemplary structure of the forward LSTM module is not intended to limit the scope of the present invention.
(5) For hidden state information HtPerforming a decoding operation to obtain a first frame XtResidual error (Residual).
The computer device may configure a decoder, wherein the decoder comprises a convolutional neural network and a nonlinear activation function, wherein the convolutional neural network comprises a plurality of convolutional layers. By the structural symmetry of the decoder and the encoder.
(6) From the residual and the first frame XtObtaining the first frame XtEnhanced frame of
Figure BDA0002223753390000151
The video quality enhancement operation provided by the embodiment can improve the video quality with less operation resources and effectively remove the artifacts.
The technical scheme provided by the embodiment effectively reduces the calculation complexity. The analysis was as follows:
original non-local module: in that the similarity between each feature information of the second frame and each feature information of the first frame is calculated to obtain a similarity matrix St(St∈RN*N) (ii) a And H according to the similarity matrix and the t-1 timet-1And Ct-1And executing the conversion operation.
For convenience, phi denotes the operation complexity of the non-local module of the present embodiment, and psi denotes the operation complexity of the original non-local module, as shown in table 1:
primitive non-local module The non-local module provided by the embodiment
Time O(2N2C) O((N/p2)2(C+logk)+2kNCp2)
Space(s) O(2N2) O((N/p2)2+kN/p2+2kNp2)
TABLE 1
In the case of logk < C, φ ═ O ((N/p)2)2C+2kNCp2) I.e. (N/p)2)2C+2kNCp2O represents the same order of magnitude as the value in the parentheses but is a constant multiple thereof, N represents the number of pieces of feature information, p × p represents the block size, and C is the number of channels. Phi/psi is 1/(2 p)4)+kp2/N≤1,kp2N is less than or equal to N, therefore, the operation complexity can be dynamically reduced according to k and p. For a given k, by p ═ N/k1/6Phi/psi can be obtainedMinimization value 1.5(k/N)2/3(ii) a Further, when p is 10, k is 4, C is 64, and f is 41Phi may be close to O (NC)2f2)(NC2f2Constant multiple of) is calculated, corresponding to the computational complexity of the convolutional layer with convolution kernel f. With continued reference to table 1, where p is set to 10 and k to 4, phi may be close to one in a thousand of psi.
Example two
Fig. 8 is a block diagram of an inter-frame similarity evaluation system according to a second embodiment of the present application, which may be partitioned into one or more program modules, stored in a storage medium, and executed by one or more processors to implement the second embodiment of the present application. The program modules referred to in the embodiments of the present application refer to a series of computer program instruction segments that can perform specific functions, and the following description will specifically describe the functions of the program modules in the embodiments.
As shown in fig. 8, the inter-frame similarity evaluation system 800 may include the following components:
a first obtaining module 810, configured to obtain a first frame and a second frame in a frame sequence;
an extracting module 820, configured to extract a plurality of feature information of the first frame and a plurality of feature information of the second frame;
a blocking module 830, configured to block the feature information of the first frame and the feature information of the second frame to obtain a plurality of first blocks corresponding to the first frame and a plurality of second blocks corresponding to the second frame;
a second obtaining module 840, configured to obtain a plurality of second blocks associated with each first block; and
a third obtaining module 850, configured to perform similarity calculation on each feature information of the first frame and part of feature information of the second frame according to a plurality of second blocks associated with each first block, so as to obtain inter-frame similarity between the first frame and the second frame.
In an exemplary embodiment, the second obtaining module 840 is further configured to: pooling each first block into a corresponding first downsampling feature information to obtain M first downsampling feature information; pooling each second block into corresponding second downsampling characteristic information to obtain M second downsampling characteristic information; calculating the similarity of the first downsampling characteristic information a and each second downsampling characteristic information, wherein the first downsampling characteristic information a corresponds to a first block a; and determining k second blocks corresponding to the k second downsampling feature information with the highest similarity as k second blocks associated with the first block a, wherein a is more than or equal to 1 and less than or equal to M, k is more than or equal to 1 and less than M, and a, k and M are natural numbers.
In an exemplary embodiment, the second obtaining module 840 is further configured to: by similarity matrix
Figure BDA0002223753390000171
Representing an inter-frame similarity between the first frame and the second frame; wherein the content of the first and second substances,
Figure BDA0002223753390000172
is composed of
Figure BDA0002223753390000173
Is used for representing the similarity between the feature information j of the first frame and the feature information i of the second frame: when the feature information j of the first frame and the feature information i of the second frame are respectively and correspondingly located in a first block and a second block which have an association relationship, calculating the similarity between the feature information j of the first frame and the feature information i of the second frame; and when the feature information j of the first frame and the feature information i of the second frame are not correspondingly positioned in the first block and the second block which have the association relationship, setting the similarity between the feature information j of the first frame and the feature information i of the second frame to be 0.
In an exemplary embodiment, the similarity between the feature information j of the first frame and the feature information i of the second frame is calculated according to the following formula:
Figure BDA0002223753390000174
Figure BDA0002223753390000175
wherein, Ft(j) Is the characteristic information j, F of the first framet-1(i) Is the characteristic information i of the second frame,
Figure BDA0002223753390000176
β is a constant, which is a euclidean distance between the feature information j of the first frame and the feature information i of the second frame.
In an exemplary embodiment, the inter-frame similarity is used to determine a reference weight between the first frame and the second frame.
In an exemplary embodiment, the system further comprises a learning module for: learning hidden state information at the t moment through a non-local convolution long-short term memory network, wherein the hidden state information is used for enhancing the first frame; wherein the non-local convolution long-short term memory network is configured to: determining the weight of hidden state information and the weight of unit state information output at the time t-1 according to the interframe similarity between a first frame corresponding to the time t and a second frame corresponding to the time t-1, and converting the hidden state information and the unit state information output at the time t-1 according to the weight of the hidden state information and the weight of the unit state information output at the time t-1 to obtain target hidden state information and target unit state information which are used as input data of the non-local convolution long-short term memory network at the time t.
In an exemplary embodiment, the system further comprises a learning module for: extracting a plurality of third blocks at corresponding positions from hidden state information output at the t-1 moment according to a plurality of second blocks associated with each first block, and generating target hidden state information according to the hidden state information in the third blocks and the similarity matrix; and according to a plurality of second blocks associated with each first block, extracting a plurality of fourth blocks at corresponding positions from the unit state information output at the t-1 moment, and generating the target hidden state information according to the unit state information in the plurality of fourth blocks and the similarity matrix.
EXAMPLE III
Fig. 9 schematically shows a hardware architecture diagram of a computer device suitable for implementing the inter-frame similarity evaluation method according to a third embodiment of the present application. In the present embodiment, the computer device 2 is a device capable of automatically performing numerical calculation and/or information processing in accordance with a command set in advance or stored. For example, the server may be a smart phone, a tablet computer, a notebook computer, a desktop computer, a monitoring device, a video conference system, a rack server, a blade server, a tower server, or a rack server (including an independent server or a server cluster composed of a plurality of servers), and the like. As shown in fig. 9, the computer device 2 includes at least, but is not limited to: the memory 21, processor 22, and network interface 23 may be communicatively coupled to each other by a system bus. Wherein:
the memory 21 includes at least one type of computer-readable storage medium including a flash memory, a hard disk, a multimedia card, a card type memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a Read Only Memory (ROM), an Electrically Erasable Programmable Read Only Memory (EEPROM), a Programmable Read Only Memory (PROM), a magnetic memory, a magnetic disk, an optical disk, etc. In some embodiments, the storage 21 may be an internal storage module of the computer device 2, such as a hard disk or a memory of the computer device 2. In other embodiments, the memory 21 may also be an external storage device of the computer device 2, such as a plug-in hard disk provided on the computer device 2, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like. Of course, the memory 21 may also comprise both an internal memory module of the computer device 2 and an external memory device thereof. In this embodiment, the memory 21 is generally used for storing an operating system installed in the computer device 2 and various types of application software, such as program codes of the inter-frame similarity evaluation method. Further, the memory 21 may also be used to temporarily store various types of data that have been output or are to be output.
Processor 22 may be a Central Processing Unit (CPU), controller, microcontroller, microprocessor, or other data Processing chip in some embodiments. The processor 22 is generally configured to control the overall operation of the computer device 2, such as performing control and processing related to data interaction or communication with the computer device 2. In this embodiment, the processor 22 is configured to execute the program code stored in the memory 21 or process data.
The network interface 23 may comprise a wireless network interface or a wired network interface, and the network interface 23 is typically used to establish a communication connection between the computer device 2 and other computer devices. For example, the network interface 23 is used to connect the computer device 2 with an external terminal through a network, establish a data transmission channel and a communication connection between the computer device 2 and the external terminal, and the like. The network may be a wireless or wired network such as an Intranet (Intranet), the Internet (Internet), a Global System of Mobile communication (GSM), Wideband Code Division Multiple Access (WCDMA), a 4G network, a 5G network, Bluetooth (Bluetooth), or Wi-Fi.
It is noted that fig. 9 only shows a computer device with components 21-23, but it is to be understood that not all of the shown components are required to be implemented, and that more or less components may be implemented instead.
In this embodiment, the method for evaluating the inter-frame similarity stored in the memory 21 may be further divided into one or more program modules and executed by one or more processors (in this embodiment, the processor 22) to complete the present invention.
Example four
The present embodiment also provides a computer-readable storage medium having stored thereon a computer program, which when executed by a processor, implements the steps of the inter-frame similarity evaluation method in the embodiments.
In this embodiment, the computer-readable storage medium includes a flash memory, a hard disk, a multimedia card, a card type memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a Read Only Memory (ROM), an Electrically Erasable Programmable Read Only Memory (EEPROM), a Programmable Read Only Memory (PROM), a magnetic memory, a magnetic disk, an optical disk, and the like. In some embodiments, the computer readable storage medium may be an internal storage unit of the computer device, such as a hard disk or a memory of the computer device. In other embodiments, the computer readable storage medium may be an external storage device of the computer device, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like provided on the computer device. Of course, the computer-readable storage medium may also include both internal and external storage devices of the computer device. In this embodiment, the computer-readable storage medium is generally used for storing an operating system and various types of application software installed in the computer device, for example, the program code of the inter-frame similarity evaluation method in the embodiment, and the like. Further, the computer-readable storage medium may also be used to temporarily store various types of data that have been output or are to be output.
It will be apparent to those skilled in the art that the modules or steps of the embodiments of the invention described above may be implemented by a general purpose computing device, they may be centralized on a single computing device or distributed across a network of multiple computing devices, and alternatively, they may be implemented by program code executable by a computing device, such that they may be stored in a storage device and executed by a computing device, and in some cases, the steps shown or described may be performed in an order different than that described herein, or they may be separately fabricated into individual integrated circuit modules, or multiple ones of them may be fabricated into a single integrated circuit module. Thus, embodiments of the invention are not limited to any specific combination of hardware and software.
The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims (9)

1. An inter-frame similarity evaluation method, the method comprising:
acquiring a first frame and a second frame in a frame sequence;
extracting a plurality of feature information of the first frame and a plurality of feature information of the second frame;
partitioning the plurality of feature information of the first frame and the plurality of feature information of the second frame to obtain a plurality of first blocks corresponding to the first frame and a plurality of second blocks corresponding to the second frame;
acquiring a plurality of second blocks related to each first block by calculating the similarity between each first block and each second block; and
according to a plurality of second blocks associated with each first block, similarity calculation is carried out on each feature information of the first frame and part of feature information of the second frame respectively to obtain inter-frame similarity between the first frame and the second frame, and the method comprises the following steps: by similarity matrix
Figure FDA0003229125870000011
Representing an inter-frame similarity between the first frame and the second frame; wherein the content of the first and second substances,
Figure FDA0003229125870000012
is composed of
Figure FDA0003229125870000013
Is used for representing the similarity between the feature information j of the first frame and the feature information i of the second frame: when the feature information j of the first frame and the feature information i of the second frame are respectively and correspondingly located in a first block and a second block which have an association relationship, calculating the similarity between the feature information j of the first frame and the feature information i of the second frame; when in useAnd if the feature information j of the first frame and the feature information i of the second frame are not correspondingly located in the first block and the second block which have the association relationship, setting the similarity between the feature information j of the first frame and the feature information i of the second frame to be 0.
2. The method according to claim 1, wherein obtaining a plurality of second blocks associated with each first block comprises:
pooling each first block into a corresponding first downsampling feature information to obtain M first downsampling feature information;
pooling each second block into corresponding second downsampling characteristic information to obtain M second downsampling characteristic information;
calculating the similarity of the first downsampling characteristic information a and each second downsampling characteristic information, wherein the first downsampling characteristic information a corresponds to a first block a; and
and determining k second blocks corresponding to the k second downsampling feature information with the highest similarity as k second blocks associated with the first block a, wherein a is more than or equal to 1 and less than or equal to M, k is more than or equal to 1 and less than M, and a, k and M are natural numbers.
3. The method according to claim 1, wherein the similarity between the feature information j of the first frame and the feature information i of the second frame is calculated by the following formula:
Figure FDA0003229125870000021
Figure FDA0003229125870000022
wherein, Ft(j) Is the characteristic information j, F of the first framet-1(i) Is the characteristic information i of the second frame,
Figure FDA0003229125870000023
β is a constant, which is a euclidean distance between the feature information j of the first frame and the feature information i of the second frame.
4. The inter-frame similarity evaluation method according to claim 1, wherein the inter-frame similarity is used to determine a reference weight between the first frame and the second frame.
5. The method of claim 4, further comprising:
learning hidden state information at the t moment through a non-local convolution long-short term memory network, wherein the hidden state information is used for enhancing the first frame;
wherein the non-local convolution long-short term memory network is configured to: determining the weight of hidden state information and the weight of unit state information output at the time t-1 according to the interframe similarity between a first frame corresponding to the time t and a second frame corresponding to the time t-1, and converting the hidden state information and the unit state information output at the time t-1 according to the weight of the hidden state information and the weight of the unit state information output at the time t-1 to obtain target hidden state information and target unit state information which are used as input data of the non-local convolution long-short term memory network at the time t.
6. The method according to claim 5, wherein converting the hidden state information and the cell state information output at the t-1 time according to the weight of the hidden state information and the weight of the cell state information output at the t-1 time comprises:
extracting a plurality of third blocks at corresponding positions from hidden state information output at the t-1 moment according to a plurality of second blocks associated with each first block, and generating target hidden state information according to the hidden state information in the third blocks and the similarity matrix; and
and according to a plurality of second blocks associated with each first block, extracting a plurality of fourth blocks at corresponding positions from the unit state information output at the t-1 moment, and generating the target hidden state information according to the unit state information in the plurality of fourth blocks and the similarity matrix.
7. An inter-frame similarity evaluation system, the system comprising:
the device comprises a first acquisition module, a second acquisition module and a first display module, wherein the first acquisition module is used for acquiring a first frame and a second frame in a frame sequence;
an extraction module, configured to extract a plurality of feature information of the first frame and a plurality of feature information of the second frame;
a blocking module, configured to block the feature information of the first frame and the feature information of the second frame to obtain a plurality of first blocks corresponding to the first frame and a plurality of second blocks corresponding to the second frame;
the second acquisition module is used for acquiring a plurality of second blocks related to each first block by calculating the similarity between each first block and each second block; and
a third obtaining module, configured to perform similarity calculation on each piece of feature information of the first frame and part of feature information of the second frame according to a plurality of second blocks associated with each first block, so as to obtain inter-frame similarity between the first frame and the second frame;
wherein the third obtaining module is further configured to: by similarity matrix
Figure FDA0003229125870000041
Representing an inter-frame similarity between the first frame and the second frame; wherein the content of the first and second substances,
Figure FDA0003229125870000042
is composed of
Figure FDA0003229125870000043
Is used for representing the characteristic information j andsimilarity between feature information i of the second frame: when the feature information j of the first frame and the feature information i of the second frame are respectively and correspondingly located in a first block and a second block which have an association relationship, calculating the similarity between the feature information j of the first frame and the feature information i of the second frame; and when the feature information j of the first frame and the feature information i of the second frame are not correspondingly positioned in the first block and the second block which have the association relationship, setting the similarity between the feature information j of the first frame and the feature information i of the second frame to be 0.
8. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor is configured to implement the steps of the method for evaluating inter-frame similarity according to any one of claims 1 to 6 when executing the computer program.
9. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, is adapted to carry out the steps of the inter-frame similarity assessment method according to any one of claims 1 to 6.
CN201910944335.1A 2019-09-30 2019-09-30 Method and system for evaluating interframe similarity Active CN112584146B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910944335.1A CN112584146B (en) 2019-09-30 2019-09-30 Method and system for evaluating interframe similarity

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910944335.1A CN112584146B (en) 2019-09-30 2019-09-30 Method and system for evaluating interframe similarity

Publications (2)

Publication Number Publication Date
CN112584146A CN112584146A (en) 2021-03-30
CN112584146B true CN112584146B (en) 2021-09-28

Family

ID=75116590

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910944335.1A Active CN112584146B (en) 2019-09-30 2019-09-30 Method and system for evaluating interframe similarity

Country Status (1)

Country Link
CN (1) CN112584146B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114241223B (en) * 2021-12-17 2023-03-24 北京达佳互联信息技术有限公司 Video similarity determination method and device, electronic equipment and storage medium

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102426606A (en) * 2011-11-11 2012-04-25 南京财经大学 Method for retrieving multi-feature image based on particle swarm algorithm
CN104392439A (en) * 2014-11-13 2015-03-04 北京智谷睿拓技术服务有限公司 Image similarity confirmation method and device
CN105578198A (en) * 2015-12-14 2016-05-11 上海交通大学 Video homologous Copy-Move detection method based on time offset characteristic
CN107103270A (en) * 2016-02-23 2017-08-29 云智视像科技(上海)有限公司 A kind of face identification system of the dynamic calculation divided group coefficient based on IDF
CN107122787A (en) * 2017-02-14 2017-09-01 北京理工大学 A kind of image scaling quality evaluating method of feature based fusion
CN107153824A (en) * 2017-05-22 2017-09-12 中国人民解放军国防科学技术大学 Across video pedestrian recognition methods again based on figure cluster
CN109241911A (en) * 2018-09-07 2019-01-18 北京相貌空间科技有限公司 Human face similarity degree calculation method and device
CN109859245A (en) * 2019-01-22 2019-06-07 深圳大学 Multi-object tracking method, device and the storage medium of video object
CN109948666A (en) * 2019-03-01 2019-06-28 广州杰赛科技股份有限公司 Image similarity recognition methods, device, equipment and storage medium
CN110162657A (en) * 2019-05-28 2019-08-23 山东师范大学 A kind of image search method and system based on high-level semantics features and color characteristic

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20180087994A (en) * 2017-01-26 2018-08-03 삼성전자주식회사 Stero matching method and image processing apparatus

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102426606A (en) * 2011-11-11 2012-04-25 南京财经大学 Method for retrieving multi-feature image based on particle swarm algorithm
CN104392439A (en) * 2014-11-13 2015-03-04 北京智谷睿拓技术服务有限公司 Image similarity confirmation method and device
CN105578198A (en) * 2015-12-14 2016-05-11 上海交通大学 Video homologous Copy-Move detection method based on time offset characteristic
CN107103270A (en) * 2016-02-23 2017-08-29 云智视像科技(上海)有限公司 A kind of face identification system of the dynamic calculation divided group coefficient based on IDF
CN107122787A (en) * 2017-02-14 2017-09-01 北京理工大学 A kind of image scaling quality evaluating method of feature based fusion
CN107153824A (en) * 2017-05-22 2017-09-12 中国人民解放军国防科学技术大学 Across video pedestrian recognition methods again based on figure cluster
CN109241911A (en) * 2018-09-07 2019-01-18 北京相貌空间科技有限公司 Human face similarity degree calculation method and device
CN109859245A (en) * 2019-01-22 2019-06-07 深圳大学 Multi-object tracking method, device and the storage medium of video object
CN109948666A (en) * 2019-03-01 2019-06-28 广州杰赛科技股份有限公司 Image similarity recognition methods, device, equipment and storage medium
CN110162657A (en) * 2019-05-28 2019-08-23 山东师范大学 A kind of image search method and system based on high-level semantics features and color characteristic

Also Published As

Publication number Publication date
CN112584146A (en) 2021-03-30

Similar Documents

Publication Publication Date Title
CN109493350B (en) Portrait segmentation method and device
CN107967669B (en) Picture processing method and device, computer equipment and storage medium
CN111629262B (en) Video image processing method and device, electronic equipment and storage medium
CN109816615B (en) Image restoration method, device, equipment and storage medium
CN110956219B (en) Video data processing method, device and electronic system
EP2063644A2 (en) Image encoding device and encoding method, and image decoding device and decoding method
Hayat Super-resolution via deep learning
US11328184B2 (en) Image classification and conversion method and device, image processor and training method therefor, and medium
US20160255357A1 (en) Feature-based image set compression
WO2020043296A1 (en) Device and method for separating a picture into foreground and background using deep learning
US9230161B2 (en) Multiple layer block matching method and system for image denoising
Ding et al. A deep learning approach for quality enhancement of surveillance video
CN112584146B (en) Method and system for evaluating interframe similarity
CN112584158B (en) Video quality enhancement method and system
JP6275719B2 (en) A method for sampling image colors of video sequences and its application to color clustering
US11403782B2 (en) Static channel filtering in frequency domain
CN111861940A (en) Image toning enhancement method based on condition continuous adjustment
CN112132769A (en) Image fusion method and device and computer equipment
CN116486009A (en) Monocular three-dimensional human body reconstruction method and device and electronic equipment
CN115984307A (en) Video object segmentation method and device, electronic equipment and storage medium
CN114627211A (en) Video business card generation method and device, computer equipment and storage medium
WO2020077535A1 (en) Image semantic segmentation method, computer device, and storage medium
Seetharaman A block-oriented restoration in gray-scale images using full range autoregressive model
CN114095728B (en) End-to-end video compression method, device and computer readable storage medium
CN115170451A (en) Sky background replacing method and device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant