CN113573058B - Interframe image coding method based on space-time significance fusion - Google Patents

Interframe image coding method based on space-time significance fusion Download PDF

Info

Publication number
CN113573058B
CN113573058B CN202111112916.2A CN202111112916A CN113573058B CN 113573058 B CN113573058 B CN 113573058B CN 202111112916 A CN202111112916 A CN 202111112916A CN 113573058 B CN113573058 B CN 113573058B
Authority
CN
China
Prior art keywords
saliency map
time
node
space
saliency
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111112916.2A
Other languages
Chinese (zh)
Other versions
CN113573058A (en
Inventor
蒋先涛
蔡佩华
张纪庄
郭咏梅
郭咏阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Kangda Intercontinental Medical Devices Co ltd
Original Assignee
Kangda Intercontinental Medical Devices Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Kangda Intercontinental Medical Devices Co ltd filed Critical Kangda Intercontinental Medical Devices Co ltd
Priority to CN202111112916.2A priority Critical patent/CN113573058B/en
Publication of CN113573058A publication Critical patent/CN113573058A/en
Application granted granted Critical
Publication of CN113573058B publication Critical patent/CN113573058B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/124Quantisation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/182Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a pixel

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The invention discloses an interframe image coding method based on space-time significance fusion, which relates to the technical field of image processing and mainly comprises the following steps: acquiring a time saliency map according to the time domain motion vector of each pixel point; extracting a single-layer image with super-pixel characteristics in the inter-frame image according to the mean value characteristics of all the pixel points; obtaining a transfer matrix according to the single-layer graph and the mean value characteristics of the pixel points corresponding to the nodes; acquiring a space saliency map based on a Markov chain theory according to the transfer matrix; acquiring a space-time saliency map according to the weight relation between the time saliency map and the space saliency map; acquiring a saliency map according to the space-time saliency map; and dynamically adjusting the quantization parameter according to the mean value characteristic of the corresponding pixel point in the saliency map, and encoding the interframe image according to the quantization parameter. The invention combines the motion characteristics of the image in time domain and space domain to obtain the time-space domain saliency map for coding, so that the coded data contains more image information, and the fidelity of the decoded data is improved.

Description

Interframe image coding method based on space-time significance fusion
Technical Field
The invention relates to the technical field of image processing, in particular to an interframe image coding method based on space-time significance fusion.
Background
Currently, with the increasingly widespread application of h.265/HEVC and its extended coding standard, the known perceptual computing models are mainly classified into four categories: region of Interest calculation model (ROI), Visual Attention model (Visual Attention), Visual Sensitivity model (Visual Sensitivity), Cross-perception Attention model (Cross-Modal Attention). Perceptual coding methods can be further classified into three categories: a preprocessing method, a non-scalable coding method, and a scalable coding method. The preprocessing method usually performs visual optimization processing on inter-frame images in the original video before encoding, and does not need to change an encoder. The non-scalable encoding method requires changes to the codec while performing the visual optimization. The scalable coding method only needs to change the coder when the visual optimization processing is carried out. The criterion for evaluating the performance of perceptual coding is the improvement of coding efficiency or visual quality, and for some real-time applications, the computational complexity of a perceptual model also needs to be verified.
Although the research on inter-frame image coding based on visual perception has been greatly advanced in recent years, there still remain disadvantages. The method comprises the following steps of (1) calculating the significance of an inter-frame image: there is currently a lack of efficient computational models for inter-picture coding applications. Although international research on the significance of static images has been greatly advanced, research on the significance detection of dynamic inter-frame images is still in the beginning and has not yet formed a system. The existing perception coding framework is not perfect enough, information interaction between an inter-frame image perceptron and a coder is only limited to salient object detection, and the information interaction is not beneficial to information sharing of the two (such as dividing information of a foreground and a background, motion type information and the like).
With the development of the mass media industry, the requirements on timeliness and fidelity of video transmission are higher and higher. Based on this, the research on computational models for inter-frame image coding has a great development space.
Disclosure of Invention
In order to overcome the defects of the prior art when the inter-frame image is coded, the invention provides an inter-frame image coding method based on space-time significance fusion, which comprises the following steps:
s1: acquiring an inter-frame image and extracting the mean value characteristic of each pixel point through a color difference calculation space;
s2: acquiring a time domain motion vector of each pixel point in the interframe image through an optical flow algorithm, and acquiring a time saliency map according to the time domain motion vector of each pixel point;
s3: extracting a single-layer image with super-pixel characteristics in the inter-frame image according to the mean value characteristics of all the pixel points;
s4: obtaining a transfer matrix according to the weight relation between the nodes and the edges in the single-layer graph and the mean value characteristics of the corresponding pixels of the nodes;
s5: acquiring a space saliency map based on a Markov chain theory according to the transfer matrix;
s6: acquiring a space-time saliency map according to the weight relation between the time saliency map and the space saliency map;
s7: carrying out normalization processing in a preset color gradation range according to the space-time saliency map to obtain a saliency map;
s8: and dynamically adjusting the quantization parameter according to the mean value characteristic of the corresponding pixel point in the saliency map, and encoding the interframe image according to the quantization parameter.
Further, the time saliency map is composed of time saliency values of each pixel point, wherein the acquisition of the time saliency values can be expressed as the following formula:
Figure 526821DEST_PATH_IMAGE001
Figure 135307DEST_PATH_IMAGE002
Figure 790410DEST_PATH_IMAGE003
wherein (x, y) is the pixel coordinate of pixel point i, MV (x, y) is the amplitude of time domain motion vector, MVx(x, y) is the horizontal component of the temporal motion vector, MVy(x, y) is the vertical component of the temporal motion vector;
Figure 60986DEST_PATH_IMAGE004
for the purpose of enhancing the amount of amplitude measurement,
Figure 875358DEST_PATH_IMAGE005
and
Figure 697821DEST_PATH_IMAGE006
is a constant parameter;
Figure 774973DEST_PATH_IMAGE007
for normalizing the enhanced amplitude level to be within a preset tone scale range,
Figure 973874DEST_PATH_IMAGE008
and the time significance value corresponding to the pixel point i is obtained.
Further, the nodes of the single-layer graph include a transient node and a sink node, where each node is connected to a transient node adjacent to the node or sharing an edge with an adjacent node of the node, and in the step S4, the method further includes the steps of:
and acquiring the weight of the edge between the adjacent nodes according to the average value characteristics of the pixel points corresponding to the nodes, and renumbering the nodes.
Further, the weight of the edge between the adjacent nodes can be expressed as:
Figure 642752DEST_PATH_IMAGE009
wherein m and n are two adjacent nodes in the single-layer graph, and wmnIs the weight of the edge between node m and node n, xm、xnThe mean features of the corresponding pixel points of node m and node n respectively,
Figure 636116DEST_PATH_IMAGE010
is a constant number of times, and is,
Figure 390446DEST_PATH_IMAGE011
is the Euler number.
Further, the transition matrix may be represented by the following formula:
Figure 189774DEST_PATH_IMAGE012
Figure 713160DEST_PATH_IMAGE013
Figure 143004DEST_PATH_IMAGE014
in the formula (I), the compound is shown in the specification,
Figure 384629DEST_PATH_IMAGE015
is the numbering of the node m after renumbering,
Figure 659753DEST_PATH_IMAGE016
is the numbering of the node n after renumbering, A is the adjacency matrix,
Figure 116273DEST_PATH_IMAGE017
and N (m) represents that the node n is communicated with the node m, D is a degree matrix, P is a transition matrix, and t is the number of transient nodes.
Further, the spatial saliency map is composed of spatial saliency values of each pixel point, wherein the acquisition of the spatial saliency values is expressed as the following formula:
Figure 451439DEST_PATH_IMAGE018
Figure 180361DEST_PATH_IMAGE019
in the formula, Q is the transition probability between any transient state after a transition matrix P is expressed by a Markov absorption chain, I is a matrix of r multiplied by r, r is the number of absorption nodes, c is a t-dimensional column vector with all elements being 1, and y is the absorption time of the corresponding transient state node;
Figure 259176DEST_PATH_IMAGE020
for corresponding transient node
Figure 757153DEST_PATH_IMAGE021
The numbering after the renumbering is carried out,
Figure 528800DEST_PATH_IMAGE022
for numbering corresponding to transient sectionDot
Figure 541755DEST_PATH_IMAGE023
The vector is normalized with respect to the absorption time of (c),
Figure 424261DEST_PATH_IMAGE024
is a spatial significance value.
Further, the spatio-temporal saliency map consists of spatio-temporal saliency values of each pixel point, and the acquisition of the spatio-temporal saliency values can be expressed as a formula:
Figure 776745DEST_PATH_IMAGE025
in the formula (I), the compound is shown in the specification,
Figure 719293DEST_PATH_IMAGE026
for the weights of the spatial saliency map to be,
Figure 970277DEST_PATH_IMAGE027
for the weights of the temporal saliency map to be,
Figure 390894DEST_PATH_IMAGE028
is the spatial saliency value of the pixel point i,
Figure 863463DEST_PATH_IMAGE029
is the temporal saliency value of the pixel point i,
Figure 976913DEST_PATH_IMAGE030
the time-space significance value of the pixel point i is obtained.
Further, the adjustment of the quantization parameter in the step S8 can be expressed as the following formula:
Figure 902144DEST_PATH_IMAGE031
in the formula, u (x, y) is the mean characteristic of the corresponding pixel point i (x, y) in the saliency map; q. q.s1And q is2Respectively corresponding pixel points in the saliency mapi (x, y) is in a corresponding proportional relation with the average value characteristic of the control quantization parameter threshold; QP0For the initial value of the quantization coefficient, Δ QP is the correction value of the quantization parameter QP (x, y), and Int is the rounding operation.
Compared with the prior art, the invention has at least the following effects:
(1) the invention relates to an interframe image coding method based on space-time saliency fusion, which combines an image time domain and an image space domain to obtain a space-time saliency map on the basis of considering the motion characteristics of the image on the space domain and the time domain, and codes a video interframe image according to a normalized result, so that information interaction between a sensor and a coder is not limited to saliency target detection any more, and more foreground and background division change information in the video interframe image change process and motion type information of the interframe image can be obtained;
(2) through the interaction of more information between the sensor and the encoder, the encoded compressed data can keep more related information, so that a higher-definition and fidelity video image is obtained when the compressed data is decoded, and the method can be applied to the compression encoding of high-definition video;
(3) the information interactivity is improved, and meanwhile, the coding quantization parameter is dynamically adjusted through the space-time significance value, so that the coding bit rate is reduced, and the coding speed is improved.
Drawings
FIG. 1 is a diagram of method steps for an inter-frame image coding method based on spatio-temporal saliency fusion;
FIG. 2 is a schematic diagram of spatio-temporal saliency fusion.
Detailed Description
The following are specific embodiments of the present invention and are further described with reference to the drawings, but the present invention is not limited to these embodiments.
Example one
The invention aims to solve the problem that in the prior art, the inter-frame image coding in a video causes insufficient information interaction, so that the video distortion is easily caused in the data decoding process, and as shown in figure 1, the invention provides an inter-frame image coding method based on space-time significance fusion, which comprises the following steps:
s1: acquiring an inter-frame image and extracting the mean value characteristic of each pixel point through a color difference calculation space;
s2: acquiring a time domain motion vector of each pixel point in the interframe image through an optical flow algorithm, and acquiring a time saliency map according to the time domain motion vector of each pixel point;
s3: extracting a single-layer image with super-pixel characteristics in the inter-frame image according to the mean value characteristics of all the pixel points;
s4: obtaining a transfer matrix according to the weight relation between the nodes and the edges in the single-layer graph and the mean value characteristics of the corresponding pixels of the nodes;
s5: acquiring a space saliency map based on a Markov chain theory according to the transfer matrix;
s6: acquiring a space-time saliency map according to the weight relation between the time saliency map and the space saliency map;
s7: carrying out normalization processing in a preset color gradation range according to the space-time saliency map to obtain a saliency map;
s8: and dynamically adjusting the quantization parameter according to the mean value characteristic of the corresponding pixel point in the saliency map, and encoding the interframe image according to the quantization parameter.
Considering that the human visual perception system is more sensitive to the color difference computing space (CIELab), in order to make the decoded inter-frame image of the encoded video data more conform to the human visual perception habit, the invention needs to pre-process the inter-frame image of the video after obtaining the inter-frame image: and converting the color of the inter-frame image from the RGB space to a color difference calculation space, and calculating the mean value of each pixel point in the inter-frame image in the color difference calculation space as the characteristic of each pixel point.
Since it is a coding method based on spatio-temporal saliency fusion, it must include temporal saliency and spatial saliency. For the calculation of the time saliency, the invention acquires the time saliency map of the interframe image based on an optical flow algorithm (Lucas-Kanade). Specifically, the time domain of each pixel point is obtained through an optical flow algorithmHorizontal component MV of motion vectorx(x, y) and vertical component MVy(x, y), then the magnitude MV (x, y) of the temporal motion vector is obtained according to the two components, and the formula expression can be expressed as:
Figure 126451DEST_PATH_IMAGE032
further, by the enhancement operation, the magnitude of the temporal motion vector can be further expressed as:
Figure 250265DEST_PATH_IMAGE033
wherein
Figure 534616DEST_PATH_IMAGE034
And
Figure 947143DEST_PATH_IMAGE035
are constant parameters, and in the present embodiment,
Figure 975142DEST_PATH_IMAGE034
the value of (a) is selected to be 10,
Figure 156724DEST_PATH_IMAGE035
the value of (d) is chosen to be 2. Finally, will
Figure 159447DEST_PATH_IMAGE036
Normalizing to a predetermined color level range (in this embodiment, the predetermined color level range is selected as [0, 255 ]]) And, is formulated as:
Figure 59269DEST_PATH_IMAGE037
and then the time significance value of the pixel point i (x, y)
Figure 625380DEST_PATH_IMAGE038
Can be used forBy passing
Figure 927048DEST_PATH_IMAGE039
To show that:
Figure 287623DEST_PATH_IMAGE040
and matching the corresponding coordinates of the time significance value according to the obtained time significance value of each pixel point and the coordinates of the pixel points in the interframe image, thereby obtaining a time significance map.
For the calculation of the spatial saliency, the method is based on a Markov chain saliency detection method to acquire a spatial saliency map at a superpixel level. Firstly, a single-layer graph G (V, E) with super-pixel characteristics in an inter-frame image is extracted according to the mean value characteristics of all pixel points, wherein V and E respectively represent nodes and edges of the single-layer graph G. Also on the single-level graph G, each node V needs to be connected to a transient node that is adjacent to the node or shares an edge with an adjacent node of the node. Based on this, the weight of the edge E between two adjacent nodes m (current node) and n (transient node connected to the current node) can be defined as:
Figure 737058DEST_PATH_IMAGE009
in the formula, xm、xnThe mean features of the corresponding pixel points of node m and node n respectively,
Figure 106860DEST_PATH_IMAGE041
is a constant number of times, and is,
Figure 997456DEST_PATH_IMAGE042
is the Euler number. Then, renumbering can be performed according to the weight, so that the first t numbered nodes are transient nodes, the second r numbered nodes are absorbing nodes, wherein t is the number of transient nodes, and r is the number of absorbing nodes.
At the upper partOn the basis of the above, it should be further understood that the transfer matrix P on the single-layer graph can be calculated according to the adjacency matrix a and the degree matrix D:
Figure 794510DEST_PATH_IMAGE043
. Therefore, to calculate the transition matrix P, the adjacency matrix a and the degree matrix D need to be confirmed.
And according to the weight of the edge between adjacent nodes, the adjacency matrix a can be represented as:
Figure 403346DEST_PATH_IMAGE044
from the connection matrix, the degree matrix D can be expressed as:
Figure 404536DEST_PATH_IMAGE045
in the formula (I), the compound is shown in the specification,
Figure 415218DEST_PATH_IMAGE046
is the numbering of the node m after renumbering,
Figure 383174DEST_PATH_IMAGE047
is the numbering of the node n after renumbering,
Figure 479306DEST_PATH_IMAGE048
for the expression of the adjacency matrix, n (m) represents that the node n communicates with the node m. Wherein the content of the first and second substances,
Figure 190910DEST_PATH_IMAGE049
which is the case when the neighboring node is on the diagonal of the inter-picture.
And then, calculating the absorption time of each transient node based on the Markov theory according to the obtained transfer matrix P. Then the sink state of the node after renumbering according to this embodiment
Figure 118414DEST_PATH_IMAGE050
And transient state
Figure 257272DEST_PATH_IMAGE051
The transition matrix may be represented as:
Figure 840700DEST_PATH_IMAGE052
the state of the first node is transient state, and the state of the last node is transient absorption. Q is the transition probability between any transient after the transition matrix P is represented by a Markov absorbing chain, R is the probability of containing the movement from any transient to any absorbing state, I is a matrix of R x R size, R is the number of absorbing nodes, and c is a t-dimensional column vector with all elements being 1. For the absorption chain, its basic properties can be deduced: matrix array
Figure 90415DEST_PATH_IMAGE053
Wherein, in the step (A),
Figure 75689DEST_PATH_IMAGE054
which is an expression of the matrix K, can be understood as the expected number of times the chain takes for transient absorption to occur in the transient node n. Assuming that the chain starts in transient m, an
Figure 198497DEST_PATH_IMAGE055
Representing the expected number of times before absorption, the absorption time of the corresponding transient node can be calculated, i.e.:
Figure 269221DEST_PATH_IMAGE056
the basic idea of the markov chain is to detect saliency using temporal attributes in the absorbing markov chain. The virtual border nodes are identified as a priori border-based absorber nodes. Significance was calculated as the absorption time of the absorption node. On the basis of the markov chain significance model, the spatial significance value can be expressed as:
Figure 322628DEST_PATH_IMAGE057
in the formula (I), the compound is shown in the specification,
Figure 162408DEST_PATH_IMAGE058
for corresponding transient node
Figure 643068DEST_PATH_IMAGE059
The numbering after the renumbering is carried out,
Figure 997826DEST_PATH_IMAGE060
is a number
Figure 854923DEST_PATH_IMAGE061
The absorption time normalization vector corresponding to the transient node,
Figure 549210DEST_PATH_IMAGE062
is a spatial significance value. And then according to the obtained space significance value of each pixel point, matching corresponding coordinates of the space significance value according to coordinates of the pixel points in the interframe image, thereby obtaining a space significance map.
The time significance map and the space significance map are obtained through the analysis and the calculation. The time saliency map reflects the dynamic characteristics of the inter-frame images in the video, the space saliency map reflects the static characteristics of the inter-frame images, and the time saliency map and the space saliency map are linearly fused to realize mutual complementation. Weight of the time saliency map is set to
Figure 200771DEST_PATH_IMAGE063
The weight of the spatial saliency map is
Figure 246087DEST_PATH_IMAGE064
The space-time significant value after linear fusion is:
Figure 454346DEST_PATH_IMAGE065
in the formula (I), the compound is shown in the specification,
Figure 3139DEST_PATH_IMAGE066
the time-space significance value of the pixel point i is obtained;
Figure 91181DEST_PATH_IMAGE067
Figure 358214DEST_PATH_IMAGE068
Figure 557114DEST_PATH_IMAGE069
Figure 22731DEST_PATH_IMAGE070
is a constant value, and generally takes a value in the range of 0.3-0.5.
And then matching corresponding coordinates of the space-time significance values according to the obtained space-time significance values of the pixel points and coordinates of the pixel points in the interframe images, thereby obtaining a space-time significance map. And then pixel level saliency is compared
Figure 355709DEST_PATH_IMAGE071
Proceed to preset color gradation range ([ 0, 255 ]]) The interior normalization processing is carried out, then the significant value is assigned to all the pixel points contained in the interior normalization processing, and a significant graph S containing each pixel point is obtainedmap
In HEVC, video inter-frame image is divided into a plurality of coding block units, and the code rate of the coding block is equal to quantization parameter QP and quantization step QPstepClosely related, the value range of QP is [0, 51]. In general, the larger the quantization parameter QP, the higher the distortion of the image. Meanwhile, the foreground region in the video needs to increase the allocation of data resources (bit), and the background region needs to save the allocation of the data resources (bit). Therefore, the invention further provides a method for dynamically adjusting the quantization parameter based on the mean value characteristic u (x, y) of each pixel point in the coded block saliency map Smap, so that the foreground region adopts high QP coding and the background region adopts low QP coding, and the adjusted quantization parameter is expressed as follows:
Figure 172355DEST_PATH_IMAGE072
in the formula, u (x, y) is the mean characteristic of the corresponding pixel point i (x, y) in the saliency map; q. q.s1And q is2Respectively, the threshold values of the control quantization parameters (q in this embodiment) are in corresponding proportional relation with the mean value characteristic of the corresponding pixel point i (x, y) in the saliency map1=0.5*2x,q2=0.8 × 2x, x is the mean characteristic of the corresponding pixel point); QP0For the initial value of the quantization coefficient, Δ QP is the correction value of the quantization parameter QP (x, y), and Int is the rounding operation.
As shown in fig. 2.a, it is an inter-frame image in a certain video. The spatial saliency is extracted to obtain a spatial saliency map as shown in fig. 2.b, and it can be seen that the foreground region and the background region can be clearly divided; the time saliency extraction is carried out on the inter-frame image to obtain a time saliency map as shown in fig. 2.c, and it can be seen that the inter-frame image can express a moving object and a moving type; according to the method of the present invention, the two are combined to obtain the space-time saliency map as shown in fig. 2.d, and the information of the two can be perfectly combined, so that more related information can be retained when encoding according to the saliency map.
In summary, the interframe image coding method based on the space-time saliency fusion, provided by the invention, combines the two to obtain a space-time saliency map on the basis of considering the motion characteristics of the image in the time domain and the space domain, and codes the video interframe image according to the normalized result, so that the information interaction between a sensor and an encoder is not limited to saliency target detection any more, and more foreground and background division change information and motion type information of the interframe image in the video interframe image change process can be obtained.
Through the interaction of more information between the sensor and the encoder, the encoded compressed data can keep more related information, so that a video image with higher definition and fidelity is obtained when the compressed data is decoded, and the method can be applied to the compression encoding of high-definition video. The information interactivity is improved, and meanwhile, the coding quantization parameter is dynamically adjusted through the space-time significance value, so that the coding bit rate is reduced, and the coding speed is improved.
It should be noted that all the directional indicators (such as up, down, left, right, front, and rear … …) in the embodiment of the present invention are only used to explain the relative position relationship between the components, the movement situation, etc. in a specific posture (as shown in the drawing), and if the specific posture is changed, the directional indicator is changed accordingly.
Moreover, descriptions of the present invention as relating to "first," "second," "a," etc. are for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicit ly indicating a number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In the description of the present invention, "a plurality" means at least two, e.g., two, three, etc., unless specifically limited otherwise.
In the present invention, unless otherwise expressly stated or limited, the terms "connected," "secured," and the like are to be construed broadly, and for example, "secured" may be a fixed connection, a removable connection, or an integral part; can be mechanically or electrically connected; they may be directly connected or indirectly connected through intervening media, or they may be connected internally or in any other suitable relationship, unless expressly stated otherwise. The specific meanings of the above terms in the present invention can be understood by those skilled in the art according to specific situations.
In addition, the technical solutions in the embodiments of the present invention may be combined with each other, but it must be based on the realization of those skilled in the art, and when the technical solutions are contradictory or cannot be realized, such a combination of technical solutions should not be considered to exist, and is not within the protection scope of the present invention.

Claims (3)

1. An interframe image coding method based on space-time significance fusion is characterized by comprising the following steps:
s1: acquiring an inter-frame image and extracting the mean value characteristic of each pixel point through a color difference calculation space;
s2: acquiring a time domain motion vector of each pixel point in the interframe image through an optical flow algorithm, and acquiring a time saliency map according to the time domain motion vector of each pixel point;
s3: extracting a single-layer image with super-pixel characteristics in the inter-frame image according to the mean value characteristics of all the pixel points;
s4: obtaining a transfer matrix according to the weight relation between the nodes and the edges in the single-layer graph and the mean value characteristics of the corresponding pixels of the nodes;
s5: acquiring a space saliency map based on a Markov chain theory according to the transfer matrix;
s6: acquiring a space-time saliency map according to the weight relation between the time saliency map and the space saliency map;
s7: carrying out normalization processing in a preset color gradation range according to the space-time saliency map to obtain a saliency map;
s8: dynamically adjusting quantization parameters according to the mean value characteristics of corresponding pixel points in the saliency map, and encoding the interframe images according to the quantization parameters;
in the step S4, the nodes of the single-layer graph include transient nodes and absorption nodes, where each node is connected to a transient node adjacent to the node or sharing an edge with an adjacent node of the node, and in the step S4, the method further includes the steps of:
acquiring the weight of edges between adjacent nodes according to the mean value characteristics of the pixels corresponding to the nodes, and renumbering the nodes;
the weight of the edge between the adjacent nodes can be expressed as:
Figure DEST_PATH_IMAGE001
wherein m and n are two adjacent nodes in the single-layer graph, and wmnIs the weight of the edge between node m and node n, xm、xnThe corresponding pixel points of the node m and the node n are respectivelyThe characteristics of the values are such that,
Figure DEST_PATH_IMAGE002
is a constant number of times, and is,
Figure DEST_PATH_IMAGE003
is the Euler number;
the transition matrix can be represented by the following formula:
Figure DEST_PATH_IMAGE004
Figure DEST_PATH_IMAGE005
Figure DEST_PATH_IMAGE006
in the formula (I), the compound is shown in the specification,
Figure DEST_PATH_IMAGE007
is the numbering of the node m after renumbering,
Figure DEST_PATH_IMAGE008
is the numbering of the node n after renumbering, A is the adjacency matrix,
Figure DEST_PATH_IMAGE009
the expression is an adjacent matrix expression, N (m) represents that a node n is communicated with a node m, D is a degree matrix, P is a transfer matrix, and t is the number of transient nodes;
in the step S5, the spatial saliency map is composed of spatial saliency values of each pixel, where the acquisition of the spatial saliency values is expressed as the following formula:
Figure DEST_PATH_IMAGE010
Figure DEST_PATH_IMAGE011
in the formula, Q is the transition probability between any transient state after a transition matrix P is expressed by a Markov absorption chain, I is a matrix of r multiplied by r, r is the number of absorption nodes, c is a t-dimensional column vector with all elements being 1, and y is the absorption time of the corresponding transient state node;
Figure DEST_PATH_IMAGE012
for corresponding transient node
Figure DEST_PATH_IMAGE013
The numbering after the renumbering is carried out,
Figure DEST_PATH_IMAGE014
is a number
Figure DEST_PATH_IMAGE015
The absorption time normalization vector corresponding to the transient node,
Figure DEST_PATH_IMAGE016
is a spatial significance value;
the adjustment of the quantization parameter in the step S8 can be expressed as the following formula:
Figure DEST_PATH_IMAGE017
in the formula, u (x, y) is the mean characteristic of the corresponding pixel point i (x, y) in the saliency map; q. q.s1And q is2Respectively, the threshold values of the control quantization parameters are in corresponding proportional relation with the mean value characteristics of the corresponding pixel points i (x, y) in the saliency map; QP0For the initial value of the quantization coefficient, Δ QP is the correction value of the quantization parameter QP (x, y), and Int is the rounding operation.
2. The method as claimed in claim 1, wherein the temporal saliency map is formed by temporal saliency values of each pixel point, and the temporal saliency value is obtained as the following formula:
Figure DEST_PATH_IMAGE018
Figure DEST_PATH_IMAGE019
Figure DEST_PATH_IMAGE020
wherein (x, y) is the pixel coordinate of pixel point i, MV (x, y) is the amplitude of time domain motion vector, MVx(x, y) is the horizontal component of the temporal motion vector, MVy(x, y) is the vertical component of the temporal motion vector;
Figure DEST_PATH_IMAGE021
for the purpose of enhancing the amount of amplitude measurement,
Figure DEST_PATH_IMAGE022
and
Figure DEST_PATH_IMAGE023
is a constant parameter;
Figure DEST_PATH_IMAGE024
for normalizing the enhanced amplitude level to be within a preset tone scale range,
Figure DEST_PATH_IMAGE025
and the time significance value corresponding to the pixel point i is obtained.
3. The method as claimed in claim 1, wherein the spatio-temporal saliency map is composed of spatio-temporal saliency values of each pixel point, and the spatio-temporal saliency value is obtained by a formula:
Figure DEST_PATH_IMAGE026
in the formula (I), the compound is shown in the specification,
Figure DEST_PATH_IMAGE027
for the weights of the spatial saliency map to be,
Figure DEST_PATH_IMAGE028
for the weights of the temporal saliency map to be,
Figure DEST_PATH_IMAGE029
is the spatial saliency value of the pixel point i,
Figure DEST_PATH_IMAGE030
is the temporal saliency value of the pixel point i,
Figure DEST_PATH_IMAGE031
the time-space significance value of the pixel point i is obtained.
CN202111112916.2A 2021-09-23 2021-09-23 Interframe image coding method based on space-time significance fusion Active CN113573058B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111112916.2A CN113573058B (en) 2021-09-23 2021-09-23 Interframe image coding method based on space-time significance fusion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111112916.2A CN113573058B (en) 2021-09-23 2021-09-23 Interframe image coding method based on space-time significance fusion

Publications (2)

Publication Number Publication Date
CN113573058A CN113573058A (en) 2021-10-29
CN113573058B true CN113573058B (en) 2021-11-30

Family

ID=78174214

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111112916.2A Active CN113573058B (en) 2021-09-23 2021-09-23 Interframe image coding method based on space-time significance fusion

Country Status (1)

Country Link
CN (1) CN113573058B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113965753B (en) * 2021-12-20 2022-05-17 康达洲际医疗器械有限公司 Inter-frame image motion estimation method and system based on code rate control

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103458238A (en) * 2012-11-14 2013-12-18 深圳信息职业技术学院 Scalable video code rate controlling method and device combined with visual perception

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8416992B2 (en) * 2005-01-10 2013-04-09 Thomson Licensing Device and method for creating a saliency map of an image
CN106611427B (en) * 2015-10-21 2019-11-15 中国人民解放军理工大学 Saliency detection method based on candidate region fusion
WO2017120776A1 (en) * 2016-01-12 2017-07-20 Shanghaitech University Calibration method and apparatus for panoramic stereo video system
CN106295542A (en) * 2016-08-03 2017-01-04 江苏大学 A kind of road target extracting method of based on significance in night vision infrared image
US11184641B2 (en) * 2017-05-09 2021-11-23 Koninklijke Kpn N.V. Coding spherical video data
CN107749066A (en) * 2017-11-10 2018-03-02 深圳市唯特视科技有限公司 A kind of multiple dimensioned space-time vision significance detection method based on region
US11122314B2 (en) * 2017-12-12 2021-09-14 Google Llc Bitrate optimizations for immersive multimedia streaming
CN108734173A (en) * 2018-04-20 2018-11-02 河海大学 Infrared video time and space significance detection method based on Gestalt optimizations
CN109547803B (en) * 2018-11-21 2020-06-09 北京航空航天大学 Time-space domain significance detection and fusion method
CN111310768B (en) * 2020-01-20 2023-04-18 安徽大学 Saliency target detection method based on robustness background prior and global information
CN113259664B (en) * 2021-07-15 2021-11-16 康达洲际医疗器械有限公司 Video compression method based on image binary identification

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103458238A (en) * 2012-11-14 2013-12-18 深圳信息职业技术学院 Scalable video code rate controlling method and device combined with visual perception

Also Published As

Publication number Publication date
CN113573058A (en) 2021-10-29

Similar Documents

Publication Publication Date Title
CN113766228B (en) Point cloud compression method, encoder, decoder, and storage medium
CN103002289B (en) Video constant quality coding device for monitoring application and coding method thereof
TWI645717B (en) Portrait decoding device, portrait decoding method, portrait encoding device, portrait encoding method, and data structure of encoded data
KR101728088B1 (en) Contrast enhancement
CN108513131B (en) Free viewpoint video depth map region-of-interest coding method
RU2007137462A (en) CLASSIFICATION OF CONTENT FOR PROCESSING MULTIMEDIA DATA
CN111726633A (en) Compressed video stream recoding method based on deep learning and significance perception
KR101354014B1 (en) Coding structure
CN113068034B (en) Video encoding method and device, encoder, equipment and storage medium
CN105931189B (en) Video super-resolution method and device based on improved super-resolution parameterized model
CN113573058B (en) Interframe image coding method based on space-time significance fusion
KR20030081403A (en) Image coding and decoding method, corresponding devices and applications
US20240202978A1 (en) Method for image processing and apparatus for implementing the same
CN112637596B (en) Code rate control system
CN111723735B (en) Pseudo high bit rate HEVC video detection method based on convolutional neural network
CN117750034A (en) Method, system, equipment and storage medium for learning video coding
JP6289055B2 (en) Video encoding apparatus and video decoding apparatus
JP4490351B2 (en) Inter-layer prediction processing method, inter-layer prediction processing apparatus, inter-layer prediction processing program, and recording medium therefor
CN115604477A (en) Ultrahigh-definition video distortion optimization coding method
US20210344968A1 (en) Method for image processing and apparatus for implementing the same
WO2022174469A1 (en) Illumination compensation method, encoder, decoder, and storage medium
CN111698503B (en) Video high-power compression method based on preprocessing
CN113453007A (en) Method for improving monitoring scene H264 coding efficiency
KR20110087859A (en) Method, apparatus and computer readable medium for adjusting the quantization factor
JP2015091126A (en) Visual perception conversion coding of image and video

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant