CN114697678A - Image encoding method, image encoding device, storage medium, and image encoding apparatus - Google Patents

Image encoding method, image encoding device, storage medium, and image encoding apparatus Download PDF

Info

Publication number
CN114697678A
CN114697678A CN202011618683.9A CN202011618683A CN114697678A CN 114697678 A CN114697678 A CN 114697678A CN 202011618683 A CN202011618683 A CN 202011618683A CN 114697678 A CN114697678 A CN 114697678A
Authority
CN
China
Prior art keywords
image
coded
motion vector
image unit
unit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011618683.9A
Other languages
Chinese (zh)
Inventor
不公告发明人
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Cambricon Technologies Corp Ltd
Original Assignee
Cambricon Technologies Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Cambricon Technologies Corp Ltd filed Critical Cambricon Technologies Corp Ltd
Priority to CN202011618683.9A priority Critical patent/CN114697678A/en
Publication of CN114697678A publication Critical patent/CN114697678A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/513Processing of motion vectors
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/105Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/527Global motion vector estimation

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The embodiment of the application discloses an image coding method, an image coding device, a storage medium and image coding equipment. The method comprises the following steps: and acquiring a reference image corresponding to the image to be coded by acquiring the image to be coded, wherein the image to be coded comprises an image unit to be coded. The method comprises the steps of obtaining a global motion vector between an image to be coded and a reference image, obtaining object attribute characteristics in an image unit to be coded, determining a target reference image unit corresponding to the image unit to be coded in the reference image according to the object attribute characteristics and the global motion vector, and obtaining a local motion vector corresponding to the image unit to be coded according to the target reference image unit. And carrying out coding processing on the image unit to be coded according to the local motion vector and the global motion vector to generate target coded data corresponding to the image unit to be coded. By the method and the device, the accuracy of the motion vector in the interframe coding process can be improved, and the quality of image coding can be further improved.

Description

Image encoding method, image encoding device, storage medium, and image encoding apparatus
Technical Field
The present application relates to the field of computer technologies, and in particular, to an image encoding method, an image encoding device, a storage medium, and an image encoding apparatus.
Background
The video utilizes the principle of human visual persistence, and makes human eyes generate motion feeling by playing a series of images, and because the data volume of the video is large, huge transmission resources and storage space are needed if video image pictures are transmitted simply. Therefore, the video can be coded, and the main function of the video coding is to code the video pixel data into a video code stream, so that the data volume of the video is reduced, and the purposes of reducing the network bandwidth and reducing the storage space in the transmission process are achieved.
In the existing video coding technology, a video can be divided into a series of video frames, and the coding process of the video frames is realized by calculating motion vectors between reference frames and the video frames. However, since the video contains abundant pixel data information, for example, videos in different scenes may contain different information, the motion vector calculated in the prior art lacks specific information for videos in different scenes, so that the accuracy of the motion vector is too low, and the video encoding quality is affected.
Disclosure of Invention
An embodiment of the present application provides an image encoding method, an image encoding device, a storage medium, and an image encoding apparatus, which can improve accuracy of a motion vector in an inter-frame encoding process, and further improve quality of image encoding.
An aspect of an embodiment of the present application provides an image encoding method, including:
acquiring an image to be coded, and acquiring a reference image corresponding to the image to be coded, wherein the image to be coded comprises an image unit to be coded;
acquiring a global motion vector between the image to be coded and the reference image, acquiring object attribute characteristics in the image unit to be coded, and determining a target reference image unit corresponding to the image unit to be coded in the reference image according to the object attribute characteristics and the global motion vector;
acquiring a local motion vector corresponding to the image unit to be coded according to the target reference image unit;
and carrying out coding processing on the image unit to be coded according to the local motion vector and the global motion vector to generate target coded data corresponding to the image unit to be coded.
Wherein the global motion vector comprises a first motion vector;
the obtaining a global motion vector between the image to be encoded and the reference image includes:
dividing the image to be coded into N coding image units, and acquiring a coding image unit U in the N coding image unitsi(ii) a N is a positive integer, i is a positive integer less than or equal to N;
traversing the reference image, determining K reference image units associated with the encoding image unit Ui, and acquiring first matching degrees between the encoding image unit Ui and the K reference image units respectively; k is a positive integer;
acquiring a correlation coding image unit and a correlation reference image unit corresponding to the maximum first matching degree, and acquiring a motion vector between the correlation coding image unit and the correlation reference image unit; the associated coded picture unit belongs to the N coded picture units, and the associated reference picture unit belongs to the K reference picture units;
and determining a motion vector between the associated coding image unit and the associated reference image unit as a first motion vector between the image to be coded and the reference image.
The image to be coded comprises identification information corresponding to the image unit to be coded;
the obtaining of the object attribute feature in the image unit to be encoded and the determining of the target reference image unit corresponding to the image unit to be encoded in the reference image according to the object attribute feature and the global motion vector include:
inputting the image to be coded into a region detection model, and determining the image unit to be coded in the image to be coded according to the identification information;
acquiring the object attribute characteristics corresponding to the image unit to be coded according to the region detection model;
determining a search area range and a motion offset direction corresponding to the image unit to be coded according to the object attribute characteristics;
and determining a target reference image unit corresponding to the image unit to be coded in the reference image according to the search area range, the motion offset direction and the global motion vector.
Wherein, the determining, according to the search area range, the motion offset direction, and the global motion vector, a target reference image unit corresponding to the image unit to be encoded in the reference image includes:
determining a search starting point corresponding to the image unit to be coded in the reference image according to the global motion vector;
and determining a target reference image unit corresponding to the image unit to be coded in the reference image according to the search starting point, the search area range and the motion offset direction.
Wherein the determining, according to the search starting point, the search area range, and the motion offset direction, a target reference image unit corresponding to the image unit to be encoded in the reference image includes:
determining a search area corresponding to the image unit to be coded in the reference image according to the search starting point, the search area range and the motion offset direction;
acquiring M candidate reference image units covered by the search area, and acquiring second matching degrees between the image unit to be coded and the M candidate reference image units respectively;
and determining the candidate reference image unit corresponding to the maximum second matching degree as the target reference image unit corresponding to the image unit to be coded.
Wherein the global motion vector further comprises a second motion vector;
the obtaining a global motion vector between the image to be encoded and the reference image includes:
inputting the image to be coded into a target detection model, and acquiring a target coding object in the image to be coded in the target detection model;
determining a target reference object associated with the target coding object in the reference image;
and determining a motion vector between the target coding object and the target reference object as a second motion vector between the image to be coded and the reference image.
The encoding processing is performed on the image unit to be encoded according to the local motion vector and the global motion vector, and target encoded data corresponding to the image unit to be encoded is generated, including:
acquiring rate distortion cost values corresponding to the local motion vector and the global motion vector respectively, and determining a motion vector corresponding to the minimum rate distortion cost value as a target motion vector;
and carrying out coding processing on the image unit to be coded according to the target motion vector to generate target coding data corresponding to the image unit to be coded.
An aspect of an embodiment of the present application provides an image encoding apparatus, including:
the device comprises a first acquisition module, a second acquisition module and a third acquisition module, wherein the first acquisition module is used for acquiring an image to be coded and acquiring a reference image corresponding to the image to be coded, and the image to be coded comprises an image unit to be coded;
a determining module, configured to obtain a global motion vector between the image to be encoded and the reference image, obtain an object attribute feature in the image unit to be encoded, and determine, according to the object attribute feature and the global motion vector, a target reference image unit corresponding to the image unit to be encoded in the reference image;
a second obtaining module, configured to obtain, according to the target reference image unit, a local motion vector corresponding to the image unit to be encoded;
and the generating module is used for coding the image unit to be coded according to the local motion vector and the global motion vector and generating target coded data corresponding to the image unit to be coded.
Wherein the global motion vector comprises a first motion vector, and the determining module comprises:
a first dividing unit for dividing the graph to be encodedImage partitioning into N encoded image units, in which encoded image units U are obtainedi(ii) a N is a positive integer, i is a positive integer less than or equal to N;
a first obtaining unit, configured to traverse the reference image, and determine the coding image unit UiAssociated K reference image units, obtaining the encoded image unit UiA first degree of matching with the K reference image units, respectively; k is a positive integer;
the second acquisition unit is used for acquiring a related coded image unit and a related reference image unit corresponding to the maximum first matching degree and acquiring a motion vector between the related coded image unit and the related reference image unit; the associated coded picture unit belongs to the N coded picture units, and the associated reference picture unit belongs to the K reference picture units;
a first determining unit, configured to determine a motion vector between the associated encoded image unit and the associated reference image unit as a first motion vector between the image to be encoded and the reference image.
The image to be coded comprises identification information corresponding to the image unit to be coded;
the determination module further comprises:
a second determining unit, configured to input the image to be encoded to a region detection model, and determine the image unit to be encoded in the image to be encoded according to the identification information;
a third obtaining unit, configured to obtain the object attribute feature corresponding to the image unit to be encoded according to the region detection model;
a third determining unit, configured to determine, according to the object attribute feature, a search area range and a motion offset direction corresponding to the image unit to be encoded;
and a fourth determining unit, configured to determine, according to the search area range, the motion offset direction, and the global motion vector, a target reference image unit corresponding to the image unit to be encoded in the reference image.
Wherein the fourth determining unit is specifically configured to:
determining a search starting point corresponding to the image unit to be coded in the reference image according to the global motion vector;
and determining a target reference image unit corresponding to the image unit to be coded in the reference image according to the search starting point, the search area range and the motion offset direction.
Wherein the fourth determining unit is specifically configured to:
determining a search area corresponding to the image unit to be coded in the reference image according to the search starting point, the search area range and the motion offset direction;
acquiring M candidate reference image units covered by the search area, and acquiring second matching degrees between the image unit to be coded and the M candidate reference image units respectively;
and determining the candidate reference image unit corresponding to the maximum second matching degree as the target reference image unit corresponding to the image unit to be coded.
Wherein the global motion vector further comprises a second motion vector;
the determination module further comprises:
a fourth obtaining unit, configured to input the image to be encoded to a target detection model, and obtain a target encoding object in the image to be encoded in the target detection model;
a fifth determining unit configured to determine a target reference object associated with the target encoding object in the reference image;
a sixth determining unit, configured to determine a motion vector between the target encoding object and the target reference object as a second motion vector between the image to be encoded and the reference image.
Wherein, the generation module includes:
a fifth obtaining unit, configured to obtain rate distortion cost values corresponding to the local motion vector and the global motion vector, and determine a motion vector corresponding to a minimum rate distortion cost value as a target motion vector;
and the generating unit is used for carrying out coding processing on the image unit to be coded according to the target motion vector and generating target coding data corresponding to the image unit to be coded.
One aspect of the present application provides a computer device, comprising: a processor and a memory;
wherein, the memorizer is used for storing the computer program, the processor is used for calling the above-mentioned computer program, in order to carry out the following step:
acquiring an image to be coded, and acquiring a reference image corresponding to the image to be coded, wherein the image to be coded comprises an image unit to be coded;
acquiring a global motion vector between the image to be coded and the reference image, acquiring object attribute characteristics in the image unit to be coded, and determining a target reference image unit corresponding to the image unit to be coded in the reference image according to the object attribute characteristics and the global motion vector;
acquiring a local motion vector corresponding to the image unit to be coded according to the target reference image unit;
and carrying out coding processing on the image unit to be coded according to the local motion vector and the global motion vector to generate target coded data corresponding to the image unit to be coded.
An aspect of the embodiments of the present application provides a computer-readable storage medium, where a computer program is stored, and the computer program is adapted to be loaded by a processor and execute the following steps:
acquiring an image to be coded, and acquiring a reference image corresponding to the image to be coded, wherein the image to be coded comprises an image unit to be coded;
acquiring a global motion vector between the image to be coded and the reference image, acquiring object attribute characteristics in the image unit to be coded, and determining a target reference image unit corresponding to the image unit to be coded in the reference image according to the object attribute characteristics and the global motion vector;
acquiring a local motion vector corresponding to the image unit to be coded according to the target reference image unit;
and carrying out coding processing on the image unit to be coded according to the local motion vector and the global motion vector to generate target coded data corresponding to the image unit to be coded.
An aspect of the application provides a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions to cause the computer device to perform the method of the above-described aspect.
In the embodiment of the application, the reference image corresponding to the image data to be encoded is acquired by acquiring the image to be encoded, and the image data to be encoded comprises the image unit to be encoded. The method comprises the steps of obtaining a global motion vector between an image to be coded and a reference image, obtaining object attribute characteristics in an image unit to be coded, and determining a target reference image unit corresponding to the image unit to be coded in the reference image according to the object attribute characteristics and the global motion vector. By means of the object attribute features and the global motion vector in the image unit to be coded, the search area in the reference image can be reduced, the complexity of determining the target reference image unit in the reference image is reduced, and the accuracy is improved. And acquiring a local motion vector corresponding to the image unit to be coded according to the target reference image unit. And carrying out coding processing on the image unit to be coded according to the local motion vector and the global motion vector to generate target coded data corresponding to the image unit to be coded. According to the method and the device, the local motion vector and the global motion vector corresponding to the image unit to be coded are obtained, and the motion factors corresponding to the local motion vector and the global motion vector are combined, so that the accuracy of the motion vector in the interframe coding process can be improved, and the quality of image coding can be improved.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a schematic architecture diagram of an image encoding system according to an embodiment of the present application;
fig. 2 is a schematic flowchart of an image encoding method according to an embodiment of the present application;
fig. 3 is a schematic diagram of a method for acquiring a reference image corresponding to an image to be encoded according to an embodiment of the present application;
FIG. 4 is a schematic diagram of obtaining a first matching degree according to an embodiment of the present disclosure;
FIG. 5 is a diagram illustrating a method for determining a target reference image unit corresponding to an image unit to be encoded according to an embodiment of the present disclosure;
fig. 6 is a flowchart illustrating an image encoding method according to an embodiment of the present application;
fig. 7 is a schematic structural diagram of an image encoding apparatus according to an embodiment of the present application;
fig. 8 is a schematic structural diagram of a computer device according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Referring to fig. 1, fig. 1 is a schematic structural diagram of an image encoding system according to an embodiment of the present application. As shown in fig. 1, the image encoding system may include a server 10 and a cluster of user terminals. The user terminal cluster may comprise one or more user terminals, where the number of user terminals is not limited. As shown in fig. 1, the system may specifically include a user terminal 100a, a user terminal 100b, user terminals 100c and …, and a user terminal 100 n. As shown in fig. 1, the user terminal 100a, the user terminal 100b, the user terminals 100c, …, and the user terminal 100n may be respectively connected to the server 10 via a network, so that each user terminal may interact with the server 10 via the network.
Wherein, each ue in the ue cluster may include: the intelligent terminal comprises an intelligent terminal with image coding, such as a smart phone, a tablet computer, a notebook computer, a desktop computer, wearable equipment, an intelligent home, and head-mounted equipment. It should be understood that each user terminal in the user terminal cluster shown in fig. 1 may be installed with a target application (i.e., an application client), and when the application client runs in each user terminal, data interaction may be performed with the server 10 shown in fig. 1.
As shown in fig. 1, the server 10 may be configured to obtain a global motion vector between an image to be encoded and a reference image and a local motion vector corresponding to an image unit to be encoded, perform encoding processing on the image unit to be encoded according to the local motion vector and the global motion vector, and generate target encoded data corresponding to the image unit to be encoded. The server 10 may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a network service, cloud communication, a middleware service, a domain name service, a security service, a CDN, a big data and artificial intelligence platform, and the like.
For convenience of understanding, in the embodiment of the present application, one user terminal may be selected as a target user terminal from the plurality of user terminals shown in fig. 1. For example, the user terminal 100a shown in fig. 1 may be used as a target user terminal, and a target application (i.e., an application client) with image coding may be integrated into the target user terminal. At this time, the target user terminal may implement data interaction with the server 10 through the service data platform corresponding to the application client. If the target user terminal can send the image to be encoded and the reference image corresponding to the image to be encoded to the server 10, the server 10 may obtain the global motion vector between the image to be encoded and the reference image and the local motion vector corresponding to the image unit to be encoded, perform encoding processing on the image unit to be encoded according to the local motion vector and the global motion vector, generate target encoded data corresponding to the image unit to be encoded, and send the target encoded data to the target user terminal.
Referring to fig. 2, fig. 2 is a schematic flowchart of an image encoding method according to an embodiment of the present disclosure. The image encoding method may be executed by a computer device, and the computer device may be a server (such as the server 11 in fig. 1), or a user terminal (such as any user terminal in the user terminal cluster in fig. 1), or a system composed of a server and a user terminal, which is not limited in this application. As shown in fig. 2, the image encoding method may include steps S101-S104.
S101, acquiring an image to be coded, and acquiring a reference image corresponding to the image to be coded, wherein the image to be coded comprises an image unit to be coded.
Specifically, after the video shooting end acquires video data through the shooting device, the computer device can encode the video data to obtain image encoding data corresponding to the video data, send the image encoding data to the decoding end, and the decoding end restores the video data according to the image encoding data. By encoding the video data, the bit number of video transmission can be greatly reduced, and the transmission efficiency is improved. When the computer device receives the video data, the image which needs to be coded currently can be determined as the image to be coded from the video data. After the image to be encoded is determined, a reference image corresponding to the image to be encoded is determined from the video data, and the image to be encoded is divided into a plurality of image units to be encoded, that is, the image to be encoded can be divided into n × n image units to be encoded, for example, the image to be encoded can be divided into 16 × 16 image units to be encoded. After the image to be encoded is divided into n × n image units to be encoded, the n × n image units to be encoded are encoded to obtain image encoded data corresponding to the image to be encoded.
Specifically, the computer device may divide the image to be encoded into an I frame and a plurality of B or P frames, where the I frame is an intra-encoded frame (also referred to as a key frame) and is an independent frame with all information, and the I frame can be independently decoded without referring to other images, and a first frame in the video sequence is always an I frame, that is, when the I frame is encoded, pixel data in the I frame is directly encoded. P frames are forward predicted frames (also called forward reference frames) and need to be encoded by referring to the previous I frame, and the decoded I frame needs to be restored by the pixels of the I frame image referred to by the P frame and the corresponding motion vectors. The image to be encoded in the embodiment of the present application may refer to an image corresponding to a P frame, and if the image to be encoded may be an image corresponding to a P frame, the reference image corresponding to the image to be encoded may refer to an image corresponding to an I frame in the forward direction of the image to be encoded. The B frame is a bidirectional predictive coding frame (also called bidirectional reference frame), and the B frame records the difference between the current frame and the previous and subsequent frames, that is, when decoding the image corresponding to the B frame, it is necessary to combine the coded data of the image of the previous frame of the B frame with the coded data of the image of the subsequent frame of the B frame, that is, a picture of the image restored by the superposition of the coded data corresponding to the previous and subsequent frames and the coded data corresponding to the current B frame image. The image to be encoded in the embodiment of the present application may refer to an image corresponding to a B frame, and if the image to be encoded may be an image corresponding to the B frame, the reference image corresponding to the image to be encoded may refer to an image corresponding to an I frame or a P frame in the forward direction of the image to be encoded, and an image corresponding to an I frame or a P frame in the backward direction of the image to be encoded, where the image frames of different types correspond to different reference relationships.
As shown in fig. 3, fig. 3 is a schematic diagram of a method for acquiring a reference image corresponding to an image to be encoded according to an embodiment of the present application, and as shown in fig. 3, if the image to be encoded in the present application is a 2 nd frame image, and an image frame type of the second frame image is a B frame type, the reference image corresponding to the image to be encoded may be a 1 st frame image and a 3 rd frame image; if the image to be encoded in the present application is a 5 th frame image and the image frame type of the second frame image is a P frame type, the reference image corresponding to the image to be encoded may be a 1 st frame image. Therefore, the reference image corresponding to the image to be coded can be determined according to the image frame type corresponding to the image to be coded
S102, acquiring a global motion vector between the image to be coded and the reference image, acquiring object attribute characteristics in the image unit to be coded, and determining a target reference image unit corresponding to the image unit to be coded in the reference image according to the object attribute characteristics and the global motion vector.
Specifically, after receiving the image to be encoded and the reference image, the computer device may perform motion estimation on the image to be encoded and the reference image to obtain a global motion vector between the image to be encoded and the reference image, where the global motion vector is an offset distance, i.e., a relative displacement, of the image to be encoded with respect to the reference image. And acquiring object attribute characteristics in the image unit to be coded, wherein the object attribute characteristics can refer to the mobility of an object in the image unit to be coded, and if the image unit to be coded comprises a house and a vehicle, the object attribute characteristics of the house can refer to small relative movement, and the object attribute characteristics of the vehicle can refer to large relative movement. And determining a target reference image unit corresponding to the image unit to be coded in the reference image according to the object number attribute characteristics and the global motion vector in the image unit to be coded, wherein the target reference image unit refers to a reference image unit with the highest matching degree with the image unit to be coded in the reference image, namely the best matching reference image unit. The target reference image unit corresponding to the image unit to be coded is determined according to the object attribute characteristics and the global motion vector in the image unit to be coded, so that the accuracy of determining the target reference image unit in the reference image can be increased, and the complexity of determining the target reference image unit can be reduced.
Optionally, the global motion vector includes a first motion vector, and the specific process of the computer device acquiring the global motion vector between the image to be encoded and the reference image may include: computer equipment divides image to be coded into N coded imagesA unit for obtaining a coded image unit U among the N coded image unitsiN is a positive integer, and i is a positive integer less than or equal to N. Traversing the reference image, acquiring K reference image units associated with the coding image unit Ui, and acquiring a coding image unit UiAnd respectively matching K reference image units with a first matching degree, wherein K is a positive integer. And acquiring a correlation coding image unit and a correlation reference image unit corresponding to the maximum first matching degree, and performing motion estimation on the correlation coding image unit and the correlation reference image unit to acquire a motion vector between the correlation coding image unit and the correlation reference image unit, wherein the correlation coding image unit belongs to the N coding image units, and the correlation reference image unit belongs to the K reference image units. And determining a motion vector between the associated coding image unit and the associated reference image unit as a first motion vector between the image to be coded and the reference image.
In particular, the computer device may divide the image to be encoded into N encoded image units, and obtain the encoded image unit U among the N encoded image unitsi. And the size of the coding image unit is smaller than or equal to the size of the image to be coded and is far larger than the size of the image unit to be coded. The manner of dividing the image to be encoded into N encoded image units may be the same as or different from the manner of dividing the image to be encoded into a plurality of image units to be encoded. The computer device may be based on the encoded image unit UiTraversing the reference image to obtain and encode the image unit UiAnd acquiring a first matching degree between the coded image unit Ui and each reference image unit in the K reference image units by the associated K reference image units, wherein one coded image unit and one reference image unit correspond to one first matching degree together.
FIG. 4 is a schematic diagram of obtaining a first matching degree according to an embodiment of the present application, as shown in FIG. 4, when a computer device needs to obtain a unit of encoded image U1Corresponding first degree of matching, it can be based on the coded picture unit U1To the reference pictureImage traversal, determining and encoding image unit U1Associated K reference picture units, U1Belonging to N coded picture units. E.g. a computer device may be based on the coded picture units U1Determining a detection frame for detecting in the reference image according to the corresponding unit size, traversing the reference image according to the detection frame, and acquiring and encoding an image unit U1Associated K reference picture elements. Separately computing coded picture units U1A first degree of match with K reference picture units, i.e. the coded picture unit U1A first degree of matching with reference picture unit 1, coding picture unit U1A first degree of matching with reference picture unit 2 until a coded picture unit U is calculated1A first degree of matching with the reference image unit K. In this way, K first matching degrees corresponding to the coded image unit can be obtained. And by analogy, acquiring K first matching degrees corresponding to each coding image unit in the N coding image units.
The computer equipment can calculate the characteristic maps corresponding to the coding image unit and the reference image unit respectively through a convolutional neural network to obtain two corresponding groups of characteristic map vectors. E.g. computing coded picture units U1The first degree of matching with the reference image unit 1 can be calculated by the convolutional neural network1Obtaining coded image units U by respectively corresponding feature maps to the reference image units 11Feature map vectors corresponding to the reference image units 1, respectively. And calculating the Euclidean distance corresponding to each coding image unit and the corresponding reference image unit through Euclidean distance calculation according to the feature vector diagrams respectively corresponding to each coding image unit and the corresponding reference image unit. Determining a first matching degree according to the Euclidean distance between each coding image unit and the corresponding reference image unit, wherein the shorter the Euclidean distance is, the higher the corresponding first matching degree is; the larger the Euclidean distance is, the lower the corresponding first matching degree is. By analogy, K first matching degrees, namely N × K first matching degrees, corresponding to each coded image unit in the N coded image units are obtained. Obtaining the maximum matching degree from the N x K first matching degrees corresponding to the N coding image unitsAnd the associated coded image unit and the associated reference image unit corresponding to the first matching degree (namely the shortest Euclidean distance) belong to the N coded image units, and the associated reference image unit belongs to the K reference image units. If encoding a picture unit U1The picture unit U is encoded if the degree of matching with the reference picture unit 2 is the greatest1The reference picture unit 2 is the associated reference picture unit corresponding to the largest first matching degree. And performing motion estimation on the associated coding image unit and the associated reference image unit, acquiring a motion vector between the associated coding image unit and the associated reference image unit, and determining the motion vector between the associated coding image unit and the associated reference image unit as a first motion vector between the image to be coded and the reference image.
Optionally, the global motion vector further includes a second motion vector, and a specific manner of the computer device acquiring the global motion vector between the image to be encoded and the reference image may further include: and inputting the image to be coded into a target detection model, and acquiring a target coding object in the image to be coded in the target detection model. And determining a target reference object associated with the target coding object in the reference image, and determining a motion vector between the target coding object and the target reference object as a second motion vector between the image to be coded and the reference image.
Specifically, the computer device may further input the image to be encoded into the target detection model, and obtain a target encoding object, such as a person, a house, or a vehicle, in the image to be encoded, where the target encoding object is any one of one or more objects in the image to be encoded. After the target coding object is determined in the image to be coded, a target reference object associated with the target coding object can be determined in the reference image according to the feature information of the target coding object, that is, the pixel information of the target coding object after movement is found in the reference image. And performing motion estimation on the target coding object and the target reference object to obtain a motion vector between the target coding object and the target reference object, and determining the motion vector between the target coding object and the target reference object as a second motion vector between the image to be coded and the reference image. Since one or more objects exist in the image to be encoded, the number of the second motion vectors is one or more, and one object corresponds to one second motion vector. If the motion speed of the object in the image unit to be coded is very high, the motion track of the target coding object in the image unit to be coded can be detected through the target detection model, the category information of the object in the image to be coded is combined, different objects correspond to different second motion vectors, the second motion vector corresponding to the image unit to be coded is obtained, and the efficiency of the motion vector in the coding process can be improved.
Optionally, the to-be-encoded image includes identification information corresponding to the to-be-encoded image unit, and the specific process of determining, by the computer device, the target reference image unit corresponding to the to-be-encoded image unit in the reference image according to the object attribute feature and the global motion vector may include: and inputting the image to be coded into the region detection model, and determining an image unit to be coded in the image to be coded according to the identification information. And acquiring object attribute characteristics corresponding to the image unit to be coded according to the area detection model, and determining a search area range and a motion offset direction corresponding to the image unit to be coded according to the object attribute characteristics. And determining a target reference image unit corresponding to the image unit to be coded in the reference image according to the search area range, the motion offset direction and the global motion vector.
Specifically, the computer device may input the image to be encoded into the region detection model, and determine the image unit to be encoded in the image to be encoded according to the identification information corresponding to the image unit to be encoded, where the identification information corresponding to the image unit to be encoded may be used to indicate the position information of the image unit to be encoded in the image to be encoded, so that the region detection model finds the region corresponding to the image unit to be encoded in the image to be encoded. And extracting the characteristics of the image unit to be coded according to the region detection model, determining objects in the image unit to be coded, such as houses, vehicles, people and other objects, and acquiring the object attribute characteristics corresponding to the objects in the image unit to be coded. The object property feature may refer to the mobility of the object in the image unit to be encoded, such as small relative movement of a house and large relative movement of a vehicle. And determining a search area range and a motion offset direction corresponding to the image unit to be coded according to the object attribute characteristics of the object in the image unit to be coded. If the relative movement of the house is small, the corresponding search area range is small, the relative movement of the vehicle is large, and the corresponding search area range is large. And after obtaining a search area range and a motion offset direction corresponding to the image unit to be coded, determining a target reference image unit corresponding to the image unit to be coded in the reference image according to the search area range, the motion offset direction and the global motion vector. Through the region detection model, objects in different scenes can be detected, so that a search region in a reference image is obtained, a target reference image unit is searched in the search region, and therefore motion estimation is performed on the target reference image unit and an image unit to be coded, and a local motion vector corresponding to the image unit to be coded is obtained. Therefore, the target reference image unit is searched in the search area, the motion estimation accuracy can be improved, the complexity of motion estimation can be reduced, and the image coding efficiency can be improved.
Optionally, the specific manner of determining, by the computer device, the target reference image unit corresponding to the image unit to be encoded in the reference image according to the search area range, the motion offset direction, and the global motion vector may include: and determining a search starting point corresponding to the image unit to be coded in the reference image according to the global motion vector. And determining a target reference image unit corresponding to the image unit to be coded in the reference image according to the search starting point, the search area range and the motion offset direction.
Specifically, the computer device may determine the corresponding coordinate information in the reference image according to the global motion vector, and may determine the corresponding coordinate information in the reference image since the global motion vector is a vector, such as (x, y). And determining a search starting point corresponding to the image unit to be coded in the reference image according to the coordinate information corresponding to the global motion vector. Since the data of the global motion vector is plural, the search start point page is determined to be plural. And determining a target reference image unit corresponding to the image unit to be coded in the reference image according to the search starting point, the search area range and the motion offset direction.
Optionally, the specific process of determining, by the computer device, the target reference image unit corresponding to the image unit to be encoded in the reference image according to the search starting point, the search area range, and the motion offset direction may include: and determining a search area corresponding to the image unit to be coded in the reference image according to the search starting point, the search area range and the motion offset direction. Acquiring M candidate reference image units covered by the search area, acquiring second matching degrees between the image unit to be coded and the M candidate reference image units respectively, and determining the candidate reference image unit corresponding to the largest second matching degree as a target reference image unit corresponding to the image unit to be coded.
Specifically, the computer device may determine a search area corresponding to the image unit to be encoded in the reference image according to the search starting point, the search area range, and the motion offset direction. Fig. 5 is a schematic diagram of a method for determining a target reference image unit corresponding to an image unit to be encoded according to an embodiment of the present application, and as shown in fig. 5, a position in a reference image having the same coordinate as the image unit to be encoded is determined as a coordinate origin, that is, (0,0), and a search area range may be adjusted according to an object attribute feature corresponding to the image unit to be encoded. If the object attribute feature corresponding to the image unit to be encoded is small in mobility, the corresponding search area range may be a little smaller, and if the object attribute feature corresponding to the image unit to be encoded is large in mobility, the corresponding search area range may be a little larger. As shown in fig. 5, when the object attribute feature corresponding to the image unit to be encoded is very mobile, the default search area range may not cover the actual position of the image unit to be encoded in the reference image, thereby resulting in a low accuracy of the finally obtained target reference image unit. Therefore, the size of the default search area range can be adjusted, and the search area range is enlarged. Since there are a plurality of pieces of global motion vector data obtained, there are a plurality of corresponding search starting points, and it is possible to determine a search starting point within the adjusted search area range as a valid starting point and a search starting point outside the adjusted search area range as an invalid starting point. As shown in fig. 5, search starting point 1, search starting point 2, and search starting point 3 are valid starting points, and search starting point 4 and search starting point 5 are invalid starting points. And determining a search area according to the effective search starting points and the preset search size, wherein each effective search starting point has a corresponding search area, and performing full search in the search area corresponding to each effective search starting point to determine a target reference image unit corresponding to the image unit to be coded. Each image unit to be coded in the image to be coded has a corresponding second motion vector, that is, a target coding object contained in the image unit to be coded is determined, and the second motion vector corresponding to the target coding object is used as the second motion vector corresponding to the image unit to be coded. The search order of the search regions corresponding to the search starting points can be determined according to the motion offset direction corresponding to the image unit to be encoded, for example, the search regions corresponding to the search starting points in the motion offset direction are preferentially searched. After the search area is determined in the reference image, M candidate reference image units covered by the search area may be obtained. And acquiring second matching degrees between the image unit to be coded and the M reference image units respectively, and determining the candidate reference image unit corresponding to the maximum second matching degree as a target reference image unit (namely, the best matching reference unit) corresponding to the image unit to be coded. In this way, the search efficiency for searching the target reference image unit in the reference image can be improved.
S103, according to the target reference image unit, obtaining a local motion vector corresponding to the image unit to be coded.
Specifically, after determining a target reference image unit in a reference image, the computer device may obtain a local motion vector corresponding to an image unit to be encoded according to the target reference image unit. The relative displacement between the target reference image unit and the image unit to be coded can be calculated, and the local motion vector corresponding to the image unit to be coded is determined according to the relative displacement. Referring to fig. 3, a reference image corresponding to an image to be encoded is determined, a reference image unit corresponding to an image unit to be encoded is determined in the reference image, and motion estimation is performed on the image unit to be encoded and the reference image unit to obtain a local motion vector corresponding to the image unit to be encoded. The local motion vector corresponding to the image unit to be encoded may refer to an optimal motion vector selected from a temporal motion vector and a spatial motion vector corresponding to the image unit to be encoded.
And S104, carrying out coding processing on the image unit to be coded according to the local motion vector and the global motion vector, and generating target coded data corresponding to the image unit to be coded.
Specifically, the computer device may determine a target motion vector according to the local motion vector and the global motion vector, perform encoding processing on the image unit to be encoded according to the target motion vector, and generate target encoded data corresponding to the image unit to be encoded. The local motion vector and the global motion vector may be used as candidate motion vectors corresponding to the image unit to be encoded, the number of the global motion vectors is also multiple, and the target motion vector may be determined from the local motion vector and the global motion vector.
In the embodiment of the application, a reference image corresponding to image data to be encoded is acquired by acquiring an image to be encoded, wherein the image data to be encoded comprises image units to be encoded. The method comprises the steps of obtaining a global motion vector between an image to be coded and a reference image, obtaining object attribute characteristics in an image unit to be coded, and determining a target reference image unit corresponding to the image unit to be coded in the reference image according to the object attribute characteristics and the global motion vector. The target reference image unit corresponding to the image unit to be coded can be found in the reference image more quickly and more accurately by determining the searching direction and the searching area range in the reference image through the object attribute characteristics corresponding to the image unit to be coded, determining the searching starting point in the reference image according to the global motion vector and combining the searching direction, the searching area range and the searching starting point. Therefore, through the object attribute features and the global operation vector in the image unit to be coded, the search area in the reference image can be reduced, the complexity of determining the target reference image unit in the reference image is reduced, and the accuracy is improved. And acquiring a local motion vector corresponding to the image unit to be coded according to the target reference image unit. And carrying out coding processing on the image unit to be coded according to the local motion vector and the global motion vector to generate target coded data corresponding to the image unit to be coded. According to the method and the device, the local motion vector and the global motion vector corresponding to the image unit to be coded are obtained, and the local motion vector and the global motion vector corresponding to the image unit to be coded are combined, so that the accuracy of the motion vector in the interframe coding process can be improved, the quality of image coding can be improved, meanwhile, the target motion vector corresponding to the image unit to be coded is determined through machine learning (such as a region detection module, a graph matching technology and the like), and the efficiency of image coding can be improved.
As shown in fig. 6, fig. 6 is a schematic diagram of an image encoding method provided in an embodiment of the present application, where the method may be executed by a computer device, and the method may be executed by a computer device, where the computer device may be a server (such as the server 11 in fig. 1), or a user terminal (such as any user terminal in the user terminal cluster in fig. 1), or a system composed of a server and a user terminal, which is not limited in this application. As shown in fig. 6, the steps of the image encoding method include S201-205.
S201, acquiring an image to be coded, and acquiring a reference image corresponding to the image to be coded, wherein the image to be coded comprises an image unit to be coded.
S202, acquiring a global motion vector between the image to be coded and the reference image, acquiring object attribute characteristics in the image unit to be coded, and determining a target reference image unit corresponding to the image unit to be coded in the reference image according to the object attribute characteristics and the global motion vector.
S203, according to the target reference image unit, obtaining a local motion vector corresponding to the image unit to be coded.
The specific implementation manner of steps S201 to S203 in the embodiment of the present application can refer to the description in the embodiment corresponding to fig. 2, and the embodiment of the present application is not described herein again.
And S204, acquiring rate distortion cost values corresponding to the local motion vector and the global motion vector respectively, and determining the motion vector corresponding to the minimum rate distortion cost value as a target motion vector.
Specifically, the computer device may obtain rate distortion cost values corresponding to the local motion vector and the global motion vector, respectively, and use a motion vector corresponding to the minimum rate distortion cost value as a target motion vector corresponding to the image unit to be encoded. The rate distortion cost values corresponding to the local motion vector and the global motion vector can be respectively calculated through a rate distortion calculation function. Therefore, in the process of performing prediction processing on the unit to be encoded, the computer device may determine the rate distortion cost corresponding to the unit to be encoded based on the predicted image unit (which refers to the predicted image unit generated according to the reference image unit and the corresponding motion vector), the unit to be encoded, and the encoding auxiliary parameters (e.g., encoding rate parameter and encoding distortion parameter) corresponding to the non-motion estimation mode, and select the encoding mode corresponding to the minimum rate distortion cost to obtain the optimal encoding performance. The coding rate is also expressed as the degree of data compression, and the lower the coding rate is, the more the video data is compressed. Specifically, the calculation formula of the rate-distortion cost can be referred to the following formula (1):
rdcost=dist+bit×λ (1)
where rdcost refers to the rate-distortion cost, dist refers to the coding distortion parameter, bit refers to the coding rate parameter associated with the coding mode (e.g., non-motion estimation mode), and λ is the lagrangian factor. And S205, performing coding processing on the image unit to be coded according to the target motion vector, and generating target coded data corresponding to the image unit to be coded.
Specifically, after the computer device determines the target motion vector, a predicted image unit corresponding to the image unit to be encoded may be generated according to the target motion vector corresponding to the image unit to be encoded (relative displacement between the image unit to be encoded and the target reference image unit), and the pixel corresponding to the target reference image unit in the reference image. After obtaining the predicted image unit, the computer device may obtain a residual (pixel difference) between the predicted image unit and the image unit to be encoded, and generate difference image data from the residual. Further, the computer device may transform the differential image data, quantize the transformed differential image data, and generate target encoded data corresponding to the image unit to be encoded in the encoder. Of course, the image to be encoded may include a plurality of image units to be encoded, and each image unit to be encoded may obtain the target encoded data corresponding to each image to be encoded based on the above operations, so that the encoded data corresponding to the entire image to be encoded may be obtained. The transformation refers to performing orthogonal transformation on the video frame image to remove the correlation between spatial pixels, and the orthogonal transformation concentrates the energy originally distributed on each pixel on a few low-frequency coefficients of a frequency domain, which represents most information of the image. Such characteristics of the frequency coefficients are advantageous to adopt a method of quantization based on HVS (human visual characteristics) characteristics of a human. The transformation method may include, but is not limited to: K-L Transform (Karhunen-Loeve Transform), Discrete Cosine Transform (DCT), Discrete Wavelet Transform (DWT). Quantization refers to a process of reducing the precision of data representation, and by quantization, the amount of data to be encoded can be reduced, and quantization is a lossy compression technique. Quantization may include vector quantization, which is the joint quantization of a set of data, and scalar quantization, which quantizes each input data independently.
After the computer device generates the encoded data corresponding to the image to be encoded, the encoded data of the image to be encoded can be sent to the decoding end, so that the decoding end decodes the encoded data, and the picture corresponding to the image to be encoded is restored. If the user a and the user B are connected to a video call, after the user a obtains video data, the video data may be encoded by applying the scheme of the embodiment, that is, the video data is divided into a series of video frames, each video frame is encoded to obtain encoded data of the video data, and then the encoded data of the video data may be sent to the decoding device of the user B, so that the decoding device of the user B decodes the encoded data corresponding to the video data to obtain restored video data, and the restored video data is displayed in the display device (such as a mobile phone) corresponding to the user B. Therefore, the user A and the user B can share things mutually, the network bandwidth in the transmission process can be reduced, the storage space can be reduced, and meanwhile, the accuracy and the efficiency of image restoration can be improved.
In the embodiment of the application, a reference image corresponding to image data to be encoded is acquired by acquiring an image to be encoded, wherein the image data to be encoded comprises image units to be encoded. The method comprises the steps of obtaining a global motion vector between an image to be coded and a reference image, obtaining object attribute characteristics in an image unit to be coded, and determining a target reference image unit corresponding to the image unit to be coded in the reference image according to the object attribute characteristics and the global motion vector. The target reference image unit corresponding to the image unit to be coded can be found in the reference image more quickly and more accurately by determining the searching direction and the searching area range in the reference image through the object attribute characteristics corresponding to the image unit to be coded, determining the searching starting point in the reference image according to the global motion vector and combining the searching direction, the searching area range and the searching starting point. Therefore, through the object attribute features and the global operation vector in the image unit to be coded, the search area in the reference image can be reduced, the complexity of determining the target reference image unit in the reference image is reduced, and the accuracy is improved. And acquiring a local motion vector corresponding to the image unit to be coded according to the target reference image unit. And carrying out coding processing on the image unit to be coded according to the local motion vector and the global motion vector to generate target coded data corresponding to the image unit to be coded. According to the method and the device, the local motion vector and the global motion vector corresponding to the image unit to be coded are obtained, and the local motion factor and the global motion factor which respectively correspond are combined. The method comprises the steps of determining a motion vector corresponding to the maximum rate distortion cost value as a target motion vector from a plurality of candidate motion vectors corresponding to local motion vectors and global motion vectors, and coding an image unit to be coded according to the target motion vector, so that the accuracy of the motion vector in the interframe coding process can be improved, the quality of image coding can be improved, and meanwhile, the method determines the target motion vector corresponding to the image unit to be coded through machine learning (such as a region detection module, a graph matching technology and the like), and the efficiency of image coding can be improved.
Referring to fig. 7, fig. 7 is a schematic structural diagram of an image encoding device according to an embodiment of the present disclosure. The image encoding apparatus may be a computer program (including program code) running in a computer device, for example, the image encoding apparatus is an application software; the device can be used for executing corresponding steps in the image coding method provided by the embodiment of the application. As shown in fig. 7, the image encoding apparatus may include: the device comprises a first acquisition module 11, a determination module 12, a second acquisition module 13 and a generation module 14.
The first obtaining module 11 is configured to obtain an image to be encoded, and obtain a reference image corresponding to the image to be encoded, where the image to be encoded includes an image unit to be encoded;
a determining module 12, configured to obtain a global motion vector between the image to be encoded and the reference image, obtain an object attribute feature in the image unit to be encoded, and determine, according to the object attribute feature and the global motion vector, a target reference image unit corresponding to the image unit to be encoded in the reference image;
a second obtaining module 13, configured to obtain, according to the target reference image unit, a local motion vector corresponding to the image unit to be encoded;
and a generating module 14, configured to perform coding processing on the image unit to be coded according to the local motion vector and the global motion vector, and generate target coded data corresponding to the image unit to be coded.
Wherein the global motion vector comprises a first motion vector, the determining module 12 comprises:
a first partitioning unit 1201 for partitioning the image to be encoded into N encoded image units, wherein an encoded image unit U is obtained in the N encoded image unitsi(ii) a N is a positive integer, i is less than or equal toA positive integer equal to N;
a first obtaining unit 1202, configured to traverse the reference image, and determine the encoded image unit UiAssociated K reference image units, obtaining said coded image unit UiA first degree of matching with the K reference image units, respectively; k is a positive integer;
a first obtaining unit 1203, configured to obtain an associated encoded image unit and an associated reference image unit corresponding to a maximum first matching degree, and obtain a motion vector between the associated encoded image unit and the associated reference image unit; the associated coded picture unit belongs to the N coded picture units, and the associated reference picture unit belongs to the K reference picture units;
a first determining unit 1204, configured to determine a motion vector between the associated encoded image unit and the associated reference image unit as a first motion vector between the image to be encoded and the reference image.
The image to be coded comprises identification information corresponding to the image unit to be coded;
the determination module 12 further includes:
a second determining unit 1205, configured to input the image to be encoded to a region detection model, and determine the image unit to be encoded in the image to be encoded according to the identification information;
a third obtaining unit 1206, configured to obtain, according to the region detection model, the object attribute feature corresponding to the image unit to be encoded;
a third determining unit 1207, configured to determine, according to the object attribute feature, a search area range and a motion offset direction corresponding to the image unit to be encoded;
a fourth determining unit 1208, configured to determine, according to the search area range, the motion offset direction, and the global motion vector, a target reference image unit corresponding to the image unit to be encoded in the reference image.
The fourth determining unit 1208 is specifically configured to:
determining a search starting point corresponding to the image unit to be coded in the reference image according to the global motion vector;
and determining a target reference image unit corresponding to the image unit to be coded in the reference image according to the search starting point, the search area range and the motion offset direction.
The fourth determining unit 1208 is specifically configured to:
determining a search area corresponding to the image unit to be coded in the reference image according to the search starting point, the search area range and the motion offset direction;
acquiring M candidate reference image units covered by the search area, and acquiring second matching degrees between the image unit to be coded and the M candidate reference image units respectively;
and determining the candidate reference image unit corresponding to the maximum second matching degree as the target reference image unit corresponding to the image unit to be coded.
Wherein the global motion vector further comprises a second motion vector;
the determination module 12 further includes:
a fourth obtaining unit 1209, configured to input the image to be encoded to a target detection model, where a target encoding object in the image to be encoded is obtained;
a fifth determining unit 1210 for determining a target reference object associated with the target encoding object in the reference image;
a sixth determining unit 1211, configured to determine a motion vector between the target encoding object and the target reference object as a second motion vector between the image to be encoded and the reference image.
Wherein, the generating module 14 includes:
a fifth obtaining unit 1401, configured to obtain rate distortion cost values corresponding to the local motion vector and the global motion vector, respectively, and determine a motion vector corresponding to a minimum rate distortion cost value as a target motion vector;
a generating unit 1402, configured to perform encoding processing on the image unit to be encoded according to the target motion vector, and generate target encoded data corresponding to the image unit to be encoded.
According to an embodiment of the present application, the steps involved in the image encoding method shown in fig. 2 or fig. 6 may be performed by respective modules in the image encoding apparatus shown in fig. 7. For example, step S101 shown in fig. 2 may be performed by the first obtaining module 11 in fig. 1, and step S102 shown in fig. 2 may be performed by the determining module 12 in fig. 7; step S103 shown in fig. 2 may be performed by the second obtaining module 13 in fig. 7; step S104 shown in fig. 2 may be performed by the generation module 14 in fig. 7, and so on.
In the embodiment of the application, the reference image corresponding to the image data to be encoded is acquired by acquiring the image to be encoded, and the image data to be encoded comprises the image unit to be encoded. The method comprises the steps of obtaining a global motion vector between an image to be coded and a reference image, obtaining object attribute characteristics in an image unit to be coded, and determining a target reference image unit corresponding to the image unit to be coded in the reference image according to the object attribute characteristics and the global motion vector. The method comprises the steps of determining a search direction and a search area range in a reference image through object attribute features corresponding to image units to be coded, determining a search starting point in the reference image according to a global motion vector, and finding target reference image units corresponding to the image units to be coded in the reference image more quickly and accurately by combining the search direction, the search area range and the search starting point, so that the efficiency and the accuracy of motion estimation are improved. And acquiring a local motion vector corresponding to the image unit to be coded according to the target reference image unit. And carrying out coding processing on the image unit to be coded according to the local motion vector and the global motion vector to generate target coded data corresponding to the image unit to be coded. According to the method and the device, the local motion vector and the global motion vector corresponding to the image unit to be coded are obtained, and the local motion factor and the global motion factor corresponding to each image unit to be coded are combined. The method comprises the steps of determining a target motion vector from a plurality of candidate motion vectors corresponding to a local motion vector and a global motion vector, wherein the motion vector corresponding to the maximum rate distortion cost value is determined, and coding an image unit to be coded according to the target motion vector, so that the accuracy of the motion vector in the interframe coding process can be improved, and the quality of image coding can be further improved.
According to an embodiment of the present application, each module in the image encoding apparatus shown in fig. 7 may be respectively or entirely combined into one or several units to form the unit, or some unit(s) may be further split into multiple sub-units with smaller functions, which may implement the same operation without affecting implementation of technical effects of the embodiment of the present application. The modules are divided based on logic functions, and in practical application, the functions of one module can be realized by a plurality of units, or the functions of a plurality of modules can be realized by one unit. In other embodiments of the present application, the image encoding apparatus may also include other units, and in practical applications, these functions may also be implemented by assistance of other units, and may be implemented by cooperation of a plurality of units.
According to an embodiment of the present application, the image encoding apparatus as shown in fig. 7 may be constructed by running a computer program (including program codes) capable of executing the steps involved in the respective methods as shown in fig. 2 or fig. 6 on a general-purpose computer device such as a computer including a processing element such as a Central Processing Unit (CPU), a random access storage medium (RAM), a read-only storage medium (ROM), and a storage element, and the image encoding method of the embodiment of the present application may be implemented. The computer program may be recorded on a computer-readable recording medium, for example, and loaded into and executed by the computing apparatus via the computer-readable recording medium.
Referring to fig. 8, fig. 8 is a schematic structural diagram of a computer device according to an embodiment of the present disclosure. As shown in fig. 8, the computer apparatus 1000 may include: the processor 1001, the network interface 1004, and the memory 1005, and the computer apparatus 1000 may further include: a user interface 1003, and at least one communication bus 1002. Wherein a communication bus 1002 is used to enable connective communication between these components. The user interface 1003 may include a Display screen (Display) and a Keyboard (Keyboard), and the optional user interface 1003 may also include a standard wired interface and a standard wireless interface. The network interface 1004 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface). The memory 1005 may be a high-speed RAM memory or a non-volatile memory (e.g., at least one disk memory). The memory 1005 may optionally be at least one memory device located remotely from the processor 1001. As shown in fig. 8, a memory 1005, which is a kind of computer-readable storage medium, may include therein an operating system, a network communication module, a user interface module, and a device control application program.
In the computer device 1000 shown in fig. 8, the network interface 1004 may provide a network communication function; the user interface 1003 is an interface for providing a user with input; and the processor 1001 may be used to invoke a device control application stored in the memory 1005 to implement:
optionally, the processor 1001 may be configured to invoke a device control application stored in the memory 1005 to implement:
acquiring an image to be coded, and acquiring a reference image corresponding to the image to be coded, wherein the image to be coded comprises an image unit to be coded;
acquiring a global motion vector between the image to be coded and the reference image, acquiring object attribute characteristics in the image unit to be coded, and determining a target reference image unit corresponding to the image unit to be coded in the reference image according to the object attribute characteristics and the global motion vector;
acquiring a local motion vector corresponding to the image unit to be coded according to the target reference image unit;
and carrying out coding processing on the image unit to be coded according to the local motion vector and the global motion vector to generate target coded data corresponding to the image unit to be coded.
Optionally, the processor 1001 may be configured to invoke a device control application stored in the memory 1005 to implement:
dividing the image to be coded into N coded image units, and acquiring a coded image unit U from the N coded image unitsi(ii) a N is a positive integer, i is a positive integer less than or equal to N;
traversing the reference image, determining the encoding image unit UiAssociated K reference image units, obtaining said coded image unit UiA first degree of matching with the K reference image units, respectively; k is a positive integer;
acquiring a correlation coding image unit and a correlation reference image unit corresponding to the maximum first matching degree, and acquiring a motion vector between the correlation coding image unit and the correlation reference image unit; the associated coded picture unit belongs to the N coded picture units, and the associated reference picture unit belongs to the K reference picture units;
and determining a motion vector between the associated coding image unit and the associated reference image unit as a first motion vector between the image to be coded and the reference image.
Optionally, the processor 1001 may be configured to call a device control application stored in the memory 1005 to implement:
inputting the image to be coded into a region detection model, and determining the image unit to be coded in the image to be coded according to the identification information;
acquiring the object attribute characteristics corresponding to the image unit to be coded according to the region detection model;
determining a search area range and a motion offset direction corresponding to the image unit to be coded according to the object attribute characteristics;
and determining a target reference image unit corresponding to the image unit to be coded in the reference image according to the search area range, the motion offset direction and the global motion vector.
Optionally, the processor 1001 may be configured to invoke a device control application stored in the memory 1005 to implement:
determining a search starting point corresponding to the image unit to be coded in the reference image according to the global motion vector;
and determining a target reference image unit corresponding to the image unit to be coded in the reference image according to the search starting point, the search area range and the motion offset direction.
Optionally, the processor 1001 may be configured to invoke a device control application stored in the memory 1005 to implement:
determining a search area corresponding to the image unit to be coded in the reference image according to the search starting point, the search area range and the motion offset direction;
acquiring M candidate reference image units covered by the search area, and acquiring second matching degrees between the image unit to be coded and the M candidate reference image units respectively;
and determining the candidate reference image unit corresponding to the maximum second matching degree as the target reference image unit corresponding to the image unit to be coded.
Optionally, the processor 1001 may be configured to invoke a device control application stored in the memory 1005 to implement:
inputting the image to be coded into a target detection model, and acquiring a target coding object in the image to be coded in the target detection model;
determining a target reference object associated with the target coding object in the reference image;
and determining a motion vector between the target coding object and the target reference object as a second motion vector between the image to be coded and the reference image.
Optionally, the processor 1001 may be configured to invoke a device control application stored in the memory 1005 to implement:
acquiring rate distortion cost values corresponding to the local motion vector and the global motion vector respectively, and determining a motion vector corresponding to the minimum rate distortion cost value as a target motion vector;
and carrying out coding processing on the image unit to be coded according to the target motion vector to generate target coding data corresponding to the image unit to be coded.
It should be understood that the computer device 1000 described in this embodiment of the present application may perform the description of the image encoding method in the embodiment corresponding to fig. 2 or fig. 6, and may also perform the description of the image encoding apparatus in the embodiment corresponding to fig. 7, which is not described herein again. In addition, the beneficial effects of the same method are not described in detail.
According to an aspect of the application, a computer program product or computer program is provided, comprising computer instructions, the computer instructions being stored in a computer readable storage medium. The processor of the computer device reads the computer instruction from the computer-readable storage medium, and the processor executes the computer instruction, so that the computer device can perform the description of the image encoding method in the embodiment corresponding to fig. 2 or fig. 6, which is not described herein again. In addition, the beneficial effects of the same method are not described in detail.
As an example, the program instructions described above may be executed on one computer device, or on multiple computer devices located at one site, or on multiple computer devices distributed over multiple sites and interconnected by a communication network, which may constitute a blockchain network.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.
The above disclosure is only for the purpose of illustrating the preferred embodiments of the present application and is not to be construed as limiting the scope of the present application, so that the present application is not limited thereto, and all equivalent variations and modifications can be made to the present application.

Claims (10)

1. An image encoding method, comprising:
acquiring an image to be coded, and acquiring a reference image corresponding to the image to be coded, wherein the image to be coded comprises an image unit to be coded;
acquiring a global motion vector between the image to be coded and the reference image, acquiring object attribute characteristics in the image unit to be coded, and determining a target reference image unit corresponding to the image unit to be coded in the reference image according to the object attribute characteristics and the global motion vector;
acquiring a local motion vector corresponding to the image unit to be coded according to the target reference image unit;
and carrying out coding processing on the image unit to be coded according to the local motion vector and the global motion vector to generate target coded data corresponding to the image unit to be coded.
2. The method of claim 1, wherein the global motion vector comprises a first motion vector;
the obtaining a global motion vector between the image to be encoded and the reference image includes:
dividing the image to be coded into N coded image units, and acquiring a coded image unit U from the N coded image unitsi(ii) a N is a positive integer, i is a positive integer less than or equal to N;
traversing the reference image, determining the coding image unit UiAssociated K reference image units, obtaining theCoding image unit UiA first degree of matching with the K reference image units, respectively; k is a positive integer;
acquiring a correlation coding image unit and a correlation reference image unit corresponding to the maximum first matching degree, and acquiring a motion vector between the correlation coding image unit and the correlation reference image unit; the associated coded picture unit belongs to the N coded picture units, and the associated reference picture unit belongs to the K reference picture units;
and determining a motion vector between the associated coding image unit and the associated reference image unit as a first motion vector between the image to be coded and the reference image.
3. The method according to claim 1, wherein the image to be encoded comprises identification information corresponding to the image unit to be encoded;
the obtaining of the object attribute feature in the image unit to be encoded and the determining of the target reference image unit corresponding to the image unit to be encoded in the reference image according to the object attribute feature and the global motion vector include:
inputting the image to be coded into a region detection model, and determining the image unit to be coded in the image to be coded according to the identification information;
acquiring the object attribute characteristics corresponding to the image unit to be coded according to the region detection model;
determining a search area range and a motion offset direction corresponding to the image unit to be coded according to the object attribute characteristics;
and determining a target reference image unit corresponding to the image unit to be coded in the reference image according to the search area range, the motion offset direction and the global motion vector.
4. The method according to claim 3, wherein said determining a target reference picture unit corresponding to the picture unit to be encoded in the reference picture according to the search area range, the motion offset direction and the global motion vector comprises:
determining a search starting point corresponding to the image unit to be coded in the reference image according to the global motion vector;
and determining a target reference image unit corresponding to the image unit to be coded in the reference image according to the search starting point, the search area range and the motion offset direction.
5. The method according to claim 4, wherein said determining a target reference image unit corresponding to the image unit to be encoded in the reference image according to the search starting point, the search area range and the motion offset direction comprises:
determining a search area corresponding to the image unit to be coded in the reference image according to the search starting point, the search area range and the motion offset direction;
acquiring M candidate reference image units covered by the search area, and acquiring second matching degrees between the image unit to be coded and the M candidate reference image units respectively;
and determining the candidate reference image unit corresponding to the maximum second matching degree as the target reference image unit corresponding to the image unit to be coded.
6. The method of claim 1, wherein the global motion vector further comprises a second motion vector;
the obtaining a global motion vector between the image to be encoded and the reference image includes:
inputting the image to be coded into a target detection model, and acquiring a target coding object in the image to be coded in the target detection model;
determining a target reference object associated with the target coding object in the reference image;
and determining a motion vector between the target coding object and the target reference object as a second motion vector between the image to be coded and the reference image.
7. The method according to claim 1, wherein the encoding the image unit to be encoded according to the local motion vector and the global motion vector to generate target encoded data corresponding to the image unit to be encoded comprises:
acquiring rate distortion cost values corresponding to the local motion vector and the global motion vector respectively, and determining a motion vector corresponding to the minimum rate distortion cost value as a target motion vector;
and carrying out coding processing on the image unit to be coded according to the target motion vector to generate target coding data corresponding to the image unit to be coded.
8. An image encoding device characterized by comprising:
the device comprises a first acquisition module, a second acquisition module and a third acquisition module, wherein the first acquisition module is used for acquiring an image to be coded and acquiring a reference image corresponding to the image to be coded, and the image to be coded comprises an image unit to be coded;
a determining module, configured to obtain a global motion vector between the image to be encoded and the reference image, obtain an object attribute feature in the image unit to be encoded, and determine, according to the object attribute feature and the global motion vector, a target reference image unit corresponding to the image unit to be encoded in the reference image;
a second obtaining module, configured to obtain, according to the target reference image unit, a local motion vector corresponding to the image unit to be encoded;
and the generating module is used for carrying out coding processing on the image unit to be coded according to the local motion vector and the global motion vector to generate target coding data corresponding to the image unit to be coded.
9. A computer device, comprising: a processor and a memory;
the memory stores a computer program that, when executed by the processor, performs the method of any of claims 1 to 7.
10. A computer-readable storage medium, in which a computer program is stored which is adapted to be loaded by a processor and to carry out the method of any one of claims 1 to 7.
CN202011618683.9A 2020-12-30 2020-12-30 Image encoding method, image encoding device, storage medium, and image encoding apparatus Pending CN114697678A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011618683.9A CN114697678A (en) 2020-12-30 2020-12-30 Image encoding method, image encoding device, storage medium, and image encoding apparatus

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011618683.9A CN114697678A (en) 2020-12-30 2020-12-30 Image encoding method, image encoding device, storage medium, and image encoding apparatus

Publications (1)

Publication Number Publication Date
CN114697678A true CN114697678A (en) 2022-07-01

Family

ID=82133727

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011618683.9A Pending CN114697678A (en) 2020-12-30 2020-12-30 Image encoding method, image encoding device, storage medium, and image encoding apparatus

Country Status (1)

Country Link
CN (1) CN114697678A (en)

Similar Documents

Publication Publication Date Title
KR102447241B1 (en) Image encoding/decoding method and device
KR102059066B1 (en) Motion vector field coding method and decoding method, and coding and decoding apparatuses
KR102254986B1 (en) Processing of equirectangular object data to compensate for distortion by spherical projections
CN110121073B (en) Bidirectional interframe prediction method and device
US10681374B2 (en) Diversified motion using multiple global motion models
KR20210092588A (en) Image processing apparatus and method thereof
CN114900691B (en) Encoding method, encoder, and computer-readable storage medium
CN112235582B (en) Video data processing method and device, computer equipment and storage medium
CN116600119B (en) Video encoding method, video decoding method, video encoding device, video decoding device, computer equipment and storage medium
US20230281881A1 (en) Video Frame Compression Method, Video Frame Decompression Method, and Apparatus
CN117459733A (en) Video encoding method, apparatus, device, readable storage medium, and program product
US10536726B2 (en) Pixel patch collection for prediction in video coding system
CN111918067A (en) Data processing method and device and computer readable storage medium
CN113808157B (en) Image processing method and device and computer equipment
CN114697678A (en) Image encoding method, image encoding device, storage medium, and image encoding apparatus
CN113507611B (en) Image storage method and device, computer equipment and storage medium
CN114501022A (en) Data processing method and device, computer equipment and storage medium
CN111885378B (en) Multimedia data encoding method, apparatus, device and medium
Zhang et al. From visual search to video compression: A compact representation framework for video feature descriptors
CN116760986B (en) Candidate motion vector generation method, candidate motion vector generation device, computer equipment and storage medium
CN113556551B (en) Encoding and decoding method, device and equipment
JP2013157950A (en) Encoding method, decoding method, encoder, decoder, encoding program and decoding program
CN115665424A (en) Image processing method, apparatus, device, storage medium, and program product
CN117459732A (en) Video encoding method, apparatus, device, readable storage medium, and program product
CN117768647A (en) Image processing method, device, equipment and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination