CN110741635A

CN110741635A - Encoding method, decoding method, encoding device, and decoding device

Info

Publication number: CN110741635A
Application number: CN201880037395.9A
Authority: CN
Inventors: 郑萧桢; 封旭阳; 张李亮; 赵丛
Original assignee: Shenzhen Dajiang Innovations Technology Co Ltd
Current assignee: SZ DJI Technology Co Ltd; Shenzhen Dajiang Innovations Technology Co Ltd
Priority date: 2018-06-29
Filing date: 2018-06-29
Publication date: 2020-01-31
Also published as: WO2020000473A1

Abstract

an encoding method, a decoding method, an encoding device and a decoding device, wherein the encoding method comprises encoding a current image and generating code stream data, the code stream data comprises identification information, the identification information is used for identifying at least target objects in the current image, the identification information comprises image area information and pixel information, the image area information comprises the position and the size of an image area where the target objects are located, the pixel information comprises the attribute of at least pixels in the image area, the encoding method indicates the position and the size of the image area where the target objects are located through the image area information, and indicates the attribute of a plurality of pixels in the image area through the pixel information, so that the target objects are identified with finer granularity, and the decoding device is beneficial to more efficiently and more accurately perform operations on the target objects.

Description

Encoding method, decoding method, encoding device, and decoding device

Copyright declaration

The disclosure of this patent document contains material which is subject to copyright protection. The copyright is owned by the copyright owner. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the patent and trademark office official records and records.

Technical Field

The present application relates to the field of image processing, and in particular, to encoding methods, decoding methods, encoding apparatuses, and decoding apparatuses.

Background

In video surveillance, human-computer interaction, security rounds, etc., objects of significant interest (including people, animals, plants, utilities, vehicles or landscapes, scenery, etc.) typically need to be identified in order for the decoding end or observer to better track the changes of the object in the video stream to better assist the observer in observing or interacting with the object.

In the existing object tracking technology, the content of a video stream is generally identified by utilizing technologies such as image processing, computer vision, computer analysis and understanding, and the like, and objects needing important attention are identified.

Disclosure of Invention

The application provides encoding methods, decoding methods, encoding devices and decoding devices, which identify target objects with finer granularity and are beneficial for the decoding devices to more efficiently and accurately perform operations on the target objects.

, providing encoding methods, including encoding a current image, and generating codestream data, where the codestream data includes identification information, the identification information is used to identify at least target objects in the current image, the identification information includes image region information and pixel information, the image region information includes a position and a size of an image region where the target objects are located, and the pixel information includes attributes of at least pixels in the image region.

In a second aspect, decoding methods are provided, including obtaining code stream data of a current image, where the code stream data includes identification information, the identification information is used to identify at least target objects in the current image, the identification information includes image area information and pixel information, the image area information includes a position and a size of an image area where the target objects are located, and the pixel information includes attributes of at least pixels in the image area, and performing decoding processing on at least part of the code stream data.

In a third aspect, encoding devices are provided, which include at least memories for storing computer-executable instructions and at least processors individually or collectively configured to access the at least memories and execute the computer-executable instructions to perform encoding processing on a current image and generate codestream data, the codestream data including identification information for identifying at least target objects in the current image, the identification information including image region information and pixel information, the image region information including a position and a size of an image region where the target objects are located, and the pixel information including an attribute of at least pixels in the image region.

In a fourth aspect, decoding devices are provided, which include at least memories for storing computer-executable instructions, and at least processors individually or collectively configured to access the at least memories and execute the computer-executable instructions to perform operations of obtaining codestream data of a current image, the codestream data including identification information for identifying at least target objects in the current image, the identification information including image region information and pixel information, the image region information including a position and a size of an image region where the target objects are located, the pixel information including attributes of at least pixels in the image region, and performing decoding processing on at least part of the codestream data.

Aspects of the present application indicate the position and size of an image region where a target object is located through image region information, indicate the attributes of a plurality of pixels in the image region through pixel information, and thereby identify the target object with finer granularity, which is advantageous for a decoding apparatus to perform an operation on the target object more efficiently and accurately. .

Drawings

Fig. 1 is a schematic flow chart of an encoding method of embodiments provided herein.

Fig. 2 is a schematic illustration of the target object in the images of embodiments of the present application.

Fig. 3 is a schematic flow chart of a decoding method of embodiments provided herein.

Fig. 4 is a schematic flow chart of an encoding device of embodiments provided herein.

Fig. 5 is a schematic flow chart of an encoding apparatus of another embodiments provided herein.

Fig. 6 is a schematic flow chart of a decoding device of embodiments provided herein.

Fig. 7 is a schematic flow chart of a decoding apparatus of another embodiments provided herein.

Detailed Description

The technical solutions in the embodiments of the present application will be described below with reference to the accompanying drawings.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein in the description of the present application is for the purpose of describing particular embodiments only and is not intended to be limiting of the application.

First, related technologies and concepts related to embodiments of the present application will be described.

The target object may be an object to be identified, identified or observed, which needs to pay attention to, in the image, and may include a person, an animal, a plant, a public facility, a vehicle or a landscape, a scene, and the like, and may also include other types of objects; specific parts of a person, animal, plant, public facility, vehicle landscape, scenery, or other type of object may also be included.

Generally, the location and size of the image area should be such that all parts of the target object fall within the image area, or at least 80% of the area of the target object falls within the image area.

The sub-image region may be a slice region where pixels in the image region have the same attribute.

For systems from an encoding end to a decoding end, the current object tracking technologies are implemented by encoding video content at the encoding end, analyzing the video content at the decoding end, finding out objects needing important attention, and identifying the objects needing important attention, namely, completing the identification at the decoding end.

The problem of completing identification at the decoding end is that video encoding is usually a lossy process, information of the video content is lost after encoding, the quality and the information content of the video content obtained by decoding at the decoding end are reduced to a certain extent by compared with the video content at the encoding end, damaged video contents are analyzed and objects needing important attention are extracted at the decoding end, the effect is usually not satisfactory, in addition, the analysis of the video content and the extraction of the objects at the decoding end consume a large amount of computing resources at the decoding end, however, the decoding end is widely applied to mobile devices such as mobile phones, and the mobile devices are sensitive to power consumption, therefore, the analysis of the video content at the decoding end consumes computing power, and user experience is influenced at the degree.

The method has the advantages that 1, the original uncompressed lost video content is analyzed at the encoding end, the object needing important attention can be extracted more efficiently and more accurately, 2, because the equipment at the encoding end generally has stronger computing capability, and the equipment at the encoding end generally needs to analyze the video content so as to execute additional operations, the original calculation and analysis at the decoding end are transferred to the encoding end, and no bad user experience is brought.

In implementations, the encoding side may encode the Video content using a common Video encoding standard, such as the h.264/Advanced Video Coding (AVC) standard, the h.265/High Efficiency Video Coding (HEVC) standard, the source coding standard (AVS) 1-P2, AVS2-P2, the VP9 standard, the open media Alliance Video (AV) 1 standard, the universal Video coding (VVC) standard, to obtain a Video file,

fig. 1 is a schematic flow chart of an encoding method 100 of embodiments provided in the present application, where the encoding method 100 is performed by an encoding device, as shown in fig. 1, the encoding method 100 includes S110, performing encoding processing on a current image, and generating code stream data, where the code stream data includes identification information, the identification information is used to identify at least target objects in the current image, the identification information includes image area information and pixel information, the image area information includes a position and a size of an image area where the target objects are located, and the pixel information includes an attribute of at least pixels in the image area.

According to the encoding method provided by the embodiment of the application, the position and the size of the image area where the target object is located are indicated through the image area information, and the attributes of a plurality of pixels in the image area are indicated through the pixel information, so that the target object is identified with finer granularity, and a decoding end can perform operation on the target object more efficiently and more accurately.

In , before the encoding process is performed on the current image at S110 to generate the codestream data, the encoding method 100 may further include performing image recognition on the current image, determining the target object, and obtaining the identification information of the target object.

For example, the auxiliary Enhancement information may be SEI (supplemental Enhancement information) extension data may be ED (extension data). SEI and ED may generally be considered to be portions of the codestream data.

For example, the image region information may include coordinates of any corner of the rectangular region (e.g., the coordinates of the upper left corner), height information of the rectangular region, and width information of the rectangular region.

Further, in other embodiments of the present application, the image region may have other shapes, such as a circle, a polygon, or a curved edge, etc. when the image region is a circle, the image region information may include coordinates of a center point (i.e., coordinates of a center point) and radius information.

It should be understood that in various embodiments of the present application, the image area may include a plurality of sub-image areas, the sub-image areas may be tile areas having pixels with the same attribute in the image area, for example, 0 sub-image areas may be 1 tile areas corresponding to the target object, and another 2 sub-image areas may be 3 tile areas corresponding to the background, for example, sub-image areas may be tile areas corresponding to locations of the target object, another sub-image areas may be tile areas corresponding to another locations of the target object, and still another sub-image areas may be tile areas corresponding to the background.

In the embodiments of the present application, the attributes may be measured in units of pixels, that is, each pixel corresponds to a respective attribute, and correspondingly, the pixel information includes information of the attribute of each pixel, or the attributes may also be measured in units of pixel blocks, and correspondingly, the pixel information includes information of the attributes of at least pixel blocks, and a pixel block includes at least two pixels.

The pixel block can be blocks with regular shapes, such as square or rectangular blocks, the pixel block can also be blocks with irregular shapes.

The amount of data stored or transmitted by the encoding device may be reduced for attributes in pixel units relative to attributes in pixel units, those skilled in the art will appreciate that pixel information may also be obtained in other alternative forms or schemes, not listed herein at .

Alternatively, in the embodiments of the present application, the pixel information may include values assigned to at least pixels in the image region, where the pixels in different sub-image regions are assigned the same or different values, it should be understood that the values of the pixels in different sub-image regions in the same image region may be the same or different, for example, the values assigned to the pixels in two sub-image regions that are not contiguous may be the same or different, for example, the values assigned to the pixels in the sub-image regions in different image regions may be the same or different, for example, the values assigned to the sub-image regions belonging to the target object in different image regions may be the same or different, and the values assigned to the sub-image regions belonging to the target object in different image regions may be the same or different.

Optionally, in embodiments of the present application, the attribute of at least pixels may include whether at least pixels belong to the target object.

In possible implementations, at least pixels, th part of the pixels are assigned th values to indicate that th part of the pixels do not belong to the target object, i.e., the pixel information includes values of pixels not belonging to the target object, for example, (or more) sub-image regions are included in the image region as the target object, several sub-image regions are also included in the image region as the background not belonging to the target object.

In another possible implementations, at least pixels have a second number assigned to it to indicate that the second portion of pixels belong to the target object.A value of a pixel belonging to the target object is included in the pixel information.A number of (or more) sub-image regions are included in the image region as the target object.A number of sub-image regions are also included in the image region as a background not belonging to the target object.

In yet another possible implementation, of at least pixels, a th portion of the pixels is assigned a th value to indicate that a th portion of the pixels does not belong to the target object, and a second portion of the pixels is assigned a second value to indicate that the second portion of the pixels belongs to the target object.

In examples of the attribute in pixel units, the pixel information may be represented by a template (mask), the template value may be identified by

binary values

0 and 1, the template value of the pixel belonging to the target object in the pixel information is 1, the template value of the pixel belonging to the background is 0, the image area of the target object i is a rectangular area, the image area information of the target object i includes the coordinates of the upper left corner of the rectangular area, the height information of the rectangular area, and the width information of the rectangular area, the pixel information of the target object i is represented by a template, and the specific content of the identification information of the target object i may be as follows.

ar_object_top[i]

ar_object_left[i]

ar_object_width[i]

ar_object_height[i]

for(m＝0；m<ar_object_height[i]；m++)

for(n＝0；n<ar_object_width[i]；n++)

mask[m][n]＝is object？1:0

Wherein ar _ object _ top [ i ], ar _ object _ left [ i ], ar _ object _ width [ i ], and ar _ object _ height [ i ] denote the position and size of the target object i, and ar _ object _ top [ i ] and ar _ object _ left [ i ] denote the position of the upper left corner of the target object i; ar _ object _ width [ i ] and ar _ object _ height [ i ] represent the width and height of the target object i. mask [ m ] [ n ] represents the template value corresponding to the pixel whose coordinates are shifted by m and n in the vertical and horizontal directions with respect to the upper left corner of the rectangular region. When the pixel belongs to the target object, the value of mask [ m ] [ n ] is 1; otherwise, when the pixel belongs to the background, the value of mask [ m ] [ n ] is 0.

Fig. 2 is a schematic diagram of a target object in an image 200 of the present application, as shown in fig. 2, the image 200 includes a target object 1 and a target object 2, where an image area 1 corresponding to the target object 1 is a rectangular area, and an image area 2 corresponding to the target object 2 is also a rectangular area, in the image area 1, a pixel with a value of 1 belongs to the target object 1, a pixel with a value of 0 does not belong to the target object 1, and in the image area 2, a pixel with a value of 1 belongs to the target object 2, and a pixel with a value of 0 does not belong to the target object 2.

In the embodiments of the present application, the attributes of at least pixels include the location of the target object to which at least pixels belong.

In specific examples, the target object is a person, the th sub-pixel of the at least sub-pixels is assigned a third value indicating that the th sub-pixel belongs to the head of the target object, and/or the second sub-pixel of the at least sub-pixels is assigned a fourth value indicating that the second sub-pixel belongs to the hand of the target object, the at least sub-pixels may further include a third sub-pixel indicating that the third sub-pixel does not belong to the target object but belongs to the background, for example, the third sub-pixel is assigned a 0 indicating that the third sub-pixel does not belong to the target object but belongs to the background, the th sub-pixel is assigned a 1 indicating that the th sub-pixel belongs to the head of the target object, and the second sub-pixel is assigned a 2 indicating that the second sub-pixel belongs to the hand of the target object.

In another specific examples, the target object is a car, the th sub-pixel of the at least sub-pixels is assigned a fifth value indicating that the th sub-pixel belongs to the front of the target object, and/or the second sub-pixel of the at least sub-pixels is assigned a sixth value indicating that the second sub-pixel belongs to the rear of the target object, the at least sub-pixels may further include a third sub-pixel indicating that the third sub-pixel does not belong to the target object but to the background, for example, the third sub-pixel is assigned a 0 indicating that the third sub-pixel does not belong to the target object but to the background, the th sub-pixel is assigned a 1 indicating that the th sub-pixel belongs to the front of the target object, and the second sub-pixel is assigned a 2 indicating that the second sub-pixel belongs to the rear of the target object.

At specific examples, the descriptive characteristics for at least pixels may include at least of the reflected intensity (intensity) of the point cloud for at least pixels, the infrared intensity for at least pixels, and the depth value for at least pixels, where depth is a measure over distances, such as distance to a lens.

specific identification schemes are provided that can effectively identify a target object and also effectively improve the efficiency of identification and reduce data storage and transmission.A core idea of the specific identification scheme is to identify or more target objects that are changed from a coded image by comparing a current image with the coded image.

Alternatively, in the embodiments of the present application, the target object may be at least objects corresponding to a new identification object of the current image relative to the encoded image, an identification object of the current image having a changed position relative to the encoded image, an identification object of the current image having a changed size relative to the encoded image, and an identification object of the current image having a changed pixel information in an image region of the encoded image.

Alternatively, from the perspective of performing the steps, the encoding method 100 may further include at least of determining a newly added object to be identified as the target object in the current image relative to the encoded image, determining an object to be identified as the target object in which the position and/or size of the current image changes relative to the encoded image, and determining an object to be identified as the target object in which the pixel information in the image region of the current image changes relative to the encoded image.

From the viewpoint of code stream data, the identification information of the code stream data further includes category identification bits for indicating at least situations that the target object is an identification object added to the current image relative to the encoded image, the target object is an identification object whose position changes relative to the encoded image, the target object is an identification object whose size changes relative to the encoded image, the target object is an identification object whose pixel information in the image area changes relative to the encoded image.

It should be understood that, in the embodiments of the present application, the identification object whose position of the current image changes with respect to the encoded image may refer to a change in the position of the identification object itself, or may refer to a change in the position of an image area where the identification object is located. The identification object whose size of the current image changes relative to the size of the encoded image may refer to that the size of the identification object itself changes, or may refer to that the size of the image area where the identification object is located changes.

In implementations, the target object includes a new marker object for the current image relative to the encoded image, and the image region information includes an absolute value of a position and an absolute value of a size of an image region in which the new marker object is located.

In , the target object may include a position-changed identification object from the encoded image, and the image area information of the target object (i.e., the position-changed identification object) includes an absolute value of the position of the image area where the target object is located or a relative value of the position change.

In the above implementation manner in which the current image has a target object (i.e., an identification object with a changed position) relative to the encoded image, there may be two cases in which the size of the image region in the current image of the target object is changed or remains unchanged from the size of the image region in the decoded image.

In the case of a change, optionally, the image area information of the target object includes an absolute value of a size of an image area where the target object is located or a relative value of the size change. Wherein, the absolute value of the size refers to the size of the image area of the target object in the current image; the relative value of the size change refers to the difference between the size of the region of the target object in the encoded image and the size of the image of the target object in the current image.

In the case of being kept unchanged, optionally, the image region information of the target object includes an identification bit for indicating that the size of the image region in which the target object is located is kept unchanged compared to the size of the image region in the decoded image. Optionally, the size of the image area is not encoded in the image area information of the target object in the code stream data.

In the above implementation manner that the current image has a target object (i.e., an identification object with a changed position) relative to the encoded image, there may be two situations that the pixel information of the image area where the target object is located in the current image is changed or remains unchanged compared to the pixels of the image area where the target object is located in the decoded image.

In the case of a change, the pixel information of the target object optionally includes an absolute value of an attribute of at least pixels of an image region where the target object is located or a relative value of an attribute change of at least pixels, where the absolute value of the attribute refers to an attribute of at least pixels of an image region where the target object is located in the current image, the attribute of at least pixels may refer to an absolute value of an attribute of all pixels in the image region or an absolute value of an attribute of a part of pixels where the attribute is changed in the image region, the relative value of the attribute change refers to a difference between a value to which a pixel of the image region where the target object is located in the current image and a value to which a pixel of the image region where the target object is located in the encoded image is assigned.

In the case of being kept unchanged, optionally, the image region information of the target object includes an identification bit for indicating that the pixel information of the image region where the target object is located is kept unchanged from that in the decoded image. Optionally, the pixel information of the target object is not encoded in the code stream data.

In implementations, the target object may include a size-changed identification object from the encoded image, and the image area information of the target object (i.e., the size-changed identification object) includes an absolute value of the size or a relative value of the size change of the image area in which the target object is located.

In the above implementation manner in which the current image has a target object (i.e., an identification object with a changed size) relative to the encoded image, there may be two situations in which the position of the image region in the current image of the target object is changed or remains unchanged from the position of the image region in the decoded image.

In the case of a change, optionally, the image area information of the target object includes an absolute value of a position of an image area in which the target object is located or a relative value of the change in position.

In the case of being kept unchanged, optionally, the image region information of the target object includes an identification bit for indicating that the position of the image region where the target object is located is kept unchanged from that in the decoded image. Optionally, the position of the image area is not encoded in the image area information of the target object in the code stream data.

In the case of a change, optionally, the pixel information of the target object includes an absolute value of an attribute of at least pixels or a relative value of an attribute change of at least pixels of an image area where the target object is located.

In , the target object may include an identification object in which the pixel information of the image region of the current image changes relative to the encoded image, and then the pixel information of the target object (i.e. the identification object in which the pixel information changes) in the image region of the current image includes an absolute value of the attribute or a relative value of the change of the attribute of at least pixels of the image region of the current image.

In the above implementation manner that the current image has a target object (i.e., an identification object with changed pixel information) relative to the encoded image, there may be two situations that the position of the image region where the target object is located in the current image is changed or remains unchanged from the position of the image region where the target object is located in the decoded image.

In the above implementation manner that the current image has a target object (i.e., an identification object with changed pixel information) relative to the encoded image, there may be two cases that the size of the image area of the target object in the current image is changed or remains unchanged from the size of the image area in the decoded image.

In the case of a change, optionally, the image area information of the target object includes an absolute value of a size of an image area where the target object is located or a relative value of the size change.

For example, for the situation that the position and the size of the image area where the target object is located are both kept unchanged, the image area information may include an identification bit for indicating that the size and the position of the image area where the target object is located are kept unchanged compared with the encoded image.

As will be understood by those skilled in the art, when only or two of the three parameters, namely the position of the image area where the target object is located, the size of the image area where the target object is located, and the pixel of the target object in the image area, are changed, the identification information may include both the image area information and the pixel information.

Wherein ar _ num _ objects _ minus1 represents the number of objects to be identified in the current picture; ar _ object _ idx [ i ] represents the label of the ith object to be identified in the current image; ar _ new _ object _ flag [ ar _ object _ idx [ i ] ] indicates whether or not the object labeled ar _ object _ idx [ i ] in the current image is a newly appearing object; the ar _ object _ bounding _ box _ update _ flag [ ar _ object _ idx [ i ] ] indicates whether the position and size of an object labeled ar _ object _ idx [ i ] in the current image have changed in the current image and the coded image; ar _ object _ top [ ar _ object _ idx [ i ] ], ar _ object _ left [ ar _ object _ idx [ i ] ], ar _ object _ width [ ar _ object _ idx [ i ] ], and ar _ object _ height [ ar _ object _ idx [ i ] ] denote the position and size of an object denoted by ar _ object _ idx [ i ], where ar _ object _ top [ ar _ object _ idx [ i ] ] and ar _ object _ left [ ar _ object _ idx [ i ] ] denote the position of the upper left corner of the object denoted by ar _ object _ idx [ i ]; ar _ object _ width [ ar _ object _ idx [ i ] ] and ar _ object _ height [ ar _ object _ idx [ i ] ] denote the width and height of an object denoted ar _ object _ idx [ i ]; the ar _ bounding _ box _ mask _ update _ flag [ ar _ object _ idx [ i ] ] indicates whether or not the attribute of the pixel of the object denoted by the reference numeral ar _ object _ idx [ i ] has changed. mask [ m ] [ n ] represents the template value corresponding to the pixel whose coordinates are shifted by m and n in the vertical and horizontal directions with respect to the upper left corner of the rectangular region. When the pixel belongs to the target object, the value of mask [ m ] [ n ] is 1; otherwise, when the pixel belongs to the background, the value of mask [ m ] [ n ] is 0.

It should be understood that ar _ new _ object _ flag and the like can be regarded as the category identification bits mentioned above. ar _ object _ idx [ i ] is a label of a target object, and may also be called an indicator bit, number or index of the target object, for indicating which target object is.

Alternatively, in embodiments of the present application, the code stream data and/or the identification information may further include an indication bit of the coded image for indicating which coded image is currently referred to.

In alternative embodiments of the present application, the encoded image may be determined by searching, from among the plurality of images that have been already encoded, an image including a target object closest to the current image as an encoded image for reference, using the reference numbers of or more target objects in the current image as a search condition.

Alternatively, in yet further embodiments of the present application, the encoded image may be determined by searching for the encoded image to be used as a reference based on at least parameter values of the same target object as in the current image, i.e., when the same target object is present and the position and/or size and/or pixel information is closest, the encoded image to be used as a reference is considered to be searched for, and the search may be performed based on only at least parameter values of the same target object as in the current image, i.e., when the same target object is present and the position and/or size and/or pixel information is closest, i.e., regardless of whether the target object is the same, and the position and/or size and/or pixel information is closest.

In application scenes of unmanned aerial vehicles, unmanned aerial vehicles can control the camera device through a tripod head, so that a target object such as a person can be kept at the center of the picture or at a certain position of the picture, in combination with the encoding method of the embodiment of the application, namely, the center of an image area where the target object is located is kept at the center of the picture or at a certain position of the picture, in the application scene, or in an application scene in which the position of the image area is kept unchanged in a plurality of frames, the image area can be a rectangular area, and the image area information can comprise the center point coordinates of the rectangular area, the height information of the rectangular area and the width information of the rectangular area.

For the application scene, the image area is a rectangular area, and the image area information includes the center point coordinates of the rectangular area, the height information of the rectangular area, and the width information of the rectangular area. The image area information may comprise an identification bit indicating that the coordinates of the center point of the image area where the target object is located remain unchanged compared to the encoded image.

It should be understood that each identified object of embodiments of the present application may have a label or index unique to . moreover, the label or index of the same identified object in different images may be the same.

Wherein ar _ num _ cancel _ objects represents objects for which the current picture no longer exists relative to the coded picture; ar _ cancel _ object _ idx [ i ] represents the label of the above-described object that is no longer present.

Optionally, in the embodiments of the present application, the identification information may also include content information indicating the content of the target object.

In cases, the content information may be label (label) information, the label may directly indicate the content of the target object using natural language, and the natural language may be expressed by Request For Comments (RFC) 5646 standard, i.e. IETF RFC 5646 standard, of Internet Engineering Task Force (IETF). in another cases, the content information may be a value, i.e. a -dimensional value may be added to indicate what the target object is by different values.

Optionally, in embodiments of the present application, the codestream data may further include image content data of the current image.

In possible implementations, the image content data of the current image includes reference frame data of the current image and residual data between the current image and the reference frame.

Fig. 3 is a schematic flowchart of a decoding method 300 of embodiments provided in the present application, where the decoding method 300 is executed by a decoding device, as shown in fig. 3, the decoding method 300 includes S310, acquiring code stream data of a current image, where the code stream data includes identification information, the identification information being used to identify at least target objects in the current image, the identification information including image region information and pixel information, the image region information including a position and a size of an image region where the target objects are located, and the pixel information including attributes of at least pixels in the image region, and S320, performing decoding processing on at least part of the code stream data.

According to the decoding method provided by the embodiment of the application, the position and the size of the image area where the target object is located are indicated through the image area information, and the attributes of a plurality of pixels in the image area are indicated through the pixel information, so that the target object is identified with finer granularity, and the decoding method is beneficial for decoding equipment to more efficiently and more accurately perform operation on the target object.

In the decoding method provided in this embodiment, the code stream data of the current image acquired in step 310 may be the same as the code stream data in the encoding method provided in the present invention, and for the explanation of the code stream data in step 310, reference may be made to the explanation of the code stream data in the above encoding method.

Optionally, in the embodiments of the present application, the image area may comprise a plurality of sub-image areas, and the pixel information may comprise values assigned to at least pixels in the image area, wherein pixels in different sub-image areas are assigned different values.

Optionally, in embodiments of the present application, the pixel information may be assigned with different values for at least pixels, and the decoding process performed on at least part of the codestream data may include determining whether at least pixels in the image region belong to the target object according to the pixel information in the codestream data S320.

In possible implementation manners, of at least pixels, the th sub-pixel may be assigned with the th value, and it is determined whether at least pixels in the image area belong to the target object according to the pixel information in the codestream data, where determining that the th sub-pixel does not belong to the target object when the th sub-pixel in the pixel information in the codestream data corresponds to the th value, and for example, determining that the th sub-pixel does not belong to the target object when the th sub-pixel in the pixel information corresponds to 0.

In another possible implementation manners, the second part of the pixels in the at least pixels may be assigned with a second value, and determining whether at least pixels in the image area belong to the target object according to the pixel information in the codestream data may include determining that the second part of the pixels belong to the target object when the second part of the pixels in the pixel information in the codestream data corresponds to the second value.

It should be understood that, similarly to the encoding method, the two possible implementations described above may be implemented separately or in combination with each other, and this is not limited in this embodiment of the application.

Optionally, in embodiments of the present application, the attribute of at least pixels may include a location of the target object to which at least pixels belong.

In possible implementations, different pixels in the pixel information may be assigned with different values, and the decoding process performed on at least part of the codestream data in S320 may include determining, according to the pixel information in the codestream data, a location to which at least pixels in the image region belong in the target object.

In specific examples, the target object may be a person, determining, according to pixel information in the codestream data, a th part of pixels in the at least pixels to which the target object belongs, where at least pixels in the image region belong to the target object, may include determining, when th part of pixels in the pixel information in the codestream data corresponds to the third value, that th part of pixels belongs to a head of the target object, and/or determining, according to pixel information in the codestream data, a second part of pixels in the at least pixels to which at least pixels in the image region belong to the target object, may include determining, when the second part of pixels in the pixel information in the codestream data corresponds to the fourth value, that the second part of pixels belongs to a hand of the target object.

In another specific examples, the target object may be a car, and determining, according to pixel information in the codestream data, a th portion of pixels in the at least pixels to which the target object belongs may include determining, when th portion of pixels in the pixel information in the codestream data corresponds to a fifth value, that th portion of pixels belongs to a head of the target object, and/or determining, according to pixel information in the codestream data, a second portion of pixels in the at least pixels to which a sixth value corresponds, that at least pixels in the image area belong to a portion of the target object, may include determining, when the second portion of pixels in the pixel information in the codestream data corresponds to a sixth value, that the second portion of pixels belongs to a tail of the target object.

For example, the descriptive characteristics for at least pixels may include at least of the reflected intensity of the point cloud for the at least pixels, the infrared intensity for the at least pixels, and the depth value for the at least pixels.

Optionally, in embodiments of the present application, the attribute is measured in units of pixel blocks, and the pixel information may include information of the attribute of at least pixel blocks, and the pixel blocks may include at least two pixels.

Optionally, in embodiments of the present application, the decoding method 300 may further include determining, according to the category identification bits, that the target object may be at least objects that are satisfied when the current image is a newly added identification object with respect to the decoded image, the current image is an identification object whose position changes with respect to the decoded image, the current image is an identification object whose size changes with respect to the decoded image, and the current image is an identification object whose pixel information in an image region in the decoded image changes.

In , the target object may include a new mark object of the current image relative to the decoded image, the image region information may include an absolute value of a position and an absolute value of a size of an image region where the target object is located, when the current image adds the mark object relative to the decoded image, both the image region information and the pixel information should be marked, S320 decodes at least a portion of the code stream data, and may include determining the position and the size of the target object, i.e., the image region where the new mark object is located, according to the image region information in the code stream data.

In , the target object may include a position-changed identification object relative to the decoded image, and the image area information of the target object (i.e., the position-changed identification object) includes an absolute value of the position of the image area where the target object is located or a relative value of the position change.

When the image area information includes the relative value of the position change of the image area where the target object is located, S320 performs decoding processing on at least part of the code stream data, which may include: and determining the position of the target object in the image area in the current image according to the position of the target object in the image area in the decoded image and the relative value of the position change of the image area. For example, the decoding apparatus may determine the position of the image region where the target object is located in the decoded image; and determining the position of the image area where the target object is located in the current image according to the position of the image area where the target object is located in the decoded image and the difference value between the position of the image area where the target object is located in the decoded image and the position of the image area where the target object is located in the current image.

In the above implementation manner in which the current image has a target object (i.e., an identification object with a changed position) relative to the decoded image, there may be two situations in which the size of the image area of the target object in the current image is changed or remains unchanged from the size of the image area in the decoded image.

In the case of a change, optionally, the image area information of the target object includes an absolute value of a size of an image area where the target object is located or a relative value of the size change. Wherein, the absolute value of the size refers to the size of the image area of the target object in the current image; the relative value of the size change refers to the difference between the size of the region of the target object in the decoded image and the size of the image of the target object in the current image.

When the image area information includes the relative value of the size change of the image area where the target object is located, S320 performs decoding processing on at least part of the code stream data, which may include: and determining the size of the image area of the target object in the current image according to the size of the image area of the target object in the decoded image and the relative value of the size change of the image area. For example, the decoding apparatus may determine the size of an image region in which the target object is located in the decoded image; and determining the size of the image area where the target object is located in the current image according to the size of the image area where the target object is located in the decoded image and the difference value between the size of the image area where the target object is located in the decoded image and the size of the image area where the target object is located in the current image.

In the case of being kept unchanged, optionally, the image region information of the target object includes an identification bit for indicating that the size of the image region in which the target object is located is kept unchanged compared to the size of the image region in the decoded image. Optionally, the size of the image area is not encoded in the image area information of the target object in the code stream data. S320, decoding at least a part of the code stream data, and may further include: and determining the size of the image area of the target object in the current image according to the size of the image area of the target object in the decoded image. Namely, the size of the image area of the target object in the decoded image is determined as the size of the image area of the target object in the current image.

In the above implementation manner that the current image has a target object (i.e., an identification object with a changed position) relative to the decoded image, there may be two situations that the pixel information of the image area where the target object is located in the current image is changed or remains unchanged compared to the pixels of the image area where the target object is located in the decoded image.

In the case of a change, the pixel information of the target object optionally includes an absolute value of an attribute of at least pixels of an image region where the target object is located or a relative value of an attribute change of at least pixels, where the absolute value of the attribute refers to an attribute of at least pixels of an image region where the target object is located in the current image, the attribute of at least pixels may refer to an absolute value of an attribute of all pixels in the image region or an absolute value of an attribute change of a partial pixel in the image region.

When the pixel information includes the relative value of the attribute change of at least pixels of the image region where the target object is located, S320 decodes at least part of the code stream data, and may further include determining the pixel information of the target object in the current image according to the pixel information of the target object in the decoded image and the relative value of the attribute change of at least pixels.

When the pixel information includes information of a part of pixels of which the attributes of the target object in the image region of the current image are changed, the decoding device may consider that the attributes of the rest of pixels are not changed.

In the case of being kept unchanged, optionally, the image region information of the target object includes an identification bit for indicating that the pixel information of the image region where the target object is located is kept unchanged from that in the decoded image. Optionally, the pixel information of the target object is not encoded in the code stream data. Correspondingly, S320 performs decoding processing on at least part of the code stream data, and may further include: and determining the pixel information of the target object in the image area of the current image according to the pixel information of the image area of the target object in the decoded image.

In implementations, the target object may include a mark object whose size changes from a decoded image, and then the image area information of the target object (i.e., the mark object whose size changes) includes an absolute value of the size or a relative value of the size change of the image area where the target object is located, when the image area information includes the relative value of the size change of the image area where the target object is located, S320 performs a decoding process on at least a portion of the codestream data, which may include determining the size of the image area where the target object is located in the current image according to the size of the image area where the target object is located in the decoded image and the relative value of the size change of the image area.

In the above implementation manner in which the current image has a target object (i.e., an identification object with a changed size) relative to the decoded image, there may be two situations in which the position of the image area in which the target object is located in the current image is changed or remains unchanged from the position of the image area in which the target object is located in the decoded image.

In the case of a change, optionally, the image area information of the target object includes an absolute value of a position of an image area in which the target object is located or a relative value of the change in position. When the image area information includes the relative value of the position change of the image area where the target object is located, S320 performs decoding processing on at least part of the code stream data, and may further include: and determining the position of the target object in the image area in the current image according to the position of the target object in the image area in the decoded image and the relative value of the position change of the image area.

In the case of being kept unchanged, optionally, the image region information of the target object includes an identification bit for indicating that the position of the image region where the target object is located is kept unchanged from that in the decoded image. Optionally, the position of the image area is not encoded in the image area information of the target object in the code stream data. S320, decoding at least a part of the code stream data, and may further include: and determining the position of the target object in the image area of the current image according to the position of the target object in the image area of the decoded image. Namely, the position of the image area of the target object in the decoded image is determined as the position of the image area of the target object in the current image.

When the pixel information comprises the relative value of the attribute change of at least pixels of the image area where the target object is located, S320 decodes at least part of the code stream data, and the method can further comprise determining the pixel information of the target object in the current image according to the pixel information of the target object in the decoded image and the relative value of the attribute change of at least pixels.

In the case of being kept unchanged, optionally, the image region information of the target object includes an identification bit for indicating that the pixel information of the image region where the target object is located is kept unchanged from that in the decoded image. Optionally, the pixel information of the target object is not encoded in the code stream data. Optionally, the pixel information of the target object is not encoded in the code stream data. Correspondingly, S320 performs decoding processing on at least part of the code stream data, and may further include: and determining the pixel information of the target object in the image area of the current image according to the pixel information of the image area of the target object in the decoded image.

In , the target object may include an identification object in which the pixel information of the image region of the current image changes relative to the decoded image, and then the pixel information of the target object (i.e. the identification object in which the pixel information changes) in the image region of the current image includes an absolute value of the attribute or a relative value of the change of the attribute of at least pixels of the image region of the target object.

When the pixel information includes a relative value of a change of an attribute of at least pixels of an image region in which the target object is located, S320 may perform decoding processing on at least a portion of the codestream data, and may include determining the pixel information of the target object in the current image according to the pixel information of the target object in the decoded image and the relative value of the change of the attribute of at least pixels.

In the above implementation manner that the current image has a target object (i.e., an identification object with changed pixel information) relative to the decoded image, there may be two cases that the size of the image area of the target object in the current image is changed or remains unchanged from the size of the image area in the decoded image.

In the case of a change, optionally, the image area information of the target object includes an absolute value of a size of an image area where the target object is located or a relative value of the size change. When the image area information includes the relative value of the size change of the image area where the target object is located, S320 performs decoding processing on at least part of the code stream data, which may include: and determining the size of the image area of the target object in the current image according to the size of the image area of the target object in the decoded image and the relative value of the size change of the image area.

In the case of being kept unchanged, optionally, the image region information of the target object includes an identification bit for indicating that the size of the image region in which the target object is located is kept unchanged compared to the size of the image region in the decoded image. Optionally, the size of the image area is not encoded in the image area information of the target object in the code stream data. S320, decoding at least a part of the code stream data, and may further include: and determining the size of the image area of the target object in the current image according to the size of the image area of the target object in the decoded image.

It should be added that at least some of the above-described implementations can be combined.

In specific application scenarios of the unmanned aerial vehicle, the image region may be a rectangular region, the image region information may include coordinates of a center point of the rectangular region, height information of the rectangular region, and width information of the rectangular region, in the case that the position of the image region where the target object is located remains unchanged and the size changes, the code stream data may not include a value of the coordinates of the center point of the image region in the image region information but indicate that the content of the image region is unchanged by using an identification bit, the image region information may further include an identification bit for indicating that the coordinates of the center point of the image region where the target object is located remains unchanged, S320 performs a decoding process on at least a portion of the code stream data, and may include determining coordinates of the center point of the target object in the image region according to the coordinates of the center point of the image region where the target object is located in the decoded image, determining coordinates of the center point of the image region where the identification object remains unchanged, determining coordinates of the height information and width of the center point of the image region according to the image region information of the height information and the center point coordinates of the identification object remains unchanged.

Optionally, in embodiments of the present application, the identification information may also be used to identify removed objects for the current image relative to the decoded image.

In possible implementations, the identification information may include label information of the removed object or location information of the removed object.

Optionally, in embodiments of the present application, the codestream data may further include image content data of the current image S320 performing decoding processing on at least part of the codestream data may include performing decoding processing on the image content data of the current image in the codestream data.

Optionally, in embodiments of the present application, the S320 performing decoding processing on at least part of the codestream data may include decoding identification information in the codestream data to obtain a current image and the decoded identification information.

Optionally, in embodiments of the present application, the S320 may perform decoding processing on at least part of the codestream data, where the decoding processing may include discarding the identification information and not decoding the identification information.

Optionally, in the embodiments of the present application, the identification information may further include content information, and S320 performs a decoding process on at least part of the codestream data, and may include determining the content of the target object according to the content information in the codestream data.

In possible implementations, the content information may be label information.

In another possible implementations, the content information may be a numerical value.

Alternatively, in the embodiments of the present application, the image region may be a rectangular region.

In possible implementations, the image region information may include coordinates of any corners of the rectangular region, height information of the rectangular region, and width information of the rectangular region.

Alternatively, the image area information may include center point coordinates of the rectangular area, height information of the rectangular area, and width information of the rectangular area.

Alternatively, the image area information may include the coordinates of the upper left corner of the rectangular area and the coordinates of the lower right corner of the rectangular area.

Alternatively, the image area information may include the coordinates of the upper right corner of the rectangular area and the coordinates of the lower left corner of the rectangular area.

Optionally, in the embodiments of the present application, the identification information may be located in the auxiliary enhancement information or the extension data of the current image.

binary values

0 and 1, the template value of the pixel belonging to the target object in the pixel information is 1, the template value of the pixel belonging to the background is 0, the image area of the target object i is a rectangular area, the image area information of the target object i includes the coordinates of the upper left corner of the rectangular area, the height information of the rectangular area, and the width information of the rectangular area, the pixel information of the target object i is represented by a template as an example, and the specific content of the identification information of the target object i may be as follows for a decoding device.

ar_object_top[i]

ar_object_left[i]

ar_object_width[i]

ar_object_height[i]

for(m＝0；m<ar_object_height[i]；m++)

for(n＝0；n<ar_object_width[i]；n++)

mask[m][n]＝mask_value

Wherein ar _ object _ top [ i ], ar _ object _ left [ i ], ar _ object _ width [ i ], and ar _ object _ height [ i ] denote the position and size of the target object i, and ar _ object _ top [ i ] and ar _ object _ left [ i ] denote the position of the upper left corner of the target object i; ar _ object _ width [ i ] and ar _ object _ height [ i ] represent the width and height of the target object i. mask [ m ] [ n ] represents the template value corresponding to the pixel whose coordinates are shifted by m and n in the vertical and horizontal directions with respect to the upper left corner of the rectangular region. When the value of the mask _ value obtained by decoding is 1, the value of mask [ m ] [ n ] is 1, and the pixel belongs to a target object i; when the value of mask _ value obtained by decoding is 0, the value of mask [ m ] [ n ] is 0, and it indicates that the pixel belongs to the background.

For the decoding apparatus, the information about the identification object of the decoding current picture may refer to the case of the decoded picture. The specific content of the identification information received by the decoding apparatus may be as follows.

When the object-fill _ objection [ objecti ] indicates an object no longer exists in the current image relative to the decoded image, the object-fill _ objection [ idx [ i ] indicates the number of objects to be identified in the current image, the object-fill _ objection [ min ] 1 indicates the number of objects to be identified in the current image, the object-fill _ idx [ i ] indicates the number of objects to be identified in the current image, the object-fill _ flag [ object _ idx [ i ] indicates whether the object with the mark-fill _ idx [ i ] is a newly appearing object, the object-fill _ flag [ object _ idx [ i ] indicates whether the object with the mark-fill _ objection [ objecti ] in the current image is a newly appearing object, the object-fill _ object _ flag [ object _ idx [ objectid ] flag [ object _ idx [ i ] indicates whether the position and size of the object-fill _ object _ idx [ object ] in the current image and the decoded image are changed, and the object-fill _ object _ idx [ object _ fill _ idx [ object _ flag [ idx ] represents the position [ idx [ element ] and the object _ flag [ idx [ 0 ] when the object _ idx [ object _ idx [ element ] is not, the object _ idx [ element ] and object _.

Alternatively, in embodiments of the present application, the code stream data and/or the identification information may further include an indication bit of a decoded picture for indicating which decoded picture is currently referred to.

The method of the embodiment of the present application is explained in detail above, and the encoding apparatus and the decoding apparatus of the embodiment of the present application are explained in detail below.

Fig. 4 is a schematic block diagram of an encoding apparatus 400 of embodiments of the present application, as shown in fig. 4, the encoding apparatus 400 includes:

at least memories 410 for storing computer-executable instructions;

at least processors 420, individually or collectively, for accessing the at least memories 410 and executing the computer-executable instructions to:

encoding a current image to generate code stream data, wherein the code stream data includes identification information, the identification information is used for identifying at least target objects in the current image, the identification information includes image area information and pixel information, the image area information includes the position and size of an image area where the target objects are located, and the pixel information includes the attribute of at least pixels in the image area.

The encoding device of the embodiment of the application indicates the position and the size of the image area where the target object is located through the image area information, and indicates the attributes of a plurality of pixels in the image area through the pixel information, so that the target object is identified with finer granularity, and the decoding device is favorable for performing operation on the target object more efficiently and more accurately.

In embodiments, the attribute of the at least pixels includes whether the at least pixels belong to the target object.

In embodiments, the image area comprises a plurality of sub-image areas, the pixel information comprising values assigned to at least pixels in the image area, wherein pixels in different sub-image areas are assigned different values.

In , in the pixel information, different values are assigned to the at least pixels for indicating whether the at least pixels belong to the target object.

In , of the at least pixels, the th portion of pixels is assigned a th value to indicate that the th portion of pixels does not belong to the target object.

In , a second portion of the at least pixels is assigned a second value to indicate that the second portion of pixels belongs to the target object.

In embodiments, the attribute of the at least pixels includes a location of the target object to which the at least pixels belong.

In , different pixels in the pixel information are assigned different values for indicating that the different pixels belong to different parts of the target object.

In embodiments, the target object is a person;

a th partial pixel of the at least pixels is assigned a third value indicating that the th partial pixel belongs to the head of the target object;

and/or the presence of a gas in the gas,

a second portion of the at least pixels is assigned a fourth numerical value indicating that the second portion of pixels belongs to the hand of the target object.

In embodiments, the target object is a vehicle;

a th sub-pixel of the at least pixels is assigned a fifth value for indicating that the th sub-pixel belongs to the head of the target object;

and/or the presence of a gas in the gas,

a second portion of the at least pixels is assigned a sixth numerical value indicating that the second portion of pixels belongs to a tail of the target object.

In embodiments, the attribute of the at least pixels includes a descriptive feature to which the at least pixels correspond.

In embodiments, the descriptive characteristics for the at least pixel correspondences include at least of the reflected intensity of the point cloud for the at least pixel correspondences, the infrared intensity for the at least pixel correspondences, and the depth value for the at least pixel correspondences.

In some embodiments are characterized in that the attributes are in units of pixel blocks, the pixel information includes information of attributes of at least pixel blocks, and the pixel blocks include at least two pixels.

In embodiments, the target object is at least objects that meet the following:

the current image is a newly added identification object relative to the coded image;

the position of the current image relative to the coded image is changed to identify the object;

the current image is a mark object with a size changed relative to the coded image;

an identification object in which the current image changes with respect to pixel information in the image region in the encoded image.

In , the codestream data further includes category identification bits for indicating at least cases:

the target object is a newly added identification object of the current image relative to the coded image;

the target object is a mark object of which the position of the current image is changed relative to the coded image;

the target object is a mark object of which the size of the current image is changed relative to the size of the coded image;

the target object is an identification object of the current image which changes relative to the pixel information in the image area in the coded image.

In , the target object includes a new marker object of the current image with respect to the encoded image, and the image region information includes an absolute value of a position and an absolute value of a size of an image region in which the new marker object is located.

In , the target object includes an identification object whose position of the current image relative to the encoded image has changed;

the image area information includes an absolute value of a position of an image area where the target object is located or a relative value of a position change.

In , the image region information includes an identification bit indicating that the size of the image region in which the target object is located remains unchanged from the encoded image.

In embodiments, the target object includes an identifying object of varying size from the current image to the encoded image;

the image area information includes an absolute value of a size or a relative value of a size change of an image area where the target object is located.

In , the pixel information includes an identification bit indicating that the pixel information of the image region where the target object is located remains unchanged from the encoded image.

In , the object includes an identification object having a change in pixel information, including an absolute value of the pixel information or a relative value of the change in pixel information, from the current image to the encoded image.

In , the pixel information includes an identification bit indicating that the pixel information of the image region where the target object is located is changed from the encoded image.

In , the image region information includes an identification bit indicating that the size and/or position of the image region in which the target object is located remains unchanged from the encoded image.

In , the image area is a rectangular area, and the image area information includes coordinates of a center point of the rectangular area, height information of the rectangular area, and width information of the rectangular area;

the image area information comprises an identification bit for indicating that the coordinates of the center point of the image area where the target problem is located are unchanged compared with the encoded image.

In embodiments, the identification information is also used to identify removed objects for the current image relative to the encoded image.

In embodiments, the identification information includes label information of the removed object or location information of the removed object.

In , the processor 420 is further configured to:

determining the newly added object to be identified as the target object relative to the coded image of the current image;

determining the object to be identified, of which the position and/or size of the current image is changed relative to the coded image, as the target object;

and determining the object to be identified, of which the pixel information in the image area changes, of the current image relative to the coded image as the target object.

In , the identification information further includes content information indicating the content of the target object.

In , the content information is label information.

In , the content information is a numerical value.

In embodiments, the image region is a rectangular region.

In embodiments, the image region information includes coordinates of an arbitrary corner of the rectangular region, height information of the rectangular region, and width information of the rectangular region;

alternatively, the first and second electrodes may be,

the image area information comprises the coordinates of the center point of the rectangular area, the height information of the rectangular area and the width information of the rectangular area;

alternatively, the first and second electrodes may be,

the image area information comprises coordinates of the upper left corner of the rectangular area and coordinates of the lower right corner of the rectangular area;

alternatively, the first and second electrodes may be,

the image area information includes an upper right corner coordinate of the rectangular area and a lower left corner coordinate of the rectangular area.

In , before the encoding process is performed on the current image to generate codestream data, the processor 420 may be further configured to:

and performing image recognition on the current image, determining the target object, and obtaining the identification information of the target object.

In , the identification information is located in the auxiliary enhancement information or the extension data of the current image.

For example, fig. 5 is a schematic block diagram of an encoding device 500 of embodiments of the present application, as shown in fig. 5, the encoding device 500 may include an encoding module 510 for performing encoding processing, generating code stream data, and the like.

Fig. 6 is a schematic block diagram of a decoding apparatus 600 of embodiments of the present application, as shown in fig. 6,

at least memories 610 for storing computer-executable instructions;

at least processors 620, individually or collectively, operable to access the at least memories 610 and execute the computer-executable instructions to:

acquiring code stream data of a current image, wherein the code stream data comprises identification information, the identification information is used for identifying at least target objects in the current image, the identification information comprises image area information and pixel information, the image area information comprises the position and the size of an image area where the target objects are located, and the pixel information comprises the attribute of at least pixels in the image area;

and decoding at least part of the code stream data.

The decoding device provided by the embodiment of the application indicates the position and the size of the image area where the target object is located through the image area information, and indicates the attributes of a plurality of pixels in the image area through the pixel information, so that the target object is identified with finer granularity, and the decoding device is favorable for performing operation on the target object more efficiently and more accurately.

In , in the pixel information, different values are assigned to the at least pixels, and the processor 620 performs decoding processing on at least part of the codestream data, including:

and determining whether the at least pixels in the image area belong to the target object according to pixel information in the code stream data.

In , the determining, by the processor 620, whether the at least pixels in the image region belong to the target object according to pixel information in the codestream data may include:

and when th part of pixels in the pixel information in the code stream data correspond to th numerical value, determining that the th part of pixels do not belong to the target object.

and when a second part of pixels in the pixel information in the code stream data correspond to the second numerical value, determining that the second part of pixels belong to the target object.

In , the pixel information is such that different pixels are assigned different values,

the processor 620 performs decoding processing on at least part of the code stream data, including:

and determining the parts of the at least pixels in the image area, which belong to the target object, according to the pixel information in the code stream data.

In embodiments, the target object is a person;

the processor 620 determines, according to pixel information in the code stream data, a position to which the at least pixels in the image area belong in the target object, including:

when th part of pixels in pixel information in the code stream data correspond to the third numerical value, determining that th part of pixels belong to the head of the target object;

and/or the presence of a gas in the gas,

and when a second part of pixels in the pixel information in the code stream data correspond to the fourth numerical value, determining that the second part of pixels belong to the hand of the target object.

In embodiments, the target object is a vehicle;

when th part of pixels in the pixel information in the code stream data correspond to the fifth numerical value, determining that th part of pixels belong to the head of the target object;

and/or the presence of a gas in the gas,

the second part of the at least pixels is assigned with a sixth value, and the processor 620 determines, according to the pixel information in the codestream data, a location where the at least pixels in the image area belong to the target object, including:

and determining that a second part of pixels belong to the tail of the target object according to the sixth numerical value corresponding to the second part of pixels in the pixel information in the code stream data.

In embodiments, the attributes are in units of pixels, the pixel information includes information for attributes of at least blocks of pixels, the blocks of pixels including at least two pixels.

In , in some embodiments, the codestream data includes a category identifier, and the processor 620 is further configured to:

determining the target object as at least objects according to the category identification position, wherein the objects are in accordance with the following conditions:

the current image is a newly added identification object relative to the decoded image;

the current image is a mark object with a changed position relative to the decoded image;

the current image is a mark object with the size changed relative to the size of the decoded image;

and the current image is relative to the decoded image and the pixel information in the image area is changed to identify the object.

In , the object includes a new mark object of the current image relative to the decoded image, and the image area information includes an absolute value of a position and an absolute value of a size of an image area where the object is located.

In , the target object includes an identification object whose position of the current image relative to the decoded image has changed;

the image area information includes an absolute value of a position of an image area where the target object is located,

alternatively, the first and second electrodes may be,

the image area information includes a relative value of a position change of an image area where the target object is located, and the processor 620 performs decoding processing on at least part of the code stream data, which may include:

and determining the position of the target object in the image area in the current image according to the position of the target object in the image area in the decoded image and the relative value of the position change of the image area.

In , the image region information includes an identification bit for indicating that the size of the image region in which the target object is located remains unchanged compared with the size of the image region in the decoded image;

the processor 620 may perform decoding processing on at least a portion of the code stream data, and further include:

and determining the size of the image area of the target object in the current image according to the size of the image area of the target object in the decoded image.

In , the target object includes an identification object of the current image that has changed size relative to the decoded image;

the image area information comprises an absolute value of a size of the image area,

alternatively, the first and second electrodes may be,

the image region information includes a relative value of a size change of the image region, and the processor 620 performs decoding processing on at least part of the code stream data, including:

and determining the size of the image area of the target object in the current image according to the size of the image area of the target object in the decoded image and the relative value of the size change of the image area.

In , the pixel information includes an identification bit indicating that the pixel information of the image region where the target object is located remains unchanged from the decoded image;

the processor 620 performs decoding processing on at least part of the code stream data, and further includes:

and determining the pixel information of the target object in the image area of the current image according to the pixel information of the image area of the target object in the decoded image.

In , the codestream data includes the pixel information;

and decoding the pixel information of the target object in the image area where the current image is located.

In , the codestream data further includes an identification bit for indicating that the pixel information of the image region where the target object is located is changed from the decoded image.

In , the target object includes an identification object of the current image that has changed in pixel information relative to the decoded image;

the pixel information comprises an absolute value of an attribute of the at least pixels;

alternatively, the first and second electrodes may be,

the pixel information includes a relative value of the attribute change of the at least pixels, and the processor 620 performs decoding processing on at least part of the codestream data, including:

and determining the pixel information of the target object in the current image according to the pixel information of the target object in the decoded image and the relative value of the attribute change of at least pixels.

In , the image region information further includes an identification bit indicating that the target object is unchanged in the decoded image compared to the image region where the current image is located;

and determining the image area information of the target object in the current image according to the image area information of the target object in the decoded image.

the image area information also comprises an identification bit for indicating that the coordinates of the central point of the image area where the target object is located are kept unchanged;

and determining the center point coordinate of the target object in the image area according to the center point coordinate of the image area where the target object is located in the decoded image.

In embodiments, the identification information is also used to identify removed objects of the current image relative to the decoded image.

In embodiments, the identification information includes label information of the removed object or position information of the removed object in the decoded image.

In embodiments, the decoding processing performed by the processor 620 on at least part of the codestream data includes:

and decoding the identification information in the code stream data to obtain the current image and the decoded identification information.

discarding the identification information and not decoding the identification information.

In , the codestream data further includes image content data of the current image;

and decoding the image content data of the current image in the code stream data.

In embodiments, the image content data for the current image includes reference frame data for the current image and residual data between the current image and the reference frame.

In , the identification information further includes content information,

and determining the content of the target object according to the content information in the code stream data.

In , the content information is label information.

In , the content information is a numerical value.

In embodiments, the image region is a rectangular region.

alternatively, the first and second electrodes may be,

In , the identification information is located in the auxiliary enhancement information or extension data of the current image.

For example, fig. 7 is a schematic block diagram of a decoding device 700 according to embodiments of the present application, as shown in fig. 7, the decoding device 700 may include an obtaining module 710 configured to obtain code stream data of a current image, and a decoding module 720 configured to perform decoding processing on at least a portion of the code stream data.

It should be understood that the processor referred to in the embodiments of the present application may be a Central Processing Unit (CPU), other general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable array (FPGA) or other programmable logic device, discrete or transistor logic device, discrete hardware component, etc.

It will also be appreciated that the memory referred to in the embodiments of the application may be either volatile memory or nonvolatile memory, or may include both volatile and nonvolatile memory. The non-volatile memory may be a read-only memory (ROM), a Programmable ROM (PROM), an Erasable PROM (EPROM), an electrically Erasable EPROM (EEPROM), or a flash memory. Volatile memory can be Random Access Memory (RAM), which acts as external cache memory. By way of example, but not limitation, many forms of RAM are available, such as Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), Synchronous Dynamic Random Access Memory (SDRAM), double data rate SDRAM, enhanced SDRAM, SLDRAM, Synchronous Link DRAM (SLDRAM), and direct rambus RAM (DR RAM).

It should be noted that when the processor is a general-purpose processor, a DSP, an ASIC, an FPGA or other programmable logic device, a discrete or a transistor logic device, a discrete hardware component, the memory (memory module) is integrated in the processor.

It should be noted that the memory described herein is intended to comprise, without being limited to, these and any other suitable types of memory.

The present application also provides computer-readable storage media having stored thereon instructions, which, when executed on a computer, cause the computer to perform the methods of the above-described method embodiments.

The embodiment of the present application further provides computer programs, which enable a computer to execute the methods of the above method embodiments.

Embodiments of the present application further provide computing devices that include the computer-readable storage medium described above.

The embodiment of the application can be applied to the aircraft, especially the unmanned aerial vehicle field.

It should be understood that the division of circuits, sub-units of the various embodiments of the present application is illustrative only. Those of ordinary skill in the art will appreciate that the various illustrative circuits, sub-circuits, and sub-units described in connection with the embodiments disclosed herein can be split or combined.

The computer instructions may be stored in a computer readable storage medium, or transmitted from computer readable storage media to computer readable storage media, for example, the computer instructions may be transmitted from website sites, computers, servers, or data centers to another website sites, computers, servers, or data centers via wire (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)), or wirelessly (e.g., infrared, wireless, microwave, etc.), to another website sites, computers, servers, or data centers via wire (e.g., digital subscriber line (DVD)), or a solid state disk (DVD), a digital video disk (SSD)), or a solid state disk (DVD), a Digital Video Disk (DVD), or a Digital Video Disk (DVD), or any combination thereof).

It should be understood that the embodiments of the present application are described with respect to a total bit width of 16 bits (bit), and the embodiments of the present application may be applied to other bit widths.

It should be appreciated that reference throughout this specification to " embodiments" or " embodiments" means that a particular feature, structure or characteristic described in connection with the embodiments is included in at least embodiments of the present application.

It should be understood that, in the various embodiments of the present application, the sequence numbers of the above-mentioned processes do not mean the execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present application.

It should be understood that in the embodiment of the present application, "B corresponding to a" means that B is associated with a, from which B can be determined. It should also be understood that determining B from a does not mean determining B from a alone, but may be determined from a and/or other information.

It should be understood that the term "and/or" herein, which is only, describes the association relationship of the associated objects, means that there may be three relationships, for example, a and/or B, and may mean that there are three cases of a alone, a and B together, and B alone.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

For example, the above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units into logical functional divisions may be realized in other ways, for example, multiple units or components may be combined or integrated into another systems, or features may be omitted or not executed, in another point, the shown or discussed coupling or direct coupling or communication connection between each other may be through interfaces, indirect coupling or communication connection between units or devices may be electrical, mechanical or other forms.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, that is, may be located in places, or may also be distributed on multiple network units.

In addition, functional units in the embodiments of the present application may be integrated into processing units, or each unit may exist alone physically, or two or more units are integrated into units.

The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

The coding method of claim 1 or , comprising:

encoding a current image to generate code stream data, wherein the code stream data includes identification information, the identification information is used for identifying at least target objects in the current image, the identification information includes image area information and pixel information, the image area information includes the position and size of an image area where the target objects are located, and the pixel information includes the attribute of at least pixels in the image area.
2. The encoding method of claim 1, wherein the at least pixel attribute comprises whether the at least pixel belongs to the target object.
3. The encoding method according to claim 1, wherein the image area comprises a plurality of sub-image areas, and the pixel information comprises a value assigned to at least pixels in the image area;

wherein the pixels in different said sub-image regions are assigned different values.
4. The encoding method according to claim 3, wherein said at least pixels in said pixel information are assigned different values for indicating whether said at least pixels belong to said target object.
5. The encoding method according to claim 4, wherein -th sub-pixels among the at least pixels are assigned a -th value to indicate that -th sub-pixels do not belong to the target object.
6. The encoding method according to claim 4 or 5, wherein a second part of the at least pixels is assigned a second value to indicate that the second part of pixels belongs to the target object.
7. The encoding method of claim 1, wherein the at least pixel attributes include a location of the target object to which the at least pixel belongs.
8. The encoding method according to claim 7, wherein different pixels in the pixel information are assigned different values for indicating that the different pixels belong to different parts of the target object.
9. The encoding method according to claim 8, wherein the target object is a person;

a th partial pixel of the at least pixels is assigned a third value indicating that the th partial pixel belongs to the head of the target object;

and/or the presence of a gas in the gas,

a second portion of the at least pixels is assigned a fourth numerical value indicating that the second portion of pixels belongs to the hand of the target object.
10. The encoding method according to claim 8, wherein the target object is a vehicle;

a th sub-pixel of the at least pixels is assigned a fifth value for indicating that the th sub-pixel belongs to the head of the target object;

and/or the presence of a gas in the gas,

a second portion of the at least pixels is assigned a sixth numerical value indicating that the second portion of pixels belongs to a tail of the target object.
11. The encoding method as claimed in claim 1, wherein the attributes of at least pixels comprise descriptive features corresponding to the at least pixels.
12. The encoding method as claimed in claim 11, wherein the descriptive features corresponding to at least pixels comprise at least of a reflection intensity of the point cloud corresponding to at least pixels, an infrared intensity corresponding to at least pixels, and a depth value corresponding to at least pixels.
13. The encoding method of , wherein the attributes are in units of pixel blocks, and wherein the pixel information comprises information of attributes of at least pixel blocks, the pixel blocks comprising at least two pixels.
14. The encoding method of any of claims 1-13, wherein the target object is at least objects that meet the following:

the current image is a newly added identification object relative to the coded image;

the position of the current image relative to the coded image is changed to identify the object;

the current image is a mark object with a size changed relative to the coded image;

an identification object in which the current image changes with respect to pixel information in the image region in the encoded image.
15. The encoding method according to claim 14, wherein the codestream data further includes a category identification bit for indicating at least of the following cases:

the target object is a newly added identification object of the current image relative to the coded image;

the target object is a mark object of which the position of the current image is changed relative to the coded image;

the target object is a mark object of which the size of the current image is changed relative to the size of the coded image;

the target object is an identification object of the current image which changes relative to the pixel information in the image area in the coded image.
16. The encoding method according to claim 14, wherein the target object includes a new marker object of the current picture with respect to the encoded picture, and the picture region information includes an absolute value of a position and an absolute value of a size of a picture region in which the new marker object is present.
17. The encoding method according to claim 14, wherein the target object includes a marker object whose position of the current image changes from the encoded image;

the image area information includes an absolute value of a position of an image area where the target object is located or a relative value of a position change.
18. The encoding method according to claim 14 or 17, wherein the image region information includes an identification bit for indicating that the size of the image region in which the target object is located is unchanged from the encoded image.
19. The encoding method according to claim 14 or 17, wherein the target object includes a marker object whose size of the current image is changed from that of the encoded image;

the image area information includes an absolute value of a size or a relative value of a size change of an image area where the target object is located.
20. The decoding method according to claim 17 or 19, wherein the pixel information includes an identification bit for indicating that the pixel information of the image region where the target object is located remains unchanged from the encoded image.
21. The encoding method of any of claims 14 through 19, wherein the target object comprises an identification object in which the current picture has changed from an encoded picture, and wherein the pixel information comprises an absolute value of the pixel information or a relative value of the change in the pixel information.
22. The decoding method according to claim 21, wherein the pixel information includes an identification bit for indicating that the pixel information of the image region where the target object is located is changed from the encoded image.
23. The encoding method according to claim 21, wherein the image region information includes an identification bit for indicating that the size and/or position of the image region in which the target object is located is unchanged from the encoded image.
24. The encoding method according to claim 19 or 21, wherein the image region is a rectangular region, and the image region information includes center point coordinates of the rectangular region, height information of the rectangular region, and width information of the rectangular region;

the image area information comprises an identification bit for indicating that the coordinates of the center point of the image area where the target problem is located are unchanged compared with the encoded image.
25. The encoding method of any one of of claims 1 to 24,

the identification information is also used to identify removed objects of the current image relative to the encoded image.
26. The encoding method of claim 25,

the identification information includes label information of the removed object or position information of the removed object.
27. The encoding method of any of claims 1-26, further comprising at least of the steps of:

determining the newly added object to be identified as the target object relative to the coded image of the current image;

determining the object to be identified, of which the position and/or size of the current image is changed relative to the coded image, as the target object;

and determining the object to be identified, of which the pixel information in the image area changes, of the current image relative to the coded image as the target object.
28. The encoding method according to of any one of claims 1 to 27, wherein the identification information further includes content information indicating a content of the target object.
29. The encoding method according to claim 28, wherein the content information is label information.
30. The encoding method of claim 28, wherein the content information is a numerical value.
31. The encoding method of any of claims 1-30, wherein the image region is a rectangular region.
32. The encoding method according to claim 31, wherein the image area information includes coordinates of an arbitrary corner of the rectangular area, height information of the rectangular area, and width information of the rectangular area;

alternatively, the first and second electrodes may be,

the image area information comprises the coordinates of the center point of the rectangular area, the height information of the rectangular area and the width information of the rectangular area;

alternatively, the first and second electrodes may be,

the image area information comprises coordinates of the upper left corner of the rectangular area and coordinates of the lower right corner of the rectangular area;

alternatively, the first and second electrodes may be,

the image area information includes an upper right corner coordinate of the rectangular area and a lower left corner coordinate of the rectangular area.
33. The encoding method according to any of claims 1-32, wherein before the encoding process of the current image to generate code stream data, the encoding method further comprises:

and performing image recognition on the current image, determining the target object, and obtaining the identification information of the target object.
34. The encoding method according to of any of claims 1-33, wherein the identification information is located in enhancement information or extension data of the current picture.
35, A decoding method, comprising:

acquiring code stream data of a current image, wherein the code stream data comprises identification information, the identification information is used for identifying at least target objects in the current image, the identification information comprises image area information and pixel information, the image area information comprises the position and the size of an image area where the target objects are located, and the pixel information comprises the attribute of at least pixels in the image area;

and decoding at least part of the code stream data.
36. The decoding method of claim 35, wherein the attribute of at least pixels includes whether the at least pixels belongs to the target object.
37. The decoding method according to claim 35, wherein the image area comprises a plurality of sub-image areas, and the pixel information comprises a value assigned to at least pixels in the image area;

wherein the pixels in different said sub-image regions are assigned different values.
38. The decoding method as claimed in claim 37, wherein said at least pixels in said pixel information are assigned different values,

the decoding processing of at least part of the code stream data includes:

and determining whether the at least pixels in the image area belong to the target object according to pixel information in the code stream data.
39. The decoding method according to claim 38, wherein said determining whether the at least pixels in the image area belong to the target object according to pixel information in the codestream data comprises:

and when th part of pixels in the pixel information in the code stream data correspond to th numerical value, determining that the th part of pixels do not belong to the target object.
40. The decoding method according to claim 38 or 39,

the determining whether the at least pixels in the image area belong to the target object according to the pixel information in the codestream data includes:

and when a second part of pixels in the pixel information in the code stream data correspond to the second numerical value, determining that the second part of pixels belong to the target object.
41. The decoding method of claim 35, wherein the attribute of at least pixels includes a location of the target object to which the at least pixels belong.
42. The decoding method as claimed in claim 41, wherein different pixels in the pixel information are assigned different values,

the decoding processing of at least part of the code stream data includes:

and determining the parts of the at least pixels in the image area, which belong to the target object, according to the pixel information in the code stream data.
43. The decoding method according to claim 42, wherein the target object is a person;

the determining, according to the pixel information in the code stream data, the position to which the at least pixels in the image area belong to the target object includes:

when th part of pixels in pixel information in the code stream data correspond to the third numerical value, determining that th part of pixels belong to the head of the target object;

and/or the presence of a gas in the gas,

the determining, according to the pixel information in the code stream data, the position to which the at least pixels in the image area belong to the target object includes:

and when a second part of pixels in the pixel information in the code stream data correspond to the fourth numerical value, determining that the second part of pixels belong to the hand of the target object.
44. The decoding method according to claim 42, wherein the target object is a vehicle;

the determining, according to the pixel information in the code stream data, the position to which the at least pixels in the image area belong to the target object includes:

when th part of pixels in the pixel information in the code stream data correspond to the fifth numerical value, determining that th part of pixels belong to the head of the target object;

and/or the presence of a gas in the gas,

a sixth value is assigned to a second part of the at least pixels, and determining that the at least pixels in the image area belong to the part of the target object according to pixel information in the codestream data includes:

and determining that a second part of pixels belong to the tail of the target object according to the sixth numerical value corresponding to the second part of pixels in the pixel information in the code stream data.
45. The decoding method of claim 35, wherein the at least pixel attributes include descriptive characteristics corresponding to the at least pixels.
46. The decoding method of claim 45, wherein the descriptive characteristics corresponding to at least pixels include at least of a reflection intensity of the point cloud corresponding to at least pixels, an infrared intensity corresponding to at least pixels, and a depth value corresponding to at least pixels.
47. The decoding method according to of any of the claims 35-46, wherein said attributes are measured in units of pixel blocks, and wherein said pixel information comprises information of attributes of at least pixel blocks, said pixel blocks comprising at least two pixels.
48. The decoding method according to of any of claims 35-47, wherein the codestream data includes a class identifier bit, the decoding method further comprising:

determining the target object as at least objects according to the category identification position, wherein the objects are in accordance with the following conditions:

the current image is a newly added identification object relative to the decoded image;

the current image is a mark object with a changed position relative to the decoded image;

the current image is a mark object with the size changed relative to the size of the decoded image;

and the current image is relative to the decoded image and the pixel information in the image area is changed to identify the object.
49. The decoding method according to claim 48, wherein the target object comprises a mark object added to the current picture relative to the decoded picture, and the picture region information comprises an absolute value of a position and an absolute value of a size of a picture region in which the target object is located.
50. The decoding method according to claim 48, wherein the target object includes a marker object whose position of the current picture relative to the decoded picture has changed;

the image area information includes an absolute value of a position of an image area where the target object is located,

alternatively, the first and second electrodes may be,

the image area information includes a relative value of a position change of an image area where the target object is located, and the decoding processing of at least part of the code stream data includes:

and determining the position of the target object in the image area in the current image according to the position of the target object in the image area in the decoded image and the relative value of the position change of the image area.
51. The decoding method according to claim 50, wherein the image region information includes an identification bit for indicating that the size of the image region in which the target object is located remains unchanged in the decoded image compared to the size of the image region in which the target object is located;

the decoding at least part of the code stream data further comprises:

and determining the size of the image area of the target object in the current image according to the size of the image area of the target object in the decoded image.
52. The decoding method according to claim 48 or 50, wherein the target object comprises a marker object in which the size of the current image changes from the size of the decoded image;

the image area information comprises an absolute value of a size of the image area,

alternatively, the first and second electrodes may be,

the image area information includes a relative value of a size change of the image area, and the decoding processing of at least part of the code stream data includes:

and determining the size of the image area of the target object in the current image according to the size of the image area of the target object in the decoded image and the relative value of the size change of the image area.
53. The decoding method according to claim 50 or 52, wherein the pixel information comprises an identification bit for indicating that the pixel information of the image region where the target object is located remains unchanged from the decoded image;

the decoding at least part of the code stream data further comprises:

and determining the pixel information of the target object in the image area of the current image according to the pixel information of the image area of the target object in the decoded image.
54. The decoding method according to claim 50 or 52, wherein the code stream data includes the pixel information;

the decoding at least part of the code stream data further comprises:

and decoding the pixel information of the target object in the image area where the current image is located.
55. The decoding method of claim 54, wherein the codestream data further includes an identification bit for indicating that the pixel information of the image region where the target object is located is changed from the decoded image.
56. The decoding method of any of claims 48-51 wherein the target object comprises an identified object in which the current picture has changed in pixel information from a decoded picture;

the pixel information comprises an absolute value of an attribute of the at least pixels;

alternatively, the first and second electrodes may be,

the pixel information includes a relative value of the attribute change of the at least pixels, and the decoding processing of at least part of the code stream data includes:

and determining the pixel information of the target object in the current image according to the pixel information of the target object in the decoded image and the relative value of the attribute change of at least pixels.
57. The decoding method according to claim 56, wherein the image region information further includes an identification bit for indicating that the target object is unchanged in the decoded image compared to the image region in which the current image is located;

the decoding processing of at least part of the code stream data includes:

and determining the image area information of the target object in the current image according to the image area information of the target object in the decoded image.
58. The decoding method according to claim 56, wherein the image region is a rectangular region, and the image region information includes coordinates of a center point of the rectangular region, height information of the rectangular region, and width information of the rectangular region;

the image area information also comprises an identification bit for indicating that the coordinates of the central point of the image area where the target object is located are kept unchanged;

the decoding processing of at least part of the code stream data includes:

and determining the center point coordinate of the target object in the image area according to the center point coordinate of the image area where the target object is located in the decoded image.
59. The decoding method according to any one of of claims 35 to 58,

the identification information is also used to identify removed objects of the current image relative to the decoded image.
60. The decoding method according to claim 59,

the identification information includes label information of the removed object or position information of the removed object in the decoded image.
61. The decoding method of any of claims 35 through 60, wherein the decoding at least part of the codestream data comprises:

and decoding the identification information in the code stream data to obtain the current image and the decoded identification information.
62. The decoding method of any of claims 35 through 61, wherein the decoding at least part of the codestream data comprises:

discarding the identification information and not decoding the identification information.
63. The decoding method according to claim 62, wherein the codestream data further includes image content data of the current image;

the decoding processing of at least part of the code stream data includes:

and decoding the image content data of the current image in the code stream data.
64. The decoding method according to claim 63, wherein the image content data of the current image comprises reference frame data of the current image and residual data between the current image and the reference frame.
65. The decoding method of any of claims 35-64, wherein the identification information further comprises content information,

the decoding processing of at least part of the code stream data includes:

and determining the content of the target object according to the content information in the code stream data.
66. The decoding method according to claim 65, wherein the content information is label information.
67. The decoding method according to claim 65, wherein the content information is a numerical value.
68. The decoding method of any of claims 35-67 and , wherein the image region is a rectangular region.
69. The decoding method according to claim 68, wherein the image region information includes coordinates of an arbitrary corner of the rectangular region, height information of the rectangular region, and width information of the rectangular region;

alternatively, the first and second electrodes may be,

the image area information comprises the coordinates of the center point of the rectangular area, the height information of the rectangular area and the width information of the rectangular area;

alternatively, the first and second electrodes may be,

the image area information comprises coordinates of the upper left corner of the rectangular area and coordinates of the lower right corner of the rectangular area;

alternatively, the first and second electrodes may be,

the image area information includes an upper right corner coordinate of the rectangular area and a lower left corner coordinate of the rectangular area.
70. The decoding method according to any of claims 35-69 of , wherein the identification information is located in the enhancement information or extension data of the current picture.
71. an encoding device, comprising:

at least memories for storing computer-executable instructions;

at least processors, individually or collectively, to access the at least memories and execute the computer-executable instructions to perform operations comprising:

encoding a current image to generate code stream data, wherein the code stream data includes identification information, the identification information is used for identifying at least target objects in the current image, the identification information includes image area information and pixel information, the image area information includes the position and size of an image area where the target objects are located, and the pixel information includes the attribute of at least pixels in the image area.
72. The encoding device of claim 71, wherein the at least pixel attribute includes whether the at least pixels belongs to the target object.
73. The encoding device of claim 71, wherein the image region comprises a plurality of sub-image regions, and wherein the pixel information comprises a value assigned to at least pixels in the image region;

wherein the pixels in different said sub-image regions are assigned different values.
74. The encoding device according to claim 73, wherein the at least pixels in the pixel information are assigned different values for indicating whether the at least pixels belong to the target object.
75. The encoding device of claim 74, wherein a th sub-pixel of the at least pixels is assigned a th value to indicate that the th sub-pixel does not belong to the target object.
76. The encoding device according to claim 74 or 75, wherein a second part of the at least pixels is assigned a second value to indicate that the second part of pixels belongs to the target object.
77. The encoding device of claim 71, wherein the at least pixel attributes include a location of the target object to which the at least pixels belong.
78. The encoding device according to claim 77, wherein different pixels in the pixel information are assigned different values for indicating that the different pixels belong to different parts of the target object.
79. The encoding device according to claim 78, wherein the target object is a person;

a th partial pixel of the at least pixels is assigned a third value indicating that the th partial pixel belongs to the head of the target object;

and/or the presence of a gas in the gas,

a second portion of the at least pixels is assigned a fourth numerical value indicating that the second portion of pixels belongs to the hand of the target object.
80. The encoding device of claim 78, wherein the target object is a vehicle;

a th sub-pixel of the at least pixels is assigned a fifth value for indicating that the th sub-pixel belongs to the head of the target object;

and/or the presence of a gas in the gas,

a second portion of the at least pixels is assigned a sixth numerical value indicating that the second portion of pixels belongs to a tail of the target object.
81. The encoding device of claim 71, wherein the at least pixel attributes include descriptive characteristics corresponding to the at least pixels.
82. The encoding device of claim 81, wherein the descriptive characteristics for the at least pixel correspondences include at least of a reflection intensity for the point cloud for the at least pixel correspondences, an infrared intensity for the at least pixel correspondences, and a depth value for the at least pixel correspondences.
83. The encoding device of any of claims 71-82, wherein the attributes are in units of pixel blocks, and wherein the pixel information includes information for attributes of at least pixel blocks, the pixel blocks including at least two pixels.
84. The encoding apparatus of any of claims 71-83, wherein the target object is at least objects that comply with:

the current image is a newly added identification object relative to the coded image;

the position of the current image relative to the coded image is changed to identify the object;

the current image is a mark object with a size changed relative to the coded image;

an identification object in which the current image changes with respect to pixel information in the image region in the encoded image.
85. The encoding device according to claim 84, wherein the codestream data further includes class identification bits for indicating at least of the following cases:

the target object is a newly added identification object of the current image relative to the coded image;

the target object is a mark object of which the position of the current image is changed relative to the coded image;

the target object is a mark object of which the size of the current image is changed relative to the size of the coded image;

the target object is an identification object of the current image which changes relative to the pixel information in the image area in the coded image.
86. The encoding device according to claim 84, wherein the target object includes a new identification object of the current picture with respect to the encoded picture, and the picture region information includes an absolute value of a position and an absolute value of a size of a picture region in which the new identification object is present.
87. The encoding device according to claim 84, wherein the target object includes a marker object whose position of the current image changes from the encoded image;

the image area information includes an absolute value of a position of an image area where the target object is located or a relative value of a position change.
88. The encoding device according to claim 84 or 87, wherein the image region information includes an identification bit for indicating that the size of the image region in which the target object is located is unchanged from the encoded image.
89. The encoding apparatus according to claim 84 or 87, wherein the target object includes a marker object whose size of the current image is changed from that of the encoded image;

the image area information includes an absolute value of a size or a relative value of a size change of an image area where the target object is located.
90. The decoding method according to claim 87 or 89, wherein the pixel information comprises an identification bit for indicating that the pixel information of the image region where the target object is located remains unchanged from the encoded image.
91. The encoding device according to any of claims 84-89, wherein the target object comprises an identification object in which the current picture changes from a coded picture, and pixel information comprises an absolute value of the pixel information or a relative value of the change in the pixel information.
92. The decoding method according to claim 91, wherein the pixel information includes an identification bit for indicating that the pixel information of the image region where the target object is located is changed from the encoded image.
93. The encoding device according to claim 91, wherein the image region information includes an identification bit indicating that the size and/or position of the image region in which the target object is located remains unchanged from the encoded image.
94. The encoding device according to claim 89 or 91, wherein the image region is a rectangular region, and the image region information includes center point coordinates of the rectangular region, height information of the rectangular region, and width information of the rectangular region;

the image area information comprises an identification bit for indicating that the coordinates of the center point of the image area where the target problem is located are unchanged compared with the encoded image.
95. The encoding device of any of of claims 71-94,

the identification information is also used to identify removed objects of the current image relative to the encoded image.
96. The encoding device of claim 95,

the identification information includes label information of the removed object or position information of the removed object.
97. The encoding device of any one of claims 71-96, wherein the processor is further configured to:

determining the newly added object to be identified as the target object relative to the coded image of the current image;

determining the object to be identified, of which the position and/or size of the current image is changed relative to the coded image, as the target object;

and determining the object to be identified, of which the pixel information in the image area changes, of the current image relative to the coded image as the target object.
98. The encoding device according to any of claims 71-97, wherein the identification information further includes content information indicating content of the target object.
99. The encoding device according to claim 98, wherein the content information is label information.
100. The encoding device according to claim 98, wherein the content information is a numerical value.
101. The encoding device of any of claims 71-100, wherein the image region is a rectangular region.
102. The encoding apparatus according to claim 101, wherein the image region information includes coordinates of an arbitrary corner of the rectangular region, height information of the rectangular region, and width information of the rectangular region;

alternatively, the first and second electrodes may be,

the image area information comprises the coordinates of the center point of the rectangular area, the height information of the rectangular area and the width information of the rectangular area;

alternatively, the first and second electrodes may be,

the image area information comprises coordinates of the upper left corner of the rectangular area and coordinates of the lower right corner of the rectangular area;

alternatively, the first and second electrodes may be,

the image area information includes an upper right corner coordinate of the rectangular area and a lower left corner coordinate of the rectangular area.
103. The encoding device according to any of claims 71-102, wherein before the encoding process of the current image to generate codestream data, the processor is further configured to:

and performing image recognition on the current image, determining the target object, and obtaining the identification information of the target object.
104. The encoding device of any of claims 71-103 , wherein the identification information is located in auxiliary enhancement information or extension data of the current image.
105, decoding device, comprising:

at least memories for storing computer-executable instructions;

at least processors, individually or collectively, to access the at least memories and execute the computer-executable instructions to perform operations comprising:

acquiring code stream data of a current image, wherein the code stream data comprises identification information, the identification information is used for identifying at least target objects in the current image, the identification information comprises image area information and pixel information, the image area information comprises the position and the size of an image area where the target objects are located, and the pixel information comprises the attribute of at least pixels in the image area;

and decoding at least part of the code stream data.
106. The decoding device of claim 105, wherein the attribute of the at least pixels includes whether the at least pixels belongs to the target object.
107. The decoding device according to claim 105, wherein the image area comprises a plurality of sub-image areas, the pixel information comprising a value assigned to at least pixels in the image area;

wherein the pixels in different said sub-image regions are assigned different values.
108. The decoding device according to claim 107, wherein said at least pixels are assigned different values in said pixel information,

the processor performs decoding processing on at least part of the code stream data, and the decoding processing comprises the following steps:

and determining whether the at least pixels in the image area belong to the target object according to pixel information in the code stream data.
109. The decoding device according to claim 108, wherein said processor determines whether said at least pixels in said image area belong to said target object according to pixel information in said codestream data, comprising:

and when th part of pixels in the pixel information in the code stream data correspond to th numerical value, determining that the th part of pixels do not belong to the target object.
110. The decoding device according to claim 108 or 109,

the processor determines whether the at least pixels in the image area belong to the target object according to pixel information in the code stream data, and the determining includes:

and when a second part of pixels in the pixel information in the code stream data correspond to the second numerical value, determining that the second part of pixels belong to the target object.
111. The decoding device of claim 105, wherein the attribute of the at least pixels includes a location of the target object to which the at least pixels belong.
112. The decoding device according to claim 111, wherein different pixels in said pixel information are assigned different values,

the processor performs decoding processing on at least part of the code stream data, and the decoding processing comprises the following steps:

and determining the parts of the at least pixels in the image area, which belong to the target object, according to the pixel information in the code stream data.
113. The decoding device according to claim 112, wherein the target object is a person;

the processor determines, according to pixel information in the code stream data, a position to which the at least pixels in the image area belong in the target object, including:

when th part of pixels in pixel information in the code stream data correspond to the third numerical value, determining that th part of pixels belong to the head of the target object;

and/or the presence of a gas in the gas,

the processor determines, according to pixel information in the code stream data, a position to which the at least pixels in the image area belong in the target object, including:

and when a second part of pixels in the pixel information in the code stream data correspond to the fourth numerical value, determining that the second part of pixels belong to the hand of the target object.
114. The decoding device according to claim 112, wherein the target object is a vehicle;

the processor determines, according to pixel information in the code stream data, a position to which the at least pixels in the image area belong in the target object, including:

when th part of pixels in the pixel information in the code stream data correspond to the fifth numerical value, determining that th part of pixels belong to the head of the target object;

and/or the presence of a gas in the gas,

the second part of the at least pixels is assigned with a sixth value, and the processor determines the part of the at least pixels in the image area belonging to the target object according to the pixel information in the code stream data, including:

and determining that a second part of pixels belong to the tail of the target object according to the sixth numerical value corresponding to the second part of pixels in the pixel information in the code stream data.
115. The decoding device of claim 105, wherein the at least pixel attributes comprise descriptive characteristics corresponding to the at least pixels.
116. The decoding device of claim 115, wherein the descriptive characteristics for the at least pixel correspondences include at least of a reflection intensity for the point cloud for the at least pixel correspondences, an infrared intensity for the at least pixel correspondences, and a depth value for the at least pixel correspondences.
117. The decoding device of any of claims 105-116, wherein the attributes are in units of pixel blocks, and wherein the pixel information includes information for attributes of at least pixel blocks, the pixel blocks including at least two pixels.
118. The decoding device according to any of claims 105 to 117, wherein the codestream data includes a class identification bit, and the processor is further configured to:

determining the target object as at least objects according to the category identification position, wherein the objects are in accordance with the following conditions:

the current image is a newly added identification object relative to the decoded image;

the current image is a mark object with a changed position relative to the decoded image;

the current image is a mark object with the size changed relative to the size of the decoded image;

and the current image is relative to the decoded image and the pixel information in the image area is changed to identify the object.
119. The decoding device according to claim 118, wherein the target object comprises a mark object added to the current picture with respect to the decoded picture, and the picture region information comprises an absolute value of a position and an absolute value of a size of a picture region in which the target object is located.
120. The decoding device according to claim 118, wherein the target object comprises an identification object whose position of the current picture relative to the decoded picture has changed;

the image area information includes an absolute value of a position of an image area where the target object is located,

alternatively, the first and second electrodes may be,

the image area information includes a relative value of a position change of an image area where the target object is located, and the processor performs decoding processing on at least part of the code stream data, including:

and determining the position of the target object in the image area in the current image according to the position of the target object in the image area in the decoded image and the relative value of the position change of the image area.
121. The decoding device according to claim 120, wherein the image region information includes an identification bit for indicating that the size of the image region in which the target object is located remains unchanged in the decoded image compared to the size of the image region in which the target object is located;

the processor decodes at least part of the code stream data, and further comprises:

and determining the size of the image area of the target object in the current image according to the size of the image area of the target object in the decoded image.
122. The decoding device according to claim 118 or 120, wherein the target object comprises an identification object in which the size of the current image changes from the size of the decoded image;

the image area information comprises an absolute value of a size of the image area,

alternatively, the first and second electrodes may be,

the image area information includes a relative value of a size change of the image area, and the processor performs decoding processing on at least part of the code stream data, including:

and determining the size of the image area of the target object in the current image according to the size of the image area of the target object in the decoded image and the relative value of the size change of the image area.
123. The decoding device according to claim 120 or 122, wherein the pixel information comprises an identification bit for indicating that the pixel information of the image region where the target object is located remains unchanged from the decoded image;

the processor decodes at least part of the code stream data, and further comprises:

and determining the pixel information of the target object in the image area of the current image according to the pixel information of the image area of the target object in the decoded image.
124. The decoding device according to claim 120 or 122, wherein the codestream data includes the pixel information;

the processor decodes at least part of the code stream data, and further comprises:

and decoding the pixel information of the target object in the image area where the current image is located.
125. The decoding device according to claim 124, wherein the codestream data further includes an identification bit for indicating that pixel information of an image region in which the target object is located is changed from the decoded image.
126. The decoding device of any of claims 118-121 wherein the target object comprises an identified object in which the current picture has changed in pixel information relative to a decoded picture;

the pixel information comprises an absolute value of an attribute of the at least pixels;

alternatively, the first and second electrodes may be,

the pixel information includes a relative value of the attribute change of the at least pixels, and the processor performs decoding processing on at least part of the code stream data, including:

and determining the pixel information of the target object in the current image according to the pixel information of the target object in the decoded image and the relative value of the attribute change of at least pixels.
127. The decoding device according to claim 126, wherein the image region information further includes an identification bit for indicating that the target object is unchanged in the decoded image compared to the image region in which the current image is located;

the processor performs decoding processing on at least part of the code stream data, and the decoding processing comprises the following steps:

and determining the image area information of the target object in the current image according to the image area information of the target object in the decoded image.
128. The decoding device according to claim 126, wherein the image region is a rectangular region, and the image region information includes coordinates of a center point of the rectangular region, height information of the rectangular region, and width information of the rectangular region;

the image area information also comprises an identification bit for indicating that the coordinates of the central point of the image area where the target object is located are kept unchanged;

the processor performs decoding processing on at least part of the code stream data, and the decoding processing comprises the following steps:

and determining the center point coordinate of the target object in the image area according to the center point coordinate of the image area where the target object is located in the decoded image.
129. The decoding device according to claim of any of claims 105 to 128,

the identification information is also used to identify removed objects of the current image relative to the decoded image.
130. The decoding device according to claim 129, characterized in that,

the identification information includes label information of the removed object or position information of the removed object in the decoded image.
131. The decoding device of any of claims 105-130, wherein the processor performs decoding processing on at least a portion of the codestream data, comprising:

and decoding the identification information in the code stream data to obtain the current image and the decoded identification information.
132. The decoding device of any of claims 105-131 wherein the processor performs decoding processing on at least a portion of the codestream data, comprising:

discarding the identification information and not decoding the identification information.
133. The decoding device according to claim 132, wherein the codestream data further includes image content data of the current image;

the processor performs decoding processing on at least part of the code stream data, and the decoding processing comprises the following steps:

and decoding the image content data of the current image in the code stream data.
134. The decoding device according to claim 133, wherein the image content data of the current image comprises reference frame data of the current image and residual data between the current image and the reference frame.
135. The decoding device according to claim , wherein the identification information further includes content information,

the processor performs decoding processing on at least part of the code stream data, and the decoding processing comprises the following steps:

and determining the content of the target object according to the content information in the code stream data.
136. The decoding device according to claim 135, wherein the content information is label information.
137. The decoding device according to claim 135, wherein the content information is a numerical value.
138. The decoding device according to of any of claims 105 to 137, wherein the image region is a rectangular region.
139. The decoding device according to claim 138, wherein the image region information comprises coordinates of an arbitrary corner of the rectangular region, height information of the rectangular region, and width information of the rectangular region;

alternatively, the first and second electrodes may be,

the image area information comprises the coordinates of the center point of the rectangular area, the height information of the rectangular area and the width information of the rectangular area;

alternatively, the first and second electrodes may be,

the image area information comprises coordinates of the upper left corner of the rectangular area and coordinates of the lower right corner of the rectangular area;

alternatively, the first and second electrodes may be,

the image area information includes an upper right corner coordinate of the rectangular area and a lower left corner coordinate of the rectangular area.
140. The decoding device of any of claims 105-139, wherein the identification information is located in auxiliary enhancement information or extension data for the current picture.