CN113936258A

CN113936258A - Image processing method, image processing device, electronic equipment and storage medium

Info

Publication number: CN113936258A
Application number: CN202111207140.2A
Authority: CN
Inventors: 路金诚; 张伟; 谭啸; 孙昊
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2021-10-15
Filing date: 2021-10-15
Publication date: 2022-01-14

Abstract

The present disclosure provides an image processing method, an image processing apparatus, an electronic device and a storage medium, which relate to the field of artificial intelligence, specifically to the technical field of computer vision and deep learning, and are specifically applicable to scenes such as smart cities and intelligent transportation. The image processing method comprises the following steps: in response to detecting a target object in an image to be processed, determining first attribute information of the target object in the image to be processed; determining second attribute information of the target object in the target image; and determining whether to update the target image with the image to be processed based on the first attribute information and the second attribute information.

Description

Image processing method, image processing device, electronic equipment and storage medium

Technical Field

The utility model relates to an artificial intelligence field, concretely relates to computer vision and deep learning technical field, specifically can be used to under scenes such as wisdom city and intelligent transportation. And more particularly, to an image processing method, apparatus, electronic device, and storage medium.

Background

To facilitate identifying an object, it is often necessary to screen out the video for optimal frames for detecting the object. In the related art, an optimal frame is manually screened out or a video frame with high definition is selected as the optimal frame. The technologies inevitably have the problems of strong subjectivity and non-ideal screened optimal frame effect.

Disclosure of Invention

Based on this, the present disclosure provides an image processing method, apparatus, electronic device, and storage medium that improve the accuracy of a target image.

One aspect of the present disclosure provides an image processing method, including: in response to detecting a target object in an image to be processed, determining first attribute information of the target object in the image to be processed; determining second attribute information of the target object in the target image; and determining whether to update the target image with the image to be processed based on the first attribute information and the second attribute information.

Another aspect of the present disclosure provides an image processing apparatus including: the first attribute determining module is used for responding to the detection of the target object in the image to be processed and determining first attribute information of the target object in the image to be processed; the second attribute determining module is used for determining second attribute information of the target object in the target image; and the updating determining module is used for determining whether to adopt the image to be processed to update the target image or not based on the first attribute information and the second attribute information.

Another aspect of the present disclosure provides an electronic device including: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor, the instructions being executable by the at least one processor to enable the at least one processor to perform the image processing method provided by the present disclosure.

According to another aspect of the present disclosure, there is provided a non-transitory computer readable storage medium storing computer instructions for causing a computer to perform the image processing method provided by the present disclosure.

According to another aspect of the present disclosure, a computer program product is provided, comprising a computer program which, when executed by a processor, implements the image processing method provided by the present disclosure.

It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.

Drawings

The drawings are included to provide a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:

fig. 1 is a schematic view of an application scenario of an image processing method and apparatus according to an embodiment of the present disclosure;

FIG. 2 is a schematic flow diagram of an image processing method according to an embodiment of the present disclosure;

FIG. 3 is a schematic diagram of an image processing method according to an embodiment of the present disclosure;

FIG. 4 is a schematic diagram of an image processing method according to another embodiment of the present disclosure;

FIG. 5 is a schematic flow chart diagram of an image processing method according to another embodiment of the present disclosure;

fig. 6 is a block diagram of the structure of an image processing apparatus according to an embodiment of the present disclosure; and

fig. 7 is a block diagram of an electronic device for implementing the image processing method according to the embodiment of the present disclosure.

Detailed Description

Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of the embodiments of the disclosure are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

The present disclosure provides an image processing method comprising a first attribute determination phase, a second attribute determination phase and an update determination phase. In a first attribute determination phase, in response to detecting a target object in an image to be processed, first attribute information of the target object in the image to be processed is determined. In a second attribute determination phase, second attribute information of the target object in the target image is determined. In the update determination stage, it is determined whether to update the target image with the image to be processed based on the first attribute information and the second attribute information.

An application scenario of the method and apparatus provided by the present disclosure will be described below with reference to fig. 1.

Fig. 1 is an application scene diagram of an image processing method and apparatus according to an embodiment of the present disclosure.

As shown in fig. 1, the application scene 100 includes a video capture device 101, a road, a vehicle 102, and a pedestrian 103.

The video capture device 101 is disposed on the side of a road on which a vehicle 102 and a pedestrian 103 can travel. The video capture device 101 may be used to capture video frames of vehicles 102 and pedestrians 103 traveling on a road.

As shown in fig. 1, the application scenario 100 may further include a communication base station 104 and an electronic device 105. The video capture apparatus 101 may be communicatively coupled to the electronic device 105 through a network service provided by the communication base station 104, for example. In this way, the video capture device 101 may upload captured video frames to the electronic device 105 in real time. The electronic device 105 may identify the target object using the received video frame as an image to be processed.

The electronic device 105 may be, for example, any electronic device with processing capabilities, including but not limited to a smartphone, a tablet, a laptop, a desktop computer, or a server, among others. The electronic device 105 may perform detection on the target object in the image to be processed, and when the target object is detected and the display effect of the target object in the image to be processed is good, identify the target object in the image to be processed to obtain an identification result. The recognition result may include the identity of the target object, the classification result, and the like.

It should be noted that the image processing method provided by the present disclosure may be executed by the electronic device 105 or other electronic devices communicatively connected to the electronic device 105. Accordingly, the image processing apparatus provided by the present disclosure may be provided in the electronic device 105 or in another electronic device communicatively connected to the electronic device 105.

It should be understood that the number and types of video capture devices, vehicles, communication base stations, and electronic equipment in fig. 1 are merely illustrative. There may be any number and type of video capture devices, vehicles, communication base stations, and electronic equipment, as desired for an implementation.

The image processing method provided by the present disclosure will be described in detail below with reference to fig. 2 to 5.

Fig. 2 is a flow chart schematic diagram of an image processing method according to an embodiment of the present disclosure.

As shown in fig. 2, the image processing method 200 of this embodiment may include operations S210 to S230.

In operation S210, in response to detecting a target object in an image to be processed, first attribute information of the target object in the image to be processed is determined.

According to the embodiment of the disclosure, the image to be processed can be a video frame uploaded by the video acquisition device in real time, and can also be any video frame in a video acquired in advance.

According to the embodiment of the present disclosure, for example, the target detection model may be adopted to detect the image to be processed, and whether the target object is detected in the image to be detected is determined according to the detection result of the target detection model. The target detection model may be constructed based on a single view detector (YOLO) framework, for example. For example, the target detection model may be a PP-YOLO model or a PP-YOLOv2 model constructed based on the paddle platform PaddlePaddle, or a single-stage detection model such as YOLOv3, or may be a two-stage detection model such as rcnn (regions with CNN features) constructed based on CNN, which is not limited in the disclosure. When the target detection model outputs the position information and the class probability of the bounding box, and the class probability is larger than a preset value, it can be determined that the target object is detected in the image to be detected.

According to an embodiment of the present disclosure, a classification model or the like may be employed to determine first attribute information of a target object in an image to be processed. The classification model may include, for example, a Residual Neural Network (ResNet) series model (e.g., ResNet50), an attention model, and the like, which is not limited in this disclosure. The embodiment may input an image to be processed including a target object into the classification model, and output a result of attribute classification of the target object by the classification model. The attribute classification result may be the first attribute information. For example, the first attribute information may include a color category, a style category, a category of whether to be occluded, an orientation category, and the like of the target object.

In an embodiment, the size of the target object in the image to be processed may be determined according to a bounding box output by the target detection model, and the size may be used as the first attribute information of the target object. The size may be expressed by, for example, the area of the target object, or the size ratio of the target object to the image to be processed.

It is to be understood that the above-mentioned first attribute information is merely an example to facilitate understanding of the present disclosure, and the present disclosure does not limit thereto. When the first attribute information includes a plurality of attribute information, a plurality of classification branches may be provided, each classification branch being used to obtain one attribute information.

In operation S220, second attribute information of the target object in the target image is determined.

According to an embodiment of the present disclosure, the target image may be an image having a target object and historically used to identify the target object. The target object in the target image is the same object as the target object detected in the image to be processed, and the second attribute information is similar to the first attribute information. For example, the second attribute information may be obtained in advance by a method similar to the method of determining the first attribute information in operation S210.

For example, in a task requiring recognition of a plurality of target objects, the operation S220 may first determine a target image having a target object detected in the image to be processed. The attribute information of the target object in the target image may be stored in the memory in advance. The operation S220 may read the second attribute information from the memory after determining the target image.

For example, if the image to be processed is the first frame image captured by the video capture device, the target image may be empty, and the image to be processed in which the target object is detected is directly used as the target image. Therefore, when the images collected by the video collecting device are subsequently received, the target image is the first frame image. Alternatively, the target image may be a target image that is updated most recently by the image processing method according to the embodiment of the present disclosure.

In operation S230, it is determined whether to update the target image with the to-be-processed image based on the first attribute information and the second attribute information.

According to an embodiment of the present disclosure, the first attribute information and the second attribute information may each include attribute information representing a presentation effect of the target object. The embodiment can compare the display effect of the first attribute information representation with the display effect of the second attribute information representation. And if the display effect represented by the first attribute information is better than that represented by the second attribute information, updating the target image by adopting the image to be processed. Otherwise, the target image is maintained unchanged, and the received image to be processed is returned to continue to be detected for the target object.

For example, the first attribute information and the second attribute information may each include attribute information that characterizes a size of the target object in the image, for example. And if the size ratio of the target object in the image to be processed is larger than that of the target object in the target image, updating the target image. Alternatively, the first attribute information and the second attribute information may each include attribute information representing whether the target object is occluded or not. And if the ratio of the target object in the image to be processed to be shielded is smaller than the ratio of the target object in the target image to be shielded, updating the target image. Alternatively, the first attribute information and the second attribute information may each include attribute information that characterizes whether the target object is complete. And if the integrity rate of the target object in the image to be processed is greater than that of the target object in the target image, updating the target image.

The image processing method of the embodiment of the disclosure can continuously receive the images acquired by the video acquisition equipment, process each image according to the receiving sequence and update the target image in an iterative manner. With the image processing method of this embodiment, the target image can be picked up based on the attribute information of the target object. Compared with the technical scheme of manual screening or screening according to the image definition in the related technology, the method is beneficial to improving the accuracy of the target image obtained by screening. Therefore, the accuracy and the recognition accuracy of downstream application such as object recognition are improved. For example, the target object may be a vehicle and the downstream application may be an application that identifies a license plate number of the vehicle. The target object may be a pedestrian and the downstream application may be an application that identifies the pedestrian. The downstream application can be applied to the fields of security protection, traffic management and the like.

FIG. 3 is a schematic diagram of an image processing method according to an embodiment of the present disclosure.

According to the embodiment of the present disclosure, when determining the first attribute information of the target object, for example, a partial image including the target object may be cut out from the image to be processed. First attribute information of the target object is then determined based on the partial image. Therefore, the accuracy of the determined first attribute information can be improved, and the interference of the existence of other objects in the image to be processed on the attributes of the target object can be reduced.

For example, the image to be processed may be cropped based on the position of the target object in the image to be processed. The position of the target object can be determined according to the position information of the bounding box output by the target detection model. In particular, the local image surrounded by the bounding box can be cut out from the image to be processed.

According to an embodiment of the present disclosure, the target object included in the image to be processed may include at least two. In this way, at least two bounding boxes can be obtained through the detection of the target detection model, and the embodiment can obtain at least two partial images according to the at least two bounding boxes when the image to be processed is cut, wherein each partial image comprises a target object. Accordingly, when the first attribute information is determined, the first attribute information of each object should be determined.

For example, as shown in fig. 3, in this embodiment 300, the target object is a vehicle, and two objects are included in the image to be processed 310. The image 310 to be processed is input into the target detection model 301, and two sets of bounding box position information 320 can be obtained. Each set of bounding box position information 320 may include a center coordinate value of a bounding box enclosing an object, a width and a height of the bounding box enclosing an object.

As shown in fig. 3, the image to be processed may be respectively cropped according to the two sets of bounding box position information, thereby obtaining a partial image 311 including one of the two objects and a partial image 312 including the other of the two objects. Then, the partial image 311 and the partial image 312 are respectively input into the classification model 302, and the first sub-attribute information 331 of one object and the first sub-attribute information 332 of another object can be respectively obtained.

According to an embodiment of the present disclosure, in a case where the target object includes at least two objects, the embodiment may maintain the target image separately for each object. In this way, the recognition accuracy for each object in downstream applications can be improved.

For example, in determining the second attribute information of the target object in the target image, the target image may be first determined, the target image including at least two target sub-images for at least two objects, respectively. Each target sub-image includes an object. In this embodiment, after the target object is detected by the target detection model, the identification information of the detected target object may be determined. A target sub-image for each object is then determined from the identification information. Accordingly, when maintaining the target image, an identifier may be assigned to the target image, where the identifier is identification information of an object to which the target image is directed.

Illustratively, a target tracking model may be employed to determine identification information of the detected target object. For example, a multi-target tracking algorithm may be employed to construct the target tracking model. The Multi-target Tracking algorithm may include, for example, a FairMOT (fairmulti-Object Tracking) algorithm that can simultaneously detect and track a target, or a DeepSORT algorithm that is only used for Tracking a target. The basic idea of the deepSORT algorithm is to perform data association by using a motion model and appearance information.

For example, the image to be processed may be input to the target tracking model, and the tracking ID of the target object in the image to be processed may be output by the target tracking model. The tracking ID is identification information of the target object. Alternatively, the two partial images clipped in embodiment 300 may be input to the target tracking model 303, respectively, thereby outputting the tracking ID 341 of one object and the tracking ID 342 of the other object. Alternatively, the embodiment may also input the first attribute information of each local image and the target object in the local image into the target tracking model at the same time. In this case, the target tracking model may determine the tracking ID of the target object by comprehensively considering the first attribute information and the local image.

After the target sub-image for each object is determined according to the identification information, the attribute information of each object in the target sub-image can be determined. Accordingly, the aforementioned determined first attribute information should include first sub-attribute information of each object. The finally determined second attribute information also correspondingly comprises second sub-attribute information of each object, and at least two pieces of second sub-attribute information of at least two objects form the second attribute information.

Since the target image is maintained separately for each object, when determining whether to update the target image, it may be determined whether to update the target sub-image with the image to be processed based on the first sub-attribute information of each object and the second sub-attribute information of each object. The first sub-attribute information and the second sub-attribute information may both include the aforementioned attribute information representing the size ratio of the target object in the image, and the like, which is not limited in this disclosure.

Fig. 4 is a schematic diagram of an image processing method according to another embodiment of the present disclosure.

According to an embodiment of the present disclosure, the first attribute information and the second attribute information may include attribute information (for convenience of description, referred to as first information) representing an orientation of the target object, in addition to the aforementioned attribute information (for convenience of description, referred to as second information) representing the presentation effect of the target object. Accordingly, different target images may be maintained for different orientations for the same target object. By the method, the accuracy of the determined target image can be further improved, and the identification accuracy of each object in downstream application is improved.

As shown in fig. 4, in this embodiment 400, for the vehicle in the image 410 to be processed, the maintained target image may include four target sub-images, and the vehicle in the four target sub-images may be oriented forward, backward, left, and right, respectively. In a specific scenario, if the orientation of the vehicle only includes left and right directions in all the collected images including the vehicle, the forward and backward target sub-images are empty, that is, the target image of the vehicle only includes the target sub-image 421 and the target sub-image 422.

Accordingly, when determining the second attribute information of the target object in the target image, the target sub-image matching both the target object in the image to be processed and the orientation of the target object may be determined first. The orientation of the target object and the orientation of the target object in the determined target sub-image are respectively the same as the orientation of the target object and the orientation of the target object in the image to be processed. After the target sub-image is determined, the second information of the target object in the target sub-image can be determined. In the embodiment 400, the first attribute information can be obtained by analyzing the image 410 to be processed, and the first attribute information comprises first information 431 representing the orientation of the vehicle in the image to be recognized and second information 432 representing the display effect of the vehicle in the image to be recognized. If the first information 431 is oriented to the right, the target sub-image 421 with the orientation of the target object oriented to the right can be selected from the target sub-image 421 and the target sub-image 422, and the second information 440 representing the exhibition effect of the vehicle in the target sub-image 421 is determined.

After obtaining the second information 432 and the second information 440, the two second information may be compared. If the second information 432 of the vehicle in the image to be processed 410 is better than the second information 440 of the vehicle in the target image 421, the target sub-image 421 may be updated with the image to be processed 410. The target sub-image 421 may specifically be replaced with the image 410 to be processed.

According to the embodiment of the disclosure, the image to be processed may be any video frame in a sequence of video frames captured by a video capture device. In case the target image comprises at least two target sub-images of the target object oriented differently from each other, one target sub-image may be output for the target object as input for a downstream application.

In outputting the target sub-image, for example, the orientation information of the target object in the video frame before the image to be processed in the video frame sequence may be counted. And determining which orientation of the target sub-image is output according to the statistical orientation information. For example, for a previous frame of an image to be processed in a sequence of video frames, a target video frame including a target object in the previous frame may be determined. The target orientation is then determined based on the orientation of the target object in the target video frame. For example, if there are a plurality of target video frames, the target orientation may be determined to be the orientation having the highest value among the plurality of orientations obtained based on the plurality of target video frames. And finally, outputting an image with the target orientation of the target object in the at least two target sub-images.

For example, in the previous frame, the video frames including the vehicle a total 48 frames, in the 48 frames, the number of the video frames in which the vehicle a is oriented to the left is 40, and the number of the video frames in which the vehicle a is oriented to the right is 8 frames, the finally output target sub-image is the target sub-image in which the target object is oriented to the left.

The accuracy of downstream application can be further improved by outputting the image of the target orientation.

Fig. 5 is a flowchart illustrating an image processing method according to another embodiment of the present disclosure.

According to the embodiment of the disclosure, if the target object is a vehicle, the first attribute information may further include third information indicating whether the vehicle has a license plate, in addition to the second information indicating the display effect of the target object. This third information can be obtained, for example, by performing object detection on the image to be processed. Accordingly, the object detected by the target detection model includes a license plate in addition to the vehicle.

In one embodiment, it may be determined whether a license plate of a vehicle is detected in the image to be processed. And determining whether to update the target image based on the first attribute information and the second attribute information only when the license plate is detected. By the method, the license plate number of the vehicle in the target image can be ensured, and the license plate number of the vehicle can be conveniently identified in downstream application.

In an embodiment, the second information may include first sub information indicating whether the target object is complete, second sub information indicating an occlusion degree of the target object, and third sub information indicating a size ratio of the target object. As shown in fig. 5, the image processing method 500 in this embodiment may include operations S510 to S570. Wherein operation S510 is performed in response to detecting the target object in the image to be processed.

In operation S510, it is determined whether the image to be processed is a first frame image in which the target object is detected. If so, operation S560 is performed, otherwise, operation S520 is performed.

In operation S520, it is determined whether a license plate of a vehicle in the image to be processed is detected. If the license plate is detected, operation S530 is performed, otherwise, operation S570 is performed.

In operation S530, it is determined whether the vehicle is complete in the image to be processed. If the image to be processed only comprises part of the vehicle body of the vehicle, the vehicle is incomplete. If not, operation S570 is performed, otherwise operation S540 is performed.

In operation S540, it is determined whether the degree of occlusion of the vehicle in the image to be processed is greater than the degree of occlusion of the vehicle in the target image. If so, operation S570 is performed, otherwise, operation S550 is performed. It should be noted that, if one target sub-image is maintained for each orientation of the vehicle, the target image mentioned in operation S540 is a target sub-image in which the orientation of the vehicle is the same as the orientation of the vehicle in the image to be processed.

In operation S550, it is determined whether the size ratio of the vehicle in the image to be processed is greater than the size ratio of the vehicle in the target image. If so, operation S560 is performed, otherwise, operation S570 is performed. The size ratio is the ratio of the size of the vehicle in the image to the overall size of the image.

In operation S560, the target image is updated to the image to be processed.

In operation S570, the target image is maintained unchanged.

Based on the image processing method provided by the embodiment of the disclosure, the disclosure also provides an image processing device, which will be described in detail below with reference to fig. 6.

Fig. 6 is a block diagram of the structure of an image processing apparatus according to an embodiment of the present disclosure.

As shown in fig. 6, the image processing apparatus 600 of this embodiment may include a first attribute determination module 610, a second attribute determination module 620, and an update determination module 630.

The first attribute determining module 610 is configured to determine first attribute information of a target object in an image to be processed in response to detecting the target object in the image to be processed. In an embodiment, the first attribute determining module 610 may be configured to perform the operation S210 described above, which is not described herein again.

The second attribute determining module 620 is configured to determine second attribute information of the target object in the target image. In an embodiment, the second attribute determining module 620 may be configured to perform the operation S220 described above, which is not described herein again.

The update determining module 630 is configured to determine whether to update the target image with the to-be-processed image based on the first attribute information and the second attribute information. In an embodiment, the update determining module 630 may be configured to perform the operation S230 described above, which is not described herein again.

According to an embodiment of the present disclosure, the first attribute determining module 610 may include an image cropping sub-module and an attribute determining sub-module. And the image cutting submodule is used for cutting the image to be processed based on the position of the target object in the image to be processed to obtain a local image comprising the target object. The attribute determining submodule is used for determining first attribute information of the target object based on the local image.

According to an embodiment of the present disclosure, the target object includes at least two objects. The image cropping submodule is used for cropping the image to be processed based on the position of each object in the image to be processed in at least two objects to obtain at least two partial images respectively comprising the at least two objects. The attribute determining submodule is configured to determine first sub-attribute information of each object based on a partial image including each object in the at least two partial images.

According to an embodiment of the present disclosure, the target object includes at least two objects; the first attribute information includes first sub-attribute information of each of the at least two objects. The second attribute determining module 620 may be configured to determine the target sub-image for each object and the second sub-attribute information of each object in the target sub-image based on the identification information of each object. The update determining module 630 may be configured to determine whether to update the target sub-image with the image to be processed based on the first sub-attribute information of each object and the second sub-attribute information of each object.

According to an embodiment of the present disclosure, the first attribute information and the second attribute information each include first information representing an orientation of the target object and second information representing a display effect of the target object. The second attribute determining module 620 may be configured to determine a target sub-image with the same orientation as the orientation represented by the first information; and determining second information of the target object in the target sub-image. The update determining module 630 may be configured to update the target sub-image with the image to be processed if the second information of the target object in the image to be processed is better than the second information of the target object in the target sub-image.

According to an embodiment of the present disclosure, the second information includes at least one of: the first sub information is used for representing whether the target object is complete, the second sub information is used for representing the shielding degree of the target object, and the third sub information is used for representing the size ratio of the target object.

According to an embodiment of the present disclosure, the target image includes at least two target sub-images of the target object oriented differently from each other; the image to be processed is any video frame in the video frame sequence. The image processing apparatus 600 may further include a target frame determination module, a target orientation determination module, and an image output module. The target frame determination module is used for determining a target video frame comprising a target object in a previous frame aiming at the previous frame of an image to be processed in the video frame sequence. The target orientation determination module is configured to determine a target orientation based on an orientation of a target object in the target video frame. The image output module is used for outputting an image with the target orientation of the target object in the at least two target sub-images.

According to an embodiment of the present disclosure, the target object includes a vehicle, and the first attribute information includes third information representing whether a license plate of the vehicle is detected. The update determining module 630 is configured to determine whether to update the target image with the to-be-processed image based on the first attribute information and the second attribute information when the third information represents that the license plate of the vehicle is detected.

In the technical scheme of the present disclosure, the processes of acquiring, collecting, storing, using, processing, transmitting, providing, disclosing and the like of the personal information of the related user all conform to the regulations of related laws and regulations, and do not violate the good custom of the public order.

The present disclosure also provides an electronic device, a readable storage medium, and a computer program product according to embodiments of the present disclosure.

FIG. 7 shows a schematic block diagram of an example electronic device 700 that may be used to implement the image processing methods of embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the disclosure described and/or claimed herein.

As shown in fig. 7, the device 700 comprises a computing unit 701, which may perform various suitable actions and processes according to a computer program stored in a Read Only Memory (ROM)702 or a computer program loaded from a storage unit 708 into a Random Access Memory (RAM) 703. In the RAM 703, various programs and data required for the operation of the device 700 can also be stored. The computing unit 701, the ROM 702, and the RAM 703 are connected to each other by a bus 704. An input/output (I/O) interface 705 is also connected to bus 704.

Various components in the device 700 are connected to the I/O interface 705, including: an input unit 706 such as a keyboard, a mouse, or the like; an output unit 707 such as various types of displays, speakers, and the like; a storage unit 708 such as a magnetic disk, optical disk, or the like; and a communication unit 709 such as a network card, modem, wireless communication transceiver, etc. The communication unit 709 allows the device 700 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks.

Computing unit 701 may be a variety of general purpose and/or special purpose processing components with processing and computing capabilities. Some examples of the computing unit 701 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and so forth. The calculation unit 701 executes the respective methods and processes described above, such as an image processing method. For example, in some embodiments, the image processing method may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as storage unit 708. In some embodiments, part or all of a computer program may be loaded onto and/or installed onto device 700 via ROM 702 and/or communications unit 709. When the computer program is loaded into the RAM 703 and executed by the computing unit 701, one or more steps of the image processing method described above may be performed. Alternatively, in other embodiments, the computing unit 701 may be configured to perform the image processing method by any other suitable means (e.g. by means of firmware).

Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.

Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.

The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The Server may be a cloud Server, which is also called a cloud computing Server or a cloud host, and is a host product in a cloud computing service system, so as to solve the defects of high management difficulty and weak service extensibility in a traditional physical host and a VPS service ("Virtual Private Server", or simply "VPS"). The server may also be a server of a distributed system, or a server incorporating a blockchain.

It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present disclosure may be executed in parallel, sequentially, or in different orders, as long as the desired results of the technical solutions disclosed in the present disclosure can be achieved, and the present disclosure is not limited herein.

The above detailed description should not be construed as limiting the scope of the disclosure. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present disclosure should be included in the scope of protection of the present disclosure.

Claims

1. An image processing method comprising:

in response to detecting a target object in an image to be processed, determining first attribute information of the target object in the image to be processed;

determining second attribute information of the target object in the target image; and

and determining whether to update the target image by using the image to be processed based on the first attribute information and the second attribute information.

2. The method of claim 1, wherein determining first attribute information of the target object in the image to be processed comprises:

cutting the image to be processed based on the position of the target object in the image to be processed to obtain a local image comprising the target object; and

determining the first attribute information of the target object based on the local image.

3. The method of claim 2, wherein the target object comprises at least two objects; wherein:

the cropping the image to be processed based on the position of the target object in the image to be processed comprises: cutting the image to be processed based on the position of each object in the image to be processed, so as to obtain at least two local images respectively comprising the at least two objects;

the determining the first attribute information based on the local image includes: determining first sub-attribute information of each object based on the partial images including the each object in the at least two partial images.

4. The method of claim 1, wherein the target object comprises at least two objects; the first attribute information includes first sub-attribute information of each of the at least two objects:

determining second attribute information of the target object in the target image comprises: determining a target sub-image for each object and second sub-attribute information of each object in the target sub-image based on the identification information of each object;

determining whether to update the target image with the to-be-processed image comprises: and determining whether to update the target sub-image by using the image to be processed based on the first sub-attribute information of each object and the second sub-attribute information of each object.

5. The method according to claim 1, wherein the first attribute information and the second attribute information each include first information characterizing an orientation of the target object and second information characterizing a presentation effect of the target object; wherein:

the determining second attribute information of the target object in the target image includes: determining a target sub-image with the same orientation of the target object as the orientation represented by the first information; and determining second information of the target object in the target sub-image; and

determining whether to update the target image with the to-be-processed image comprises: and under the condition that the second information of the target object in the image to be processed is better than the second information of the target object in the target sub-image, updating the target sub-image by adopting the image to be processed.

6. The method of claim 5, wherein the second information comprises at least one of: the first sub-information is used for representing whether the target object is complete or not, the second sub-information is used for representing the degree of the target object being shielded, and the third sub-information is used for representing the size ratio of the target object.

7. The method according to claim 5, wherein the target image comprises at least two target sub-images of a target object oriented differently from each other; the image to be processed is any video frame in a video frame sequence; the method further comprises the following steps:

determining a target video frame comprising the target object in a previous frame of the image to be processed in the video frame sequence aiming at the previous frame;

determining a target orientation based on an orientation of the target object in the target video frame; and

and outputting an image with the target orientation of the target object in the at least two target sub-images.

8. The method of claim 1, wherein the target object comprises a vehicle; the first attribute information includes third information representing whether a license plate of the vehicle is detected; determining whether to update the target image with the to-be-processed image comprises:

and under the condition that the third information represents that the license plate of the vehicle is detected, determining whether to update the target image by adopting the image to be processed or not based on the first attribute information and the second attribute information.

9. An image processing apparatus comprising:

the first attribute determining module is used for responding to the detection of a target object in an image to be processed and determining first attribute information of the target object in the image to be processed;

the second attribute determining module is used for determining second attribute information of the target object in the target image; and

and the updating determining module is used for determining whether to update the target image by adopting the image to be processed based on the first attribute information and the second attribute information.

10. The apparatus of claim 9, wherein the first attribute determination module comprises:

the image cutting submodule is used for cutting the image to be processed based on the position of the target object in the image to be processed to obtain a local image comprising the target object; and

an attribute determining sub-module, configured to determine the first attribute information of the target object based on the local image.

11. The apparatus of claim 10, wherein the target object comprises at least two objects;

the image cropping sub-module is used for: cutting the image to be processed based on the position of each object in the image to be processed, so as to obtain at least two local images respectively comprising the at least two objects;

the attribute determination submodule is configured to: determining first sub-attribute information of each object based on the partial images including the each object in the at least two partial images.

12. The apparatus of claim 9, wherein the target object comprises at least two objects; the first attribute information includes first sub-attribute information of each of the at least two objects:

the second attribute determination module is to: determining a target sub-image for each object and second sub-attribute information of each object in the target sub-image based on the identification information of each object;

the update determination module is to: and determining whether to update the target sub-image by using the image to be processed based on the first sub-attribute information of each object and the second sub-attribute information of each object.

13. The apparatus according to claim 9, wherein the first attribute information and the second attribute information each include first information characterizing an orientation of the target object and second information characterizing a presentation effect of the target object; wherein:

the second attribute determination module is to: determining a target sub-image with the same orientation of the target object as the orientation represented by the first information; and determining second information of the target object in the target sub-image; and

the update determination module is to: and under the condition that the second information of the target object in the image to be processed is better than the second information of the target object in the target sub-image, updating the target sub-image by adopting the image to be processed.

14. The apparatus of claim 13, wherein the second information comprises at least one of: the first sub-information is used for representing whether the target object is complete or not, the second sub-information is used for representing the degree of the target object being shielded, and the third sub-information is used for representing the size ratio of the target object.

15. The apparatus of claim 13, wherein the target image comprises at least two target sub-images of a target object oriented differently from each other; the image to be processed is any video frame in a video frame sequence; the device further comprises:

a target frame determination module, configured to determine, for a previous frame of the to-be-processed image in the sequence of video frames, a target video frame including the target object in the previous frame;

a target orientation determination module for determining a target orientation based on an orientation of the target object in the target video frame; and

and the image output module is used for outputting an image with the target orientation as the orientation of the target object in the at least two target sub-images.

16. The apparatus of claim 9, wherein the target object comprises a vehicle; the first attribute information includes third information representing whether a license plate of the vehicle is detected; the update determination module is to:

17. An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-8.

18. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any of claims 1-8.

19. A computer program product comprising a computer program which, when executed by a processor, implements a method according to any one of claims 1 to 8.