WO2019233266A1 - Procédé de traitement d'image, support de stockage lisible par ordinateur et dispositif électronique - Google Patents

Procédé de traitement d'image, support de stockage lisible par ordinateur et dispositif électronique Download PDF

Info

Publication number
WO2019233266A1
WO2019233266A1 PCT/CN2019/087590 CN2019087590W WO2019233266A1 WO 2019233266 A1 WO2019233266 A1 WO 2019233266A1 CN 2019087590 W CN2019087590 W CN 2019087590W WO 2019233266 A1 WO2019233266 A1 WO 2019233266A1
Authority
WO
WIPO (PCT)
Prior art keywords
target
image
foreground
processed
area
Prior art date
Application number
PCT/CN2019/087590
Other languages
English (en)
Chinese (zh)
Inventor
陈岩
Original Assignee
Oppo广东移动通信有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Oppo广东移动通信有限公司 filed Critical Oppo广东移动通信有限公司
Publication of WO2019233266A1 publication Critical patent/WO2019233266A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques

Definitions

  • the present application relates to the field of computer technology, and in particular, to an image processing method, a computer-readable storage medium, and an electronic device.
  • Smart devices can capture images through the camera, or they can acquire images through transmission with other smart devices.
  • images captured in different scenes have different color characteristics, and different foreground objects have different performance characteristics.
  • an image processing method a computer-readable storage medium, and an electronic device are provided.
  • An image processing method includes:
  • An image classification label is generated according to a recognition result of the foreground target.
  • An image processing device includes:
  • An image acquisition module configured to acquire an image to be processed
  • a target detection module configured to perform target detection on the image to be processed, and obtain a foreground target in the image to be processed
  • a target recognition module configured to identify the foreground target if the target area occupied by the foreground target in the image to be processed is greater than an area threshold
  • An image classification module is configured to generate an image classification label according to a recognition result of the foreground target.
  • a computer-readable storage medium stores a computer program thereon.
  • the computer program is executed by a processor, the following operations are implemented:
  • An image classification label is generated according to a recognition result of the foreground target.
  • An electronic device includes a memory and a processor.
  • the memory stores computer-readable instructions.
  • the processor causes the processor to perform the following operations:
  • An image classification label is generated according to a recognition result of the foreground target.
  • the image processing method, the computer-readable storage medium, and the electronic device can acquire an image to be processed, perform target detection on the image to be processed, and obtain a foreground target.
  • the foreground target is identified, and an image classification label is generated based on the recognition result of the foreground target.
  • the area occupied by the foreground target is relatively large, the foreground target can be identified more accurately. In this way, an image classification label is generated based on the recognition result of the foreground target, and the image can be classified more accurately.
  • FIG. 1 is an application environment diagram of an image processing method in an embodiment.
  • FIG. 2 is a flowchart of an image processing method according to an embodiment.
  • FIG. 3 is a flowchart of an image processing method in another embodiment.
  • FIG. 4 (a) is a schematic diagram of an image where the target area is less than the area threshold in one embodiment.
  • FIG. 4 (b) is an image diagram of a target area larger than an area threshold in an embodiment.
  • FIG. 5 is a schematic diagram of a model for identifying a foreground and a background of an image in an embodiment.
  • FIG. 6 is a schematic diagram of a model for identifying an image foreground and background in another embodiment.
  • FIG. 7 is a schematic diagram of generating an image classification label in one embodiment.
  • FIG. 8 is a flowchart of an image processing method according to another embodiment.
  • FIG. 9 is a schematic structural diagram of an image processing apparatus according to an embodiment.
  • FIG. 10 is a schematic diagram of an image processing circuit in an embodiment.
  • first the terms “first”, “second”, and the like used in this application can be used herein to describe various elements, but these elements are not limited by these terms. These terms are only used to distinguish the first element from another element.
  • the first client may be referred to as the second client, and similarly, the second client may be referred to as the first client. Both the first client and the second client are clients, but they are not the same client.
  • FIG. 1 is an application environment diagram of an image processing method in an embodiment.
  • the application environment includes a terminal 102 and a server 104.
  • the image to be processed may be transmitted between the terminal 102 and the server 104, and the image to be processed may be classified and processed.
  • the terminal 102 may store several images to be processed, and then send the images to be processed to the server 104.
  • a classification algorithm for classifying images is stored in the server 104, and target detection may be performed on the received to-be-processed image to determine a foreground target included in the to-be-processed image.
  • the terminal 102 may perform classification processing on the image to be processed according to the obtained image classification label.
  • the terminal 102 is an electronic device located at the outermost periphery of a computer network and is mainly used for inputting user information and outputting processing results.
  • the terminal 102 may be a personal computer, a mobile terminal, a personal digital assistant, or a wearable electronic device.
  • the server 104 is a device for responding to a service request while providing a computing service, and may be, for example, one or more computers. In other embodiments provided in this application, the foregoing application environment may further include only the terminal 102 or the server 104, which is not limited herein.
  • FIG. 2 is a flowchart of an image processing method according to an embodiment. As shown in FIG. 2, the image processing method includes operations 202 to 208. among them:
  • the image to be processed may be acquired through a camera of an electronic device, or may be acquired from another electronic device, or may be downloaded through a network, which is not limited herein.
  • a camera may be installed on the electronic device, and when the electronic device detects a shooting instruction, it controls the camera through the shooting instruction to collect images to be processed. After obtaining the images, the electronic device can process the images immediately or store the images in a folder in a unified manner. After the images stored in the folder reach a certain number, the stored images are processed in a unified manner.
  • the electronic device may store the acquired images in an album, and when the number of images stored in the album is greater than a certain number, processing of the images in the album is triggered.
  • target detection is performed on the image to be processed to obtain a foreground target in the image to be processed.
  • one or more objects are generally included in a scene where an image is captured.
  • the image when shooting outdoor scenes, the image generally includes pedestrians, blue sky, beaches, buildings, etc.
  • the image When shooting indoor scenes, the image generally includes objects such as furniture, appliances, office supplies, and so on.
  • the foreground target refers to the more prominent main target in the image, which is the object that the user is more concerned about.
  • the area in the image other than the foreground target is the background area.
  • the image to be processed is a two-dimensional pixel matrix composed of several pixels
  • the electronic device can detect the foreground target in the image to be processed. It is detected that the foreground target contains some or all of the pixels in the image to be processed, and then the specific position of the foreground target in the image to be processed is marked. Specifically, after detecting the foreground target, the electronic device may mark the foreground target in the image to be processed through a rectangular frame, so that the user can directly see the specific position of the detected foreground target from the image to be processed.
  • the foreground target is identified.
  • a target identifier may be established for the detected foreground target to uniquely identify a foreground target.
  • the electronic device may establish a correspondence between an image identifier, a target identifier, and a target position.
  • the image identifier is used to uniquely identify an image to be processed
  • the target position is used to indicate a specific position of the foreground target in the image to be processed.
  • the detected foreground target is composed of some or all pixels in the image to be processed.
  • the number of pixels contained in the area where the foreground target is located can be counted, and the target area occupied by the foreground target can be calculated based on the counted number of pixels.
  • the target area may be directly expressed by the number of pixels included in the foreground target, or may be expressed by a ratio of the number of pixels included in the foreground target to the number of pixels included in the image to be processed. The larger the number of pixels contained in the foreground target, the larger the corresponding target area.
  • the electronic device obtains the target area of the foreground target after detecting the foreground target. If the target area is greater than the area threshold, the foreground target is considered too large and the corresponding background area is relatively small. When the background area is too small, the recognition of the background is not accurate. At this time, image classification can be performed according to the foreground target. For example, when the foreground object occupies more than 1/2 of the area of the image to be processed, an image classification label is generated according to the recognition result of the foreground object.
  • an electronic device sets a classification type of the foreground object in advance, and then recognizes which preset classification type the detected foreground object belongs to by using a preset classification algorithm. For example, the electronic device can classify the foreground target into a person, a puppy, a kitten, a gourmet, or other types, and then can identify which type of the aforementioned foreground target the detected foreground target belongs to.
  • a preset classification algorithm For example, the electronic device can classify the foreground target into a person, a puppy, a kitten, a gourmet, or other types, and then can identify which type of the aforementioned foreground target the detected foreground target belongs to.
  • RCNN Registered Reality Network
  • SSD Single Shot MultiBox Detector
  • YOLO You Only Look Look Once
  • Operation 208 Generate an image classification label according to the recognition result of the foreground target.
  • the foreground type of the foreground object can be obtained, and then the image to be processed can be labeled according to the foreground type.
  • the image classification label can be used to mark the type of the image to be processed.
  • the electronic device can classify the image to be processed according to the image classification label, and then classify the image to be processed.
  • the classification label can also be used to find the image to be processed. For example, the electronic device may store the images corresponding to the same image classification label in an album, so that the user can sort and find the corresponding images.
  • the image to be processed can be classified and processed according to the image classification label. For example, when the foreground target is detected as a person, the portrait area in the image can be subjected to beauty treatment; when the foreground target is detected as a plant, the saturation and contrast of the plant can be improved.
  • the image processing method provided in the foregoing embodiment may acquire an image to be processed, and perform target detection on the image to be processed to obtain a foreground target.
  • the foreground target is identified, and an image classification label is generated based on the recognition result of the foreground target.
  • the area occupied by the foreground target is relatively large, the foreground target can be identified more accurately. In this way, an image classification label is generated based on the recognition result of the foreground target, and the image can be classified more accurately.
  • FIG. 3 is a flowchart of an image processing method in another embodiment. As shown in FIG. 3, the image processing method includes operations 302 to 316. among them:
  • Operation 302 Acquire an image set including at least one target image, and calculate the similarity between any two target images.
  • the image to be processed when the image to be processed is identified and an image classification label is generated, it can be a single image to be processed or a batch of images to be processed. For example, when capturing an image, the image is recognized immediately after the image is captured, and an image classification label is generated. It is also possible to store the captured images in an electronic device, and after the captured images exceed a certain number, the recognition processing is unified.
  • the image set includes one or more target images, and the target images may be images stored in the electronic device.
  • the images stored by the electronic device may be obtained in different ways, for example, they may be taken by a user through a camera, they may be downloaded on the network, or they may be sent by a friend.
  • the electronic device recognizes the target image in the image collection and generates an image classification label.
  • Generating the image set may specifically include: at least one target image acquired from a preset file path, and generating an image set according to the acquired target image.
  • the preset file path is used to store images that can be used to identify image classification labels. For example, the preset file path can store only images captured by the user through a camera.
  • an image that needs to be identified may be acquired according to an image generation time when a specified trigger condition is satisfied.
  • a specified trigger condition when a specified trigger condition is met, an image collection is generated according to a target image stored in an electronic device whose storage duration exceeds a duration threshold, and the storage duration refers to a time interval from the time when the target image is acquired by the electronic device to the current time. For example, if the image was captured by a camera, the time is counted from the moment the image is generated by the camera. If the image is downloaded via the network, the time is counted from the moment the image is received.
  • the electronic device can trigger an image recognition process every time a specified time is reached. Or when the number of images included in the image collection exceeds a certain number, the image recognition processing is triggered, which is not limited herein.
  • the target images are classified according to the similarity; wherein the similarity between any two target images in the same type of target images is greater than the similarity threshold.
  • images with a high degree of similarity often have similar recognition results. For example, when continuous shooting is performed by an electronic device, since the interval between successively captured images is relatively short, the captured images are similar, so that the recognition results of the images are relatively close. After generating the image set, the similarity between any two target images in the image set can be calculated. The target images with higher similarity can be identified only once to avoid the consumption of electronic device resources caused by repeated identification.
  • the target image after calculating the similarity of the target image, the target image can be classified according to the similarity, and the images with higher similarity can be classified into the same class.
  • the similarity between the same type of images is relatively high, and the recognition results are relatively close, so that the same type of images can be uniformly processed for recognition.
  • calculate the similarity between any two images in the image set and cluster the target images based on the similarity. Assuming the range of similarity is [0,1], two images with similarity greater than 0.9 can be classified into the same category.
  • Operation 306 Obtain a target image from each type of target image as the image to be processed.
  • a target image After classifying the target image, a target image can be obtained from each type of target image as the image to be processed for recognition processing.
  • the image classification label generated according to the recognition result of the image to be processed can be used as the image classification of the corresponding target image label.
  • a target image may be randomly obtained from each type of target image as an image to be processed, and an image to be processed may also be determined by calculating a similarity difference value.
  • an image subset can be generated according to each type of target image; the target images in the image subset are traversed, and similarities between the target image and other target images in the image subset are accumulated to obtain a total similarity sum; An image to be processed is determined from the image subset according to the sum of similarities. For example, the total similarity corresponding to each target image in the image subset is calculated. The larger the total similarity is, the higher the similarity between the target image and other target images is. The target image with the largest total similarity can be used as Pending image.
  • Operation 308 Perform target detection on the image to be processed to obtain a foreground target in the image to be processed.
  • one or more foreground objects may exist in the image to be processed.
  • the area occupied by the foreground object in the image to be processed is used as the target area; when there are two or two For the above foreground target, the total area occupied by all foreground targets included in the image to be processed is taken as the target area.
  • the target area is larger than the area threshold, it is considered that the area occupied by the foreground target is larger and the area occupied by the background area is smaller.
  • the target area is greater than the area threshold, the foreground target is identified; if the target area is less than or equal to the area threshold, the background area other than the foreground target in the image to be processed is identified; an image classification label is generated based on the recognition result of the background area .
  • the electronic device detects a background area in the image to be processed, and detects which background type the background area belongs to after detecting the background area.
  • the electronic device can set the background type of the background area in advance, and then identify which preset background type the background area specifically belongs to through a preset algorithm.
  • the background area can be divided into scenes such as beach, snow, night, blue sky, indoor, etc.
  • the background type corresponding to the background area can be obtained.
  • An image classification label is generated according to the obtained background type.
  • FIG. 4 (a) is a schematic diagram of an image where the target area is less than the area threshold in an embodiment. As shown in FIG. 4 (a), it includes a background area 402 and a foreground target 404. The area of the background area 402 in the image is larger than the foreground target 404. For the area occupied by the image, the recognition of the background area 402 is more accurate at this time, and an image classification label may be generated according to the recognition result obtained by the recognition of the background area 402.
  • FIG. 4 (b) is a schematic diagram of an image where the target area is greater than the area threshold in one embodiment. As shown in FIG. 4 (b), it includes the background area 406 and the foreground target 408.
  • the area occupied by the foreground target 408 in the image is larger than the background area 406.
  • the recognition of the foreground object 408 is more accurate at this time, and an image classification label may be generated according to the recognition result obtained by the recognition of the foreground object 408.
  • the background region can be identified by the classification model
  • the foreground target can be identified by the detection model.
  • the electronic device trains the classification model and the detection model, and outputs a corresponding loss function, respectively.
  • the loss function is a function that can evaluate the confidence of the classification results.
  • the confidence function corresponding to each preset category can be output through the loss function. The higher the confidence level, the greater the probability that the image belongs to the category. In this way, the background type and foreground type corresponding to the image are determined by the confidence level.
  • the background of the image is defined in advance as types of beach, night scene, fireworks, indoor, etc.
  • the electronic device can train the classification model in advance, and the trained classification model can output a loss function.
  • the background region can be detected by the classification model, and the type of the background region can be identified.
  • the confidence function corresponding to each preset background type can be calculated through the loss function, and the background classification result corresponding to the background region is determined through the confidence degree.
  • the calculated confidence levels of the four types of beach, night view, fireworks, and indoor are 0.01, 0.06, 0.89, and 0.04, respectively. It can be determined that the background region of the image to be processed is the background type with the highest confidence.
  • FIG. 5 is a schematic diagram of a model for identifying a foreground and a background of an image in an embodiment.
  • the electronic device can train the classification model. Before training the model, the image is labeled with a category label, and the classification model is trained through the image and the corresponding category label. After training the classification model, a first loss function can be obtained.
  • a background region in an image can be detected by a classification model, and a first confidence level corresponding to each preset background type can be calculated by using the obtained first loss function. According to the obtained first confidence level, a background classification result corresponding to the background region can be determined.
  • the electronic device can train the detection model.
  • the foreground targets included in the image are marked with a rectangular frame, and the category corresponding to each foreground target is marked.
  • the detection model is trained through images. After the detection model is trained, a second loss function can be obtained.
  • the foreground objects in the image can be detected by the detection model, and the positions of each foreground object can be output.
  • a second confidence function corresponding to each preset foreground type can be calculated through the second loss function. According to the obtained second confidence level, the foreground classification result corresponding to the foreground target can be determined.
  • the above classification model and detection model can be two independent algorithm models
  • the classification model can be a Mobilenet algorithm model
  • the detection model can be an SSD algorithm model, which is not limited here.
  • FIG. 6 is a schematic diagram of a model for identifying an image foreground and background in another embodiment.
  • the recognition model is a neural network model.
  • the input layer of the neural network receives training images with image category labels, performs feature extraction through a basic network (such as a CNN network), and outputs the extracted image features.
  • the feature layer is used to perform category detection on the background training target to obtain a first loss function
  • the foreground training target is subjected to category detection to obtain a second loss function
  • the foreground training target is subjected to position detection based on the foreground area to obtain a position loss.
  • the neural network may be a convolutional neural network.
  • Convolutional neural networks include a data input layer, a convolutional calculation layer, an activation layer, a pooling layer, and a fully connected layer.
  • the data input layer is used to pre-process the original image data.
  • the pre-processing may include de-averaging, normalization, dimensionality reduction, and whitening processes.
  • De-averaging refers to centering all dimensions of the input data to 0 in order to pull the center of the sample back to the origin of the coordinate system.
  • Normalization is normalizing the amplitude to the same range.
  • Whitening refers to normalizing the amplitude on each characteristic axis of the data.
  • the convolution calculation layer is used for local correlation and window sliding.
  • each filter connected to the data window in the convolution calculation layer is fixed.
  • Each filter focuses on an image feature, such as vertical edges, horizontal edges, colors, textures, etc., and these filters are combined to obtain the entire image.
  • a filter is a weight matrix.
  • a weight matrix can be used to convolve with data in different windows.
  • the activation layer is used to non-linearly map the output of the convolution layer.
  • the activation function used by the activation layer may be ReLU (The Rectified Linear Unit).
  • the pooling layer can be sandwiched between consecutive convolutional layers to compress the amount of data and parameters and reduce overfitting.
  • the pooling layer can use the maximum method or average method to reduce the dimensionality of the data.
  • the fully connected layer is located at the tail of the convolutional neural network, and all neurons between the two layers have the right to reconnect.
  • a part of the convolutional neural network is cascaded to the first confidence output node
  • a part of the convolutional layer is cascaded to the second confidence output node
  • a part of the convolutional layer is cascaded to the position output node.
  • the first confidence output node it can be detected.
  • the output node can detect the type of the foreground object of the image according to the second confidence level, and the position corresponding to the foreground object can be detected according to the position output node.
  • the classification model and the detection model may be stored in an electronic device in advance, and when an image to be processed is acquired, the image to be processed is identified through the classification model and the detection model.
  • the classification model and the detection model generally occupy the storage space of the electronic device, and when a large number of images are processed, the storage capacity requirements of the electronic device are also relatively high.
  • the image can be processed through the classification model and detection model stored locally on the terminal, or the image to be processed can be sent to the server for processing through the classification model and detection model stored on the server.
  • the server can send the trained classification model and detection model to the terminal after training the classification model and detection model, and the terminal does not need to train the above model.
  • the classification model and detection model stored in the terminal can be compressed models, so that the compressed model will occupy less resources, but the corresponding recognition accuracy will be lower.
  • the terminal can decide whether to perform the recognition processing locally on the terminal or the recognition processing on the server according to the number of images to be processed. After the terminal obtains the image to be processed, it counts the number of images of the image to be processed. If the number of images exceeds the preset upload number, the terminal uploads the image to be processed to the server and processes the image to be processed on the server. After processing by the server, the processing result is sent to the terminal.
  • FIG. 7 is a schematic diagram of generating an image classification label in one embodiment.
  • the background region of the image is identified, and the image classification labels that can be obtained include landscape, beach, snow, blue sky, green space, night scene, darkness, backlight, sunrise / sunset, indoor, fireworks, spotlight, etc. Recognize the foreground object of the image.
  • the available image classification labels include portraits, babies, cats, dogs, food, etc.
  • an image classification label is generated according to the recognition result of the foreground object; when the area occupied by the background region is larger than 1/2 of the image, an image classification is generated according to the recognition result of the background region label.
  • Operation 314 classify the foreground types identified by each foreground object, and generate a corresponding image classification label according to each foreground type.
  • the image classification label when generating an image classification label according to the foreground classification result, if it is determined according to the foreground classification result that only one foreground type is included in the image to be processed, the image classification label may be directly generated according to the foreground type; if the foreground classification result is to be determined
  • the image contains only two or more foreground types of foreground objects, then multi-level image classification labels can be generated according to the foreground type, that is, the obtained foreground types can be classified, and a corresponding image classification can be generated according to each foreground type. label.
  • an upper limit value for the number of generated tags may be set.
  • a classification label may be generated according to each type of foreground type; the current scene type exceeds this upper limit
  • classification labels are generated only for some foreground types.
  • the method further includes: counting the number of tags of the image classification tags; and if the number of tags exceeds the upper limit of the number, obtaining a target image classification tag from the foregoing image classification tags. The electronic device can mark the image according to the target image classification label.
  • the image may contain three foreground objects, and the corresponding foreground types are "human”, “dog”, and “cat”, respectively.
  • a corresponding image classification label is generated, which are "target-person”, “target-dog”, and “target-cat”.
  • the number of generated image classification labels is three. Assuming that the upper limit of the number is 2, then the number of tags obtained above exceeds the upper limit of the number. Then the target image classification tags can be determined by the above-mentioned image classification tags, which are "target-person” and "target-dog".
  • the total area of the foreground target corresponding to each image classification tag may be calculated, and the target image classification tag may be obtained from the image classification tags according to the total area.
  • the image classification label corresponding to the largest total area can be obtained as the target image classification label, or the image classification labels can be sorted according to the total area, and the target image classification label can be obtained from the sorted image classification labels.
  • the image classification label "Pic-people” can be generated directly based on the foreground type “people”. If the image contains target A, target B, and target C, and the corresponding foreground types are "human”, “cat", and “person”, respectively, the target A and target C corresponding to "people” can be calculated in the image The total area S 1 , the total area S 2 of the target “B” corresponding to “cat” in the image. If S 1 > S 2 , an image classification label will be generated according to the foreground type “person”; if S 1 ⁇ S 2 , an image classification label will be generated according to the foreground type “cat”.
  • the number of foreground targets corresponding to each image classification tag may also be counted, and an object may be obtained from the image classification tags according to the above number of targets.
  • the image classification label corresponding to the largest number of targets can be obtained as the target image classification label, or the image classification labels can be sorted according to the number of targets, and the target image classification label can be obtained from the sorted image classification labels.
  • the image to be processed contains target A, target B, target C, target D, target E, and target F, and the corresponding foreground types are “human”, “dog”, “human”, “human”, “cat” "And” Dog.
  • the foreground types corresponding to the image to be processed include “human”, “dog” and “cat”
  • the generated image classification tags according to the foreground type are "target_person”, “target_dog” and “target_cat”
  • the corresponding number of foreground targets is 3, 2, 1 respectively.
  • the first two image classification labels "target_person” and “target_dog” can be sorted according to the number of targets as the target image classification labels.
  • the operation of identifying the foreground target further includes:
  • Operation 802 Obtain depth data of each detected foreground object, and the depth data is used to represent a distance between the foreground object and the image acquisition device.
  • the depth data is used to indicate the distance between the foreground target and the image acquisition device. It can be considered that the closer the foreground target is to the image acquisition device, the more attention the user receives.
  • Depth data can be obtained, but not limited to, by means of structured light, dual camera ranging.
  • an electronic device can obtain depth data corresponding to each pixel point in an image to be processed, that is, all pixel points included in the foreground target have corresponding depth data.
  • the depth data corresponding to the foreground target may be the depth data corresponding to any pixel in the foreground target, or the average value of the depth data corresponding to all the pixels included in the foreground target, which is not limited here.
  • Operation 804 Identify a foreground target whose depth data is less than a depth threshold.
  • the foreground data that needs to be identified can be filtered by the depth data. Closer foreground targets can be considered as foreground targets that users are more concerned about. Specifically, when the depth data is less than the depth threshold, the foreground target is considered to be a foreground target that the user is more concerned about, and only the foreground target whose depth data is less than the depth threshold may be identified.
  • the operation of identifying the foreground target may further include: acquiring the detected target sharpness of each foreground target, and identifying the foreground target whose target sharpness is greater than the sharpness threshold.
  • multiple foreground targets may be detected from the image to be processed.
  • each foreground target can be identified separately to obtain the foreground type of each foreground target, or one or more of them can be selected for identification to obtain foreground recognition. result.
  • the electronic device After the electronic device detects the foreground target in the image to be processed, it can calculate the target sharpness corresponding to each foreground target.
  • the target sharpness can reflect the sharpness of textures such as the edge details of the foreground target, and can reflect the importance of each foreground object to a certain extent. Therefore, the foreground target for recognition can be obtained according to the target sharpness. For example, when shooting, the user will focus on the object of interest and blur the other objects. When identifying foreground objects, only foreground objects with higher definition can be identified, and foreground objects with lower definition are not identified.
  • the foreground target can include several pixels, and the sharpness of the foreground target can be calculated by calculating the gray difference of each pixel. Generally, the higher the sharpness, the greater the gray difference between pixels; the lower the sharpness, the smaller the gray difference between pixels.
  • the target sharpness calculated according to algorithms such as the Brenner gradient method, the Tenegrad gradient method, the Laplace gradient method, the variance method, and the energy gradient method may be specifically but not limited thereto.
  • the electronic device After the electronic device detects the foreground target, it can assign a foreground identifier to each foreground target to distinguish different foreground targets. Then, the corresponding relationship between the foreground identifier and the foreground coordinate is established. Each foreground target can be marked by the foreground identifier, and the position of each foreground target in the image to be processed can be located by the foreground coordinate. The electronic device can extract the foreground target through the foreground coordinates and identify the extracted foreground target.
  • the sharpness threshold may be a preset fixed value or a dynamically changing value, which is not limited herein. For example, it may be a fixed value stored in the electronic device in advance, or a value input by a user and dynamically adjusted as required, or a value calculated according to the acquired target sharpness.
  • the foreground target can be identified according to the depth data and the target definition at the same time. Specifically, the detected target sharpness of each foreground target is obtained, and the depth data corresponding to the foreground target whose target sharpness is greater than the sharpness threshold is obtained; and the foreground target whose depth data is smaller than the depth threshold is identified.
  • the image processing method provided in the foregoing embodiment may acquire an image to be processed, and perform target detection on the image to be processed to obtain a foreground target.
  • the foreground target is identified, and an image classification label is generated based on the recognition result of the foreground target.
  • the area occupied by the foreground target is relatively large, the foreground target can be identified more accurately. In this way, an image classification label is generated based on the recognition result of the foreground target, and the image can be classified more accurately.
  • FIG. 2, FIG. 3, and FIG. 8 are sequentially displayed according to the directions of the arrows, these operations are not necessarily performed sequentially in the order indicated by the arrows. Unless explicitly stated in this article, there is no strict order in which these operations can be performed, and these operations can be performed in other orders. Moreover, at least a part of the operations in FIG. 2, FIG. 3, and FIG. 8 may include multiple sub-operations or multiple phases. These sub-operations or phases are not necessarily executed at the same time, but may be performed at different times. These sub-operations The execution order of the operations or phases is not necessarily performed sequentially, but may be performed in turn or alternately with at least a part of other operations or sub-operations or phases of other operations.
  • FIG. 9 is a schematic structural diagram of an image processing apparatus according to an embodiment.
  • the image processing apparatus 900 includes an image acquisition module 902, a target detection module 904, a target recognition module 906, and an image classification module 908. among them:
  • the image acquisition module 902 is configured to acquire an image to be processed.
  • a target detection module 904 is configured to perform target detection on the image to be processed, and obtain a foreground target in the image to be processed.
  • a target recognition module 906 is configured to identify the foreground target if the target area occupied by the foreground target in the image to be processed is greater than an area threshold.
  • An image classification module 908 is configured to generate an image classification label according to a recognition result of the foreground object.
  • the image processing apparatus may acquire an image to be processed, and perform target detection on the image to be processed to obtain a foreground target.
  • the foreground target is identified, and an image classification label is generated based on the recognition result of the foreground target.
  • the area occupied by the foreground target is relatively large, the foreground target can be identified more accurately. In this way, an image classification label is generated based on the recognition result of the foreground target, and the image can be classified more accurately.
  • the image acquisition module 902 is further configured to acquire an image set including at least one target image, and calculate the similarity between any two target images; classify the target image according to the similarity; The similarity between any two target images in the same type of target image is greater than the similarity threshold; one target image is obtained from each type of target image as the image to be processed.
  • the target recognition module 906 is further configured to: if two or more foreground objects are detected from the image to be processed, use the total area of all foreground objects included in the image to be processed as Target area; if the target area is greater than an area threshold, identify the foreground target.
  • the target recognition module 906 is further configured to obtain the target sharpness of each detected foreground target, and identify a foreground target whose target sharpness is greater than a sharpness threshold.
  • the target recognition module 906 is further configured to obtain detected depth data of each foreground target, where the depth data is used to represent the distance between the foreground target and the image acquisition device; the depth data is less than a depth threshold The foreground target is identified.
  • the image classification module 908 is further configured to, if two or more foreground objects are detected from the to-be-processed image, classify the foreground type identified by each of the foreground objects; One foreground type generates a corresponding image classification label.
  • the image classification module 908 is further configured to identify a background area other than a foreground target in the image to be processed if the target area is less than or equal to an area threshold; The recognition result generates an image classification label.
  • each module in the above image processing apparatus is for illustration only. In other embodiments, the image processing apparatus may be divided into different modules as needed to complete all or part of the functions of the above image processing apparatus.
  • An embodiment of the present application further provides a computer-readable storage medium.
  • One or more non-volatile computer-readable storage media containing computer-executable instructions, when the computer-executable instructions are executed by one or more processors, causing the processors to perform the image processing provided by the foregoing embodiments method.
  • An embodiment of the present application further provides an electronic device.
  • the above electronic device includes an image processing circuit.
  • the image processing circuit may be implemented by hardware and / or software components, and may include various processing units that define an ISP (Image Signal Processing) pipeline.
  • FIG. 10 is a schematic diagram of an image processing circuit in an embodiment. As shown in FIG. 10, for convenience of explanation, only aspects of the image processing technology related to the embodiments of the present application are shown.
  • the image processing circuit includes an ISP processor 1040 and a control logic 1050.
  • the image data captured by the imaging device 1010 is first processed by an ISP processor 1040, which analyzes the image data to capture image statistical information that can be used to determine and / or one or more control parameters of the imaging device 1010.
  • the imaging device 1010 may include a camera having one or more lenses 1012 and an image sensor 1014.
  • the image sensor 1014 may include a color filter array (such as a Bayer filter).
  • the image sensor 1014 may obtain the light intensity and wavelength information captured by each imaging pixel of the image sensor 1014, and provide a set of raw images Image data.
  • the sensor 1020 may provide parameters (such as image stabilization parameters) of the acquired image processing to the ISP processor 1040 based on the interface type of the sensor 1020.
  • the sensor 1020 interface may use a SMIA (Standard Mobile Imaging Architecture) interface, other serial or parallel camera interfaces, or a combination of the foregoing interfaces.
  • SMIA Standard Mobile Imaging Architecture
  • the image sensor 1014 may also send the original image data to the sensor 1020, and the sensor 1020 may provide the original image data to the ISP processor 1040 based on the interface type of the sensor 1020, or the sensor 1020 stores the original image data in the image memory 1030.
  • the ISP processor 1040 processes the original image data pixel by pixel in a variety of formats.
  • each image pixel may have a bit depth of 8, 10, 12, or 14 bits, and the ISP processor 1040 may perform one or more image processing operations on the original image data and collect statistical information about the image data.
  • the image processing operations may be performed with the same or different bit depth accuracy.
  • the ISP processor 1040 may also receive image data from the image memory 1030.
  • the sensor 1020 interface sends the original image data to the image memory 1030, and the original image data in the image memory 1030 is then provided to the ISP processor 1040 for processing.
  • the image memory 1030 may be a part of a memory device, a storage device, or a separate dedicated memory in an electronic device, and may include a DMA (Direct Memory Access) feature.
  • DMA Direct Memory Access
  • the ISP processor 1040 may perform one or more image processing operations, such as time-domain filtering.
  • the processed image data may be sent to the image memory 1030 for further processing before being displayed.
  • the ISP processor 1040 receives processed data from the image memory 1030, and performs image data processing on the processed data in the original domain and in the RGB and YCbCr color spaces.
  • the image data processed by the ISP processor 1040 may be output to a display 1070 for viewing by a user and / or further processed by a graphics engine or a GPU (Graphics Processing Unit).
  • the output of the ISP processor 1040 can also be sent to the image memory 1030, and the display 1070 can read image data from the image memory 1030.
  • the image memory 1030 may be configured to implement one or more frame buffers.
  • the output of the ISP processor 1040 may be sent to an encoder / decoder 1060 to encode / decode image data.
  • the encoded image data can be saved and decompressed before being displayed on the display 1070 device.
  • the encoder / decoder 1060 may be implemented by a CPU or a GPU or a coprocessor.
  • the statistical data determined by the ISP processor 1040 may be sent to the control logic unit 1050.
  • the statistical data may include image sensor 1014 statistical information such as auto exposure, auto white balance, auto focus, flicker detection, black level compensation, and lens 1012 shading correction.
  • the control logic 1050 may include a processor and / or a microcontroller that executes one or more routines (such as firmware). The one or more routines may determine control parameters and ISP processing of the imaging device 1010 based on the received statistical data. 1040 control parameters.
  • control parameters of the imaging device 1010 may include sensor 1020 control parameters (such as gain, integration time for exposure control, image stabilization parameters, etc.), camera flash control parameters, lens 1012 control parameters (such as focus distance for focusing or zooming), or these A combination of parameters.
  • ISP control parameters may include gain levels and color correction matrices for automatic white balance and color adjustment (eg, during RGB processing), and lens 1012 shading correction parameters.
  • Non-volatile memory may include read-only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory.
  • Volatile memory can include random access memory (RAM), which is used as external cache memory.
  • RAM is available in various forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), dual data rate SDRAM (DDR, SDRAM), enhanced SDRAM (ESDRAM), synchronous Link (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
  • SRAM static RAM
  • DRAM dynamic RAM
  • SDRAM synchronous DRAM
  • DDR dual data rate SDRAM
  • SDRAM enhanced SDRAM
  • SLDRAM synchronous Link (Synchlink) DRAM
  • SLDRAM synchronous Link (Synchlink) DRAM
  • Rambus direct RAM
  • DRAM direct memory bus dynamic RAM
  • RDRAM memory bus dynamic RAM

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Image Analysis (AREA)

Abstract

L'invention concerne un procédé de traitement d'image, comprenant les étapes suivantes : obtenir une image à traiter ; effectuer une détection de cible sur l'image à traiter pour obtenir une cible de premier plan dans l'image à traiter ; si une zone cible occupée par la cible de premier plan dans l'image à traiter est supérieure à un seuil de zone, reconnaître la cible de premier plan ; et générer une étiquette de classification d'image en fonction du résultat de reconnaissance de la cible de premier plan.
PCT/CN2019/087590 2018-06-08 2019-05-20 Procédé de traitement d'image, support de stockage lisible par ordinateur et dispositif électronique WO2019233266A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810587091.1A CN108960290A (zh) 2018-06-08 2018-06-08 图像处理方法、装置、计算机可读存储介质和电子设备
CN201810587091.1 2018-06-08

Publications (1)

Publication Number Publication Date
WO2019233266A1 true WO2019233266A1 (fr) 2019-12-12

Family

ID=64493527

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/087590 WO2019233266A1 (fr) 2018-06-08 2019-05-20 Procédé de traitement d'image, support de stockage lisible par ordinateur et dispositif électronique

Country Status (2)

Country Link
CN (1) CN108960290A (fr)
WO (1) WO2019233266A1 (fr)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111222419A (zh) * 2019-12-24 2020-06-02 深圳市优必选科技股份有限公司 一种物体识别方法、机器人以及计算机可读存储介质
CN111539962A (zh) * 2020-01-10 2020-08-14 济南浪潮高新科技投资发展有限公司 一种目标图像分类方法、装置以及介质
CN111797934A (zh) * 2020-07-10 2020-10-20 北京嘉楠捷思信息技术有限公司 路标识别方法及装置
CN111833303A (zh) * 2020-06-05 2020-10-27 北京百度网讯科技有限公司 产品的检测方法、装置、电子设备及存储介质
CN112132206A (zh) * 2020-09-18 2020-12-25 青岛商汤科技有限公司 图像识别方法及相关模型的训练方法及相关装置、设备
CN112182272A (zh) * 2020-09-23 2021-01-05 创新奇智(成都)科技有限公司 图像检索方法及装置、电子设备、存储介质
CN113884504A (zh) * 2021-08-24 2022-01-04 湖南云眼智能装备有限公司 一种电容外观检测控制方法及装置
CN116468882A (zh) * 2022-01-07 2023-07-21 荣耀终端有限公司 图像处理方法、装置、设备、存储介质和程序产品

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108960290A (zh) * 2018-06-08 2018-12-07 Oppo广东移动通信有限公司 图像处理方法、装置、计算机可读存储介质和电子设备
CN111435447A (zh) * 2019-01-14 2020-07-21 珠海格力电器股份有限公司 识别留胚米的方法、装置和烹饪器具
CN110163810B (zh) * 2019-04-08 2023-04-25 腾讯科技(深圳)有限公司 一种图像处理方法、装置以及终端
CN110334635B (zh) * 2019-06-28 2021-08-31 Oppo广东移动通信有限公司 主体追踪方法、装置、电子设备和计算机可读存储介质
CN111210440B (zh) * 2019-12-31 2023-12-22 联想(北京)有限公司 皮肤对象的识别方法、装置和电子设备
CN111274426B (zh) * 2020-01-19 2023-09-12 深圳市商汤科技有限公司 类别标注方法及装置、电子设备和存储介质
CN113705285A (zh) * 2020-05-22 2021-11-26 珠海金山办公软件有限公司 主体识别方法、装置、及计算机可读存储介质
CN111738354A (zh) * 2020-07-20 2020-10-02 深圳市天和荣科技有限公司 一种自动识别训练方法、系统、存储介质及计算机设备
CN112560698B (zh) * 2020-12-18 2024-01-16 北京百度网讯科技有限公司 图像处理方法、装置、设备和介质
CN113283436B (zh) * 2021-06-11 2024-01-23 北京有竹居网络技术有限公司 图片处理方法、装置和电子设备
CN114220111B (zh) * 2021-12-22 2022-09-16 深圳市伊登软件有限公司 基于云平台的图文批量识别方法及系统
CN117372738A (zh) * 2022-07-01 2024-01-09 顺丰科技有限公司 目标物数量检测方法、装置、电子设备及存储介质
CN116563170B (zh) * 2023-07-10 2023-09-15 中国人民解放军空军特色医学中心 一种图像数据处理方法、系统以及电子设备

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110007366A1 (en) * 2009-07-10 2011-01-13 Palo Alto Research Center Incorporated System and method for classifying connected groups of foreground pixels in scanned document images according to the type of marking
CN103985114A (zh) * 2014-03-21 2014-08-13 南京大学 一种监控视频人物前景分割与分类的方法
CN107133352A (zh) * 2017-05-24 2017-09-05 北京小米移动软件有限公司 照片显示方法及装置
CN108960290A (zh) * 2018-06-08 2018-12-07 Oppo广东移动通信有限公司 图像处理方法、装置、计算机可读存储介质和电子设备

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4808267B2 (ja) * 2009-05-27 2011-11-02 シャープ株式会社 画像処理装置、画像形成装置、画像処理方法、コンピュータプログラム及び記録媒体
CN102968802A (zh) * 2012-11-28 2013-03-13 无锡港湾网络科技有限公司 一种基于视频监控的运动目标分析跟踪方法及系统
CN103745230B (zh) * 2014-01-14 2017-05-10 四川大学 一种自适应群体异常行为分析方法
CN104658030B (zh) * 2015-02-05 2018-08-10 福建天晴数码有限公司 二次图像混合的方法和装置
CN105913082B (zh) * 2016-04-08 2020-11-27 北京邦视科技有限公司 一种对图像中目标进行分类的方法及系统
CN107657051B (zh) * 2017-10-16 2020-03-17 Oppo广东移动通信有限公司 一种图片标签的生成方法、终端设备及存储介质
CN108038491B (zh) * 2017-11-16 2020-12-11 深圳市华尊科技股份有限公司 一种图像分类方法及装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110007366A1 (en) * 2009-07-10 2011-01-13 Palo Alto Research Center Incorporated System and method for classifying connected groups of foreground pixels in scanned document images according to the type of marking
CN103985114A (zh) * 2014-03-21 2014-08-13 南京大学 一种监控视频人物前景分割与分类的方法
CN107133352A (zh) * 2017-05-24 2017-09-05 北京小米移动软件有限公司 照片显示方法及装置
CN108960290A (zh) * 2018-06-08 2018-12-07 Oppo广东移动通信有限公司 图像处理方法、装置、计算机可读存储介质和电子设备

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111222419A (zh) * 2019-12-24 2020-06-02 深圳市优必选科技股份有限公司 一种物体识别方法、机器人以及计算机可读存储介质
CN111539962A (zh) * 2020-01-10 2020-08-14 济南浪潮高新科技投资发展有限公司 一种目标图像分类方法、装置以及介质
CN111833303A (zh) * 2020-06-05 2020-10-27 北京百度网讯科技有限公司 产品的检测方法、装置、电子设备及存储介质
CN111833303B (zh) * 2020-06-05 2023-07-25 北京百度网讯科技有限公司 产品的检测方法、装置、电子设备及存储介质
CN111797934A (zh) * 2020-07-10 2020-10-20 北京嘉楠捷思信息技术有限公司 路标识别方法及装置
CN112132206A (zh) * 2020-09-18 2020-12-25 青岛商汤科技有限公司 图像识别方法及相关模型的训练方法及相关装置、设备
CN112182272A (zh) * 2020-09-23 2021-01-05 创新奇智(成都)科技有限公司 图像检索方法及装置、电子设备、存储介质
CN112182272B (zh) * 2020-09-23 2023-07-28 创新奇智(成都)科技有限公司 图像检索方法及装置、电子设备、存储介质
CN113884504A (zh) * 2021-08-24 2022-01-04 湖南云眼智能装备有限公司 一种电容外观检测控制方法及装置
CN116468882A (zh) * 2022-01-07 2023-07-21 荣耀终端有限公司 图像处理方法、装置、设备、存储介质和程序产品
CN116468882B (zh) * 2022-01-07 2024-03-15 荣耀终端有限公司 图像处理方法、装置、设备、存储介质

Also Published As

Publication number Publication date
CN108960290A (zh) 2018-12-07

Similar Documents

Publication Publication Date Title
WO2019233266A1 (fr) Procédé de traitement d'image, support de stockage lisible par ordinateur et dispositif électronique
US10896323B2 (en) Method and device for image processing, computer readable storage medium, and electronic device
CN108764370B (zh) 图像处理方法、装置、计算机可读存储介质和计算机设备
WO2019233297A1 (fr) Procédé de construction d'un ensemble de données, terminal mobile et support de stockage lisible
US11138478B2 (en) Method and apparatus for training, classification model, mobile terminal, and readable storage medium
WO2019233394A1 (fr) Procédé et appareil de traitement d'image, support de stockage et dispositif électronique
CN108777815B (zh) 视频处理方法和装置、电子设备、计算机可读存储介质
WO2019233393A1 (fr) Procédé et appareil de traitement d'image, support de stockage et dispositif électronique
WO2020259179A1 (fr) Procédé de mise au point, dispositif électronique et support d'informations lisible par ordinateur
US20200412940A1 (en) Method and device for image processing, method for training object detection model
WO2020259264A1 (fr) Procédé de suivi d'un sujet, appareil électronique, et support d'enregistrement lisible par ordinateur
CN108897786B (zh) 应用程序的推荐方法、装置、存储介质及移动终端
WO2019237887A1 (fr) Procédé de traitement d'images, dispositif électronique et support d'informations lisible par ordinateur
CN108810418B (zh) 图像处理方法、装置、移动终端及计算机可读存储介质
CN108961302B (zh) 图像处理方法、装置、移动终端及计算机可读存储介质
CN101416219B (zh) 数字图像中的前景/背景分割
CN110580487A (zh) 神经网络的训练方法、构建方法、图像处理方法和装置
CN108984657B (zh) 图像推荐方法和装置、终端、可读存储介质
CN108765033B (zh) 广告信息推送方法和装置、存储介质、电子设备
WO2020001196A1 (fr) Procédé de traitement d'images, dispositif électronique et support d'informations lisible par ordinateur
CN108875619B (zh) 视频处理方法和装置、电子设备、计算机可读存储介质
WO2019233392A1 (fr) Procédé et appareil de traitement d'image, dispositif électronique et support d'informations lisible par ordinateur
WO2019233271A1 (fr) Procédé de traitement d'image, support d'informations lisible par ordinateur et dispositif électronique
CN108241645B (zh) 图像处理方法及装置
CN108717530B (zh) 图像处理方法、装置、计算机可读存储介质和电子设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19815183

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19815183

Country of ref document: EP

Kind code of ref document: A1