WO2019223513A1 - Procédé de reconnaissance d'image, dispositif électronique et support de stockage - Google Patents

Procédé de reconnaissance d'image, dispositif électronique et support de stockage Download PDF

Info

Publication number
WO2019223513A1
WO2019223513A1 PCT/CN2019/085499 CN2019085499W WO2019223513A1 WO 2019223513 A1 WO2019223513 A1 WO 2019223513A1 CN 2019085499 W CN2019085499 W CN 2019085499W WO 2019223513 A1 WO2019223513 A1 WO 2019223513A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
decision tree
processed
category label
foreground
Prior art date
Application number
PCT/CN2019/085499
Other languages
English (en)
Chinese (zh)
Inventor
陈岩
Original Assignee
Oppo广东移动通信有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Oppo广东移动通信有限公司 filed Critical Oppo广东移动通信有限公司
Publication of WO2019223513A1 publication Critical patent/WO2019223513A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items

Definitions

  • the present application relates to the field of image processing technologies, and in particular, to an image recognition method, an electronic device, and a computer-readable storage medium.
  • an image recognition method an electronic device, and a computer-readable storage medium are provided.
  • An image recognition method includes:
  • the category labels are output.
  • An electronic device includes a memory and a processor.
  • the memory stores computer-readable instructions.
  • the processor causes the processor to perform the following operations:
  • the category labels are output.
  • a computer-readable storage medium stores a computer program thereon.
  • the computer program is executed by a processor, the following operations are implemented:
  • the category labels are output.
  • the embodiment of the present application obtains a to-be-processed image and inputs the to-be-processed image into a decision tree to obtain a category label of the to-be-processed image, outputs a category label, and the category label is a scene category of the to-be-processed image.
  • the decision tree is used to identify the scene category of the image, and the category label of the image can be obtained only by inputting the image into the decision tree, which improves the accuracy of image recognition.
  • FIG. 1 is a schematic structural diagram of an electronic device in one or more embodiments.
  • FIG. 2 is a flowchart of an image recognition method in one or more embodiments.
  • FIG. 3 is a flowchart of a method for generating a decision tree in one or more embodiments.
  • FIG. 4 is a flowchart of a method for obtaining a category label from a decision tree in one or more embodiments.
  • FIG. 5 is a flowchart of a method for obtaining a first category label from a foreground decision tree in one or more embodiments.
  • FIG. 6 is a flowchart of a method for obtaining a second category label from a background decision tree in one or more embodiments.
  • FIG. 7A is a schematic diagram of processing a foreground decision tree in one or more embodiments.
  • FIG. 7B is a schematic diagram of processing a background decision tree in one or more embodiments.
  • FIG. 8 is a flowchart of a method for outputting a category label of a preview image in one or more embodiments.
  • FIG. 9 is a structural block diagram of an image recognition apparatus in one or more embodiments.
  • FIG. 10 is a structural block diagram of an image recognition apparatus in other embodiments.
  • FIG. 11 is a schematic diagram of an image processing circuit in one or more embodiments.
  • first the terms “first”, “second”, and the like used in this application can be used herein to describe various elements, but these elements are not limited by these terms. These terms are only used to distinguish the first element from another element.
  • the first client may be referred to as the second client, and similarly, the second client may be referred to as the first client. Both the first client and the second client are clients, but they are not the same client.
  • FIG. 1 a schematic diagram of an internal structure of an electronic device is provided.
  • the electronic device includes a processor, a memory, a camera, and a network interface connected through a system bus.
  • the processor is used to provide computing and control capabilities to support the operation of the entire electronic device.
  • the memory is used to store data, programs, and / or instruction codes, etc., and at least one computer program is stored on the memory, and the computer program can be executed by a processor to implement the image recognition method applicable to electronic devices provided in the embodiments of the present application.
  • the memory may include a non-volatile storage medium such as a magnetic disk, an optical disc, a read-only memory (ROM), or a random-access memory (RAM).
  • the memory includes a non-volatile storage medium and an internal memory.
  • the non-volatile storage medium stores an operating system and a computer program.
  • the computer program can be executed by a processor to implement an image recognition method provided by various embodiments of the present application.
  • the internal memory provides a cached operating environment for the operating system and computer programs in a non-volatile storage medium.
  • the camera can be used to capture images.
  • the network interface may be an Ethernet card or a wireless network card, and is used to communicate with external electronic devices, for example, it may be used to communicate with a server.
  • FIG. 1 is only a block diagram of a part of the structure related to the solution of the present application, and does not constitute a limitation on the electronic device to which the solution of the present application is applied.
  • the specific electronic device may be Include more or fewer parts than shown in the figure, or combine certain parts, or have a different arrangement of parts.
  • an image recognition method is provided and applied to the above electronic device for illustration. As shown in FIG. 2, the method includes the following operations:
  • An image to be processed refers to an image that needs to be scene-recognized for the content in the image.
  • the image to be processed may be an image acquired by the electronic device through a camera in real time, or an image stored locally in the electronic device in advance.
  • the electronic device can acquire the captured image to be processed in real time through the camera, and the electronic device can also acquire the image to be processed by selecting a locally stored image.
  • the image to be processed is input into a decision tree, and a category label of the image to be processed is obtained.
  • the category label is a scene category of the image to be processed.
  • the decision tree is obtained by training according to the image with the scene category.
  • Decision tree is a machine learning method for making decisions based on a tree structure.
  • a decision tree is a predictive model that represents a mapping relationship between object attributes and object values.
  • a decision tree can include a root node, multiple internal nodes, and multiple leaf nodes. The leaf nodes correspond to the decision results, and each other node corresponds to an attribute test. The sample set contained in each node is divided into child nodes according to the results of the attribute test, and the root node contains the entire sample set. The path from the root node to each leaf node corresponds to a decision test sequence.
  • the electronic device may input the image to be processed into a decision tree, and the decision tree may be a pre-established decision tree.
  • the scene category of the image may include portrait, cat, dog, baby, food, blue sky, green grass, beach, snow, etc.
  • Decision trees can be obtained by training images with scene categories.
  • the trained decision trees can predict the categories of images.
  • the electronic device inputs the image to be processed into the trained decision tree, the electronic device can obtain the output prediction result of the decision tree, and the prediction result obtained by the electronic device is the category label of the image to be processed.
  • the category tags obtained by the electronic device may be scene categories of images such as portraits, cats, dogs, babies, food, blue sky, green grass, beaches, and snow.
  • Operation 206 Output a category label.
  • the electronic device After obtaining the prediction result of the decision tree, the electronic device can also output the obtained prediction result, that is, output the category label of the image to be processed.
  • category labels of the images to be processed are obtained, and category labels are output, where the category labels are scene categories of the images to be processed, and the decision tree is based on images with scene categories. Get it.
  • the decision tree is used to identify the scene category of the image, and the category label of the image can be obtained only by inputting the image into the decision tree, which improves the accuracy of image recognition.
  • an image recognition method provided may further include a process of generating a decision tree, and specific operations include:
  • Operation 302 Extract at least two image features in the sample image.
  • the sample image is a part of the image randomly selected from all the images to be examined.
  • Image features are features that can distinguish different scene categories in an image.
  • the image feature may be a color feature or a shape feature.
  • the image features may be preset, and one feature represents a category of the image.
  • a circle in a shape feature represents a category in an image
  • a green in a color feature represents a category in an image is green grass
  • a blue in a color feature represents a category in an image is a blue sky.
  • the electronic device may extract image features in the sample image, and at least two image features in the sample image extracted by the electronic device.
  • the sample image is a photo of a group of people on the grassland collected by the camera. The photo includes portraits, blue sky, and green grass.
  • the electronic device can extract the shape features and color features of the sample image collected by the camera.
  • the electronic device extracts The shape features are circular, and the color features extracted by the electronic features are green and blue, indicating that the sample images collected by the camera include human portraits, green grass, and blue sky.
  • Operation 304 Calculate information parameter values when at least two image features are used as nodes.
  • the information parameter value includes one of information gain, information gain ratio, and Gini coefficient.
  • a node refers to a condition point for classifying attributes in a decision tree, and the node may include a root node, an internal node, and a leaf node.
  • the root node is also called the decision node.
  • the root node is the best choice among several solutions.
  • the internal node can also be called the state node.
  • the internal node of the decision tree represents the judgment condition.
  • the leaf node can also be called the result node and the leaf node. Represents the prediction result of a decision tree.
  • the electronic device can separately calculate the information parameter value when each extracted image feature is used as the root node, and the electronic device can also separately calculate the information parameter value when each extracted image feature is used as the internal node. It can be understood that the leaf node represents the prediction result of the decision tree, and the electronic device does not need to calculate the information parameter value when the image feature is used as the leaf node.
  • Operation 306 Use the image feature with the largest information parameter value as the node.
  • the electronic device may calculate an information gain, an information gain ratio, or a Gini coefficient when the image feature is used as a node.
  • the information gain refers to the image features one by one.
  • the information gain refers to the difference between the entropy of the set to be classified and the conditional entropy of selecting a certain feature.
  • the information gain is to be classified.
  • the information gain can also be understood as the difference between the previous and next information.
  • the information gain is the decision tree's attribute selection Information difference before and after division.
  • Entropy refers to the expected value of information
  • conditional entropy refers to the entropy of a selected feature.
  • the ID3 decision tree learning algorithm uses information gain as a criterion to divide attributes or features.
  • the information gain ratio refers to the ratio of the information gain to the entropy of a selected feature in the training data set.
  • the information gain is often biased to select attributes with many branches, which will lead to the problem of overfitting.
  • the information gain ratio can be used as a compensation measure to solve the problems of information gain.
  • the C4.5 decision tree learning algorithm uses the information gain ratio as a criterion to divide attributes or features.
  • the Gini coefficient indicates the probability of a randomly selected sample being misdivisioned in the subset.
  • electronic devices select image features, the more cluttered the image features, the larger the Gini coefficient.
  • Gini coefficients are used.
  • the electronic device can use the image feature with the largest information parameter value as the node. For example, when the electronic device selects the root node, taking the information gain as an example, the electronic device can respectively use multiple image features extracted from the sample image as the root node, and calculate the information gain when the multiple image features are used as the root node. Then, the image feature with the largest information gain is used as the root node. Understandably, the same method can be used to select internal nodes in the decision tree.
  • the information parameter values when at least two image features are used as nodes are calculated, and the image feature with the largest information parameter value is used as the node.
  • the established decision tree can be made more accurate, thereby greatly improving the accuracy of image recognition.
  • the provided image recognition method may further include a process of obtaining a category label from a decision tree.
  • the specific operations include:
  • Operation 402 Input the image to be processed into a decision tree, and the decision tree extracts image features in the image to be processed.
  • the electronic device After the electronic device obtains the image to be processed, it can input the image to be processed into a decision tree for processing. Since the decision tree is obtained based on the image with the scene category, the electronic device can obtain the prediction result of the image scene category output from the decision tree.
  • the decision tree can extract image features in the image to be processed. For example, the decision tree can extract shape features, color features, and the like in the image to be processed. The decision tree can extract at least two image features in the image to be processed.
  • the decision tree calculates information parameter values when image features in the image to be processed are used as nodes.
  • the extracted image features can be used as root nodes, and the image features can be calculated as the root node's information parameter values.
  • the information parameter value calculated by the decision tree can be the information gain.
  • the decision tree can also use the extracted image features as internal nodes, and then calculate the image features as internal node information parameter values.
  • the decision tree can compare the calculated information parameter value to obtain the image feature with the largest information parameter value. For example, the decision tree can extract the image features in the image to be processed, calculate the information parameter values when the image feature is the root node, and compare the calculated information parameter values to obtain the image feature with the largest information parameter value. At this time, the decision tree can take the image feature with the largest information parameter value as the root node. Then, after the root node of the decision tree is selected, the decision tree can calculate the information parameter value when the image feature that is not the root node is the internal node. There can be multiple internal nodes in a decision tree. After the decision tree calculates the image feature as the information parameter value of the internal node, the image feature with the largest information parameter value can be used as the first internal node, and so on, the decision tree can select multiple internal nodes.
  • the decision tree After selecting the root node and internal nodes of the decision tree, the decision tree is completely established. After the electronic device inputs the image to be processed into the decision tree, the decision tree can predict the scene category of the image to be processed and output the category label.
  • the decision tree By inputting the image to be processed into the decision tree, the decision tree extracts the image features in the image to be processed, and the decision tree calculates the information parameter values when the image features in the image to be processed are used as nodes, and the image features with the largest information parameter values are used as nodes.
  • the category label of the decision tree output.
  • the decision tree calculates information parameter values when image features are used as nodes, selects each node accordingly, and then outputs the category label of the image to be processed, which improves the accuracy of decision tree prediction and thus the accuracy of image recognition.
  • an image recognition method provided may further include a process of obtaining a first category label from a foreground decision tree.
  • Specific operations include:
  • an image to be processed is input to a foreground decision tree, and the foreground decision tree extracts features of the foreground image in the image to be processed.
  • a trained decision tree can include a foreground decision tree.
  • the foreground image feature may be an image feature that distinguishes the image foreground scene category.
  • the foreground scene can include portraits, cats, dogs, babies, food, and text.
  • the electronic device can input the image to be processed into the trained foreground decision tree, and the foreground decision tree can extract features of the foreground image in the image to be processed.
  • the foreground image feature may be a shape feature or a color feature.
  • the foreground decision tree separately calculates information parameter values when the foreground image features in the image to be processed are used as nodes.
  • the foreground decision tree can calculate information parameter values when the foreground image features in the image to be processed are used as root nodes. Similarly, the foreground decision tree can also calculate the information parameter values when the foreground image features in the image to be processed are used as internal nodes.
  • a first category label with the foreground image feature having the largest information parameter value as the node's foreground decision tree output is obtained.
  • the first category label is a foreground category label in the image to be processed, for example, a foreground category label such as portrait, cat, dog, baby, food, and text.
  • the foreground decision tree can select the root node and the internal node by comparing the size of the calculated information parameter values. After the root node and internal nodes of the foreground decision tree are selected, the foreground decision tree can process the to-be-processed images input by the electronic device, so as to obtain the first category label output from the foreground decision tree.
  • the foreground decision tree By inputting the to-be-processed image into the foreground decision tree, the foreground decision tree extracts the features of the foreground image in the to-be-processed image, and the foreground decision tree separately calculates the information parameter value when the feature of the foreground image in the to-be-processed image is used as a node to obtain the largest information parameter value.
  • the foreground image feature is used as the first category label of the node's foreground decision tree output.
  • the first category label output by the foreground decision tree is the foreground scene in the image to be processed.
  • the foreground decision tree separates the foreground scene from the background scene of the image to be processed, which improves the accuracy of image recognition.
  • the provided image recognition method may further include a process of obtaining a second category label from the background decision tree.
  • the specific operations include:
  • the image to be processed is input to a background decision tree, and the background decision tree extracts features of the background image in the image to be processed.
  • the trained decision tree can also include a background decision tree.
  • the background image feature may be an image feature that distinguishes an image background scene category.
  • the background scene can include scene tags such as blue sky, green grass, landscape, indoor, beach, snow, fireworks, spotlight, sunset, and night scene.
  • the electronic device can input the image to be processed into the trained background decision tree, and the background decision tree can extract the characteristics of the background image in the image to be processed.
  • the background image feature may be a shape feature or a color feature.
  • the background decision tree separately calculates information parameter values when the background image features in the image to be processed are used as nodes.
  • the background decision tree can calculate the information parameter values when the background image features in the to-be-processed image are used as root nodes. Similarly, the background decision tree can also calculate the back-image features in the to-be-processed image when they are used as internal nodes. Information parameter value.
  • a second category label that uses the background image feature with the largest information parameter value as the node's background decision tree output is obtained.
  • the second category label is a background category label in the image to be processed, for example, blue sky, green grass, landscape, indoor, beach, snow, fireworks, spotlight, sunset, and night scenery and other background category labels.
  • the background decision tree can select the root node and the internal node by comparing the size of the calculated information parameter values. After the root node and internal nodes of the background decision tree are selected, the background decision tree can process the to-be-processed images input by the electronic device, so as to obtain the second category label output from the background decision tree.
  • the background decision tree By inputting the to-be-processed image into the background decision tree, the background decision tree extracts the background image features in the to-be-processed image, and the background decision tree respectively calculates the information parameter values of the background image features in the to-be-processed image as nodes.
  • the background image feature with the largest information parameter value is used as the second category label of the node's background decision tree output.
  • the second category label output by the background decision tree is the background scene in the image to be processed.
  • the background decision tree separates the foreground scene from the background scene of the image to be processed, which improves the accuracy of image recognition.
  • a process of a foreground decision tree is provided.
  • the decision tree may include a foreground decision tree 710.
  • a rectangular box is used to represent the root node
  • a circle is used to represent the internal node
  • a triangle is used to represent the leaf node.
  • the electronic device obtains the image to be processed, it can input the image to be processed into the foreground decision tree 710.
  • the foreground decision tree 710 can select the root node 712 and the internal node 714 by calculating the information parameter values of the image to be processed.
  • the to-be-processed image is input to the foreground decision tree 710 to obtain a foreground category label 716.
  • a process of a background decision tree is provided.
  • the decision tree may include a background decision tree 720.
  • a rectangular box is used to represent the root node
  • a circle is used to represent the internal node
  • a triangle is used to represent the leaf node.
  • the electronic device After the electronic device inputs the to-be-processed image into the foreground decision tree 710 and the background decision tree 720, it can obtain a foreground category label 716 and a background category label 726 that are simultaneously output by the foreground decision tree 710 and the background decision tree 720.
  • the provided image recognition method may further include a process of outputting a category label of a preview image, and specific operations include:
  • Operation 802 Obtain a preview image collected through a camera.
  • the electronic device can control the camera to collect images, and when the camera is on, it can collect real-time preview images.
  • the electronic device can obtain the preview image collected through the camera.
  • the preview image can also include different image scenes.
  • the image scenes can be divided into foreground and background.
  • the foreground can include portraits, cats, dogs, babies, and food.
  • the background can include scenes such as blue sky, green grass, landscape, indoor, beach, snow, fireworks, spotlights, sunset and night.
  • Operation 804 Input the preview image into the decision tree to obtain a category label of the preview image.
  • the electronic device may input the preview image into the trained decision tree, and the trained decision tree may include a foreground decision tree and a background decision tree. After the electronic device inputs the preview image into the foreground decision tree, it can obtain the foreground category label output from the foreground decision tree. After the electronic device inputs the preview image into the background decision tree, it can obtain the output from the background decision tree. Backdrop category labels.
  • a category label of the preview image is output.
  • the electronic device can simultaneously output the foreground category label output from the foreground decision tree and the background category label output from the background decision tree.
  • the preview image is input into the decision tree, the category label of the preview image is obtained, and the category label of the preview image is output.
  • the electronic device can identify the image scene category of the preview image through the decision tree, thereby improving the accuracy of image recognition.
  • an image recognition method provided may further include a process of adjusting a photographing mode according to a category tag, which specifically includes: acquiring a category tag of a preview image, and adjusting the photographing mode according to the category tag of the preview image.
  • the electronic device can adjust the photographing mode.
  • the photographing mode may include a portrait mode, a landscape mode, and a professional mode.
  • the electronic device can adjust the photographing mode according to the category label of the preview image. For example, when the electronic device obtains the scene category of the preview image whose category label is portrait and blue sky, the photographing mode may be adjusted to the portrait mode.
  • Adjusting the shooting mode according to the category label of the preview image can enhance the shooting effect of the image.
  • an image recognition method is provided, and specific operations for implementing the method are as follows:
  • the electronic device can extract at least two image features in the sample image.
  • the sample image is a part of the image randomly selected from all the images to be examined.
  • Image features are features that can distinguish different scene categories in an image.
  • the image feature may be a color feature or a shape feature.
  • the image feature may be preset, and one feature represents a category of the image.
  • a circle in a shape feature represents a category in the image
  • a green feature in the color feature represents a category of green grass in the image.
  • the color Blue in the feature indicates that the category in the image is blue sky.
  • the electronic device may extract image features in the sample image, and at least two image features in the sample image extracted by the electronic device.
  • the electronic device may calculate information parameter values when at least two image features are used as nodes.
  • the electronic device can separately calculate the information parameter value when each extracted image feature is used as the root node, and the electronic device can also separately calculate the information parameter value when each extracted image feature is used as the internal node.
  • the leaf node represents the prediction result of the decision tree, and the electronic device does not need to calculate the information parameter value when the image feature is used as the leaf node.
  • the electronic device may also use the image feature with the largest information parameter value as the node.
  • the electronic device may calculate an information parameter value when the image feature is used as a node.
  • the electronic device may calculate an information gain, an information gain ratio, or a Gini coefficient when the image feature is used as a node.
  • the electronic device can then acquire the image to be processed.
  • An image to be processed refers to an image that needs to be scene-recognized for the content in the image.
  • the image to be processed may be an image collected by the electronic device through a camera in real time, or an image stored locally in the electronic device in advance.
  • the electronic device can acquire the captured image to be processed in real time through the camera, and the electronic device can also acquire the image to be processed by selecting a locally stored image.
  • the electronic device may also input the image to be processed into a decision tree to obtain a category label of the image to be processed.
  • the category label is a scene category of the image to be processed.
  • the decision tree is obtained by training on the image with the scene category.
  • the electronic device may input the image to be processed into a decision tree, and the decision tree may be a pre-established decision tree.
  • the decision tree can be obtained based on images with scene categories, where the scene categories of the image can include portraits, cats, dogs, babies, food, blue sky, green grass, beach, snow, etc.
  • Decision trees can be obtained by training images with scene categories.
  • the trained decision trees can predict the categories of images.
  • the electronic device When the electronic device inputs the image to be processed into the trained decision tree, the electronic device can obtain the output prediction result of the decision tree, and the prediction result obtained by the electronic device is the category label of the image to be processed.
  • the category tags obtained by the electronic device may be scene categories of images such as portraits, cats, dogs, babies, food, blue sky, green grass, beaches, and snow.
  • the electronic device may also input the image to be processed into a decision tree, and the decision tree extracts image features in the image to be processed.
  • the electronic device After the electronic device obtains the image to be processed, it can input the image to be processed into a decision tree for processing. Since the decision tree is obtained based on the image with the scene category, the electronic device can obtain the prediction result of the image scene category output from the decision tree.
  • the decision tree After the electronic device inputs the image to be processed into the decision tree, the decision tree can extract image features in the image to be processed. For example, the decision tree can extract shape features, color features, and the like in the image to be processed.
  • the decision tree can extract at least two image features in the image to be processed.
  • the decision tree can calculate the information parameter values when the image features in the image to be processed are used as nodes.
  • the decision tree can obtain the category label with the image feature with the largest information parameter value as the node's decision tree output.
  • the electronic device can input the to-be-processed image into the foreground decision tree, the foreground decision tree extracts features of the foreground image in the to-be-processed image, and the foreground decision tree can separately calculate information parameter values when the to-be-processed image feature is used as a node.
  • the electronic device can obtain the first category label using the foreground image feature with the largest information parameter value as the node's foreground decision tree output.
  • the electronic device can input the image to be processed into the background decision tree.
  • the background decision tree extracts the characteristics of the background image in the image to be processed.
  • the background decision tree can calculate the information of the background image characteristics in the image to be processed as nodes Parameter value.
  • the electronic device can obtain the second category label that uses the background image feature with the largest information parameter value as the node's background decision tree output.
  • the electronic device can then output a category label. After obtaining the prediction result of the decision tree, the electronic device can also output the obtained prediction result, that is, output the category label of the image to be processed.
  • the electronic device can also obtain preview images collected through the camera.
  • the electronic device can input the preview image into the decision tree, and the electronic device can obtain the category label of the preview image and output the category label of the preview image.
  • the electronic device can simultaneously output the foreground category label output from the foreground decision tree and the background category label output from the background decision tree.
  • the electronic device can adjust the photographing mode.
  • the photographing mode may include a portrait mode, a landscape mode, and a professional mode.
  • the electronic device can adjust the photographing mode according to the category label of the preview image.
  • FIG. 9 is a structural block diagram of an image recognition apparatus according to an embodiment. As shown in FIG. 9, the apparatus includes an image acquisition module 910, a label acquisition module 920, and a label output module 930, where:
  • the image acquisition module 910 is configured to acquire an image to be processed.
  • a label acquisition module 920 is configured to input an image to be processed into a decision tree to obtain a category label of the image to be processed, where the category label is a scene category of the image to be processed, and the decision tree is obtained by training on the image with the scene category.
  • the label output module 930 is configured to output a category label.
  • the provided image recognition device may further include a feature extraction module 940, a parameter value calculation module 950, and a node selection module 960, where:
  • a feature extraction module 940 is configured to extract at least two image features in a sample image.
  • a parameter value calculation module 950 is configured to calculate information parameter values when at least two image features are used as nodes.
  • the node selection module 960 uses the image feature with the largest information parameter value as the node.
  • the label acquisition module 920 may be further configured to input the image to be processed into a decision tree, the decision tree extracts image features in the image to be processed, and the decision tree separately calculates the image features in the image to be processed as the information parameter of the node Value to obtain the category label with the image feature with the largest information parameter value as the decision tree output of the node.
  • the label acquisition module 920 may be further configured to input the image to be processed into the foreground decision tree.
  • the foreground decision tree extracts features of the foreground image in the to-be-processed image.
  • the foreground decision tree separately calculates the features of the foreground image in the to-be-processed image as nodes
  • the first category label of the foreground decision tree outputted by using the foreground image feature with the largest information parameter value as the node's information parameter value.
  • the label acquisition module 920 may also be used to input the image to be processed into the background decision tree.
  • the background decision tree extracts the characteristics of the background image in the image to be processed.
  • the background decision tree calculates the When the scene image feature is used as the node's information parameter value, the background image feature with the largest information parameter value is obtained as the second category label output by the node's background decision tree.
  • the image acquisition module 910 may also be used to acquire a preview image collected by a camera, and the label acquisition module 920 may also be used to input a preview image into a decision tree to obtain a category label of the preview image.
  • the label output module 930 Can also be used to output category labels for preview images.
  • the tag obtaining module 920 may be further configured to obtain a category tag of the preview image, and adjust a photographing mode according to the category tag of the preview image.
  • each module in the foregoing image recognition device is only used as an example. In other embodiments, the image recognition device may be divided into different modules as needed to complete all or part of the functions of the image recognition device.
  • Each module in the above-mentioned image recognition device may be implemented in whole or in part by software, hardware, and a combination thereof.
  • the above-mentioned modules may be embedded in the hardware in or independent of the processor in the computer device, or may be stored in the memory of the computer device in the form of software, so that the processor can call and execute the operations corresponding to the above modules.
  • each module in the image recognition device provided in the embodiments of the present application may be in the form of a computer program.
  • the computer program can be run on a terminal or a server.
  • the program module constituted by the computer program can be stored in the memory of the terminal or server.
  • the computer program is executed by a processor, the operations of the method described in the embodiments of the present application are implemented.
  • An embodiment of the present application further provides a computer-readable storage medium.
  • One or more non-transitory computer-readable storage media containing computer-executable instructions, which when executed by one or more processors, cause the processors to perform the operations of the image recognition method.
  • a computer program product containing instructions that, when run on a computer, causes the computer to perform an image recognition method.
  • An embodiment of the present application further provides an electronic device.
  • the above electronic device includes an image processing circuit.
  • the image processing circuit may be implemented by hardware and / or software components, and may include various processing units that define an ISP (Image Signal Processing) pipeline.
  • FIG. 11 is a schematic diagram of an image processing circuit in an embodiment. As shown in FIG. 11, for ease of description, only aspects of the image processing technology related to the embodiments of the present application are shown.
  • the image processing circuit includes an ISP processor 1140 and a control logic 1150.
  • the image data captured by the imaging device 1110 is first processed by the ISP processor 1140, which analyzes the image data to capture image statistical information that can be used to determine and / or one or more control parameters of the imaging device 1110.
  • the imaging device 1110 may include a camera having one or more lenses 1112 and an image sensor 1114.
  • the image sensor 1114 may include a color filter array (such as a Bayer filter).
  • the image sensor 1114 may obtain light intensity and wavelength information captured by each imaging pixel of the image sensor 1114, and provide a set of original Image data.
  • the sensor 1120 (such as a gyroscope) may provide parameters (such as image stabilization parameters) of the acquired image processing to the ISP processor 1140 based on the interface type of the sensor 1120.
  • the sensor 1120 interface may use a SMIA (Standard Mobile Imaging Architecture) interface, other serial or parallel camera interfaces, or a combination of the foregoing interfaces.
  • SMIA Standard Mobile Imaging Architecture
  • the image sensor 1114 may also send the original image data to the sensor 1120.
  • the sensor 1120 may provide the original image data to the ISP processor 1140 based on the interface type of the sensor 1120, or the sensor 1120 stores the original image data in the image memory 1130.
  • the ISP processor 1140 processes the original image data pixel by pixel in a variety of formats.
  • each image pixel may have a bit depth of 8, 10, 12, or 14 bits, and the ISP processor 1140 may perform one or more image processing operations on the original image data and collect statistical information about the image data.
  • the image processing operations may be performed with the same or different bit depth accuracy.
  • the ISP processor 1140 may also receive image data from the image memory 1130.
  • the sensor 1120 interface sends the original image data to the image memory 1130, and the original image data in the image memory 1130 is then provided to the ISP processor 1140 for processing.
  • the image memory 1130 may be a part of a memory device, a storage device, or a separate dedicated memory in an electronic device, and may include a DMA (Direct Memory Access) feature.
  • DMA Direct Memory Access
  • the ISP processor 1140 may perform one or more image processing operations, such as time-domain filtering.
  • the processed image data may be sent to the image memory 1130 for further processing before being displayed.
  • the ISP processor 1140 receives processed data from the image memory 1130, and performs image data processing on the processed data in the original domain and in the RGB and YCbCr color spaces.
  • the image data processed by the ISP processor 1140 may be output to a display 1170 for viewing by a user and / or further processed by a graphics engine or a GPU (Graphics Processing Unit).
  • the output of the ISP processor 1140 can also be sent to the image memory 1130, and the display 1170 can read image data from the image memory 1130.
  • the image memory 1130 may be configured to implement one or more frame buffers.
  • the output of the ISP processor 1140 may be sent to an encoder / decoder 1160 to encode / decode image data.
  • the encoded image data can be saved and decompressed before being displayed on the display 1170 device.
  • the encoder / decoder 1160 may be implemented by a CPU or a GPU or a coprocessor.
  • the statistical data determined by the ISP processor 1140 may be sent to the control logic 1150 unit.
  • the statistical data may include statistical information of the image sensor 1114 such as auto exposure, auto white balance, auto focus, flicker detection, black level compensation, and lens 1112 shading correction.
  • the control logic 1150 may include a processor and / or a microcontroller that executes one or more routines (such as firmware). The one or more routines may determine the control parameters of the imaging device 1110 and the ISP processing based on the received statistical data. 1140 control parameters.
  • control parameters of the imaging device 1110 may include sensor 1120 control parameters (such as gain, integration time for exposure control, image stabilization parameters, etc.), camera flash control parameters, lens 1112 control parameters (such as focus distance for focusing or zooming), or these A combination of parameters.
  • the ISP control parameters may include gain levels and color correction matrices for automatic white balance and color adjustment (eg, during RGB processing), and lens 1112 shading correction parameters.
  • the image processing method in FIG. 11 can be used to implement the foregoing image processing method.
  • Non-volatile memory may include read-only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory.
  • Volatile memory can include random access memory (RAM), which is used as external cache memory.
  • RAM is available in various forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), dual data rate SDRAM (DDR, SDRAM), enhanced SDRAM (ESDRAM), synchronous Link (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
  • SRAM static RAM
  • DRAM dynamic RAM
  • SDRAM synchronous DRAM
  • DDR dual data rate SDRAM
  • SDRAM enhanced SDRAM
  • SLDRAM synchronous Link (Synchlink) DRAM
  • SLDRAM synchronous Link (Synchlink) DRAM
  • Rambus direct RAM
  • DRAM direct memory bus dynamic RAM
  • RDRAM memory bus dynamic RAM

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

L'invention concerne un procédé de reconnaissance d'image consistant à : acquérir une image à traiter ; entrer l'image à traiter dans un arbre de décision afin d'obtenir une étiquette de catégorie pour l'image à traiter, l'étiquette de catégorie attribuant une catégorie de scène à l'image à traiter, et l'arbre de décision étant obtenu d'après l'apprentissage d'images comprenant des catégories de scène ; et générer l'étiquette de catégorie.
PCT/CN2019/085499 2018-05-21 2019-05-05 Procédé de reconnaissance d'image, dispositif électronique et support de stockage WO2019223513A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810488963.9 2018-05-21
CN201810488963.9A CN108764321B (zh) 2018-05-21 2018-05-21 图像识别方法和装置、电子设备、存储介质

Publications (1)

Publication Number Publication Date
WO2019223513A1 true WO2019223513A1 (fr) 2019-11-28

Family

ID=64008487

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/085499 WO2019223513A1 (fr) 2018-05-21 2019-05-05 Procédé de reconnaissance d'image, dispositif électronique et support de stockage

Country Status (2)

Country Link
CN (1) CN108764321B (fr)
WO (1) WO2019223513A1 (fr)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114565815A (zh) * 2022-02-25 2022-05-31 包头市迪迦科技有限公司 一种基于三维模型的视频智能融合方法及系统
WO2022125236A1 (fr) * 2020-12-11 2022-06-16 Argo AI, LLC Systèmes et procédés de détection d'objet à l'aide d'informations de stéréo-vision

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108764321B (zh) * 2018-05-21 2019-08-30 Oppo广东移动通信有限公司 图像识别方法和装置、电子设备、存储介质
CN110111310B (zh) * 2019-04-17 2021-03-05 广州思德医疗科技有限公司 一种评估标签图片的方法及装置
CN112085386A (zh) * 2020-09-09 2020-12-15 浙江连信科技有限公司 用于风险行为预测的模型训练方法和装置
CN112948608B (zh) * 2021-02-01 2023-08-22 北京百度网讯科技有限公司 图片查找方法、装置、电子设备及计算机可读存储介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101193206A (zh) * 2006-11-30 2008-06-04 华晶科技股份有限公司 依据使用者的拍照操作习性进行拍照参数自动调整的方法
CN102831447A (zh) * 2012-08-30 2012-12-19 北京理工大学 多类别面部表情高精度识别方法
CN106096542A (zh) * 2016-06-08 2016-11-09 中国科学院上海高等研究院 基于距离预测信息的图像视频场景识别方法
CN107690660A (zh) * 2016-12-21 2018-02-13 深圳前海达闼云端智能科技有限公司 图像识别方法及装置
CN108734214A (zh) * 2018-05-21 2018-11-02 Oppo广东移动通信有限公司 图像识别方法和装置、电子设备、存储介质
CN108764321A (zh) * 2018-05-21 2018-11-06 Oppo广东移动通信有限公司 图像识别方法和装置、电子设备、存储介质

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101620615B (zh) * 2009-08-04 2011-12-28 西南交通大学 一种基于决策树学习的自动图像标注与翻译的方法
CN103617427B (zh) * 2013-12-13 2016-06-29 首都师范大学 极化sar图像分类方法
CN107622281B (zh) * 2017-09-20 2021-02-05 Oppo广东移动通信有限公司 图像分类方法、装置、存储介质及移动终端

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101193206A (zh) * 2006-11-30 2008-06-04 华晶科技股份有限公司 依据使用者的拍照操作习性进行拍照参数自动调整的方法
CN102831447A (zh) * 2012-08-30 2012-12-19 北京理工大学 多类别面部表情高精度识别方法
CN106096542A (zh) * 2016-06-08 2016-11-09 中国科学院上海高等研究院 基于距离预测信息的图像视频场景识别方法
CN107690660A (zh) * 2016-12-21 2018-02-13 深圳前海达闼云端智能科技有限公司 图像识别方法及装置
CN108734214A (zh) * 2018-05-21 2018-11-02 Oppo广东移动通信有限公司 图像识别方法和装置、电子设备、存储介质
CN108764321A (zh) * 2018-05-21 2018-11-06 Oppo广东移动通信有限公司 图像识别方法和装置、电子设备、存储介质

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
"Let Decision Tree Tell You: Hello Kitty: Human or Cat?", 27 February 2016 (2016-02-27), Retrieved from the Internet <URL:http://wap.pig66.com/e-677145-1-1.html> *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022125236A1 (fr) * 2020-12-11 2022-06-16 Argo AI, LLC Systèmes et procédés de détection d'objet à l'aide d'informations de stéréo-vision
US11443147B2 (en) 2020-12-11 2022-09-13 Argo AI, LLC Systems and methods for object detection using stereovision information
US11645364B2 (en) 2020-12-11 2023-05-09 Argo AI, LLC Systems and methods for object detection using stereovision information
US12050661B2 (en) 2020-12-11 2024-07-30 Ford Global Technologies, Llc Systems and methods for object detection using stereovision information
CN114565815A (zh) * 2022-02-25 2022-05-31 包头市迪迦科技有限公司 一种基于三维模型的视频智能融合方法及系统
CN114565815B (zh) * 2022-02-25 2023-11-03 包头市迪迦科技有限公司 一种基于三维模型的视频智能融合方法及系统

Also Published As

Publication number Publication date
CN108764321A (zh) 2018-11-06
CN108764321B (zh) 2019-08-30

Similar Documents

Publication Publication Date Title
WO2019233393A1 (fr) Procédé et appareil de traitement d&#39;image, support de stockage et dispositif électronique
WO2019233394A1 (fr) Procédé et appareil de traitement d&#39;image, support de stockage et dispositif électronique
WO2019223513A1 (fr) Procédé de reconnaissance d&#39;image, dispositif électronique et support de stockage
WO2019233263A1 (fr) Procédé de traitement vidéo, dispositif électronique, et support d&#39;enregistrement lisible par ordinateur
CN108810418B (zh) 图像处理方法、装置、移动终端及计算机可读存储介质
US11138478B2 (en) Method and apparatus for training, classification model, mobile terminal, and readable storage medium
US10896323B2 (en) Method and device for image processing, computer readable storage medium, and electronic device
CN108764370B (zh) 图像处理方法、装置、计算机可读存储介质和计算机设备
WO2020001197A1 (fr) Procédé de traitement d&#39;images, dispositif électronique et support de stockage lisible par ordinateur
WO2019233262A1 (fr) Procédé de traitement vidéo, dispositif électronique, et support d&#39;informations lisible par ordinateur
CN108810413B (zh) 图像处理方法和装置、电子设备、计算机可读存储介质
WO2019233266A1 (fr) Procédé de traitement d&#39;image, support de stockage lisible par ordinateur et dispositif électronique
WO2019085792A1 (fr) Dispositif et procédé de traitement d&#39;image, support d&#39;informations lisible et dispositif électronique
WO2019233392A1 (fr) Procédé et appareil de traitement d&#39;image, dispositif électronique et support d&#39;informations lisible par ordinateur
CN110572573B (zh) 对焦方法和装置、电子设备、计算机可读存储介质
CN108024107B (zh) 图像处理方法、装置、电子设备及计算机可读存储介质
CN108961302B (zh) 图像处理方法、装置、移动终端及计算机可读存储介质
WO2019233271A1 (fr) Procédé de traitement d&#39;image, support d&#39;informations lisible par ordinateur et dispositif électronique
WO2020001196A1 (fr) Procédé de traitement d&#39;images, dispositif électronique et support d&#39;informations lisible par ordinateur
WO2019233260A1 (fr) Procédé et appareil d&#39;envoi d&#39;informations de publicité, support d&#39;informations, et dispositif électronique
CN108717530B (zh) 图像处理方法、装置、计算机可读存储介质和电子设备
WO2019114508A1 (fr) Procédé de traitement d&#39;image, appareil, support d&#39;informations lisible par ordinateur et dispositif électronique
CN108804658B (zh) 图像处理方法和装置、存储介质、电子设备
CN109712177B (zh) 图像处理方法、装置、电子设备和计算机可读存储介质
CN107862658B (zh) 图像处理方法、装置、计算机可读存储介质和电子设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19807470

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19807470

Country of ref document: EP

Kind code of ref document: A1