CN111275040A - Positioning method and device, electronic equipment and computer readable storage medium - Google Patents

Positioning method and device, electronic equipment and computer readable storage medium Download PDF

Info

Publication number
CN111275040A
CN111275040A CN202010058788.7A CN202010058788A CN111275040A CN 111275040 A CN111275040 A CN 111275040A CN 202010058788 A CN202010058788 A CN 202010058788A CN 111275040 A CN111275040 A CN 111275040A
Authority
CN
China
Prior art keywords
pixel point
target
distance
information
determining
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010058788.7A
Other languages
Chinese (zh)
Other versions
CN111275040B (en
Inventor
战赓
欧阳万里
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Sensetime Technology Development Co Ltd
Original Assignee
Beijing Sensetime Technology Development Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Sensetime Technology Development Co Ltd filed Critical Beijing Sensetime Technology Development Co Ltd
Priority to CN202010058788.7A priority Critical patent/CN111275040B/en
Publication of CN111275040A publication Critical patent/CN111275040A/en
Priority to JP2022500616A priority patent/JP2022540101A/en
Priority to KR1020227018711A priority patent/KR20220093187A/en
Priority to PCT/CN2021/072210 priority patent/WO2021143865A1/en
Application granted granted Critical
Publication of CN111275040B publication Critical patent/CN111275040B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/761Proximity, similarity or dissimilarity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2210/00Indexing scheme for image generation or computer graphics
    • G06T2210/12Bounding box

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a positioning method and device, electronic equipment and a computer readable storage medium, and the positioning method and device determine an object anchor frame for each pixel point in a target image based on an image characteristic diagram of the target image, namely an object frame corresponding to object frame information, wherein mutual exclusivity exists between the anchor frame and a corresponding predicted anchor frame, the number of the object anchor frames used in an object positioning process is reduced, and the calculated amount is reduced. Meanwhile, based on the image feature map of the target image, the object type information of the object to which each pixel point in the target image belongs, the confidence corresponding to the object frame information and the confidence corresponding to the object type information can be determined, and then the final confidence corresponding to the object frame information is determined based on the two determined confidences, so that the information expression capability of the object frame is effectively enhanced, and the accuracy of object positioning based on the object frame is improved.

Description

Positioning method and device, electronic equipment and computer readable storage medium
Technical Field
The present disclosure relates to the field of computer technology and image processing, and in particular, to a positioning method and apparatus, an electronic device, and a computer-readable storage medium.
Background
Object detection or object positioning is an important basic technology in computer vision, and is particularly applied to scenes such as instance segmentation, object tracking, pedestrian recognition, face recognition and the like.
Object detection or object positioning is mostly realized by using an object anchor frame. However, the object positioning has the defects of large calculation amount of object positioning and inaccurate positioning caused by large number of object anchor frames used in object positioning, weak expression capability of the object anchor frames and the like.
Disclosure of Invention
In view of the above, the present disclosure provides at least a positioning method and apparatus.
In a first aspect, the present disclosure provides a positioning method, including:
acquiring a target image;
determining object type information of an object to which each pixel point belongs, object frame information of the object to which each pixel point belongs, a first confidence degree corresponding to the object type information and a second confidence degree corresponding to the object frame information in the target image based on the image feature map of the target image;
respectively determining the target confidence of the object frame information of the object to which each pixel point belongs based on the first confidence and the second confidence;
and determining the positioning information of the object in the target image based on the object frame information of the object to which each pixel point belongs and the target confidence of the object frame information.
In the above embodiment, only one object anchor frame, that is, an object frame corresponding to the object frame information, can be determined for each pixel point in the target image based on the image feature map of the target image, so that the number of the object anchor frames used in the object positioning process is reduced, the amount of calculation is reduced, and the efficiency of object positioning is improved. Meanwhile, based on the image feature map of the target image, the object type information of the object to which each pixel point in the target image belongs, the confidence degree corresponding to the object frame information and the confidence degree corresponding to the object type information can be determined, and then the final confidence degree corresponding to the object frame information is determined based on the two determined confidence degrees, so that the information expression capability of the object frame or the object frame information is effectively enhanced, the positioning information and the object type information of the object frame corresponding to the object frame information can be expressed, the confidence degree information of the object frame information can also be expressed, and the accuracy of object positioning based on the object frame can be improved.
In a possible implementation manner, the image feature map includes a classification feature map for classifying objects to which pixel points in the target image belong and a positioning feature map for positioning the objects to which pixel points in the target image belong;
the determining, based on the image feature map of the target image, object type information of an object to which each pixel point belongs, object border information of the object to which each pixel point belongs, a first confidence degree corresponding to the object type information, and a second confidence degree corresponding to the object border information in the target image includes:
determining object type information of an object to which each pixel point in the target image belongs and a first confidence corresponding to the object type information based on the classification feature map;
and determining the object frame information of the object to which each pixel point belongs in the target image and a second confidence corresponding to the object frame information based on the positioning feature map.
In the above embodiment, based on the classification feature map and the positioning feature map of the target image, not only the object frame information of the object to which each pixel point belongs in the target image is determined, but also the object type information of the object to which each pixel point belongs in the target image is determined, and the confidence degrees corresponding to the object type information and the object frame information, respectively, are determined, so that the information expression capability of the object frame is improved, and the accuracy of object positioning based on the object frame is improved.
In a possible implementation manner, the determining, based on the location feature map, object bounding box information of an object to which each pixel point in the target image belongs includes:
respectively determining a target distance range in which the distance between a pixel point and each frame in the object frames of the object to which the pixel point belongs is located based on the positioning feature map aiming at the pixel point in the target image;
respectively determining the target distance between the pixel point and each frame in the object frames of the object to which the pixel point belongs based on the target distance range and the positioning feature map;
and determining the object frame information of the object to which the pixel point belongs based on the position information of the pixel point in the target image and the target distance between the pixel point and each frame.
In the above embodiment, the target distance range in which the distance between the pixel point and each frame in the object frame of the object to which the pixel point belongs is determined, and then the target distance between the pixel point and each frame is determined based on the determined target distance range, and the accuracy of the determined target distance can be improved through the two steps of processing. Then, based on the determined accurate target distance, an object frame with an accurate position can be determined for the pixel point, and the accuracy of the determined object frame is improved.
In a possible implementation manner, determining a target distance range in which a distance between a pixel point and each of object borders of the object at the pixel point is located includes:
aiming at one border in the object borders of the object to which one pixel point in the target image belongs, determining the maximum distance between the pixel point and the border based on the positioning feature map;
carrying out segmentation processing on the maximum distance to obtain a plurality of distance ranges;
determining a first probability value that the distance between the pixel point and the bounding box is within each distance range based on the positioning feature map;
based on the determined first probability value, selecting a target distance range in which the distance between the pixel point and the frame is located from the plurality of distance ranges.
In the above embodiment, the distance range corresponding to the maximum probability value may be selected as the target distance range in which the distance between the pixel point and a certain frame is located, so that the accuracy of the determined target distance range is improved, and the accuracy of the distance between the pixel point and the certain frame determined based on the target distance range is improved.
In a possible embodiment, the selecting, based on the determined first probability value, a target distance range in which a distance between the pixel point and the bounding box is located from the plurality of distance ranges includes:
determining the uncertain parameter value of the distance between the pixel point and the frame of the edge based on the positioning feature map;
determining a target probability value that the distance between the pixel point and the bounding box is within each distance range based on the distance uncertainty parameter value and each first probability value;
and taking the distance range corresponding to the maximum target probability value as the target distance range in which the distance between the pixel point and the frame of the edge is positioned.
In the above embodiment, while the first probability value that the distance between the pixel point and the certain bounding box is within each distance range is determined, an uncertain parameter value is also determined, the first probability value can be corrected and corrected based on the uncertain parameter value, so as to obtain the target probability value that the distance between the pixel point and the certain bounding box is within each distance range, and improve the accuracy of the probability value that the distance between the determined pixel point and the certain bounding box is within each distance range, thereby being beneficial to improving the accuracy of the target distance range determined based on the probability value.
In a possible implementation manner, determining a second confidence degree corresponding to the object bounding box information includes:
and determining a second confidence corresponding to the object frame information of the object to which the pixel point belongs based on a first probability value corresponding to a target distance range in which the distance between one pixel point in the target image and each frame in the object frame of the object to which the pixel point belongs is located.
In the above embodiment, the confidence of the object frame information of the object to which the pixel point belongs can be determined by using the maximum first probability value corresponding to the distance between the pixel point and each frame, so that the information expression capability of the object frame is enhanced.
In a possible implementation manner, the determining, based on the classification feature map, object type information of an object to which each pixel point in the target image belongs includes:
determining a second probability value of each preset object type of an object to which each pixel point in the target image belongs based on the classification feature map;
and determining the object type information of the object to which the pixel point belongs based on the preset object type corresponding to the maximum second probability value.
In the embodiment, the preset object type corresponding to the maximum probability value is selected as the object type information of the object to which the pixel point belongs, so that the accuracy of the determined object type information is improved.
In a possible implementation manner, the determining, based on the object border information of the object to which each pixel belongs and the target confidence of the object border information, the location information of the object in the target image includes:
screening a plurality of target pixel points from the target image; the distance between different target pixel points in the target image is smaller than a preset threshold value, and the object type information of the objects to which the different target pixel points belong is the same;
selecting object frame information corresponding to the highest object confidence from the object frame information of the object to which each target pixel point belongs to obtain target frame information;
and determining the positioning information of the object in the target image based on the selected target frame information and the target confidence corresponding to the target frame information.
According to the embodiment, the object frame information with the highest target confidence coefficient is selected from the pixel points which are relatively close and have the same object type information, so that the object is positioned, the number of the object frame information for positioning the object can be effectively reduced, and the timeliness of the object positioning is improved.
In a second aspect, the present disclosure provides a positioning device comprising:
the image acquisition module is used for acquiring a target image;
the image processing module is used for determining object type information of an object to which each pixel point belongs, object frame information of the object to which each pixel point belongs, a first confidence coefficient corresponding to the object type information and a second confidence coefficient corresponding to the object frame information in the target image based on the image feature map of the target image;
the confidence processing module is used for respectively determining the target confidence of the object frame information of the object to which each pixel point belongs based on the first confidence and the second confidence;
and the positioning module is used for determining the positioning information of the object in the target image based on the object frame information of the object to which each pixel point belongs and the target confidence of the object frame information.
In a possible implementation manner, the image feature map includes a classification feature map for classifying objects to which pixel points in the target image belong and a positioning feature map for positioning the objects to which pixel points in the target image belong;
the image processing module is configured to:
determining object type information of an object to which each pixel point in the target image belongs and a first confidence corresponding to the object type information based on the classification feature map;
and determining the object frame information of the object to which each pixel point belongs in the target image and a second confidence corresponding to the object frame information based on the positioning feature map.
In a possible implementation manner, when determining, based on the positioning feature map, object border information of an object to which each pixel point in the target image belongs, the image processing module is configured to:
respectively determining a target distance range in which the distance between a pixel point and each frame in the object frames of the object to which the pixel point belongs is located based on the positioning feature map aiming at the pixel point in the target image;
respectively determining the target distance between the pixel point and each frame in the object frames of the object to which the pixel point belongs based on the target distance range and the positioning feature map;
and determining the object frame information of the object to which the pixel point belongs based on the position information of the pixel point in the target image and the target distance between the pixel point and each frame.
In a possible implementation manner, when determining a target distance range in which a distance between a pixel point and each of object borders of the object is located, the image processing module is configured to:
aiming at one border in the object borders of the object to which one pixel point in the target image belongs, determining the maximum distance between the pixel point and the border based on the positioning feature map;
carrying out segmentation processing on the maximum distance to obtain a plurality of distance ranges;
determining a first probability value that the distance between the pixel point and the bounding box is within each distance range based on the positioning feature map;
based on the determined first probability value, selecting a target distance range in which the distance between the pixel point and the frame is located from the plurality of distance ranges.
In a possible embodiment, the image processing module, when selecting, from the plurality of distance ranges, a target distance range in which the distance between the pixel point and the bounding box is located, based on the determined first probability value, is configured to:
determining the uncertain parameter value of the distance between the pixel point and the frame of the edge based on the positioning feature map;
determining a target probability value that the distance between the pixel point and the bounding box is within each distance range based on the distance uncertainty parameter value and each first probability value;
and taking the distance range corresponding to the maximum target probability value as the target distance range in which the distance between the pixel point and the side frame is positioned.
In a possible implementation manner, when determining the second confidence degree corresponding to the object bounding box information, the image processing module is configured to:
and determining a second confidence corresponding to the object frame information of the object to which the pixel point belongs based on a first probability value corresponding to a target distance range in which the distance between one pixel point in the target image and each frame in the object frame of the object to which the pixel point belongs is located.
In a possible implementation manner, when determining the object type information of the object to which each pixel point in the target image belongs based on the classification feature map, the image processing module is configured to:
determining a second probability value of each preset object type of an object to which each pixel point in the target image belongs based on the classification feature map;
and determining the object type information of the object to which the pixel point belongs based on the preset object type corresponding to the maximum second probability value.
In one possible embodiment, the positioning module is configured to:
screening a plurality of target pixel points from the target image; the distance between different target pixel points in the target image is smaller than a preset threshold value, and the object type information of the objects to which the different target pixel points belong is the same;
selecting object frame information corresponding to the highest object confidence from the object frame information of the object to which each target pixel point belongs to obtain target frame information;
and determining the positioning information of the object in the target image based on the selected target frame information and the target confidence corresponding to the target frame information.
In a third aspect, the present disclosure provides an electronic device, comprising: a processor, a memory and a bus, the memory storing machine-readable instructions executable by the processor, the processor and the memory communicating via the bus when the electronic device is running, the machine-readable instructions when executed by the processor performing the steps of the positioning method as described above.
In a fourth aspect, the present disclosure also provides a computer-readable storage medium having a computer program stored thereon, which, when executed by a processor, performs the steps of the positioning method as described above.
The above-mentioned apparatus, electronic device, and computer-readable storage medium of the present disclosure at least include technical features substantially the same as or similar to technical features of any aspect or any implementation manner of any aspect of the above-mentioned method of the present disclosure, and therefore, for the description of the effects of the above-mentioned apparatus, electronic device, and computer-readable storage medium, reference may be made to the description of the effects of the above-mentioned method contents, which is not repeated herein.
Drawings
To more clearly illustrate the technical solutions of the embodiments of the present disclosure, the drawings that are required to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present disclosure and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings may be obtained from the drawings without inventive effort.
Fig. 1 shows a flowchart of a positioning method provided by an embodiment of the present disclosure;
fig. 2 is a flow chart illustrating another positioning method provided by the embodiment of the present disclosure;
fig. 3 is a flowchart illustrating determining object border information of an object to which each pixel point in a target image belongs based on a positioning feature map in yet another positioning method according to an embodiment of the present disclosure;
FIG. 4 is a flowchart illustrating a further positioning method provided by the embodiment of the disclosure, wherein a target distance range in which a distance between a pixel point and a bounding box is located is selected from a plurality of distance ranges based on a determined first probability value;
fig. 5 is a flowchart illustrating a method for determining location information of an object in a target image according to object border information of an object to which each pixel belongs and a target confidence of the object border information in another location method provided by the embodiment of the present disclosure;
FIG. 6 is a schematic structural diagram of a positioning device provided by an embodiment of the present disclosure;
fig. 7 shows a schematic structural diagram of an electronic device provided in an embodiment of the present disclosure.
Detailed Description
To make the objects, technical solutions and advantages of the embodiments of the present disclosure more clear, the technical solutions of the embodiments of the present disclosure will be clearly and completely described below with reference to the drawings in the embodiments of the present disclosure, and it should be understood that the drawings in the present disclosure are for illustrative and descriptive purposes only and are not used to limit the scope of the present disclosure. Additionally, it should be understood that the schematic drawings are not necessarily drawn to scale. The flowcharts used in this disclosure illustrate operations implemented according to some embodiments of the present disclosure. It should be understood that the operations of the flow diagrams may be performed out of order, and steps without logical context may be performed in reverse order or simultaneously. In addition, one skilled in the art, under the direction of the present disclosure, may add one or more other operations to the flowchart, and may remove one or more operations from the flowchart.
In addition, the described embodiments are only a few embodiments of the present disclosure, not all embodiments. The components of the embodiments of the present disclosure, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present disclosure, presented in the figures, is not intended to limit the scope of the claimed disclosure, but is merely representative of selected embodiments of the disclosure. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the disclosure without making creative efforts, shall fall within the protection scope of the disclosure.
It is to be noted that the term "comprising" will be used in the disclosed embodiments to indicate the presence of the features stated hereinafter, but does not exclude the addition of further features.
The disclosure provides a positioning method and device, an electronic device, and a computer-readable storage medium, in order to reduce the number of object anchor frames used for positioning and improve the information expression capability of the object anchor frames in the process of positioning an object by using the object anchor frames, so as to improve the accuracy of object positioning. According to the method and the device, only one object anchor frame, namely the object frame corresponding to the object frame information, is determined for each pixel point in the target image based on the image characteristic diagram of the target image, so that the number of the object anchor frames used in the object positioning process is reduced, and the calculation amount is reduced. Meanwhile, based on the image feature map of the target image, the object type information of the object to which each pixel point in the target image belongs, the confidence corresponding to the object frame information and the confidence corresponding to the object type information can be determined, and then the final confidence corresponding to the object frame information is determined based on the two determined confidences, so that the information expression capability of the object frame is effectively enhanced, and the accuracy of object positioning based on the object frame is improved.
The following describes the positioning method and apparatus, electronic device, and computer-readable storage medium according to the present disclosure with specific embodiments.
The embodiment of the disclosure provides a positioning method, which is applied to a terminal device for positioning an object in an image. Specifically, as shown in fig. 1, the positioning method provided by the embodiment of the present disclosure includes the following steps:
and S110, acquiring a target image.
Here, the target image may be an image including a target object captured in an object tracking process, or may be an image including a human face captured in human face detection, and the present disclosure does not limit the use of the target image.
The target image comprises at least one object to be positioned. The object may be an object, a human, an animal, or the like.
The target image may be captured by a terminal device executing the positioning method of the present embodiment, or may be captured by another device and then transmitted to the terminal device executing the positioning method of the present embodiment.
S120, determining object type information of an object to which each pixel point belongs, object frame information of the object to which each pixel point belongs, a first confidence degree corresponding to the object type information and a second confidence degree corresponding to the object frame information in the target image based on the image feature map of the target image.
Before this step is executed, the target image needs to be processed to obtain an image feature map corresponding to the target image. In specific implementation, the convolutional neural network can be used for extracting image features of the target image to obtain an image feature map.
After the image feature map of the target image is determined, the image feature map is processed, and the object type information of the object to which each pixel point belongs, the object frame information of the object to which each pixel point belongs, the first confidence degree corresponding to the object type information and the second confidence degree corresponding to the object frame information in the target image can be determined. In specific implementation, the convolutional neural network may be used to further extract image features from the image feature map to obtain the object type information, the object border information, the first confidence level, and the second confidence level.
The object type information includes an object type of an object to which the pixel point belongs. The object frame information includes a distance between the pixel point and each frame in the object frame corresponding to the object frame information. The object frame may be referred to as an object anchor frame.
The first confidence degree is used for representing the accuracy degree or the credibility degree of the object type information determined based on the image feature map. The second confidence level is used for representing the accuracy or credibility of the object frame information determined based on the image feature map.
S130, respectively determining the target confidence of the object frame information of the object to which each pixel point belongs based on the first confidence and the second confidence.
Here, specifically, a product of the first confidence level and the second confidence level may be used as the target confidence level corresponding to the object border information. The target confidence is used for comprehensively representing the positioning accuracy and the classification accuracy of the object frame corresponding to the object frame information.
Of course, other methods may also be utilized to determine the target confidence, for example, the target confidence may be determined by combining the preset weight of the first confidence, the preset weight of the second confidence, the first confidence and the second confidence, and the present disclosure does not limit the specific implementation scheme for determining the target confidence based on the first confidence and the second confidence.
S140, determining the positioning information of the object in the target image based on the object frame information of the object to which each pixel point belongs and the target confidence of the object frame information.
Here, the object frame information of the object to which the pixel point belongs and the target confidence of the object frame information may be used as the positioning information of the object to which the pixel point belongs in the target image, and then the positioning information of each object in the target image is determined based on the positioning information of the object to which each pixel point belongs in the target image.
The method and the device have the advantages that the object frame information of the object to which the pixel point belongs is determined, the target confidence coefficient of the object frame information is also determined, the information expression capacity of the object frame or the object frame information is effectively enhanced, the positioning information and the object type information of the object frame corresponding to the object frame information can be expressed, the confidence coefficient information of the object frame information can be expressed, and accordingly the accuracy of object positioning based on the object frame is improved.
In addition, the above embodiment can determine an object anchor frame, that is, an object frame corresponding to the object frame information, for each pixel point in the target image based on the image feature map of the target image, thereby reducing the number of object anchor frames used in the object positioning process, reducing the amount of calculation, and improving the efficiency of object positioning.
In some examples, as shown in fig. 2, the image feature map includes a classification feature map for classifying objects to which pixel points in the target image belong and a localization feature map for localizing the objects to which pixel points in the target image belong.
In a specific implementation, as shown in fig. 2, the classification feature map and the localization feature map may be obtained by performing image feature extraction on the target image by using a convolutional neural network to obtain an initial feature map, and then processing the initial feature map by using 4 convolutional layers of 3 × 3 and 256 inputs and outputs, respectively.
After the classification feature map and the positioning feature map are obtained, the object type information of the object to which each pixel point belongs, the object border information of the object to which each pixel point belongs, the first confidence degree corresponding to the object type information, and the second confidence degree corresponding to the object border information in the target image are determined based on the image feature map of the target image, which can be specifically realized by the following steps:
determining object type information of an object to which each pixel point in the target image belongs and a first confidence corresponding to the object type information based on the classification feature map; and determining the object frame information of the object to which each pixel point belongs in the target image and a second confidence corresponding to the object frame information based on the positioning feature map.
In specific implementation, the classification feature map may be subjected to image feature extraction by using a convolutional neural network or a convolutional layer, so as to obtain object type information of an object to which each pixel point belongs and a first confidence corresponding to the object type information. And performing image feature extraction on the positioning feature map by using a convolutional neural network or a convolutional layer to obtain object frame information of an object to which each pixel point belongs and a second confidence corresponding to the object frame information.
In the embodiment, based on the classification feature map and the positioning feature map of the target image, not only the object frame information of the object to which each pixel point belongs in the target image is determined, but also the object type information of the object to which each pixel point belongs in the target image is determined, and the confidence degrees corresponding to the object type information and the object frame information respectively are determined, so that the information expression capability of the object frame is improved, and the accuracy of object positioning based on the object frame is improved.
In some embodiments, as shown in fig. 3, the determining, based on the positioning feature map, object border information of an object to which each pixel point in the target image belongs may specifically be implemented by using the following steps:
s310, aiming at one pixel point in the target image, respectively determining a target distance range in which the distance between the pixel point and each frame in the object frame of the object to which the pixel point belongs is located based on the positioning feature map.
Here, the positioning feature map may be subjected to image feature extraction by using a convolutional neural network or a convolutional layer, so as to determine a target distance range in which a distance between a pixel point and each of object frames of an object to which the pixel point belongs is located.
In specific implementation, the maximum distance between a pixel point and a certain border can be determined based on the positioning feature map; then, carrying out segmentation processing on the maximum distance to obtain a plurality of distance ranges; extracting image features of the positioning feature map by using a convolutional neural network or a convolutional layer to determine a first probability value that the distance between the pixel point and the frame of the edge is within each distance range; finally, based on the determined first probability value, a target distance range in which the distance between the pixel point and the frame is located is selected from the plurality of distance ranges. Specifically, the distance range corresponding to the maximum first probability value may be set as the target distance range.
As shown in fig. 2, the object frame may include an upper frame, a lower frame, a left frame and a right frame, five first probability values a, b, c, d, e corresponding to the left frame and five distance ranges are determined based on the above method, and the distance range corresponding to the largest first probability value b is selected as the target distance range.
In the above, the distance range corresponding to the maximum probability value is selected as the target distance range in which the distance between the pixel point and the frame is located, so that the accuracy of the determined target distance range is improved, and the accuracy of the distance between the pixel point determined based on the target distance range and a certain frame is improved.
S320, respectively determining the target distance between the pixel point and each frame in the object frames of the object to which the pixel point belongs based on the target distance range and the positioning feature map.
After the target distance range is determined, a regression network, such as a convolutional neural network, matched with the target distance range is selected, and image feature extraction is performed on the positioning feature map to obtain the target distance between the pixel point and each frame in the object frame of the object to which the pixel point belongs.
On the basis of determining the target distance range, the convolutional neural network is further utilized to determine an accurate distance, and the accuracy of the determined distance can be effectively improved.
In addition, as shown in fig. 2, after the target distance is determined, the determined target distance may be corrected by using a preset or trained parameter or weight N to obtain a final target distance.
As shown in fig. 2, the exact target distance between the pixel point and the left frame is determined by this step, and the target distance is labeled in fig. 2 and denoted by f. As shown in fig. 2, the determined target distance is within the determined target distance range.
S330, determining the object frame information of the object to which the pixel point belongs based on the position information of the pixel point in the target image and the target distance between the pixel point and each frame.
The position information of each frame in the object frame corresponding to the object frame information in the target image can be determined by using the position information of the pixel point in the target image and the target distance between the pixel point and each frame. And finally, the position information of each frame in the target image can be used as the object frame information of the object to which the pixel point belongs.
In the above embodiment, the target distance range where the distance between the pixel point and each frame in the object frame is located is first determined, and then the target distance between the pixel point and each frame is determined based on the determined target distance range. Then, based on the determined accurate target distance, an object frame with an accurate position can be determined for the pixel point, and the accuracy of the determined object frame is improved.
In some embodiments, as shown in fig. 4, the selecting a target distance range from the plurality of distance ranges, in which the distance between the pixel point and a bounding box is located, based on the determined first probability value may be further implemented by:
s410, determining the uncertain distance parameter value between the pixel point and a certain frame based on the positioning feature map.
Here, a convolutional neural network determining a first probability value may be utilized to determine a distance uncertainty parameter value of a pixel point from a bounding box while determining the first probability value that the distance of the pixel point from the bounding box lies within each distance range. The distance uncertainty parameter values here can be used to characterize the confidence level of the determined respective first probability.
And S420, determining a target probability value that the distance between the pixel point and the bounding box is within each distance range based on the distance uncertainty parameter value and each first probability value.
Here, each first probability value is corrected using the distance uncertainty parameter value to obtain a corresponding target probability value.
In particular implementation, the target probability value may be determined using the following formula:
Figure BDA0002373719570000141
in the formula, px,nRepresenting the probability value of the object that the distance between the pixel point and the frame x is within the nth distance range, N representing the number of the distance ranges, σxRepresenting the value of the distance uncertainty parameter, s, corresponding to the frame xx,nA first probability value representing that the distance of the pixel point from the frame x is within an nth distance range; sx,mA first probability value representing that the distance of the pixel point from the frame x lies within the mth distance range.
And S430, based on the determined target probability value, selecting a target distance range in which the distance between the pixel point and the frame is located from the distance ranges.
Here, the distance range corresponding to the maximum target probability value may be specifically selected as the target distance range.
In the above embodiment, while the first probability value that the distance between the pixel point and the certain bounding box is within each distance range is determined, an uncertain parameter value is also determined, the first probability value can be corrected and corrected based on the parameter value, a target probability value that the distance between the pixel point and the certain bounding box is within each distance range is obtained, and the accuracy of the probability value that the distance between the determined pixel point and the certain bounding box is within each distance range is improved, thereby being beneficial to improving the accuracy of the target distance range determined based on the probability value.
After determining the target distance between the pixel point and each frame in the corresponding object frame, the confidence level of the corresponding object frame information, i.e. the second confidence level, may be determined by using the following steps:
and determining a second confidence corresponding to the object frame information of the object to which the pixel point belongs based on a first probability value corresponding to a target distance range in which the distance between one pixel point in the target image and each frame in the object frame of the object to which the pixel point belongs is located.
In a specific implementation, an average of first probability values corresponding to target distance ranges corresponding to all borders in an object border of an object to which the pixel point belongs may be used as the second confidence.
Of course, other methods may be used to determine the second confidence level, and the disclosure is not limited to the method of determining the second confidence level based on the first probability value corresponding to the target distance range.
In the above embodiment, the confidence of the object frame information of the object to which the pixel point belongs, that is, the second confidence, can be determined by using the maximum first probability value corresponding to the distance between the pixel point and each frame, so that the information expression capability of the object frame is enhanced.
In some embodiments, the determining the object type information of the object to which each pixel point in the target image belongs based on the classification feature map may specifically be implemented by using the following steps:
determining a second probability value of each preset object type of an object to which each pixel point in the target image belongs based on the classification feature map; and determining the object type information of the object to which the pixel point belongs based on the preset object type corresponding to the maximum second probability value.
In specific implementation, the convolutional neural network or convolutional layer can be used to extract the image features of the classification feature map, so as to obtain a second probability value that the object to which the pixel point belongs is of each preset object type. And then, selecting a preset object type corresponding to the maximum second probability value to determine the object type information of the object to which the pixel point belongs. As shown in fig. 2, the second probability value corresponding to the cat determined by the present embodiment is the largest, and therefore it is determined that the object type information corresponds to the cat.
In the embodiment, the preset object type corresponding to the maximum probability value is selected as the object type information of the object to which the pixel point belongs, so that the accuracy of the determined object type information is improved.
In some embodiments, as shown in fig. 5, the determining the positioning information of the object in the target image based on the object border information of the object to which each pixel belongs and the target confidence of the object border information may specifically be implemented by using the following steps:
s510, screening a plurality of target pixel points from the target image; and the distance between different target pixel points in the target image is smaller than a preset threshold, and the object type information of the objects to which the different target pixel points belong is the same.
Here, the plurality of target pixel points obtained by screening are pixel points on the same object.
S520, selecting object frame information corresponding to the highest object confidence degree from the object frame information of the object to which each target pixel point belongs to obtain the target frame information.
For the pixel points on the same object, the object frame information corresponding to the highest object confidence coefficient can be selected to position the object, and other object frame information with lower object confidence coefficient can be removed, so that the calculation amount in the object positioning process is reduced.
S530, determining the positioning information of the object in the target image based on the selected target frame information and the target confidence corresponding to the target frame information.
According to the embodiment, the object frame information with the highest target confidence coefficient is selected from the pixel points which are relatively close and have the same object type information, so that the object is positioned, the number of the object frame information for positioning the object can be effectively reduced, and the timeliness of the object positioning is improved.
Corresponding to the above positioning method, the embodiment of the present disclosure further provides a positioning apparatus, where the apparatus is used on a terminal device for positioning an object in an image, and the apparatus and each module thereof can perform the same method steps as the above positioning method and can achieve the same or similar beneficial effects, so repeated parts are not described again.
As shown in fig. 6, the present disclosure provides a positioning device including:
and an image obtaining module 610, configured to obtain a target image.
An image processing module 620, configured to determine, based on the image feature map of the target image, object type information of an object to which each pixel point belongs, object border information of the object to which each pixel point belongs, a first confidence corresponding to the object type information, and a second confidence corresponding to the object border information in the target image.
A confidence processing module 630, configured to determine, based on the first confidence and the second confidence, a target confidence of the object border information of the object to which each pixel belongs, respectively.
The positioning module 640 is configured to determine positioning information of an object in the target image based on object border information of an object to which each pixel belongs and a target confidence of the object border information.
In some embodiments, the image feature map includes a classification feature map for classifying objects to which pixel points in the target image belong and a localization feature map for localizing objects to which pixel points in the target image belong;
the image processing module 620 is configured to:
determining object type information of an object to which each pixel point in the target image belongs and a first confidence corresponding to the object type information based on the classification feature map;
and determining the object frame information of the object to which each pixel point belongs in the target image and a second confidence corresponding to the object frame information based on the positioning feature map.
In some embodiments, the image processing module 620, when determining the object border information of the object to which each pixel point in the target image belongs based on the positioning feature map, is configured to:
respectively determining a target distance range in which the distance between a pixel point and each frame in the object frames of the object to which the pixel point belongs is located based on the positioning feature map aiming at the pixel point in the target image;
respectively determining the target distance between the pixel point and each frame in the object frames of the object to which the pixel point belongs based on the target distance range and the positioning feature map;
and determining the object frame information of the object to which the pixel point belongs based on the position information of the pixel point in the target image and the target distance between the pixel point and each frame.
In some embodiments, the image processing module 620 is configured to, when determining a target distance range in which a distance between a pixel point and each of object borders of the object of the pixel point is located:
aiming at one border in the object borders of the object to which one pixel point in the target image belongs, determining the maximum distance between the pixel point and the border based on the positioning feature map;
carrying out segmentation processing on the maximum distance to obtain a plurality of distance ranges;
determining a first probability value that the distance between the pixel point and the bounding box is within each distance range based on the positioning feature map;
based on the determined first probability value, selecting a target distance range in which the distance between the pixel point and the frame is located from the plurality of distance ranges.
In some embodiments, the image processing module 620, when selecting the target distance range from the plurality of distance ranges in which the distance between the pixel point and the bounding box is located, based on the determined first probability value, is configured to:
determining the uncertain parameter value of the distance between the pixel point and the frame of the edge based on the positioning feature map;
determining a target probability value that the distance between the pixel point and the bounding box is within each distance range based on the distance uncertainty parameter value and each first probability value;
and taking the distance range corresponding to the maximum target probability value as the target distance range in which the distance between the pixel point and the side frame is positioned.
In some embodiments, the image processing module 620, when determining the second confidence level corresponding to the object bounding box information, is configured to:
and determining a second confidence corresponding to the object frame information of the object to which the pixel point belongs based on a first probability value corresponding to a target distance range in which the distance between one pixel point in the target image and each frame in the object frame of the object to which the pixel point belongs is located.
In some embodiments, the image processing module 620, when determining the object type information of the object to which each pixel point in the target image belongs based on the classification feature map, is configured to:
determining a second probability value of each preset object type of an object to which each pixel point in the target image belongs based on the classification feature map;
and determining the object type information of the object to which the pixel point belongs based on the preset object type corresponding to the maximum second probability value.
In some embodiments, the positioning module 640 is configured to:
screening a plurality of target pixel points from the target image; the distance between different target pixel points in the target image is smaller than a preset threshold value, and the object type information of the objects to which the different target pixel points belong is the same;
selecting object frame information corresponding to the highest object confidence from the object frame information of the object to which each target pixel point belongs to obtain target frame information;
and determining the positioning information of the object in the target image based on the selected target frame information and the target confidence corresponding to the target frame information.
An embodiment of the present disclosure discloses an electronic device, as shown in fig. 7, including: a processor 701, a memory 702, and a bus 703, the memory 702 storing machine-readable instructions executable by the processor 701, the processor 701 and the memory 702 communicating via the bus 703 when the electronic device is operating.
The machine readable instructions, when executed by the processor 701, perform the steps of the following positioning method:
acquiring a target image;
determining object type information of an object to which each pixel point belongs, object frame information of the object to which each pixel point belongs, a first confidence degree corresponding to the object type information and a second confidence degree corresponding to the object frame information in the target image based on the image feature map of the target image;
respectively determining the target confidence of the object frame information of the object to which each pixel point belongs based on the first confidence and the second confidence;
and determining the positioning information of the object in the target image based on the object frame information of the object to which each pixel point belongs and the target confidence of the object frame information.
In addition, when the processor 701 executes the machine readable instructions, the method contents in any embodiment described in the above method part may also be executed, which is not described herein again.
A computer program product corresponding to the method and the apparatus provided in the embodiments of the present disclosure includes a computer readable storage medium storing a program code, where instructions included in the program code may be used to execute the method in the foregoing method embodiments, and specific implementation may refer to the method embodiments, which is not described herein again.
The foregoing description of the various embodiments is intended to highlight various differences between the embodiments, and the same or similar parts may be referred to one another, which are not repeated herein for brevity.
It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the system and the apparatus described above may refer to corresponding processes in the method embodiments, and are not described in detail in this disclosure. In the several embodiments provided in the present disclosure, it should be understood that the disclosed system, apparatus, and method may be implemented in other ways. The above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is merely a logical division, and there may be other divisions in actual implementation, and for example, a plurality of modules or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or modules through some communication interfaces, and may be in an electrical, mechanical or other form.
The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present disclosure may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a non-volatile computer-readable storage medium executable by a processor. Based on such understanding, the technical solution of the present disclosure may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present disclosure. And the aforementioned storage medium includes: various media capable of storing program codes, such as a U disk, a removable hard disk, a ROM, a RAM, a magnetic disk, or an optical disk.
The above are only specific embodiments of the present disclosure, but the scope of the present disclosure is not limited thereto, and any person skilled in the art can easily conceive of changes or substitutions within the technical scope of the present disclosure, and shall be covered by the scope of the present disclosure. Therefore, the protection scope of the present disclosure shall be subject to the protection scope of the claims.

Claims (18)

1. A method of positioning, comprising:
acquiring a target image;
determining object type information of an object to which each pixel point belongs, object frame information of the object to which each pixel point belongs, a first confidence degree corresponding to the object type information and a second confidence degree corresponding to the object frame information in the target image based on the image feature map of the target image;
respectively determining the target confidence of the object frame information of the object to which each pixel point belongs based on the first confidence and the second confidence;
and determining the positioning information of the object in the target image based on the object frame information of the object to which each pixel point belongs and the target confidence of the object frame information.
2. The method according to claim 1, wherein the image feature map comprises a classification feature map for classifying the objects to which the pixels belong in the target image and a localization feature map for localizing the objects to which the pixels belong in the target image;
the determining, based on the image feature map of the target image, object type information of an object to which each pixel point belongs, object border information of the object to which each pixel point belongs, a first confidence degree corresponding to the object type information, and a second confidence degree corresponding to the object border information in the target image includes:
determining object type information of an object to which each pixel point in the target image belongs and a first confidence corresponding to the object type information based on the classification feature map;
and determining the object frame information of the object to which each pixel point belongs in the target image and a second confidence corresponding to the object frame information based on the positioning feature map.
3. The method according to claim 2, wherein the determining, based on the localization feature map, object bounding box information of an object to which each pixel point in the target image belongs includes:
respectively determining a target distance range in which the distance between a pixel point and each frame in the object frames of the object to which the pixel point belongs is located based on the positioning feature map aiming at the pixel point in the target image;
respectively determining the target distance between the pixel point and each frame in the object frames of the object to which the pixel point belongs based on the target distance range and the positioning feature map;
and determining the object frame information of the object to which the pixel point belongs based on the position information of the pixel point in the target image and the target distance between the pixel point and each frame.
4. The method of claim 3, wherein determining a target distance range within which a distance between a pixel point and each of object borders of the object includes:
aiming at one border in the object borders of the object to which one pixel point in the target image belongs, determining the maximum distance between the pixel point and the border based on the positioning feature map;
carrying out segmentation processing on the maximum distance to obtain a plurality of distance ranges;
determining a first probability value that the distance between the pixel point and the bounding box is within each distance range based on the positioning feature map;
based on the determined first probability value, selecting a target distance range in which the distance between the pixel point and the frame is located from the plurality of distance ranges.
5. The method according to claim 4, wherein selecting a target distance range from the plurality of distance ranges in which the distance between the pixel point and the bounding box is located based on the determined first probability value comprises:
determining the uncertain parameter value of the distance between the pixel point and the frame of the edge based on the positioning feature map;
determining a target probability value that the distance between the pixel point and the bounding box is within each distance range based on the distance uncertainty parameter value and each first probability value;
and taking the distance range corresponding to the maximum target probability value as the target distance range in which the distance between the pixel point and the frame of the edge is positioned.
6. The method according to claim 4, wherein determining the second confidence level corresponding to the object bounding box information comprises:
and determining a second confidence corresponding to the object frame information of the object to which the pixel point belongs based on a first probability value corresponding to a target distance range in which the distance between one pixel point in the target image and each frame in the object frame of the object to which the pixel point belongs is located.
7. The method according to any one of claims 2 to 6, wherein the determining object type information of the object to which each pixel point in the target image belongs based on the classification feature map comprises:
determining a second probability value of each preset object type of an object to which each pixel point in the target image belongs based on the classification feature map;
and determining the object type information of the object to which the pixel point belongs based on the preset object type corresponding to the maximum second probability value.
8. The method according to any one of claims 1 to 7, wherein the determining the positioning information of the object in the target image based on the object border information of the object to which each pixel point belongs and the target confidence of the object border information includes:
screening a plurality of target pixel points from the target image; the distance between different target pixel points in the target image is smaller than a preset threshold value, and the object type information of the objects to which the different target pixel points belong is the same;
selecting object frame information corresponding to the highest object confidence from the object frame information of the object to which each target pixel point belongs to obtain target frame information;
and determining the positioning information of the object in the target image based on the selected target frame information and the target confidence corresponding to the target frame information.
9. A positioning device, comprising:
the image acquisition module is used for acquiring a target image;
the image processing module is used for determining object type information of an object to which each pixel point belongs, object frame information of the object to which each pixel point belongs, a first confidence coefficient corresponding to the object type information and a second confidence coefficient corresponding to the object frame information in the target image based on the image feature map of the target image;
the confidence processing module is used for respectively determining the target confidence of the object frame information of the object to which each pixel point belongs based on the first confidence and the second confidence;
and the positioning module is used for determining the positioning information of the object in the target image based on the object frame information of the object to which each pixel point belongs and the target confidence of the object frame information.
10. The positioning apparatus according to claim 9, wherein the image feature map includes a classification feature map for classifying the objects to which the pixels belong in the target image and a positioning feature map for positioning the objects to which the pixels belong in the target image;
the image processing module is configured to:
determining object type information of an object to which each pixel point in the target image belongs and a first confidence corresponding to the object type information based on the classification feature map;
and determining the object frame information of the object to which each pixel point belongs in the target image and a second confidence corresponding to the object frame information based on the positioning feature map.
11. The positioning apparatus according to claim 10, wherein the image processing module, when determining the object border information of the object to which each pixel point in the target image belongs based on the positioning feature map, is configured to:
respectively determining a target distance range in which the distance between a pixel point and each frame in the object frames of the object to which the pixel point belongs is located based on the positioning feature map aiming at the pixel point in the target image;
respectively determining the target distance between the pixel point and each frame in the object frames of the object to which the pixel point belongs based on the target distance range and the positioning feature map;
and determining the object frame information of the object to which the pixel point belongs based on the position information of the pixel point in the target image and the target distance between the pixel point and each frame.
12. The positioning apparatus of claim 11, wherein the image processing module, when determining a target distance range in which a distance between a pixel point and each of object borders of the object of the pixel point is located, is configured to:
aiming at one border in the object borders of the object to which one pixel point in the target image belongs, determining the maximum distance between the pixel point and the border based on the positioning feature map;
carrying out segmentation processing on the maximum distance to obtain a plurality of distance ranges;
determining a first probability value that the distance between the pixel point and the bounding box is within each distance range based on the positioning feature map;
based on the determined first probability value, selecting a target distance range in which the distance between the pixel point and the frame is located from the plurality of distance ranges.
13. The positioning apparatus according to claim 12, wherein the image processing module, when selecting the target distance range from the plurality of distance ranges in which the distance between the pixel point and the bounding box is located based on the determined first probability value, is configured to:
determining the uncertain parameter value of the distance between the pixel point and the frame of the edge based on the positioning feature map;
determining a target probability value that the distance between the pixel point and the bounding box is within each distance range based on the distance uncertainty parameter value and each first probability value;
and taking the distance range corresponding to the maximum target probability value as the target distance range in which the distance between the pixel point and the side frame is positioned.
14. The positioning apparatus according to claim 12, wherein the image processing module, when determining the second confidence level corresponding to the object border information, is configured to:
and determining a second confidence corresponding to the object frame information of the object to which the pixel point belongs based on a first probability value corresponding to a target distance range in which the distance between one pixel point in the target image and each frame in the object frame of the object to which the pixel point belongs is located.
15. The positioning apparatus according to any one of claims 10 to 14, wherein the image processing module, when determining the object type information of the object to which each pixel point in the target image belongs based on the classification feature map, is configured to:
determining a second probability value of each preset object type of an object to which each pixel point in the target image belongs based on the classification feature map;
and determining the object type information of the object to which the pixel point belongs based on the preset object type corresponding to the maximum second probability value.
16. The positioning device according to any one of claims 9 to 15, wherein the positioning module is configured to:
screening a plurality of target pixel points from the target image; the distance between different target pixel points in the target image is smaller than a preset threshold value, and the object type information of the objects to which the different target pixel points belong is the same;
selecting object frame information corresponding to the highest object confidence from the object frame information of the object to which each target pixel point belongs to obtain target frame information;
and determining the positioning information of the object in the target image based on the selected target frame information and the target confidence corresponding to the target frame information.
17. An electronic device, comprising: the positioning device comprises a processor, a storage medium and a bus, wherein the storage medium stores machine-readable instructions executable by the processor, when an electronic device runs, the processor and the storage medium communicate through the bus, and the processor executes the machine-readable instructions to execute the positioning method according to any one of claims 1 to 8.
18. A computer-readable storage medium, having stored thereon a computer program which, when executed by a processor, performs the positioning method according to any one of claims 1 to 8.
CN202010058788.7A 2020-01-18 2020-01-18 Positioning method and device, electronic equipment and computer readable storage medium Active CN111275040B (en)

Priority Applications (4)

Application Number Priority Date Filing Date Title
CN202010058788.7A CN111275040B (en) 2020-01-18 2020-01-18 Positioning method and device, electronic equipment and computer readable storage medium
JP2022500616A JP2022540101A (en) 2020-01-18 2021-01-15 POSITIONING METHOD AND APPARATUS, ELECTRONIC DEVICE, COMPUTER-READABLE STORAGE MEDIUM
KR1020227018711A KR20220093187A (en) 2020-01-18 2021-01-15 Positioning method and apparatus, electronic device, computer readable storage medium
PCT/CN2021/072210 WO2021143865A1 (en) 2020-01-18 2021-01-15 Positioning method and apparatus, electronic device, and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010058788.7A CN111275040B (en) 2020-01-18 2020-01-18 Positioning method and device, electronic equipment and computer readable storage medium

Publications (2)

Publication Number Publication Date
CN111275040A true CN111275040A (en) 2020-06-12
CN111275040B CN111275040B (en) 2023-07-25

Family

ID=70998770

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010058788.7A Active CN111275040B (en) 2020-01-18 2020-01-18 Positioning method and device, electronic equipment and computer readable storage medium

Country Status (4)

Country Link
JP (1) JP2022540101A (en)
KR (1) KR20220093187A (en)
CN (1) CN111275040B (en)
WO (1) WO2021143865A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111931723A (en) * 2020-09-23 2020-11-13 北京易真学思教育科技有限公司 Target detection and image recognition method and device, and computer readable medium
CN112819003A (en) * 2021-04-19 2021-05-18 北京妙医佳健康科技集团有限公司 Method and device for improving OCR recognition accuracy of physical examination report
WO2021143865A1 (en) * 2020-01-18 2021-07-22 北京市商汤科技开发有限公司 Positioning method and apparatus, electronic device, and computer readable storage medium
CN114613147A (en) * 2020-11-25 2022-06-10 浙江宇视科技有限公司 Vehicle violation identification method and device, medium and electronic equipment

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113762109B (en) * 2021-08-23 2023-11-07 北京百度网讯科技有限公司 Training method of character positioning model and character positioning method

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108764292A (en) * 2018-04-27 2018-11-06 北京大学 Deep learning image object mapping based on Weakly supervised information and localization method
US20190035101A1 (en) * 2017-07-27 2019-01-31 Here Global B.V. Method, apparatus, and system for real-time object detection using a cursor recurrent neural network
CN109426803A (en) * 2017-09-04 2019-03-05 三星电子株式会社 The method and apparatus of object for identification
CN109522938A (en) * 2018-10-26 2019-03-26 华南理工大学 The recognition methods of target in a kind of image based on deep learning

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111275040B (en) * 2020-01-18 2023-07-25 北京市商汤科技开发有限公司 Positioning method and device, electronic equipment and computer readable storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190035101A1 (en) * 2017-07-27 2019-01-31 Here Global B.V. Method, apparatus, and system for real-time object detection using a cursor recurrent neural network
CN109426803A (en) * 2017-09-04 2019-03-05 三星电子株式会社 The method and apparatus of object for identification
CN108764292A (en) * 2018-04-27 2018-11-06 北京大学 Deep learning image object mapping based on Weakly supervised information and localization method
CN109522938A (en) * 2018-10-26 2019-03-26 华南理工大学 The recognition methods of target in a kind of image based on deep learning

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021143865A1 (en) * 2020-01-18 2021-07-22 北京市商汤科技开发有限公司 Positioning method and apparatus, electronic device, and computer readable storage medium
CN111931723A (en) * 2020-09-23 2020-11-13 北京易真学思教育科技有限公司 Target detection and image recognition method and device, and computer readable medium
CN111931723B (en) * 2020-09-23 2021-01-05 北京易真学思教育科技有限公司 Target detection and image recognition method and device, and computer readable medium
CN114613147A (en) * 2020-11-25 2022-06-10 浙江宇视科技有限公司 Vehicle violation identification method and device, medium and electronic equipment
CN114613147B (en) * 2020-11-25 2023-08-04 浙江宇视科技有限公司 Vehicle violation identification method and device, medium and electronic equipment
CN112819003A (en) * 2021-04-19 2021-05-18 北京妙医佳健康科技集团有限公司 Method and device for improving OCR recognition accuracy of physical examination report

Also Published As

Publication number Publication date
JP2022540101A (en) 2022-09-14
WO2021143865A1 (en) 2021-07-22
KR20220093187A (en) 2022-07-05
CN111275040B (en) 2023-07-25

Similar Documents

Publication Publication Date Title
CN108229322B (en) Video-based face recognition method and device, electronic equipment and storage medium
CN111275040A (en) Positioning method and device, electronic equipment and computer readable storage medium
CN110414507B (en) License plate recognition method and device, computer equipment and storage medium
CN109325964B (en) Face tracking method and device and terminal
CN108960211B (en) Multi-target human body posture detection method and system
US8395676B2 (en) Information processing device and method estimating a posture of a subject in an image
CN108009466B (en) Pedestrian detection method and device
CN114119676B (en) Target detection tracking identification method and system based on multi-feature information fusion
CN105678213B (en) Dual-mode mask person event automatic detection method based on video feature statistics
CN106203539B (en) Method and device for identifying container number
CN111814690B (en) Target re-identification method, device and computer readable storage medium
CN112200056A (en) Face living body detection method and device, electronic equipment and storage medium
CN112464797A (en) Smoking behavior detection method and device, storage medium and electronic equipment
CN113378675A (en) Face recognition method for simultaneous detection and feature extraction
CN113989604A (en) Tire DOT information identification method based on end-to-end deep learning
CN112101134B (en) Object detection method and device, electronic equipment and storage medium
CN113657370A (en) Character recognition method and related equipment thereof
CN113378837A (en) License plate shielding identification method and device, electronic equipment and storage medium
CN116363655A (en) Financial bill identification method and system
CN108985216B (en) Pedestrian head detection method based on multivariate logistic regression feature fusion
CN116091781A (en) Data processing method and device for image recognition
US20220405527A1 (en) Target Detection Methods, Apparatuses, Electronic Devices and Computer-Readable Storage Media
CN115019152A (en) Image shooting integrity judgment method and device
CN114494355A (en) Trajectory analysis method and device based on artificial intelligence, terminal equipment and medium
CN116433939B (en) Sample image generation method, training method, recognition method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant