WO2022095514A1 - Procédé et appareil de détection d'image, dispositif électronique et support de stockage - Google Patents

Procédé et appareil de détection d'image, dispositif électronique et support de stockage Download PDF

Info

Publication number
WO2022095514A1
WO2022095514A1 PCT/CN2021/108965 CN2021108965W WO2022095514A1 WO 2022095514 A1 WO2022095514 A1 WO 2022095514A1 CN 2021108965 W CN2021108965 W CN 2021108965W WO 2022095514 A1 WO2022095514 A1 WO 2022095514A1
Authority
WO
WIPO (PCT)
Prior art keywords
dimensional
sample
target object
relative position
reference object
Prior art date
Application number
PCT/CN2021/108965
Other languages
English (en)
Chinese (zh)
Inventor
陈明汉
卢彦斌
贺兰懿
危夷晨
Original Assignee
北京迈格威科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京迈格威科技有限公司 filed Critical 北京迈格威科技有限公司
Publication of WO2022095514A1 publication Critical patent/WO2022095514A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/64Three-dimensional objects
    • G06V20/647Three-dimensional objects by matching two-dimensional images to three-dimensional objects

Definitions

  • the present application relates to the technical field of image processing, and in particular, to an image detection method, apparatus, electronic device and storage medium.
  • the vision-based security warning business it is a common business requirement to identify whether a target object crosses a reference object from a two-dimensional image, such as whether a pedestrian crosses a warning line.
  • it is generally based on the two-dimensional image captured by the camera to identify whether the target object crosses the reference object. event.
  • an image detection method, apparatus, device, and medium according to the embodiments of the present application are proposed to overcome the above problems or at least partially solve the above problems.
  • a first aspect of the present application discloses an image detection method, the method comprising:
  • the multiple two-dimensional images comprising a reference object and a target object
  • the image detection result is determined according to the corresponding three-dimensional relative positions of the plurality of two-dimensional images.
  • determining the image detection result according to the corresponding three-dimensional relative positions of the multiple two-dimensional images including:
  • each subsequence is a sequence composed of relative position scores belonging to the same direction dimension in the sequence
  • Whether the target object crosses the reference object is identified according to the respective three-dimensional relative positions and changing trends included in the subsequence.
  • the three-dimensional relative position corresponding to any two-dimensional image in the plurality of two-dimensional images includes: a horizontal relative position score and a vertical relative position score;
  • Identifying whether the target object straddles the reference object according to the respective three-dimensional relative positions and change trends included in the subsequence includes:
  • the absolute values of the longitudinal relative position scores in the respective three-dimensional relative positions included in the subsequence are all smaller than the first preset threshold, and the change trend of the horizontal relative position scores in the respective three-dimensional relative positions included in the subsequence is due to
  • the second preset threshold changes to a third preset threshold, it is determined that the target object crosses the reference object, wherein the second preset threshold is less than zero, and the third preset threshold is greater than zero
  • the The absolute value of the second preset threshold and the absolute value of the third preset threshold are both numerical values between 0 and 1.
  • the method further includes:
  • the method further includes:
  • the target is determined according to the shooting moment of the next two-dimensional image The spanning instant at which the object spans the reference object.
  • the three-dimensional relative position corresponding to any two-dimensional image in the plurality of two-dimensional images is determined according to the following steps:
  • the three-dimensional relative position prediction model is obtained by using the first training sample to train the first preset model, and the generation process of the first training sample includes the following steps:
  • the position information of the sample target object and the sample reference object on the two-dimensional image is obtained;
  • the first training sample is generated according to the respective position information of the sample reference object and the sample target object on the two-dimensional image and the corresponding three-dimensional relative position labels.
  • the three-dimensional relative position label includes a horizontal relative position label and a vertical relative position label
  • a three-dimensional relative position label is generated, including:
  • the marked three-dimensional position determine the distance from the sample target object to the sample reference object and whether the sample target object is on the left or right side of the reference object;
  • the marked three-dimensional position when determining that the sample target object is located in the upper area above the upper limit line, determine the distance from the sample target object to the upper limit line;
  • the marked three-dimensional position when it is determined that the sample target object is located in the middle area between the upper limit line and the lower limit line, determine the distance from the sample target object to the midline;
  • the marked three-dimensional position when determining that the sample target object is located in the lower region below the lower limit line, determine the distance from the sample target object to the lower limit line;
  • a longitudinal relative position label is generated according to the size of the sample target object and the distance of the sample target object to the upper limit line, lower limit line or midline.
  • identifying whether the target object straddles the reference object according to the corresponding three-dimensional relative positions of the multiple two-dimensional images includes:
  • the cross-line recognition model is obtained by using the second training sample to train the second preset model, and the generation process of the second training sample includes the following steps:
  • the second training sample is generated according to the three-dimensional relative position of the sample, the cross-line label and/or the cross-line time of the sample target object.
  • the method further includes:
  • the second training sample is generated, including:
  • the second training sample is generated according to the transformed vertical three-dimensional relative position, the horizontal three-dimensional relative position among the three-dimensional relative positions of the samples, and the cross-line label and/or cross-line time of the sample target object.
  • the method further includes:
  • the second training sample is generated, including:
  • the second training sample is generated according to the three-dimensional relative position of the sample, the cross-line label and/or the cross-line time of the sample target object, and the feature information.
  • an image detection device is also disclosed, and the device includes:
  • an image acquisition module for acquiring multiple two-dimensional images collected at different times, the multiple two-dimensional images including a reference object and a target object;
  • a three-dimensional position determination module is configured to, for each two-dimensional image in the plurality of two-dimensional images, determine the relative position of the target object relative to the reference object according to the position information of the reference object and the target object on the two-dimensional image. three-dimensional relative position;
  • the identification module is configured to identify whether the target object crosses the reference object according to the corresponding three-dimensional relative positions of the plurality of two-dimensional images.
  • an electronic device including: a memory, a processor, and a computer program stored in the memory and running on the processor, the processor implements the first The image detection method described in the aspect embodiment.
  • a computer-readable storage medium is further disclosed, and a computer program stored in the storage medium enables a processor to execute the image detection method according to the embodiments of the first aspect of the present application.
  • a fifth aspect of the embodiments of the present application provides a computer program, including computer-readable codes, which, when the computer-readable codes are executed on a computing and processing device, cause the computing and processing device to execute the above-mentioned image detection method.
  • multiple two-dimensional images collected at different times can be obtained, and the multiple two-dimensional images include a reference object and a target object; for each two-dimensional image in the multiple images, according to the reference object and the target object, respectively on the two-dimensional image
  • the position information of the target object is determined with respect to the three-dimensional relative position of the reference object; after that, the image detection result is determined according to the corresponding three-dimensional relative positions of the multiple two-dimensional images.
  • the 3D relative position of the target object relative to the reference object in each 2D image is determined, and the 3D relative position can more accurately reflect the spatial position between the target object and the reference object.
  • the corresponding three-dimensional relative positions of the multiple two-dimensional images can reflect the movement trend of the target object in space.
  • the target object crossing line can be described as a transformation problem of spatial position, so as to more accurately identify whether the target object crosses the reference object.
  • FIG. 1 is a schematic diagram of a scenario in an implementation process in an embodiment of the present application.
  • FIG. 2 is a flow chart of the steps of the image detection method in the embodiment of the present application.
  • FIG. 3 is a schematic diagram of the principle of a three-dimensional cross-line prediction calculation in the implementation of the present application.
  • FIG. 4 is a schematic flowchart of image detection using a neural network in the implementation of the present application.
  • FIG. 5 is a flow chart of steps for preparing a first training sample in the implementation of the present application.
  • Fig. 6 is a kind of simulated three-dimensional scene graph in the implementation of this application.
  • FIG. 7 is a flow chart of the steps for obtaining a three-dimensional relative position label in the implementation of the present application.
  • FIG. 8 is a scene diagram of image detection using a neural network in the implementation of the present application.
  • Fig. 9 is another scene diagram of utilizing neural network for image detection in the implementation of the present application.
  • FIG. 10 is a structural block diagram of an image detection apparatus in the implementation of the present application.
  • This application proposes an image detection method, which describes the process of "target object crossing the line” as an algorithm problem, that is, according to multiple two-dimensional images captured during the movement of the target object, obtain the three-dimensional space position sequence of the target object and the reference object , if the three-dimensional space position sequence satisfies certain conditions, it can be determined that the target object crosses the reference object.
  • FIG. 1 shows a schematic diagram of a scene in the implementation process of the embodiment of the present application
  • FIG. 2 shows a flowchart of the steps of the image detection method of the embodiment of the present application.
  • the image detection method of the present application is introduced.
  • the method of this embodiment can be applied to an intelligent terminal or server, and may specifically include the following steps:
  • Step S201 Obtain multiple two-dimensional images collected at different times, where the multiple two-dimensional images include a reference object and a target object.
  • multiple two-dimensional images captured by the camera at multiple consecutive moments can be obtained, and the two-dimensional images can be understood as plane images.
  • the camera can be controlled to collect a 2D image of a specified area at a specified time. In this way, multiple 2D images taken for the same area at different times can be obtained. It can be understood that multiple 2D images are continuous and multiple. shot at the moment.
  • the specified time may be set according to actual needs, and no additional limitation is made here.
  • the angle of the camera can remain unchanged, that is, the camera can collect two-dimensional images from the same viewing angle. In this way, the recognition error caused by the change of the angle of view of the camera can be reduced.
  • the two-dimensional images at different times may also be captured under different camera perspectives.
  • the two-dimensional image collected by the camera may be collected in real time, that is, the two-dimensional image collected by the camera at a moment is input to the intelligent terminal or server for analysis in real time. Of course, it may also be pre-collected and stored by the camera, and subsequently acquired by the smart terminal or server from the storage location.
  • analyzing the cross-line behavior is generally to analyze the cross-line behavior of the target object relative to the reference object, that is, to identify whether the target object crosses the reference object. Therefore, each 2D image obtained can contain the same reference object and the same target object.
  • the reference object may refer to a warning line, a railing, or another target object in the area that is marked as a space dividing line. When the reference object is a warning line, it is generally a conspicuous straight line mark, such as a yellow straight line mark.
  • the target object may refer to a movable object such as a person, an animal, a motor vehicle, and an intelligent robot.
  • each image includes a reference object 101 and a target object, which is a pedestrian in Figure 1. It can be seen that the position of the target object in each image can be different, that is, the movement trajectory of the pedestrian can be obtained from the three images.
  • the target object and the reference object in each two-dimensional image can be marked, so that the target object and the reference object can be distinguished from other contents in the image.
  • Step S202 For each two-dimensional image in the plurality of two-dimensional images, determine the three-dimensional relative position of the target object relative to the reference object according to the respective position information of the reference object and the target object on the two-dimensional image.
  • each two-dimensional image may include a reference object and a target object
  • the position information of the reference object on the two-dimensional image and the position information of the target object on the two-dimensional image can be obtained. location information.
  • the position information of the reference object on the two-dimensional image may refer to the two-dimensional coordinate position, and the position information of the target object on the two-dimensional image may also refer to the two-dimensional coordinate position.
  • the two-dimensional coordinate position may refer to, for example, a pixel coordinate position.
  • the position information of the two can be combined to determine the three-dimensional relative position of the target object relative to the reference object.
  • the two-dimensional coordinate position of the reference object since the two-dimensional coordinate position of the reference object is generally fixed, and the two-dimensional coordinate position of the target object changes with the movement of the target object, the two-dimensional coordinate position of the reference object can be used as the benchmark to determine The three-dimensional relative position of the two-dimensional coordinate position of the target object relative to the two-dimensional coordinate position of the reference object.
  • the process of determining the three-dimensional relative position according to the two-dimensional position information can be called three-dimensional cross-line prediction calculation, and the two-dimensional position of the moving target object in the two-dimensional image can be converted into a series of three-dimensional relative positions through the three-dimensional cross-line prediction calculation.
  • Position to quantify the direction of the target object relative to the reference object in space and the distance relative to the reference object in this direction with numerical values, so as to accurately quantify the movement trajectory of the target object relative to the reference object.
  • FIG. 3 a schematic diagram of the principle of a three-dimensional cross-line prediction calculation of the present application is shown.
  • the three two-dimensional images shown in FIG. 1 are taken as an example for analysis.
  • the circle represents the three-dimensional position of the pedestrian, and the reference object The position is shown by the thick arrow in Figure 3.
  • the position of the reference object is fixed, and three positions of the pedestrian are marked in FIG. 3 , which are respectively 201 , 202 , and 203 in the order of movement.
  • the positions 201 , 202 , and 203 can be understood as the positions where the two-dimensional position of the target object on the image plane is projected to the three-dimensional scene.
  • the direction of the target object relative to the reference object in space can be divided into horizontal and vertical directions relative to the reference object, then the three-dimensional relative position of the target object relative to the reference object can be The horizontal relative position score and the vertical relative position score are included. In this way, the horizontal relative position score and the vertical relative position score can be used to quantify the distance and direction of the target object relative to the reference object.
  • the relative direction position score can be used to describe the spatial position of the target object relative to the reference object, then the two-dimensional coordinate position of the reference object can be used as a benchmark to determine the three-dimensional relationship between the two-dimensional coordinate position of the target object and the two-dimensional coordinate position of the reference object.
  • the relative position specifically, can be determined according to the positional relationship between the two-dimensional coordinate position of the target object and the two-dimensional coordinate position of the reference object in the horizontal relative direction, and the position score in the relative direction can be determined, and the two-dimensional coordinate position of the target object is vertically opposite.
  • the positional relationship between the direction and the two-dimensional coordinate position of the reference object determines the longitudinal relative position score.
  • the three-dimensional relative position of the target object relative to the reference object may reflect the direction of the target object relative to the reference object in space and the distance relative to the reference object in this direction.
  • the location score may also identify which side of the reference object the target object is located on, eg, a negative score indicates that it is to the right of the reference object, and a positive score indicates that it is to the left of the reference object.
  • a distance range can be set along the longitudinal direction of the reference object.
  • Top_line and Bottle_line are set in the longitudinal direction to determine the relative position of the target object in the area between Top_line and Bottle_line.
  • the lateral relative position score of the reference object is set along the longitudinal direction of the reference object.
  • the horizontal direction of the target object relative to the reference object at position 202 is the left side
  • the vertical direction is the area between Top_line and Bottle_line.
  • the distance from the reference object in the horizontal direction is shown by the line segment x1
  • the distance from the vertical direction from the upper or lower boundary of the area where the reference object is located is shown by the line segment x2.
  • the horizontal relative position score and the vertical relative position score are determined, so that the three-dimensional relative position of the target object relative to the reference object is obtained.
  • the three-dimensional relative position can quantify the direction of the target object relative to the reference object in space and the distance relative to the reference object in this direction
  • the position information of the two-dimensional image is quantified into carrying direction information and
  • the numerical score of the distance information can more accurately locate the position distance relationship between the target object and the reference object, reduce the difficulty of identification, and improve the accuracy of identification.
  • Step S203 Determine an image detection result according to the corresponding three-dimensional relative positions of the multiple two-dimensional images.
  • the image detection result may refer to detecting whether the target object crosses the reference object.
  • the reference object is a warning line, if the target object crosses the reference object, it is commonly referred to as crossing the line.
  • the three-dimensional relative position of each two-dimensional image may include a horizontal relative position score and a vertical relative position score
  • the three-dimensional relative positions of multiple two-dimensional images constitute multiple sets of different horizontal relative position scores and vertical relative position scores. , in this way, it can be determined whether the target object crosses the reference object according to the change trend of a plurality of different horizontal relative position scores and the change trend of the vertical relative position scores.
  • the vertical relative position score can generally be used to constrain the horizontal relative position score within a certain spatial range to analyze the change trend of the horizontal relative position score.
  • the longitudinal relative position score can be constrained within a space range of a certain distance between the upper and lower ends of the reference object, and the change trend of the lateral relative position score can be analyzed.
  • the vertical relative position score represents the area between Top_line and Bottle_line
  • the horizontal relative position score is transformed from one numerical range to another data range, it means that the reference object is crossed.
  • the left and right sides of the reference object are represented by different score value ranges, the left side is represented by a positive value, and the right side is represented by a negative value.
  • the 3D relative position of the target object relative to the reference object in each 2D image is determined, and the 3D relative position can more accurately reflect the spatial position between the target object and the reference object.
  • the corresponding three-dimensional relative positions of the multiple two-dimensional images can reflect the moving trend of the target object in space. In this way, the cross-line of the target object can be described as a transformation problem of spatial position, so as to more accurately identify whether the target object crosses the reference object.
  • the 3D relative positions corresponding to each 2D image can be considered as a sequence, so that the 3D relative positions corresponding to each of the multiple 2D images form a set of sequences, so, in According to the corresponding three-dimensional relative positions of the multiple two-dimensional images, when identifying whether the target object crosses the reference object, a subsequence can be extracted from the sequence composed of the corresponding three-dimensional relative positions of the multiple two-dimensional images; Change trend, identify whether the target object crosses the reference object.
  • each subsequence is a sequence composed of relative position scores belonging to the same direction dimension in the sequence.
  • the sequence composed of the corresponding three-dimensional relative positions of the multiple two-dimensional images can be regarded as a set of numerical sequences.
  • the three-dimensional relative positions of pedestrians relative to the reference object in the three two-dimensional images A, B, and C are composed of The sequence is ⁇ (0.5, 0.3), (0.1, 0.9), (-0.1, 0.2) ⁇ .
  • the three-dimensional relative position may include a horizontal relative position score and a vertical relative position score, that is, each sequence can be considered to include position scores in two directional dimensions, wherein a plurality of position scores on each directional dimension It constitutes a subsequence in the form of numerical values.
  • a subsequence extracted from the sequence composed of the corresponding three-dimensional relative positions of the multiple two-dimensional images may be a sequence composed of position scores in one direction dimension in the three-dimensional relative positions corresponding to a single two-dimensional image.
  • the subsequence can include a sequence composed of horizontal relative position scores, such as (0.5, 0.1, -0.1), or Include sequences consisting of longitudinal relative position scores, such as (0.3, 0.9, 0.2).
  • the change trend of the 3D relative position can be analyzed from different directional dimensions, and then, the change trend of the 3D relative position can be analyzed comprehensively in different directional dimensions to determine whether the target object crosses the reference object.
  • each subsequence can also include a vertical relative position score and a horizontal relative position score.
  • the change trend of the three-dimensional relative position can include The change trend of the horizontal relative position score and the change trend of the vertical relative position score.
  • the horizontal relative position is the horizontal position of the target object relative to the reference object
  • the vertical relative position is the vertical position of the target object relative to the reference object.
  • the horizontal relative position score can represent the horizontal relative position of the target object.
  • the distance and orientation of the reference object such as the value of the line segment x1 representing the position 202 in FIG. 3 and the orientation relative to the reference object
  • the longitudinal relative position score can represent the longitudinal distance and orientation of the target object relative to the reference object.
  • the absolute values of the longitudinal relative position scores in the respective three-dimensional relative positions included in the subsequence may be smaller than the first preset value.
  • the second preset threshold value is less than zero
  • the third preset threshold value is greater than zero.
  • the absolute value of the second preset threshold value and the absolute value of the third preset threshold value are both numerical values between 0 and 1.
  • each subsequence can analyze the change trend of the three-dimensional relative position from the corresponding direction dimension, and can analyze the change trend of the three-dimensional relative position in different direction dimensions comprehensively to determine whether the target object crosses the reference object.
  • the trend of change of the sub-sequences of the longitudinal dimension can be analyzed first, that is, the change trend of the longitudinal relative position score in each three-dimensional relative position can be analyzed. Within the range, the change trend analysis of the horizontal relative position score is carried out.
  • the trend of changes in the sub-sequences of the lateral dimension can be analyzed again, that is, the horizontal direction of the respective three-dimensional relative positions can be analyzed.
  • the change trend of the relative position if the horizontal relative position score changes from the second preset threshold to the third preset threshold, it is determined that the target object crosses the reference object.
  • the second preset threshold and the third preset threshold are two different thresholds.
  • the second preset threshold and the third preset threshold may be two thresholds with opposite positive and negative values.
  • the subsequence of the longitudinal dimension is (0.3, 0.9, 0.2), and each of its scores is less than the threshold 0.5, then the subsequence (0.5, 0.1, -0.1) of the lateral dimension is analyzed, which has the The tendency of the second preset threshold value 0.2 to change to the third preset threshold value -0.2, therefore, it can be considered that the target object crosses the reference object.
  • the change trend of the horizontal relative position score and the change trend of the vertical relative position score are both a change trend closely related to time. In this way, when identifying whether the target object crosses the reference object according to the above changing trend, the moment when the target object crosses the line can be determined at the same time.
  • the span of the target object across the reference object The moment is a moment in the shooting period of the two two-dimensional images corresponding to the two adjacent laterally relative positions.
  • the target object when the scores of two adjacent lateral relative positions change from less than zero to greater than zero, or from greater than zero to less than zero, both indicate that the target object changes from one lateral side of the reference object to the other lateral side.
  • the target object spans the reference object, so that the shooting moments of two two-dimensional images corresponding to two adjacent laterally relative positions can be obtained, and then a moment between the two shooting moments can be determined as the target object.
  • the spanning moment may be the middle moment of two shooting moments, or any moment.
  • 0.1 changes to -0.1 the target object spans the reference object, and 0.1 corresponds to a two-dimensional image B.
  • the shooting time is 1:23 minutes
  • -0.1 corresponds to the two-dimensional image C
  • the shooting time is 2:01 minutes
  • the crossing time can be determined as 1:42 minutes, that is, the middle time.
  • the next two-dimensional image can be continuously acquired, and the next two-dimensional image and the multiple two-dimensional images include the same reference object and the same target object;
  • the next two-dimensional image and part of the two-dimensional images in the plurality of two-dimensional images that the target object crosses the reference object it is determined that the target object crosses the reference object according to the shooting time of the next two-dimensional image. The spanning moment of the reference object.
  • the shooting time of the next image can be determined as the crossing time when the target object crosses the reference object, or the shooting time of the next image can be later than the time in some two-dimensional images.
  • the moment between the shooting moments of the last two-dimensional image, such as the middle moment or any moment in the middle, is determined as the crossing moment when the target object crosses the reference object.
  • the process of recognizing the target object crossing the line through the two-dimensional image can be converted into a three-dimensional relative position change process, that is, the two-dimensional image recognition process can be converted into the change of the position score in different directions.
  • the process of "target object crossing the line" is described as an algorithm problem, so as to more accurately determine whether the target object crosses the reference object.
  • the three-dimensional relative position includes a horizontal relative position and a vertical relative position
  • a neural network can be used to learn how to convert the position in the two-dimensional image into a three-dimensional horizontal relative position.
  • Position and vertical relative position that is, the numerical quantification process of two-dimensional position information is completed by neural network, and further, the process of judging target object crossing lines according to three-dimensional relative position can also be completed by neural network.
  • FIG. 4 shows a schematic flowchart of performing cross-line detection by using a neural network according to an embodiment of the present application. As shown in Figure 4, it includes a three-dimensional relative position prediction model and a cross-line recognition model.
  • the output end of the 3D relative position prediction model can be connected with the input end of the cross-line recognition model.
  • the 3D relative position prediction model and the cross-line recognition model constitute a joint model.
  • the entire cross-line detection can be completed.
  • the joint model is used to detect the cross-line, the position information of the reference object and the target object on the two-dimensional image can be input into the joint model. In this way, you can Obtain cross-line detection results output by the joint model.
  • the three-dimensional relative position prediction model may include a feature layer and multiple fully connected layers.
  • the cross-line identification model includes a feature integration layer, which is used to integrate the horizontal relative position Score lr and the vertical relative position Score tb in the three-dimensional relative position output by the three-dimensional relative position prediction model, and output the cross-line result.
  • the input to the three-dimensional relative position prediction model can be the position information of the reference object and the target object on the two-dimensional image, such as the two-dimensional position coordinates of the target object, the two-dimensional position coordinates of the reference object, and the reference object in FIG. 4 .
  • Feature information of the object wherein the feature information of the reference object may include direction information and length information of the reference object.
  • the three-dimensional relative position prediction model is essentially capable of describing multiple two-dimensional images captured from different camera perspectives as an algorithm problem, thus realizing the conversion from two-dimensional positions to three-dimensional positions.
  • the process of converting a two-dimensional position to a three-dimensional position it specifically involves a three-dimensional cross-line prediction calculation, and the three-dimensional cross-line prediction calculation can be crossed with reference to the description related to FIG. 1 in the above-mentioned embodiment, wherein the training of the first Assuming the process of obtaining a 3D relative position prediction model from the model, it can be understood that the first preset model is gradually trained to have the ability to perform 3D cross-line prediction and calculation. In this way, the 3D relative position prediction model actually has the ability of 3D cross-line prediction and calculation, so Two-dimensional positions can be accurately converted to three-dimensional positions.
  • the 3D relative position label is generated based on the 3D cross-line prediction calculation.
  • the training sample for training the three-dimensional relative position prediction model may be referred to as the first training sample.
  • FIG. 5 a flow chart of the steps for preparing the first training sample is shown, which may specifically include the following steps:
  • Step S501 Mark the three-dimensional position of the sample target object and the three-dimensional position of the sample reference object in the simulated three-dimensional scene.
  • FIG. 6 a simulated three-dimensional scene is shown.
  • the three-dimensional position of the sample target object and the three-dimensional position of the sample reference object are performed in the simulated three-dimensional scene.
  • Annotation can be understood as a kind of annotation that simulates the real position, and the obtained 3D position is also a simulated 3D position.
  • FIG. 6 takes the reference object as the warning line as an example for description.
  • the marked 3D position may refer to the 3D coordinates in space.
  • the coordinate value of a certain dimension can be allowed to be 0 in the 3D coordinates.
  • the 3D coordinate value indicating the height is 0. .
  • the two-dimensional images captured by the camera under different viewing angles can be simulated, and the motion trajectory of the target object at different speeds can also be simulated. That is, when performing three-dimensional position annotation, the three-dimensional position of the target object can be marked under different camera angles and different target object speeds, so as to simulate the motion trajectory of the target object in various real situations.
  • Step S502 According to the marked three-dimensional position, the position information of the sample target object and the sample reference object on the two-dimensional image is obtained through camera projection transformation.
  • the sample target object and the sample reference object in the 3D simulation scene can be obtained according to the marked 3D positions
  • the position information in the two-dimensional image, that is, the marked three-dimensional position is converted into a two-dimensional position.
  • the position information of the sample target object and the sample reference object on the two-dimensional image can be obtained through camera projection transformation, wherein the position information on the two-dimensional image can be two-dimensional position coordinates, for example, can be pixel coordinates.
  • the respective position information of the sample target object and the sample reference object on the two-dimensional image is used as the information input to the first preset model.
  • Step S503 Generate a three-dimensional relative position label according to the characteristic information of the sample reference object and the marked three-dimensional position.
  • the characteristic information of the sample reference object may include information such as the length and direction of the sample reference object. Specifically, the characteristic information of the sample reference object, the three-dimensional position marked by the sample reference object, the Three-dimensional position, generating a three-dimensional relative position label of the sample target object relative to the sample reference object at each three-dimensional position.
  • the length information in the feature information of the sample reference object can be used to help determine the distance between the upper and lower sides of the sample reference object
  • the direction information can be used to help determine the direction in which the sample target object crosses the reference object, for example, from the left side of the reference object across Domain to the right, or the right side of the reference object spans the domain to the left.
  • the length of the sample reference object is 2, then the distance from the upper end of the sample reference object is 1, and the distance from the lower end of the sample reference object is 1, as shown in the range from Top_line to Bottle_line.
  • the direction of the sample reference object is shown by the arrow in FIG. 5 , and the direction of the target object across the sample reference object is determined based on this direction.
  • the generated three-dimensional relative position label is determined according to the marked three-dimensional position in the simulated three-dimensional scene, and in practice, represents the real three-dimensional relative position. Therefore, these three-dimensional relative position labels can be used as supervised labels for training the first preset model.
  • the process of how to obtain the three-dimensional relative position label is described.
  • the three-dimensional relative position label includes a horizontal relative position label and a vertical relative position label.
  • a flowchart of steps for obtaining a three-dimensional relative position label which may specifically include a process for generating a horizontal relative position label and a process for generating a vertical relative position label.
  • the process of generating the horizontal relative position label is as described in the following steps S701 to S702
  • the process of generating the vertical relative position label is described in the following steps S703 to S707 .
  • Step S701 Determine the distance from the sample target object to the sample reference object and whether the sample target object is on the left or right side of the reference object according to the marked three-dimensional position.
  • the distance between the sample target object and the sample reference object can be determined, and the distance can refer to the lateral distance relative to the sample reference object, as shown in FIG. 5
  • the line segment x3 represents the lateral distance, and it can be determined that the sample target object is located on the left or right side of the reference object.
  • the lateral distance relative to the sample reference object may further refer to a lateral vertical distance relative to the sample reference object.
  • Step S702 Generate a lateral relative position label according to the size of the sample target object, the distance from the sample target object to the sample reference object, and whether the sample target object is on the left or right side of the reference object.
  • the sample target object is identified as a cylinder, and the size of the sample target object may refer to the radius of the cylinder.
  • the horizontal relative position label can be generated according to the following formula (1).
  • score lr is the horizontal relative position label
  • obj is the target object
  • dir_lr represents the azimuth parameter value, which represents the azimuth relationship between the target object and the reference object.
  • the lateral relative position label can be understood as a relative position distance from the sample reference object in the lateral direction, and the lateral relative position label can simultaneously reflect the distance relationship and the direction relationship between the target object and the sample reference object.
  • Step S703 Determine the upper limit line, the middle line and the lower limit line according to the length information and direction information of the sample reference object.
  • the upper limit line, the middle line and the lower limit line can be determined according to the length information and direction information of the sample reference object.
  • the distance between the line and the lower limit line may be greater than or equal to the length of the sample reference object.
  • the midline may refer to the midpoint line of the distance between the upper limit line and the lower limit line, and is used to help define the change of the motion trajectory of the sample target object.
  • Top_line is the upper limit line
  • Bottle_line is the lower limit line
  • Mid_line is the middle line.
  • the simulated three-dimensional scene can be divided into an upper region, a middle region and a lower region, and the upper region is the region above the upper limit line, and the middle region is the region located on the upper limit line.
  • the area between the upper limit line and the lower limit line, and the lower area is the area below the lower limit line.
  • the distance from the three-dimensional position of the sample target object to the corresponding line can be determined according to the region where the three-dimensional position is located, and then, according to the distance from the three-dimensional position to the corresponding line, Generates a longitudinal relative position label for this 3D position.
  • Step S704 According to the marked three-dimensional position, when it is determined that the sample target object is located in the upper region above the upper limit line, determine the distance from the sample target object to the upper limit line.
  • the distance from the sample target object at the three-dimensional position to the upper limit line can be determined according to the following formula ( 2) Generate the longitudinal relative position label corresponding to the three-dimensional position:
  • score tb is the vertical and vertical relative position label
  • Top_line is the upper limit line.
  • Step S705 According to the marked three-dimensional position, when it is determined that the sample target object is located in the middle region between the upper limit line and the lower limit line, determine the distance from the sample target object to the midline.
  • the distance from the three-dimensional position of the sample target object to the midline can be determined according to the determination.
  • the three-dimensional position can be generated according to the following formula (3) Corresponding vertical relative position label:
  • score tb is the vertical relative position label
  • Top_line is the upper limit line
  • dir_tb is the azimuth parameter adjustment value, which represents the azimuth relationship between the target object and Mid_line.
  • Step S706 According to the marked three-dimensional position, when it is determined that the sample target object is located in the lower region below the lower limit line, determine the distance from the sample target object to the lower limit line.
  • the distance from the three-dimensional position of the sample target object to the lower limit line can be determined.
  • the following formula (4 ) determines the longitudinal relative position label for this 3D position:
  • score tb is the vertical relative position label
  • Bot_line is the lower limit line.
  • Step S707 Generate a longitudinal relative position label according to the size of the sample target object and the distance from the sample target object to the upper limit line, lower limit line or midline.
  • the three-dimensional position after determining the distance of the sample target object from the three-dimensional position to the corresponding line according to the region where the three-dimensional position is located, the three-dimensional position can be determined according to the corresponding formulas (2), (3) and (4) above. In this way, the respective longitudinal relative position labels of the three-dimensional positions of the multiple sample target objects are obtained.
  • steps S701 to S707 is a three-dimensional cross-line prediction calculation process.
  • Step S504 Generate the first training sample according to the respective position information of the sample reference object and the sample target object on the two-dimensional image and the corresponding three-dimensional relative position labels.
  • the position information on the two-dimensional image obtained by the projective transformation of the three-dimensional position and the corresponding three-dimensional relative position label can be used as a training sample pair.
  • a plurality of training sample pairs of the marked three-dimensional positions, the plurality of training sample pairs are the first training samples.
  • each training sample pair includes the respective position information of the sample target object and the sample reference object on the two-dimensional image, and the three-dimensional relative position label of the sample target object.
  • the training process of the first preset model is introduced. Specifically, as shown in FIG. 4 , the sample reference object and the sample target object in the first training sample can be divided into two-dimensional The position information on the image is input into the first preset model, and the predicted three-dimensional relative position output by the first preset model is obtained;
  • the first preset model is updated according to the predicted three-dimensional relative position and the three-dimensional relative position label in the first training sample, and when the first preset model converges, the training ends, so that the first preset model at the end of the training is
  • the preset model is determined as a three-dimensional relative position prediction model.
  • the 3D relative position prediction model can have cross-line prediction calculation capability, that is, the 3D relative position of the target object relative to the reference object can be determined according to the 2D positions of the target object and the reference object.
  • the respective position information of the reference object and the target object on the two-dimensional image may be input into the three-dimensional relative position prediction model to obtain the three-dimensional relative position corresponding to the two-dimensional image.
  • the input of the cross-line recognition model is the output of the three-dimensional relative position prediction model
  • the output of the three-dimensional relative position prediction model is the three-dimensional relative position corresponding to the two-dimensional position. Therefore, the training samples for training the cross-line recognition model are also 3D relative position.
  • the three-dimensional relative position prediction model may be used to generate training samples of the cross-line recognition model, wherein the training samples for training the cross-line recognition model are called second training samples.
  • FIG. 4 and Figure 8-9 three different scenarios for cross-line detection using neural networks are shown. Different neural network scenarios correspond to the second training samples in the three cases, and different The two training samples can achieve their own training effects.
  • the respective position information of the sample target object and the sample reference object on the two-dimensional image can be input into the trained three-dimensional relative position prediction model to obtain the sample.
  • the three-dimensional relative position of the sample is the position after accurate conversion of the position information of the sample target object and the sample reference object on the two-dimensional image; after that, according to the three-dimensional relative position of the sample, the sample The cross-line label and/or cross-line moment of the target object generates a second training sample.
  • the first training sample can be used, that is, the position information of the sample target object and the sample reference object in the first training sample on the two-dimensional image are input into the relative position prediction model to obtain the relative position prediction model
  • the output three-dimensional relative position, and then the three-dimensional relative position of the sample, the cross-line label and/or the cross-line time of the sample target object are used as the second training sample.
  • each three-dimensional relative position label in the first training sample can also be directly used as the three-dimensional relative position of the sample in the second training sample.
  • the second training sample may include the relative three-dimensional position of the sample, the cross-line label of the sample target object, or may include the three-dimensional relative position of the sample, the cross-line time of the sample target object, or may include the three-dimensional relative position of the sample , the cross-line label and cross-line time of the sample target object.
  • the cross-line label and/or the cross-line time is used as a supervision label for training the second preset model, wherein the cross-line time may refer to the real cross-line time, and the cross-line label may be used to represent whether the target object crosses the reference object. reality.
  • the second preset model can be used to determine the time when the target object crosses the reference object.
  • a generated second training sample includes multiple three-dimensional relative positions of multiple samples corresponding to multiple two-dimensional positions.
  • multiple samples corresponding to multiple two-dimensional positions can be The three-dimensional relative position is simultaneously input to the preset second model, so as to update the preset second model according to the output cross-line identification result, cross-line label and/or cross-line time.
  • FIG. 8 another neural network application scenario is shown.
  • the vertical three-dimensional relative position in the three-dimensional relative position of the sample can also be scaled. value transformation to obtain the transformed vertical three-dimensional relative position; thus, according to the transformed vertical three-dimensional relative position, the horizontal three-dimensional relative position in the three-dimensional relative position of the sample, and the cross-line label and/or cross-line of the sample target object moment, the second training sample is generated.
  • performing amplitude value transformation on the longitudinal three-dimensional relative position in the three-dimensional relative position of the sample may refer to mapping the longitudinal three-dimensional relative position into a positive value or a negative value.
  • the longitudinal three-dimensional relative position is in the middle area, the value of the longitudinal three-dimensional relative position is mapped to a positive value, and if the longitudinal three-dimensional relative position is in the upper or lower area, the longitudinal three-dimensional relative position Values map to negative values. In this way, it is possible to identify whether the target object bypasses the reference object and crosses into the warning area, thereby improving the accuracy of cross-line detection.
  • the transformed vertical three-dimensional relative position and the horizontal three-dimensional relative position of the sample can be used as the sample three-dimensional relative position
  • the cross-line label and/or cross-line time of the sample target object can be used as the second training sample.
  • the second training sample may include the relative three-dimensional position of the sample, the cross-line label of the sample target object, or may include the relative three-dimensional position of the sample, the cross-line time of the sample target object, or may include the three-dimensional relative sample of the sample. position, cross-line label and cross-line moment of the sample target object.
  • a plurality of transformed vertical three-dimensional relative positions and horizontal three-dimensional relative positions among the three-dimensional relative positions of the samples may be simultaneously input into the preset second model, so as to obtain the cross-line recognition result according to the output. , cross-line label and/or cross-line time to update the preset second model.
  • FIG. 9 another neural network application scenario is shown.
  • the preset model in addition to inputting the three-dimensional relative position of the sample into the second preset model for training, in order to make the second During the training process of the preset model, it can learn better independently, and additionally obtain the feature information output by at least one network layer in the three-dimensional relative position prediction model, and according to the three-dimensional relative position of the sample, the sample target object's characteristic information.
  • the cross-line label and/or cross-line time, and the feature information are used to generate the second training sample.
  • the feature information output by at least one network layer of the 3D relative position prediction model can also be obtained, so that the sample 3D relative position, feature information, cross-line label of the sample target object and/or or the cross-line moment as the second training sample.
  • the second training sample may include the three-dimensional relative position of the sample (the longitudinal relative position label and the horizontal relative position label after amplitude transformation), feature information, and the cross-line label of the sample target object, or may include the three-dimensional relative position of the sample,
  • the feature information, the cross-line time of the sample target object or, may include the three-dimensional relative position of the sample, the feature information, the cross-line label and the cross-line time of the sample target object.
  • the three-dimensional relative position and feature information of the samples in the second training sample are used as the information input to the second preset model during the training process, and the cross-line labels and/or cross-line moments of the sample target objects are used as the supervision in the training process. Label.
  • the second preset model When actually training the second preset model, you can choose to use one of the second training samples to train the second preset model. Specifically, you can input the second training sample obtained in any of the above situations into the second preset model. model, and then update the second preset model according to the judgment result and the cross-line label output by the second preset model, or, according to the judgment result, the cross-line label and the cross-line time output from the second preset model, the The second preset model is updated, or the second preset model is updated according to the judgment result output by the second preset model and the cross-line time, so as to obtain the cross-line recognition model.
  • the loss of the second preset model can be determined according to the cross-line result and the cross-line label, and then according to the loss, the first Two preset models are updated.
  • the loss of the second preset model can be determined according to the cross-line recognition time and the cross-line time, and then according to the loss, the first Two preset models are updated.
  • the supervision label includes the cross-line time and the cross-line label
  • the cross-line recognition result also includes the cross-line recognition time and the recognition result of whether the cross-line is crossed
  • the time recognition can be determined according to the cross-line recognition time and the cross-line time. loss, and determining the discriminant loss according to the result of cross-line and cross-line label, and determine the loss of the second preset model according to the recognition loss and discriminant loss, and then update the second preset model according to the loss.
  • FIG. 10 a structural block diagram of an image detection apparatus according to an embodiment of the present application is shown. As shown in FIG. 10 , the apparatus may specifically include the following modules:
  • an image obtaining module 1001 configured to obtain a plurality of two-dimensional images collected at different times, the plurality of two-dimensional images including a reference object and a target object;
  • a three-dimensional position determination module 1002 configured to, for each two-dimensional image in the plurality of two-dimensional images, determine the relative position of the target object relative to the reference object according to the position information of the reference object and the target object on the two-dimensional image.
  • the identification module 1003 is configured to determine the image detection result according to the corresponding three-dimensional relative positions of the multiple two-dimensional images.
  • the identification module may specifically include the following units:
  • a sequence extraction unit configured to extract subsequences from the sequence composed of the corresponding three-dimensional relative positions of the multiple two-dimensional images; wherein each subsequence is a sequence composed of relative position scores belonging to the same direction dimension in the sequence;
  • An identification unit configured to identify whether the target object crosses the reference object according to each three-dimensional relative position and change trend included in the subsequence.
  • the three-dimensional relative position corresponding to any two-dimensional image in the plurality of two-dimensional images includes: a horizontal relative position score and a vertical relative position score; the identifying unit is specifically used for each three-dimensional relative position included in the subsequence.
  • the absolute values of the longitudinal relative position scores in the sub-sequence are all smaller than the first preset threshold, and the change trend of the lateral relative positions in each three-dimensional relative position included in the subsequence is from the second preset threshold to the third preset threshold.
  • the second preset threshold is less than zero
  • the third preset threshold is greater than zero
  • the absolute value of the second preset threshold and the The absolute values of the three preset thresholds are all values between 0 and 1.
  • the device may also include the following modules:
  • the first moment determination module is configured to determine that when the scores of two adjacent lateral relative positions in the subsequence change from less than zero to greater than zero, determine the crossing moment when the target object crosses the reference object, which is the adjacent A moment in the shooting period of two two-dimensional images corresponding to two lateral relative positions.
  • the device may also include the following modules:
  • the image obtaining module is further configured to obtain a next two-dimensional image when it is determined that the target object does not cross the reference object, and the next two-dimensional image and the multiple two-dimensional images include the same reference object and the same reference object. target;
  • a second moment determination module configured to determine according to the next two-dimensional image and part of the two-dimensional images in the plurality of two-dimensional images that the target object spans the reference object, according to the next image At the shooting moment, the crossing moment at which the target object crosses the reference object is determined.
  • the three-dimensional position determination module is specifically configured to input the position information of the reference object and the target object on the two-dimensional image into a pre-trained three-dimensional relative position prediction model to obtain the three-dimensional corresponding to the two-dimensional image. relative position;
  • the three-dimensional relative position prediction model is obtained by using the first training sample to train the first preset model, and the device further includes a first training sample obtaining module, which may specifically include the following units:
  • a labeling unit used for labeling the 3D position of the sample target object and the 3D position of the sample reference object in the simulated 3D scene
  • a transformation unit configured to obtain the respective position information of the sample target object and the sample reference object on the two-dimensional image through camera projection transformation according to the marked three-dimensional position
  • a label generation unit configured to generate a three-dimensional relative position label according to the characteristic information of the sample reference object and the marked three-dimensional position;
  • a first sample generating unit configured to generate the first training sample according to the respective position information of the sample reference object and the sample target object on the two-dimensional image and the corresponding three-dimensional relative position labels.
  • the three-dimensional relative position label includes a horizontal line relative position label and a vertical relative position label;
  • the label generating unit may specifically include the following subunits:
  • a first position determination subunit configured to determine the distance from the sample target object to the sample reference object and whether the sample target object is on the left or right side of the reference object according to the marked three-dimensional position
  • the second position determination subunit generates a horizontal direction according to the size of the sample target object, the distance from the sample target object to the sample reference object, and whether the sample target object is on the left or right side of the reference object relative position label;
  • the third position determination subunit determines the upper limit line, the middle line and the lower limit line according to the length information and direction information of the sample reference object;
  • the first distance determination subunit is used to determine the distance from the sample target object to the upper limit line when determining that the sample target object is located in the upper region above the upper limit line according to the marked three-dimensional position;
  • the second distance determination subunit is configured to, according to the marked three-dimensional position, determine the distance from the sample target object to the midline when it is determined that the sample target object is located in the middle region between the upper limit line and the lower limit line the distance;
  • a third distance determination subunit configured to determine the distance from the sample target object to the lower limit line when determining that the sample target object is located in the lower region below the lower limit line according to the marked three-dimensional position
  • the label generation subunit is configured to generate a longitudinal relative position label according to the size of the sample target object and the distance from the sample target object to the upper limit line, lower limit line or midline.
  • the identification module is specifically configured to input the corresponding three-dimensional relative positions of the multiple two-dimensional images into a pre-trained cross-line identification model to determine whether the target object crosses the reference object;
  • the cross-line recognition model is obtained by using the second training sample to train the second preset model
  • the device further includes a second training sample obtaining module, which may specifically include the following units:
  • an information input unit configured to input the respective position information of the sample target object and the sample reference object on the two-dimensional image into the three-dimensional relative position prediction model to obtain the three-dimensional relative position of the sample;
  • the second sample generating unit is configured to generate the second training sample according to the three-dimensional relative position of the sample, the cross-line label and/or the cross-line moment of the sample target object.
  • the device further includes:
  • an amplitude transformation module configured to perform amplitude transformation on the vertical three-dimensional relative position in the three-dimensional relative position of the sample to obtain the transformed vertical three-dimensional relative position
  • the sample second sample generation unit is specifically configured to generate a sample according to the transformed vertical three-dimensional relative position, the horizontal three-dimensional relative position in the three-dimensional relative position of the sample, and the cross-line label and/or cross-line label of the sample target object. At the moment of line, the second training sample is generated.
  • the device further includes the following modules:
  • a feature information obtaining module for obtaining feature information output by at least one network layer in the three-dimensional relative position prediction model
  • the second sample generating unit is specifically configured to generate the second training sample according to the three-dimensional relative position of the sample, the cross-line label and/or the cross-line time of the sample target object, and the feature information.
  • the embodiments of the present application also provide an electronic device, which can be used to execute the image detection method, and can include a memory, a processor, and a computer program stored in the memory and executed on the processor, the processor being configured to perform the described image detection method.
  • the embodiment of the present application also provides a computer-readable storage medium, and the computer program stored in the storage medium enables the processor to execute the image detection method described in the embodiment of the present application.
  • a fifth aspect of the embodiments of the present application provides a computer program, including computer-readable codes, which, when the computer-readable codes are executed on a computing and processing device, cause the computing and processing device to execute the above-mentioned image detection method.
  • embodiments of the embodiments of the present application may be provided as methods, apparatuses, or computer program products. Accordingly, the embodiments of the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present application may take the form of a computer program product implemented on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
  • computer-usable storage media including, but not limited to, disk storage, CD-ROM, optical storage, etc.
  • These computer program instructions may be provided to the processor of a general purpose computer, special purpose computer, embedded processor or other programmable data processing terminal equipment to produce a machine that causes the instructions to be executed by the processor of the computer or other programmable data processing terminal equipment Means are created for implementing the functions specified in the flow or flows of the flowcharts and/or the blocks or blocks of the block diagrams.
  • These computer program instructions may also be stored in a computer readable memory capable of directing a computer or other programmable data processing terminal equipment to operate in a particular manner, such that the instructions stored in the computer readable memory result in an article of manufacture comprising instruction means, the The instruction means implement the functions specified in the flow or flow of the flowcharts and/or the block or blocks of the block diagrams.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Multimedia (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Geometry (AREA)
  • Computer Hardware Design (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

L'invention concerne un procédé et un appareil de détection d'image, un dispositif et un support. Ledit procédé consiste à : obtenir de multiples images bidimensionnelles collectées à différents moments, les multiples images bidimensionnelles comprenant un objet de référence et un objet cible (S201) ; pour chaque image bidimensionnelle parmi les multiples images bidimensionnelles, déterminer, en fonction d'informations de position de l'objet de référence et de l'objet cible respectivement sur l'image bidimensionnelle, une position relative tridimensionnelle de l'objet cible par rapport à l'objet de référence (S202) ; et déterminer un résultat de détection d'image selon des positions relatives tridimensionnelles correspondant aux multiples images bidimensionnelles (S203). Le procédé de détection d'image permet d'améliorer la précision de détection à travers les lignes.
PCT/CN2021/108965 2020-11-06 2021-07-28 Procédé et appareil de détection d'image, dispositif électronique et support de stockage WO2022095514A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011232710.9 2020-11-06
CN202011232710.9A CN112329645B (zh) 2020-11-06 2020-11-06 图像检测方法、装置、电子设备及存储介质

Publications (1)

Publication Number Publication Date
WO2022095514A1 true WO2022095514A1 (fr) 2022-05-12

Family

ID=74316747

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/108965 WO2022095514A1 (fr) 2020-11-06 2021-07-28 Procédé et appareil de détection d'image, dispositif électronique et support de stockage

Country Status (2)

Country Link
CN (1) CN112329645B (fr)
WO (1) WO2022095514A1 (fr)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114897875A (zh) * 2022-06-02 2022-08-12 杭州电子科技大学 基于深度学习的微通道下大肠杆菌与微球的三维定位方法
CN116524135A (zh) * 2023-07-05 2023-08-01 方心科技股份有限公司 一种基于图像的三维模型生成方法及系统
CN116630550A (zh) * 2023-07-21 2023-08-22 方心科技股份有限公司 一种基于多图片的三维模型生成方法及系统

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112329645B (zh) * 2020-11-06 2024-05-28 北京迈格威科技有限公司 图像检测方法、装置、电子设备及存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2899691A1 (fr) * 2012-09-18 2015-07-29 Hangzhou Hikvision Digital Technology Co., Ltd. Procédé et système de poursuite de cible pour caméra dôme à grande vitesse à poursuite intelligente
US20170178345A1 (en) * 2015-12-17 2017-06-22 Canon Kabushiki Kaisha Method, system and apparatus for matching moving targets between camera views
CN108986164A (zh) * 2018-07-03 2018-12-11 百度在线网络技术(北京)有限公司 基于图像的位置检测方法、装置、设备及存储介质
CN110807431A (zh) * 2019-11-06 2020-02-18 上海眼控科技股份有限公司 对象定位方法、装置、电子设备及存储介质
CN112329645A (zh) * 2020-11-06 2021-02-05 北京迈格威科技有限公司 图像检测方法、装置、电子设备及存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2899691A1 (fr) * 2012-09-18 2015-07-29 Hangzhou Hikvision Digital Technology Co., Ltd. Procédé et système de poursuite de cible pour caméra dôme à grande vitesse à poursuite intelligente
US20170178345A1 (en) * 2015-12-17 2017-06-22 Canon Kabushiki Kaisha Method, system and apparatus for matching moving targets between camera views
CN108986164A (zh) * 2018-07-03 2018-12-11 百度在线网络技术(北京)有限公司 基于图像的位置检测方法、装置、设备及存储介质
CN110807431A (zh) * 2019-11-06 2020-02-18 上海眼控科技股份有限公司 对象定位方法、装置、电子设备及存储介质
CN112329645A (zh) * 2020-11-06 2021-02-05 北京迈格威科技有限公司 图像检测方法、装置、电子设备及存储介质

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114897875A (zh) * 2022-06-02 2022-08-12 杭州电子科技大学 基于深度学习的微通道下大肠杆菌与微球的三维定位方法
CN114897875B (zh) * 2022-06-02 2022-11-11 杭州电子科技大学 基于深度学习的微通道下大肠杆菌与微球的三维定位方法
CN116524135A (zh) * 2023-07-05 2023-08-01 方心科技股份有限公司 一种基于图像的三维模型生成方法及系统
CN116524135B (zh) * 2023-07-05 2023-09-15 方心科技股份有限公司 一种基于图像的三维模型生成方法及系统
CN116630550A (zh) * 2023-07-21 2023-08-22 方心科技股份有限公司 一种基于多图片的三维模型生成方法及系统
CN116630550B (zh) * 2023-07-21 2023-10-20 方心科技股份有限公司 一种基于多图片的三维模型生成方法及系统

Also Published As

Publication number Publication date
CN112329645A (zh) 2021-02-05
CN112329645B (zh) 2024-05-28

Similar Documents

Publication Publication Date Title
WO2022095514A1 (fr) Procédé et appareil de détection d'image, dispositif électronique et support de stockage
CN111340797B (zh) 一种激光雷达与双目相机数据融合检测方法及系统
Dong et al. Ellipse R-CNN: Learning to infer elliptical object from clustering and occlusion
US10417781B1 (en) Automated data capture
US11928800B2 (en) Image coordinate system transformation method and apparatus, device, and storage medium
CN113450408B (zh) 一种基于深度相机的非规则物体位姿估计方法及装置
Bayraktar et al. Analysis of feature detector and descriptor combinations with a localization experiment for various performance metrics
CN109934847B (zh) 弱纹理三维物体姿态估计的方法和装置
Yu et al. 6dof object pose estimation via differentiable proxy voting loss
CN111178170B (zh) 一种手势识别方法和一种电子设备
Chen et al. Crowd escape behavior detection and localization based on divergent centers
Hu et al. Human interaction recognition using spatial-temporal salient feature
CN107330363B (zh) 一种快速的互联网广告牌检测方法
Darujati et al. Facial motion capture with 3D active appearance models
Liu et al. Visual object tracking with partition loss schemes
Wen et al. CAE-RLSM: Consistent and efficient redundant line segment merging for online feature map building
Fang et al. GraspNet: a large-scale clustered and densely annotated dataset for object grasping
JP2015184743A (ja) 画像処理装置および物体認識方法
Wang et al. Stream query denoising for vectorized hd map construction
CN113470073A (zh) 一种基于深度学习的动物中心追踪方法
Chen et al. Multi-robot point cloud map fusion algorithm based on visual SLAM
CN112270357A (zh) Vio视觉系统及方法
WO2019198233A1 (fr) Dispositif de reconnaissance d'action, procédé de reconnaissance d'action et support d'enregistrement lisible par ordinateur
Lu et al. Slicing-tracking-detection: Simultaneous multi-cylinder detection from large-scale and complex point clouds
Wang Monocular 2D and 3D human pose estimation review

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21888214

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 16/08/2023)

122 Ep: pct application non-entry in european phase

Ref document number: 21888214

Country of ref document: EP

Kind code of ref document: A1