WO2022179314A1 - 一种对象检测方法及电子设备 - Google Patents

一种对象检测方法及电子设备 Download PDF

Info

Publication number
WO2022179314A1
WO2022179314A1 PCT/CN2022/070160 CN2022070160W WO2022179314A1 WO 2022179314 A1 WO2022179314 A1 WO 2022179314A1 CN 2022070160 W CN2022070160 W CN 2022070160W WO 2022179314 A1 WO2022179314 A1 WO 2022179314A1
Authority
WO
WIPO (PCT)
Prior art keywords
candidate frame
candidate
image
electronic device
detection
Prior art date
Application number
PCT/CN2022/070160
Other languages
English (en)
French (fr)
Inventor
张灵敏
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to EP22758680.7A priority Critical patent/EP4276683A4/en
Publication of WO2022179314A1 publication Critical patent/WO2022179314A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/30Services specially adapted for particular environments, situations or purposes
    • H04W4/40Services specially adapted for particular environments, situations or purposes for vehicles, e.g. vehicle-to-pedestrians [V2P]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • G06V20/58Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/255Detecting or recognising potential candidate objects based on visual cues, e.g. shapes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/64Three-dimensional objects
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/02Services making use of location information
    • H04W4/023Services making use of location information using mutual or relative location information between multiple location based services [LBS] targets or of distance thresholds
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/30Services specially adapted for particular environments, situations or purposes
    • H04W4/40Services specially adapted for particular environments, situations or purposes for vehicles, e.g. vehicle-to-pedestrians [V2P]
    • H04W4/44Services specially adapted for particular environments, situations or purposes for vehicles, e.g. vehicle-to-pedestrians [V2P] for communication between vehicles and infrastructures, e.g. vehicle-to-cloud [V2C] or vehicle-to-home [V2H]

Definitions

  • the present application relates to the field of artificial intelligence, and in particular, to an object detection method and electronic device.
  • AI artificial intelligence
  • cameras on the road collect images, use deep learning algorithms to detect objects in the images, and obtain information such as object categories and locations.
  • the present application provides an object detection method and electronic device, which can improve the object detection accuracy, reduce the probability of missed detection and false detection, and meet the needs of users.
  • an embodiment of the present application provides an object detection method, which is applied to an electronic device including a camera device.
  • the electronic device is fixedly set, the electronic device communicates wirelessly with the mobile device, the mobile device and the electronic device are within a certain range, and the mobile device is located on the object; the method includes: the electronic device obtains an Image, the image contains at least one object; identify the object contained in the image, and obtain one or more candidate frames of each object in the object and a candidate frame information corresponding to each candidate frame; the candidate frame information includes the object in the image.
  • the first object category is a vehicle category, including cars, off-road vehicles, buses, trucks, school buses, fire engines, ambulances, police cars, etc.; the second object category is also a vehicle category, including cars, off-road vehicles, buses, Trucks, school buses, fire trucks, ambulances, police cars, etc.
  • the first object category is a human category, including infants, children, adults, and the elderly; the second object category is also a human category, including infants, children, adults, and elderly.
  • the candidate frame of the object when the object is not occluded, only includes the complete outline of the object.
  • the V2X technology (receiving the first message) is used to obtain information such as the location and category of the object. According to the obtained information such as the position and category of the object, combined with the position, category and other information of the object identified by the object detection algorithm, the detection frame is determined from multiple candidate frames; The accuracy of the detection frame determined from the candidate frame is improved; the probability of missed detection and false detection in object detection is reduced.
  • the object includes at least one of the following: a vehicle, a person; and the location information includes location coordinates.
  • an embodiment of the present application provides an object detection method, which is applied to an electronic device including a camera device.
  • the electronic device is fixedly arranged, the electronic device communicates wirelessly with the mobile device, the mobile device and the electronic device are within a certain range, and the mobile device is located on the object.
  • the method includes: an electronic device obtains an image by photographing or photographing at a first angle through a camera device, and the image includes at least one object; identifying the object included in the image, and obtaining one or more candidate frames of each object in the objects and a candidate frame information corresponding to each candidate frame; the candidate frame information includes the position of the object in the image, the detection probability of the object, the first object category of the object, other candidate frames overlapping with the candidate frame and the maximum value of the candidate frame Intersection and union ratio max IoU; within a preset time period before or after the image is obtained, a first message from the mobile device is received, and the first message includes the position information of the object and the second object category of the object; according to the position information of the object, the electronic The position information of the device and the first angle, the position of the object in the image, from one or more candidate frames, determine at least one first candidate frame corresponding to the first message; adjust the parameters corresponding to the at least one first candidate frame, parameters Related to candidate frame information; from at least one first candidate
  • the first object category is a vehicle category, including cars, off-road vehicles, buses, trucks, school buses, fire engines, ambulances, police cars, etc.; the second object category is also a vehicle category, including cars, off-road vehicles, buses, Trucks, school buses, fire trucks, ambulances, police cars, etc.
  • the first object category is a human category, including infants, children, adults, and the elderly; the second object category is also a human category, including infants, children, adults, and elderly.
  • the candidate frame of the object when the object is not occluded, only includes the complete outline of the object.
  • the V2X technology (receiving the first message) is used to obtain information such as the location and category of the object. Adjust the relevant parameters in the process of determining the detection frame from multiple candidate frames according to the obtained position, category and other information of the object, so that the information of the detection frame determined from the candidate frame is more accurate and more accurate; The probability of missed detection and false detection.
  • the object includes at least one of the following: a vehicle, a person; and the location information includes location coordinates.
  • adjusting the parameter corresponding to at least one first candidate frame includes: increasing the value of the detection probability corresponding to each first candidate frame. Obtaining one or more second candidate frames whose detection probability is greater than or equal to a preset detection probability threshold; including: deleting or excluding the first candidate frame whose detection probability is less than the preset detection probability threshold, and obtaining one or more second candidate frames frame.
  • the candidate frame whose detection probability is less than the detection probability threshold is deleted, that is, it is determined that the candidate frame whose detection probability is less than the detection probability threshold is not a detection frame; the position of the object in the first message and the object in the candidate frame information of the first candidate frame The distance between the positions is less than the preset distance threshold d threshold , indicating that the object corresponding to the first candidate frame and the object sending the first message have a high probability of being the same object, increasing the detection probability in the candidate frame information of the first candidate frame The value of , increases the probability that the first candidate frame is determined as a detection frame, and reduces the probability of missed detection.
  • the value of the detection probability corresponding to each first candidate frame is increased; including: in, d represents the position information in the first message, after the coordinates are converted into the first position in the image, the position of the object in the image, and the distance between the two; or, d represents the position of the object in the image, in After the coordinates are converted to the first position in the geodetic coordinate system, the distance between the two and the position information in the first message; d threshold represents the preset distance threshold; score oral represents the detection probability corresponding to a first candidate frame The value of ; score' represents the set detection probability adjustment threshold; detection probability threshold ⁇ score' ⁇ 1.
  • adjusting the parameter corresponding to at least one first candidate frame includes: increasing the intersection ratio threshold corresponding to the first candidate frame.
  • the distance between the position of the object in the first message and the position of the object in the image in the candidate frame information of the first candidate frame is less than a preset distance threshold d threshold , it indicates that the object corresponding to the candidate frame is sent and sent.
  • the probability that the object of the first message is the same object is larger, and the increase adopts the NMS algorithm to determine the intersection ratio threshold IoU th corresponding to the first candidate frame in the detection frame process from multiple candidate frames;
  • the probability of deletion increases the probability that the first candidate frame is determined as a detection frame, and reduces the probability of missed detection.
  • increasing the intersection ratio threshold corresponding to the first candidate frame including: in, d represents the position information in the first message, after the coordinates are converted into the first position in the image, the position of the object in the image, and the distance between the two; or, d represents the position of the object in the image, in After the coordinates are converted into the first position in the geodetic coordinate system, the distance between the two and the position information in the first message; d threshold represents the preset distance threshold; Indicates the preset intersection ratio threshold in the NMS algorithm; IoU' th indicates the set intersection ratio threshold adjustment threshold;
  • the method further includes: if the second object category of the object in the first message is inconsistent with the first object category of the object in the candidate frame information of the first candidate frame , the second object category of the object in the first message is determined as the object category of the detection frame. That is, the object category of the detection frame is determined according to the object category in the V2X message, which reduces the probability of false detection.
  • the second object category of the object in the first message is determined as the object category of the detection frame; include: in, d represents the position information in the first message, after the coordinates are converted into the first position in the image, the position of the object in the image, and the distance between the two; or, d represents the position of the object in the image, in After the coordinates are converted to the first position in the geodetic coordinate system, the distance between the two and the position information in the first message; d threshold represents the preset distance threshold, and class v represents the second distance of the object in the first message.
  • Object category, class oral represents the first object category of the object in the candidate frame information of the first candidate frame.
  • the position of the object in the image is determined from one or more candidate frames.
  • At least one first candidate frame corresponding to the first message including: traversing all candidate frame information corresponding to the object, comparing D converted according to the candidate frame information with a preset distance threshold, and setting D less than a preset distance threshold
  • the candidate frame of the distance threshold is determined as the first candidate frame; wherein, D represents the position information of the object, and after the coordinates are converted into the first position in the image according to the position information of the electronic device and the first angle, it is the same as the position of the object in the image. location, the distance between the two.
  • the candidate frame is determined as the first candidate frame, and the parameters corresponding to the first candidate frame are adjusted.
  • recognizing the object included in the image includes: using the YOLO algorithm or the Fast_RCNN algorithm to identify the object included in the image.
  • an embodiment of the present application provides an electronic device.
  • the electronic device includes: a processor; a memory; and a computer program, wherein the computer program is stored on the memory; when the computer program is executed by the processor, the electronic device is caused to execute the first aspect and any implementation manner of the first aspect, or the above.
  • a computer-readable storage medium includes a computer program, and when the computer program runs on the electronic device, causes the electronic device to perform any one of the first aspect and the first aspect, or any one of the second aspect and the second aspect. method of implementation.
  • a computer program product When it runs on a computer, the computer is made to execute the method of the first aspect and any one of the implementation manners of the first aspect, or the above-mentioned second aspect and any one of the implementation manners of the second aspect.
  • Fig. 1 is a scene schematic diagram of AI image recognition
  • Fig. 3 is a kind of schematic diagram of the deployment example of RSU and OBU;
  • FIG. 4 is an example diagram of a scene to which the object detection method provided by the embodiment of the present application is applicable;
  • FIG. 5 is a schematic structural diagram of an electronic device to which the object detection method provided by the embodiment of the present application is applicable;
  • FIG. 6 is a schematic flowchart of an object detection method provided by an embodiment of the present application.
  • Fig. 7 is a scene example diagram of the object detection method provided by the embodiment of the present application.
  • Figure 8 is a schematic diagram of the conversion relationship between the geodetic coordinate system and the world coordinate system
  • FIG. 10 is a schematic structural diagram of an electronic device for object detection provided by an embodiment of the present application.
  • references in this specification to "one embodiment” or “some embodiments” and the like mean that a particular feature, structure, or characteristic described in connection with the embodiment is included in one or more embodiments of the present application.
  • appearances of the phrases “in one embodiment,” “in some embodiments,” “in other embodiments,” “in other embodiments,” etc. in various places in this specification are not necessarily All refer to the same embodiment, but mean “one or more but not all embodiments” unless specifically emphasized otherwise.
  • the terms “including”, “including”, “having” and their variants mean “including but not limited to” unless specifically emphasized otherwise.
  • the term “connected” includes both direct and indirect connections unless otherwise specified.
  • first and second are only used for descriptive purposes, and should not be construed as indicating or implying relative importance or implicitly indicating the number of indicated technical features.
  • a feature defined as “first” or “second” may expressly or implicitly include one or more of that feature.
  • words such as “exemplarily” or “for example” are used to represent examples, illustrations or illustrations. Any embodiment or design described in the embodiments of the present application as “exemplarily” or “such as” should not be construed as preferred or advantageous over other embodiments or designs. Rather, the use of words such as “exemplarily” or “such as” is intended to present the related concepts in a specific manner.
  • Figure 1 shows a scene of AI image recognition.
  • the electronic device 100 is mounted on a bracket by the road.
  • the electronic device 100 includes an AI camera, or the electronic device 100 is an AI camera.
  • the electronic device 100 collects an image of the intersection, and performs image recognition on the collected image.
  • the electronic device 100 uses a deep learning algorithm to detect an object in an image, and can obtain information such as the category and location of the object.
  • Figure 2 shows an example of object detection results.
  • the electronic device 100 uses an object detection algorithm to perform object detection on the collected image 110 , where the detected object is a vehicle, and obtains information such as the position, category, and detection probability of the vehicle in the image (ie, the probability that the object is of this category).
  • the electronic device 100 obtains that the vehicle category of the vehicle 111 in the image 110 is a sedan, and the detection probability is 99%; the vehicle category of the vehicle 112 obtained in the image 110 is a truck, and the detection probability is 98%;
  • the vehicle type of the vehicle 113 is a bus, and the detection probability is 99%.
  • the vehicle 114 in FIG. 2 is not detected, that is, the vehicle 114 is missed; the vehicle type of the vehicle 115 in FIG. 2 is a sedan, which is detected as an off-road vehicle, that is, the vehicle 115 is wrongly detected.
  • the embodiment of the present application provides an object detection method.
  • the electronic device supports vehicle-to-everything (V2X) technology, uses V2X to obtain vehicle information, and combines the vehicle information obtained by V2X to make judgment in object detection, thereby reducing false detection. and missed detection, improve object detection accuracy.
  • V2X vehicle-to-everything
  • V2X is a new generation of information and communication technology that connects the vehicle with everything; where V represents the vehicle, and X represents any object that interacts with the vehicle.
  • X can include vehicles, people, roadside infrastructure and networks.
  • the information interaction modes of V2X include: vehicle to vehicle (V2V), vehicle to infrastructure (V2I), vehicle to pedestrian (V2P), vehicle to pedestrian (V2P), vehicle to infrastructure (V2I).
  • Information interaction with the network (vehicle to network, V2N).
  • C-V2X based on cellular (Cellular) technology, namely C-V2X.
  • C-V2X is a wireless communication technology for vehicles based on the evolution of cellular network communication technologies such as 3G/4G/5G.
  • C-V2X includes two communication interfaces: one is a short-distance direct communication interface (PC5) between terminals such as vehicles, people, roadside infrastructure, etc., and the other is vehicles, people, roadside infrastructure and other terminals and the network The communication interface (Uu) between them is used to achieve reliable communication over long distances and large ranges.
  • PC5 short-distance direct communication interface
  • Terminals in V2X include roadside units (RSUs) and onboard units (OBUs).
  • RSUs roadside units
  • OBUs onboard units
  • FIG. 3 shows a deployment example of an RSU and an OBU.
  • the RSU is a static entity that supports V2X applications and is deployed on the roadside, such as a gantry beside the road; it can exchange data with other entities that support V2X applications (such as RSU or OBU).
  • the OBU is a dynamic entity supporting V2X applications, usually installed in the vehicle, and capable of data exchange with other entities (such as RSU or OBU) supporting V2X applications.
  • the RSU may be the electronic device 100 in FIG. 1 or FIG. 4 .
  • the electronic device 100 of FIG. 1 or FIG. 4 may include an RSU.
  • the OBU may be referred to as a mobile device. Further, if the OBU is set on the vehicle, it can be called an in-vehicle device.
  • the mobile device may include an OBU. Further, if the mobile device is installed on the vehicle, it can be called a vehicle-mounted device.
  • OBUs are installed in the electronic device 100 and the vehicle 200 , respectively.
  • the electronic device 100 and each vehicle 200 can communicate with each other through V2X.
  • the OBU of the vehicle 200 periodically broadcasts a basic safety message (BSM), and the BSM includes the basic information of the vehicle, such as vehicle speed, heading, position, acceleration, vehicle type, predicted path and historical path , vehicle events, etc.
  • the communication distance between the OBUs is the first distance (for example, 500 meters).
  • the electronic device 100 may acquire basic safety messages broadcast by vehicles 200 within a first distance around it through V2X.
  • the electronic device 100 supports the RSU function, and the communication distance between the OBU and the RSU is the second distance (for example, 1000 meters).
  • the electronic device 100 may acquire basic safety messages broadcast by vehicles 200 within a second distance around it through V2X.
  • the following embodiments of the present application take the installation of the OBU in the electronic device 100 as an example for description. It can be understood that the object detection method provided by the embodiment of the present application is also applicable to the case where the electronic device supports the RSU function.
  • the electronic device 100 collects road images, and uses an object detection algorithm to perform object detection on the road images.
  • object detection algorithms include YOLO (you only look once) algorithm, Fast_RCNN, etc.
  • object detection one or more candidate identification regions (regions of interest, RoI) of each detection object are obtained by calculation, which are called candidate frames (RoIs); then one or more candidate frames are screened out.
  • RoI that is, the detection frame of the detected object is obtained.
  • the detection objects may include vehicles, pedestrians, road signs, and the like.
  • the electronic device 100 also obtains the broadcast messages of the vehicles 200 within the first distance around it through V2X; combines the obtained broadcast messages of the surrounding vehicles, and selects an RoI from multiple candidate frames; Detection accuracy for object detection.
  • the electronic device 100 may be an AI camera, an electronic device including an AI camera, or an electronic device including a camera or a camera (different from an AI camera).
  • FIG. 5 shows a schematic structural diagram of an electronic device.
  • the electronic device 100 may include a processor 210, an external memory interface 220, an internal memory 221, a universal serial bus (USB) interface 230, a charge management module 240, a power management module 241, a battery 242, an antenna, wireless communication Module 250, camera sensor 260, etc.
  • a processor 210 may include a processor 210, an external memory interface 220, an internal memory 221, a universal serial bus (USB) interface 230, a charge management module 240, a power management module 241, a battery 242, an antenna, wireless communication Module 250, camera sensor 260, etc.
  • USB universal serial bus
  • the structures illustrated in the embodiments of the present application do not constitute a specific limitation on the electronic device 100 .
  • the electronic device 100 may include more or less components than shown, or combine some components, or separate some components, or arrange different components.
  • the illustrated components may be implemented in hardware, software, or a combination of software and hardware.
  • Processor 210 may include one or more processing units.
  • the processor 210 may include an application processor (application processor, AP), a modem processor, a graphics processor (graphics processing unit, GPU), an image signal processor (image signal processor, ISP), a controller, a video Codec, digital signal processor (digital signal processor, DSP), baseband processor, and/or neural-network processing unit (neural-network processing unit, NPU), etc.
  • application processor application processor
  • AP application processor
  • modem processor graphics processor
  • image signal processor image signal processor
  • ISP image signal processor
  • controller a video Codec
  • digital signal processor digital signal processor
  • DSP digital signal processor
  • baseband processor baseband processor
  • neural-network processing unit neural-network processing unit
  • NPU neural-network processing unit
  • different processing units may be independent components, or may be integrated in one or more processors.
  • electronic device 100 may also include one or more processors 210 .
  • the controller can generate an operation control signal according to the instruction operation code and
  • the processor 210 may include one or more interfaces.
  • the interface may include an inter-integrated circuit (I2C) interface, an inter-integrated circuit sound (I2S) interface, a pulse code modulation (pulse code modulation, PCM) interface, a universal asynchronous receiver (universal asynchronous receiver) /transmitter, UART) interface, mobile industry processor interface (MIPI), general-purpose input/output (GPIO) interface, SIM card interface, and/or USB interface, etc.
  • the USB interface 230 is an interface that conforms to the USB standard specification, and may specifically be a Mini USB interface, a Micro USB interface, a USB Type C interface, and the like.
  • the USB interface 230 can be used to connect a charger to charge the electronic device 100, and can also be used to transmit data between the electronic device 100 and peripheral devices.
  • the interface connection relationship between the modules illustrated in the embodiments of the present application is only a schematic illustration, and does not constitute a structural limitation of the electronic device 100 .
  • the electronic device 100 may also adopt different interface connection manners in the foregoing embodiments, or a combination of multiple interface connection manners.
  • the charging management module 240 is used to receive charging input from the charger.
  • the charger may be a wireless charger or a wired charger.
  • the charging management module 240 may receive charging input from the wired charger through the USB interface 230 .
  • the charging management module 240 may receive wireless charging input through the wireless charging coil of the electronic device 100 . While the charging management module 240 charges the battery 242 , the power management module 241 can also supply power to the electronic device.
  • the power management module 241 is used to connect the battery 242 , the charging management module 240 and the processor 210 .
  • the power management module 241 receives input from the battery 242 and/or the charging management module 240, and supplies power to the processor 210, the internal memory 221, the external memory interface 220, the wireless communication module 250, and the like.
  • the power management module 241 can also be used to monitor parameters such as battery capacity, battery cycle times, battery health status (leakage, impedance).
  • the power management module 241 may also be provided in the processor 210 .
  • the power management module 241 and the charging management module 240 may also be provided in the same device.
  • the wireless communication function of the electronic device 100 may be implemented by the antenna, the wireless communication module 250 and the like.
  • the wireless communication module 250 may provide wireless communication solutions including Wi-Fi, Bluetooth (BT), and wireless data transmission modules (eg, 433MHz, 868MHz, 915MHz) applied to the electronic device 100 .
  • the wireless communication module 250 may be one or more devices integrating at least one communication processing module.
  • the wireless communication module 250 receives electromagnetic waves via the antenna, filters and frequency modulates the electromagnetic wave signals, and sends the processed signals to the processor 210 .
  • the wireless communication module 250 can also receive the signal to be sent from the processor 210, frequency-modulate the signal, amplify the signal, and radiate it into electromagnetic waves through the antenna.
  • the electronic device 100 may receive a broadcast message (basic security message) through a wireless communication module.
  • a broadcast message basic security message
  • the external memory interface 220 can be used to connect an external memory card, such as a Micro SD card, to expand the storage capacity of the electronic device 100 .
  • the external memory card communicates with the processor 210 through the external memory interface 220 to realize the data storage function. For example to save files like music, video etc in external memory card.
  • Internal memory 221 may be used to store one or more computer programs including instructions.
  • the processor 210 may execute the above-mentioned instructions stored in the internal memory 221, thereby causing the electronic device 100 to execute the object detection method, various applications and data processing provided in some embodiments of the present application.
  • the internal memory 221 may include a code storage area and a data storage area. Among them, the code storage area can store the operating system.
  • the data storage area may store data and the like created during the use of the electronic device 100 .
  • the internal memory 221 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic disk storage components, flash memory components, universal flash storage (UFS), and the like.
  • the processor 210 may cause the electronic device 100 to execute the instructions provided in the embodiments of the present application by executing the instructions stored in the internal memory 221 and/or the instructions stored in the memory provided in the processor 210 .
  • Camera sensor 260 is used to capture still images or video. The object is projected onto the camera sensor 260 through the lens-generated optical image.
  • the camera sensor 260 may be a charge coupled device (CCD) or a complementary metal-oxide-semiconductor (CMOS) phototransistor.
  • CMOS complementary metal-oxide-semiconductor
  • the camera sensor 260 converts the optical signal into an electrical signal, and then passes the electrical signal to the processor 210 for conversion into a digital image signal.
  • the DSP in the processor 210 converts the digital image signal into a standard RGB, YUV and other format image signal.
  • the shutter when taking a photo, the shutter is opened, the light is transmitted to the camera sensor 260 through the lens, the light signal is converted into an electrical signal, and the camera sensor 260 transmits the electrical signal to the processor 210 for processing, and converts it into an image visible to the naked eye.
  • the processor 210 can also perform algorithm optimization on noise, brightness, and skin color of the image; and can also optimize parameters such as exposure and color temperature of the shooting scene.
  • the electronic device 100 collects road images through the camera sensor 260 , and the road images collected by the camera sensor 260 are transmitted to the processor 210 .
  • the processor 210 uses an object detection algorithm such as YOLO and Fast_RCNN to obtain one or more candidate frames (RoIs) of each object in the road image through calculation.
  • the electronic device 100 also receives broadcast messages (basic safety messages) of surrounding vehicles through the wireless communication module 250 .
  • the processor 210 receives the basic safety information of the surrounding vehicles, and in combination with the acquired basic safety information of the surrounding vehicles, selects an RoI from multiple candidate boxes, that is, the detection object is obtained.
  • the system to which the embodiments of the present application are applicable may include more electronic devices than those shown in FIG. 4 .
  • a server such as a cloud server
  • the electronic device collects road images and uploads the road images to the server.
  • the electronic device receives the basic security message periodically broadcast by the OBU within the first distance, and forwards the acquired basic security message to the server.
  • object detection algorithms such as YOLO and Fast_RCNN are used to obtain one or more candidate frames (RoIs) of each object in the road image through calculation; and combined with the acquired basic safety information of the vehicle, one is selected from multiple candidate frames. RoI, that is, the detection object is obtained.
  • FIG. 6 shows a schematic flowchart of an object detection algorithm provided by an embodiment of the present application.
  • the object detection algorithm provided in this embodiment of the present application may include:
  • S601. Periodically collect images at a first angle and according to a period of a first duration; after each image is collected, use an object detection algorithm to identify the image, and obtain one or more candidate frames of each object in the image , and the candidate frame information corresponding to each candidate frame.
  • the electronic device uses the camera to periodically collect images at a first angle and according to a period of a first duration (for example, 1s). After each image is collected, an object detection algorithm is used to identify the image.
  • the object detection algorithm is YOLO or Fast_RCNN, etc.
  • the detection object for object detection is a vehicle.
  • the electronic device identifies each collected image, and acquires one or more candidate frames of each object in the image, as well as candidate frame information corresponding to each candidate frame.
  • the candidate frame information includes detection probability, image position, object category (eg, vehicle category), the maximum intersection ratio (max IoU) between this candidate frame and other candidate frames, and so on.
  • the detection probability is the probability that the detection object is the object category (for example, the vehicle category), also called the score;
  • the image position is the pixel coordinates of the candidate frame in the image, for example, including the upper left of the candidate frame
  • vehicle categories include Cars, off-road vehicles, buses, trucks, school buses, fire trucks, ambulances, police cars, etc.
  • the intersection over union (IoU) is the ratio of the intersection of the two candidate frames to the union of the two candidate frames.
  • FIG. 7 shows a scene example diagram of the object detection method provided by the embodiment of the present application.
  • candidate frame 0 candidate frame 1
  • candidate frame 2 candidate frame 3
  • candidate frame 4 candidate frame 5
  • candidate frame 6 and candidate frame are obtained.
  • the detection probability (score) and the maximum intersection ratio (max IoU) information are shown in Table 1.
  • S602. Receive a first message from the mobile device of the object; the first message includes location information, category information and the like of the object.
  • the OBU in the vehicle periodically sends the first message according to the second duration (for example, 100ms), and the first message includes the identification of the vehicle, the position of the vehicle (latitude, longitude and altitude), the type of the vehicle (including cars, off-road vehicles, buses, trucks, school buses, fire trucks, ambulances, police cars, etc.), message sending time and other information.
  • the first message is a BSM message.
  • the BSM message includes the information shown in Table 2.
  • BSM message illustrate vehicle ID vehicle identification DSecond Universal time (UTC time) VehicleSize Vehicle size (including length, width, height) AccelerationSet4Way Four-axis acceleration of the vehicle Heading vehicle heading angle Speed Current vehicle speed VehicleClassification vehicle class Position3D Vehicle location (latitude, longitude, altitude) ... ...
  • the electronic device receives the first messages periodically sent by surrounding vehicles according to the second duration, and saves the received first messages.
  • S603. Determine a first image corresponding to the first message according to whether the difference between the time for receiving the first message and the time for collecting the image is within a preset time period.
  • the electronic device determines that the image acquisition time point is the first moment, and determines the first message whose message sending time is within a second time period before or after the first moment as the first message corresponding to the first message. an image.
  • the first image is an image captured by the camera at 18:20:01.
  • the period for the vehicle to send the first message is 100 milliseconds (ie, the second duration is 100 milliseconds).
  • the camera determines that the first message whose sending time is between (18:20:01-100 milliseconds) to 18:20:01 is the first message corresponding to the first image.
  • the last first message of the multiple first messages of the same vehicle is retained.
  • the electronic device acquires information such as the identifier of the corresponding vehicle, the vehicle location, the vehicle category, the message sending time, and other information according to each first message in the first message corresponding to the first image.
  • the camera obtains that the vehicle position of the vehicle 1 is the longitude L 1 , the latitude B 1 , and the altitude H 1 ; the vehicle type is a car.
  • the camera obtains the vehicle position of the vehicle 2 as the longitude L 2 , the latitude B 2 , and the altitude H 2 ; the vehicle type is a car.
  • the camera obtains the vehicle position of the vehicle 3 as the longitude L 3 , the latitude B 3 , and the altitude H 3 ; the vehicle type is a sedan.
  • the camera obtains that the vehicle position of the vehicle 4 is the longitude L 4 , the latitude B 4 , and the altitude H 4 ; the vehicle type is a car.
  • the camera obtains the vehicle position of the vehicle 5 as the longitude L 5 , the latitude B 5 , and the altitude H 5 ; the vehicle type is a sedan.
  • the latitude, longitude and altitude (B, L, H) of the vehicle location in the first message are converted to pixel coordinates (x v , y v ).
  • latitude, longitude and altitude are coordinate values in the geodetic coordinate system.
  • latitude B is the angle between the ground normal of a point and the equatorial plane; starting from the equatorial plane, the south is negative, and the range is -90° ⁇ 0°, and the north is positive, and the range is 0° ⁇ 90 °.
  • Longitude L is calculated from the starting meridian plane, positive to the east, negative to the west, and the range is -180° ⁇ 180°. The distance from a point to the ellipsoid along the normal is called the point's altitude H.
  • the world coordinate system takes the center O of the ellipsoid as the coordinate origin, the intersection of the starting meridian plane and the equatorial plane is the X axis, the direction orthogonal to the X axis on the equatorial plane is the Y axis, and the rotation axis of the ellipsoid is the Z axis , the three directions are right-handed.
  • N is the radius of the unitary circle
  • e is the first eccentricity of the earth.
  • the equatorial radius of the reference ellipsoid be a and the polar radius of the reference ellipsoid be b.
  • a is greater than b.
  • e 2 (a 2 -b 2 )/a 2
  • N a/(1-e 2 sin 2 B) 1/2 .
  • the coordinate system X c Y c Z c is the camera coordinate system
  • the camera coordinate system takes the camera light spot as the origin
  • the X axis and the Y axis are parallel to the two sides of the image
  • the Z axis coincides with the optical axis.
  • the coordinate system X i Y i Z i is the image coordinate system
  • the image coordinate system takes the intersection of the optical axis and the projection surface as the origin
  • the X axis and the Y axis are parallel to the two sides of the image
  • the Z axis coincides with the optical axis.
  • the coordinate system uv is the pixel coordinate system.
  • the pixel coordinate system and the image coordinate system are in the same plane, and the origin is different; from the camera light point to the direction of the projection surface, the upper left corner of the projection surface is the origin of the pixel coordinate system.
  • the matrix is the in-camera parameter matrix, which is related to the in-camera hardware parameters;
  • the matrix is the camera extrinsic parameter matrix, which is related to the relative position of the world coordinate system and the pixel coordinate system (for example, the shooting angle of the camera, which may also be referred to as the first angle; the relative position of the camera and the vehicle, etc.).
  • a point in the world coordinate system is rotated by pan Convert to a point in the pixel coordinate system;
  • s is the value of the object point in the Z-axis direction of the pixel coordinate system.
  • the pixel coordinates in the converted image are the first position of the object in the first image.
  • the above implementation manner is described by taking the example of converting the vehicle position in the first message from coordinate values (longitude, latitude and height) in the geodetic coordinate system to pixel coordinate values in the pixel coordinate system. It can be understood that in other implementations, the image position in the candidate frame information can also be converted from the pixel coordinate value in the pixel coordinate system to the coordinate value in the geodetic coordinate system, and then compared with the vehicle position in the first message. ; the embodiments of the present application do not limit this.
  • S605 Determine at least one first candidate frame related to the object in the image according to whether the difference between the first position and the position of the object in the image in the candidate frame information is within a preset threshold.
  • the candidate frame of the first image Traverse the candidate frame of the first image, if the first distance d between the first position (x v , y v ) and the image position (x p , y p ) in the candidate frame information of the candidate frame is less than the preset distance threshold d threshold (for example, 0.5 meters), determine that the candidate frame is the first candidate frame corresponding to the first message.
  • d threshold for example, 0.5 meters
  • S606 Adjust a parameter corresponding to at least one first candidate frame, where the parameter is related to candidate frame information.
  • the score of the candidate frame corresponding to the first message is increased. In this way, the probability that the candidate frame is determined as a detection frame is increased, and the probability of missed detection is reduced.
  • the detection probability of the candidate frame corresponding to the first message is increased according to the first distance d.
  • the adjusted detection probability score n of the candidate frame corresponding to the first message is obtained according to the following formula 3.
  • the adjustment factor is the confidence of score n , the smaller d, the greater the confidence;
  • score oral is the detection probability value in the candidate frame information, and
  • score' is the set detection probability adjustment threshold, which satisfies score th ⁇ score' ⁇ 1.
  • the process of determining the detection frame from multiple candidate frames by using the NMS algorithm is increased.
  • the intersection and union ratio threshold IoU th corresponding to the candidate frame in in this way, the probability of the candidate frame being deleted is reduced, that is, the probability that the candidate frame is determined as a detection frame is increased, and the probability of missed detection is reduced.
  • the intersection ratio threshold corresponding to the candidate frame corresponding to the first message is increased according to the first distance d.
  • the adjustment factor for The confidence of is the preset IoU th value in the NMS algorithm, and IoU' th is the set threshold for the adjustment of the cross-union ratio threshold, which satisfies the
  • a detection frame related to the object is obtained; the candidate frame information of the detection frame is the finally obtained information of the object.
  • the candidate frame whose detection probability (score) is less than the detection probability threshold score th (eg, 0.85) is deleted, that is, it is determined that the candidate frame whose detection probability is lower than the detection probability threshold is not a detection frame.
  • the score th is set to 0.85, and the detection probability of the candidate boxes with numbers 2, 4, and 7 in Table 1 is smaller than the score th .
  • a first message is received, where the first distance between the vehicle position and the image position (192, 223) of the candidate frame of sequence number 7 in the first message is less than d threshold , and the first message corresponds to the candidate frame of sequence number 7 .
  • Set score' to 0.95.
  • the calculated score n of the candidate frame with sequence number 7 is 0.87, and the confidence level is 0.5.
  • the candidate frames of sequence numbers 2 and 4 are deleted; the score n of the candidate frame of sequence number 7 is greater than score th , and the candidate frame of sequence number 7 is retained; the probability of missed selection is reduced.
  • the remaining candidate frames are the candidate frames of sequence number 0, sequence number 1, sequence number 3, sequence number 5, sequence number 6 and sequence number 7.
  • a specific algorithm such as a non-maximum suppression (NMS) algorithm, is used to determine the detection frame from the plurality of candidate frames.
  • NMS non-maximum suppression
  • the NMS algorithm is introduced as follows:
  • IoU th for example, 0.5
  • the two candidate frames have a high probability of representing the same detection object, delete the candidate frame with a smaller score, and keep the candidate frame with a higher score;
  • the candidate frame with lower score of the same detection object is removed, and the candidate frame with higher score of the same detection object is determined as the detection frame.
  • the detection frame is determined from the candidate frames shown in Table 1 according to the NMS algorithm.
  • the remaining candidate frames in Table 1 are the candidate frames of sequence number 0, sequence number 1, sequence number 3, sequence number 5, sequence number 6 and sequence number 7.
  • preset intersection ratio threshold is 0.5
  • the maximum intersection ratio max IoU of the candidate frame of sequence number 0 and sequence number 1 is 0.542. Since the score value of sequence number 1, 0.91, is greater than the score value of sequence number 0, 0.87, and 0.542>0.5, the candidate frame of sequence number 0 with lower score value is 0.542. The box will be removed.
  • a first message is received, the first distance between the vehicle position and the image position (13.5, 51) of the candidate frame of sequence number 0 in the first message is less than d threshold , and the first message corresponds to the candidate frame of sequence number 0; according to Calculate the confidence is 0.6.
  • Table 3 set IoU' th to 0.6, and calculate the intersection and union ratio threshold corresponding to the candidate frame of sequence number 0 according to formula 4. is 0.56; since 0.542 ⁇ 0.56, the candidate frame with sequence number 0 will not be deleted. In this way, the candidate boxes of sequence number 0 and sequence number 1 are reserved.
  • the maximum intersection ratio max IoU of the candidate frames of No. 5 and No. 6 is 0.501. Since the score value of No.
  • the electronic device further determines the vehicle category of the detection object according to the first distance d.
  • d is less than d threshold , it indicates that the vehicle corresponding to the candidate frame and the vehicle sending the first message are more likely to be the same vehicle, and the vehicle category in the first message is determined as the vehicle category of the detection object . Since the accuracy rate of the vehicle category in the first message is high, the accuracy of object detection is improved.
  • the vehicle category of the detection object is obtained according to the following formula 5.
  • the adjustment factor is the confidence of the class, the smaller the d, the greater the confidence; class oral is the vehicle category in the candidate frame information, and class v is the vehicle category in the first message.
  • the vehicle type in the first message is inconsistent with the vehicle type of the candidate frame corresponding to the first message, and the vehicle type in the first message is determined as the vehicle type of the detection frame.
  • the vehicle type in the candidate frame information of the candidate frame of serial number 0 is off-road vehicle, and the vehicle type in the first message corresponding to the candidate frame of serial number 0 is sedan; then the vehicle type of the detection frame of serial number 0 is determined to be sedan.
  • the first message corresponding to the candidate frame is not received, and the vehicle type of the candidate frame is determined as the vehicle type of the detection frame.
  • the vehicle type in the candidate frame information of the candidate frame of No. 1 is a car, and the first message corresponding to the candidate frame of No. 1 is not received; then it is determined that the vehicle type of the detection frame of No. 1 is a car.
  • the vehicle category in the first message is consistent with the vehicle category of the candidate frame corresponding to the first message, and the vehicle category in the first message is determined as the vehicle category of the detection frame.
  • the vehicle type in the candidate frame information of the candidate frame No. 7 is a car
  • the vehicle type in the first message corresponding to the candidate frame No. 7 is a car; then it is determined that the vehicle type of the detection frame No. 7 is a car.
  • the object detection method provided by the embodiment of the present application uses the V2X technology to obtain information such as the location of the vehicle, the vehicle category, and the like. Adjust the parameters in the process of using the object detection algorithm to determine the detection object according to the obtained vehicle position, vehicle category and other information, so that the information of the detection frame determined from the candidate frame is more accurate and more accurate; it reduces the missed detection in the object detection. and probability of misdetection.
  • the camera in each embodiment of the present application can be replaced with a camera, and the camera can be replaced with a camera.
  • the above-mentioned electronic device includes corresponding hardware structures and/or software modules for executing each function.
  • the embodiments of the present application can be implemented in hardware or a combination of hardware and computer software. Whether a function is performed by hardware or computer software driving hardware depends on the specific application and design constraints of the technical solution. Those skilled in the art may use different methods to implement the described functions for each specific application, but such implementation should not be considered beyond the scope of the embodiments of the present application.
  • the electronic device may be divided into functional modules according to the foregoing method examples.
  • each functional module may be divided corresponding to each function, or two or more functions may be integrated into one processing module.
  • the above-mentioned integrated modules can be implemented in the form of hardware, and can also be implemented in the form of software function modules. It should be noted that, the division of modules in the embodiments of the present application is schematic, and is only a logical function division, and there may be other division manners in actual implementation.
  • the electronic device 1000 includes: an image acquisition unit 1010 , a processing unit 1020 , a storage unit 1030 and a communication unit 1040 .
  • the image acquisition unit 1010 is used for acquiring images.
  • the processing unit 1020 is configured to control and manage the actions of the electronic device 1000 . For example, it can be used to perform object detection on an image by using an object detection algorithm to obtain candidate frame information; determine the candidate frame corresponding to each first message; Adjusting the candidate box parameters based on the adjusted candidate box parameters; determining a detection box from the plurality of candidate boxes based on the adjusted candidate box parameters; and/or other processes for the techniques described herein.
  • the storage unit 1030 is used to store program codes and data of the electronic device 1000 . For example, it can be used to save the received first message; or to save candidate frame information, etc.
  • the communication unit 1040 is used to support the communication between the electronic device 1000 and other electronic devices. For example, it can be used to receive the first message.
  • the unit modules in the above electronic device 1000 include but are not limited to the above image acquisition unit 1010 , the processing unit 1020 , the storage unit 1030 and the communication unit 1040 .
  • the electronic device 1000 may further include a power supply unit and the like.
  • the power supply unit is used to supply power to the electronic device 1000 .
  • the image acquisition unit 1010 may be a camera sensor.
  • the processing unit 1020 may be a processor or a controller, such as a central processing unit (CPU), a digital signal processor (DSP), an application-specific integrated circuit (ASIC), Field programmable gate array (FPGA) or other programmable logic device, transistor logic device, hardware component or any combination thereof.
  • the storage unit 1030 may be a memory.
  • the communication unit 1040 may be a transceiver, a transceiver circuit, or the like.
  • the image acquisition unit 1010 is a camera sensor (the camera sensor 260 shown in FIG. 5 ), the processing unit 1020 is a processor (the processor 210 shown in FIG. 5 ), and the storage unit 1030 may be a memory (as shown in FIG. 5 )
  • the communication unit 1040 may be called a communication interface, and includes a wireless communication module (the wireless communication module 250 shown in FIG. 5 ).
  • the electronic device 1000 provided in this embodiment of the present application may be the electronic device 100 shown in FIG. 5 .
  • the above-mentioned camera sensor, processor, memory, communication interface, etc. can be connected together, for example, through a bus connection.
  • Embodiments of the present application further provide a computer-readable storage medium, where computer program codes are stored in the computer-readable storage medium, and when the processor executes the computer program codes, the electronic device executes the methods in the foregoing embodiments.
  • Embodiments of the present application also provide a computer program product, which when the computer program product runs on a computer, causes the computer to execute the method in the above-mentioned embodiments.
  • the electronic device 1000, the computer-readable storage medium, or the computer program product provided in the embodiments of the present application are all used to execute the corresponding methods provided above. Therefore, for the beneficial effects that can be achieved, reference may be made to the above-provided methods. The beneficial effects in the corresponding method will not be repeated here.
  • the disclosed electronic devices and methods may be implemented in other manners.
  • the electronic device embodiments described above are only illustrative.
  • the division of the modules or units is only a logical function division. In actual implementation, there may be other division methods, such as multiple units or components. May be incorporated or may be integrated into another electronic device, or some features may be omitted, or not implemented.
  • the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of electronic devices or units, and may be in electrical, mechanical or other forms.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.
  • the above-mentioned integrated units can be implemented in the form of hardware, and can also be implemented in the form of software functional units.
  • the integrated unit if implemented in the form of a software functional unit and sold or used as an independent product, may be stored in a readable storage medium.
  • the technical solutions of the embodiments of the present application can be embodied in the form of software products in essence, or the parts that contribute to the prior art, or all or part of the technical solutions, which are stored in a storage medium , including several instructions to make a device (may be a single chip microcomputer, a chip, etc.) or a processor (processor) to execute all or part of the steps of the methods described in the various embodiments of the present application.
  • the aforementioned storage medium includes: a U disk, a removable hard disk, a ROM, a magnetic disk, or an optical disk and other mediums that can store program codes.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Databases & Information Systems (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Image Analysis (AREA)
  • Traffic Control Systems (AREA)

Abstract

本申请涉及一种对象检测方法及电子设备,可以降低对象检测中漏选和错选的概率。该方法包括:电子设备得到一个图像,对图像包含的对象进行识别,获取对象的一个或多个候选框以及每个候选框对应的候选框信息;候选框信息包括对象在图像中的位置,对象的检测概率,对象的第一对象类别;电子设备还接收移动设备的第一消息,第一消息包括对象的位置信息,对象的第二对象类别;根据对象的位置信息,对象在图像中的位置,从一个或多个候选框中确定第一消息对应的第一候选框;调整至少一个第一候选框对应的参数;从中得到检测概率大于或等于预设的检测概率门限的第二候选框;通过NMS算法,从一个或多个第二候选框中得到对象的检测框。

Description

一种对象检测方法及电子设备
本申请要求于2021年02月27日提交国家知识产权局、申请号为202110221889.6、申请名称为“一种对象检测方法及电子设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及人工智能领域,尤其涉及一种对象检测方法及电子设备。
背景技术
随着人工智能(artificial intelligence,AI)技术的发展,AI图像识别的应用越来越广泛。比如,道路上的摄像头采集图像,使用深度学习算法对图像中的对象进行检测,获取对象的类别、位置等信息。
由于条件限制,比如天气原因影响可视度、图像中物体被遮挡、摄像头拍摄角度偏差等,对象检测中经常出现漏检、错检等问题,不能满足用户需求。
发明内容
为了解决上述的技术问题,本申请提供一种对象检测方法及电子设备,能够提高对象检测精度,降低漏检和错检几率,满足用户需求。
第一方面,本申请实施例提供一种对象检测方法,应用于包括摄像设备的电子设备。电子设备固定设置,电子设备与移动设备无线通信,移动设备与电子设备相距在一定范围内,移动设备位于对象上;该方法包括:电子设备通过摄像设备,以第一角度,拍摄或摄像得到一个图像,该图像包含至少一个对象;对图像包含的对象进行识别,获取对象中的每一个对象的一个或多个候选框以及每个候选框对应的一个候选框信息;候选框信息包括对象在图像中的位置,对象的检测概率,对象的第一对象类别,其他与候选框重叠的候选框和候选框两者的最大交并比max IoU;在得到图像之前或之后的预设时长内,接收到来自移动设备的第一消息,第一消息包括对象的位置信息,对象的第二对象类别;根据对象的位置信息,电子设备的位置信息以及第一角度,对象在图像中的位置,从一个或多个候选框中,确定第一消息对应的至少一个第一候选框;从至少一个第一候选框中,得到检测概率大于或等于预设的检测概率门限的一个或多个第二候选框;通过极大值抑制NMS算法,从一个或多个第二候选框中,得到对象的一个检测框;该检测框对应的候选框信息为揭示该对象的信息;其中,第一对象类别和第二对象类别基于同一类别划分标准。比如,第一对象类别为车辆类别,包括轿车、越野车、大巴车、货车、校车、消防车、救护车、警车等;第二对象类别也为车辆类别,包括轿车、越野车、大巴车、货车、校车、消防车、救护车、警车等。或者,第一对象类别为人的类别,包括婴儿、儿童、成年人、老年人等;第二对象类别也为人的类别,包括婴儿、儿童、成年人、老年人等。
其中,在对象没有被遮挡时,该对象的候选框仅包括该对象的完整轮廓。
在该方法中,利用V2X技术(接收第一消息)获取对象的位置,类别等信息。根 据获取的对象的位置,类别等信息,结合对象检测算法识别出的对象的位置,类别等信息,从多个候选框中确定出检测框;相比单独采用对象检测算法识别对象的方法,提高了从候选框中确定出的检测框的精度;降低了对象检测中漏检和错检几率。
根据第一方面,对象包括以下至少一种:车辆、人;位置信息包括位置坐标。
第二方面,本申请实施例提供一种对象检测方法,应用于包括摄像设备的电子设备。电子设备固定设置,电子设备与移动设备无线通信,移动设备与电子设备相距在一定范围内,移动设备位于对象上。该方法包括:电子设备通过摄像设备,以第一角度,拍摄或摄像得到一个图像,图像包含至少一个对象;对图像包含的对象进行识别,获取对象中的每一个对象的一个或多个候选框以及每个候选框对应的一个候选框信息;候选框信息包括对象在图像中的位置,对象的检测概率,对象的第一对象类别,其他与候选框重叠的候选框和候选框两者的最大交并比max IoU;在得到图像之前或之后的预设时长内,接收来自移动设备的第一消息,第一消息包括对象的位置信息,对象的第二对象类别;根据对象的位置信息,电子设备的位置信息以及第一角度,对象在图像中的位置,从一个或多个候选框中,确定第一消息对应的至少一个第一候选框;调整至少一个第一候选框对应的参数,参数与候选框信息相关;从至少一个第一候选框以及第一候选框以外的候选框中,得到检测概率大于或等于预设的检测概率门限的一个或多个第二候选框;通过极大值抑制NMS算法,从一个或多个第二候选框中,得到对象的一个检测框;该检测框对应的候选框信息为揭示该对象的信息;其中,第一对象类别和第二对象类别基于同一类别划分标准。比如,第一对象类别为车辆类别,包括轿车、越野车、大巴车、货车、校车、消防车、救护车、警车等;第二对象类别也为车辆类别,包括轿车、越野车、大巴车、货车、校车、消防车、救护车、警车等。或者,第一对象类别为人的类别,包括婴儿、儿童、成年人、老年人等;第二对象类别也为人的类别,包括婴儿、儿童、成年人、老年人等。
其中,在对象没有被遮挡时,该对象的候选框仅包括该对象的完整轮廓。
在该方法中,利用V2X技术(接收第一消息)获取对象的位置,类别等信息。根据获取的对象的位置,类别等信息调整从多个候选框中确定出检测框过程中相关参数,使得从候选框中确定出的检测框的信息更准确,精度更高;降低了对象检测中漏检和错检几率。
根据第二方面,对象包括以下至少一种:车辆、人;位置信息包括位置坐标。
根据第二方面,或者以上第二方面的任意一种实现方式,调整至少一个第一候选框对应的参数;包括:增大每个第一候选框对应的检测概率的值。得到检测概率大于或等于预设的检测概率门限的一个或多个第二候选框;包括:删除或排除检测概率小于预设的检测概率门限的第一候选框,得到一个或多个第二候选框。
在该方法中,将检测概率小于检测概率门限的候选框删除,即确定检测概率小于检测概率门限的候选框不是检测框;第一消息中对象的位置与第一候选框的候选框信息中对象位置之间的距离小于预设的距离阈值d threshold,表示第一候选框对应的对象和发送第一消息的对象是同一对象的概率较大,增大第一候选框的候选框信息中检测概率的值,增加了第一候选框被确定为检测框的概率,降低了漏检几率。
根据第二方面,或者以上第二方面的任意一种实现方式,增大每个第一候选框对 应的检测概率的值;包括:
Figure PCTCN2022070160-appb-000001
其中,
Figure PCTCN2022070160-appb-000002
d表示第一消息中的位置信息,在经过坐标转换为图像中的第一位置后,与对象在图像中的位置,两者之间的距离;或者,d表示对象在图像中的位置,在经过坐标转换为大地坐标系中第一位置后,与第一消息中的位置信息,两者之间的距离;d threshold表示预设的距离阈值;score oral表示一个第一候选框对应的检测概率的值;score′表示设定的检测概率调整阈值;检测概率门限<score′<1。
根据第二方面,或者以上第二方面的任意一种实现方式,调整至少一个第一候选框对应的参数;包括:增大第一候选框对应的交并比门限。通过极大值抑制NMS算法,从一个或多个第二候选框中,得到对象的一个检测框;包括:通过NMS算法,在一个第二候选框的候选框信息中max IoU小于增大后的第一候选框对应的交并比门限的条件下,停止从候选框中删除或排除第二候选框,将第二候选框确定为对象的检测框。
在该方法中,如果第一消息中对象位置与第一候选框的候选框信息中对象在图像中的位置之间的距离小于预设的距离阈值d threshold,表示该候选框对应的对象和发送第一消息的对象是同一对象的概率较大,增大采用NMS算法从多个候选框中确定出检测框过程中第一候选框对应的交并比门限IoU th;降低了第一候选框被删除的概率,即提高了第一候选框被确定为检测框的概率,降低了漏检几率。
在一种实现方式中,增大第一候选框对应的交并比门限;包括:
Figure PCTCN2022070160-appb-000003
Figure PCTCN2022070160-appb-000004
其中,
Figure PCTCN2022070160-appb-000005
Figure PCTCN2022070160-appb-000006
d表示第一消息中的位置信息,在经过坐标转换为图像中的第一位置后,与对象在图像中的位置,两者之间的距离;或者,d表示对象在图像中的位置,在经过坐标转换为大地坐标系中第一位置后,与第一消息中的位置信息,两者之间的距离;d threshold表示预设的距离阈值;
Figure PCTCN2022070160-appb-000007
表示NMS算法中预设的交并比门限;IoU′ th表示设定的交并比门限调整阈值;
Figure PCTCN2022070160-appb-000008
根据第二方面,或者以上第二方面的任意一种实现方式,该方法还包括:如果第一消息中对象的第二对象类别与第一候选框的候选框信息中对象的第一对象类别不一致,将第一消息中对象的第二对象类别确定为检测框的对象类别。即根据V2X消息中的对象类别确定检测框的对象类别,降低了错检几率。
其中,如果第一消息中对象的第二对象类别与第一候选框的候选框信息中对象的第一对象类别不一致,将第一消息中对象的第二对象类别确定为检测框的对象类别;包括:
Figure PCTCN2022070160-appb-000009
其中,
Figure PCTCN2022070160-appb-000010
Figure PCTCN2022070160-appb-000011
d表示第一消息中的位置信息,在经过坐标转换为图像中的第一位置后,与对象在图像中的位置,两者之间的距离;或者,d表示对象在图像中的位置,在经过坐标转换为大地坐标系中的第一位置后,与第一消息中的位置信息,两者之间的距离;d threshold表示预设的距离阈值,class v表示第一消息中对象的第二对象类别,class oral表示第一候选框的候选框信息中对象的第一对象类别。
根据第二方面,或者以上第二方面的任意一种实现方式,根据对象的位置信息,电子设备的位置信息以及第一角度,对象在图像中的位置,从一个或多个候选框中, 确定第一消息对应的至少一个第一候选框;包括:遍历所述对象对应的所有候选框信息,将根据所述候选框信息转换得到的D与预设的距离阈值比较,将D小于预设的距离阈值的候选框确定为第一候选框;其中,D表示对象的位置信息,在根据电子设备的位置信息以及第一角度经过坐标转换为图像中的第一位置后,与对象在图像中的位置,两者之间的距离。
在该方法中,如果第一消息中对象位置与候选框的候选框信息中对象在图像中位置之间的距离小于预设的距离阈值d threshold,表示该候选框对应的对象和发送第一消息的对象是同一对象的概率较大,将该候选框确定为第一候选框,调整该第一候选框对应的参数。
根据第二方面,或者以上第二方面的任意一种实现方式,对图像包含的对象进行识别;包括:采用YOLO算法或Fast_RCNN算法对图像包含的对象进行识别。
第三方面,本申请实施例提供一种电子设备。电子设备包括:处理器;存储器;以及计算机程序,其中计算机程序存储在存储器上;当计算机程序被处理器执行时,使得电子设备执行上述第一方面以及第一方面任意一种实现方式,或者上述第二方面以及第二方面任意一种实现方式所述的方法。
第三方面以及第三方面中任意一种实现方式所对应的技术效果可参见上述第一方面以及第一方面中任意一种实现方式,或者上述第二方面以及第二方面任意一种实现方式所对应的技术效果,此处不再赘述。
第四方面,提供一种计算机可读存储介质。该计算机可读存储介质包括计算机程序,当计算机程序在电子设备上运行时,使得电子设备执行如第一方面以及第一方面任意一种实现方式,或者上述第二方面以及第二方面任意一种实现方式的方法。
第四方面以及第四方面中任意一种实现方式所对应的技术效果可参见上述第一方面以及第一方面中任意一种实现方式,或者上述第二方面以及第二方面任意一种实现方式所对应的技术效果,此处不再赘述。
第五方面,提供一种计算机程序产品。当其在计算机上运行时,使得计算机执行如第一方面以及第一方面任意一种实现方式,或者上述第二方面以及第二方面任意一种实现方式的方法。
第五方面以及第五方面中任意一种实现方式所对应的技术效果可参见上述第一方面以及第一方面中任意一种实现方式,或者上述第二方面以及第二方面任意一种实现方式所对应的技术效果,此处不再赘述。
附图说明
图1为一种AI图像识别的场景示意图;
图2为一种对象检测结果示意图;
图3为一种RSU和OBU的部署实例示意图;
图4为本申请实施例提供的对象检测方法所适用的一种场景示例图;
图5为本申请实施例提供的对象检测方法所适用的电子设备的架构示意图;
图6为本申请实施例提供的对象检测方法的流程示意图;
图7为本申请实施例提供的对象检测方法的一种场景示例图;
图8为大地坐标系和世界坐标系的转换关系示意图;
图9为世界坐标系和像素坐标系的转换关系示意图;
图10为本申请实施例提供的对象检测电子设备的结构示意图。
具体实施方式
以下实施例中所使用的术语只是为了描述特定实施例的目的,而并非旨在作为对本申请的限制。如在本申请的说明书和所附权利要求书中所使用的那样,单数表达形式“一个”、“一种”、“所述”、“上述”、“该”和“这一”旨在也包括例如“一个或多个”这种表达形式,除非其上下文中明确地有相反指示。还应当理解,在本申请以下各实施例中,“至少一个”、“一个或多个”是指一个或两个以上(包含两个)。术语“和/或”,用于描述关联对象的关联关系,表示可以存在三种关系;例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B的情况,其中A、B可以是单数或者复数。字符“/”一般表示前后关联对象是一种“或”的关系。
在本说明书中描述的参考“一个实施例”或“一些实施例”等意味着在本申请的一个或多个实施例中包括结合该实施例描述的特定特征、结构或特点。由此,在本说明书中的不同之处出现的语句“在一个实施例中”、“在一些实施例中”、“在其他一些实施例中”、“在另外一些实施例中”等不是必然都参考相同的实施例,而是意味着“一个或多个但不是所有的实施例”,除非是以其他方式另外特别强调。术语“包括”、“包含”、“具有”及它们的变形都意味着“包括但不限于”,除非是以其他方式另外特别强调。术语“连接”包括直接连接和间接连接,除非另外说明。
以下,术语“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括一个或者更多个该特征。
在本申请实施例中,“示例性地”或者“例如”等词用于表示作例子、例证或说明。本申请实施例中被描述为“示例性地”或者“例如”的任何实施例或设计方案不应被解释为比其它实施例或设计方案更优选或更具优势。确切而言,使用“示例性地”或者“例如”等词旨在以具体方式呈现相关概念。
随着AI技术的发展,AI图像识别的应用越来越广泛。图1示出了一种AI图像识别的场景。道路旁的支架上安装了电子设备100。电子设备100包括AI摄像头,或者,电子设备100为AI摄像头。电子设备100采集路口图像,并对采集到的图像进行图像识别。比如,电子设备100使用深度学习算法对图像中的对象进行检测,可以获取到对象的类别、位置等信息。图2示出了一种对象检测结果示例。电子设备100使用对象检测算法对采集到的图像110进行对象检测,检测对象为车辆,获取到图像中车辆的位置、类别、检测概率(即该对象为该类别的概率)等信息。示例性的,电子设备100获取到图像110中车辆111的车辆类别为轿车,检测概率为99%;获取到图像110中车辆112的车辆类别为货车,检测概率为98%;获取到图像110中车辆113的车辆类别为大巴车,检测概率为99%。
由于条件限制,比如天气原因影响可视度、图像中物体被遮挡、摄像头拍摄角度偏差等,对象检测中经常出现漏检、错检等问题。示例性的,图2中车辆114未被检测出,即车辆114被漏检;图2中车辆115的车辆类别为轿车,被检测为越野车,即车辆115被错检。
本申请实施例提供一种对象检测方法,电子设备支持车用无线通信技术(vehicle to  everything,V2X),利用V2X获取车辆信息,在对象检测中结合利用V2X获取的车辆信息进行判断,减少错检与漏检,提高对象检测精度。
V2X是将车辆与一切事物相连接的新一代信息通信技术;其中V代表车辆,X代表任何与车辆交互信息的对象,比如,X可以包括车辆、人、路侧基础设施和网络等。V2X的信息交互模式包括:车辆与车辆之间(vehicle to vehicle,V2V)、车辆与路侧基础设施之间(vehicle to infrastructure,V2I)、车辆与人之间(vehicle to pedestrian,V2P)、车辆与网络之间(vehicle to network,V2N)的信息交互。
基于蜂窝(Cellular)技术的V2X,即C-V2X。C-V2X是基于3G/4G/5G等蜂窝网通信技术演进形成的车用无线通信技术。C-V2X包含两种通信接口:一种是车辆、人、路侧基础设施等终端之间的短距离直接通信接口(PC5),另一种是车辆、人、路侧基础设施等终端与网络之间的通信接口(Uu),用于实现长距离和大范围的可靠通信。
V2X中的终端包括路侧单元(road side unit,RSU)和板载单元(on board unit,OBU)。示例性的,图3示出了一种RSU和OBU的部署实例。RSU是支持V2X应用的静态实体,部署在道路侧,比如道路旁的龙门架上;能够与其他支持V2X应用的实体(比如RSU或OBU)进行数据交换。OBU是支持V2X应用的动态实体,通常安装于车辆内,能够与其他支持V2X应用的实体(比如RSU或OBU)进行数据交换。
在一种实施方式中,RSU可以为图1或图4中的电子设备100。
在另一种实施方式中,图1或图4中的电子设备100可包括RSU。
在一种实施方式中,OBU可以称为移动设备。进一步地,OBU设置在车上的话,可以称为车载设备。
在另一种实施方式中,移动设备可以包括OBU。进一步地,移动设备设置在车上的话,可以称为车载设备。
需要说明的是,上述各种实施方式,在不矛盾的前提下,可以自由组合。
本申请实施例提供的对象检测方法,可以应用于图4所示的场景。电子设备100与车辆200内分别安装OBU。电子设备100与各个车辆200之间可以通过V2X互相通信。比如,车辆200的OBU周期性的广播基础安全消息(basic safety message,BSM),BSM中包括本车辆的基础信息,比如,车辆行驶速度,航向,位置,加速度,车辆类别,预测路径及历史路径,车辆事件等。OBU间的通信距离为第一距离(比如500米)。电子设备100可以通过V2X获取其周围第一距离内的车辆200广播的基础安全消息。
在另一些示例中,电子设备100支持RSU功能,OBU与RSU间的通信距离为第二距离(比如1000米)。电子设备100可以通过V2X获取其周围第二距离内的车辆200广播的基础安全消息。本申请下述实施例以电子设备100内安装OBU为例进行说明。可以理解的,本申请实施例提供的对象检测方法同样适用于电子设备支持RSU功能的情况。
电子设备100采集道路图像,采用对象检测算法对道路图像进行对象检测。比如,对象检测算法包括YOLO(you only look once)算法、Fast_RCNN等。在对象检测中,先通过计算获取每个检测对象的一个或多个候选的识别区域(region of interest,RoI),称为候选框(RoIs);再从一个或多个候选框中筛选出一个RoI,即获取到检测对象的检测框。其中,检测对象可以包括车辆、行人、路标等。
本申请实施例提供的对象检测方法,电子设备100还通过V2X获取其周围第一距离 内车辆200的广播消息;结合获取的周围车辆的广播消息,从多个候选框中筛选出一个RoI;提高对象检测的检测精度。
可选地,电子设备100可以为AI摄像头,还可以为包括AI摄像头的电子设备,还可以为包括相机或摄像头(不同于AI摄像头)的电子设备。
示例性的,图5示出了一种电子设备的结构示意图。该电子设备100可包括处理器210,外部存储器接口220,内部存储器221,通用串行总线(universal serial bus,USB)接口230,充电管理模块240,电源管理模块241,电池242,天线,无线通信模块250,摄像头传感器260等。
可以理解的是,本申请实施例示意的结构并不构成对电子设备100的具体限定。在本申请另一些实施例中,电子设备100可以包括比图示更多或更少的部件,或者组合某些部件,或者拆分某些部件,或者不同的部件布置。图示的部件可以以硬件,软件或软件和硬件的组合实现。
处理器210可以包括一个或多个处理单元。例如:处理器210可以包括应用处理器(application processor,AP),调制解调处理器,图形处理器(graphics processing unit,GPU),图像信号处理器(image signal processor,ISP),控制器,视频编解码器,数字信号处理器(digital signal processor,DSP),基带处理器,和/或神经网络处理器(neural-network processing unit,NPU)等。其中,不同的处理单元可以是独立的部件,也可以集成在一个或多个处理器中。在一些实施例中,电子设备100也可以包括一个或多个处理器210。其中,控制器可以根据指令操作码和时序信号,产生操作控制信号,完成取指令和执行指令的控制。
在一些实施例中,处理器210可以包括一个或多个接口。接口可以包括集成电路间(inter-integrated circuit,I2C)接口,集成电路间音频(integrated circuit sound,I2S)接口,脉冲编码调制(pulse code modulation,PCM)接口,通用异步收发传输器(universal asynchronous receiver/transmitter,UART)接口,移动产业处理器接口(mobile industry processor interface,MIPI),通用输入输出(general-purpose input/output,GPIO)接口,SIM卡接口,和/或USB接口等。其中,USB接口230是符合USB标准规范的接口,具体可以是Mini USB接口,Micro USB接口,USB Type C接口等。USB接口230可以用于连接充电器为电子设备100充电,也可以用于电子设备100与外围设备之间传输数据。
可以理解的是,本申请实施例示意的各模块间的接口连接关系,只是示意性说明,并不构成对电子设备100的结构限定。在本申请另一些实施例中,电子设备100也可以采用上述实施例中不同的接口连接方式,或多种接口连接方式的组合。
充电管理模块240用于从充电器接收充电输入。其中,充电器可以是无线充电器,也可以是有线充电器。在一些有线充电的实施例中,充电管理模块240可以通过USB接口230接收有线充电器的充电输入。在一些无线充电的实施例中,充电管理模块240可以通过电子设备100的无线充电线圈接收无线充电输入。充电管理模块240为电池242充电的同时,还可以通过电源管理模块241为电子设备供电。
电源管理模块241用于连接电池242,充电管理模块240与处理器210。电源管理模块241接收电池242和/或充电管理模块240的输入,为处理器210,内部存储器221,外部存储器接口220和无线通信模块250等供电。电源管理模块241还可以用于监测电池容量,电池循环 次数,电池健康状态(漏电,阻抗)等参数。在其他一些实施例中,电源管理模块241也可以设置于处理器210中。在另一些实施例中,电源管理模块241和充电管理模块240也可以设置于同一个器件中。
电子设备100的无线通信功能可以通过天线以及无线通信模块250等实现。
无线通信模块250可以提供应用在电子设备100上的包括Wi-Fi,蓝牙(bluetooth,BT),无线数传模块(例如,433MHz,868MHz,915MHz)等无线通信的解决方案。无线通信模块250可以是集成至少一个通信处理模块的一个或多个器件。无线通信模块250经由天线接收电磁波,将电磁波信号滤波以及调频处理,将处理后的信号发送到处理器210。无线通信模块250还可以从处理器210接收待发送的信号,对其进行调频,放大,经天线转为电磁波辐射出去。
本申请实施例中,电子设备100可以通过无线通信模块接收广播消息(基础安全消息)。
外部存储器接口220可以用于连接外部存储卡,例如Micro SD卡,实现扩展电子设备100的存储能力。外部存储卡通过外部存储器接口220与处理器210通信,实现数据存储功能。例如将音乐,视频等文件保存在外部存储卡中。
内部存储器221可以用于存储一个或多个计算机程序,该一个或多个计算机程序包括指令。处理器210可以通过运行存储在内部存储器221的上述指令,从而使得电子设备100执行本申请一些实施例中所提供的对象检测方法,以及各种应用以及数据处理等。内部存储器221可以包括代码存储区和数据存储区。其中,代码存储区可存储操作系统。数据存储区可存储电子设备100使用过程中所创建的数据等。此外,内部存储器221可以包括高速随机存取存储器,还可以包括非易失性存储器,例如一个或多个磁盘存储部件,闪存部件,通用闪存存储器(universal flash storage,UFS)等。在一些实施例中,处理器210可以通过运行存储在内部存储器221的指令,和/或存储在设置于处理器210中的存储器的指令,来使得电子设备100执行本申请实施例中所提供的对象检测方法,以及其他应用及数据处理。
摄像头传感器260用于捕获静态图像或视频。物体通过镜头生成光学图像投射到摄像头传感器260。摄像头传感器260可以是电荷耦合器件(charge coupled device,CCD)或互补金属氧化物半导体(complementary metal-oxide-semiconductor,CMOS)光电晶体管。摄像头传感器260把光信号转换成电信号,之后将电信号传递给处理器210转换成数字图像信号。处理器210中DSP将数字图像信号转换成标准的RGB,YUV等格式的图像信号。例如,拍照时,打开快门,光线通过镜头被传递到摄像头传感器260上,光信号转换为电信号,摄像头传感器260将所述电信号传递给处理器210处理,转化为肉眼可见的图像。处理器210还可以对图像的噪点,亮度,肤色进行算法优化;还可以对拍摄场景的曝光,色温等参数优化。
本申请实施例中,电子设备100通过摄像头传感器260采集道路图像,摄像头传感器260采集的道路图像传递至处理器210。处理器210采用YOLO、Fast_RCNN等对象检测算法通过计算获取道路图像中每个对象的一个或多个候选框(RoIs)。电子设备100还通过无线通信模块250接收周围车辆的广播消息(基础安全消息)。处理器210接收到周围车辆的基础安全消息,结合获取的周围车辆的基础安全消息,从多个候选框中筛选出一个RoI,即获取到检测对象。
需要说明的是,本申请实施例所适用的系统可以包括比图4所示更多的电子设备。在 一些实施例中,该系统中还包括服务器(比如云服务器)。在一种示例中,电子设备采集道路图像,并将道路图像上传至服务器。电子设备接收距其第一距离内的OBU周期性广播的基础安全消息,并将获取到的基础安全消息转发至服务器。在服务器上采用YOLO、Fast_RCNN等对象检测算法通过计算获取道路图像中每个对象的一个或多个候选框(RoIs);并结合获取的车辆的基础安全消息,从多个候选框中筛选出一个RoI,即获取到检测对象。
下面结合附图,以图4所示场景为例,对本申请实施例提供的对象检测算法进行详细介绍。
示例性的,图6示出了本申请实施例提供的一种对象检测算法的流程示意图。本申请实施例提供的对象检测算法可以包括:
S601、以第一角度,按照第一时长的周期,周期性地采集图像;在每采集到一个图像后,采用对象检测算法对图像进行识别,获取图像中每一个对象的一个或多个候选框,以及每个候选框对应的候选框信息。
电子设备使用摄像头以第一角度,按照第一时长(比如1s)的周期,周期性地采集图像。在每采集到一个图像后,采用对象检测算法对图像进行识别。比如,对象检测算法为YOLO或Fast_RCNN等。进行对象检测的检测对象为车辆。
电子设备对采集到的每个图像进行识别,获取图像中每一个对象的一个或多个候选框,以及每个候选框对应的候选框信息。候选框信息包括检测概率,图像位置,对象类别(例如,车辆类别),本候选框与其他候选框的最大交并比(max IoU)等。其中,检测概率即该检测对象为该对象类别(例如,车辆类别)的概率,也称为得分(score);图像位置即该候选框在图像中的像素坐标,比如,包括该候选框的左上角的像素坐标值(x min,y min),候选框的右下角的像素坐标值(x max,y max),候选框中心点的像素坐标值(x p,y p)等;车辆类别包括轿车、越野车、大巴车、货车、校车、消防车、救护车、警车等。交并比(intersect over union,IoU)即两个候选框交集占两个候选框并集的比例。
示例性的,图7示出了本申请实施例提供的对象检测方法的一种场景示例图。
如图7的(a)所示,对第一图像进行对象检测,获取到候选框0,候选框1,候选框2,候选框3,候选框4,候选框5,候选框6和候选框7。候选框0-候选框7的左上角像素坐标值(x min,y min),右下角像素坐标值(x max,y max),候选框中心点的像素坐标值(x p,y p),检测概率(score)以及最大交并比(max IoU)信息如表1所示。
表1
序号 x min y min x max y max x p y p score max IoU
0 26 1 71 29 13.5 51 0.87 0.542
1 34 3 83 35 18.5 59 0.91 0.542
2 356 38 417 80 197 248.5 0.52 0.392
3 377 30 435 74 203.5 254.5 0.95 0.392
4 386 47 446 89 216.5 267.5 0.63 0.353
5 354 19 398 49 186.5 223.5 0.86 0.501
6 365 17 412 52 191 232 0.98 0.501
7 380 4 416 30 192 223 0.76 0.192
S602、接收来自对象的移动设备的第一消息;第一消息包括对象的位置信息、类别信 息等。
以车辆作为检测对象,OBU作为移动设备。车辆内OBU按照第二时长(比如100ms),周期性地发送第一消息,第一消息包括车辆的标识,车辆位置(纬度、经度和海拔),车辆类别(包括轿车、越野车、大巴车、货车、校车、消防车、救护车、警车等),消息发送时间等信息。比如,该第一消息为BSM消息。示例性的,BSM消息包括表2所示信息。
表2
BSM消息 说明
vehicle ID 车辆标识
DSecond 世界标准时间(UTC时间)
VehicleSize 车辆尺寸(包含长、宽、高)
AccelerationSet4Way 车辆四轴加速度
Heading 车辆航向角
Speed 当前车辆行驶速度
VehicleClassification 车辆类别
Position3D 车辆位置(纬度、经度、海拔)
电子设备接收周围车辆分别按照第二时长,周期性发送的第一消息,并保存接收到的第一消息。
S603、根据第一消息的接收时间和图像的采集时间两者的差值是否在预设时长内,确定与第一消息对应的第一图像。
在一种实现方式中,电子设备确定图像的采集时间点为第一时刻,将消息发送时间在该第一时刻之前或之后的第二时长内的第一消息确定为与第一消息对应的第一图像。
示例性的,第一图像为摄像头采集图像中18时20分01秒的图像。车辆发送第一消息的周期为100毫秒(即第二时长为100毫秒)。摄像头确定发送时间在(18时20分01秒-100毫秒)至18时20分01秒之间的第一消息为第一图像对应的第一消息。
可选的,在一些示例中,如果电子设备保存的第一消息中包括同一车辆的多条第一消息,则保留该同一车辆的多条第一消息中的最后一条第一消息。
S604、根据第一消息中的对象的位置信息,结合电子设备的位置信息以及第一角度,通过坐标转换,得到该对象在第一图像中的第一位置。
电子设备根据第一图像对应的第一消息中每条第一消息获取对应车辆的标识,车辆位置,车辆类别,消息发送时间等信息。示例性的,摄像头根据车辆1的BSM消息,获取到车辆1的车辆位置为经度L 1、纬度B 1、海拔H 1;车辆类别为轿车。摄像头根据车辆2的BSM消息,获取到车辆2的车辆位置为经度L 2、纬度B 2、海拔H 2;车辆类别为轿车。摄像头根据车辆3的BSM消息,获取到车辆3的车辆位置为经度L 3、纬度B 3、海拔H 3;车辆类别为轿车。摄像头根据车辆4的BSM消息,获取到车辆4的车辆位置为经度L 4、纬度B 4、海拔H 4;车辆类别为轿车。摄像头根据车辆5的BSM消息,获取到车辆5的车辆位置为经度L 5、纬度B 5、海拔H 5;车辆类别为轿车。
在一种实现方式中,将第一消息中车辆位置的纬度、经度和海拔(B,L,H)转换为像素坐标(x v,y v)。
一、将车辆位置的纬度、经度和海拔值转换为世界坐标系中的坐标值。
如图8所示,纬度、经度和海拔(B,L,H)是大地坐标系中的坐标值。纬度B的定义是某点的地面法线与赤道面的夹角;以赤道面为起算点,向南为负,范围为-90°~0°,向北为正,范围为0°~90°。经度L由起始大地子午面起算,向东为正,向西为负,范围是-180°~180°。某点沿法线到椭球面的距离称为该点的海拔H。
世界坐标系以椭球中心O为坐标原点,以起始子午面与赤道面的交线为X轴,在赤道面上与X轴正交的方向为Y轴,椭球的旋转轴为Z轴,三个方向呈右手系。
根据以下公式①将大地坐标系的坐标值(B、L、H)(B为纬度,L为经度,H为海拔)转换为世界坐标系坐标值(x,y,z)。
Figure PCTCN2022070160-appb-000012
其中,N为卯酉圈半径,e是地球的第一偏心率。令参考椭球的赤道半径为a,参考椭球的极半径为b。在参考椭球的定义中,a大于b。则:e 2=(a 2-b 2)/a 2,N=a/(1-e 2sin 2B) 1/2
二、将世界坐标系中的坐标值转换为图像中的像素坐标。
相机中有四个坐标系,分别为世界坐标系、相机坐标系、图像坐标系和像素坐标系。如图9所示,坐标系X cY cZ c为相机坐标系,相机坐标系以相机光点为原点,X轴、Y轴平行于图像的两条边,Z轴与光轴重合。坐标系X iY iZ i为图像坐标系,图像坐标系以光轴与投影面的交点为原点,X轴、Y轴平行于图像的两条边,Z轴与光轴重合。坐标系uv为像素坐标系,像素坐标系与图像坐标系处于同一平面,原点不同;从相机光点向投影面方向看,投影面的左上角为像素坐标系原点。
根据以下公式②将世界坐标系坐标值(x,y,z)转换为像素坐标系坐标值(x v,y v)。
Figure PCTCN2022070160-appb-000013
其中,矩阵
Figure PCTCN2022070160-appb-000014
为相机内参数矩阵,与相机内硬件参数相关;矩阵
Figure PCTCN2022070160-appb-000015
为相机外参数矩阵,与世界坐标系和像素坐标系的相对位置(比如,相机的拍摄角度,也可称为第一角度;相机与车辆的相对位置等)相关。比如,世界坐标系内一点通过旋转
Figure PCTCN2022070160-appb-000016
平移
Figure PCTCN2022070160-appb-000017
转换为像素坐标系的一点;s为对象点在像素坐标系Z轴方向的值。有关r 11-r 33、t 1-t 3的内容可参见本领域的相关技术,此处不再赘述。
转换获得的图像中的像素坐标即为该对象在第一图像中的第一位置。
上述实现方式以将第一消息中的车辆位置从大地坐标系中的坐标值(经度、纬度和高度)转换为像素坐标系中的像素坐标值为例进行介绍。可以理解的,在另一些实现方式中,也可以将候选框信息中图像位置从像素坐标系中的像素坐标值转换为大地坐标系中的坐标值,然后与第一消息中的车辆位置进行比较;本申请实施例对此并不进行限定。
S605、根据第一位置与候选框信息中对象在图像中的位置的差值是否在预设阈值内,确定图像中与该对象相关的至少一个第一候选框。
遍历第一图像的候选框,如果第一位置(x v,y v)与候选框的候选框信息中图像位置(x p,y p)之间的第一距离d小于预设的距离阈值d threshold(比如,0.5米),确定该候选 框为该第一消息对应的第一候选框。其中,
Figure PCTCN2022070160-appb-000018
S606、调整至少一个第一候选框对应的参数,该参数与候选框信息相关。
在一些实施例中,如果d小于d threshold,表示候选框对应的车辆和发送第一消息的车辆是同一车辆的概率较大,则增大第一消息对应的候选框的score。这样,增加了该候选框被确定为检测框的概率,降低漏检几率。
在一种实现方式中,根据第一距离d增大第一消息对应的候选框的检测概率。示例性的,根据下述公式③获取第一消息对应的候选框的调整后检测概率score n。其中,调整系数
Figure PCTCN2022070160-appb-000019
为score n的置信度,d越小,置信度越大;score oral为候选框信息中的检测概率值,score′为设定的检测概率调整阈值,满足score th<score′<1。
Figure PCTCN2022070160-appb-000020
在一些实施例中,如果d小于d threshold,表示候选框对应的车辆和发送第一消息的车辆是同一车辆的概率较大,则增大采用NMS算法从多个候选框中确定出检测框过程中该候选框对应的交并比门限IoU th;这样,降低了该候选框被删除的概率,即提高了该候选框被确定为检测框的概率,降低漏检几率。
在一种实现方式中,根据第一距离d增大第一消息对应的候选框对应的交并比门限。示例性的,根据下述公式④获取第一消息对应的候选框对应的交并比门限
Figure PCTCN2022070160-appb-000021
其中,调整系数
Figure PCTCN2022070160-appb-000022
Figure PCTCN2022070160-appb-000023
的置信度,d越小,置信度越大;
Figure PCTCN2022070160-appb-000024
为NMS算法中预设的IoU th取值,IoU′ th为设定的交并比门限调整阈值,满足
Figure PCTCN2022070160-appb-000025
Figure PCTCN2022070160-appb-000026
S607、根据检测概率,并结合特定算法,得到与该对象的一个检测框;该检测框的候选框信息为最终得到的该对象的信息。
在一些实施例中,将检测概率(score)小于检测概率门限score th(比如,0.85)的候选框删除,即确定检测概率小于检测概率门限的候选框不是检测框。
示例性的,设置score th为0.85,表1中序号2、4、7的候选框的检测概率小于score th。接收到一个第一消息,第一消息中车辆位置与序号7的候选框的图像位置(192,223)之间的第一距离小于d threshold,该第一消息与序号7的候选框对应。设置score′为0.95。示例性的,如表3所示,计算出序号7的候选框的score n为0.87,置信度为0.5。
表3
Figure PCTCN2022070160-appb-000027
Figure PCTCN2022070160-appb-000028
这样,序号2和4的候选框被删除;序号7的候选框的score n大于score th,序号7的候选框被保留;降低了漏选几率。根据检测概率筛选后,剩余的候选框为序号0,序号1,序号3,序号5,序号6和序号7的候选框。
在一些实施例中,采用特定算法,比如极大值抑制(non-maximum suppression,NMS)算法,从多个候选框中确定出检测框。
NMS算法介绍如下:
1、在多个RoIs中选取score最大的一个,记为box_best,并保留它。
2、计算box_best与其余RoIs的IoU。
3、如果一个候选框与box_best的IoU大于交并比门限IoU th(比如0.5),将该候选框删除。由于两个候选框的IoU大于IoU th时,该两个候选框表示同一检测对象的概率较大,删除score较小的候选框,保留score较高的候选框;这样可以从多个候选框中去掉同一检测对象的score较低的候选框,将同一检测对象的score较高的候选框确定为检测框。
4、从剩余的RoIs中,选取score最大的一个,循环执行步骤1-3。
示例性的,根据NMS算法从表1所示候选框中确定检测框。根据检测概率筛选后,表1中剩余的候选框为序号0,序号1,序号3,序号5,序号6和序号7的候选框。预设的交并比门限
Figure PCTCN2022070160-appb-000029
为0.5,序号0和序号1的候选框的最大交并比max IoU为0.542,由于序号1的score值0.91大于序号0的score值0.87,且0.542>0.5,score值较低的序号0的候选框会被删除。接收到一个第一消息,第一消息中车辆位置与序号0的候选框的图像位置(13.5,51)之间的第一距离小于d threshold,该第一消息与序号0的候选框对应;根据
Figure PCTCN2022070160-appb-000030
计算获得置信度
Figure PCTCN2022070160-appb-000031
为0.6。示例性的,如表3所示,设置IoU′ th为0.6,根据公式④计算获取序号0的候选框对应的交并比门限
Figure PCTCN2022070160-appb-000032
为0.56;由于0.542<0.56,序号0的候选框不会被删除。这样,序号0和序号1的候选框都被保留。序号5和序号6的候选框的最大交并比max IoU为0.501,由于序号6的score值0.98大于序号5的score值0.86,且0.501>0.5,score值较低的序号5的候选框被删除。这样,采用NMS算法进行筛选后,表1中剩余的候选框为序号0,序号1,序号3,序号6和序号7的候选框。如图7的(b)所示,序号0,序号1,序号3,序号6和序号7的候选框被确定为检测框。
在一些实施例中,电子设备还根据第一距离d确定检测对象的车辆类别。
在一种实现方式中,如果d小于d threshold,表示候选框对应的车辆和发送第一消息的车辆是同一车辆的概率较大,将第一消息中的车辆类别确定为该检测对象的车辆类别。由于第一消息中车辆类别的准确率较高,这样就提高了对象检测的精度。
示例性的,根据下述公式⑤获取检测对象的车辆类别。其中,调整系数
Figure PCTCN2022070160-appb-000033
Figure PCTCN2022070160-appb-000034
为class的置信度,d越小,置信度越大;class oral为候选框信息中的车辆类别,class v为第一消息中的车辆类别。
Figure PCTCN2022070160-appb-000035
在一种示例中,第一消息中的车辆类别和第一消息对应的候选框的车辆类别不一致, 将第一消息中的车辆类别确定为检测框的车辆类别。示例性的,序号0的候选框的候选框信息中车辆类别为越野车,序号0的候选框对应的第一消息中车辆类别为轿车;则确定序号0的检测框的车辆类别为轿车。
在一种示例中,未接收到与候选框对应的第一消息,将候选框的车辆类别确定为检测框的车辆类别。示例性的,序号1的候选框的候选框信息中车辆类别为轿车,未接收到序号1的候选框对应的第一消息;则确定序号1的检测框的车辆类别为轿车。
在一种示例中,第一消息中的车辆类别和第一消息对应的候选框的车辆类别一致,将第一消息中的车辆类别确定为检测框的车辆类别。示例性的,序号7的候选框的候选框信息中车辆类别为轿车,序号7的候选框对应的第一消息中车辆类别为轿车;则确定序号7的检测框的车辆类别为轿车。
本申请实施例提供的对象检测方法,利用V2X技术获取车辆的位置,车辆类别等信息。根据获取的车辆的位置,车辆类别等信息调整采用对象检测算法确定检测对象过程中的参数,使得从候选框中确定出的检测框的信息更准确,精度更高;降低了对象检测中漏检和错检几率。
需要说明的是,本申请各个实施例中的相机都可以替换为摄像头,摄像头都可以替换为相机。
需要说明的是,本申请各个实施例的全部或部分,均可以自由地、任意地组合。组合后的技术方案,也在本申请的范围之内。
可以理解的是,上述电子设备为了实现上述功能,其包含了执行各个功能相应的硬件结构和/或软件模块。本领域技术人员应该意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,本申请实施例能够以硬件或硬件和计算机软件的结合形式来实现。某个功能究竟以硬件还是计算机软件驱动硬件的方式来执行,取决于技术方案的特定应用和设计约束条件。本领域技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请实施例的范围。
本申请实施例可以根据上述方法示例对上述电子设备进行功能模块的划分,例如,可以对应各个功能划分各个功能模块,也可以将两个或两个以上的功能集成在一个处理模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。需要说明的是,本申请实施例中对模块的划分是示意性的,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。
在一种示例中,请参考图10,其示出了上述实施例中所涉及的电子设备的一种可能的结构示意图。该电子设备1000包括:图像采集单元1010,处理单元1020,存储单元1030和通信单元1040。
其中,图像采集单元1010用于采集图像。
处理单元1020,用于对电子设备1000的动作进行控制管理。例如,可以用于采用对象检测算法对图像进行对象检测,获取候选框信息;确定每条第一消息对应的候选框;根据第一消息中车辆位置与对应的候选框的候选框信息中图像位置之间的第一距离调整候选框参数;根据调整后的候选框参数,从多个候选框中确定出检测框;和/或用于本文所描述的技术的其它过程。
存储单元1030用于保存电子设备1000的程序代码和数据。例如,可以用于保存接收 到的第一消息;或者用于保存候选框信息等。
通信单元1040用于支持电子设备1000与其他电子设备的通信。例如,可以用于接收第一消息。
当然,上述电子设备1000中的单元模块包括但不限于上述图像采集单元1010,处理单元1020,存储单元1030和通信单元1040。例如,电子设备1000中还可以包括电源单元等。电源单元用于对电子设备1000供电。
其中,图像采集单元1010可以是摄像头传感器。处理单元1020可以是处理器或控制器,例如可以是中央处理器(central processing unit,CPU),数字信号处理器(digital signal processor,DSP),专用集成电路(application-specific integrated circuit,ASIC),现场可编程门阵列(field programmable gate array,FPGA)或者其他可编程逻辑器件、晶体管逻辑器件、硬件部件或者其任意组合。存储单元1030可以是存储器。通信单元1040可以是收发器、收发电路等。
例如,图像采集单元1010为摄像头传感器(如图5所示的摄像头传感器260),处理单元1020为处理器(如图5所示的处理器210),存储单元1030可以为存储器(如图5所示的内部存储器221),通信单元1040可以称为通信接口,包括无线通信模块(如图5所示的无线通信模块250)。本申请实施例所提供的电子设备1000可以为图5所示的电子设备100。其中,上述摄像头传感器、处理器、存储器、通信接口等可以连接在一起,例如通过总线连接。
本申请实施例还提供一种计算机可读存储介质,该计算机可读存储介质中存储有计算机程序代码,当处理器执行该计算机程序代码时,电子设备执行上述实施例中的方法。
本申请实施例还提供了一种计算机程序产品,当该计算机程序产品在计算机上运行时,使得计算机执行上述实施例中的方法。
其中,本申请实施例提供的电子设备1000、计算机可读存储介质或者计算机程序产品均用于执行上文所提供的对应的方法,因此,其所能达到的有益效果可参考上文所提供的对应的方法中的有益效果,此处不再赘述。
通过以上的实施方式的描述,所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将电子设备的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。
在本申请所提供的几个实施例中,应该理解到,所揭露的电子设备和方法,可以通过其它的方式实现。例如,以上所描述的电子设备实施例仅仅是示意性的,例如,所述模块或单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个电子设备,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,电子设备或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以使用硬件的形式实现,也可以使用软件功能单元的形式实现。
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可 以存储在一个可读取存储介质中。基于这样的理解,本申请实施例的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该软件产品存储在一个存储介质中,包括若干指令用以使得一个设备(可以是单片机,芯片等)或处理器(processor)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、ROM、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何在本申请揭露的技术范围内的变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。

Claims (16)

  1. 一种对象检测方法,应用于包括摄像设备的电子设备,所述电子设备固定设置,所述电子设备与移动设备无线通信,所述移动设备与所述电子设备相距在一定范围内,所述移动设备位于所述对象上,其特征在于,所述方法包括:
    所述电子设备通过所述摄像设备,以第一角度,拍摄或摄像得到一个图像,所述图像包含至少一个对象;
    对所述图像包含的对象进行识别,获取所述对象中的每一个对象的一个或多个候选框以及每个候选框对应的一个候选框信息;所述候选框信息包括所述对象在所述图像中的位置,所述对象的检测概率,所述对象的第一对象类别,其他与所述候选框重叠的候选框和所述候选框两者的最大交并比max IoU;
    在得到所述图像之前或之后的预设时长内,接收到来自所述移动设备的第一消息,所述第一消息包括所述对象的位置信息,所述对象的第二对象类别;
    根据所述对象的位置信息,所述电子设备的位置信息以及所述第一角度,所述对象在所述图像中的位置,从所述一个或多个候选框中,确定所述第一消息对应的至少一个第一候选框;
    从所述至少一个第一候选框中,得到检测概率大于或等于预设的检测概率门限的一个或多个第二候选框;
    通过极大值抑制NMS算法,从所述一个或多个第二候选框中,得到所述对象的一个检测框;所述检测框对应的候选框信息为揭示所述对象的信息;
    其中,所述第一对象类别和所述第二对象类别基于同一类别划分标准。
  2. 根据权利要求1所述的方法,其特征在于,所述对象包括以下至少一种:车辆、人;所述位置信息包括位置坐标。
  3. 一种对象检测方法,应用于包括摄像设备的电子设备,所述电子设备固定设置,所述电子设备与移动设备无线通信,所述移动设备与所述电子设备相距在一定范围内,所述移动设备位于所述对象上,其特征在于,所述方法包括:
    所述电子设备通过所述摄像设备,以第一角度,拍摄或摄像得到一个图像,所述图像包含至少一个对象;
    对所述图像包含的对象进行识别,获取所述对象中的每一个对象的一个或多个候选框以及每个候选框对应的一个候选框信息;所述候选框信息包括所述对象在所述图像中的位置,所述对象的检测概率,所述对象的第一对象类别,其他与所述候选框重叠的候选框和所述候选框两者的最大交并比max IoU;
    在得到所述图像之前或之后的预设时长内,接收来自所述移动设备的第一消息,所述第一消息包括所述对象的位置信息,所述对象的第二对象类别;
    根据所述对象的位置信息,所述电子设备的位置信息以及所述第一角度,所述对象在所述图像中的位置,从所述一个或多个候选框中,确定所述第一消息对应的至少一个第一候选框;
    调整所述至少一个第一候选框对应的参数,所述参数与所述候选框信息相关;
    从所述至少一个第一候选框以及所述第一候选框以外的候选框中,得到检测概率 大于或等于预设的检测概率门限的一个或多个第二候选框;
    通过极大值抑制NMS算法,从所述一个或多个第二候选框中,得到所述对象的一个检测框;所述检测框对应的候选框信息为揭示所述对象的信息;
    其中,所述第一对象类别和所述第二对象类别基于同一类别划分标准。
  4. 根据权利要求3所述的方法,其特征在于,所述对象包括以下至少一种:车辆、人;所述位置信息包括位置坐标。
  5. 根据权利要求4所述的方法,其特征在于,所述调整所述至少一个第一候选框对应的参数;包括:增大每个第一候选框对应的检测概率的值。
  6. 根据权利要求3-5中任意一项所述的方法,其特征在于,所述得到检测概率大于或等于预设的检测概率门限的一个或多个第二候选框;包括:删除或排除检测概率小于预设的检测概率门限的第一候选框,得到一个或多个第二候选框。
  7. 根据权利要求5所述的方法,其特征在于,所述增大每个第一候选框对应的检测概率的值;包括:
    Figure PCTCN2022070160-appb-100001
    其中,
    Figure PCTCN2022070160-appb-100002
    d表示所述第一消息中的所述位置信息,在经过坐标转换为所述图像中的第一位置后,与所述对象在所述图像中的位置,两者之间的距离;d threshold表示预设的距离阈值;score oral表示一个第一候选框对应的检测概率的值;score′表示设定的检测概率调整阈值;检测概率门限<score′<1。
  8. 根据权利要求3-7中任意一项所述的方法,其特征在于,所述调整所述至少一个第一候选框对应的参数;包括:增大所述第一候选框对应的交并比门限。
  9. 根据权利要求8所述的方法,其特征在于,所述增大所述第一候选框对应的交并比门限;包括:
    Figure PCTCN2022070160-appb-100003
    其中,
    Figure PCTCN2022070160-appb-100004
    d表示所述第一消息中的所述位置信息,在经过坐标转换为所述图像中的第一位置后,与所述对象在所述图像中的位置,两者之间的距离;d threshold表示预设的距离阈值;
    Figure PCTCN2022070160-appb-100005
    表示所述NMS算法中预设的交并比门限;IoU′ th表示设定的交并比门限调整阈值;
    Figure PCTCN2022070160-appb-100006
  10. 根据权利要求8或9所述的方法,其特征在于,所述通过极大值抑制NMS算法,从所述一个或多个第二候选框中,得到所述对象的一个检测框;包括:
    通过所述NMS算法,在一个第二候选框的候选框信息中max IoU小于所述增大后的第一候选框对应的交并比门限的条件下,停止从候选框中删除或排除所述第二候选框,将所述第二候选框确定为所述对象的检测框。
  11. 根据权利要求3-10中任意一项所述的方法,其特征在于,所述方法还包括:
    如果所述第一消息中所述对象的第二对象类别与所述第一候选框的候选框信息中所述对象的第一对象类别不一致,将所述第一消息中所述对象的第二对象类别确定为所述检测框的对象类别。
  12. 根据权利要求11所述的方法,其特征在于,所述第一消息中所述对象的第二对象类别与所述第一候选框的候选框信息中所述对象的第一对象类别的一致或不一致,通过如下公式确定:
    Figure PCTCN2022070160-appb-100007
    其中,
    Figure PCTCN2022070160-appb-100008
    d表示所述第一消息中的所述位置信息,在经过坐标转换为所述图像中的第一位置后,与所述对象在所述图像中的位置,两者之间的距离;d threshold表示预设的距离阈值,class v表示第一消息中对象的第二对象类别,class oral表示第一候选框的候选框信息中对象的第一对象类别。
  13. 根据权利要求1-12中任意一项所述的方法,其特征在于,根据所述对象的位置信息,所述电子设备的位置信息以及所述第一角度,所述对象在所述图像中的位置,从所述一个或多个候选框中,确定所述第一消息对应的至少一个第一候选框;包括:
    遍历所述对象对应的所有候选框信息,将根据所述候选框信息转换得到的D与预设的距离阈值比较,将D小于预设的距离阈值的候选框确定为第一候选框;
    其中,D表示所述对象的位置信息,在根据所述电子设备的位置信息以及所述第一角度经过坐标转换为所述图像中的第一位置后,与所述对象在所述图像中的位置,两者之间的距离。
  14. 根据权利要求1-13中任意一项所述的方法,其特征在于,所述对所述图像包含的对象进行识别;包括:采用YOLO算法或Fast_RCNN算法对所述图像包含的对象进行识别。
  15. 一种电子设备,其特征在于,所述电子设备包括:
    处理器;
    存储器;
    以及计算机程序,其中所述计算机程序存储在所述存储器上,当所述计算机程序被所述处理器执行时,使得所述电子设备执行如权利要求1-14中任意一项所述的方法。
  16. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质包括计算机程序,当所述计算机程序在电子设备上运行时,使得所述电子设备执行如权利要求1-14中任意一项所述的方法。
PCT/CN2022/070160 2021-02-27 2022-01-04 一种对象检测方法及电子设备 WO2022179314A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
EP22758680.7A EP4276683A4 (en) 2021-02-27 2022-01-04 OBJECT DETECTION METHOD AND ELECTRONIC DEVICE

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110221889.6A CN115063733A (zh) 2021-02-27 2021-02-27 一种对象检测方法及电子设备
CN202110221889.6 2021-02-27

Publications (1)

Publication Number Publication Date
WO2022179314A1 true WO2022179314A1 (zh) 2022-09-01

Family

ID=83048618

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/070160 WO2022179314A1 (zh) 2021-02-27 2022-01-04 一种对象检测方法及电子设备

Country Status (3)

Country Link
EP (1) EP4276683A4 (zh)
CN (1) CN115063733A (zh)
WO (1) WO2022179314A1 (zh)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108960266A (zh) * 2017-05-22 2018-12-07 阿里巴巴集团控股有限公司 图像目标检测方法及装置
CN109308516A (zh) * 2017-07-26 2019-02-05 华为技术有限公司 一种图像处理的方法及设备
US20190130189A1 (en) * 2017-10-30 2019-05-02 Qualcomm Incorporated Suppressing duplicated bounding boxes from object detection in a video analytics system
US20200028736A1 (en) * 2019-08-26 2020-01-23 Lg Electronics Inc. Method and apparatus for determining an error of a vehicle in autonomous driving system
CN111787481A (zh) * 2020-06-17 2020-10-16 北京航空航天大学 一种基于5g的路车协调高精度感知方法
CN112183206A (zh) * 2020-08-27 2021-01-05 广州中国科学院软件应用技术研究所 一种基于路侧单目摄像头的交通参与者定位方法和系统

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108960266A (zh) * 2017-05-22 2018-12-07 阿里巴巴集团控股有限公司 图像目标检测方法及装置
CN109308516A (zh) * 2017-07-26 2019-02-05 华为技术有限公司 一种图像处理的方法及设备
US20190130189A1 (en) * 2017-10-30 2019-05-02 Qualcomm Incorporated Suppressing duplicated bounding boxes from object detection in a video analytics system
US20200028736A1 (en) * 2019-08-26 2020-01-23 Lg Electronics Inc. Method and apparatus for determining an error of a vehicle in autonomous driving system
CN111787481A (zh) * 2020-06-17 2020-10-16 北京航空航天大学 一种基于5g的路车协调高精度感知方法
CN112183206A (zh) * 2020-08-27 2021-01-05 广州中国科学院软件应用技术研究所 一种基于路侧单目摄像头的交通参与者定位方法和系统

Also Published As

Publication number Publication date
CN115063733A (zh) 2022-09-16
EP4276683A4 (en) 2024-06-05
EP4276683A1 (en) 2023-11-15

Similar Documents

Publication Publication Date Title
US11455793B2 (en) Robust object detection and classification using static-based cameras and events-based cameras
WO2020082745A1 (zh) 摄像装置调整方法及相关设备
WO2020133450A1 (zh) 一种移动设备动态组网分享算力的系统与方法
WO2021147637A1 (zh) 车道推荐方法、装置及车载通信设备
CN109817022B (zh) 一种获取目标对象位置的方法、终端、汽车及系统
IT201900011403A1 (it) Detecting illegal use of phone to prevent the driver from getting a fine
US20190025801A1 (en) Monitoring server, distributed-processing determination method, and non-transitory computer-readable medium storing program
WO2021258321A1 (zh) 一种图像获取方法以及装置
CN107204055A (zh) 一种智能联网行车记录仪
WO2022062786A1 (zh) 控制接口的方法、通信装置
WO2018032295A1 (zh) 事故现场还原方法、装置及运动监控设备
WO2021088393A1 (zh) 确定位姿的方法、装置和系统
US20210229804A1 (en) Traffic information processing equipment, system and method
JPWO2018016151A1 (ja) 画像処理装置と画像処理方法
JPWO2018016150A1 (ja) 画像処理装置と画像処理方法
JP2018064007A (ja) 固体撮像素子、および電子装置
US20240046604A1 (en) Image processing method and apparatus, and electronic device
WO2021170129A1 (zh) 一种位姿确定方法以及相关设备
WO2022179314A1 (zh) 一种对象检测方法及电子设备
WO2021164387A1 (zh) 目标物体的预警方法、装置和电子设备
JP2020129369A (ja) モバイル機器を使用する画像処理のためのシステムおよび方法
CN113468929A (zh) 运动状态识别方法、装置、电子设备和存储介质
WO2022237390A1 (zh) 通信方法、装置及系统
CN115195605A (zh) 基于流媒体后视镜系统的数据处理方法、装置和车辆
CN108154691B (zh) 一种基于专用短程通信的交通调节系统及方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22758680

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2022758680

Country of ref document: EP

Effective date: 20230808

NENP Non-entry into the national phase

Ref country code: DE