WO2022217834A1 - Method and system for image processing - Google Patents

Method and system for image processing Download PDF

Info

Publication number
WO2022217834A1
WO2022217834A1 PCT/CN2021/119300 CN2021119300W WO2022217834A1 WO 2022217834 A1 WO2022217834 A1 WO 2022217834A1 CN 2021119300 W CN2021119300 W CN 2021119300W WO 2022217834 A1 WO2022217834 A1 WO 2022217834A1
Authority
WO
WIPO (PCT)
Prior art keywords
subject
image
area
determining
region
Prior art date
Application number
PCT/CN2021/119300
Other languages
French (fr)
Inventor
Xingming Zhang
Lisheng Wang
Mengchao ZHU
Yayun Wang
Original Assignee
Zhejiang Dahua Technology Co., Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Dahua Technology Co., Ltd. filed Critical Zhejiang Dahua Technology Co., Ltd.
Priority to EP21936704.2A priority Critical patent/EP4226274A4/en
Priority to KR1020237021895A priority patent/KR20230118881A/en
Publication of WO2022217834A1 publication Critical patent/WO2022217834A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • G06V20/58Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads
    • G06V20/586Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads of parking space
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • G06V20/54Surveillance or monitoring of activities, e.g. for recognising suspicious objects of traffic, e.g. cars on the road, trains or boats
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/10Image acquisition
    • G06V10/12Details of acquisition arrangements; Constructional details thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/14Traffic control systems for road vehicles indicating individual free spaces in parking areas
    • G08G1/145Traffic control systems for road vehicles indicating individual free spaces in parking areas where the indication depends on the parking areas
    • G08G1/146Traffic control systems for road vehicles indicating individual free spaces in parking areas where the indication depends on the parking areas where the parking area is a limited parking space, e.g. parking garage, restricted space
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/18Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30248Vehicle exterior or interior
    • G06T2207/30252Vehicle exterior; Vicinity of vehicle
    • G06T2207/30264Parking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/08Detecting or categorising vehicles

Definitions

  • the present disclosure relates to image processing, in particular, to a method and system for management of parking spaces based on image processing.
  • a closed outdoor parking lot realizes a guidance and management of parking spaces in a parking lot through the detection of vehicles using cameras.
  • the angle of view of a panoramic camera is larger than that of a monocular camera.
  • An image acquired by the panoramic camera can be seen in a large scene of 360 degrees, thereby covering more targets (e.g., vehicles) .
  • targets e.g., vehicles
  • an identification rate of a vehicle with a small area from an image of a large scene is low, causing underreporting.
  • the method for image processing may include obtaining an image acquired by a camera with a height from the ground satisfying a first condition; determining multiple regions of the image; identifying at least one subject in the multiple regions; and determining an identification result of the image based on a subject among the at least one subject, at least a portion of the subject being in an overlapping area of the multiple regions.
  • each of at least one matching area in the image may be enclosed in a region of the multiple regions, and the at least one matching area matches to the at least one subjects.
  • the determining the multiple regions of the image may include determining a reference area of the image, wherein a deformation degree of the reference area satisfies a second condition; and determining the multiple regions according to the reference area.
  • the multiple regions may contain a set of regions which may contain a first region and a second region
  • the determining the multiple regions according to the reference area may include identifying one or more reference subjects in the identification area; determining a target reference subject from the one or more reference subjects in the reference area satisfies a third condition; determining an edge of the first region in the image based on an edge of a reference matching area which matches to the target reference subject; determining an edge of the second region in the image based on an edge of the reference area; and determining the set of regions according to the edge of the first region and the edge of the second region in the image.
  • the third condition may be related to a position of each of the one or more reference subjects in the reference area.
  • the third condition may be related to an identification confidence level of each of the one or more reference subjects in the identification area.
  • the determining the identification result of the image based on the subject among the at least one subject, the at least a portion of the subject being in the overlapping area of the multiple regions may include determining, based on an intersection over union (IOU) of each of the at least one subject in the multiple regions, the subject; and determining the identification result of the image according to an identification confidence level of the subject.
  • IOU intersection over union
  • the identifying the at least one subject in the multiple regions further may include: for a region of the multiple regions, obtaining an identification frame in the region based on an identification algorithm, wherein the identification frame has a corresponding relationship with a matching area which matches to the target in the region; correcting the identification frame to obtain a corrected identification frame based on an angle between a actual horizontal line of the matching area corresponding to the identification frame and a standard horizontal line; and identifying the subject in the region according to the corrected identification frame.
  • the at least one subject may include a vehicle
  • the ground captured by the camera may include a parking area
  • the method further may include determining a vacant parking space in the parking area according to the identification result of the image.
  • the system for image processing may include: at least one storage device including a set of instructions; and at least one processor configured to communicate with the at least one storage device.
  • the at least one processor may be configured to direct the system to perform operations including: obtaining an image acquired by a camera with a height from the ground satisfying a first condition; determining multiple regions of the image; identifying at least one subject in the multiple regions; and determining an identification result of the image based on a subject among the at least one subject, at least a portion of the subject being in an overlapping area of the multiple regions.
  • each of at least one matching area in the image may be enclosed in a region of the multiple regions, and the at least one matching area matches to the at least one subjects.
  • the determining the multiple regions of the image may include determining a reference area of the image, wherein a deformation degree of the reference area satisfies a second condition; and determining the multiple regions according to the reference area.
  • the multiple regions may contain a set of regions which may contain a first region and a second region
  • the determining the multiple regions according to the reference area may include identifying one or more reference subjects in the identification area; determining a target reference subject from the one or more reference subjects in the reference area satisfies a third condition; determining an edge of the first region in the image based on an edge of a reference matching area which matches to the target reference subject; determining an edge of the second region in the image based on an edge of the reference area; and determining the set of regions according to the edge of the first region and the edge of the second region in the image.
  • the third condition may be related to a position of each of the one or more reference subjects in the reference area.
  • the third condition may be related to an identification confidence level of each of the one or more reference subjects in the identification area.
  • the determining the identification result of the image based on the subject among the at least one subject, the at least a portion of the subject being in the overlapping area of the multiple regions may include determining, based on an intersection over union (IOU) of each of the at least one subject in the multiple regions, the subject; and determining the identification result of the image according to an identification confidence level of the subject.
  • IOU intersection over union
  • the identifying the at least one subject in the multiple regions further may include: for a region of the multiple regions, obtaining an identification frame in the region based on an identification algorithm, wherein the identification frame has a corresponding relationship with a matching area which matches to the target in the region; correcting the identification frame to obtain a corrected identification frame based on an angle between a actual horizontal line of the matching area corresponding to the identification frame and a standard horizontal line; and identifying the subject in the region according to the corrected identification frame.
  • the at least one subject may include a vehicle
  • the ground captured by the camera may include a parking area
  • the system further may include determining a vacant parking space in the parking area according to the identification result of the image.
  • the system for image processing may include: a obtaining module configured to obtain an image acquired by a camera with a height from the ground satisfying a first condition; a segmenting module configured to determine multiple regions of the image; an identifying module configured to identify at least one subject in the multiple regions; and a determining module configured to determine an identification result of the image based on a subject among the at least one subject, at least a portion of the subject being in an overlapping area of the multiple regions.
  • Another aspect of embodiments of the present disclosure may provide a non-transitory computer readable medium including executable instructions.
  • the executable instructions When executed by at least one processor, the executable instructions may direct the at least one processor to perform a method.
  • the method may include: obtaining an image acquired by a camera with a height from the ground satisfying a first condition; determining multiple regions of the image; identifying at least one subject in the multiple regions; and determining an identification result of the image based on a subject among the at least one subject, at least a portion of the subject being in an overlapping area of the multiple regions.
  • FIG. 1 is a diagram illustrating an image processing system according to some embodiments of the present disclosure
  • FIG. 2 is an exemplary flowchart illustrating a process for image processing according to some embodiments of the present disclosure
  • FIG. 3 is an exemplary flowchart illustrating a process of determining a region according to some embodiments of the present disclosure
  • FIG. 4A is a diagram illustrating a process of determining a region according to some embodiments of the present disclosure
  • FIG. 4B is another diagram illustrating a process of determining a region according to some embodiments of the present disclosure.
  • FIG. 5 is another diagram illustrating a process of determining a region according to some embodiments of the present disclosure
  • FIG. 6 is an exemplary flowchart illustrating a process of determining an identification result of an image according to some embodiments of the present disclosure
  • FIG. 7 is a diagram illustrating a process of determining an IOU of a subject according to some embodiments of the present disclosure
  • FIG. 8 is an exemplary flowchart illustrating a process of correcting an identification frame according to some embodiments of the present disclosure
  • FIG. 9 is a diagram of correcting an identification frame according to some embodiments of the present disclosure.
  • FIG. 10 is an exemplary flowchart illustrating a process of vehicle detection according to some embodiments of the present disclosure.
  • FIG. 11 is a diagram of an image of a vehicle to be detected according to some embodiments of the present disclosure.
  • FIG. 12 is a diagram of pre-detection areas and detection partitions according to some embodiments of the present disclosure.
  • FIG. 13 is a diagram of a segmented area recognition result according to some embodiments of the present disclosure.
  • FIG. 14 is an exemplary flowchart illustrating a process of vehicle detection according to some embodiments of the present disclosure
  • FIG. 15 is another exemplary flowchart illustrating a process of vehicle detection according to some embodiments of the present disclosure.
  • FIG. 16 is a diagram illustrating a distortion and tilt of an image according to some embodiments of the present disclosure.
  • FIG. 17 is a diagram of a distortion correction for a vehicle frame according to some embodiments of the present disclosure.
  • FIG. 18 is a diagram illustrating a structure of a vehicle detection device according to some embodiments of the present disclosure.
  • FIG. 19 is another diagram illustrating a structure of a vehicle detection device according to some embodiments of the present disclosure.
  • FIG. 20 is a diagram illustrating a structure of a computer storage medium according to some embodiments of the present disclosure.
  • system may be a method for distinguishing different components, elements, members, parts, or assemblies of different levels.
  • device may be a method for distinguishing different components, elements, members, parts, or assemblies of different levels.
  • unit may be replaced by other expressions.
  • flowcharts may be used to describe operations performed by the system according to the embodiments of the present disclosure. It should be understood that the preceding or following operations may not be performed precisely in order. On the contrary, individual steps may be processed in reverse order or simultaneously. At the same time, other operations may be added to the processes, or a step or several steps may be removed from the processes.
  • FIG. 1 is a diagram illustrating an image processing system according to some embodiments of the present disclosure.
  • the image processing system 100 may be applied to process image, and may be particularly suitable for images with a large number of subjects and small pixels, images affected by occlusion and distortion of subjects located at a distant view and around edges of the images, etc.
  • the image processing system 100 may be applied to an application area with a large space, especially a parking area (e.g., an outdoor parking lot, an indoor parking lot, or the like) .
  • the image processing system 100 may be applied to identify a subject in an image.
  • a subject identified by the image processing may be a vehicle parked in the parking area, a vacant parking space in the parking area, or the like.
  • the image processing system 100 may monitor the parking situation of vehicles in the parking area in real time, therefore achieving the guidance and management of parking spaces in the parking area.
  • the image processing system 100 may include a processing device 110, a network 120, an image acquisition device 130, and a supporting rod 140.
  • the processing device 110 may be configured to process information and/or data related to an image. For example, the processing device 110 may acquire an image from the image acquisition device 130 through the network 120. The processing device 110 may segment the image, identify multiple regions obtained by the segmentation, and determine at least one subject in the multiple regions. The processing device 110 may determine an identification result of the image based on a subject among the at least one subject, at least a portion of the subject being in an overlapping area of the multiple regions. The processing device 110 may determine the multiple regions in the image based on a segmentation frame, a reference area, or the like.
  • the processing device 110 may include one or more sub-processing devices (e.g., a single-core processing device or a multi-core processing device) . In some embodiments, the processing device 110 or a part of the processing device 110 may be integrated into the camera 130.
  • sub-processing devices e.g., a single-core processing device or a multi-core processing device.
  • the processing device 110 or a part of the processing device 110 may be integrated into the camera 130.
  • the network 120 may provide channels for information exchange.
  • the network 120 may include a variety of network access points, such as wired or wireless access points, base stations, or network exchange points.
  • Data sources may be connected to the network 120 and send information through the network 120 via the access points mentioned above.
  • the image acquisition device 130 may be an electronic device with a function of acquiring images or videos.
  • the image acquisition device 130 may include a camera for acquiring images.
  • the image acquisition device 130 may be a panoramic camera, a monocular camera, a multi-nodular camera, or a rotatable camera.
  • the image acquisition device 130 may continuously collect data, collect data at regular intervals, or collect data based on control instructions.
  • the collected data may be stored in a local storage device, or be sent to a remote site via the network 120 for storage or further processing.
  • the image acquisition device 130 may be installed on the supporting rod 140.
  • the supporting rod 140 has a certain height (e.g., 10 meters, 15 meters, etc. ) , such that the image acquisition device 130 may acquire images of a large area.
  • the supporting rod 140 may be any rod on which the image acquisition device 130 may be installed.
  • the supporting rod 140 may include light poles, or the like.
  • the processing device 110 may include an obtaining module, a segmenting module, an identifying module, and a determining module.
  • the obtaining module may be configured to obtain an image.
  • the image may be acquired by a camera, and a height of the camera from the ground may satisfy a first condition.
  • the image may be a panoramic image.
  • the obtaining module may pre-process the image.
  • the segmenting module may be configured to determine multiple regions of the image. When the segmenting module determines the multiple regions of the image, each matching area in the image is fully enclosed in at least one of the multiple regions.
  • the segmenting module may determine the multiple regions in a variety of ways. For example, the segmenting module may determine the multiple regions through a reference area or a segmentation frame.
  • the identifying module may be configured to identify at least one subject in multiple regions, and a reference area, etc. For example, the identifying module may identify the at least one subject based on the identification model.
  • the identifying module may obtain a position of a subject in a region through the identification model.
  • the position may be expressed in a variety of ways. For example, the position may be expressed by a target frame that includes the subject.
  • the identification model may be a trained machine learning model.
  • the identification model may include a U-Net model, a YOLO model, or the like.
  • the determining module may be configured to determine an identification result of the image. In some embodiments, the determining module may determine the identification result of the image based on a subject among the at least one subject, and the subject may be in an overlapping area of the multiple regions. In some embodiments, the determining module may determine the subject based on an intersection over union (IOU) of each subject in the multiple regions, and may further determine the identification result of the image.
  • IOU intersection over union
  • the processing device 110 may also include a training module, which may be configured to train an initial model to obtain a trained model.
  • the training module may be configured to train an initial identification model to obtain the identification model.
  • FIG. 2 For the relevant content of the obtaining module, the segmenting module, the identifying module, the determining module, and the training model, refer to FIG. 2, FIG. 3, FIG. 4A, FIG. 4B, FIG. 5, FIG. 6, FIG. 7, FIG. 8, and FIG. 9.
  • the above descriptions of the candidate display, determination system, and the modules is only for convenience of description, which do not limit the present disclosure to the scope of the embodiments mentioned above. It should be understood that for those skilled in the art, after understanding the principles of the system, various modules may be arbitrarily combined with each other, or subsystems connected to other modules may be formed, without departing from the principles.
  • the obtaining module, the segmenting module, the identifying module, and the determining module may be different modules in one system, or one module may implement the functions of the two or more modules mentioned above.
  • each module may share a storage module, and each module may also have its own storage module. Such deformations are all within the scope of the present disclosure.
  • FIG. 2 is an exemplary flowchart illustrating a process for image processing according to some embodiments of the present disclosure.
  • process 200 may include the following operations.
  • the process 200 may be performed by the image acquisition device 140.
  • process 200 may be implemented in the system 100 illustrated in FIG. 1.
  • the process 200 may be stored in a storage device and/or the storage (e.g., the memory 1920) as a form of instructions, and invoked and/or executed by the processing device 110 (e.g., the processor 1910 of the computing device 1900 as illustrated in FIG. 19) .
  • the operations of the illustrated process presented below are intended to be illustrative.
  • the process 200 may be accomplished with one or more additional operations not described, and/or without one or more of the operations discussed. Additionally, the order in which the operations of the process 200 as illustrated in FIG. 2 and described below is not intended to be limiting.
  • the processing device 110 may obtain an image acquired by a camera, and a height of the camera from the ground may satisfy a first condition.
  • the first condition may be satisfied if the height of the camera from the ground exceeds a height threshold.
  • the height threshold may be 10 meters, 12 meters, 15 meters, or the like.
  • the height of the camera from the ground may be changed according to the actual scene.
  • the camera may be installed on a device with a certain height for taking an image of an actual scene.
  • the camera may be installed on a light pole to take an image of a parking lot (e.g., an outdoor parking lot) .
  • the height of the camera installed on the light pole from the ground of the outdoor parking lot may be 10-15 meters.
  • the camera may be installed on a pole to take an image of an indoor or an outdoor warehouse. The height of the camera installed on the pole from the ground of the warehouse may be 15-20 meters.
  • the image acquired by the camera may be a panoramic image.
  • the image may be a panoramic image of an outdoor parking lot.
  • the image may be a panoramic image of an indoor warehouse, or the like.
  • the number of images acquired by the camera may be more than one.
  • the camera may be a panoramic camera.
  • the camera may be a four-lens fisheye camera, or the like. Accordingly, the image taken by the camera may be a panoramic image.
  • the camera may be a non-panoramic camera.
  • the camera may be a plurality of monocular cameras, and the plurality of monocular cameras may be installed at different positions to take images of different areas on the ground.
  • the obtaining module may obtain the panoramic image of the ground based on the images of different areas taken by the monocular cameras.
  • the camera may be a rotatable camera.
  • the rotatable camera may be rotated to different angles to take images.
  • the obtaining module may obtain the panoramic image of the ground based on the images acquired by the rotatable camera from various angles.
  • the panoramic image may be deformed. The farther away from the center of the image, the greater the deformation degree of the panoramic image may be.
  • the deformation of the panoramic image may be symmetrical.
  • the panoramic image may include a distant view part and a close view part.
  • the distant view part may refer to an area in the panoramic image representing a physical area in space farther from the camera (also called the distant view)
  • the close view part may refer to a region in the panoramic image representing a physical area in space closer to the camera (also called the close view) .
  • pixel values in the distant view part may be smaller than that in the close view part.
  • objects in the distant view part may be prone to be occluded by other objects in the close view part.
  • vehicles on the ground in the distant view part may be easily occluded by other objects in the close view part (e.g., tall buildings, trees, etc. ) .
  • the obtaining module may acquire images from the camera periodically (e.g., at an interval of a preset number of frames) .
  • the obtaining module may pre-process the images acquired from the camera. For example, pre-processing may include resolution adjustment, image enhancement, cropping, or the like.
  • the obtaining module may crop the image acquired by the camera.
  • the pixel values of an object in the distant view part of the panoramic image may be small, and the object may be easily occluded by other objects.
  • the obtaining module may crop the panoramic image acquired by the camera to remove the distant view part. For example, for a panoramic image A including an edge a and an edge b. The edge a may be close to the distant view part, and the edge b may be close to the close view part.
  • the obtaining module may remove the 1/4 area of the panoramic image A that is far away from the edge b and close to the edge a. The remaining area may be used as the effective area of the image for subsequent processing (e.g., operation 220, etc. ) .
  • the processing device 110 may determine multiple regions of the image.
  • the region may be a partial region of the image.
  • the number of regions may be one or more. In some embodiments, all regions combined may include the full content of the image. In some embodiments, the region may be any shape, etc., a rectangle, or the like.
  • the overlapping area between the multiple regions may refer to an overlapping area between at least two of the multiple regions.
  • the overlapping area refer to operation 240 and the related descriptions.
  • the matching area may refer to the area in the image that is configured to match a subject.
  • the matching area may be a bounding box of any shape (e.g., a matrix) that may include the subject.
  • the bounding box may be a preset frame to place the subject in.
  • the image may be an image of a parking lot, the subject may be a vehicle, and the matching area may therefore be the parking spaces (including occupied parking spaces and vacant parking spaces) .
  • the image may be an image of a warehouse, the subject may be the goods or the shelf, and the matching area may therefore be the shelf spaces of the goods or the shelf.
  • the matching area may also be an area obtained by rotating, zooming (e.g., zooming in) , and/or merging the parking spaces or the shelf spaces.
  • the bounding box may be a frame determined based on the outline of the subject.
  • the bounding box may be a target frame that identifies a target position output by the identification model. Refer to operation 230 for the identification model and the target frame.
  • the multiple regions may satisfy a condition (also referred to as a fifth condition) .
  • the fifth condition may be that each matching area in the image is included in at least one of the multiple regions. It should be understood that a matching area being included in a region refers to that the matching area in the image is fully enclosed in a region, and is not segmented by an edge of the region. As mentioned above, there may be an overlapping area between the multiple regions, and therefore some matching areas may be enclosed in more than one region.
  • the segmenting module may segment the image in a variety of ways to obtain the multiple regions. For example, the segmenting module may segment the image based on the input of a user by obtaining the input of the user from a user terminal. For another example, the segmenting module may determine the multiple regions through a reference area or a segmentation frame. More descriptions for the determination of the multiple regions may be found elsewhere in the present disclosure (e.g., FIG. 3, FIG. 4, FIG. 5, and the related descriptions thereof) .
  • the segmenting module may evaluate the determined multiple regions to determine if the segmentation is reasonable. Whether the segmentation is reasonable may refer to whether each matching area in the image is completely enclosed in a region. If the segmentation is unreasonable, the segmenting module may re-segment or feedback to the user for re-segmentation.
  • the evaluation of whether the segmentation is reasonable may be achieved through an evaluation model.
  • the evaluation model may be DNN, CNN, or the like.
  • the multiple regions obtained by segmentation may be input into the evaluation model, and the evaluation of whether the segmentation is reasonable may be the output of the evaluation model.
  • the processing device 110 may identify at least one subject in the multiple regions.
  • the subject may be an object that needs to be identified in the image.
  • the subject may change according to the application scenario.
  • the subject may be a parked vehicle (e.g., a vehicle parked in a parking space) .
  • the subject may be goods (e.g., the goods placed in the shelf spaces) .
  • the identifying module may identify the position of the subject in the region.
  • the position may be expressed in a variety of ways. For example, the position may be expressed by a target frame (or a bounding box) that includes the subject. As another example, the position may be expressed by a center point of the subject.
  • the identifying module may identify the at least one subject in the multiple regions in a variety of ways. In some embodiments, the identifying module may identify the at least one subject in the multiple regions based on an identification model.
  • the identification model may be a trained machine learning model.
  • the machine learning model may include a U-Net model, a YOLO model, or the like.
  • the identification model may identify a subject in each of the multiple regions separately.
  • the input of the identification model is a part of the region in the image.
  • the part of the region in the image may be obtained by clipping the image.
  • the part of the region in the image may be obtained by masking other parts in the image except the region.
  • the region in the image may be input into the YOLO model, and the YOLO model may output an area corresponding to the pixels of the subject in the multiple regions.
  • the training module may train an initial identification model based on training data to obtain the identification model.
  • the training data may include sample images of multiple parking lots, and labels corresponding to the sample images may be the positions of vehicles in the sample images.
  • the sample images may be collected by cameras at a height of 6 meters, 12 meters, and 20 meters from the ground.
  • the identifying module identifying the subject in the region may include acquiring an identification frame in the region, and determining the subject in the region based on the identification frame. There may be a corresponding relationship between the identification frame and the subject, such as a one-to-one correspondence.
  • the identification frame may be used when determining whether the subject or the type of the subject is included in the region.
  • the identification model may determine whether the subject is included in the region by processing the identification frame.
  • the identifying module may correct the identification frame, and identify the subject after the correction.
  • the identification model may process the multiple regions to obtain and output the identification frame.
  • the identifying module may correct the identification frame, and input the corrected identification frame into the identification model to identify the subject in the multiple regions. More descriptions for the correction of the identification frame may be found in FIG. 7 and the related descriptions.
  • the processing device 110 may determine an identification result of the image based on a subject among the at least one subject, and the subject may be in an overlapping area of the multiple regions.
  • the subject in the overlapping area may refer to that at least a portion of the subject appears in the overlapping area, and the overlapping area includes all or a part of the content of the subject.
  • the overlapping area may refer to an area in the image that appears in more than one region among the multiple regions. When a subject appears in more than one region, the subject appears, accordingly, in the overlapping area of the more than one region.
  • the overlapping area may be an overlapping part between two regions, or an overlapping part between more than two regions.
  • the identification result of the image may refer to the identification of the subject and the position of the subject in the image. More descriptions for the representation of the position of the subject may be found in elsewhere in the present disclosure.
  • the determining module may determine the subject in the overlapping area of the multiple regions based on an intersection over union (IOU) of each subject in the multiple regions, and may further determine the identification result of the image.
  • IOU intersection over union
  • the subject may be a vehicle
  • the image from the camera may include a parking area, such as an indoor parking lot, an outdoor parking lot, roadside parking spaces, etc.
  • the identifying module may determine a vacant parking space in the parking area according to the identification result of the image. For example, the identifying module may obtain a distribution map of parking spaces in the parking area from a storage device, and then compare the distribution map of parking spaces with the positions of vehicles in the image to determine whether there is vacant parking space for a vehicle. Vacant spaces in other areas (e.g., a vacant shelf space in a warehouse) may also be determined in a similar way.
  • the image may include a panoramic image, and the panoramic image may be deformed.
  • the accuracy of identification may be improved, and the problems of inaccurate identification when the identification model attempts to identify an entire panoramic image may be avoided (e.g., problems due to large number of subjects, small pixels, and occlusion and distortion in the panoramic image) .
  • ensuring that each matching area in the panoramic image is completely enclosed in a region may further ensure the accuracy of the identification.
  • the same subject may be prevented from being mistaken for two subjects, and therefore the identification accuracy may be further improved.
  • the matching area may be configured to be an enlarged image area of a parking space or a combined image area of multiple parking spaces to deal with the situation that multiple parking spaces are occupied by one vehicle due to illegal parking, therefore improving the accuracy of the determination of vacant parking spaces.
  • FIG. 3 is an exemplary flowchart illustrating a process of determining a region according to some embodiments of the present disclosure.
  • process 300 may include the following operations.
  • process 300 may be implemented in the system 100 illustrated in FIG. 1.
  • the process 300 may be stored in a storage device and/or the storage (e.g., the memory 1920) as a form of instructions, and invoked and/or executed by the processing device 110 (e.g., the processor 1910 of the computing device1900 as illustrated in FIG. 19) .
  • the operations of the illustrated process presented below are intended to be illustrative.
  • the process 300 may be accomplished with one or more additional operations not described, and/or without one or more of the operations discussed. Additionally, the order in which the operations of the process 300 as illustrated in FIG. 3 and described below is not intended to be limiting.
  • operation 220 in FIG. 2 may be performed according to process 300 illustrated in FIG. 3.
  • the processing device 110 may determine a reference area of the image, and a deformation degree of the reference area may satisfy a second condition.
  • the number of reference areas may be one or more.
  • the reference area may be any shape, e.g., a rectangle, a circle, or the like.
  • the deformation degree of the reference area may be the degree of deformation of the image in the reference area relative to the real scene.
  • the image may be distorted.
  • an area that is straight in the real scene may be an area that is curved in the panoramic image.
  • the second condition may be that the deformation degree of the reference area in the image is less than a threshold (e.g., the threshold may be 60%, 50%, etc. ) .
  • the threshold of the deformation degree in the second condition may be adjusted according to actual application scenarios. For example, the height of the camera from the ground may affect the second condition. The farther the distance, the greater the threshold may be in the second condition.
  • the deformation degree of an area in the image may be determined in a variety of ways.
  • the segmenting module may determine the deformation degree based on the degree of deformation of an object in the area of the image relative to the object in the real scene.
  • the area A may include a object, and a outline of the object is curved in the image. But the outline of the object is straight actually.
  • the segmenting module may determine the deformation degree based on the curvature of the outline. The greater the curvature is, the greater the deformation degree is.
  • the segmenting module may determine the reference area of the image in a variety of ways. For example, the segmenting module may determine the deformation degrees of different candidate areas in the image, and determine the reference area based on the deformation degrees of the candidate areas and the second condition. One of the candidate areas with a deformation degree that satisfies the second condition may be determined as the reference area. As another example, the segmenting module may determine the reference area based on a reference line, and more descriptions for determining the reference area based on the reference line, refer to the following descriptions.
  • the reference area may be a matching area or a combined area of multiple matching areas.
  • the image, the multiple regions, the reference area, and the matching area may all include multiple edges.
  • the segmenting module may obtain coordinates of each edge of the image, the multiple regions, the reference area, and the matching area in the same two-dimensional coordinate system (e.g., the two-dimensional coordinate system in FIG. 4A, 4A, 5, and 7) .
  • a certain corner of the image may be used as the origin of the coordinate system (point o as shown in FIG. 4A, etc. ) , and the image may be located in the first quadrant of the coordinate system.
  • the two edges i.e., a first edge of the image and a second edge of the image
  • the two edges corresponding to the length of the image may be parallel to the horizontal axis (x axis as shown in FIG. 4A) of the coordinate system.
  • the first edge of the image may be closer to the close view part of the image and the origin of the coordinate system.
  • the two edges i.e., a third edge of the image and a fourth edge of the image
  • corresponding to the width of the image may be parallel to the vertical axis (y axis as shown in FIG. XX) of the coordinate system.
  • the third edge of the image may be closer to the origin of the coordinate system.
  • the segmenting module may determine the first edge, the second edge, the third edge, and the fourth edge of each of the multiple regions, the reference area, and the matching area.
  • the segmenting module may obtain the reference line of the image, designate an edge of the image which intersects with the reference line as an edge of the reference area, and determine, based on a pixel ratio of the length to the width of the image, an area ratio of the two portions of the reference area on the two sides of the reference line.
  • the length and the width of the reference area may be preset pixel values.
  • the segmenting module may designate an edge of the image which intersects with the reference line and is closer to the close view part as an edge of the reference area. For example, the segmenting module may take the first edge of the image as an edge of the reference area.
  • the reference line may refer to a line passing through the image.
  • the reference line may be a centerline of the image.
  • the deformation degree of the image on two sides of the reference line may be symmetrically distributed.
  • the reference line may not pass through a subject (e.g., a vehicle, etc. ) .
  • the segmenting module may move the reference line, and the subject is not segmented by the reference line.
  • the segmenting module may move the reference line to an area between the subject and another subject adjacent to the subject.
  • the size of the reference area may be a preset fixed size.
  • the size of the reference area may be adjusted adaptively according to the size of the image.
  • the size of the reference area may be related to the parameters of the identification model. For example, a preset pixel value of the length of the reference area may be 1920px, and a preset pixel value of the width may be 1080px.
  • the segmenting module may merge the multiple reference areas, and determine the multiple regions through subsequent operations (e.g., operation 320, etc. ) based on the merged area.
  • the reference areas when there are multiple reference areas, may have a corresponding relationship with the multiple regions.
  • the reference areas may include a first reference area configured to determine a region on one side of the reference line, and a second reference area configured to determine another region on the other side of the reference line.
  • the area ratio of the two portions of the reference area on the two sides of the reference line may be related to the pixel ratio of the length to the width of the image, a type (e.g., a first reference area and a second reference area) of the reference area, etc.
  • the pixel ratio of the length to the width of the image may be f
  • the reference line may be parallel to the vertical axis mentioned above.
  • the first reference area may correspond to the left side of the reference line, therefore the area ratio of the left portion of the first reference area to the right portion of the first reference area may be f.
  • the second reference area may correspond to the right side of the reference line, therefore the area ratio of the right portion of the second reference area to the left portion of the second reference area may be f.
  • the processing device 110 may determine the multiple regions according to the reference area.
  • the specific positions of the multiple regions in the image may be determined directly based on the reference area.
  • the edges of the reference area may be designated as the edges of the multiple regions.
  • multiple edges of the reference area and multiple edges of the image may be combined to determine the edges of different regions, such that there is an overlapping area between the multiple regions, and the overlapping area may be the reference area.
  • the specific positions of the multiple regions in the image may be determined based on the reference area and a reference subject included in the reference area.
  • the multiple regions may include a first region and a second region. There may be a vertical position relationship (i.e., one being on top of another) between the first region and the second region in the set of regions. The first region and the second region may be arranged along the vertical direction of the image.
  • each side of the reference line may include a set of regions, respectively.
  • the segmenting module may identify one or more reference subjects in the reference area.
  • the segmenting module may determine a target reference subject satisfying a third condition from the one or more reference subjects in the reference area.
  • the segmenting module may determine an edge of the first region in the image based on an edge of a reference matching area that corresponds to the target reference subject, determine an edge of the second region in the image based on an edge of the reference area, and determine the set of regions according to the edge of the first region and the edge of the second region in the image.
  • the target reference subject may be a reference subject used to determine the multiple regions.
  • the target reference subject may be one or more.
  • the third condition may be related to the deformation degree of each area where a reference subject is located.
  • the third condition may be that the deformation degree of the reference subject is within a preset threshold range (e.g., 50%-55%, etc. ) .
  • the preset threshold range may be related to the deformation degree of the entire image.
  • the preset threshold range may be determined based on an average deformation degree of the image.
  • the third condition may be related to the position of each of the one or more reference subjects in the reference area.
  • the third condition may be that the distance between the reference subject and the edge of the reference area is less than a threshold.
  • the third condition may be the top n reference subjects ranked from short to long in the distance between the reference subject and the edge of the reference area, with n being larger than 0.
  • the third condition may be that the distance between the reference subject and the second edge of the reference area is less than a threshold.
  • the segmenting module may determine the first reference area and the second reference area based on the reference line.
  • the third condition may be related to the type of the reference area. Continuing to take the example discussed in operation 310, when the first reference area corresponds to the left side of the reference line, the third condition may be that the distance between the reference subject and the edge corresponding to the upper left corner of the first reference area (e.g., the second edge and the third edge) is less than a threshold.
  • the third condition may be related to the position and an identification confidence level of at least one reference subject in the reference area.
  • the third condition may be that a reference object has the maximum identification confidence level among the reference subjects whose distance from the edge of the reference area is less than a threshold.
  • the identification model may obtain the position and the identification confidence level of the reference subject in the reference area by processing the reference area.
  • the identification confidence level may refer to the probability of the reference subject being at the obtained position.
  • the identification confidence level may be related to whether the reference subject completely appears in the reference area. For example, the lower the completeness level, the lower the identification confidence level may be.
  • the segmenting module may select one of the multiple target reference subjects to determine the region, or merge the multiple target reference subjects and determine the region based on the merged result, or proceed in other ways.
  • the multiple regions may be directly determined based on the edge of the reference area.
  • the reference matching area when the segmenting module determines the first region based on the reference matching area, the reference matching area may be completely enclosed in the region.
  • the segmenting module may designate the edge of the reference matching area as the edge of the first region and keep the types of the edges the same. For example, the first edge of the reference matching area may be designated as the first edge of the first region.
  • the segmenting module may designate the edge of the reference area as the edge of the second region and keep the types of the edges the same. For example, the second edge of the reference area may be designated as the second edge of the second region.
  • the position of the target reference subject in the reference area may be considered. For example, taking the coordinate system mentioned above as an example, if the target reference subject is close to the second edge of the reference area, the first edge of the reference matching area may be designated as the first edge of the first region, and the second edge of the reference area may be designated as the second edge of the second region.
  • the segmenting module may adopt a similar method as the target reference subject close to the second edge of the reference area.
  • the segmenting module may determine other edges of the multiple regions based on the edge of the image and/or the edge of the reference area. For example, after the first edge of the first region and the second edge of the second region are determined in the above manner, other edges of the first region and the second region may be determined based on the third or fourth edge of the image or reference area.
  • the distance between two parallel edges of a region may be a preset threshold.
  • the reference area may be determined based on the reference line.
  • the segmenting module may determine the edge of a region based on the reference line. For example, in the coordinate system mentioned above, the segmenting module may use the reference line as either the third edge or the fourth edge of the region, which may be determined correspondingly based on the position of the region relative to the reference line.
  • FIG. 4A and FIG. 4B More descriptions for determining the multiple regions based on the reference area may be found in FIG. 4A and FIG. 4B.
  • FIG. 4A is a diagram illustrating a process of determining a region according to some embodiments of the present disclosure.
  • the segmenting module may segment a panoramic image 400 and obtain a reference area 402.
  • the identifying module may identify the reference area 402.
  • a target reference subject 401 in the reference area 402 may be determined based on the third condition.
  • the target reference subject 401 may be close to the upper left corner of the reference area 402. That is, the distance to the second edge (i.e., the upper edge) and to the third edge (i.e., the left edge) corresponding to the upper left corner of the reference area 402 are both less than a threshold.
  • the target reference subject 401 may be completely enclosed in the reference area 402, and the confidence level may also be high.
  • the target reference subject 401 may correspond to a reference matching area 404.
  • the segmenting module may take the first edge (i.e., the lower edge) of the reference matching area 404 as the first edge (i.e., the lower edge) of the first region 405, and take the edge with a pixel distance of 1080px to the first edge as the second edge (i.e., the upper edge) of the first region.
  • the segmenting module may take the second edge (i.e., the upper edge) of the reference area 402 as the second edge (i.e., the upper edge) of the second region 403, and take the first edge (i.e., the lower edge) of the reference area 402 as the first edge (i.e., the lower edge) of the second region 403.
  • the segmenting module may take the fourth edge (i.e., the right edge) of the reference area 402 as the fourth edge (i.e., the right edge) of both the first region 405 and the second region 403, and take the third edge (i.e., the left edge) of the image as the third edge (i.e., the left edge) of both the first region 405 and the second region 403.
  • FIG. 4B is another diagram illustrating a process of determining a region according to some embodiments of the present disclosure.
  • the identifying module may determine the reference area 412 based on the method stated above.
  • the area ratio of the two portions of the reference area 412 on the left and right side of the reference line may be the pixel ratio f of the length to the width of the image.
  • the reference area 412 may be used to determine the first region 413 and the second region 414 on the left side of the reference line 411.
  • the identifying module may determine, from the reference area 412, a target reference subject 415 and a reference matching area 416 that corresponds to the target reference subject 415, wherein the determination of the target reference subject 415 is similar to FIG. 4A.
  • the segmenting module may determine the first edge, the second edge, and the third edge of the first region 413 and the second region 414 in a manner similar to that shown in FIG. 4A.
  • the segmenting module may take the reference line 411 as the fourth edge (i.e., the right edge) of the first region 413 and the second region 414.
  • the segmenting module may determine the multiple regions by moving a segmentation frame a plurality of steps on the image. In some embodiments, the segmenting module may move the segmentation frame a plurality of steps according to preset moving distances and moving directions, and take the areas within the segmentation frame at the stopping positions of the segmentation frame between each step as the multiple regions.
  • a size of the segmentation frame may be a preset fixed size.
  • the size of the segmentation frame may be adjusted adaptively according to the image size.
  • the segmenting module may determine a moving distance of each step in the plurality of steps. Specifically, the segmenting module may determine the moving distance of the step based on a target historical subject in a historical region.
  • the historical region may be a region determined by the position of the current segmentation frame. In other words, the historical region may be the latest determined region before the current step of moving.
  • the segmenting module may determine a target edge of the historical region based on the moving direction of the segmentation frame.
  • the target edge may refer to the edge that is closest to the moving direction and intersects with the moving direction among the multiple edges of the historical region. For example, if the segmentation frame in FIG. 5 moves upward, the target edge of the historical region 501 may be the second edge (i.e., the upper edge) .
  • the identifying module may identify at least one historical subject in the historical region. For the identification of a historical subject may be the same as or similar to the identification of the subject as described elsewhere in the present disclosure (e.g., operation 230 in FIG. 2) .
  • the segmenting module may determine the target historical subject based on the distance between each historical subject and the target edge, and the identification confidence level of each historical subject. In some embodiments, the segmenting module may designate the historical subject that has an identification confidence level lower than the threshold and that is the farthest from the target edge as a historical reference subject.
  • the segmenting module may determine the moving distance of the step based on the distance between the edge of a historical matching area intersecting with the moving direction and the edge of the historical region intersecting with the moving direction, wherein the historical matching area corresponds to the target historical subject.
  • FIG. 5 is another diagram illustrating a process of determining a region according to some embodiments of the present disclosure.
  • the moving direction 505 of the segmentation frame is from bottom to top.
  • the segmentation frame may determine the historical region 501 and the target historical subject 502.
  • the target historical subject 502 may be a historical subject that is the farthest from the second edge (i.e., the upper edge) of the historical region among all the incomplete historical subjects in the historical region 501, and each of the all the incomplete historical subjects is segmented by the second edge of the historical region.
  • the second edge of the historical region 501 may be the edge of the historical region 501 that intersects with the moving direction 505 and that is farthest from the moving direction 505.
  • the first edge (i.e., the lower edge) of the historical reference area 503 corresponding to the target historical subject 502 may be used as the first edge (i.e., the lower edge) of the segmentation frame after the next step of moving.
  • the first edge of the newly segmented region 504 may be the first edge of the historical reference area 503.
  • FIG. 6 is an exemplary flowchart illustrating a process of determining an identification result of an image according to some embodiments of the present disclosure.
  • process 600 may be implemented in the system 100 illustrated in FIG. 1.
  • the process 600 may be stored in a storage device and/or the storage (e.g., the memory 1920) as a form of instructions, and invoked and/or executed by the processing device 110 (e.g., the processor 1910 of the computing device 1900 as illustrated in FIG. 19) .
  • the operations of the illustrated process presented below are intended to be illustrative.
  • the process 600 may be accomplished with one or more additional operations not described, and/or without one or more of the operations discussed. Additionally, the order in which the operations of the process 600 as illustrated in FIG. 6 and described below is not intended to be limiting.
  • operation 240 in FIG. 2 may be performed according to process 600 illustrated in FIG. 6.
  • the processing device 110 may determine, based on an intersection over union (IOU) of each of the at least one subject in the multiple regions, the subject in the overlapped area of the multiple regions.
  • IOU intersection over union
  • the IOU of a subject may refer to a ratio of an area of intersection between multiple boxes to an area of union of the multiple boxes, and each of the multiple boxes includes at least a portion of the subject.
  • the number of the multiple boxes may be two.
  • the multiple boxes including the at the least a portion of the subject may be target frames of the subject in the multiple regions (e.g., two regions) .
  • the entire content of the subject 705 may be enclosed in the region 702, and a part of the content of the subject 705 may be enclosed in the region 701.703 may refer to the target frame in the region 701.704 may refer to the target frame in the region 702.
  • the IOU of the subject may be the ratio of the area of intersection between the target frame 704 and the target frame 703 to the area of union of the target frame 704 and the target frame 703.
  • the identifying module may determine the position of the subject in the multiple regions and indicate the position of the subject by the target frame.
  • the determining module may establish a coordinate system to determine the coordinates of the target frame where the subject in the multiple regions is located. If there are multiple target frames where the subject is located, the area of intersection and the area of union of the multiple target frames may be calculated to further determine the IOU of each subject.
  • the determining module may designate a subject with an IOU larger than a threshold (e.g., 0, 0.4, etc. ) as the subject in the overlapping area of the multiple regions.
  • a threshold e.g., 0, 0.4, etc.
  • the processing device 110 may determine the identification result of the image according to an identification confidence level of the subject in the overlapping area of the multiple regions.
  • the identification model may output the corresponding position of the subject and the identification confidence level when each of the more than one region is being identified.
  • the same subject may have different identification confidence levels in different regions.
  • the identification confidence level may indicate whether the subject is completely enclosed in a region.
  • the identification confidence level of a subject corresponding to a region may be related to an area of at last a portion of the subject that is located in the region. If the subject is completely enclosed in a region, the identification confidence level may be high in the region.
  • the determining module may process the subject in the overlapping area based on the multiple identification confidence levels of the subject output by the identification model, such that the same subject is only identified as one subject, not multiple subjects. Specifically, for each subject in the overlapping area, the determining module may select the target frame corresponding to the highest confidence level as the position of the subject in the image based on the multiple confidence levels of the subject. When the multiple identification confidence levels are the same, a target frame corresponding to any one of the identification confidence levels may be selected as the position of the subject in the image.
  • the identification result of the subject in the overlapping area may be determined, and the identification results of subjects in non-overlapping areas may be combined to obtain the identification result of the image.
  • the non-overlapping areas may refer to areas in the multiple regions that are outside of the overlapping area.
  • the problem of being considered as two subjects when one subject is enclosed in more than one region, which ultimately causes error in determining the number of subjects may be avoided.
  • the problem of incorrectly calculating the number of vacant parking spaces may occur due to the number of vehicles in the image being incorrectly determined.
  • FIG. 8 is an exemplary flowchart illustrating a process of correcting an identification frame according to some embodiments of the present disclosure.
  • process 800 may be implemented in the system 100 illustrated in FIG. 1.
  • the process 800 may be stored in a storage device and/or the storage (e.g., the memory 1920) as a form of instructions, and invoked and/or executed by the processing device 110 (e.g., the processor 1910 of the computing device 1900 as illustrated in FIG. 19) .
  • the operations of the illustrated process presented below are intended to be illustrative.
  • the process 800 may be accomplished with one or more additional operations not described, and/or without one or more of the operations discussed. Additionally, the order in which the operations of the process 800 as illustrated in FIG. 8 and described below is not intended to be limiting.
  • operation 230 in FIG. 2 may be performed according to process 800 illustrated in FIG. 8.
  • the processing device 110 may obtain an identification frame in the region based on an identification algorithm, and the identification frame has a corresponding relationship with a matching area which matches to the target in the region.
  • the identification algorithm may refer to an algorithm configured to determine the identification frame of a subject in an area (e.g., a region of the multiple regions, a reference area, etc. ) or in an image.
  • the identification frame there is a corresponding relationship between the identification frame and the subject in the region, e.g., one-to-one, one-to-many, many-to-one, or the like.
  • the identification algorithm may be included in the identification model.
  • the identification model may output the identification frames in a region.
  • the identification algorithm may be the non-maximum suppression algorithm (NMS) , or the like.
  • the determining module may obtain the identification frames 901, 902, and 903 based on the identification algorithm.
  • the processing device 110 may correct the identification frame to obtain a corrected identification frame based on an angle between an actual horizontal line of the matching area corresponding to the identification frame and a standard horizontal line.
  • the matching area matches the subject, and since there is a corresponding relationship between the identification frame and the subject, the identification frame may have a corresponding matching area.
  • the actual horizontal line may refer to a horizontal line parallel to the edge of the matching area in a real scene.
  • the double solid lines 906 shown in FIG. 9 may represent an actual horizontal line.
  • the actual horizontal line may be determined in a variety of ways.
  • the actual horizontal line may be determined according to user input. Taking an image from a camera of a parking lot as an example, the identifying module may obtain a distribution map of the parking spaces in the image, take the parking spaces as the matching areas, and obtain the parking space lines of the parking spaces as the actual horizontal lines.
  • the standard horizontal line may be parallel to the edge of the identification frame.
  • the standard horizontal line may be parallel to the first edge and the second edge of the identification frame.
  • the horizontal axis of the coordinate system may be regarded as the standard horizontal line.
  • the standard horizontal line in FIG. 9 may be lines parallel to the lower edge and the upper edge of the identification frame 901.
  • the identifying module may rotate the identification frame based on the angle between the actual horizontal line and the standard horizontal line, such that the identification frame may have an edge parallel to the actual horizontal line, and the corrected identification frame may thereby be obtained.
  • an angle a may be formed between the actual horizontal line and the standard horizontal line, and the identification frame 901 may be rotated counterclockwise by a to obtain the corrected identification frame 904, wherein the first edge of the corrected identification frame 904 is parallel to the actual horizontal line.
  • the processing device 110 may identify the subject in the region according to the corrected identification frame.
  • the determining module may input the corrected identification frames of the multiple regions into the identification model, and output target positions of the multiple regions.
  • the determining module may rotate the corrected identification frame of a region and the corresponding subject in the corrected identification frame of the region, such that the corrected identification frame may have an edge parallel to the standard horizontal line.
  • the determining module may input the rotated corrected identification frame into the identification model, and output the at least one subject in the multiple regions. For example, as shown in FIG. 9, the corrected identification frame 904 and the corresponding area may be rotated such that the corrected identification frame 904 may include an edge parallel to the standard horizontal line, and the rotated corrected identification frame 904 and the corresponding area may be input into the identification model for identification.
  • the image may include an image of a vehicle to be detected below.
  • the reference area may include a first pre-detection area and a second pre-detection area below.
  • the reference matching area may include a first vehicle frame and a second vehicle frame below.
  • the first reference area may include a first pre-detection area below, and a first vehicle frame below may be obtained from the first reference area.
  • the second reference area may include a second pre-detection area below, and a second vehicle frame below may be obtained from the second reference area.
  • the multiple regions may include a first detection partition, a second detection partition, a first upper detection partition, a first lower detection partition, a second upper detection partition, and a second lower detection partition below.
  • the set of regions corresponding to one side of the first reference area may include the first upper detection partition and the first lower detection partition below
  • the set of regions corresponding to one side of the second reference area may include the second upper detection partition and the second lower detection partition below.
  • the identification confidence level may be referred to as a confidence level below.
  • Some embodiments provide a smart solution for the use of old light poles.
  • the construction of smart cities puts forward more ideas about the use of old light poles.
  • urban street lighting is still in a primitive stage, which greatly reduces the use function and efficiency of related implementation devices.
  • the height of the light poles may be generally 10 to 15 meters, and the distance between the light poles may be relatively large.
  • the application proposes to use panoramic cameras for monitoring, which may ensure that target vehicles directly under the light poles and in the distance may be covered.
  • FIG. 10 is an exemplary flowchart illustrating a process of vehicle detection according to some embodiments of the present disclosure.
  • the vehicle detection of the present disclosure may be applied to a vehicle detection device, wherein the vehicle detection device of the present disclosure may be a server, a terminal device, or a system in which the server and the terminal device cooperate with each other.
  • the vehicle detection device of the present disclosure may be a server, a terminal device, or a system in which the server and the terminal device cooperate with each other.
  • various parts included in the electronic device such as each unit, sub-unit, module, and sub-module, may be all arranged in the server, the terminal device, or the server and the terminal device, respectively.
  • the server mentioned above may be hardware or software.
  • the server may be implemented as a distributed server cluster composed of a plurality of servers, or as a single server.
  • the server may be implemented as a plurality of software or software modules, for example, a software or software module configured to provide a distributed server.
  • the server may also be implemented as a single software or software module, which is not specifically limited herein.
  • the vehicle detection method in the embodiments of the present disclosure may be implemented by a processor invoking a computer-readable instruction stored in a memory.
  • the vehicle detection device in the embodiments of the present disclosure may be a panoramic camera, and the panoramic camera may be arranged on a light pole of an outdoor closed parking lot for detecting the conditions of vehicles in the parking lot.
  • the vehicle detection method in the embodiments of the present disclosure may specifically include the following steps:
  • an image of a vehicle to be detected may be obtained, and a first pre-detection area and a second pre-detection area at the center of the image of the vehicle to be detected may be set according to a resolution ratio of the image of the vehicle to be detected.
  • the vehicle detection device may obtain a panoramic image taken by a panoramic camera arranged on a light pole, that is, an image of a vehicle to be detected. Please refer to FIG. 11 for details.
  • FIG. 11 is a diagram of an image of a vehicle to be detected according to some embodiments of the present disclosure.
  • the panoramic camera in the embodiment of the present disclosure may be formed by stitching four-eye fisheye, and the output panoramic image pixel may be 5520*2700, or 3840*2160. Since the panoramic camera may obtain a large 360-degree scene, the count of target vehicles that are covered by the whole image may be large, resulting in too small pixels occupied by each target vehicle. If vehicle detection is performed on the whole image directly, the whole image may be compressed to an input size of a neural network, and each target may disappear after several times of scaling, resulting in a vehicle detection network unable to achieve accurate target vehicle detection.
  • the embodiment of the present disclosure proposes an image segmentation method according to deep learning.
  • an algorithmic procession may be performed on the segmented sub-images, and target vehicles in each segmented sub-image may be identified to ensure that the target vehicles in the segmented sub-image may be highly and accurately identified.
  • the image of the vehicle to be detected may be specifically divided into a first pre-detection area and a second pre-detection area, as well as four detection partitions. Please refer to FIG. 12 for details.
  • FIG. 12 is a diagram of pre-detection areas and detection partitions according to some embodiments of the present disclosure.
  • the vehicle detection device may obtain the image resolution of the image of the vehicle to be detected, and calculate the resolution ratio, that is, an aspect ratio f of the image of the vehicle to be detected.
  • the vehicle detection device may use a pre-detection frame with a resolution of 1920*1080, and adjust the position of the pre-detection frame in the image of the vehicle to be detected according to the aspect ratio f.
  • the vehicle detection device may overlap the bottom of the pre-detection frame with the bottom of the image of the vehicle to be detected, take the central position of the center point in the image of the vehicle to be detected as the boundary, and adjust the position of the pre-detection frame, so that a ratio of the area of the pre-detection frame on the left of the center position to the area on the right of the center position may be the same as the value of the aspect ratio f.
  • the area included in the pre-detection frame may be the first pre-detection area, such as the detection area A shown in FIG. 12.
  • the vehicle detection device may overlap the bottom of the pre-detection frame with the bottom of the image of the vehicle to be detected, take the central position of the center point in the image of the vehicle to be detected as the boundary, and adjust the position of the pre-detection frame, so that a ratio of the area of the pre-detection frame on the right of the center position to the area on the left of the center position may be the same as the value of the aspect ratio f.
  • the area included in the pre-detection frame may be the second pre-detection area.
  • vehicle detection may be performed on the first pre-detection area and the second pre-detection area, respectively, and a first vehicle frame in the first pre-detection area and a second vehicle frame in the second pre-detection area may be obtained.
  • the vehicle detection device may use a pre-trained vehicle detection network to perform vehicle detection in the first pre-detection area and the second pre-detection area, respectively.
  • the vehicle detection network in the embodiments of the present disclosure may use a conventional vehicle detection algorithm and a yolo deep learning training architecture, wherein the network model may use a pyramid network structure.
  • the deep network in the pyramid network structure may be configured to identify the large target vehicle.
  • the small target vehicle may be identified in the shallow network.
  • the network structure it may be ensured that the deep network focuses on optimizing the recognition of large targets, and the shallow network focuses on optimizing the recognition of small targets.
  • Training parameters of the vehicle detection network in the embodiments of the present disclosure may use 50,000 images of outdoor parking lot scenes, covering camera heights of 6 meters, 12 meters, and 20 meters, a material ratio may be 3: 3: 4, and training iterations may be about 200,000 times, until the training is complete.
  • the vehicle detection device may obtain the first vehicle frame and the confidence level in the first pre-detection area, and the second vehicle frame and the confidence level in the second pre-detection area output by the vehicle detection network.
  • a first detection partition and a second detection partition may be arranged according to the first vehicle frame and/or the second vehicle frame.
  • the vehicle detection device may take the central position of the image of the vehicle to be detected, that is, a vertical dividing line that the center point of the image is located, and combine the conditions of the first vehicle frame and/or the second vehicle frame to divide the left and right detection partitions in the image of the vehicle to be detected, that is, the first detection partition and the second detection partition.
  • both the first pre-detection area and the second pre-detection area include a central position
  • both the detected first vehicle frame and the second vehicle frame may have a vehicle frame located at the central position. Therefore, the vehicle detection device may detect whether there is a vehicle frame at the central position of the image of the vehicle to be detected, and if the vehicle frame does not exist, the first detection partition and the second detection partition may be directly divided according to the central position; if the vehicle frame exists, a dividing line may be arranged offset to the left or right according to the position and size of the vehicle frame at the center position, and the dividing line may not pass through any vehicle frame. Finally, the vehicle detection device may divide the first detection partition and the second detection partition according to the dividing line.
  • the division method of the embodiments of the present disclosure may ensure that the two detection partitions on the left and right may not segment the target vehicle in the middle of the image of the vehicle to be detected.
  • the location of the first detection partition please refer to the detection area B in FIG. 12 for details.
  • the first detection partition may be divided into a first upper detection partition and a first lower detection partition according to the first vehicle frame
  • the second detection partition may be divided into a second upper detection partition and a second lower detection partition according to the second vehicle frame.
  • the vehicle detection device may calibrate the position of the first vehicle frame in the first pre-detection area, take a row of vehicle targets at the top of the first pre-detection area in the first vehicle frame, and select target vehicles with the highest confidence level and/or the leftmost target vehicles, that is, the target vehicles S shown in FIG. 12.
  • the vehicle detection device may set a lower edge of the first upper detection partition according to the horizontal coordinate of the lower edge of the vehicle frame that the target vehicles are located, that is, the horizontal coordinate of the lower edge of the first upper detection partition may be the same as the horizontal coordinate of the lower edge of the vehicle frame that the target vehicles are located.
  • the left edge of the first upper detection partition may coincide with the left edge of the image of the vehicle to be detected
  • the right edge of the first upper detection partition may coincide with the right edge of the first detection partition.
  • the panoramic camera may have more occlusions by the target vehicles in a long view
  • only the lower 3/4 area of the image of the vehicle to be detected may be taken as an effective recognition area. Therefore, the upper edge of the first upper detection partition may be the horizontal coordinate of the lower edge plus 1080, or a horizontal line located at 3/4 of the height of the image of the vehicle to be detected.
  • the upper edge of the first lower detection partition may coincide with the upper edge of the first detection partition
  • the left edge of the first lower detection partition may coincide with the left edge of the image of the vehicle to be detected
  • the right edge of the first lower detection partition may coincide with the right edge of the first detection partition
  • the lower edge of the first lower detection partition may coincide with the lower edge of the image of the vehicle to be detected, that is, the detection area C shown in FIG. 12.
  • images may be distorted.
  • a horizontal area in reality may be reflected as an oblique area in the image, especially images on two sides in a close shot, the tilt and distortion may be more serious.
  • some areas of the first upper detection partition and the first lower detection partition may overlap with each other. Through the overlapping arrangement of the areas in the embodiment of the present disclosure, the vehicle detection of large target vehicles relatively close to the panoramic camera and small target vehicles far away may be ensured, which may not be affected by the segmentation of the detection partition.
  • the vehicle detection device may divide the second detection partition into the second upper detection partition and the second lower detection partition according to the second vehicle frame through the process mentioned above, which may not be repeated herein.
  • the vehicle detection may be performed on the first upper detection partition, the first lower detection partition, the second upper detection partition, and the second lower detection partition, respectively, and a target vehicle frame may be marked on the image of the vehicle to be detected according to a vehicle detection result.
  • the vehicle detection device may perform the vehicle detection in the first upper detection partition, the first lower detection partition, the second upper detection partition, and the second lower detection partition, respectively, which may not only ensure that the small target vehicles in each detection partition may be identified with high accuracy, but also minimize the impact of segmentation of the detection area on the segmentation of the target vehicles.
  • the vehicle detection device may mark the target vehicle frame on the image of the vehicle to be detected according to the vehicle detection result, that is, a diagram of a segmented area recognition result shown in FIG. 13.
  • the embodiment of the present disclosure also proposes a strategy for filtering and detecting the same target in the overlapping area between the detection areas.
  • the following may take the vehicle detection result of the first upper detection partition and the first lower detection partition as an example for description. Please refer to FIG. 14 for details.
  • FIG. 14 is an exemplary flowchart illustrating a process of vehicle detection according to some embodiments of the present disclosure.
  • S1005 may include the following sub-steps:
  • the vehicle detection may be performed on the first upper detection partition to obtain third vehicle frames.
  • the vehicle detection may be performed on the first lower detection partition to obtain fourth vehicle frames.
  • a vehicle frame set including the vehicle frames in the third vehicle frames and the vehicle frames in the fourth vehicle frames that the vehicle frames overlap with each other may be obtained, wherein the vehicle frame set may include the vehicle frames in which two frame areas overlap with each other.
  • the vehicle detection device may need to traverse all the third vehicle frames in the first upper detection partition and all the fourth vehicle frames in the first lower detection partition, and complete coordinate mapping in the image of the vehicle to be detected according to coordinate relative points.
  • the vehicle detection device may extract a vehicle frame group in which the vehicle frames in the third vehicle frames and the vehicle frames in the fourth vehicle frames overlap with each other, wherein, the overlap may be defined as that an intersection ratio of two vehicle frames is greater than 0.
  • the intersection ratio may be obtained through dividing an intersection area of the two vehicle frames by a union area of the two vehicle frames.
  • the vehicle detection device may group a plurality of vehicle frame groups into a vehicle frame set.
  • one of the vehicle frames of the vehicle frames with overlapping area of two frames in the vehicle frame set may be deleted according to a preset rule.
  • the vehicle detection device may filter the vehicle frame groups in the vehicle frame set according to the preset rule.
  • a filter condition in the embodiments of the present disclosure may include but is not limited to:
  • Filtering condition 1 when the intersection ratio of the two vehicle frames in the vehicle frame group is greater than 0.4, deleting one of the vehicle frames with lower confidence level.
  • Filtering condition 2 when a center point of a vehicle frame in the vehicle frame group enters a vehicle frame of other detection partitions, deleting a vehicle frame with lower confidence level.
  • FIG. 15 is another exemplary flowchart illustrating a process of vehicle detection according to some embodiments of the present disclosure.
  • a staff may mark a parking detection area in a screen according to actual situation, so as to combine the scene of the parking lot and use relevant information such as area, parking space, or the like, to calculate the situation of distortion and tilt.
  • the parking detection area may specifically be a parking frame or a parking area including a plurality of parking frames.
  • a target vehicle parking area with each parking area as a statistical unit, according to single-row or multi-row parking areas on the ground, a plurality of statistical areas may be drawn.
  • the state of each parking area may be determined, that is, whether there is a parking space or no parking space.
  • the staff may also draw an independent parking space frame, with each parking space as a statistical unit, an independent parking space frame may be drawn, and combining the results of an algorithm to determine the occupancy state of each parking space, that is, whether there is a parking space or no parking space.
  • the vehicle detection may be performed on the first upper detection partition to obtain a fifth vehicle frame.
  • Images at the far end of the panoramic camera may generally have distortions and tilts. As shown in FIG. 16, vehicles and parking spaces in the image may be distorted and tilted on the screen, which is not conducive to the filtering and recognition of a vehicle frame.
  • edge coordinates of the parking detection area and edge coordinates of the fifth vehicle frame in the case that the center point of the fifth vehicle frame is located in the parking detection area may be obtained.
  • the vehicle detection device may compare the position of the center point of the fifth vehicle frame with the position of the parking detection area. If the parking frame includes coordinates of the center point, then directly obtaining coordinates of the lower edge of the parking frame and line segments composed of the lower edge of the parking frame. If the parking area includes the coordinates of the center point, obtaining coordinates of the lower edge of the parking area, wherein, the ordinate of a cut-off point of the lower edge of the parking area may be the same as the ordinate of the center point of the fifth vehicle frame.
  • the length of a line segment may be half of the length of the lower edge of the fifth vehicle frame, so as to obtain coordinates of two ends of a truncated line segment at the lower edge of the parking area.
  • an inclination angle of a parking space may be calculated according to edge coordinates of the parking detection area and edge coordinates of the fifth vehicle frame, and an inclined fifth vehicle frame may be obtained by adjusting the fifth vehicle frame according to the inclination angle of the parking space.
  • the vehicle detection device may obtain the coordinates of the lower edge of the fifth vehicle frame and the line segments, calculate the inclination angle of the parking space through two line segments, and recalculate the position of the fifth vehicle frame according to the inclination angle of the parking space.
  • the vehicle detection device may obtain the coordinates, width, and height of the center point of the fifth vehicle frame through the vehicle detection algorithm, so as to calculate the coordinates of a rectangular frame of a target vehicle and the coordinates of each vertex.
  • each side of the rectangular frame may be rotated counterclockwise by the inclination angle of the parking space with the center point as the center to calculate a new vehicle frame, that is, a tilted fifth vehicle frame.
  • duplicate vehicle frames may be filtered by the tilted fifth vehicle frame.
  • the vehicle detection device may use NMS (non-maximum suppression) technology to filter the duplicate vehicle frames, and a filter threshold may be set to be between 0.20 and 0.25. According to the position and coordinates of the new vehicle frame, the filtering of a plurality of detection frames for one target may be completed.
  • NMS non-maximum suppression
  • FIG. 17 is a diagram of a distortion correction for a vehicle frame according to some embodiments of the present disclosure.
  • FIG. 17 is a result of calculating a distortion angle and completing the correction of the vehicle frame through the vehicle frame configuration mentioned above.
  • An area frame D may be a preset parking area.
  • the distortion angle a may be calculated according to the data mentioned above, and then each vehicle frame in the parking area may be adjusted. Taking the adjusted frame E and frame F of the two vehicle frames in the middle of the parking area as an example, the value of NMS has dropped from the original 0.7 to 0.15.
  • the vehicle detection device may obtain the image of the vehicle to be detected, and use the resolution ratio of the image of the vehicle to be detected to arrange the first pre-detection area and the second pre-detection area at the center position of the image of the vehicle to be detected.
  • the vehicle detection may be performed in the first pre-detection area and the second pre-detection area, respectively to obtain the first vehicle frame in the first pre-detection area and the second vehicle frame in the second pre-detection area.
  • the first detection partition and the second detection partition may be arranged according to the first vehicle frame and/or the second vehicle frame.
  • the first detection partition may be divided into the first upper detection partition and the first lower detection partition according to the first vehicle frame
  • the second detection partition may be divided into the second upper detection partition and the second lower detection partition according to the second vehicle frame.
  • the vehicle detection may be performed on the first upper detection partition, the first lower detection partition, the second upper detection partition, and the second lower detection partition, respectively, and the target vehicle frame may be marked on the image of the vehicle to be detected according to the vehicle detection result.
  • the order of the steps may not indicate a strict execution order that constitutes any limitation on the implementation process.
  • the specific execution order of each step should be determined according to the function and the possible inner logic.
  • FIG. 18 is a diagram illustrating a structure of a vehicle detection device according to some embodiments of the present disclosure.
  • the vehicle detection device 1800 may include a pre-detection module 1810, a vehicle frame module 1820, a detection partition module 1830, and a vehicle detection module 1840.
  • the pre-detection module 1810 may be configured to obtain an image of a vehicle to be detected, and use a resolution ratio of the image of the vehicle to be detected to set a first pre-detection area and a second pre-detection area at a central position of the image of the vehicle to be detected.
  • the vehicle frame module 1820 may be configured to perform vehicle detection on the first pre-detection area and the second pre-detection area, and obtain a first vehicle frame in the first pre-detection area and a second vehicle frame in the second pre-detection area.
  • the detection partition module 1830 may be configured to arrange a first detection partition and a second detection partition according to the first vehicle frame and/or the second vehicle frame, divide the first detection partition into a first upper detection partition and a first lower detection partition according to the first vehicle frame, and divide the second detection partition into a second upper detection partition and a second lower detection partition according to the second vehicle frame.
  • the vehicle detection module 1840 may be configured to perform the vehicle detection on the first upper detection partition, the first lower detection partition, the second upper detection partition, and the second lower detection partition, respectively, and mark a target vehicle frame in the image of the vehicle to be detected according to a vehicle detection result.
  • FIG. 19 is a diagram illustrating a structure of a vehicle detection device according to another embodiment of the present disclosure.
  • the vehicle detection device 1900 of the embodiments of the present disclosure may include a processor 1910, a memory 1920, an input/output device 1930, and a bus 1940.
  • the processing device 110 in the Fig. 1. may be the vehicle detection device 1900.
  • the processor 1910, the memory 1920, and the input/output device 1930 may be connected to the bus 1940, respectively.
  • the memory 1920 may store program data
  • the processor 1910 may be configured to execute the program data to implement the vehicle detection method described in the embodiments mentioned above.
  • the processor 1910 may also be referred to as a CPU (Central Processing Unit) .
  • the processor 1910 may be an integrated circuit chip with signal processing capabilities.
  • the processor 1910 may also be a general-purpose processor, a DSP (Digital Signal Processor) , an ASIC (Application Specific Integrated Circuit) , a FPGA (Field Programmable Gate Array) , or other programming logic devices, discrete gates or transistor logic devices, discrete hardware assemblies.
  • the general-purpose processor may be a microprocessor or the processor 1910 may also be any conventional processor, or the like.
  • the present application may also provide a computer storage medium.
  • the computer storage medium 2000 may be configured to store program data 2010.
  • the program data 2010 is executed by a processor, the vehicle detection method described in the embodiments mentioned above may be implemented.
  • the present application may also provide a computer program product, wherein the computer program product may include a computer program, and the computer program may be operable to cause a computer to execute the vehicle detection method as described in the embodiments of the present disclosure.
  • the computer program product may be a software installation package.
  • the vehicle detection method may be in a form of a software functional unit, and when the vehicle detection method is sold or used as an independent product, the vehicle detection method may be stored in a device, such as a computer readable storage medium.
  • a device such as a computer readable storage medium.
  • the computer software product may be stored in a storage medium including several instructions to make a computer device (which may be a personal computer, a server, a network device, or the like) or a processor to execute all or part of the steps of the methods described in the various embodiments of the present disclosure.
  • the storage media mentioned above may include U disk, mobile hard disk, ROM (Read-only Memory) , RAM (Random Access Memory) , magnetic disks, optical disks, and other media that may store program codes.
  • the present disclosure uses specific words to describe the embodiments of the present disclosure.
  • “one embodiment” , “an embodiment” , and/or “some embodiments” means a certain feature, structure, or characteristic related to at least one embodiment of the present disclosure. Therefore, it should be emphasized and noted that “one embodiment” , “an embodiment” , or “an alternative embodiment” mentioned twice or more in different positions in the present disclosure does not necessarily refer to the same embodiment.
  • some features, structures, or characteristics in one or more embodiments of the present disclosure may be combined appropriately.
  • numbers describing the number of ingredients and attributes are used. It should be understood that such numbers used for the description of the embodiments use the modifier "about” , “approximately” , or “substantially” in some examples. Unless otherwise stated, “about” , “approximately” , or “substantially” indicates that the number is allowed to vary by ⁇ 20%.
  • the numerical parameters used in the description and claims are approximate values, and the approximate values may be changed according to the required characteristics of individual embodiments. In some embodiments, the numerical parameters should consider the prescribed effective digits and adopt the method of general digit retention. Although the numerical ranges and parameters used to confirm the breadth of the range in some embodiments of the present disclosure are approximate values, in specific embodiments, settings of such numerical values are as accurate as possible within a feasible range.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Signal Processing (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Image Analysis (AREA)
  • Traffic Control Systems (AREA)
  • Image Processing (AREA)

Abstract

A method for image processing, the method may include obtaining an image acquired by a camera with a height from the ground satisfying a first condition; determining multiple regions of the image; identifying at least one subject in the multiple regions; and determining an identification result of the image based on a subject among the at least one subject, at least a portion of the subject being in an overlapping area of the multiple regions.

Description

METHOD AND SYSTEM FOR IMAGE PROCESSING
CROSS REFERENCE TO RELATED APPLICATIONS
This application claims priority to Chinese Patent Application No. 202110406602.7, filed on April 15, 2021, the contents of which are hereby incorporated by reference.
TECHNICAL FIELD
The present disclosure relates to image processing, in particular, to a method and system for management of parking spaces based on image processing.
BACKGROUND
With the development of urbanization, the number of vehicles continues to increase, and outdoor closed parking lots are also increasing. Generally, a closed outdoor parking lot realizes a guidance and management of parking spaces in a parking lot through the detection of vehicles using cameras.
The angle of view of a panoramic camera is larger than that of a monocular camera. An image acquired by the panoramic camera can be seen in a large scene of 360 degrees, thereby covering more targets (e.g., vehicles) . However, due to the characteristics of the panoramic camera, an identification rate of a vehicle with a small area from an image of a large scene is low, causing underreporting.
SUMMARY
The present disclosure is further described in terms of exemplary embodiments. These exemplary embodiments are described in detail with reference to the drawings. The drawings are not to scale. These embodiments are non-limiting exemplary embodiments, in which like reference numerals represent similar structures throughout the several views of the drawings, and wherein:
One aspect of embodiments of the present disclosure may provide a method and a system for image processing. The method for image processing may include obtaining an image acquired by a camera with a height from the ground satisfying a first condition; determining multiple regions of the image; identifying at least one subject in the multiple regions; and determining an identification result of the image based on a subject among the at least one subject, at least a portion of the subject being in an overlapping area of the multiple regions.
In some embodiments, each of at least one matching area in the image may be enclosed in a region of the multiple regions, and the at least one matching area matches to the at least one subjects.
In some embodiments, the determining the multiple regions of the image may include determining a reference area of the image, wherein a deformation degree of the reference area satisfies a second condition; and determining the multiple regions according to the reference area.
In some embodiments, the multiple regions may contain a set of regions which may contain a first region and a second region, and the determining the multiple regions according to the reference area may include identifying one or more reference subjects in the identification area; determining a  target reference subject from the one or more reference subjects in the reference area satisfies a third condition; determining an edge of the first region in the image based on an edge of a reference matching area which matches to the target reference subject; determining an edge of the second region in the image based on an edge of the reference area; and determining the set of regions according to the edge of the first region and the edge of the second region in the image.
In some embodiments, the third condition may be related to a position of each of the one or more reference subjects in the reference area.
In some embodiments, the third condition may be related to an identification confidence level of each of the one or more reference subjects in the identification area.
In some embodiments, the determining the identification result of the image based on the subject among the at least one subject, the at least a portion of the subject being in the overlapping area of the multiple regions may include determining, based on an intersection over union (IOU) of each of the at least one subject in the multiple regions, the subject; and determining the identification result of the image according to an identification confidence level of the subject.
In some embodiments, the identifying the at least one subject in the multiple regions further may include: for a region of the multiple regions, obtaining an identification frame in the region based on an identification algorithm, wherein the identification frame has a corresponding relationship with a matching area which matches to the target in the region; correcting the identification frame to obtain a corrected identification frame based on an angle between a actual horizontal line of the matching area corresponding to the identification frame and a standard horizontal line; and identifying the subject in the region according to the corrected identification frame.
In some embodiments, the at least one subject may include a vehicle, the ground captured by the camera may include a parking area, and the method further may include determining a vacant parking space in the parking area according to the identification result of the image.
Another aspect of embodiments of the present disclosure may provide a system for image processing. The system for image processing may include: at least one storage device including a set of instructions; and at least one processor configured to communicate with the at least one storage device. When executing the set of instructions, the at least one processor may be configured to direct the system to perform operations including: obtaining an image acquired by a camera with a height from the ground satisfying a first condition; determining multiple regions of the image; identifying at least one subject in the multiple regions; and determining an identification result of the image based on a subject among the at least one subject, at least a portion of the subject being in an overlapping area of the multiple regions.
In some embodiments, each of at least one matching area in the image may be enclosed in a region of the multiple regions, and the at least one matching area matches to the at least one subjects.
In some embodiments, the determining the multiple regions of the image may include  determining a reference area of the image, wherein a deformation degree of the reference area satisfies a second condition; and determining the multiple regions according to the reference area.
In some embodiments, the multiple regions may contain a set of regions which may contain a first region and a second region, and the determining the multiple regions according to the reference area may include identifying one or more reference subjects in the identification area; determining a target reference subject from the one or more reference subjects in the reference area satisfies a third condition; determining an edge of the first region in the image based on an edge of a reference matching area which matches to the target reference subject; determining an edge of the second region in the image based on an edge of the reference area; and determining the set of regions according to the edge of the first region and the edge of the second region in the image.
In some embodiments, the third condition may be related to a position of each of the one or more reference subjects in the reference area.
In some embodiments, the third condition may be related to an identification confidence level of each of the one or more reference subjects in the identification area.
In some embodiments, the determining the identification result of the image based on the subject among the at least one subject, the at least a portion of the subject being in the overlapping area of the multiple regions may include determining, based on an intersection over union (IOU) of each of the at least one subject in the multiple regions, the subject; and determining the identification result of the image according to an identification confidence level of the subject.
In some embodiments, the identifying the at least one subject in the multiple regions further may include: for a region of the multiple regions, obtaining an identification frame in the region based on an identification algorithm, wherein the identification frame has a corresponding relationship with a matching area which matches to the target in the region; correcting the identification frame to obtain a corrected identification frame based on an angle between a actual horizontal line of the matching area corresponding to the identification frame and a standard horizontal line; and identifying the subject in the region according to the corrected identification frame.
In some embodiments, the at least one subject may include a vehicle, the ground captured by the camera may include a parking area, and the system further may include determining a vacant parking space in the parking area according to the identification result of the image.
Another aspect of embodiments of the present disclosure may provide a system for image processing. The system for image processing may include: a obtaining module configured to obtain an image acquired by a camera with a height from the ground satisfying a first condition; a segmenting module configured to determine multiple regions of the image; an identifying module configured to identify at least one subject in the multiple regions; and a determining module configured to determine an identification result of the image based on a subject among the at least one subject, at least a portion of the subject being in an overlapping area of the multiple regions.
Another aspect of embodiments of the present disclosure may provide a non-transitory computer readable medium including executable instructions. When executed by at least one processor, the executable instructions may direct the at least one processor to perform a method. The method may include: obtaining an image acquired by a camera with a height from the ground satisfying a first condition; determining multiple regions of the image; identifying at least one subject in the multiple regions; and determining an identification result of the image based on a subject among the at least one subject, at least a portion of the subject being in an overlapping area of the multiple regions.
BRIEF DESCRIPTION OF THE DRAWINGS
The present disclosure is further described in terms of exemplary embodiments. These exemplary embodiments are described in detail with reference to the drawings. These embodiments are non-limiting exemplary embodiments, in which like reference numerals represent similar structures, and wherein:
FIG. 1 is a diagram illustrating an image processing system according to some embodiments of the present disclosure;
FIG. 2 is an exemplary flowchart illustrating a process for image processing according to some embodiments of the present disclosure;
FIG. 3 is an exemplary flowchart illustrating a process of determining a region according to some embodiments of the present disclosure;
FIG. 4A is a diagram illustrating a process of determining a region according to some embodiments of the present disclosure;
FIG. 4B is another diagram illustrating a process of determining a region according to some embodiments of the present disclosure;
FIG. 5 is another diagram illustrating a process of determining a region according to some embodiments of the present disclosure;
FIG. 6 is an exemplary flowchart illustrating a process of determining an identification result of an image according to some embodiments of the present disclosure;
FIG. 7 is a diagram illustrating a process of determining an IOU of a subject according to some embodiments of the present disclosure;
FIG. 8 is an exemplary flowchart illustrating a process of correcting an identification frame according to some embodiments of the present disclosure;
FIG. 9 is a diagram of correcting an identification frame according to some embodiments of the present disclosure;
FIG. 10 is an exemplary flowchart illustrating a process of vehicle detection according to some embodiments of the present disclosure;
FIG. 11 is a diagram of an image of a vehicle to be detected according to some embodiments of the present disclosure;
FIG. 12 is a diagram of pre-detection areas and detection partitions according to some embodiments of the present disclosure;
FIG. 13 is a diagram of a segmented area recognition result according to some embodiments of the present disclosure;
FIG. 14 is an exemplary flowchart illustrating a process of vehicle detection according to some embodiments of the present disclosure;
FIG. 15 is another exemplary flowchart illustrating a process of vehicle detection according to some embodiments of the present disclosure;
FIG. 16 is a diagram illustrating a distortion and tilt of an image according to some embodiments of the present disclosure;
FIG. 17 is a diagram of a distortion correction for a vehicle frame according to some embodiments of the present disclosure;
FIG. 18 is a diagram illustrating a structure of a vehicle detection device according to some embodiments of the present disclosure;
FIG. 19 is another diagram illustrating a structure of a vehicle detection device according to some embodiments of the present disclosure; and
FIG. 20 is a diagram illustrating a structure of a computer storage medium according to some embodiments of the present disclosure.
Additional features will be set forth in part in the description which follows, and in part will become apparent to those skilled in the art upon examination of the following and the accompanying drawings or may be learned by production or operation of the examples. The features of the present disclosure may be realized and attained by practice or use of various aspects of the methodologies, instrumentalities and combinations set forth in the detailed examples discussed below.
DETAILED DESCRIPTION
In the following detailed description, numerous specific details are set forth by way of examples in order to provide a thorough understanding of the relevant disclosure. Obviously, drawings described below are only some examples or embodiments of the present disclosure. Those skilled in the art, without further creative efforts, may apply the present disclosure to other similar scenarios according to these drawings. It should be understood that the purposes of these illustrated embodiments are only provided to those skilled in the art to practice the application, and not intended to limit the scope of the present disclosure. Unless obviously obtained from the context or the context illustrates otherwise, the same numeral in the drawings refers to the same structure or operation.
It should be understood that, as used in the disclosure, "system" , "device" , "unit" and/or "module" may be a method for distinguishing different components, elements, members, parts, or assemblies of different levels. However, if other words may achieve the same purpose, the words may be replaced by other expressions.
As used in the disclosure and the appended claims, the singular forms “a, ” “an, ” and “the” may include plural referents unless the content clearly dictates otherwise. In general, the terms “comprise” and “include” merely prompt to include steps and elements that have been clearly identified, and these steps and elements do not constitute an exclusive listing. The methods or devices may also include other steps or elements. The term “based on” is “based at least in part on. ” The term “one embodiment” means “at least one embodiment; ” the term “another embodiment” means “at least one other embodiment. ” Related definitions of other terms will be given in the description below.
As used in the disclosure, flowcharts may be used to describe operations performed by the system according to the embodiments of the present disclosure. It should be understood that the preceding or following operations may not be performed precisely in order. On the contrary, individual steps may be processed in reverse order or simultaneously. At the same time, other operations may be added to the processes, or a step or several steps may be removed from the processes.
FIG. 1 is a diagram illustrating an image processing system according to some embodiments of the present disclosure.
The image processing system 100 may be applied to process image, and may be particularly suitable for images with a large number of subjects and small pixels, images affected by occlusion and distortion of subjects located at a distant view and around edges of the images, etc. In some embodiments, the image processing system 100 may be applied to an application area with a large space, especially a parking area (e.g., an outdoor parking lot, an indoor parking lot, or the like) . In some embodiments, the image processing system 100 may be applied to identify a subject in an image. In some embodiments, a subject identified by the image processing may be a vehicle parked in the parking area, a vacant parking space in the parking area, or the like. In some embodiments, the image processing system 100 may monitor the parking situation of vehicles in the parking area in real time, therefore achieving the guidance and management of parking spaces in the parking area.
As shown in FIG. 1, the image processing system 100 may include a processing device 110, a network 120, an image acquisition device 130, and a supporting rod 140.
The processing device 110 may be configured to process information and/or data related to an image. For example, the processing device 110 may acquire an image from the image acquisition device 130 through the network 120. The processing device 110 may segment the image, identify multiple regions obtained by the segmentation, and determine at least one subject in the multiple regions. The processing device 110 may determine an identification result of the image based on a subject among the at least one subject, at least a portion of the subject being in an overlapping area of the multiple regions. The processing device 110 may determine the multiple regions in the image based on a segmentation frame, a reference area, or the like.
In some embodiments, the processing device 110 may include one or more sub-processing devices (e.g., a single-core processing device or a multi-core processing device) . In some  embodiments, the processing device 110 or a part of the processing device 110 may be integrated into the camera 130.
The network 120 may provide channels for information exchange. The network 120 may include a variety of network access points, such as wired or wireless access points, base stations, or network exchange points. Data sources may be connected to the network 120 and send information through the network 120 via the access points mentioned above.
The image acquisition device 130 may be an electronic device with a function of acquiring images or videos. In some embodiments, the image acquisition device 130 may include a camera for acquiring images. For example, the image acquisition device 130 may be a panoramic camera, a monocular camera, a multi-nodular camera, or a rotatable camera. The image acquisition device 130 may continuously collect data, collect data at regular intervals, or collect data based on control instructions. The collected data may be stored in a local storage device, or be sent to a remote site via the network 120 for storage or further processing.
In some embodiments, the image acquisition device 130 may be installed on the supporting rod 140. The supporting rod 140 has a certain height (e.g., 10 meters, 15 meters, etc. ) , such that the image acquisition device 130 may acquire images of a large area. The supporting rod 140 may be any rod on which the image acquisition device 130 may be installed. For example, the supporting rod 140 may include light poles, or the like.
In some embodiments, the processing device 110 may include an obtaining module, a segmenting module, an identifying module, and a determining module.
The obtaining module may be configured to obtain an image. In some embodiment, the image may be acquired by a camera, and a height of the camera from the ground may satisfy a first condition. In some embodiments, the image may be a panoramic image. In some embodiments, the obtaining module may pre-process the image.
The segmenting module may be configured to determine multiple regions of the image. When the segmenting module determines the multiple regions of the image, each matching area in the image is fully enclosed in at least one of the multiple regions. The segmenting module may determine the multiple regions in a variety of ways. For example, the segmenting module may determine the multiple regions through a reference area or a segmentation frame.
The identifying module may be configured to identify at least one subject in multiple regions, and a reference area, etc. For example, the identifying module may identify the at least one subject based on the identification model. The identifying module may obtain a position of a subject in a region through the identification model. The position may be expressed in a variety of ways. For example, the position may be expressed by a target frame that includes the subject. The identification model may be a trained machine learning model. The identification model may include a U-Net model, a YOLO model, or the like.
The determining module may be configured to determine an identification result of the image. In some embodiments, the determining module may determine the identification result of the image based on a subject among the at least one subject, and the subject may be in an overlapping area of the multiple regions. In some embodiments, the determining module may determine the subject based on an intersection over union (IOU) of each subject in the multiple regions, and may further determine the identification result of the image.
The processing device 110 may also include a training module, which may be configured to train an initial model to obtain a trained model. For example, the training module may be configured to train an initial identification model to obtain the identification model.
For the relevant content of the obtaining module, the segmenting module, the identifying module, the determining module, and the training model, refer to FIG. 2, FIG. 3, FIG. 4A, FIG. 4B, FIG. 5, FIG. 6, FIG. 7, FIG. 8, and FIG. 9.
It should be noted that the above descriptions of the candidate display, determination system, and the modules is only for convenience of description, which do not limit the present disclosure to the scope of the embodiments mentioned above. It should be understood that for those skilled in the art, after understanding the principles of the system, various modules may be arbitrarily combined with each other, or subsystems connected to other modules may be formed, without departing from the principles. In some embodiments, the obtaining module, the segmenting module, the identifying module, and the determining module may be different modules in one system, or one module may implement the functions of the two or more modules mentioned above. For example, each module may share a storage module, and each module may also have its own storage module. Such deformations are all within the scope of the present disclosure.
FIG. 2 is an exemplary flowchart illustrating a process for image processing according to some embodiments of the present disclosure. As shown in FIG. 2, process 200 may include the following operations. In some embodiments, the process 200 may be performed by the image acquisition device 140. In some embodiments, process 200 may be implemented in the system 100 illustrated in FIG. 1. For example, the process 200 may be stored in a storage device and/or the storage (e.g., the memory 1920) as a form of instructions, and invoked and/or executed by the processing device 110 (e.g., the processor 1910 of the computing device 1900 as illustrated in FIG. 19) . The operations of the illustrated process presented below are intended to be illustrative. In some embodiments, the process 200 may be accomplished with one or more additional operations not described, and/or without one or more of the operations discussed. Additionally, the order in which the operations of the process 200 as illustrated in FIG. 2 and described below is not intended to be limiting.
In 210, the processing device 110 (e.g., the obtaining module) may obtain an image acquired by a camera, and a height of the camera from the ground may satisfy a first condition.
The first condition may be satisfied if the height of the camera from the ground exceeds a height threshold. The height threshold may be 10 meters, 12 meters, 15 meters, or the like. The height of the camera from the ground may be changed according to the actual scene.
In some embodiments, the camera may be installed on a device with a certain height for taking an image of an actual scene. For example, the camera may be installed on a light pole to take an image of a parking lot (e.g., an outdoor parking lot) . The height of the camera installed on the light pole from the ground of the outdoor parking lot may be 10-15 meters. As another example, the camera may be installed on a pole to take an image of an indoor or an outdoor warehouse. The height of the camera installed on the pole from the ground of the warehouse may be 15-20 meters.
The image acquired by the camera may be a panoramic image. For example, the image may be a panoramic image of an outdoor parking lot. As another example, the image may be a panoramic image of an indoor warehouse, or the like. In some embodiments, the number of images acquired by the camera may be more than one.
In some embodiments, the camera may be a panoramic camera. For example, the camera may be a four-lens fisheye camera, or the like. Accordingly, the image taken by the camera may be a panoramic image.
In some embodiments, the camera may be a non-panoramic camera. For example, the camera may be a plurality of monocular cameras, and the plurality of monocular cameras may be installed at different positions to take images of different areas on the ground. The obtaining module may obtain the panoramic image of the ground based on the images of different areas taken by the monocular cameras.
In some embodiments, the camera may be a rotatable camera. The rotatable camera may be rotated to different angles to take images. The obtaining module may obtain the panoramic image of the ground based on the images acquired by the rotatable camera from various angles.
In some embodiments, the panoramic image may be deformed. The farther away from the center of the image, the greater the deformation degree of the panoramic image may be. The deformation of the panoramic image may be symmetrical. The panoramic image may include a distant view part and a close view part. The distant view part may refer to an area in the panoramic image representing a physical area in space farther from the camera (also called the distant view) , and the close view part may refer to a region in the panoramic image representing a physical area in space closer to the camera (also called the close view) . In a panoramic image, for objects of the same size, pixel values in the distant view part may be smaller than that in the close view part. Therefore, in the panoramic image, objects in the distant view part may be prone to be occluded by other objects in the close view part. For example, in a panoramic image, vehicles on the ground in the distant view part may be easily occluded by other objects in the close view part (e.g., tall buildings, trees, etc. ) .
In some embodiments, the obtaining module may acquire images from the camera periodically (e.g., at an interval of a preset number of frames) . In some embodiments, the obtaining module may pre-process the images acquired from the camera. For example, pre-processing may include resolution adjustment, image enhancement, cropping, or the like.
In some embodiments, the obtaining module may crop the image acquired by the camera. As mentioned above, the pixel values of an object in the distant view part of the panoramic image may be small, and the object may be easily occluded by other objects. The obtaining module may crop the panoramic image acquired by the camera to remove the distant view part. For example, for a panoramic image A including an edge a and an edge b. The edge a may be close to the distant view part, and the edge b may be close to the close view part. The obtaining module may remove the 1/4 area of the panoramic image A that is far away from the edge b and close to the edge a. The remaining area may be used as the effective area of the image for subsequent processing (e.g., operation 220, etc. ) .
In 220, the processing device 110 (e.g., the segmenting module) may determine multiple regions of the image.
The region may be a partial region of the image. The number of regions may be one or more. In some embodiments, all regions combined may include the full content of the image. In some embodiments, the region may be any shape, etc., a rectangle, or the like.
In some embodiments, there may be an overlapping area between the multiple regions. As used herein, the overlapping area between the multiple regions may refer to an overlapping area between at least two of the multiple regions. For more descriptions about the overlapping area, refer to operation 240 and the related descriptions.
In some embodiments, at least one matching area may be included in the image. The matching area may refer to the area in the image that is configured to match a subject. In some embodiments, the matching area may be a bounding box of any shape (e.g., a matrix) that may include the subject. The bounding box may be a preset frame to place the subject in. For example, the image may be an image of a parking lot, the subject may be a vehicle, and the matching area may therefore be the parking spaces (including occupied parking spaces and vacant parking spaces) . As another example, the image may be an image of a warehouse, the subject may be the goods or the shelf, and the matching area may therefore be the shelf spaces of the goods or the shelf. The matching area may also be an area obtained by rotating, zooming (e.g., zooming in) , and/or merging the parking spaces or the shelf spaces. The bounding box may be a frame determined based on the outline of the subject. For example, the bounding box may be a target frame that identifies a target position output by the identification model. Refer to operation 230 for the identification model and the target frame.
When the segmenting module determines the multiple regions of the image, the multiple regions may satisfy a condition (also referred to as a fifth condition) . In some embodiments, the fifth condition may be that each matching area in the image is included in at least one of the multiple regions. It should be understood that a matching area being included in a region refers to that the matching area in the image is fully enclosed in a region, and is not segmented by an edge of the region. As mentioned above, there may be an overlapping area between the multiple regions, and therefore some matching areas may be enclosed in more than one region.
The segmenting module may segment the image in a variety of ways to obtain the multiple regions. For example, the segmenting module may segment the image based on the input of a user by obtaining the input of the user from a user terminal. For another example, the segmenting module may determine the multiple regions through a reference area or a segmentation frame. More descriptions for the determination of the multiple regions may be found elsewhere in the present disclosure (e.g., FIG. 3, FIG. 4, FIG. 5, and the related descriptions thereof) .
In some embodiments, the segmenting module may evaluate the determined multiple regions to determine if the segmentation is reasonable. Whether the segmentation is reasonable may refer to whether each matching area in the image is completely enclosed in a region. If the segmentation is unreasonable, the segmenting module may re-segment or feedback to the user for re-segmentation.
In some embodiments, the evaluation of whether the segmentation is reasonable may be achieved through an evaluation model. The evaluation model may be DNN, CNN, or the like. For example, the multiple regions obtained by segmentation may be input into the evaluation model, and the evaluation of whether the segmentation is reasonable may be the output of the evaluation model.
In 230, the processing device 110 (e.g., the identifying module) may identify at least one subject in the multiple regions.
The subject may be an object that needs to be identified in the image. The subject may change according to the application scenario. For example, when the acquired image is a parking area, the subject may be a parked vehicle (e.g., a vehicle parked in a parking space) . As another example, when the acquired image is a warehouse, the subject may be goods (e.g., the goods placed in the shelf spaces) .
In some embodiments, for each region, the identifying module may identify the position of the subject in the region. The position may be expressed in a variety of ways. For example, the position may be expressed by a target frame (or a bounding box) that includes the subject. As another example, the position may be expressed by a center point of the subject.
The identifying module may identify the at least one subject in the multiple regions in a variety of ways. In some embodiments, the identifying module may identify the at least one subject in the multiple regions based on an identification model.
The identification model may be a trained machine learning model. The machine learning model may include a U-Net model, a YOLO model, or the like.
The identification model may identify a subject in each of the multiple regions separately. For one of the multiple regions, the input of the identification model is a part of the region in the image. In some embodiments, the part of the region in the image may be obtained by clipping the image. The part of the region in the image may be obtained by masking other parts in the image except the region.
Taking the YOLO model as an example, for one of the multiple regions, the region in the image may be input into the YOLO model, and the YOLO model may output an area corresponding to the pixels of the subject in the multiple regions.
In some embodiments, the training module may train an initial identification model based on training data to obtain the identification model. Taking a parking lot scene as an example, the training data may include sample images of multiple parking lots, and labels corresponding to the sample images may be the positions of vehicles in the sample images. The sample images may be collected by cameras at a height of 6 meters, 12 meters, and 20 meters from the ground.
In some embodiments, for each region, the identifying module identifying the subject in the region may include acquiring an identification frame in the region, and determining the subject in the region based on the identification frame. There may be a corresponding relationship between the identification frame and the subject, such as a one-to-one correspondence. The identification frame may be used when determining whether the subject or the type of the subject is included in the region. For example, the identification model may determine whether the subject is included in the region by processing the identification frame.
In some embodiments, the identifying module may correct the identification frame, and identify the subject after the correction. For example, the identification model may process the multiple regions to obtain and output the identification frame. The identifying module may correct the identification frame, and input the corrected identification frame into the identification model to identify the subject in the multiple regions. More descriptions for the correction of the identification frame may be found in FIG. 7 and the related descriptions.
In 240, the processing device 110 (e.g., the determining module) may determine an identification result of the image based on a subject among the at least one subject, and the subject may be in an overlapping area of the multiple regions.
The subject in the overlapping area may refer to that at least a portion of the subject appears in the overlapping area, and the overlapping area includes all or a part of the content of the subject.
The overlapping area may refer to an area in the image that appears in more than one region among the multiple regions. When a subject appears in more than one region, the subject appears, accordingly, in the overlapping area of the more than one region. In some embodiments, the  overlapping area may be an overlapping part between two regions, or an overlapping part between more than two regions.
The identification result of the image may refer to the identification of the subject and the position of the subject in the image. More descriptions for the representation of the position of the subject may be found in elsewhere in the present disclosure.
In some embodiments, the determining module may determine the subject in the overlapping area of the multiple regions based on an intersection over union (IOU) of each subject in the multiple regions, and may further determine the identification result of the image. For the relevant content of determining the identification result of the image, refer to FIG. 6 and the related descriptions.
In some embodiments, the subject may be a vehicle, and the image from the camera may include a parking area, such as an indoor parking lot, an outdoor parking lot, roadside parking spaces, etc. In some embodiments, the identifying module may determine a vacant parking space in the parking area according to the identification result of the image. For example, the identifying module may obtain a distribution map of parking spaces in the parking area from a storage device, and then compare the distribution map of parking spaces with the positions of vehicles in the image to determine whether there is vacant parking space for a vehicle. Vacant spaces in other areas (e.g., a vacant shelf space in a warehouse) may also be determined in a similar way.
As mentioned above, the image may include a panoramic image, and the panoramic image may be deformed. By segmenting the panoramic image and identifying the regions obtained after the segmentation, the accuracy of identification may be improved, and the problems of inaccurate identification when the identification model attempts to identify an entire panoramic image may be avoided (e.g., problems due to large number of subjects, small pixels, and occlusion and distortion in the panoramic image) . Moreover, when determining the multiple regions, ensuring that each matching area in the panoramic image is completely enclosed in a region may further ensure the accuracy of the identification.
By considering the subject in the overlapping area when determining the identification result of the subject in the image, the same subject may be prevented from being mistaken for two subjects, and therefore the identification accuracy may be further improved.
Taking the scenario that the panoramic image is a parking lot as an example, the matching area may be configured to be an enlarged image area of a parking space or a combined image area of multiple parking spaces to deal with the situation that multiple parking spaces are occupied by one vehicle due to illegal parking, therefore improving the accuracy of the determination of vacant parking spaces.
FIG. 3 is an exemplary flowchart illustrating a process of determining a region according to some embodiments of the present disclosure. As shown in FIG. 3, process 300 may include the following operations. In some embodiments, process 300 may be implemented in the system 100  illustrated in FIG. 1. For example, the process 300 may be stored in a storage device and/or the storage (e.g., the memory 1920) as a form of instructions, and invoked and/or executed by the processing device 110 (e.g., the processor 1910 of the computing device1900 as illustrated in FIG. 19) . The operations of the illustrated process presented below are intended to be illustrative. In some embodiments, the process 300 may be accomplished with one or more additional operations not described, and/or without one or more of the operations discussed. Additionally, the order in which the operations of the process 300 as illustrated in FIG. 3 and described below is not intended to be limiting. In some embodiments, operation 220 in FIG. 2 may be performed according to process 300 illustrated in FIG. 3.
In 310, the processing device 110 (e.g., the segmenting module) may determine a reference area of the image, and a deformation degree of the reference area may satisfy a second condition.
In some embodiments, the number of reference areas may be one or more. In some embodiments, the reference area may be any shape, e.g., a rectangle, a circle, or the like.
The deformation degree of the reference area may be the degree of deformation of the image in the reference area relative to the real scene. In some cases, due to the imaging characteristics of the camera (e.g., a multi-lens camera taking a panoramic image) , the image may be distorted. For example, an area that is straight in the real scene may be an area that is curved in the panoramic image.
The second condition may be that the deformation degree of the reference area in the image is less than a threshold (e.g., the threshold may be 60%, 50%, etc. ) . In some embodiments, the threshold of the deformation degree in the second condition may be adjusted according to actual application scenarios. For example, the height of the camera from the ground may affect the second condition. The farther the distance, the greater the threshold may be in the second condition.
The deformation degree of an area in the image may be determined in a variety of ways. For example, the segmenting module may determine the deformation degree based on the degree of deformation of an object in the area of the image relative to the object in the real scene. For example, for an area A in the image, the area A may include a object, and a outline of the object is curved in the image. But the outline of the object is straight actually. The segmenting module may determine the deformation degree based on the curvature of the outline. The greater the curvature is, the greater the deformation degree is.
In some embodiments, the segmenting module may determine the reference area of the image in a variety of ways. For example, the segmenting module may determine the deformation degrees of different candidate areas in the image, and determine the reference area based on the deformation degrees of the candidate areas and the second condition. One of the candidate areas with a deformation degree that satisfies the second condition may be determined as the reference area. As another example, the segmenting module may determine the reference area based on a reference  line, and more descriptions for determining the reference area based on the reference line, refer to the following descriptions.
In some embodiments, the reference area may be a matching area or a combined area of multiple matching areas.
The image, the multiple regions, the reference area, and the matching area may all include multiple edges. In some embodiments, the segmenting module may obtain coordinates of each edge of the image, the multiple regions, the reference area, and the matching area in the same two-dimensional coordinate system (e.g., the two-dimensional coordinate system in FIG. 4A, 4A, 5, and 7) . For example, a certain corner of the image may be used as the origin of the coordinate system (point o as shown in FIG. 4A, etc. ) , and the image may be located in the first quadrant of the coordinate system. The two edges (i.e., a first edge of the image and a second edge of the image) corresponding to the length of the image may be parallel to the horizontal axis (x axis as shown in FIG. 4A) of the coordinate system. Compared to the second edge of the image, the first edge of the image may be closer to the close view part of the image and the origin of the coordinate system. The two edges (i.e., a third edge of the image and a fourth edge of the image) corresponding to the width of the image may be parallel to the vertical axis (y axis as shown in FIG. XX) of the coordinate system. Compared to the fourth edge of the image, the third edge of the image may be closer to the origin of the coordinate system. In a similar manner, the segmenting module may determine the first edge, the second edge, the third edge, and the fourth edge of each of the multiple regions, the reference area, and the matching area.
In some embodiments, the segmenting module may obtain the reference line of the image, designate an edge of the image which intersects with the reference line as an edge of the reference area, and determine, based on a pixel ratio of the length to the width of the image, an area ratio of the two portions of the reference area on the two sides of the reference line. The length and the width of the reference area may be preset pixel values.
In some embodiments, the segmenting module may designate an edge of the image which intersects with the reference line and is closer to the close view part as an edge of the reference area. For example, the segmenting module may take the first edge of the image as an edge of the reference area.
The reference line may refer to a line passing through the image. In some embodiments, the reference line may be a centerline of the image. For example, for a panoramic image with symmetrical deformation, when the centerline is used as the reference line, the deformation degree of the image on two sides of the reference line may be symmetrically distributed.
In some embodiments, the reference line may not pass through a subject (e.g., a vehicle, etc. ) . When the reference line of the image passes through a subject, the segmenting module may move the reference line, and the subject is not segmented by the reference line. For example, the segmenting  module may move the reference line to an area between the subject and another subject adjacent to the subject.
In some embodiments, the size of the reference area may be a preset fixed size. The size of the reference area may be adjusted adaptively according to the size of the image. The size of the reference area may be related to the parameters of the identification model. For example, a preset pixel value of the length of the reference area may be 1920px, and a preset pixel value of the width may be 1080px.
In some embodiments, when there are multiple reference areas, the segmenting module may merge the multiple reference areas, and determine the multiple regions through subsequent operations (e.g., operation 320, etc. ) based on the merged area.
In some embodiments, when there are multiple reference areas, the reference areas may have a corresponding relationship with the multiple regions. For example, the reference areas may include a first reference area configured to determine a region on one side of the reference line, and a second reference area configured to determine another region on the other side of the reference line.
In some embodiments, the area ratio of the two portions of the reference area on the two sides of the reference line may be related to the pixel ratio of the length to the width of the image, a type (e.g., a first reference area and a second reference area) of the reference area, etc. For example, the pixel ratio of the length to the width of the image may be f, the reference line may be parallel to the vertical axis mentioned above. The first reference area may correspond to the left side of the reference line, therefore the area ratio of the left portion of the first reference area to the right portion of the first reference area may be f. The second reference area may correspond to the right side of the reference line, therefore the area ratio of the right portion of the second reference area to the left portion of the second reference area may be f.
More descriptions for determining the reference area based on the reference line may be found in FIG. 4B and the related descriptions.
In 320, the processing device 110 (e.g., the segmenting module) may determine the multiple regions according to the reference area.
In some embodiments, the specific positions of the multiple regions in the image may be determined directly based on the reference area. For example, the edges of the reference area may be designated as the edges of the multiple regions. For example, if a reference area is included in an image, multiple edges of the reference area and multiple edges of the image may be combined to determine the edges of different regions, such that there is an overlapping area between the multiple regions, and the overlapping area may be the reference area.
In some embodiments, the specific positions of the multiple regions in the image may be determined based on the reference area and a reference subject included in the reference area.
In some embodiments, the multiple regions may include a first region and a second region. There may be a vertical position relationship (i.e., one being on top of another) between the first region and the second region in the set of regions. The first region and the second region may be arranged along the vertical direction of the image.
As mentioned above, the multiple regions may be related to the two sides of the reference line. In some embodiments, each side of the reference line may include a set of regions, respectively.
In some embodiments, in reference to the relevant descriptions of subject identification in operation 230, the segmenting module may identify one or more reference subjects in the reference area. The segmenting module may determine a target reference subject satisfying a third condition from the one or more reference subjects in the reference area. The segmenting module may determine an edge of the first region in the image based on an edge of a reference matching area that corresponds to the target reference subject, determine an edge of the second region in the image based on an edge of the reference area, and determine the set of regions according to the edge of the first region and the edge of the second region in the image.
The target reference subject may be a reference subject used to determine the multiple regions. The target reference subject may be one or more.
In some embodiments, the third condition may be related to the deformation degree of each area where a reference subject is located. For example, the third condition may be that the deformation degree of the reference subject is within a preset threshold range (e.g., 50%-55%, etc. ) . The preset threshold range may be related to the deformation degree of the entire image. For example, the preset threshold range may be determined based on an average deformation degree of the image.
In some embodiments, the third condition may be related to the position of each of the one or more reference subjects in the reference area. For example, the third condition may be that the distance between the reference subject and the edge of the reference area is less than a threshold. As another example, the third condition may be the top n reference subjects ranked from short to long in the distance between the reference subject and the edge of the reference area, with n being larger than 0.
Continuing to take the example discussed in operation 310, if the segmenting module uses the first edge of the image as the first edge of the reference area, the third condition may be that the distance between the reference subject and the second edge of the reference area is less than a threshold.
As mentioned above, the segmenting module may determine the first reference area and the second reference area based on the reference line. In some embodiments, the third condition may be related to the type of the reference area. Continuing to take the example discussed in operation 310, when the first reference area corresponds to the left side of the reference line, the third condition may  be that the distance between the reference subject and the edge corresponding to the upper left corner of the first reference area (e.g., the second edge and the third edge) is less than a threshold.
In some embodiments, the third condition may be related to the position and an identification confidence level of at least one reference subject in the reference area. For example, the third condition may be that a reference object has the maximum identification confidence level among the reference subjects whose distance from the edge of the reference area is less than a threshold.
In reference to the relevant descriptions of subject identification in operation 230, the identification model may obtain the position and the identification confidence level of the reference subject in the reference area by processing the reference area. The identification confidence level may refer to the probability of the reference subject being at the obtained position. In some embodiments, the identification confidence level may be related to whether the reference subject completely appears in the reference area. For example, the lower the completeness level, the lower the identification confidence level may be.
In some embodiments, when the segmenting module determines multiple target reference subjects based on the third condition, the segmenting module may select one of the multiple target reference subjects to determine the region, or merge the multiple target reference subjects and determine the region based on the merged result, or proceed in other ways.
In some embodiments, when the reference area does not include a target reference subject, the multiple regions may be directly determined based on the edge of the reference area.
In some embodiments, when the segmenting module determines the first region based on the reference matching area, the reference matching area may be completely enclosed in the region. In some embodiments, the segmenting module may designate the edge of the reference matching area as the edge of the first region and keep the types of the edges the same. For example, the first edge of the reference matching area may be designated as the first edge of the first region. Similarly, the segmenting module may designate the edge of the reference area as the edge of the second region and keep the types of the edges the same. For example, the second edge of the reference area may be designated as the second edge of the second region.
In some embodiments, when the segmenting module determines the edge of the first region based on the reference matching area and determines the edge of the second region based on the reference area, the position of the target reference subject in the reference area may be considered. For example, taking the coordinate system mentioned above as an example, if the target reference subject is close to the second edge of the reference area, the first edge of the reference matching area may be designated as the first edge of the first region, and the second edge of the reference area may be designated as the second edge of the second region. When the target reference subject is located in other positions, the segmenting module may adopt a similar method as the target reference subject close to the second edge of the reference area.
In some embodiments, the segmenting module may determine other edges of the multiple regions based on the edge of the image and/or the edge of the reference area. For example, after the first edge of the first region and the second edge of the second region are determined in the above manner, other edges of the first region and the second region may be determined based on the third or fourth edge of the image or reference area.
In some embodiments, the distance between two parallel edges of a region may be a preset threshold.
As mentioned above, the reference area may be determined based on the reference line. In some embodiments, the segmenting module may determine the edge of a region based on the reference line. For example, in the coordinate system mentioned above, the segmenting module may use the reference line as either the third edge or the fourth edge of the region, which may be determined correspondingly based on the position of the region relative to the reference line.
More descriptions for determining the multiple regions based on the reference area may be found in FIG. 4A and FIG. 4B.
FIG. 4A is a diagram illustrating a process of determining a region according to some embodiments of the present disclosure.
As shown in FIG. 4A, in the coordinate system shown, the segmenting module may segment a panoramic image 400 and obtain a reference area 402. The identifying module may identify the reference area 402. When the segmenting module determines a first region 405 and a second region 403, a target reference subject 401 in the reference area 402 may be determined based on the third condition. The target reference subject 401 may be close to the upper left corner of the reference area 402. That is, the distance to the second edge (i.e., the upper edge) and to the third edge (i.e., the left edge) corresponding to the upper left corner of the reference area 402 are both less than a threshold. The target reference subject 401 may be completely enclosed in the reference area 402, and the confidence level may also be high. The target reference subject 401 may correspond to a reference matching area 404.
The segmenting module may take the first edge (i.e., the lower edge) of the reference matching area 404 as the first edge (i.e., the lower edge) of the first region 405, and take the edge with a pixel distance of 1080px to the first edge as the second edge (i.e., the upper edge) of the first region. The segmenting module may take the second edge (i.e., the upper edge) of the reference area 402 as the second edge (i.e., the upper edge) of the second region 403, and take the first edge (i.e., the lower edge) of the reference area 402 as the first edge (i.e., the lower edge) of the second region 403. The segmenting module may take the fourth edge (i.e., the right edge) of the reference area 402 as the fourth edge (i.e., the right edge) of both the first region 405 and the second region 403, and take the third edge (i.e., the left edge) of the image as the third edge (i.e., the left edge) of both the first region 405 and the second region 403.
FIG. 4B is another diagram illustrating a process of determining a region according to some embodiments of the present disclosure.
As shown in FIG. 4B, in the coordinate system shown in the figure, the identifying module may determine the reference area 412 based on the method stated above. The area ratio of the two portions of the reference area 412 on the left and right side of the reference line may be the pixel ratio f of the length to the width of the image. The reference area 412 may be used to determine the first region 413 and the second region 414 on the left side of the reference line 411.
The identifying module may determine, from the reference area 412, a target reference subject 415 and a reference matching area 416 that corresponds to the target reference subject 415, wherein the determination of the target reference subject 415 is similar to FIG. 4A. The segmenting module may determine the first edge, the second edge, and the third edge of the first region 413 and the second region 414 in a manner similar to that shown in FIG. 4A. The segmenting module may take the reference line 411 as the fourth edge (i.e., the right edge) of the first region 413 and the second region 414.
In some embodiments, the segmenting module may determine the multiple regions by moving a segmentation frame a plurality of steps on the image. In some embodiments, the segmenting module may move the segmentation frame a plurality of steps according to preset moving distances and moving directions, and take the areas within the segmentation frame at the stopping positions of the segmentation frame between each step as the multiple regions.
In some embodiments, a size of the segmentation frame may be a preset fixed size. The size of the segmentation frame may be adjusted adaptively according to the image size.
In some embodiments, the segmenting module may determine a moving distance of each step in the plurality of steps. Specifically, the segmenting module may determine the moving distance of the step based on a target historical subject in a historical region. The historical region may be a region determined by the position of the current segmentation frame. In other words, the historical region may be the latest determined region before the current step of moving.
In some embodiments, the segmenting module may determine a target edge of the historical region based on the moving direction of the segmentation frame. The target edge may refer to the edge that is closest to the moving direction and intersects with the moving direction among the multiple edges of the historical region. For example, if the segmentation frame in FIG. 5 moves upward, the target edge of the historical region 501 may be the second edge (i.e., the upper edge) .
The identifying module may identify at least one historical subject in the historical region. For the identification of a historical subject may be the same as or similar to the identification of the subject as described elsewhere in the present disclosure (e.g., operation 230 in FIG. 2) . The segmenting module may determine the target historical subject based on the distance between each historical subject and the target edge, and the identification confidence level of each historical subject. In some  embodiments, the segmenting module may designate the historical subject that has an identification confidence level lower than the threshold and that is the farthest from the target edge as a historical reference subject. The segmenting module may determine the moving distance of the step based on the distance between the edge of a historical matching area intersecting with the moving direction and the edge of the historical region intersecting with the moving direction, wherein the historical matching area corresponds to the target historical subject.
FIG. 5 is another diagram illustrating a process of determining a region according to some embodiments of the present disclosure.
As shown in FIG. 5, in the coordinate system, the moving direction 505 of the segmentation frame is from bottom to top. The segmentation frame may determine the historical region 501 and the target historical subject 502. The target historical subject 502 may be a historical subject that is the farthest from the second edge (i.e., the upper edge) of the historical region among all the incomplete historical subjects in the historical region 501, and each of the all the incomplete historical subjects is segmented by the second edge of the historical region. The second edge of the historical region 501 may be the edge of the historical region 501 that intersects with the moving direction 505 and that is farthest from the moving direction 505. The first edge (i.e., the lower edge) of the historical reference area 503 corresponding to the target historical subject 502 may be used as the first edge (i.e., the lower edge) of the segmentation frame after the next step of moving. In other words, the first edge of the newly segmented region 504 may be the first edge of the historical reference area 503.
FIG. 6 is an exemplary flowchart illustrating a process of determining an identification result of an image according to some embodiments of the present disclosure. In some embodiments, process 600 may be implemented in the system 100 illustrated in FIG. 1. For example, the process 600 may be stored in a storage device and/or the storage (e.g., the memory 1920) as a form of instructions, and invoked and/or executed by the processing device 110 (e.g., the processor 1910 of the computing device 1900 as illustrated in FIG. 19) . The operations of the illustrated process presented below are intended to be illustrative. In some embodiments, the process 600 may be accomplished with one or more additional operations not described, and/or without one or more of the operations discussed. Additionally, the order in which the operations of the process 600 as illustrated in FIG. 6 and described below is not intended to be limiting. In some embodiments, operation 240 in FIG. 2 may be performed according to process 600 illustrated in FIG. 6.
In 610, the processing device 110 (e.g., the determining module) may determine, based on an intersection over union (IOU) of each of the at least one subject in the multiple regions, the subject in the overlapped area of the multiple regions.
The IOU of a subject may refer to a ratio of an area of intersection between multiple boxes to an area of union of the multiple boxes, and each of the multiple boxes includes at least a portion of the subject. In some embodiments, the number of the multiple boxes may be two. In some  embodiments, the multiple boxes including the at the least a portion of the subject may be target frames of the subject in the multiple regions (e.g., two regions) .
For example, as shown in FIG. 7, the entire content of the subject 705 may be enclosed in the region 702, and a part of the content of the subject 705 may be enclosed in the region 701.703 may refer to the target frame in the region 701.704 may refer to the target frame in the region 702. The IOU of the subject may be the ratio of the area of intersection between the target frame 704 and the target frame 703 to the area of union of the target frame 704 and the target frame 703.
As stated above, the identifying module may determine the position of the subject in the multiple regions and indicate the position of the subject by the target frame. In some embodiments, after determining the position of the subject in the multiple regions, the determining module may establish a coordinate system to determine the coordinates of the target frame where the subject in the multiple regions is located. If there are multiple target frames where the subject is located, the area of intersection and the area of union of the multiple target frames may be calculated to further determine the IOU of each subject.
In some embodiments, the determining module may designate a subject with an IOU larger than a threshold (e.g., 0, 0.4, etc. ) as the subject in the overlapping area of the multiple regions.
In 620, the processing device 110 (e.g., the determining module) may determine the identification result of the image according to an identification confidence level of the subject in the overlapping area of the multiple regions.
If the whole or a part of a subject is enclosed in more than one region, the identification model may output the corresponding position of the subject and the identification confidence level when each of the more than one region is being identified. The same subject may have different identification confidence levels in different regions. For example, the identification confidence level may indicate whether the subject is completely enclosed in a region. The identification confidence level of a subject corresponding to a region may be related to an area of at last a portion of the subject that is located in the region. If the subject is completely enclosed in a region, the identification confidence level may be high in the region.
In some embodiments, after obtaining at least one subject in the overlapping area, the determining module may process the subject in the overlapping area based on the multiple identification confidence levels of the subject output by the identification model, such that the same subject is only identified as one subject, not multiple subjects. Specifically, for each subject in the overlapping area, the determining module may select the target frame corresponding to the highest confidence level as the position of the subject in the image based on the multiple confidence levels of the subject. When the multiple identification confidence levels are the same, a target frame corresponding to any one of the identification confidence levels may be selected as the position of the subject in the image.
In some embodiments, the identification result of the subject in the overlapping area may be determined, and the identification results of subjects in non-overlapping areas may be combined to obtain the identification result of the image. The non-overlapping areas may refer to areas in the multiple regions that are outside of the overlapping area.
Through operations in process 600, the problem of being considered as two subjects when one subject is enclosed in more than one region, which ultimately causes error in determining the number of subjects, may be avoided. In a realistic scenario, the problem of incorrectly calculating the number of vacant parking spaces may occur due to the number of vehicles in the image being incorrectly determined.
FIG. 8 is an exemplary flowchart illustrating a process of correcting an identification frame according to some embodiments of the present disclosure.
In some embodiments, process 800 may be implemented in the system 100 illustrated in FIG. 1. For example, the process 800 may be stored in a storage device and/or the storage (e.g., the memory 1920) as a form of instructions, and invoked and/or executed by the processing device 110 (e.g., the processor 1910 of the computing device 1900 as illustrated in FIG. 19) . The operations of the illustrated process presented below are intended to be illustrative. In some embodiments, the process 800 may be accomplished with one or more additional operations not described, and/or without one or more of the operations discussed. Additionally, the order in which the operations of the process 800 as illustrated in FIG. 8 and described below is not intended to be limiting. In some embodiments, operation 230 in FIG. 2 may be performed according to process 800 illustrated in FIG. 8.
In 810, for one of the multiple regions, the processing device 110 (e.g., the identifying module) may obtain an identification frame in the region based on an identification algorithm, and the identification frame has a corresponding relationship with a matching area which matches to the target in the region.
The identification algorithm may refer to an algorithm configured to determine the identification frame of a subject in an area (e.g., a region of the multiple regions, a reference area, etc. ) or in an image.
In some embodiments, there is a corresponding relationship between the identification frame and the subject in the region, e.g., one-to-one, one-to-many, many-to-one, or the like.
In some embodiments, the identification algorithm may be included in the identification model. In some embodiments, the identification model may output the identification frames in a region. For example, the identification algorithm may be the non-maximum suppression algorithm (NMS) , or the like.
For example, as shown in FIG. 9, the determining module may obtain the identification frames 901, 902, and 903 based on the identification algorithm.
In 820, the processing device 110 (e.g., the identifying module) may correct the identification frame to obtain a corrected identification frame based on an angle between an actual horizontal line of the matching area corresponding to the identification frame and a standard horizontal line.
As stated above, the matching area matches the subject, and since there is a corresponding relationship between the identification frame and the subject, the identification frame may have a corresponding matching area.
The actual horizontal line may refer to a horizontal line parallel to the edge of the matching area in a real scene. For example, the double solid lines 906 shown in FIG. 9 may represent an actual horizontal line.
In some embodiments, the actual horizontal line may be determined in a variety of ways. For example, the actual horizontal line may be determined according to user input. Taking an image from a camera of a parking lot as an example, the identifying module may obtain a distribution map of the parking spaces in the image, take the parking spaces as the matching areas, and obtain the parking space lines of the parking spaces as the actual horizontal lines.
The standard horizontal line may be parallel to the edge of the identification frame. For example, the standard horizontal line may be parallel to the first edge and the second edge of the identification frame. As another example, the horizontal axis of the coordinate system may be regarded as the standard horizontal line. As an example, the standard horizontal line in FIG. 9 may be lines parallel to the lower edge and the upper edge of the identification frame 901.
In some embodiments, the identifying module may rotate the identification frame based on the angle between the actual horizontal line and the standard horizontal line, such that the identification frame may have an edge parallel to the actual horizontal line, and the corrected identification frame may thereby be obtained. For example, as shown in FIG. 9, an angle a may be formed between the actual horizontal line and the standard horizontal line, and the identification frame 901 may be rotated counterclockwise by a to obtain the corrected identification frame 904, wherein the first edge of the corrected identification frame 904 is parallel to the actual horizontal line.
In 830, the processing device 110 (e.g., the identifying module) may identify the subject in the region according to the corrected identification frame.
In some embodiments, the determining module may input the corrected identification frames of the multiple regions into the identification model, and output target positions of the multiple regions. In some embodiments, the determining module may rotate the corrected identification frame of a region and the corresponding subject in the corrected identification frame of the region, such that the corrected identification frame may have an edge parallel to the standard horizontal line. Further, the determining module may input the rotated corrected identification frame into the identification model, and output the at least one subject in the multiple regions. For example, as shown in FIG. 9, the corrected identification frame 904 and the corresponding area may be rotated such that the corrected  identification frame 904 may include an edge parallel to the standard horizontal line, and the rotated corrected identification frame 904 and the corresponding area may be input into the identification model for identification.
In some embodiments, the image may include an image of a vehicle to be detected below. In some embodiments, the reference area may include a first pre-detection area and a second pre-detection area below. The reference matching area may include a first vehicle frame and a second vehicle frame below. For example, the first reference area may include a first pre-detection area below, and a first vehicle frame below may be obtained from the first reference area. The second reference area may include a second pre-detection area below, and a second vehicle frame below may be obtained from the second reference area. In some embodiments, the multiple regions may include a first detection partition, a second detection partition, a first upper detection partition, a first lower detection partition, a second upper detection partition, and a second lower detection partition below. For example, the set of regions corresponding to one side of the first reference area may include the first upper detection partition and the first lower detection partition below, and the set of regions corresponding to one side of the second reference area may include the second upper detection partition and the second lower detection partition below. In some embodiments, the identification confidence level may be referred to as a confidence level below.
Some embodiments provide a smart solution for the use of old light poles. The construction of smart cities puts forward more ideas about the use of old light poles. At present, urban street lighting is still in a primitive stage, which greatly reduces the use function and efficiency of related implementation devices. By maximizing the utilization rate of street light poles, especially in the application of intelligent monitoring, that is, the construction costs may be reduced and the level of urban intelligence may be improved.
Large outdoor parking lots may generally use high-light lighting. The height of the light poles may be generally 10 to 15 meters, and the distance between the light poles may be relatively large. In this regard, when the light poles are utilized, if only monocular cameras are used for intelligent monitoring, due to problems such as angle and focal length, there may be multiple views between the light poles, resulting in multiple target vehicles being missed. According to the situation mentioned above, the application proposes to use panoramic cameras for monitoring, which may ensure that target vehicles directly under the light poles and in the distance may be covered.
According to the principles mentioned above, the present disclosure proposes a vehicle detection method according to a panoramic camera. Please refer to FIG. 10 for details. FIG. 10 is an exemplary flowchart illustrating a process of vehicle detection according to some embodiments of the present disclosure.
The vehicle detection of the present disclosure may be applied to a vehicle detection device, wherein the vehicle detection device of the present disclosure may be a server, a terminal device, or a  system in which the server and the terminal device cooperate with each other. Correspondingly, various parts included in the electronic device, such as each unit, sub-unit, module, and sub-module, may be all arranged in the server, the terminal device, or the server and the terminal device, respectively.
The server mentioned above may be hardware or software. When the server is hardware, the server may be implemented as a distributed server cluster composed of a plurality of servers, or as a single server. When the server is software, the server may be implemented as a plurality of software or software modules, for example, a software or software module configured to provide a distributed server. The server may also be implemented as a single software or software module, which is not specifically limited herein. In some possible implementation ways, the vehicle detection method in the embodiments of the present disclosure may be implemented by a processor invoking a computer-readable instruction stored in a memory. Specifically, the vehicle detection device in the embodiments of the present disclosure may be a panoramic camera, and the panoramic camera may be arranged on a light pole of an outdoor closed parking lot for detecting the conditions of vehicles in the parking lot.
As shown in FIG. 10, the vehicle detection method in the embodiments of the present disclosure may specifically include the following steps:
S1001: an image of a vehicle to be detected may be obtained, and a first pre-detection area and a second pre-detection area at the center of the image of the vehicle to be detected may be set according to a resolution ratio of the image of the vehicle to be detected.
The vehicle detection device may obtain a panoramic image taken by a panoramic camera arranged on a light pole, that is, an image of a vehicle to be detected. Please refer to FIG. 11 for details. FIG. 11 is a diagram of an image of a vehicle to be detected according to some embodiments of the present disclosure. The panoramic camera in the embodiment of the present disclosure may be formed by stitching four-eye fisheye, and the output panoramic image pixel may be 5520*2700, or 3840*2160. Since the panoramic camera may obtain a large 360-degree scene, the count of target vehicles that are covered by the whole image may be large, resulting in too small pixels occupied by each target vehicle. If vehicle detection is performed on the whole image directly, the whole image may be compressed to an input size of a neural network, and each target may disappear after several times of scaling, resulting in a vehicle detection network unable to achieve accurate target vehicle detection.
Therefore, the embodiment of the present disclosure proposes an image segmentation method according to deep learning. By segmenting the image of the vehicle to be detected, an algorithmic procession may be performed on the segmented sub-images, and target vehicles in each segmented sub-image may be identified to ensure that the target vehicles in the segmented sub-image may be highly and accurately identified. For the segmentation of the image of the vehicle to be detected, the image of the vehicle to be detected may be specifically divided into a first pre-detection  area and a second pre-detection area, as well as four detection partitions. Please refer to FIG. 12 for details. FIG. 12 is a diagram of pre-detection areas and detection partitions according to some embodiments of the present disclosure.
Specifically, firstly, the vehicle detection device may obtain the image resolution of the image of the vehicle to be detected, and calculate the resolution ratio, that is, an aspect ratio f of the image of the vehicle to be detected. The vehicle detection device may use a pre-detection frame with a resolution of 1920*1080, and adjust the position of the pre-detection frame in the image of the vehicle to be detected according to the aspect ratio f. Specifically, the vehicle detection device may overlap the bottom of the pre-detection frame with the bottom of the image of the vehicle to be detected, take the central position of the center point in the image of the vehicle to be detected as the boundary, and adjust the position of the pre-detection frame, so that a ratio of the area of the pre-detection frame on the left of the center position to the area on the right of the center position may be the same as the value of the aspect ratio f. At this time, the area included in the pre-detection frame may be the first pre-detection area, such as the detection area A shown in FIG. 12.
In the same way, the vehicle detection device may overlap the bottom of the pre-detection frame with the bottom of the image of the vehicle to be detected, take the central position of the center point in the image of the vehicle to be detected as the boundary, and adjust the position of the pre-detection frame, so that a ratio of the area of the pre-detection frame on the right of the center position to the area on the left of the center position may be the same as the value of the aspect ratio f. At this time, the area included in the pre-detection frame may be the second pre-detection area.
S1002: vehicle detection may be performed on the first pre-detection area and the second pre-detection area, respectively, and a first vehicle frame in the first pre-detection area and a second vehicle frame in the second pre-detection area may be obtained.
The vehicle detection device may use a pre-trained vehicle detection network to perform vehicle detection in the first pre-detection area and the second pre-detection area, respectively. The vehicle detection network in the embodiments of the present disclosure may use a conventional vehicle detection algorithm and a yolo deep learning training architecture, wherein the network model may use a pyramid network structure. For different scenarios with large and small target vehicles, the deep network in the pyramid network structure may be configured to identify the large target vehicle. After the deep network features and the shallow network are merged, the small target vehicle may be identified in the shallow network. Through the network structure, it may be ensured that the deep network focuses on optimizing the recognition of large targets, and the shallow network focuses on optimizing the recognition of small targets. Training parameters of the vehicle detection network in the embodiments of the present disclosure may use 50,000 images of outdoor parking lot scenes, covering camera heights of 6 meters, 12 meters, and 20 meters, a material ratio may be 3: 3: 4, and training iterations may be about 200,000 times, until the training is complete.
The vehicle detection device may obtain the first vehicle frame and the confidence level in the first pre-detection area, and the second vehicle frame and the confidence level in the second pre-detection area output by the vehicle detection network.
S1003: a first detection partition and a second detection partition may be arranged according to the first vehicle frame and/or the second vehicle frame.
The vehicle detection device may take the central position of the image of the vehicle to be detected, that is, a vertical dividing line that the center point of the image is located, and combine the conditions of the first vehicle frame and/or the second vehicle frame to divide the left and right detection partitions in the image of the vehicle to be detected, that is, the first detection partition and the second detection partition.
Specifically, since both the first pre-detection area and the second pre-detection area include a central position, thus both the detected first vehicle frame and the second vehicle frame may have a vehicle frame located at the central position. Therefore, the vehicle detection device may detect whether there is a vehicle frame at the central position of the image of the vehicle to be detected, and if the vehicle frame does not exist, the first detection partition and the second detection partition may be directly divided according to the central position; if the vehicle frame exists, a dividing line may be arranged offset to the left or right according to the position and size of the vehicle frame at the center position, and the dividing line may not pass through any vehicle frame. Finally, the vehicle detection device may divide the first detection partition and the second detection partition according to the dividing line. Compared with a division according to the central position directly, the division method of the embodiments of the present disclosure may ensure that the two detection partitions on the left and right may not segment the target vehicle in the middle of the image of the vehicle to be detected. For the location of the first detection partition, please refer to the detection area B in FIG. 12 for details.
S1004: the first detection partition may be divided into a first upper detection partition and a first lower detection partition according to the first vehicle frame, and the second detection partition may be divided into a second upper detection partition and a second lower detection partition according to the second vehicle frame.
The vehicle detection device may calibrate the position of the first vehicle frame in the first pre-detection area, take a row of vehicle targets at the top of the first pre-detection area in the first vehicle frame, and select target vehicles with the highest confidence level and/or the leftmost target vehicles, that is, the target vehicles S shown in FIG. 12.
The vehicle detection device may set a lower edge of the first upper detection partition according to the horizontal coordinate of the lower edge of the vehicle frame that the target vehicles are located, that is, the horizontal coordinate of the lower edge of the first upper detection partition may be the same as the horizontal coordinate of the lower edge of the vehicle frame that the target vehicles are located. In addition, the left edge of the first upper detection partition may coincide with the left edge  of the image of the vehicle to be detected, and the right edge of the first upper detection partition may coincide with the right edge of the first detection partition. Considering that the panoramic camera may have more occlusions by the target vehicles in a long view, in order to improve the efficiency of the vehicle detection, only the lower 3/4 area of the image of the vehicle to be detected may be taken as an effective recognition area. Therefore, the upper edge of the first upper detection partition may be the horizontal coordinate of the lower edge plus 1080, or a horizontal line located at 3/4 of the height of the image of the vehicle to be detected.
The upper edge of the first lower detection partition may coincide with the upper edge of the first detection partition, the left edge of the first lower detection partition may coincide with the left edge of the image of the vehicle to be detected, the right edge of the first lower detection partition may coincide with the right edge of the first detection partition, and the lower edge of the first lower detection partition may coincide with the lower edge of the image of the vehicle to be detected, that is, the detection area C shown in FIG. 12.
In view of the imaging characteristics of the panoramic camera, images may be distorted. A horizontal area in reality may be reflected as an oblique area in the image, especially images on two sides in a close shot, the tilt and distortion may be more serious. As shown in FIG. 12, some areas of the first upper detection partition and the first lower detection partition may overlap with each other. Through the overlapping arrangement of the areas in the embodiment of the present disclosure, the vehicle detection of large target vehicles relatively close to the panoramic camera and small target vehicles far away may be ensured, which may not be affected by the segmentation of the detection partition.
In the same way, the vehicle detection device may divide the second detection partition into the second upper detection partition and the second lower detection partition according to the second vehicle frame through the process mentioned above, which may not be repeated herein.
S1005: the vehicle detection may be performed on the first upper detection partition, the first lower detection partition, the second upper detection partition, and the second lower detection partition, respectively, and a target vehicle frame may be marked on the image of the vehicle to be detected according to a vehicle detection result.
After the adaptive segmentation process mentioned above, the vehicle detection device may perform the vehicle detection in the first upper detection partition, the first lower detection partition, the second upper detection partition, and the second lower detection partition, respectively, which may not only ensure that the small target vehicles in each detection partition may be identified with high accuracy, but also minimize the impact of segmentation of the detection area on the segmentation of the target vehicles.
After the vehicle detection is performed on each detection partition, the vehicle detection device may mark the target vehicle frame on the image of the vehicle to be detected according to the vehicle detection result, that is, a diagram of a segmented area recognition result shown in FIG. 13.
In order to further improve the accuracy of the vehicle detection method, the embodiment of the present disclosure also proposes a strategy for filtering and detecting the same target in the overlapping area between the detection areas. The following may take the vehicle detection result of the first upper detection partition and the first lower detection partition as an example for description. Please refer to FIG. 14 for details. FIG. 14 is an exemplary flowchart illustrating a process of vehicle detection according to some embodiments of the present disclosure.
As shown in FIG. 14, S1005 may include the following sub-steps:
S1401: the vehicle detection may be performed on the first upper detection partition to obtain third vehicle frames.
S1402: the vehicle detection may be performed on the first lower detection partition to obtain fourth vehicle frames.
S1403: a vehicle frame set including the vehicle frames in the third vehicle frames and the vehicle frames in the fourth vehicle frames that the vehicle frames overlap with each other may be obtained, wherein the vehicle frame set may include the vehicle frames in which two frame areas overlap with each other.
When filtering the target vehicles in the overlapping area, the vehicle detection device may need to traverse all the third vehicle frames in the first upper detection partition and all the fourth vehicle frames in the first lower detection partition, and complete coordinate mapping in the image of the vehicle to be detected according to coordinate relative points.
In the overlapping area, the vehicle detection device may extract a vehicle frame group in which the vehicle frames in the third vehicle frames and the vehicle frames in the fourth vehicle frames overlap with each other, wherein, the overlap may be defined as that an intersection ratio of two vehicle frames is greater than 0. The intersection ratio may be obtained through dividing an intersection area of the two vehicle frames by a union area of the two vehicle frames. Finally, the vehicle detection device may group a plurality of vehicle frame groups into a vehicle frame set.
S1404: one of the vehicle frames of the vehicle frames with overlapping area of two frames in the vehicle frame set may be deleted according to a preset rule.
The vehicle detection device may filter the vehicle frame groups in the vehicle frame set according to the preset rule. Specifically, a filter condition in the embodiments of the present disclosure may include but is not limited to:
Filtering condition 1: when the intersection ratio of the two vehicle frames in the vehicle frame group is greater than 0.4, deleting one of the vehicle frames with lower confidence level.
Filtering condition 2: when a center point of a vehicle frame in the vehicle frame group enters a vehicle frame of other detection partitions, deleting a vehicle frame with lower confidence level.
In addition, in order to solve the problem of image distortion of the panoramic camera, the embodiment of the present disclosure also proposes a maximum suppression algorithm according to an offset angle to solve the problem of the identification and underreporting of distorted target vehicles. Please refer to FIG. 15 for details. FIG. 15 is another exemplary flowchart illustrating a process of vehicle detection according to some embodiments of the present disclosure.
S1501: the first upper detection partition and a preset parking detection area may be obtained therein.
Before the vehicle detection, a staff may mark a parking detection area in a screen according to actual situation, so as to combine the scene of the parking lot and use relevant information such as area, parking space, or the like, to calculate the situation of distortion and tilt.
The parking detection area may specifically be a parking frame or a parking area including a plurality of parking frames. For example, a target vehicle parking area, with each parking area as a statistical unit, according to single-row or multi-row parking areas on the ground, a plurality of statistical areas may be drawn. By configuring the maximum number of parking spaces in each parking area, combining the results of an algorithm to identify a target, the state of each parking area may be determined, that is, whether there is a parking space or no parking space. The staff may also draw an independent parking space frame, with each parking space as a statistical unit, an independent parking space frame may be drawn, and combining the results of an algorithm to determine the occupancy state of each parking space, that is, whether there is a parking space or no parking space.
S1502: the vehicle detection may be performed on the first upper detection partition to obtain a fifth vehicle frame.
Images at the far end of the panoramic camera may generally have distortions and tilts. As shown in FIG. 16, vehicles and parking spaces in the image may be distorted and tilted on the screen, which is not conducive to the filtering and recognition of a vehicle frame.
S1503: edge coordinates of the parking detection area and edge coordinates of the fifth vehicle frame in the case that the center point of the fifth vehicle frame is located in the parking detection area may be obtained.
The vehicle detection device may compare the position of the center point of the fifth vehicle frame with the position of the parking detection area. If the parking frame includes coordinates of the center point, then directly obtaining coordinates of the lower edge of the parking frame and line segments composed of the lower edge of the parking frame. If the parking area includes the coordinates of the center point, obtaining coordinates of the lower edge of the parking area, wherein, the ordinate of a cut-off point of the lower edge of the parking area may be the same as the ordinate of the center point of the fifth vehicle frame. The length of a line segment may be half of the length of the  lower edge of the fifth vehicle frame, so as to obtain coordinates of two ends of a truncated line segment at the lower edge of the parking area.
S1504: an inclination angle of a parking space may be calculated according to edge coordinates of the parking detection area and edge coordinates of the fifth vehicle frame, and an inclined fifth vehicle frame may be obtained by adjusting the fifth vehicle frame according to the inclination angle of the parking space.
Further, the vehicle detection device may obtain the coordinates of the lower edge of the fifth vehicle frame and the line segments, calculate the inclination angle of the parking space through two line segments, and recalculate the position of the fifth vehicle frame according to the inclination angle of the parking space. Specifically, the vehicle detection device may obtain the coordinates, width, and height of the center point of the fifth vehicle frame through the vehicle detection algorithm, so as to calculate the coordinates of a rectangular frame of a target vehicle and the coordinates of each vertex. Combining the calculated inclination angle of the parking space, each side of the rectangular frame may be rotated counterclockwise by the inclination angle of the parking space with the center point as the center to calculate a new vehicle frame, that is, a tilted fifth vehicle frame.
S1505: duplicate vehicle frames may be filtered by the tilted fifth vehicle frame.
The vehicle detection device may use NMS (non-maximum suppression) technology to filter the duplicate vehicle frames, and a filter threshold may be set to be between 0.20 and 0.25. According to the position and coordinates of the new vehicle frame, the filtering of a plurality of detection frames for one target may be completed.
Please refer to FIG. 17 for details. FIG. 17 is a diagram of a distortion correction for a vehicle frame according to some embodiments of the present disclosure. FIG. 17 is a result of calculating a distortion angle and completing the correction of the vehicle frame through the vehicle frame configuration mentioned above. An area frame D may be a preset parking area. The distortion angle a may be calculated according to the data mentioned above, and then each vehicle frame in the parking area may be adjusted. Taking the adjusted frame E and frame F of the two vehicle frames in the middle of the parking area as an example, the value of NMS has dropped from the original 0.7 to 0.15.
In the embodiment of the present disclosure, the vehicle detection device may obtain the image of the vehicle to be detected, and use the resolution ratio of the image of the vehicle to be detected to arrange the first pre-detection area and the second pre-detection area at the center position of the image of the vehicle to be detected. The vehicle detection may be performed in the first pre-detection area and the second pre-detection area, respectively to obtain the first vehicle frame in the first pre-detection area and the second vehicle frame in the second pre-detection area. The first detection partition and the second detection partition may be arranged according to the first vehicle frame and/or the second vehicle frame. The first detection partition may be divided into the first upper detection partition and the first lower detection partition according to the first vehicle frame, and the  second detection partition may be divided into the second upper detection partition and the second lower detection partition according to the second vehicle frame. The vehicle detection may be performed on the first upper detection partition, the first lower detection partition, the second upper detection partition, and the second lower detection partition, respectively, and the target vehicle frame may be marked on the image of the vehicle to be detected according to the vehicle detection result. By the way mentioned above, the present disclosure may ensure that small targets in the detection partition may be accurately identified by setting the detection partitions, while reducing the influence of dividing areas on the segmentation of a target, and improving the accuracy of the vehicle detection method.
For those skilled in the art, it may be understood that in the method of the specific implementation mentioned above, the order of the steps may not indicate a strict execution order that constitutes any limitation on the implementation process. The specific execution order of each step should be determined according to the function and the possible inner logic.
In order to implement the vehicle detection method of the embodiments mentioned above, the application proposes a vehicle detection device. Please refer to FIG. 18 for details. FIG. 18 is a diagram illustrating a structure of a vehicle detection device according to some embodiments of the present disclosure.
As shown in FIG. 18, the vehicle detection device 1800 may include a pre-detection module 1810, a vehicle frame module 1820, a detection partition module 1830, and a vehicle detection module 1840.
The pre-detection module 1810 may be configured to obtain an image of a vehicle to be detected, and use a resolution ratio of the image of the vehicle to be detected to set a first pre-detection area and a second pre-detection area at a central position of the image of the vehicle to be detected.
The vehicle frame module 1820 may be configured to perform vehicle detection on the first pre-detection area and the second pre-detection area, and obtain a first vehicle frame in the first pre-detection area and a second vehicle frame in the second pre-detection area.
The detection partition module 1830 may be configured to arrange a first detection partition and a second detection partition according to the first vehicle frame and/or the second vehicle frame, divide the first detection partition into a first upper detection partition and a first lower detection partition according to the first vehicle frame, and divide the second detection partition into a second upper detection partition and a second lower detection partition according to the second vehicle frame.
The vehicle detection module 1840 may be configured to perform the vehicle detection on the first upper detection partition, the first lower detection partition, the second upper detection partition, and the second lower detection partition, respectively, and mark a target vehicle frame in the image of the vehicle to be detected according to a vehicle detection result.
In order to implement the vehicle detection method of the embodiments mentioned above, the application may also propose another vehicle detection device. Please refer to FIG. 19 for details. FIG. 19 is a diagram illustrating a structure of a vehicle detection device according to another embodiment of the present disclosure.
The vehicle detection device 1900 of the embodiments of the present disclosure may include a processor 1910, a memory 1920, an input/output device 1930, and a bus 1940. In some embodiment, the processing device 110 in the Fig. 1. may be the vehicle detection device 1900.
The processor 1910, the memory 1920, and the input/output device 1930 may be connected to the bus 1940, respectively. The memory 1920 may store program data, and the processor 1910 may be configured to execute the program data to implement the vehicle detection method described in the embodiments mentioned above.
In the embodiment, the processor 1910 may also be referred to as a CPU (Central Processing Unit) . The processor 1910 may be an integrated circuit chip with signal processing capabilities. The processor 1910 may also be a general-purpose processor, a DSP (Digital Signal Processor) , an ASIC (Application Specific Integrated Circuit) , a FPGA (Field Programmable Gate Array) , or other programming logic devices, discrete gates or transistor logic devices, discrete hardware assemblies. The general-purpose processor may be a microprocessor or the processor 1910 may also be any conventional processor, or the like.
The present application may also provide a computer storage medium. As shown in FIG. 20, the computer storage medium 2000 may be configured to store program data 2010. When the program data 2010 is executed by a processor, the vehicle detection method described in the embodiments mentioned above may be implemented.
The present application may also provide a computer program product, wherein the computer program product may include a computer program, and the computer program may be operable to cause a computer to execute the vehicle detection method as described in the embodiments of the present disclosure. The computer program product may be a software installation package.
When implementing the vehicle detection method described in the embodiments of the present disclosure mentioned above, the vehicle detection method may be in a form of a software functional unit, and when the vehicle detection method is sold or used as an independent product, the vehicle detection method may be stored in a device, such as a computer readable storage medium. According to the understanding mentioned above, the technical solution of the present disclosure essentially, or the part that contributes to the prior art, or all or part of the technical solution may be embodied in a form of a software product. The computer software product may be stored in a storage medium including several instructions to make a computer device (which may be a personal computer, a server, a network device, or the like) or a processor to execute all or part of the steps of the methods described in the various embodiments of the present disclosure. The storage media mentioned above  may include U disk, mobile hard disk, ROM (Read-only Memory) , RAM (Random Access Memory) , magnetic disks, optical disks, and other media that may store program codes.
The basic concepts have been described above. Obviously, for persons having ordinary skills in the art, the disclosure of the invention is merely by way of example, and does not constitute a limitation on the present disclosure. Although not explicitly stated here, those skilled in the art may make various modifications, improvements, and amendments to the present disclosure. These alterations, improvements, and modifications are intended to be suggested by this disclosure and are within the spirit and scope of the exemplary embodiments of this disclosure.
At the same time, the present disclosure uses specific words to describe the embodiments of the present disclosure. For example, "one embodiment" , "an embodiment" , and/or "some embodiments" means a certain feature, structure, or characteristic related to at least one embodiment of the present disclosure. Therefore, it should be emphasized and noted that “one embodiment” , “an embodiment” , or “an alternative embodiment” mentioned twice or more in different positions in the present disclosure does not necessarily refer to the same embodiment. In addition, some features, structures, or characteristics in one or more embodiments of the present disclosure may be combined appropriately.
In addition, unless explicitly stated in the claims, the order of processing elements and sequences, the use of numbers and letters, or the use of other names described in the present disclosure are not used to limit the order of processes and methods in the present disclosure. Although the foregoing disclosure uses various examples to discuss some embodiments of the present disclosure that are currently considered useful, it should be understood that such details are only for illustrative purposes, and the appended claims are not limited to the disclosed embodiments. On the contrary, the claims are intended to cover all modifications and equivalent combinations that conform to the essence and scope of the embodiments of the present disclosure. For example, although the system components described above may be implemented by hardware devices, the system components may also be implemented only by software solutions, such as installing the described system on an existing server or mobile device.
In the same way, it should be noted that, in order to simplify the expressions disclosed in the present disclosure and thus help the understanding of one or more embodiments of the present disclosure, in the foregoing description of the embodiments of the present disclosure, multiple features are sometimes combined into one example, drawing, or description thereof. However, the method of disclosure does not mean that the subject of the present disclosure requires more features than those mentioned in the claims. In fact, the features of the embodiments are less than all the features of the single embodiment disclosed above.
In some embodiments, numbers describing the number of ingredients and attributes are used. It should be understood that such numbers used for the description of the embodiments use the  modifier "about" , "approximately" , or "substantially" in some examples. Unless otherwise stated, "about" , "approximately" , or "substantially" indicates that the number is allowed to vary by ±20%. Correspondingly, in some embodiments, the numerical parameters used in the description and claims are approximate values, and the approximate values may be changed according to the required characteristics of individual embodiments. In some embodiments, the numerical parameters should consider the prescribed effective digits and adopt the method of general digit retention. Although the numerical ranges and parameters used to confirm the breadth of the range in some embodiments of the present disclosure are approximate values, in specific embodiments, settings of such numerical values are as accurate as possible within a feasible range.
For each patent, patent application, patent application publication, or other materials cited in the present disclosure, such as articles, books, specifications, publications, documents, or the like, the entire contents of which are hereby incorporated into the present disclosure as a reference. The application history documents that are inconsistent or conflict with the content of the present disclosure are excluded, and the documents that restrict the broadest scope of the claims of the present disclosure (currently or later attached to the present disclosure) are also excluded. It should be noted that if there is any inconsistency or conflict between the description, definition, and/or use of terms in the auxiliary materials of the present disclosure and the content of the present disclosure, the description, definition, and/or use of terms in the present disclosure is subject to the present disclosure.
Finally, it should be understood that the embodiments described in the present disclosure are only used to illustrate the principles of the embodiments of the present disclosure. Other variations may also fall within the scope of the present disclosure. Therefore, as an example and not a limitation, alternative configurations of the embodiments of the present disclosure may be regarded as consistent with the teaching of the present disclosure. Accordingly, the embodiments of the present disclosure are not limited to the embodiments introduced and described in the present disclosure explicitly.

Claims (20)

  1. A method for image processing, comprising:
    obtaining an image acquired by a camera with a height from the ground satisfying a first condition;
    determining multiple regions of the image;
    identifying at least one subject in the multiple regions; and
    determining an identification result of the image based on a subject among the at least one subject, at least a portion of the subject being in an overlapping area of the multiple regions.
  2. The method of claim 1, wherein each of at least one matching area in the image is enclosed in a region of the multiple regions, and the at least one matching area matches to the at least one subjects.
  3. The method of claim 1, wherein the determining the multiple regions of the image includes:
    determining a reference area of the image, wherein a deformation degree of the reference area satisfies a second condition; and
    determining the multiple regions according to the reference area.
  4. The method of claim 3, wherein the multiple regions contain a set of regions which contains a first region and a second region, and the determining the multiple regions according to the reference area includes:
    identifying one or more reference subjects in the identification area;
    determining a target reference subject from the one or more reference subjects in the reference area satisfies a third condition;
    determining an edge of the first region in the image based on an edge of a reference matching area which matches to the target reference subject;
    determining an edge of the second region in the image based on an edge of the reference area; and
    determining the set of regions according to the edge of the first region and the edge of the second region in the image.
  5. The method of claim 4, wherein the third condition is related to a position of each of the one or more reference subjects in the reference area.
  6. The method of claim 5, wherein the third condition is related to an identification confidence level of each of the one or more reference subjects in the identification area.
  7. The method of claim 1, wherein the determining the identification result of the image based on the subject among the at least one subject, the at least a portion of the subject being in the overlapping area of the multiple regions includes:
    determining, based on an intersection over union (IOU) of each of the at least one subject in the multiple regions, the subject; and
    determining the identification result of the image according to an identification confidence level of the subject.
  8. The method of claim 1, wherein the identifying the at least one subject in the multiple regions further includes:
    for a region of the multiple regions,
    obtaining an identification frame in the region based on an identification algorithm, wherein the identification frame has a corresponding relationship with a matching area which matches to the target in the region;
    correcting the identification frame to obtain a corrected identification frame based on an angle between a actual horizontal line of the matching area corresponding to the identification frame and a standard horizontal line; and
    identifying the subject in the region according to the corrected identification frame.
  9. The method claim 1, wherein the at least one subject includes a vehicle, the ground captured by the camera includes a parking area, and the method further includes:
    determining a vacant parking space in the parking area according to the identification result of the image.
  10. A system for image processing, comprising:
    at least one storage device including a set of instructions; and
    at least one processor configured to communicate with the at least one storage device, wherein when executing the set of instructions, the at least one processor is configured to direct the system to perform operations including:
    obtaining an image acquired by a camera with a height from the ground satisfying a first condition;
    determining multiple regions of the image;
    identifying at least one subject in the multiple regions; and
    determining an identification result of the image based on a subject among the at least one subject, at least a portion of the subject being in an overlapping area of the multiple regions.
  11. The system of claim 10, wherein each of at least one matching area in the image is enclosed in a region of the multiple regions, and the at least one matching area matches to the at least one subjects.
  12. The system of claim 10, wherein the determining the multiple regions of the image includes:
    determining a reference area of the image, wherein a deformation degree of the reference area satisfies a second condition; and
    determining the multiple regions according to the reference area.
  13. The system of claim 12, wherein the multiple regions contain a set of regions which contains a first region and a second region, and the determining the multiple regions according to the reference area includes:
    identifying one or more reference subjects in the identification area;
    determining a target reference subject from the one or more reference subjects in the reference area satisfies a third condition;
    determining an edge of the first region in the image based on an edge of a reference matching area which matches to the target reference subject;
    determining an edge of the second region in the image based on an edge of the reference area; and
    determining the set of regions according to the edge of the first region and the edge of the second region in the image.
  14. The system of claim 13, wherein the third condition is related to a position of each of the one or more reference subjects in the reference area.
  15. The system of claim 14, wherein the third condition is related to an identification confidence level of each of the one or more reference subjects in the identification area.
  16. The system of claim 10, wherein the determining the identification result of the image based on the subject among the at least one subject, the at least a portion of the subject being in the overlapping area of the multiple regions includes:
    determining, based on an intersection over union (IOU) of each of the at least one subject in the multiple regions, the subject; and
    determining the identification result of the image according to an identification confidence level of the subject.
  17. The system of claim 10, wherein the identifying the at least one subject in the multiple regions further includes:
    for a region of the multiple regions,
    obtaining an identification frame in the region based on an identification algorithm, wherein the identification frame has a corresponding relationship with a matching area which matches to the target in the region;
    correcting the identification frame to obtain a corrected identification frame based on an angle between a actual horizontal line of the matching area corresponding to the identification frame and a standard horizontal line; and
    identifying the subject in the region according to the corrected identification frame.
  18. The system claim 10, wherein the at least one subject includes a vehicle, the ground captured by the camera includes a parking area, and the method further includes:
    determining a vacant parking space in the parking area according to the identification result of the image.
  19. A system for image processing, comprising:
    a obtaining module configured to obtain an image acquired by a camera with a height from the ground satisfying a first condition;
    a segmenting module configured to determine multiple regions of the image;
    an identifying module configured to identify at least one subject in the multiple regions; and
    a determining module configured to determine an identification result of the image based on a subject among the at least one subject, at least a portion of the subject being in an overlapping area of the multiple regions.
  20. A non-transitory computer readable medium, comprising executable instructions that, when executed by at least one processor, direct the at least one processor to perform a method, the method comprising:
    obtaining an image acquired by a camera with a height from the ground satisfying a first condition;
    determining multiple regions of the image;
    identifying at least one subject in the multiple regions; and
    determining an identification result of the image based on a subject among the at least one subject, at least a portion of the subject being in an overlapping area of the multiple regions.
PCT/CN2021/119300 2021-04-15 2021-09-18 Method and system for image processing WO2022217834A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP21936704.2A EP4226274A4 (en) 2021-04-15 2021-09-18 Method and system for image processing
KR1020237021895A KR20230118881A (en) 2021-04-15 2021-09-18 Image processing method and image processing system

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110406602.7A CN113191221B (en) 2021-04-15 2021-04-15 Vehicle detection method and device based on panoramic camera and computer storage medium
CN202110406602.7 2021-04-15

Publications (1)

Publication Number Publication Date
WO2022217834A1 true WO2022217834A1 (en) 2022-10-20

Family

ID=76977196

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/119300 WO2022217834A1 (en) 2021-04-15 2021-09-18 Method and system for image processing

Country Status (4)

Country Link
EP (1) EP4226274A4 (en)
KR (1) KR20230118881A (en)
CN (1) CN113191221B (en)
WO (1) WO2022217834A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113191221B (en) * 2021-04-15 2022-04-19 浙江大华技术股份有限公司 Vehicle detection method and device based on panoramic camera and computer storage medium
CN113706920B (en) * 2021-08-20 2023-08-11 云往(上海)智能科技有限公司 Parking behavior judging method and intelligent parking system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170255830A1 (en) * 2014-08-27 2017-09-07 Alibaba Group Holding Limited Method, apparatus, and system for identifying objects in video images and displaying information of same
CN109165645A (en) * 2018-08-01 2019-01-08 腾讯科技(深圳)有限公司 A kind of image processing method, device and relevant device
CN111653103A (en) * 2020-05-07 2020-09-11 浙江大华技术股份有限公司 Target object identification method and device
CN113191221A (en) * 2021-04-15 2021-07-30 浙江大华技术股份有限公司 Vehicle detection method and device based on panoramic camera and computer storage medium

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7460696B2 (en) * 2004-06-01 2008-12-02 Lumidigm, Inc. Multispectral imaging biometrics
CN104376554B (en) * 2014-10-16 2017-07-18 中海网络科技股份有限公司 A kind of parking offense detection method based on image texture
IL238473A0 (en) * 2015-04-26 2015-11-30 Parkam Israel Ltd A method and system for detecting and mapping parking spaces
CN105678217A (en) * 2015-12-29 2016-06-15 安徽海兴泰瑞智能科技有限公司 Vehicle parking guidance positioning method
CN107767673B (en) * 2017-11-16 2019-09-27 智慧互通科技有限公司 A kind of Roadside Parking management method based on multiple-camera, apparatus and system
CN110717361A (en) * 2018-07-13 2020-01-21 长沙智能驾驶研究院有限公司 Vehicle parking detection method, preceding vehicle start reminding method and storage medium
US11288525B2 (en) * 2018-10-31 2022-03-29 Texas Instruments Incorporated Object detection for distorted images
CN110517288B (en) * 2019-07-23 2021-11-02 南京莱斯电子设备有限公司 Real-time target detection tracking method based on panoramic multi-path 4k video images
CN111968132A (en) * 2020-07-28 2020-11-20 哈尔滨工业大学 Panoramic vision-based relative pose calculation method for wireless charging alignment
CN112183409A (en) * 2020-09-30 2021-01-05 深圳道可视科技有限公司 Parking space detection method based on panoramic image, electronic equipment and storage medium
CN112330601B (en) * 2020-10-15 2024-03-19 浙江大华技术股份有限公司 Fish-eye camera-based parking detection method, device, equipment and medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170255830A1 (en) * 2014-08-27 2017-09-07 Alibaba Group Holding Limited Method, apparatus, and system for identifying objects in video images and displaying information of same
CN109165645A (en) * 2018-08-01 2019-01-08 腾讯科技(深圳)有限公司 A kind of image processing method, device and relevant device
CN111653103A (en) * 2020-05-07 2020-09-11 浙江大华技术股份有限公司 Target object identification method and device
CN113191221A (en) * 2021-04-15 2021-07-30 浙江大华技术股份有限公司 Vehicle detection method and device based on panoramic camera and computer storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP4226274A4 *

Also Published As

Publication number Publication date
EP4226274A4 (en) 2024-03-13
CN113191221B (en) 2022-04-19
EP4226274A1 (en) 2023-08-16
KR20230118881A (en) 2023-08-14
CN113191221A (en) 2021-07-30

Similar Documents

Publication Publication Date Title
WO2022078156A1 (en) Method and system for parking space management
CN111461245B (en) Wheeled robot semantic mapping method and system fusing point cloud and image
CN111178236B (en) Parking space detection method based on deep learning
CN110009561B (en) Method and system for mapping surveillance video target to three-dimensional geographic scene model
WO2022217834A1 (en) Method and system for image processing
CN110807496B (en) Dense target detection method
KR102052833B1 (en) Apparatus and method for vehicle speed detection using image tracking
CN113761999B (en) Target detection method and device, electronic equipment and storage medium
WO2021098079A1 (en) Method for using binocular stereo camera to construct grid map
CN103473537B (en) A kind of target image contour feature method for expressing and device
US20210350705A1 (en) Deep-learning-based driving assistance system and method thereof
CN112819895A (en) Camera calibration method and device
CN113469201A (en) Image acquisition equipment offset detection method, image matching method, system and equipment
CN115249355B (en) Object association method, device and computer-readable storage medium
JP2012511754A (en) Method and apparatus for dividing an obstacle
Wang et al. Automatic registration of point cloud and panoramic images in urban scenes based on pole matching
CN115533902A (en) Visual guidance-based unstacking method and device, electronic equipment and system
CN117911668A (en) Drug information identification method and device
CN112529006B (en) Panoramic picture detection method, device, terminal and storage medium
CN116912517B (en) Method and device for detecting camera view field boundary
CN114066930A (en) Planar target tracking method and device, terminal equipment and storage medium
CN114943954B (en) Parking space detection method, device and system
CN114219871A (en) Obstacle detection method and device based on depth image and mobile robot
CN115170679A (en) Calibration method and device for road side camera, electronic equipment and storage medium
CN113642553A (en) Whole and component target detection combined non-constrained license plate accurate positioning method

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21936704

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2021936704

Country of ref document: EP

Effective date: 20230508

ENP Entry into the national phase

Ref document number: 20237021895

Country of ref document: KR

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE