WO2023005275A1 - Traffic behavior recognition method and apparatus, electronic device, and storage medium - Google Patents

Traffic behavior recognition method and apparatus, electronic device, and storage medium Download PDF

Info

Publication number
WO2023005275A1
WO2023005275A1 PCT/CN2022/087745 CN2022087745W WO2023005275A1 WO 2023005275 A1 WO2023005275 A1 WO 2023005275A1 CN 2022087745 W CN2022087745 W CN 2022087745W WO 2023005275 A1 WO2023005275 A1 WO 2023005275A1
Authority
WO
WIPO (PCT)
Prior art keywords
rider
area
vehicle
passengers
identification result
Prior art date
Application number
PCT/CN2022/087745
Other languages
French (fr)
Chinese (zh)
Inventor
范佳柔
甘伟豪
武伟
Original Assignee
上海商汤智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 上海商汤智能科技有限公司 filed Critical 上海商汤智能科技有限公司
Publication of WO2023005275A1 publication Critical patent/WO2023005275A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Definitions

  • This application relates to computer technology, in particular to a traffic behavior recognition method and device, electronic equipment and storage media.
  • traffic behavior identification can include the identification of non-motorized vehicle occupants. If illegal occupant behavior is found, penalties and safety education are required.
  • the number of people loaded is mainly determined by identifying the number of heads or bodies appearing in the image. If the number of loaded people is too large, it is determined that the loading behavior is illegal.
  • the method may include: acquiring an image to be recognized including one or more rider areas; for any rider area in the one or more rider areas, performing an operation includes: in the image to be identified, determining the The associated vehicle area associated with the rider area, the rider area includes a vehicle and at least one human body; identifying the number of passengers in the rider area to obtain the identification result of the number of passengers, and identifying the vehicle type in the associated vehicle area to obtain Vehicle type identification result; according to the identification result of the number of passengers and the identification result of the vehicle type, it is determined whether the target rider in the rider area has illegal loading behavior.
  • the present application also proposes a traffic behavior recognition device, which includes: an acquisition module, configured to acquire an image to be recognized including one or more rider areas; a first determination module, configured to target the one or more rider areas Any one of the rider areas, in the image to be recognized, determine the associated vehicle area associated with the rider area, the rider area includes a vehicle and at least one human body; the identification module is used to carry people in the rider area Quantity identification, obtaining the identification result of the number of people carried, and performing vehicle type identification on the associated vehicle area to obtain the result of vehicle type identification; a second determination module, used to identify the result of the number of people carried and the vehicle type identification result , to determine whether the target rider in the rider's area has illegal passenger behavior.
  • the present application also proposes an electronic device, the device includes: a processor; a memory for storing processor-executable instructions; wherein, the processor executes the executable instructions to implement any of the foregoing embodiments.
  • the present application also proposes a computer-readable storage medium, the storage medium stores a computer program, and the computer program is used to make a processor execute the traffic behavior recognition method as described in any one of the foregoing embodiments.
  • the present application also proposes a computer program product, including a computer program stored in a memory, and when the computer program instructions are executed by a processor, the traffic behavior recognition method as described in any one of the foregoing embodiments is implemented.
  • Fig. 1 is a method flowchart of a traffic behavior recognition method shown in the present application
  • FIG. 2 is a flowchart of a method for determining an associated vehicle area shown in the present application
  • FIG. 3 is a schematic diagram of an object detection process shown in the present application.
  • FIG. 4 is a flow chart of another method for determining an associated vehicle area shown in the present application.
  • FIG. 5 is a schematic flow diagram of a method for identifying a manned behavior shown in the present application
  • Figure 6 is a schematic diagram of a judging rule for illegal manned behavior shown in this application.
  • FIG. 7 is a schematic structural diagram of a traffic behavior recognition device shown in the present application.
  • FIG. 8 is a schematic diagram of a hardware structure of an electronic device shown in the present application.
  • the purpose of this application is to propose a traffic behavior recognition method (hereinafter referred to as the recognition method).
  • the recognition method for the identification of non-motor vehicle illegal loading behavior, the number of people loaded is mainly determined by identifying the number of heads or bodies appearing in the image. If the number of loaded people is too large, it is determined that the loading behavior is illegal.
  • people are relatively close to each other, and it is easy to be blocked by human bodies and heads. Therefore, it is impossible to get the accurate number of human bodies or heads, which leads to incorrect identification of the number of people on board.
  • different types of non-motorized vehicles have different requirements for the number of passengers. Existing methods cannot identify the legitimacy of different types of non-motorized vehicle occupant behavior.
  • FIG. 1 is a flow chart of a traffic behavior recognition method shown in this application. The method may include steps S102 to S108.
  • the method uses a neural network model to identify the number of people in the rider area, and can learn the number of people in the rider area through model self-adaptation, so that even in the area to be identified In the case of occlusion in the image, the accurate number of passengers can also be identified, thereby improving the accuracy of traffic behavior recognition.
  • the legitimacy of manned behavior can be identified based on the identification results of the number of people carried and the result of vehicle type recognition, so that the type of vehicle and the number of people carried can be comprehensively considered when performing legal identification, so as to achieve traffic behavior recognition for different types of vehicles Effect.
  • the identification method shown in FIG. 1 can be applied to electronic equipment.
  • the electronic device may execute the method by carrying software logic corresponding to the device method.
  • the type of the electronic device may be a notebook computer, a computer, a server, a mobile phone, a PAD terminal, and the like.
  • the type of the electronic device is not particularly limited in this application.
  • the electronic device may also be a client device or a server device, which is not specifically limited here. It can be understood that the identification method can be executed solely by the client device or the server device, or can be executed by the client device and the server device in cooperation.
  • the server can be a cloud built by a single server or a server machine.
  • an electronic device (hereinafter referred to as device) is taken as an example for description.
  • the device may acquire the image to be recognized from an image acquisition device deployed on a road site.
  • the image acquisition device can perform image acquisition on a preset field of view of the road scene at a fixed angle or an adjustable angle, and the device can send the acquired image to be recognized to the device.
  • the image to be recognized may include one or more vehicles and one or more riders.
  • the vehicle may be a non-motorized vehicle.
  • the non-motorized vehicles may be motorcycles, tricycles, electric vehicles and the like.
  • the rider may refer to a person with driving behavior.
  • the device may execute S104.
  • the rider area shown in this application refers to the area enclosed by the detection frame of the target rider in the image to be recognized.
  • the target rider can be specified according to business needs.
  • the target rider may be a rider randomly selected from the riders included in the image to be recognized.
  • the target rider may be the rider with the highest definition among the riders included in the image to be recognized.
  • the target rider may include a rider who is about to leave the viewing area.
  • each rider included in the image to be recognized may be designated as a target rider respectively.
  • the rider area may include a vehicle and at least one human body.
  • the vehicle area shown in this application refers to the area enclosed by the detection frames of the vehicles in the image to be recognized.
  • the associated vehicle area associated with the rider area can be determined at least through the correlation prediction score or overlap between the rider area and the vehicle area.
  • the correlation prediction score or coincidence degree may characterize the degree of close spatial connection between the rider area and the associated vehicle area.
  • the coincidence degree and correlation prediction score are described below respectively.
  • the associated vehicle area can be determined by the overlapping degree between the rider area and the vehicle area.
  • Fig. 2 is a flowchart of a method for determining an associated vehicle area shown in the present application.
  • S104 further includes S202 and S204.
  • the image to be recognized is detected to obtain one or more vehicle areas and the rider area.
  • S204 among the one or more vehicle areas, determine a target vehicle area that has the greatest overlap with the rider area, and determine the target vehicle area as an associated vehicle area associated with the rider area.
  • the spatial association relationship between the vehicle and the rider can be used to determine the accurate associated vehicle area, which helps It can accurately determine the type of vehicle driven by the target rider, which helps to improve the accuracy of traffic behavior recognition.
  • the object detection model can be used to detect the object, and the detection frames corresponding to the rider and the vehicle in the image to be recognized can be obtained; the target detection frame corresponding to the target rider can be surrounded in the image to be recognized The formed area is determined as the rider area; and the area surrounded by the detection frame corresponding to the vehicle in the image to be recognized is determined as the vehicle area.
  • the object recognition network can be based on a regional convolutional neural network (Region Convolutional Neural Networks, RCNN), a fast regional convolutional neural network (Fast Region Convolutional Neural Networks, FAST-RCNN) or a faster regional convolutional neural network (Faster A model built by Region Convolutional Neural Networks, FASTER-RCNN).
  • RCNN regional convolutional neural network
  • FAST-RCNN Fast regional convolutional neural network
  • Faster A model built by Region Convolutional Neural Networks FASTER-RCNN
  • FIG. 3 is a schematic diagram of an object detection process shown in the present application. It should be noted that, FIG. 3 only schematically illustrates the object detection process, and does not specifically limit the present application.
  • the object detection model 30 shown in FIG. 3 may be a model constructed based on the FASTER-RCNN network.
  • the model may at least include a backbone network (backbone) 31, a candidate frame generation network (Region Proposal Network, RPN) 32, and a region-based convolutional neural network (Region-based Convolutional Neural Network, RCNN) 33.
  • backbone backbone
  • RPN candidate frame generation network
  • RCNN region-based convolutional neural network
  • the backbone network 31 can be used to perform several convolution operations on the image to be recognized to obtain the target feature map of the image to be recognized.
  • RPN 32 is used to process the target feature map to obtain anchor frames (anchors) corresponding to each rider and each vehicle in the image to be recognized.
  • RCNN 33 is used to carry out detection frame (bounding box, bbox) regression and classification according to the anchor frame of RPN 32 output and the target feature map of backbone network 31 output, obtain the rider frame corresponding to each rider in the image to be identified respectively, and Vehicle box corresponding to each vehicle.
  • position information and/or size information of each rider frame or vehicle frame may be obtained.
  • the position information and/or size information of the rider frame or the vehicle frame can be represented by four vertex coordinates.
  • the object detection model may be trained under supervision with several training samples first.
  • the position and size information of the object frame corresponding to each object (including the rider and the vehicle) in several sample images may be marked to obtain several training samples. Then, the model can be supervised by using the training samples in a conventional training manner until the model converges.
  • the object detection model can be used to perform object detection on the image to be recognized to obtain the rider frame corresponding to each rider included in the image to be recognized, and the vehicle frame corresponding to each vehicle respectively. If the image includes multiple riders and/or multiple vehicles, different rider frames and/or different vehicle frames may also be numbered in the recognition result.
  • the target rider frame corresponding to the target rider can be selected, and the area enclosed by the target rider frame in the image to be recognized is determined as the rider area, And determining the area enclosed by the vehicle frame in the image to be recognized as the vehicle area.
  • the degree of overlap between each vehicle area and the rider area may be calculated respectively.
  • the vehicle areas may be sorted in descending order of the calculated coincidence degrees, and the vehicle area ranked first is determined as the target vehicle area.
  • the target vehicle area may be determined as an associated vehicle area associated with the rider area.
  • the degree of overlap may include a ratio of an area where the vehicle area and the rider area intersect to an area where the vehicle area and the rider area merge. That is, the intersection over union (IoU) between the vehicle area and the rider area is used to characterize the overlap between the two.
  • IoU intersection over union
  • area 1 the vehicle area
  • area 2 the rider area
  • the area S(h) of the region 2 (h y2 ⁇ h y1 )*(h x2 ⁇ h x1 ).
  • the degree of overlap between the vehicle area and the rider area can be determined.
  • the degree of overlap between the vehicle area and the rider area can be accurately calculated, thereby accurately determining the associated vehicle area associated with the rider area, which helps to improve the accuracy of traffic behavior recognition.
  • the target vehicle can also be determined by the correlation prediction score between the rider and the vehicle.
  • FIG. 4 is a flowchart of another method for determining an associated vehicle area shown in the present application.
  • S104 may include S402 to S406.
  • the image to be recognized is detected to obtain the one or more vehicle areas and the rider area.
  • the correlation score between the one or more vehicle regions and the rider region is determined by using a pre-trained correlation score prediction model.
  • the target vehicle area with the highest associated score with the rider area determines the target vehicle area as an associated vehicle area associated with the rider area.
  • the degree of correlation between the rider area and the vehicle area can be accurately represented by the correlation score, so that the associated vehicle area with the strongest correlation with the rider area can be determined, which helps to accurately determine the type of vehicle driven by the target rider, and then helps To improve the accuracy of traffic behavior recognition.
  • S402 reference may be made to the description of S202, and no detailed description is given here.
  • the association score prediction network may be a network constructed based on a deep learning network.
  • the association score prediction network first obtain images including multiple pairs of vehicle regions and rider regions, and then mark the correlation scores between each pair of vehicle regions and rider regions to obtain several training samples. Among them, if the rider area is associated with the vehicle area, the association score is marked as 1; otherwise, it is marked as 0. Afterwards, the network can be trained with supervision using the training samples until the network converges. After the training of the correlation score prediction network is completed, the network can be used to predict the correlation score between the vehicle region and the rider region in the image to be recognized.
  • the device may continue to execute S106.
  • the object area in this step (including the rider area and the vehicle area) may be the area surrounded by the object frame corresponding to the object in the image to be recognized.
  • the object region may carry image features related to the object.
  • the rider area described in this application may include the first image feature related to the rider's behavior of carrying people.
  • the first image feature may include a vehicle driven by a rider, and image features corresponding to a human body carried on the vehicle. The number of people on board can be judged by the first image feature.
  • the vehicle area described in the present application may include the second image feature related to the vehicle type.
  • the second image feature may include an image feature corresponding to the vehicle, such as a feature of the number of wheels, a feature of the wheel structure, a feature of the body structure, and the like.
  • the vehicle type can be determined by means of the second image feature.
  • S106 may include S1062 and S1064.
  • S1062 carry out identification of the number of passengers in the area for riders, and obtain the identification result of the number of passengers.
  • vehicle type identification is performed on the associated vehicle area to obtain a vehicle type identification result.
  • the execution order of S1062 and S1064 is not limited in this application.
  • a rider area map corresponding to the rider area may be acquired.
  • the rider frame corresponding to the target rider and the image to be recognized (or the target feature map obtained by using the backbone network 31 to perform feature extraction on the image to be recognized) can be input into the regional feature extraction unit to obtain the target rider.
  • the rider area map may be a feature map, or an image of the rider area.
  • the region feature extraction unit may include a region of interest feature alignment (Region of interest Align, ROI Align) unit or a region of interest feature pooling (Region of interest Pooling, ROI Pooling) unit.
  • the area feature extraction unit can be used to perform processing such as pooling and convolution on the rider area enclosed by the rider frame to obtain a rider area map.
  • the rider area map may include high-dimensional or low-dimensional image features.
  • the number of passengers can be identified on the rider area map to obtain the result of identification of the number of people.
  • the number of people can be identified by using a pre-trained model for identifying the number of people.
  • the identification model of the number of people on board may include a classifier built based on a neural network.
  • the recognition result of the number of people carried by the model may include a first recognition result, a second recognition result, and a third recognition result, and confidence levels corresponding to each recognition result.
  • the first preset identification result indicates that the number of people carried has reached the first preset number.
  • the second preset identification result indicates that the number of people on board has reached a second preset number.
  • the third recognition result indicates that the number of people on board has reached a third preset number.
  • the first preset quantity, the second preset quantity and the third preset quantity can be set according to business requirements. For example, the first preset number may be 3 or more, the second preset number may be 2, and the third preset number may be 1.
  • the identification result corresponding to the highest confidence can be selected. For example, using the aforementioned recognition model for the number of people on board, the recognition results obtained by classifying the number of people in the rider area map indicate that the first recognition result, the second recognition result, and the third recognition result correspond to confidence levels of 0.7, 0.2, and 0.1, respectively. . That is, it can be determined that the identification result of the number of persons carried is the first identification result corresponding to the highest confidence level of 0.7.
  • the recognition model of the number of people When training the recognition model of the number of people, it is possible to obtain training samples with labeled information of the number of people, and then use the training samples to perform multiple rounds of iterations through supervised training until the model converges. After the training is completed, the model can be used to identify the number of passengers. Therefore, the characteristics of adaptive learning of the neural network can be used to improve the accuracy of identifying the number of downloaders in various situations (including occlusion situations).
  • traffic behavior recognition may not be necessary or cannot be performed normally. Such scenarios may be referred to as invalid scenarios.
  • the scene where the rider pushes the cart and the scene where the rider stands next to the car includes both the rider and the vehicle, the rider does not drive, so there is no need to detect the manned behavior in this type of scene.
  • the rider or vehicle may not be recognized normally due to the low identifiability of the rider or vehicle in the image, so it may not be possible to Carry out traffic behavior recognition normally.
  • a fourth identification result that indicates that the current identification is invalid may be added to the identification result of the number of passengers obtained after the identification of the number of passengers in the rider area. If the identification result of the number of passengers in the rider area is the fourth identification result, it can be explained that the scene in the rider area is an invalid scene, and traffic behavior identification is not required or impossible, so it is not necessary to perform traffic behavior in the rider area identify.
  • the identification results of the number of people on board output by the aforementioned number of people identification model may include the first identification result, the second identification result, the third identification result, and the fourth identification result, as well as confidence values corresponding to the various identification results.
  • the fourth recognition result indicates that at least one of the following scenes appears in the image to be recognized: a scene where a rider pushes a cart, a scene where a rider stands beside the car, a scene where multiple riders are close to each other, a scene with low definition, or a scene where the vehicle is Occluded scenes.
  • the identification result corresponding to the highest confidence can be selected.
  • the recognition results obtained by classifying the number of people in the rider area map indicate: the confidence levels corresponding to the first recognition result, the second recognition result, the third recognition result, and the fourth recognition result 0.1, 0.2, 0.1, 0.6. Therefore, it can be determined that the identification result of the number of passengers is an invalid identification result corresponding to the highest confidence level of 0.6.
  • training the human identification network may include S11-S13.
  • S11 acquire a first training sample.
  • the first training sample includes a plurality of sample images of riders and first annotation information corresponding to the number of passengers, and the first annotation information includes one of the following labels: 1 person, 2 persons, 3 persons, or invalid Tags that include at least one of the following: a rider pushing a cart, a rider standing next to a cart, multiple riders in close proximity to each other, low clarity, or the cart is obscured;
  • the first initial network may be any type of neural network.
  • the first initial network may output the identification result of the number of passengers.
  • the first loss may be determined according to the first label information, and the parameters of the first initial network may be updated through a backpropagation operation, to complete a parameter iteration.
  • the number of parameter iterations can be preset, and the vehicle identification network can be obtained after the preset number of iterations are completed on the second initial network.
  • the The number of passengers is determined as the identification result of the number of people in the rider area.
  • the first confidence threshold can be set according to business conditions. For example, assuming that the identification result of the number of passengers determined by the model is 1 person, the corresponding confidence level is 0.7, and the first confidence threshold is 0.7, then it is credible that the identification result of the number of passengers is 1 person, and the number of passengers can be output. Person recognition results. Wherein, the confidence level can represent the degree of credibility when the number of passengers is 1.
  • the credibility of the output recognition result can be guaranteed, thereby ensuring the accuracy of traffic behavior recognition.
  • various traffic behavior recognition scenarios can be flexibly adapted by adjusting the size of the first confidence threshold.
  • the first confidence threshold may be set to a higher value (such as 0.9). In this way, the reliability of the output identification result of the number of passengers is sufficiently high, thereby improving the accuracy of traffic behavior identification.
  • the first confidence threshold can be set to a lower value (for example, 0.6), thereby increasing the number of output manned number recognition results, thereby improving Sensitivity of traffic behavior recognition.
  • the corresponding associated vehicle area map may be obtained first according to the associated vehicle area.
  • the vehicle frame corresponding to the associated vehicle area and the image to be recognized (or the target feature map obtained by using the backbone network 31 to perform feature extraction on the image to be recognized) can be input into the ROI Pooling unit to obtain the The above-mentioned associated vehicle area map.
  • the associated vehicle area map may be a feature map, or an image of the associated vehicle area.
  • vehicle type identification may be performed on the associated vehicle area map to obtain a vehicle type identification result.
  • vehicle type recognition can be performed by a pre-trained vehicle recognition network.
  • the vehicle recognition network may include a classifier built based on a neural network.
  • the calculation results output by the vehicle identification network may include confidence levels (for example, probabilities) when vehicles in the vehicle area map are identified as respective preset vehicle types.
  • the vehicle type corresponding to the highest confidence level may be selected, for example, the vehicle type corresponding to the highest confidence level may be determined as the vehicle type identification result.
  • training the vehicle recognition network may include S21-S23.
  • S21 acquire a second training sample.
  • the second training samples include a plurality of sample images of vehicles and second labeling information of corresponding vehicle types.
  • the second initial network may be any type of neural network.
  • the second preset network may output a vehicle type identification result.
  • the second loss may be determined according to the second label information, and the parameters of the second initial network may be updated through a backpropagation operation to complete a parameter iteration.
  • the number of parameter iterations can be preset, and the vehicle identification network can be obtained after the second initial network has completed a preset number of iterations.
  • the characteristics of neural network self-adaptive learning can be utilized to improve the accuracy of vehicle type identification.
  • the second confidence level may be reached in response to the second confidence level. degree threshold, the vehicle type is determined as the vehicle type identification result of the vehicle area.
  • the second confidence threshold can be set according to business conditions.
  • various traffic behavior recognition scenarios can be flexibly adapted by adjusting the size of the second confidence threshold.
  • the second confidence threshold may be set to a higher value (such as 0.9). In this way, the reliability of the output vehicle type recognition result can be sufficiently high, thereby improving the accuracy of traffic behavior recognition.
  • the second confidence threshold can be set to a lower value (for example, 0.6), so that the number of output vehicle type recognition results can be increased, thereby improving traffic behavior. Recognition sensitivity.
  • the device may execute S108.
  • actual recognition results can be output for different vehicle types.
  • the target rider in the rider area is illegally carrying passengers in response to the identification result of the number of passengers being the first identification result; the first preset identification result represents the number of passengers The first preset amount is reached.
  • the first preset quantity may be an empirical value. For example, in the non-motor vehicle scene, no matter what type of vehicle, the number of people including the driver cannot exceed 3, and non-motor vehicles with more than 3 people can be considered as non-compliant. At this time, the first preset number can be set to 3, and if the number of passengers reaches 3 or more, it can be determined that the passenger-carrying violation is violated.
  • the target rider is illegally carrying people in response to the recognition result of the number of passengers being the second recognition result, and the vehicle type represented by the type recognition result is a preset non-motor vehicle type;
  • the second identification result indicates that the number of people on board has reached a second preset number, and the second preset number is smaller than the first preset number.
  • the corresponding behavior of carrying people may be compliant or may be illegal. If the vehicle (associated vehicle) driven by the target rider is a preset non-motor vehicle type, it can be determined that the manned behavior violates the regulations, otherwise it can be determined that the manned behavior is compliant.
  • the second preset quantity may be an empirical value.
  • the preset non-motor vehicle type may refer to a vehicle whose number of people cannot reach the second preset number.
  • the second preset number is 2
  • the preset non-motor vehicle type is a utility vehicle such as a tricycle.
  • This type of tool cart can only legally carry 1 person. If it is recognized that the number of people carried is 2, it can be determined that it has illegally carried people. If the vehicle type is not such a tool cart, such as a motorcycle or an electric bicycle, if the identified number of passengers is 2, it can be determined that the number of passengers is compliant.
  • the target rider has not violated regulations for carrying passengers in response to the recognition result of the number of passengers being a third recognition result;
  • the third recognition result indicates that the number of passengers is a third preset number, and the first Three preset numbers are smaller than the second preset number.
  • the third preset quantity may be an empirical value. For example, in a non-motor vehicle scene, no matter what type of vehicle, if the number of passengers including the driver is one person, it can be considered as complying with the regulations. At this time, the third preset number may be set to 1.
  • the traffic behavior identification for the target rider is invalid in response to the identification result of the number of passengers being the fourth identification result. This eliminates the need for further traffic behavior recognition.
  • a warning message may be issued.
  • the device may be connected to an interactive terminal held by a traffic policeman.
  • the device recognizes the illegal behavior of carrying people, it can package the information corresponding to the target rider, the vehicle information it drives, and the reason for the violation as alarm information, and send it to the interactive terminal held by the traffic police.
  • the traffic policeman can make corresponding processing after receiving the warning. In this way, an alarm can be automatically and timely issued for violations of manned behavior, which facilitates the handling of violations.
  • Embodiments will be described below in conjunction with non-motor vehicle vehicle-mounted person behavior recognition scenarios.
  • the camera can send the to-be-recognized images collected in the predetermined area to the recognition device for rider behavior detection.
  • the identification device can be equipped with pre-trained rider-vehicle identification network (hereinafter referred to as network 1), passenger identification network (hereinafter referred to as network 2) and vehicle identification network (hereinafter referred to as network 3).
  • network 1 pre-trained rider-vehicle identification network
  • network 2 passenger identification network
  • network 3 vehicle identification network
  • the network 1 is used to detect the rider and vehicle appearing in the image to be recognized, and the corresponding vehicle area and rider area.
  • the network 2 can be used to identify the number of people on board.
  • the network 3 can be used to identify the type of vehicle.
  • the identification device can also perform multi-target tracking on each rider appearing in the image to be identified according to the identification result of the network 1 to obtain the corresponding driving track of each rider, so as to identify the rider who newly appears in the predetermined area and is still in the predetermined area Active riders and riders who are about to leave their intended area. A rider who is about to leave a predetermined area can be determined as a target rider.
  • FIG. 5 is a schematic flow chart of a method for identifying manned behavior shown in the present application.
  • the rider frame corresponding to the rider and the vehicle frame corresponding to the vehicle appearing in the image to be recognized are recognized through the network 1, and the target For the target rider frame corresponding to the rider, the area enclosed by the target rider frame in the image to be recognized is determined as the rider area, and the area enclosed by the vehicle frame in the image to be identified is determined as the vehicle area.
  • the overlap degree between each vehicle area and the rider area is determined by using IoU, and the target vehicle area corresponding to the maximum overlap degree is determined as an associated vehicle area spatially associated with the rider area.
  • the spatial overlap relationship between the target rider and the vehicle he drives can be used to accurately determine the associated vehicle area associated with the rider area, which helps to improve the accuracy of vehicle type recognition and obtain accurate passenger behavior recognition results. .
  • FIG. 6 is a schematic diagram of a judgment flow chart of an illegal manned behavior shown in the present application.
  • the recognition result represented by the recognition result of the number of persons carried is judged. If the identification result of the number of passengers is invalid, the traffic behavior identification of the target rider may not be performed.
  • S604 may be executed to determine whether the vehicle type represented by the vehicle type recognition result is a tricycle.
  • the vehicle type is a tricycle, it is determined that the target rider is illegally carrying passengers, otherwise it is determined that the target rider is not illegally carrying passengers.
  • the alarm information can be generated based on the rider information, vehicle information, and information on the cause of the violation, and sent to the corresponding handheld device of the traffic police in time, so that the traffic police can make timely processing.
  • the present application also proposes a traffic behavior recognition device.
  • FIG. 7 is a schematic structural diagram of a traffic behavior recognition device shown in the present application.
  • the device 70 may include: an acquisition module 71, configured to acquire an image to be recognized including one or more rider areas; a first determination module 72, configured to target Any one of the rider areas, from the image to be identified, determine the associated vehicle area associated with the rider area, the rider area includes a vehicle and at least one human body; the recognition module 73 is used to carry people in the rider area Quantity identification, obtaining the identification result of the number of people carried, and performing vehicle type identification on the associated vehicle area to obtain the result of vehicle type identification; the second determination module 74 is used to identify the number of people based on the identification result of the number of passengers and the vehicle type As a result, it is determined whether the target rider in the rider's zone has a manning violation.
  • an acquisition module 71 configured to acquire an image to be recognized including one or more rider areas
  • a first determination module 72 configured to target Any one of the rider areas, from the image to be identified, determine the associated vehicle area associated with the rider area, the rider area includes a
  • the first determination module 72 is configured to: detect the image to be recognized to obtain one or more vehicle areas and the rider area; in the one or more vehicle areas, determine A target vehicle area that overlaps the rider area with the greatest degree, and determining the target vehicle area as an associated vehicle area associated with the rider area.
  • the first determination module 72 is configured to: detect the image to be recognized to obtain one or more vehicle areas and the rider area; The correlation score between the one or more vehicle areas and the rider area; in the one or more vehicle areas, determine the target vehicle area with the highest correlation score with the rider area, and determine the target vehicle area as The associated vehicle zone associated with this rider zone.
  • the identification module 73 is configured to: identify the number of passengers in the rider area, obtain the number of passengers and the corresponding first confidence level; reach the first confidence level in response to the first confidence level Threshold, determining the number of passengers as the identification result of the number of passengers in the rider area; performing vehicle type identification on the associated vehicle area to obtain the vehicle type corresponding to the associated vehicle area and the corresponding second degree of confidence; responding When the second confidence level reaches a second confidence level threshold, the vehicle type is determined as the vehicle type identification result of the vehicle area.
  • the second determining module 74 is configured to: determine that the target rider is illegally carrying passengers in response to the recognition result of the number of passengers being a first recognition result; The number reaches the first preset number; or, in response to the recognition result of the number of passengers is the second recognition result, and the vehicle type represented by the type recognition result is a preset non-motor vehicle type, determine that the target rider violates the rules Manning; the second identification result indicates that the number of people on board reaches a second preset number, and the second preset number is less than the first preset number; or, in response to the number of people on board represented by the identification result of the number of people on board The number of people is the second identification result, and the vehicle type represented by the type identification result is not the preset non-motor vehicle type, and it is determined that the target rider is not illegally carrying people; or, in response to the number of people carrying
  • the recognition result is a third recognition result, and it is determined that the target rider has not carried people in violation of regulations; the third recognition result
  • the fourth recognition result indicates that the image to be recognized includes at least one of the following scenes: a scene of a rider pushing a cart; a scene of a rider standing next to the car; a scene of multiple riders close to each other; low-resolution scene; or the scene where the vehicle is blocked.
  • the device 70 further includes: an alarm module, configured to send out an alarm message in response to the target rider in the rider area carrying passengers illegally.
  • the identification result of the number of passengers is obtained by detecting the rider area through the occupancy identification network; the device 70 also includes: a training module of the occupancy identification network, which is used to obtain the first training sample , the first training sample includes a plurality of sample images of riders and first annotation information corresponding to the number of passengers, and the first annotation information includes one of the following labels: 1 person, 2 people, 3 people, or An invalid label, the invalid label includes at least one of the following: a rider pushes a cart, a rider stands next to the car, multiple riders are close to each other, low definition, and the vehicle is blocked; the first training sample is input into the preset
  • the first initial network obtains the identification result of the number of people in the sample corresponding to each sample image; determines a first loss based on the identification result of the number of people in the sample and the first annotation information, and optimizes the first loss based on the first loss An initial network, obtaining the manned identification network.
  • the vehicle recognition result is obtained by detecting the associated vehicle area through a vehicle recognition network; the device 70 further includes: a training module of the vehicle recognition network, configured to obtain a second training sample, the The second training sample includes a plurality of sample images of the vehicle and the second label information of the corresponding vehicle type; the second training sample is input into the preset second initial network to obtain the sample vehicle type recognition result of each sample image Determining a second loss based on the sample vehicle type identification result and the second label information, optimizing the second initial network based on the second loss, to obtain the vehicle identification network.
  • Embodiments of the traffic behavior recognition device shown in this application can be applied to electronic equipment.
  • the present application discloses an electronic device, which may include: a processor, and a memory for storing instructions executable by the processor.
  • the processor is configured to invoke the executable instructions stored in the memory to implement the traffic behavior recognition method shown in any one of the foregoing embodiments.
  • FIG. 8 is a schematic diagram of a hardware structure of an electronic device shown in the present application.
  • the electronic device may include a processor 801 for executing instructions, a network interface 802 for connecting to a network, a memory 803 for storing operation data for the processor, and a memory 803 for storing behavior recognition device correspondence.
  • the non-volatile memory 804 for instructions, the processor 801 , the network interface 802 , the memory 803 and the non-volatile memory 804 are coupled through an internal bus 805 .
  • the embodiment of the device may be implemented by software, or by hardware or a combination of software and hardware.
  • software implementation as an example, as a device in a logical sense, it is formed by reading the corresponding computer program instructions in the non-volatile memory into the memory for operation by the processor of the electronic device where it is located.
  • the electronic device where the device in the embodiment is usually based on the actual function of the electronic device can also include other Hardware, no more details on this.
  • the instructions corresponding to the traffic behavior recognition device may also be directly stored in the memory, which is not limited here.
  • the present application proposes a computer-readable storage medium, the storage medium stores a computer program, and the computer program can be used to make a processor execute the traffic behavior recognition method as shown in any one of the foregoing embodiments.
  • one or more embodiments of the present application may be provided as a method, system or computer program product. Accordingly, one or more embodiments of the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, one or more embodiments of the present application may be implemented as described on one or more computer-usable storage media (which may include, but are not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein. In the form of a computer program product.
  • each embodiment in the present application is described in a progressive manner, the same and similar parts of each embodiment can be referred to each other, and each embodiment focuses on the differences from other embodiments.
  • the description is relatively simple, and for relevant parts, please refer to part of the description of the method embodiment.
  • Embodiments of the subject matter and functional operations described in this application can be implemented in digital electronic circuitry, tangibly embodied computer software or firmware, computer hardware that may include the structures disclosed in this application and their structural equivalents, or their A combination of one or more of them.
  • Embodiments of the subject matter described in this application can be implemented as one or more computer programs, that is, one of computer program instructions encoded in a tangible, non-transitory program carrier to be executed by or to control the operation of data processing apparatus. or multiple modules.
  • the program instructions may be encoded in an artificially generated propagated signal, such as a machine-generated electrical, optical or electromagnetic signal, which is generated to encode and transmit information to a suitable receiver device for viewing by The data processing device executes.
  • a computer storage medium may be a machine-readable storage device, a machine-readable storage substrate, a random or serial access memory device, or a combination of one or more of them.
  • the processes and logic flows described in this application can be performed by one or more programmable computers executing one or more computer programs to perform corresponding functions by operating on input data and generating output.
  • the processes and logic flows can also be performed by, and apparatus can also be implemented as, special purpose logic circuitry, such as an FPGA (Field Programmable Gate Array) or an ASIC (Application Specific Integrated Circuit).
  • FPGA Field Programmable Gate Array
  • ASIC Application Specific Integrated Circuit
  • Computers suitable for the execution of a computer program may include, for example, general and/or special purpose microprocessors, or any other type of central processing unit.
  • a central processing unit will receive instructions and data from a read only memory and/or a random access memory.
  • the basic components of a computer may include a central processing unit for implementing or executing instructions and one or more memory devices for storing instructions and data.
  • a computer will also include, or be operatively coupled to, one or more mass storage devices for storing data, such as magnetic or magneto-optical disks, or optical disks, to receive data therefrom or Send data to it, or both.
  • mass storage devices for storing data, such as magnetic or magneto-optical disks, or optical disks, to receive data therefrom or Send data to it, or both.
  • a computer is not required to have such a device.
  • a computer may be embedded in another device such as a mobile phone, a personal digital assistant (PDA), a mobile audio or video player, a game console, a Global Positioning System (GPS) receiver, or a device such as a Universal Serial Bus (USB) ) portable storage devices like flash drives, to name a few.
  • PDA personal digital assistant
  • GPS Global Positioning System
  • USB Universal Serial Bus
  • Computer-readable media suitable for storing computer program instructions and data may include all forms of non-volatile memory, media and memory devices and may include, for example, semiconductor memory devices such as EPROM, EEPROM and flash memory devices, magnetic disks such as internal hard drives or removable disks), magneto-optical disks, and CD ROM and DVD-ROM disks.
  • semiconductor memory devices such as EPROM, EEPROM and flash memory devices
  • magnetic disks such as internal hard drives or removable disks
  • magneto-optical disks and CD ROM and DVD-ROM disks.
  • the processor and memory can be supplemented by, or incorporated in, special purpose logic circuitry.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)
  • Traffic Control Systems (AREA)

Abstract

The present application provides a traffic behavior recognition method and apparatus, an electronic device, and a storage medium. The method may comprise: obtaining an image to be recognized comprising one or more rider areas; for any one of the one or more rider areas, executing operation comprising: determining an associated vehicle area associated with the rider area from said image, the rider area comprising a vehicle and at least one human body; performing manned quantity recognition on the rider area to obtain a manned quantity recognition result, and performing vehicle type recognition on the associated vehicle area to obtain a vehicle type recognition result; and determining, according to the manned quantity recognition result and the vehicle type recognition result, whether a target rider in the rider area has an illegal manned behavior.

Description

交通行为识别方法及装置、电子设备和存储介质Traffic behavior recognition method and device, electronic device and storage medium
相关申请交叉引用Related Application Cross Reference
本申请主张申请号为202110873586.2、申请日为2021年7月30日的中国专利申请的优先权,该中国专利申请的全部内容在此引入本申请作为参考。This application claims the priority of the Chinese patent application with the application number 202110873586.2 and the filing date of July 30, 2021. The entire content of the Chinese patent application is hereby incorporated by reference into this application.
技术领域technical field
本申请涉及计算机技术,具体涉及交通行为识别方法及装置、电子设备和存储介质。This application relates to computer technology, in particular to a traffic behavior recognition method and device, electronic equipment and storage media.
背景技术Background technique
随着监管部门加强监管力度,需要对交通行为进行识别。在一些场景中,交通行为识别可以包括对非机动车载人行为的识别,如果发现违规的载人行为,需要施以处罚和安全教育。As the regulatory authorities strengthen their supervision, traffic behavior needs to be identified. In some scenarios, traffic behavior identification can include the identification of non-motorized vehicle occupants. If illegal occupant behavior is found, penalties and safety education are required.
目前针对非机动车违规载人行为的识别,主要通过识别图像中出现的人头或人体数量以确定载人数量,如果载人数量过多,则确定载人行为违规。At present, for the identification of non-motor vehicle illegal loading behavior, the number of people loaded is mainly determined by identifying the number of heads or bodies appearing in the image. If the number of loaded people is too large, it is determined that the loading behavior is illegal.
发明内容Contents of the invention
本申请提出一种交通行为识别方法。所述方法可以包括:获取包括一个或多个骑手区域的待识别图像;针对所述一个或多个骑手区域中的任一个骑手区域,执行操作包括:在所述待识别图像中,确定与该骑手区域关联的关联车辆区域,该骑手区域包括一辆车辆以及至少一个人体;对该骑手区域进行载人数量识别,得到载人数量识别结果,以及对所述关联车辆区域进行车辆类型识别,得到车辆类型识别结果;根据所述载人数量识别结果和所述车辆类型识别结果,确定该骑手区域中的目标骑手是否存在违规载人行为。This application proposes a traffic behavior recognition method. The method may include: acquiring an image to be recognized including one or more rider areas; for any rider area in the one or more rider areas, performing an operation includes: in the image to be identified, determining the The associated vehicle area associated with the rider area, the rider area includes a vehicle and at least one human body; identifying the number of passengers in the rider area to obtain the identification result of the number of passengers, and identifying the vehicle type in the associated vehicle area to obtain Vehicle type identification result; according to the identification result of the number of passengers and the identification result of the vehicle type, it is determined whether the target rider in the rider area has illegal loading behavior.
本申请还提出一种交通行为识别装置,所述装置包括:获取模块,用于获取包括一个或多个骑手区域的待识别图像;第一确定模块,用于针对所述一个或多个骑手区域中的任一个骑手区域,在所述待识别图像中,确定与该骑手区域关联的关联车辆区域,该骑手区域包括一辆车辆以及至少一个人体;识别模块,用于对该骑手区域进行载人数量识别,得到载人数量识别结果,以及对所述关联车辆区域进行车辆类型识别,得到车辆类型识别结果;第二确定模块,用于根据所述载人数量识别结果和所述车辆类型识别结果,确定该骑手区域中的目标骑手是否存在违规载人行为。The present application also proposes a traffic behavior recognition device, which includes: an acquisition module, configured to acquire an image to be recognized including one or more rider areas; a first determination module, configured to target the one or more rider areas Any one of the rider areas, in the image to be recognized, determine the associated vehicle area associated with the rider area, the rider area includes a vehicle and at least one human body; the identification module is used to carry people in the rider area Quantity identification, obtaining the identification result of the number of people carried, and performing vehicle type identification on the associated vehicle area to obtain the result of vehicle type identification; a second determination module, used to identify the result of the number of people carried and the vehicle type identification result , to determine whether the target rider in the rider's area has illegal passenger behavior.
本申请还提出一种电子设备,所述设备包括:处理器;用于存储处理器可执行指令的存储器;其中,所述处理器通过运行所述可执行指令以实现如前述任一实施例所述的交通行为识别方法。The present application also proposes an electronic device, the device includes: a processor; a memory for storing processor-executable instructions; wherein, the processor executes the executable instructions to implement any of the foregoing embodiments. The traffic behavior recognition method described above.
本申请还提出一种计算机可读存储介质,所述存储介质存储有计算机程序,所述计算机程序用于使处理器执行如前述任一实施例所述的交通行为识别方法。The present application also proposes a computer-readable storage medium, the storage medium stores a computer program, and the computer program is used to make a processor execute the traffic behavior recognition method as described in any one of the foregoing embodiments.
本申请还提出一种计算机程序产品,包括存储于存储器中的计算机程序,所述计算机程序指令被处理器执行时实现如前述任一实施例所述的交通行为识别方法。The present application also proposes a computer program product, including a computer program stored in a memory, and when the computer program instructions are executed by a processor, the traffic behavior recognition method as described in any one of the foregoing embodiments is implemented.
应当理解的是,以所述的一般描述和后文的细节描述仅是示例性和解释性的,并不能限制本申请。It is to be understood that both the general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.
附图说明Description of drawings
为了更清楚地说明本申请一个或多个实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显见地,下面描述中的附图仅仅是本申请一个或多个实施例中记载的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in one or more embodiments of the present application, the drawings that need to be used in the description of the embodiments will be briefly introduced below. Obviously, the drawings in the following description are only one of the present application Or some embodiments described in multiple embodiments, for those skilled in the art, other drawings can also be obtained according to these drawings without creative work.
图1为本申请示出的一种交通行为识别方法的方法流程图;Fig. 1 is a method flowchart of a traffic behavior recognition method shown in the present application;
图2为本申请示出的一种关联车辆区域确定方法的流程图;FIG. 2 is a flowchart of a method for determining an associated vehicle area shown in the present application;
图3为本申请示出的一种对象检测流程示意图;FIG. 3 is a schematic diagram of an object detection process shown in the present application;
图4为本申请示出的另一种关联车辆区域确定方法的流程图;FIG. 4 is a flow chart of another method for determining an associated vehicle area shown in the present application;
图5为本申请示出的一种载人行为识别方法流程示意图;FIG. 5 is a schematic flow diagram of a method for identifying a manned behavior shown in the present application;
图6为本申请示出的一种违规载人行为判断规则示意;Figure 6 is a schematic diagram of a judging rule for illegal manned behavior shown in this application;
图7为本申请示出的一种交通行为识别装置的结构示意图;FIG. 7 is a schematic structural diagram of a traffic behavior recognition device shown in the present application;
图8为本申请示出的一种电子设备的硬件结构示意图。FIG. 8 is a schematic diagram of a hardware structure of an electronic device shown in the present application.
具体实施方式Detailed ways
下面将详细地对示例性实施例进行说明,其示例表示在附图中。下面的描述涉及附图时,除非另有表示,不同附图中的相同数字表示相同或相似的要素。以下示例性实施例中所描述的实施方式并不代表与本申请相一致的所有实施方式。相反,它们仅是与如所附权利要求书中所详述的、本申请的一些方面相一致的设备和方法的例子。Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numerals in different drawings refer to the same or similar elements unless otherwise indicated. The implementations described in the following exemplary embodiments do not represent all implementations consistent with this application. Rather, they are merely examples of devices and methods consistent with aspects of the present application as recited in the appended claims.
在本申请使用的术语是仅仅出于描述特定实施例的目的,而非旨在限制本申请。在本申请和所附权利要求书中所使用的单数形式的“一种”、“所述”和“该”也旨在可以包括多数形式,除非所述下文清楚地表示其他含义。还应当理解,本文中使用的术语“和/或”是指并包含一个或多个相关联的列出项目的任何或所有可能组合。还应当理解,本文中所使用的词语“如果”,取决于语境,可以被解释成为“在……时”或“当……时”或“响应于确定”。The terminology used in this application is for the purpose of describing particular embodiments only, and is not intended to limit the application. As used in this application and the appended claims, the singular forms "a", "said" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and includes any and all possible combinations of one or more of the associated listed items. It should also be understood that the word "if", as used herein, could be interpreted as "at" or "when" or "in response to a determination", depending on the context.
本申请旨在提出一种交通行为识别方法(以下简称识别方法)。目前针对非机动车违规载人行为的识别,主要通过识别图像中出现的人头或人体数量以确定载人数量,如果载人数量过多,则确定载人行为违规。一方面,在非机动车载人场景中,人与人之间较为紧密,容易出现人体与人体,人头与人头的遮挡,因此无法得到准确人体或人头数量,从而导致载人数量识别有误。另一方面,不同的类型的非机动车对载人数量的要求不一样。现有方法无法对不同类型的非机动车载人行为进行合法性识别。The purpose of this application is to propose a traffic behavior recognition method (hereinafter referred to as the recognition method). At present, for the identification of non-motor vehicle illegal loading behavior, the number of people loaded is mainly determined by identifying the number of heads or bodies appearing in the image. If the number of loaded people is too large, it is determined that the loading behavior is illegal. On the one hand, in the scene of non-motorized vehicle-mounted people, people are relatively close to each other, and it is easy to be blocked by human bodies and heads. Therefore, it is impossible to get the accurate number of human bodies or heads, which leads to incorrect identification of the number of people on board. On the other hand, different types of non-motorized vehicles have different requirements for the number of passengers. Existing methods cannot identify the legitimacy of different types of non-motorized vehicle occupant behavior.
图1为本申请示出的一种交通行为识别方法的方法流程图。所述方法可以包括步骤S102至步骤S108。FIG. 1 is a flow chart of a traffic behavior recognition method shown in this application. The method may include steps S102 to S108.
S102,获取包括一个或多个骑手区域的待识别图像。S102. Acquire images to be identified including one or more rider areas.
S104,针对所述一个或多个骑手区域中的任一个骑手区域,从所述待识别图像中,确定与该骑手区域关联的关联车辆区域,该骑手区域包括一辆车辆以及至少一个人体。S104. For any one of the one or more rider areas, from the image to be recognized, determine an associated vehicle area associated with the rider area, where the rider area includes a vehicle and at least one human body.
S106,对该骑手区域进行载人数量识别,得到载人数量识别结果,以及对所述关联车辆区域进行车辆类型识别,得到车辆类型识别结果。S106. Identify the number of passengers in the rider area to obtain an identification result of the number of passengers, and identify the vehicle type in the associated vehicle area to obtain a vehicle type identification result.
S108,根据所述载人数量识别结果和所述车辆类型识别结果,确定该骑手区域中的 目标骑手是否存在违规载人行为。S108. According to the identification result of the number of passengers and the identification result of the vehicle type, determine whether the target rider in the rider area has illegal loading behavior.
在本申请示出的交通行为识别方法中,一方面,该方法利用神经网络模型对骑手区域进行载人数量识别,可以通过模型自适应学习骑手区域中的载人数量,由此即便在待识别图像中出现遮挡等情况,也可以识别出准确的载人数量,从而提升交通行为识别准确性。In the traffic behavior recognition method shown in this application, on the one hand, the method uses a neural network model to identify the number of people in the rider area, and can learn the number of people in the rider area through model self-adaptation, so that even in the area to be identified In the case of occlusion in the image, the accurate number of passengers can also be identified, thereby improving the accuracy of traffic behavior recognition.
另一方面,可以根据载人数量识别结果和车辆类型识别结果,识别载人行为的合法性,从而在进行合法性识别时综合考虑车辆类型与载人数量,达到针对不同类型车辆进行交通行为识别的效果。On the other hand, the legitimacy of manned behavior can be identified based on the identification results of the number of people carried and the result of vehicle type recognition, so that the type of vehicle and the number of people carried can be comprehensively considered when performing legal identification, so as to achieve traffic behavior recognition for different types of vehicles Effect.
图1示出的识别方法可以应用于电子设备中。其中,所述电子设备可以通过搭载与设备方法对应的软件逻辑执行该方法。所述电子设备的类型可以是笔记本电脑、计算机、服务器、手机、PAD终端等。在本申请中不特别限定所述电子设备的类型。所述电子设备也可以是客户端设备或服务端设备,在此不作特别限定。可以理解的是,所述识别方法既可以仅通过客户端设备或服务端设备单独执行,也可以通过客户端设备与服务端设备配合执行。所述服务端可以是由单台服务器或服务器机器构建的云端。以下以执行主体为电子设备(以下简称设备)为例进行说明。The identification method shown in FIG. 1 can be applied to electronic equipment. Wherein, the electronic device may execute the method by carrying software logic corresponding to the device method. The type of the electronic device may be a notebook computer, a computer, a server, a mobile phone, a PAD terminal, and the like. The type of the electronic device is not particularly limited in this application. The electronic device may also be a client device or a server device, which is not specifically limited here. It can be understood that the identification method can be executed solely by the client device or the server device, or can be executed by the client device and the server device in cooperation. The server can be a cloud built by a single server or a server machine. In the following, an electronic device (hereinafter referred to as device) is taken as an example for description.
在一些实施例中,所述设备可以从部署在道路现场的图像采集设备处获取待识别图像。所述图像采集设备可以固定角度或可调整角度针对道路现场的预设视野区域进行图像采集,该设备可以将采集的待识别图像发送至所述设备。In some embodiments, the device may acquire the image to be recognized from an image acquisition device deployed on a road site. The image acquisition device can perform image acquisition on a preset field of view of the road scene at a fixed angle or an adjustable angle, and the device can send the acquired image to be recognized to the device.
所述待识别图像可以包括一个或多个车辆与一个或多个骑手。所述车辆可以是非机动车辆。所述非机动车辆可以是摩托车、三轮车、电动车等。所述骑手可以是指具有驾驶行为的人。The image to be recognized may include one or more vehicles and one or more riders. The vehicle may be a non-motorized vehicle. The non-motorized vehicles may be motorcycles, tricycles, electric vehicles and the like. The rider may refer to a person with driving behavior.
在获取待识别图像后,所述设备可以执行S104。本申请示出的骑手区域是指待识别图像中的目标骑手的检测框围成的区域。所述目标骑手可以根据业务需求进行指定。例如,所述目标骑手可以是从待识别图像包括的骑手中随机选出的骑手。再例如,所述目标骑手可以是待识别图像包括的骑手中清晰度最高的骑手。再例如,所述目标骑手可以包括即将离开所述视野区域的骑手。再例如,可以将待识别图像包括的各骑手分别指定为目标骑手。所述骑手区域可以包括一辆车辆以及至少一个人体。After acquiring the image to be recognized, the device may execute S104. The rider area shown in this application refers to the area enclosed by the detection frame of the target rider in the image to be recognized. The target rider can be specified according to business needs. For example, the target rider may be a rider randomly selected from the riders included in the image to be recognized. For another example, the target rider may be the rider with the highest definition among the riders included in the image to be recognized. For another example, the target rider may include a rider who is about to leave the viewing area. For another example, each rider included in the image to be recognized may be designated as a target rider respectively. The rider area may include a vehicle and at least one human body.
本申请示出的车辆区域是指待识别图像中的车辆的检测框围成的区域。The vehicle area shown in this application refers to the area enclosed by the detection frames of the vehicles in the image to be recognized.
在本申请中至少可以通过骑手区域与车辆区域之间的关联性预测分数或者重合度,来确定与骑手区域关联的关联车辆区域。所述关联性预测分数或者重合度可以表征所述骑手区域与所述关联车辆区域在空间上连接紧密的程度。以下分别针对重合度和关联性预测分数进行说明。In the present application, the associated vehicle area associated with the rider area can be determined at least through the correlation prediction score or overlap between the rider area and the vehicle area. The correlation prediction score or coincidence degree may characterize the degree of close spatial connection between the rider area and the associated vehicle area. The coincidence degree and correlation prediction score are described below respectively.
在一些实施例中,可以通过骑手区域与车辆区域之间的重合度确定关联车辆区域。In some embodiments, the associated vehicle area can be determined by the overlapping degree between the rider area and the vehicle area.
图2为本申请示出的一种关联车辆区域确定方法的流程图。Fig. 2 is a flowchart of a method for determining an associated vehicle area shown in the present application.
如图2所示,S104进一步包括S202和S204。其中,在S202,对所述待识别图像进行检测,得到一个或多个车辆区域以及该骑手区域。在S204,在所述一个或多个车辆区域中,确定与该骑手区域重合度最大的目标车辆区域,并将所述目标车辆区域确定为与该骑手区域关联的关联车辆区域。As shown in FIG. 2, S104 further includes S202 and S204. Wherein, at S202, the image to be recognized is detected to obtain one or more vehicle areas and the rider area. In S204, among the one or more vehicle areas, determine a target vehicle area that has the greatest overlap with the rider area, and determine the target vehicle area as an associated vehicle area associated with the rider area.
由此,通过将与该骑手区域重合度最大的目标车辆区域确定为与该骑手区域关联的关联车辆区域,可以利用车辆与骑手在空间上的关联关系,确定准确的关联车辆区域, 从而有助于准确地确定目标骑手驾驶的车辆类型,有助于提升交通行为识别的准确性。在一些实施例中,在S202,可以通过对象检测模型进行对象检测,得到待识别图像中的骑手和车辆分别对应的检测框;可以将与目标骑手对应目标检测框在所述待识别图像中围成的区域确定为该骑手区域;以及将车辆对应的检测框在所述待识别图像中围成的区域确定为所述车辆区域。Therefore, by determining the target vehicle area with the largest overlap with the rider area as the associated vehicle area associated with the rider area, the spatial association relationship between the vehicle and the rider can be used to determine the accurate associated vehicle area, which helps It can accurately determine the type of vehicle driven by the target rider, which helps to improve the accuracy of traffic behavior recognition. In some embodiments, in S202, the object detection model can be used to detect the object, and the detection frames corresponding to the rider and the vehicle in the image to be recognized can be obtained; the target detection frame corresponding to the target rider can be surrounded in the image to be recognized The formed area is determined as the rider area; and the area surrounded by the detection frame corresponding to the vehicle in the image to be recognized is determined as the vehicle area.
所述对象识别网络可以是基于区域卷积神经网络(Region Convolutional Neural Networks,RCNN)、快速区域卷积神经网络(Fast Region Convolutional Neural Networks,FAST-RCNN)或更快速的区域卷积神经网络(Faster Region Convolutional Neural Networks,FASTER-RCNN)构建的模型。本申请不对所述对象检测模型的网络结构进行特别限定。The object recognition network can be based on a regional convolutional neural network (Region Convolutional Neural Networks, RCNN), a fast regional convolutional neural network (Fast Region Convolutional Neural Networks, FAST-RCNN) or a faster regional convolutional neural network (Faster A model built by Region Convolutional Neural Networks, FASTER-RCNN). The present application does not specifically limit the network structure of the object detection model.
图3为本申请示出的一种对象检测流程示意图。需要说明的是,图3仅对对象检测流程进行示意性说明,不对本申请做出特别限定。FIG. 3 is a schematic diagram of an object detection process shown in the present application. It should be noted that, FIG. 3 only schematically illustrates the object detection process, and does not specifically limit the present application.
图3示出的对象检测模型30可以是基于FASTER-RCNN网络构建的模型。该模型可以至少包括骨干网络(backbone)31、候选框生成网络(Region Proposal Network,RPN)32、以及基于区域的卷积神经网络(Region-based Convolutional Neural Network,RCNN)33。The object detection model 30 shown in FIG. 3 may be a model constructed based on the FASTER-RCNN network. The model may at least include a backbone network (backbone) 31, a candidate frame generation network (Region Proposal Network, RPN) 32, and a region-based convolutional neural network (Region-based Convolutional Neural Network, RCNN) 33.
其中,利用骨干网络31可以对待识别图像进行若干次卷积运算得到该待识别图像的目标特征图。RPN 32用于对目标特征图进行处理得到与待识别图像中的各骑手和各车辆分别对应的锚框(anchors)。RCNN 33用于根据RPN 32输出的锚框和骨干网络31输出的目标特征图进行检测框(bounding box,bbox)回归和分类,得到所述待识别图像中的各骑手分别对应的骑手框,以及各车辆分别对应的车辆框。在一些例子中,可以得到各骑手框或车辆框的位置信息和/或大小信息。在一些实施例中,可以通过4个顶点坐标表征骑手框或车辆框的位置信息和/或大小信息。Wherein, the backbone network 31 can be used to perform several convolution operations on the image to be recognized to obtain the target feature map of the image to be recognized. RPN 32 is used to process the target feature map to obtain anchor frames (anchors) corresponding to each rider and each vehicle in the image to be recognized. RCNN 33 is used to carry out detection frame (bounding box, bbox) regression and classification according to the anchor frame of RPN 32 output and the target feature map of backbone network 31 output, obtain the rider frame corresponding to each rider in the image to be identified respectively, and Vehicle box corresponding to each vehicle. In some examples, position information and/or size information of each rider frame or vehicle frame may be obtained. In some embodiments, the position information and/or size information of the rider frame or the vehicle frame can be represented by four vertex coordinates.
在一些例子中,可以先通过若干训练样本对所述对象检测模型进行有监督训练。在另一些例子中,可以对若干样本图像中各对象(包括骑手与车辆)对应的对象框的位置和大小信息进行标注,得到若干训练样本。然后可以采用常规训练方式利用所述训练样本对该模型进行有监督训练,直至模型收敛。In some examples, the object detection model may be trained under supervision with several training samples first. In other examples, the position and size information of the object frame corresponding to each object (including the rider and the vehicle) in several sample images may be marked to obtain several training samples. Then, the model can be supervised by using the training samples in a conventional training manner until the model converges.
训练完成后,所述对象检测模型可以用于对所述待识别图像进行对象检测,得到待识别图像中包括的各骑手分别对应的骑手框,以及各车辆分别对应的车辆框。如果图像中包括多个骑手和/或多个车辆,在识别结果中还可以对不同骑手框和/或不同车辆框进行编号。After the training is completed, the object detection model can be used to perform object detection on the image to be recognized to obtain the rider frame corresponding to each rider included in the image to be recognized, and the vehicle frame corresponding to each vehicle respectively. If the image includes multiple riders and/or multiple vehicles, different rider frames and/or different vehicle frames may also be numbered in the recognition result.
在得到待识别图像中包括的骑手框与车辆框后,可以选出目标骑手对应的目标骑手框,并将所述目标骑手框在所述待识别图像中围成的区域确定为该骑手区域,以及将车辆框在所述待识别图像中围成的区域确定为所述车辆区域。After obtaining the rider frame and the vehicle frame included in the image to be recognized, the target rider frame corresponding to the target rider can be selected, and the area enclosed by the target rider frame in the image to be recognized is determined as the rider area, And determining the area enclosed by the vehicle frame in the image to be recognized as the vehicle area.
在S204,针对所述一个或多个骑手区域中的任一个骑手区域,可以分别计算各车辆区域与该骑手区域之间的重合度。可以按照计算得到的重合度从大到小的顺序,对车辆区域进行排序,并将排在首位的车辆区域确定为所述目标车辆区域。可以将所述目标车辆区域确定为与该骑手区域关联的关联车辆区域。In S204, for any one of the one or more rider areas, the degree of overlap between each vehicle area and the rider area may be calculated respectively. The vehicle areas may be sorted in descending order of the calculated coincidence degrees, and the vehicle area ranked first is determined as the target vehicle area. The target vehicle area may be determined as an associated vehicle area associated with the rider area.
在一些实施例中,所述重合度可以包括车辆区域及该骑手区域相交的区域与车辆区域及该骑手区域合并的区域之比。即通过车辆区域与骑手区域之间的交并比(Intersection  over Union,IoU)表征二者之间的重合度。In some embodiments, the degree of overlap may include a ratio of an area where the vehicle area and the rider area intersect to an area where the vehicle area and the rider area merge. That is, the intersection over union (IoU) between the vehicle area and the rider area is used to characterize the overlap between the two.
在进行交并比计算时,可以先确定所述车辆区域(以下简称区域1)与该骑手区域(以下简称区域2)是否重合。如果重合,可以通过区域1与区域2相交的区域的面积除以所述区域1与所述区域2合并的区域的面积,得到区域1与区域2之间的面积交并比IoU(区域1,区域2)。When calculating the intersection ratio, it may first be determined whether the vehicle area (hereinafter referred to as area 1) and the rider area (hereinafter referred to as area 2) overlap. If they overlap, the area of the area where area 1 and area 2 intersect can be divided by the area of the area where area 1 and area 2 merge to obtain the area intersection ratio IoU between area 1 and area 2 (area 1, area 2).
假设区域1左上角的坐标为(p x1,p y1),右下角的坐标为(p x2,p y2)。区域2左上角的坐标为(h x1,h y1),右下角的坐标为(h x2,h y2)。 Assume that the coordinates of the upper left corner of area 1 are (p x1 , p y1 ), and the coordinates of the lower right corner are (p x2 , p y2 ). The coordinates of the upper left corner of area 2 are (h x1 ,h y1 ), and the coordinates of the lower right corner are (h x2 ,h y2 ).
如果p x1>h x2||p x2<h x1||p y1>h y2||p y2<h y1对应的值为1,则可以确定所述区域1与区域2并不重合,换言之,可以确定区域1对应的车辆与区域2对应的目标骑手在空间上并不关联。 If the value corresponding to p x1 >h x2 ||p x2 <h x1 ||p y1 >h y2 ||p y2 <h y1 is 1, it can be determined that the area 1 and area 2 do not overlap. In other words, it can be It is determined that the vehicle corresponding to area 1 and the target rider corresponding to area 2 are not spatially related.
如果p x1>h x2||p x2<h x1||p y1>h y2||p y2<h y1对应的值为0,可以进一步根据公式Len=min(p x2,h x2)–max(p x1–h x1),确定相交的区域的长度Len,以及根据公式Wid=min(p y2,h y2)–max(p y1–h y1),确定相交的区域的宽度Wid。 If the corresponding value of p x1 >h x2 ||p x2 <h x1 ||p y1 >h y2 ||p y2 <h y1 is 0, then according to the formula Len=min(p x2 ,h x2 )–max( p x1 -h x1 ), determine the length Len of the intersecting area, and determine the width Wid of the intersecting area according to the formula Wid=min(p y2 ,h y2 )-max(p y1 -h y1 ).
在确定相交区域的长度Len和宽度Wid之后,根据公式S1=Len*Wid,即可得到区域1与区域2相交的区域的面积S1。After determining the length Len and width Wid of the intersection area, according to the formula S1=Len*Wid, the area S1 of the area where area 1 and area 2 intersect can be obtained.
之后,可以根据公式S2=S(p)+S(h)–S1,确定所述区域1与所述区域2合并的区域的面积。其中:Afterwards, the area of the combined area of the area 1 and the area 2 can be determined according to the formula S2=S(p)+S(h)−S1. in:
区域1的面积S(p)=(p y2–p y1)*(p x2–p x1); Area S(p) of region 1 = (p y2 -p y1 )*(p x2 -p x1 );
区域2的面积S(h)=(h y2–h y1)*(h x2–h x1)。 The area S(h) of the region 2=(h y2 −h y1 )*(h x2 −h x1 ).
最后,根据公式IoU=S1/S2,即可确定所述车辆区域与该骑手区域之间的重合度。由此可以准确计算车辆区域与该骑手区域之间重合度,从而准确地确定出与该骑手区域关联的关联车辆区域,有助于提升交通行为识别的准确性。Finally, according to the formula IoU=S1/S2, the degree of overlap between the vehicle area and the rider area can be determined. In this way, the degree of overlap between the vehicle area and the rider area can be accurately calculated, thereby accurately determining the associated vehicle area associated with the rider area, which helps to improve the accuracy of traffic behavior recognition.
在一些实施例中,还可以通过骑手与车辆之间的关联性预测分数确定目标车辆。In some embodiments, the target vehicle can also be determined by the correlation prediction score between the rider and the vehicle.
图4为本申请示出的另一种关联车辆区域确定方法的流程图。FIG. 4 is a flowchart of another method for determining an associated vehicle area shown in the present application.
如图4所示,S104可以包括S402至S406。其中,在S402,对所述待识别图像进行检测,得到所述一个或多个车辆区域以及该骑手区域。在S404,通过预先训练的关联性分数预测模型,确定所述一个或多个车辆区域与该骑手区域之间关联分数。在S406,在所述一个或多个车辆区域中,确定与该骑手区域关联分数最高的目标车辆区域,并将所述目标车辆区域确定为与该骑手区域关联的关联车辆区域。As shown in FIG. 4, S104 may include S402 to S406. Wherein, at S402, the image to be recognized is detected to obtain the one or more vehicle areas and the rider area. At S404, the correlation score between the one or more vehicle regions and the rider region is determined by using a pre-trained correlation score prediction model. At S406, among the one or more vehicle areas, determine a target vehicle area with the highest associated score with the rider area, and determine the target vehicle area as an associated vehicle area associated with the rider area.
由此通过关联分数准确表征骑手区域与车辆区域之间的关联度,从而可以确定与该骑手区域关联性最强的关联车辆区域,有助于准确地确定目标骑手驾驶的车辆类型,进而有助于提升交通行为识别的准确性。需要说明的是,对S402的说明可以参照对S202的说明,在此不做详述。Therefore, the degree of correlation between the rider area and the vehicle area can be accurately represented by the correlation score, so that the associated vehicle area with the strongest correlation with the rider area can be determined, which helps to accurately determine the type of vehicle driven by the target rider, and then helps To improve the accuracy of traffic behavior recognition. It should be noted that for the description of S402, reference may be made to the description of S202, and no detailed description is given here.
所述关联性分数预测网络,可以是基于深度学习网络构建的网络。在训练该网络时,可以先获取包括多对车辆区域与骑手区域的图像,然后标注各对车辆区域与骑手区域之间的关联性分数,得到若干训练样本。其中,如果骑手区域与车辆区域关联,则将关联分数标注为1;反之标注为0。之后,可以利用训练样本对该网络进行有监督训练,直至该网络收敛。在关联性分数预测网络训练完成后,即可利用该网络预测待识别图像中所述车辆区域与该骑手区域之间的关联性分数。The association score prediction network may be a network constructed based on a deep learning network. When training the network, first obtain images including multiple pairs of vehicle regions and rider regions, and then mark the correlation scores between each pair of vehicle regions and rider regions to obtain several training samples. Among them, if the rider area is associated with the vehicle area, the association score is marked as 1; otherwise, it is marked as 0. Afterwards, the network can be trained with supervision using the training samples until the network converges. After the training of the correlation score prediction network is completed, the network can be used to predict the correlation score between the vehicle region and the rider region in the image to be recognized.
在确定关联车辆区域后,所述设备可以继续执行S106。本步骤中的对象区域(包括骑手区域与车辆区域)可以是对象对应的对象框在待识别图像中围成的区域。所述对象区域可以携带与所述对象相关的图像特征。After determining the associated vehicle area, the device may continue to execute S106. The object area in this step (including the rider area and the vehicle area) may be the area surrounded by the object frame corresponding to the object in the image to be recognized. The object region may carry image features related to the object.
本申请记载的所述骑手区域中可以涵盖与骑手载人行为相关的第一图像特征。比如,所述第一图像特征可以包括骑手驾驶的车辆,以及该车辆上承载的人体对应的图像特征。通过所述第一图像特征可以判断载人数量。The rider area described in this application may include the first image feature related to the rider's behavior of carrying people. For example, the first image feature may include a vehicle driven by a rider, and image features corresponding to a human body carried on the vehicle. The number of people on board can be judged by the first image feature.
本申请记载的所述车辆区域中可以涵盖与车辆类型相关的第二图像特征。比如,所述第二图像特征可以包括车辆对应的图像特征,例如车轮数量的特征、车轮结构的特征、车身结构的特征等。通过所述第二图像特征可以判断车辆类型。The vehicle area described in the present application may include the second image feature related to the vehicle type. For example, the second image feature may include an image feature corresponding to the vehicle, such as a feature of the number of wheels, a feature of the wheel structure, a feature of the body structure, and the like. The vehicle type can be determined by means of the second image feature.
在一些实施例中,S106可以包括S1062和S1064。在S1062,对所述骑手区域进行载人数量识别,得到载人数量识别结果。在S1064,对所述关联车辆区域进行车辆类型识别,得到车辆类型识别结果。本申请中不限定S1062与S1064的执行顺序。In some embodiments, S106 may include S1062 and S1064. In S1062, carry out identification of the number of passengers in the area for riders, and obtain the identification result of the number of passengers. In S1064, vehicle type identification is performed on the associated vehicle area to obtain a vehicle type identification result. The execution order of S1062 and S1064 is not limited in this application.
在一些实施例中,在执行S1062时,可以获取所述骑手区域对应的骑手区域图。在一些实施例中,可以将目标骑手对应的骑手框以及待识别图像(或者利用骨干网络31对所述待识别图像进行特征提取得到的目标特征图)输入区域特征提取单元,得到与所述目标骑手对应的骑手区域图。其中,该骑手区域图可以是特征图,也可以是骑手区域的图像。In some embodiments, when executing S1062, a rider area map corresponding to the rider area may be acquired. In some embodiments, the rider frame corresponding to the target rider and the image to be recognized (or the target feature map obtained by using the backbone network 31 to perform feature extraction on the image to be recognized) can be input into the regional feature extraction unit to obtain the target rider. Rider area map corresponding to the rider. Wherein, the rider area map may be a feature map, or an image of the rider area.
所述区域特征提取单元可以包括感兴趣区域特征对齐(Region of interest Align,ROI Align)单元或感兴趣区域特征池化(Region of interest Pooling,ROI Pooling)单元。区域特征提取单元可以用于对骑手框围成的骑手区域进行诸如池化、卷积等处理,得到骑手区域图。所述骑手区域图中可以包括高维或低维的图像特征。The region feature extraction unit may include a region of interest feature alignment (Region of interest Align, ROI Align) unit or a region of interest feature pooling (Region of interest Pooling, ROI Pooling) unit. The area feature extraction unit can be used to perform processing such as pooling and convolution on the rider area enclosed by the rider frame to obtain a rider area map. The rider area map may include high-dimensional or low-dimensional image features.
得到骑手区域图之后,可以对所述骑手区域图进行载人数量识别,得到载人数量识别结果。After the rider area map is obtained, the number of passengers can be identified on the rider area map to obtain the result of identification of the number of people.
在一些实施例中,可以通过预先训练的载人数量识别模型进行人数识别。所述载人数量识别模型可以包括基于神经网络搭建的分类器。该模型输出的载人数量识别结果可以包括第一识别结果、第二识别结果与第三识别结果,以及各种识别结果分别对应的置信度。其中,所述第一预设识别结果表征载人数量达到第一预设数量。所述第二预设识别结果表征载人数量达到第二预设数量。所述第三识别结果表征载人数量达到第三预设数量。第一预设数量、第二预设数量与第三预设数量可以根据业务需求进行设定。例如,第一预设数量可以是3及以上,第二预设数量可以是2,第三预设数量可以是1。In some embodiments, the number of people can be identified by using a pre-trained model for identifying the number of people. The identification model of the number of people on board may include a classifier built based on a neural network. The recognition result of the number of people carried by the model may include a first recognition result, a second recognition result, and a third recognition result, and confidence levels corresponding to each recognition result. Wherein, the first preset identification result indicates that the number of people carried has reached the first preset number. The second preset identification result indicates that the number of people on board has reached a second preset number. The third recognition result indicates that the number of people on board has reached a third preset number. The first preset quantity, the second preset quantity and the third preset quantity can be set according to business requirements. For example, the first preset number may be 3 or more, the second preset number may be 2, and the third preset number may be 1.
在确定载人数量识别结果时,可以选择最高的置信度对应的识别结果。例如,利用前述载人数量识别模型,对骑手区域图中载人数量进行分类得到的识别结果指示:第一识别结果、第二识别结果、第三识别结果分别对应的置信度0.7,0.2,0.1。即可确定载人数量识别结果为最高置信度0.7对应的第一识别结果。When determining the identification result of the number of persons carried, the identification result corresponding to the highest confidence can be selected. For example, using the aforementioned recognition model for the number of people on board, the recognition results obtained by classifying the number of people in the rider area map indicate that the first recognition result, the second recognition result, and the third recognition result correspond to confidence levels of 0.7, 0.2, and 0.1, respectively. . That is, it can be determined that the identification result of the number of persons carried is the first identification result corresponding to the highest confidence level of 0.7.
在训练载人数量识别模型时,可以先获取带有载人数量标注信息的训练样本,然后利用训练样本,通过有监督训练方式进行多轮迭代,直至该模型收敛。完成训练后即可使用该模型进行载人数量识别。由此可以利用神经网络自适应学习的特性,提升各种情形(包括遮挡情形)下载人数量识别的准确性。When training the recognition model of the number of people, it is possible to obtain training samples with labeled information of the number of people, and then use the training samples to perform multiple rounds of iterations through supervised training until the model converges. After the training is completed, the model can be used to identify the number of passengers. Therefore, the characteristics of adaptive learning of the neural network can be used to improve the accuracy of identifying the number of downloaders in various situations (including occlusion situations).
很多场景中可能无需进行或者无法正常进行交通行为识别。可以将该类场景称为无效场景。比如,虽然在骑手推车的场景和骑手站立在车旁场景中包括骑手也包括车辆, 但是骑手并没有发生驾驶行为,因此无需对这类场景中的载人行为进行检测。再比如,在多个骑手相互紧靠的场景、低清晰度的场景和车辆被遮挡的场景中,由于图像中骑手或车辆的可辨识性过低,可能无法正常识别骑手或车辆,从而可能无法正常进行交通行为识别。In many scenarios, traffic behavior recognition may not be necessary or cannot be performed normally. Such scenarios may be referred to as invalid scenarios. For example, although the scene where the rider pushes the cart and the scene where the rider stands next to the car includes both the rider and the vehicle, the rider does not drive, so there is no need to detect the manned behavior in this type of scene. For another example, in scenes where multiple riders are close to each other, low-definition scenes, and scenes where vehicles are occluded, the rider or vehicle may not be recognized normally due to the low identifiability of the rider or vehicle in the image, so it may not be possible to Carry out traffic behavior recognition normally.
可以在对骑手区域进行载人数量识别后得到的载人数量识别结果中增加表征当前识别无效的第四识别结果。如果针对骑手区域的载人数量识别结果为所述第四识别结果,则可以说明该骑手区域中的场景为无效场景,无需或无法进行交通行为识别,由此可以无需对该骑手区域进行交通行为识别。A fourth identification result that indicates that the current identification is invalid may be added to the identification result of the number of passengers obtained after the identification of the number of passengers in the rider area. If the identification result of the number of passengers in the rider area is the fourth identification result, it can be explained that the scene in the rider area is an invalid scene, and traffic behavior identification is not required or impossible, so it is not necessary to perform traffic behavior in the rider area identify.
在前述情形下,利用前述载人数量识别模型输出的载人数量识别结果可以包括第一识别结果、第二识别结果、第三识别结果与第四识别结果,以及各种识别结果分别对应的置信度。In the aforementioned circumstances, the identification results of the number of people on board output by the aforementioned number of people identification model may include the first identification result, the second identification result, the third identification result, and the fourth identification result, as well as confidence values corresponding to the various identification results. Spend.
所述第四识别结果表征所述待识别图像出现以下至少一种场景:骑手推车的场景、骑手站立在车旁的场景、多个骑手相互紧靠的场景、低清晰度的场景、或者车辆被遮挡的场景。The fourth recognition result indicates that at least one of the following scenes appears in the image to be recognized: a scene where a rider pushes a cart, a scene where a rider stands beside the car, a scene where multiple riders are close to each other, a scene with low definition, or a scene where the vehicle is Occluded scenes.
在确定最终载人数量识别结果时,可以选择最高的置信度对应的识别结果。例如,利用前述载人数量识别模型,对骑手区域图中载人数量进行分类得到的识别结果指示:第一识别结果、第二识别结果、第三识别结果与第四识别结果分别对应的置信度0.1,0.2,0.1,0.6。从而,可以确定载人数量识别结果为最高置信度0.6对应的无效识别结果。When determining the final identification result of the number of people carried, the identification result corresponding to the highest confidence can be selected. For example, using the aforementioned recognition model for the number of people on board, the recognition results obtained by classifying the number of people in the rider area map indicate: the confidence levels corresponding to the first recognition result, the second recognition result, the third recognition result, and the fourth recognition result 0.1, 0.2, 0.1, 0.6. Therefore, it can be determined that the identification result of the number of passengers is an invalid identification result corresponding to the highest confidence level of 0.6.
在一些实施例中,训练所述载人识别网络的可以包括S11-S13。In some embodiments, training the human identification network may include S11-S13.
其中,S11,获取第一训练样本。Wherein, S11, acquire a first training sample.
所述第一训练样本包括多张骑手的样本图像以及对应的载人数量的第一标注信息,所述第一标注信息包括以下标签中的一种:1人、2人、3人、或无效标签,所述无效标签包括以下中的至少一种:骑手推车、骑手站立在车旁、多个骑手相互紧靠、低清晰度、或车辆被遮挡;The first training sample includes a plurality of sample images of riders and first annotation information corresponding to the number of passengers, and the first annotation information includes one of the following labels: 1 person, 2 persons, 3 persons, or invalid Tags that include at least one of the following: a rider pushing a cart, a rider standing next to a cart, multiple riders in close proximity to each other, low clarity, or the cart is obscured;
S12,将所述第一训练样本输入预设的第一初始网络,得到各张样本图像对应的样本载人数量识别结果。S12. Input the first training sample into the preset first initial network to obtain the identification result of the number of people in the sample corresponding to each sample image.
所述第一初始网络可以是任意类型的神经网络。所述第一初始网络可以输出载人数量识别结果。The first initial network may be any type of neural network. The first initial network may output the identification result of the number of passengers.
S13,基于所述样本载人数量识别结果与所述第一标注信息确定第一损失,基于所述第一损失优化所述第一初始网络,得到所述载人识别网络。S13. Determine a first loss based on the identification result of the number of people in the sample and the first annotation information, optimize the first initial network based on the first loss, and obtain the human identification network.
在针对第一训练样本得到各张样本图像对应的样本载人数量识别结果后,可以根据第一标注信息,确定第一损失,并通过反向传播操作,更新所述第一初始网络的参数,以完成一次参数迭代。在一些实施例中可以预先设置参数迭代的次数,在对第以初始网络完成预设次数的迭代之后,可以得到所述车辆识别网络。After obtaining the identification results of the number of people in the sample corresponding to each sample image for the first training sample, the first loss may be determined according to the first label information, and the parameters of the first initial network may be updated through a backpropagation operation, to complete a parameter iteration. In some embodiments, the number of parameter iterations can be preset, and the vehicle identification network can be obtained after the preset number of iterations are completed on the second initial network.
通过所述训练方法,在进行载人数量识别时,一方面可以减少无法或无需进行载人行为检测交通行为识别的无效场景,提升载人行为检测交通行为识别效率;另一方面,可以准确地识别载人数量,提升交通行为识别效果。Through the training method, when carrying out the identification of the number of passengers, on the one hand, it is possible to reduce the invalid scenes that cannot or do not need to carry out the identification of the traffic behavior of the manned behavior detection, and improve the efficiency of the traffic behavior recognition of the manned behavior detection; on the other hand, it can accurately Identify the number of passengers and improve the effect of traffic behavior recognition.
在一些实施例中,在对所述骑手区域进行载人数量识别,得到载人数量以及对应的第一置信度之后,响应于所述第一置信度达到第一置信度阈值,可以将所述载人数量确 定为所述骑手区域的载人数量识别结果。In some embodiments, after identifying the number of passengers in the rider area, and obtaining the number of people and the corresponding first confidence level, in response to the first confidence level reaching the first confidence level threshold, the The number of passengers is determined as the identification result of the number of people in the rider area.
所述第一置信度阈值可以根据业务情形进行设定。例如,假设模型确定出的载人数量识别结果为1人时对应的置信度为0.7,第一置信度阈值为0.7,则该载人数量识别结果为1人是可信的,可以输出该载人识别结果。其中,该置信度可以表征载人数量为1人时的可信程度。The first confidence threshold can be set according to business conditions. For example, assuming that the identification result of the number of passengers determined by the model is 1 person, the corresponding confidence level is 0.7, and the first confidence threshold is 0.7, then it is credible that the identification result of the number of passengers is 1 person, and the number of passengers can be output. Person recognition results. Wherein, the confidence level can represent the degree of credibility when the number of passengers is 1.
通过设置置信度阈值,并在置信度达到置信度阈值的情形下才输出载人数量识别结果,可以保证输出的识别结果的可信度,进而保证交通行为识别的准确性。By setting the confidence threshold and outputting the recognition result of the number of passengers only when the confidence reaches the confidence threshold, the credibility of the output recognition result can be guaranteed, thereby ensuring the accuracy of traffic behavior recognition.
在一些实施例中,可以通过调整所述第一置信度阈值的大小,灵活适应多种交通行为识别场景。In some embodiments, various traffic behavior recognition scenarios can be flexibly adapted by adjusting the size of the first confidence threshold.
例如,在优先保证违规载人行为准确性的场景中,可以将第一置信度阈值设置为较高的数值(比如0.9)。如此可以使输出的载人数量识别结果可信度足够高,进而提升交通行为识别的准确性。再例如,在优先保证载人行为识别灵敏度的场景中,可以将第一置信度阈值设置为较低的数值(比如,0.6),由此可以增加输出的载人数量识别结果的数量,进而提升交通行为识别的灵敏度。For example, in a scenario where the accuracy of illegal manned behavior is prioritized, the first confidence threshold may be set to a higher value (such as 0.9). In this way, the reliability of the output identification result of the number of passengers is sufficiently high, thereby improving the accuracy of traffic behavior identification. For another example, in a scenario where the sensitivity of manned behavior recognition is prioritized, the first confidence threshold can be set to a lower value (for example, 0.6), thereby increasing the number of output manned number recognition results, thereby improving Sensitivity of traffic behavior recognition.
在一些实施例中,在执行S1064时,可以先根据所述关联车辆区域,得到对应的关联车辆区域图。在一些实施例中,可以将所述关联车辆区域对应的车辆框以及待识别图像(或者利用骨干网络31对所述待识别图像进行特征提取得到的目标特征图),输入ROI Pooling单元,得到所述关联车辆区域图。其中,该关联车辆区域图可以是特征图,也可以是关联车辆区域的图像。In some embodiments, when executing S1064, the corresponding associated vehicle area map may be obtained first according to the associated vehicle area. In some embodiments, the vehicle frame corresponding to the associated vehicle area and the image to be recognized (or the target feature map obtained by using the backbone network 31 to perform feature extraction on the image to be recognized) can be input into the ROI Pooling unit to obtain the The above-mentioned associated vehicle area map. Wherein, the associated vehicle area map may be a feature map, or an image of the associated vehicle area.
然后可以对所述关联车辆区域图进行车辆类型识别,得到车辆类型识别结果。在一些实施例中,可以通过预先训练的车辆识别网络进行车辆类型识别。所述车辆识别网络可以包括基于神经网络搭建的分类器。该车辆识别网络输出的计算结果可以包括将车辆区域图中车辆分别识别为各预设车辆类型时的置信度(例如,概率)。在确定最终车辆类型时,可以选择最高的置信度对应的车辆类型,例如,可以将最高的置信度对应的车辆类型确定为所述车辆类型识别结果。Then, vehicle type identification may be performed on the associated vehicle area map to obtain a vehicle type identification result. In some embodiments, vehicle type recognition can be performed by a pre-trained vehicle recognition network. The vehicle recognition network may include a classifier built based on a neural network. The calculation results output by the vehicle identification network may include confidence levels (for example, probabilities) when vehicles in the vehicle area map are identified as respective preset vehicle types. When determining the final vehicle type, the vehicle type corresponding to the highest confidence level may be selected, for example, the vehicle type corresponding to the highest confidence level may be determined as the vehicle type identification result.
在一些实施例中,训练所述车辆识别网络可以包括S21-S23。In some embodiments, training the vehicle recognition network may include S21-S23.
其中,S21,获取第二训练样本。Wherein, S21, acquire a second training sample.
所述第二训练样本包括多张车辆的样本图像以及对应的车辆类型的第二标注信息。The second training samples include a plurality of sample images of vehicles and second labeling information of corresponding vehicle types.
S22,将所述第二训练样本输入预设的第二初始网络,得到每张样本图像的样本车辆类型识别结果。S22. Input the second training sample into a preset second initial network to obtain a sample vehicle type recognition result of each sample image.
所述第二初始网络可以是任意类型的神经网络。所述第二预设网络可以输出车辆类型识别结果。The second initial network may be any type of neural network. The second preset network may output a vehicle type identification result.
S23,基于所述样本车辆类型识别结果与所述第二标注信息确定第二损失,基于所述第二损失优化所述第二初始网络,得到所述车辆识别网络。S23. Determine a second loss based on the sample vehicle type identification result and the second annotation information, optimize the second initial network based on the second loss, and obtain the vehicle identification network.
在针对第二训练样本得到计算结果后,可以根据第二标注信息,确定第二损失,并通过反向传播操作,更新所述第二初始网络的参数,以完成一次参数迭代。在一些实施例中可以预先设置参数迭代的次数,在对第二初始网络完成预设次数的迭代之后,可以得到所述车辆识别网络。After the calculation result is obtained for the second training sample, the second loss may be determined according to the second label information, and the parameters of the second initial network may be updated through a backpropagation operation to complete a parameter iteration. In some embodiments, the number of parameter iterations can be preset, and the vehicle identification network can be obtained after the second initial network has completed a preset number of iterations.
通过所述训练方法,在进行车辆类型识别时,可以利用神经网络自适应学习的特性,提升车辆类型识别的准确性。Through the training method, when performing vehicle type identification, the characteristics of neural network self-adaptive learning can be utilized to improve the accuracy of vehicle type identification.
在一些实施例中,在对所述关联车辆区域进行车辆类型识别,得到所述关联车辆区域对应的车辆类型以及对应的第二置信度之后,可以响应于所述第二置信度达到第二置信度阈值,将所述车辆类型确定为所述车辆区域的车辆类型识别结果。In some embodiments, after the vehicle type identification is performed on the associated vehicle area, and the vehicle type corresponding to the associated vehicle area and the corresponding second confidence level are obtained, the second confidence level may be reached in response to the second confidence level. degree threshold, the vehicle type is determined as the vehicle type identification result of the vehicle area.
所述第二置信度阈值可以根据业务情形进行设定。The second confidence threshold can be set according to business conditions.
通过设置置信度阈值,并在置信度达到置信度阈值的情形下才输出车辆类型识别结果,可以保证输出的识别结果的可信度,进而保证交通行为识别的准确性。By setting the confidence threshold and outputting the vehicle type recognition result only when the confidence reaches the confidence threshold, the credibility of the output recognition result can be guaranteed, thereby ensuring the accuracy of traffic behavior recognition.
在一些实施例中,可以通过调整所述第二置信度阈值的大小,灵活适应多种交通行为识别场景。例如,在提高载人行为识别准确性的场景中,可以将第二置信度阈值设置为较高的数值(比如0.9)。如此可以使输出的车辆类型识别结果可信度足够高,进而提升交通行为识别的准确性。再例如,在提高载人行为识别灵敏度的场景中,可以将第二置信度阈值设置为较低的数值(比如,0.6),由此可以增加输出的车辆类型识别结果的数量,进而提升交通行为识别的灵敏度。In some embodiments, various traffic behavior recognition scenarios can be flexibly adapted by adjusting the size of the second confidence threshold. For example, in a scenario where the accuracy of manned behavior recognition is improved, the second confidence threshold may be set to a higher value (such as 0.9). In this way, the reliability of the output vehicle type recognition result can be sufficiently high, thereby improving the accuracy of traffic behavior recognition. For another example, in the scenario of improving the sensitivity of passenger behavior recognition, the second confidence threshold can be set to a lower value (for example, 0.6), so that the number of output vehicle type recognition results can be increased, thereby improving traffic behavior. Recognition sensitivity.
在得到骑手载人数量识别结果、车辆类型识别结果后,所述设备可以执行S108。After obtaining the identification result of the number of riders and the identification result of the vehicle type, the device may execute S108.
在一些实施例中,可以针对不同的车辆类型,输出符合实际的识别结果。In some embodiments, actual recognition results can be output for different vehicle types.
在执行S108时,第一方面,可以响应于所述载人数量识别结果为第一识别结果,确定所述骑手区域中的目标骑手违规载人;所述第一预设识别结果表征载人数量达到第一预设数量。When executing S108, in the first aspect, it may be determined that the target rider in the rider area is illegally carrying passengers in response to the identification result of the number of passengers being the first identification result; the first preset identification result represents the number of passengers The first preset amount is reached.
所述第一预设数量,可以为经验数值。比如,在非机动车场景,不论何种类型的车辆,包括驾驶员在内的载人数量不能超过3人,载人数量超过3人的非机动车可以认为是不符合规定。此时可以将所述第一预设数量设为3,载人数量达到3人或3人以上的,则可以确定载人行违规。The first preset quantity may be an empirical value. For example, in the non-motor vehicle scene, no matter what type of vehicle, the number of people including the driver cannot exceed 3, and non-motor vehicles with more than 3 people can be considered as non-compliant. At this time, the first preset number can be set to 3, and if the number of passengers reaches 3 or more, it can be determined that the passenger-carrying violation is violated.
第二方面,可以响应于所述载人数量识别结果为第二识别结果,并且所述类型识别结果表征的车辆类型为预设的非机动车类型,确定所述目标骑手违规载人;所述第二识别结果表征载人数量达到第二预设数量,所述第二预设数量小于所述第一预设数量。In the second aspect, it may be determined that the target rider is illegally carrying people in response to the recognition result of the number of passengers being the second recognition result, and the vehicle type represented by the type recognition result is a preset non-motor vehicle type; The second identification result indicates that the number of people on board has reached a second preset number, and the second preset number is smaller than the first preset number.
第三方面,可以响应于所述载人数量识别结果表征的载人数量为所述第二识别结果,并且所述类型识别结果表征的车辆类型不是所述预设的非机动车类型,确定所述目标骑手未违规载人。In the third aspect, in response to the number of people represented by the recognition result of the number of people being carried as the second recognition result, and the vehicle type represented by the type recognition result is not the preset non-motor vehicle type, determine the The above-mentioned target rider did not carry passengers in violation of regulations.
当载人数量为第二预设数量时,不同类型的车辆,对应的载人行为可能合规也可能违规。如果目标骑手驾驶车辆(关联车辆)为预设的非机动车类型,则可以确定载人行为违规,反之则可以确定载人行为合规。When the number of passengers is the second preset number, different types of vehicles, the corresponding behavior of carrying people may be compliant or may be illegal. If the vehicle (associated vehicle) driven by the target rider is a preset non-motor vehicle type, it can be determined that the manned behavior violates the regulations, otherwise it can be determined that the manned behavior is compliant.
所述第二预设数量可以是经验数值。所述预设的非机动车类型,可以是指核载人数不得达到所述第二预设数量的车辆。The second preset quantity may be an empirical value. The preset non-motor vehicle type may refer to a vehicle whose number of people cannot reach the second preset number.
举例来说,假设所述第二预设数量为2,当所述预设的非机动车类型为三轮车等工具车时。该类工具车只能合法载1人,如果识别出载人数量为2,则可以确定违规载人。如果车辆类型不是该类工具车,比如可能是摩托车或电动自行车,如果识别出载人数量为2,则可以确定该载人数量合规。For example, assuming that the second preset number is 2, when the preset non-motor vehicle type is a utility vehicle such as a tricycle. This type of tool cart can only legally carry 1 person. If it is recognized that the number of people carried is 2, it can be determined that it has illegally carried people. If the vehicle type is not such a tool cart, such as a motorcycle or an electric bicycle, if the identified number of passengers is 2, it can be determined that the number of passengers is compliant.
第四方面,可以响应于所述载人数量识别结果为第三识别结果,确定所述目标骑手未违规载人;所述第三识别结果表征载人数量为第三预设数量,所述第三预设数量小于所述第二预设数量。In the fourth aspect, it may be determined that the target rider has not violated regulations for carrying passengers in response to the recognition result of the number of passengers being a third recognition result; the third recognition result indicates that the number of passengers is a third preset number, and the first Three preset numbers are smaller than the second preset number.
所述第三预设数量可以是经验数值。比如,在非机动车场景,不论何种类型的车 辆,包括驾驶员在内的载人数量如果是1人,则可以认为是符合规定的。此时可以将所述第三预设数量设为1。The third preset quantity may be an empirical value. For example, in a non-motor vehicle scene, no matter what type of vehicle, if the number of passengers including the driver is one person, it can be considered as complying with the regulations. At this time, the third preset number may be set to 1.
第五方面,可以响应于所述载人数量识别结果为第四识别结果,确定针对所述目标骑手的交通行为识别无效。由此可以无需继续进行交通行为识别。In the fifth aspect, it may be determined that the traffic behavior identification for the target rider is invalid in response to the identification result of the number of passengers being the fourth identification result. This eliminates the need for further traffic behavior recognition.
通过前述五方面的合法性判断逻辑,可以针对不同的车辆类型场景,输出符合实际的合法性识别结果,提升合法性识别适用性。Through the legality judgment logic of the aforementioned five aspects, it is possible to output the actual legality recognition results for different vehicle types and scenarios, and improve the applicability of legality recognition.
在一些实施例中,在确定目标骑手违规载人的情形下,可以发出告警信息。In some embodiments, when it is determined that the target rider is illegally carrying passengers, a warning message may be issued.
在一些实施例中,可以连接所述设备与交警持有的交互终端。当所述设备识别出违规载人行为时,可以将所述目标骑手对应的信息、其驾驶的车辆信息、违规原由等信息打包为告警信息,并发送至交警持有的交互终端。所述交警可以在接收到告警进行后作出相应处理。由此可以自动及时针对违规载人行为进行告警,便于对违规行为做出处理。In some embodiments, the device may be connected to an interactive terminal held by a traffic policeman. When the device recognizes the illegal behavior of carrying people, it can package the information corresponding to the target rider, the vehicle information it drives, and the reason for the violation as alarm information, and send it to the interactive terminal held by the traffic police. The traffic policeman can make corresponding processing after receiving the warning. In this way, an alarm can be automatically and timely issued for violations of manned behavior, which facilitates the handling of violations.
以下结合非机动车载人行为识别场景进行实施例说明。Embodiments will be described below in conjunction with non-motor vehicle vehicle-mounted person behavior recognition scenarios.
在道路场景中部署了若干摄像头。所述摄像头可以将预定区域内采集的待识别图像发送至识别设备进行骑手行为检测。Several cameras are deployed in the road scene. The camera can send the to-be-recognized images collected in the predetermined area to the recognition device for rider behavior detection.
所述识别设备可以搭载预先训练的骑手-车辆识别网络(以下简称网络1),载人识别网络(以下简称网络2)以及车辆识别网络(以下简称网络3)。The identification device can be equipped with pre-trained rider-vehicle identification network (hereinafter referred to as network 1), passenger identification network (hereinafter referred to as network 2) and vehicle identification network (hereinafter referred to as network 3).
其中,所述网络1用于检测待识别图像中出现的骑手和车辆,以及对应的车辆区域以及骑手区域。所述网络2可以用于识别载人数量。所述网络3可以用于识别车辆类型。Wherein, the network 1 is used to detect the rider and vehicle appearing in the image to be recognized, and the corresponding vehicle area and rider area. The network 2 can be used to identify the number of people on board. The network 3 can be used to identify the type of vehicle.
所述识别设备还可以根据网络1的识别结果对所述待识别图像中出现的各骑手进行多目标跟踪,得到各骑手对应的行驶轨迹,从而识别新出现在预定区域的骑手、仍在预定区域运动的骑手以及即将离开预定区域的骑手。可以将即将离开预定区域的骑手确定为目标骑手。The identification device can also perform multi-target tracking on each rider appearing in the image to be identified according to the identification result of the network 1 to obtain the corresponding driving track of each rider, so as to identify the rider who newly appears in the predetermined area and is still in the predetermined area Active riders and riders who are about to leave their intended area. A rider who is about to leave a predetermined area can be determined as a target rider.
图5为本申请示出的一种载人行为识别方法流程示意图。FIG. 5 is a schematic flow chart of a method for identifying manned behavior shown in the present application.
如图5所示,在所述识别设备接收到待识别图像后,在S501,通过网络1识别出所述待识别图像中出现的骑手对应的骑手框和车辆对应的车辆框,并确定出目标骑手对应的目标骑手框,将所述目标骑手框在所述待识别图像中围成的区域确定为骑手区域,将所述车辆框在所述待识别图像中围成的区域确定为车辆区域。As shown in Figure 5, after the recognition device receives the image to be recognized, in S501, the rider frame corresponding to the rider and the vehicle frame corresponding to the vehicle appearing in the image to be recognized are recognized through the network 1, and the target For the target rider frame corresponding to the rider, the area enclosed by the target rider frame in the image to be recognized is determined as the rider area, and the area enclosed by the vehicle frame in the image to be identified is determined as the vehicle area.
在S502,利用IoU确定各车辆区域与所述骑手区域之间的重合度,并将最大重合度对应的目标车辆区域确定为与所述骑手区域在空间上关联的关联车辆区域。由此可以利用目标骑手与其驾驶的车辆在空间上的重合关系,准确地确定出与所述骑手区域关联的关联车辆区域,有助于提升车辆类型识别准确性,得到准确的载人行为识别结果。In S502, the overlap degree between each vehicle area and the rider area is determined by using IoU, and the target vehicle area corresponding to the maximum overlap degree is determined as an associated vehicle area spatially associated with the rider area. In this way, the spatial overlap relationship between the target rider and the vehicle he drives can be used to accurately determine the associated vehicle area associated with the rider area, which helps to improve the accuracy of vehicle type recognition and obtain accurate passenger behavior recognition results. .
在S503,获取与所述骑手区域对应的骑手区域图,并利用网络2得到载人数量识别结果。S504,获取与所述关联车辆区域对应的关联车辆区域图,并利用网络3得到车辆类型识别结果。在本例中,可以判断载人数量识别结果与车辆类型识别结果对应的置信度是否分别达到0.8,从而可以筛选出可信的载人数量识别结果与车辆类型识别结果,进而提升载人识别准确性。In S503, obtain the rider area map corresponding to the rider area, and use the network 2 to obtain the identification result of the number of passengers. S504. Obtain an associated vehicle area map corresponding to the associated vehicle area, and use the network 3 to obtain a vehicle type identification result. In this example, it can be judged whether the confidence levels corresponding to the recognition results of the number of passengers and the vehicle type are respectively 0.8, so that the credible results of the recognition of the number of people and the type of vehicles can be screened out, thereby improving the accuracy of the recognition of people sex.
在S505,根据所述载人数量识别结果和所述类型识别结果,识别所述目标骑手的载人行为是否违规。In S505, according to the result of identification of the number of passengers carried and the result of identification of the type, it is identified whether the passenger carrying behavior of the target rider violates regulations.
图6为本申请示出的一种违规载人行为判断流程示意图。FIG. 6 is a schematic diagram of a judgment flow chart of an illegal manned behavior shown in the present application.
如图6所示,在S602,判断载人数量识别结果表征的识别结果。如果所述载人数量识别结果为无效,则可以不进行所述目标骑手的交通行为识别。As shown in FIG. 6 , at S602 , the recognition result represented by the recognition result of the number of persons carried is judged. If the identification result of the number of passengers is invalid, the traffic behavior identification of the target rider may not be performed.
如果所述载人数量识别结果表征的载人数量达到3人,确定所述目标骑手违规载人。If the number of passengers represented by the recognition result of the number of passengers reaches 3, it is determined that the target rider is illegally carrying people.
如果所述载人数量识别结果表征的载人数量为2人,则可以执行S604,确定车辆类型识别结果表征的车辆类型是否为三轮车。If the number of people represented by the recognition result of the number of people carried is 2 persons, S604 may be executed to determine whether the vehicle type represented by the vehicle type recognition result is a tricycle.
如果所述车辆类型为三轮车,则确定所述目标骑手违规载人,否则确定所述目标骑手未违规载人。If the vehicle type is a tricycle, it is determined that the target rider is illegally carrying passengers, otherwise it is determined that the target rider is not illegally carrying passengers.
如果所述载人数量识别结果表征的载人数量为1人,则确定所述目标骑手未违规载人。If the number of passengers represented by the recognition result of the number of passengers is 1 person, it is determined that the target rider is not carrying people in violation of regulations.
由此,一方面,可以无需针对无效场景进行载人行为识别,提升载人行为识别效率和效果;另一方面,可以针对不同的车辆类型场景,输出符合实际的载人识别结果,提升载人识别方法的适用性。As a result, on the one hand, it is unnecessary to perform manned behavior recognition for invalid scenarios, improving the efficiency and effect of manned behavior recognition; Applicability of identification methods.
如果识别出违规载人行为,则可以基于违规载人行为对应的骑手信息、车辆信息、以及违规原由信息生成告警信息,并及时发送至交警对应的手持设备,由此便于交警做出及时处理。If the violation of passenger loading behavior is identified, the alarm information can be generated based on the rider information, vehicle information, and information on the cause of the violation, and sent to the corresponding handheld device of the traffic police in time, so that the traffic police can make timely processing.
与所述任一实施例相对应的,本申请还提出一种交通行为识别装置。Corresponding to any of the above embodiments, the present application also proposes a traffic behavior recognition device.
图7为本申请示出的一种交通行为识别装置的结构示意图。FIG. 7 is a schematic structural diagram of a traffic behavior recognition device shown in the present application.
如图7所示,所述装置70可以包括:获取模块71,用于获取包括一个或多个骑手区域的待识别图像;第一确定模块72,用于针对所述一个或多个骑手区域中的任一个骑手区域,从所述待识别图像中,确定与该骑手区域关联的关联车辆区域,该骑手区域包括一辆车辆以及至少一个人体;识别模块73,用于对该骑手区域进行载人数量识别,得到载人数量识别结果,以及对所述关联车辆区域进行车辆类型识别,得到车辆类型识别结果;第二确定模块74,用于根据所述载人数量识别结果和所述车辆类型识别结果,确定该骑手区域中的目标骑手是否存在违规载人行为。As shown in FIG. 7 , the device 70 may include: an acquisition module 71, configured to acquire an image to be recognized including one or more rider areas; a first determination module 72, configured to target Any one of the rider areas, from the image to be identified, determine the associated vehicle area associated with the rider area, the rider area includes a vehicle and at least one human body; the recognition module 73 is used to carry people in the rider area Quantity identification, obtaining the identification result of the number of people carried, and performing vehicle type identification on the associated vehicle area to obtain the result of vehicle type identification; the second determination module 74 is used to identify the number of people based on the identification result of the number of passengers and the vehicle type As a result, it is determined whether the target rider in the rider's zone has a manning violation.
在一些实施例中,所述第一确定模块72,用于:对所述待识别图像进行检测,得到一个或多个车辆区域以及该骑手区域;在所述一个或多个车辆区域中,确定与该骑手区域重合度最大的目标车辆区域,并将所述目标车辆区域确定为与该骑手区域关联的关联车辆区域。In some embodiments, the first determination module 72 is configured to: detect the image to be recognized to obtain one or more vehicle areas and the rider area; in the one or more vehicle areas, determine A target vehicle area that overlaps the rider area with the greatest degree, and determining the target vehicle area as an associated vehicle area associated with the rider area.
在一些实施例中,所述第一确定模块72,用于:对所述待识别图像进行检测,得到一个或多个车辆区域以及该骑手区域;通过预先训练的关联性分数预测模型,确定所述一个或多个车辆区域与该骑手区域之间的关联分数;在所述一个或多个车辆区域中,确定与该骑手区域关联分数最高的目标车辆区域,并将所述目标车辆区域确定为与该骑手区域关联的关联车辆区域。In some embodiments, the first determination module 72 is configured to: detect the image to be recognized to obtain one or more vehicle areas and the rider area; The correlation score between the one or more vehicle areas and the rider area; in the one or more vehicle areas, determine the target vehicle area with the highest correlation score with the rider area, and determine the target vehicle area as The associated vehicle zone associated with this rider zone.
在一些实施例中,所述识别模块73,用于:对该骑手区域进行载人数量识别,得到载人数量以及对应的第一置信度;响应于所述第一置信度达到第一置信度阈值,将所述载人数量确定为该骑手区域的载人数量识别结果;对所述关联车辆区域进行车辆类型识别,得到所述关联车辆区域对应的车辆类型以及对应的第二置信度;响应于所述第二置信度达到第二置信度阈值,将所述车辆类型确定为所述车辆区域的车辆类型识别结果。In some embodiments, the identification module 73 is configured to: identify the number of passengers in the rider area, obtain the number of passengers and the corresponding first confidence level; reach the first confidence level in response to the first confidence level Threshold, determining the number of passengers as the identification result of the number of passengers in the rider area; performing vehicle type identification on the associated vehicle area to obtain the vehicle type corresponding to the associated vehicle area and the corresponding second degree of confidence; responding When the second confidence level reaches a second confidence level threshold, the vehicle type is determined as the vehicle type identification result of the vehicle area.
在一些实施例中,所述第二确定模块74,用于:响应于所述载人数量识别结果为第一识别结果,确定所述目标骑手违规载人;所述第一识别结果表征载人数量达到第一预设数量;或者,响应于所述载人数量识别结果为第二识别结果,并且所述类型识别结果表征的车辆类型为预设的非机动车类型,确定所述目标骑手违规载人;所述第二识别结果表征载人数量达到第二预设数量,所述第二预设数量小于所述第一预设数量;或者,响应于所述载人数量识别结果表征的载人数量为所述第二识别结果,并且所述类型识别结果表征的车辆类型不是所述预设的非机动车类型,确定所述目标骑手未违规载人;或者,响应于所述载人数量识别结果为第三识别结果,确定所述目标骑手未违规载人;所述第三识别结果表征载人数量为第三预设数量,所述第三预设数量小于所述第二预设数量;或者,响应于所述载人数量识别结果为第四识别结果,确定针对所述目标骑手的交通行为识别无效。In some embodiments, the second determining module 74 is configured to: determine that the target rider is illegally carrying passengers in response to the recognition result of the number of passengers being a first recognition result; The number reaches the first preset number; or, in response to the recognition result of the number of passengers is the second recognition result, and the vehicle type represented by the type recognition result is a preset non-motor vehicle type, determine that the target rider violates the rules Manning; the second identification result indicates that the number of people on board reaches a second preset number, and the second preset number is less than the first preset number; or, in response to the number of people on board represented by the identification result of the number of people on board The number of people is the second identification result, and the vehicle type represented by the type identification result is not the preset non-motor vehicle type, and it is determined that the target rider is not illegally carrying people; or, in response to the number of people carrying The recognition result is a third recognition result, and it is determined that the target rider has not carried people in violation of regulations; the third recognition result indicates that the number of people carried is a third preset number, and the third preset number is smaller than the second preset number or, in response to the identification result of the number of passengers being the fourth identification result, it is determined that the traffic behavior identification for the target rider is invalid.
在一些实施例中,所述第四识别结果表征所述待识别图像包括以下至少一种场景:骑手推车的场景;骑手站立在车旁的场景;多个骑手相互紧靠的场景;低清晰度的场景;或车辆被遮挡的场景。In some embodiments, the fourth recognition result indicates that the image to be recognized includes at least one of the following scenes: a scene of a rider pushing a cart; a scene of a rider standing next to the car; a scene of multiple riders close to each other; low-resolution scene; or the scene where the vehicle is blocked.
在一些实施例中,所述装置70还包括:告警模块,用于响应于该骑手区域中的目标骑手违规载人,发出告警信息。In some embodiments, the device 70 further includes: an alarm module, configured to send out an alarm message in response to the target rider in the rider area carrying passengers illegally.
在一些实施例中,所述载人数量识别结果通过载人识别网络对该骑手区域进行检测获得;所述装置70还包括:所述载人识别网络的训练模块,用于获取第一训练样本,所述第一训练样本包括多张骑手的样本图像以及对应的载人数量的第一标注信息,所述第一标注信息包括以下标签中的一种:1人、2人、3人、或无效标签,所述无效标签包括以下中的至少一种:骑手推车、骑手站立在车旁、多个骑手相互紧靠、低清晰度、车辆被遮挡;将所述第一训练样本输入预设的第一初始网络,得到各张样本图像对应的样本载人数量识别结果;基于所述样本载人数量识别结果与所述第一标注信息确定第一损失,基于所述第一损失优化所述第一初始网络,得到所述载人识别网络。In some embodiments, the identification result of the number of passengers is obtained by detecting the rider area through the occupancy identification network; the device 70 also includes: a training module of the occupancy identification network, which is used to obtain the first training sample , the first training sample includes a plurality of sample images of riders and first annotation information corresponding to the number of passengers, and the first annotation information includes one of the following labels: 1 person, 2 people, 3 people, or An invalid label, the invalid label includes at least one of the following: a rider pushes a cart, a rider stands next to the car, multiple riders are close to each other, low definition, and the vehicle is blocked; the first training sample is input into the preset The first initial network obtains the identification result of the number of people in the sample corresponding to each sample image; determines a first loss based on the identification result of the number of people in the sample and the first annotation information, and optimizes the first loss based on the first loss An initial network, obtaining the manned identification network.
在一些实施例中,所述车辆识别结果通过车辆识别网络对所述关联车辆区域进行检测获得;所述装置70还包括:所述车辆识别网络的训练模块,用于获取第二训练样本,所述第二训练样本包括多张车辆的样本图像以及对应的车辆类型的第二标注信息;将所述第二训练样本输入预设的第二初始网络,得到每张样本图像的样本车辆类型识别结果;基于所述样本车辆类型识别结果与所述第二标注信息确定第二损失,基于所述第二损失优化所述第二初始网络,得到所述车辆识别网络。In some embodiments, the vehicle recognition result is obtained by detecting the associated vehicle area through a vehicle recognition network; the device 70 further includes: a training module of the vehicle recognition network, configured to obtain a second training sample, the The second training sample includes a plurality of sample images of the vehicle and the second label information of the corresponding vehicle type; the second training sample is input into the preset second initial network to obtain the sample vehicle type recognition result of each sample image Determining a second loss based on the sample vehicle type identification result and the second label information, optimizing the second initial network based on the second loss, to obtain the vehicle identification network.
本申请示出的交通行为识别装置的实施例可以应用于电子设备上。相应地,本申请公开了一种电子设备,该设备可以包括:处理器,用于存储处理器可执行指令的存储器。其中,所述处理器被配置为调用所述存储器中存储的可执行指令,实现前述任一实施例示出的交通行为识别方法。Embodiments of the traffic behavior recognition device shown in this application can be applied to electronic equipment. Correspondingly, the present application discloses an electronic device, which may include: a processor, and a memory for storing instructions executable by the processor. Wherein, the processor is configured to invoke the executable instructions stored in the memory to implement the traffic behavior recognition method shown in any one of the foregoing embodiments.
图8为本申请示出的一种电子设备的硬件结构示意图。FIG. 8 is a schematic diagram of a hardware structure of an electronic device shown in the present application.
如图8所示,该电子设备可以包括用于执行指令的处理器801,用于进行网络连接的网络接口802,用于为处理器存储运行数据的内存803,以及用于存储行为识别装置对应指令的非易失性存储器804,处理器801、网络接口802、内存803以及非易失性存储器804通过内部总线805耦接。As shown in FIG. 8 , the electronic device may include a processor 801 for executing instructions, a network interface 802 for connecting to a network, a memory 803 for storing operation data for the processor, and a memory 803 for storing behavior recognition device correspondence. The non-volatile memory 804 for instructions, the processor 801 , the network interface 802 , the memory 803 and the non-volatile memory 804 are coupled through an internal bus 805 .
其中,所述装置的实施例可以通过软件实现,也可以通过硬件或者软硬件结合的 方式实现。以软件实现为例,作为一个逻辑意义上的装置,是通过其所在电子设备的处理器将非易失性存储器中对应的计算机程序指令读取到内存中运行形成的。从硬件层面而言,除了图8所示的处理器、内存、网络接口、以及非易失性存储器之外,实施例中装置所在的电子设备通常根据该电子设备的实际功能,还可以包括其他硬件,对此不再赘述。Wherein, the embodiment of the device may be implemented by software, or by hardware or a combination of software and hardware. Taking software implementation as an example, as a device in a logical sense, it is formed by reading the corresponding computer program instructions in the non-volatile memory into the memory for operation by the processor of the electronic device where it is located. From the perspective of hardware, in addition to the processor, memory, network interface, and non-volatile memory shown in Figure 8, the electronic device where the device in the embodiment is usually based on the actual function of the electronic device can also include other Hardware, no more details on this.
可以理解的是,为了提升处理速度,交通行为识别装置对应指令也可以直接存储于内存中,在此不作限定。It can be understood that, in order to increase the processing speed, the instructions corresponding to the traffic behavior recognition device may also be directly stored in the memory, which is not limited here.
本申请提出一种计算机可读存储介质,所述存储介质存储有计算机程序,所述计算机程序可以用于使处理器执行如前述任一实施例示出的交通行为识别方法。The present application proposes a computer-readable storage medium, the storage medium stores a computer program, and the computer program can be used to make a processor execute the traffic behavior recognition method as shown in any one of the foregoing embodiments.
本领域技术人员应明白,本申请一个或多个实施例可提供为方法、系统或计算机程序产品。因此,本申请一个或多个实施例可采用完全硬件实施例、完全软件实施例或结合软件和硬件方面的实施例的形式。而且,本申请一个或多个实施例可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(可以包括但不限于磁盘存储器、CD-ROM、光学存储器等)所述实施的计算机程序产品的形式。Those skilled in the art should understand that one or more embodiments of the present application may be provided as a method, system or computer program product. Accordingly, one or more embodiments of the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, one or more embodiments of the present application may be implemented as described on one or more computer-usable storage media (which may include, but are not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein. In the form of a computer program product.
本申请中的“和/或”表示至少具有两者中的其中一个,例如,“A和/或B”可以包括三种方案:A、B、以及“A和B”。"And/or" in this application means at least one of the two, for example, "A and/or B" may include three options: A, B, and "A and B".
本申请中的各个实施例均采用递进的方式描述,各个实施例之间相同相似的部分互相参见即可,每个实施例重点说明的都是与其他实施例的不同之处。尤其,对于数据处理设备实施例而言,由于其基本相似于方法实施例,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。Each embodiment in the present application is described in a progressive manner, the same and similar parts of each embodiment can be referred to each other, and each embodiment focuses on the differences from other embodiments. In particular, for the data processing device embodiment, since it is basically similar to the method embodiment, the description is relatively simple, and for relevant parts, please refer to part of the description of the method embodiment.
所述对本申请特定实施例进行了描述。其它实施例在所附权利要求书的范围内。在一些情况下,在权利要求书中记载的行为或步骤可以按照不同于实施例中的顺序来执行并且仍然可以实现期望的结果。另外,在附图中描绘的过程不一定要求示出的特定顺序或者连续顺序才能实现期望的结果。在某些实施方式中,多任务处理和并行处理也是可以的或者可能是有利的。The foregoing describes specific embodiments of the present application. Other implementations are within the scope of the following claims. In some cases, the actions or steps recited in the claims can be performed in an order different from that in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. Multitasking and parallel processing are also possible or may be advantageous in certain embodiments.
本申请中描述的主题及功能操作的实施例可以在以下中实现:数字电子电路、有形体现的计算机软件或固件、可以包括本申请中公开的结构及其结构性等同物的计算机硬件、或者它们中的一个或多个的组合。本申请中描述的主题的实施例可以实现为一个或多个计算机程序,即编码在有形非暂时性程序载体所述以被数据处理装置执行或控制数据处理装置的操作的计算机程序指令中的一个或多个模块。可替代地或附加地,程序指令可以被编码在人工生成的传播信号所述,例如机器生成的电、光或电磁信号,该信号被生成以将信息编码并传输到合适的接收机装置以由数据处理装置执行。计算机存储介质可以是机器可读存储设备、机器可读存储基板、随机或串行存取存储器设备、或它们中的一个或多个的组合。Embodiments of the subject matter and functional operations described in this application can be implemented in digital electronic circuitry, tangibly embodied computer software or firmware, computer hardware that may include the structures disclosed in this application and their structural equivalents, or their A combination of one or more of them. Embodiments of the subject matter described in this application can be implemented as one or more computer programs, that is, one of computer program instructions encoded in a tangible, non-transitory program carrier to be executed by or to control the operation of data processing apparatus. or multiple modules. Alternatively or additionally, the program instructions may be encoded in an artificially generated propagated signal, such as a machine-generated electrical, optical or electromagnetic signal, which is generated to encode and transmit information to a suitable receiver device for viewing by The data processing device executes. A computer storage medium may be a machine-readable storage device, a machine-readable storage substrate, a random or serial access memory device, or a combination of one or more of them.
本申请中描述的处理及逻辑流程可以由执行一个或多个计算机程序的一个或多个可编程计算机执行,以通过根据输入数据进行操作并生成输出来执行相应的功能。所述处理及逻辑流程还可以由专用逻辑电路—例如FPGA(现场可编程门阵列)或ASIC(专用集成电路)来执行,并且装置也可以实现为专用逻辑电路。The processes and logic flows described in this application can be performed by one or more programmable computers executing one or more computer programs to perform corresponding functions by operating on input data and generating output. The processes and logic flows can also be performed by, and apparatus can also be implemented as, special purpose logic circuitry, such as an FPGA (Field Programmable Gate Array) or an ASIC (Application Specific Integrated Circuit).
适合用于执行计算机程序的计算机可以包括,例如通用和/或专用微处理器,或任何其他类型的中央处理单元。通常,中央处理单元将从只读存储器和/或随机存取存储器 接收指令和数据。计算机的基本组件可以包括用于实施或执行指令的中央处理单元以及用于存储指令和数据的一个或多个存储器设备。通常,计算机还将可以包括用于存储数据的一个或多个大容量存储设备,例如磁盘、磁光盘或光盘等,或者计算机将可操作地与此大容量存储设备耦接以从其接收数据或向其传送数据,抑或两种情况兼而有之。然而,计算机不是必须具有这样的设备。此外,计算机可以嵌入在另一设备中,例如移动电话、个人数字助理(PDA)、移动音频或视频播放器、游戏操纵台、全球定位系统(GPS)接收机、或例如通用串行总线(USB)闪存驱动器的便携式存储设备,仅举几例。Computers suitable for the execution of a computer program may include, for example, general and/or special purpose microprocessors, or any other type of central processing unit. Generally, a central processing unit will receive instructions and data from a read only memory and/or a random access memory. The basic components of a computer may include a central processing unit for implementing or executing instructions and one or more memory devices for storing instructions and data. Generally, a computer will also include, or be operatively coupled to, one or more mass storage devices for storing data, such as magnetic or magneto-optical disks, or optical disks, to receive data therefrom or Send data to it, or both. However, a computer is not required to have such a device. In addition, a computer may be embedded in another device such as a mobile phone, a personal digital assistant (PDA), a mobile audio or video player, a game console, a Global Positioning System (GPS) receiver, or a device such as a Universal Serial Bus (USB) ) portable storage devices like flash drives, to name a few.
适合于存储计算机程序指令和数据的计算机可读介质可以包括所有形式的非易失性存储器、媒介和存储器设备,例如可以包括半导体存储器设备(例如EPROM、EEPROM和闪存设备)、磁盘(例如内部硬盘或可移动盘)、磁光盘以及CD ROM和DVD-ROM盘。处理器和存储器可由专用逻辑电路补充或并入专用逻辑电路中。Computer-readable media suitable for storing computer program instructions and data may include all forms of non-volatile memory, media and memory devices and may include, for example, semiconductor memory devices such as EPROM, EEPROM and flash memory devices, magnetic disks such as internal hard drives or removable disks), magneto-optical disks, and CD ROM and DVD-ROM disks. The processor and memory can be supplemented by, or incorporated in, special purpose logic circuitry.
虽然本申请包含许多具体实施细节,但是这些不应被解释为限制任何公开的范围或所要求保护的范围,而是主要用于描述特定公开的具体实施例的特征。本申请内在多个实施例中描述的某些特征也可以在单个实施例中被组合实施。另一方面,在单个实施例中描述的各种特征也可以在多个实施例中分开实施或以任何合适的子组合来实施。此外,虽然特征可以如所述在某些组合中起作用并且甚至最初如此要求保护,但是来自所要求保护的组合中的一个或多个特征在一些情况下可以从该组合中去除,并且所要求保护的组合可以指向子组合或子组合的变型。While this application contains many specific implementation details, these should not be construed as limitations on the scope of any disclosure or of what may be claimed, but rather as primarily describing features of particular disclosed embodiments. Certain features that are described in this application in multiple embodiments can also be implemented in combination in a single embodiment. On the other hand, various features that are described in a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination. Furthermore, although features may function in certain combinations as described and even initially claimed as such, one or more features from a claimed combination may in some cases be removed from that combination and the claimed A protected combination can point to a subcombination or a variant of a subcombination.
类似地,虽然在附图中以特定顺序描绘了操作,但是这不应被理解为要求这些操作以所示的特定顺序执行或顺次执行、或者要求所有例示的操作被执行,以实现期望的结果。在某些情况下,多任务和并行处理可能是有利的。此外,所述实施例中的各种系统模块和组件的分散不应被理解为在所有实施例中均需要这样的分散,并且应当理解,所描述的程序组件和系统通常可以一起集成在单个软件产品中,或者封装成多个软件产品。Similarly, while operations are depicted in the figures in a particular order, this should not be construed as requiring that those operations be performed in the particular order shown, or sequentially, or that all illustrated operations be performed, to achieve the desired result. In some cases, multitasking and parallel processing may be advantageous. Furthermore, the separation of various system modules and components in the described embodiments should not be construed as requiring such separation in all embodiments, and it should be understood that the described program components and systems can often be integrated together in a single software product, or packaged into multiple software products.
由此,主题的特定实施例已被描述。其他实施例在所附权利要求书的范围以内。在某些情况下,权利要求书中记载的动作可以以不同的顺序执行并且仍实现期望的结果。此外,附图中描绘的处理并非必需所示的特定顺序或顺次顺序,以实现期望的结果。在某些实现中,多任务和并行处理可能是有利的。Thus, certain embodiments of the subject matter have been described. Other embodiments are within the scope of the following claims. In some cases, the actions recited in the claims can be performed in a different order and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some implementations, multitasking and parallel processing may be advantageous.
以上所述仅为本申请一个或多个实施例而已,并不用以限制本申请,凡在本申请一个或多个实施例的精神和原则之内,所做的任何修改、等同替换、改进等,均应包含在本申请一个或多个实施例保护的范围之内。The above is only one or more embodiments of the application, and is not intended to limit the application. Any modifications, equivalent replacements, improvements, etc. made within the spirit and principles of one or more embodiments of the application , should be included within the protection scope of one or more embodiments of the present application.

Claims (13)

  1. 一种交通行为识别方法,包括:A traffic behavior recognition method, comprising:
    获取包括一个或多个骑手区域的待识别图像;Obtain an image to be identified including one or more rider areas;
    针对所述一个或多个骑手区域中的任一个骑手区域,执行操作包括:For any of the one or more rider zones, performing an action includes:
    在所述待识别图像中,确定与该骑手区域关联的关联车辆区域,该骑手区域包括一辆车辆以及至少一个人体;In the image to be recognized, determine an associated vehicle area associated with the rider area, where the rider area includes a vehicle and at least one human body;
    对该骑手区域进行载人数量识别,得到载人数量识别结果,以及对所述关联车辆区域进行车辆类型识别,得到车辆类型识别结果;Identifying the number of passengers in the rider area to obtain the identification result of the number of passengers, and identifying the vehicle type in the associated vehicle area to obtain the identification result of the vehicle type;
    根据所述载人数量识别结果和所述车辆类型识别结果,确定该骑手区域中的目标骑手是否存在违规载人行为。According to the identification result of the number of passengers and the identification result of the vehicle type, it is determined whether the target rider in the rider area has illegal loading behavior.
  2. 根据权利要求1所述的方法,其中,在所述待识别图像中,确定与该骑手区域关联的所述关联车辆区域,包括:The method according to claim 1, wherein, in the image to be recognized, determining the associated vehicle area associated with the rider area comprises:
    对所述待识别图像进行检测,得到一个或多个车辆区域以及该骑手区域;Detecting the image to be recognized to obtain one or more vehicle areas and the rider area;
    在所述一个或多个车辆区域中,确定与该骑手区域重合度最大的目标车辆区域,并将所述目标车辆区域确定为与该骑手区域关联的关联车辆区域。Among the one or more vehicle areas, a target vehicle area that overlaps the rider area with the greatest degree is determined, and the target vehicle area is determined as an associated vehicle area associated with the rider area.
  3. 根据权利要求1所述的方法,其中,在所述待识别图像中,确定与该骑手区域关联的所述关联车辆区域,包括:The method according to claim 1, wherein, in the image to be recognized, determining the associated vehicle area associated with the rider area comprises:
    对所述待识别图像进行检测,得到一个或多个车辆区域以及该骑手区域;Detecting the image to be recognized to obtain one or more vehicle areas and the rider area;
    通过预先训练的关联性分数预测模型,确定所述一个或多个车辆区域与该骑手区域之间的关联分数;determining an association score between the one or more vehicle areas and the rider area via a pre-trained association score prediction model;
    在所述一个或多个车辆区域中,确定与该骑手区域关联分数最高的目标车辆区域,并将所述目标车辆区域确定为与该骑手区域关联的关联车辆区域。Among the one or more vehicle areas, a target vehicle area with the highest associated score with the rider area is determined, and the target vehicle area is determined as an associated vehicle area associated with the rider area.
  4. 根据权利要求1所述的方法,其中,对该骑手区域进行载人数量识别,得到载人数量识别结果,包括:The method according to claim 1, wherein, carrying out the identification of the number of passengers in the rider area, and obtaining the identification result of the number of passengers includes:
    对该骑手区域进行载人数量识别,得到载人数量以及对应的第一置信度;Identify the number of passengers in the rider area to obtain the number of passengers and the corresponding first confidence level;
    响应于所述第一置信度达到第一置信度阈值,将所述载人数量确定为该骑手区域的载人数量识别结果;In response to the first confidence level reaching a first confidence level threshold, determining the number of occupants as an identification result of the number of occupants of the rider area;
    对所述关联车辆区域进行车辆类型识别,得到车辆类型识别结果,包括:Performing vehicle type identification on the associated vehicle area to obtain a vehicle type identification result, including:
    对所述关联车辆区域进行车辆类型识别,得到所述关联车辆区域对应的车辆类型以及对应的第二置信度;Performing vehicle type identification on the associated vehicle area to obtain the vehicle type corresponding to the associated vehicle area and the corresponding second degree of confidence;
    响应于所述第二置信度达到第二置信度阈值,将所述车辆类型确定为所述关联车辆区域的车辆类型识别结果。In response to the second confidence level reaching a second confidence level threshold, the vehicle type is determined as a vehicle type identification result for the associated vehicle area.
  5. 根据权利要求1-4中任一项所述的方法,其中,根据所述载人数量识别结果和所述车辆类型识别结果,确定该骑手区域中的目标骑手是否存在违规载人行为,包括以下任一项:The method according to any one of claims 1-4, wherein, according to the identification result of the number of passengers and the identification result of the vehicle type, determining whether the target rider in the rider area has illegal passenger behavior includes the following Either:
    响应于所述载人数量识别结果为第一识别结果,确定所述目标骑手违规载人;所述第一识别结果表征载人数量达到第一预设数量;In response to the identification result of the number of passengers being a first identification result, it is determined that the target rider is illegally carrying passengers; the first identification result indicates that the number of passengers has reached a first preset number;
    响应于所述载人数量识别结果为第二识别结果,并且所述类型识别结果表征的车辆类型为预设的非机动车类型,确定所述目标骑手违规载人;所述第二识别结果表征载人数量达到第二预设数量,所述第二预设数量小于所述第一预设数量;In response to the recognition result of the number of passengers being a second recognition result, and the vehicle type represented by the type recognition result is a preset non-motor vehicle type, it is determined that the target rider is illegally carrying people; the second recognition result represents The number of people carried reaches a second preset number, and the second preset number is smaller than the first preset number;
    响应于所述载人数量识别结果表征的载人数量为所述第二识别结果,并且所述类型识别结果表征的车辆类型不是所述预设的非机动车类型,确定所述目标骑手未违规载人;In response to the number of passengers represented by the recognition result of the number of passengers being the second recognition result, and the vehicle type represented by the type recognition result is not the preset non-motor vehicle type, it is determined that the target rider is not in violation Manned;
    响应于所述载人数量识别结果为第三识别结果,确定所述目标骑手未违规载人;所述第三识别结果表征载人数量为第三预设数量,所述第三预设数量小于所述第二预设数量;或者,In response to the identification result of the number of passengers being a third identification result, it is determined that the target rider is not illegally carrying passengers; the third identification result indicates that the number of passengers is a third preset number, and the third preset number is less than said second predetermined amount; or,
    响应于所述载人数量识别结果为第四识别结果,确定针对所述目标骑手的交通行为识别无效。In response to the passenger number identification result being the fourth identification result, it is determined that the traffic behavior identification for the target rider is invalid.
  6. 根据权利要求5所述的方法,其中,所述第四识别结果表征所述待识别图像包括以下至少一种场景:The method according to claim 5, wherein the fourth recognition result indicates that the image to be recognized includes at least one of the following scenes:
    骑手推车的场景、The scene of riding a cart,
    骑手站立在车旁的场景、The scene of the rider standing next to the car,
    多个骑手相互紧靠的场景、Scenes with multiple riders leaning against each other,
    低清晰度的场景、或low-resolution scenes, or
    车辆被遮挡的场景。A scene where the vehicle is occluded.
  7. 根据权利要求1-6中任一项所述的方法,还包括:The method according to any one of claims 1-6, further comprising:
    响应于该骑手区域中的目标骑手违规载人,发出告警信息。In response to the illegal loading of the target rider in the rider area, a warning message is issued.
  8. 根据权利要求1-6中任一项所述的方法,其中,所述载人数量识别结果通过载人识别网络对该骑手区域进行检测获得,其中,训练所述载人识别网络包括:The method according to any one of claims 1-6, wherein the recognition result of the number of people is obtained by detecting the rider area through a people recognition network, wherein training the people recognition network includes:
    获取第一训练样本,所述第一训练样本包括多张骑手的样本图像以及对应的载人数量的第一标注信息,所述第一标注信息包括以下标签中的一种:Obtain a first training sample, the first training sample includes a plurality of rider sample images and the corresponding first annotation information of the number of passengers, and the first annotation information includes one of the following labels:
    1人、2人、3人、或无效标签,1 person, 2 persons, 3 persons, or invalid label,
    所述无效标签包括以下中的至少一种:The invalid label includes at least one of the following:
    骑手推车、骑手站立在车旁、多个骑手相互紧靠、低清晰度、或车辆被遮挡;Rider pushing the cart, rider standing next to the cart, multiple riders close to each other, low resolution, or the cart is obscured;
    将所述第一训练样本输入预设的第一初始网络,得到各张样本图像对应的样本载人数量识别结果;Inputting the first training sample into the preset first initial network to obtain the recognition result of the number of people in the sample corresponding to each sample image;
    基于所述样本载人数量识别结果与所述第一标注信息确定第一损失,基于所述第一损失优化所述第一初始网络,得到所述载人识别网络。A first loss is determined based on the identification result of the number of people in the sample and the first annotation information, and the first initial network is optimized based on the first loss to obtain the human identification network.
  9. 根据权利要求1-6中任一项所述的方法,其中,所述车辆识别结果通过车辆识别网络对所述关联车辆区域进行检测获得,其中,训练所述车辆识别网络包括:The method according to any one of claims 1-6, wherein the vehicle recognition result is obtained by detecting the associated vehicle area through a vehicle recognition network, wherein training the vehicle recognition network includes:
    获取第二训练样本,所述第二训练样本包括多张车辆的样本图像以及对应的车辆类型的第二标注信息;Obtaining a second training sample, the second training sample includes a plurality of sample images of vehicles and second labeling information corresponding to the vehicle type;
    将所述第二训练样本输入预设的第二初始网络,得到每张样本图像的样本车辆类型识别结果;Inputting the second training sample into a preset second initial network to obtain a sample vehicle type recognition result of each sample image;
    基于所述样本车辆类型识别结果与所述第二标注信息确定第二损失,基于所述第二损失优化所述第二初始网络,得到所述车辆识别网络。A second loss is determined based on the sample vehicle type identification result and the second label information, and the second initial network is optimized based on the second loss to obtain the vehicle identification network.
  10. 一种交通行为识别装置,包括:A traffic behavior recognition device, comprising:
    获取模块,用于获取包括一个或多个骑手区域的待识别图像;An acquisition module, configured to acquire images to be identified including one or more rider areas;
    第一确定模块,用于针对所述一个或多个骑手区域中的任一个骑手区域,在所述待识别图像中,确定与该骑手区域关联的关联车辆区域,该骑手区域包括一辆车辆以及至少一个人体;The first determination module is configured to, for any one of the one or more rider areas, in the image to be recognized, determine an associated vehicle area associated with the rider area, the rider area includes a vehicle and at least one human body;
    识别模块,用于对该骑手区域进行载人数量识别,得到载人数量识别结果,以及对所述关联车辆区域进行车辆类型识别,得到车辆类型识别结果;The identification module is used to identify the number of passengers in the rider area to obtain the identification result of the number of passengers, and to identify the vehicle type in the associated vehicle area to obtain the vehicle type identification result;
    第二确定模块,用于根据所述载人数量识别结果和所述车辆类型识别结果,确定该骑手区域中的目标骑手是否存在违规载人行为。The second determining module is configured to determine whether the target rider in the rider area has illegal passenger behavior according to the identification result of the number of passengers and the identification result of the vehicle type.
  11. 一种电子设备,包括:An electronic device comprising:
    处理器;processor;
    用于存储处理器可执行指令的存储器;memory for storing processor-executable instructions;
    其中,所述处理器通过运行所述可执行指令以实现如权利要求1-9中任一项所述的交通行为识别方法。Wherein, the processor implements the traffic behavior recognition method according to any one of claims 1-9 by running the executable instructions.
  12. 一种计算机可读存储介质,所述存储介质存储有计算机程序,所述计算机程序用于使处理器执行如权利要求1-9中任一项所述的交通行为识别方法。A computer-readable storage medium, the storage medium stores a computer program, and the computer program is used to make a processor execute the traffic behavior recognition method according to any one of claims 1-9.
  13. 一种计算机程序产品,包括存储于存储器中的计算机程序,所述计算机程序指令被处理器执行时实现如权利要求1-9中任一项所述的交通行为识别方法。A computer program product, comprising a computer program stored in a memory, when the computer program instructions are executed by a processor, the traffic behavior recognition method according to any one of claims 1-9 is realized.
PCT/CN2022/087745 2021-07-30 2022-04-19 Traffic behavior recognition method and apparatus, electronic device, and storage medium WO2023005275A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110873586.2A CN113516099A (en) 2021-07-30 2021-07-30 Traffic behavior recognition method and device, electronic equipment and storage medium
CN202110873586.2 2021-07-30

Publications (1)

Publication Number Publication Date
WO2023005275A1 true WO2023005275A1 (en) 2023-02-02

Family

ID=78068130

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/087745 WO2023005275A1 (en) 2021-07-30 2022-04-19 Traffic behavior recognition method and apparatus, electronic device, and storage medium

Country Status (2)

Country Link
CN (1) CN113516099A (en)
WO (1) WO2023005275A1 (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113516099A (en) * 2021-07-30 2021-10-19 浙江商汤科技开发有限公司 Traffic behavior recognition method and device, electronic equipment and storage medium
CN114419329B (en) * 2022-03-30 2022-08-09 浙江大华技术股份有限公司 Method and device for detecting number of people carried in vehicle
CN116665140A (en) * 2023-03-08 2023-08-29 深圳市旗扬特种装备技术工程有限公司 Method, device, equipment and storage medium for detecting shared single vehicle-mounted human behavior

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112131935A (en) * 2020-08-13 2020-12-25 浙江大华技术股份有限公司 Motor vehicle carriage manned identification method and device and computer equipment
CN112395976A (en) * 2020-11-17 2021-02-23 杭州海康威视系统技术有限公司 Motorcycle manned identification method, device, equipment and storage medium
CN112614102A (en) * 2020-12-18 2021-04-06 浙江大华技术股份有限公司 Vehicle detection method, terminal and computer readable storage medium thereof
CN113516099A (en) * 2021-07-30 2021-10-19 浙江商汤科技开发有限公司 Traffic behavior recognition method and device, electronic equipment and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112131935A (en) * 2020-08-13 2020-12-25 浙江大华技术股份有限公司 Motor vehicle carriage manned identification method and device and computer equipment
CN112395976A (en) * 2020-11-17 2021-02-23 杭州海康威视系统技术有限公司 Motorcycle manned identification method, device, equipment and storage medium
CN112614102A (en) * 2020-12-18 2021-04-06 浙江大华技术股份有限公司 Vehicle detection method, terminal and computer readable storage medium thereof
CN113516099A (en) * 2021-07-30 2021-10-19 浙江商汤科技开发有限公司 Traffic behavior recognition method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN113516099A (en) 2021-10-19

Similar Documents

Publication Publication Date Title
WO2023005275A1 (en) Traffic behavior recognition method and apparatus, electronic device, and storage medium
US10963709B2 (en) Hierarchical machine-learning network architecture
WO2020042984A1 (en) Vehicle behavior detection method and apparatus
WO2020048265A1 (en) Methods and apparatuses for multi-level target classification and traffic sign detection, device and medium
US11232350B2 (en) System and method for providing road user classification training using a vehicle communications network
US9971934B2 (en) System and method for partially occluded object detection
JP5127392B2 (en) Classification boundary determination method and classification boundary determination apparatus
US20170220874A1 (en) Partially occluded object detection using context and depth ordering
WO2019223655A1 (en) Detection of non-motor vehicle carrying passenger
US20170259814A1 (en) Method of switching vehicle drive mode from automatic drive mode to manual drive mode depending on accuracy of detecting object
US11460856B2 (en) System and method for tactical behavior recognition
CN111178286B (en) Gesture track prediction method and device and electronic equipment
Raja et al. SPAS: Smart pothole-avoidance strategy for autonomous vehicles
CN113673533A (en) Model training method and related equipment
CN109383519A (en) Information processing method, information processing system and program
CN111081045A (en) Attitude trajectory prediction method and electronic equipment
US20200160059A1 (en) Methods and apparatuses for future trajectory forecast
He et al. Towards C-V2X Enabled Collaborative Autonomous Driving
JP7269694B2 (en) LEARNING DATA GENERATION METHOD/PROGRAM, LEARNING MODEL AND EVENT OCCURRENCE ESTIMATING DEVICE FOR EVENT OCCURRENCE ESTIMATION
CN116052189A (en) Text recognition method, system and storage medium
WO2021193103A1 (en) Information processing device, information processing method, and program
CN111723601A (en) Image processing method and device
Abou El-Seoud et al. A framework of Malicious Vehicles Recognition in Real Time Foggy Weather
US20230391366A1 (en) System and method for detecting a perceived level of driver discomfort in an automated vehicle
US20230109171A1 (en) Operator take-over prediction

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE