CN108009473A

CN108009473A - Based on goal behavior attribute video structural processing method, system and storage device

Info

Publication number: CN108009473A
Application number: CN201711055281.0A
Authority: CN
Inventors: 谢维信; 王鑫; 高志坚
Original assignee: Shenzhen University
Current assignee: Shenzhen University
Priority date: 2017-10-31
Filing date: 2017-10-31
Publication date: 2018-05-08
Anticipated expiration: 2037-10-31
Also published as: CN108009473B

Abstract

The invention discloses a kind of method of the video structural processing based on goal behavior attribute, this method includes：Target base attribute is obtained using YOLO algorithm of target detection；The above-mentioned trace information for having detected target is obtained using multiple target tracking algorithm；Anomalous video frame is extracted using the abnormal behaviour parser based on movement Optical-flow Feature；According to the metadata structure of self-defined structure, the characteristic informations such as corresponding target category attribute and target trajectory are obtained using the above method；Flase drop data present in the metadata extracted are modified using weighting decision method；The server that acquired data are uploaded to rear end is further handled.By the above-mentioned means, non-structured video data can be switched to the structural data with practical value by the present invention, improve the network transmission efficiency of video monitoring system and reduce back-end server load factor.Present invention also offers a kind of real time processing system and device based on goal behavior attribute.

Description

Based on goal behavior attribute video structural processing method, system and storage device

Technical field

The present invention relates to computer vision field, is handled more particularly to one kind based on goal behavior attribute video structural Method, system and storage device.

Background technology

With the development of intelligent monitoring technology, it is particularly important that processing for video.In the prior art, for the place of video The reason methods for using Image Feature Detection more, but since dimension in video can be very high, and have substantial amounts of redundancy feature and nothing Feature is closed, thus causes the pressure of Video processing, can not realize quickly processing video, and acquisition target signature can be reduced Accuracy rate.So in order to meet the needs of Intellectualized monitoring technology development, it is necessary to which a kind of high real-time and high-accuracy regard The method of frequency processing, processing system for video.

The content of the invention

The present invention solves the technical problem of, there is provided a kind of video structural processing based on goal behavior attribute Method, system and storage device, it is possible to achieve the processing of high real-time and high-accuracy is carried out to video.

In order to solve the above technical problems, the technical solution adopted by the present invention is to provide a kind of regarding based on goal behavior attribute The method of frequency structuring processing, comprises the following steps：

Target detection identification is carried out to the single frames picture；

To the target into line trace, to obtain tracking result；And/or

Unusual checking is carried out to the target.

In order to solve the above technical problems, another technical solution used in the present invention is：Offer is based on goal behavior attribute Video structural processing system, including the processor and memory being electrically connected with each other, the processor couples the storage Device, execute instruction to realize above-mentioned method for processing video frequency, and the execute instruction is produced the processor at work Handling result is stored in the memory.

In order to solve the above technical problems, the present invention another technical solution for using be to provide it is a kind of with store function Device, have program stored therein data, and described program data, which are performed, realizes above-mentioned method for processing video frequency.Above technical scheme Beneficial effect is：Be different from the situation of the prior art, the present invention by by video slicing into single frames picture, and to single frames picture into Row target detection identifies, to the target that recognizes into line trace, to obtain tracking result, and the target to recognizing carry out it is abnormal Behavioral value, in the process, realizes the data that structuring is extracted from non-structured video, can effectively realize video The processing of the high real-time and high-accuracy of data transfer.

Brief description of the drawings

Fig. 1 is that the flow of an embodiment of the method for video structural processing of the application based on goal behavior attribute is shown It is intended to；

Fig. 2 is that the flow of another embodiment of the method for video structural processing of the application based on goal behavior attribute is shown It is intended to；

Fig. 3 is that the flow of the another embodiment of the method for video structural processing of the application based on goal behavior attribute is shown It is intended to；

Fig. 4 is that the flow of the another embodiment of the method for video structural processing of the application based on goal behavior attribute is shown It is intended to；

Fig. 5 is that the flow of the another embodiment of the method for video structural processing of the application based on goal behavior attribute is shown It is intended to；

Fig. 6 is that the flow of the another embodiment of the method for video structural processing of the application based on goal behavior attribute is shown It is intended to；

Fig. 7 is that the flow of the another embodiment of the method for video structural processing of the application based on goal behavior attribute is shown It is intended to；

Fig. 8 is the flow diagram of an embodiment of step S243 in the embodiment that Fig. 7 is provided；

When Fig. 9 is the movement in an embodiment of the method for video structural processing of the application based on goal behavior attribute The schematic diagram of empty container；

Figure 10 is a kind of schematic diagram of an embodiment of the device with store function of the application；

Figure 11 is an example structure schematic diagram of video structural processing system of the application based on goal behavior attribute.

Embodiment

Hereinafter, it will be described with reference to the accompanying drawings the exemplary embodiment of the application.For clear and brief purpose, no Function known to detailed description and construction.The term described below being limited in view of the function in the application can according to The intention or implementation of family and operator and difference.Therefore, it should the art is limited on the basis of disclosed in entire disclosure Language.

Referring to Fig. 1, the method for video frequency monitoring method of the present invention based on video structural data and deep learning The flow diagram of first embodiment.This method includes：

S10：Read video.

Alternatively, reading video includes the real-time video of reading camera collection and/or prerecords the video of preservation Data.Wherein, gather the camera of real-time video, can be USB camera and IP Camera based on rtsp protocol streams its In one kind, or other kinds of camera.

In an embodiment, the video of reading is that the USB camera either IP Camera based on rtsp protocol streams is real-time The video of shooting, collecting.

In another embodiment, the video of reading is the video for prerecording preservation, by from local storage either Video that is that such as USB flash disk, hard disk External memory equipment input is read or being transferred from network, is not described in detail one by one herein.

S20：Structuring processing is carried out to video, obtains structural data.

Alternatively, structuring processing is carried out to video, obtains structural data and specifically refer to, by what is read in step S10 Non-structured video data changes into institutional data, specifically, structural data refer to it is heavier for subsequent analysis The data wanted.Alternatively, position of the structural data including target, target classification, objective attribute target attribute, target state, target At least one information in movement locus, time on target, wherein it is possible to understand, structural data can also include The information for other classifications that user's (using the method described in the present invention or the people of system) needs.Other data are not special It is important, or can be excavated to obtain by relevant informations such as structural datas.Which letter structured message specifically includes Breath, depending on different demands.On how to which structural data is handled, to obtain structural data, can hereafter do detailed Illustrate.

S30：Structural data is uploaded to cloud server, and structural data is analysed in depth, it is pre- to obtain If result.

Alternatively, after step S20 handles video structural, the data of obtained structuring are uploaded to high in the clouds Server, the memory block of storage to cloud server.

In one embodiment, by the obtained data of video structural processing, it is saved directly to depositing for cloud server Storage area, to retain archives, also serves as the database for improving the system.

Alternatively, after step S20 is by Video processing, obtained structural data is uploaded to cloud server, cloud End server further analyses in depth these structural datas.

Alternatively, what the data of structuring of the cloud server to being uploaded from each monitoring node carried out further gos deep into Analysis, wherein, in-depth analysis include target trajectory analysis and target flow analysis or the analysis needed for other, target including people, At least one therein such as car and animal.

In one embodiment, cloud server is to carrying out the data of the structuring from each monitoring node upload into one The in-depth analysis of step is trajectory analysis, is further sentenced according to the rule of the track of the target of upload, in the scene residence time Whether the fixed target is suspicious, and whether which is to be trapped in a certain region for a long time, if the abnormal behaviour such as generation area invasion.

In another embodiment, cloud server to from it is each monitoring node upload structuring data carry out into The in-depth analysis of one step is target flow analysis, according to the data of the structuring of each monitoring point upload, to appearing in a certain prison The target of control point is counted, and obtains the flow of target in the monitoring node each period by statistics.Target therein It can be pedestrian and vehicle, while the peak period either ebb period of target flow can be obtained.By calculating target flow Related data, for rational prompting pedestrian and driver, avoids rush hour, or public resource such as illuminates offer Reference frame.

This method plays critical structural data by handling to obtain video structural to in-depth analysis, then only Structural data is uploaded to high in the clouds, rather than by whole transmission of video to high in the clouds, solves that network transmission pressure is big, data flow Measure the problem of of high cost.

In one embodiment, according to advance setting, when each monitoring node will pass through regarding based on goal behavior attribute Frequency structuring processing system is (hereinafter referred to as：Processing system for video) processing resulting structures data are when being uploaded to cloud server, Cloud server analyses in depth structural data after storage configuration data.

In another embodiment, when each monitoring node is by by the structural data of processing system for video processing gained When reaching cloud server, server needs user to choose whether to be analysed in depth after storage configuration data.

In another embodiment, when user will can complete once when initially upload whenever necessary The structural data of in-depth analysis, re-starts the in-depth analysis of setting again.

Alternatively, the in-depth analysis that the structural data uploaded to each monitoring node carries out further comprises：To structure Change data to be counted, analyzed to obtain the behavior type and abnormal behaviour of one or more targets, and to abnormal behaviour into Row alarm etc., or the content of the analyzing and processing of other users needs.

Detailed below to obtain structural data on how to by video structural data processing, i.e., the application is also A kind of method of the video structural processing based on goal behavior attribute is provided.In one embodiment, at video structural data Reason is to utilize the target detection recognizer that embedded in deep learning, multiple target tracking algorithm, based on the different of movement Optical-flow Feature The intelligent analysis module of normal Activity recognition scheduling algorithm, structure is changed into by the non-structured video data read in step S10 The data of change.

Referring to Fig. 2, a kind of method one of the video structural processing based on goal behavior attribute provided for the application is real The flow diagram of example is applied, this method is also that the step S20 of above example includes step S22 to step S23 at the same time.

S22：Target detection identification is carried out to single frames picture.

Alternatively, step S22 is that single frames picture carries out target detection identification.Wherein, target detection identifies including for object Pedestrian detection identification, vehicle detection identification and animal detection identification etc..

Alternatively, the step of step S22 carries out target detection identification to single frames picture includes：Extract target in single frames picture Characteristic information.Positional information of all clarification of objective information, the classification of target and target etc. in single frames picture is extracted, wherein Target can be pedestrian, vehicle and animal etc..

In one embodiment, when only including pedestrian in single frames picture, target detection identification is the detection identification to pedestrian, Extract the characteristic information of all pedestrians in picture.

In another embodiment, when including pedestrian, vehicle when the target of multiple types in single frames picture, target detection identification It is that a variety of species such as pedestrian, vehicle are detected with identification, that is, extracts the characteristic information of pedestrian, vehicle etc. in single frames picture, can With understanding, the species of target identification can be specified by the specific of user.

Alternatively, algorithm is based on depth after optimizing used by step S22 carries out single frames picture target detection identification Spend the algorithm of target detection of study.Specifically, YOLOV2 deep learning target detections frame can be used to carry out target detection knowledge Not, the core of the algorithm is by the use of whole image as network inputs are used as, and directly returns bounding box's in output layer Position and the classification belonging to bounding box.

Alternatively, target detection is made of model training and model measurement two parts.

In one embodiment, in terms of model training, use takes 50% to come from VOC data sets and COCO data sets Pedestrian image or vehicle image, remaining 50% data be derived from the monitoring such as real street, indoor channel, square number According to.It is understood that (VOC data sets and COCO data sets) data in used common data sets in model training What the ratio for the data concentrated with real monitoring data can be adjusted as needed, wherein when the number of common data concentration Higher according to the ratio taken, comparatively, precision of the data obtained model under really monitoring scene will be relatively slightly poor, instead It, when real monitoring data is concentrated, taken ratio is higher, comparatively precision can be improved.

Alternatively, in one embodiment, after step S22 detects target in single frames picture, which is put In entering into tracking queue and (also referred hereinafter as tracking chain), then can also target tracking algorism be used to carry out default tracking to target With analysis.

Alternatively, further comprise before in said extracted single frames picture the step of clarification of objective information：The first number of structure According to structure.Alternatively, clarification of objective information is extracted according to metadata structure, i.e., extracts single frames according to metadata structure Clarification of objective information in picture.In one embodiment, metadata structure includes the base attribute unit of pedestrian, such as：Shooting Leading address, the time of target disengaging camera, target are in the trace information of current monitor node, the color or mesh of target dress At least one of target sectional drawing.For example, the metadata structure of pedestrian may refer to shown in table 1 below, wherein metadata structure is also It can include the information not included needed for other users but in following table.

Alternatively, in one embodiment, it is basic that some are only included in order to save the resource of network transmission, in metadata structure Attribute information, other attributes can be carried out by relevant informations such as target trajectories excavate calculate i.e. can obtain.

The metadata structure of 1 pedestrian of table

Property Name	Type	Description
			Camera ID	short	Camera node serial number
Target time of occurrence	long	Target enters monitoring node time
			Target time departure	long	Target leaves monitoring node time
Target trajectory	point	Target is in present node movement locus
			Target id	short	Target id identiflication number
Target jacket color	short	Pre-define 10 kinds of colors
			Target trousers color	short	Color in pre-defined 5
Target entirety sectional drawing	image	Record target entirety sectional drawing
			Target head and shoulder sectional drawing	image	Record target cranial sectional drawing

In another embodiment, metadata structure can also include the base attribute information of vehicle, such as：Shooting leading address, Target disengaging time of camera, target the trace information of current monitor node, the appearance color of target, target license plate number Either at least one of sectional drawing of target.

It is understood that the definition for the information and the data type of metadata that metadata structure specifically includes is according to need Initially set or referred in particular to after initial setting according to the needs of user in the numerous information set The particular community information obtained is needed calmly.

In one embodiment, what the structure of metadata was initially set be image leading address, target disengaging camera time, Target is carrying out target knowledge in classifications such as the sectional drawings of the trace information of current monitor node, the color that target is worn or target When other, user can specify the time for obtaining target disengaging camera according to the needs of oneself.

In one embodiment, when the target in single frames picture is pedestrian, according to the metadata of pedestrian set in advance Structure carries out the characteristic information of extraction pedestrian, that is, extract pedestrian pass in and out time of camera, pedestrian it is residing when preceding camera address, The time of pedestrian's disengaging camera, pedestrian are current in the trace information of current monitor node, the color of pedestrian's dress or pedestrian At least one of sectional drawing or according to specially appointed other target property informations of user, as pedestrian passes in and out Time of camera and pedestrian's wears color etc..

Alternatively, when from single frames picture detection recognize target, while clarification of objective information is obtained, from original Video frame in intercept out the image of target, then using based on yolov2, (yolov2 is that Joseph Redmon were carried in 2016 A kind of target detection based on deep learning gone out knows method for distinguishing) frame carry out model training.

In one embodiment, when carrying out target detection to single frames picture, detected target is pedestrian, then from original Video frame in intercept out detection pedestrian image, then utilize the frame based on yolov2 train head and shoulder, the upper part of the body, under Pedestrian is carried out position cutting by half body detection model, judges the clothing colouring information at lower part of the body position thereon, and intercepts trip The head and shoulder picture of people.

In another embodiment, the target detected when carrying out target detection to single frames picture is vehicle, then from original Video frame in intercept out detection vehicle image, then utilize the frame based on yolov2 to train the detection model of vehicle Identification is detected to vehicle, judges its vehicle body appearance color, identification license board information, and intercept out the picture of vehicle.Can be with Understand, because the targeted species of identification can be set by the user selection, the detection identification to vehicle is determined by manager It is fixed whether to carry out.

In another embodiment, the target detected when carrying out target detection to single frames picture is animal, then from original Video frame in intercept out detection animal image, then utilize the frame based on yolov2 to train the detection model of animal Identification is detected to animal, judges the information such as its appearance color, kind, and intercepts out the picture of animal.It is appreciated that Be because the targeted species of identification can be set by the user selection, detection to animal identify by user decide whether into OK.

Alternatively, the single frames picture of each target detection identification can be one or multiple single frames pictures at the same time Carry out.

In one embodiment, the single frames picture for carrying out target detection identification every time is one, i.e., every time only to a single frames Target in picture carries out target detection identification.

In another embodiment, target detection identification can be carried out to plurality of pictures every time, i.e., every time at the same time to multiple lists Target in frame picture carries out target detection identification.

Alternatively, the frame based on yolov2 is carried out carrying out ID to the target that detects after model training (IDentity) label, is associated with facilitating in follow-up tracking.Wherein, the ID number of the classification of different targets can be advance Setting, and the upper limit of ID number is to be set by the user.

Alternatively, the target recognized to detection carries out ID labels or artificial progress ID labels automatically.

In one embodiment, to detecting the target recognized into line label, wherein, according to the classification of detection target Fixed, the ID number of mark has gap, such as the ID number of pedestrian can be set as：Numeral+numeral, vehicle：Capitalization+numeral, Animal：Letter+numeral of small letter, it is convenient to be associated in follow-up tracking.The rule of setting therein can be according to user's Custom and fancy setting, do not repeat one by one herein.

In another embodiment, to detecting the target recognized into line label, wherein, according to the classification of the target detected Depending on, the section belonging to ID number marked to target is different.For example, the ID labels of the pedestrian target detected are set In section 1 to 1000000, the ID labels of detected vehicle target are set in section 1000001 to 2000000.Specifically , it can also be adjusted and change as needed depending on initial setting personnel setting.

Alternatively, ID labels are carried out to the target of detection, can be automatically performed by system by presetting or Manual ID labels are carried out by user.

In one embodiment, when detection recognizes pedestrian's either target of vehicle in single frames picture, system can be certainly Move detected target, according to the classification of the target of detection, and then the ID number of label carries out ID marks automatically before Number.

In another embodiment, user carries out ID labels to the target in picture manually.Can be to not passing through system The target that the single frames picture target of automatic ID labels carries out ID labels or omits either other in inspection set in advance The target outside target classification is surveyed, ID labels can independently be carried out by user.

Alternatively, referring to Fig. 3, in one embodiment, gone back before step S22 carries out target detection identification to single frames picture Including：

S21：By video slicing into single frames picture.

Alternatively, step by video slicing into single frames picture be the video slicing that will be read in step S10 into single frames picture, Prepare for step S22.

Alternatively, in one embodiment, by video slicing into the step of single frames picture it is the video that will be read in step S10 Equidistant frame-skipping or the cutting of not equidistant frame-skipping.

In one embodiment, it is that the video that will be read in step S10 is equidistant into the step of single frames picture by video slicing Frame-skipping cutting, the frame number skipped is identical, i.e., skips identical frame number at equal intervals and carry out being cut into single frames picture, The frame number wherein skipped is the frame number not comprising important information, you can with the frame number ignored.For example, skipped among at equal intervals 1 frame, carries out video slicing, that is, takes t frames, t+2 frames, t+4 frames, the frame number skipped is t+1 frames, and t+3 frames are above-mentioned The frame number skipped is the frame number for the important information not included by judgement, or above-mentioned skipped frame number is with being taken Frame number overlap frame number either the very high frame number of registration.

In another embodiment, it is that the video that will be read in step S10 differs into the step of single frames picture by video slicing The cutting of the frame-skipping of spacing, that is, the frame number skipped can be different, do not skip different frame numbers at equal intervals and cut It is divided into single frames picture, wherein the frame number skipped is the frame number not comprising important information, it is negligible frame number, wherein not Frame number comprising important information be through judgement, and judge result be strictly unessential frame number.For example, not equidistant frame-skipping Cutting, that is, take t frames, then skips 2 frames and takes t+3 frames, then skips 1 frame and take t+5 frames, then skips 3 frames and take t+9 frames, wherein, jumped The frame number crossed has a frame numbers such as t+1 frames, t+2 frames, t+4 frames, t+6 frames, t+7 frames, t+8 frames respectively, the above-mentioned frame number skipped be by Judge not include this frame number for analyzing required information.

In various embodiments, by video slicing into the step of single frames picture can be by system regarding reading automatically Frequency is cut into single frames picture or is chosen whether video slicing can also be that user is manual into single frames picture by user Input has been previously-completed the single frames picture of cutting.

Alternatively, in one embodiment, video slicing is completed to regard reading into after the completion of the step of single frames picture For frequency when being cut into single frames picture, the single frames picture obtained automatically to cutting performs step S22, i.e., to single frames picture obtained by cutting into Row target detection identifies or carries out step S22 institutes as the single frames picture obtained by user's selection is decided whether to cutting The target detection identification stated.

Alternatively, during identification is detected to target, the value that can be identified to the detection of each target is according to one The statistics that fixed rule carries out calculates.

In one embodiment, after step S22, frame number is added up to (altogether in current monitor node to detecting a certain target Count the frame number occurred), frame number that wherein detected value is A, (detected value can have a variety of the statistics that detected value is the frame number of B etc. Or it is a kind of, be subject to testing result), and preserve statistics as a result, in case calling.

Alternatively, the method for correction is broadly divided into trajectory corrector and objective attribute target attribute correction.

Alternatively, after obtaining the structural data of each target to target detection, resulting structures data are carried out Correction.It is being corrected to the flase drop data in structural data, correction is voted according to weight ratio, final most The data value of probability be exact value, the data value of a small number of results is flase drop value.

In one embodiment, (call above-mentioned statistical result) after statistics calculates, it is found that detection recognizes certain in step S22 The frame number occurred in current monitor node of one target is 200 frames, wherein there is 180 frames to detect that the jacket color of the target is red Color, the jacket color that the target is detected in 20 frames is black, is voted according to weight ratio, the correction of a final proof target it is accurate It is worth jacket color for red, and corresponding value in structural data is revised as red, is finally completed correction.

Alternatively, trajectory corrector is specific as follows：Assuming that a target a length of T frames when occurring under a certain monitoring scene, therefore Can obtain its track point set for G=p1, p2 ..., p_N, tracing point is calculated in X-axis and the average and deviation of Y-axis, so Rejecting abnormalities and noise track point, expression are afterwards：

In one embodiment, the tracing point of deviation or average very little is rejected in trajectory corrector, reduces noise spot interference.

Alternatively, objective attribute target attribute correction is specific as follows：Objective attribute target attribute school is based on weighting criterion and corrects same mesh Target property value.Assuming that the jacket color label of some target is label={ " red ", " black ", " white " ... ... }, I.e. some property value has T classification.First it is converted into digital coding L=[m₁,m₂,m₃,……,m_T]；Then frequency is obtained Highest encoded radio x and its frequency F, finally directly exports the property value Y (exact value) of target.Expression is as follows：

F=T- | | M-m_x||₀

Y=label [m_x]

Above formula needs to meet,

Alternatively, in one embodiment, the present invention combines YOLO target detections frame and carries out target identification and positioning, and makes Each clarification of objective vector is extracted with GoogLeNet networks, so that succeeding target matches.GoogLeNet is 2014 The CNN neutral nets for 22 layer depths that Google companies propose, it is widely used in the fields such as image classification, identification.Due to The feature vector of profound deep learning network extraction has preferable robustness, ga s safety degree, so above-mentioned steps can be compared with The good accuracy for improving the subsequently tracking for target.

S23：To target into line trace, to obtain tracking result.

Alternatively, to the target that detects into line trace, the step of to obtain tracking result in, the target tracked is step The target or other specially appointed targets of user, step S23 that rapid S22 is detected further comprise：To target into line trace, remember Record the time of the into or out monitoring node of target, and each position that target is passed through, to obtain the movement rail of target Mark.It is specific how to target into line trace, to obtain tracking result, the application be based on this provide it is a kind of based on KCF and The modified multi-object tracking method of Kalman, will hereafter elaborate.

In another embodiment, the method for processing video frequency that the application provides includes step S21, S22 and S23 in above example Basis on further comprise that step S24, or the embodiment only include step S21, S22 and S24, referring to Fig. 4 and Fig. 5. It is understood that realized based on goal behavior attribute video structural processing method (processing of abbreviation video structural) by video Data are converted into the data of structuring, wherein specific conversion process includes：Target detection identification, target trajectory track and extract and Target unusual checking.In one embodiment, video structural processing includes target detection identification and target trajectory extraction. In another embodiment, video structural processing includes target detection knowledge, target trajectory extraction and target unusual checking.

S24：Unusual checking is carried out to target.

Alternatively, step S24 is the behaviour that unusual checking is carried out to detecting the target identified in above-mentioned steps S21 Make.

Alternatively, unusual checking includes pedestrian's unusual checking and vehicle abnormality behavioral value, wherein pedestrian Abnormal behaviour includes：Run, fight and riot, traffic abnormity behavior include：Hit and exceed the speed limit etc..

By above method by Video processing, to obtain significant data, and then data volume can be avoided excessive, mitigated significantly The pressure of network transmission.

In one embodiment, when the pedestrian target detected in step S21 carries out unusual checking, a prison is judged When people in control node more than or equal to default quantity is run, it is possible to determine that crowd's riot occurs.Such as：It can set and work as step S24 judges 10 people when running abnormal, it is possible to determine that generation crowd's riot, in other embodiment, judges the number threshold of riot Depending on being worth as the case may be.

In another embodiment, it can set when step S24 judges that 2 cars occur to hit abnormal, can be judged with this Generation traffic accident, when step S24 judges that abnormal behaviour occurs to hit for more than 3 cars, it is possible to determine that great traffic accident occurs.Can With understanding, the quantity on car of judgement, which is possible, is set as needed adjustment.

In another embodiment, when the speed that vehicle is detected in step S24 exceedes default velocity amplitude, can both sentence The fixed vehicle is over-speed vehicles, you can the corresponding video of the vehicle is carried out sectional drawing preservation, the information of the vehicle of identification.Wherein The information of vehicle includes license plate number.

Alternatively, in an embodiment, when step S24 detects abnormal behaviour, monitoring node can be carried out at sound-light alarm Reason.

In one embodiment, the content of sound-light alarm includes reporting voice prompt content：As " asking don't be crowded, note Meaning safety！" or other voice prompt contents set in advance；The content of sound-light alarm further includes：Open corresponding monitoring node Warning lamp, to remind passing crowd and vehicle, takes care.

Alternatively, according to the number for the behavior that is abnormal number carry out the severe grade of setting abnormal behaviour, it is different Severe grade corresponds to different emergency trouble shooting measures.The severe grade of abnormal behaviour can be divided into yellow, orange and red.It is yellow The corresponding emergency measure of abnormal behaviour of colour gradation is to carry out sound-light alarm, the corresponding emergency measure of abnormal behaviour of orange grade It is the Security Personnel that link monitor is responsible for a little while carrying out sound-light alarm, the abnormal behaviour measure of red early warning is to carry out acousto-optic The Security Personnel that alarm, link monitor are responsible for a little can alarm on timely line at the same time.

In one embodiment, when the number for the behavior that is abnormal is below 3 people or 3 people, it is set as the people of yellowness ratings Group's abnormal behaviour；When the number for the behavior that is abnormal is more than crowd exception row of 3 people more than orange grade when being less than or equal to 5 people For；It is set as crowd's abnormal behaviour of red scale when the number for the behavior that is abnormal is more than 5 people.Wherein, specific setting Number can be adjusted according to the actual needs, not repeated one by one herein.

Alternatively, in an embodiment, to further comprising the steps of after the step of target progress unusual checking：If inspection Abnormal behaviour is measured, then current video two field picture sectional drawing is preserved and believed with the detected clarification of objective for being abnormal behavior Breath is transmitted to cloud server.

Alternatively, the corresponding characteristic information of the target to being abnormal behavior can include：Camera ID, abnormal thing Event, abnormal behaviour sectional drawing etc. information occur for part type, abnormal behaviour, can also include required other kinds of letter Breath.Wherein sending to the information that the metadata structure of the abnormal behaviour of cloud server is included includes the structure in table 2 below, It can include the information of other classifications.

The metadata structure of 2 abnormal behaviour of table

Property Name	Data type	Description
			Camera ID	short	Camera Unique ID
Anomalous event type	short	Pre-define two kinds of abnormal behaviours
			Abnormal time of origin	long	Abnormal conditions time of origin
Abnormal conditions sectional drawing	image	Recording exceptional behavior sectional drawing

In one embodiment, when carrying out unusual checking to target, it is tested with pedestrian and sends the abnormal behaviour fought, Then corresponding current video two field picture sectional drawing is preserved, and by sectional drawing and is abnormal structuring number corresponding to the target of behavior According to transmitting together to cloud server.It is same to cloud server sending the sectional drawing of detected abnormal behaviour When, this monitoring node carries out sound-light alarm processing, and starts corresponding emergency measure according to the grade of abnormal behaviour.

In another embodiment,, will be current when detecting generation crowd's riot when carrying out unusual checking to target Video frame images sectional drawing is preserved and sent to cloud server, in case cloud server is further processed, is monitored at the same time Node carries out sound-light alarm, and starts corresponding emergency measure according to the grade of abnormal behaviour.

Specifically, in one embodiment, the step of target progress unusual checking, is included：The one or more mesh of extraction The light stream movable information of the multiple characteristic points of target, and clustered according to light stream movable information and unusual checking.It is based on This, the application also provides a kind of anomaly detection method based on cluster Optical-flow Feature, will hereafter elaborate.

The above-mentioned method based on the processing of goal behavior attribute video structural, it is possible to achieve by non-structured video data Structural data is changed into, improves the real-time of Food processing analysis.

Referring to Fig. 6, a kind of modified multi-object tracking method one based on KCF and Kalman also provided for the application is real The flow diagram of example is applied, this method is also the step S23 in above example at the same time, specifically includes step S231 to step S234.Specifically include following steps：

S231：With reference in tracking chain and previous frame picture more than first more than first a mesh of a corresponding detection block prediction of target Tracking box of each target in present frame in mark.

Alternatively, tracking chain be according to single frames picture from video obtained by cutting of all before present frame picture or Multiple target followings in the continuous single frames picture in part calculate gained, the track of multiple targets before collecting in all pictures Information and empirical value.

In one embodiment, tracking chain is to calculate institute according to the target following to all pictures before present frame picture Obtain, include all information of all targets in all frame pictures before present frame picture.

In another embodiment, tracking chain is according to the target following to the continuous picture in part before present frame picture Calculate gained.Wherein track that the continuous picture number of calculating is more, and the accuracy rate of budget is higher.

Alternatively, with reference to the clarification of objective information in tracking chain, and according to more than first a target in previous frame picture Corresponding detection block, tracking box of more than first the tracked a targets of prediction in present frame picture, such as prediction more than first The position that target is likely to occur in the current frame.

In one embodiment, above-mentioned steps can predict the position of the tracking box of a target more than first in the current frame, i.e., Obtain the predicted value of a target more than first.

In another embodiment, above-mentioned steps can predict tracking box of a target in the next frame of present frame more than first Position.Wherein, a target more than first predicted is in the position of the tracking box of the next frame of present frame compared to being predicted The error of the position of the tracking box of a target more than first in the current frame is bigger.

Alternatively, a target more than first refers to all targets detected in previous frame picture.

S232：Obtain more than first a target corresponding tracking box, and present frame in the current frame in previous frame picture The detection block of a target more than second in picture.

Specifically, a target more than second refers to detected all targets in present frame picture.

Alternatively, more than first a target corresponding tracking box, and currently in the current frame in previous frame picture are obtained The detection block of more than second a targets in frame picture.Wherein tracking box will be occurred in the current frame in more than first a targets of prediction Position when rectangle frame, or the frame of other shapes, frame includes one or more targets.

Alternatively, more than first a target corresponding tracking box, and currently in the current frame in previous frame picture are obtained In frame picture during the detection block of more than second a targets, acquired tracking box and detection block are right respectively comprising tracking box and detection block The clarification of objective information answered.Such as positional information, color characteristic and textural characteristics of target etc..Alternatively, corresponding feature Information can be set as needed by user.

S233：Establish the detection block of more than second a targets in the tracking box in the current frame of a target more than first and present frame Target association matrix.

Alternatively, the correspondence of more than first a targets in the current frame in the previous frame picture obtained in step S232 Tracking box detection block corresponding with detected more than second a target in present frame picture, establish target association matrix.

In one embodiment, more than first a destination numbers are N such as in previous frame picture, the number of targets that present frame detects Measure as M, then establish the target association matrix W of a size M × N, wherein：

A_ij(0<i≤M；0<J≤N) value be by dist (i, j), IOU (i, j), m (i, j) determine, specifically, can table Show the following formula：

Wherein, I_W、I_hFor the width and height of picture frame；Dist (i, j) is j-th in the tracking chain obtained in previous frame The centroid distance of the next frame tracking box that target is predicted and the detection block for i-th of target that detection identification obtains in present frame, d (i, j) is the centroid distance after being normalized using 1/2 distance of picture frame diagonal, and m (i, j) is two target feature vectors Euclidean distance, F_Mi、F_NjFor the feature vector extracted based on GoogLeNet networks, this feature vector is using CNN frames Model carries out feature extraction more has robustness and ga s safety degree compared to traditional manual feature extraction.Wherein, it is normalized Purpose is primarily to ensure that d (i, j) and influences of the IOU (i, j) to A (i, j) are consistent.IOU (i, j) represents previous frame The inspection of the tracking box in the current frame of j-th of target prediction and j-th of target that detection identification obtains in present frame in tracking chain The Duplication of frame is surveyed, i.e., the intersection of above-mentioned tracking box and detection block is than its upper union.IOU expressions are：

Alternatively, its value range of IOU (i, j) is 0≤IOU (i, j)≤1, and the value is bigger, shows above-mentioned tracking box and inspection It is bigger to survey frame Duplication.

In one embodiment, when target is static, same target should in the centroid position detected by front and rear two frame This is in same point or deviation very little, therefore the value of IOU should be approximately that 1, d (i, j) should also tend to 0, therefore A_ijValue It is smaller, and when object matching, the value of m (i, j) is smaller, therefore the target of ID=j in chain is tracked when being matched It is bigger with the successful possibility of detection object matching of detection chain ID=i；If the position of the front and rear same target detection frame of two frames Put and fall far short, without overlapping, then IOU should be that 0, m (i, j) value is larger, therefore the value of d (i, j) is bigger, therefore track chain The target of middle ID=j is with detecting the successful possibility of detection object matching of chain ID=i with regard to smaller.

Alternatively, the foundation of target association matrix is with reference to centroid distance, IOU and clarification of objective vector Euclidean distance Outside, while can be with other characteristic informations of reference object, such as：Color characteristic, textural characteristics etc..It is understood that when ginseng According to index it is more when, then accuracy rate see it is higher, but under real-time can become slightly because of the increase of calculation amount accordingly Drop.

Alternatively, in one embodiment, when needing to ensure preferable real-time, in most cases only referring to two taken The positional information of target establishes target association matrix in two field picture.

In one embodiment, the color of wearing of the positional information of reference object and target (can also be the appearance face of target Color) establish the target association square of the corresponding tracking box of a target more than first and the detection block of the corresponding present frame of more than second a targets Battle array.

S234：It is corrected using Target Matching Algorithm, to obtain the corresponding physical location of present frame Part I target.

Alternatively, using Target Matching Algorithm, according to it is actually detected to target observation and step S231 in mesh The predicted value corresponding to detection block is marked, desired value is corrected, to obtain the physical location of a target more than first in present frame, It that is to say more than first in previous frame in a target while appear in the target of more than second a targets of present frame in the current frame Physical location.It should be understood that because the observation of a target more than second in present frame can be because the clarity of cutting picture Certain error is had etc. factor, thus using combine tracking chain and previous frame in more than first a targets in previous frame picture Detection block, the position of a target in the current frame more than first predicted is corrected the physical location of a target more than second.

Alternatively, Target Matching Algorithm is Hungary Algorithm (Hungarian), and observation is that target is examined in step S22 Clarification of objective information, including positional information of the classification of target and target etc. are obtained when surveying identification, the predicted value of target is step Combined in rapid S231 the target predicted the position of tracking chain and target in previous frame positional value in the current frame and other Characteristic information.Wherein, using the positional information of target as main basis for estimation, other characteristic informations are secondary basis for estimation.

Alternatively, in an embodiment, by the detection block in more than second a targets, with more than first a targets in the current frame The object definition of tracking box successful match is Part I target, while the tracking box and the more than first in a target in present frame Every group tracking being also defined as Part I target, i.e. successful match of a target more than two in the detection block successful match of present frame Frame is all from same target with detection block.Wherein it is possible to understand, the detection block more than second in a target, more than first The tracking box successful match of a target in the current frame refers to：Positional information and other characteristic informations correspond, or right The item number answered is relatively more, i.e., higher corresponding item number probability is successful match.

In another embodiment, the quantity of Part I target is less than more than first a targets, is that more than first a targets exist Tracking box in present frame only has part can be with the detection block successful match of more than second a targets, some is in present frame The middle characteristic information according to matching foundation can not successful match.

Alternatively, in different implementation, more than first in the detection block and previous frame of more than second a targets in present frame The step of a target tracking box successful match in the current frame, includes：The detection block of more than second a targets in present frame With first in previous frame more than the tracking box of a target in the current frame centroid distance and/or Duplication judge whether matching into Work(.

In one embodiment, some in a target more than second in present frame or the detection block of multiple targets and upper one When the centroid distance of the tracking box of some in more than first a targets in frame or multiple targets in the current frame is close, and it is overlapping Object matching success is then judged when rate is very high.It is appreciated that time of cutting of adjacent two frames picture be separated by very short, i.e. mesh Be marked on that the distance moved in the time that this is separated by is very small, thus can be determined that at this time object matching in two frame pictures into Work(.

Alternatively, a target more than second includes Part I target and Part II target, wherein, from the foregoing, it will be observed that first Partial target is：The mesh of detection block and the tracking box successful match of more than first a targets in the current frame more than second in a target Mark.Part II target is：Detection block more than second in a target, with the tracking box of more than first a targets in the current frame not It is newly-increased target by the object definition for not having to record in chain is tracked in Part II target with successful target.It is appreciated that , in Part II target, except newly-increased target there is likely to be another kind of target：In a target more than first without matching into The target that work(still occurred in tracking chain.

In one embodiment, the quantity of Part II target can be 0, i.e. the detection of a target more than second in present frame The tracking box of frame and more than first a targets in the current frame can be with successful match, so the quantity of Part II target at this time It is 0.

Alternatively, analysis is being corrected using Target Matching Algorithm, it is corresponding to obtain present frame Part I target Include after the step of physical location：Filter out the newly-increased target in Part II target；Newly-increased target is added into tracking chain.Separately Further included in one embodiment：Corresponding filter tracker is initialized to increase the initial position of target and/or characteristic information newly.One is real Apply in example filter tracker include Kalman filter (kalman), coring correlation filter (kcf) and Kalman filter and The wave filter that coring correlation filter is combined.Kalman filter, coring correlation filter and Kalman filter and coring The wave filter that correlation filter is combined is all based on the multiple target tracking algorithm that programming is realized.Wherein, Kalman filter with The wave filter that coring correlation filter is combined refers to the structure for combining both Kalman filter and coring correlation filter The filter construction realized of algorithm structure.In other embodiment, filter tracker can also be other kinds of wave filter, As long as it can realize identical function.

Alternatively, the data for tracking chain calculate gained by the data training of the pervious all frames of previous frame and previous frame, Target in tracking chain includes the Part I target and Part III target of foregoing description.Specifically, Part I target Refer to：The target of tracking box and the detection block successful match in more than second a targets more than first in a target in the current frame. Part III target refers to：Track the target of the target and more than second a non-successful match of target in chain.

It should be understood that it is to be removed in tracking chain and more than second a object matchings successful the on Part III objective spirit All targets outside a part of target.

Alternatively, analysis is corrected using Target Matching Algorithm in step S234, to obtain present frame Part I mesh Include after the step of marking corresponding physical location：The corresponding target lost frames counting number value of Part III target adds 1, and in mesh Mark lost frames counting number value removes corresponding target from tracking chain when being more than or equal to predetermined threshold value.It should be understood that lose frame number The predetermined threshold value of count value is to preset, and can be adjusted as needed.

In one embodiment, the corresponding lost frames counting number value of a certain target is more than or equal to default threshold in Part III target During value, this target is removed from current tracking chain.

Alternatively, when a certain target is removed from current tracking chain, the structural data corresponding to the target is uploaded To cloud server, the empirical value in structural data or database that cloud server can be to combining the target is right again The target carries out the in-depth analysis of track or abnormal behaviour.

Wherein it is possible to understand, when this is sent to cloud by the structural data corresponding to the target that removes from tracking chain When holding server, performing the system of this method can select to trust, and interrupt in-depth analysis of the cloud server to the target.

Alternatively, analysis is corrected using Target Matching Algorithm in step S234, to obtain present frame Part I mesh Include after the step of marking corresponding physical location：The corresponding target lost frames counting number value of Part III target adds 1, and is counting It is local to track Part III target to obtain current pursuit gain when numerical value is less than predetermined threshold value.

Further, according to the corresponding prediction of current pursuit gain and Part III target of Part III target in an embodiment Value is corrected, to obtain the physical location of Part III target.Specifically, in an embodiment, current pursuit gain is by coring Correlation filter and Kalman filter carry out Part III target with the wave filter that coring correlation filter is combined local Obtained during tracking, predicted value is the positional value of Kalman filter (kalman) prediction Part III target.

Alternatively, it is by Kalman filtering tracker to the target that is detected in above-mentioned steps S22 into line trace (kalman) completed jointly with the wave filter that is combined of wave filter of coring correlation filtering tracker (kcf).

In one embodiment, when the target of tracking be can be with matched target when, i.e., without it is doubtful loss target when, only adjust The tracking work to target can have both been completed with Kalman filtering tracker (kalman).

In another embodiment, when occurring doubtful lost target in the target of tracking, Kalman Filtering tracking is called The wave filter that device (kalman) and coring correlation filtering tracker (kcf) are combined coordinates the tracking work completed to target jointly Make or completed from Kalman filtering tracker (kalman) and coring correlation filtering tracker (kcf) with successively.

Alternatively, in an embodiment, step S234 is corrected using Target Matching Algorithm, to obtain present frame first The step of partial objectives for corresponding physical location, includes：For each target in Part I target, corresponded to according to each target The corresponding predicted value of present frame tracking box and the corresponding observation of present frame detection block be corrected, with Part I mesh The physical location of each target in mark.

In one embodiment, for each target in Part I target, the corresponding predicted value of tracking box can in the current frame To be interpreted as：With reference to the positional information in the empirical value and previous frame in tracking chain, each mesh in Part I target is predicted Target positional information in the current frame, the physical location of the Part I target obtained then in conjunction with observation station in the current frame is (i.e. Observation), correct Part I target in each target physical location.This operation is reducing because of predicted value or observation Error band come measure the problem of each target actual value is inaccurate.

Alternatively, in one embodiment, the above-mentioned modified multi-object tracking method for being based on KCF and Kalman can be with Realize and multiple targets are carried out with trace analysis, record target enters the access time of the monitoring node and under the monitoring scene Each movement position, so as to generate a track chain, specifically can clearly react fortune of the target in current monitor node Dynamic information.

Referring to Fig. 7, implement for a kind of anomaly detection method one based on cluster Optical-flow Feature that the application also provides The flow diagram of example, this method are also the step 24 of above example at the same time, including step S241 to step S245.Specifically Step is as follows：

S241：Light stream detection is carried out to the detection block region of one or more targets.

Alternatively, before unusual checking is carried out to target, detection of the preset algorithm completion to target is had been based on Identification, and where obtaining when target detection is carried out to the target in single frames picture the corresponding detection block of each target and detection block Position, light stream detection then is carried out to the detection blocks of one or more targets.Wherein, light stream contains the movement letter of target Breath.Alternatively, preset algorithm can be yolov2 algorithms or other algorithms with similar functions.

It is intelligible, the corresponding detection block of each target and the position where detection block in acquired single frames picture, Because detection block center can and target center of gravity close to overlap, so can this obtain each pedestrian target in each two field picture And the or positional information of other types target.

In one embodiment, the essence of the detection block progress light stream detection to one or more targets is that acquisition target institute is right The movable information of light stream point in detection block is answered, includes the velocity magnitude and the direction of motion of the movement of light stream point.

Alternatively, light stream detection is to obtain each body dynamics information of light stream point, is by LK (Lucas-Kanade) gold Word tower optical flow method or other there is same or like streamer method to complete.

It is alternatively possible to light stream detection is carried out to the detection block of a target in every frame picture every time, can also be at the same time Light stream detection is carried out to the detection block of target multiple in every frame picture, the general number of targets for carrying out light stream detection every time is foundation System is initially depending on setting.It is understood that this setting can be adjusted setting as needed, when the quick light of needs During stream detection, it can be set as at the same time being detected the detection block of multiple targets in every frame picture.It is very delicate when needing When light stream detects, it can adjust and be set as carrying out light stream detection to the detection block of a target in every frame picture every time.

Alternatively, in one embodiment, light stream is carried out to the detection block of a target in continuous multiframe picture every time Detect or the detection block of a target in single frames picture is detected.

Optionally, in another embodiment, every time to multiple or target complete detection blocks in continuous multiframe picture Carry out light stream detection or light stream detection is carried out to multiple or target complete detection blocks in single frames picture every time.

Alternatively, in one embodiment, before light stream detection is carried out to target, target is first detected in above-mentioned steps Approximate location region, then directly in continuous two field pictures have target occur region (it is to be appreciated that target examine Survey region) carry out light stream detection.Wherein, the continuous two field pictures for carrying out light stream detection are the identical images of size.

Alternatively, in one embodiment, the detection block region of target is carried out light stream to detect being to a frame figure The detection block region of the middle target of piece carries out light stream detection, and the data obtained and information then are stored in local storage In, then light stream detection is carried out to the detection block region of the target in the picture in next frame or default frame.

In one embodiment, the detection block to target and region carry out light stream detection every time, and one by one to figure The detection block of all targets in piece carries out light stream detection.

In another embodiment, every time multiple targets in a pictures are carried out at the same time with light stream detection, you can to understand Light stream detection is carried out to the detection block of the either partial target of all targets in a single frames picture every time.

In another embodiment, the carry out light stream detection to the detection blocks of all targets in multiple single frames pictures every time.

In another embodiment, every time in multiple single frames pictures, specially appointed same category of target detection frame into Row light stream detects.

Alternatively, gained Optic flow information is added in space-time model after step S241, so as to be calculated by statistics Obtain the light stream vector information of front and rear multiple image.

S242：The light stream movable information of the corresponding characteristic point of detection block, calculates detection in extraction at least two continuous frames image The comentropy of frame region.

Alternatively, step 242 extracts the light stream movable information of the corresponding characteristic point of detection block at least two continuous frames image, The comentropy of detection block region is calculated, it is that the corresponding feature in detection block region at least two continuous frames image is clicked through Row calculates, and wherein light stream movable information refers to the direction of motion of light stream point and the size of movement velocity, that is, extracts the fortune of light stream point Dynamic direction and the distance of movement, then calculate the movement velocity of light stream point, characteristic point is can represent object features information one The set of a or multiple pixels.

Alternatively, after the light stream movable information of the corresponding characteristic point of detection block in extracting two continuous frames image, and according to The comentropy of detection block region is calculated according to the light stream movable information extracted, it is to be understood that mesh is based on during comentropy The Optic flow information for marking all light stream points in detection zone calculates gained.

Alternatively, step 242 extracts the light stream movable information of the corresponding characteristic point of detection block at least two continuous frames image, The comentropy of detection block region is calculated, is that (LK pyramid optical flow methods are under for LK (Lucas-Kanade) pyramid optical flow method Abbreviation LK optical flow methods in text) pixel light stream characteristic information in rectangle frame region of the extraction consecutive frame only containing pedestrian targetAnd LK light stream extraction algorithms are accelerated using graphics processor (Graphics Processing Unit), So as to fulfill the Optical-flow Feature information of real-time online extraction pixel.Wherein, Optical-flow Feature information, refers to light stream vector information, can Abbreviation light stream vector.

Alternatively, due to the light stream vector of optical flow algorithm extractionIt is by two two-dimensional matrix vectors Form, i.e.,

Wherein, each point corresponds to each pixel position in image in matrix；Represent same picture in consecutive frame The pixel separation that vegetarian refreshments is moved in X-axis,Represent the pixel separation that same pixel is moved in Y-axis in consecutive frame.

Alternatively, pixel separation refers to the distance that characteristic point moves in adjacent two field pictures, can be carried by LK light streams Algorithm is taken directly to extract acquisition.

In one embodiment, step 242 be to having completed the single-frame images of target detection, and got target inspection In the image of detection block during survey, the light stream movable information of the characteristic point corresponding to the detection block of each target is calculated.Its Middle characteristic point can also be construed to refer to that the point of acute variation occurs for image intensity value or curvature is larger on image border Point (intersection point at i.e. two edges).This operation can reduce calculation amount, improve computational efficiency.

Alternatively, step S242 can calculate all detection blocks or part detection block correspondence in two continuous frames image at the same time Characteristic point Optic flow information, the corresponding characteristic point of detection block all in the consecutive image of two can also be calculated at the same time Optic flow information, the quantity of the image calculated every time is by advance in the setting of system, and can be set as needed.

In one embodiment, step S242 calculates the corresponding characteristic point of all detection blocks in two continuous frames image at the same time Optic flow information.

In another embodiment, it is corresponding to be calculated over detection block all in the consecutive image of two at the same time by step S242 The Optic flow information of characteristic point.

Alternatively, step S242 can calculate the corresponding detection block of all targets at least two continuous frames image at the same time Optic flow information or calculate at the same time is specified at least two continuous frames image and the light of the detection block of corresponding target Stream information.

In one embodiment, step S242 is the corresponding detection of all targets calculated at the same time in continuous at least two field pictures The Optic flow information of frame, such as：T frames neutralize the Optic flow information of the detection block corresponding to all targets in t+1 two field pictures.

In another embodiment, step S242 is to calculate specifying at least two continuous frames image and corresponding at the same time Target detection block, such as：T frames A classes target and t+1 two field picture A ' class targets, the institute of targets of the ID marked as 1 to 3 are right The Optic flow information for the detection block answered, i.e., extract at the same time and calculate target A₁、A₂、A₃Target A corresponding with its₁’、A₂’、A₃' inspection Survey the Optic flow information of frame.

S243：Cluster point is established according to light stream movable information and comentropy.

Alternatively, according to the light stream movable information extracted in step S242 and the comentropy foundation cluster point for calculating gained. Wherein light stream movable information be react light stream motion feature information, including the direction of movement and movement velocity magnitude, It can include other relative motion characteristic informations, comentropy is foundation light stream movable information as obtained by calculating.

In one embodiment, the light stream movable information extracted in step S242 includes the direction of movement, the distance of movement, fortune At least one of dynamic velocity magnitude and other relative motion characteristic informations.

Alternatively, before step S243 establishes cluster point according to the comentropy of light stream movable information and calculating gained, first Light stream is clustered using K- mean algorithms (k-mean).Wherein, detection when cluster point number can be according to target detection Frame number determines that it is foundation to carry out cluster to light stream：The direction of motion light stream point identical with movement velocity size is created as gathering Class point.Alternatively, in one embodiment, the value range of K is 6~9, and certain K values can also be other values, not do herein superfluous State.

Alternatively, cluster point is that the direction of motion is identical with movement velocity size or the set of approximately uniform light stream point.

S244：Calculate the kinetic energy of cluster point or the kinetic energy of target detection frame region.Specifically, with institute in step S243 The cluster point of foundation is unit, the kinetic energy for the cluster point established in calculation procedure S245, or calculates target detection frame institute at the same time Kinetic energy in region.

In one embodiment, in the kinetic energy of cluster point or the kinetic energy of target region established in calculation procedure S243 It is at least one.It is understood that in different embodiments, the calculating side of one of which needs can be configured according to specific requirements Formula, can also configure two kinds of calculations of kinetic energy of the kinetic energy for calculating cluster point or target region at the same time, when only needing to count When calculating one of which, it can manually select and not calculate another kind.Alternatively, its front and rear N frame is utilized according to the position of cluster point Empty container when motion vector establishes a movement, and calculate the light stream histogram of each cluster point place detection zone (HOF) comentropy and the mean kinetic energy of cluster point set.

Alternatively, the formula of the kinetic energy of target detection frame region is as follows：

Alternatively, i=0 ..., k-1 represent the sequence number of light stream in single target detection block region, and k represents single mesh Light stream total number after the cluster in mark region, in addition, calculating for convenience, makes m=1.Alternatively, in one embodiment, the value of K Scope is 6~9, and certain K values can also be other values, and this will not be repeated here.

S245：Abnormal behaviour is judged according to the kinetic energy of cluster point and/or comentropy.

Alternatively, according to the kinetic energy of cluster point or moving for the target detection frame region calculated in step S244 It can judge whether the corresponding target of cluster point is abnormal behavior, wherein when target is pedestrian, abnormal behaviour includes, and runs quickly Run, fight and riot, when target is vehicle, abnormal behaviour includes hitting and hypervelocity.

Specifically, comentropy and cluster point of two kinds of abnormal behaviours all with target detection frame region of fighting and run Kinetic energy is related.I.e. abnormal behaviour is when fighting, and the Optic flow information entropy of target detection frame region is larger, poly- corresponding to target The kinetic energy of class point or the kinetic energy of target region are also larger.And abnormal behaviour is the cluster corresponding to target when running The kinetic energy of point or the kinetic energy of target region are larger, and the Optic flow information entropy of target detection frame region is smaller.When not sending out During raw abnormal behaviour, the Optic flow information entropy of detection block region corresponding to target is smaller, and the cluster point corresponding to target moves Energy or the kinetic energy of target region are also smaller.

Alternatively, in an embodiment, S245 according to cluster point kinetic energy and/or comentropy judge abnormal behaviour the step of into One step includes：If the Optic flow information entropy of the detection block region corresponding to target is more than or equal to first threshold, and target institute is right The kinetic energy of cluster point or the kinetic energy of target detection frame region answered are more than or equal to second threshold, then it is to beat to judge abnormal behaviour Frame.

Alternatively, in another embodiment, the step of judging abnormal behaviour according to the kinetic energy of cluster point and/or comentropy, is into one Step includes：If the comentropy of the detection block region corresponding to target is more than or equal to the 3rd threshold value and is less than first threshold, together When target corresponding to cluster point kinetic energy or the kinetic energy of target detection frame region be more than second threshold.Then judge abnormal row To be to run.

In one embodiment, for example, comentropy is represented with H, kinetic energy is represented with E.

Alternatively, target run behavior judgment formula it is as follows：

In one embodiment, present invention training obtains the behavior of runningValue range isλ₁Value is 3000, whereinIt is used to indicate that the Optic flow information entropy H of target detection frame region and the region of target detection frame The ratio of kinetic energy E, λ₁It is a default kinetic energy values.

Alternatively, target is fought the judgment formula of behavior：

In one embodiment, present invention training obtains the behavior of fightingValue range isλ₂Value For 3.0, whereinIt is used to indicate that the ratio of comentropy H and kinetic energy E, λ₂It is a default information entropy.

Alternatively, the judgment formula of normal behaviour：

In one embodiment, in the present invention, the normal behaviour λ that training obtains₃Take 1500, λ₄Take 1.85, λ₃It is one pre- If kinetic energy values, and be less than λ₁, λ₄It is a default information entropy, and is less than λ₂。

In one embodiment, when a certain pedestrian target is being run, the light stream of the cluster point corresponding to the pedestrian target is moved Can be larger, Optic flow information entropy is smaller.

Alternatively, when crowd's riot occurs, multiple pedestrian targets can be detected in a single frames picture first, then When carrying out unusual checking to multiple pedestrian targets for being detected, it is found that multiple targets there occurs exception of running, It can be determined that generation crowd's riot at this time.

In one embodiment, when carrying out unusual checking to multiple targets detected in a single frames picture, when The motion energy for clustering point having more than corresponding to the target of pre-set threshold numbers is larger, and Optic flow information entropy is smaller；At this time can be with Judgement may have occurred crowd's riot.

Alternatively, when target is vehicle, the judgement of abnormal behaviour is again based on in detection block corresponding to target Most light stream directions and the distance between the vehicle that is detected size (can be drawn from positional information calculation), judge whether Hit.It is understood that the most light stream directions for working as the detection block of two vehicle targets are opposite, and the distance of two cars When close, it can be determined that doubtful generation crash.

Alternatively, the result for step S245 being judged abnormal behaviour preserves, and sends to cloud server.

Method described in above-mentioned steps S241 to step S245 can effectively improve the efficiency and reality of unusual checking Shi Xing.

Alternatively, in an embodiment, the corresponding characteristic point of detection block in step S242 extractions at least two continuous frames image Further included before the step of light stream movable information, the comentropy of calculating detection block region：Extraction at least two continuous frames image Characteristic point.

Alternatively, the characteristic point of extraction at least two continuous frames image, can extract the middle mesh of the continuous image of two frames every time Mark the characteristic point of detection block or extract the feature of target detection frame in the continuous image of multiframe (more than two frames) every time Point, wherein the quantity of the image extracted every time is during initialization system by setting, and can be adjusted as needed.Wherein, it is special Sign point refer to image intensity value occur the point of acute variation or on image border the larger point of curvature (i.e. two edges Intersection point).

Alternatively, in an embodiment, the corresponding characteristic point of detection block in step S242 extractions at least two continuous frames image The step of light stream movable information, the comentropy for calculating detection block region, further comprises：Calculated using preset algorithm continuous The characteristic point of object matching in two field pictures, removes unmatched characteristic point in two continuous frames image.

Alternatively, first, call in image processing function (goodFeaturesToTrack ()) extraction previous frame image After testing to target area in characteristic point (also referred to as Shi-Tomasi angle points), then call LK-pyramid light streams to carry The function calcOpticalFlowPyrLK () in algorithm is taken to calculate target and the matched feature of previous frame that present frame detects Point, the characteristic point not moved in two frames before and after removal, so as to obtain the light stream movable information of pixel.Wherein, in the present embodiment Characteristic point can be Shi-Tomasi angle points, and or abbreviation angle point.

Alternatively, in an embodiment, step S245 is further included before establishing the step of clustering point according to light stream movable information： The light stream direction of motion of characteristic point is drawn in the picture.

In one embodiment, further included before the step of establishing cluster point according to light stream movable information, in each two field picture In draw the light stream direction of motion of each characteristic point.

Optionally, referring to Fig. 8, in an embodiment, the step of step S243 establishes cluster point according to light stream movable information it After further include step S2431 and step S2432：

S2431：Empty container when position and motion vector based on object detection area are established.

Alternatively, existed based on the cluster point in the positional information and detection block where object detection area, that is, target detection frame Empty container when the motion vector relation of front and rear multiframe is established.

Alternatively, when being the movement in an embodiment referring to Fig. 9 empty container schematic diagram, empty container when wherein AB is this Two dimensional height, the two-dimentional width of empty container when BC is this, the depth of empty container when CE is this.Wherein, the depth CE of empty container when It is video frame number, the two-dimentional size of empty container when ABCD is represented, the size of target detection frame when two-dimentional size represents target detection. It should be understood that when empty container model can be other figures, when the figure of target detection frame changes, when empty container Model can accordingly change.

Alternatively, in one embodiment, when the figure of target detection frame changes, then corresponding established space-time Container can change according to the graphic change of target detection frame.

S2432：Calculate average information entropy and the mean motion of the light stream histogram of the corresponding detection block of each cluster point Kinetic energy.

Alternatively, the average information entropy for calculating each light stream histogram for clustering the corresponding detection block of point is moved with average Energy.Light stream histogram HOF (Histogram of Oriented Optical Flow) counts light stream point in a certain certain party To the schematic diagram of the probability of distribution.

Alternatively, the basic thought of HOF is that corresponding histogram is projected into according to the direction value of each light stream point In bin, and it is weighted according to the amplitude of the light stream, in the present invention, the value size of bin is 12, wherein each light stream point Movement velocity size and Orientation calculation formula it is as follows, T refers to adjacent two field pictures interlude.

Wherein, using light stream histogram, it is possible to reduce noise in the size of target, target direction of motion and video etc. Influence of the factor to the Optical-flow Feature of object pixel.

Alternatively, the species of abnormal behaviour includes fighting running, in riot or traffic abnormity in different embodiments It is a kind of.

In one embodiment, when target is pedestrian, abnormal behaviour includes：Fight, run and riot.

In another embodiment, when target is vehicle, abnormal behaviour is for example：Hit and exceed the speed limit.

Alternatively, in one embodiment, the average letter of the light stream histogram of the corresponding detection block of each cluster point is calculated Entropy and mean kinetic energy are ceased, the average information entropy of the light stream of each cluster centre and average in N two field pictures before and after substantially calculating Kinetic energy.

The method of above-mentioned unusual checking, can effectively improve the intelligence of present security protection, while can also have Calculation amount of the reduction of effect during unusual checking, improves efficiency, reality that system carries out target unusual checking When property and accuracy rate.

Alternatively, to target into line trace, the step of to obtain tracking result after further comprise：It will leave current The structural data for monitoring the destination object of node is sent to cloud server.

Alternatively, to target into during line trace, when a certain clarification of objective information especially positional information is in preset time It is not updated inside, you can judge that the target has been moved off current monitoring node, the structural data of the target is sent To cloud server.Wherein preset time can be set by the user, and such as set 5 minutes either 10 minutes, herein not one by one Repeat.

In one embodiment, to target into during line trace, when finding positional information, that is, coordinate value of certain pedestrian certain Preset time in be not updated, you can to judge that this pedestrian has been moved off current monitoring node, by the pedestrian couple The structural data answered is sent to cloud server.

In another embodiment, to target into during line trace, when finding the position coordinates of certain pedestrian or certain vehicle always When resting on the visual angle edge of monitoring node, you can, will to judge that the pedestrian or vehicle have been moved off current monitoring node The structural data of the pedestrian or vehicle is sent to cloud server.

Alternatively, default characteristic information (such as Target Attribute values, the movement of the target for leaving current monitor node will be determined Track, target sectional drawing etc. and other required informations) carry out being packaged into default metadata structure, it is then encoded into preset format Transmission parses received packaged data to cloud server, cloud server, extracts the metadata of target simultaneously Preserve to database.

In one embodiment, the default characteristic information for being determined the target for leaving present node is packaged as default member Data structure, is then encoded into JSON data formats and is sent by network to cloud server, cloud server docking is received JSON data packets are parsed, and extract metadata structure, are then preserved to the database of cloud server.It should be understood that Default characteristic information can be adjusted setting as needed, do not do repeat one by one herein.

Referring to Figure 10, the present invention also provides a kind of device 400 with store function, have program stored therein data, the program Data are performed the method and embodiment realized and plant the video structural processing based on goal behavior attribute as described above Described method.Specifically, the above-mentioned device with store function can be memory, personal computer, server, network Equipment, or the one kind therein such as USB flash disk.

Please refer to Fig.1 the reality that 1, Figure 11 is a kind of video structural processing system based on goal behavior attribute of the present invention Apply illustration to be intended to, in the present embodiment, processing system for video 400 includes：One memory 404 coupled with processor 402, processor 402 a kind of method of Video processing as described above and embodiment are described as more than to realize for execute instruction at work Method, and the handling result that execute instruction is produced is stored in memory 404.

Alternatively, step S23 carries out target abnormal row to target into line trace to obtain tracking result and step S24 For detection, be based on step S22 to single frames picture carry out target detection identification basis on, can just carry out to target with Track and target abnormal behaviour is detected.

Alternatively, step S24, which carries out target unusual checking, directly to be carried out after step S22 is completed, Can be carried out at the same time with step S23, or be after step S23, and based on step S23 tracking result it is enterprising OK.

Alternatively, step S23 is based on to target into line trace when step S24 carries out target unusual checking, to obtain To tracking result, the detection to the abnormal behaviour of target can be more accurate.

Wherein, the side of a kind of video structural processing based on goal behavior attribute described in step S21 to step S24 Method, can effectively reduce the pressure of the network transmission of monitor video, effectively improve the real-time of monitoring system, significantly cut Subtrahend is according to traffic fee.

Alternatively, the step of carrying out target detection identification to the single frames picture, further comprises extracting single frames picture In clarification of objective information.It is understood that by the video slicing of reading into after multiple single frames pictures, be to cutting after Single frames picture carry out target detection identification.

Alternatively, to the clarification of objective information in the obtained single frames picture of video slicing is extracted, wherein mesh Mark includes pedestrian, vehicle and animal, can also extract the characteristic information of building or road and bridge as needed.

In one embodiment, when target is pedestrian, the characteristic information of extraction includes：The position of pedestrian, pedestrian wear face clothes The characterization informations such as color, the gender of pedestrian, motion state, movement locus, residence time and other retrievable information.

In another embodiment, when target is vehicle, the characteristic information of extraction includes：The model of vehicle, the face of vehicle body License plate number of color, the travel speed of vehicle and vehicle etc..

In another embodiment, when target is building, the characteristic information of extraction includes：The essential information of building： Such as build floor height, the height of building, the appearance color of building.

In another embodiment, when target is road and bridge, the characteristic information of extraction includes：Width, the road of road Title, the information such as speed limit of road.

Alternatively, the step of carrying out unusual checking to target includes：More pixels of the one or more targets of extraction Motion vector, and according between motion vector relation carry out unusual checking.

In one embodiment, detail is referring to a kind of method of unusual checking as described above.

In one embodiment, being initially set in the structural data of Video processing stage acquisition includes the position of target, mesh Mark at least one information in classification, objective attribute target attribute, target state, target trajectory, time on target.Wherein, may be used To need to adjust according to user, the positional information of target is only obtained in the Video processing stage, or obtains the position of target at the same time Put and target classification.It is understood that the Video processing stage obtains information, required for being selected by user at video The information category that the reason stage obtains.

Alternatively, after terminating to video structural processing, the structural data obtained is uploaded to cloud service Device, cloud server can preserve the structural data that each monitoring node is uploaded, and to each knot for monitoring node and being uploaded Structure data are analysed in depth, to obtain default result.

Alternatively, the step of structural data that cloud server uploads each monitoring node is analysed in depth can To be that setting is carried out by system or carried out manually by user automatically.

In one embodiment, the fundamental analysis content included by the in-depth analysis of cloud server is preset, is such as counted The quantity of pedestrian, target trajectory analysis, target whether have abnormal behaviour occur, be abnormal behavior target quantity, at the same time Analyse in depth the other guide for further including and needing user especially to select, ratio, the speed of target of each period of such as target.

The foregoing is merely embodiments of the present invention, are not intended to limit the scope of the invention, every to utilize this The equivalent structure or equivalent flow shift that description of the invention and accompanying drawing content are made, it is relevant to be directly or indirectly used in other Technical field, is included within the scope of the present invention.

Claims

A kind of 1. method of the video structural processing based on goal behavior attribute, it is characterised in that including：

Target detection identification is carried out to single frames picture；

To the target into line trace, to obtain tracking result；And/or

Unusual checking is carried out to the target.
2. the method for the video structural processing based on goal behavior attribute according to claim 1, it is characterised in that described The step of carrying out target detection identification to the single frames picture includes：

Extract clarification of objective information described in the single frames picture.
3. the method for the video structural processing based on goal behavior attribute according to claim 2, it is characterised in that described Further comprise before the step of extracting clarification of objective information described in the single frames picture：

Build metadata structure；

Wherein, the clarification of objective information is extracted according to metadata structure.
4. the method for the video structural processing according to claim 1 based on goal behavior attribute, it is characterised in that institute State to the target into line trace, to obtain tracking result the step of further comprises：

To the target into line trace, record the time of the into or out monitoring node of target, and target pass through it is each A position, to form the movement locus of the target.
5. the method for the video structural processing based on goal behavior attribute according to claim 1, it is characterised in that described To the target into line trace, the step of to obtain tracking result after further comprise：Current monitor node will be left The structural data of the destination object is sent to cloud server.
6. the method for the video structural processing based on goal behavior attribute according to claim 1, it is characterised in that described The step of carrying out unusual checking to the target includes：

The light stream movable information of multiple characteristic points of the one or more targets of extraction, and according to the light stream movable information into Row cluster and unusual checking.
7. the method for the video structural processing based on goal behavior attribute according to claim 1, it is characterised in that described Abnormal behaviour further comprises：Run, fight, at least one of riot or traffic abnormity.
8. the method for the video structural processing based on goal behavior attribute according to claim 1, it is characterised in that described It is further comprising the steps of after the step of carrying out unusual checking to the target：If it is detected that the abnormal behaviour, then will Current video two field picture sectional drawing is preserved and sent to cloud server.
9. a kind of video structural processing system based on goal behavior attribute, it is characterised in that including what is be electrically connected with each other Processor and memory, the processor couple the memory, the processor at work execute instruction to realize such as power Profit requires 1~8 any one of them method, and the handling result that the execute instruction is produced is stored in the memory.
10. a kind of device with store function, it is characterised in that have program stored therein data, and described program data are performed Realize such as claim 1~8 any one of them method.