CN113723374A - Alarm method and related device for identifying user contradiction based on video - Google Patents

Alarm method and related device for identifying user contradiction based on video Download PDF

Info

Publication number
CN113723374A
CN113723374A CN202111285941.0A CN202111285941A CN113723374A CN 113723374 A CN113723374 A CN 113723374A CN 202111285941 A CN202111285941 A CN 202111285941A CN 113723374 A CN113723374 A CN 113723374A
Authority
CN
China
Prior art keywords
user
bus
target
grids
video data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111285941.0A
Other languages
Chinese (zh)
Other versions
CN113723374B (en
Inventor
张建军
王东阳
陶政文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Tongbada Electric Technology Co ltd
Guangzhou Tongda Auto Electric Co Ltd
Original Assignee
Guangzhou Tongbada Electric Technology Co ltd
Guangzhou Tongda Auto Electric Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Tongbada Electric Technology Co ltd, Guangzhou Tongda Auto Electric Co Ltd filed Critical Guangzhou Tongbada Electric Technology Co ltd
Priority to CN202111285941.0A priority Critical patent/CN113723374B/en
Publication of CN113723374A publication Critical patent/CN113723374A/en
Application granted granted Critical
Publication of CN113723374B publication Critical patent/CN113723374B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G08SIGNALLING
    • G08BSIGNALLING OR CALLING SYSTEMS; ORDER TELEGRAPHS; ALARM SYSTEMS
    • G08B21/00Alarms responsive to a single specified undesired or abnormal condition and not otherwise provided for
    • G08B21/02Alarms for ensuring the safety of persons
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/18Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast
    • H04N7/183Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast for receiving images from a single remote source
    • H04N7/185Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast for receiving images from a single remote source from a mobile camera, e.g. for remote control

Landscapes

  • Business, Economics & Management (AREA)
  • Emergency Management (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to the technical field of safety, and provides an alarm method and a related device for identifying user contradiction based on video, wherein the method comprises the following steps: the method comprises the steps of acquiring video data of an environment in the bus by calling a camera, tracking a user riding the bus in the video data to generate a track of the user moving in the bus, detecting emotion of the user in the bus in the video data, determining the user as a target user if the user generates negative emotion, detecting a relationship between two target users according to the track, executing alarm operation on the two target users according to the track if the relationship is a contradictory relationship, getting rid of manual observation of drivers, ticket sellers, security guards and other drivers, greatly reducing cost brought by manual observation, improving efficiency, taking the track of the user moving as evidence, preventing details of emergency occurrence from being omitted, guaranteeing reasonable handling of the emergency, and reducing safety risks on the bus.

Description

Alarm method and related device for identifying user contradiction based on video
Technical Field
The invention relates to the technical field of safety, in particular to an alarm method and a related device for identifying user contradiction based on video.
Background
In a public area, users frequently come and go, and sometimes sudden things happen, so that contradictions are generated among the users, and potential safety hazards are caused to other surrounding users.
Although the camera is installed in the public area, contradictions among users occur instantly, in the public area, the actions of the users are rich, the semantics of the actions are difficult to identify simply, and the accuracy for detecting the contradictions among the users is low, so that at present, people working in the public area mostly consider manual observation, and video monitoring is mostly used as evidence for reference after the contradictions occur among the users.
Taking a bus as an example, the bus is one of public transportation means for users to go out, and the users frequently come and go on the bus, so that the safety problem exists when the users take the bus, and the situations of theft, rubbing, charging and the like occur at all times.
At present, the public transport vehicle mainly relies on drivers and passengers such as drivers, ticket sellers and security personnel to manually observe the emergencies and mediate the emergencies related to safety.
However, the cost of manual observation is high, details of the emergency are easy to miss, the emergency is difficult to deal with, and the potential safety hazard is large.
Disclosure of Invention
The embodiment of the invention provides an alarm method and a related device for identifying user contradiction based on videos, and aims to solve the problems of how to multiplex video equipment and improving the efficiency of dealing with safety-related emergencies among users under the condition of not increasing the cost.
In a first aspect, an embodiment of the present invention provides an alarm method for identifying a user conflict based on a video, including:
calling a camera to acquire video data of an environment inside a bus;
tracking a user riding the bus in the video data to generate a trajectory of movement of the user in the bus;
detecting emotions generated in the bus by the user in the video data;
if the user generates negative emotion, determining the user as a target user;
detecting the relation between the two target users according to the track;
and if the relationship is a contradictory relationship, executing alarm operation on the two target users according to the track.
Optionally, the tracking a user riding the bus in the video data to generate a trajectory of the user's movement in the bus comprises:
dividing a plurality of grids on a top view of the bus;
identifying a user riding the bus in the video data to determine an original location of the user at the bus;
projecting the original position to a top view of the bus to obtain a target position;
and recording the grids where the target positions are located according to a time sequence, wherein the grids are used as the moving tracks of the user in the bus.
Optionally, the dividing the plurality of grids on the top view of the bus includes:
determining a seat, a handrail and a pedestrian path on the top view of the bus;
arranging the seats as a grid;
setting an area surrounding the handrail in the pedestrian path as a grid;
and setting the area of the pedestrian path not surrounding the handrail as a grid.
Optionally, the recording the grid where the target location is located in time sequence as a track of the user moving in the bus includes:
dividing a plurality of unit times in sequence;
identifying a time during which the target location stays on the grid for each of the units of time;
determining the grid with the longest dwell time as the grid in which the target position is located in the unit time;
and arranging the grids according to the sequence of the unit time to be used as the moving track of the user in the bus.
Optionally, the detecting, in the video data, an emotion generated in the bus by the user includes:
splitting the video data into synchronous image data and audio data;
identifying expression information generated by the face of the user and action information generated by limbs in the image data;
identifying voice information uttered by the user in the audio data;
and if the face of the user has one or more expression information of anger, sadness, fear, surprise and disgust, one or more action information of pulling, beating and kicking of limbs, and voice information with offensive and abusive properties, determining that the user generates negative emotion in the bus.
Optionally, the detecting a relationship between the two target users according to the trajectory includes:
aligning the trajectories of the two target users in a temporal sequence, the trajectories including a plurality of grids divided on a top view of the bus;
if the alignment is finished, calculating the distance between the grids on the tracks of the two target users;
if the occupation ratio of the first target distance is larger than or equal to a first proportion threshold value, the number of the second target distances is nonzero, and the occupation ratio is smaller than or equal to a second proportion threshold value, determining that the relationship between the two target users is a contradictory relationship;
the first target distance is a distance that is greater than or equal to a first distance threshold, the second target distance is a distance that is less than or equal to a second distance threshold, the first proportional threshold is greater than the second proportional threshold, and the first distance threshold is greater than the second distance threshold.
Optionally, said calculating a distance between said grids on said trajectories of two said target users comprises:
searching the grids with the same unit time on the tracks of the two target users;
inquiring the minimum number of other grids passing from one grid to the other grid according to a walking rule on the top view of the bus as the distance between the grids;
wherein, the net includes seat, handrail, pedestrian way, the walking rule includes:
the seat adjacent to the handrail or the walkway prohibits walking, the seat adjacent to the other seat permits walking, and both the handrail and the walkway permit walking.
In a second aspect, an embodiment of the present invention further provides an alarm device for identifying a user conflict based on a video, including:
the video data acquisition module is used for calling the camera to acquire video data of the environment in the bus;
a track generation module, configured to track, in the video data, a user riding the bus to generate a track of movement of the user in the bus;
the emotion detection module is used for detecting emotion generated in the bus by the user in the video data;
the target user determining module is used for determining the user as a target user if the user generates negative emotion;
the user relationship detection module is used for detecting the relationship between the two target users according to the track;
and the alarm operation execution module is used for executing alarm operation on the two target users according to the track if the relationship is a contradictory relationship.
Optionally, the trajectory generation module is further configured to:
dividing a plurality of grids on a top view of the bus;
identifying a user riding the bus in the video data to determine an original location of the user at the bus;
projecting the original position to a top view of the bus to obtain a target position;
and recording the grids where the target positions are located according to a time sequence, wherein the grids are used as the moving tracks of the user in the bus.
Optionally, the trajectory generation module is further configured to:
determining a seat, a handrail and a pedestrian path on the top view of the bus;
arranging the seats as a grid;
setting an area surrounding the handrail in the pedestrian path as a grid;
and setting the area of the pedestrian path not surrounding the handrail as a grid.
Optionally, the trajectory generation module is further configured to:
dividing a plurality of unit times in sequence;
identifying a time during which the target location stays on the grid for each of the units of time;
determining the grid with the longest dwell time as the grid in which the target position is located in the unit time;
and arranging the grids according to the sequence of the unit time to be used as the moving track of the user in the bus.
Optionally, the emotion detection module is further configured to:
splitting the video data into synchronous image data and audio data;
identifying expression information generated by the face of the user and action information generated by limbs in the image data;
identifying voice information uttered by the user in the audio data;
and if the face of the user has one or more expression information of anger, sadness, fear, surprise and disgust, one or more action information of pulling, beating and kicking of limbs, and voice information with offensive and abusive properties, determining that the user generates negative emotion in the bus.
Optionally, the user relationship detection module is further configured to:
aligning the trajectories of the two target users in a temporal sequence, the trajectories including a plurality of grids divided on a top view of the bus;
if the alignment is finished, calculating the distance between the grids on the tracks of the two target users;
if the occupation ratio of the first target distance is larger than or equal to a first proportion threshold value, the number of the second target distances is nonzero, and the occupation ratio is smaller than or equal to a second proportion threshold value, determining that the relationship between the two target users is a contradictory relationship;
the first target distance is a distance that is greater than or equal to a first distance threshold, the second target distance is a distance that is less than or equal to a second distance threshold, the first proportional threshold is greater than the second proportional threshold, and the first distance threshold is greater than the second distance threshold.
Optionally, the user relationship detection module is further configured to:
searching the grids with the same unit time on the tracks of the two target users;
inquiring the minimum number of other grids passing from one grid to the other grid according to a walking rule on the top view of the bus as the distance between the grids;
wherein, the net includes seat, handrail, pedestrian way, the walking rule includes:
the seat adjacent to the handrail or the walkway prohibits walking, the seat adjacent to the other seat permits walking, and both the handrail and the walkway permit walking.
In a third aspect, an embodiment of the present invention further provides a computer device, where the computer device includes:
one or more processors;
a memory for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement a method of identifying a user conflict based on video as described in the first aspect.
In a fourth aspect, the present invention further provides a computer-readable storage medium, on which a computer program is stored, which, when executed by a processor, implements the method for identifying a user contradiction based on video according to the first aspect.
In the embodiment, the camera is called to collect video data for the environment in the bus, the user riding the bus is tracked in the video data to generate a track of the movement of the user in the bus, the emotion generated by the user in the bus is detected in the video data, if the user generates negative emotion, the user is determined as a target user, the relationship between two target users is detected according to the track, if the relationship is a contradiction relationship, an alarm operation is executed on the two target users according to the track, the contradiction relationship generated between the users is detected according to the two dimensions of the track of the movement of the user in the bus and the generated emotion, the accuracy of detecting the contradiction relationship is ensured, the camera is reused, the manual observation of drivers and passengers such as drivers, ticket sellers, security guards and the like is avoided, the cost caused by the manual observation is greatly reduced, the efficiency is improved, and the track of the movement of the user can be used as evidence, the details of the emergency are prevented from being omitted, the emergency is reasonably coped with, and the safety risk on the bus is reduced.
Drawings
Fig. 1 is a flowchart of an alarm method for identifying a user conflict based on video according to an embodiment of the present invention;
fig. 2 is a schematic structural diagram of an alarm device for identifying a user conflict based on video according to a second embodiment of the present invention;
fig. 3 is a schematic structural diagram of a computer device according to a third embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures.
Example one
Fig. 1 is a flowchart of an alarm method for identifying a user conflict based on a video according to an embodiment of the present invention, where this embodiment is applicable to detect whether a conflict occurs between users according to a trajectory and an emotion of the users, and the method may be executed by an alarm device based on a user conflict, where the alarm device based on a user conflict may be implemented by software and/or hardware, and may be configured in a computer device, and specifically includes the following steps:
step 101, calling a camera to acquire video data of the environment inside the bus.
Buses (Bus, also called Bus, etc.) refer to a fixed route followed by urban roads, on which a plurality of stations are arranged, and there are special numbers (such as 960, 195A, 35, etc.) at fixed shift time, and motor vehicles for passengers to get on and off and for passengers to go out are parked at the stations. The general shape is square, and the window is provided with a seat.
In urban areas, the speed per hour of the bus is generally 25-50 kilometers per hour, and in suburban areas, the speed per hour of the bus can reach 80 kilometers per hour.
One or more cameras can be arranged in the bus, the cameras can be in a fixed direction, the cameras can simultaneously run to cover the whole environment in the bus, the cameras can also be provided with a holder and can rotate, and the rotating range can cover the whole environment in the bus.
After the bus is started, when the bus runs along a specified route, a user can get on and/or get off the bus at each station, and at the moment, the camera can be started to continuously acquire video data of the internal environment of the bus and record the action of the user on the bus.
And 102, tracking the user riding the bus in the video data to generate a track of the user moving in the bus.
In this embodiment, a user riding a bus may be tracked in video data by a target tracking algorithm with the user riding the bus as a target, and a moving track may be generated for the user.
In a specific implementation, the target tracking algorithm may include the following categories:
1. classical tracking algorithm
The classical tracking algorithm mainly models or tracks the target features according to the target.
1.1 method for modeling based on target model
By modeling the target appearance model, the target is found in the following frame. Such as region matching, feature point tracking, active contour based tracking algorithms, optical flow methods, and the like. For the feature matching method, firstly, the features of the target are extracted, then the most similar features are found in the subsequent frames for target positioning, and the features of the target comprise SIFT features, SURF features, Harris corners and the like.
1.2 search-based methods
The method based on the target model modeling is used for processing the whole picture, and the real-time performance is poor, so that the prediction algorithm is added into tracking, target searching is carried out near the predicted value, and the searching range is reduced. One class of predictive algorithms includes Kalman filtering and particle filtering methods. Another method for reducing the search range is a kernel method, which is to apply the principle of the steepest descent method to gradually iterate the target template in the gradient descent direction until the target template is iterated to the optimal position, such as the Meanshift and Camshift algorithms.
2. Tracking algorithm based on kernel correlation filtering
The related filtering (measuring the similarity degree of two signals) in the communication field is introduced into target tracking, and a tracking algorithm based on the related filtering is generated, such as MOSSE, CSK, KCF, BACF, SAMF and the like, the speed can reach hundreds of frames per second, and the method can be widely applied to a real-time tracking system.
3. Tracking algorithm based on deep learning
Under the background of big data, the network model is trained by deep learning, and the obtained convolution characteristic output expression capability is stronger. In target tracking, the initial application mode is to directly apply the characteristics learned by the network to a tracking frame of the correlation filtering or Struck, so as to obtain a better tracking result, such as a deep srdcf method. And in essence, the feature expression obtained by convolution output is better than the HOG or CN feature.
In one embodiment of the present invention, step 102 may include the steps of:
and step 1021, dividing a plurality of grids on the top view of the bus.
In this embodiment, most buses are single-layer structures, and part of buses are double-layer structures, and in each layer of structure, the tracks of the users moving on the buses are all standing or riding, so that the top views of each layer of structure on the buses can be gridded, that is, the top views of each layer of structure on the buses are divided into a plurality of grids, and the tracks of the users moving on the buses are recorded in the form of the grids.
Considering that each facility on the bus has a specific role, the top view of the bus can be gridded according to each facility on the bus.
Further, the seat, the grab bar, the walkway are determined on the top view of the bus.
The seats may be provided for a single user to ride, and thus, a single seat may be provided as a single grid.
The handrail is generally a position where the user stands, and therefore, the area of the walkway surrounding the handrail can be set up as a plurality of grids, bounded by appropriate length and width, and at this time, the grids can be regarded as handrails.
The walkway is movable by the user and therefore the area of the walkway not surrounding the handrail can be arranged as a number of grids bounded by appropriate length and width.
Furthermore, in addition to seats, handrails, other facilities of the walkway, parts (such as driver's seats, doors, hoods, etc.) are prohibited from being divided into grids.
Step 1022, identify the user riding the bus in the video data to determine the original location of the user on the bus.
In general, a driver sits at a driving seat, a ticket seller and a security guard sit at a specific position or wear specific clothing (such as armband, uniform, etc.), these pieces of information are marked in advance as characteristics, and other persons except these marks can be regarded as a user riding a bus.
In the video data, for a user riding a bus, the position where the user is present in the bus can be calculated as the original position using cenet (central network) or using a network improved based on cenet.
And 1023, projecting the original position to a top view of the bus to obtain a target position.
The position under the coordinate system of the original position camera is converted into a world coordinate system by using external reference and internal reference of the camera through calibrating the external reference and the internal reference of the camera in advance, and the X coordinate and the Y coordinate of the original position are taken as the positions projected onto the top view of the bus and recorded as target positions.
And step 1024, recording grids where the target positions are located according to the time sequence, and taking the grids as the moving tracks of the users in the bus.
The walking track of the user is consecutive in time, and the grid where the target position is located can be recorded according to the time sequence and used as the moving track of the user in the bus.
In the embodiment, gridding can simplify the environment on the bus, and the grid where the user walks is recorded along the time sequence as the moving track, so that the actual complex track can be simplified, the fault tolerance rate can be improved, the follow-up operation on the tracks among different users can be facilitated, the personal data of the user can be desensitized, and the privacy of the user can be protected.
In a specific implementation, the time duration may be set with reference to the moving speed of the user, so as to divide a plurality of unit times in sequence, each unit time is the time duration, and it is default that the user moves by at most one grid in the unit time, and the time duration is not too short, otherwise the user cannot perform the functions of simplification and desensitization, and is not too long, otherwise the moving trajectory of the user is not consistent, for example, the user may be divided into one unit time every 400 ms.
Because the user may have a cross-grid condition when moving, the time that the target position stays in the grid in each unit time can be identified, the grid with the longest stay time is determined as the grid where the target position is located in the unit time, and the grids are arranged according to the sequence of the unit time and serve as the moving track of the user in the bus.
And 103, detecting the emotion generated in the bus by the user in the video data.
The emotion generated in the bus by the user riding the bus can be expressed through a plurality of dimensions, so that in the embodiment, the emotion generated in the bus by the user riding the bus can be detected through different dimensions in real time in the video data.
In a specific implementation, the video data may be split into synchronized image data and audio data.
In the image data, expression information generated by the face of the user and motion information generated by the limbs can be identified through classification algorithms such as CNN and PCNN.
In the audio data, voice information issued by a user can be recognized by a voice recognition algorithm.
If the face of the user has expression information of one or more of anger, sadness, fear, surprise and disgust, the body generates action information of one or more of pulling, beating and kicking, and voice information with offensive and abusive properties is sent out, the negative emotion of the user in the bus is determined.
In the embodiment, negative emotion generated in the bus by the user is detected through facial expression, limb action and voice, multi-modal feature extraction is realized, and therefore the accuracy of detecting the negative emotion generated in the bus by the user is improved.
And step 104, if the user generates negative emotion, determining that the user is the target user.
If the user generates negative emotion on the bus, sudden things such as theft, rubbing, charging and the like may occur on the bus, so that the emotion of the user changes, and at the moment, the users can be marked as target users.
Of course, in addition to the possibility of accidents such as theft, scratch, and charging on the bus, there are also possible normal events, such as normal conversations between users as friends, but emotions in which the dimensions of facial expressions, body movements, voice information, and the like are detected as negative directions during the conversations, and so on.
And 105, detecting the relation between the two target users according to the track.
If accidents such as theft, rubbing, charging and the like occur on the bus, generally two target users occur, and the moving track between the two target users has characteristics, so that the relation between the two target users can be detected by comparing the moving track between the two target users, and whether accidents such as theft, rubbing, charging and the like occur between the two target users can be deduced.
In one embodiment of the present invention, step 105 may include the steps of:
step 1051, align the trajectories of the two target users in time order.
For different target users, the moving tracks of the target users are different in time, so that the tracks of the two target users can be aligned on a time axis according to the time sequence, and the subsequent comparison of the tracks is facilitated.
Further, the trajectory includes a plurality of grids divided on the overhead view of the bus, each grid is associated with a time (e.g., a unit time), and the grids can be filled with 0 according to the time and the vacant grids in some time (e.g., a unit time).
For example, the target user a is in a bus at time T1, and the movement trajectory is as follows:
T1 T2 T3 T4 T5 T6 T7 T8 T9 ……
C1 C2 C3 B3 B4 B4 B4 C4 C5 ……
the target user B is in a bus at the time T3, and the moving track is as follows:
T3 T4 T5 T6 T7 T8 T9 ……
C1 C2 C3 C4 C5 C6 A6 ……
wherein T represents a unit time, A represents a seat, B represents a grab bar, and C represents a pedestrian path.
The trajectory of the movement of the target user a is aligned with the trajectory of the movement of the target user B as follows:
T1 T2 T3 T4 T5 T6 T7 T8 T9 ……
C1 C2 C3 B3 B4 B4 B4 C4 C5 ……
0 0 C1 C2 C3 C4 C5 C6 A6 ……
step 1052, if the alignment is completed, calculating the distance between the grids on the tracks of the two target users.
If the alignment of the trajectories of the two target users is completed, the grids can be regarded as characters, and the trajectories of the two target users are regarded as character strings with equal length, so that the distance between the grids is calculated on the trajectories of the two target users.
Wherein the distance is a distance of movement, which refers to the minimum number of movements between two characters required to convert one character to another.
In a specific implementation, grids with the same unit time can be searched on the tracks of two target users, and the minimum number of other grids passing through from one grid to the other grid according to the walking rule is inquired on the top view of the bus and is used as the distance between the grids.
Wherein, the net includes seat, handrail, pedestrian way, and the walking rule includes:
the seats adjacent to the handrails or walkways are prohibited from walking (when the user is defaulted to walking from the grid of handrails or walkways and not across the seats), the seats adjacent to the other seats are allowed to walk (when the user is defaulted to across the seats), and both the handrails and walkways are allowed to walk.
Step 1053, if the ratio of the first target distance is larger than or equal to the first ratio threshold, the number of the second target distances is non-zero and the ratio is smaller than or equal to the second ratio threshold, determining that the relationship between the two target users is a contradiction relationship.
The first target distance is a distance which is greater than or equal to a first distance threshold, the second target distance is a distance which is less than or equal to a second distance threshold, the first proportional threshold is greater than a second proportional threshold, and the first distance threshold is greater than the second distance threshold.
The distance between the two target users is the first target distance, which means that the two target users do not have close contact, and the probability of accidents such as theft, rubbing, charging and the like is low at the distance.
The distance between the two target users is the second target distance, which indicates that the two target users have close contact, and the probability of accidents such as theft, rubbing, charging and the like is higher at the distance.
The occupation ratio of the first target distance is greater than or equal to the first proportional threshold, which indicates that the distance between the two target users is mostly far, the probability of strangers between the two target users is high, and conversely, the probability of fellow persons (such as relatives, friends and the like) between the two target users is low.
The second target distance is a distance with a distance less than or equal to the second distance threshold, which means that a small part of the distance between two target users is relatively close, and the probability of accidents such as theft, rubbing, charging and the like occurring between two strangers (target users) during moving is relatively high.
If the conditions are met, the relationship between the two target users can be regarded as a contradiction relationship, namely, the contradiction is generated between the two target users due to accidents such as theft, rubbing, charging and the like.
And step 106, if the relationship is a contradictory relationship, executing alarm operation on the two target users according to the track.
If the relationship between the two target users is determined to be a contradictory relationship, the alarm operation can be executed on the at least two target users, the remote monitoring end is informed, the tracks of the at least two target users are used as evidence to be sent to the monitoring end, the video data can be called by a worker conveniently to check, particularly, a segment with the distance between the two target users as the second target distance is searched in the video data, and the segment with possible accidents of theft, scratch, shoulder charge and the like can be quickly located by the worker conveniently to check.
After receiving the alarm operation, the monitoring end worker checks the alarm operation according to the video data, and if the occurrence of accidents such as theft, rubbing, charging and the like is confirmed, the monitoring end worker can be connected with a loudspeaker of the bus to warn and mediate the response.
In the embodiment, video data is collected for the environment inside the bus, a user riding the bus is tracked in the video data to generate a track of the user moving in the bus, emotion generated in the bus by the user is detected in the video data, if the user generates negative emotion, the user is determined as a target user, the relationship between two target users is detected according to the track, if the relationship is a contradiction relationship, an alarm operation is executed for the two target users according to the track, the contradiction relationship generated between the users is detected through two dimensions of the track of the user moving on the bus and the generated emotion, the accuracy of detecting the contradiction relationship is ensured, a camera is reused, drivers and passengers such as drivers, ticket sellers and security guards can be free from manual observation, the cost caused by manual observation is greatly reduced, the efficiency is improved, and the track of the user moving can be used as evidence, the details of the emergency are prevented from being omitted, the emergency is reasonably coped with, and the safety risk on the bus is reduced.
It should be noted that, for simplicity of description, the method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the illustrated order of acts, as some steps may occur in other orders or concurrently in accordance with the embodiments of the present invention. Further, those skilled in the art will appreciate that the embodiments described in the specification are presently preferred and that no particular act is required to implement the invention.
Example two
Fig. 2 is a block diagram of a structure of an alarm device for identifying a user conflict based on a video according to a second embodiment of the present invention, which may specifically include the following modules:
the video data acquisition module 201 is used for calling a camera to acquire video data of the environment in the bus;
a trajectory generation module 202, configured to track, in the video data, a user riding the bus to generate a trajectory of movement of the user in the bus;
the emotion detection module 203 is used for detecting emotion generated in the bus by the user in the video data;
a target user determination module 204, configured to determine that the user is a target user if the user generates a negative emotion;
a user relationship detection module 205, configured to detect a relationship between two target users according to the trajectory;
and an alarm operation executing module 206, configured to execute an alarm operation on the two target users according to the trajectory if the relationship is a contradictory relationship.
In an embodiment of the present invention, the trajectory generation module 202 is further configured to:
dividing a plurality of grids on a top view of the bus;
identifying a user riding the bus in the video data to determine an original location of the user at the bus;
projecting the original position to a top view of the bus to obtain a target position;
and recording the grids where the target positions are located according to a time sequence, wherein the grids are used as the moving tracks of the user in the bus.
In an embodiment of the present invention, the trajectory generation module 202 is further configured to:
determining a seat, a handrail and a pedestrian path on the top view of the bus;
arranging the seats as a grid;
setting an area surrounding the handrail in the pedestrian path as a grid;
and setting the area of the pedestrian path not surrounding the handrail as a grid.
In an embodiment of the present invention, the trajectory generation module 202 is further configured to:
dividing a plurality of unit times in sequence;
identifying a time during which the target location stays on the grid for each of the units of time;
determining the grid with the longest dwell time as the grid in which the target position is located in the unit time;
and arranging the grids according to the sequence of the unit time to be used as the moving track of the user in the bus.
In an embodiment of the present invention, the emotion detection module 203 is further configured to:
splitting the video data into synchronous image data and audio data;
identifying expression information generated by the face of the user and action information generated by limbs in the image data;
identifying voice information uttered by the user in the audio data;
and if the face of the user has one or more expression information of anger, sadness, fear, surprise and disgust, one or more action information of pulling, beating and kicking of limbs, and voice information with offensive and abusive properties, determining that the user generates negative emotion in the bus.
In an embodiment of the present invention, the user relationship detecting module 205 is further configured to:
aligning the trajectories of the two target users in a temporal sequence, the trajectories including a plurality of grids divided on a top view of the bus;
if the alignment is finished, calculating the distance between the grids on the tracks of the two target users;
if the occupation ratio of the first target distance is larger than or equal to a first proportion threshold value, the number of the second target distances is nonzero, and the occupation ratio is smaller than or equal to a second proportion threshold value, determining that the relationship between the two target users is a contradictory relationship;
the first target distance is a distance that is greater than or equal to a first distance threshold, the second target distance is a distance that is less than or equal to a second distance threshold, the first proportional threshold is greater than the second proportional threshold, and the first distance threshold is greater than the second distance threshold.
In an embodiment of the present invention, the user relationship detecting module 205 is further configured to:
searching the grids with the same unit time on the tracks of the two target users;
inquiring the minimum number of other grids passing from one grid to the other grid according to a walking rule on the top view of the bus as the distance between the grids;
wherein, the net includes seat, handrail, pedestrian way, the walking rule includes:
the seat adjacent to the handrail or the walkway prohibits walking, the seat adjacent to the other seat permits walking, and both the handrail and the walkway permit walking.
The alarm device based on the user contradiction provided by the embodiment of the invention can execute the alarm method based on the user contradiction provided by any embodiment of the invention, and has corresponding functional modules and beneficial effects of the execution method.
EXAMPLE III
Fig. 3 is a schematic structural diagram of a computer device according to a third embodiment of the present invention. FIG. 3 illustrates a block diagram of an exemplary computer device 12 suitable for use in implementing embodiments of the present invention. The computer device 12 shown in FIG. 3 is only an example and should not impose any limitation on the scope of use or functionality of embodiments of the present invention.
As shown in FIG. 3, computer device 12 is in the form of a general purpose computing device. The components of computer device 12 may include, but are not limited to: one or more processors or processing units 16, a memory 28, and a bus 18 that couples various system components including the memory 28 and the processing unit 16.
Bus 18 represents one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, and a processor or local bus using any of a variety of bus architectures. By way of example, such architectures include, but are not limited to, Industry Standard Architecture (ISA) bus, micro-channel architecture (MAC) bus, enhanced ISA bus, Video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus.
Computer device 12 typically includes a variety of computer system readable media. Such media may be any available media that is accessible by computer device 12 and includes both volatile and nonvolatile media, removable and non-removable media.
The memory 28 may include computer system readable media in the form of volatile memory, such as Random Access Memory (RAM) 30 and/or cache memory 32. Computer device 12 may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, storage system 34 may be used to read from and write to non-removable, nonvolatile magnetic media (not shown in FIG. 3, and commonly referred to as a "hard drive"). Although not shown in FIG. 3, a magnetic disk drive for reading from and writing to a removable, nonvolatile magnetic disk (e.g., a "floppy disk") and an optical disk drive for reading from or writing to a removable, nonvolatile optical disk (e.g., a CD-ROM, DVD-ROM, or other optical media) may be provided. In these cases, each drive may be connected to bus 18 by one or more data media interfaces. Memory 28 may include at least one program product having a set (e.g., at least one) of program modules that are configured to carry out the functions of embodiments of the invention.
A program/utility 40 having a set (at least one) of program modules 42 may be stored, for example, in memory 28, such program modules 42 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, each of which examples or some combination thereof may comprise an implementation of a network environment. Program modules 42 generally carry out the functions and/or methodologies of the described embodiments of the invention.
Computer device 12 may also communicate with one or more external devices 14 (e.g., keyboard, pointing device, display 24, etc.), with one or more devices that enable a user to interact with computer device 12, and/or with any devices (e.g., network card, modem, etc.) that enable computer device 12 to communicate with one or more other computing devices. Such communication may be through an input/output (I/O) interface 22. Also, computer device 12 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network such as the Internet) via network adapter 20. As shown, network adapter 20 communicates with the other modules of computer device 12 via bus 18. It should be understood that although not shown in the figures, other hardware and/or software modules may be used in conjunction with computer device 12, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.
The processing unit 16 executes various functional applications and data processing by executing programs stored in the memory 28, for example, implementing a method for identifying a user contradiction based on video provided by an embodiment of the present invention.
Example four
The fourth embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the computer program implements each process of the above-mentioned alarm method based on video identification of user contradictions, and can achieve the same technical effect, and in order to avoid repetition, the details are not repeated here.
A computer readable storage medium may include, for example, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims (10)

1. An alarm method for identifying user contradictions based on videos is characterized by comprising the following steps:
calling a camera to acquire video data of an environment inside a bus;
tracking a user riding the bus in the video data to generate a trajectory of movement of the user in the bus;
detecting emotions generated in the bus by the user in the video data;
if the user generates negative emotion, determining the user as a target user;
detecting the relation between the two target users according to the track;
and if the relationship is a contradictory relationship, executing alarm operation on the two target users according to the track.
2. The method of claim 1, wherein tracking a user riding in the bus in the video data to generate a trajectory of movement of the user in the bus comprises:
dividing a plurality of grids on a top view of the bus;
identifying a user riding the bus in the video data to determine an original location of the user at the bus;
projecting the original position to a top view of the bus to obtain a target position;
and recording the grids where the target positions are located according to a time sequence, wherein the grids are used as the moving tracks of the user in the bus.
3. The method of claim 2, wherein the dividing the plurality of grids on the overhead view of the bus comprises:
determining a seat, a handrail and a pedestrian path on the top view of the bus;
arranging the seats as a grid;
setting an area surrounding the handrail in the pedestrian path as a grid;
and setting the area of the pedestrian path not surrounding the handrail as a grid.
4. The method of claim 2, wherein said chronologically recording the grid on which the target location is located as a trajectory of the user's movement in the bus comprises:
dividing a plurality of unit times in sequence;
identifying a time during which the target location stays on the grid for each of the units of time;
determining the grid with the longest dwell time as the grid in which the target position is located in the unit time;
and arranging the grids according to the sequence of the unit time to be used as the moving track of the user in the bus.
5. The method of claim 1, wherein the detecting, in the video data, the emotion generated by the user in the bus comprises:
splitting the video data into synchronous image data and audio data;
identifying expression information generated by the face of the user and action information generated by limbs in the image data;
identifying voice information uttered by the user in the audio data;
and if the face of the user has one or more expression information of anger, sadness, fear, surprise and disgust, one or more action information of pulling, beating and kicking of limbs, and voice information with offensive and abusive properties, determining that the user generates negative emotion in the bus.
6. The method according to any one of claims 1-5, wherein said detecting a relationship between two of said target users from said trajectory comprises:
aligning the trajectories of the two target users in a temporal sequence, the trajectories including a plurality of grids divided on a top view of the bus;
if the alignment is finished, calculating the distance between the grids on the tracks of the two target users;
if the occupation ratio of the first target distance is larger than or equal to a first proportion threshold value, the number of the second target distances is nonzero, and the occupation ratio is smaller than or equal to a second proportion threshold value, determining that the relationship between the two target users is a contradictory relationship;
the first target distance is a distance that is greater than or equal to a first distance threshold, the second target distance is a distance that is less than or equal to a second distance threshold, the first proportional threshold is greater than the second proportional threshold, and the first distance threshold is greater than the second distance threshold.
7. The method of claim 6, wherein said calculating a distance between said grids on said trajectories of two said target users comprises:
searching the grids with the same unit time on the tracks of the two target users;
inquiring the minimum number of other grids passing from one grid to the other grid according to a walking rule on the top view of the bus as the distance between the grids;
wherein, the net includes seat, handrail, pedestrian way, the walking rule includes:
the seat adjacent to the handrail or the walkway prohibits walking, the seat adjacent to the other seat permits walking, and both the handrail and the walkway permit walking.
8. An alert device for identifying user conflicts based on video, comprising:
the video data acquisition module is used for calling the camera to acquire video data of the environment in the bus;
a track generation module, configured to track, in the video data, a user riding the bus to generate a track of movement of the user in the bus;
the emotion detection module is used for detecting emotion generated in the bus by the user in the video data;
the target user determining module is used for determining the user as a target user if the user generates negative emotion;
the user relationship detection module is used for detecting the relationship between the two target users according to the track;
and the alarm operation execution module is used for executing alarm operation on the two target users according to the track if the relationship is a contradictory relationship.
9. A computer device, characterized in that the computer device comprises:
one or more processors;
a memory for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement the video-based identification of user contradiction alert method of any of claims 1-7.
10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out a video-based identification user contradiction alert method according to any one of claims 1-7.
CN202111285941.0A 2021-11-02 2021-11-02 Alarm method and related device for identifying user contradiction based on video Active CN113723374B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111285941.0A CN113723374B (en) 2021-11-02 2021-11-02 Alarm method and related device for identifying user contradiction based on video

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111285941.0A CN113723374B (en) 2021-11-02 2021-11-02 Alarm method and related device for identifying user contradiction based on video

Publications (2)

Publication Number Publication Date
CN113723374A true CN113723374A (en) 2021-11-30
CN113723374B CN113723374B (en) 2022-02-15

Family

ID=78686405

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111285941.0A Active CN113723374B (en) 2021-11-02 2021-11-02 Alarm method and related device for identifying user contradiction based on video

Country Status (1)

Country Link
CN (1) CN113723374B (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106295568A (en) * 2016-08-11 2017-01-04 上海电力学院 The mankind's naturalness emotion identification method combined based on expression and behavior bimodal
US20180300557A1 (en) * 2017-04-18 2018-10-18 Amazon Technologies, Inc. Object analysis in live video content
CN108937970A (en) * 2018-06-06 2018-12-07 姜涵予 A kind of method and device for evaluating and testing affective state
US20190147228A1 (en) * 2017-11-13 2019-05-16 Aloke Chaudhuri System and method for human emotion and identity detection
WO2019132459A1 (en) * 2017-12-28 2019-07-04 주식회사 써로마인드로보틱스 Multimodal information coupling method for recognizing user's emotional behavior, and device therefor
CN110458644A (en) * 2019-07-05 2019-11-15 深圳壹账通智能科技有限公司 A kind of information processing method and relevant device
US20200227036A1 (en) * 2019-01-14 2020-07-16 Ford Global Technologies, Llc Systems and methods of real-time vehicle-based analytics and uses thereof
CN112395921A (en) * 2019-08-16 2021-02-23 杭州海康威视数字技术股份有限公司 Abnormal behavior detection method, device and system
US20210090233A1 (en) * 2019-09-25 2021-03-25 International Business Machines Corporation Cognitive object emotional analysis based on image quality determination

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106295568A (en) * 2016-08-11 2017-01-04 上海电力学院 The mankind's naturalness emotion identification method combined based on expression and behavior bimodal
US20180300557A1 (en) * 2017-04-18 2018-10-18 Amazon Technologies, Inc. Object analysis in live video content
US20190147228A1 (en) * 2017-11-13 2019-05-16 Aloke Chaudhuri System and method for human emotion and identity detection
WO2019132459A1 (en) * 2017-12-28 2019-07-04 주식회사 써로마인드로보틱스 Multimodal information coupling method for recognizing user's emotional behavior, and device therefor
CN108937970A (en) * 2018-06-06 2018-12-07 姜涵予 A kind of method and device for evaluating and testing affective state
US20200227036A1 (en) * 2019-01-14 2020-07-16 Ford Global Technologies, Llc Systems and methods of real-time vehicle-based analytics and uses thereof
CN110458644A (en) * 2019-07-05 2019-11-15 深圳壹账通智能科技有限公司 A kind of information processing method and relevant device
CN112395921A (en) * 2019-08-16 2021-02-23 杭州海康威视数字技术股份有限公司 Abnormal behavior detection method, device and system
US20210090233A1 (en) * 2019-09-25 2021-03-25 International Business Machines Corporation Cognitive object emotional analysis based on image quality determination

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
SEBASTIAN ZEPF等: ""Driver Emotion Recognition for Intelligent Vehicles: A Survey"", 《ACM COMPUTING SURVEYS》 *
Z. YANG等: ""Automatic aggression detection inside trains"", 《2010 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS,MAN AND CYBERNETICS》 *
张严浩: ""群体行为视觉语义表达分析及应用"", 《中国优秀博硕士学位论文全文数据库(硕士)·信息科技辑》 *
瞿中等: ""运动目标轨迹网格化分析与徘徊行为检测研究"", 《微电子学与计算机》 *
费凡: ""智能视频监控中运动人体异常行为的自动检测征与识别算法的研究与实现"", 《万方数据库》 *

Also Published As

Publication number Publication date
CN113723374B (en) 2022-02-15

Similar Documents

Publication Publication Date Title
CN108241844B (en) Bus passenger flow statistical method and device and electronic equipment
Eren et al. Estimating driving behavior by a smartphone
US9881221B2 (en) Method and system for estimating gaze direction of vehicle drivers
US10552687B2 (en) Visual monitoring of queues using auxillary devices
Liu et al. SafeShareRide: Edge-based attack detection in ridesharing services
WO2019223655A1 (en) Detection of non-motor vehicle carrying passenger
US20090319560A1 (en) System and method for multi-agent event detection and recognition
Mokayed et al. Real-time human detection and counting system using deep learning computer vision techniques
JP6595375B2 (en) Traffic condition analysis device, traffic condition analysis method, and traffic condition analysis program
KR102289181B1 (en) System, apparatus and method for vision based parking management
CN111942317B (en) Driving door anti-collision method, device, system and computer readable storage medium
CN112071084A (en) Method and system for judging illegal parking by utilizing deep learning
CN112101253A (en) Civil airport ground guarantee state identification method based on video action identification
US20120155711A1 (en) Apparatus and method for analyzing video
CN114120293A (en) Subway train passenger detection method and system
KR102289182B1 (en) System, apparatus and method for vision based parking management
Ijjina et al. Accident detection from dashboard camera video
Guerrieri et al. Real-time social distance measurement and face mask detection in public transportation systems during the COVID-19 pandemic and post-pandemic Era: Theoretical approach and case study in Italy
CN113723374B (en) Alarm method and related device for identifying user contradiction based on video
Ng et al. Low latency deep learning based parking occupancy detection by exploiting structural similarity
CN117274380A (en) Real-time detection method for aircraft corridor bridge abutment state and related equipment
CN109830238B (en) Method, device and system for detecting working state of tower controller
CN110781847A (en) Neural network action behavior recognition based method
CN115272939A (en) Method and device for detecting accident vehicle, electronic equipment and computer readable storage medium
JP2023522390A (en) Tracking Vulnerable Road Users Across Image Frames Using Fingerprints Obtained from Image Analysis

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant