CN112364696B - Method and system for improving family safety by utilizing family monitoring video - Google Patents
Method and system for improving family safety by utilizing family monitoring video Download PDFInfo
- Publication number
- CN112364696B CN112364696B CN202011092212.9A CN202011092212A CN112364696B CN 112364696 B CN112364696 B CN 112364696B CN 202011092212 A CN202011092212 A CN 202011092212A CN 112364696 B CN112364696 B CN 112364696B
- Authority
- CN
- China
- Prior art keywords
- information
- person
- personnel
- expression
- image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 54
- 238000012544 monitoring process Methods 0.000 title abstract description 18
- 230000009471 action Effects 0.000 claims abstract description 123
- 206010000117 Abnormal behaviour Diseases 0.000 claims abstract description 82
- 230000001815 facial effect Effects 0.000 claims abstract description 54
- 230000004044 response Effects 0.000 claims abstract description 7
- 238000012549 training Methods 0.000 claims description 32
- 238000013145 classification model Methods 0.000 claims description 28
- 238000005070 sampling Methods 0.000 claims description 24
- 238000013528 artificial neural network Methods 0.000 claims description 11
- 230000033001 locomotion Effects 0.000 claims description 11
- 238000000605 extraction Methods 0.000 claims description 8
- 238000001514 detection method Methods 0.000 claims description 7
- 238000005516 engineering process Methods 0.000 claims description 5
- 238000002372 labelling Methods 0.000 claims description 5
- 210000004709 eyebrow Anatomy 0.000 claims description 4
- 238000010195 expression analysis Methods 0.000 claims description 3
- 238000004364 calculation method Methods 0.000 claims 1
- 230000009545 invasion Effects 0.000 abstract description 4
- 230000008921 facial expression Effects 0.000 abstract description 2
- 230000006399 behavior Effects 0.000 description 8
- 230000006870 function Effects 0.000 description 7
- 238000010586 diagram Methods 0.000 description 6
- 230000008569 process Effects 0.000 description 5
- 238000004891 communication Methods 0.000 description 3
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 230000002159 abnormal effect Effects 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 238000013527 convolutional neural network Methods 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 210000000544 articulatio talocruralis Anatomy 0.000 description 1
- 238000004590 computer program Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 210000002310 elbow joint Anatomy 0.000 description 1
- 238000004868 gas analysis Methods 0.000 description 1
- 210000004394 hip joint Anatomy 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 230000000977 initiatory effect Effects 0.000 description 1
- 210000000629 knee joint Anatomy 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 210000000697 sensory organ Anatomy 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 230000001568 sexual effect Effects 0.000 description 1
- 210000000323 shoulder joint Anatomy 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 210000003857 wrist joint Anatomy 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/174—Facial expression recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/41—Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/46—Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/48—Matching video sequences
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/52—Surveillance or monitoring of activities, e.g. for recognising suspicious objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/20—Movements or behaviour, e.g. gesture recognition
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Data Mining & Analysis (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Human Computer Interaction (AREA)
- Evolutionary Biology (AREA)
- Health & Medical Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Software Systems (AREA)
- Computational Linguistics (AREA)
- Social Psychology (AREA)
- Psychiatry (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Alarm Systems (AREA)
- Image Analysis (AREA)
Abstract
The application provides a method and a system for improving home security by using a home monitoring video, wherein the method comprises the following steps: in response to the facial image being a pre-stored facial image, identifying the facial image, determining whether the facial image contains panic expression information; when the face image contains terrorism information, extracting sound information and/or personnel action information in the video frame sequence, and judging whether the personnel has abnormal behaviors according to the face image, the sound information and/or the personnel action information; when the person has abnormal behaviors, determining an alarm terminal corresponding to the judgment scene based on the corresponding relation between the preset judgment scene and the alarm terminal; and sending alarm information to the determined alarm terminal. According to the method and the device, abnormal behaviors of the personnel can be detected in real time according to facial expression, sound and personnel action information of the personnel in the household monitoring video, and the abnormal behaviors are timely alarmed to related personnel, so that the invasion of the household by outsiders and the unsafe condition of the family are avoided, and the requirements of the people on the monitoring system are met.
Description
Technical Field
The application relates to the technical field of video monitoring, in particular to a method and a system for improving home safety by using home monitoring video.
Background
In recent years, as the home security problem receives increasing attention from society, the home security problem is mainly to monitor the home security according to various sensors and monitoring videos, however, both the sensors and the monitoring videos are still in the traditional mode, namely "record only and not judge", so that abnormal conditions and evidence can only be investigated through the video playback after the fact, and the defect that abnormal behaviors and alarms cannot be judged in real time exists.
Therefore, it is necessary to provide a method for judging the abnormal behaviors of the personnel according to the expression, behaviors, sounds and the like of the personnel in the monitoring video, and initiating an alarm program according to the judgment result so as to improve the family safety.
Disclosure of Invention
The purpose of the application is to provide a method and a system for improving home safety by using home monitoring video.
In one aspect, the present application provides a method for improving home security by using a home surveillance video, including:
acquiring face images of a person in a video frame sequence, identifying the face images of the person, and determining whether the face images of the person are pre-stored face images;
Identifying the face image in response to the face image being a pre-stored face image, and determining whether the face image contains panic expression information;
when the face image contains terrorism information, extracting sound information and/or personnel action information in the video frame sequence, and judging whether the personnel has abnormal behaviors according to the face image, the sound information and/or the personnel action information;
when the person has abnormal behaviors, determining an alarm terminal corresponding to a preset judgment scene based on the corresponding relation between the judgment scene and the alarm terminal;
and sending alarm information to the determined alarm terminal, wherein the alarm information comprises the position information of the personnel and personnel face images, personnel action information and/or sound information in the video frame sequence.
In some embodiments of the present application, the method further comprises:
responding to the face image which is not a pre-stored face image, and sending the face image of the person to a preset alarm terminal;
extracting sound information and/or personnel action information in the video frame sequence, and judging whether the personnel has abnormal behaviors or not according to the sound information and/or the personnel action information;
When the person has abnormal behaviors, determining an alarm terminal corresponding to a preset judgment scene based on the corresponding relation between the judgment scene and the alarm terminal;
and sending alarm information to the determined alarm terminal, wherein the alarm information comprises the position information of the personnel and personnel face images, personnel action information and/or sound information in the video frame sequence.
In some embodiments of the present application, the determining, based on a preset correspondence between a judgment scene and an alarm terminal, the alarm terminal corresponding to the judgment scene includes:
presetting importance levels of a plurality of alarm terminals and presetting corresponding relations between the judgment scenes and the importance levels of the alarm terminals;
determining the alarm terminal of the importance level corresponding to the judgment scene according to the judgment scene;
wherein, the judging scene includes:
determining that the person has abnormal behavior according to any one of the face image, the sound information and the person action information;
determining that the person has abnormal behavior according to any two of the face image, the sound information and the person action information;
And determining that the person has abnormal behaviors according to the facial image, the sound information and the person action information.
In some embodiments of the present application, the determining whether the person has abnormal behavior according to the facial image, the sound information, and/or the person action information includes:
extracting expression data in the facial image of the person, and calculating the matching degree of the expression data and the frightening expression model data in an expression database, wherein the frightening expression model data aiming at a plurality of levels are stored in the expression database;
when the matching degree of the expression data and the frightening expression model data of any level is larger than a preset matching threshold value, judging that the facial image contains frightening expression information, and the frightening level of the expression information contained in the facial image is the level;
and when the panic level of the expression information contained in the facial image is a preset level, determining that the person has abnormal behaviors.
In some embodiments of the present application, determining whether the person has abnormal behavior according to the sound information and/or the person action information includes:
based on a voice recognition technology, recognizing semantic keywords corresponding to the voice information;
Comparing the identified semantic keywords with pre-configured alarm keywords;
and determining whether the person has abnormal behaviors according to the matching degree of the identified semantic keywords and the alarm keywords.
In some embodiments of the present application, determining whether the person has abnormal behavior according to the sound information and/or the person action information further includes:
and extracting acoustic characteristics of the sound information, and determining whether abnormal behaviors exist in the personnel according to the acoustic characteristics.
In some embodiments of the present application, determining whether the person has abnormal behavior according to the sound information and/or the person action information further includes:
inputting the human body key point position heat image of each sampling image frame in the image sequence into an action classification extraction model to extract a human body key point position feature image of the sampling image frame sequence;
classifying human body actions in the image sequence based on the human body key point feature map to obtain a human body action recognition result in the video;
judging whether the person has abnormal behaviors according to the human body action recognition result.
In some embodiments of the present application, the step of determining whether the person has abnormal behavior according to the sound information and/or the person action information further includes a step of training an action classification model, where the step includes:
Acquiring a training video frame sequence, wherein the video frame sequence comprises training videos and annotation information of human body actions indicated by the training videos;
sampling the video frame sequence to obtain a training sampling image frame sequence;
performing key point detection on the training sample image frame sequence to obtain a human body key point position heat map of each training sample image frame in the training sample image frame sequence;
inputting the human body key point position heat map of the training sampling image frame sequence into a neural network corresponding to the action classification model to be trained to predict, and iteratively adjusting parameters of the neural network corresponding to the action classification model to be trained based on the difference between the prediction result of the neural network corresponding to the action classification model to be trained and the labeling information of human body actions indicated by the corresponding training video, and stopping iteration when the preset convergence condition is met, so as to obtain the trained action classification model.
In some embodiments of the present application, the extracting sound information and/or personnel action information in the video frame sequence includes:
acquiring a sample video frame sequence in a first preset time period before a time point for triggering and extracting sound information and personnel action information in the video frame sequence and a second preset time period after the time point for triggering and extracting the sound information and the personnel action information in the video frame sequence;
And extracting sound information and personnel action information in the video frame sequence from the sample video frame sequence.
Another aspect of the present application provides a system for improving home security using home surveillance video, including:
the face image acquisition unit is used for acquiring face images of people in the video frame sequence, identifying the face images of the people and determining whether the face images of the people are pre-stored face images or not;
an expression analysis unit for identifying the face image in response to the face image being a pre-stored face image, and determining whether the face image contains frightening expression information;
the abnormal behavior judging unit is used for extracting sound information and/or personnel action information in the video frame sequence when the face image contains terrorism information, and judging whether the personnel has abnormal behaviors or not according to the face image, the sound information and/or the personnel action information;
the alarm terminal determining unit is used for determining an alarm terminal corresponding to a preset judgment scene based on the corresponding relation between the judgment scene and the alarm terminal when the person has abnormal behaviors;
and the alarm unit is used for sending alarm information to the determined alarm terminal, wherein the alarm information comprises the position information of the personnel and personnel face images, personnel action information and/or sound information in the video frame sequence.
Compared with the prior art, the method and the system for improving the home safety by using the home monitoring video provided by the application identify facial images of people in the video, determine whether the facial images of the people are preset images, acquire expression images of the people in the video in real time if the facial images of the people are preset images, and identify expression types of the expression images; then, based on the expression type of the expression image, when the expression image of the person is a preset frightening expression, acquiring a behavior image of the person and/or sound data in the video, judging whether the person has abnormal behaviors according to the behavior image and/or the sound data, and if so, starting an alarm program. If the person is not the preset image, the human body action of the person is obtained in the video, whether the person has abnormal behavior or not is judged according to the human body action, and if so, an alarm program is started. The method can detect abnormal behaviors in the video scene in real time, give an alarm to related personnel in time, avoid invasion of families by outsiders and meet the requirements of people on a monitoring system under unsafe conditions.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the application. Also, like reference numerals are used to designate like parts throughout the figures. In the drawings:
Fig. 1 is a schematic structural diagram of a method for improving home security by using home surveillance video according to some embodiments of the present application;
FIG. 2 is a schematic diagram of another method for improving home security using home surveillance video according to some embodiments of the present application;
fig. 3 illustrates a flow chart of a trace gas analysis method provided by some embodiments of the present application.
Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
It is noted that unless otherwise indicated, technical or scientific terms used herein should be given the ordinary meaning as understood by one of ordinary skill in the art to which this application belongs.
In addition, the terms "first" and "second" etc. are used to distinguish different objects and are not used to describe a particular order. Furthermore, the terms "comprise" and "have," as well as any variations thereof, are intended to cover a non-exclusive inclusion. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those listed steps or elements but may include other steps or elements not listed or inherent to such process, method, article, or apparatus.
The embodiment of the application provides a method for improving home security by using home monitoring video, and the method is exemplified by the embodiment and fig. 1 to 2.
As shown in fig. 1, the method for improving home security by using home surveillance video in the present application may include:
step 101, acquiring face images of a person in a video frame sequence, identifying the face images of the person, and determining whether the face images of the person are pre-stored face images.
The pre-stored facial images may be facial images of family members and facial images of guests that the family members consider safe.
In the present embodiment, an execution subject (e.g., a server) of a method of improving home security using home surveillance video may receive a face image of a person from a first terminal device through a wired connection or a wireless connection. The first terminal device may be various electronic devices supporting an image capturing function or a video capturing function, including but not limited to a smart phone, a tablet computer, a video camera, a camera, and the like. Here, the face image is transmitted by the first terminal detecting the face of the person. The face may include, for example, a facial contour of a person (e.g., a facial form contour, a contour of at least two of the five sense organs). Here, the above-described execution subject may detect in real time or according to a preset time interval whether the first terminal transmits the face image.
And 102, in response to the facial image being a pre-stored facial image, identifying the facial image, and determining whether the facial image contains the frightening expression information.
The expression of the frightening refers to the expression exposed by the fact that the personnel may be in danger, for example, the personnel shows a slightly painful expression of the frowning, which means that the personnel may be uncomfortable and needs help; for example, the expression of calling for help and panic with a large mouth,
and 103, when the facial image contains the terrorism information, extracting sound information and/or personnel action information in the video frame sequence, and judging whether the personnel has abnormal behaviors according to the facial image, the sound information and/or the personnel action information.
It will be appreciated that the face image only contains panic information, and that a person who may not be accurately reflected is currently in a dangerous state, so that whether the person has abnormal behaviors is determined according to sound information of the person and/or motion information of the person in the video.
And 104, determining an alarm terminal corresponding to the judgment scene based on the preset corresponding relation between the judgment scene and the alarm terminal when the abnormal behavior exists in the personnel.
It can be understood that the importance levels of a plurality of alarm terminals and the corresponding relation between the preset judgment scene and the importance levels of the alarm terminals can be preset; and determining an alarm terminal of an importance level corresponding to the judgment scene according to the judgment scene.
Step 105, sending alarm information to the determined alarm terminal, wherein the alarm information comprises the position information of the personnel and the facial images, the action information and/or the sound information of the personnel in the video frame sequence.
Compared with the prior art, the facial image of the person in the video is identified, whether the facial image of the person is a preset image or not is determined, if yes, the expression image of the person in the video is obtained in real time, and the expression type of the expression image is identified; then, based on the expression type of the expression image, when the expression image of the person is a preset frightening expression, acquiring a behavior image of the person and/or sound data in the video, judging whether the person has abnormal behaviors according to the behavior image and/or the sound data, and if so, starting an alarm program. If the person is not the preset image, the human body action of the person is obtained in the video, whether the person has abnormal behavior or not is judged according to the human body action, and if so, an alarm program is started. The method can detect abnormal behaviors in the video scene in real time, give an alarm to related personnel in time, avoid invasion of families by outsiders and meet the requirements of people on a monitoring system under unsafe conditions.
In some variations of the embodiments of the present application, the method further comprises:
and step 106, if the face image is not the pre-stored face image, sending the face image of the person to a preset alarm terminal.
If the face image is not a pre-stored face image, the person of the face image may be an intruder, and of course, may be a person of the family member who knows, but does not record the face image.
And 107, extracting sound information and/or personnel action information in the video frame sequence, and judging whether the personnel has abnormal behaviors according to the sound information and/or the personnel action information.
It will be appreciated that the model for determining whether a person is abnormal based on the information about sound, person action, is the same here as to whether the face image is a pre-determined face image or not, except that the specific parameters of the model may differ, for example, a person whose face image is not a pre-stored face image may be a thief, at which time the user is determined to be a thief if there is an action to take something out by capturing his action information.
And 108, determining an alarm terminal corresponding to the judgment scene based on the preset corresponding relation between the judgment scene and the alarm terminal when the abnormal behavior exists in the personnel.
It will be appreciated that when a family member and a stranger are at home at the same time, if facial images, sounds, and motion information of the family member cannot be captured well, it is also possible to determine what kind of danger the family member may be in through abnormal behavior of the stranger.
Step 109, sending alarm information to the determined alarm terminal, wherein the alarm information comprises the position information of the personnel and the facial images, the action information and/or the sound information of the personnel in the video frame sequence.
In this embodiment, when the face image is not a pre-stored face image, whether the family member is in a dangerous state is determined according to abnormal behavior information of a stranger, so that the family member can send alarm information without acquiring the face image, sound information behavior information and the like of the family member under conditions such as emergency, dangerous condition and the like, and safety of the family member is improved.
In some modified implementations of the embodiments of the present application, determining, in step 104, an alarm terminal corresponding to a judgment scene based on a preset correspondence between the judgment scene and the alarm terminal includes:
104a, presetting importance levels of a plurality of alarm terminals and presetting a corresponding relation between a judgment scene and the importance levels of the alarm terminals;
104b, determining an alarm terminal of an importance level corresponding to the judgment scene according to the judgment scene;
wherein, judge the scene and include:
the existence of abnormal behavior of the person is determined based on any one of the face image, the sound information, and the person action information.
It is understood that the abnormal behavior of the person is determined based on any one of the face image, the sound information, and the person action information at this time, but the other two items determine that the person does not have the abnormal behavior, i.e., is not in a dangerous state. Then the alarm information is only sent to the home guardian's terminal at this time.
The existence of abnormal behavior of the person is determined from any two of the face image, the sound information, and the person action information.
It is understood that the person is determined to have abnormal behavior based on any one of the face image, the sound information, and the person action information at this time, but the remaining one determines that the person does not have abnormal behavior, i.e., is not in a dangerous state. Then the alarm information is sent to the terminals of at least two home guardianship persons.
And determining that the person has abnormal behaviors according to the facial images, the sound information and the person action information.
It is understood that when it is determined that there is abnormal behavior of a person based on face images, sound information, and person action information, it is highly probable that the user is in an extremely dangerous situation. Then the alarm information is sent to the terminals of at least two home guardianship and an alarm is to be given 110.
In some modified implementations of the embodiments of the present application, determining whether the person has abnormal behaviors according to the facial image, the sound information, and/or the person action information in step 103 includes:
step 103a, extracting expression data in facial images of people, calculating the matching degree of the expression data and the frightening expression model data in an expression database, wherein the frightening expression model data aiming at a plurality of levels are stored in the expression database.
Calculating the matching degree of the current expression data and the panic expression model data; and when the matching degree is within a preset matching threshold value, judging that the level matched with the current expression data is the level. The method for calculating the matching degree of the current expression data and the frightening expression model data can be to identify the expression matched with the current expression data, and one or more of a whole identification method, a partial identification method, a deformation extraction method, a motion extraction method, a geometric feature method and a personal feature method is adopted to identify the facial expression.
The specific comparison method in this embodiment may include:
103a1, taking the center of the face image of the person as a coordinate origin o, and establishing an x-o-y rectangular coordinate system;
103a2, overlapping the facial image of the person with the panic expression model data (panic expression image) in the expression database;
103a3, for the point M on the face of the person, finding the corresponding pixel point coordinate A1 (x 1, y 1) in the panic expression image, and finding the corresponding pixel point coordinate B1 (x 2, y 2) in the panic expression image;
103a4, obtaining the distance between A1 and B1 in an x-o-y rectangular coordinate system;
103a5, repeating the steps 103a3 to 103a4 for each point on the face of the person, and obtaining the corresponding coordinate distance of the point in the face image and the panic expression image of the person until the solving of the corresponding coordinate distances of all the points is completed;
103a6, carrying out weighted average on the coordinate distances corresponding to all the points to obtain the matching degree of the front expression data and the panic expression model data.
It will be appreciated that the point M on the face of the person may be a point on all or part of the face position, eye position, mouth position, nose position, eyebrow position of the face of the person, and typically the weight of the point at the eye position, eyebrow position is greater than the weight of the point at the mouth position, which in turn is greater than the weight of the point at the nose position.
Step 103b, when the matching degree of the expression data and the panic expression model data of any level is greater than a preset matching threshold, judging that the facial image contains the panic expression information, and the panic level of the expression information contained in the facial image is the level.
The level of the panic expression may include, among others: non-alarm level, pre-alarm level and alarm level. If the level of the current expression data matching is a non-alarm level, not performing alarm prompt and alarm operation; if the level of the current expression data matching is the pre-alarm level, prompting the user whether to send an alarm signal or not; if the level of the current expression data matching is an alarm level, an alarm signal is directly sent without an alarm prompt.
And step 103c, when the panic level of the expression information contained in the facial image is a preset level, determining that the person has abnormal behaviors.
And if the level of the matching of the current expression data is the pre-alarm level and the alarm level, determining that the abnormal behavior exists in the personnel.
In some modified implementations of the embodiments of the present application, determining whether the person has abnormal behaviors according to the sound information and/or the person action information in step 103 may include:
step 103a, identifying semantic keywords corresponding to the voice information based on a voice identification technology;
The voice information can be a help calling request sent by a person, and the help calling request can indicate that the person has a desire to request for help.
Step 103b, comparing the identified semantic keywords with pre-configured alarm keywords;
step 103c, determining whether the person has abnormal behaviors according to the matching degree of the identified semantic keywords and the alarm keywords.
The voice recognition technology is applied to perform the judging process, which may be, for example, based on the voice recognition technology, recognizing a semantic keyword corresponding to the voice information of the sound source, comparing the recognized semantic keyword with a pre-configured alarm keyword, determining that the voice information of the sound source indicates a call request when the recognized semantic keyword is matched with the alarm keyword, and determining that the voice information of the sound source does not indicate the call request when the recognized semantic keyword is not matched with the alarm keyword.
The voice in the video may be detected at a certain time (e.g., every minute) or the voice detection may be continuously turned on. Recording for a preset time/period (e.g., 15 seconds/period) or continuous recording occurs when the violence feature occurs in the voice at any time, and one or more recordings are sent in the form of ambient recording voice packets at a fixed time. After the recording, uploading the section of voice to a server for voice separation, stripping all sound sources appearing in the recording, respectively identifying after stripping, matching the identified characters with violent words in a violent language word stock, a sexual infringement word stock, a sigma word stock and the like, and judging that family members are dangerous when the number of words exceeds the preset number (for example, 5) of words of alarm keywords, wherein the section of voice is sent to a mobile terminal App of a guardian for alarm prompt.
It will be appreciated that the speech recognition techniques used in the methods of the embodiments of the present invention may be existing or potential speech recognition techniques and are within the scope of the present invention. In addition, the alarm keyword may be various keywords indicating that a monitoring object (e.g., child or elderly person) may be sent out in a dangerous scene, which should not be limited herein, and may be, for example, "alarm", "rescue", "fire", etc. Preferably, the alarm keywords can also be custom set by personnel, thereby also enabling personnel-personalized alarm strategies.
Further, in step 103, according to the sound information and/or the personnel action information, determining whether the personnel has abnormal behaviors may further include:
and extracting acoustic characteristics of the sound information, and determining whether abnormal behaviors exist in the personnel according to the acoustic characteristics. For example, it may be to analyze the pitch or intensity characteristics of the sound source voice information and determine sounds with too high a pitch or intensity characteristic as the presence of abnormal behaviors of the person.
In some modified implementations of the embodiments of the present application, in step 103, determining whether the person has abnormal behaviors according to the sound information and/or the person action information may further include:
And 103d, inputting the human body key point position heat map of each sampling image frame in the image sequence into an action classification extraction model, and extracting a human body key point position feature map of the sampling image frame sequence.
The video frame sequence may be a video to be identified formed by continuously imaging a scene containing a person, and may include a plurality of frames of images that are continuous in time.
The human body key points may be key nodes of a body structure affecting the position and posture of the human body, for example, may be human body joint points. The preset human body key points may be preset human body key points, and may include, for example, joints such as shoulder joints, elbow joints, wrist joints, hip joints, knee joints, and ankle joints. The human body key point detection model can perform feature extraction and key point positioning on each input sampling image frame, detect the probability of the preset human body key points in different positions in the image, and then generate a human body key point position heat map according to the probability of the detected human body key points in different positions.
And 103e, classifying the human body actions in the image sequence based on the human body key point position feature map to obtain a human body action recognition result in the video.
In this embodiment, the human body key point position heat map of the obtained sampling image frame sequence may be input into a trained action classification model, and the trained action classification model may classify the human body actions indicated by the sampling image frame sequence according to the human body key point position heat map, so as to obtain the recognition result of the human body actions in the video to be recognized corresponding to the sampling image frame.
The trained motion classification model may be a model constructed based on a deep learning network, for example, may be a neural network constructed based on CNN or RNN. The trained motion classification model may be trained based on sample data, where the sample data may include an image frame sequence extracted from a video as a sample and annotation information of a human motion corresponding to the image frame sequence. In practice, human body actions corresponding to each video segment in the sample data can be marked, so that marking information of an image frame sequence of the video segment is generated. The human body key point position heat map of each image frame in the sample data can be extracted, the extracted human body key point position heat map is input into the action classification model to be trained for action classification, and the parameters of the action classification model to be trained can be iteratively adjusted in the training process, so that the difference between the classification result of the action classification model on the image frame sequence in the sample data and the corresponding labeling information is continuously reduced.
Step 103, judging whether the person has abnormal behaviors according to the human body action recognition result.
According to the embodiment, the video is sampled to obtain the sampling image frame sequence of the video to be identified, then the trained human body key point detection model is adopted to carry out key point detection on the sampling image frame sequence to obtain the human body key point position heat map of each sampling image frame in the sampling image frame sequence, wherein the human body key point position heat map is used for representing probability characteristics of the position where the preset human body key point is located, then the human body key point position heat map of the sampling image frame sequence is input into the trained action classification model to be classified, human body actions corresponding to the video to be identified are obtained, and action identification is carried out by utilizing the coordination relation of the human body key points in the video to be identified and time continuous characteristics of the human body actions, so that the identification precision is improved.
Further, in step 103, according to the sound information and/or the personnel action information, it is determined whether the personnel has abnormal behavior, and the step of training the action classification model is further included, where the step includes:
acquiring a training video frame sequence, wherein the video frame sequence comprises training videos and annotation information of human body actions indicated by the training videos;
Sampling the video frame sequence to obtain a training sampling image frame sequence;
performing key point detection on the training sample image frame sequence to obtain a human body key point position heat map of each training sample image frame in the training sample image frame sequence;
inputting the human body key point position heat map of the training sampling image frame sequence into a neural network corresponding to the action classification model to be trained to predict, and iteratively adjusting parameters of the neural network corresponding to the action classification model to be trained based on the difference between the prediction result of the neural network corresponding to the action classification model to be trained and the labeling information of human body actions indicated by the corresponding training video, and stopping iteration when the preset convergence condition is met, so as to obtain the trained action classification model.
The trained motion classification model may be a model constructed based on a deep learning network, for example, may be a neural network constructed based on CNN or RNN. The trained motion classification model may be trained based on sample data, where the sample data may include an image frame sequence extracted from a video as a sample and annotation information of a human motion corresponding to the image frame sequence. In practice, human body actions corresponding to each video segment in the sample data can be marked, so that marking information of an image frame sequence of the video segment is generated. The human body key point position heat map of each image frame in the sample data can be extracted, the extracted human body key point position heat map is input into the action classification model to be trained for action classification, and the parameters of the action classification model to be trained can be iteratively adjusted in the training process, so that the difference between the classification result of the action classification model on the image frame sequence in the sample data and the corresponding labeling information is continuously reduced.
The following is an embodiment of the system for improving home security by using home surveillance video. The embodiment is a system for implementing the method embodiment for improving the home security by using the home monitoring video. In the embodiment, facial images of people in a video are identified, whether the facial images of the people are preset images or not is determined, if yes, expression images of the people in the video are obtained in real time, and expression types of the expression images are identified; then, based on the expression type of the expression image, when the expression image of the person is a preset frightening expression, acquiring a behavior image of the person and/or sound data in the video, judging whether the person has abnormal behaviors according to the behavior image and/or the sound data, and if so, starting an alarm program. If the person is not the preset image, the human body action of the person is obtained in the video, whether the person has abnormal behavior or not is judged according to the human body action, and if so, an alarm program is started. The method can detect abnormal behaviors in the video scene in real time, give an alarm to related personnel in time, avoid invasion of families by outsiders and meet the requirements of people on a monitoring system under unsafe conditions.
As shown in fig. 3, the system may include:
a facial image acquisition unit 301, configured to acquire facial images of a person in a video frame sequence, identify the facial images of the person, and determine whether the facial images of the person are pre-stored facial images;
an expression analysis unit 302, configured to identify the facial image in response to the facial image being a pre-stored facial image, and determine whether the facial image contains frightening expression information;
an abnormal behavior determination unit 303, configured to extract sound information and/or person action information in the video frame sequence when the face image contains panic information, and determine whether the person has abnormal behavior according to the face image, the sound information and/or the person action information;
an alarm terminal determining unit 304, configured to determine, when a person has abnormal behavior, an alarm terminal corresponding to a judgment scene based on a preset correspondence between the judgment scene and the alarm terminal;
and an alarm unit 305 for sending alarm information to the determined alarm terminal, wherein the alarm information includes the position information of the person and the face image, the action information and/or the sound information of the person in the video frame sequence.
It is noted that the flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and flowchart illustration, and combinations of blocks in the block diagrams and flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described systems, apparatuses and units may refer to corresponding procedures in the foregoing method embodiments, and are not repeated herein.
In the several embodiments provided in this application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The above-described apparatus embodiments are merely illustrative, for example, the division of the units is merely a logical function division, and there may be other manners of division in actual implementation, and for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be through some communication interface, device or unit indirect coupling or communication connection, which may be in electrical, mechanical or other form.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in each embodiment of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: a usb disk, a removable hard disk, a Read-only memory (ROM), a random access memory (RAM, randomAccessMemory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the embodiments, and are intended to be included within the scope of the claims and description.
Claims (8)
1. A method for improving home security using home surveillance video, comprising:
acquiring face images of a person in a video frame sequence, identifying the face images of the person, and determining whether the face images of the person are pre-stored face images;
identifying the face image in response to the face image being a pre-stored face image, and determining whether the face image contains panic expression information;
when the face image contains terrorism information, extracting sound information and/or personnel action information in the video frame sequence, and judging whether the personnel has abnormal behaviors according to the face image, the sound information and/or the personnel action information;
when the person has abnormal behaviors, determining an alarm terminal corresponding to a preset judgment scene based on the corresponding relation between the judgment scene and the alarm terminal;
sending alarm information to the determined alarm terminal, wherein the alarm information comprises the position information of the personnel and personnel face images, personnel action information and/or sound information in the video frame sequence;
the method further comprises the steps of:
responding to the face image which is not a pre-stored face image, and sending the face image of the person to a preset alarm terminal;
Extracting sound information and/or personnel action information in the video frame sequence, and judging whether the personnel has abnormal behaviors or not according to the sound information and/or the personnel action information;
when the person has abnormal behaviors, determining an alarm terminal corresponding to a preset judgment scene based on the corresponding relation between the judgment scene and the alarm terminal;
sending alarm information to the determined alarm terminal, wherein the alarm information comprises the position information of the personnel and personnel face images, personnel action information and/or sound information in the video frame sequence;
the step of judging whether the person has abnormal behaviors according to the face image, the sound information and/or the person action information comprises the following steps:
extracting expression data in the facial image of the person, and calculating the matching degree of the expression data and the frightening expression model data in an expression database, wherein the frightening expression model data aiming at a plurality of levels are stored in the expression database;
when the matching degree of the expression data and the frightening expression model data of any level is larger than a preset matching threshold value, judging that the facial image contains frightening expression information, and the frightening level of the expression information contained in the facial image is the level;
When the panic level of the expression information contained in the facial image is a preset level, determining that the person has abnormal behaviors;
the extraction of expression data in the facial image of the person, the calculation of the matching degree of the expression data and the frightening expression model data in an expression database, wherein the frightening expression model data aiming at a plurality of levels are stored in the expression database, and the extraction comprises the following steps:
103a1, taking the center of the face image of the person as a coordinate origin o, and establishing an x-o-y rectangular coordinate system;
103a2, overlapping the facial image of the person with the data of the panic expression model in the expression database;
103a3, for the point M on the face of the person, calculating the corresponding pixel point coordinate A1 (x 1, y 1) in the face image of the person, and calculating the corresponding pixel point coordinate B1 (x 2, y 2) in the panic expression picture;
103a4, obtaining the distance between A1 and B1 in an x-o-y rectangular coordinate system;
103a5, repeating the steps 103a3 to 103a4 for each point on the face of the person, and obtaining the corresponding coordinate distance of the point in the face image and the panic expression image of the person until the solving of the corresponding coordinate distances of all the points is completed;
103a6, carrying out weighted average on the coordinate distances corresponding to all points to obtain the matching degree of the front expression data and the panic expression model data;
The points M on the face of the person may be all or part of the points on the face, eye, mouth, nose, and eyebrow positions, the weight of the points on the eye and eyebrow positions being greater than the weight of the points on the mouth, which in turn are greater than the weight of the points on the nose.
2. The method for improving home security by using home surveillance video according to claim 1, wherein the determining the alarm terminal corresponding to the judgment scene based on the preset correspondence between the judgment scene and the alarm terminal comprises:
presetting importance levels of a plurality of alarm terminals and presetting corresponding relations between the judgment scenes and the importance levels of the alarm terminals;
determining the alarm terminal of the importance level corresponding to the judgment scene according to the judgment scene;
wherein, the judging scene includes:
determining that the person has abnormal behavior according to any one of the face image, the sound information and the person action information;
determining that the person has abnormal behavior according to any two of the face image, the sound information and the person action information;
And determining that the person has abnormal behaviors according to the facial image, the sound information and the person action information.
3. The method for improving home security using home surveillance video according to claim 1, wherein determining whether the person has abnormal behavior based on the sound information and/or person action information comprises:
based on a voice recognition technology, recognizing semantic keywords corresponding to the voice information;
comparing the identified semantic keywords with pre-configured alarm keywords;
and determining whether the person has abnormal behaviors according to the matching degree of the identified semantic keywords and the alarm keywords.
4. A method for improving home security using home surveillance video according to claim 3, wherein determining whether the person has abnormal behavior based on the sound information and/or person action information, further comprises:
and extracting acoustic characteristics of the sound information, and determining whether abnormal behaviors exist in the personnel according to the acoustic characteristics.
5. The method for improving home security using home surveillance video of claim 1, wherein determining whether the person has abnormal behavior based on the sound information and/or person action information, further comprises:
Inputting the human body key point position heat image of each sampling image frame in the image sequence into an action classification extraction model to extract a human body key point position feature image of the sampling image frame sequence;
classifying human body actions in the image sequence based on the human body key point feature map to obtain a human body action recognition result in the video;
judging whether the person has abnormal behaviors according to the human body action recognition result.
6. The method for improving home security using home surveillance video of claim 5, wherein the step of determining whether the person has abnormal behavior based on the sound information and/or the person's motion information, further comprises the step of training a motion classification model, the step comprising:
acquiring a training video frame sequence, wherein the video frame sequence comprises training videos and annotation information of human body actions indicated by the training videos;
sampling the video frame sequence to obtain a training sampling image frame sequence;
performing key point detection on the training sample image frame sequence to obtain a human body key point position heat map of each training sample image frame in the training sample image frame sequence;
inputting the human body key point position heat map of the training sampling image frame sequence into a neural network corresponding to the action classification model to be trained to predict, and iteratively adjusting parameters of the neural network corresponding to the action classification model to be trained based on the difference between the prediction result of the neural network corresponding to the action classification model to be trained and the labeling information of human body actions indicated by the corresponding training video, and stopping iteration when the preset convergence condition is met, so as to obtain the trained action classification model.
7. A method for improving home security using home surveillance video according to claim 1 or 2, wherein the extracting sound information and/or personnel action information from the sequence of video frames comprises:
acquiring a sample video frame sequence in a first preset time period before a time point for triggering and extracting sound information and personnel action information in the video frame sequence and a second preset time period after the time point for triggering and extracting the sound information and the personnel action information in the video frame sequence;
and extracting sound information and personnel action information in the video frame sequence from the sample video frame sequence.
8. A system for improving home security using home surveillance video, for implementing the method of any of claims 1 to 7, the system comprising:
the face image acquisition unit is used for acquiring face images of people in the video frame sequence, identifying the face images of the people and determining whether the face images of the people are pre-stored face images or not;
an expression analysis unit for identifying the face image in response to the face image being a pre-stored face image, and determining whether the face image contains frightening expression information;
The abnormal behavior judging unit is used for extracting sound information and/or personnel action information in the video frame sequence when the face image contains terrorism information, and judging whether the personnel has abnormal behaviors or not according to the face image, the sound information and/or the personnel action information;
the alarm terminal determining unit is used for determining an alarm terminal corresponding to a preset judgment scene based on the corresponding relation between the judgment scene and the alarm terminal when the person has abnormal behaviors;
and the alarm unit is used for sending alarm information to the determined alarm terminal, wherein the alarm information comprises the position information of the personnel and personnel face images, personnel action information and/or sound information in the video frame sequence.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011092212.9A CN112364696B (en) | 2020-10-13 | 2020-10-13 | Method and system for improving family safety by utilizing family monitoring video |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011092212.9A CN112364696B (en) | 2020-10-13 | 2020-10-13 | Method and system for improving family safety by utilizing family monitoring video |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112364696A CN112364696A (en) | 2021-02-12 |
CN112364696B true CN112364696B (en) | 2024-03-19 |
Family
ID=74507892
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011092212.9A Active CN112364696B (en) | 2020-10-13 | 2020-10-13 | Method and system for improving family safety by utilizing family monitoring video |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112364696B (en) |
Families Citing this family (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112861816A (en) * | 2021-03-30 | 2021-05-28 | 中国工商银行股份有限公司 | Abnormal behavior detection method and device |
CN113111215A (en) * | 2021-03-30 | 2021-07-13 | 深圳市冠标科技发展有限公司 | User behavior analysis method and device, electronic equipment and storage medium |
CN112883932A (en) * | 2021-03-30 | 2021-06-01 | 中国工商银行股份有限公司 | Method, device and system for detecting abnormal behaviors of staff |
CN113902997A (en) * | 2021-06-21 | 2022-01-07 | 苏州亿尔奇信息科技有限公司 | Abnormal behavior alarm method and system based on video monitoring |
CN114022955A (en) * | 2021-10-22 | 2022-02-08 | 北京明略软件系统有限公司 | Action recognition method and device |
CN114415528A (en) * | 2021-12-08 | 2022-04-29 | 珠海格力电器股份有限公司 | Intelligent household equipment reminding method and device, computer equipment and storage medium |
CN114694269A (en) * | 2022-02-28 | 2022-07-01 | 江西中业智能科技有限公司 | Human behavior monitoring method, system and storage medium |
CN115240302A (en) * | 2022-07-18 | 2022-10-25 | 珠海格力电器股份有限公司 | Method and device for monitoring indoor safety environment, electronic equipment and storage medium |
CN115294680A (en) * | 2022-08-16 | 2022-11-04 | 杭州萤石软件有限公司 | Method and equipment for realizing personal safety protection function |
CN116915958B (en) * | 2023-09-06 | 2024-02-13 | 广东电网有限责任公司佛山供电局 | One-time operation video monitoring and analyzing method and related device |
CN117575872B (en) * | 2023-11-23 | 2024-05-31 | 周波 | Dangerous chemical management method and system |
CN117635174A (en) * | 2023-12-04 | 2024-03-01 | 中国人寿保险股份有限公司山东省分公司 | Fraud risk assessment method and system for comprehensive multi-mode AI analysis |
CN118015547A (en) * | 2024-02-28 | 2024-05-10 | 成都趣点科技有限公司 | On-duty off-duty intelligent detection method and system based on machine vision |
Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102970438A (en) * | 2012-11-29 | 2013-03-13 | 广东欧珀移动通信有限公司 | Automatic mobile phone alarming method and device |
CN108449514A (en) * | 2018-03-29 | 2018-08-24 | 百度在线网络技术(北京)有限公司 | Information processing method and device |
CN108734055A (en) * | 2017-04-17 | 2018-11-02 | 杭州海康威视数字技术股份有限公司 | A kind of exception personnel detection method, apparatus and system |
CN108985259A (en) * | 2018-08-03 | 2018-12-11 | 百度在线网络技术(北京)有限公司 | Human motion recognition method and device |
CN109584907A (en) * | 2018-11-29 | 2019-04-05 | 北京奇虎科技有限公司 | A kind of method and apparatus of abnormal alarm |
CN110458101A (en) * | 2019-08-12 | 2019-11-15 | 南京邮电大学 | Inmate's sign monitoring method and equipment based on video in conjunction with equipment |
CN111047828A (en) * | 2019-12-12 | 2020-04-21 | 天地伟业技术有限公司 | Household intelligent security alarm system |
CN111223261A (en) * | 2020-04-23 | 2020-06-02 | 佛山海格利德机器人智能设备有限公司 | Composite intelligent production security system and security method thereof |
CN111601074A (en) * | 2020-04-24 | 2020-08-28 | 平安科技(深圳)有限公司 | Security monitoring method and device, robot and storage medium |
CN111629184A (en) * | 2020-06-17 | 2020-09-04 | 内蒙古京海煤矸石发电有限责任公司 | Video monitoring alarm system and method capable of identifying personnel in monitoring area |
CN111724524A (en) * | 2020-06-11 | 2020-09-29 | 绍兴文理学院 | Access control management system and method based on voice recognition and form recognition |
-
2020
- 2020-10-13 CN CN202011092212.9A patent/CN112364696B/en active Active
Patent Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102970438A (en) * | 2012-11-29 | 2013-03-13 | 广东欧珀移动通信有限公司 | Automatic mobile phone alarming method and device |
CN108734055A (en) * | 2017-04-17 | 2018-11-02 | 杭州海康威视数字技术股份有限公司 | A kind of exception personnel detection method, apparatus and system |
CN108449514A (en) * | 2018-03-29 | 2018-08-24 | 百度在线网络技术(北京)有限公司 | Information processing method and device |
CN108985259A (en) * | 2018-08-03 | 2018-12-11 | 百度在线网络技术(北京)有限公司 | Human motion recognition method and device |
CN109584907A (en) * | 2018-11-29 | 2019-04-05 | 北京奇虎科技有限公司 | A kind of method and apparatus of abnormal alarm |
CN110458101A (en) * | 2019-08-12 | 2019-11-15 | 南京邮电大学 | Inmate's sign monitoring method and equipment based on video in conjunction with equipment |
CN111047828A (en) * | 2019-12-12 | 2020-04-21 | 天地伟业技术有限公司 | Household intelligent security alarm system |
CN111223261A (en) * | 2020-04-23 | 2020-06-02 | 佛山海格利德机器人智能设备有限公司 | Composite intelligent production security system and security method thereof |
CN111601074A (en) * | 2020-04-24 | 2020-08-28 | 平安科技(深圳)有限公司 | Security monitoring method and device, robot and storage medium |
CN111724524A (en) * | 2020-06-11 | 2020-09-29 | 绍兴文理学院 | Access control management system and method based on voice recognition and form recognition |
CN111629184A (en) * | 2020-06-17 | 2020-09-04 | 内蒙古京海煤矸石发电有限责任公司 | Video monitoring alarm system and method capable of identifying personnel in monitoring area |
Non-Patent Citations (1)
Title |
---|
Kinect 在家庭智能监控系统中的应用;陆奎,周峰;《硬件纵横》;论文第13-15、19页 * |
Also Published As
Publication number | Publication date |
---|---|
CN112364696A (en) | 2021-02-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112364696B (en) | Method and system for improving family safety by utilizing family monitoring video | |
WO2021169209A1 (en) | Method, apparatus and device for recognizing abnormal behavior on the basis of voice and image features | |
Hussain et al. | Activity-aware fall detection and recognition based on wearable sensors | |
CN112328999B (en) | Double-recording quality inspection method and device, server and storage medium | |
CN108734055B (en) | Abnormal person detection method, device and system | |
US10699541B2 (en) | Recognition data transmission device | |
CN109598229B (en) | Monitoring system and method based on action recognition | |
CN109887234B (en) | Method and device for preventing children from getting lost, electronic equipment and storage medium | |
US20220292170A1 (en) | Enrollment System with Continuous Learning and Confirmation | |
KR101979375B1 (en) | Method of predicting object behavior of surveillance video | |
CN115171335A (en) | Image and voice fused indoor safety protection method and device for elderly people living alone | |
Hua et al. | Falls prediction based on body keypoints and seq2seq architecture | |
CN114469076A (en) | Identity feature fused old solitary people falling identification method and system | |
KR102233679B1 (en) | Apparatus and method for detecting invader and fire for energy storage system | |
CN115482485A (en) | Video processing method and device, computer equipment and readable storage medium | |
CN114078603A (en) | Intelligent endowment monitoring system and method, computer equipment and readable storage medium | |
CN116912744B (en) | Intelligent monitoring system and method based on Internet of things | |
CN117197755A (en) | Community personnel identity monitoring and identifying method and device | |
CN112330742A (en) | Method and device for recording activity routes of key personnel in public area | |
KR102648004B1 (en) | Apparatus and Method for Detecting Violence, Smart Violence Monitoring System having the same | |
US20240135713A1 (en) | Monitoring device, monitoring system, monitoring method, and non-transitory computer-readable medium storing program | |
CN109815828A (en) | Realize the system and method for initiative alarming or help-seeking behavior detection control | |
CN112061065B (en) | In-vehicle behavior recognition alarm method, device, electronic device and storage medium | |
KR20230064095A (en) | Apparatus and method for detecting abnormal behavior through deep learning-based image analysis | |
Kodikara et al. | Surveillance based Child Kidnap Detection and Prevention Assistance |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |