CN112528824A - Method and device for preventing infant from eating foreign matter, electronic device and storage medium - Google Patents

Method and device for preventing infant from eating foreign matter, electronic device and storage medium Download PDF

Info

Publication number
CN112528824A
CN112528824A CN202011413220.9A CN202011413220A CN112528824A CN 112528824 A CN112528824 A CN 112528824A CN 202011413220 A CN202011413220 A CN 202011413220A CN 112528824 A CN112528824 A CN 112528824A
Authority
CN
China
Prior art keywords
hand
position information
video frame
nose
determining
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011413220.9A
Other languages
Chinese (zh)
Other versions
CN112528824B (en
Inventor
张发恩
林国森
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chuangxin Qizhi Shenzhen Technology Co ltd
Original Assignee
Chuangxin Qizhi Shenzhen Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chuangxin Qizhi Shenzhen Technology Co ltd filed Critical Chuangxin Qizhi Shenzhen Technology Co ltd
Priority to CN202011413220.9A priority Critical patent/CN112528824B/en
Publication of CN112528824A publication Critical patent/CN112528824A/en
Application granted granted Critical
Publication of CN112528824B publication Critical patent/CN112528824B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/28Recognition of hand or arm movements, e.g. recognition of deaf sign language
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • GPHYSICS
    • G08SIGNALLING
    • G08BSIGNALLING OR CALLING SYSTEMS; ORDER TELEGRAPHS; ALARM SYSTEMS
    • G08B21/00Alarms responsive to a single specified undesired or abnormal condition and not otherwise provided for
    • G08B21/02Alarms for ensuring the safety of persons
    • G08B21/0202Child monitoring systems using a transmitter-receiver system carried by the parent and the child
    • GPHYSICS
    • G08SIGNALLING
    • G08BSIGNALLING OR CALLING SYSTEMS; ORDER TELEGRAPHS; ALARM SYSTEMS
    • G08B3/00Audible signalling systems; Audible personal calling systems
    • G08B3/10Audible signalling systems; Audible personal calling systems using electric transmission; using electromagnetic transmission

Abstract

The application provides a method and a device for preventing infants from eating foreign matters, electronic equipment and a computer readable storage medium, wherein the method comprises the following steps: inputting each video frame in the collected video frame sequence into a trained hand detection model to obtain hand position information in the video frame output by the hand detection model; determining a hand motion track according to hand position information in each video frame in a video frame sequence; judging whether the hand is close to the identified nose part in the video frame or not according to the hand motion track; if so, judging whether the staying time of the hand at the peripheral position of the nose part reaches a preset time threshold or not according to the hand motion track; when the time length threshold is reached, outputting alarm information of edible foreign matters. The method and the device can accurately monitor the area around the face of the infant, can accurately track the hand close to the face of the infant, and realize accurate identification and alarm when the infant has feeding behavior, thereby avoiding the infant from eating foreign matters.

Description

Method and device for preventing infant from eating foreign matter, electronic device and storage medium
Technical Field
The present disclosure relates to the field of intelligent home appliances, and more particularly, to a method and an apparatus for preventing infants from eating foreign objects, an electronic device, and a computer-readable storage medium.
Background
Children are used to put toys at hand into the mouth during the process of playing alone. If the toy is small in size, it may be accidentally swallowed by a child, causing a great deal of injury to the child. Therefore, it is necessary to monitor the behavior of the infant and alarm when the infant is found to have eating behavior, so as to prevent the infant from eating foreign matter.
In the related art, the hands of the infant can be monitored through a distance sensor, monitored motion trajectory data is compared with a preset hand motion trajectory of the infant, and if the monitored motion trajectory data and the preset hand motion trajectory of the infant are matched, the infant is determined to have feeding behavior. However, the scheme is too simple, the distance sensor is difficult to accurately monitor the hand in the application process, and the reliability is poor.
Disclosure of Invention
An object of the embodiments of the present application is to provide a method and an apparatus for preventing a child from eating a foreign object, an electronic device, and a computer-readable storage medium, which are used for preventing a child from eating a foreign object.
In one aspect, the present application provides a method of preventing an infant from eating foreign matter, comprising:
inputting each video frame in the collected video frame sequence into a trained hand detection model to obtain hand position information in the video frame output by the hand detection model;
determining a hand motion track according to hand position information in each video frame in the video frame sequence;
judging whether the hand is close to the identified nose part in the video frame or not according to the hand motion track;
if so, judging whether the staying time of the hand at the peripheral position of the nose part reaches a preset time threshold value or not according to the hand motion track;
and outputting alarm information of edible foreign matters when the time length threshold is reached.
In an embodiment, prior to inputting each video frame of the sequence of video frames into the hand detection model, the method further comprises:
inputting the video frame into a trained nose detection model;
judging whether the nose detection model outputs nose position information in the video frame;
if the nose position information is output, judging whether the nose position information is matched with the preset appointed nose position information in the video frame;
if not, alarm information of the deviation of the camera device is output.
In an embodiment, the method further comprises:
and if the nose position information is not output, outputting alarm information of the deviation of the camera device.
In one embodiment, the determining a hand motion trajectory from hand position information in each video frame of the sequence of video frames comprises:
calculating an intersection ratio according to hand position information in the front video frame and the back video frame in the video frame sequence;
judging whether the cross-over ratio reaches a preset cross-over ratio threshold value or not, and if so, determining that hand position information in the front video frame and the back video frame corresponds to the same hand;
and determining the hand motion track according to hand position information of the same hand in the video frame sequence.
In an embodiment, the determining whether a hand is close to an identified nose in the video frame according to the hand motion trajectory includes:
determining whether the hand is close to the nose in the horizontal direction according to the position information of the plurality of hands in the hand motion track and the identified position information of the nose;
if so, judging whether the hand position information in the hand motion track corresponds to the hand area size and is positioned in a preset size interval;
determining that the hand is close to the identified nose when the hand region size is within the size interval.
In an embodiment, the sequence of video frames is acquired by a 3D camera;
the judging whether the hand is close to the nose part identified in the video frame or not according to the hand motion track comprises the following steps:
determining whether the hand is close to the nose in the horizontal direction according to the position information of the plurality of hands in the hand motion track and the identified position information of the nose;
if so, acquiring a depth information matrix corresponding to the hand position information, and determining the depth information corresponding to the hand position information according to the depth information matrix;
judging whether the depth information is located in a preset depth information interval or not;
when the depth information is located in the depth information interval, determining that the hand is close to the identified nose.
In an embodiment, the determining whether the staying time of the hand at the peripheral position of the nose reaches a preset time threshold according to the hand motion trajectory includes:
screening out hand position information of which the distance from the identified nose position information is smaller than a preset distance threshold value from the hand motion trajectory, and taking the hand position information as first hand position information;
screening a plurality of hand position information in continuous video frames from the first hand position information to be used as second hand position information;
judging whether the quantity of the second hand position information reaches a preset quantity threshold value or not; the number threshold is obtained by converting the duration threshold;
if so, determining that the staying time of the hand at the peripheral position of the nose reaches the time threshold.
In another aspect, the present application further provides a device for preventing an infant from eating foreign matter, comprising:
the identification module is used for inputting each video frame in the collected video frame sequence into a trained hand detection model and obtaining hand position information in the video frame output by the hand detection model;
the determining module is used for determining a hand motion track according to hand position information in each video frame in the video frame sequence;
the first judging module is used for judging whether the hand part is close to the identified nose part in the video frame or not according to the hand part motion track;
the second judgment module is used for judging whether the staying time of the hand at the peripheral position of the nose part reaches a preset time threshold value or not according to the hand motion track if the hand motion track exists;
and the alarm module is used for outputting alarm information of the edible foreign matters when the time length threshold value is reached.
Further, the present application also provides an electronic device, including:
a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to perform the above method of preventing a child from eating a foreign object.
In addition, the present application also provides a computer readable storage medium, wherein the storage medium stores a computer program, and the computer program can be executed by a processor to realize the method for preventing the infant from eating the foreign matter.
In the embodiment of the application, after hand position information of each video frame in a video frame sequence is identified, a hand motion track is determined according to the hand position information in each video frame, and whether a hand is close to an identified nose part in the video frame is judged according to the hand motion track; if so, judging whether the staying time of the hand at the peripheral position of the nose reaches a time threshold or not according to the hand motion track; and when the time length threshold is reached, outputting alarm information of the edible foreign matters. The scheme of the application can accurately monitor the area around the face of the infant, can accurately track the hand close to the face of the infant, and can realize accurate identification and alarm when the infant has feeding behavior, thereby avoiding the infant from eating foreign matters.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings required to be used in the embodiments of the present application will be briefly described below.
Fig. 1 is a schematic view of an application scenario of a method for preventing a child from eating foreign matter according to an embodiment of the present application;
fig. 2 is a schematic structural diagram of an electronic device according to an embodiment of the present application;
fig. 3 is a flowchart illustrating a method for preventing a child from eating foreign substances according to an embodiment of the present application;
fig. 4 is a schematic flowchart of a method for checking a position of an image capturing apparatus according to an embodiment of the present disclosure;
FIG. 5 is a diagram illustrating a designated nose position in a video frame according to an embodiment of the present application;
fig. 6 is a flowchart illustrating a method for determining a hand motion trajectory according to an embodiment of the present application;
FIG. 7 is a flowchart illustrating a method for determining whether a hand is close to a nose according to an embodiment of the present disclosure;
FIG. 8 is a schematic flow chart illustrating a method for determining whether a hand is close to a nose according to another embodiment of the present disclosure;
fig. 9 is a block diagram of an apparatus for preventing a child from eating foreign substances according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be described below with reference to the drawings in the embodiments of the present application.
Like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures. Meanwhile, in the description of the present application, the terms "first", "second", and the like are used only for distinguishing the description, and are not to be construed as indicating or implying relative importance.
Fig. 1 is a schematic view of an application scenario of the method for preventing a child from eating a foreign object according to the embodiment of the present application. As shown in fig. 1, the application scenario includes an image capture apparatus 40 and a smart device 50; the camera 40 may be a head-mounted camera device, and is configured to capture a real-time image from above the head of the infant and downward, so as to capture a video frame sequence of the area around the face of the infant and transmit the captured video frame sequence to the smart device 50; the smart device 50 may be a computer host, a tablet computer, or other electronic devices with computing functions, and is configured to determine whether there is a feeding behavior for the infant according to the acquired video frame sequence.
As shown in fig. 2, the present embodiment provides an electronic apparatus 1 including: at least one processor 11 and a memory 12, one processor 11 being exemplified in fig. 2. The processor 11 and the memory 12 are connected by a bus 10, and the memory 12 stores instructions executable by the processor 11, and the instructions are executed by the processor 11, so that the electronic device 1 can execute all or part of the flow of the method in the embodiments described below. In an embodiment, the electronic device 1 may be the smart device 50 described above.
The Memory 12 may be implemented by any type of volatile or non-volatile Memory device or combination thereof, such as Static Random Access Memory (SRAM), Electrically Erasable Programmable Read-Only Memory (EEPROM), Erasable Programmable Read-Only Memory (EPROM), Programmable Read-Only Memory (PROM), Read-Only Memory (ROM), magnetic Memory, flash Memory, magnetic disk or optical disk.
The present application also provides a computer readable storage medium storing a computer program executable by a processor 11 to perform the method for preventing a child from eating foreign objects provided by the present application.
Referring to fig. 3, a flow chart of a method for preventing a child from eating foreign substances according to an embodiment of the present application is shown in fig. 3, and the method may include the following steps 310 to 350.
Step 310: and inputting each video frame in the collected video frame sequence into the trained hand detection model to obtain hand position information in the video frame output by the hand detection model.
The scheme can be executed by the intelligent device. In addition, the intelligent device may be integrated into the camera device, and in this case, the camera device may directly perform the steps of the present scheme after acquiring the sequence of video frames. For convenience of description, the following description takes a smart device as an implementation subject.
The video frame sequence is generated in real time by shooting the area around the face of the infant by the camera device, and the video frame in the video frame sequence comprises the image of the area around the face of the infant.
The hand detection model is used to detect a hand in a video frame. The hand detection model can be obtained through training of the target detection model. The target detection model may be any one of fast R-cnn (fast regions with conditional Neural Networks features), SSD (Single Shot multi box Detector), yolo (young Only Look one), and the like. Before performing step 310 of the present application, the target detection model may be trained by hand sample images. The hand sample image carries a label indicating hand position information and category information in the hand sample image. The target detection model may output predicted position information and predicted category information for the hand sample image. And evaluating the difference between the predicted position information and the hand position information of the same hand sample image and the difference between the category information and the predicted category information through a loss function so as to adjust the network parameters of the target detection model. And repeating iteration until the function value of the loss function is stable, and determining that the target detection model is converged to obtain the hand detection model.
The smart device may input each video frame in the sequence of video frames into the hand detection model in turn. If a hand is present in the video frame, the hand detection model may output hand position information in the video frame. The hand position information is used to indicate the position of the hand in the video frame, and may be generally represented as a rectangular box defining the area where the hand is located. The hand position information may be in the form of a combination of coordinates of the upper left corner and the lower right corner of a rectangular frame defining the region where the hand is located in the image coordinate system to which the video frame belongs, or a combination of coordinates of the center point, the width, and the height of a rectangular frame defining the region where the hand is located in the image coordinate system to which the video frame belongs. The specific form of the hand position information depends on the target detection model for training the hand detection model.
Step 320: and determining a hand motion track according to hand position information in each video frame in the video frame sequence.
The intelligent device can track the hands according to the hand position information in each video frame in the video frame sequence, so that the hand motion track is determined. The hand motion trajectory may indicate a hand motion direction.
Step 330: and judging whether the hand is close to the identified nose part in the video frame or not according to the hand motion track.
The identified nose in the video frame can be represented by nose position information, and the nose position information indicates the position of the nose in the video frame. The nose position information may be represented as a rectangular box defining the area where the nose is located. The nose position information may be in the form of a combination of coordinates of an upper left corner and coordinates of a lower right corner of a rectangular frame defining the region where the nose is located in the image coordinate system to which the video frame belongs, or a combination of coordinates of a center point, a width, and a height of the rectangular frame defining the region where the nose is located in the image coordinate system to which the video frame belongs. The specific form of the nose position information depends on the target detection model for training the nose detection model.
For each hand position information in the hand motion track, the intelligent device can determine the distance between the hand and the nose according to the hand position information and the nose position information belonging to the same video frame as the hand position information. For example, the smart device may determine a distance between the hand position information and the nose position information according to a central point of a rectangular frame corresponding to the hand position information and a central point of a rectangular frame corresponding to the nose position information, and use the distance as the distance between the hand and the nose in the video frame.
The smart device may determine whether the distance between the hand and the nose gradually decreases. On the one hand, if not, it indicates that the hand is not close to the nose. On the other hand, if yes, the hand is close to the nose.
In an embodiment, the intelligent device may extract a plurality of hand position information from the hand motion trajectory, and determine a distance between the hand and the nose according to the selected hand position information and the nose position information belonging to the same video frame as the hand position information, so as to determine whether the hand is close to the nose. By this measure, it is not necessary to calculate the position information of each hand, and the amount of calculation can be reduced.
Step 340: if so, judging whether the staying time of the hand at the peripheral position of the nose part reaches a preset time threshold value or not according to the hand motion track.
The duration threshold may be an empirical value, or may be set according to eating habits of different children. When the staying time of the hands at the peripheral position of the nose reaches the time threshold, the infant can be determined to have the feeding behavior.
When the intelligent device determines that the hand in the video frame sequence is close to the nose, the hand position information, of which the distance from the identified nose position information is smaller than a preset distance threshold value, can be screened out from the hand motion track of the video frame sequence to serve as the first hand position information. Here, the first hand information is hand position information determined by the first filtering. When the distance between the hand and the nose is smaller than the distance threshold value, the hand can be determined to be located at the peripheral position of the nose. For example, the smart device may determine a distance between the hand and the nose according to a rectangular frame center point corresponding to the hand position information and a rectangular frame center point corresponding to the nose position information, and determine whether the distance is smaller than the distance threshold, thereby screening out the first hand position information.
The intelligent device can screen out a plurality of hand position information located in continuous video frames from the first hand position information to serve as second hand position information. Here, the second hand position information is hand position information determined through the secondary filtering. The smart device may filter out sets of second hand position information from the first hand position information.
Illustratively, the smart device screens out more than 5000 pieces of first hand position information from the hand motion trajectories from the 1001 st video frame to the 19000 th video frame, and screens out 3 sets of second hand position information from the first hand position information in consecutive video frames.
For each set of second hand position information, the smart device may determine whether the number of second hand position information reaches a preset number threshold. The number threshold is obtained by converting a duration threshold. Illustratively, the duration threshold is 3 seconds, the frame rate of the video frames is 30, and the number threshold is 90. In one aspect, for any group of second hand position information, if the number of the second hand position information of the group does not reach the number threshold, it can be determined that the staying time of the hand at the peripheral position of the nose does not reach the time threshold. On the other hand, if the number of the second hand position information of the group reaches the number threshold, it can be assumed that the staying time of the hand at the peripheral position of the nose reaches the time threshold.
Step 350: when the time length threshold is reached, outputting alarm information of edible foreign matters.
When the staying time of the hands at the peripheral position of the nose reaches a time threshold, the infant can be determined to have eating behavior. At this moment, in order to avoid the infant to eat the foreign matter, the intelligent equipment can output the alarm information of eating the foreign matter. Illustratively, the smart device may send out an alarm message in the form of voice through the audio device, such as "the baby is eating something dirty". Or, the intelligent device may output alarm information in the form of text, voice, animation, and the like adapted to a personal terminal (for example, a mobile phone, a tablet computer, an intelligent watch, intelligent glasses, and the like) of the alarm recipient to the personal terminal of the alarm recipient through a pre-configured contact manner of the alarm recipient (which may be a guardian).
Through the measures from the step 310 to the step 350, the area around the face of the infant can be accurately monitored, the hand close to the face of the infant can be accurately tracked, and accurate identification and alarm can be realized when the infant has feeding behavior, so that the infant can be prevented from eating foreign matters.
In an embodiment, before the method for preventing the infant from eating the foreign object is executed, the intelligent device first determines whether a camera capturing a sequence of video frames is shifted. Referring to fig. 4, a flowchart of a method for checking a position of an image capturing apparatus according to an embodiment of the present disclosure is shown in fig. 4, where the method may include the following steps 301 to 304.
Step 301: the video frame is input into the trained nose detection model.
The nose detection model is used to detect a nose in a video frame. The nose detection model can be obtained through training of the target detection model. The target detection model can be any one of the models such as Faster R-CNN, SSD and YOLO. Prior to performing step 310 of the present application, the target detection model may be trained by the nose sample images. The nose sample image may be an image taken overhead from the baby's head containing the baby's nose, the nose sample image carrying a label indicating nose position information and category information in the nose sample image. The target detection model may output predicted location information and predicted category information for the nose sample image. And evaluating the difference between the predicted position information and the nose position information of the same nose sample image and the difference between the category information and the predicted category information through a loss function, so as to adjust the network parameters of the target detection model. And repeating iteration until the function value of the loss function is stable, and determining that the target detection model is converged to obtain the nose detection model.
Step 302: and judging whether the nose detection model outputs nose position information in the video frame.
After the intelligent device inputs the video frame into the nose detection model, whether the nose detection model outputs the nose position information in the video frame can be judged. On one hand, if the nose position information is not output, the nose of the infant does not appear in the video frame, at the moment, the position of the camera device deviates, and alarm information of the deviation of the camera device can be output. Illustratively, the smart device may send out alarm information in the form of voice, such as "device wearing irregularity", through the audio device. Or the intelligent device can output alarm information in the forms of characters, voice, animation and the like which are matched with the personal terminal to the personal terminal of the alarm receiver through the preset contact way of the alarm receiver. On the other hand, if the nose position information is output, execution may continue to step 303.
Step 303: and if the nose position information is output, judging whether the nose position information is matched with the preset specified nose position information in the video frame.
Referring to fig. 5, a schematic diagram of specifying a nose position in a video frame according to an embodiment of the present application is provided, where, as shown in fig. 5, a solid line box represents the entire video frame, and a dashed line box represents the specified nose position. The designated nose position is used for limiting the position of the nose of the infant in the video frame when the camera device is positioned correctly.
The intelligent device can judge whether the detected nose position information is matched with the specified nose position information. In one embodiment, when the area corresponding to the specified nose position information contains the area corresponding to the nose position information, the nose position information is matched with the specified nose position information. The intelligent device can judge whether the region corresponding to the nose position information is in the region corresponding to the specified nose position information, so that whether the nose position information is matched with the specified nose position information is determined.
Step 304: if not, alarm information of the deviation of the camera device is output.
On one hand, if the nose position information is matched with the designated nose position information, which indicates that the position of the camera device is correct, the intelligent device can execute the method for preventing the infant from eating the foreign matters according to the video frame sequence acquired by the camera device. On the other hand, if the nose position information does not match the specified nose position information, the position of the camera is deviated, and the intelligent device can output alarm information of the deviation of the camera.
Through the measures, when the camera device is deviated in position, the intelligent equipment can give an alarm in time, so that the judgment of the feeding behavior of the infant according to the invalid video frame sequence is avoided.
In an embodiment, referring to fig. 6, which is a flowchart illustrating a method for determining a hand motion trajectory according to an embodiment of the present application, the smart device may perform the following steps 321 to 323 when determining the hand motion trajectory.
Step 321: and calculating the intersection ratio according to the hand position information in the front video frame and the back video frame in the video frame sequence.
The hand detection model may identify two or more hands that may be identified in the video frame. To accurately implement hand tracking, the smart device may calculate an intersection ratio (IoU) for hand position information in two video frames before and after a video frame sequence. For example, the smart device may calculate an intersection ratio of hand position information for the 1 st video frame and the 2 nd video frame in the sequence of video frames, an intersection ratio of hand position information for the 2 nd video frame and the 3 rd video frame, an intersection ratio of hand position information for the 3 rd video frame and the 4 th video frame, and so on.
Step 322: and judging whether the cross-over ratio reaches a preset cross-over ratio threshold value, and if so, determining that the hand position information in the front and back video frames corresponds to the same hand.
Step 323: and determining the motion track of the hand according to the hand position information of the same hand in the video frame sequence.
Where the cross-over threshold is used to determine hand position information belonging to the same hand, the cross-over threshold may be an empirical value, such as 90%.
After the intersection ratio is calculated for the hand position information in the two video frames, the intelligent device can judge whether the intersection ratio reaches the intersection ratio threshold value. On the other hand, if the hand position information does not reach the predetermined position, it is described that the two hand position information do not belong to the same hand. On the other hand, if the hand position information reaches the preset position, the hand position information is described to belong to the same hand.
After determining hand position information belonging to the same hand in each video frame, the smart device may determine a hand motion trajectory for the hand.
In an embodiment, referring to fig. 7, a flowchart of a method for determining whether a hand is close to a nose according to an embodiment of the present application is shown, and as shown in fig. 7, when determining whether a hand is close to a nose, a smart device may perform the following steps 331A to 333A.
Step 331A: and determining whether the hand is close to the nose in the horizontal direction or not according to the position information of the plurality of hands in the hand motion tracks and the identified position information of the nose.
The smart device may determine whether the hand is horizontally near the nose according to the previous embodiments when performing step 330. On the one hand, if the hand is not close to the nose, the smart device may continue to determine the positional relationship of the hand and the nose according to the new video frame. On the other hand, if the hand is approaching the nose in the horizontal direction, the smart device may continue to perform step 332A.
Step 332A: if so, judging whether the hand position information in the hand motion track corresponds to the hand area size and is positioned in a preset size interval.
Step 333A: when the hand region size is within the size interval, it is determined that the hand is near the identified nose.
The preset size interval may be an empirical value, or may be configured according to the actual situation of the individual child. When the hand region size in the video frame is in the size interval, it can be determined that the hand of the infant is close to the nose in the vertical direction.
When the hand is close to the nose in the horizontal direction, the smart device may calculate a hand region size corresponding to the hand position information when the hand is close to the nose, in other words, calculate a size of a rectangular frame corresponding to the hand position information. The smart device may determine whether the hand region size is within the size range. On the one hand, if not, it is stated that the hand is not close to the nose in the vertical direction, and thus it may be determined that the hand is not close to the nose. On the other hand, if yes, the hand is indicated to be close to the nose in the horizontal direction and the vertical direction at the same time, and then the hand can be determined to be close to the nose.
Through above-mentioned measure, whether the smart machine can judge the hand from two dimensions and be close to the nose, and then can discern infant's feeding action more accurately.
In an embodiment, the sequence of video frames is captured by a 3D camera, in other words, the camera means comprises a 3D camera, at which time the smart device may obtain a matrix of depth information of the video frames from the 3D camera. The depth information matrix may be a two-dimensional matrix of the same width and height as the video frame, and the elements in the two-dimensional matrix are depth values of pixels at the same position in the video frame, the depth values being used to represent distances in the vertical direction. For example, the dimension of the video frame may be represented as 600 × 800 in terms of height multiplied by width, and the dimension of the corresponding depth information matrix is 600 rows and 800 columns, and the element in the 100 th row and 200 th column in the depth information matrix is the depth value of the pixel at the same position in the video frame.
Referring to fig. 8, which is a schematic flowchart of a method for determining whether a hand is close to a nose according to an embodiment of the present application, as shown in fig. 8, when the smart device determines whether a hand is close to a nose, the following steps 331B to 334B may be performed.
Step 331B: and determining whether the hand is close to the nose in the horizontal direction or not according to the position information of the plurality of hands in the hand motion tracks and the identified position information of the nose.
The smart device may determine whether the hand is horizontally near the nose according to the previous embodiments when performing step 330. On the one hand, if the hand is not close to the nose, the smart device may continue to determine the positional relationship of the hand and the nose according to the new video frame. On the other hand, if the hand is approaching the nose in the horizontal direction, the smart device may continue to perform step 332B.
Step 332B: if so, acquiring a depth information matrix corresponding to the hand position information, and determining the depth information corresponding to the hand position information according to the depth information matrix.
The intelligent device can obtain a two-dimensional matrix limited by the rectangular frame from the depth information matrix of the video frame according to the rectangular frame corresponding to the hand position information, and the two-dimensional matrix is used as the depth information matrix corresponding to the hand position information. The intelligent device can calculate an average value of elements in the depth information matrix, and the calculated average value is used as depth information corresponding to the hand position information. The depth information may indicate a position of the hand in a vertical direction.
Step 333B: and judging whether the depth information is located in a preset depth information interval.
Step 334B: when the depth information is located in the depth information interval, the hand is determined to be close to the identified nose.
The depth information section may be an empirical value or may be configured according to the actual situation of the individual child. When the depth information corresponding to the hand position information in the video frame is located in the depth information interval, it can be determined that the hand of the infant is close to the nose in the vertical direction.
The intelligent device can judge whether the depth information is in the depth information interval. On the one hand, if not, it is stated that the hand is not close to the nose in the vertical direction, and thus it may be determined that the hand is not close to the nose. On the other hand, if yes, the hand is indicated to be close to the nose in the horizontal direction and the vertical direction at the same time, and then the hand can be determined to be close to the nose.
Through the measures, the intelligent device can obtain the depth information matrix capable of accurately representing the position in the vertical direction from the 3D camera, and then can judge whether the hand is close to the nose or not from two dimensions, and accurately identify the feeding action of the infant.
Referring to fig. 9, a block diagram of an apparatus for preventing a child from eating foreign substances according to an embodiment of the present application is shown in fig. 9, and the apparatus may include:
the identification module 910 is configured to input each video frame in the collected video frame sequence into a trained hand detection model, and obtain hand position information in the video frame output by the hand detection model;
a determining module 920, configured to determine a hand motion trajectory according to hand position information in each video frame of the sequence of video frames;
a first determining module 930, configured to determine whether a hand is close to the identified nose in the video frame according to the hand motion trajectory;
a second judging module 940, configured to judge whether a staying time of the hand at a peripheral position of the nose reaches a preset time threshold according to the hand motion trajectory if yes;
and the alarm module 950 is configured to output alarm information of the edible foreign matter when the time length threshold is reached.
The implementation processes of the functions and actions of the modules in the device are specifically described in the implementation processes of the corresponding steps in the method for preventing the infant from eating the foreign matters, and are not described again here.
In the embodiments provided in the present application, the disclosed apparatus and method can be implemented in other ways. The apparatus embodiments described above are merely illustrative, and for example, the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In addition, functional modules in the embodiments of the present application may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.
The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

Claims (10)

1. A method of preventing foreign body consumption by an infant, comprising:
inputting each video frame in the collected video frame sequence into a trained hand detection model to obtain hand position information in the video frame output by the hand detection model;
determining a hand motion track according to hand position information in each video frame in the video frame sequence;
judging whether the hand is close to the identified nose part in the video frame or not according to the hand motion track;
if so, judging whether the staying time of the hand at the peripheral position of the nose part reaches a preset time threshold value or not according to the hand motion track;
and outputting alarm information of edible foreign matters when the time length threshold is reached.
2. The method of claim 1, wherein prior to inputting each video frame of the sequence of video frames into the hand detection model, the method further comprises:
inputting the video frame into a trained nose detection model;
judging whether the nose detection model outputs nose position information in the video frame;
if the nose position information is output, judging whether the nose position information is matched with the preset appointed nose position information in the video frame;
if not, alarm information of the deviation of the camera device is output.
3. The method of claim 2, further comprising:
and if the nose position information is not output, outputting alarm information of the deviation of the camera device.
4. The method of claim 1, wherein determining a hand motion trajectory from hand position information in each of the sequence of video frames comprises:
calculating an intersection ratio according to hand position information in the front video frame and the back video frame in the video frame sequence;
judging whether the cross-over ratio reaches a preset cross-over ratio threshold value or not, and if so, determining that hand position information in the front video frame and the back video frame corresponds to the same hand;
and determining the hand motion track according to hand position information of the same hand in the video frame sequence.
5. The method of claim 1, wherein the determining whether a hand is close to an identified nose in the video frame according to the hand motion trajectory comprises:
determining whether the hand is close to the nose in the horizontal direction according to the position information of the plurality of hands in the hand motion track and the identified position information of the nose;
if so, judging whether the hand position information in the hand motion track corresponds to the hand area size and is positioned in a preset size interval;
determining that the hand is close to the identified nose when the hand region size is within the size interval.
6. The method of claim 1, wherein the sequence of video frames is acquired by a 3D camera;
the judging whether the hand is close to the nose part identified in the video frame or not according to the hand motion track comprises the following steps:
determining whether the hand is close to the nose in the horizontal direction according to the position information of the plurality of hands in the hand motion track and the identified position information of the nose;
if so, acquiring a depth information matrix corresponding to the hand position information, and determining the depth information corresponding to the hand position information according to the depth information matrix;
judging whether the depth information is located in a preset depth information interval or not;
when the depth information is located in the depth information interval, determining that the hand is close to the identified nose.
7. The method of claim 1, wherein the determining whether the staying time of the hand at the peripheral position of the nose reaches a preset time threshold according to the hand motion trajectory comprises:
screening out hand position information of which the distance from the identified nose position information is smaller than a preset distance threshold value from the hand motion trajectory, and taking the hand position information as first hand position information;
screening a plurality of hand position information in continuous video frames from the first hand position information to be used as second hand position information;
judging whether the quantity of the second hand position information reaches a preset quantity threshold value or not; the number threshold is obtained by converting the duration threshold;
if so, determining that the staying time of the hand at the peripheral position of the nose reaches the time threshold.
8. A device for preventing foreign objects from being consumed by a child, comprising:
the identification module is used for inputting each video frame in the collected video frame sequence into a trained hand detection model and obtaining hand position information in the video frame output by the hand detection model;
the determining module is used for determining a hand motion track according to hand position information in each video frame in the video frame sequence;
the first judging module is used for judging whether the hand part is close to the identified nose part in the video frame or not according to the hand part motion track;
the second judgment module is used for judging whether the staying time of the hand at the peripheral position of the nose part reaches a preset time threshold value or not according to the hand motion track if the hand motion track exists;
and the alarm module is used for outputting alarm information of the edible foreign matters when the time length threshold value is reached.
9. An electronic device, characterized in that the electronic device comprises:
a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to perform the method of preventing a toddler from consuming a foreign body of any of claims 1-7.
10. A computer-readable storage medium, characterized in that the storage medium stores a computer program executable by a processor to perform the method of preventing a child from consuming a foreign object according to any one of claims 1 to 7.
CN202011413220.9A 2020-12-02 2020-12-02 Method and device for preventing infant from eating foreign matter, electronic device and storage medium Active CN112528824B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011413220.9A CN112528824B (en) 2020-12-02 2020-12-02 Method and device for preventing infant from eating foreign matter, electronic device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011413220.9A CN112528824B (en) 2020-12-02 2020-12-02 Method and device for preventing infant from eating foreign matter, electronic device and storage medium

Publications (2)

Publication Number Publication Date
CN112528824A true CN112528824A (en) 2021-03-19
CN112528824B CN112528824B (en) 2022-11-25

Family

ID=74997733

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011413220.9A Active CN112528824B (en) 2020-12-02 2020-12-02 Method and device for preventing infant from eating foreign matter, electronic device and storage medium

Country Status (1)

Country Link
CN (1) CN112528824B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104332022A (en) * 2014-11-21 2015-02-04 上海斐讯数据通信技术有限公司 Warning system and warning method for preventing infant from eating foreign matter
US20190065872A1 (en) * 2017-08-25 2019-02-28 Toyota Jidosha Kabushiki Kaisha Behavior recognition apparatus, learning apparatus, and method and program therefor
CN109993065A (en) * 2019-03-06 2019-07-09 开易(北京)科技有限公司 Driving behavior detection method and system based on deep learning
CN110852190A (en) * 2019-10-23 2020-02-28 华中科技大学 Driving behavior recognition method and system integrating target detection and gesture recognition
CN111931740A (en) * 2020-09-29 2020-11-13 创新奇智(南京)科技有限公司 Commodity sales amount identification method and device, electronic equipment and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104332022A (en) * 2014-11-21 2015-02-04 上海斐讯数据通信技术有限公司 Warning system and warning method for preventing infant from eating foreign matter
US20190065872A1 (en) * 2017-08-25 2019-02-28 Toyota Jidosha Kabushiki Kaisha Behavior recognition apparatus, learning apparatus, and method and program therefor
CN109993065A (en) * 2019-03-06 2019-07-09 开易(北京)科技有限公司 Driving behavior detection method and system based on deep learning
CN110852190A (en) * 2019-10-23 2020-02-28 华中科技大学 Driving behavior recognition method and system integrating target detection and gesture recognition
CN111931740A (en) * 2020-09-29 2020-11-13 创新奇智(南京)科技有限公司 Commodity sales amount identification method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN112528824B (en) 2022-11-25

Similar Documents

Publication Publication Date Title
US20190130188A1 (en) Object classification in a video analytics system
US20190130189A1 (en) Suppressing duplicated bounding boxes from object detection in a video analytics system
US20190304102A1 (en) Memory efficient blob based object classification in video analytics
CN111240481B (en) Read-write distance identification method based on smart watch
US20190034734A1 (en) Object classification using machine learning and object tracking
US20190130191A1 (en) Bounding box smoothing for object tracking in a video analytics system
KR102106135B1 (en) Apparatus and method for providing application service by using action recognition
WO2018025831A1 (en) People flow estimation device, display control device, people flow estimation method, and recording medium
US10229503B2 (en) Methods and systems for splitting merged objects in detected blobs for video analytics
US11763463B2 (en) Information processing apparatus, control method, and program
US11678029B2 (en) Video labeling method and apparatus, device, and computer-readable storage medium
CN109791615A (en) For detecting and tracking the method, target object tracking equipment and computer program product of target object
US20150104067A1 (en) Method and apparatus for tracking object, and method for selecting tracking feature
CN110992305A (en) Package counting method and system based on deep learning and multi-target tracking technology
CN110956118A (en) Target object detection method and device, storage medium and electronic device
CN110414360A (en) A kind of detection method and detection device of abnormal behaviour
KR20160078089A (en) Detection method for abnormal object on farm, managing method and system for the same
CN111797776A (en) Infant monitoring method and device based on posture
CN112528824B (en) Method and device for preventing infant from eating foreign matter, electronic device and storage medium
KR102156279B1 (en) Method and automated camera-based system for detecting and suppressing harmful behavior of pet
CN111091031A (en) Target object selection method and face unlocking method
CN113963316A (en) Target event determination method and device, storage medium and electronic device
CN112132110A (en) Method for intelligently judging human body posture and nursing equipment
CN116883916A (en) Conference abnormal behavior detection method and system based on deep learning
Langeland Automatic error detection in 3d pritning using computer vision

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant