CN112528824A

CN112528824A - Method and device for preventing infant from eating foreign matter, electronic device and storage medium

Info

Publication number: CN112528824A
Application number: CN202011413220.9A
Authority: CN
Inventors: 张发恩; 林国森
Original assignee: Chuangxin Qizhi Shenzhen Technology Co ltd
Current assignee: Chuangxin Qizhi Shenzhen Technology Co ltd
Priority date: 2020-12-02
Filing date: 2020-12-02
Publication date: 2021-03-19
Anticipated expiration: 2040-12-02
Also published as: CN112528824B

Abstract

The application provides a method and a device for preventing infants from eating foreign matters, electronic equipment and a computer readable storage medium, wherein the method comprises the following steps: inputting each video frame in the collected video frame sequence into a trained hand detection model to obtain hand position information in the video frame output by the hand detection model; determining a hand motion track according to hand position information in each video frame in a video frame sequence; judging whether the hand is close to the identified nose part in the video frame or not according to the hand motion track; if so, judging whether the staying time of the hand at the peripheral position of the nose part reaches a preset time threshold or not according to the hand motion track; when the time length threshold is reached, outputting alarm information of edible foreign matters. The method and the device can accurately monitor the area around the face of the infant, can accurately track the hand close to the face of the infant, and realize accurate identification and alarm when the infant has feeding behavior, thereby avoiding the infant from eating foreign matters.

Description

Method and device for preventing infant from eating foreign matter, electronic device and storage medium

Technical Field

The present disclosure relates to the field of intelligent home appliances, and more particularly, to a method and an apparatus for preventing infants from eating foreign objects, an electronic device, and a computer-readable storage medium.

Background

Children are used to put toys at hand into the mouth during the process of playing alone. If the toy is small in size, it may be accidentally swallowed by a child, causing a great deal of injury to the child. Therefore, it is necessary to monitor the behavior of the infant and alarm when the infant is found to have eating behavior, so as to prevent the infant from eating foreign matter.

In the related art, the hands of the infant can be monitored through a distance sensor, monitored motion trajectory data is compared with a preset hand motion trajectory of the infant, and if the monitored motion trajectory data and the preset hand motion trajectory of the infant are matched, the infant is determined to have feeding behavior. However, the scheme is too simple, the distance sensor is difficult to accurately monitor the hand in the application process, and the reliability is poor.

Disclosure of Invention

An object of the embodiments of the present application is to provide a method and an apparatus for preventing a child from eating a foreign object, an electronic device, and a computer-readable storage medium, which are used for preventing a child from eating a foreign object.

In one aspect, the present application provides a method of preventing an infant from eating foreign matter, comprising:

inputting each video frame in the collected video frame sequence into a trained hand detection model to obtain hand position information in the video frame output by the hand detection model;

determining a hand motion track according to hand position information in each video frame in the video frame sequence;

judging whether the hand is close to the identified nose part in the video frame or not according to the hand motion track;

if so, judging whether the staying time of the hand at the peripheral position of the nose part reaches a preset time threshold value or not according to the hand motion track;

and outputting alarm information of edible foreign matters when the time length threshold is reached.

In an embodiment, prior to inputting each video frame of the sequence of video frames into the hand detection model, the method further comprises:

inputting the video frame into a trained nose detection model;

judging whether the nose detection model outputs nose position information in the video frame;

if the nose position information is output, judging whether the nose position information is matched with the preset appointed nose position information in the video frame;

if not, alarm information of the deviation of the camera device is output.

In an embodiment, the method further comprises:

and if the nose position information is not output, outputting alarm information of the deviation of the camera device.

In one embodiment, the determining a hand motion trajectory from hand position information in each video frame of the sequence of video frames comprises:

calculating an intersection ratio according to hand position information in the front video frame and the back video frame in the video frame sequence;

judging whether the cross-over ratio reaches a preset cross-over ratio threshold value or not, and if so, determining that hand position information in the front video frame and the back video frame corresponds to the same hand;

and determining the hand motion track according to hand position information of the same hand in the video frame sequence.

In an embodiment, the determining whether a hand is close to an identified nose in the video frame according to the hand motion trajectory includes:

determining whether the hand is close to the nose in the horizontal direction according to the position information of the plurality of hands in the hand motion track and the identified position information of the nose;

if so, judging whether the hand position information in the hand motion track corresponds to the hand area size and is positioned in a preset size interval;

determining that the hand is close to the identified nose when the hand region size is within the size interval.

In an embodiment, the sequence of video frames is acquired by a 3D camera;

the judging whether the hand is close to the nose part identified in the video frame or not according to the hand motion track comprises the following steps:

if so, acquiring a depth information matrix corresponding to the hand position information, and determining the depth information corresponding to the hand position information according to the depth information matrix;

judging whether the depth information is located in a preset depth information interval or not;

when the depth information is located in the depth information interval, determining that the hand is close to the identified nose.

In an embodiment, the determining whether the staying time of the hand at the peripheral position of the nose reaches a preset time threshold according to the hand motion trajectory includes:

screening out hand position information of which the distance from the identified nose position information is smaller than a preset distance threshold value from the hand motion trajectory, and taking the hand position information as first hand position information;

screening a plurality of hand position information in continuous video frames from the first hand position information to be used as second hand position information;

judging whether the quantity of the second hand position information reaches a preset quantity threshold value or not; the number threshold is obtained by converting the duration threshold;

if so, determining that the staying time of the hand at the peripheral position of the nose reaches the time threshold.

In another aspect, the present application further provides a device for preventing an infant from eating foreign matter, comprising:

the identification module is used for inputting each video frame in the collected video frame sequence into a trained hand detection model and obtaining hand position information in the video frame output by the hand detection model;

the determining module is used for determining a hand motion track according to hand position information in each video frame in the video frame sequence;

the first judging module is used for judging whether the hand part is close to the identified nose part in the video frame or not according to the hand part motion track;

the second judgment module is used for judging whether the staying time of the hand at the peripheral position of the nose part reaches a preset time threshold value or not according to the hand motion track if the hand motion track exists;

and the alarm module is used for outputting alarm information of the edible foreign matters when the time length threshold value is reached.

Further, the present application also provides an electronic device, including:

a processor;

a memory for storing processor-executable instructions;

wherein the processor is configured to perform the above method of preventing a child from eating a foreign object.

In addition, the present application also provides a computer readable storage medium, wherein the storage medium stores a computer program, and the computer program can be executed by a processor to realize the method for preventing the infant from eating the foreign matter.

In the embodiment of the application, after hand position information of each video frame in a video frame sequence is identified, a hand motion track is determined according to the hand position information in each video frame, and whether a hand is close to an identified nose part in the video frame is judged according to the hand motion track; if so, judging whether the staying time of the hand at the peripheral position of the nose reaches a time threshold or not according to the hand motion track; and when the time length threshold is reached, outputting alarm information of the edible foreign matters. The scheme of the application can accurately monitor the area around the face of the infant, can accurately track the hand close to the face of the infant, and can realize accurate identification and alarm when the infant has feeding behavior, thereby avoiding the infant from eating foreign matters.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings required to be used in the embodiments of the present application will be briefly described below.

Fig. 1 is a schematic view of an application scenario of a method for preventing a child from eating foreign matter according to an embodiment of the present application;

fig. 2 is a schematic structural diagram of an electronic device according to an embodiment of the present application;

fig. 3 is a flowchart illustrating a method for preventing a child from eating foreign substances according to an embodiment of the present application;

fig. 4 is a schematic flowchart of a method for checking a position of an image capturing apparatus according to an embodiment of the present disclosure;

FIG. 5 is a diagram illustrating a designated nose position in a video frame according to an embodiment of the present application;

fig. 6 is a flowchart illustrating a method for determining a hand motion trajectory according to an embodiment of the present application;

FIG. 7 is a flowchart illustrating a method for determining whether a hand is close to a nose according to an embodiment of the present disclosure;

FIG. 8 is a schematic flow chart illustrating a method for determining whether a hand is close to a nose according to another embodiment of the present disclosure;

fig. 9 is a block diagram of an apparatus for preventing a child from eating foreign substances according to an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be described below with reference to the drawings in the embodiments of the present application.

Like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures. Meanwhile, in the description of the present application, the terms "first", "second", and the like are used only for distinguishing the description, and are not to be construed as indicating or implying relative importance.

Fig. 1 is a schematic view of an application scenario of the method for preventing a child from eating a foreign object according to the embodiment of the present application. As shown in fig. 1, the application scenario includes an image capture apparatus 40 and a smart device 50; the camera 40 may be a head-mounted camera device, and is configured to capture a real-time image from above the head of the infant and downward, so as to capture a video frame sequence of the area around the face of the infant and transmit the captured video frame sequence to the smart device 50; the smart device 50 may be a computer host, a tablet computer, or other electronic devices with computing functions, and is configured to determine whether there is a feeding behavior for the infant according to the acquired video frame sequence.

As shown in fig. 2, the present embodiment provides an electronic apparatus 1 including: at least one processor 11 and a memory 12, one processor 11 being exemplified in fig. 2. The processor 11 and the memory 12 are connected by a bus 10, and the memory 12 stores instructions executable by the processor 11, and the instructions are executed by the processor 11, so that the electronic device 1 can execute all or part of the flow of the method in the embodiments described below. In an embodiment, the electronic device 1 may be the smart device 50 described above.

The Memory 12 may be implemented by any type of volatile or non-volatile Memory device or combination thereof, such as Static Random Access Memory (SRAM), Electrically Erasable Programmable Read-Only Memory (EEPROM), Erasable Programmable Read-Only Memory (EPROM), Programmable Read-Only Memory (PROM), Read-Only Memory (ROM), magnetic Memory, flash Memory, magnetic disk or optical disk.

The present application also provides a computer readable storage medium storing a computer program executable by a processor 11 to perform the method for preventing a child from eating foreign objects provided by the present application.

Referring to fig. 3, a flow chart of a method for preventing a child from eating foreign substances according to an embodiment of the present application is shown in fig. 3, and the method may include the following steps 310 to 350.

Step 310: and inputting each video frame in the collected video frame sequence into the trained hand detection model to obtain hand position information in the video frame output by the hand detection model.

The scheme can be executed by the intelligent device. In addition, the intelligent device may be integrated into the camera device, and in this case, the camera device may directly perform the steps of the present scheme after acquiring the sequence of video frames. For convenience of description, the following description takes a smart device as an implementation subject.

The video frame sequence is generated in real time by shooting the area around the face of the infant by the camera device, and the video frame in the video frame sequence comprises the image of the area around the face of the infant.

The hand detection model is used to detect a hand in a video frame. The hand detection model can be obtained through training of the target detection model. The target detection model may be any one of fast R-cnn (fast regions with conditional Neural Networks features), SSD (Single Shot multi box Detector), yolo (young Only Look one), and the like. Before performing step 310 of the present application, the target detection model may be trained by hand sample images. The hand sample image carries a label indicating hand position information and category information in the hand sample image. The target detection model may output predicted position information and predicted category information for the hand sample image. And evaluating the difference between the predicted position information and the hand position information of the same hand sample image and the difference between the category information and the predicted category information through a loss function so as to adjust the network parameters of the target detection model. And repeating iteration until the function value of the loss function is stable, and determining that the target detection model is converged to obtain the hand detection model.

The smart device may input each video frame in the sequence of video frames into the hand detection model in turn. If a hand is present in the video frame, the hand detection model may output hand position information in the video frame. The hand position information is used to indicate the position of the hand in the video frame, and may be generally represented as a rectangular box defining the area where the hand is located. The hand position information may be in the form of a combination of coordinates of the upper left corner and the lower right corner of a rectangular frame defining the region where the hand is located in the image coordinate system to which the video frame belongs, or a combination of coordinates of the center point, the width, and the height of a rectangular frame defining the region where the hand is located in the image coordinate system to which the video frame belongs. The specific form of the hand position information depends on the target detection model for training the hand detection model.

Step 320: and determining a hand motion track according to hand position information in each video frame in the video frame sequence.

The intelligent device can track the hands according to the hand position information in each video frame in the video frame sequence, so that the hand motion track is determined. The hand motion trajectory may indicate a hand motion direction.

Step 330: and judging whether the hand is close to the identified nose part in the video frame or not according to the hand motion track.

The identified nose in the video frame can be represented by nose position information, and the nose position information indicates the position of the nose in the video frame. The nose position information may be represented as a rectangular box defining the area where the nose is located. The nose position information may be in the form of a combination of coordinates of an upper left corner and coordinates of a lower right corner of a rectangular frame defining the region where the nose is located in the image coordinate system to which the video frame belongs, or a combination of coordinates of a center point, a width, and a height of the rectangular frame defining the region where the nose is located in the image coordinate system to which the video frame belongs. The specific form of the nose position information depends on the target detection model for training the nose detection model.

For each hand position information in the hand motion track, the intelligent device can determine the distance between the hand and the nose according to the hand position information and the nose position information belonging to the same video frame as the hand position information. For example, the smart device may determine a distance between the hand position information and the nose position information according to a central point of a rectangular frame corresponding to the hand position information and a central point of a rectangular frame corresponding to the nose position information, and use the distance as the distance between the hand and the nose in the video frame.

The smart device may determine whether the distance between the hand and the nose gradually decreases. On the one hand, if not, it indicates that the hand is not close to the nose. On the other hand, if yes, the hand is close to the nose.

In an embodiment, the intelligent device may extract a plurality of hand position information from the hand motion trajectory, and determine a distance between the hand and the nose according to the selected hand position information and the nose position information belonging to the same video frame as the hand position information, so as to determine whether the hand is close to the nose. By this measure, it is not necessary to calculate the position information of each hand, and the amount of calculation can be reduced.

Step 340: if so, judging whether the staying time of the hand at the peripheral position of the nose part reaches a preset time threshold value or not according to the hand motion track.

The duration threshold may be an empirical value, or may be set according to eating habits of different children. When the staying time of the hands at the peripheral position of the nose reaches the time threshold, the infant can be determined to have the feeding behavior.

When the intelligent device determines that the hand in the video frame sequence is close to the nose, the hand position information, of which the distance from the identified nose position information is smaller than a preset distance threshold value, can be screened out from the hand motion track of the video frame sequence to serve as the first hand position information. Here, the first hand information is hand position information determined by the first filtering. When the distance between the hand and the nose is smaller than the distance threshold value, the hand can be determined to be located at the peripheral position of the nose. For example, the smart device may determine a distance between the hand and the nose according to a rectangular frame center point corresponding to the hand position information and a rectangular frame center point corresponding to the nose position information, and determine whether the distance is smaller than the distance threshold, thereby screening out the first hand position information.

The intelligent device can screen out a plurality of hand position information located in continuous video frames from the first hand position information to serve as second hand position information. Here, the second hand position information is hand position information determined through the secondary filtering. The smart device may filter out sets of second hand position information from the first hand position information.

Illustratively, the smart device screens out more than 5000 pieces of first hand position information from the hand motion trajectories from the 1001 st video frame to the 19000 th video frame, and screens out 3 sets of second hand position information from the first hand position information in consecutive video frames.

For each set of second hand position information, the smart device may determine whether the number of second hand position information reaches a preset number threshold. The number threshold is obtained by converting a duration threshold. Illustratively, the duration threshold is 3 seconds, the frame rate of the video frames is 30, and the number threshold is 90. In one aspect, for any group of second hand position information, if the number of the second hand position information of the group does not reach the number threshold, it can be determined that the staying time of the hand at the peripheral position of the nose does not reach the time threshold. On the other hand, if the number of the second hand position information of the group reaches the number threshold, it can be assumed that the staying time of the hand at the peripheral position of the nose reaches the time threshold.

Step 350: when the time length threshold is reached, outputting alarm information of edible foreign matters.

When the staying time of the hands at the peripheral position of the nose reaches a time threshold, the infant can be determined to have eating behavior. At this moment, in order to avoid the infant to eat the foreign matter, the intelligent equipment can output the alarm information of eating the foreign matter. Illustratively, the smart device may send out an alarm message in the form of voice through the audio device, such as "the baby is eating something dirty". Or, the intelligent device may output alarm information in the form of text, voice, animation, and the like adapted to a personal terminal (for example, a mobile phone, a tablet computer, an intelligent watch, intelligent glasses, and the like) of the alarm recipient to the personal terminal of the alarm recipient through a pre-configured contact manner of the alarm recipient (which may be a guardian).

Through the measures from the step 310 to the step 350, the area around the face of the infant can be accurately monitored, the hand close to the face of the infant can be accurately tracked, and accurate identification and alarm can be realized when the infant has feeding behavior, so that the infant can be prevented from eating foreign matters.

In an embodiment, before the method for preventing the infant from eating the foreign object is executed, the intelligent device first determines whether a camera capturing a sequence of video frames is shifted. Referring to fig. 4, a flowchart of a method for checking a position of an image capturing apparatus according to an embodiment of the present disclosure is shown in fig. 4, where the method may include the following steps 301 to 304.

Step 301: the video frame is input into the trained nose detection model.

The nose detection model is used to detect a nose in a video frame. The nose detection model can be obtained through training of the target detection model. The target detection model can be any one of the models such as Faster R-CNN, SSD and YOLO. Prior to performing step 310 of the present application, the target detection model may be trained by the nose sample images. The nose sample image may be an image taken overhead from the baby's head containing the baby's nose, the nose sample image carrying a label indicating nose position information and category information in the nose sample image. The target detection model may output predicted location information and predicted category information for the nose sample image. And evaluating the difference between the predicted position information and the nose position information of the same nose sample image and the difference between the category information and the predicted category information through a loss function, so as to adjust the network parameters of the target detection model. And repeating iteration until the function value of the loss function is stable, and determining that the target detection model is converged to obtain the nose detection model.

Step 302: and judging whether the nose detection model outputs nose position information in the video frame.

After the intelligent device inputs the video frame into the nose detection model, whether the nose detection model outputs the nose position information in the video frame can be judged. On one hand, if the nose position information is not output, the nose of the infant does not appear in the video frame, at the moment, the position of the camera device deviates, and alarm information of the deviation of the camera device can be output. Illustratively, the smart device may send out alarm information in the form of voice, such as "device wearing irregularity", through the audio device. Or the intelligent device can output alarm information in the forms of characters, voice, animation and the like which are matched with the personal terminal to the personal terminal of the alarm receiver through the preset contact way of the alarm receiver. On the other hand, if the nose position information is output, execution may continue to step 303.

Step 303: and if the nose position information is output, judging whether the nose position information is matched with the preset specified nose position information in the video frame.

Referring to fig. 5, a schematic diagram of specifying a nose position in a video frame according to an embodiment of the present application is provided, where, as shown in fig. 5, a solid line box represents the entire video frame, and a dashed line box represents the specified nose position. The designated nose position is used for limiting the position of the nose of the infant in the video frame when the camera device is positioned correctly.

The intelligent device can judge whether the detected nose position information is matched with the specified nose position information. In one embodiment, when the area corresponding to the specified nose position information contains the area corresponding to the nose position information, the nose position information is matched with the specified nose position information. The intelligent device can judge whether the region corresponding to the nose position information is in the region corresponding to the specified nose position information, so that whether the nose position information is matched with the specified nose position information is determined.

Step 304: if not, alarm information of the deviation of the camera device is output.

On one hand, if the nose position information is matched with the designated nose position information, which indicates that the position of the camera device is correct, the intelligent device can execute the method for preventing the infant from eating the foreign matters according to the video frame sequence acquired by the camera device. On the other hand, if the nose position information does not match the specified nose position information, the position of the camera is deviated, and the intelligent device can output alarm information of the deviation of the camera.

Through the measures, when the camera device is deviated in position, the intelligent equipment can give an alarm in time, so that the judgment of the feeding behavior of the infant according to the invalid video frame sequence is avoided.

In an embodiment, referring to fig. 6, which is a flowchart illustrating a method for determining a hand motion trajectory according to an embodiment of the present application, the smart device may perform the following steps 321 to 323 when determining the hand motion trajectory.

Step 321: and calculating the intersection ratio according to the hand position information in the front video frame and the back video frame in the video frame sequence.

The hand detection model may identify two or more hands that may be identified in the video frame. To accurately implement hand tracking, the smart device may calculate an intersection ratio (IoU) for hand position information in two video frames before and after a video frame sequence. For example, the smart device may calculate an intersection ratio of hand position information for the 1 st video frame and the 2 nd video frame in the sequence of video frames, an intersection ratio of hand position information for the 2 nd video frame and the 3 rd video frame, an intersection ratio of hand position information for the 3 rd video frame and the 4 th video frame, and so on.

Step 322: and judging whether the cross-over ratio reaches a preset cross-over ratio threshold value, and if so, determining that the hand position information in the front and back video frames corresponds to the same hand.

Step 323: and determining the motion track of the hand according to the hand position information of the same hand in the video frame sequence.

Where the cross-over threshold is used to determine hand position information belonging to the same hand, the cross-over threshold may be an empirical value, such as 90%.

After the intersection ratio is calculated for the hand position information in the two video frames, the intelligent device can judge whether the intersection ratio reaches the intersection ratio threshold value. On the other hand, if the hand position information does not reach the predetermined position, it is described that the two hand position information do not belong to the same hand. On the other hand, if the hand position information reaches the preset position, the hand position information is described to belong to the same hand.

After determining hand position information belonging to the same hand in each video frame, the smart device may determine a hand motion trajectory for the hand.

In an embodiment, referring to fig. 7, a flowchart of a method for determining whether a hand is close to a nose according to an embodiment of the present application is shown, and as shown in fig. 7, when determining whether a hand is close to a nose, a smart device may perform the following steps 331A to 333A.

Step 331A: and determining whether the hand is close to the nose in the horizontal direction or not according to the position information of the plurality of hands in the hand motion tracks and the identified position information of the nose.

The smart device may determine whether the hand is horizontally near the nose according to the previous embodiments when performing step 330. On the one hand, if the hand is not close to the nose, the smart device may continue to determine the positional relationship of the hand and the nose according to the new video frame. On the other hand, if the hand is approaching the nose in the horizontal direction, the smart device may continue to perform step 332A.

Step 332A: if so, judging whether the hand position information in the hand motion track corresponds to the hand area size and is positioned in a preset size interval.

Step 333A: when the hand region size is within the size interval, it is determined that the hand is near the identified nose.

The preset size interval may be an empirical value, or may be configured according to the actual situation of the individual child. When the hand region size in the video frame is in the size interval, it can be determined that the hand of the infant is close to the nose in the vertical direction.

When the hand is close to the nose in the horizontal direction, the smart device may calculate a hand region size corresponding to the hand position information when the hand is close to the nose, in other words, calculate a size of a rectangular frame corresponding to the hand position information. The smart device may determine whether the hand region size is within the size range. On the one hand, if not, it is stated that the hand is not close to the nose in the vertical direction, and thus it may be determined that the hand is not close to the nose. On the other hand, if yes, the hand is indicated to be close to the nose in the horizontal direction and the vertical direction at the same time, and then the hand can be determined to be close to the nose.

Through above-mentioned measure, whether the smart machine can judge the hand from two dimensions and be close to the nose, and then can discern infant's feeding action more accurately.

In an embodiment, the sequence of video frames is captured by a 3D camera, in other words, the camera means comprises a 3D camera, at which time the smart device may obtain a matrix of depth information of the video frames from the 3D camera. The depth information matrix may be a two-dimensional matrix of the same width and height as the video frame, and the elements in the two-dimensional matrix are depth values of pixels at the same position in the video frame, the depth values being used to represent distances in the vertical direction. For example, the dimension of the video frame may be represented as 600 × 800 in terms of height multiplied by width, and the dimension of the corresponding depth information matrix is 600 rows and 800 columns, and the element in the 100 th row and 200 th column in the depth information matrix is the depth value of the pixel at the same position in the video frame.

Referring to fig. 8, which is a schematic flowchart of a method for determining whether a hand is close to a nose according to an embodiment of the present application, as shown in fig. 8, when the smart device determines whether a hand is close to a nose, the following steps 331B to 334B may be performed.

Step 331B: and determining whether the hand is close to the nose in the horizontal direction or not according to the position information of the plurality of hands in the hand motion tracks and the identified position information of the nose.

The smart device may determine whether the hand is horizontally near the nose according to the previous embodiments when performing step 330. On the one hand, if the hand is not close to the nose, the smart device may continue to determine the positional relationship of the hand and the nose according to the new video frame. On the other hand, if the hand is approaching the nose in the horizontal direction, the smart device may continue to perform step 332B.

Step 332B: if so, acquiring a depth information matrix corresponding to the hand position information, and determining the depth information corresponding to the hand position information according to the depth information matrix.

The intelligent device can obtain a two-dimensional matrix limited by the rectangular frame from the depth information matrix of the video frame according to the rectangular frame corresponding to the hand position information, and the two-dimensional matrix is used as the depth information matrix corresponding to the hand position information. The intelligent device can calculate an average value of elements in the depth information matrix, and the calculated average value is used as depth information corresponding to the hand position information. The depth information may indicate a position of the hand in a vertical direction.

Step 333B: and judging whether the depth information is located in a preset depth information interval.

Step 334B: when the depth information is located in the depth information interval, the hand is determined to be close to the identified nose.

The depth information section may be an empirical value or may be configured according to the actual situation of the individual child. When the depth information corresponding to the hand position information in the video frame is located in the depth information interval, it can be determined that the hand of the infant is close to the nose in the vertical direction.

The intelligent device can judge whether the depth information is in the depth information interval. On the one hand, if not, it is stated that the hand is not close to the nose in the vertical direction, and thus it may be determined that the hand is not close to the nose. On the other hand, if yes, the hand is indicated to be close to the nose in the horizontal direction and the vertical direction at the same time, and then the hand can be determined to be close to the nose.

Through the measures, the intelligent device can obtain the depth information matrix capable of accurately representing the position in the vertical direction from the 3D camera, and then can judge whether the hand is close to the nose or not from two dimensions, and accurately identify the feeding action of the infant.

Referring to fig. 9, a block diagram of an apparatus for preventing a child from eating foreign substances according to an embodiment of the present application is shown in fig. 9, and the apparatus may include:

the identification module 910 is configured to input each video frame in the collected video frame sequence into a trained hand detection model, and obtain hand position information in the video frame output by the hand detection model;

a determining module 920, configured to determine a hand motion trajectory according to hand position information in each video frame of the sequence of video frames;

a first determining module 930, configured to determine whether a hand is close to the identified nose in the video frame according to the hand motion trajectory;

a second judging module 940, configured to judge whether a staying time of the hand at a peripheral position of the nose reaches a preset time threshold according to the hand motion trajectory if yes;

and the alarm module 950 is configured to output alarm information of the edible foreign matter when the time length threshold is reached.

The implementation processes of the functions and actions of the modules in the device are specifically described in the implementation processes of the corresponding steps in the method for preventing the infant from eating the foreign matters, and are not described again here.

In the embodiments provided in the present application, the disclosed apparatus and method can be implemented in other ways. The apparatus embodiments described above are merely illustrative, and for example, the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

In addition, functional modules in the embodiments of the present application may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.

The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

Claims

1. A method of preventing foreign body consumption by an infant, comprising:

2. The method of claim 1, wherein prior to inputting each video frame of the sequence of video frames into the hand detection model, the method further comprises:

inputting the video frame into a trained nose detection model;

if not, alarm information of the deviation of the camera device is output.

3. The method of claim 2, further comprising:

4. The method of claim 1, wherein determining a hand motion trajectory from hand position information in each of the sequence of video frames comprises:

5. The method of claim 1, wherein the determining whether a hand is close to an identified nose in the video frame according to the hand motion trajectory comprises:

6. The method of claim 1, wherein the sequence of video frames is acquired by a 3D camera;

7. The method of claim 1, wherein the determining whether the staying time of the hand at the peripheral position of the nose reaches a preset time threshold according to the hand motion trajectory comprises:

8. A device for preventing foreign objects from being consumed by a child, comprising:

9. An electronic device, characterized in that the electronic device comprises:

a processor;

a memory for storing processor-executable instructions;

wherein the processor is configured to perform the method of preventing a toddler from consuming a foreign body of any of claims 1-7.

10. A computer-readable storage medium, characterized in that the storage medium stores a computer program executable by a processor to perform the method of preventing a child from consuming a foreign object according to any one of claims 1 to 7.