CN112597800B - Method and system for detecting sitting-up actions of students in recording and broadcasting system - Google Patents

Method and system for detecting sitting-up actions of students in recording and broadcasting system Download PDF

Info

Publication number
CN112597800B
CN112597800B CN202011327975.7A CN202011327975A CN112597800B CN 112597800 B CN112597800 B CN 112597800B CN 202011327975 A CN202011327975 A CN 202011327975A CN 112597800 B CN112597800 B CN 112597800B
Authority
CN
China
Prior art keywords
motion
sitting
judgment
standing
angle
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011327975.7A
Other languages
Chinese (zh)
Other versions
CN112597800A (en
Inventor
张进
蒋守欢
廖亮亮
王满海
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ANHUI TELEHOME DIGITAL TECHNOLOGY CO LTD
Original Assignee
ANHUI TELEHOME DIGITAL TECHNOLOGY CO LTD
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ANHUI TELEHOME DIGITAL TECHNOLOGY CO LTD filed Critical ANHUI TELEHOME DIGITAL TECHNOLOGY CO LTD
Priority to CN202011327975.7A priority Critical patent/CN112597800B/en
Publication of CN112597800A publication Critical patent/CN112597800A/en
Application granted granted Critical
Publication of CN112597800B publication Critical patent/CN112597800B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/50Extraction of image or video features by performing operations within image blocks; by using histograms, e.g. histogram of oriented gradients [HoG]; by summing image-intensity values; Projection analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • G06V40/166Detection; Localisation; Normalisation using acquisition arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Data Mining & Analysis (AREA)
  • Social Psychology (AREA)
  • Psychiatry (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a method and a system for detecting sitting actions of students in a recording and broadcasting system, which comprise the following steps: s100, acquiring a frame of image; s200, preprocessing the acquired image; s300, performing foreground extraction on the preprocessed image to obtain a motion history image; s400, judging and storing a possibility target of the motion history graph; s500, two-stage judgment is carried out on the stored information through a judgment module to obtain coordinates of standing and sitting. The invention provides a method and a system for detecting standing and sitting actions of students in a recording and broadcasting system, and provides a method for detecting standing and sitting actions of students based on a motion history diagram and gradient directions, wherein the standing and sitting actions are judged by adopting two-stage judgment to improve the detection accuracy. The problem that the target is incomplete is solved by using a motion history graph to solve the background modeling method and the like, the global motion angle is extracted for the complete target, and finally, the two-stage discrimination method is adopted for discrimination.

Description

Method and system for detecting sitting-up actions of students in recording and broadcasting system
Technical Field
The invention relates to the technical field of motion detection, in particular to a method and a system for detecting sitting actions of students in a recording and broadcasting system.
Background
In the intelligent recording and broadcasting, the interaction monitoring of students mainly adopts the modes of detecting the standing and sitting actions of the students, and some detection devices adopt the modes of capacitance pressure sensing and the like.
In order to save costs, vision processing-based methods are receiving increasing attention. CN 102096930A (patent number) adopts a background modeling mode to determine a moving target, and then uses template matching to determine student standing and sitting down, and the method does not consider interference caused by other movements at all, so that the false detection rate is high; CN 110728696A firstly uses background modeling to select a moving object, then uses a method of feature points and sparse optical flow to make erection judgment, and finally gives a judgment result according to a preset line, and the method has several disadvantages: 1. the simple background modeling method is difficult to extract a complete moving object in a complex environment such as a classroom; 2. the feature points are easy to be interfered by light rays and dressing, and the calculation complexity of an optical flow method is high; 3. the pre-scribe method is too limited and if the camera moves slightly, the entire decision condition will fail.
Disclosure of Invention
The invention provides a method and a system for detecting sitting actions of students in a recording and broadcasting system, which can solve the technical defects.
In order to achieve the above purpose, the present invention adopts the following technical scheme:
a method for detecting sitting actions of students in a recording and broadcasting system comprises the following steps:
s100, acquiring a frame of image;
s200, preprocessing the acquired image;
s300, performing foreground extraction on the preprocessed image to obtain a motion history image;
s400, judging and storing a possibility target of the motion history graph;
s500, two-stage judgment is carried out on the stored information through a judgment module to obtain coordinates of standing and sitting.
Further, preprocessing the acquired image in S200 includes scaling the height and width of the acquired 4K image to 1/4 of the original height and width, respectively, and then performing gaussian filtering.
Further, the step S300 is to extract the foreground of the preprocessed image to obtain a motion history map;
the method specifically comprises the following steps:
(c1) An inter-frame difference method, that is, an absolute value of a pixel value difference between each point in a current frame and a corresponding point of a previous frame is adopted, a two-frame difference method corresponds to a previous frame, a three-frame difference method corresponds to a previous two frames, and the like, a formula is shown as 1,
F(x,y)=abs(I(x,y)-I pre (x,y)) (1)
where F (x, y) represents the result after the frame difference, abs () represents the absolute value process, I (x, y) represents the pixel value of the current frame coordinate (x, y), I pre (x, y) represents the pixel value of the previous frame coordinates (x, y);
specifically, an annular buf for storing 4 frames of images is created, images in buf [ i ] and buf [ i+2] are taken out, and a result after the frame difference is calculated by using a formula (1);
(c2) The motion history map is a map of motion history pixels where motion occurred set to the current timestamp, and the motion history pixels where motion occurred last long before were cleared, as shown in equation 2,
specifically, a binarization threshold value is set to be 13, F (x, y) is binarized to obtain silh (x, y), when silh (x, y) is not equal to 0, the value of a motion history graph mhi (x, y) takes the current system time timestamp, when silh (x, y) =0 and mhi (x, y) is smaller than the current system time timestamp minus the duration, the value of a motion history graph mhi (x, y) is 0, and otherwise the value of mhi (x, y) is kept unchanged;
the binarized image of the moving object is obtained through the steps, 3*3 morphological corrosion and expansion processing is carried out on the binarized image, and then the circumscribed rectangle of the moving object is extracted to be used as input for judging the possible standing and sitting object module.
Further, in S300, the value of the duration variable is between 0.3 and 0.5 according to the actual situation of the seats of the students in the classroom scene.
Further, the step S400 of judging and storing the possibility target of the motion history map specifically comprises,
the upward angle of standing is set to angle >80 and angle <170, and the downward angle of sitting is set to angle >200 and angle <300;
the specific angle solving comprises gradient direction solving and global motion direction solving;
wherein,
(d1) Gradient direction solving
The gradient direction calculation is shown in equation 3,
specifically, the sobel operator is used for calculating respectivelyAnd->Then, calculating the arcsine of an operator by using a fastAtan2 function to obtain the gradient direction of the motion history map;
(d2) Global motion direction solution
The global motion direction is the average direction of the selected area, and an angle value of 0 to 360 is calculated according to the average direction;
the average direction is calculated from a weighted direction histogram, the weight calculation formula being ω=ax+b, whereb=1-t a, finally +.>Where t represents timestamp in mhi, dt represents duration in mhi, where the most recent motion is known to have a greater weight and the motion that occurred in the past has a lesser weight;
specifically, the gradient direction diagram is divided into 12 equal parts according to 0 to 360 degrees to obtain a gradient direction histogram, the coordinate of the maximum value of the gradient direction histogram is searched to be used as a basic direction, the weight of the motion history diagram is calculated according to a weight formula and an initialized weight coefficient, the relative offset of the basic direction is calculated, and the final motion direction angle can be obtained according to the offset and the basic direction.
Further, the step S500 is to perform two-stage determination on the stored information through a determination module to obtain coordinates of standing and sitting, and specifically, a two-stage determination mode is adopted to improve the detection precision;
in the first stage of judgment, the continuous characteristic of standing and sitting movements is utilized, the standing and sitting movements are judged by calculating the movement direction of a target continuous multi-frame, the face detection is added on the basis of the first stage of the second stage of standing judgment to determine the final judgment result, and the HOG characteristic similarity comparison is adopted for the second stage of sitting judgment for judgment.
Further, the first-stage judgment is that according to the characteristics of rising and sitting movements of students, the rising movements are considered to be rising movements when the rising movements exist for 10 continuous frames, and the falling movements are considered to be sitting movements when the falling movements exist for 10 continuous frames;
firstly, motion judgment of similar areas of adjacent frames is carried out, a feature extraction vector is established by initialization, the feature extraction vector is used for storing motion angles and motion coordinate areas of all possible targets, if the motion angles of a foreground area extracted by a current frame meet angle 80 and angle <170, the coordinates of the area are compared with all coordinates of the previous frame in a non-maximum suppression mode, the condition that the motion angles of continuous 3 frames do not meet angle 80 and angle <170 is considered in the process of motion, therefore, when the ratio of the minimum value between the overlapped area of the two frames and the area of the two frames is larger than 0.5, the two areas are judged to be similar areas, if the motion angles of the continuous 3 frames do not meet angle 80 and angle <170 conditions, the frame count is cleared, and when the feature vector with the frame count being larger than 10 is met, all stored angles in the vector are further judged, and only the number of frames meeting angle 80 and angle <95 is judged to be vertical motion;
the sitting motion determination is the same as the standing motion determination logic, and it is only necessary to determine whether the frame count of the feature vector is greater than 10 frames or not at the time of final determination, and if the frame count is greater than 10 frames, the sitting motion is determined.
Further, the second stage adopts a face detection method to carry out standing judgment;
taking out 10 frames within two seconds of the position determined in the first stage and combining the 10 frames into one frame of image for detection, and judging that the face is standing if two or more faces are detected and the geometric position of the face is at the upper half part of the selected position;
for the sitting action, the extracted HOG features of each frame after the first stage of position determination are compared with the HOG features of the stored image in similarity, and if the similarity value is less than 0.7, the sitting is determined.
On the other hand, the invention also discloses a system for detecting the sitting action of students in the recording and broadcasting system, which comprises the following units,
the image acquisition module is used for acquiring images in the lesson scene of the student;
the image preprocessing module is used for preprocessing the acquired image;
the foreground extraction module is used for extracting the foreground of the preprocessed image to obtain a motion history image;
the possibility standing sitting target judging module is used for judging and storing a possibility target of the motion history map;
and the judging module is used for carrying out two-stage judgment on the stored information through the judging module to obtain the coordinates of standing and sitting.
According to the technical scheme, the method for detecting the standing and sitting actions of the students in the recording and broadcasting system provided by the invention is based on the motion history diagram and the gradient direction, and the standing and sitting actions are judged by adopting two-stage judgment to improve the detection accuracy. The problem that the target is incomplete is solved by using a motion history graph to solve the background modeling method and the like, the global motion angle is extracted for the complete target, and finally, the two-stage discrimination method is adopted for discrimination.
Compared with the prior art, the invention has the following beneficial effects:
(1) The invention adopts the three-frame difference method and the motion history diagram to judge the moving object, thereby not only reducing the calculation complexity, but also solving the problem of incomplete detection of the moving object. The information of multiple frames is finally needed for judgment, so that the complexity of subsequent judgment can be greatly reduced by complete target judgment;
(2) The two-stage discrimination method can greatly improve the detection rate. The interference of some slight actions can be filtered through the first-stage discrimination, but the misjudgment possibility still exists for actions with larger actions such as lifting hands, and the like, and the continuous multi-frame combined face detection method is added, so that the instability of single-frame detection can be avoided, and the interference caused by the angle problem can be reduced.
(3) The invention has simple operation, only needs one cradle head camera, omits operations such as extra scribing and the like, and reduces the construction complexity.
Drawings
FIG. 1 is a flow schematic of the method of the present invention;
FIG. 2 is a schematic diagram of a first level decision flow of the present invention;
FIG. 3 is a schematic diagram of the second stage of the present invention for erection decision by face detection;
fig. 4 is a schematic diagram of a sitting motion determination flow of the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention.
As shown in fig. 1, in the method for detecting a sitting action of a student in a recording and playing system according to this embodiment, the whole process flow includes: collecting a frame of image; preprocessing an image; carrying out foreground extraction on the preprocessed image to obtain a motion history image; performing probability target judgment on the motion history graph and storing the probability target judgment; and carrying out two-stage judgment on the stored information through a judgment module to obtain the coordinates of standing and sitting.
The following is a specific description:
(a) Image acquisition module
And collecting images in the student class scene.
(b) Image preprocessing module
The image preprocessing module mainly comprises the steps of scaling and filtering the image, specifically, scaling the height and width of the acquired 4K image to 1/4 of the original height and width respectively, and then performing Gaussian filtering processing.
(c) Foreground extraction module
The foreground extraction module is one of key steps of standing and sitting action detection and positioning, and the subsequent operation is judged based on a foreground target. Through the above description, the complete target is difficult to extract by a simple background modeling method, and the method of combining a frame difference method and a motion history graph is adopted to solve the problem of target extraction integrity.
(c1) The basic idea of the inter-frame difference method is that the absolute value of the pixel value difference between each point in the current frame and the corresponding point of the previous frame, the two-frame difference method corresponds to the previous frame, the three-frame difference method corresponds to the previous two frames, and the like, the formula is shown as 1,
F(x,y)=abs(I(x,y)-I pre (x,y)) (1)
where F (x, y) represents the result after the frame difference, abs () represents the absolute value process, I (x, y) represents the pixel value of the current frame coordinate (x, y), I pre (x, y) represents the pixel value of the previous frame coordinates (x, y).
Specifically, an annular buf storing 4 frames of images is created, images in buf [ i ] and buf [ i+2] are taken out, and the result after the frame difference is calculated by using a formula 1.
(c2) The basic idea of the motion history is to set the motion history picture element where the motion occurred as the current time stamp, and to clear the motion history picture element where the motion occurred last long ago, as shown in formula 2,
specifically, a binarization threshold value is set to be 13, F (x, y) is binarized to obtain silh (x, y), when silh (x, y) noteq0, the value of the motion history graph mhi (x, y) takes the current system time timestamp, when silh (x, y) =0 and mhi (x, y) is smaller than the current system time timestamp minus the duration, the value of the motion history graph mhi (x, y) is 0, and otherwise the value of mhi (x, y) remains unchanged. The value of the duration variable is typically between 0.3 and 0.5, depending on the actual seating situation of the students in the classroom scene.
The binarized image of the moving object is obtained through the steps, 3*3 morphological corrosion and expansion processing is carried out on the binarized image, and then the circumscribed rectangle of the moving object is extracted to be used as input for judging the possible standing and sitting object module.
(d) Target judgment module for possible standing and sitting
The background extraction module extracts a plurality of possible target circumscribed rectangles, and the heights of the circumscribed rectangles are larger than the widths according to the action characteristics of the students in standing and sitting, so that the circumscribed rectangles which do not accord with upward and downward movement are firstly excluded according to the characteristics.
The student standing and sitting actions are a continuous process, the standing actions are upward movements, the sitting actions are downward movements, but considering the nonstandard of the student standing and sitting actions, some have a forward tilting action in the front half of the standing, some shake left and right in the front half of the standing, so the standing up angle is set to angle >80 and angle <170, and the sitting down angle is set to angle >200 and angle <300, in order to accommodate all the possibility of standing and sitting down actions.
Specific angle solutions include gradient direction solutions and global motion direction solutions.
(d1) Gradient direction solving
The gradient direction calculation is shown in equation 3,
specifically, the sobel operator is used for calculating respectivelyAnd->Then, the fastAtan2 function is used for solving the arcsine of the operator to obtain the gradient direction of the motion history map.
(d2) Global motion direction solution
The basic idea of the global motion direction is to calculate the average direction of the selected region, from which an angle value of 0 to 360 is found. The average direction is calculated from a weighted direction histogram, the weight calculation formula being ω=ax+b, whereb=1-t a, finally +.>Where t represents timestamp in mhi, dt represents duration in mhi, where it is known from the formula that the most recent motion has a greater weight and the motion that occurred in the past has a lesser weight.
Specifically, the gradient direction diagram is divided into 12 equal parts according to 0 to 360 degrees to obtain a gradient direction histogram, the coordinate of the maximum value of the gradient direction histogram is searched to be used as a basic direction, the weight of the motion history diagram is calculated according to a weight formula and an initialized weight coefficient, the relative offset of the basic direction is calculated, and the final motion direction angle can be obtained according to the offset and the basic direction.
(e) Decision module
The judging module judges the final target according to the foreground extraction content determined in the previous step and the solved motion direction, and the design of the module influences the detection precision, so that the detection precision is improved by adopting a two-stage judging mode in the judging of the module. In the first stage of judgment, the continuous characteristic of standing and sitting movements is utilized, the standing and sitting movements are judged by calculating the movement direction of a target continuous multi-frame, the face detection is added on the basis of the first stage of the second stage of standing judgment to determine the final judgment result, and the HOG characteristic similarity comparison is adopted for the second stage of sitting judgment for judgment.
The first level of determination is based on the characteristics of the student standing and sitting movements, the rising movement tendency is considered to be standing movements when there is a rising movement tendency for 10 consecutive frames, and the falling movement tendency is considered to be sitting movements when there is a falling movement tendency for 10 consecutive frames, as shown in fig. 2. In the specific steps, firstly, motion judgment of similar areas of adjacent frames is carried out, a feature extraction vector is established by initialization, the feature extraction vector is used for storing motion angles and motion coordinate areas of all possible targets, if the motion angles of the foreground areas extracted by the current frames meet angle 80 and angle <170, the coordinates of the area are compared with all coordinates of the previous frames in a non-maximum suppression mode, and considering interference generated in the motion process, the condition that the motion angles of continuous 3 frames do not meet angle 80 and angle <170 possibly exists, therefore, when the ratio of the overlapping area of the two frames to the minimum value between the two frames is larger than 0.5, the two areas are judged to be similar areas, if the motion angles of the continuous 3 frames do not meet angle 80 and angle <170 conditions, the frame count is cleared, and when the feature vector with the frame count being larger than 10 is met, all stored angles in the vector are further judged, and only the number of frames meeting angle 80 and angle <95 is judged to be vertical motion. The sitting motion determination is logically similar to the standing motion determination, and it is only necessary to determine whether the frame count of the feature vector is greater than 10 frames or not at the time of final determination, and if the frame count is greater than 10 frames, the sitting motion is determined.
The second stage of standing and sitting actions are quite different, and for the standing action, the first stage of judgment can solve misjudgment caused by micro motion of students, but the action with relatively high interference of lifting hands still can generate misjudgment, and in order to solve the interference, the second stage adopts a face detection method to perform standing judgment, as shown in fig. 2. The face detection method is mature and applied to various scenes, the minimum face which can be detected can reach 10 x 10 pixels, the detection precision is high, and the detection speed is high. In order to improve the detection efficiency and the detection robustness, 10 frames are taken out and combined into one frame of image for detection within two seconds at the position determined by the first stage, and if two or more faces are detected, the geometric position of the faces is determined to be a rising action at the upper half part of the selected position. For the sitting action, as shown in fig. four, the extracted HOG features of each frame after the first stage of position determination are compared with the HOG features of the stored image in similarity, and if the similarity value is less than 0.7, the sitting is determined.
Therefore, the invention can realize the accurate positioning of standing and sitting by only one camera, does not need additional parameter configuration in the implementation process, and can effectively reduce the miscut of the lens in the interaction.
On the other hand, the invention also discloses a system for detecting the sitting action of students in the recording and broadcasting system, which comprises the following units,
the image acquisition module is used for acquiring images in the lesson scene of the student;
the image preprocessing module is used for preprocessing the acquired image;
the foreground extraction module is used for extracting the foreground of the preprocessed image to obtain a motion history image;
the possibility standing sitting target judging module is used for judging and storing a possibility target of the motion history map;
and the judging module is used for carrying out two-stage judgment on the stored information through the judging module to obtain the coordinates of standing and sitting.
According to the technical scheme, the method for detecting the standing and sitting actions of the students in the recording and broadcasting system provided by the invention is based on the motion history diagram and the gradient direction, and the standing and sitting actions are judged by adopting two-stage judgment to improve the detection accuracy. The problem that the target is incomplete is solved by using a motion history graph to solve the background modeling method and the like, the global motion angle is extracted for the complete target, and finally, the two-stage discrimination method is adopted for discrimination.
It may be understood that the system provided by the embodiment of the present invention corresponds to the method provided by the embodiment of the present invention, and explanation, examples and beneficial effects of the related content may refer to corresponding parts in the above method.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims (6)

1. A method for detecting sitting actions of students in a recording and broadcasting system is characterized by comprising the following steps of: the method comprises the following steps:
s100, acquiring a frame of image;
s200, preprocessing the acquired image;
s300, performing foreground extraction on the preprocessed image to obtain a motion history image;
s400, judging and storing the possibility of sitting down of the target standing up of the motion history map;
s500, two-stage judgment is carried out on the stored information through a judgment module to obtain coordinates of standing and sitting, the two-stage judgment is carried out on the stored information through the judgment module to obtain the coordinates of standing and sitting, and the detection precision is improved by adopting a two-stage judgment mode;
in the first stage of judgment, the standing and sitting actions are judged by calculating the movement direction of a target continuous multi-frame by utilizing the continuity characteristic of standing and sitting movements, the face detection is added on the basis of the first stage of the second stage of standing judgment to determine the final judgment result, and the HOG characteristic similarity comparison is adopted for the second stage of sitting judgment for judgment;
the first stage of judgment is to consider that according to the characteristics of rising and sitting movements of students, the rising movements are considered to be rising movements when the rising movements exist for 10 continuous frames, and the falling movements are considered to be sitting movements when the falling movements exist for 10 continuous frames;
firstly, motion judgment of similar areas of adjacent frames is carried out, a feature extraction vector is established by initialization, the feature extraction vector is used for storing motion angles and motion coordinate areas of all possible targets, if the motion angles of a foreground area extracted by a current frame meet angle 80 and angle <170, the coordinates of the area are compared with all coordinates of the previous frame in a non-maximum suppression mode, the condition that the motion angles of continuous 3 frames do not meet angle 80 and angle <170 is considered in the process of motion, therefore, when the ratio of the minimum value between the overlapped area of the two frames and the area of the two frames is larger than 0.5, the two areas are judged to be similar areas, if the motion angles of the continuous 3 frames do not meet angle 80 and angle <170 conditions, the frame count is cleared, and when the feature vector with the frame count being larger than 10 is met, all stored angles in the vector are further judged, and only the number of frames meeting angle 80 and angle <95 is judged to be vertical motion;
the sitting action judgment is the same as the standing action judgment logic, and only needs to judge whether the frame count of the feature vector is more than 10 frames or not in the final judgment, and if the frame count is more than 10 frames, the sitting action is judged;
for the judgment of the possibility of standing up of the target, the second-stage judgment is to adopt a face detection method to carry out standing up judgment, 10 frames are taken out and combined into one frame of image to be detected within two seconds of the position determined by the first-stage judgment, and if two or more faces are detected and the geometric position of the faces is in the upper half part of the selected position, the target is judged to be the standing up action;
and if the value of the similarity is smaller than 0.7, the second stage judges that the target is in sitting action.
2. The method for detecting the sitting-up motion of a student in a recording and playing system according to claim 1, wherein the method comprises the following steps: the preprocessing of the acquired image in S200 includes scaling the height and width of the acquired 4K image to 1/4 of the original, respectively, and then performing gaussian filtering.
3. The method for detecting the sitting-up motion of a student in a recording and playing system according to claim 1, wherein the method comprises the following steps: s300, performing foreground extraction on the preprocessed image to obtain a motion history image;
the method specifically comprises the following steps:
(c1) An inter-frame difference method, that is, an absolute value of a pixel value difference between each point in a current frame and a corresponding point of a previous frame is adopted, a two-frame difference method corresponds to a previous frame, a three-frame difference method corresponds to a previous two frames, and the like, a formula is shown as 1,
F(x,y)=abs(I(x,y)-I pre (x,y)) (1)
where F (x, y) represents the result after the frame difference, abs () represents the absolute value process, I (x, y) represents the pixel value of the current frame coordinate (x, y), I pre (x, y) represents the pixel value of the previous frame coordinates (x, y);
specifically, an annular buf for storing 4 frames of images is created, images in buf [ i ] and buf [ i+2] are taken out, and a result after the frame difference is calculated by using a formula (1);
(c2) The motion history map is a map of motion history pixels where motion occurred set to the current timestamp, and the motion history pixels where motion occurred last long before were cleared, as shown in equation 2,
specifically, a binarization threshold value is set to be 13, F (x, y) is binarized to obtain silh (x, y), when silh (x, y) is not equal to 0, the value of a motion history graph mhi (x, y) takes the current system time timestamp, when silh (x, y) =0 and mhi (x, y) is smaller than the current system time timestamp minus the duration, the value of a motion history graph mhi (x, y) is 0, and otherwise the value of mhi (x, y) is kept unchanged;
the binarized image of the moving object is obtained through the steps, 3*3 morphological corrosion and expansion processing is carried out on the binarized image, and then the circumscribed rectangle of the moving object is extracted to be used as input for judging the possible standing and sitting object module.
4. The method for detecting sitting actions of students in a recording and playing system according to claim 3, wherein the method comprises the following steps: in the step S300, the value of the duration variable is between 0.3 and 0.5 according to the actual situation of seats of students in a classroom scene.
5. The method for detecting sitting actions of students in a recording and playing system according to claim 3, wherein the method comprises the following steps: the step 400 of determining and storing the possibility of the target standing and sitting on the movement history map specifically comprises,
the upward angle of standing is set to angle >80 and angle <170, and the downward angle of sitting is set to angle >200 and angle <300;
the specific angle solving comprises gradient direction solving and global motion direction solving;
wherein,
(d1) Gradient direction solving
The gradient direction calculation is shown in equation 3,
specifically, the sobel operator is used for calculating respectivelyAnd->Then, calculating the arcsine of an operator by using a fastAtan2 function to obtain the gradient direction of the motion history map;
(d2) Global motion direction solution
The global motion direction is the average direction of the selected area, and an angle value of 0 to 360 is calculated according to the average direction;
the average direction is calculated from a weighted direction histogram, the weight calculation formula being ω=ax+b, whereb=1-t a, finally +.>Where t represents timestamp in mhi, dt represents duration in mhi, where the most recent motion is known to have a greater weight and the motion that occurred in the past has a lesser weight;
specifically, the gradient direction diagram is divided into 12 equal parts according to 0 to 360 degrees to obtain a gradient direction histogram, the coordinate of the maximum value of the gradient direction histogram is searched to be used as a basic direction, the weight of the motion history diagram is calculated according to a weight formula and an initialized weight coefficient, the relative offset of the basic direction is calculated, and the final motion direction angle can be obtained according to the offset and the basic direction.
6. A system for detecting a sitting motion of a student in a recording and playing system, for implementing the method of claim 1, characterized in that:
comprising the following units of the device,
the image acquisition and action judgment module is used for acquiring images in the lesson scene of the student; judging and storing the possibility of target standing and sitting on the motion history map; two-stage judgment is carried out on the stored information through a judgment module to obtain coordinates of standing and sitting;
the image preprocessing module is used for preprocessing the acquired image;
and the foreground extraction module is used for carrying out foreground extraction on the preprocessed image to obtain a motion history image.
CN202011327975.7A 2020-11-24 2020-11-24 Method and system for detecting sitting-up actions of students in recording and broadcasting system Active CN112597800B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011327975.7A CN112597800B (en) 2020-11-24 2020-11-24 Method and system for detecting sitting-up actions of students in recording and broadcasting system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011327975.7A CN112597800B (en) 2020-11-24 2020-11-24 Method and system for detecting sitting-up actions of students in recording and broadcasting system

Publications (2)

Publication Number Publication Date
CN112597800A CN112597800A (en) 2021-04-02
CN112597800B true CN112597800B (en) 2024-01-26

Family

ID=75183643

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011327975.7A Active CN112597800B (en) 2020-11-24 2020-11-24 Method and system for detecting sitting-up actions of students in recording and broadcasting system

Country Status (1)

Country Link
CN (1) CN112597800B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102096930A (en) * 2011-01-30 2011-06-15 吴柯维 Student standing and sitting detection method for intelligent recorded broadcasting system for teaching
CN106780565A (en) * 2016-11-15 2017-05-31 天津大学 A kind of many students based on light stream and k means clusters rise and sit detection method
CN107480607A (en) * 2017-07-28 2017-12-15 青岛大学 A kind of method that standing Face datection positions in intelligent recording and broadcasting system
CN110414479A (en) * 2019-08-08 2019-11-05 燕山大学 A kind of drinking behavior cognitive method, continuous and discontinuous movement segmentation recognition method
CN110728696A (en) * 2019-09-06 2020-01-24 天津大学 Student standing detection method of recording and broadcasting system based on background modeling and optical flow method
WO2020207328A1 (en) * 2019-04-11 2020-10-15 华为技术有限公司 Image recognition method and electronic device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102096930A (en) * 2011-01-30 2011-06-15 吴柯维 Student standing and sitting detection method for intelligent recorded broadcasting system for teaching
CN106780565A (en) * 2016-11-15 2017-05-31 天津大学 A kind of many students based on light stream and k means clusters rise and sit detection method
CN107480607A (en) * 2017-07-28 2017-12-15 青岛大学 A kind of method that standing Face datection positions in intelligent recording and broadcasting system
WO2020207328A1 (en) * 2019-04-11 2020-10-15 华为技术有限公司 Image recognition method and electronic device
CN110414479A (en) * 2019-08-08 2019-11-05 燕山大学 A kind of drinking behavior cognitive method, continuous and discontinuous movement segmentation recognition method
CN110728696A (en) * 2019-09-06 2020-01-24 天津大学 Student standing detection method of recording and broadcasting system based on background modeling and optical flow method

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Motion segmentation and pose recognition with motion history gradients;Gary R;Machine Vision and Applications;第174-184页 *
人体行为识别及在教育录播系统中的应用;党冬利;中国优秀硕士学位论文全文数据库;第10-26、36-42页 *
基于多路特征融合的Faster R-CNN与迁移学习的学生课堂行为检测;白捷 等;广西师范大学学报(自然科学版)(第05期);全文 *

Also Published As

Publication number Publication date
CN112597800A (en) 2021-04-02

Similar Documents

Publication Publication Date Title
US11302315B2 (en) Digital video fingerprinting using motion segmentation
CN107358149B (en) Human body posture detection method and device
JP3892059B2 (en) Moving body tracking device
US9672634B2 (en) System and a method for tracking objects
CN103093198B (en) A kind of crowd density monitoring method and device
CN104978567B (en) Vehicle checking method based on scene classification
CN106447701A (en) Methods and devices for image similarity determining, object detecting and object tracking
JP2021082316A5 (en)
CN107295296B (en) Method and system for selectively storing and recovering monitoring video
CN108710879B (en) Pedestrian candidate region generation method based on grid clustering algorithm
CN110309765B (en) High-efficiency detection method for video moving target
CN101719280B (en) Method for detecting petty infrared target
CN104168444A (en) Target tracking method of tracking ball machine and tracking ball machine
KR102434397B1 (en) Real time multi-object tracking device and method by using global motion
Gargi et al. A system for automatic text detection in video
Al-Heety Moving vehicle detection from video sequences for traffic surveillance system
CN111709982A (en) Three-dimensional reconstruction method for dynamic environment
CN114419006A (en) Method and system for removing watermark of gray level video characters changing along with background
CN113657264A (en) Forest fire smoke root node detection method based on fusion of dark channel and KNN algorithm
CN112597800B (en) Method and system for detecting sitting-up actions of students in recording and broadcasting system
Algethami et al. Combining Accumulated Frame Differencing and Corner Detection for Motion Detection.
CN106951831B (en) Pedestrian detection tracking method based on depth camera
CN111667419A (en) Moving target ghost eliminating method and system based on Vibe algorithm
Yu et al. Length-based vehicle classification in multi-lane traffic flow
Malavika et al. Moving object detection and velocity estimation using MATLAB

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant