WO2023101117A1 - Procédé de gestion d'enseignement en distanciel à l'aide d'une reconnaissance de personne basée sur l'apprentissage profond - Google Patents

Procédé de gestion d'enseignement en distanciel à l'aide d'une reconnaissance de personne basée sur l'apprentissage profond Download PDF

Info

Publication number
WO2023101117A1
WO2023101117A1 PCT/KR2022/007400 KR2022007400W WO2023101117A1 WO 2023101117 A1 WO2023101117 A1 WO 2023101117A1 KR 2022007400 W KR2022007400 W KR 2022007400W WO 2023101117 A1 WO2023101117 A1 WO 2023101117A1
Authority
WO
WIPO (PCT)
Prior art keywords
face
data
emotion
index
concentration
Prior art date
Application number
PCT/KR2022/007400
Other languages
English (en)
Korean (ko)
Inventor
이충건
김준회
Original Assignee
주식회사 마블러스
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 주식회사 마블러스 filed Critical 주식회사 마블러스
Publication of WO2023101117A1 publication Critical patent/WO2023101117A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/20Education
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/174Facial expression recognition

Definitions

  • the present invention relates to a deep learning-based human recognition method and a non-face-to-face education management method, and more particularly, to a non-face-to-face education management method using deep learning-based human recognition.
  • the present invention is to provide a non-face-to-face education management method using deep learning-based human recognition in order to solve the above problems.
  • a method of driving a deep learning module for a mobile environment performed by a computing device comprising: receiving image data from an external electronic device; Primarily detecting face data from received image data; Secondarily detecting human data when face data is not detected; and managing non-face-to-face image information based on the detected face data or human data.
  • an emotion index may be derived based on the detected face data and human data.
  • managing the non-face-to-face image information may be managing based on the derived emotion index.
  • the method may further include a step of feedback updating the deep learning model by transferring data collected in an online situation.
  • An education method performed by a computing device comprising: receiving image data from an external electronic device; Primarily detecting face data from received image data; Secondarily detecting human data when face data is not detected; and tracking the detected face data or human data in real time and managing non-face-to-face training image information.
  • An education method performed by a computing device comprising: receiving image data from an external electronic device; Primarily detecting face data from received image data; Secondarily detecting human data when face data is not detected; and tracking the detected face data or human data in real time and managing non-face-to-face training image information.
  • the managing of the non-face-to-face training image information includes recognizing the detected face data or person data as being away and sending a warning alarm when a detection state of the detected face data or person data changes over a predetermined period of time. can do.
  • the method may further include deriving an emotion index based on the detected face data and human data.
  • the emotion index includes an emotion index and a concentration index
  • the emotion index inputs facial data to an emotion recognition model based on facial expression recognition technology to determine any one emotion type or enjoyment among positive, negative, and neutral. Any one emotion type among surprise, sadness, anger, fear, displeasure, and calmness is derived as a probability value
  • the concentration index derives heart rate and heart rate variability through rPPG after face detection from the input image, and normal, concentration, and immersion Concentration types can be derived as probability values for each stage.
  • Emotion indicators are extracted so that they can be applied in a non-face-to-face education environment, and it is possible to actively manage the educational environment by checking whether the seat is away or not, as well as checking the emotion indicators and concentration indicators.
  • FIG. 1 illustrates a communication system according to various embodiments of the present invention.
  • FIG. 2 is a block diagram of a configuration of a video electronic device based on changes in emotion and concentration state according to various embodiments of the present disclosure.
  • Figure 3 shows a block diagram of the configuration of a server according to various embodiments of the present invention.
  • FIG. 4 illustrates a deep learning-based human recognition method and a non-face-to-face education management method according to various embodiments of the present disclosure.
  • FIG. 5 illustrates the structure of a machine learning model according to various embodiments of the present invention.
  • FIG 6 illustrates an example in which the artificial intelligence logic according to the present invention derives result values for emotion, posture, and concentration.
  • FIG. 7 is an exemplary diagram for explaining the architecture and learning of a deep learning module according to the present invention.
  • FIG. 8 illustrates an example of a process of detailed models constituting an emotion recognition model based on transfer learning in detail.
  • FIG. 9 is a flowchart illustrating an emotion index and concentration index acquisition according to the present invention.
  • FIG. 10 shows the result of performing a demonstration of the face and person recognition model according to the present invention.
  • processor 500 artificial neural network
  • hidden layer 531 first hidden layer
  • first unit 533 second hidden layer
  • FIG. 1 illustrates a communication system according to various embodiments of the present invention.
  • a communication system includes an electronic device 110, a wired/wireless communication network 120, and a server 130.
  • the server 130 obtains image data from the user's electronic device 110 through the wired/wireless communication network 120, derives an emotional state and a concentration state, and then displays a chatbot message UI corresponding to the corresponding state. It is transmitted back to the electronic device 110 of the user through the wireless communication network 120 .
  • the electronic device 110 captures and transmits image data including face and posture information for the learning state of the user according to a request of the server 130 through the wired/wireless communication network 120 .
  • the electronic device 110 includes a memory that can store information, a transceiver that can transmit and receive information, and at least one processor that can perform information calculation, such as a personal computer, a cellular phone, a smart phone, and a tablet computer. It may be an electronic device including.
  • the type of electronic device 110 is not limited.
  • the wired/wireless communication network 120 provides a communication path through which the electronic device 110 and the server 130 can transmit and receive signals and data to each other.
  • the wired/wireless communication network 120 is not limited to a communication method according to a specific communication protocol, and an appropriate communication method may be used according to an implementation example.
  • IP Internet Protocol
  • the wired/wireless communication network 120 may be implemented as a wired/wireless Internet network, and the electronic device 110 and the server 130 are implemented as mobile communication terminals.
  • the wired/wireless communication network 120 may be implemented as a wireless network such as a cellular network or a wireless local area network (WLAN) network.
  • WLAN wireless local area network
  • the server 130 receives image data including face and posture information for the learning state of the user from the electronic device 110 through the wired/wireless communication network 120 .
  • the server 130 may be an electronic device including a memory capable of storing information, a transmitting/receiving unit capable of transmitting and receiving information, and at least one processor capable of performing information calculation.
  • FIG. 2 illustrates a block diagram of a configuration of an electronic device according to various embodiments of the present disclosure.
  • an electronic device 110 includes a memory 111, a transceiver 112, and a processor 113.
  • the memory 111 may include volatile memory, non-volatile memory, or a combination of volatile and non-volatile memories. Also, the memory 111 may provide stored data according to a request of the processor 113 .
  • the transceiver 112 is connected to the processor 113 and transmits and/or receives signals. All or part of the transceiver 113 may be referred to as a transmitter, a receiver, or a transceiver.
  • the transceiver 112 is a wired access system and a wireless access system, such as an institute of electrical and electronics engineers (IEEE) 802.xx system, an IEEE Wi-Fi system, a 3rd generation partnership project (3GPP) system, and a 3GPP long term evolution (LTE) system. , 3GPP 5G new radio (NR) system, 3GPP2 system, at least one of various wireless communication standards such as Bluetooth may be supported.
  • IEEE institute of electrical and electronics engineers
  • 3GPP 3rd generation partnership project
  • LTE 3GPP long term evolution
  • NR 3GPP 5G new radio
  • 3GPP2 system at least one of various wireless communication standards such as Bluetooth may be supported.
  • the processor 113 may be configured to implement the procedures and/or methods proposed in the present invention.
  • the processor 113 controls overall operations of the electronic device 110 to provide content based on machine learning analysis of biometric information.
  • the processor 116 transmits or receives information or the like through the transceiver 115 .
  • Processor 116 also writes data to and reads data from memory 112 .
  • Processor 116 may include at least one processor.
  • Figure 3 shows a block diagram of the configuration of the server 130 according to various embodiments of the present invention.
  • a server 130 includes a memory 131 , a transceiver 132 and a processor 133 .
  • the server 130 may be a type of electronic device.
  • the server 130 receives image data including face and posture information about the learning state of the user from the electronic device 110 through the wired/wireless communication network 120 .
  • the server 130 converts the received image data into Mat data, converts the Mat data into emotional and concentration values by inputting the Mat data into the emotion recognition SDK module, and converts the emotional state and concentration state values into control logic. and call the chatbot message UI linked to the control logic.
  • the memory 131 is connected to the transceiver 132 and may store information received through communication.
  • the memory 131 is connected to the processor 133 and may store data such as a basic program for operation of the processor 133, an application program, setting information, and information generated by operation of the processor 133.
  • the memory 131 may include volatile memory, non-volatile memory, or a combination of volatile and non-volatile memories. Also, the memory 131 may provide stored data according to a request of the processor 133 .
  • the transceiver 132 is connected to the processor 133 and transmits and/or receives signals. All or part of the transceiver 132 may be referred to as a transmitter, a receiver, or a transceiver.
  • the transceiver 132 is a wired access system and a wireless access system, such as an institute of electrical and electronics engineers (IEEE) 802.xx system, an IEEE Wi-Fi system, a 3rd generation partnership project (3GPP) system, and a 3GPP long term evolution (LTE) system. , 3GPP 5G new radio (NR) system, 3GPP2 system, at least one of various wireless communication standards such as Bluetooth may be supported.
  • the processor 133 may be configured to implement the procedures and/or methods proposed in the present invention.
  • the processor 133 converts image data into emotional and concentration values by inputting them to the emotion recognition SDK module, and controls overall operations of the server 130, such as control logic using the emotional state and concentration values as input values.
  • the processor 133 transmits or receives information or the like through the transceiver 132 .
  • the processor 133 writes data to and reads data from the memory 131 .
  • the processor 135 may include at least one processor.
  • image data is received from an external electronic device (S100), face data is primarily detected from the received image data (S200), and face data is obtained from the received image data. Determining whether it is detected (S300), detecting human data secondarily if face data is not detected (S400), deriving an emotion index through a deep learning module (S500), based on the emotion index It includes a step of managing non-face-to-face image information (S600), and a step of feedback-updating a deep learning model by transferring data collected in an online situation (S700).
  • Step S100 is a step of receiving image data including a face captured or transmitted in real time from the user's electronic device 110 .
  • the training method or the deep learning module driving method according to the present invention may receive image data including a user's facial expression or human shape.
  • Step S200 is a step of primarily detecting face data from received image data.
  • Step S300 is a step of determining whether face data is detected in received image data.
  • Step S400 is a step of secondarily detecting human data when face data is not detected.
  • Step S500 is a step of deriving an emotion index through a deep learning module.
  • image data received through the OpenCV module or the like can be processed.
  • image data can be converted into a Mat (matrix) format handled by the OpenCV image processing library.
  • the derived face data may be converted into Mat data, and the Mat data may be input into an emotion recognition SDK module to be converted into emotion and concentration values.
  • the emotional state may include a plurality of emotional state values with a sum of 1.0
  • the concentration state may include a plurality of concentration state values with a sum of 1.0.
  • the emotional state value and the concentration state value represent probability values, and the sum of the corresponding states must always satisfy 1.0.
  • the 7 emotions and 3 concentration states are calculated as probability values, and the sum of the 7 emotions always has a probability value of 1.0, and the sum of the 3 concentration states also always has a probability value of 1.0. That is, it is possible to form a closed variable space without additional variables.
  • Step S600 may be a step of managing non-face-to-face image information based on the emotion index. Specifically, it will be described later with reference to FIG. 6 .
  • Step S700 may include a step of feedback updating the deep learning model by transferring data collected in an online situation.
  • FIG. 5 illustrates the structure of a machine learning model according to various embodiments of the present invention.
  • MLP multi-layer perceptron
  • Deep learning as one of the emerging technologies in the field of machine learning, is a neural network composed of a plurality of hidden layers and a plurality of hidden units included in them.
  • these basic features are transformed into high-level features that can better explain the problem to be predicted while passing through a plurality of hidden layers.
  • prior knowledge or intuition of an expert is not required, subjective factors in feature extraction can be removed, and a model with higher generalization ability can be developed.
  • feature extraction and model construction are composed of one set, there is an advantage in that the final model can be formed through a simpler process compared to existing machine learning theories.
  • a multi-layer perceptron is a type of artificial neural network (ANN) with multiple nodes based on deep learning. Each node uses a non-linear activation function with neurons similar to animal connection patterns. This nonlinear property makes it possible to linearly distinguish inseparable data.
  • the artificial neural network 500 of the MLP model includes one or more input layers 510, a plurality of hidden layers 530, and one or more outputs. It consists of an output layer (550).
  • Input data such as an RGB value of each pixel in at least one ultrasound image per unit time is input to a node of the input layer 510 .
  • the user's biometric information eg, electrocardiogram information, concentration level, happiness emotion intensity information, and adjusted content information, eg, content genre, content topic, and content channel information 511 are deep It corresponds to the basic characteristic (low level feature) of the learning model.
  • the hidden layer 530 performs calculations based on input factors.
  • the hidden layer 530 is a layer in which units defined by a plurality of nodes formed by integrating the user's biometric information and the information 511 of the adjusted content are stored. As shown in FIG. 5 , the hidden layer 530 may include a plurality of hidden layers.
  • the hidden layer 530 when the hidden layer 530 is composed of a first hidden layer 531 and a second hidden layer 533, the first hidden layer 531 includes user biometric information and adjusted content information 511 As a layer in which first units 532 defined by a plurality of nodes formed by consolidating are stored, the first unit 532 corresponds to a higher characteristic of the user's biometric information and the information 511 of the adjusted content.
  • the second hidden layer 533 is a layer in which second units 534, defined as a plurality of nodes formed by consolidating the first units of the first hidden layer 531, are stored. Corresponds to the upper characteristics of 1 unit 532.
  • the output layer 550 may include a plurality of prediction result units 551 .
  • the plurality of prediction result units 551 may include two units of a true unit and a false unit.
  • the true unit is a prediction result unit meaning that the emotional state value and concentration state value of the user's face data are highly likely to be higher than the threshold value after adjusting the content to the adjusted content
  • the false unit is the content adjustment to the adjusted content. It is a prediction result unit that means that the possibility that the emotion state value and the concentration state value of the user's face data are higher than the threshold value is low.
  • Weights are assigned to connections between the prediction result units 551 and the second units 534 included in the second hidden layer 533, which is the last layer among the hidden layers 530. Based on these weights, it is predicted whether the emotional state value and the concentration state value of the user's face data are greater than or equal to a threshold value after adjusting the content to the adjusted content.
  • the artificial neural network 500 of the MLP model learns by adjusting learning parameters.
  • the learning parameters include at least one of a weight and a variance.
  • the learning parameters are iteratively adjusted through an optimization algorithm called gradient descent. Each time a prediction result is computed from a given data sample (forward propagation), the performance of the network is evaluated through a loss function that measures the prediction error.
  • Each learning parameter of the artificial neural network 500 is adjusted by gradually increasing in the direction of minimizing the value of the loss function, and this process is called back-propagation.
  • the first depth can be divided into emotion (emotion) determination, face detection, and concentration state determination.
  • the learner's emotional state value and concentration state value derived from the above-described emotion recognition SDK module are allocated to the emotion determination unit and the concentration state determination unit, respectively, and basic face detection data may correspond to the face detection unit.
  • one of seven types of emotional states of the learner may be derived based on the result input to the emotion determination unit.
  • This emotion determination unit can derive a corresponding result through machine learning (machine learning) modules such as deep learning and artificial intelligence, and is not limited to a specific technology.
  • an appropriate control logic may be calculated based on emotional state values corresponding to the seven emotions determined in the second depth. For example, when a positive emotional state value corresponds to level 0 (less than 0.5) for a long period of time, control logic may be activated to drive a care chatbot UI for user management.
  • a result of the appropriateness of the posture and a determination of whether to leave the seat may be derived through the posture determination unit and the seat departure determination unit.
  • the output signal is sent to the user's electronic device to implement a guide to the dialog and face position reset screen by controlling the chatbot UI can be sent
  • the chatbot UI is controlled, and the user's electronic interface is implemented to implement a guide through a voice interface during learning It can transmit an output signal to the device.
  • the concentration state determiner In the second depth corresponding to the concentration state determiner, it can be divided into a normal state, a concentration state, and an immersion state corresponding to three categories.
  • the chatbot UI is controlled, and if the normal state continues for a long time, it is determined that care is required, and guidance through a voice interface during learning An output signal can be sent to the user's electronic device to implement.
  • the step of managing the non-face-to-face image information is to manage the non-face-to-face image information based on the derived emotional index, and when the emotional index is derived, a predetermined signal is sent to the non-face-to-face image It may be a step of transmitting to an electronic device for photographing.
  • control logic determines that the learning efficiency is reduced based on the user's emotion/concentration value, the seat is empty, or a change is required
  • the control logic According to the control, a chatbot message UI may be called and a signal may be transmitted to be controlled in the user's electronic device.
  • non-face-to-face image information obtained by photographing a learner's face in real time may be managed in real time.
  • managing the non-face-to-face image information may include tracking detected face data or human data in real time and managing non-face-to-face training image information.
  • the managing of the non-face-to-face education image information may include recognizing the detected face data or person data as being away and transmitting a warning alarm when the detection state changes over a predetermined time period.
  • control logic may provide a route alarm through the chatbot message UI.
  • the facial expression recognition deep learning module based on ai deep learning technology is a model learned using an image dataset of Korean elementary school students (5 to 13 years old) as input data.
  • 7 describes a transfer learning module as an exemplary diagram for explaining the architecture and learning of a deep learning module according to the present invention.
  • pre-processing may be performed on an image data set of 5,000 to 10,000 or Korean elementary school students at a predetermined resolution and size.
  • the preprocessed image dataset is inserted into a first convolutional network, and at this time, the number of channels of the first convolutional network may be determined as n by n, where n may be 24 by way of example.
  • the network after max pooling may be determined as n/2 by n/2, and may be illustratively a 12 by 12 network, but is not limited thereto.
  • the max-pooled data may be inserted into a second convolutional network.
  • the number of channels of the second convolutional network may be determined by m by m, for example, m may be 8 by way of example, but is not limited thereto.
  • the second convolution can be max-pooled again.
  • the Rectified Linear Unit (ReLu) which rectifies the final network, can be inserted into the front of the neural network for deep learning learning as an active function.
  • ReLu Rectified Linear Unit
  • deep learning can be learned to derive whether an input image corresponds to any one of a plurality of emotional states or a plurality of concentration states.
  • the present invention can derive quantitative indicators for the emotional state, learning state, and concentration state of Korean students by deep learning through the transfer learning architecture of these images of Korean elementary school students.
  • the emotion recognition derivation model according to the present invention may be composed of the following three detailed models.
  • the first emotion recognition model may perform facial expression analysis by applying the ai deep learning module after detecting a face from an input image, and calculate probability values for each of three emotion types (positive, negative, and neutral).
  • a quantitative emotional index may be derived based on the probability value.
  • the second emotion recognition model performs facial expression analysis by applying the above-described ai deep learning module after detecting a face from an input image, and probability values for each of the seven emotion types (joy, surprise, sadness, anger, fear, displeasure, calmness) can be calculated.
  • a quantitative emotional index may be derived based on the probability value.
  • the first emotion recognition model and the second emotion recognition model may be applied alternatively or complementary to each other, so that a more precise emotion index may be calculated.
  • the concentration recognition model may derive heart rate and heart rate variability after face detection from an input image.
  • rPPG remote photoplethysmography
  • heart rate and heart rate variability can be measured and analyzed.
  • the concentration recognition model calculates a probability value for each of the three stages of concentration (normal ⁇ concentration ⁇ immersion), and derives the concentration state index.
  • a concentration index can be derived based on the concentration state indicator.
  • the concentration state index can be used as a basic value for deriving a learning index.
  • the emotion recognition derivation model may derive a learning index based on quantitative index values derived from the first emotion recognition model, the second emotion recognition model, and the concentration recognition model.
  • the learning index is an indicator for quantitatively grasping the user's learning status, and more precise calculation is possible by additionally utilizing student data and data generated in the learning situation, such as learning time, number of questions, and absentee status. It could be possible.
  • FIG. 9 is a flowchart illustrating an emotion index and concentration index acquisition according to the present invention.
  • the education method includes receiving image data from an external electronic device (S110), detecting face data from the received image data (S210), and using facial expression recognition technology for the face data. obtaining an emotion index by inputting the face data to the rPPG model based on the change in light blood flow (S410) It may include calculating the learning index as (S510).
  • Step S110 is a step of receiving image data including a face captured or transmitted in real time from the user's electronic device 110 .
  • Step S210 is a step of extracting a user's facial expression from the received image data and extracting a data set for deriving emotion and concentration based on the facial expression.
  • step S310 based on the first emotion recognition module and the second emotion recognition module, probability values for each of three emotion types (positive, negative, neutral) are calculated or seven emotion types (joy, surprise, sadness, anger, Fear, displeasure, calmness) may be a step of calculating each probability value.
  • heart rate and heart rate variability may be derived after face detection from the input image.
  • This step measures and analyzes heart rate and heart rate variability by applying rPPG (remote photoplethysmography) technology to the derived heart rate variability data.
  • rPPG remote photoplethysmography
  • Step S510 is a step of calculating a learning index based on the concentration index and the emotion index.
  • Emotional index and learning index may be built into a database.
  • the database may store the emotional index and learning index in the form of a time series to enable comprehensive analysis.
  • ASICs application specific integrated circuits
  • DSPs digital signal processors
  • DSPDs digital signal processing devices
  • PLDs programmable logic devices
  • FPGAs field programmable gate arrays
  • the above-described method can be written as a program that can be executed on a computer, and can be implemented in a general-purpose digital computer that operates the program using a computer-readable medium.
  • the structure of data used in the above-described method may be recorded on a computer-readable storage medium through various means.
  • Program storage devices which may be used to describe a storage device containing executable computer code for performing various methods of the present invention, should not be construed as including transitory objects such as carrier waves or signals. do.
  • the computer-readable storage media includes storage media such as magnetic storage media (eg, ROM, floppy disk, hard disk, etc.) and optical reading media (eg, CD-ROM, DVD, etc.).
  • the education method according to the present invention supports user learning in a mobile environment because real-time face recognition and person recognition are possible even with image data taken by a general RGB camera using AI deep learning technology-based object recognition technology. It has the potential to be widely used in the education industry.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Biomedical Technology (AREA)
  • Artificial Intelligence (AREA)
  • Tourism & Hospitality (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Educational Administration (AREA)
  • Educational Technology (AREA)
  • Economics (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

La présente invention concerne un procédé d'enseignement comprenant les étapes suivantes : réception de données d'image provenant d'un dispositif électronique externe ; détection primaire de données de visage à partir des données d'image reçues ; détection secondaire de données de personne lorsqu'aucune donnée de visage n'est détectée ; et gestion d'informations d'image en distanciel sur la base des données de visage ou des données de personne détectées.
PCT/KR2022/007400 2021-11-30 2022-05-25 Procédé de gestion d'enseignement en distanciel à l'aide d'une reconnaissance de personne basée sur l'apprentissage profond WO2023101117A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
KR1020210168671A KR20230081013A (ko) 2021-11-30 2021-11-30 딥러닝 기반 사람 인식 방법 및 비대면 교육 관리 방법
KR10-2021-0168671 2021-11-30

Publications (1)

Publication Number Publication Date
WO2023101117A1 true WO2023101117A1 (fr) 2023-06-08

Family

ID=86612430

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2022/007400 WO2023101117A1 (fr) 2021-11-30 2022-05-25 Procédé de gestion d'enseignement en distanciel à l'aide d'une reconnaissance de personne basée sur l'apprentissage profond

Country Status (2)

Country Link
KR (1) KR20230081013A (fr)
WO (1) WO2023101117A1 (fr)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040137414A1 (en) * 1996-08-13 2004-07-15 Ho Chi Fai Learning method and system that consider a student's concentration level
KR20210058757A (ko) * 2020-06-16 2021-05-24 바이두 온라인 네트웍 테크놀러지 (베이징) 캄파니 리미티드 중점 학습 내용 결정 방법, 장치, 기기 및 저장 매체
KR102266476B1 (ko) * 2021-01-12 2021-06-17 (주)이루미에듀테크 시선 추적 기술을 활용한 온라인 학습 능력 향상 방법, 장치 및 시스템
JP6906820B1 (ja) * 2020-07-07 2021-07-21 Assest株式会社 集中度判別プログラム
CN113646838A (zh) * 2019-04-05 2021-11-12 华为技术有限公司 在视频聊天过程中提供情绪修改的方法和系统

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040137414A1 (en) * 1996-08-13 2004-07-15 Ho Chi Fai Learning method and system that consider a student's concentration level
CN113646838A (zh) * 2019-04-05 2021-11-12 华为技术有限公司 在视频聊天过程中提供情绪修改的方法和系统
KR20210058757A (ko) * 2020-06-16 2021-05-24 바이두 온라인 네트웍 테크놀러지 (베이징) 캄파니 리미티드 중점 학습 내용 결정 방법, 장치, 기기 및 저장 매체
JP6906820B1 (ja) * 2020-07-07 2021-07-21 Assest株式会社 集中度判別プログラム
KR102266476B1 (ko) * 2021-01-12 2021-06-17 (주)이루미에듀테크 시선 추적 기술을 활용한 온라인 학습 능력 향상 방법, 장치 및 시스템

Also Published As

Publication number Publication date
KR20230081013A (ko) 2023-06-07

Similar Documents

Publication Publication Date Title
WO2018135696A1 (fr) Plate-forme d'intelligence artificielle utilisant une technologie d'apprentissage auto-adaptative basée sur apprentissage profond
WO2023151289A1 (fr) Procédé d'identification d'émotion, procédé d'apprentissage, appareil, dispositif, support de stockage et produit
CN112699774B (zh) 视频中人物的情绪识别方法及装置、计算机设备及介质
Del Duchetto et al. Are you still with me? Continuous engagement assessment from a robot's point of view
CN109934097A (zh) 一种基于人工智能的表情和心理健康管理系统
WO2019190076A1 (fr) Procédé de suivi des yeux et terminal permettant la mise en œuvre dudit procédé
Yahaya et al. Gesture recognition intermediary robot for abnormality detection in human activities
Liu et al. Predicting engagement breakdown in HRI using thin-slices of facial expressions
CN114359976A (zh) 一种基于人物识别的智能安防方法与装置
WO2023101117A1 (fr) Procédé de gestion d'enseignement en distanciel à l'aide d'une reconnaissance de personne basée sur l'apprentissage profond
CN111666829A (zh) 多场景多主体身份行为情绪识别分析方法及智能监管系统
Villegas-Ch et al. Identification of emotions from facial gestures in a teaching environment with the use of machine learning techniques
CN113282840A (zh) 一种训练采集综合管理平台
Rosatelli et al. Detecting f-formations & roles in crowded social scenes with wearables: Combining proxemics & dynamics using lstms
Zhang et al. Falling detection of lonely elderly people based on NAO humanoid robot
WO2020101121A1 (fr) Procédé d'analyse d'image basée sur l'apprentissage profond, système et terminal portable
CN111782039A (zh) 一种适用于手势识别的方法及系统
WO2022080666A1 (fr) Dispositif de suivi des connaissances d'un utilisateur basé sur l'apprentissage par intelligence artificielle, système, et procédé de commande de celui-ci
CN110705413A (zh) 基于视线方向和lstm神经网络的情感预测方法及系统
CN116313127A (zh) 一种基于院前急救大数据的决策支持系统
WO2022181907A1 (fr) Procédé, appareil et système pour la fourniture d'informations nutritionnelles sur la base d'une analyse d'image de selles
CN111563465B (zh) 一种动物行为学自动分析系统
WO2021049700A1 (fr) Application et serveur pour la gestion de personnels de service
Belgiovine et al. HRI Framework for Continual Learning in Face Recognition
CN106778537B (zh) 一种基于图像处理的动物社交网络结构采集及分析系统及其方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22899626

Country of ref document: EP

Kind code of ref document: A1