WO2024148794A1 - 头部动作识别方法和装置 - Google Patents
头部动作识别方法和装置 Download PDFInfo
- Publication number
- WO2024148794A1 WO2024148794A1 PCT/CN2023/110256 CN2023110256W WO2024148794A1 WO 2024148794 A1 WO2024148794 A1 WO 2024148794A1 CN 2023110256 W CN2023110256 W CN 2023110256W WO 2024148794 A1 WO2024148794 A1 WO 2024148794A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- posture data
- head posture
- user
- autocorrelation function
- head
- Prior art date
Links
- 230000009471 action Effects 0.000 title claims abstract description 74
- 238000000034 method Methods 0.000 title claims abstract description 66
- 238000005311 autocorrelation function Methods 0.000 claims abstract description 52
- 230000000737 periodic effect Effects 0.000 claims description 22
- 238000012545 processing Methods 0.000 claims description 20
- 238000004590 computer program Methods 0.000 claims description 12
- 230000008569 process Effects 0.000 description 11
- 238000004891 communication Methods 0.000 description 8
- 230000006870 function Effects 0.000 description 8
- 230000004886 head movement Effects 0.000 description 6
- 230000003287 optical effect Effects 0.000 description 6
- 238000005070 sampling Methods 0.000 description 6
- 238000010586 diagram Methods 0.000 description 4
- 238000001914 filtration Methods 0.000 description 4
- 101001121408 Homo sapiens L-amino-acid oxidase Proteins 0.000 description 3
- 102100026388 L-amino-acid oxidase Human genes 0.000 description 3
- 101100012902 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) FIG2 gene Proteins 0.000 description 3
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 238000003491 array Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 239000013307 optical fiber Substances 0.000 description 2
- 230000002093 peripheral effect Effects 0.000 description 2
- 230000000644 propagated effect Effects 0.000 description 2
- 206010039203 Road traffic accident Diseases 0.000 description 1
- 101100233916 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) KAR5 gene Proteins 0.000 description 1
- 230000001133 acceleration Effects 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/011—Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
- G06F3/012—Head tracking input arrangements
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/20—Movements or behaviour, e.g. gesture recognition
Definitions
- the present invention relates to the field of computer technology, and in particular to a head action recognition method and device.
- the embodiments of the present invention provide a method and device for head movement recognition, which can effectively reduce the error of the estimated frequency of head posture data through the autocorrelation function, thereby improving the accuracy of head movement recognition.
- an embodiment of the present invention provides a head action recognition method, comprising:
- the step of determining whether the user makes a preset action based on the estimated frequency Work including:
- the estimated frequency determine a center frequency band region that meets the periodic signal condition in the frequency domain distribution of the head posture data
- Whether the user performs a preset action is determined according to the amplitude and power proportion of the central frequency band area.
- determining a center frequency band region that meets periodic signal conditions in the frequency domain distribution of the head posture data according to the estimated frequency includes:
- a region at least including the center frequency band is determined as the center frequency band region.
- determining whether the user performs a preset action based on the estimated frequency includes:
- the estimated frequency determine a center frequency band region that meets the periodic signal condition in the frequency domain distribution of the head posture data
- Whether the user performs a preset action is determined according to at least one of the amplitude and the power proportion of the central frequency band area.
- determining whether the user performs a preset action according to at least one of the amplitude and the power proportion of the center frequency band region includes:
- determining whether the user performs a preset action according to at least one of the amplitude and the power proportion of the center frequency band region includes:
- determining the estimated frequency of the head posture data according to the autocorrelation function includes:
- the inverse of the estimated period is determined as the estimated frequency.
- determining the estimated period of the head posture data according to the trough and the peak of the autocorrelation function includes:
- the preset action includes: periodic nodding or periodic shaking of the head.
- the method before determining the autocorrelation function of the head posture data and determining the estimated frequency of the head posture data according to the autocorrelation function, the method further includes:
- the method before determining whether the user performs a preset action based on the estimated frequency, the method further includes:
- the method before determining whether the user performs a preset action based on the estimated frequency, the method further includes:
- the method before determining whether the user performs a preset action based on the estimated frequency, the method further includes:
- the first value is greater than a sixth threshold, it is determined that the user has not performed the preset action.
- the step of obtaining the head posture data of the first user within a first time period includes:
- the head posture data is collected from the acquired image data of the user based on the sampling frequency.
- the method before determining the autocorrelation function of the head posture data, the method further includes:
- the number of head posture data after interpolation processing is N.
- the method before determining the autocorrelation function of the head posture data, the method further includes:
- Gaussian filtering is performed on the head posture data.
- the method further includes:
- M head posture data from the head of the recognition queue are popped out of the recognition queue, and the newly collected M head posture data are sent to the tail of the recognition queue;
- M is an integer less than or equal to N.
- the head posture data includes: lateral rotation angle data or longitudinal rotation angle data of the user's head.
- an embodiment of the present invention provides a head action recognition device, comprising:
- An acquisition module used to acquire the head posture data of the user within a first time length
- a processing module used to determine the autocorrelation function of the head posture data, and determine the estimated frequency of the head posture data according to the time domain distribution of the autocorrelation function;
- the processing module is further used to determine whether the user performs a preset action based on the estimated frequency.
- an embodiment of the present invention provides a vehicle, the vehicle comprising:
- a vehicle-mounted device which can implement the method provided in the first aspect when running.
- an embodiment of the present invention provides an electronic chip, including:
- the processor is coupled to at least one memory, wherein:
- the memory stores program instructions that can be executed by the processor, and the processor calls the program instructions to execute the method provided in the first aspect.
- an embodiment of the present invention provides a computer program product, the computer program product comprising a computer program, the computer program being executed by a processor to implement the first aspect. method.
- an embodiment of the present invention provides a computer-readable storage medium, wherein the computer-readable storage medium includes a stored program, wherein when the program is running, the device where the computer-readable storage medium is located is controlled to execute the method provided in the first aspect.
- the head posture data of the user within a first time length is first obtained; then the autocorrelation function of the head posture data is determined, and the estimated frequency of the head posture data is determined according to the autocorrelation function; then, whether the user performs a preset action is determined based on the estimated frequency.
- This method can effectively reduce the error of the estimated frequency of the head posture data through the autocorrelation function, thereby improving the accuracy of head action recognition.
- FIG1 is a flow chart of a head action recognition method provided by an embodiment of the present invention.
- FIG2 is a flow chart of another head action recognition method provided by an embodiment of the present invention.
- FIG3 is a schematic diagram of the structure of a head action recognition device provided by an embodiment of the present invention.
- FIG. 4 is a schematic diagram of the structure of an electronic device provided by an embodiment of the present invention.
- An embodiment of the present invention provides a head movement recognition method that can improve the accuracy of the recognition result.
- FIG1 is a flow chart of a head action recognition method provided by an embodiment of the present invention. The method can be applied to an in-vehicle device, as shown in FIG1 , and may include:
- Step 101 obtaining the user's head posture data within a first time length.
- the vehicle-mounted device can obtain the head posture data of the user through the vehicle-mounted camera device.
- the head posture data mainly includes the lateral rotation angle data or the longitudinal rotation angle data of the user's head.
- the head posture data can be displayed in a two-dimensional coordinate system, the horizontal coordinate is time, the vertical coordinate is the lateral rotation angle or the longitudinal rotation angle, and the value of the rotation angle can be between negative 180 degrees and positive 180 degrees. For example, when the driver looks straight ahead, the lateral rotation angle and the longitudinal rotation angle are both 0. When the driver turns his head to the left, the lateral rotation angle can become negative 90 degrees, and when the driver looks up, the longitudinal rotation angle can become positive 90 degrees.
- the longitudinal rotation angle can be regarded as a continuous periodic signal; if the driver shakes his head 3 times in 1 second, the lateral rotation angle can be regarded as a continuous periodic signal.
- the first time length is a pre-set reasonable value. The time required for the user to complete periodic nodding or periodic shaking of the head is usually less than the first time length. For example, the first time length can be 2.5 seconds or 3 seconds or other reasonable values.
- Step 102 determine the autocorrelation function of the head posture data, and determine the estimated frequency of the head posture data according to the autocorrelation function.
- the vehicle-mounted device may first determine the autocorrelation function of the head posture data, and then determine the estimated period of the head posture data based on the first adjacent trough and peak of the autocorrelation function in the positive direction of the origin, and the inverse of the estimated period is the estimated frequency.
- This step may specifically include: first determining the coordinates (lo, lowest) of the first trough of the autocorrelation function in the positive direction of the coordinate origin, lo is the horizontal coordinate of the first trough, and lowest is the vertical coordinate of the first trough; determining the first trough adjacent to and located in the first trough
- the coordinates of the peak in the positive direction are (hi, highest), where hi is the horizontal coordinate of the peak and highest is the vertical coordinate of the peak;
- Step 103 Determine whether the user performs a preset action based on the estimated frequency.
- the vehicle-mounted equipment can determine the frequency domain distribution of the head posture data through Fourier transform, and then determine the central frequency band area that meets the periodic signal conditions in the frequency domain distribution of the head posture data based on the estimated frequency; and then determine whether the user makes a preset action based on at least one of the amplitude and power proportion of the central frequency band area.
- the vehicle-mounted device determines the center frequency band area, it determines the maximum amplitude in the center frequency band area or the power ratio of the center frequency band area in the total power of the entire frequency band.
- the vehicle-mounted device can determine whether the user has performed a preset action based only on the maximum amplitude or the power ratio, or it can determine based on both at the same time to improve accuracy.
- the vehicle-mounted device after acquiring the head posture data, determines the estimated frequency of the head posture data according to the autocorrelation function; then determines whether the user has made a preset action based on the estimated frequency.
- This method can effectively reduce the error of the estimated frequency through the autocorrelation function, thereby improving the accuracy of head action recognition.
- the vehicle-mounted device after the vehicle-mounted device obtains the head posture data, it can traverse the overall head posture data and determine the maximum and minimum values of the head posture data; if it is detected that the result of subtracting the minimum value from the maximum value is less than the third threshold, it is determined that the user has not made the preset action.
- the head posture data When the user makes a preset action, the head posture data is displayed as a periodic signal with obvious peaks (maximum values) and troughs (minimum values). Usually, the difference between the peaks and troughs will be greater than the third threshold. Therefore, when the difference between the maximum and minimum values of the head posture data is less than the third threshold, the vehicle-mounted device can preliminarily determine that the user has not made the preset action.
- the vehicle-mounted device determines the autocorrelation function of the head posture data, it also determines The horizontal coordinate of the first trough of the autocorrelation function in the positive direction of the coordinate origin is determined; if the horizontal coordinate of the first trough is detected to be greater than the fourth threshold, it is determined that the user has not made a preset action.
- the horizontal coordinate of the first trough is greater than the fourth threshold, which means that the period of the signal is too long, and some non-periodic signals may be mistakenly identified as periodic signals with a longer period. Under normal circumstances, the period of the signal is less than the first time length, and the fourth threshold can be reasonably set according to the size of the first time length.
- the vehicle-mounted device detects that no peak is found in the autocorrelation function or the ordinates of the peaks are all less than the fifth threshold, it is determined that the user has not made a preset action. If the signal has no peak or the peak is not obvious, it means that the periodicity of the signal is not strong and does not meet the judgment standard of the periodic signal in the embodiment of the present invention.
- the vehicle-mounted device can also set other preset actions, and determine whether the user has performed the preset actions based on the posture data of other parts of the user.
- the vehicle-mounted device calculates the maximum amplitude or power ratio of the central frequency band area mainly to determine whether the head posture data contains a periodic signal. In addition to periodic nodding or shaking the head, a user's rapid waving left and right or up and down in a short period of time can also be identified as a periodic signal.
- the vehicle-mounted device can also obtain the user's hand posture data through the vehicle-mounted camera device, and determine whether the user has performed the preset action based on the above-mentioned head posture data judgment process. Other actions that can be judged as periodic signals can also be used as preset actions of embodiments of the present invention.
- the process of the vehicle-mounted device acquiring the head posture data of the user through the vehicle-mounted camera device mainly includes: acquiring the head posture data from the acquired image data of the user based on the sampling frequency.
- the vehicle-mounted device determines the head posture data, it will The collected head posture data is sent to the recognition queue. If it is detected that the number of head posture data in the recognition queue is not N, the head posture data is interpolated, wherein the number of head posture data after interpolation is N.
- the first time length is set to 2.5 seconds.
- the vehicle-mounted camera device can obtain 64 frames of image data within 2.5 seconds, and the vehicle-mounted device can collect 64 head posture data. In actual collection, since the frame rate may be unstable, the vehicle-mounted camera device obtains 60 frames of image data, and the vehicle-mounted device can collect 60 head posture data. When collecting head posture data, the collection time of each head posture data is recorded.
- the vehicle-mounted device can divide the 2.5-second collection time into 64 parts (1-64) on average, each corresponding to a time point and head posture data. Then, according to the collection time of the 60 head posture data actually collected, 4 time points that are missed due to unstable frame rate are determined, and the head posture data corresponding to the 4 time points are completed.
- the vehicle-mounted device can use the linear difference method to complete the missing data. For example, the head posture data corresponding to time point 50 is missing, the head posture data corresponding to time point 49 is positive 50 degrees, and the head posture data corresponding to time point 51 is positive 54 degrees.
- the vehicle-mounted device can determine the mean of the two as the head posture data corresponding to time point 50, that is, positive 52 degrees.
- the vehicle-mounted device determines the head posture data corresponding to time point 64 as negative 40 degrees, so that the three head posture data pass through the same straight line in the two-dimensional coordinate system. If the actual collected head posture data is greater than 64, difference processing is also required. For example, time point 60 and time point 61 have their own corresponding head posture data, and there is another head posture data between time point 60 and time point 61. The vehicle-mounted device can delete the redundant head posture data.
- time point 50 and time point 52 each have corresponding head posture data
- time point 51 corresponds to two head posture data: positive 30 degrees and positive 32 degrees.
- the vehicle-mounted device can determine the average of the two as the head posture data corresponding to time point 51, that is, positive 31 degrees.
- the embodiment of the present invention can effectively reduce the impact of unstable frame rate of the vehicle-mounted camera device by performing difference processing on the head posture data, and indirectly improve the accuracy of preset action recognition.
- the vehicle-mounted device also performs Gaussian multiplication on the head posture data after difference processing. Filtering to eliminate the influence of noise signals on head posture data.
- the vehicle-mounted device after the recognition of N head posture data in the recognition queue is completed, the vehicle-mounted device will pop out the M head posture data from the head of the recognition queue, and send the newly collected M head posture data to the end of the recognition queue, where M is an integer less than or equal to N.
- M is an integer less than or equal to N.
- the vehicle-mounted device first recognizes 64 head posture data in the recognition queue, and after the recognition, pops out the first 4 head posture data in the recognition queue, and sends the newly collected 4 head posture data to the recognition queue, and puts them at the end of the remaining 60 head posture data to form a new 64 head posture data, and the vehicle continues to recognize them.
- the vehicle-mounted device can repeat the above operation in a loop.
- FIG2 is a flow chart of another head action recognition method provided by an embodiment of the present invention. As shown in FIG2 , the method may include:
- Step 201 collecting head posture data.
- the on-board equipment obtains the user's image data in real time through the on-board camera device, and collects the head posture data in the image data.
- Step 202 difference processing.
- the vehicle-mounted device sends the collected head posture data to the recognition queue. If it is detected that the number of head posture data in the recognition queue is different from the preset number, difference processing is performed.
- Step 203 Gaussian filtering.
- the on-board equipment performs Gaussian filtering on the head posture data to eliminate noise interference.
- Step 204 determining an estimated frequency according to the autocorrelation function.
- the vehicle-mounted device first determines the autocorrelation function of the head posture data, and then determines the frequency of the autocorrelation.
- the frequency of the head posture data and the autocorrelation function are the same. Therefore, the frequency of the head posture data can be determined more accurately by calculating the autocorrelation function.
- Step 205 determine the frequency domain distribution.
- the on-board device determines the frequency domain distribution of the head posture data through Fourier transform.
- Step 206 determining the center frequency band area.
- the vehicle-mounted device determines the center frequency band based on the estimated frequency determined in step 204 and the formula, and determines an area centered on the center frequency band as the center frequency band area.
- the area range can be reasonably set.
- Step 207 determining the maximum amplitude.
- the on-board equipment determines the maximum amplitude in the center frequency band area.
- Step 208 determine whether it is greater than a first threshold.
- the vehicle-mounted device determines whether the maximum amplitude determined in step 207 is greater than the first threshold value. If so, the process proceeds to step 211 . Otherwise, the process ends the entire process.
- Step 209 determining the power ratio.
- the on-board device determines the power ratio of the center frequency band area to the total power of all frequency bands.
- Step 210 determine whether it is greater than a second threshold.
- the vehicle-mounted device determines whether the power proportion determined in step 209 is greater than the second threshold value. If so, the process proceeds to step 211 , otherwise, the entire process ends.
- Step 211 determine whether the user has performed a preset action and perform corresponding processing.
- the vehicle-mounted device After the vehicle-mounted device determines that the user has performed a preset action, it performs the task processing associated with the preset action. For example, when it detects that the user periodically shakes his head, the song playing is switched, and when it detects that the user periodically nods, the air conditioner in the car is turned on.
- the vehicle-mounted device can detect the head movements of the vehicle user in real time, and perform corresponding processing when preset movements occur, thereby improving user experience and driving safety.
- Fig. 3 is a schematic diagram of the structure of a device provided by an embodiment of the present invention.
- the device can be deployed in a vehicle-mounted device, as shown in Fig. 3, and can include: an acquisition module 310 and a processing module 320.
- the acquisition module 310 is used to acquire the head posture data of the user within a first time length.
- the processing module 320 is used to determine the autocorrelation function of the head posture data, and determine the estimated frequency of the head posture data according to the time domain distribution of the autocorrelation function.
- the processing module 320 is further configured to determine whether the user performs a preset action based on the estimated frequency.
- the embodiment of the present invention also provides a vehicle, which can be loaded with vehicle-mounted equipment.
- the head posture recognition method according to the embodiment of the present invention can be implemented.
- Fig. 4 is a schematic diagram of the structure of an electronic device provided by an embodiment of the present invention.
- the electronic device shown in Fig. 4 is only an example and should not bring any limitation to the functions and application scope of the embodiment of the present invention.
- the electronic device is in the form of a general-purpose computing device.
- the components of the electronic device may include, but are not limited to: one or more processors 410, memory 430, and a communication bus 440 connecting different system components (including memory 430 and processor 410).
- the communication bus 440 represents one or more of several types of bus structures, including a memory bus or memory controller, a peripheral bus, a graphics acceleration port, a processor or a local bus using any of a variety of bus structures.
- these architectures include but are not limited to Industry Standard Architecture (ISA) bus, Micro Channel Architecture (MAC) bus, Enhanced ISA bus, Video Electronics Standards Association (VESA) local bus and Peripheral Component Interconnection (PCI) bus.
- Electronic devices typically include a variety of computer system readable media. These media can be any available media that can be accessed by the electronic device, including volatile and non-volatile media, removable and non-removable media.
- the memory 430 may include computer system readable media in the form of volatile memory, such as random access memory (Random Access Memory; hereinafter referred to as: RAM) and/or cache memory.
- RAM random access memory
- the electronic device may further include other removable/non-removable, volatile/non-volatile computer system storage media.
- a disk drive for reading and writing removable non-volatile disks such as "floppy disks"
- an optical disk drive for reading and writing removable non-volatile optical disks (such as: compact disc read only memory (Compact Disc Read Only Memory; hereinafter referred to as: CD-ROM), digital versatile disc read only memory (Digital Video Disc Read Only Memory; hereinafter referred to as: DVD-ROM) or other optical media) may be provided.
- CD-ROM compact disc read only memory
- DVD-ROM digital versatile disc read only memory
- each drive can be connected to the communication bus 440 via one or more data medium interfaces.
- the memory 430 may include at least one program product, the program The program product has a set (eg, at least one) of program modules, and these program modules are configured to execute the functions of various embodiments of the present invention.
- a program/utility having a set (at least one) of program modules may be stored in memory 430, such program modules including, but not limited to, an operating system, one or more application programs, other program modules, and program data, each of which or some combination may include an implementation of a network environment.
- the program modules generally perform the functions and/or methods of the embodiments described herein.
- the electronic device may also communicate with one or more external devices, may communicate with one or more devices that enable a user to interact with the electronic device, and/or may communicate with any device (e.g., a network card, a modem, etc.) that enables the electronic device to communicate with one or more other computing devices. Such communication may be performed through the communication interface 420.
- the electronic device may also communicate with one or more networks (e.g., a local area network (LAN), a wide area network (WAN), and/or a public network, such as the Internet) through a network adapter (not shown in FIG. 4 ), and the network adapter may communicate with other modules of the electronic device through the communication bus 440.
- networks e.g., a local area network (LAN), a wide area network (WAN), and/or a public network, such as the Internet
- LAN local area network
- WAN wide area network
- a public network such as the Internet
- the network adapter may communicate with other modules of the electronic device through the communication bus 440.
- the processor 410 executes various functional applications and data processing by running the programs stored in the memory 430, such as implementing the head action recognition method provided in the embodiment of the present invention.
- An embodiment of the present invention further provides a computer program product, which includes a computer program.
- the computer program is executed by a processor, the head action recognition method provided by the embodiment of the present invention is implemented.
- An embodiment of the present invention further provides a computer-readable storage medium, wherein the computer-readable storage medium stores computer instructions, and the computer instructions enable the computer to execute the head action recognition method provided by the embodiment of the present invention.
- the computer-readable storage medium may be any combination of one or more computer-readable media.
- a computer-readable medium may be a computer-readable signal medium or a computer-readable storage medium.
- a computer-readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or device, or any combination thereof.
- Computer-readable storage media include: an electrical connection with one or more wires, a portable computer disk, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM) or flash memory, an optical fiber, a portable compact disk read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination thereof.
- a computer-readable storage medium may be any tangible medium containing or storing a program that may be used by or in conjunction with an instruction execution system, device, or device.
- Computer-readable signal media may include a data signal propagated in baseband or as part of a carrier wave, which carries a computer-readable program code. Such propagated data signals may take a variety of forms, including, but not limited to, electromagnetic signals, optical signals, or any suitable combination of the above. Computer-readable signal media may also be any computer-readable medium other than a computer-readable storage medium, which may send, propagate, or transmit a program for use by or in conjunction with an instruction execution system, apparatus, or device.
- Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
- first and second are used for descriptive purposes only and should not be understood as indicating or implying relative importance or implicitly indicating the number of the indicated technical features. Therefore, the features defined as “first” and “second” may explicitly or implicitly include at least one of the features. In the description of the present invention, the meaning of “plurality” is at least two, such as two, three, etc., unless otherwise clearly and specifically defined.
- Any process or method description in a flowchart or otherwise described herein may be understood to represent a module, segment or portion of code comprising one or more executable instructions for implementing the steps of a custom logical function or process, and the scope of the preferred embodiments of the present invention includes alternative implementations in which functions may not be performed in the order shown or discussed, including performing functions in a substantially simultaneous manner or in reverse order depending on the functions involved, which should be understood by technicians in the technical field to which the embodiments of the present invention belong.
- the disclosed systems, devices and methods can be implemented in other ways.
- the device embodiments described above are only schematic.
- the division of the units is only a logical function division. There may be other division methods in actual implementation.
- multiple units or components can be combined or integrated into another system, or some features can be ignored or not executed.
- Another point is that the mutual coupling or direct coupling or communication connection shown or discussed can be through some interfaces, indirect coupling or communication connection of devices or units, which can be electrical, mechanical or other forms.
- each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
- the above-mentioned integrated unit may be implemented in the form of hardware or in the form of hardware plus software functional units.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Human Computer Interaction (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Social Psychology (AREA)
- Psychiatry (AREA)
- Multimedia (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- User Interface Of Digital Computer (AREA)
- Image Analysis (AREA)
- Character Discrimination (AREA)
Abstract
本发明涉及计算机技术领域,尤其涉及一种头部动作识别方法和装置。先获取第一时间长度内用户的头部姿态数据;然后确定头部姿态数据的自相关函数,并根据自相关函数确定头部姿态数据的预估频率;之后基于预估频率确定用户是否做出预设动作。该方法通过自相关函数能够有效缩小头部姿态数据的预估频率的误差,提高了头部动作识别的精确度。
Description
本发明要求于2023年01月13日提交中国专利局、申请号为202310064924.7、申请名称为“头部动作识别方法和装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
本发明涉及计算机技术领域,尤其涉及一种头部动作识别方法和装置。
随着计算机技术的发展,其影响力遍布生活中的各个领域,使得其他领域变得越来越智能化。例如,汽车领域中,车辆通常会安装车载设备(如车载电视、车载音响等),丰富驾驶员的驾驶体验。然而,驾驶员有时会在车辆行驶过程中对车载设备进行一定的操作,例如,修改车辆导航的目的地,或者切换车载音响播放的歌曲,该行为会使得驾驶员一只手脱离方向盘,分散驾驶员的注意力,可能会导致交通事故。
发明内容
本发明实施例提供了一种头部动作识别方法和装置,通过自相关函数能够有效缩小头部姿态数据的预估频率的误差,提高了头部动作识别的精确度。
第一方面,本发明实施例提供了一种头部动作识别方法,包括:
获取第一时间长度内用户的头部姿态数据;
确定所述头部姿态数据的自相关函数,并根据所述自相关函数确定所述头部姿态数据的预估频率;
基于所述预估频率确定所述用户是否做出预设动作。
一种实现方式中,所述基于所述预估频率确定所述用户是否做出预设动
作,包括:
根据所述预估频率,在所述头部姿态数据的频域分布确定符合周期信号条件的中心频段区域;
根据所述中心频段区域的振幅和功率占比,确定所述用户是否做出预设动作。
一种实现方式中,所述根据所述预估频率,在所述头部姿态数据的频域分布确定符合周期信号条件的中心频段区域,包括:
基于傅里叶变换确定所述头部姿态数据的频域分布;
根据公式C=f*L确定所述频域分布的中心频段,C为所述中心频段,f为所述预估频率,L为所述第一时间长度;
在所述头部姿态数据的频域分布确定至少包含所述中心频段的区域作为所述中心频段区域。
一种实现方式中,所述基于所述预估频率确定所述用户是否做出预设动作,包括:
根据所述预估频率,在所述头部姿态数据的频域分布确定符合周期信号条件的中心频段区域;
根据所述中心频段区域的振幅和功率占比中的至少一项,确定所述用户是否做出预设动作。
一种实现方式中,所述根据所述中心频段区域的振幅和功率占比中的至少一项,确定所述用户是否做出预设动作,包括:
确定所述中心频段区域中的最大振幅;
若所述最大振幅大于第一阈值,则确定所述用户出现所述预设动作。
一种实现方式中,所述根据所述中心频段区域的振幅和功率占比中的至少一项,确定所述用户是否做出预设动作,包括:
确定所述中心频段区域的功率在整个频段的总功率中的功率占比;
若所述功率占比大于第二阈值,则确定所述用户出现所述预设动作。
一种实现方式中,所述根据所述自相关函数确定所述头部姿态数据的预估频率,包括:
根据所述自相关函数的波谷和波峰确定所述头部姿态数据的预估周期;
将所述预估周期的倒数确定为所述预估频率。
一种实现方式中,所述根据所述自相关函数的波谷和波峰确定所述头部姿态数据的预估周期,包括:
确定所述自相关函数在坐标原点正方向的第一个波谷的坐标(lo,lowest),lo为所述第一个波谷的横坐标,lowest为所述第一个波谷的纵坐标;
确定与所述第一个波谷相邻并位于所述第一个波谷正方向的波峰的坐标(hi,highest),hi为所述波峰的横坐标,highest为所述波峰的纵坐标;
根据公式T=2*(hi-lo)*L/N确定所述头部姿态数据的预估周期,T为所述预估周期,L为所述第一时间长度,N为在所述第一时间长度内需采集的头部姿态数据的数量。
一种实现方式中,所述预设动作包括:周期性点头或者周期性摇头。
一种实现方式中,所述确定所述头部姿态数据的自相关函数,并根据所述自相关函数确定所述头部姿态数据的预估频率之前,还包括:
确定所述头部姿态数据的最大值和最小值;
若所述最大值减去所述最小值的结果小于第三阈值,则确定所述用户未做出所述预设动作。
一种实现方式中,所述基于所述预估频率确定所述用户是否做出预设动作之前,还包括:
确定所述自相关函数在坐标原点正方向的第一个波谷的横坐标;
若所述第一个波谷的横坐标大于第四阈值,则确定所述用户未做出所述
预设动作。
一种实现方式中,所述基于所述预估频率确定所述用户是否做出预设动作之前,还包括:
若所述自相关函数中未发现波峰或波峰的纵坐标皆小于第五阈值,则确定所述用户未做出所述预设动作。
一种实现方式中,所述基于所述预估频率确定所述用户是否做出预设动作之前,还包括:
确定所述自相关函数在坐标原点正方向的第一个波谷的横坐标lo;
确定与所述第一个波谷相邻并位于所述第一个波谷正方向的波峰的横坐标hi;
根据公式A=|2*lo-hi|计算第一数值,A为所述第一数值;
若所述第一数值大于第六阈值,则确定所述用户未做出所述预设动作。
一种实现方式中,所述获取第一时间长度内第一用户的头部姿态数据,包括:
根据公式F=N/L确定所述头部姿态数据的采样频率,F为所述采样频率,L为所述第一时间长度,N为在所述第一时间长度内需采集的头部姿态数据的数量;
基于所述采样频率在已获取的用户的图像数据中采集所述头部姿态数据。
一种实现方式中,所述确定所述头部姿态数据的自相关函数之前,还包括:
将所述第一时间长度内采集的所述头部姿态数据发送至识别队列,若检测到所述识别队列中的头部姿态数据的数量不为N,则对所述头部姿态数据进行插值处理;
其中,插值处理后的头部姿态数据的数量为N。
一种实现方式中,所述确定所述头部姿态数据的自相关函数之前,还包括:
对所述头部姿态数据进行高斯滤波。
一种实现方式中,还包括:
在所述识别队列中N个头部姿态数据识别完成之后,将所述识别队列自队首开始M个头部姿态数据弹出所述识别队列,并将新采集的M个头部姿态数据发送至所述识别队列的队尾;
其中,M为小于或等于N的整数。
一种实现方式中,所述头部姿态数据包括:所述用户头部的横向旋转角数据或纵向旋转角数据。
第二方面,本发明实施例提供了一种头部动作识别装置,包括:
获取模块,用于获取第一时间长度内用户的头部姿态数据;
处理模块,用于确定所述头部姿态数据的自相关函数,并根据所述自相关函数的时域分布确定所述头部姿态数据的预估频率;
所述处理模块,还用于基于所述预估频率确定所述用户是否做出预设动作。
第三方面,本发明实施例提供了一种车辆,所述车辆包括:
车载设备,所述车载设备运行时能够实现第一方面提供的方法。
第四方面,本发明实施例提供了一种电子芯片,包括:
至少一个处理器;
所述处理器与至少一个存储器耦合,其中:
所述存储器存储有可被所述处理器执行的程序指令,所述处理器调用所述程序指令能够执行第一方面提供的方法。
第五方面,本发明实施例提供了一种计算机程序产品,所述计算机程序产品包括计算机程序,所述计算机程序被处理器执行时实现第一方面提供的
方法。
第六方面,本发明实施例提供了一种计算机可读存储介质,所述计算机可读存储介质包括存储的程序,其中,在所述程序运行时控制所述计算机可读存储介质所在设备执行第一方面提供的方法。
本发明实施例中,先获取第一时间长度内用户的头部姿态数据;然后确定头部姿态数据的自相关函数,并根据自相关函数确定头部姿态数据的预估频率;之后基于预估频率确定用户是否做出预设动作。该方法通过自相关函数能够有效缩小头部姿态数据的预估频率的误差,提高了头部动作识别的精确度。
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作一简单地介绍,显而易见地,下面描述中的附图是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为本发明实施例提供的一种头部动作识别方法的流程图;
图2为本发明实施例提供的另一种头部动作识别方法的流程图;
图3为本发明实施例提供的一种头部动作识别装置的结构示意图;
图4为本发明实施例提供的一种电子设备的结构示意图。
为了更好的理解本说明书的技术方案,下面结合附图对本发明实施例进行详细描述。
应当明确,所描述的实施例仅仅是本说明书一部分实施例,而不是全部的实施例。基于本说明书中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其它实施例,都属于本说明书保护的范围。
在本发明实施例中使用的术语是仅仅出于描述特定实施例的目的,而非旨在限制本说明书。在本发明实施例和所附权利要求书中所使用的单数形式
的“一种”、“所述”和“该”也旨在包括多数形式,除非上下文清楚地表示其他含义。
驾驶员的驾驶车辆时,车载设备可以通过对驾驶员进行头部动作识别来确定驾驶员的意图,然而,该过程中由于干扰信号存在,识别结果的准确性难以保证。本发明实施例提供了一种头部动作识别方法,可以提高识别结果的准确性。
图1为本发明实施例提供的一种头部动作识别方法的流程图。该方法可以应用于车载设备,如图1所示,可以包括:
步骤101,获取第一时间长度内用户的头部姿态数据。
本发明实施例中,车载设备可以通过车载摄像装置获取用户的头部姿态数据。头部姿态数据主要包括用户头部的横向旋转角数据或纵向旋转角数据,头部姿态数据可以显示于二维坐标系中,横坐标为时间,纵坐标为横向旋转角或纵向旋转角,旋转角的数值可以在负180度到正180度之间。例如,驾驶员看向正前方时,横向旋转角和纵向旋转角都为0。当驾驶员向左转头时,横向旋转角可以变为负90度,当驾驶员看向上方时,纵向旋转角可以变为正90度。如果驾驶员在1秒内点头3次,则纵向旋转角可以看做一段连续的周期信号;如果驾驶员在1秒摇头3次,则横向旋转角可以看做一段连续的周期信号。第一时间长度为预先设置的合理数值,用户完成周期性点头或周期性摇头所需的时间通常小于第一时间长度,例如,第一时间长度可以为2.5秒或3秒或其他合理数值。
步骤102,确定头部姿态数据的自相关函数,并根据自相关函数确定头部姿态数据的预估频率。
一种可选的实施例中,车载设备可以先确定头部姿态数据的自相关函数,然后根据自相关函数在原点正方向第一个相邻的波谷和波峰确定头部姿态数据的预估周期,预估周期的倒数即为预估频率。该步骤具体可以包括:先确定自相关函数在坐标原点正方向的第一个波谷的坐标(lo,lowest),lo为第一个波谷的横坐标,lowest为第一个波谷的纵坐标;确定与第一个波谷相邻并位于第一个波谷
正方向的波峰的坐标(hi,highest),hi为波峰的横坐标,highest为波峰的纵坐标;根据公式T=2*(hi-lo)*L/N确定头部姿态数据的预估周期,T为预估周期,L为第一时间长度,N为在第一时间长度内需采集的头部姿态数据的数量。
步骤103,基于预估频率确定用户是否做出预设动作。
车载设备可以通过傅里叶变换确定头部姿态数据的频域分布,然后根据预估频率,在头部姿态数据的频域分布确定符合周期信号条件的中心频段区域;再根据中心频段区域的振幅和功率占比中的至少一项,确定用户是否做出预设动作。
一种可选的实施例中,车载设备确定符合周期信号条件的中心频段区域的具体步骤可以包括:基于傅里叶变换确定头部姿态数据的频域分布;根据公式C=f*L确定频域分布的中心频段,C为中心频段,f为所估频率,L为第一时间长度;在头部姿态数据的频域分布确定至少包含中心频段的区域作为中心频段区域。车载设备确定中心频段区域后,确定中心频段区域中的最大振幅或者中心频段区域的功率在整个频段的总功率中的功率占比,若检测到最大振幅大于第一阈值或者功率占比大于第二阈值,则确定用户出现预设动作。车载设备可以只根据最大振幅或功率占比判断用户是否出现预设动作,也可以根据二者同时判断,提高准确性。
本发明实施例中,车载设备通过获取头部姿态数据后,根据自相关函数确定头部姿态数据的预估频率;之后基于预估频率确定用户是否做出预设动作。该方法通过自相关函数能够有效减小预估频率的误差,提高了头部动作识别的精确度。
一种可选的实施例中,车载设备获取头部姿态数据后,可以遍历整体头部姿态数据并确定头部姿态数据的最大值和最小值;若检测到最大值减去最小值的结果小于第三阈值,则确定用户未做出预设动作。用户做出预设动作时,头部姿态数据显示为周期信号,有明显的的波峰(最大值)和波谷(最小值),通常情况下,波峰和波谷的差值会大于第三阈值,因此,当头部姿态数据的最大值和最小值的差值小于第三阈值时,车载设备可以初步确定用户未做出预设动作。
一种可选的实施例中,车载设备确定头部姿态数据的自相关函数后,还会确
定自相关函数在坐标原点正方向的第一个波谷的横坐标;若检测到第一个波谷的横坐标大于第四阈值,则确定用户未做出预设动作。第一个波谷的横坐标大于第四阈值意味着信号的周期过长,有可能将一些非周期信号错误识别为周期较长的周期信号。正常情况下,信号的周期小于第一时间长度,第四阈值可以根据第一时间长度的大小合理设置。
一种可选的实施例中,车载设备若检测到自相关函数中未发现波峰或波峰的纵坐标皆小于第五阈值,则确定用户未做出预设动作。如果信号没有波峰或波峰不明显,则意味着信号的周期性不强,不符合本发明实施例中周期信号的判断标准。
一种可选的实施例中,车载设备确定头部姿态数据的自相关函数后,还会根据公式A=|2*lo-hi|计算第一数值,A为第一数值;若第一数值大于第六阈值,则确定用户未做出预设动作。可以理解,周期性信号第一个波谷横坐标的两倍应非常接近于其正方向相邻波峰的横坐标。
一种可选的实施例中,车载设备还可以设置其他的预设动作,并根据用户其他部位的姿态数据判断用户是否出现预设动作。车载设备计算中心频段区域的最大振幅或功率占比主要用于判断头部姿态数据中是否包含周期性信号。除了周期性点头或周期性摇头外,用户短时间内快速左右挥手或上下挥手也可以被识别为一种周期性信号。车载设备同样可以通过车载摄像装置获取用户的手部姿态数据,并根据上述头部姿态数据的判断流程确定用户是否出现预设动作。其他能够被判定为周期性信号的动作也可以作为本发明实施例的预设动作。
一种可选的实施例中,车载设备通过车载摄像装置获取用户的头部姿态数据的过程主要包括:基于采样频率在已获取的用户的图像数据中采集头部姿态数据。其中,根据公式F=N/L可以确定采样频率,F为采样频率,L为第一时间长度,N为在第一时间长度内需采集的头部姿态数据的数量。
一种可选的实施例中,车载设备确定头部姿态数据后,会将第一时间长度内
采集的头部姿态数据发送至识别队列,若检测到识别队列中的头部姿态数据的数量不为N,则对头部姿态数据进行插值处理,其中,插值处理后的头部姿态数据的数量为N。例如,第一时间长度设置为2.5秒,正常情况下,车载摄像装置在2.5秒内可以获取64帧图像数据,车载设备可以采集64个头部姿态数据。在实际采集时,由于帧率可能不稳定,车载摄像装置获取了60帧图像数据,车载设备能够采集60个头部姿态数据。采集头部姿态数据时,会记录每个头部姿态数据的采集时间。车载设备可以将2.5秒的采集时间平均划分为64份(1-64),每份对应一个时间点和头部姿态数据。然后根据实际采集的60个头部姿态数据的采集时间确定4个因帧率不稳定而漏采集的时间点,并补全4个时间点对应的头部姿态数据。车载设备可以采用线性差值的方法补全缺失的数据。例如,时间点50对应的头部姿态数据缺失,时间点49对应的头部姿态数据为正50度,时间点51对应的头部姿态数据为正54度,则车载设备可以将二者均值确定为时间点50对应的头部姿态数据,即正52度。若确定的头部姿态数据为于采集时间的两端,如时间点64对应的头部姿态数据缺失,时间点63对应的头部姿态数据为负30度,时间点62对应的头部姿态数据为负20度,则车载设备将时间点64对应的头部姿态数据确定为负40度,使得三个头部姿态数据在二维坐标系中过同一条直线。若实际采集的头部姿态数据大于64,也需进行差值处理。例如,时间点60和时间点61分别有各自对应的头部姿态数据,时间点60和时间点61之间还有一个头部姿态数据,车载设备可以将多余的头部姿态数据删除。又例如,时间点50和时间点52分别有各自对应的头部姿态数据,时间点51对应两个头部姿态数据:正30度和正32度,车载设备可以将二者均值确定为时间点51对应的头部姿态数据,即正31度。
本发明实施例通过对头部姿态数据进行差值处理,能够有效降低车载摄像装置帧率不稳定带来的影响,间接提高预设动作识别的准确度。
一种可选的实施例中,车载设备还会对差值处理后的头部姿态数据进行高斯
滤波,消除噪声信号对头部姿态数据的影响。
一种可选的实施例中,在识别队列中N个头部姿态数据识别完成之后,车载设备会将识别队列自队首开始M个头部姿态数据弹出识别队列,并将新采集的M个头部姿态数据发送至识别队列的队尾,其中,M为小于或等于N的整数。例如,车载设备先对识别队列中的64个头部姿态数据进行识别,识别过后,将识别队列中的前4个头部姿态数据弹出,并将新采集的4个头部姿态数据发送至识别队列,放入剩余60个头部姿态数据的队尾,组成新的64个头部姿态数据,车载继续对其进行识别。车载设备可以循环重复上述操作。
图2为本发明实施例提供的另一种头部动作识别方法的流程图。如图2所示,该方法可以包括:
步骤201,采集头部姿态数据。
车辆启动后,车载设备通过车载摄像装置实时获取用户的图像数据,并在图像数据中采集头部姿态数据。
步骤202,差值处理。
车载设备将采集的头部姿态数据发送至识别队列,若检测到识别队列中的头部姿态数据的个数与预设的个数不同,则进行差值处理。
步骤203,高斯滤波处理。
车载设备对头部姿态数据进行高斯滤波处理,消除噪声干扰。
步骤204,根据自相关函数确定预估频率。
车载设备先确定头部姿态数据的自相关函数,然后确定自相关的频率,头部姿态数据和自相关函数的频率相同,因此,通过计算自相关函数可以较准确的确定头部姿态数据的频率。
步骤205,确定频域分布。
车载设备通过傅里叶变换确定头部姿态数据的频域分布。
步骤206,确定中心频段区域。
车载设备基于步骤204确定的预估频率和公式确定中心频段,将以中心频段为中心的一段区域确定为中心频段区域,区域范围可以合理设置。
步骤207,确定最大振幅。
车载设备确定中心频段区域中的最大振幅。
步骤208,判断是否大于第一阈值。
车载设备判断步骤207确定的最大振幅是否大于第一阈值,若是,则进入步骤211,否则进入结束整个流程。
步骤209,确定功率占比。
车载设备确定中心频段区域的功率在所有频段总功率的功率占比。
步骤210,判断是否大于第二阈值。
车载设备判断步骤209确定的功率占比是否大于第二阈值,若是,则进入步骤211,否则结束整个流程。
步骤211,确定用户出现预设动作,进行相应处理。
车载设备确定用户出现预设动作后,进行预设动作关联的任务处理。例如,检测到用户周期性摇头时,切换歌曲播放,检测到用户周期性点头时,打开车内空调。
车载设备通过上述流程可以实时检测车载用户的头部动作,并在出现预设动作时进行相应处理,提高了用户体验和驾驶安全性。
图3为本发明实施例提供的一种装置的结构示意图。该装置可以部署于车载设备,如图3所示,可以包括:获取模块310和处理模块320。
获取模块310,用于获取第一时间长度内用户的头部姿态数据。
处理模块320,用于确定头部姿态数据的自相关函数,并根据自相关函数的时域分布确定头部姿态数据的预估频率。
所述处理模块320,还用于基于预估频率确定用户是否做出预设动作。
本发明实施例还提供一种车辆,该车辆可以装载车载设备,车载设备运行时
能够实现本发明实施例的头势识别方法。
图4为本发明实施例提供的一种电子设备的结构示意图。图4显示的电子设备仅仅是一个示例,不应对本发明实施例的功能和使用范围带来任何限制。
如图4所示,电子设备以通用计算设备的形式表现。电子设备的组件可以包括但不限于:一个或者多个处理器410,存储器430,连接不同系统组件(包括存储器430和处理器410)的通信总线440。
通信总线440表示几类总线结构中的一种或多种,包括存储器总线或者存储器控制器,外围总线,图形加速端口,处理器或者使用多种总线结构中的任意总线结构的局域总线。举例来说,这些体系结构包括但不限于工业标准体系结构(Industry Standard Architecture;以下简称:ISA)总线,微通道体系结构(Micro Channel Architecture;以下简称:MAC)总线,增强型ISA总线、视频电子标准协会(Video Electronics Standards Association;以下简称:VESA)局域总线以及外围组件互连(Peripheral Component Interconnection;以下简称:PCI)总线。
电子设备典型地包括多种计算机系统可读介质。这些介质可以是任何能够被电子设备访问的可用介质,包括易失性和非易失性介质,可移动的和不可移动的介质。
存储器430可以包括易失性存储器形式的计算机系统可读介质,例如随机存取存储器(Random Access Memory;以下简称:RAM)和/或高速缓存存储器。电子设备可以进一步包括其它可移动/不可移动的、易失性/非易失性计算机系统存储介质。尽管图4中未示出,可以提供用于对可移动非易失性磁盘(例如“软盘”)读写的磁盘驱动器,以及对可移动非易失性光盘(例如:光盘只读存储器(Compact Disc Read Only Memory;以下简称:CD-ROM)、数字多功能只读光盘(Digital Video Disc Read Only Memory;以下简称:DVD-ROM)或者其它光介质)读写的光盘驱动器。在这些情况下,每个驱动器可以通过一个或者多个数据介质接口与通信总线440相连。存储器430可以包括至少一个程序产品,该程
序产品具有一组(例如至少一个)程序模块,这些程序模块被配置以执行本发明各实施例的功能。
具有一组(至少一个)程序模块的程序/实用工具,可以存储在存储器430中,这样的程序模块包括——但不限于——操作系统、一个或者多个应用程序、其它程序模块以及程序数据,这些示例中的每一个或某种组合中可能包括网络环境的实现。程序模块通常执行本发明所描述的实施例中的功能和/或方法。
电子设备也可以与一个或多个外部设备通信,还可与一个或者多个使得用户能与该电子设备交互的设备通信,和/或与使得该电子设备能与一个或多个其它计算设备进行通信的任何设备(例如网卡,调制解调器等等)通信。这种通信可以通过通信接口420进行。并且,电子设备还可以通过网络适配器(图4中未示出)与一个或者多个网络(例如局域网(Local Area Network;以下简称:LAN),广域网(Wide Area Network;以下简称:WAN)和/或公共网络,例如因特网)通信,上述网络适配器可以通过通信总线440与电子设备的其它模块通信。应当明白,尽管图4中未示出,可以结合电子设备使用其它硬件和/或软件模块,包括但不限于:微代码、设备驱动器、冗余处理单元、外部磁盘驱动阵列、磁盘阵列(Redundant Arrays of Independent Drives;以下简称:RAID)系统、磁带驱动器以及数据备份存储系统等。
处理器410通过运行存储在存储器430中的程序,从而执行各种功能应用以及数据处理,例如实现本发明实施例提供的头部动作识别方法。
本发明实施例还提供一种计算机程序产品,上述计算机程序产品包括计算机程序,计算机程序被处理器执行时实现本发明实施例提供的头部动作识别方法。
本发明实施例还提供一种计算机可读存储介质,上述计算机可读存储介质存储计算机指令,上述计算机指令使上述计算机执行本发明实施例提供的头部动作识别方法。
上述计算机可读存储介质可以采用一个或多个计算机可读的介质的任意组
合。计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子(非穷举的列表)包括:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机存取存储器(RAM)、只读存储器(Read Only Memory;以下简称:ROM)、可擦式可编程只读存储器(Erasable Programmable Read Only Memory;以下简称:EPROM)或闪存、光纤、便携式紧凑磁盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本文件中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。
计算机可读的信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括——但不限于——电磁信号、光信号或上述的任意合适的组合。计算机可读的信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。
计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括——但不限于——无线、电线、光缆、RF等等,或者上述的任意合适的组合。
在本说明书的描述中,参考术语“一个实施例”、“一些实施例”、“示例”、“具体示例”、或“一些示例”等的描述意指结合该实施例或示例描述的具体特征、结构、材料或者特点包含于本发明的至少一个实施例或示例中。在本说明书中,对上述术语的示意性表述不必须针对的是相同的实施例或示例。而且,描述的具体特征、结构、材料或者特点可以在任一个或多个实施例或示例中以合适的方式结合。此外,在不相互矛盾的情况下,本领域的技术人员可以将本说明书中描述的不同实施例或示例以及不同实施例或示例的特征进行结合和组合。
此外,术语“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括至少一个该特征。在本发明的描述中,“多个”的含义是至少两个,例如两个,三个等,除非另有明确具体的限定。
流程图中或在此以其他方式描述的任何过程或方法描述可以被理解为,表示包括一个或更多个用于实现定制逻辑功能或过程的步骤的可执行指令的代码的模块、片段或部分,并且本发明的优选实施方式的范围包括另外的实现,其中可以不按所示出或讨论的顺序,包括根据所涉及的功能按基本同时的方式或按相反的顺序,来执行功能,这应被本发明的实施例所属技术领域的技术人员所理解。
在本发明所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如,多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
另外,在本发明各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用硬件加软件功能单元的形式实现。
以上所述仅为本发明的较佳实施例而已,并不用以限制本发明,凡在本发明的精神和原则之内,所做的任何修改、等同替换、改进等,均应包含在本发明保护的范围之内。
Claims (17)
- 一种头部动作识别方法,其特征在于,包括:获取第一时间长度内用户的头部姿态数据;确定所述头部姿态数据的自相关函数,并根据所述自相关函数确定所述头部姿态数据的预估频率;基于所述预估频率确定所述用户是否做出预设动作。
- 根据权利要求1所述的方法,其特征在于,所述基于所述预估频率确定所述用户是否做出预设动作,包括:根据所述预估频率,在所述头部姿态数据的频域分布确定符合周期信号条件的中心频段区域;根据所述中心频段区域的振幅和功率占比中的至少一项,确定所述用户是否做出预设动作。
- 根据权利要求2所述的方法,其特征在于,所述根据所述预估频率,在所述头部姿态数据的频域分布确定符合周期信号条件的中心频段区域,包括:基于傅里叶变换确定所述头部姿态数据的频域分布;根据公式C=f*L确定所述频域分布的中心频段,C为所述中心频段,f为所述预估频率,L为所述第一时间长度;在所述头部姿态数据的频域分布确定至少包含所述中心频段的区域作为所述中心频段区域。
- 根据权利要求2所述的方法,其特征在于,所述根据所述中心频段区域的振幅和功率占比中的至少一项,确定所述用户是否做出预设动作,包括:确定所述中心频段区域中的最大振幅;若所述最大振幅大于第一阈值,则确定所述用户出现所述预设动作。
- 根据权利要求2所述的方法,其特征在于,所述根据所述中心频段区域的振幅和功率占比中的至少一项,确定所述用户是否做出预设动作,包括:确定所述中心频段区域的功率在整个频段的总功率中的功率占比;若所述功率占比大于第二阈值,则确定所述用户出现所述预设动作。
- 根据权利要求1所述的方法,其特征在于,所述根据所述自相关函数确定所述头部姿态数据的预估频率,包括:根据所述自相关函数的波谷和波峰确定所述头部姿态数据的预估周期;将所述预估周期的倒数确定为所述预估频率。
- 根据权利要求6所述的方法,其特征在于,所述根据所述自相关函数的波谷和波峰确定所述头部姿态数据的预估周期,包括:确定所述自相关函数在坐标原点正方向的第一个波谷的坐标(lo,lowest),lo为所述第一个波谷的横坐标,lowest为所述第一个波谷的纵坐标;确定与所述第一个波谷相邻并位于所述第一个波谷正方向的波峰的坐标(hi,highest),hi为所述波峰的横坐标,highest为所述波峰的纵坐标;根据公式T=2*(hi-lo)*L/N确定所述头部姿态数据的预估周期,T为所述预估周期,L为所述第一时间长度,N为在所述第一时间长度内需采集的头部姿态数据的数量。
- 根据权利要求1所述的方法,其特征在于,所述确定所述头部姿态数据的自相关函数,并根据所述自相关函数确定所述头部姿态数据的预估频率之前,还包括:确定所述头部姿态数据的最大值和最小值;若所述最大值减去所述最小值的结果小于第三阈值,则确定所述用户未做出所述预设动作。
- 根据权利要求1所述的方法,其特征在于,所述基于所述预估频率 确定所述用户是否做出预设动作之前,还包括:确定所述自相关函数在坐标原点正方向的第一个波谷的横坐标;若所述第一个波谷的横坐标大于第四阈值,则确定所述用户未做出所述预设动作。
- 根据权利要求1所述的方法,其特征在于,所述基于所述预估频率确定所述用户是否做出预设动作之前,还包括:若所述自相关函数中未发现波峰,或波峰的纵坐标皆小于第五阈值,则确定所述用户未做出所述预设动作。
- 根据权利要求1所述的方法,其特征在于,所述基于所述预估频率确定所述用户是否做出预设动作之前,还包括:确定所述自相关函数在坐标原点正方向的第一个波谷的横坐标lo;确定与所述第一个波谷相邻并位于所述第一个波谷正方向的波峰的横坐标hi;根据公式A=|2*lo-hi|计算第一数值,A为所述第一数值;若所述第一数值大于第六阈值,则确定所述用户未做出所述预设动作。
- 根据权利要求1至11任一项所述的方法,其特征在于,所述头部姿态数据包括:所述用户头部的横向旋转角数据或纵向旋转角数据。
- 一种头部动作识别装置,其特征在于,包括:获取模块,用于获取第一时间长度内用户的头部姿态数据;处理模块,用于确定所述头部姿态数据的自相关函数,并根据所述自相关函数的时域分布确定所述头部姿态数据的预估频率;所述处理模块,还用于基于所述预估频率确定所述用户是否做出预设动作。
- 一种车辆,其特征在于,所述车辆包括:车载设备,所述车载设备运行时能够实现如权利要求1至12任一项所述的方法。
- 一种电子设备,其特征在于,包括:至少一个处理器;所述处理器与至少一个存储器耦合,其中:所述存储器存储有可被所述处理器执行的程序指令,所述处理器调用所述程序指令能够执行如权利要求1至12任一项所述的方法。
- 一种计算机程序产品,所述计算机程序产品包括计算机程序,所述计算机程序被处理器执行时实现如权利要求1至12任一项所述的方法。
- 一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现如权利要求1至12任一项所述的方法。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310064924.7 | 2023-01-13 | ||
CN202310064924.7A CN118349100A (zh) | 2023-01-13 | 2023-01-13 | 头部动作识别方法和装置 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2024148794A1 true WO2024148794A1 (zh) | 2024-07-18 |
Family
ID=91819986
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2023/110256 WO2024148794A1 (zh) | 2023-01-13 | 2023-07-31 | 头部动作识别方法和装置 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN118349100A (zh) |
WO (1) | WO2024148794A1 (zh) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106725381A (zh) * | 2016-12-19 | 2017-05-31 | 华南农业大学 | 一种智能健身运动手环 |
CN110084353A (zh) * | 2019-04-30 | 2019-08-02 | 深圳市大白牛文化科技有限公司 | 一种磕头计数器 |
CN110163329A (zh) * | 2019-04-30 | 2019-08-23 | 深圳市大白牛文化科技有限公司 | 一种磕头计数器 |
CN113449836A (zh) * | 2021-07-21 | 2021-09-28 | 温州亿通自动化设备有限公司 | 一种磕头计数方法及装置 |
-
2023
- 2023-01-13 CN CN202310064924.7A patent/CN118349100A/zh active Pending
- 2023-07-31 WO PCT/CN2023/110256 patent/WO2024148794A1/zh unknown
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106725381A (zh) * | 2016-12-19 | 2017-05-31 | 华南农业大学 | 一种智能健身运动手环 |
CN110084353A (zh) * | 2019-04-30 | 2019-08-02 | 深圳市大白牛文化科技有限公司 | 一种磕头计数器 |
CN110163329A (zh) * | 2019-04-30 | 2019-08-23 | 深圳市大白牛文化科技有限公司 | 一种磕头计数器 |
CN113449836A (zh) * | 2021-07-21 | 2021-09-28 | 温州亿通自动化设备有限公司 | 一种磕头计数方法及装置 |
Also Published As
Publication number | Publication date |
---|---|
CN118349100A (zh) | 2024-07-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107591151B (zh) | 远场语音唤醒方法、装置和终端设备 | |
WO2021017329A1 (zh) | 一种检测驾驶员分心的方法及装置 | |
WO2024148793A1 (zh) | 头势识别方法和装置 | |
CN110070866A (zh) | 语音识别方法及装置 | |
US20140168068A1 (en) | System and method for manipulating user interface using wrist angle in vehicle | |
CN109637148B (zh) | 车载鸣笛监控系统、方法、存储介质及设备 | |
CN115720253B (zh) | 视频处理方法、装置、车辆以及存储介质 | |
CN116279746A (zh) | 车载系统的控制方法和装置 | |
CN112083795A (zh) | 对象控制方法及装置、存储介质和电子设备 | |
CN113053368A (zh) | 语音增强方法、电子设备和存储介质 | |
CN108924461A (zh) | 视频图像处理方法及装置 | |
CN110723135A (zh) | 基于自动泊车辅助系统的停车调节方法、装置和存储介质 | |
CN114103944B (zh) | 车间时距调整方法、装置和设备 | |
WO2024148794A1 (zh) | 头部动作识别方法和装置 | |
CN111985417A (zh) | 功能部件识别方法、装置、设备及存储介质 | |
KR20170061453A (ko) | 운전 보조 장치 및 방법 | |
CN109733285B (zh) | 车辆行驶状态显示方法、设备和系统 | |
CN117333837A (zh) | 行车安全辅助方法、电子设备及存储介质 | |
CN114333404A (zh) | 一种停车场寻车方法、装置、车辆及存储介质 | |
CN115973194A (zh) | 智能车辆控制方法、装置、设备及介质 | |
US11983328B2 (en) | Apparatus for recognizing gesture in vehicle and method thereof | |
CN115083404A (zh) | 一种车载语音降噪方法、装置、电子设备及存储介质 | |
CN117333836A (zh) | 行车安全辅助方法、电子设备及存储介质 | |
CN209980327U (zh) | 一种基于yolo物象识别的智能行车记录仪、系统及车辆 | |
CN114596862A (zh) | 一种语音识别引擎确定方法、装置及计算机设备 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 23915556 Country of ref document: EP Kind code of ref document: A1 |