CN107170466B - Mopping sound detection method based on audio - Google Patents

Mopping sound detection method based on audio Download PDF

Info

Publication number
CN107170466B
CN107170466B CN201710242995.6A CN201710242995A CN107170466B CN 107170466 B CN107170466 B CN 107170466B CN 201710242995 A CN201710242995 A CN 201710242995A CN 107170466 B CN107170466 B CN 107170466B
Authority
CN
China
Prior art keywords
sound
mopping
audio
probability
normal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710242995.6A
Other languages
Chinese (zh)
Other versions
CN107170466A (en
Inventor
王成
龙舟
钱跃良
王向东
袁静
李锦涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Computing Technology of CAS
Original Assignee
Institute of Computing Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Computing Technology of CAS filed Critical Institute of Computing Technology of CAS
Priority to CN201710242995.6A priority Critical patent/CN107170466B/en
Publication of CN107170466A publication Critical patent/CN107170466A/en
Application granted granted Critical
Publication of CN107170466B publication Critical patent/CN107170466B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • G10L25/66Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for extracting parameters related to health condition
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/103Detecting, measuring or recording devices for testing the shape, pattern, colour, size or movement of the body or parts thereof, for diagnostic purposes
    • A61B5/11Measuring movement of the entire body or parts thereof, e.g. head or hand tremor, mobility of a limb
    • A61B5/112Gait analysis
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/45Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of analysis window
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Public Health (AREA)
  • Physiology (AREA)
  • Heart & Thoracic Surgery (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Epidemiology (AREA)
  • Biophysics (AREA)
  • Pathology (AREA)
  • Biomedical Technology (AREA)
  • Dentistry (AREA)
  • Medical Informatics (AREA)
  • Molecular Biology (AREA)
  • Surgery (AREA)
  • Animal Behavior & Ethology (AREA)
  • Veterinary Medicine (AREA)
  • Measurement Of The Respiration, Hearing Ability, Form, And Blood Characteristics Of Living Organisms (AREA)

Abstract

The invention provides a method for detecting sound of walking and mopping based on audio. The method comprises the following steps: performing framing processing on the collected left and right feet double-track audio data to obtain corresponding audio frames; obtaining the probability that the audio frame belongs to the mopping sound and the probability that the audio frame belongs to the normal footstep sound by using a classifier by taking the feature vector extracted from the audio frame as input, wherein the classifier is obtained by training, and the training samples comprise a positive sample for identifying the normal footstep sound, a mopping sample for identifying the mopping sound and a negative sample for identifying other sounds except for the footsteps; and obtaining a time interval corresponding to the mopping sound according to the obtained probability that each audio frame belongs to the mopping sound and the probability that each audio frame belongs to the normal footstep sound. The method can accurately detect the floor mopping sound in the walking process, and is beneficial to gait detection, falling early warning and the like.

Description

Mopping sound detection method based on audio
Technical Field
The invention relates to the technical field of computer application, in particular to a detection method of dragging sound based on audio information.
Background
Gait analysis (gait analysis) is a technique for obtaining and analyzing gait parameters by observing or collecting the posture of a human body while walking. Generally, gait parameters include spatial parameters (e.g., stride length, step size, step width, etc.), temporal parameters (e.g., step frequency, step speed, etc.), and the symmetry of the left and right feet of these parameters, the stability of long-term data, etc. Gait analysis is widely applied and researched in the aspects of physical exercise, medical rehabilitation and the like.
In gait analysis, whether the foot drags the ground is medically called foot clearance, the steps of a normal person are relatively stable in the processes of landing and leaving the ground, the height of the ground is sufficient in the swinging process, the patient raises and lands with difficulty in starting and stopping, the situation that the ground is rubbed can be generated, and in the swinging process, the obvious floor dragging sound can also be generated due to the fact that the foot is not raised to high enough. The detection of foot clearance has important significance for rehabilitation medicine, gait detection, falling early warning and the like.
However, in the prior art, gait analysis is usually based on video images, pressure sensors, electromyography, etc., but these devices are highly invasive to the patient, and especially for mopping events, it is difficult to directly determine from the motion sensors. In addition, although there is an audio-based step detection method (for example, chinese patent application No. 201610971951.2 by wang et al entitled "two-channel-based step detection method") in the prior art, it does not include a scheme for determining whether a foot is dragging the floor, and there is also no general and effective foot dragging detection mechanism in the prior art.
Disclosure of Invention
It is therefore an object of the present invention to overcome the above-mentioned drawbacks of the prior art and to provide a method capable of accurately detecting a sound of a footstep mopping based on audio. The method comprises the following steps:
step 1: performing framing processing on the collected left and right feet double-track audio data to obtain corresponding audio frames;
step 2: obtaining the probability that the audio frame belongs to the mopping sound and the probability that the audio frame belongs to the normal footstep sound by using a classifier by taking the feature vector extracted from the audio frame as input, wherein the classifier is obtained by training, and the training samples comprise a positive sample for identifying the normal footstep sound, a mopping sample for identifying the mopping sound and a negative sample for identifying other sounds except for the footsteps;
and step 3: and obtaining a time interval corresponding to the mopping sound according to the obtained probability that each audio frame belongs to the mopping sound and the probability that each audio frame belongs to the normal footstep sound.
Preferably, the positive samples include audio frames labeled heel strike and audio frames labeled forefoot strike under normal gait.
Preferably, the positive samples include three audio frames centered at each heel strike labeling position and three audio frames centered at each forefoot strike labeling position in the left-foot channel audio data, and three audio frames centered at each heel strike labeling position and three audio frames centered at each forefoot strike labeling position in the known right-foot channel audio data, in a known normal gait.
Preferably, the mopping sample comprises an audio frame labeled heel strike and an audio frame labeled forefoot strike in a mopping gait.
Preferably, the floor mopping sound samples include three audio frames centered at each position where the heel strike is labeled and three audio frames centered at each position where the forefoot strike is labeled in the left foot channel audio data, and three audio frames centered at each position where the heel strike is labeled in the known right foot channel audio data in the known floor mopping sound samples.
Preferably, the negative examples include nine audio frames between the forefoot strike and the heel strike of the preceding step in the left-foot channel audio data, and nine audio frames between the forefoot strike and the heel strike of the following step in the right-foot channel audio data.
Preferably, in step 2, the audio frame of the left foot channel and the probability of the audio frame belonging to the floor form a floor mopping sound probability curve of the left foot channel, the audio frame of the right foot channel and the probability of the audio frame belonging to the floor mopping sound form a floor mopping sound probability curve of the right foot channel, the audio frame of the left foot channel and the probability of the audio frame belonging to the normal footstep sound form a normal footstep sound probability curve of the left foot channel, and the audio frame of the right foot channel and the probability of the audio frame belonging to the normal footstep sound form a normal footstep sound probability curve of the right foot channel; and in the step 3, the probability curves of the mopping sound of the left and right foot sound channels are fused into a comprehensive probability curve of the mopping sound, the probability curves of the normal footstep sound of the left and right foot sound channels are fused into a comprehensive probability curve of the normal footstep sound, and the time interval corresponding to the normal footstep sound and the time interval corresponding to the mopping sound are obtained based on a preset probability threshold.
Preferably, the time interval corresponding to the normal footstep sound is the time interval in which the probability curve of the synthesized normal footstep sound is less than 0.5; the time interval corresponding to the mopping sound is the time interval that the probability curve of the integrated mopping sound is larger than 0.35.
Compared with the prior art, the invention has the advantages that: whether the audio frame is the mopping sound and/or the normal footstep sound can be accurately detected according to the two-channel audio data, and in addition, the method based on the machine learning can be suitable for various different scenes and has strong universality.
Drawings
The invention is illustrated and described only by way of example and not by way of limitation in the scope of the invention as set forth in the following drawings, in which:
FIG. 1 shows a flow diagram of a method of training a classifier for detecting mopping sounds according to one embodiment of the invention;
FIG. 2 illustrates a flow diagram of a method of detecting mopping sounds and normal footsteps according to one embodiment of the present invention;
FIG. 3 illustrates a framing approach according to one embodiment of the present invention;
FIG. 4 illustrates an example of a binaural data annotation according to an embodiment of the invention;
fig. 5 shows an example of detection results of the mopping sound and the normal footstep sound according to one embodiment of the present invention.
Detailed Description
In order to more clearly understand the technical features, objects and effects of the present invention, the audio-based step floor detection method according to the present invention will now be described in detail with reference to the accompanying drawings.
For a clear understanding of the present invention, the following patents (or patent applications) are incorporated herein by reference in their entirety:
1. chinese patent application No. 201610971951.2, entitled "method for detecting steps based on binaural tracks", assigned to wancheng et al.
2. The application number of the Wangcheng et al is 201610517381.X, the name of the invention is "a method for establishing gait data set and gait analysis method".
FIG. 1 shows a schematic flow diagram of an audio-based mopping detection method according to one embodiment of the present invention. The method specifically comprises the following steps:
step S110, collecting audio data
Through arranging wearable gait data acquisition device respectively left and right feet department, gather the sound signal that the people produced when walking, can obtain the dual track audio data.
In one embodiment, the wearable gait data acquisition device comprises a microphone unit capable of acquiring acoustic signals. The gait data acquisition device comprises a left foot gait data acquisition node and a right foot gait data acquisition node, and each gait acquisition node comprises a storage unit, a microprocessor, a power supply unit, a wireless transceiving unit, a signal acquisition unit and a signal transmitter. When data is collected, a signal collector (such as a microphone) collects sound signals, and the collected signals are sent to a microprocessor for processing.
In one embodiment, the method for acquiring binaural data comprises: and respectively fixing the left foot gait acquisition node and the right foot gait acquisition node on the left foot and the right foot of the tested person. Two gait acquisition equipment nodes are simultaneously used on the two feet, namely the left foot gait acquisition node acquires the audio data of the left foot, and the right foot gait acquisition node acquires the audio data of the right foot, so that a double-track is formed, the data of the left foot and the right foot are analyzed and fused, and more accurate information than a single-foot measurement mode can be obtained. In particular, the gait data acquisition nodes may be worn at various locations on the shoe, such as the front, lateral or rear side of the upper, or the sole near the ball of the foot, the middle or near the heel. Preferably, the left foot gait acquisition node and the right foot gait acquisition node are worn on the symmetrical positions of the left foot and the right foot.
For a specific method for collecting audio data during walking, refer to the chinese patent application "a method for establishing a gait data set and a gait analysis method" (chinese patent application No. cn201610517381. x).
Step S120: data slicing
And performing frame division and windowing on the collected two-channel audio data to obtain a series of audio frames. As shown in fig. 3, each audio frame contains 200 samples at an audio sampling rate of 8000hz, and an overlap interval of 120 samples between adjacent frames is set during framing. After framing, in order to reduce signal discontinuity at the start and end of a frame, a hamming window is added to an audio frame, that is, a sliding window is added to audio data, and the corresponding audio frame is taken as a basic unit of investigation in this embodiment by using the sliding window, and since the spectral characteristics and some physical characteristic parameters of audio are basically kept unchanged within a range of 10-30 ms, the window length of the hamming window is generally 10ms to 30 ms.
Step S130: extracting and selecting audio features
And performing feature extraction on the audio frame to obtain a feature vector of the audio frame. According to one embodiment of the invention, the feature vector comprises: autocorrelation coefficients, self-band energy (0 to 4kHz) characteristics, zero crossing rate, linear prediction coefficients (LPCC characteristics), and mel-frequency cepstral coefficients (MFCC) characteristics. Table 1 shows the composition of feature vectors in one embodiment, including: the method comprises the following steps of (1) obtaining 36 dimensions of 10-dimensional sub-band energy characteristics, 12-dimensional Mel cepstrum coefficient characteristics, 12-dimensional linear prediction coefficients, zero crossing rate and autocorrelation coefficients.
TABLE 1
Feature(s) Dimension number
Coefficient of autocorrelation 1
Sub-band energy (0 to 4kHz) 10
Zero crossing rate 1
LPCC 12
MFCC 12
It should be understood that the dimensions of the feature vectors and the specific combination of feature vectors described above are not exclusive. In other embodiments, the feature vector may be a free combination of some or all of the above features or other feature combinations that can better characterize the information implied by the audio frame.
Step S140: selecting training samples
The typical footstep sound has the characteristics that the heel and the forefoot touch the ground, corresponding touch signals can be collected by the audio data collection equipment of the left foot and the right foot, but the audio signal of the foot on the side is relatively strong. Therefore, in the manual labeling, the positions of two sounds per step (i.e., two sounds with the heel and the forefoot landed) are sequentially labeled on the audio on the corresponding side according to the left and right feet on the two audio of the left and right feet, as shown in fig. 4.
In this embodiment, in order to realize the detection of three categories of normal footstep sound, mopping sound and other sounds, the training samples include a positive sample for identifying normal footstep sound, a mopping sample for identifying mopping sound and a negative sample for identifying other sounds other than footsteps.
Preferably, under normal gait, 3 frames each are taken as positive samples centered at each marked position on the two audio tracks for the left and right feet, so that in a single track (i.e. corresponding to the channel for the left foot or to the channel for the right foot), 6 positive samples are taken for each step, and then 9 consecutive frames are taken as negative samples at the middle position of two adjacent steps (the second sound of the previous foot and the first sound of the next foot), so that there are 18 negative samples between each two steps.
It will be appreciated by those skilled in the art that other audio frames may be selected as long as they are distinguishable from the normal footstep sound.
For the collection of the mopping samples, the mopping samples can be obtained by simulating mopping by normal people, and the labeling method is the same as the labeling method of normal footsteps, for example, in the mopping gait, 3 frames are respectively taken as the positive samples on two audios of the left foot and the right foot by taking the position of each label as the center, so that each footstep corresponds to 6 mopping samples in a single sound channel.
In one embodiment, the training samples collected include positive samples of 3264 audio frame, negative samples of 4026 audio frame, and floor samples of 463 audio frame.
It should be understood that the number of positive samples, the number of negative samples, and the number of mopping samples per step and the number of audio frames per sample may be determined by considering the training time and the obtained model accuracy, and is not limited to the specific values listed herein.
Step S150: training classifier model
By using the positive and negative samples and the mopping sample to form a sample library, a classifier, such as a Support Vector Machine (SVM), a weighted support vector machine, an extreme learning machine or a weighted extreme learning machine, can be trained by using a computer machine learning method. The input of the classifier is a feature vector extracted from each audio frame, and the output is the probability of whether a certain audio frame is a mopping sound, a normal footstep sound and other sounds, and the sum of the probabilities of the three is 1 for each audio frame.
The mopping sound can be detected by using the trained classifier, as shown in fig. 2, in this embodiment, the detection method includes the following steps:
step S210, collecting audio data.
In this step, an audio frame of the two-channel audio data to be detected is obtained according to the method of step S110 shown in fig. 1.
Step S220: and obtaining the probability that the audio frame belongs to normal footstep sound and mopping sound by using the trained classifier.
And detecting each audio frame of the two-channel audio data to be detected by using the trained classifier to obtain the probability of each audio frame belonging to the mopping sound, and establishing a corresponding probability curve. The probability curve is a curve whose abscissa is the audio frame number (or the time represented by the audio frame), and whose ordinate is the probability that the corresponding audio frame belongs to the mopping sound. The two audio data of the left foot and the right foot correspond to the two probability curves. Similarly, the probability of each audio frame belonging to normal footstep sound and the probability curves of other sounds can be obtained, and the probability curves corresponding to the left and right feet are also established, which are shown in fig. 5 as the probability curve of normal footstep sound and the probability curve of mopping sound.
Step S230: smoothing the probability curve and identifying mopping and normal footstep sounds
In this step, the probability curves of the dragging sounds of the left and right feet and the probability curve of the normal step sound are fused and smoothed, for example, in one embodiment, after the probability curves of the left and right feet are merged based on a summation method, in order to overcome the large instability and noise points existing in the probability curves, a low-pass filter (relative cut-off frequency 0.1) is used for smoothing, and the smoothed probability curves have a relatively obvious "large probability" interval, for example, in the probability curve of the dragging sounds illustrated in fig. 5, a time interval greater than 0.35 is obviously present, so that intervals continuously exceeding the threshold value can be found according to a preset threshold value, and the intervals are determined as the intervals belonging to the dragging sounds.
In another embodiment, the step interval is determined based on a binaural probability maximization method. Generally, the probability that the audio data of the local channel is determined as the floor sound is higher, so that the audio data of the local channel can be more depended on, and the probability of the audio data of the other side plays a complementary role. For each pair of candidate audio frames (referring to the audio frames of the left and right channels with the same time table), the one with the higher probability is selected, and then the probability value of the audio frame position in the comprehensive probability curve is represented by the selected one, so that the probability curve of the comprehensive left and right feet audio data is obtained. And for the comprehensive probability curve, searching for intervals continuously exceeding a threshold value by using a preset probability threshold value, and judging the intervals as the intervals belonging to the mopping sound.
The specific process of fusion and smoothing for probability curves can also be found in patent application No. CN201610971951.2 (binaural-based step detection method).
In tests, the inventor finds that the patient with mopping has high probability of mopping the floor by the footstep, the normal footstep sound can obtain high probability per se, and the mopping sound can be included at the position where the normal footstep sound is counted. Therefore, in order to accurately distinguish the mopping sound from the normal footstep sound, in the present invention, the mopping sound is not counted in the time interval in which the mopping sound is judged to be the normal footstep sound. For example, the determination process is to consider an interval smaller than a preset threshold value of 0.5 on the probability curve of the normal footstep sound as the normal footstep sound; an interval on the probability curve of the normal footstep sound of more than 0.5 and on the probability curve of the mopping sound of more than 0.35 is regarded as a time interval of the mopping sound. In this way, the sound of mopping can be prevented from being masked by the sound of normal footsteps to some extent. Of course, depending on the specific application requirements, the normal footstep sound and the mopping sound can be counted together. For example, an interval on the probability curve of the normal footstep sound smaller than a preset threshold value of 0.5 is regarded as the normal footstep sound; an interval greater than 0.35 on the probability curve of the mopping sound is regarded as a time interval of the mopping sound.
Fig. 5 is a diagram showing the detection results of the normal footfall and the mopping sound, in which the abscissa indicates the number of each audio frame and the ordinate indicates the probability curve of the normal footfall and the probability curve of the mopping sound, respectively. The original normal footstep sound probability curve refers to the footstep sound probability curve after the audio data of the left foot and the right foot are fused and before the audio data of the left foot and the right foot are smoothed. The probability curve of the normal footstep sound and the probability curve of the mopping sound are the probability curves after audio data fusion and smoothing of the left and right feet. Fig. 5 shows that the probability judgment threshold for the mopping sound is greater than 0.35, and the probability judgment threshold for the normal footstep sound is less than 0.5.
The floor mopping sound detection method based on the audio frequency does not miss normal footstep sound and floor mopping sound, and has high recall rate and accuracy.
To further verify the technical effects of the present invention, the inventors performed tests based on the classifier model of the present invention. The test data includes: 3 healthy people and 2 abnormal gait patients walk back and forth 4 times in a distance of 5 meters. The test results are shown in table 2, and no mopping phenomenon was found in healthy people, but the patients had more mopping phenomenon with gait.
TABLE 2
Healthy person 1 Healthy person 2 Healthy person 3 Patient 1 Patient 2
0 0 0 6 times of 8 times (by volume)
The present invention may be a system, method and/or computer program product. The computer program product may include a computer-readable storage medium having computer-readable program instructions embodied therewith for causing a processor to implement various aspects of the present invention.
The computer readable storage medium may be a tangible device that retains and stores instructions for use by an instruction execution device. The computer readable storage medium may include, for example, but is not limited to, an electronic memory device, a magnetic memory device, an optical memory device, an electromagnetic memory device, a semiconductor memory device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a Static Random Access Memory (SRAM), a portable compact disc read-only memory (CD-ROM), a Digital Versatile Disc (DVD), a memory stick, a floppy disk, a mechanical coding device, such as punch cards or in-groove projection structures having instructions stored thereon, and any suitable combination of the foregoing.
Having described embodiments of the present invention, the foregoing description is intended to be exemplary, not exhaustive, and not limited to the embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein is chosen in order to best explain the principles of the embodiments, the practical application, or improvements made to the technology in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.

Claims (9)

1. An audio-based mopping sound detection method comprises the following steps:
step 1: performing framing processing on the collected left and right feet double-track audio data to obtain corresponding audio frames;
step 2: the characteristic vector extracted from the audio frame is used as input, a classifier is used for obtaining the probability of the audio frame belonging to the mopping sound and the probability of the audio frame belonging to the normal footstep sound, the audio frame of the left foot sound channel and the probability of the audio frame belonging to the mopping sound form the mopping sound probability curve of the left foot sound channel, the audio frame of the right foot sound channel and the probability of the audio frame belonging to the mopping sound form the mopping sound probability curve of the right foot sound channel, the audio frame of the left foot sound channel and the probability of the audio frame belonging to the normal footstep sound form the normal footstep sound probability curve of the left foot sound channel, and the audio frame of the right foot sound channel and the probability of the audio frame belonging to the normal footstep sound form the normal footstep, wherein the classifier is obtained by training, and training samples comprise positive samples for identifying normal footsteps, mopping samples for identifying mopping sounds and negative samples for identifying other sounds other than footsteps;
and step 3: and fusing the probability curves of the mopping sound of the left and right foot sound channels into a comprehensive probability curve of the mopping sound, fusing the probability curves of the normal footstep sound of the left and right foot sound channels into a comprehensive probability curve of the normal footstep sound, and obtaining a time interval corresponding to the normal footstep sound and a time interval corresponding to the mopping sound based on a preset probability threshold.
2. The method of claim 1, wherein the positive samples comprise audio frames labeled heel strike and audio frames labeled forefoot strike under normal gait.
3. The method of claim 2, wherein the positive samples comprise three audio frames centered at each location labeled heel strike in the left foot channel audio data and three audio frames centered at each location labeled heel strike in the known normal gait, and three audio frames centered at each location labeled heel strike in the known right foot channel audio data.
4. The method of claim 1, wherein the mopping sample comprises audio frames labeled heel strike and audio frames labeled forefoot strike in mopping gait.
5. The method of claim 4, wherein the mopping sound samples comprise three audio frames centered at each location labeled heel strike in the left foot channel audio data and three audio frames centered at each location labeled forefoot strike in the known mopping gait, and three audio frames centered at each location labeled heel strike and three audio frames centered at each location labeled forefoot strike in the known right foot channel audio data.
6. The method of claim 1, wherein the negative examples include nine audio frames between a forefoot strike of a preceding step and a heel strike of a succeeding step in left foot channel audio data, and nine audio frames between a forefoot strike of a preceding step and a heel strike of a succeeding step in right foot channel audio data.
7. The method according to any one of claims 1 to 6, wherein the time interval corresponding to a normal footfall is a time interval in which the probability curve of the integrated normal footfall is less than 0.5; the time interval corresponding to the mopping sound is the time interval that the probability curve of the integrated mopping sound is larger than 0.35.
8. A computer-readable storage medium, on which a computer program is stored, which program, when being executed by a processor, carries out the steps of any one of claims 1 to 7.
9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the steps of any of claims 1 to 7 are implemented by the processor when executing the program.
CN201710242995.6A 2017-04-14 2017-04-14 Mopping sound detection method based on audio Active CN107170466B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710242995.6A CN107170466B (en) 2017-04-14 2017-04-14 Mopping sound detection method based on audio

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710242995.6A CN107170466B (en) 2017-04-14 2017-04-14 Mopping sound detection method based on audio

Publications (2)

Publication Number Publication Date
CN107170466A CN107170466A (en) 2017-09-15
CN107170466B true CN107170466B (en) 2020-12-29

Family

ID=59849006

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710242995.6A Active CN107170466B (en) 2017-04-14 2017-04-14 Mopping sound detection method based on audio

Country Status (1)

Country Link
CN (1) CN107170466B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110364141B (en) * 2019-06-04 2021-09-28 杭州电子科技大学 Elevator typical abnormal sound alarm method based on depth single classifier
GB2607561B (en) * 2021-04-08 2023-07-19 Miicare Ltd Mobility analysis
CN113679300A (en) * 2021-09-13 2021-11-23 李侃 Electronic book recommendation method and device for floor sweeping robot
CN114550395B (en) * 2022-04-28 2022-07-19 常州分音塔科技有限公司 Sound alarm detection method and device

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2002197437A (en) * 2000-12-27 2002-07-12 Sony Corp Walking detection system, walking detector, device and walking detecting method
CN101393660A (en) * 2008-10-15 2009-03-25 中山大学 Intelligent gate inhibition system based on footstep recognition
CN102799899A (en) * 2012-06-29 2012-11-28 北京理工大学 Special audio event layered and generalized identification method based on SVM (Support Vector Machine) and GMM (Gaussian Mixture Model)
CN104200815A (en) * 2014-07-16 2014-12-10 电子科技大学 Audio noise real-time detection method based on correlation analysis
CN106175778A (en) * 2016-07-04 2016-12-07 中国科学院计算技术研究所 A kind of method setting up gait data collection and gait analysis method
CN106531186A (en) * 2016-10-28 2017-03-22 中国科学院计算技术研究所 Footstep detecting method according to acceleration and audio information

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7190775B2 (en) * 2003-10-29 2007-03-13 Broadcom Corporation High quality audio conferencing with adaptive beamforming
CN103730124A (en) * 2013-12-31 2014-04-16 上海交通大学无锡研究院 Noise robustness endpoint detection method based on likelihood ratio test
CN105215542B (en) * 2015-10-14 2017-10-10 西北工业大学 Underwater Acoustic channels method in friction welding (FW) welding process
CN106166071B (en) * 2016-07-04 2018-11-30 中国科学院计算技术研究所 A kind of acquisition method and equipment of gait parameter

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2002197437A (en) * 2000-12-27 2002-07-12 Sony Corp Walking detection system, walking detector, device and walking detecting method
CN101393660A (en) * 2008-10-15 2009-03-25 中山大学 Intelligent gate inhibition system based on footstep recognition
CN102799899A (en) * 2012-06-29 2012-11-28 北京理工大学 Special audio event layered and generalized identification method based on SVM (Support Vector Machine) and GMM (Gaussian Mixture Model)
CN104200815A (en) * 2014-07-16 2014-12-10 电子科技大学 Audio noise real-time detection method based on correlation analysis
CN106175778A (en) * 2016-07-04 2016-12-07 中国科学院计算技术研究所 A kind of method setting up gait data collection and gait analysis method
CN106531186A (en) * 2016-10-28 2017-03-22 中国科学院计算技术研究所 Footstep detecting method according to acceleration and audio information

Also Published As

Publication number Publication date
CN107170466A (en) 2017-09-15

Similar Documents

Publication Publication Date Title
CN107170466B (en) Mopping sound detection method based on audio
Dey et al. InstaBP: cuff-less blood pressure monitoring on smartphone using single PPG sensor
US11033205B2 (en) System and method for analyzing gait and postural balance of a person
Yancheva et al. Using linguistic features longitudinally to predict clinical scores for Alzheimer’s disease and related dementias
CN106653058B (en) Dual-track-based step detection method
Altaf et al. Acoustic gaits: Gait analysis with footstep sounds
JP2017504440A (en) Improved gait detection in user movement measurements
CN106531186B (en) Merge the step detection method of acceleration and audio-frequency information
CN108742637B (en) Body state detection method and detection system based on gait recognition device
Apte et al. A sensor fusion approach to the estimation of instantaneous velocity using single wearable sensor during sprint
Salvi et al. An optimised algorithm for accurate steps counting from smart-phone accelerometry
Kong et al. Comparison of gait event detection from shanks and feet in single-task and multi-task walking of healthy older adults
US10426426B2 (en) Methods and apparatus for performing dynamic respiratory classification and tracking
JP6479447B2 (en) Walking state determination method, walking state determination device, program, and storage medium
Wang et al. DeepDDK: A deep learning based oral-diadochokinesis analysis software
Aubol et al. Tibial acceleration reliability and minimal detectable difference during overground and treadmill running
JP5485924B2 (en) Walking sound analyzer, method, and program
US11918346B2 (en) Methods and systems for pulmonary condition assessment
Boutaayamou et al. Validated extraction of gait events from 3D accelerometer recordings
JP7489729B2 (en) Method for preventing falls and device for carrying out such method
EP3991157B1 (en) Evaluating movement of a subject
Zaeni et al. Classification of the Stride Length based on IMU Sensor using the Decision Tree
Zaeni et al. Detection of the Imbalance Step Length using the Decision Tree
Summoogum et al. Passive Tracking of Gait Biomarkers in Older Adults: Feasibility of an Acoustic Based Approach for Non-Intrusive gait Analysis
Ismail Gait and postural sway analysis, A multi-modal system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
EE01 Entry into force of recordation of patent licensing contract

Application publication date: 20170915

Assignee: Beijing Zhongke Huicheng Technology Co., Ltd.

Assignor: Institute of Computing Technology, Chinese Academy of Sciences

Contract record no.: 2018110000005

Denomination of invention: Shuffling sound detection method based on voice frequency

License type: Common License

Record date: 20180222

EE01 Entry into force of recordation of patent licensing contract
EC01 Cancellation of recordation of patent licensing contract

Assignee: Beijing Zhongke Huicheng Technology Co., Ltd.

Assignor: Institute of Computing Technology, Chinese Academy of Sciences

Contract record no.: 2018110000005

Date of cancellation: 20180309

EC01 Cancellation of recordation of patent licensing contract
EE01 Entry into force of recordation of patent licensing contract

Application publication date: 20170915

Assignee: Luoyang Zhongke Huicheng Technology Co., Ltd.

Assignor: Institute of Computing Technology, Chinese Academy of Sciences

Contract record no.: 2018110000009

Denomination of invention: Shuffling sound detection method based on voice frequency

License type: Common License

Record date: 20180319

EE01 Entry into force of recordation of patent licensing contract
GR01 Patent grant
GR01 Patent grant