WO2022195657A1 - Muscle sound extraction method, muscle sound extraction device, and program - Google Patents

Muscle sound extraction method, muscle sound extraction device, and program Download PDF

Info

Publication number
WO2022195657A1
WO2022195657A1 PCT/JP2021/010332 JP2021010332W WO2022195657A1 WO 2022195657 A1 WO2022195657 A1 WO 2022195657A1 JP 2021010332 W JP2021010332 W JP 2021010332W WO 2022195657 A1 WO2022195657 A1 WO 2022195657A1
Authority
WO
WIPO (PCT)
Prior art keywords
sound
muscle
time width
extraction
abnormal
Prior art date
Application number
PCT/JP2021/010332
Other languages
French (fr)
Japanese (ja)
Inventor
有信 新島
Original Assignee
日本電信電話株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 日本電信電話株式会社 filed Critical 日本電信電話株式会社
Priority to PCT/JP2021/010332 priority Critical patent/WO2022195657A1/en
Publication of WO2022195657A1 publication Critical patent/WO2022195657A1/en

Links

Images

Classifications

    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B7/00Instruments for auscultation
    • A61B7/02Stethoscopes
    • A61B7/04Electric stethoscopes

Definitions

  • the present invention relates to a muscle sound extraction method, a muscle sound extraction device, and a program.
  • Muscle sounds which are minute vibrations generated when muscles contract, can be easily measured with a condenser microphone or the like, as described in Non-Patent Document 1, for example.
  • a signal with a high SN ratio (Signal to Noise Ratio) can be measured by measuring muscle sounds in a noise-free environment such as a laboratory environment.
  • noises such as environmental sounds, conversational voices, and microphone contact noises are mixed.
  • the signal can be extracted using a bandpass filter of 1-250 Hz, etc.
  • noise since noise also contains frequency components in the same band, applying a filter alone will not detect signals other than muscle sounds. There is a problem that the signal is also extracted and the SN ratio is lowered.
  • An embodiment of the present invention has been made in view of the above points, and aims to enable muscle sound extraction with a high SN ratio.
  • a muscle sound extraction method includes: a measurement procedure for measuring a sound that includes muscle sounds; a determination procedure for determining whether or not the sound in the time width is an abnormal sound using an abnormal sound detection model learned in advance for each time width and the feature amount of the time width; and a correction procedure for performing correction for attenuating the sound determined to be an abnormal sound.
  • FIG. 10 is a diagram showing an example of application results for muscle sounds;
  • FIG. 10 is a diagram showing an example of application results for microphone contact noise;
  • the muscle sound extraction apparatus 10 applies a machine learning algorithm for abnormal sound detection to sounds (speech) including muscle sounds for each predetermined window width, and extracts the sound of the window width. Determine whether it is muscle sound or noise. By correcting and reducing the amplitude of sounds determined to be noise, muscle sounds with a high SN ratio can be extracted.
  • the muscle sound extraction apparatus 10 includes a model learning time for learning a model for detecting an abnormal sound by a machine learning algorithm (hereinafter referred to as an "abnormal sound detection model"), and an abnormal sound detection model.
  • a muscle sound extraction time in which a muscle sound with a high SN ratio is extracted using a detection model.
  • model learning and muscle sound extraction may be performed by different devices. Also, at this time, a device that performs model learning may be called a “learning device”, a “model learning device”, or the like.
  • FIG. 1 is a diagram showing an example of the hardware configuration of a muscle sound extraction device 10 according to this embodiment.
  • the muscle sound extraction device 10 includes an input device 101, a display device 102, an external I/F 103, a communication I/F 104, a processor 105, and a memory device 106. have. Each of these pieces of hardware is communicably connected via a bus 107 .
  • the input device 101 is, for example, a keyboard, mouse, touch panel, or the like.
  • the display device 102 is, for example, a display. Note that the muscle sound extraction device 10 may not have at least one of the input device 101 and the display device 102 .
  • the external I/F 103 is an interface with various external devices such as a microphone 108.
  • the muscle sound extraction device 10 can measure the sound as a signal with the microphone 108 via the external I/F 103 .
  • Examples of external devices other than the microphone 108 include various recording media (CD (Compact Disc), DVD (Digital Versatile Disk), SD memory card (Secure Digital memory card), USB (Universal Serial Bus) memory card etc.).
  • the communication I/F 104 is an interface for connecting the muscle sound extraction device 10 to a communication network.
  • the processor 105 is, for example, various arithmetic units such as a CPU (Central Processing Unit) and a GPU (Graphics Processing Unit).
  • the memory device 106 is, for example, various storage devices such as HDD (Hard Disk Drive), SSD (Solid State Drive), RAM (Random Access Memory), ROM (Read Only Memory), and flash memory.
  • the muscle sound extraction device 10 can implement various types of processing described later.
  • the hardware configuration shown in FIG. 1 is merely an example, and the muscle sound extraction device 10 may have other hardware configurations.
  • the muscle sound extraction device 10 may have multiple processors 105 and may have multiple memory devices 106 .
  • FIG. 2 is a diagram showing an example of the functional configuration of the muscle sound extraction device 10 according to this embodiment.
  • the muscle sound extraction device 10 includes a measurement unit 201, a feature amount extraction unit 202, a model learning unit 203, an abnormal sound detection unit 204, a sound correction unit 205, and an output unit 206 . These units are implemented by, for example, processing that one or more programs installed in the muscle sound extraction device 10 cause the processor 105 to execute.
  • the muscle sound extraction device 10 also has a model storage unit 207 .
  • This model storage unit 207 is realized by the memory device 106, for example.
  • the model storage unit 207 may be realized by, for example, a database server or the like connected to the muscle sound extraction device 10 via a communication network.
  • the measurement unit 201 measures sound as a signal using the microphone 108 .
  • the feature quantity extraction unit 202 extracts the feature quantity of the signal measured by the measurement unit 201 .
  • the model learning unit 203 learns the parameters of the abnormal sound detection model using the feature amount extracted by the feature amount extraction unit 202 .
  • the parameters of this learned abnormal sound detection model (hereinafter also referred to as “learned model parameters”) are stored in the model storage unit 207 .
  • the abnormal sound detection unit 204 uses the learned model parameters stored in the model storage unit 207 and the feature amount extracted by the feature amount extraction unit 202 to determine the window width of each window. Determine whether the sound is muscle sound or noise (abnormal sound).
  • the sound correction unit 205 corrects the amplitude of the sound determined as noise by the abnormal sound detection unit 204 .
  • the output unit 206 outputs a signal of the sound of the window width (corrected sound if corrected by the sound correction unit 205) to a predetermined output destination.
  • this output destination can be any predetermined output destination.
  • the sound signal may be stored in the memory device 106 or the like, may be transmitted to another device connected via a communication network, or may be displayed on the display device 102 as a waveform or the like. Alternatively, it may be output as a sound wave from a speaker or the like.
  • FIG. 3 is a flowchart showing an example of model learning processing according to this embodiment.
  • the measurement unit 201 measures muscle sound as a signal using the microphone 108 (step S101).
  • muscle sounds are measured in an environment without noise, such as a laboratory environment.
  • the feature quantity extraction unit 202 extracts the feature quantity of the signal measured in step S101 above for each predetermined window width (step S102).
  • the feature amount extraction unit 202 applies a Hanning window for each predetermined window width (for example, a time width of 500 ms or the like), and then performs fast Fourier transform (FFT) on each window.
  • FFT fast Fourier transform
  • the feature amount extraction unit 202 uses, for each window, a vector whose elements are values obtained by normalizing the power of each frequency band to 0 or more and 1 or less as a feature amount (feature vector).
  • the number of windows is N
  • the frequency bands are f 1 , . m (n)
  • these N feature amounts are information representing the features of muscle sounds, and thus can be regarded as learning data of positive examples.
  • the model learning unit 203 learns the parameters of the abnormal sound detection model by unsupervised learning using the feature amount (that is, learning data of positive examples) extracted in step S102 (step S103).
  • the abnormal sound detection model for example, a machine learning model learned by an unsupervised learning method such as One-class SVM may be used.
  • the learned model parameters of this abnormal sound detection model are stored in the model storage unit 207 .
  • the muscle sound extraction apparatus 10 measures only muscle sounds, and then learns an abnormal sound detection model by an unsupervised learning method using feature values extracted from the muscle sounds. .
  • FIG. 4 is a flowchart showing an example of muscle sound extraction processing according to the present embodiment. It is assumed that learned model parameters are stored in the model storage unit 207 .
  • the measurement unit 201 measures a sound including muscle sound as a signal using the microphone 108 (step S201). It should be noted that this step S201 corresponds to, for example, a case where muscle sounds are measured under a general environment including an environment in which noise may be mixed.
  • the feature quantity extraction unit 202 extracts the feature quantity of the signal measured in step S201 above for each predetermined window width (step S202).
  • the feature quantity extraction unit 202 extracts the feature quantity of the signal measured in step S201, as in step S102 of FIG. That is, the feature quantity extraction unit 202 applies a Hanning window every predetermined window width (for example, a time width of 500 ms or the like), and then performs fast Fourier transform on each window.
  • the feature amount extraction unit 202 uses, for each window, a vector whose elements are values obtained by normalizing the power of each frequency band to 0 or more and 1 or less as a feature amount (feature vector).
  • the number of windows is N′
  • the frequency bands are f 1 , . is f m (n)
  • feature quantities (f 1 (1) , . . . , f M (1) ), . , f M (N′) ) is obtained.
  • Steps S203 to S206 are repeatedly executed for each window. Steps S203 to S206 for a certain window will be described below.
  • the abnormal sound detection unit 204 uses the learned model parameters stored in the model storage unit 207 and the feature amount of the window to detect the sound of the window width by the abnormal sound detection model in which the learned model parameters are set. It is determined whether the sound is muscle sound or abnormal sound (that is, noise) (step S203).
  • step S203 when the sound in the window width is determined to be an abnormal sound in step S203 (NO in step S204), the sound correction unit 205 corrects the amplitude of the sound in the window width to 0 (step S205). ). On the other hand, if it is determined in step S203 that the sound in the window width is a muscle sound (YES in step S204), nothing is done (that is, the sound in the window width is not corrected). .).
  • the output unit 206 outputs a signal of the sound of the window width (the corrected sound if corrected in step S205) to a predetermined output destination (step S206).
  • the muscle sound extraction apparatus 10 at the time of muscle sound extraction uses the feature amount extracted from the sound including the muscle sound to determine whether it is a muscle sound or noise by the abnormal sound detection model for each predetermined window width. . Then, when it is determined to be noise, the amplitude of the sound in the window width is corrected to zero. As a result, the noise is removed (attenuated) from the sound including the muscle sound, so that the muscle sound with a high SN ratio can be extracted.
  • FIG. 5 shows the result of measuring only the muscle sound of the forearm
  • FIG. 6 shows the result of measuring only the contact noise of the microphone.
  • the muscle sound extraction apparatus 10 As shown in Fig. 5, it can be seen that the signal is hardly attenuated for muscle sounds. On the other hand, as shown in FIG. 6, it can be seen that the contact noise of the microphone is almost attenuated. Therefore, according to the muscle sound extraction apparatus 10 according to the present embodiment, it is possible to attenuate only the noise while hardly attenuating the muscle sound, so that it is possible to extract the muscle sound with a high SN ratio.

Abstract

In the muscle sound extraction method according to an embodiment, a computer executes: a measurement procedure for measuring sound in which muscle sound is contained; a feature amount extraction procedure for extracting a feature amount from the sound every prescribed time width; a determining procedure for determining, every aforementioned time width, whether or not the sound during the time width is abnormal sound, using an abnormal sound sensing model learned in advance and the feature amount of the time width; and a correction procedure for performing, on the sound determined to be abnormal sound, a correction for attenuating the sound.

Description

筋音抽出方法、筋音抽出装置及びプログラムMuscle sound extraction method, muscle sound extraction device and program
 本発明は、筋音抽出方法、筋音抽出装置及びプログラムに関する。 The present invention relates to a muscle sound extraction method, a muscle sound extraction device, and a program.
 筋肉が収縮するときに発生する微細な振動である筋音は、例えば、非特許文献1に記載されているようにコンデンサマイク等で簡易に測定することが可能である。この場合、実験室環境等のノイズが入らない環境下で筋音を測定することで、SN比(Signal to Noise Ratio)が高い信号を測定することができる。 Muscle sounds, which are minute vibrations generated when muscles contract, can be easily measured with a condenser microphone or the like, as described in Non-Patent Document 1, for example. In this case, a signal with a high SN ratio (Signal to Noise Ratio) can be measured by measuring muscle sounds in a noise-free environment such as a laboratory environment.
 しかしながら、実験室環境以外の一般環境下で、同様にマイクを使って筋音を測定する場合、多様なノイズが混入する可能性がある。例えば、環境音、会話の声、マイクの接触ノイズ等といったノイズが混入する可能性がある。筋音は低周波数帯域であるため1-250Hzのバンドパスフィルタ等を用いて信号を抽出することができるが、ノイズも同帯域の周波数成分を含むため、フィルタを適用するだけでは筋音以外の信号も抽出してしまい、SN比が低下するという課題がある。 However, when measuring muscle sounds using a microphone in a general environment other than a laboratory environment, various noises may be mixed in. For example, there is a possibility that noises such as environmental sounds, conversational voices, and microphone contact noises are mixed. Since muscle sounds are in a low frequency band, the signal can be extracted using a bandpass filter of 1-250 Hz, etc. However, since noise also contains frequency components in the same band, applying a filter alone will not detect signals other than muscle sounds. There is a problem that the signal is also extracted and the SN ratio is lowered.
 本発明の一実施形態は、上記の点に鑑みてなされたもので、SN比の高い筋音抽出を可能とすることを目的とする。 An embodiment of the present invention has been made in view of the above points, and aims to enable muscle sound extraction with a high SN ratio.
 上記目的を達成するため、一実施形態に係る筋音抽出方法は、筋音が含まれる音を測定する測定手順と、所定の時間幅毎に、前記音から特徴量を抽出する特徴量抽出手順と、前記時間幅毎に、予め学習された異常音検知モデルと、前記時間幅の特徴量とを用いて、前記時間幅における音が異常音であるか否かを判定する判定手順と、前記異常音であると判定された音に対して、前記音を減衰させるための補正を行う補正手順と、をコンピュータが実行する。 In order to achieve the above object, a muscle sound extraction method according to an embodiment includes: a measurement procedure for measuring a sound that includes muscle sounds; a determination procedure for determining whether or not the sound in the time width is an abnormal sound using an abnormal sound detection model learned in advance for each time width and the feature amount of the time width; and a correction procedure for performing correction for attenuating the sound determined to be an abnormal sound.
 SN比の高い筋音抽出を可能とすることができる。 It is possible to extract muscle sounds with a high SN ratio.
本実施形態に係る筋音抽出装置のハードウェア構成の一例を示す図である。It is a figure which shows an example of the hardware constitutions of the muscle sound extraction apparatus which concerns on this embodiment. 本実施形態に係る筋音抽出装置の機能構成の一例を示す図である。It is a figure which shows an example of the functional structure of the muscle sound extraction apparatus which concerns on this embodiment. 本実施形態に係るモデル学習処理の一例を示すフローチャートである。6 is a flowchart showing an example of model learning processing according to the embodiment; 本実施形態に係る筋音抽出処理の一例を示すフローチャートである。6 is a flow chart showing an example of muscle sound extraction processing according to the present embodiment. 筋音に対する適用結果の一例を示す図である。FIG. 10 is a diagram showing an example of application results for muscle sounds; マイク接触ノイズに対する適用結果の一例を示す図である。FIG. 10 is a diagram showing an example of application results for microphone contact noise;
 以下、本発明の一実施形態について説明する。本実施形態では、ノイズが混入する可能性がある一般環境下であっても、SN比の高い筋音抽出を可能とする筋音抽出装置10について説明する。ここで、本実施形態に係る筋音抽出装置10は、筋音を含む音(音声)に対して、所定の窓幅毎に異常音検知の機械学習アルゴリズムを適用し、その窓幅の音が筋音かノイズかを判定する。そして、ノイズと判定された音に関してはその振幅に補正をかけて減少させることで、SN比の高い筋音抽出を可能とする。 An embodiment of the present invention will be described below. In this embodiment, a muscle sound extracting apparatus 10 that enables muscle sound extraction with a high SN ratio even under a general environment where noise may be mixed will be described. Here, the muscle sound extraction apparatus 10 according to the present embodiment applies a machine learning algorithm for abnormal sound detection to sounds (speech) including muscle sounds for each predetermined window width, and extracts the sound of the window width. Determine whether it is muscle sound or noise. By correcting and reducing the amplitude of sounds determined to be noise, muscle sounds with a high SN ratio can be extracted.
 なお、本実施形態に係る筋音抽出装置10には、機械学習アルゴリズムにより異常音を検知するモデル(以下、「異常音検知モデル」という。)を学習するためのモデル学習時と、この異常音検知モデルを利用してSN比の高い筋音抽出を行う筋音抽出時とがある。以下では、モデル学習と筋音抽出とを同一の筋音抽出装置10が行う場合について説明するが、モデル学習と筋音抽出とが異なる装置で行われてもよい。また、このとき、モデル学習を行う装置は「学習装置」や「モデル学習装置」等と称されてもよい。 Note that the muscle sound extraction apparatus 10 according to the present embodiment includes a model learning time for learning a model for detecting an abnormal sound by a machine learning algorithm (hereinafter referred to as an "abnormal sound detection model"), and an abnormal sound detection model. There is a muscle sound extraction time in which a muscle sound with a high SN ratio is extracted using a detection model. Although the same muscle sound extraction device 10 performs model learning and muscle sound extraction will be described below, model learning and muscle sound extraction may be performed by different devices. Also, at this time, a device that performs model learning may be called a “learning device”, a “model learning device”, or the like.
 <筋音抽出装置10のハードウェア構成>
 まず、本実施形態に係る筋音抽出装置10のハードウェア構成について、図1を参照しながら説明する。図1は、本実施形態に係る筋音抽出装置10のハードウェア構成の一例を示す図である。
<Hardware Configuration of Muscle Sound Extraction Device 10>
First, the hardware configuration of the muscle sound extraction device 10 according to this embodiment will be described with reference to FIG. FIG. 1 is a diagram showing an example of the hardware configuration of a muscle sound extraction device 10 according to this embodiment.
 図1に示すように、本実施形態に係る筋音抽出装置10は、入力装置101と、表示装置102と、外部I/F103と、通信I/F104と、プロセッサ105と、メモリ装置106とを有する。これらの各ハードウェアは、それぞれがバス107により通信可能に接続されている。 As shown in FIG. 1, the muscle sound extraction device 10 according to this embodiment includes an input device 101, a display device 102, an external I/F 103, a communication I/F 104, a processor 105, and a memory device 106. have. Each of these pieces of hardware is communicably connected via a bus 107 .
 入力装置101は、例えば、キーボードやマウス、タッチパネル等である。表示装置102は、例えば、ディスプレイ等である。なお、筋音抽出装置10は、入力装置101及び表示装置102のうちの少なくとも一方を有していなくてもよい。 The input device 101 is, for example, a keyboard, mouse, touch panel, or the like. The display device 102 is, for example, a display. Note that the muscle sound extraction device 10 may not have at least one of the input device 101 and the display device 102 .
 外部I/F103は、マイク(マイクロフォン)108等の各種外部装置とのインタフェースである。筋音抽出装置10は、外部I/F103を介して、マイク108により音を信号として測定することができる。なお、マイク108以外の外部装置の例としては、例えば、各種記録媒体(CD(Compact Disc)、DVD(Digital Versatile Disk)、SDメモリカード(Secure Digital memory card)、USB(Universal Serial Bus)メモリカード等)が挙げられる。 The external I/F 103 is an interface with various external devices such as a microphone 108. The muscle sound extraction device 10 can measure the sound as a signal with the microphone 108 via the external I/F 103 . Examples of external devices other than the microphone 108 include various recording media (CD (Compact Disc), DVD (Digital Versatile Disk), SD memory card (Secure Digital memory card), USB (Universal Serial Bus) memory card etc.).
 通信I/F104は、筋音抽出装置10を通信ネットワークに接続するためのインタフェースである。プロセッサ105は、例えば、CPU(Central Processing Unit)やGPU(Graphics Processing Unit)等の各種演算装置である。メモリ装置106は、例えば、HDD(Hard Disk Drive)やSSD(Solid State Drive)、RAM(Random Access Memory)、ROM(Read Only Memory)、フラッシュメモリ等の各種記憶装置である。 The communication I/F 104 is an interface for connecting the muscle sound extraction device 10 to a communication network. The processor 105 is, for example, various arithmetic units such as a CPU (Central Processing Unit) and a GPU (Graphics Processing Unit). The memory device 106 is, for example, various storage devices such as HDD (Hard Disk Drive), SSD (Solid State Drive), RAM (Random Access Memory), ROM (Read Only Memory), and flash memory.
 本実施形態に係る筋音抽出装置10は、図1に示すハードウェア構成を有することにより、後述する各種処理を実現することができる。なお、図1に示すハードウェア構成は一例であって、筋音抽出装置10は、他のハードウェア構成を有していてもよい。例えば、筋音抽出装置10は、複数のプロセッサ105を有していてもよいし、複数のメモリ装置106を有していてもよい。 By having the hardware configuration shown in FIG. 1, the muscle sound extraction device 10 according to the present embodiment can implement various types of processing described later. The hardware configuration shown in FIG. 1 is merely an example, and the muscle sound extraction device 10 may have other hardware configurations. For example, the muscle sound extraction device 10 may have multiple processors 105 and may have multiple memory devices 106 .
 <筋音抽出装置10の機能構成>
 次に、本実施形態に係る筋音抽出装置10の機能構成について、図2を参照しながら説明する。図2は、本実施形態に係る筋音抽出装置10の機能構成の一例を示す図である。
<Functional Configuration of Muscle Sound Extracting Device 10>
Next, the functional configuration of the muscle sound extraction device 10 according to this embodiment will be described with reference to FIG. FIG. 2 is a diagram showing an example of the functional configuration of the muscle sound extraction device 10 according to this embodiment.
 図2に示すように、本実施形態に係る筋音抽出装置10は、測定部201と、特徴量抽出部202と、モデル学習部203と、異常音検知部204と、音補正部205と、出力部206とを有する。これらの各部は、例えば、筋音抽出装置10にインストールされた1以上のプログラムがプロセッサ105に実行させる処理により実現される。 As shown in FIG. 2, the muscle sound extraction device 10 according to the present embodiment includes a measurement unit 201, a feature amount extraction unit 202, a model learning unit 203, an abnormal sound detection unit 204, a sound correction unit 205, and an output unit 206 . These units are implemented by, for example, processing that one or more programs installed in the muscle sound extraction device 10 cause the processor 105 to execute.
 また、本実施形態に係る筋音抽出装置10は、モデル記憶部207を有する。このモデル記憶部207は、例えば、メモリ装置106により実現される。ただし、モデル記憶部207は、例えば、筋音抽出装置10と通信ネットワークを介して接続されるデータベースサーバ等により実現されていてもよい。 The muscle sound extraction device 10 according to this embodiment also has a model storage unit 207 . This model storage unit 207 is realized by the memory device 106, for example. However, the model storage unit 207 may be realized by, for example, a database server or the like connected to the muscle sound extraction device 10 via a communication network.
 測定部201は、マイク108により音を信号として測定する。特徴量抽出部202は、測定部201が測定した信号の特徴量を抽出する。モデル学習部203は、モデル学習時において、特徴量抽出部202が抽出した特徴量を用いて、異常音検知モデルのパラメータを学習する。この学習済みの異常音検知モデルのパラメータ(以下、「学習済みモデルパラメータ」ともいう。)はモデル記憶部207に保存される。異常音検知部204は、筋音抽出時において、モデル記憶部207に保存されている学習済みモデルパラメータと、特徴量抽出部202が抽出した特徴量とを用いて、窓毎にその窓幅の音が筋音かノイズ(異常音)かを判定する。音補正部205は、異常音検知部204によりノイズと判定された音の振幅を補正する。出力部206は、当該窓幅の音(音補正部205により補正された場合は補正後の音)の信号を所定の出力先に出力する。なお、この出力先は予め決められた任意の出力先とすることが可能である。例えば、音の信号をメモリ装置106等に保存してもよいし、通信ネットワークを介して接続される他の装置に送信してもよいし、音の信号を波形等として表示装置102に表示してもよいし、スピーカ等から音波として出力してもよい。 The measurement unit 201 measures sound as a signal using the microphone 108 . The feature quantity extraction unit 202 extracts the feature quantity of the signal measured by the measurement unit 201 . During model learning, the model learning unit 203 learns the parameters of the abnormal sound detection model using the feature amount extracted by the feature amount extraction unit 202 . The parameters of this learned abnormal sound detection model (hereinafter also referred to as “learned model parameters”) are stored in the model storage unit 207 . At the time of muscle sound extraction, the abnormal sound detection unit 204 uses the learned model parameters stored in the model storage unit 207 and the feature amount extracted by the feature amount extraction unit 202 to determine the window width of each window. Determine whether the sound is muscle sound or noise (abnormal sound). The sound correction unit 205 corrects the amplitude of the sound determined as noise by the abnormal sound detection unit 204 . The output unit 206 outputs a signal of the sound of the window width (corrected sound if corrected by the sound correction unit 205) to a predetermined output destination. Note that this output destination can be any predetermined output destination. For example, the sound signal may be stored in the memory device 106 or the like, may be transmitted to another device connected via a communication network, or may be displayed on the display device 102 as a waveform or the like. Alternatively, it may be output as a sound wave from a speaker or the like.
 <モデル学習処理>
 次に、異常音検知モデルを学習するためのモデル学習処理について、図3を参照しながら説明する。図3は、本実施形態に係るモデル学習処理の一例を示すフローチャートである。
<Model learning processing>
Next, model learning processing for learning an abnormal sound detection model will be described with reference to FIG. FIG. 3 is a flowchart showing an example of model learning processing according to this embodiment.
 まず、測定部201は、マイク108により筋音を信号として測定する(ステップS101)。なお、このステップS101では、例えば、実験室環境等といったノイズが入らない環境下で筋音が測定される。 First, the measurement unit 201 measures muscle sound as a signal using the microphone 108 (step S101). In this step S101, muscle sounds are measured in an environment without noise, such as a laboratory environment.
 次に、特徴量抽出部202は、所定の窓幅毎に、上記のステップS101で測定された信号の特徴量を抽出する(ステップS102)。ここで、特徴量抽出部202は、所定の窓幅(例えば、500ms等の時間幅)毎にハニング窓を掛けた後、各窓に対して高速フーリエ変換(FFT:fast Fourier transform)を行う。そして、特徴量抽出部202は、窓毎に、各周波数帯域のパワーを0以上1以下に正規化した値を要素とするベクトルを特徴量(特徴ベクトル)とする。これにより、例えば、窓数をN、周波数帯域をf,・・・,f、n(1≦n≦N)番目の窓における周波数帯域f(1≦m≦M)のパワーをf (n)とすれば、N個の窓における特徴量(f (1),・・・,f (1)),・・・,(f (N),・・・,f (N))が得られる。なお、これらN個の特徴量は、筋音の特徴を表す情報であるため、正例の学習用データということができる。 Next, the feature quantity extraction unit 202 extracts the feature quantity of the signal measured in step S101 above for each predetermined window width (step S102). Here, the feature amount extraction unit 202 applies a Hanning window for each predetermined window width (for example, a time width of 500 ms or the like), and then performs fast Fourier transform (FFT) on each window. Then, the feature amount extraction unit 202 uses, for each window, a vector whose elements are values obtained by normalizing the power of each frequency band to 0 or more and 1 or less as a feature amount (feature vector). As a result, for example, the number of windows is N, the frequency bands are f 1 , . m (n) , feature quantities (f 1 (1) , . . . , f M (1) ), . (N) ) is obtained. It should be noted that these N feature amounts are information representing the features of muscle sounds, and thus can be regarded as learning data of positive examples.
 そして、モデル学習部203は、上記のステップS102で抽出された特徴量(つまり、正例の学習用データ)を用いて、教師なし学習により異常音検知モデルのパラメータを学習する(ステップS103)。ここで、異常音検知モデルとしては、例えば、One-class SVM等といった教師なし学習の手法により学習される機械学習モデルを用いればよい。なお、この異常音検知モデルの学習済みモデルパラメータはモデル記憶部207に保存される。 Then, the model learning unit 203 learns the parameters of the abnormal sound detection model by unsupervised learning using the feature amount (that is, learning data of positive examples) extracted in step S102 (step S103). Here, as the abnormal sound detection model, for example, a machine learning model learned by an unsupervised learning method such as One-class SVM may be used. Note that the learned model parameters of this abnormal sound detection model are stored in the model storage unit 207 .
 以上のように、モデル学習時における筋音抽出装置10は、筋音のみを測定した上で、この筋音から抽出した特徴量を用いて、教師なし学習の手法により異常音検知モデルを学習する。 As described above, during model learning, the muscle sound extraction apparatus 10 measures only muscle sounds, and then learns an abnormal sound detection model by an unsupervised learning method using feature values extracted from the muscle sounds. .
 <筋音抽出処理>
 次に、上記のモデル学習処理で学習された異常音検知モデルを利用して筋音を抽出する筋音抽出処理について、図4を参照しながら説明する。図4は、本実施形態に係る筋音抽出処理の一例を示すフローチャートである。なお、モデル記憶部207には学習済みモデルパラメータが保存されているものとする。
<Muscle sound extraction processing>
Next, a muscle sound extraction process for extracting muscle sounds using the abnormal sound detection model learned in the model learning process will be described with reference to FIG. FIG. 4 is a flowchart showing an example of muscle sound extraction processing according to the present embodiment. It is assumed that learned model parameters are stored in the model storage unit 207 .
 まず、測定部201は、マイク108により、筋音を含む音を信号として測定する(ステップS201)。なお、このステップS201は、例えば、ノイズが混入する可能性がある環境を含む一般環境下で筋音を測定した場合に該当する。 First, the measurement unit 201 measures a sound including muscle sound as a signal using the microphone 108 (step S201). It should be noted that this step S201 corresponds to, for example, a case where muscle sounds are measured under a general environment including an environment in which noise may be mixed.
 次に、特徴量抽出部202は、所定の窓幅毎に、上記のステップS201で測定された信号の特徴量を抽出する(ステップS202)。ここで、特徴量抽出部202は、図3のステップS102と同様に、上記のステップS201で測定された信号の特徴量を抽出する。すなわち、特徴量抽出部202は、所定の窓幅(例えば、500ms等の時間幅)毎にハニング窓を掛けた後、各窓に対して高速フーリエ変換を行う。そして、特徴量抽出部202は、窓毎に、各周波数帯域のパワーを0以上1以下に正規化した値を要素とするベクトルを特徴量(特徴ベクトル)とする。これにより、例えば、窓数をN'、周波数帯域をf,・・・,f、n(1≦n≦N')番目の窓における周波数帯域f(1≦m≦M)のパワーをf (n)とすれば、N'個の窓における特徴量(f (1),・・・,f (1)),・・・,(f (N'),・・・,f (N'))が得られる。 Next, the feature quantity extraction unit 202 extracts the feature quantity of the signal measured in step S201 above for each predetermined window width (step S202). Here, the feature quantity extraction unit 202 extracts the feature quantity of the signal measured in step S201, as in step S102 of FIG. That is, the feature quantity extraction unit 202 applies a Hanning window every predetermined window width (for example, a time width of 500 ms or the like), and then performs fast Fourier transform on each window. Then, the feature amount extraction unit 202 uses, for each window, a vector whose elements are values obtained by normalizing the power of each frequency band to 0 or more and 1 or less as a feature amount (feature vector). As a result, for example, the number of windows is N′, the frequency bands are f 1 , . is f m (n) , feature quantities (f 1 (1) , . . . , f M (1) ), . , f M (N′) ) is obtained.
 以降のステップS203~ステップS206は各窓に対して繰り返し実行される。以下では、或る1つの窓に対するステップS203~ステップS206について説明する。 The subsequent steps S203 to S206 are repeatedly executed for each window. Steps S203 to S206 for a certain window will be described below.
 異常音検知部204は、モデル記憶部207に保存された学習済みモデルパラメータと、当該窓における特徴量とを用いて、この学習済みモデルパラメータを設定した異常音検知モデルにより当該窓幅の音が筋音又は異常音(つまり、ノイズ)のいずれであるかを判定する(ステップS203)。 The abnormal sound detection unit 204 uses the learned model parameters stored in the model storage unit 207 and the feature amount of the window to detect the sound of the window width by the abnormal sound detection model in which the learned model parameters are set. It is determined whether the sound is muscle sound or abnormal sound (that is, noise) (step S203).
 その後、上記のステップS203で当該窓幅の音が異常音であると判定された場合(ステップS204でNO)、音補正部205は、当該窓幅の音の振幅を0に補正する(ステップS205)。一方で、上記のステップS203で当該窓幅の音が筋音であると判定された場合(ステップS204でYES)は特に何もしない(つまり、当該窓幅の音に対して補正は行われない。)。 After that, when the sound in the window width is determined to be an abnormal sound in step S203 (NO in step S204), the sound correction unit 205 corrects the amplitude of the sound in the window width to 0 (step S205). ). On the other hand, if it is determined in step S203 that the sound in the window width is a muscle sound (YES in step S204), nothing is done (that is, the sound in the window width is not corrected). .).
 そして、出力部206は、当該窓幅の音(上記のステップS205で補正された場合は補正後の音)の信号を所定の出力先に出力する(ステップS206)。 Then, the output unit 206 outputs a signal of the sound of the window width (the corrected sound if corrected in step S205) to a predetermined output destination (step S206).
 以上のように、筋音抽出時における筋音抽出装置10は、筋音を含む音から抽出した特徴量を用いて、所定の窓幅毎に異常音検知モデルにより筋音かノイズかを判定する。そして、ノイズであると判定された場合には、当該窓幅の音の振幅を0に補正する。これにより、筋音を含む音からノイズが除去(減衰)されるため、SN比の高い筋音抽出が可能となる。 As described above, the muscle sound extraction apparatus 10 at the time of muscle sound extraction uses the feature amount extracted from the sound including the muscle sound to determine whether it is a muscle sound or noise by the abnormal sound detection model for each predetermined window width. . Then, when it is determined to be noise, the amplitude of the sound in the window width is corrected to zero. As a result, the noise is removed (attenuated) from the sound including the muscle sound, so that the muscle sound with a high SN ratio can be extracted.
 <効果>
 一例として、スマートフォンのマイクにより48kHzのサンプリングレートで10秒間測定した音に対して筋音抽出装置10を適用した場合の結果について説明する。図5は前腕の筋音のみを測定した場合の結果であり、図6はマイクの接触ノイズのみを測定した場合の結果である。
<effect>
As an example, the result of applying the muscle sound extraction device 10 to sounds measured for 10 seconds at a sampling rate of 48 kHz with a microphone of a smartphone will be described. FIG. 5 shows the result of measuring only the muscle sound of the forearm, and FIG. 6 shows the result of measuring only the contact noise of the microphone.
 図5に示すように、筋音に対しては信号がほとんど減衰されていないことがわかる。一方で、図6に示すように、マイクの接触ノイズはほぼ減衰させていることがわかる。したがって、本実施形態に係る筋音抽出装置10によれば、筋音をほとんど減衰させずにノイズのみを減衰させることができるため、SN比の高い筋音抽出が可能となる。 As shown in Fig. 5, it can be seen that the signal is hardly attenuated for muscle sounds. On the other hand, as shown in FIG. 6, it can be seen that the contact noise of the microphone is almost attenuated. Therefore, according to the muscle sound extraction apparatus 10 according to the present embodiment, it is possible to attenuate only the noise while hardly attenuating the muscle sound, so that it is possible to extract the muscle sound with a high SN ratio.
 本発明は、具体的に開示された上記の実施形態に限定されるものではなく、請求の範囲の記載から逸脱することなく、種々の変形や変更、既知の技術との組み合わせ等が可能である。 The present invention is not limited to the specifically disclosed embodiments described above, and various modifications, alterations, combinations with known techniques, etc. are possible without departing from the scope of the claims. .
 10    筋音抽出装置
 101   入力装置
 102   表示装置
 103   外部I/F
 104   通信I/F
 105   プロセッサ
 106   メモリ装置
 107   バス
 108   マイク
 201   測定部
 202   特徴量抽出部
 203   モデル学習部
 204   異常音検知部
 205   音補正部
 206   出力部
 207   モデル記憶部
10 muscle sound extraction device 101 input device 102 display device 103 external I/F
104 Communication I/F
105 processor 106 memory device 107 bus 108 microphone 201 measurement unit 202 feature extraction unit 203 model learning unit 204 abnormal sound detection unit 205 sound correction unit 206 output unit 207 model storage unit

Claims (6)

  1.  筋音が含まれる音を測定する測定手順と、
     所定の時間幅毎に、前記音から特徴量を抽出する特徴量抽出手順と、
     前記時間幅毎に、予め学習された異常音検知モデルと、前記時間幅の特徴量とを用いて、前記時間幅における音が異常音であるか否かを判定する判定手順と、
     前記異常音であると判定された音に対して、前記音を減衰させるための補正を行う補正手順と、
     をコンピュータが実行する筋音抽出方法。
    a measurement procedure for measuring a sound containing muscle sound;
    A feature amount extraction procedure for extracting a feature amount from the sound for each predetermined time width;
    a determination procedure for determining whether or not the sound in the time width is an abnormal sound using an abnormal sound detection model learned in advance for each time width and the feature amount of the time width;
    a correction procedure for performing a correction for attenuating the sound determined to be the abnormal sound;
    A computer-implemented muscle sound extraction method.
  2.  前記異常音検知モデルは、筋音のみから抽出した特徴量を正例として教師なし学習の手法により学習された機械学習モデルである、請求項1に記載の筋音抽出方法。 The method of extracting muscle sounds according to claim 1, wherein the abnormal sound detection model is a machine learning model learned by an unsupervised learning method using features extracted only from muscle sounds as positive examples.
  3.  前記特徴量抽出手順は、
     前記時間幅毎に、前記音を高速フーリエ変換し、各周波数帯域のパワーを正規化した値を各要素とするベクトルとして前記特徴量を抽出する、請求項1又は2に記載の筋音抽出方法。
    The feature quantity extraction procedure includes:
    3. The muscle sound extraction method according to claim 1, wherein for each time width, the sound is fast Fourier transformed, and the feature amount is extracted as a vector whose elements are values obtained by normalizing the power of each frequency band. .
  4.  前記補正手順は、
     前記異常音であると判定された音の振幅を0とする補正を行う、請求項1乃至3の何れか一項に記載の筋音抽出方法。
    The correction procedure includes:
    4. The muscle sound extraction method according to any one of claims 1 to 3, wherein correction is performed so that the amplitude of the sound determined to be the abnormal sound is zero.
  5.  筋音が含まれる音を測定する測定部と、
     所定の時間幅毎に、前記音から特徴量を抽出する特徴量抽出部と、
     前記時間幅毎に、予め学習された異常音検知モデルと、前記時間幅の特徴量とを用いて、前記時間幅における音が異常音であるか否かを判定する判定部と、
     前記異常音であると判定された音に対して、前記音を減衰させるための補正を行う補正部と、
     を有する筋音抽出装置。
    a measurement unit that measures sound containing muscle sound;
    a feature quantity extraction unit that extracts a feature quantity from the sound for each predetermined time width;
    a determination unit that determines whether or not the sound in the time width is an abnormal sound, using an abnormal sound detection model learned in advance for each time width and the feature amount of the time width;
    a correction unit that performs correction for attenuating the sound determined to be the abnormal sound;
    Muscle sound extraction device having
  6.  コンピュータに、請求項1乃至4の何れか一項に記載の筋音抽出方法を実行させるプログラム。 A program that causes a computer to execute the muscle sound extraction method according to any one of claims 1 to 4.
PCT/JP2021/010332 2021-03-15 2021-03-15 Muscle sound extraction method, muscle sound extraction device, and program WO2022195657A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/JP2021/010332 WO2022195657A1 (en) 2021-03-15 2021-03-15 Muscle sound extraction method, muscle sound extraction device, and program

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2021/010332 WO2022195657A1 (en) 2021-03-15 2021-03-15 Muscle sound extraction method, muscle sound extraction device, and program

Publications (1)

Publication Number Publication Date
WO2022195657A1 true WO2022195657A1 (en) 2022-09-22

Family

ID=83320078

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2021/010332 WO2022195657A1 (en) 2021-03-15 2021-03-15 Muscle sound extraction method, muscle sound extraction device, and program

Country Status (1)

Country Link
WO (1) WO2022195657A1 (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2002224070A (en) * 2001-01-30 2002-08-13 National Institute Of Advanced Industrial & Technology Myoelectric characteristic pattern discriminating method and device
WO2011007569A1 (en) * 2009-07-15 2011-01-20 国立大学法人筑波大学 Classification estimating system and classification estimating program
US20150178631A1 (en) * 2013-09-04 2015-06-25 Neural Id Llc Pattern recognition system
US20170049400A1 (en) * 2014-02-19 2017-02-23 Institut National De La Recherche Scientifique (Inrs) Method and system for evaluating a noise level of a biosignal
US20190156200A1 (en) * 2017-11-17 2019-05-23 Aivitae LLC System and method for anomaly detection via a multi-prediction-model architecture
JP2020058806A (en) * 2018-10-12 2020-04-16 デピュイ・シンセス・プロダクツ・インコーポレイテッド Neuromuscular sensing device having multi-sensing array
CN111110268A (en) * 2019-11-26 2020-05-08 中国科学院合肥物质科学研究院 Human body muscle sound signal prediction method based on random vector function connection network technology

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2002224070A (en) * 2001-01-30 2002-08-13 National Institute Of Advanced Industrial & Technology Myoelectric characteristic pattern discriminating method and device
WO2011007569A1 (en) * 2009-07-15 2011-01-20 国立大学法人筑波大学 Classification estimating system and classification estimating program
US20150178631A1 (en) * 2013-09-04 2015-06-25 Neural Id Llc Pattern recognition system
US20170049400A1 (en) * 2014-02-19 2017-02-23 Institut National De La Recherche Scientifique (Inrs) Method and system for evaluating a noise level of a biosignal
US20190156200A1 (en) * 2017-11-17 2019-05-23 Aivitae LLC System and method for anomaly detection via a multi-prediction-model architecture
JP2020058806A (en) * 2018-10-12 2020-04-16 デピュイ・シンセス・プロダクツ・インコーポレイテッド Neuromuscular sensing device having multi-sensing array
CN111110268A (en) * 2019-11-26 2020-05-08 中国科学院合肥物质科学研究院 Human body muscle sound signal prediction method based on random vector function connection network technology

Similar Documents

Publication Publication Date Title
CN112489682B (en) Audio processing method, device, electronic equipment and storage medium
CN112712816B (en) Training method and device for voice processing model and voice processing method and device
WO2017008587A1 (en) Method and apparatus for eliminating popping at the head of audio, and a storage medium
JP6784758B2 (en) Noise signal determination method and device, and voice noise removal method and device
CN109616098B (en) Voice endpoint detection method and device based on frequency domain energy
CN110047519B (en) Voice endpoint detection method, device and equipment
Kumar Performance Evaluation of Novel AMDF‐Based Pitch Detection Scheme
JP2009192536A (en) Measuring apparatus, test apparatus, program and electronic device
US20190122678A1 (en) Methods and apparatus to perform windowed sliding transforms
WO2022195657A1 (en) Muscle sound extraction method, muscle sound extraction device, and program
Pavlova et al. Scaling features of intermittent dynamics: Differences of characterizing correlated and anti-correlated data sets
CN112652290B (en) Method for generating reverberation audio signal and training method of audio processing model
CN112951263A (en) Speech enhancement method, apparatus, device and storage medium
CN110415722B (en) Speech signal processing method, storage medium, computer program, and electronic device
Li et al. Learning normality is enough: a software-based mitigation against inaudible voice attacks
Shi et al. A fractal-dimension-based envelope demodulation for rolling element bearing fault feature extraction from vibration signals
CN108020741B (en) Method and device for identifying damping characteristics of double-frequency harmonic attenuation signal
CN113555031B (en) Training method and device of voice enhancement model, and voice enhancement method and device
CN113593594B (en) Training method and equipment for voice enhancement model and voice enhancement method and equipment
CN111710327B (en) Method, apparatus, device and medium for model training and sound data processing
Alimuradov An algorithm for measurement of the pitch frequency of speech signals based on complementary ensemble decomposition into empirical modes
US10346125B2 (en) Detection of clipping event in audio signals
CN113314147A (en) Training method and device of audio processing model and audio processing method and device
CN110646665B (en) Resonance detection method, system, terminal and storage medium for multi-inverter power grid
Mangla et al. Intelligent audio analysis techniques for identification of music in smart devices

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21931413

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21931413

Country of ref document: EP

Kind code of ref document: A1