JP2023091528A

JP2023091528A - Signal processing apparatus and signal processing method

Info

Publication number: JP2023091528A
Application number: JP2021206313A
Authority: JP
Inventors: 太郎笠原; Taro Kasahara; 光渡部; Hikaru Watabe; 洋志吉越; Hiroshi Yoshikoshi
Original assignee: Ono Sokki Co Ltd
Current assignee: Ono Sokki Co Ltd
Priority date: 2021-12-20
Filing date: 2021-12-20
Publication date: 2023-06-30

Abstract

To perform learning so as to properly separate an input physical quantity into a noise component and an extraction signal.SOLUTION: A signal processing apparatus 10 includes: a noise-reduced input physical quantity generation unit 15 which generates a noise-reduced input physical quantity 90A obtained by reducing a noise component (noise 91b) included in an input physical quantity (engine peripheral sound 90); a selection unit 16 which selects one or both of the input physical quantity and the noise-reduced input physical quantity; and a learning unit 21 which learns a weight of a neural network for generating a noise reducing mask α for reducing the noise component, from the input physical quantity and/or the noise-reduced input physical quantity selected by the selection unit.SELECTED DRAWING: Figure 3A

Description

本発明は、入力物理量をノイズ成分とノイズ成分が除去された抽出信号とに良好に分離するための学習を行う信号処理装置、及び、信号処理方法に関する。 TECHNICAL FIELD The present invention relates to a signal processing apparatus and a signal processing method that perform learning for satisfactorily separating an input physical quantity into a noise component and an extracted signal from which the noise component has been removed.

例えば、ガソリンエンジンなどの内燃機関における点火時期は、出力トルクの向上を目的として、ノッキングが発生しないクランク角度の範囲内において可能な限り進角されることが一般的である。そこで、点火時期を調整する過程では、ノッキングが発生しているか否かが試験者又はノッキング判定装置によって判定される。こうしたノッキング判定装置の一例が特許文献１に記載されている。 For example, the ignition timing of an internal combustion engine such as a gasoline engine is generally advanced as much as possible within a crank angle range in which knocking does not occur, in order to improve output torque. Therefore, in the process of adjusting the ignition timing, a tester or a knocking determination device determines whether or not knocking occurs. An example of such a knocking determination device is described in Patent Document 1.

特許文献１に記載のノッキング判定装置は、ノッキングの有無が判定される判定信号と比較される対象信号とを、判定信号との関係（例えば、時間的な関係や運転条件における関係）で定まる条件に基づいて選択する。 In the knocking determination device described in Patent Document 1, a determination signal for determining the presence or absence of knocking and a target signal to be compared are determined by conditions determined by the relationship between the determination signal and the determination signal (for example, the relationship between time and driving conditions). Choose based on

特許文献１に記載のノッキング判定装置では、ノッキングの有無の判定結果しか外部に提示できず、判定結果の裏付けをとることが困難である。そこで、本発明の発明者らによって、判定結果の裏付けとなる所望の物理量を推定することが可能な装置として、特許文献２又は特許文献３に記載の装置が提案された。 In the knocking determination device described in Patent Document 1, only the result of determining the presence or absence of knocking can be presented to the outside, and it is difficult to corroborate the determination result. Therefore, the inventors of the present invention have proposed a device described in Patent Document 2 or Patent Document 3 as a device capable of estimating a desired physical quantity that supports the determination result.

特許文献２に記載の装置は、「入力物理量に含まれるノイズ成分を除去し、前記ノイズ成分が除去された入力物理量から所望の物理量を推定するための学習装置であって、前記ノイズ成分は、内燃機関で発生するノッキング音以外の雑音であり、前記入力物理量は、前記雑音及び前記ノッキング音が含まれる前記内燃機関の音圧であり、前記所望の物理量は、ノッキング発生時の前記内燃機関の筒内圧であり、ニューラルネットワークにより、前記雑音を除去し、かつ、前記ノッキング音を抽出するマスクを生成するニューラルネットワークの重み、及び／又は、前記マスクにより抽出されたノッキング音を前記ノッキング発生時の内燃機関の筒内圧に変換する伝達関数を学習する学習部、を備えることを特徴とする学習装置。」というものである。 The device described in Patent Document 2 is "a learning device for removing noise components contained in an input physical quantity and estimating a desired physical quantity from the input physical quantity from which the noise component has been removed, wherein the noise component is noise other than the knocking sound generated in the internal combustion engine, the input physical quantity is the sound pressure of the internal combustion engine including the noise and the knocking sound, and the desired physical quantity is the sound pressure of the internal combustion engine when knocking occurs weight of a neural network that generates a mask for removing the noise and extracting the knocking sound by a neural network, and/or the knocking sound extracted by the mask is the in-cylinder pressure at the time of the knocking. A learning device characterized by comprising a learning unit for learning a transfer function for conversion to in-cylinder pressure of an internal combustion engine.

特許文献３に記載の装置は、「内燃機関で発生するノッキング音以外の雑音を除去し、前記雑音が除去されたノッキング音を推定するノッキング判定装置であって、ニューラルネットワークにより、前記雑音を除去し、かつ、前記ノッキング音を抽出するマスクを生成するニューラルネットワークの重み、及び、前記マスクにより抽出されたノッキング音をノッキング発生時の内燃機関の筒内圧に変換する伝達関数を学習する学習部と、ニューラルネットワークにより、前記マスクを用いて、前記雑音が含まれるノッキング音から前記雑音が除去されたノッキング音を推定する第２推定部と、を備えることを特徴とするノッキング判定装置。」というものである。 The device described in Patent Document 3 is "a knocking determination device that removes noise other than the knocking sound generated in an internal combustion engine and estimates the knocking sound from which the noise has been removed, wherein the noise is removed by a neural network. and a learning unit that learns the weights of a neural network that generates a mask for extracting the knocking sound, and a transfer function that converts the knocking sound extracted by the mask into the in-cylinder pressure of the internal combustion engine when knocking occurs. and a second estimation unit for estimating a knocking sound from which the noise is removed from the noise-containing knocking sound by a neural network using the mask." is.

特開２０１７－４４１４８号公報JP 2017-44148 A 特許第６６０５１７０号公報Japanese Patent No. 6605170 特許第６６５１０４０号公報Japanese Patent No. 6651040

しかしながら、特許文献２及び特許文献３に記載された従来技術は、入力物理量に含まれるノイズ成分の量（ノイズ量）が大きい場合（特に、入力物理量から取りたい信号（抽出したい信号）よりもノイズの音圧が比較的大きい場合）に、分離後のノイズ成分に教師信号と関連のある成分が含まれていることがあった。したがって、入力物理量をノイズ成分と抽出信号とに良好に分離することができないときがあった。 However, the conventional techniques described in Patent Document 2 and Patent Document 3, when the amount of noise components (noise amount) included in the input physical quantity is large When the sound pressure of the signal is relatively large), the separated noise component sometimes contained a component related to the teacher signal. Therefore, sometimes the input physical quantity cannot be well separated into the noise component and the extracted signal.

本発明は、前記の課題を解決するためになされたものであり、入力物理量をノイズ成分と抽出信号とに良好に分離するための学習を行う信号処理装置、及び、信号処理方法を提供することを主な目的とする。 SUMMARY OF THE INVENTION The present invention has been made to solve the above-mentioned problems, and provides a signal processing apparatus and a signal processing method that perform learning for good separation of an input physical quantity into a noise component and an extracted signal. is the main purpose.

前記課題を解決するため、本発明は、信号処理装置であって、入力物理量に含まれるノイズ成分を低減したノイズ低減入力物理量を生成するノイズ低減入力物理量生成部と、前記入力物理量と前記ノイズ低減入力物理量のいずれか一方又は双方を選択する選択部と、前記入力物理量と前記ノイズ低減入力物理量の中の前記選択部が選択したものからノイズ成分を除去するためのノイズ除去マスクを生成するニューラルネットワークの重みを学習する学習部と、を備える構成とする。 In order to solve the above-mentioned problems, the present invention provides a signal processing apparatus comprising: a noise reduction input physical quantity generation unit for generating a noise reduction input physical quantity in which a noise component included in the input physical quantity is reduced; A neural network that generates a selection unit that selects one or both of input physical quantities, and a noise removal mask for removing noise components from the input physical quantity and the noise reduction input physical quantity selected by the selection unit. and a learning unit that learns the weight of .

また、本発明は、信号処理方法であって、入力物理量に含まれるノイズ成分を低減したノイズ低減入力物理量を生成するノイズ低減入力物理量生成工程と、前記入力物理量と前記ノイズ低減入力物理量のいずれか一方又は双方を選択する選択工程と、前記入力物理量と前記ノイズ低減入力物理量の中の前記選択工程で選択したものからノイズ成分を除去するためのノイズ除去マスクを生成するニューラルネットワークの重みを学習する学習工程と、を含む構成とする。
その他の手段は、後記する。 The present invention also provides a signal processing method comprising: a noise-reduced input physical quantity generating step of generating a noise-reduced input physical quantity in which a noise component included in the input physical quantity is reduced; a selection step of selecting one or both; and learning neural network weights that generate a noise removal mask for removing noise components from the ones selected in the selection step of the input physical quantity and the noise reduction input physical quantity. and a learning step.
Other means will be described later.

本発明によれば、入力物理量をノイズ成分と抽出信号とに良好に分離するための学習を行うことができる。 According to the present invention, it is possible to perform learning to separate an input physical quantity into a noise component and an extracted signal.

第１実施形態に係る信号処理装置（推定装置）を含む信号処理システムの全体構成を示すブロック図である。1 is a block diagram showing the overall configuration of a signal processing system including a signal processing device (estimation device) according to a first embodiment; FIG. 第１実施形態に係る信号処理装置（推定装置）の構成を示すブロック図である。1 is a block diagram showing the configuration of a signal processing device (estimating device) according to a first embodiment; FIG. 学習モードの説明図である。FIG. 4 is an explanatory diagram of a learning mode; 第１実施形態に係る信号処理装置（推定装置）の学習モード時の動作説明図である。FIG. 4 is an explanatory diagram of operation in the learning mode of the signal processing device (estimation device) according to the first embodiment; 閾値算出モードの説明図である。FIG. 10 is an explanatory diagram of a threshold calculation mode; 第１実施形態に係る信号処理装置（推定装置）の閾値算出モード時の動作説明図である。FIG. 4 is an explanatory diagram of the operation of the signal processing device (estimation device) according to the first embodiment in a threshold calculation mode; 判定モードの説明図である。FIG. 10 is an explanatory diagram of a determination mode; 第１実施形態に係る信号処理装置（推定装置）の判定モード時の動作説明図である。FIG. 4 is an explanatory diagram of the operation of the signal processing device (estimation device) according to the first embodiment in a determination mode; 分離モードの説明図である。FIG. 4 is an explanatory diagram of a separation mode; 第１実施形態に係る信号処理装置（推定装置）の分離モード時の動作説明図である。FIG. 4 is an explanatory diagram of the operation of the signal processing device (estimation device) according to the first embodiment in a separation mode; 官能試験モードの説明図である。It is explanatory drawing of sensory test mode. 第１実施形態に係る信号処理装置（推定装置）の官能試験モード時の動作説明図である。FIG. 4 is an explanatory diagram of the operation of the signal processing device (estimation device) according to the first embodiment in sensory test mode; 学習時におけるノイズ成分と教師信号との関係を表す説明図である。FIG. 4 is an explanatory diagram showing the relationship between noise components and teacher signals during learning; 学習時における抽出信号と教師信号との関係を表す説明図である。FIG. 4 is an explanatory diagram showing the relationship between an extracted signal and a teacher signal during learning; 学習時における抽出信号と推定信号との関係を表す説明図である。FIG. 4 is an explanatory diagram showing the relationship between an extracted signal and an estimated signal during learning; 第１実施形態において、ノイズ除去マスクを生成するニューラルネットワークの重みの学習の説明図である。FIG. 4 is an explanatory diagram of weight learning of a neural network that generates a noise removal mask in the first embodiment; 圧力や、振動、音などの位相成分が考慮されていない場合の計算例の説明図である。FIG. 10 is an explanatory diagram of a calculation example when phase components such as pressure, vibration, and sound are not considered; 圧力や、振動、音などの位相成分が考慮されている場合の計算例の説明図である。FIG. 10 is an explanatory diagram of a calculation example when phase components such as pressure, vibration, and sound are taken into consideration; 官能試験における信号処理の説明図（１）である。It is explanatory drawing (1) of the signal processing in a sensory test. 官能試験における信号処理の説明図（２）である。It is explanatory drawing (2) of the signal processing in a sensory test. 官能試験における信号処理の説明図（３）である。It is explanatory drawing (3) of the signal processing in a sensory test. 官能試験における信号処理の説明図（４）である。It is explanatory drawing (4) of the signal processing in a sensory test. 官能試験における信号処理の説明図（５）である。It is explanatory drawing (5) of the signal processing in a sensory test. 官能試験における信号処理の説明図（６）である。It is explanatory drawing (6) of the signal processing in a sensory test. 第１実施形態において、信号処理装置（推定装置）のデータ収集処理を示すフローチャートである。6 is a flowchart showing data collection processing of the signal processing device (estimation device) in the first embodiment. 第１実施形態において、信号処理装置（推定装置）の学習処理を示すフローチャートである。4 is a flowchart showing learning processing of the signal processing device (estimation device) in the first embodiment. 学習処理のサブルーチンを示すフローチャートである。10 is a flowchart showing a subroutine of learning processing; 学習処理のサブルーチンの変更例を示すフローチャートである。FIG. 11 is a flow chart showing a modified example of a subroutine for learning processing; FIG. 第１実施形態において、信号処理装置（推定装置）の閾値算出処理を示すフローチャートである。4 is a flowchart showing threshold calculation processing of the signal processing device (estimation device) in the first embodiment. 第１実施形態において、図１７の分離処理及び図１８の官能試験処理の後に行われる信号処理装置（推定装置）の閾値算出処理を示すフローチャートである。19 is a flowchart showing threshold calculation processing of the signal processing device (estimation device) performed after the separation processing of FIG. 17 and the sensory test processing of FIG. 18 in the first embodiment. 第１実施形態において、信号処理装置（推定装置）の判定処理を示すフローチャートである。6 is a flowchart showing determination processing of the signal processing device (estimation device) in the first embodiment. 第１実施形態において、信号処理装置（推定装置）の分離処理を示すフローチャートである。6 is a flowchart showing separation processing of the signal processing device (estimation device) in the first embodiment. 第１実施形態において、信号処理装置（推定装置）の官能試験処理を示すフローチャートである。4 is a flowchart showing sensory test processing of the signal processing device (estimation device) in the first embodiment. 比較例に係る信号処理装置（推定装置）の構成を示すブロック図である。FIG. 3 is a block diagram showing the configuration of a signal processing device (estimation device) according to a comparative example; 比較例に係る信号処理装置（推定装置）の学習モードの説明図である。FIG. 4 is an explanatory diagram of a learning mode of a signal processing device (estimation device) according to a comparative example; 比較例に係る信号処理装置（推定装置）の学習モード時の動作説明図である。FIG. 10 is an explanatory diagram of the operation of the signal processing device (estimation device) according to the comparative example in the learning mode; 第１実施形態の変形例に係る信号処理装置（推定装置）の用途変更を実現するための信号処理システムの全体構成を示すブロック図である。FIG. 11 is a block diagram showing the overall configuration of a signal processing system for realizing a change in application of a signal processing device (estimation device) according to a modification of the first embodiment; 用途変更された第１実施形態に係る信号処理装置（推定装置）の学習モードの説明図である。FIG. 10 is an explanatory diagram of a learning mode of the signal processing device (estimation device) according to the first embodiment whose application is changed; 比較例に係る信号処理装置（推定装置）において発生する不適切な例（入力物理量から取りたい信号（抽出したい信号）を綺麗に分離できなくなる場合の例）の説明図（１）である。FIG. 10 is an explanatory diagram (1) of an inappropriate example that occurs in the signal processing device (estimation device) according to the comparative example (example in which a signal to be obtained from an input physical quantity (signal to be extracted) cannot be separated cleanly); 比較例に係る信号処理装置（推定装置）において発生する不適切な例（入力物理量から取りたい信号（抽出したい信号）を綺麗に分離できなくなる場合の例）の説明図（２）である。FIG. 12 is an explanatory diagram (2) of an inappropriate example (a case where a signal to be obtained from an input physical quantity (signal to be extracted) cannot be separated cleanly) that occurs in the signal processing device (estimation device) according to the comparative example; 比較例に係る信号処理装置（推定装置）において発生する不適切な例（入力物理量から取りたい信号（抽出したい信号）を綺麗に分離できなくなる場合の例）の説明図（３）である。FIG. 11 is an explanatory diagram (3) of an inappropriate example (a case where a signal to be obtained from an input physical quantity (signal to be extracted) cannot be separated cleanly) that occurs in the signal processing device (estimation device) according to the comparative example; 比較例に係る信号処理装置（推定装置）において発生する不適切な例（入力物理量から取りたい信号（抽出したい信号）を綺麗に分離できなくなる場合の例）の説明図（４）である。FIG. 11 is an explanatory diagram (4) of an inappropriate example (a case where a signal to be taken from an input physical quantity (signal to be extracted) cannot be separated cleanly) that occurs in the signal processing device (estimation device) according to the comparative example; 第１実施形態に係る信号処理装置（推定装置）において実現される好適な例（入力物理量から取りたい信号（抽出したい信号）を綺麗に分離できる場合の例）の説明図（１）である。FIG. 4 is an explanatory diagram (1) of a preferred example (a case where a desired signal (a signal to be extracted) can be cleanly separated from an input physical quantity) implemented in the signal processing device (estimating device) according to the first embodiment; 第１実施形態に係る信号処理装置（推定装置）において実現される処理マスクを用いて図２３Ａに示す信号のスペクトログラムからノイズを低減したノイズ低減入力物理量の説明図（２）である。FIG. 23B is an explanatory diagram (2) of noise-reduced input physical quantities obtained by reducing noise from the spectrogram of the signal shown in FIG. 23A using the processing mask realized in the signal processing device (estimation device) according to the first embodiment; 第１実施形態に係る信号処理装置（推定装置）において実現される好適な例（入力物理量から取りたい信号（抽出したい信号）を綺麗に分離できる場合の例）の説明図（３）である。FIG. 10 is an explanatory diagram (3) of a preferred example (an example in which a signal to be taken from an input physical quantity (signal to be extracted) can be cleanly separated) implemented in the signal processing device (estimation device) according to the first embodiment; 第１実施形態に係る信号処理装置（推定装置）において実現される好適な例（入力物理量から取りたい信号（抽出したい信号）を綺麗に分離できる場合の例）の説明図（４）である。FIG. 10 is an explanatory diagram (4) of a preferred example (an example in which a signal to be taken from an input physical quantity (signal to be extracted) can be separated cleanly) implemented in the signal processing device (estimation device) according to the first embodiment; 好適な例における取り逃がした信号の説明図（５）である。It is explanatory drawing (5) of the missed signal in a suitable example. 処理マスクの生成方法の概略説明図である。FIG. 4 is a schematic explanatory diagram of a method of generating a processing mask; 処理マスクの生成方法の説明図である。FIG. 10 is an explanatory diagram of a method of generating a processing mask; ノイズ低減入力物理量の生成方法の説明図である。FIG. 4 is an explanatory diagram of a method of generating noise reduction input physical quantities; ノイズ低減入力物理量の生成方法の説明図である。FIG. 4 is an explanatory diagram of a method of generating noise reduction input physical quantities; 第１実施形態に係る信号処理装置（推定装置）の学習モードにおけるシャッフル処理の説明図である。FIG. 4 is an explanatory diagram of shuffle processing in the learning mode of the signal processing device (estimation device) according to the first embodiment; 第１実施形態に係る信号処理装置（推定装置）の学習モード時における動作の変更例の説明図である。FIG. 7 is an explanatory diagram of a modification of the operation of the signal processing device (estimation device) according to the first embodiment in the learning mode; 第２実施形態に係る信号処理装置（推定装置）の構成を示すブロック図である。It is a block diagram which shows the structure of the signal processing apparatus (estimation apparatus) which concerns on 2nd Embodiment.

図面を参照して、本発明の実施形態（以下、「本実施形態」と称する）について詳細に説明する。なお、各図は、本発明を十分に理解できる程度に、概略的に示しているに過ぎない。よって、本発明は、図示例のみに限定されるものではない。また、各図において、共通する構成要素や同様な構成要素については、同一の符号を付し、それらの重複する説明を省略する。 An embodiment of the present invention (hereinafter referred to as "the present embodiment") will be described in detail with reference to the drawings. In addition, each figure is only shown roughly to such an extent that the present invention can be fully understood. Accordingly, the present invention is not limited to the illustrated examples only. Moreover, in each figure, the same code|symbol is attached|subjected about a common component and a similar component, and those overlapping description is abbreviate|omitted.

特許文献２及び特許文献３に記載された従来技術は、分離後のノイズ成分に教師信号と関連のある成分が含まれていないかを評価していない。このような従来技術は、入力物理量をノイズ成分と抽出信号とに良好に分離できない。そこで、本発明は、分離後のノイズ成分に教師信号と関連のある成分が含まれていないかを評価して、入力物理量をノイズ成分と抽出信号とに良好に分離するための学習を行う信号処理装置を提供することも意図している。 The conventional techniques described in Patent Documents 2 and 3 do not evaluate whether or not the separated noise components include components related to the teacher signal. Such conventional techniques cannot separate an input physical quantity into a noise component and an extracted signal. Therefore, the present invention evaluates whether or not the noise component after separation contains a component related to the teacher signal, and performs learning for good separation of the input physical quantity into the noise component and the extracted signal. It is also intended to provide a processing device.

また、特許文献２及び特許文献３に記載された従来技術は、入力物理量に関連する位相成分を考慮することなく、入力物理量をノイズ成分と抽出信号とに分離する構成になっていた。そのため、従来技術は、入力物理量をノイズ成分と抽出信号とに分離しても、ノイズ成分に抽出信号が混入してしまい、抽出信号を正確に分離できなかった。そこで、本発明は、位相成分を考慮した構成を実現することにより、入力物理量をノイズ成分と抽出信号とに良好に分離するための学習を行う信号処理装置を提供することも意図している。 Further, the conventional techniques described in Patent Documents 2 and 3 are configured to separate an input physical quantity into a noise component and an extraction signal without considering a phase component related to the input physical quantity. Therefore, even if the conventional technology separates the input physical quantity into a noise component and an extraction signal, the extraction signal is mixed with the noise component, and the extraction signal cannot be separated accurately. Therefore, the present invention also intends to provide a signal processing apparatus that performs learning for good separation of an input physical quantity into a noise component and an extracted signal by realizing a configuration that considers a phase component.

また、特許文献２及び特許文献３の従来技術では、伝達関数を用いて所望の物理量を推定しているが、本発明は伝達関数を使用するか否かは問わない。 Moreover, in the prior arts of Patent Documents 2 and 3, a desired physical quantity is estimated using a transfer function, but the present invention does not matter whether or not a transfer function is used.

［第１実施形態］
本第１実施形態は、入力物理量に含まれるノイズ成分を低減したノイズ低減入力物理量を生成するノイズ低減入力物理量生成部と、前記入力物理量と前記ノイズ低減入力物理量のいずれか一方又は双方を選択する選択部と、前記入力物理量と前記ノイズ低減入力物理量の中の前記選択部が選択したものからノイズ成分を除去するためのノイズ除去マスクを生成するニューラルネットワークの重みを学習する学習部と、を備える信号処理装置を提供するものである。 [First embodiment]
The first embodiment includes a noise reduction input physical quantity generation unit that generates a noise reduction input physical quantity obtained by reducing a noise component included in the input physical quantity, and selects one or both of the input physical quantity and the noise reduction input physical quantity. a selection unit; and a learning unit for learning weights of a neural network that generates a noise removal mask for removing noise components from the input physical quantity and the noise reduction input physical quantity selected by the selection unit. A signal processing device is provided.

＜信号処理装置（推定装置）を含む信号処理システムの全体構成＞
図１を参照して、本第１実施形態に係る信号処理装置１０（推定装置）を含む信号処理システム１００の全体構成について説明する。図１は、信号処理装置１０を含む信号処理システム１００の全体構成を示すブロック図である。 <Overall Configuration of Signal Processing System Including Signal Processing Device (Estimation Device)>
An overall configuration of a signal processing system 100 including a signal processing device 10 (estimation device) according to the first embodiment will be described with reference to FIG. FIG. 1 is a block diagram showing the overall configuration of a signal processing system 100 including a signal processing device 10. As shown in FIG.

信号処理装置１０は、信号に対して各種の処理を行う装置である。本実施形態では、信号処理装置１０が、ノイズ成分が除去された抽出信号に伝達関数を掛け合わせて、教師信号と同じ次元（単位）の推定信号を推定する推定装置として機能するものとして説明する。また、本実施形態では、ガソリンエンジンなどの内燃機関でノッキングが発生しているか否かを検証する用途に信号処理装置１０が用いられるものとして説明する。ただし、信号処理装置１０は、このような用途に限らず、様々な用途に用いることができる。例えば、信号処理装置１０は、後記するように、検証対象を電子装置に搭載された複数の電子部品のうちの１つとして、この検証対象の電子部品でどのような動作音が発生しているかを検証する用途に用いることができる。 The signal processing device 10 is a device that performs various types of processing on signals. In the present embodiment, the signal processing device 10 functions as an estimating device that estimates an estimated signal of the same dimension (unit) as the teacher signal by multiplying the noise component-removed extracted signal by a transfer function. . In this embodiment, the signal processing device 10 is used for verifying whether or not knocking occurs in an internal combustion engine such as a gasoline engine. However, the signal processing device 10 is not limited to such uses, and can be used for various uses. For example, as will be described later, the signal processing device 10 determines what kind of operation sound is generated by the electronic component to be verified, assuming that one of a plurality of electronic components mounted in the electronic device is to be verified. can be used for verification purposes.

図１に示すように、信号処理システム１００は、試験対象となるエンジン１のノッキングの有無を判定するものであり、音圧センサ４と、筒内圧センサ５と、データ収集装置６と、モニタ７と、ヘッドホン８と、レベル指定部９と、信号処理装置１０とを備える。 As shown in FIG. 1, the signal processing system 100 determines the presence or absence of knocking in the engine 1 to be tested. , headphones 8 , a level designator 9 , and a signal processing device 10 .

図１に示すように、試験対象のエンジン１は、車両３に搭載されている。なお、試験対象のエンジン１は、車両３に搭載されない状態、例えば単独の状態で用いられてもよい。エンジン１には、エンジン１の駆動を制御するエンジンＥＣＵ（Electronic Control Unit）２が接続されている。エンジンＥＣＵ２は、ＣＰＵ（Central Processing Unit）、ＲＯＭ（Read Only Memory）、ＲＡＭ（Random access memory）、その他の記憶装置等で構成されている。エンジンＥＣＵ２は、ＲＯＭや記憶装置に記憶されたプログラムをＣＰＵで演算処理することで、エンジン１の駆動制御に必要な各種情報をエンジンＥＣＵ２の外部から取得しながらエンジン１の駆動を制御する。 As shown in FIG. 1 , an engine 1 to be tested is mounted on a vehicle 3 . Note that the engine 1 to be tested may be used in a state where it is not mounted on the vehicle 3, for example, in a single state. An engine ECU (Electronic Control Unit) 2 that controls driving of the engine 1 is connected to the engine 1 . The engine ECU 2 includes a CPU (Central Processing Unit), a ROM (Read Only Memory), a RAM (Random Access Memory), other storage devices, and the like. The engine ECU 2 controls the driving of the engine 1 while acquiring various information necessary for driving control of the engine 1 from the outside of the engine ECU 2 by performing arithmetic processing with the CPU on programs stored in the ROM or storage device.

エンジンＥＣＵ２は、エンジン１の現在の回転角度を表す角度情報をデータ収集装置６に出力する。角度情報には、例えば、基準角度パルスとクランク角度パルスとが含まれる。基準角度パルスは、クランク軸の一回転中の基準角度位置で出力される基準のパルスで、例えば、クランク軸が１回転する毎に１パルスが出力される。また、クランク角度パルスは、クランク軸が一定角度回転する毎に出力されるパルスであって、例えば、１°毎に１パルスが出力される場合、吸入、圧縮、燃焼及び排気の４行程を１サイクルとする４ストロークエンジンにおいて、１サイクルにクランク軸が回転する２回転の間に７２０パルスが出力される。なお、角度情報がデータ収集装置６に入力されるのであれば、エンジンＥＣＵ２を介さず、クランク軸の回転角度の原点を検出する原点センサや、クランク軸の回転角度を検出する角度センサからの角度情報がデータ収集装置６に入力されてもよい。 The engine ECU 2 outputs angle information representing the current rotation angle of the engine 1 to the data collection device 6 . The angle information includes, for example, a reference angle pulse and a crank angle pulse. The reference angular pulse is a reference pulse that is output at a reference angular position during one rotation of the crankshaft. For example, one pulse is output for each rotation of the crankshaft. Further, the crank angle pulse is a pulse that is output each time the crankshaft rotates by a certain angle. In a cyclic four-stroke engine, 720 pulses are output during two rotations of the crankshaft in one cycle. If the angle information is input to the data collection device 6, the angle from an origin sensor that detects the origin of the rotation angle of the crankshaft or from an angle sensor that detects the rotation angle of the crankshaft is not passed through the engine ECU 2. Information may be input to the data collection device 6 .

エンジン１の近くには、音圧センサ４が設置されている。音圧センサ４は、エンジン１から発生する音を検出し、この検出した音に基づく音圧信号をデータ収集装置６に出力する。詳述すると、音圧センサ４は、エンジン１に発生する筒内圧力変動に基づく物理量の一例である音圧を検出し、検出された音圧の大きさを示す音圧信号を生成する。よって、エンジン１にノッキングが発生していないとき、音圧センサ４から出力される音圧信号には、ノッキングに相関のある音は含まれない。一方、エンジン１にノッキングが発生しているとき、音圧センサ４から出力される音圧信号には、ノッキングに相関のある音が含まれている。 A sound pressure sensor 4 is installed near the engine 1 . The sound pressure sensor 4 detects sound generated from the engine 1 and outputs a sound pressure signal based on the detected sound to the data collection device 6 . More specifically, the sound pressure sensor 4 detects sound pressure, which is an example of a physical quantity based on in-cylinder pressure fluctuations generated in the engine 1, and generates a sound pressure signal indicating the magnitude of the detected sound pressure. Therefore, when the engine 1 is not knocking, the sound pressure signal output from the sound pressure sensor 4 does not include sound correlated with knocking. On the other hand, when the engine 1 is knocking, the sound pressure signal output from the sound pressure sensor 4 includes sound correlated with knocking.

エンジン１には、エンジン１の筒内圧を検出する筒内圧センサ５を取り付ける。筒内圧センサ５は、筒内の燃焼ガス振動に応じた波形成分を含む筒内圧信号をデータ収集装置６に出力する。ここで、エンジン１にノッキングが発生していないとき、筒内圧センサ５から出力される筒内圧信号には、ノッキングに相関した成分が含まれない。一方、エンジン１にノッキングが発生しているとき、筒内圧センサ５から出力される筒内圧信号には、ノッキングに相関した成分が含まれている。筒内圧センサ５は、点火プラグと一体化したものを用いてもよいし、点火プラグとは別個に構成されたものを用いてもよい。 An in-cylinder pressure sensor 5 for detecting the in-cylinder pressure of the engine 1 is attached to the engine 1 . The in-cylinder pressure sensor 5 outputs to the data collection device 6 an in-cylinder pressure signal including a waveform component corresponding to combustion gas vibration in the cylinder. Here, when knocking does not occur in the engine 1, the in-cylinder pressure signal output from the in-cylinder pressure sensor 5 does not contain a component correlated with knocking. On the other hand, when knocking occurs in the engine 1, the in-cylinder pressure signal output from the in-cylinder pressure sensor 5 contains a component correlated with knocking. The in-cylinder pressure sensor 5 may be integrated with the spark plug, or may be configured separately from the spark plug.

データ収集装置６は、音圧センサ４からの音圧信号を入力してＡ／Ｄ変換する。また、データ収集装置６は、音圧信号を入力するタイミングで、エンジンＥＣＵ２から現在の角度情報を取得する。そして、データ収集装置６は、角度情報に基づいてエンジン１の１サイクル分の音圧信号を取得する。よって、４ストローク単気筒エンジンを例とすると、データ収集装置６は、単位時間当たりのエンジン１の回転速度に応じた数、例えば、回転速度が３０００［ｒ／ｍｉｎ］であれば１分間に１５００個の音圧信号を生成する。また、データ収集装置６は、一部又は全部の音圧信号に角度情報を関連付ける。そして、データ収集装置６は、角度情報が関連付けられた音圧信号を信号処理装置１０に出力する。なお、データ収集装置６は、音圧信号を一時的に保持したり、一旦蓄えたりしてから信号処理装置１０に出力してもよい。また、データ収集装置６は、音圧信号に時刻情報を関連付けてもよい。 The data collection device 6 inputs the sound pressure signal from the sound pressure sensor 4 and A/D converts it. Moreover, the data collection device 6 acquires the current angle information from the engine ECU 2 at the timing of inputting the sound pressure signal. Then, the data collection device 6 acquires the sound pressure signal for one cycle of the engine 1 based on the angle information. Therefore, taking a four-stroke single-cylinder engine as an example, the data collection device 6 collects a number corresponding to the rotational speed of the engine 1 per unit time, for example, 1500 per minute if the rotational speed is 3000 [r/min]. generate sound pressure signals. The data collection device 6 also associates angle information with some or all of the sound pressure signals. The data collection device 6 then outputs the sound pressure signal associated with the angle information to the signal processing device 10 . The data collection device 6 may temporarily hold the sound pressure signal or temporarily store the sound pressure signal before outputting it to the signal processing device 10 . The data collection device 6 may also associate time information with the sound pressure signal.

モニタ７は、信号処理装置１０が抽出したエンジン１のノッキング音や推定したエンジン１の筒内圧、信号処理装置１０によるノッキングの有無の判定結果を表示する。モニタ７の一例として、一般的なフラットパネルディスプレイがある。 The monitor 7 displays the knocking sound of the engine 1 extracted by the signal processing device 10 , the estimated in-cylinder pressure of the engine 1 , and the determination result of the presence or absence of knocking by the signal processing device 10 . An example of the monitor 7 is a common flat panel display.

ヘッドホン８は、音を発する放音部である。ヘッドホン８は、後記する官能試験モードで試験対象物（本実施形態では、エンジン１）の状況を検査する際に、検査者の頭部に装着される。 Headphone 8 is a sound emitting unit that emits sound. The headphones 8 are worn on the head of the inspector when inspecting the condition of the test object (in this embodiment, the engine 1) in the sensory test mode, which will be described later.

レベル指定部９は、信号に対する上昇レベル又は下降レベルを指定するレベル指定情報を受け付ける入力部である。レベル指定部９は、例えばタッチパネルディスプレイや、テンキー、専用のスイッチなどによって構成されている。本実施形態では、抽出信号（抽出ノッキング音９１ａ（図１１Ｂ））のレベルを変更してレベル変更抽出信号（レベル変更ノッキング音９１ａａ（図１２Ａ））を生成する場合に、抽出信号の上昇レベル又は下降レベルを指定（入力）するために、検査者を含む、その周囲の人物によってレベル指定部９が操作される。 The level designation unit 9 is an input unit that receives level designation information that designates a rise level or a fall level for a signal. The level specifying unit 9 is composed of, for example, a touch panel display, numeric keys, dedicated switches, and the like. In this embodiment, when the level of the extracted signal (extracted knocking sound 91a (FIG. 11B)) is changed to generate the level-changed extracted signal (level-changed knocking sound 91aa (FIG. 12A)), the level of the extracted signal increases or In order to designate (input) a lowering level, the level designating section 9 is operated by surrounding persons including the examiner.

信号処理装置１０は、ノイズ除去マスクαを生成するニューラルネットワーク９４（図３Ａ）の重みＷ（図３Ｂ）及び伝達関数Ｈ（図３Ａ）を学習する。以下、「ノイズ除去マスクαを生成するニューラルネットワーク９４」を「マスク生成ネットワーク９４Ａ（図３Ａ）」と称する場合がある。また、「マスク生成ネットワーク９４Ａの重みＷ」を「ニューラルネットワークの重みＷ」と称する場合がある。ここで、ノイズ除去マスクαは、ノイズ成分が含まれている入力物理量からノイズ成分を除去するための実数又は複素数の行列である。ノイズ除去マスクαは、入力物理量に合わせて変化する。また、伝達関数Ｈは、複素数の重みベクトルであり、エンジン１の構造減衰補正量の逆数と解釈する。構造減衰補正量とは、エンジン燃焼時の筒内圧に起因する振動がエンジン１を通り、音となって音圧センサに到達するまでの伝達特性のことである。エンジン１の筒内圧の周波数成分に構造減衰補正量を乗算したものがエンジン１の燃焼騒音レベルとなる。 The signal processor 10 learns the weights W (FIG. 3B) and transfer function H (FIG. 3A) of the neural network 94 (FIG. 3A) that generates the noise removal mask α. Hereinafter, the "neural network 94 that generates the noise removal mask α" may be referred to as a "mask generation network 94A (Fig. 3A)". Also, the "weight W of the mask generating network 94A" may be referred to as the "weight W of the neural network". Here, the noise removal mask α is a matrix of real or complex numbers for removing noise components from input physical quantities containing noise components. The noise removal mask α changes according to the input physical quantity. Also, the transfer function H is a complex weight vector and interpreted as the reciprocal of the structural damping correction amount of the engine 1 . The structural damping correction amount is the transmission characteristic of the vibration caused by the in-cylinder pressure during engine combustion, which passes through the engine 1 and reaches the sound pressure sensor as sound. The combustion noise level of the engine 1 is obtained by multiplying the frequency component of the in-cylinder pressure of the engine 1 by the structural damping correction amount.

そのため、音圧であるエンジン１の近傍音に含まれる燃焼騒音に構造減衰補正量の逆数を乗算すれば、エンジン１の筒内圧が求められる。したがって、エンジン１の近傍音からエンジン１の筒内圧を推定することができる。 Therefore, the in-cylinder pressure of the engine 1 can be obtained by multiplying the combustion noise included in the near-field sound of the engine 1, which is the sound pressure, by the reciprocal of the structural damping correction amount. Therefore, the in-cylinder pressure of the engine 1 can be estimated from the near sound of the engine 1 .

エンジン１の近傍音からエンジン１の筒内圧を推定する原理については、例えば、前記の特許文献２に記載されている。前記の特許文献２によれば、「まず、エンジン１の近傍音ｙと、教師データ（教師信号）として、実測したエンジン１の筒内圧ｘとを収集する。（中略）。エンジン１の近傍音ｙ及びエンジン１の実測筒内圧ｘを用いて、未知数であるマスクを生成するニューラルネットワークの重み及び伝達関数をニューラルネットワークにより学習する。そして、学習したニューラルネットワークが生成したマスク及び伝達関数を用いて、試験時に測定したエンジン１の近傍音ｙから、エンジン１の筒内圧ｘを推定する。」と記載されている。 The principle of estimating the in-cylinder pressure of the engine 1 from the near-field sound of the engine 1 is described, for example, in the above-mentioned Patent Document 2. According to the above-mentioned Patent Document 2, "First, the nearby sound y of the engine 1 and the actually measured in-cylinder pressure x of the engine 1 as teacher data (teacher signal) are collected. The neural network learns weights and transfer functions of a neural network that generates masks, which are unknowns, using y and the measured in-cylinder pressure x of the engine 1. Then, the masks and transfer functions generated by the trained neural network are used. , the in-cylinder pressure x of the engine 1 is estimated from the near-field sound y of the engine 1 measured during the test."

信号処理装置１０は、学習したニューラルネットワーク９４（図３Ａ）が生成したノイズ除去マスクα及び伝達関数Ｈを用いて、エンジン１の近傍音からノッキング音を抽出し、エンジン１のノッキング筒内圧を推定し、抽出したノッキング音に基づいて、ノッキングの有無を判定する。そして、信号処理装置１０は、抽出したノッキング音や推定したエンジン１のノッキング筒内圧、ノッキングの有無の判定結果をモニタ７に表示する。 The signal processing device 10 extracts the knocking sound from the near sound of the engine 1 using the noise removal mask α and the transfer function H generated by the learned neural network 94 (FIG. 3A), and estimates the knocking cylinder pressure of the engine 1. Then, the presence or absence of knocking is determined based on the extracted knocking sound. Then, the signal processing device 10 displays on the monitor 7 the extracted knocking sound, the estimated knocking cylinder internal pressure of the engine 1, and the determination result of the presence or absence of knocking.

ここで、信号処理装置１０は、ＣＰＵ、ＲＯＭ、ＲＡＭ、その他の記憶装置等で構成されている。信号処理装置１０は、ＲＯＭや記憶装置に記憶されているプログラムをＣＰＵで演算処理する。なお、信号処理装置１０は、以下の処理を実行するプログラムを有するパーソナルコンピュータ（ＰＣ）等であってもよい。 Here, the signal processing device 10 is composed of a CPU, a ROM, a RAM, other storage devices, and the like. The signal processing device 10 performs arithmetic processing on a program stored in a ROM or a storage device using a CPU. Note that the signal processing device 10 may be a personal computer (PC) or the like having a program for executing the following processes.

本実施形態では、信号処理装置１０は、学習モード、閾値算出モード、判定モード、分離モード、官能試験モードという５つの動作モードで動作する。１つ目の学習モードは、ノイズ除去マスクα（図３Ａ）を生成するニューラルネットワークの重みＷ（図３Ｂ）及び伝達関数Ｈ（図３Ａ）を学習する動作モードである。２つ目の閾値算出モードは、ノイズ除去マスクαを生成するニューラルネットワークの重みＷ及び伝達関数Ｈの学習後、ノッキングの有無を閾値判定するときの閾値を算出する動作モードである。３つ目の判定モードは、学習したニューラルネットワーク９４（図３Ａ）に入力物理量を入力することで生成されたノイズ除去マスクαを用いて入力物理量（本実施形態では、エンジン近傍音９０）から抽出信号（本実施形態では、抽出ノッキング音９１ａ）を抽出し、ノッキングの有無を判定する動作モードである。４つ目の分離モードは、入力物理量を抽出信号とノイズ成分（本実施形態では、雑音９１ｂ）とに分離する動作モードである。５つ目の官能試験モードは、検査者が後記する加工音を聞き取り、検査者の聴感によって検査すべき目的音（本実施形態では、ノッキング音）における閾値算出モードで使用するデータを決定するための動作モードである。５つ目の官能試験モードでは、例えば許容範囲外となった音から閾値を算出する。 In this embodiment, the signal processing device 10 operates in five operation modes: learning mode, threshold calculation mode, determination mode, separation mode, and sensory test mode. The first learning mode is an operation mode for learning the weight W (FIG. 3B) and transfer function H (FIG. 3A) of the neural network that generates the noise removal mask α (FIG. 3A). The second threshold calculation mode is an operation mode for calculating a threshold for judging the presence or absence of knocking after learning the weight W and the transfer function H of the neural network that generates the noise removal mask α. The third determination mode uses a noise removal mask α generated by inputting the input physical quantity to the learned neural network 94 (FIG. 3A) to extract from the input physical quantity (in this embodiment, the near-engine sound 90). In this operation mode, a signal (in this embodiment, an extracted knocking sound 91a) is extracted and the presence or absence of knocking is determined. The fourth separation mode is an operation mode that separates the input physical quantity into an extraction signal and a noise component (noise 91b in this embodiment). In the fifth sensory test mode, the inspector listens to the processed sound described later, and determines the data to be used in the threshold calculation mode for the target sound (knocking sound in this embodiment) to be inspected by the auditory sense of the inspector. mode of operation. In the fifth sensory test mode, for example, a threshold value is calculated from sounds outside the allowable range.

これら５つの動作モードは、任意に切り替えることができる。例えば、図示を省略した管理装置により、ＣＡＮ（Controller Area Network）を介して、信号処理装置１０の動作モードを切り替えることができる。また、図示を省略したマウス、キーボード等の操作手段を用いて、信号処理装置１０の動作モードを切り替えてもよい。 These five operation modes can be switched arbitrarily. For example, a management device (not shown) can switch the operation mode of the signal processing device 10 via CAN (Controller Area Network). Further, the operation mode of the signal processing device 10 may be switched using operation means such as a mouse and a keyboard (not shown).

学習モードの場合、ノッキングが発生する運転条件、及び、ノッキングが発生しない運転条件でそれぞれエンジン１を運転し、データ収集装置６が、教師データ（教師信号）として、筒内圧信号を収集する。このとき、データ収集装置６は、筒内圧センサ５からの筒内圧信号を入力してＡ／Ｄ変換し、これを音圧信号に関連付けておく。 In the learning mode, the engine 1 is operated under operating conditions in which knocking occurs and under operating conditions in which knocking does not occur, and the data collection device 6 collects cylinder pressure signals as teaching data (teaching signals). At this time, the data collection device 6 inputs the in-cylinder pressure signal from the in-cylinder pressure sensor 5, performs A/D conversion, and associates this with the sound pressure signal.

閾値算出モードの場合、ノッキングが発生しない運転条件でエンジン１を運転し、データ収集装置６が、音圧信号を収集する。 In the threshold calculation mode, the engine 1 is operated under operating conditions in which knocking does not occur, and the data collection device 6 collects sound pressure signals.

判定モードの場合、学習したニューラルネットワークにより生成されたノイズ除去マスクαを用いてエンジン近傍音９０（入力物理量）から抽出ノッキング音９１ａ（抽出信号）を抽出し、信号処理装置１０が、閾値に基づいてノッキングの有無を判定する。 In the determination mode, a knocking sound 91a (extracted signal) is extracted from the near-engine sound 90 (input physical quantity) using the noise removal mask α generated by the learned neural network. to determine the presence or absence of knocking.

なお、閾値算出モード又は判定モードの場合、データ収集装置６が、筒内圧信号を収集する必要はない。つまり、学習モードの場合、音圧センサ４と筒内圧センサ５の双方が動作するが、閾値算出モード又は判定モードの場合、音圧センサ４のみが動作する。 Note that in the case of the threshold calculation mode or determination mode, the data collection device 6 does not need to collect the in-cylinder pressure signal. That is, both the sound pressure sensor 4 and the in-cylinder pressure sensor 5 operate in the learning mode, but only the sound pressure sensor 4 operates in the threshold calculation mode or determination mode.

分離モードの場合、エンジン１を運転し、信号処理装置１０が、エンジン近傍音９０（入力物理量）を学習したニューラルネットワークに入力してノイズ除去マスクαを生成し、雑音９１ｂ（ノイズ成分）と抽出ノッキング音９１ａ（抽出信号）とに分離する。その際に、信号処理装置１０は、雑音９１ｂをノイズ成分記憶部２６ｃに、抽出ノッキング音９１ａを抽出信号記憶部２６ｄに、それぞれ記憶する。本発明では、入力物理量に関連する振幅と位相を考慮して、ノイズ除去マスクαを生成するニューラルネットワークの重みＷ及び伝達関数Ｈを学習することにより分離性能が向上している。ノイズ除去マスクαは実数又は複素数、伝達関数Ｈは複素数で実装する。これにより、信号処理装置１０は、入力物理量をノイズ成分と抽出信号とに良好に分離するための学習を行うことができる。 In the separation mode, the engine 1 is operated, and the signal processing device 10 inputs the near-engine sound 90 (input physical quantity) to the learned neural network to generate a noise removal mask α, and extracts the noise 91b (noise component). knocking sound 91a (extracted signal). At this time, the signal processing device 10 stores the noise 91b in the noise component storage section 26c and the extracted knocking sound 91a in the extracted signal storage section 26d. In the present invention, the separation performance is improved by learning the weight W and transfer function H of the neural network that generates the noise removal mask α in consideration of the amplitude and phase related to the input physical quantity. The noise removal mask α is implemented with real or complex numbers, and the transfer function H is implemented with complex numbers. Thereby, the signal processing device 10 can perform learning for good separation of the input physical quantity into the noise component and the extracted signal.

官能試験モードの場合、信号処理装置１０が、レベル指定部９からレベル指定情報を受け付け、レベル指定情報に基づいて、雑音９１ｂ（ノイズ成分）と抽出ノッキング音９１ａ（抽出信号）とを用いて加工音を生成する。 In the case of the sensory test mode, the signal processing device 10 receives level designation information from the level designation unit 9, and processes noise 91b (noise component) and extracted knocking sound 91a (extracted signal) based on the level designation information. produce sound.

仮に振幅スペクトルが同じだとしても、位相スペクトルによって信号波形は大きく変わるため、聴感印象に多大な影響を及ぼす。そのため、品質の良い加工音を生成するには、位相を考慮して、加工音に用いる抽出信号（本実施形態では、抽出ノッキング音９１ａ）を取得することが重要である。換言すると、信号処理装置１０は、入力物理量に関連する振幅と位相を考慮して、ノイズ除去マスクαを生成するニューラルネットワークの重みＷ及び伝達関数Ｈを学習することにより分離性能が向上し、品質の良い加工音（聴感上、ノッキング音が自然な加工音）を生成することができる。 Even if the amplitude spectrum is the same, the signal waveform changes greatly depending on the phase spectrum, which greatly affects the auditory impression. Therefore, in order to generate a high-quality processed sound, it is important to acquire the extracted signal (the extracted knocking sound 91a in the present embodiment) used for the processed sound in consideration of the phase. In other words, the signal processing device 10 considers the amplitude and phase related to the input physical quantity, and learns the weight W and the transfer function H of the neural network that generates the noise removal mask α, thereby improving the separation performance and improving the quality. It is possible to generate a good processed sound (processed sound with a natural knocking sound on the sense of hearing).

＜信号処理装置（推定装置）の構成＞
図２を参照して、信号処理装置１０（推定装置）の構成について説明する。図２は、信号処理装置１０の構成を示すブロック図である。図２に示すように、信号処理装置１０は、信号切出部１１と、スペクトログラム算出部１２と、信号記憶部１３と、スイッチ１４と、学習処理部２０（学習装置）と、判定処理部３０と、分離部４０と、信号合成部５０と、を備える。ここで、信号処理装置１０は、データ収集装置６から、音圧信号と、この音圧信号に関連付けられた角度情報とが入力される。さらに、学習モードの場合、信号処理装置１０は、データ収集装置６から筒内圧信号が入力される。 <Configuration of Signal Processing Device (Estimation Device)>
The configuration of the signal processing device 10 (estimation device) will be described with reference to FIG. FIG. 2 is a block diagram showing the configuration of the signal processing device 10. As shown in FIG. As shown in FIG. 2, the signal processing device 10 includes a signal extraction unit 11, a spectrogram calculation unit 12, a signal storage unit 13, a switch 14, a learning processing unit 20 (learning device), and a determination processing unit 30. , a separating unit 40 and a signal synthesizing unit 50 . Here, the signal processing device 10 receives the sound pressure signal and the angle information associated with the sound pressure signal from the data collection device 6 . Furthermore, in the learning mode, the signal processing device 10 receives an in-cylinder pressure signal from the data collection device 6 .

信号切出部１１は、データ収集装置６から入力された角度情報に基づいて、入力された音圧信号から所定の切出角度範囲の音圧信号を切り出す。例えば、切出角度範囲はＡＴＤＣ（After Top Dead Center）の約－１０～９０°の角度範囲である。本実施形態では、ＴＤＣ（Top Dead Center）を基準として切り出されているため、点火タイミングが変更されても切出角度範囲は固定されたままであるが、点火タイミングの変更に応じて切出角度範囲を変更してもよい。信号切出部１１は、音圧信号を切り出すと、切り出した音圧信号をスペクトログラム算出部１２に出力する。 Based on the angle information input from the data collection device 6, the signal extraction unit 11 extracts a sound pressure signal within a predetermined extraction angle range from the input sound pressure signal. For example, the cut-out angle range is an ATDC (After Top Dead Center) angle range of about -10 to 90 degrees. In this embodiment, since the TDC (Top Dead Center) is used as a reference, the cut-out angle range remains fixed even if the ignition timing is changed. may be changed. After extracting the sound pressure signal, the signal extracting unit 11 outputs the extracted sound pressure signal to the spectrogram calculating unit 12 .

スペクトログラム算出部１２は、信号切出部１１が切り出した音圧信号に対して短時間フーリエ変換（ＳＴＦＴ：Short Time Fourier Transform）を行い、音圧信号のスペクトログラムを算出する。短時間フーリエ変換は、例えば、離散フーリエ変換を高速に計算する高速フーリエ変換（ＦＦＴ：Fast Fourier Transform）により行われる。
その後、スペクトログラム算出部１２は、音圧信号のスペクトログラムを信号記憶部１３に書き込む。 The spectrogram calculator 12 performs a Short Time Fourier Transform (STFT) on the sound pressure signal extracted by the signal extractor 11 to calculate a spectrogram of the sound pressure signal. The short-time Fourier transform is performed, for example, by fast Fourier transform (FFT), which calculates the discrete Fourier transform at high speed.
After that, the spectrogram calculation unit 12 writes the spectrogram of the sound pressure signal to the signal storage unit 13 .

信号記憶部１３は、スペクトログラム算出部１２が変換した音圧信号のスペクトログラムを記憶するメモリ、ＨＤＤ（Hard Disk Drive）、ＳＳＤ（Solid State Drive）等の記憶装置である。なお、学習モードの場合、信号記憶部１３は、データ収集装置６から入力された筒内圧信号（観測ノッキング筒内圧９３（図３Ａ））を、教師データ（教師信号）として記憶する。ここで、「ノッキング筒内圧」とは、筒内圧信号に重畳したノッキング成分を表す。ノッキング筒内圧は、筒内圧をノッキングの周波数成分以上の周波数帯を通過させるハイパスフィルタ処理することで得られる。 The signal storage unit 13 is a storage device such as a memory, a HDD (Hard Disk Drive), or an SSD (Solid State Drive) that stores the spectrogram of the sound pressure signal converted by the spectrogram calculation unit 12 . In the learning mode, the signal storage unit 13 stores the in-cylinder pressure signal (observed knocking in-cylinder pressure 93 (FIG. 3A)) input from the data collection device 6 as teacher data (teacher signal). Here, the "knocking in-cylinder pressure" represents the knocking component superimposed on the in-cylinder pressure signal. The knocking in-cylinder pressure is obtained by subjecting the in-cylinder pressure to a high-pass filter that passes a frequency band equal to or higher than the knocking frequency component.

スイッチ１４は、前記した５つの動作モードに対応する学習モード用接続部Ｍ１、閾値算出モード用接続部Ｍ２、判定モード用接続部Ｍ３、分離モード用接続部Ｍ４、官能試験モード用接続部Ｍ５に任意に切り替えることができる。信号処理装置１０は、信号処理装置１０の動作モードに応じてスイッチ１４を切り替えることにより、信号記憶部１３に記憶されている信号を任意の出力先に出力することができる。 The switch 14 is connected to the learning mode connection M1, the threshold calculation mode connection M2, the determination mode connection M3, the separation mode connection M4, and the sensory test mode connection M5 corresponding to the five operation modes. It can be switched arbitrarily. By switching the switch 14 according to the operation mode of the signal processing device 10, the signal processing device 10 can output the signal stored in the signal storage unit 13 to an arbitrary output destination.

例えば学習モードの場合、信号処理装置１０は、スイッチ１４を学習モード用接続部Ｍ１に接続して、信号記憶部１３に記憶されている音圧信号のスペクトログラム及び筒内圧信号を後記するノイズ低減入力物理量生成部１５と後記する選択部１６に出力する。また、閾値算出モードの場合、信号処理装置１０は、スイッチ１４を閾値算出モード用接続部Ｍ２に接続して信号記憶部１３に記憶されている音圧信号のスペクトログラムを後記する第１推定部２３に出力する。また、判定モードの場合、信号処理装置１０は、スイッチ１４を判定モード用接続部Ｍ３に接続して、信号記憶部１３に記憶されている音圧信号のスペクトログラムを後記する第２推定部３１（抽出信号推定部）に出力する。また、分離モードの場合、信号処理装置１０は、スイッチ１４を分離モード用接続部Ｍ４に接続して、信号記憶部１３に記憶されている音圧信号のスペクトログラムを後記する分離部４０に出力する。また、官能試験モードの場合、信号処理装置１０は、スイッチ１４を官能試験モード用接続部Ｍ５に接続して、図示せぬ信号生成部によって予め生成されて信号記憶部１３に記憶された官能試験モードの実行指示信号を後記する信号調整部５１に出力する。ここでは、官能試験モードの実行指示信号が図示せぬ信号生成部によって予め生成されて信号記憶部１３に記憶されているものとして説明する。ただし、官能試験モードの実行指示信号は、スイッチ１４が官能試験モード用接続部Ｍ５に接続されたときに、図示せぬ信号生成部によって生成されて、信号記憶部１３を介さずに、図示せぬ信号生成部から後記する信号調整部５１に出力されるようにしてもよい。 For example, in the case of the learning mode, the signal processing device 10 connects the switch 14 to the learning mode connection portion M1, and inputs the spectrogram of the sound pressure signal and the in-cylinder pressure signal stored in the signal storage portion 13 to the noise reduction input described later. It outputs to the physical quantity generator 15 and the selector 16 which will be described later. Further, in the case of the threshold calculation mode, the signal processing device 10 connects the switch 14 to the threshold calculation mode connection section M2 and stores the spectrogram of the sound pressure signal stored in the signal storage section 13 in the first estimation section 23 described later. output to In addition, in the determination mode, the signal processing device 10 connects the switch 14 to the determination mode connection unit M3, and outputs the spectrogram of the sound pressure signal stored in the signal storage unit 13 to a second estimation unit 31 (to be described later). extracted signal estimator). In the case of the separation mode, the signal processing device 10 connects the switch 14 to the separation mode connection section M4, and outputs the spectrogram of the sound pressure signal stored in the signal storage section 13 to the separation section 40 described later. . In the case of the sensory test mode, the signal processing device 10 connects the switch 14 to the sensory test mode connection portion M5, and the sensory test generated in advance by the signal generation portion (not shown) and stored in the signal storage portion 13 is stored in the signal storage portion 13. A mode execution instruction signal is output to the signal adjustment unit 51, which will be described later. Here, it is assumed that the sensory test mode execution instruction signal is generated in advance by a signal generating section (not shown) and stored in the signal storage section 13 . However, when the switch 14 is connected to the sensory test mode connection M5, the sensory test mode execution instruction signal is generated by a signal generator (not shown) and is not shown in the figure without passing through the signal storage unit 13. Alternatively, the signal may be output from the other signal generator to the signal adjuster 51 described later.

＜学習処理部＞
学習処理部２０は、学習モードにおいて、ノイズ除去マスクαを生成するニューラルネットワーク９４（図３Ａ）の重みＷ及び伝達関数Ｈを学習する学習処理（図１４Ａ）を行い、閾値算出モードにおいて、エンジン１の抽出したノッキング音の閾値判定に用いる閾値を算出する閾値算出処理（図１５Ａ）を行う。図２に示すように、学習処理部２０は、ノイズ低減入力物理量生成部１５と、選択部１６と、学習部２１と、学習済みパラメータ記憶部２２と、第１推定部２３と、閾値算出部２４と、閾値記憶部２５と、教師信号記憶部２６ａと、推定信号記憶部２６ｂと、ノイズ成分記憶部２６ｃと、抽出信号記憶部２６ｄと、を備える。 <Learning processing unit>
In the learning mode, the learning processing unit 20 performs learning processing (FIG. 14A) for learning the weight W and the transfer function H of the neural network 94 (FIG. 3A) that generates the noise removal mask α, and in the threshold calculation mode, the engine 1 A threshold calculation process (FIG. 15A) is performed to calculate a threshold used for threshold determination of the extracted knocking sound. As shown in FIG. 2, the learning processing unit 20 includes a noise reduction input physical quantity generation unit 15, a selection unit 16, a learning unit 21, a learned parameter storage unit 22, a first estimation unit 23, and a threshold calculation unit. 24, a threshold storage unit 25, a teacher signal storage unit 26a, an estimated signal storage unit 26b, a noise component storage unit 26c, and an extracted signal storage unit 26d.

ノイズ低減入力物理量生成部１５は、入力物理量（本実施形態では、エンジン近傍音９０（図３Ａ））に含まれるノイズ成分（本実施形態では、雑音９１ｂ（図３Ａ））を低減したノイズ低減入力物理量９０Ａ（図３Ａ）を生成する。ノイズ低減入力物理量生成部１５は、入力物理量（本実施形態では、エンジン近傍音９０（図３Ａ））からノイズ低減入力物理量９０Ａを生成するための処理マスクβを生成する機能を有している。処理マスクβは、ノイズ除去マスクαをフィルタ処理したものである。処理マスクβについては、後記する。 The noise reduction input physical quantity generation unit 15 generates a noise reduction input obtained by reducing the noise component (noise 91b (FIG. 3A) in this embodiment) included in the input physical quantity (near engine sound 90 (FIG. 3A) in this embodiment). A physical quantity 90A (FIG. 3A) is generated. The noise reduction input physical quantity generator 15 has a function of generating a processing mask β for generating a noise reduction input physical quantity 90A from an input physical quantity (in this embodiment, the near-engine sound 90 (FIG. 3A)). The processing mask β is obtained by filtering the noise removal mask α. The processing mask β will be described later.

選択部１６は、入力物理量（エンジン近傍音９０（図３Ａ））とノイズ低減入力物理量９０Ａ（図３Ａ）をシャッフルしたデータセットを学習部２１に供給するシャッフル機能を有している。なお、ここで、「シャッフル」とは、順番をばらばらにして混ぜることを意味している。選択部１６は、入力物理量とノイズ低減入力物理量のいずれか一方又は双方を選択して学習部２１に供給する。 The selection unit 16 has a shuffle function for supplying the learning unit 21 with a data set obtained by shuffling the input physical quantity (near engine sound 90 (FIG. 3A)) and the noise reduction input physical quantity 90A (FIG. 3A). Here, "shuffle" means mixing in random order. The selection unit 16 selects one or both of the input physical quantity and the noise reduction input physical quantity and supplies it to the learning unit 21 .

学習部２１は、学習モードにおいて、入力物理量（エンジン近傍音９０（図３Ａ））とノイズ低減入力物理量９０Ａ（図３Ａ）の中の選択部１６が選択したものからノイズ成分（雑音９１ｂ（図３Ａ））を除去するためのノイズ除去マスクαを生成するニューラルネットワーク９４の重みを学習する。また、学習部２１は、生成されたノイズ除去マスクαで抽出した抽出信号（抽出ノッキング音９１ａ（図３Ａ））をノッキング発生時のエンジン１の推定ノッキング筒内圧９２に変換する伝達関数Ｈ（図３Ａ）を学習する。本実施形態では、学習部２１は、信号記憶部１３から入力された音圧信号のスペクトログラム及び筒内圧信号を用いて、ノイズ除去マスクαを生成するニューラルネットワーク９４の重みＷ及び伝達関数Ｈを学習する。その後、学習部２１は、学習したノイズ除去マスクαを生成するニューラルネットワーク９４の重みＷ及び伝達関数Ｈを学習済みパラメータ記憶部２２に書き込む。 In the learning mode, the learning unit 21 extracts a noise component (noise 91b (FIG. 3A )), learn the weights of the neural network 94 that generates a denoising mask α for removing )). Further, the learning unit 21 converts the extracted signal (extracted knocking sound 91a (FIG. 3A)) extracted by the generated noise removal mask α into an estimated knocking cylinder internal pressure 92 of the engine 1 at the time of knocking (see FIG. 3A). 3A). In this embodiment, the learning unit 21 uses the spectrogram of the sound pressure signal and the in-cylinder pressure signal input from the signal storage unit 13 to learn the weight W and the transfer function H of the neural network 94 that generates the noise removal mask α. do. After that, the learning unit 21 writes the weight W and the transfer function H of the neural network 94 that generates the learned noise removal mask α into the learned parameter storage unit 22 .

学習済みパラメータ記憶部２２は、学習済みのパラメータ（ノイズ除去マスクαを生成するニューラルネットワークの重みＷ及び伝達関数Ｈ）を記憶するメモリ、ＨＤＤ、ＳＳＤ等の記憶装置である。 The learned parameter storage unit 22 is a storage device such as a memory, HDD, SSD, etc. that stores learned parameters (weight W and transfer function H of the neural network that generates the noise removal mask α).

第１推定部２３は、閾値算出モードにおいて、ニューラルネットワーク９４により生成されたノイズ除去マスクαを用いて、音圧信号のスペクトログラムからエンジン１の抽出ノッキング音９１ａを抽出する。この第１推定部２３が抽出した抽出ノッキング音９１ａは、後記する閾値を算出するときに利用される。具体的には、第１推定部２３は、学習済みパラメータ記憶部２２のノイズ除去マスクαを生成するニューラルネットワークの重みＷが反映されたニューラルネットワーク９４に、信号記憶部１３から入力された音圧信号のスペクトログラムを入力する。すると、学習したニューラルネットワークにより生成されたノイズ除去マスクαが音圧信号のスペクトログラムから抽出ノッキング音９１ａを抽出する。その後、第１推定部２３は、抽出ノッキング音９１ａを閾値算出部２４に出力する。 In the threshold calculation mode, the first estimator 23 uses the noise removal mask α generated by the neural network 94 to extract the extracted knocking sound 91a of the engine 1 from the spectrogram of the sound pressure signal. The extracted knocking sound 91a extracted by the first estimating section 23 is used when calculating a threshold, which will be described later. Specifically, the first estimation unit 23 stores the sound pressure input from the signal storage unit 13 in the neural network 94 reflecting the weight W of the neural network that generates the noise removal mask α in the learned parameter storage unit 22. Enter the spectrogram of the signal. Then, the noise elimination mask α generated by the learned neural network extracts the extracted knocking sound 91a from the spectrogram of the sound pressure signal. After that, the first estimator 23 outputs the extracted knocking sound 91 a to the threshold calculator 24 .

閾値算出部２４は、第１推定部２３が抽出したエンジン１の抽出ノッキング音９１ａに基づいて閾値を算出する。具体的には、閾値算出部２４は、抽出ノッキング音９１ａのスペクトログラムの絶対値を所定時間（例えば、エンジン１の１サイクル）毎に総和する。例えば、複数の気筒を有するエンジン１では、抽出ノッキング音９１ａを総和すると、気筒別に１つのスコアが求められる。続いて、閾値算出部２４は、所定時間毎に総和した抽出ノッキング音９１ａの中央値を算出する。例えば、閾値算出部２４は、全てのサイクルについて、抽出ノッキング音９１ａの総和の中央値を算出する。このとき、閾値算出部２４は、任意の値で予め設定したマージンを中央値に加算し、閾値とする。なお、閾値算出部２４は、気筒毎に抽出ノッキング音９１ａを総和して中央値を求め、気筒毎の閾値を算出してもよい。また、閾値算出部２４は、各気筒で抽出ノッキング音９１ａを総和し、全気筒で中央値を求め、全気筒で共通の閾値を算出してもよい。また、後述する官能試験にて、検査者により許容不可能と判断された加工音９１ｃの抽出ノッキング音９１ａの総和を求め、総和値以下の任意の値を閾値としてもよい。その後、閾値算出部２４は、算出した閾値を閾値記憶部２５に書き込む。 The threshold calculator 24 calculates a threshold based on the extracted knocking sound 91 a of the engine 1 extracted by the first estimator 23 . Specifically, the threshold calculation unit 24 sums the absolute values of the spectrogram of the extracted knocking sound 91a every predetermined time (for example, one cycle of the engine 1). For example, in an engine 1 having a plurality of cylinders, one score is obtained for each cylinder by summing the extracted knocking sounds 91a. Subsequently, the threshold calculation unit 24 calculates the median value of the extracted knocking sounds 91a summed up for each predetermined time. For example, the threshold calculator 24 calculates the median sum of the extracted knocking sounds 91a for all cycles. At this time, the threshold calculation unit 24 adds a margin set in advance with an arbitrary value to the median value to obtain the threshold. Note that the threshold calculation unit 24 may calculate the threshold for each cylinder by summing the extracted knocking sounds 91a for each cylinder to obtain a median value. Further, the threshold calculation unit 24 may sum the knocking sounds 91a extracted from each cylinder, obtain a median value for all cylinders, and calculate a common threshold for all cylinders. Further, in a sensory test, which will be described later, the total sum of extracted knocking sounds 91a of the processed sounds 91c judged to be unacceptable by the inspector may be obtained, and any value below the total sum value may be used as the threshold value. After that, the threshold calculator 24 writes the calculated threshold to the threshold storage 25 .

閾値記憶部２５は、閾値算出部２４が算出した閾値を記憶するメモリ、ＨＤＤ、ＳＳＤ等の記憶装置である。 The threshold storage unit 25 is a storage device such as a memory, HDD, SSD, etc. that stores the threshold calculated by the threshold calculation unit 24 .

教師信号記憶部２６ａは、教師信号（本実施形態では、観測ノッキング筒内圧９３（図３Ａ））を記憶するメモリ、ＨＤＤ、ＳＳＤ等の記憶装置である。 The teacher signal storage unit 26a is a storage device such as a memory, an HDD, or an SSD that stores a teacher signal (in this embodiment, the observed knocking in-cylinder pressure 93 (FIG. 3A)).

推定信号記憶部２６ｂは、推定信号（本実施形態では、推定ノッキング筒内圧９２（図３Ａ））を記憶するメモリ、ＨＤＤ、ＳＳＤ等の記憶装置である。 The estimated signal storage unit 26b is a storage device such as a memory, HDD, or SSD that stores an estimated signal (in this embodiment, the estimated knocking in-cylinder pressure 92 (FIG. 3A)).

ノイズ成分記憶部２６ｃは、入力物理量に含まれているノイズ成分（本実施形態では、雑音９１ｂ（図６Ａ））を記憶するメモリ、ＨＤＤ、ＳＳＤ等の記憶装置である。 The noise component storage unit 26c is a storage device such as a memory, HDD, or SSD that stores noise components (in this embodiment, noise 91b (FIG. 6A)) included in the input physical quantity.

抽出信号記憶部２６ｄは、ノイズ成分が含まれている入力物理量からノイズ成分を除去した抽出信号（本実施形態では、抽出ノッキング音９１ａ（図６Ａ））を記憶するメモリ、ＨＤＤ、ＳＳＤ等の記憶装置である。 The extracted signal storage unit 26d stores an extracted signal (in this embodiment, an extracted knocking sound 91a (FIG. 6A)) obtained by removing the noise component from the input physical quantity containing the noise component. It is a device.

＜判定処理部＞
判定処理部３０は、判定モードにおいて、音圧信号のスペクトログラムからエンジン１のノッキング音を抽出し、ノッキングの有無を判定する判定処理（図１６）を行う。図２に示すように、判定処理部３０は、第２推定部３１（抽出信号推定部）と、閾値判定部３２とを備える。 <Determination processing unit>
In the determination mode, the determination processing unit 30 extracts the knocking sound of the engine 1 from the spectrogram of the sound pressure signal, and performs determination processing (FIG. 16) for determining the presence or absence of knocking. As shown in FIG. 2 , the determination processing section 30 includes a second estimation section 31 (extracted signal estimation section) and a threshold determination section 32 .

第２推定部３１（抽出信号推定部）は、判定モードにおいて、ニューラルネットワーク９４により生成されたノイズ除去マスクαを用いて、音圧信号のスペクトログラムから抽出ノッキング音９１ａ（抽出信号）を抽出する。この第２推定部３１が抽出した抽出ノッキング音９１ａは、後記する閾値判定に利用される。なお、第２推定部３１の処理内容は、第１推定部２３と同様のため、説明を省略する。
その後、第２推定部３１は、抽出ノッキング音９１ａを閾値判定部３２に出力する。 The second estimator 31 (extracted signal estimator) extracts an extracted knocking sound 91a (extracted signal) from the spectrogram of the sound pressure signal using the noise removal mask α generated by the neural network 94 in the determination mode. The extracted knocking sound 91a extracted by the second estimating section 31 is used for threshold value determination, which will be described later. The processing contents of the second estimating unit 31 are the same as those of the first estimating unit 23, so the description thereof is omitted.
After that, second estimation section 31 outputs extracted knocking sound 91 a to threshold determination section 32 .

閾値判定部３２は、判定モードにおいて、閾値記憶部２５に記憶されている閾値と、第２推定部３１が抽出した抽出ノッキング音９１ａとの閾値判定により、ノッキングの有無を判定する。具体的には、閾値判定部３２は、閾値算出部２４と同様、抽出ノッキング音９１ａのスペクトログラムの絶対値を所定時間毎に総和する。そして、閾値判定部３２は、総和した抽出ノッキング音９１ａと閾値とを比較し、総和した抽出ノッキング音９１ａが閾値を超える場合にはノッキング有りと判定し、総和した抽出ノッキング音９１ａが閾値以下の場合にはノッキング無しと判定する。その後、閾値判定部３２は、ノッキングの有無の判定結果と、第２推定部３１から入力された抽出ノッキング音９１ａをモニタ７（図１）に出力する。 In the determination mode, the threshold determination unit 32 determines whether or not there is knocking by threshold determination between the threshold stored in the threshold storage unit 25 and the extracted knocking sound 91a extracted by the second estimation unit 31 . Specifically, similarly to the threshold calculation section 24, the threshold determination section 32 sums the absolute values of the spectrogram of the extracted knocking sound 91a every predetermined time. Then, the threshold determination unit 32 compares the total extracted knocking sound 91a with the threshold, determines that there is knocking when the total extracted knocking sound 91a exceeds the threshold, and determines that there is knocking when the total extracted knocking sound 91a exceeds the threshold. If so, it is determined that there is no knocking. After that, the threshold determination unit 32 outputs the determination result of the presence or absence of knocking and the extracted knocking sound 91a input from the second estimation unit 31 to the monitor 7 (FIG. 1).

＜分離部＞
分離部４０は、ノイズ除去マスクαを用いて入力物理量をノイズ成分と抽出信号とに分離する。本実施形態では、図６Ａに示すように、分離部４０は、入力物理量としてのエンジン近傍音９０をノイズ成分である雑音９１ｂと抽出信号である抽出ノッキング音９１ａとに分離する。 <Separation section>
The separation unit 40 separates the input physical quantity into a noise component and an extraction signal using the noise removal mask α. In the present embodiment, as shown in FIG. 6A, the separation unit 40 separates the near-engine sound 90, which is the input physical quantity, into noise 91b, which is the noise component, and extracted knocking sound 91a, which is the extraction signal.

＜信号合成部＞
信号合成部５０は、各種の信号を合成して加工音を生成する。本実施形態では、図１１Ａから図１２Ｃに示すように、信号合成部５０は、抽出ノッキング音９１（抽出信号）のレベルを変更して、エンジン近傍音９０（入力物理量）から分離された雑音９１ｂ（ノイズ成分）と合成して加工音を生成する。信号合成部５０は、信号調整部５１と、信号出力部５２とを有している。信号調整部５１は、レベル指定部９で指定（入力）された信号の上昇レベル又は下降レベルに応じて抽出ノッキング音９１のレベルを変更し、レベルが変更された抽出ノッキング音９１（レベル変更抽出信号）とノイズ成分とを合成して加工音を生成する。信号出力部５２は、加工音（信号）を放音部（ヘッドホン８）に出力して、放音部に加工音を放音させる。 <Signal Synthesizer>
The signal synthesizing unit 50 synthesizes various signals to generate a processed sound. In the present embodiment, as shown in FIGS. 11A to 12C, the signal synthesizing unit 50 changes the level of the extracted knocking sound 91 (extracted signal) to generate noise 91b separated from the near-engine sound 90 (input physical quantity). (noise component) to generate processed sound. The signal synthesis section 50 has a signal adjustment section 51 and a signal output section 52 . The signal adjustment unit 51 changes the level of the extracted knocking sound 91 in accordance with the rising level or falling level of the signal designated (input) by the level designating unit 9, and the extracted knocking sound 91 whose level has been changed (level-changed extraction signal) and noise components to generate a processed sound. The signal output unit 52 outputs the processed sound (signal) to the sound emitting unit (headphones 8) and causes the sound emitting unit to emit the processed sound.

＜学習モード時の動作＞
図３Ａ、図３Ｂ、図８Ａ、図８Ｂ、図８Ｃ、及び図９を参照して、信号処理装置１０の学習モード時の動作について説明する。図３Ａは、学習モードの説明図である。図３Ｂは、信号処理装置１０の学習モード時の動作説明図である。図８Ａは、学習時におけるノイズ成分と教師信号との関係を表す説明図である。図８Ｂは、学習時における抽出信号と教師信号との関係を表す説明図である。図８Ｃは、学習時における抽出信号と推定信号との関係を表す説明図である。図９は、第１実施形態において、ノイズ除去マスクαを生成するニューラルネットワークの重みＷの学習の説明図である。 <Operation in learning mode>
3A, 3B, 8A, 8B, 8C, and 9, the operation of the signal processing device 10 in the learning mode will be described. FIG. 3A is an explanatory diagram of the learning mode. FIG. 3B is an explanatory diagram of the operation of the signal processing device 10 in the learning mode. FIG. 8A is an explanatory diagram showing the relationship between noise components and teacher signals during learning. FIG. 8B is an explanatory diagram showing the relationship between the extracted signal and the teacher signal during learning. FIG. 8C is an explanatory diagram showing the relationship between the extracted signal and the estimated signal during learning. FIG. 9 is an explanatory diagram of learning of the weight W of the neural network that generates the noise removal mask α in the first embodiment.

図３Ａに示すように、学習モード時において、信号処理装置１０では、エンジン近傍音９０（入力物理量）がノイズ低減入力物理量生成部１５と選択部１６とに供給される。また、図３Ｂに示すように、観測ノッキング筒内圧９３（教師信号）がノイズ低減入力物理量生成部１５と選択部１６とに供給される。 As shown in FIG. 3A , in the signal processing device 10 in the learning mode, the near-engine sound 90 (input physical quantity) is supplied to the noise reduction input physical quantity generator 15 and the selector 16 . Further, as shown in FIG. 3B , an observed knocking in-cylinder pressure 93 (teacher signal) is supplied to the noise reduction input physical quantity generator 15 and the selector 16 .

ノイズ低減入力物理量生成部１５は、ノイズ除去マスクαをフィルタ処理した処理マスクβを用いて、エンジン近傍音９０から、エンジン近傍音９０に含まれる雑音９１ｂを低減したノイズ低減入力物理量９０Ａを生成して選択部１６に供給する。ノイズ除去マスクαは、学習部２１が、ニューラルネットワークの重みＷを学習する際に、前記エンジン近傍音９０（入力物理量）とノイズ低減入力物理量９０Ａのいずれか一方又は双方が学習部２１に入力される度に、ニューラルネットワークの重みＷを用いて、生成される。 The noise reduction input physical quantity generation unit 15 generates a noise reduction input physical quantity 90A by reducing the noise 91b included in the near engine sound 90 from the near engine sound 90 using the processing mask β obtained by filtering the noise removal mask α. and supplied to the selection unit 16. When the learning unit 21 learns the weight W of the neural network, one or both of the near-engine sound 90 (input physical quantity) and the noise reduction input physical quantity 90A are input to the learning unit 21 to obtain the noise removal mask α. is generated using the weights W of the neural network each time.

選択部１６は、エンジン近傍音９０とノイズ低減入力物理量生成部１５によって生成されたノイズ低減入力物理量９０Ａのいずれか一方又は双方を選択して、学習部２１に供給する。 The selection unit 16 selects one or both of the near-engine sound 90 and the noise reduction input physical quantity 90A generated by the noise reduction input physical quantity generation unit 15 and supplies them to the learning unit 21 .

学習部２１は、エンジン近傍音９０とノイズ低減入力物理量９０Ａの中の選択部１６が選択したものから雑音９１ｂを除去するためのノイズ除去マスクαを生成するニューラルネットワークの重みＷを学習する。このとき、学習部２１は、エンジン近傍音９０とノイズ低減入力物理量９０Ａの中の選択部１６が選択したものを雑音９１ｂと抽出ノッキング音９１ａとに分離する。その際に、学習部２１は、エンジン近傍音９０とノイズ低減入力物理量９０Ａの中の選択部１６が選択したものにノイズ除去マスクαを掛け合わせて抽出ノッキング音９１ａを取得する。また、学習部２１は、エンジン近傍音９０とノイズ低減入力物理量９０Ａの中の選択部１６が選択したものから抽出ノッキング音９１ａを差し引くことで、雑音９１ｂを取得する。本実施形態では、ノイズ除去マスクαは、エンジン近傍音９０とノイズ低減入力物理量９０Ａの中の選択部１６が選択したものに含まれるノッキング音の割合と位相成分（位相の修正量）を表す。位相成分については、図１０Ａ及び図１０Ｂを用いて後記する。 The learning unit 21 learns the weight W of the neural network that generates the noise removal mask α for removing the noise 91b from the near-engine sound 90 and the noise reduction input physical quantity 90A selected by the selection unit 16 . At this time, the learning unit 21 separates the near-engine sound 90 and the noise reduction input physical quantity 90A selected by the selection unit 16 into the noise 91b and the extracted knocking sound 91a. At this time, the learning unit 21 obtains an extracted knocking sound 91a by multiplying the near-engine sound 90 and the noise reduction input physical quantity 90A selected by the selection unit 16 by the noise removal mask α. Further, the learning unit 21 acquires the noise 91b by subtracting the extracted knocking sound 91a from the one selected by the selection unit 16 from the near-engine sound 90 and the noise reduction input physical quantity 90A. In the present embodiment, the noise removal mask α represents the ratio of knocking sounds and phase components (phase correction amount) included in the near-engine sound 90 and the noise reduction input physical quantity 90A selected by the selection unit 16. The phase component will be described later using FIGS. 10A and 10B.

学習部２１は、ニューラルネットワークの重みＷを学習する際に、ノイズ除去マスクαを取得し、ノイズ除去マスクαをノイズ低減入力物理量生成部１５に供給する。ノイズ低減入力物理量生成部１５は、ノイズ除去マスクαをフィルタ処理して処理マスクβを取得する。そして、ノイズ低減入力物理量生成部１５は、処理マスクβを用いて、エンジン近傍音９０からノイズ低減入力物理量９０Ａを生成して選択部１６に供給する。 The learning unit 21 acquires the noise removal mask α when learning the weight W of the neural network, and supplies the noise removal mask α to the noise reduction input physical quantity generation unit 15 . The noise reduction input physical quantity generator 15 obtains a processing mask β by filtering the noise removal mask α. Then, the noise reduction input physical quantity generation unit 15 uses the processing mask β to generate the noise reduction input physical quantity 90A from the near-engine sound 90 and supplies the noise reduction input physical quantity 90A to the selection unit 16 .

学習部２１は、ニューラルネットワークの重みＷを学習するとともに、抽出ノッキング音９１ａ（抽出信号）を観測ノッキング筒内圧９３（教師信号）と同じ次元（単位）の推定ノッキング筒内圧９２（推定信号）に位相を加味して変換するための伝達関数Ｈを学習する。換言すると、学習部２１は、ニューラルネットワークの重みＷ、及び、伝達関数Ｈを学習する際に、ノイズ除去マスクα、及び、伝達関数Ｈに対して、エンジン近傍音９０（入力物理量）に関連する振幅と位相成分を加味して学習する。本実施形態では、学習部２１は、抽出ノッキング音９１ａに対して、逆短時間フーリエ変換（ＩＳＴＦＴ）と高速フーリエ変換（ＦＦＴ）とを行い、伝達関数Ｈを掛け、逆高速フーリエ変換（ＩＦＦＴ）と短時間フーリエ変換（ＳＴＦＴ）とを行うことで、抽出ノッキング音９１ａを推定ノッキング筒内圧９２に変換している。なお、本実施形態では、伝達関数Ｈは、抽出ノッキング音９１ａを推定ノッキング筒内圧９２に変換するための振幅（ゲイン）と位相成分である。位相成分については、図１０Ａ及び図１０Ｂを用いて後記する。 The learning unit 21 learns the weight W of the neural network, and converts the extracted knocking sound 91a (extracted signal) into an estimated knocking in-cylinder pressure 92 (estimated signal) of the same dimension (unit) as the observed knocking in-cylinder pressure 93 (teacher signal). A transfer function H is learned for conversion taking into consideration the phase. In other words, when the learning unit 21 learns the neural network weight W and the transfer function H, the noise removal mask α and the transfer function H are related to the near-engine sound 90 (input physical quantity). Learn by adding amplitude and phase components. In this embodiment, the learning unit 21 performs an inverse short-time Fourier transform (ISTFT) and a fast Fourier transform (FFT) on the extracted knocking sound 91a, multiplies the transfer function H, and performs an inverse fast Fourier transform (IFFT). and a short-time Fourier transform (STFT), the extracted knocking sound 91a is converted into an estimated knocking cylinder pressure 92. FIG. In this embodiment, the transfer function H is the amplitude (gain) and phase components for converting the extracted knocking sound 91 a into the estimated knocking cylinder internal pressure 92 . The phase component will be described later using FIGS. 10A and 10B.

本実施形態では、図３Ａに示すように、学習部２１は、雑音９１ｂ（ノイズ成分）と教師信号（観測ノッキング筒内圧９３）との関連性が小さくなるとともに、抽出ノッキング音９１ａ（抽出信号）と教師信号（観測ノッキング筒内圧９３）との関連性が大きくなるように、ニューラルネットワークの重みを学習する。具体的には、学習部２１は、雑音９１ｂ（ノイズ成分）のスペクトログラムに対して逆短時間フーリエ変換（ＩＳＴＦＴ）を行うことにより取得された第１信号Ｓｆｉと観測ノッキング筒内圧９３（教師信号）とのコヒーレンスが小さくなるとともに、抽出ノッキング音９１ａ（抽出信号）のスペクトログラムに対して逆短時間フーリエ変換（ＩＳＴＦＴ）を行うことにより取得された第２信号Ｓｓｅと観測ノッキング筒内圧９３（教師信号）とのコヒーレンスが大きくなるように、ノイズ除去マスクαを生成するニューラルネットワークの重みＷを学習する。これにより、信号処理装置１０は、雑音９１ｂから筒内圧に起因する音を除去することができる。 In the present embodiment, as shown in FIG. 3A, the learning unit 21 reduces the relationship between the noise 91b (noise component) and the teacher signal (observed knocking in-cylinder pressure 93), and the extracted knocking sound 91a (extracted signal). and the teacher signal (observed knocking in-cylinder pressure 93). Specifically, the learning unit 21 obtains a first signal Sfi obtained by performing an inverse short-time Fourier transform (ISTFT) on the spectrogram of the noise 91b (noise component) and the observed knocking in-cylinder pressure 93 (teacher signal). and the second signal Sse obtained by performing an inverse short-time Fourier transform (ISTFT) on the spectrogram of the extracted knocking sound 91a (extracted signal) and the observed knocking cylinder pressure 93 (teacher signal). The weight W of the neural network that generates the noise removal mask α is learned so that the coherence with is large. Thereby, the signal processing device 10 can remove the sound caused by the in-cylinder pressure from the noise 91b.

図３Ｂに、信号処理装置１０の学習モード時の動作を示す。図３Ｂの太枠は学習モード時の作動する構成要素を示している。また、図３Ｂの太線矢印は、学習モード時に出力される信号を示している。 FIG. 3B shows the operation of the signal processing device 10 in the learning mode. The bold boxes in FIG. 3B indicate the active components during the learn mode. Also, the bold arrows in FIG. 3B indicate signals output during the learning mode.

図３Ｂに示すように、学習モード時において、信号処理装置１０は、スイッチ１４を学習モード用接続部Ｍ１に接続して、信号記憶部１３に記憶されているエンジン近傍音９０のスペクトログラム及び観測ノッキング筒内圧９３をノイズ低減入力物理量生成部１５と選択部１６とに出力する。 As shown in FIG. 3B , in the learning mode, the signal processing device 10 connects the switch 14 to the learning mode connecting portion M1 to obtain the spectrogram of the near-engine sound 90 stored in the signal storage portion 13 and the observed knocking. The in-cylinder pressure 93 is output to the noise reduction input physical quantity generator 15 and the selector 16 .

これに応答して、ノイズ低減入力物理量生成部１５は、処理マスクβを用いて、エンジン近傍音９０からノイズ低減入力物理量９０Ａを生成して選択部１６に供給する。選択部１６は、エンジン近傍音９０とノイズ低減入力物理量生成部１５によって生成されたノイズ低減入力物理量９０Ａのいずれか一方又は双方を選択して、学習部２１に供給する。学習部２１は、エンジン近傍音９０とノイズ低減入力物理量９０Ａの中の選択部１６が選択したものを用いてニューラルネットワークの重みＷを学習する。また、学習部２１は、生成されたノイズ除去マスクαで抽出した抽出ノッキング音９１ａをノッキング発生時のエンジン１の推定ノッキング筒内圧９２に変換する伝達関数Ｈを学習する。このとき、学習部２１は、新しいエンジン近傍音９０とノイズ低減入力物理量９０Ａの中の選択部１６が選択したものが入力される度に、選択部１６が選択したものに対応するノイズ除去マスクαを生成する。 In response to this, the noise reduction input physical quantity generation unit 15 uses the processing mask β to generate the noise reduction input physical quantity 90A from the near-engine sound 90 and supplies the noise reduction input physical quantity 90A to the selection unit 16 . The selection unit 16 selects one or both of the near-engine sound 90 and the noise reduction input physical quantity 90A generated by the noise reduction input physical quantity generation unit 15 and supplies them to the learning unit 21 . The learning unit 21 learns the weight W of the neural network using the one selected by the selection unit 16 from the near-engine sound 90 and the noise reduction input physical quantity 90A. Further, the learning unit 21 learns a transfer function H for converting an extracted knocking sound 91a extracted with the generated noise removal mask α into an estimated knocking cylinder internal pressure 92 of the engine 1 when knocking occurs. At this time, each time the learning unit 21 receives a new near-engine sound 90 and noise reduction input physical quantity 90A selected by the selection unit 16, the learning unit 21 selects the noise removal mask α corresponding to the one selected by the selection unit 16. to generate

そして、学習部２１は、学習されたパラメータ（ノイズ除去マスクαを生成するニューラルネットワークの重みＷと伝達関数Ｈ）を学習済みパラメータ記憶部２２に記憶する。また、学習部２１は、観測ノッキング筒内圧９３を教師信号記憶部２６ａに記憶するとともに、推定ノッキング筒内圧９２を推定信号記憶部２６ｂに記憶する。 Then, the learning unit 21 stores the learned parameters (weight W and transfer function H of the neural network for generating the noise removal mask α) in the learned parameter storage unit 22 . The learning unit 21 also stores the observed knocking in-cylinder pressure 93 in the teacher signal storage unit 26a, and stores the estimated knocking in-cylinder pressure 92 in the estimated signal storage unit 26b.

＜閾値算出モード時の動作＞
図４Ａ及び図４Ｂを参照して、信号処理装置１０の閾値算出モード時の動作について説明する。図４Ａは、閾値算出モードの説明図である。図４Ｂは、信号処理装置１０の閾値算出モード時の動作説明図である。 <Operation in threshold calculation mode>
The operation of the signal processing device 10 in the threshold calculation mode will be described with reference to FIGS. 4A and 4B. FIG. 4A is an explanatory diagram of the threshold calculation mode. FIG. 4B is an explanatory diagram of the operation of the signal processing device 10 in the threshold calculation mode.

図４Ｂに示すように、閾値算出モード時において、図３Ｂの学習モード時と異なり、信号処理装置１０の学習部２１は、停止した状態になっている。その代わりに、図４Ｂに示すように、学習処理部２０の第１推定部２３と閾値算出部２４が作動して、エンジン１の抽出ノッキング音９１ａの閾値判定に用いる閾値Ｔを算出する。図４Ｂの太枠は閾値算出モード時の作動する構成要素を示している。また、図４Ｂの太線矢印は、閾値算出モード時に出力される信号を示している。 As shown in FIG. 4B, during the threshold calculation mode, the learning unit 21 of the signal processing device 10 is in a stopped state unlike during the learning mode of FIG. 3B. Instead, as shown in FIG. 4B , the first estimator 23 and the threshold calculator 24 of the learning processor 20 operate to calculate the threshold T used for threshold determination of the extracted knocking sound 91 a of the engine 1 . The bold frames in FIG. 4B indicate the active components during the threshold calculation mode. Also, the bold arrows in FIG. 4B indicate signals output in the threshold calculation mode.

図４Ｂに示すように、閾値算出モード時において、信号処理装置１０は、スイッチ１４を閾値算出モード用接続部Ｍ２に接続して、信号記憶部１３に記憶されているエンジン近傍音９０のスペクトログラムを第１推定部２３に出力する。 As shown in FIG. 4B, in the threshold calculation mode, the signal processing device 10 connects the switch 14 to the threshold calculation mode connection portion M2, and outputs the spectrogram of the near-engine sound 90 stored in the signal storage portion 13. Output to the first estimation unit 23 .

これに応答して、第１推定部２３は、学習済みパラメータ記憶部２２から、ノイズ除去マスクαを生成するニューラルネットワークの重みＷを取得する。そして、第１推定部２３は、ニューラルネットワーク９４（図４Ａ）により生成したノイズ除去マスクαを用いてエンジン近傍音９０からエンジン１の抽出ノッキング音９１ａを抽出する。図４Ａは、このときの第１推定部２３の動作の概要を示している。この後、第１推定部２３は、抽出ノッキング音９１ａを閾値算出部２４に出力する。 In response to this, the first estimation unit 23 acquires the weight W of the neural network that generates the noise removal mask α from the learned parameter storage unit 22 . Then, the first estimator 23 extracts an extracted knocking sound 91a of the engine 1 from the near-engine sound 90 using the noise removal mask α generated by the neural network 94 (FIG. 4A). FIG. 4A shows an overview of the operation of the first estimation unit 23 at this time. After that, the first estimator 23 outputs the extracted knocking sound 91 a to the threshold calculator 24 .

これに応答して、閾値算出部２４は、エンジン１の各サイクルで抽出ノッキング音９１ａのスペクトログラムの絶対値を総和し、予め設定したマージンを加算して、閾値Ｔを算出する。そして、閾値算出部２４は、閾値Ｔを閾値記憶部２５に記憶する。 In response to this, the threshold calculation unit 24 sums the absolute values of the spectrogram of the extracted knocking sound 91a in each cycle of the engine 1, adds a preset margin, and calculates the threshold T. Then, the threshold calculator 24 stores the threshold T in the threshold storage 25 .

＜判定モード時の動作＞
図５Ａ及び図５Ｂを参照して、信号処理装置１０の判定モード時の動作について説明する。図５Ａは、判定モードの説明図である。図５Ｂは、信号処理装置１０の判定モード時の動作説明図である。 <Operation in Judgment Mode>
The operation of the signal processing device 10 in the determination mode will be described with reference to FIGS. 5A and 5B. FIG. 5A is an explanatory diagram of the determination mode. FIG. 5B is an explanatory diagram of the operation of the signal processing device 10 in the determination mode.

図５Ｂに示すように、判定モード時において、図３Ｂの学習モード時と異なり、信号処理装置１０の学習部２１は、停止した状態になっている。その代わりに、図５Ｂに示すように、判定処理部３０の第２推定部３１（抽出信号推定部）と閾値判定部３２が作動して、ノッキングの有無を判定する。図５Ｂの太枠は判定モード時の作動する構成要素を示している。また、図５Ｂの太線矢印は、判定モード時に出力される信号を示している。 As shown in FIG. 5B, in the determination mode, unlike the learning mode in FIG. 3B, the learning unit 21 of the signal processing device 10 is in a stopped state. Instead, as shown in FIG. 5B, the second estimating section 31 (extracted signal estimating section) and the threshold value determining section 32 of the determination processing section 30 operate to determine the presence or absence of knocking. The bold frames in FIG. 5B indicate the active components during the judgment mode. Also, the bold arrows in FIG. 5B indicate signals output in the determination mode.

図５Ｂに示すように、判定モード時において、信号処理装置１０は、スイッチ１４を判定モード用接続部Ｍ３に接続して、信号記憶部１３に記憶されているエンジン近傍音９０のスペクトログラムを第２推定部３１に出力する。 As shown in FIG. 5B, in the determination mode, the signal processing device 10 connects the switch 14 to the determination mode connection portion M3, and transfers the spectrogram of the near-engine sound 90 stored in the signal storage portion 13 to the second Output to the estimation unit 31 .

これに応答して、第２推定部３１は、学習済みパラメータ記憶部２２から、ノイズ除去マスクαを生成するニューラルネットワークの重みＷを取得する。そして、第２推定部３１は、ニューラルネットワーク９４（図５Ａ）により生成したノイズ除去マスクαを用いてエンジン近傍音９０から抽出ノッキング音９１ａを抽出する。図５Ａは、このときの第２推定部３１の動作の概要を示している。この後、第２推定部３１は、抽出ノッキング音９１ａを閾値判定部３２に出力する。 In response to this, the second estimation unit 31 acquires the weight W of the neural network that generates the noise removal mask α from the learned parameter storage unit 22 . Then, the second estimator 31 extracts an extracted knocking sound 91a from the near-engine sound 90 using the noise removal mask α generated by the neural network 94 (FIG. 5A). FIG. 5A shows an overview of the operation of the second estimator 31 at this time. After that, second estimation section 31 outputs extracted knocking sound 91 a to threshold determination section 32 .

これに応答して、閾値判定部３２は、閾値記憶部２５から閾値Ｔを取得し、抽出ノッキング音９１ａの絶対値の総和と閾値Ｔを比較してノッキングの有無を判定する。そして、閾値判定部３２は、例えば、ノッキングの有無の判定結果や、抽出ノッキング音９１ａと閾値Ｔとの関係を表す波形図等をモニタ７に出力して表示させる。 In response to this, the threshold determination unit 32 obtains the threshold T from the threshold storage unit 25, compares the sum of the absolute values of the extracted knocking sound 91a with the threshold T, and determines whether or not knocking has occurred. Then, the threshold determination unit 32 outputs, for example, the result of determining whether or not there is knocking, a waveform diagram representing the relationship between the extracted knocking sound 91a and the threshold T, and the like to the monitor 7 for display.

＜分離モード時の動作＞
図６Ａ及び図６Ｂを参照して、信号処理装置１０の分離モード時の動作について説明する。図６Ａは、分離モードの説明図である。図６Ｂは、信号処理装置１０の分離モード時の動作説明図である。 <Operation in separation mode>
The operation of the signal processing device 10 in the separation mode will be described with reference to FIGS. 6A and 6B. FIG. 6A is an explanatory diagram of the separation mode. FIG. 6B is an explanatory diagram of the operation of the signal processing device 10 in the separation mode.

図６Ｂに示すように、分離モード時において、図３Ｂの学習モード時と異なり、信号処理装置１０の学習部２１は、停止した状態になっている。その代わりに、図６Ｂに示すように、分離部４０が作動して、エンジン近傍音９０を抽出ノッキング音９１ａと雑音９１ｂとに分離する。図６Ｂの太枠は分離モード時の作動する構成要素を示している。また、図６Ｂの太線矢印は、分離モード時に出力される信号を示している。 As shown in FIG. 6B, in the separation mode, the learning unit 21 of the signal processing device 10 is in a stopped state unlike in the learning mode of FIG. 3B. Instead, as shown in FIG. 6B, the separator 40 operates to separate the near-engine sound 90 into an extracted knocking sound 91a and noise 91b. The bold boxes in FIG. 6B indicate the active components during the isolation mode. Also, the bold arrows in FIG. 6B indicate signals output in the separation mode.

図６Ｂに示すように、分離モード時において、信号処理装置１０は、スイッチ１４を分離モード用接続部Ｍ４に接続して、信号記憶部１３に記憶されているエンジン近傍音９０のスペクトログラムを分離部４０に出力する。 As shown in FIG. 6B, in the separation mode, the signal processing device 10 connects the switch 14 to the separation mode connection section M4, and outputs the spectrogram of the near-engine sound 90 stored in the signal storage section 13 to the separation section. 40.

これに応答して、分離部４０は、学習済みパラメータ記憶部２２から、ノイズ除去マスクαを生成するニューラルネットワークの重みＷを取得する。そして、分離部４０は、ニューラルネットワーク９４（図６Ａ）により生成したノイズ除去マスクαを用いて、エンジン近傍音９０を抽出ノッキング音９１ａと雑音９１ｂとに分離する。図６Ａは、このときの分離部４０の動作の概要を示している。この後、分離部４０は、抽出ノッキング音９１ａ（抽出信号）を抽出信号記憶部２６ｄに記憶するとともに、雑音９１ｂをノイズ成分記憶部２６ｃに記憶する。 In response to this, the separation unit 40 acquires the weight W of the neural network that generates the noise removal mask α from the learned parameter storage unit 22 . Using the noise removal mask α generated by the neural network 94 (FIG. 6A), the separation unit 40 separates the near-engine sound 90 into an extracted knocking sound 91a and noise 91b. FIG. 6A shows an overview of the operation of the separating section 40 at this time. After that, the separating unit 40 stores the extracted knocking sound 91a (extracted signal) in the extracted signal storage unit 26d, and stores the noise 91b in the noise component storage unit 26c.

＜官能試験モード時の動作＞
図７Ａ及び図７Ｂを参照して、信号処理装置１０の官能試験モード時の動作について説明する。図７Ａは、官能試験モードの説明図である。図７Ｂは、信号処理装置１０の官能試験モード時の動作説明図である。官能試験モードは、検査者の聴感に基づく閾値Ｔを算出するモードである。 <Operation in sensory test mode>
The operation of the signal processing device 10 in the sensory test mode will be described with reference to FIGS. 7A and 7B. FIG. 7A is an explanatory diagram of the sensory test mode. FIG. 7B is an explanatory diagram of the operation of the signal processing device 10 in the sensory test mode. The sensory test mode is a mode for calculating the threshold value T based on the auditory sense of the examiner.

図７Ｂに示すように、官能試験モード時において、図３Ｂの学習モード時と異なり、信号処理装置１０の学習部２１は、停止した状態になっている。その代わりに、図７Ｂに示すように、信号合成部５０の信号調整部５１と信号出力部５２、並びに、学習処理部２０の第１推定部２３と閾値算出部２４が作動して、検査者の聴感に基づく閾値Ｔを算出する。図７Ｂの太枠は官能試験モード時の作動する構成要素を示している。また、図７Ｂの太線矢印は、官能試験モード時に出力される信号を示している。 As shown in FIG. 7B, in the sensory test mode, the learning unit 21 of the signal processing device 10 is in a stopped state unlike in the learning mode of FIG. 3B. Instead, as shown in FIG. 7B, the signal adjustment unit 51 and the signal output unit 52 of the signal synthesis unit 50, and the first estimation unit 23 and the threshold calculation unit 24 of the learning processing unit 20 operate to A threshold value T based on the sense of hearing is calculated. Bold frames in FIG. 7B indicate active components during the sensory test mode. In addition, thick arrows in FIG. 7B indicate signals output during the sensory test mode.

図７Ｂに示すように、官能試験モード時において、信号処理装置１０は、スイッチ１４を官能試験モード用接続部Ｍ５に接続することで、図示せぬ信号生成部によって予め生成されて信号記憶部１３に記憶された官能試験モードの実行指示信号を信号調整部５１に出力する。ただし、官能試験モードの実行指示信号は、スイッチ１４が官能試験モード用接続部Ｍ５に接続されたときに、図示せぬ信号生成部によって生成されて、信号記憶部１３を介さずに、図示せぬ信号生成部から信号調整部５１に出力されるようにしてもよい。 As shown in FIG. 7B , in the sensory test mode, the signal processing device 10 connects the switch 14 to the sensory test mode connection portion M5, so that the signal generated in advance by the signal generation portion (not shown) and the signal storage portion 13 to the signal adjustment unit 51 . However, when the switch 14 is connected to the sensory test mode connection M5, the sensory test mode execution instruction signal is generated by a signal generator (not shown) and is not shown in the figure without passing through the signal storage unit 13. Alternatively, it may be output from the signal generator to the signal adjuster 51 .

これに応答して、信号調整部５１は、レベル指定部９からレベル指定情報を受け取るとともに、抽出信号記憶部２６ｄに記憶されている抽出ノッキング音９１ａとノイズ成分記憶部２６ｃに記憶されている雑音９１ｂとを取得する。そして、信号調整部５１は、レベル指定情報に基づいて、抽出ノッキング音９１ａと雑音９１ｂとを用いて加工音９１ｃを生成する。このとき、信号調整部５１は、レベル指定情報によって指定された量だけ抽出ノッキング音９１ａのレベル（大きさ）を上昇又は下降させてから、雑音９１ｂと合成することによって、加工音９１ｃを生成して信号出力部５２に出力する。信号出力部５２は、加工音９１ｃをヘッドホン８（放音部）に出力して放音させる。 In response to this, the signal adjustment unit 51 receives the level designation information from the level designation unit 9 and adjusts the extracted knocking sound 91a stored in the extracted signal storage unit 26d and the noise stored in the noise component storage unit 26c. 91b. Then, the signal adjustment unit 51 generates a processed sound 91c using the extracted knocking sound 91a and the noise 91b based on the level designation information. At this time, the signal adjustment unit 51 raises or lowers the level (loudness) of the extracted knocking sound 91a by the amount specified by the level specifying information, and then combines it with the noise 91b to generate the processed sound 91c. and output to the signal output unit 52 . The signal output unit 52 outputs the processed sound 91c to the headphone 8 (sound emitting unit) for sound emission.

また、信号調整部５１は、加工音９１ｃを第１推定部２３に出力する。第１推定部２３は、学習済みパラメータ記憶部２２から、ノイズ除去マスクαを生成するニューラルネットワークの重みＷを取得する。そして、第１推定部２３は、ニューラルネットワーク９４（図７Ａ）により生成したノイズ除去マスクαを用いて加工音９１ｃから抽出ノッキング音９１ａを抽出する。図７Ａは、このときの第１推定部２３の動作の概要を示している。この後、第１推定部２３は、抽出ノッキング音９１ａを閾値算出部２４に出力する。閾値算出部２４は、エンジン１の各サイクルで抽出ノッキング音９１ａのスペクトログラムの絶対値を総和し、総和値以下の任意の値を閾値Ｔとする。そして、閾値算出部２４は、閾値Ｔを閾値記憶部２５に記憶する。 Also, the signal adjustment unit 51 outputs the processed sound 91 c to the first estimation unit 23 . The first estimation unit 23 acquires the weight W of the neural network that generates the noise removal mask α from the learned parameter storage unit 22 . Then, the first estimation unit 23 extracts the extracted knocking sound 91a from the processed sound 91c using the noise removal mask α generated by the neural network 94 (FIG. 7A). FIG. 7A shows an overview of the operation of the first estimator 23 at this time. After that, the first estimator 23 outputs the extracted knocking sound 91 a to the threshold calculator 24 . The threshold calculator 24 sums the absolute values of the spectrogram of the extracted knocking sound 91a in each cycle of the engine 1, and sets an arbitrary value equal to or less than the total sum as the threshold T. Then, the threshold calculator 24 stores the threshold T in the threshold storage 25 .

ところで、特許文献２及び特許文献３に記載された従来技術は、ニューラルネットワークの学習時における目的関数に音の分離度合いを測る関数を含めていないため、エンジン近傍音（入力物理量）から除去される雑音（ノッキング音以外の音（背景音））の中にノッキング音（目的音）が混入する可能性があった。つまり、特許文献２及び特許文献３に記載された従来技術は、学習時に、推定筒内圧と実測筒内圧（教師データ）との二乗誤差を最小化するだけであるため、「雑音が除去されたエンジン音」が雑音だけを良好に除去されたものであるか否かを監視するものではなかった。例えば、特許文献３に記載された従来技術は、エンジン近傍音（入力物理量）に関連する位相成分が考慮されていないノイズ除去マスクαをエンジン近傍音に掛けることで、「雑音が除去されたエンジン音」すなわちノッキング音（本実施形態の「抽出信号」に相当）を抽出する。その際に、特許文献３に記載された従来技術は、音の分離度合いを測る関数を用いていないため、雑音と共に、除去されるべきでないノッキング音（目的音）がエンジン近傍音から除去される可能性があった。そのため、特許文献３に記載された従来技術は、ノッキングの有無の評価性能を低下させる可能性があった。 By the way, the conventional technologies described in Patent Documents 2 and 3 do not include a function for measuring the degree of sound separation in the objective function during learning of the neural network. There is a possibility that the knocking sound (target sound) is mixed in the noise (sound other than the knocking sound (background sound)). In other words, the conventional techniques described in Patent Documents 2 and 3 only minimize the squared error between the estimated in-cylinder pressure and the actually measured in-cylinder pressure (teacher data) during learning. It was not used to monitor whether "engine noise" was good enough to remove only noise. For example, in the prior art described in Patent Document 3, by applying a noise removal mask α that does not consider the phase component related to the near-engine sound (input physical quantity) to the near-engine sound, the "noise-removed engine "sound", that is, a knocking sound (corresponding to the "extraction signal" of the present embodiment) is extracted. At that time, the conventional technology described in Patent Document 3 does not use a function for measuring the degree of sound separation, so the knocking sound (target sound) that should not be removed is removed from the near-engine sound along with the noise. It was possible. Therefore, the conventional technology described in Patent Document 3 may degrade the performance of evaluating the presence or absence of knocking.

これに対して、本実施形態に係る信号処理装置１０は、学習時に、エンジン近傍音（入力物理量）から除去される雑音の中にノッキング音（目的音）が混入しているか否かを評価する構成になっている。そのための構成として、本実施形態に係る信号処理装置１０は、ノイズ成分（雑音９１ｂ）と教師信号（観測ノッキング筒内圧９３）との関連性が小さくなるように学習する構成になっている。具体的には、図３Ａに示すように、ノイズ成分（雑音９１ｂ）のスペクトログラムに対して逆短時間フーリエ変換（ＩＳＴＦＴ）を行うことにより取得された第１信号Ｓｆｉと教師信号（観測ノッキング筒内圧９３）とのコヒーレンスが小さくなるように学習する。また、本実施形態に係る信号処理装置１０は、抽出ノッキング音９１ａ（抽出信号）と教師信号（観測ノッキング筒内圧９３）との関連性が大きくなるように学習する構成になっている。具体的には、図３Ａに示すように、抽出ノッキング音９１ａ（抽出信号）のスペクトログラムに対して逆短時間フーリエ変換（ＩＳＴＦＴ）を行うことにより取得された第２信号Ｓｓｅと教師信号（観測ノッキング筒内圧９３）とのコヒーレンスが大きくなるように学習する。このような本実施形態に係る信号処理装置１０は、特許文献２及び特許文献３に記載された従来技術よりも、ノッキングの有無の評価性能を向上させることができる。 On the other hand, the signal processing device 10 according to the present embodiment evaluates during learning whether or not the knocking sound (target sound) is mixed in the noise removed from the near-engine sound (input physical quantity). It is configured. As a configuration for this purpose, the signal processing device 10 according to the present embodiment is configured to learn such that the relationship between the noise component (noise 91b) and the teacher signal (observed knocking in-cylinder pressure 93) is reduced. Specifically, as shown in FIG. 3A, the first signal Sfi and the teacher signal (observed knocking in-cylinder pressure 93) so that the coherence with 93) becomes small. Further, the signal processing device 10 according to the present embodiment is configured to learn so as to increase the relationship between the extracted knocking sound 91a (extracted signal) and the teacher signal (observed knocking in-cylinder pressure 93). Specifically, as shown in FIG. 3A, a second signal Sse and a teacher signal (observed Learning is performed so as to increase the coherence with the in-cylinder pressure 93). The signal processing device 10 according to this embodiment can improve the performance of evaluating the presence or absence of knocking as compared with the conventional techniques described in Patent Documents 2 and 3.

図８Ａは、雑音９１ｂ（ノイズ成分）のスペクトログラムＦ１１と、スペクトログラムＦ１１に対して逆短時間フーリエ変換（ＩＳＴＦＴ）を行うことにより得られる第１信号Ｓｆｉ（図３Ａ）の信号波形Ｆ１２と、観測ノッキング筒内圧９３（教師信号）の信号波形Ｆ９３と、を示している。また、図８Ａは、第１信号Ｓｆｉ（図３Ａ）の信号波形Ｆ１２と観測ノッキング筒内圧９３（教師信号）の信号波形Ｆ９３との学習の過程で最小化されるコヒーレンスＦ１３を示している。 FIG. 8A shows a spectrogram F11 of noise 91b (noise component), a signal waveform F12 of a first signal Sfi (FIG. 3A) obtained by performing an inverse short-time Fourier transform (ISTFT) on the spectrogram F11, and observed knocking. A signal waveform F93 of the in-cylinder pressure 93 (teacher signal) is shown. FIG. 8A shows the coherence F13 minimized in the learning process between the signal waveform F12 of the first signal Sfi (FIG. 3A) and the signal waveform F93 of the observed knocking in-cylinder pressure 93 (teacher signal).

図８Ｂは、抽出ノッキング音９１ａ（抽出信号）のスペクトログラムＦ２１と、スペクトログラムＦ２１に対して逆短時間フーリエ変換（ＩＳＴＦＴ）を行うことにより得られる第２信号Ｓｓｅ（図３Ａ）の信号波形Ｆ２２と、観測ノッキング筒内圧９３（教師信号）の信号波形Ｆ９３を示している。また、図８Ｂは、第２信号Ｓｓｅ（図３Ａ）の信号波形Ｆ２２と観測ノッキング筒内圧９３（教師信号）の信号波形Ｆ９３との学習の過程で最大化されるコヒーレンスＦ２３を示している。 FIG. 8B shows a spectrogram F21 of an extracted knocking sound 91a (extracted signal), a signal waveform F22 of a second signal Sse (FIG. 3A) obtained by performing an inverse short-time Fourier transform (ISTFT) on the spectrogram F21, A signal waveform F93 of an observed knocking in-cylinder pressure 93 (teacher signal) is shown. FIG. 8B shows the coherence F23 maximized in the learning process between the signal waveform F22 of the second signal Sse (FIG. 3A) and the signal waveform F93 of the observed knocking in-cylinder pressure 93 (teacher signal).

コヒーレンスは以下の式（１）によって定義される。コヒーレンス関数γ^２は、系の入力と出力の関連度合いを示すものである。コヒーレンス関数γ^２は、以下の式（１）に示すように、クロススペクトルの絶対値の２乗を測定入力及び系の出力の各々のパワースペクトルで割り算したものである。

Coherence is defined by equation (1) below. The coherence function ^γ2 indicates the degree of relatedness between the inputs and outputs of the system. The coherence function ^γ2 is the square of the absolute value of the cross spectrum divided by the power spectrum of each of the measured input and system output, as shown in Equation (1) below.

ここで、Ｗｘｙはクロススペクトルの平均値、Ｗｘｘはｘのパワースペクトルの平均値、Ｗｙｙはｙのパワースペクトルの平均値を意味している。コヒーレンス関数γ^２は、０から１までの値をとる。γ^２（ｆ）が１の場合、その周波数ｆにおいて、系の出力がすべて測定入力に起因していることを示している。また、γ^２（ｆ）が０の場合、その周波数ｆにおいて、系の出力が測定入力にまったく関係ないことを示している。また、０＜γ^２（ｆ）＜１の場合、測定とは無関係な信号、系内部で発生しているノイズ、系の非線形性等があるものと考えられる。 Here, Wxy means the average value of the cross spectrum, Wxx means the average value of the power spectrum of x, and Wyy means the average value of the power spectrum of y. The coherence function γ ² takes values between 0 and 1. When γ ² (f) is 1, it indicates that at that frequency f, the output of the system is entirely due to the measured input. Also, when γ ² (f) is 0, it indicates that at that frequency f, the output of the system is completely independent of the measured input. In the case of 0<γ ² (f)<1, it is considered that there are signals unrelated to the measurement, noise generated inside the system, nonlinearity of the system, and the like.

また、エンジン近傍音（入力物理量）から除去される雑音の中にノッキング音（目的音）が混入しているか否かを評価する方法としてコヒーレントアウトプットパワーを用いても良い。学習部２１は、ノイズ成分（雑音９１ｂ）を逆短時間フーリエ変換（ＩＳＴＦＴ）して求めた信号波形と教師信号（観測ノッキング筒内圧９３）とのコヒーレンス又はコヒーレントアウトプットパワーが小さくなるとともに、抽出信号（抽出ノッキング音９１ａ）を逆短時間フーリエ変換（ＩＳＴＦＴ）して求めた信号波形と教師信号（観測ノッキング筒内圧９３）とのコヒーレンス又はコヒーレントアウトプットパワーが大きくなるように、ノイズ除去マスクαを生成するニューラルネットワークの重みＷ、及び、伝達関数Ｈを学習する。これにより、信号処理装置１０は、ノッキングの有無の評価性能を更に向上させることができる。 Further, coherent output power may be used as a method of evaluating whether or not knocking sound (target sound) is mixed in noise removed from engine near sound (input physical quantity). The learning unit 21 reduces the coherence or coherent output power between the signal waveform obtained by inverse short-time Fourier transform (ISTFT) of the noise component (noise 91b) and the teacher signal (observed knocking cylinder pressure 93), and extracts The noise removal mask α Learn the weights W and the transfer function H of the neural network that generates As a result, the signal processing device 10 can further improve the performance of evaluating the presence or absence of knocking.

なお、コヒーレンスとコヒーレントアウトプットパワーについて、評価対象を以下の「Ｉ」と「ＩＩ」の２組とし、評価方法を以下の「Ａ」と「Ｂ」の２通りとする場合に、信号処理装置１０は、以下の４つの評価パターンの評価を可能にする。
（評価対象）
Ｉ：ノイズ成分と教師信号
ＩＩ：抽出信号と教師信号
（評価方法）
Ａ：コヒーレンス
Ｂ：コヒーレントアウトプットパワー
（評価パターン）
１：ＩをＡで評価、ＩＩをＡで評価。
２：ＩをＢで評価、ＩＩをＢで評価。
３：ＩをＡで評価、ＩＩをＢで評価。
４：ＩをＢで評価、ＩＩをＡで評価。 Regarding coherence and coherent output power, when the evaluation objects are two sets of "I" and "II" below, and the evaluation methods are two types of "A" and "B" below, the signal processing device 10 allows evaluation of the following four evaluation patterns.
(Evaluation target)
I: Noise component and teacher signal II: Extracted signal and teacher signal (Evaluation method)
A: Coherence B: Coherent output power (evaluation pattern)
1: I was evaluated with A, and II was evaluated with A.
2: I was evaluated with B, and II was evaluated with B.
3: I was evaluated by A, and II was evaluated by B.
4: I was evaluated with B, and II was evaluated with A.

コヒーレントアウトプットパワーはコヒーレンス関数γ^２とｙのパワースペクトルの平均値Ｗｙｙとの積で定義される。コヒーレントアウトプットパワーは、系の出力に含まれる入力に起因したパワーを示すものである。 The coherent output power is defined as the product of the coherence function ^γ2 and the average value Wyy of the power spectrum of y. Coherent output power refers to the power due to the input contained in the output of the system.

また、図３Ａに示すように、学習部２１は、観測ノッキング筒内圧９３（教師信号）に対して短時間フーリエ変換（ＳＴＦＴ）を行い求めたスペクトログラムと推定ノッキング筒内圧９２（推定信号）との誤差が最小となるように、ニューラルネットワーク９４により、ノイズ除去マスクαを生成するニューラルネットワークの重みＷ、及び、伝達関数Ｈを学習する。これにより、信号処理装置１０は、ノイズ除去マスクαを生成するニューラルネットワークの重みＷと伝達関数Ｈの学習精度を向上させることができる。 Further, as shown in FIG. 3A, the learning unit 21 performs a short-time Fourier transform (STFT) on the observed knocking in-cylinder pressure 93 (teacher signal) to obtain a spectrogram and an estimated knocking in-cylinder pressure 92 (estimated signal). A neural network 94 learns the neural network weight W and the transfer function H for generating the noise removal mask α so that the error is minimized. Thereby, the signal processing device 10 can improve the learning accuracy of the weight W and the transfer function H of the neural network that generates the noise removal mask α.

また、学習部２１は、観測ノッキング筒内圧９３（教師信号）と、抽出信号（抽出ノッキング音９１ａ）に対して、逆短時間フーリエ変換（ＩＳＴＦＴ）と高速フーリエ変換（ＦＦＴ）とを行い、伝達関数Ｈを掛け、逆高速フーリエ変換（ＩＦＦＴ）を行い求めた推定ノッキング筒内圧９２（推定信号）の信号波形との誤差が最小となるように、ニューラルネットワーク９４により、ノイズ除去マスクαを生成するニューラルネットワークの重みＷ、及び、伝達関数Ｈを学習しても良い。これにより、信号処理装置１０は、ノイズ除去マスクαを生成するニューラルネットワークの重みＷと伝達関数Ｈの学習精度を更に向上させることができる。 Further, the learning unit 21 performs an inverse short-time Fourier transform (ISTFT) and a fast Fourier transform (FFT) on the observed knocking in-cylinder pressure 93 (teacher signal) and the extracted signal (extracted knocking sound 91a), and transmits them. A noise removal mask α is generated by a neural network 94 so as to minimize the error from the signal waveform of the estimated knocking in-cylinder pressure 92 (estimated signal) obtained by multiplying by a function H and performing an inverse fast Fourier transform (IFFT). The weight W and transfer function H of the neural network may be learned. Thereby, the signal processing device 10 can further improve the learning accuracy of the weight W and the transfer function H of the neural network that generates the noise removal mask α.

図８Ｃは、抽出ノッキング音９１ａ（抽出信号）を推定ノッキング筒内圧９２（推定信号）に変換する際の一例を示している。図８Ｃは、抽出ノッキング音９１ａ（抽出信号）のスペクトログラムＦ３１と、スペクトログラムＦ３１に対して逆短時間フーリエ変換（ＩＳＴＦＴ）を行うことにより得られる信号波形Ｆ３２と、さらに高速フーリエ変換（ＦＦＴ）を行うことにより得られるスペクトルＦ３３と、を示している。また、図８Ｃは、スペクトルＦ３３の信号に伝達関数Ｈを掛けることにより得られる推定ノッキング筒内圧のスペクトルＦ３５と、を示している。なお、図８Ｃでは、伝達関数Ｈの一例として、周波数と係数との対応関係を示す周波数応答特性Ｆ３４が示されている。また、図８Ｃは、スペクトルＦ３５に逆高速フーリエ変換（ＩＦＦＴ）を行うことにより得られる信号波形Ｆ３６と、さらに、短時間フーリエ変換（ＳＴＦＴ）を行うことにより得られる、推定ノッキング筒内圧９２（推定信号）のスペクトログラムＦ３７と、を示している。 FIG. 8C shows an example of converting an extracted knocking sound 91a (extracted signal) into an estimated knocking in-cylinder pressure 92 (estimated signal). FIG. 8C shows a spectrogram F31 of an extracted knocking sound 91a (extracted signal), a signal waveform F32 obtained by performing an inverse short-time Fourier transform (ISTFT) on the spectrogram F31, and a fast Fourier transform (FFT). A spectrum F33 obtained by the above is shown. FIG. 8C also shows a spectrum F35 of the estimated knocking in-cylinder pressure obtained by multiplying the signal of the spectrum F33 by the transfer function H. FIG. Note that FIG. 8C shows, as an example of the transfer function H, a frequency response characteristic F34 indicating the correspondence between frequencies and coefficients. FIG. 8C shows a signal waveform F36 obtained by performing an inverse fast Fourier transform (IFFT) on the spectrum F35, and an estimated knocking in-cylinder pressure 92 (estimated signal) spectrogram F37.

本実施形態では、学習部２１は、畳み込みニューラルネットワーク（ＣＮＮ：Convolutional Neural Network）によりノイズ除去マスクαを生成するニューラルネットワークの重みＷを学習する。さらに、学習部２１は、伝達関数Ｈとして、抽出信号のスペクトルＦ３３に乗じる重みを学習する。なお、学習部２１はＣＮＮに限らずリカレントニューラルネットワーク（ＲＮＮ：Recurrent Neural Network）やＴｒａｎｓｆｏｒｍｅｒ等の他の構成をとっても良いし、他の機械学習手法を用いても良い。 In this embodiment, the learning unit 21 learns the weight W of the neural network that generates the noise removal mask α by a convolutional neural network (CNN). Furthermore, the learning unit 21 learns, as the transfer function H, the weight by which the spectrum F33 of the extracted signal is multiplied. Note that the learning unit 21 is not limited to the CNN, and may have other configurations such as a recurrent neural network (RNN: Recurrent Neural Network), a Transformer, or other machine learning techniques.

なお、図９に示すように、ノイズ除去マスクαを生成するニューラルネットワークの重みＷを学習するニューラルネットワーク９４の一例として、Ｕ－Ｎｅｔ９５がある。このＵ－Ｎｅｔ９５は、Ｅｎｃｏｄｅｒ‐Ｄｅｃｏｄｅｒモデルの一種で、画像認識や音の分離に使用されている深層学習の一手法である。音の分離において、Ｕ－Ｎｅｔ９５では、下向きパス９６（Ｅｎｃｏｄｅｒ）で畳み込み（ストライドは２以上）を行い、階層９７が深くなるにつれて音の特徴を抽出する。一方、上向きパス９８（Ｄｅｃｏｄｅｒ）では、抽出された音の特徴から逆畳み込みとＵＰサンプリング（膨張）を行うことによりノイズ除去マスクαを生成する。ここまでは、一般的なＥｎｃｏｄｅｒ‐Ｄｅｃｏｄｅｒモデルの構成である。さらに、Ｕ－Ｎｅｔ９５では、各Ｅｎｃｏｄｅｒの畳み込み層からの出力９９をＤｅｃｏｄｅｒの畳み込み層にマージする。これにより、Ｕ－Ｎｅｔ９５では、一般的なＥｎｃｏｄｅｒ‐Ｄｅｃｏｄｅｒモデルよりも高精度なノイズ除去マスクαを生成できる。信号処理装置１０は、ノイズ除去マスクαを生成し、エンジン近傍音９０に掛け合わせることで、抽出信号（抽出ノッキング音９１ａ）を取得できる。ノイズ除去マスクαは、周波数と時間で表される空間のマスクになっている。 As shown in FIG. 9, U-Net 95 is an example of a neural network 94 that learns the weight W of the neural network that generates the noise elimination mask α. This U-Net95 is a kind of encoder-decoder model, and is one of deep learning techniques used for image recognition and sound separation. In sound separation, the U-Net 95 performs convolution (stride is 2 or more) in a downward pass 96 (Encoder), and extracts sound features as the hierarchy 97 becomes deeper. On the other hand, in the upward pass 98 (Decoder), the denoising mask α is generated by performing deconvolution and UP sampling (dilation) from the extracted sound features. Up to this point, the configuration of a general Encoder-Decoder model has been described. In addition, U-Net 95 merges the output 99 from each Encoder's convolutional layer into the Decoder's convolutional layer. As a result, U-Net 95 can generate a noise removal mask α with higher accuracy than a general Encoder-Decoder model. The signal processing device 10 generates the noise removal mask α and multiplies it with the near-engine sound 90 to obtain an extracted signal (extracted knocking sound 91a). The noise elimination mask α is a spatial mask represented by frequency and time.

このように、学習部２１がＵ－Ｎｅｔ９５でノイズ除去マスクαを生成するニューラルネットワークの重みＷを学習するので、学習したニューラルネットワークは入力されるエンジン１の近傍音に応じて適切なノイズ除去マスクαを生成するようになる。よって、信号処理装置１０は、エンジン１のノッキング筒内圧を正確に推定し、エンジン１のノッキング音を良好に抽出することができる。 In this way, since the learning unit 21 learns the weight W of the neural network that generates the noise removal mask α in the U-Net 95, the learned neural network can apply an appropriate noise removal mask according to the input near sound of the engine 1. will generate α. Therefore, the signal processing device 10 can accurately estimate the knocking in-cylinder pressure of the engine 1 and extract the knocking sound of the engine 1 satisfactorily.

＜位相成分の影響＞
図１０Ａ及び図１０Ｂを参照して、位相成分の影響について説明する。図１０Ａは、圧力や、振動、音などの位相成分が考慮されていない場合の計算例の説明図である。図１０Ｂは、圧力や、振動、音などの位相成分が考慮されている場合の計算例の説明図である。図１０Ａ及び図１０Ｂにおいて、円内の矢印は、圧力や、振動、音などに含まれる位相成分を表している。 <Influence of phase component>
The influence of the phase component will be described with reference to FIGS. 10A and 10B. FIG. 10A is an explanatory diagram of a calculation example when phase components such as pressure, vibration, and sound are not considered. FIG. 10B is an explanatory diagram of a calculation example when phase components such as pressure, vibration, and sound are considered. In FIGS. 10A and 10B, arrows in circles represent phase components included in pressure, vibration, sound, and the like.

図１０Ａ及び図１０Ｂに示す例では、ともに、以下の条件が仮定されている。
・系が線形時不変系（入力と出力との間の伝達特性が線形かつ時間に依存して変化しない系）である。 In the examples shown in FIGS. 10A and 10B, the following conditions are assumed.
・The system is a linear time-invariant system (a system in which the transfer characteristic between the input and the output is linear and does not change depending on time).

図１０Ａの例では、エンジン１が駆動されると、筒内圧センサ５（図１）により、観測ノッキング筒内圧９３が観測されている。観測ノッキング筒内圧９３がエンジン１内を伝搬することで、ノッキング音８２が放出される。さらに、ノッキング音８２が背景音であるメカニカルノイズ８３（ノイズ成分）と合わさり、音圧センサ４によりエンジン近傍音９０（入力物理量）が観測される。 In the example of FIG. 10A, when the engine 1 is driven, an observed knocking in-cylinder pressure 93 is observed by the in-cylinder pressure sensor 5 (FIG. 1). A knocking sound 82 is emitted as the observed knocking in-cylinder pressure 93 propagates through the engine 1 . Furthermore, the knocking sound 82 is combined with the background sound mechanical noise 83 (noise component), and the sound pressure sensor 4 observes the near-engine sound 90 (input physical quantity).

図１０Ａは、以下のように、観測ノッキング筒内圧９３、ノッキング振動８１、ノッキング音８２、メカニカルノイズ８３、及びエンジン近傍音９０の概略的な計算の一例を示しており、位相φは変わらないと仮定している。
観測ノッキング筒内圧９３＝ＡＳｉｎ（Ωｔ＋φ）
ノッキング振動８１＝ＡＢＳｉｎ（Ωｔ＋φ）
ノッキング音８２＝ＡＢＣＳｉｎ（Ωｔ＋φ）
メカニカルノイズ８３＝ＮＳｉｎ（Ωｔ＋φＮ）
エンジン近傍音９０＝（ＡＢＣ＋Ｎ）Ｓｉｎ（Ωｔ＋φ） FIG. 10A shows an example of schematic calculation of the observed knocking in-cylinder pressure 93, the knocking vibration 81, the knocking sound 82, the mechanical noise 83, and the near-engine sound 90 as follows. Assuming.
Observed knocking cylinder internal pressure 93 = ASin (Ωt + φ)
Knocking vibration 81 = ABSin (Ωt + φ)
Knocking sound 82 = ABCSin (Ωt + φ)
Mechanical noise 83 = NSin (Ωt + φN)
Near engine sound 90 = (ABC + N) Sin (Ωt + φ)

これに対して、図１０Ｂは位相を考慮した例である。以下のように、観測ノッキング筒内圧９３、ノッキング振動８１、ノッキング音８２、メカニカルノイズ８３、及びエンジン近傍音９０の概略的な計算の一例を示しており、次元（単位）が変化する度に位相φが変化することを考慮している。
観測ノッキング筒内圧９３＝ＡＳｉｎ（Ωｔ＋φＡ）
ノッキング振動８１＝ＡＢＳｉｎ（Ωｔ＋φＡ＋φＢ）
ノッキング音８２＝ＡＢＣＳｉｎ（Ωｔ＋φＡ＋φＢ＋φＣ）
メカニカルノイズ８３＝ＮＳｉｎ（Ωｔ＋φＮ）
エンジン近傍音９０＝ＡＢＣＳｉｎ（Ωｔ＋φＡ＋φＢ＋φＣ）＋ＮＳｉｎ（Ωｔ＋φＮ） On the other hand, FIG. 10B is an example considering the phase. An example of schematic calculation of the observed knocking cylinder internal pressure 93, the knocking vibration 81, the knocking sound 82, the mechanical noise 83, and the near-engine sound 90 is shown below. It takes into account that φ changes.
Observed knocking cylinder internal pressure 93 = ASin (Ωt + φA)
Knocking vibration 81 = ABSin (Ωt + φA + φB)
Knocking sound 82 = ABCSin (Ωt + φA + φB + φC)
Mechanical noise 83 = NSin (Ωt + φN)
Near engine sound 90 = ABCSin (Ωt + φA + φB + φC) + NSin (Ωt + φN)

図１０Ｂに示す例は、圧力が振動となり、音となって放射される際の位相成分の変化が考慮されているため、図１０Ａに示す例よりも、信号を良好に解析することができる。そこで、信号処理装置１０は、図３Ａに示すように、ノイズ除去マスクαを生成するニューラルネットワークの重みＷ及び伝達関数Ｈの学習時に、位相を加味して学習する構成となっている。このような信号処理装置１０は、エンジン近傍音９０（入力物理量）を抽出ノッキング音９１ａ（抽出信号）と雑音９１ｂ（ノイズ成分）とに良好に分離することができる。つまり、信号処理装置１０は、雑音の中にノッキング音（目的音）が混入しないように、エンジン近傍音９０を抽出ノッキング音９１ａと雑音９１ｂとに分離することができる。このような信号処理装置１０は、特許文献２及び特許文献３に記載された従来技術よりも、ノッキングの有無の評価性能を向上させることができる。また、このような信号処理装置１０は、良好な官能試験を行うことができる。 In the example shown in FIG. 10B, the change in the phase component when the pressure becomes vibration and is radiated as sound is considered, so the signal can be analyzed better than the example shown in FIG. 10A. Therefore, as shown in FIG. 3A, the signal processing apparatus 10 is configured to learn the weight W and the transfer function H of the neural network that generates the noise removal mask α while taking the phase into account. Such a signal processing device 10 can satisfactorily separate the near-engine sound 90 (input physical quantity) into an extracted knocking sound 91a (extracted signal) and noise 91b (noise component). That is, the signal processing device 10 can separate the near-engine sound 90 into the extracted knocking sound 91a and the noise 91b so that the knocking sound (target sound) is not mixed in the noise. Such a signal processing device 10 can improve the performance of evaluating the presence or absence of knocking compared to the conventional techniques described in Patent Documents 2 and 3. Moreover, such a signal processing device 10 can perform a good sensory test.

（官能試験の概要）
図１１Ａから図１２Ｃを参照して、官能試験の概要について説明する。図１１Ａから図１２Ｃは、官能試験における信号処理の説明図である。 (Summary of sensory test)
An outline of the sensory test will be described with reference to FIGS. 11A to 12C. 11A to 12C are explanatory diagrams of signal processing in a sensory test.

図２に示すように、圧力と音の各信号は、データ収集装置６によって収集され、信号処理装置１０に出力される。信号処理装置１０は、信号切出部１１で各信号を切り出し、スペクトログラム算出部１２でスペクトログラムを算出して、スペクトログラムを信号記憶部１３に記憶する。 As shown in FIG. 2, pressure and sound signals are collected by a data collection device 6 and output to a signal processing device 10 . The signal processing device 10 extracts each signal in the signal extraction unit 11 , calculates a spectrogram in the spectrogram calculation unit 12 , and stores the spectrogram in the signal storage unit 13 .

本実施形態に係る信号処理装置１０は、前記した分離モードと前記した官能試験モードとを実行することができる。分離モード時に、信号処理装置１０は、エンジン近傍音９０（入力物理量）を抽出ノッキング音９１ａ（抽出信号）と雑音９１ｂ（ノイズ成分）とに分離する。そして、官能試験モード時に、信号処理装置１０は、抽出ノッキング音９１ａのレベルを変更してレベル変更抽出信号（レベル変更ノッキング音９１ａａ（図１２Ａ））を生成し、雑音９１ｂと合成して加工音９１ｃ（図１２Ｃ）を生成する。 The signal processing device 10 according to the present embodiment can execute the above-described separation mode and the above-described sensory test mode. In the separation mode, the signal processing device 10 separates the near-engine sound 90 (input physical quantity) into an extracted knocking sound 91a (extracted signal) and noise 91b (noise component). Then, in the sensory test mode, the signal processing device 10 changes the level of the extracted knocking sound 91a to generate a level-changed extraction signal (level-changed knocking sound 91aa (FIG. 12A)), which is combined with the noise 91b to produce a processed sound. 91c (FIG. 12C).

官能試験モード時において、検査者は、頭部にヘッドホン８（図１）を装着する。そして、検査者自身または別途レベル指定をする操作者は、許容可能なレベルを周囲に宣告して、レベル指定部９を操作してレベル指定情報を信号処理装置１０に入力する。信号処理装置１０は、入力されたレベル指定情報に基づいて、抽出ノッキング音９１ａのレベルを変更する。 In the sensory test mode, the examiner wears headphones 8 (FIG. 1) on the head. Then, the inspector himself/herself or an operator who separately designates the level announces the permissible level to the surroundings and operates the level designation unit 9 to input the level designation information to the signal processing device 10 . Signal processing device 10 changes the level of extracted knocking sound 91a based on the input level designation information.

図１１Ａは、エンジン近傍音９０（入力物理量）を示しており、図１１Ｂと図１１Ｃは、分離モード時にエンジン近傍音９０から分離された抽出ノッキング音９１ａ（抽出信号）と雑音９１ｂ（ノイズ成分）とを示している。図１２Ａは、官能試験モード時に、抽出ノッキング音９１ａ（抽出信号）のレベルを３ｄＢ上昇させる変更を行って生成されたレベル変更ノッキング音９１ａａ（レベル変更抽出信号）を示しており、図１２Ｂは、レベル変更ノッキング音９１ａａと合成される雑音９１ｂを示しており、図１２Ｃは、レベル変更ノッキング音９１ａａと雑音９１ｂとを合成した加工音９１ｃを示している。信号処理装置１０は、加工音９１ｃを生成することにより、検査すべき目的音（ノッキング音）を聞き分け易い状態にすることができる。このような信号処理装置１０は、目的音（ノッキング音）の有無を高精度に検査者に把握させることができ、検査性能を向上させることができる。また、信号処理装置１０は、検査者により許容不可能と判断された加工音９１ｃのスペクトログラムの絶対値の総和を求めて、総和値以下の任意の値を閾値記憶部２５に書き込む。これにより、閾値Ｔは検査者の官能にあった値となる。 FIG. 11A shows a near-engine sound 90 (input physical quantity), and FIGS. 11B and 11C show an extracted knocking sound 91a (extracted signal) and noise 91b (noise component) separated from the near-engine sound 90 in the separation mode. and FIG. 12A shows a level-changed knocking sound 91aa (level-changed extracted signal) generated by changing the level of the extracted knocking sound 91a (extracted signal) by 3 dB during the sensory test mode, and FIG. 12B shows: FIG. 12C shows noise 91b synthesized with level-changed knocking sound 91aa, and FIG. 12C shows processed sound 91c synthesized from level-changed knocking sound 91aa and noise 91b. By generating the processed sound 91c, the signal processing device 10 can make the target sound (knocking sound) to be inspected easily audible. Such a signal processing apparatus 10 can allow an inspector to grasp the presence or absence of a target sound (knocking sound) with high accuracy, and can improve inspection performance. In addition, the signal processing device 10 obtains the sum of the absolute values of the spectrograms of the processed sounds 91c determined by the inspector to be unacceptable, and writes any value below the sum to the threshold storage unit 25. FIG. As a result, the threshold value T becomes a value that suits the examiner's senses.

なお、信号処理装置１０は、運用に応じて、官能試験モード時に、レベル変更抽出信号（レベル変更ノッキング音９１ａａ）に測定対象から受聴者の耳位置までの伝達関数Ｈｅ（図示せず）を掛けてから、前記受聴者の耳位置でのノイズ成分を合成して加工音を生成するようにしてもよい。 Note that the signal processing device 10 multiplies the level change extraction signal (level change knocking sound 91aa) by a transfer function He (not shown) from the object to be measured to the listener's ear position in the sensory test mode according to the operation. After that, the noise component at the listener's ear position may be synthesized to generate the processed sound.

＜信号処理装置（推定装置）の動作＞
図１３から図１８を参照して、信号処理装置１０の動作について説明する。図１３は、信号処理装置１０のデータ収集処理を示すフローチャートである。図１４Ａは、信号処理装置１０の学習処理を示すフローチャートである。図１４Ｂは、学習処理のサブルーチンを示すフローチャートである。学習モードの場合、信号処理装置１０は、図１３のデータ収集処理を行った後、図１４Ａと図１４Ｂの学習処理を行う。図１４Ｃは、学習処理のサブルーチンの変更例を示すフローチャートである。図１５Ａは、信号処理装置１０の閾値算出処理を示すフローチャートである。閾値算出モードの場合、信号処理装置１０は、図１３のデータ収集処理を行った後、図１５Ａの閾値算出処理を行う。図１５Ｂは、閾値算出処理を示すフローチャートである。分離モードで分離処理を行った場合及び官能試験モードで官能試験処理を行った場合に、信号処理装置１０は、図１５Ｂの閾値算出処理を行う。図１６は、信号処理装置１０の判定処理を示すフローチャートである。判定モードの場合、信号処理装置１０は、図１３のデータ収集処理を行った後、図１６の判定処理を行う。ただし、信号処理装置１０は、図１３のデータ収集処理を行いながら、リアルタイムで図１６の判定処理を行うようにしてもよい。図１７は、信号処理装置１０の分離処理を示すフローチャートである。分離モードの場合、信号処理装置１０は、図１３のデータ収集処理を行った後、図１７の分離処理を行う。図１８は、信号処理装置１０の官能試験処理を示すフローチャートである。官能試験モードの場合、信号処理装置１０は、図１５Ｂの閾値算出処理を行う前に、図１８の官能試験処理を行う。 <Operation of Signal Processing Device (Estimation Device)>
The operation of the signal processing device 10 will be described with reference to FIGS. 13 to 18. FIG. FIG. 13 is a flow chart showing data collection processing of the signal processing device 10 . 14A is a flowchart showing learning processing of the signal processing device 10. FIG. FIG. 14B is a flowchart showing a subroutine of learning processing. In the case of the learning mode, the signal processing device 10 performs the learning process of FIGS. 14A and 14B after performing the data collection process of FIG. 13 . FIG. 14C is a flow chart showing a modification of the subroutine of the learning process. FIG. 15A is a flowchart showing threshold calculation processing of the signal processing device 10 . In the threshold calculation mode, the signal processing device 10 performs the threshold calculation process of FIG. 15A after performing the data collection process of FIG. 13 . FIG. 15B is a flowchart showing threshold calculation processing. When the separation process is performed in the separation mode and when the sensory test process is performed in the sensory test mode, the signal processing device 10 performs the threshold calculation process of FIG. 15B. FIG. 16 is a flow chart showing determination processing of the signal processing device 10 . In the determination mode, the signal processing device 10 performs the determination process of FIG. 16 after performing the data collection process of FIG. However, the signal processing device 10 may perform the determination process of FIG. 16 in real time while performing the data collection process of FIG. FIG. 17 is a flow chart showing separation processing of the signal processing device 10 . In the separation mode, the signal processing device 10 performs the separation processing in FIG. 17 after performing the data collection processing in FIG. 13 . FIG. 18 is a flow chart showing sensory test processing of the signal processing device 10 . In the sensory test mode, the signal processing device 10 performs the sensory test process of FIG. 18 before performing the threshold calculation process of FIG. 15B.

（データ収集処理）
図１３を参照して、データ収集処理を説明する。
図１３に示すように、ステップＳ２０において、データ収集装置６が、音圧信号、筒内圧信号（教師信号）、及び、角度情報を信号切出部１１に入力する。なお、分離モード、閾値算出モード、又は判定モードの場合、ステップＳ２０では、筒内圧信号を入力する必要はない。
ステップＳ２１において、信号切出部１１は、角度情報から各気筒の燃焼行程タイミングを算出する。 (data collection processing)
The data collection process will be described with reference to FIG.
As shown in FIG. 13 , in step S20 , the data collection device 6 inputs the sound pressure signal, the in-cylinder pressure signal (teacher signal), and the angle information to the signal extractor 11 . Note that in the case of the separation mode, the threshold calculation mode, or the determination mode, it is not necessary to input the in-cylinder pressure signal in step S20.
In step S21, the signal extractor 11 calculates the combustion stroke timing of each cylinder from the angle information.

ステップＳ２２において、信号切出部１１は、ステップＳ２１で算出した燃焼行程タイミングに合わせて、ＴＤＣ付近の音圧信号と筒内圧信号を切り出す。
ステップＳ２３において、スペクトログラム算出部１２は、ステップＳ２２で切り出した音圧信号に対して短時間フーリエ変換（ＳＴＦＴ）を行い、音圧信号のスペクトログラムを算出する。
ステップＳ２４において、スペクトログラム算出部１２は、音圧信号のスペクトログラムを信号記憶部１３に書き込む。また、信号切出部１１は、筒内圧信号が存在する場合は、筒内圧信号（教師信号）を信号記憶部１３に書き込む。 In step S22, the signal extraction unit 11 extracts the sound pressure signal and the in-cylinder pressure signal near TDC in accordance with the combustion stroke timing calculated in step S21.
In step S23, the spectrogram calculator 12 performs a short-time Fourier transform (STFT) on the sound pressure signal cut out in step S22 to calculate a spectrogram of the sound pressure signal.
In step S24 , the spectrogram calculation unit 12 writes the spectrogram of the sound pressure signal to the signal storage unit 13 . Further, the signal extraction unit 11 writes the in-cylinder pressure signal (teacher signal) to the signal storage unit 13 when the in-cylinder pressure signal is present.

（学習処理）
図１４Ａから図１４Ｃを参照して、学習モードで実行される学習処理を説明する。
図１４Ａに示すように、ステップＳ３０において、信号処理装置１０は、信号記憶部１３から音圧信号のスペクトログラム及び筒内圧信号を読み出して、学習部２１に入力する。
ステップＳ３１において、学習部２１は、ステップＳ３０で入力された音圧信号のスペクトログラム及び筒内圧信号を用いて、ニューラルネットワーク９４で入力物理量に関連する振幅と位相成分を加味して、ノイズ除去マスクαを生成するニューラルネットワークの重みＷ、及び、伝達関数Ｈを学習する。学習部２１は、畳み込みニューラルネットワーク（例えば、Ｕ－Ｎｅｔ）によりノイズ除去マスクαを生成するニューラルネットワークの重みＷを学習し、伝達関数Ｈとして、抽出信号のスペクトルに乗じる重みを学習する。 (learning process)
The learning process performed in the learning mode will be described with reference to FIGS. 14A to 14C.
As shown in FIG. 14A , in step S30 , the signal processing device 10 reads the spectrogram of the sound pressure signal and the in-cylinder pressure signal from the signal storage unit 13 and inputs them to the learning unit 21 .
In step S31, the learning unit 21 uses the spectrogram of the sound pressure signal and the in-cylinder pressure signal input in step S30 to add amplitude and phase components related to the input physical quantity in the neural network 94 to obtain a noise removal mask α Learn the weights W and the transfer function H of the neural network that generates The learning unit 21 learns the weight W of the neural network that generates the noise removal mask α by a convolutional neural network (eg, U-Net), and learns the weight by which the spectrum of the extracted signal is multiplied as the transfer function H.

ステップＳ３２において、学習部２１は、学習したノイズ除去マスクαを生成するニューラルネットワークの重みＷ及び伝達関数Ｈを学習済みパラメータ記憶部２２に書き込む。 In step S32 , the learning unit 21 writes the weight W and the transfer function H of the neural network that generates the learned noise removal mask α to the learned parameter storage unit 22 .

なお、ステップＳ３１の処理は、例えば図１４Ｂに示す一連の処理によって行われる。
図１４Ｂに示すように、ステップＳ３１ａにおいて、学習処理部２０は、ニューラルネットワーク９４によりノイズ除去マスクαを生成する。
ステップＳ１０１において、学習部２１は、ステップＳ３１ａにおいて生成したノイズ除去マスクαを用いて入力物理量（エンジン近傍音９０）をノイズ成分（雑音９１ｂ）と抽出信号（抽出ノッキング音９１ａ）とに分離する。 The process of step S31 is performed by a series of processes shown in FIG. 14B, for example.
As shown in FIG. 14B, in step S31a, the learning processing unit 20 uses the neural network 94 to generate a noise removal mask α.
In step S101, the learning unit 21 separates the input physical quantity (near engine sound 90) into a noise component (noise 91b) and an extracted signal (extracted knocking sound 91a) using the noise removal mask α generated in step S31a.

ステップＳ１０２において、学習部２１は、伝達関数Ｈを掛け合わせて抽出信号（抽出ノッキング音９１ａ）を推定信号（推定ノッキング筒内圧９２）に変換する。
ステップＳ１０３において、学習部２１は、ノイズ除去マスクαを生成するニューラルネットワークの重みＷ、及び、伝達関数Ｈの更新を行う。ステップＳ１０３では、学習部２１は、ノイズ成分（雑音９１ｂ）のスペクトログラムに対して逆短時間フーリエ変換（ＩＳＴＦＴ）を行うことにより取得された第１信号Ｓｆｉ（図３Ａ）と教師信号（観測ノッキング筒内圧９３）とのコヒーレンスが小さくなるとともに、抽出信号（抽出ノッキング音９１ａ）のスペクトログラムに対して逆短時間フーリエ変換（ＩＳＴＦＴ）を行うことにより取得された第２信号Ｓｓｅ（図３Ａ）と教師信号（観測ノッキング筒内圧９３）とのコヒーレンスが大きくなるように、また、推定信号（推定ノッキング筒内圧９２）と教師信号（観測ノッキング筒内圧９３）に対して短時間フーリエ変換（ＳＴＦＴ）を行い求めたスペクトログラムとの誤差が小さくなるように、ノイズ除去マスクαを生成するニューラルネットワークの重みＷ、及び、伝達関数Ｈを更新する。 In step S102, the learning unit 21 multiplies the transfer function H to convert the extracted signal (extracted knocking sound 91a) into an estimated signal (estimated knocking in-cylinder pressure 92).
In step S103, the learning unit 21 updates the weight W and the transfer function H of the neural network that generates the noise removal mask α. In step S103, the learning unit 21 acquires the first signal Sfi (FIG. 3A) obtained by performing an inverse short-time Fourier transform (ISTFT) on the spectrogram of the noise component (noise 91b) and the teacher signal (observed knocking tube As the coherence with the internal pressure 93) becomes smaller, the second signal Sse (FIG. 3A) obtained by performing an inverse short-time Fourier transform (ISTFT) on the spectrogram of the extracted signal (extracted knocking sound 91a) and the teacher signal (observed knocking in-cylinder pressure 93). The weights W and the transfer function H of the neural network that generates the noise removal mask α are updated so that the error with the spectrogram obtained is reduced.

ステップＳ１０４において、学習部２１は、コヒーレンスと誤差が収束したか否かを判定し、収束したと判定された場合（“Ｙｅｓ”の場合）に、ステップＳ３１の処理を終了し、一方、収束していないと判定された場合（“Ｎｏ”の場合）に、ステップＳ１０５の処理を行う。 In step S104, the learning unit 21 determines whether or not the coherence and the error have converged. If it is determined that there is not ("No"), the process of step S105 is performed.

ステップＳ１０５において、学習部２１は、ノイズ低減入力物理量９０Ａを生成して、エンジン近傍音９０とノイズ低減入力物理量９０Ａとをシャッフルする。ステップＳ１０５の後、学習部２１は、ステップＳ３１ａ以降の処理を繰り返す。なお、ステップＳ１０４において、コヒーレンスと誤差が収束した状態とは、教師信号（観測ノッキング筒内圧９３）とノイズ成分（雑音９１ｂ）を逆短時間フーリエ変換（ＩＳＴＦＴ）して求めた信号波形とのコヒーレンスが「０」に近い値で、かつ、教師信号（観測ノッキング筒内圧９３）と抽出信号（抽出ノッキング音９１ａ）を逆短時間フーリエ変換（ＩＳＴＦＴ）して求めた信号波形とのコヒーレンスが「１」に近い値で、かつ、教師信号（観測ノッキング筒内圧９３）に対して短時間フーリエ変換（ＳＴＦＴ）を行い求めたスペクトログラムに対する推定信号（推定ノッキング筒内圧９２）の誤差が「０」に近い値に収束した状態である。 In step S105, the learning unit 21 generates the noise reduction input physical quantity 90A and shuffles the near-engine sound 90 and the noise reduction input physical quantity 90A. After step S105, the learning unit 21 repeats the processes after step S31a. In step S104, the state in which the coherence and error converge is the coherence between the teacher signal (observed knocking in-cylinder pressure 93) and the signal waveform obtained by inverse short-time Fourier transform (ISTFT) of the noise component (noise 91b). is close to "0", and the coherence between the teacher signal (observed knocking in-cylinder pressure 93) and the extracted signal (extracted knocking sound 91a) by inverse short-time Fourier transform (ISTFT) is "1". and the error of the estimated signal (estimated knocking cylinder pressure 92) with respect to the spectrogram obtained by performing short-time Fourier transform (STFT) on the teacher signal (observed knocking cylinder pressure 93) is close to "0". It is in a state of convergence to a value.

なお、ステップＳ３１の処理は、図１４Ｂの代わりに、例えば図１４Ｃに示す処理を行うようにしてもよい。
図１４Ｃに示す処理は、図１４Ｂに示す処理と比べると、ステップＳ１０３の処理の代わりに、ステップＳ１０３ａの処理を行う点で相違している。
ステップＳ１０３ａにおいて、学習部２１は、ステップＳ１０３の条件に加え、さらに、推定信号（推定ノッキング筒内圧９２）に対して逆短時間フーリエ変換（ＩＳＴＦＴ）を行い求めた信号波形の最大値、及び、最大値と最小値との差（ノッキングインテンシティ）と、教師信号（観測ノッキング筒内圧９３）の最大値、及び、最大値と最小値との差（ノッキングインテンシティ）を算出し、前者の最大値と後者の最大値との誤差、及び、前者の最大値と最小値との差（ノッキングインテンシティ）と後者の最大値と最小値との差（ノッキングインテンシティ）における誤差がそれぞれ小さくなるようにするという条件を満たすように、ノイズ除去マスクαを生成するニューラルネットワークの重みＷ、及び、伝達関数Ｈの更新を行う。すなわち、ステップＳ１０３ａでは、学習部２１は、ノイズ成分（雑音９１ｂ）のスペクトログラムに対して逆短時間フーリエ変換（ＩＳＴＦＴ）を行うことにより取得された第１信号Ｓｆｉ（図３Ａ）と教師信号（観測ノッキング筒内圧９３）とのコヒーレンスが小さくなるとともに、抽出信号（抽出ノッキング音９１ａ）のスペクトログラムに対して逆短時間フーリエ変換（ＩＳＴＦＴ）を行うことにより取得された第２信号Ｓｓｅ（図３Ａ）と教師信号（観測ノッキング筒内圧９３）とのコヒーレンスが大きくなるように、また、推定信号（推定ノッキング筒内圧９２）と教師信号（観測ノッキング筒内圧９３）に対して短時間フーリエ変換（ＳＴＦＴ）を行い求めたスペクトログラムとの誤差が小さくなるように、さらに、推定信号（推定ノッキング筒内圧９２）に対して逆短時間フーリエ変換（ＩＳＴＦＴ）を行い求めた信号波形の最大値、及び、最大値と最小値との差（ノッキングインテンシティ）と、教師信号（観測ノッキング筒内圧９３）の最大値、及び、最大値と最小値との差（ノッキングインテンシティ）を算出し、前者の最大値と後者の最大値との誤差、及び、前者の最大値と最小値との差（ノッキングインテンシティ）と後者の最大値と最小値との差（ノッキングインテンシティ）における誤差がそれぞれ小さくなるように、ノイズ除去マスクαを生成するニューラルネットワークの重みＷ、及び、伝達関数Ｈを更新する。 It should be noted that the process of step S31 may be performed, for example, as shown in FIG. 14C instead of FIG. 14B.
The process shown in FIG. 14C is different from the process shown in FIG. 14B in that the process of step S103a is performed instead of the process of step S103.
In step S103a, in addition to the conditions of step S103, the learning unit 21 performs an inverse short-time Fourier transform (ISTFT) on the estimated signal (estimated knocking in-cylinder pressure 92) to determine the maximum value of the signal waveform, and Calculate the difference between the maximum value and the minimum value (knocking intensity), the maximum value of the teacher signal (observed knocking in-cylinder pressure 93), and the difference between the maximum value and the minimum value (knocking intensity). value and the maximum value of the latter, and the difference between the maximum and minimum values of the former (knocking intensity) and the maximum and minimum values of the latter (knocking intensity), respectively. The weight W of the neural network for generating the noise removal mask α and the transfer function H are updated so as to satisfy the condition of . That is, in step S103a, the learning unit 21 obtains the first signal Sfi (FIG. 3A) and the teacher signal (observation The second signal Sse (Fig. 3A) obtained by performing an inverse short-time Fourier transform (ISTFT) on the spectrogram of the extracted signal (extracted knocking sound 91a) while the coherence with the knocking cylinder internal pressure 93) becomes small. A short-time Fourier transform (STFT) is applied to the estimated signal (estimated knocking in-cylinder pressure 92) and the teacher signal (observed knocking in-cylinder pressure 93) so that the coherence with the teacher signal (observed knocking in-cylinder pressure 93) is large. Further, the estimated signal (estimated knocking in-cylinder pressure 92) is further subjected to an inverse short-time Fourier transform (ISTFT) to obtain the maximum value of the signal waveform, and the maximum value and Calculate the difference from the minimum value (knocking intensity), the maximum value of the teacher signal (observed knocking in-cylinder pressure 93), and the difference between the maximum and minimum values (knocking intensity). and the difference between the maximum and minimum values of the former (knocking intensity) and the difference between the maximum and minimum values of the latter (knocking intensity). Update the weights W and the transfer function H of the neural network that produces the removal mask α.

なお、図１４Ｂ及び図１４Ｃに示すフローは、ステップＳ１０５を初回（ステップＳ３１の後）に行うようにしてもよい。また、図１４Ｂ及び図１４Ｃに示すフローは、ステップＳ１０５を抜いて一度目の処理を行い、学習が終了した後に、ステップＳ１０５を含んで二度目の処理を行うようにしてもよい。 Note that in the flows shown in FIGS. 14B and 14C, step S105 may be performed for the first time (after step S31). Further, in the flows shown in FIGS. 14B and 14C, step S105 may be skipped and the first process may be performed, and after the learning is completed, the second process including step S105 may be performed.

（閾値算出処理）
図１５Ａを参照して、閾値算出モードで実行される閾値算出処理を説明する。
図１５Ａに示すように、ステップＳ４０において、学習処理部２０は、ニューラルネットワーク９４によりノイズ除去マスクαを生成する。
ステップＳ４１において、第１推定部２３は、ニューラルネットワーク９４により生成したノイズ除去マスクαを用いて、音圧信号のスペクトログラムからエンジン１のノッキング音を抽出する。
ステップＳ４２において、閾値算出部２４は、エンジン１の各サイクルでノッキング音のスペクトログラムの絶対値を総和する。 (Threshold calculation process)
The threshold calculation process executed in the threshold calculation mode will be described with reference to FIG. 15A.
As shown in FIG. 15A, in step S40, the learning processing unit 20 uses the neural network 94 to generate a noise removal mask α.
In step S41, the first estimator 23 uses the noise removal mask α generated by the neural network 94 to extract the knocking sound of the engine 1 from the spectrogram of the sound pressure signal.
In step S42 , the threshold calculator 24 sums the absolute values of the spectrogram of the knocking sound in each cycle of the engine 1 .

ステップＳ４３において、閾値算出部２４は、エンジン１の全サイクルにおけるノッキング音のスペクトログラムの中央値を算出する。
ステップＳ４４において、閾値算出部２４は、中央値に任意のマージンを加算し、閾値とする。
ステップＳ４５において、閾値算出部２４は、算出した閾値を閾値記憶部２５に書き込む。 In step S43 , the threshold calculator 24 calculates the median value of the spectrogram of the knocking sound in all cycles of the engine 1 .
In step S44, the threshold calculation unit 24 adds an arbitrary margin to the median to obtain a threshold.
In step S45 , the threshold calculation unit 24 writes the calculated threshold to the threshold storage unit 25 .

なお、信号処理装置１０は、図１７の分離処理及び図１８の官能試験処理の後に、図１５Ｂの閾値算出処理を実行する機能を有している。以下、図１５Ｂを参照して、図１７の分離処理及び図１８の官能試験処理の後に行われる閾値算出処理を説明する。
図１５Ｂに示すように、ステップＳ１４０において、信号調整部５１は、ノイズ成分記憶部２６ｃと抽出信号記憶部２６ｄから、官能試験で許容不可となった信号を取得し、第１推定部２３に供給する。
ステップＳ１４１において、学習処理部２０は、ニューラルネットワーク９４によりノイズ除去マスクαを生成する。
ステップＳ１４２において、第１推定部２３は、ニューラルネットワーク９４により生成したノイズ除去マスクαを用いて、官能試験で許容不可となった信号のスペクトログラムからノッキング音のスペクトログラムを取得し、閾値算出部２４に供給する。
ステップＳ１４３において、閾値算出部２４は、第１推定部２３から供給されたノッキング音のスペクトログラムの絶対値を総和する。 Note that the signal processing device 10 has a function of executing the threshold calculation process of FIG. 15B after the separation process of FIG. 17 and the sensory test process of FIG. Hereinafter, the threshold value calculation process performed after the separation process of FIG. 17 and the sensory test process of FIG. 18 will be described with reference to FIG. 15B.
As shown in FIG. 15B , in step S140, the signal adjustment unit 51 acquires signals that are unacceptable in the sensory test from the noise component storage unit 26c and the extracted signal storage unit 26d, and supplies the signals to the first estimation unit 23. do.
In step S141 , the learning processing unit 20 uses the neural network 94 to generate a noise removal mask α.
In step S142, the first estimating unit 23 uses the noise removal mask α generated by the neural network 94 to acquire the spectrogram of the knocking sound from the spectrogram of the signal that is unacceptable in the sensory test. supply.
In step S143 , the threshold calculator 24 sums the absolute values of the spectrograms of the knocking sounds supplied from the first estimator 23 .

ステップＳ１４４において、閾値算出部２４は、総和値以下の任意の値を閾値とする。
ステップＳ１４５において、閾値算出部２４は、算出した閾値を閾値記憶部２５に書き込む。 In step S144, the threshold calculation unit 24 sets an arbitrary value equal to or less than the total sum as a threshold.
In step S145 , the threshold calculation unit 24 writes the calculated threshold to the threshold storage unit 25 .

（判定処理）
図１６を参照して、判定モードで実行される判定処理を説明する。
図１６に示すように、ステップＳ５０において、学習処理部２０は、ニューラルネットワーク９４によりノイズ除去マスクαを生成する。
ステップＳ５１において、第２推定部３１は、ニューラルネットワーク９４により生成したノイズ除去マスクαを用いて、信号記憶部１３より入力された音圧信号のスペクトログラムからエンジン１のノッキング音を抽出する。 (Determination process)
The determination process executed in the determination mode will be described with reference to FIG. 16 .
As shown in FIG. 16, in step S50, the learning processing unit 20 generates a noise removal mask α using the neural network 94. FIG.
In step S51 , the second estimator 31 uses the noise removal mask α generated by the neural network 94 to extract the knocking sound of the engine 1 from the spectrogram of the sound pressure signal input from the signal storage 13 .

ステップＳ５２において、閾値判定部３２は、エンジン１の各サイクルでノッキング音のスペクトログラムの絶対値を総和する。
ステップＳ５３において、閾値判定部３２は、閾値記憶部２５に記憶済みの閾値と、総和したノッキング音とを比較し、ノッキング音が閾値を超えているか否かを判定する。
総和したノッキング音が閾値を超えている場合（ステップＳ５３でＹｅｓ）、閾値判定部３２は、ノッキング有りと判定する（ステップＳ５４）。 In step S52 , the threshold determination unit 32 sums the absolute values of the spectrogram of the knocking sound in each cycle of the engine 1 .
In step S53, the threshold determination unit 32 compares the threshold stored in the threshold storage unit 25 with the sum of the knocking sounds, and determines whether or not the knocking sounds exceed the threshold.
If the sum of the knocking sounds exceeds the threshold (Yes in step S53), the threshold determination unit 32 determines that there is knocking (step S54).

総和したノッキング音が閾値以下の場合、（ステップＳ５３でＮｏ）、閾値判定部３２は、ノッキング無しと判定する（ステップＳ５５）。
ステップＳ５６において、閾値判定部３２は、閾値判定の結果及びノッキング音をモニタ７に出力する。 If the total knocking sound is equal to or less than the threshold (No in step S53), the threshold determination unit 32 determines that there is no knocking (step S55).
In step S56 , the threshold determination unit 32 outputs the result of threshold determination and the knocking sound to the monitor 7 .

（分離処理）
図１７を参照して、分離モードで実行される分離処理を説明する。
図１７に示すように、ステップＳ６０において、学習処理部２０は、ニューラルネットワーク９４によりノイズ除去マスクαを生成する。
ステップＳ６１において、分離部４０は、ニューラルネットワーク９４により生成されたノイズ除去マスクαを用いて、入力物理量をノイズ成分と抽出信号とに分離する。
ステップＳ６２において、学習処理部２０は、分離したノイズ成分と抽出信号を、それぞれに対応するノイズ成分記憶部２６ｃと抽出信号記憶部２６ｄに記憶する。 (separation processing)
The separation processing executed in the separation mode will be described with reference to FIG.
As shown in FIG. 17, in step S60, the learning processing unit 20 generates a noise removal mask α using the neural network 94. FIG.
In step S61, the separation unit 40 separates the input physical quantity into a noise component and an extraction signal using the noise elimination mask α generated by the neural network 94. FIG.
In step S62, the learning processing unit 20 stores the separated noise component and extracted signal in the corresponding noise component storage unit 26c and extracted signal storage unit 26d.

（官能試験処理）
図１８を参照して、官能試験モードで実行される官能試験処理を説明する。
図１８に示すように、ステップＳ７０において、信号合成部５０の信号調整部５１は、レベル指定部９からのレベル指定情報の入力を受け付ける。
ステップＳ７１において、信号調整部５１は、抽出信号記憶部２６ｄから抽出信号を取得するとともに、ノイズ成分記憶部２６ｃからノイズ成分を取得し、レベル指定情報に応じて抽出信号のレベルを変更してレベル変更抽出信号を生成する。 (Sensory test treatment)
The sensory test processing executed in the sensory test mode will be described with reference to FIG.
As shown in FIG. 18, in step S70, the signal adjusting section 51 of the signal synthesizing section 50 receives input of level designation information from the level designating section 9. FIG.
In step S71, the signal adjustment unit 51 acquires the extraction signal from the extraction signal storage unit 26d, acquires the noise component from the noise component storage unit 26c, and changes the level of the extraction signal according to the level designation information. Generate a change extraction signal.

ステップＳ７２において、信号調整部５１は、レベル変更抽出信号とノイズ成分とを合成して加工音を生成する。
ステップＳ７３において、信号合成部５０の信号出力部５２は、加工音を放音部（ヘッドホン８）に出力して、放音部から加工音を放音させる。
ステップＳ７４において、信号合成部５０は、レベル指定部９からの終了の指示があったか否かを判定し、終了の指示があったと判定された場合（“Ｙｅｓ”の場合）に、加工音を第１推定部２３に出力し、閾値算出部２４により閾値を算出したのち、閾値記憶部２５に書き込み、図１８の処理を終了する。一方、終了の指示がないと判定された場合（“Ｎｏ”の場合）に、ステップＳ７５において、信号合成部５０の信号調整部５１は、レベル指定部９からのレベル指定情報の変更を受け付け、ステップＳ７１以降の処理を繰り返す。 In step S72, the signal adjustment unit 51 synthesizes the level-changed extraction signal and the noise component to generate processed sound.
In step S73, the signal output unit 52 of the signal synthesizing unit 50 outputs the processed sound to the sound emitting unit (headphones 8) to cause the sound emitting unit to emit the processed sound.
In step S74, the signal synthesizing unit 50 determines whether or not there has been an instruction to end from the level specifying unit 9, and if it is determined that there has been an instruction to end (in the case of "Yes"), the processed sound is transferred to the first level. 1 estimating unit 23, the threshold is calculated by the threshold calculating unit 24, and then written to the threshold storing unit 25, and the processing of FIG. 18 ends. On the other hand, if it is determined that there is no instruction to end ("No"), in step S75, the signal adjusting unit 51 of the signal synthesizing unit 50 receives a change in level designation information from the level designating unit 9, The processing after step S71 is repeated.

なお、図１７の分離処理及び図１８の官能試験処理の後に、信号処理装置１０は、官能試験（図１８）で許容不可となった信号に対して図１５Ｂに示す閾値算出処理を行う。 After the separation processing in FIG. 17 and the sensory test processing in FIG. 18, the signal processing device 10 performs the threshold value calculation processing shown in FIG. 15B on the signal that is unacceptable in the sensory test (FIG. 18).

なお、本実施形態では、関連性を表す要素としてコヒーレンスを用いる場合を説明したが、関連性を表す要素としてコヒーレントアウトプットパワーを用いるようにしてもよい。 In this embodiment, the case of using coherence as an element representing relevance has been described, but coherent output power may be used as an element representing relevance.

＜ノイズ低減入力物理量生成機能とシャッフル機能＞
本第１実施形態に係る信号処理装置１０は、学習モードにおいて、ノイズ低減入力物理量生成機能と、シャッフル機能を有している。ノイズ低減入力物理量生成機能は、入力物理量（エンジン近傍音９０（図３Ａ））よりもノイズが少ないノイズ低減入力物理量９０Ａを生成する機能である。シャッフル機能は、入力物理量（エンジン近傍音９０（図３Ａ））とノイズ低減入力物理量９０Ａとをシャッフルする機能である。 <Noise reduction input physical quantity generation function and shuffle function>
The signal processing apparatus 10 according to the first embodiment has a noise reduction input physical quantity generation function and a shuffle function in the learning mode. The noise reduction input physical quantity generation function is a function for generating a noise reduction input physical quantity 90A having less noise than the input physical quantity (the near-engine sound 90 (FIG. 3A)). The shuffle function is a function for shuffling the input physical quantity (near engine sound 90 (FIG. 3A)) and the noise reduction input physical quantity 90A.

以下、ノイズ低減入力物理量生成機能とシャッフル機能について説明する。ここでは、ノイズ低減入力物理量生成機能とシャッフル機能を分かり易く説明するために、比較例に係る信号処理装置１０Ｚの構成及び動作について説明し、その後に、本第１実施形態に係る信号処理装置１０の動作について説明する。比較例に係る信号処理装置１０Ｚは、ノイズ低減入力物理量生成機能とシャッフル機能を有していない装置である。 The noise reduction input physical quantity generation function and shuffle function will be described below. Here, in order to explain the noise reduction input physical quantity generation function and the shuffle function in an easy-to-understand manner, the configuration and operation of the signal processing device 10Z according to the comparative example will be described, and then the signal processing device 10 according to the first embodiment will be described. operation will be described. A signal processing device 10Z according to the comparative example is a device that does not have the noise reduction input physical quantity generation function and the shuffle function.

まず、図１９Ａから図１９Ｃを参照して、比較例に係る信号処理装置１０Ｚの構成及び動作について説明する。図１９Ａは、比較例に係る信号処理装置１０Ｚの構成を示すブロック図である。図１９Ｂは、比較例に係る信号処理装置１０Ｚの学習モードの説明図である。図１９Ｃは、比較例に係る信号処理装置１０Ｚの学習モード時の動作説明図である。 First, the configuration and operation of a signal processing device 10Z according to a comparative example will be described with reference to FIGS. 19A to 19C. FIG. 19A is a block diagram showing the configuration of a signal processing device 10Z according to a comparative example. FIG. 19B is an explanatory diagram of the learning mode of the signal processing device 10Z according to the comparative example. FIG. 19C is an explanatory diagram of operation in the learning mode of the signal processing device 10Z according to the comparative example.

図１９Ａの信号処理装置１０Ｚは、図１に示す信号処理システム１００とは異なり、ノイズ低減入力物理量生成機能とシャッフル機能が搭載されていない比較例の信号処理システム１００Ｚを具現化したものである。図１９Ａに示すように、比較例に係る信号処理装置１０Ｚは、本第１実施形態に係る信号処理装置１０（図２参照）と比較すると、学習処理部２０（図２参照）の代わりに、学習処理部２０Ｚを有している点で相違している。学習処理部２０Ｚは、学習処理部２０（図２参照）と比較すると、ノイズ低減入力物理量生成部１５と選択部１６とを有していない点、及び、学習部２１（図２参照）の代わりに、学習部２１Ｚを有している点で相違している。学習部２１Ｚは、学習部２１（図２参照）と比較すると、学習モードにおいて、入力物理量（エンジン近傍音９０（図１９Ｂ））からノイズ成分（雑音９１ｂ（図１９Ｂ））を除去するためのノイズ除去マスクαを生成するニューラルネットワーク９４の重みを学習する点で相違している。 Unlike the signal processing system 100 shown in FIG. 1, the signal processing device 10Z of FIG. 19A embodies a signal processing system 100Z of a comparative example that does not have the noise reduction input physical quantity generation function and the shuffle function. As shown in FIG. 19A, compared with the signal processing device 10 (see FIG. 2) according to the first embodiment, the signal processing device 10Z according to the comparative example has It is different in that it has a learning processing section 20Z. Compared to the learning processing unit 20 (see FIG. 2), the learning processing unit 20Z does not have the noise reduction input physical quantity generation unit 15 and the selection unit 16, and instead of the learning unit 21 (see FIG. 2) However, it is different in that it has a learning section 21Z. In comparison with the learning unit 21 (see FIG. 2), the learning unit 21Z, in the learning mode, generates noise for removing a noise component (noise 91b (FIG. 19B)) from the input physical quantity (engine near sound 90 (FIG. 19B)). The difference lies in learning the weights of the neural network 94 that generates the removal mask α.

図１９Ｂに示すように、学習モード時において、比較例に係る信号処理装置１０Ｚでは、エンジン近傍音９０が学習部２１Ｚに供給される。これに応答して、信号処理装置１０Ｚの学習部２１Ｚは、エンジン近傍音９０を雑音９１ｂ（ノイズ成分）と抽出ノッキング音９１ａ（抽出信号）とに分離する。その際に、信号処理装置１０Ｚの学習部２１Ｚは、エンジン近傍音９０にノイズ除去マスクαを掛け合わせて抽出ノッキング音９１ａ（抽出信号）を取得する。また、信号処理装置１０Ｚの学習部２１Ｚは、エンジン近傍音９０から抽出ノッキング音９１ａ（抽出信号）を差し引くことで、雑音９１ｂ（ノイズ成分）を取得する。 As shown in FIG. 19B, in the learning mode, in the signal processing device 10Z according to the comparative example, the near-engine sound 90 is supplied to the learning section 21Z. In response to this, learning unit 21Z of signal processing device 10Z separates near-engine sound 90 into noise 91b (noise component) and extracted knocking sound 91a (extracted signal). At this time, the learning unit 21Z of the signal processing device 10Z multiplies the near-engine sound 90 by the noise removal mask α to obtain an extracted knocking sound 91a (extracted signal). Further, the learning unit 21Z of the signal processing device 10Z subtracts the extracted knocking sound 91a (extracted signal) from the near-engine sound 90 to obtain noise 91b (noise component).

信号処理装置１０Ｚの学習部２１Ｚは、ノイズ除去マスクαを生成するニューラルネットワークの重みＷ、及び、抽出ノッキング音９１ａ（抽出信号）を観測ノッキング筒内圧９３（教師信号）と同じ次元（単位）の推定ノッキング筒内圧９２（推定信号）に位相を加味して変換するための伝達関数Ｈを学習する。このとき、比較例では、信号処理装置１０Ｚの学習部２１Ｚは、抽出ノッキング音９１ａ（抽出信号）に対して、逆短時間フーリエ変換（ＩＳＴＦＴ）と高速フーリエ変換（ＦＦＴ）とを行い、伝達関数Ｈを掛け、逆高速フーリエ変換（ＩＦＦＴ）と短時間フーリエ変換（ＳＴＦＴ）とを行うことで、抽出ノッキング音９１ａ（抽出信号）を推定ノッキング筒内圧９２（推定信号）に変換している。なお、伝達関数Ｈは、抽出ノッキング音９１ａ（抽出信号）を推定ノッキング筒内圧９２（推定信号）に変換するための振幅（ゲイン）と位相成分である。 The learning unit 21Z of the signal processing device 10Z converts the weight W of the neural network that generates the noise removal mask α and the extracted knocking sound 91a (extracted signal) into the same dimension (unit) as the observed knocking in-cylinder pressure 93 (teacher signal). A transfer function H for converting the estimated knocking in-cylinder pressure 92 (estimated signal) with the phase added is learned. At this time, in the comparative example, the learning unit 21Z of the signal processing device 10Z performs an inverse short-time Fourier transform (ISTFT) and a fast Fourier transform (FFT) on the extracted knocking sound 91a (extracted signal) to obtain a transfer function By multiplying by H and performing an inverse fast Fourier transform (IFFT) and a short-time Fourier transform (STFT), the extracted knocking sound 91a (extracted signal) is converted into an estimated knocking in-cylinder pressure 92 (estimated signal). The transfer function H is the amplitude (gain) and phase components for converting the extracted knocking sound 91a (extracted signal) into the estimated knocking in-cylinder pressure 92 (estimated signal).

また、信号処理装置１０Ｚの学習部２１Ｚは、雑音９１ｂ（ノイズ成分）に対して任意の処理を行うことにより取得された第１信号Ｓｆｉと教師信号（観測ノッキング筒内圧９３）との関連性が小さくなるとともに、抽出ノッキング音９１ａ（抽出信号）に対して任意の処理を行うことにより取得された第２信号Ｓｓｅと教師信号（観測ノッキング筒内圧９３）との関連性が大きくなるように、ニューラルネットワークの重みを学習する。具体的には、学習部２１Ｚは、雑音９１ｂ（ノイズ成分）のスペクトログラムに対して逆短時間フーリエ変換（ＩＳＴＦＴ）を行うことにより取得された第１信号Ｓｆｉと観測ノッキング筒内圧９３（教師信号）とのコヒーレンスが小さくなるとともに、抽出ノッキング音９１ａ（抽出信号）のスペクトログラムに対して逆短時間フーリエ変換（ＩＳＴＦＴ）を行うことにより取得された第２信号Ｓｓｅと観測ノッキング筒内圧９３（教師信号）とのコヒーレンスが大きくなるように、ノイズ除去マスクαを生成するニューラルネットワークの重みＷ、及び、伝達関数Ｈを学習する。これにより、信号処理装置１０Ｚは、雑音９１ｂから筒内圧に起因する音を除去することができる。 Further, the learning unit 21Z of the signal processing device 10Z determines the relationship between the first signal Sfi obtained by performing arbitrary processing on the noise 91b (noise component) and the teacher signal (observed knocking in-cylinder pressure 93). As the relation between the second signal Sse obtained by performing arbitrary processing on the extracted knocking sound 91a (extracted signal) and the teacher signal (observed knocking in-cylinder pressure 93) increases, the neural Learn network weights. Specifically, the learning unit 21Z obtains a first signal Sfi obtained by performing an inverse short-time Fourier transform (ISTFT) on the spectrogram of the noise 91b (noise component) and the observed knocking in-cylinder pressure 93 (teacher signal). and the second signal Sse obtained by performing an inverse short-time Fourier transform (ISTFT) on the spectrogram of the extracted knocking sound 91a (extracted signal) and the observed knocking cylinder pressure 93 (teacher signal). The weight W of the neural network that generates the noise removal mask α and the transfer function H are learned so that the coherence with is increased. Thereby, the signal processing device 10Z can remove the sound caused by the in-cylinder pressure from the noise 91b.

図１９Ｃに、比較例に係る信号処理装置１０Ｚの学習モード時の動作を示す。図１９Ｃの太枠は、学習モード時の作動する構成要素を示している。また、図１９Ｃの太線矢印は、学習モード時に出力される信号を示している。 FIG. 19C shows the operation in the learning mode of the signal processing device 10Z according to the comparative example. The bold boxes in FIG. 19C indicate the active components during the learn mode. Also, the bold arrows in FIG. 19C indicate signals output during the learning mode.

図１９Ｃに示すように、学習モード時において、比較例に係る信号処理装置１０Ｚは、スイッチ１４を学習モード用接続部Ｍ１に接続して、信号記憶部１３に記憶されているエンジン近傍音９０のスペクトログラム及び観測ノッキング筒内圧９３（教師信号）を学習部２１Ｚに出力する。 As shown in FIG. 19C , in the learning mode, the signal processing device 10Z according to the comparative example connects the switch 14 to the learning mode connection portion M1, and changes the near-engine sound 90 stored in the signal storage portion 13. The spectrogram and the observed knocking in-cylinder pressure 93 (teacher signal) are output to the learning section 21Z.

これに応答して、信号処理装置１０Ｚの学習部２１Ｚは、エンジン近傍音９０から雑音９１ｂ（ノイズ成分）を除去するノイズ除去マスクαを生成するニューラルネットワーク９４（図１９Ｂ）の重みＷと、生成されたノイズ除去マスクαで抽出した抽出ノッキング音９１ａ（抽出信号）をノッキング発生時のエンジン１の推定ノッキング筒内圧９２（推定信号）に変換する伝達関数Ｈとを学習する。このとき、信号処理装置１０Ｚの学習部２１Ｚは、新しいエンジン近傍音９０が入力される度に、エンジン近傍音９０に対応するノイズ除去マスクαを生成する。 In response to this, the learning unit 21Z of the signal processing device 10Z generates the weight W of the neural network 94 ( FIG. A transfer function H for converting an extracted knocking sound 91a (extracted signal) extracted by the noise removal mask α into an estimated knocking cylinder pressure 92 (estimated signal) of the engine 1 at the time of knocking is learned. At this time, the learning unit 21Z of the signal processing device 10Z generates a noise removal mask α corresponding to the engine near sound 90 each time a new engine near sound 90 is input.

そして、信号処理装置１０Ｚの学習部２１Ｚは、学習されたパラメータ（ノイズ除去マスクαを生成するニューラルネットワークの重みＷと伝達関数Ｈ）を学習済みパラメータ記憶部２２に記憶する。また、信号処理装置１０Ｚの学習部２１Ｚは、観測ノッキング筒内圧９３（教師信号）を教師信号記憶部２６ａに記憶するとともに、推定ノッキング筒内圧９２（推定信号）を推定信号記憶部２６ｂに記憶する。 Then, the learning unit 21Z of the signal processing device 10Z stores the learned parameters (weight W and transfer function H of the neural network for generating the noise removal mask α) in the learned parameter storage unit 22. FIG. Further, the learning unit 21Z of the signal processing device 10Z stores the observed knocking in-cylinder pressure 93 (teacher signal) in the teacher signal storage unit 26a, and stores the estimated knocking in-cylinder pressure 92 (estimated signal) in the estimated signal storage unit 26b. .

＜用途変更例＞
ところで、本実施形態に係る信号処理装置１０は、前記した用途に限らず、様々な用途に用いることができる。例えば、本実施形態に係る信号処理装置１０は、検証対象を電子装置に搭載された複数の電子部品のうちの１つとして、この検証対象の電子部品でどのような動作音が発生しているかを検証する用途にも用いることができる。この場合の入力物理量は、この電子部品が搭載された電子装置近傍音となる。電子装置近傍音には、この電子装置に搭載された全ての電子部品の過渡的な動作音と、その他のノイズが重畳した音が含まれており、検証対象の電子部品の動作音はその一部である。ノイズ成分には、この電子装置に搭載された電子部品のうち、検証対象の電子部品を除く他の電子部品の動作音と、その他のノイズが含まれる。電子装置近傍音は、過渡的なノイズ成分が大きく、エンジン近傍音９０よりも目的音の取り逃しが発生し易い信号である。また、電子装置近傍音は、定常的なノイズ成分が小さいため、取り逃した音が目立ち易い信号である。 <Example of change of use>
By the way, the signal processing device 10 according to the present embodiment can be used for various purposes without being limited to the above-described uses. For example, the signal processing apparatus 10 according to the present embodiment sets the verification target to be one of a plurality of electronic components mounted in an electronic device, and determines what kind of operation sound is generated by this verification target electronic component. It can also be used for verification purposes. The input physical quantity in this case is the near sound of the electronic device in which this electronic component is mounted. The electronic device near-field sound includes the transient operating sounds of all the electronic components installed in this electronic device and the superimposed sounds of other noises. Department. The noise component includes the operation sound of electronic components other than the electronic component to be verified among the electronic components mounted in this electronic device, and other noises. The electronic device vicinity sound is a signal that has a large transient noise component, and is more likely to miss the target sound than the engine vicinity sound 90 . In addition, since the electronic device near-field sound has a small stationary noise component, it is a signal in which the missed sound is easily conspicuous.

以下、図２０を参照して、このような用途変更例に適した信号処理システム１００の全体構成を説明し、さらに、図２１を参照して、このような用途変更例における信号処理装置１０の適した信号処理システム１００の学習モード時の処理について説明する。図２０は、信号処理装置１０の用途変更例を実現するための信号処理システム１００の全体構成を示すブロック図である。図２１は、用途変更された信号処理装置１０における学習モードの説明図である。 Hereinafter, with reference to FIG. 20, the overall configuration of the signal processing system 100 suitable for such a modified example of use will be described, and further with reference to FIG. Processing during the learning mode of a suitable signal processing system 100 will now be described. FIG. 20 is a block diagram showing the overall configuration of a signal processing system 100 for realizing a modified example of application of the signal processing device 10. As shown in FIG. FIG. 21 is an explanatory diagram of the learning mode in the signal processing device 10 whose application has been changed.

本実施形態に係る信号処理装置１０は、図２０に示すように、例えば、検証対象を電子装置１０３に搭載された複数の電子部品１０１，１０２のうちの１つである電子部品１０１として、この検証対象の電子部品１０１でどのような動作音が発生しているかを検証する用途に用いることができる。この場合の入力物理量は、電子装置近傍音となる。電子装置近傍音には、この電子装置１０３に搭載された電子部品１０１，１０２の過渡的な動作音と、その他のノイズが重畳した音が含まれており、検証対象の電子部品１０１の動作音はその一部である。ノイズ成分には、検証対象でない他の電子部品１０２の動作音と、その他のノイズが含まれる。 As shown in FIG. 20, the signal processing apparatus 10 according to the present embodiment, for example, uses an electronic component 101, which is one of a plurality of electronic components 101 and 102 mounted on an electronic device 103, as an object to be verified. It can be used for verifying what kind of operation sound is generated by the electronic component 101 to be verified. The input physical quantity in this case is the near-field sound of the electronic device. The electronic device near-field sound includes transitional operating sounds of the electronic components 101 and 102 mounted on the electronic device 103 and sounds in which other noises are superimposed. is part of it. The noise components include operating sounds of other electronic components 102 that are not verification targets and other noises.

図２０に示す例では、信号処理システム１００は、図１と同様に音圧センサ４と、データ収集装置６と、モニタ７と、ヘッドホン８と、レベル指定部９と、信号処理装置１０とを備え、図１の筒内圧センサ５の代わりに加速度センサ１０５を備える。音圧センサ４は、電子装置１０３に搭載された全ての電子部品１０１，１０２の動作音やその他のノイズを含む電子装置１０３の動作音を取得して音圧信号とすると、データ収集装置６に出力する。 In the example shown in FIG. 20, the signal processing system 100 includes a sound pressure sensor 4, a data collection device 6, a monitor 7, headphones 8, a level designator 9, and a signal processing device 10, as in FIG. 1, and an acceleration sensor 105 is provided in place of the in-cylinder pressure sensor 5 of FIG. When the sound pressure sensor 4 acquires the operation sound of the electronic device 103 including the operation sound of all the electronic components 101 and 102 mounted on the electronic device 103 and other noises and converts it into a sound pressure signal, the data collection device 6 Output.

加速度センサ１０５は、検証対象である電子部品１０１に接するように取り付けられており、検証対象である電子部品１０１の振動加速度を取得して振動加速度信号とすると、データ収集装置６に出力する。ここで、電子部品１０１に動作音が発生していないとき、加速度センサ１０５から出力される振動加速度信号には、動作音に相関した成分が含まれない。一方、電子部品１０１に動作音が発生しているとき、加速度センサ１０５から出力される振動加速度信号には、動作音に相関した成分が含まれている。 The acceleration sensor 105 is attached so as to be in contact with the electronic component 101 to be verified. Here, when no operating sound is generated in the electronic component 101, the vibration acceleration signal output from the acceleration sensor 105 does not include a component correlated with the operating sound. On the other hand, when the electronic component 101 generates an operation sound, the vibration acceleration signal output from the acceleration sensor 105 contains a component correlated with the operation sound.

データ収集装置６は、電子装置１０３の音圧信号と、検証対象の電子部品１０１の振動加速度信号をＡ／Ｄ変換して信号処理装置１０に出力する。信号処理装置１０は、ノイズ除去マスクαを生成するニューラルネットワーク９４（図２１）の重みＷ（図２１）及び伝達関数Ｈ（図２１）を学習する。モニタ７とヘッドホン８とレベル指定部９と信号処理装置１０については、図１で前記しているので、ここでは説明を省略する。 The data collection device 6 A/D-converts the sound pressure signal of the electronic device 103 and the vibration acceleration signal of the electronic component 101 to be verified, and outputs them to the signal processing device 10 . The signal processing device 10 learns the weight W (FIG. 21) and transfer function H (FIG. 21) of the neural network 94 (FIG. 21) that generates the noise removal mask α. Since the monitor 7, the headphone 8, the level designator 9, and the signal processing device 10 have already been described in FIG. 1, their description is omitted here.

つまり、エンジンのノッキング音の処理におけるエンジン近傍音は、電子装置１０３の動作音の処理における受聴者の耳位置の音に対応し、よって電子装置近傍音Ａ９０に対応する。エンジンのノッキング音の処理における抽出ノッキング音は、検証対象の電子部品１０１の動作音に対応する。エンジンのノッキング音の処理における推定ノッキング筒内圧は、検証対象の電子部品１０１の振動加速度を推定した値に対応する。エンジンのノッキング音の処理における観測ノッキング筒内圧は、検証対象の電子部品１０１の振動加速度を観測した値に対応する。つまり、電子部品１０１の動作音の処理における教師信号は、検証対象の電子部品１０１の振動加速度を観測した値である。 That is, the near-engine sound in the processing of the knocking sound of the engine corresponds to the sound at the listener's ear position in the processing of the operation sound of the electronic device 103, and thus corresponds to the near-electronic device sound A90. The extracted knocking sound in the engine knocking sound processing corresponds to the operation sound of the electronic component 101 to be verified. The estimated knocking in-cylinder pressure in processing the engine knocking sound corresponds to the value obtained by estimating the vibration acceleration of the electronic component 101 to be verified. The observed knocking in-cylinder pressure in processing the engine knocking sound corresponds to the value obtained by observing the vibration acceleration of the electronic component 101 to be verified. In other words, the teacher signal in processing the operation sound of the electronic component 101 is a value obtained by observing the vibration acceleration of the electronic component 101 to be verified.

図１の信号処理装置１０をこの用途変更例に用いる場合において、図２１に示す用途変更例の学習モードでの信号処理装置１０の動作は、図３Ａに示す用途例の学習モードでの信号処理装置１０の動作と比較すると、以下の点で相違する。
（１）入力物理量としてエンジン近傍音９０（図３Ａ）の代わりに電子装置近傍音Ａ９０がノイズ低減入力物理量生成部１５と選択部１６に入力される点。
（２）ノイズ低減入力物理量生成部１５が、ノイズ低減入力物理量９０Ａ（図３Ａ）の代わりに、入力物理量（用途変更例では、電子装置近傍音Ａ９０）に含まれるノイズ成分（用途変更例では、雑音Ａ９１ｂ）を低減したノイズ低減入力物理量Ａ９０Ａを生成する点。
（３）選択部１６が、入力物理量である電子装置近傍音Ａ９０とノイズ低減入力物理量Ａ９０Ａのいずれか一方又は双方を選択して学習部２１に供給する点。
（４）学習部２１は、信号記憶部１３から入力された音圧信号のスペクトログラム及び振動加速度信号を用いて、入力物理量である電子装置近傍音Ａ９０とノイズ低減入力物理量Ａ９０Ａの中の選択部１６が選択したものからノイズ成分（雑音Ａ９１ｂ）を除去するためのノイズ除去マスクαを生成するニューラルネットワーク９４の重みＷを学習する点。
（５）学習部２１は、信号記憶部１３から入力された音圧信号のスペクトログラム及び振動加速度信号を用いて、生成されたノイズ除去マスクαで抽出した抽出信号を電子部品１０１の推定振動加速度Ａ９２に変換する伝達関数Ｈを学習する点。
（６）学習部２１は、抽出信号として、抽出ノッキング音９１ａ（図３Ａ）の代わりに、抽出動作音Ａ９１ａを抽出する点。
（７）学習部２１は、ノイズ成分である雑音Ａ９１ｂに対して任意の処理を行うことにより取得された第１信号Ｓｆｉと教師信号である観測振動加速度Ａ９３との関連性が小さくなるとともに、抽出信号である抽出動作音Ａ９１ａに対して任意の処理を行うことにより取得された第２信号Ｓｓｅと教師信号である観測振動加速度Ａ９３との関連性が大きくなるように、ニューラルネットワークの重みを学習する点。具体的には、学習部２１は、雑音Ａ９１ｂのスペクトログラムに対して逆短時間フーリエ変換（ＩＳＴＦＴ）を行うことにより取得された第１信号Ｓｆｉと観測振動加速度Ａ９３とのコヒーレンスが小さくなるとともに、抽出動作音Ａ９１ａのスペクトログラムに対して逆短時間フーリエ変換（ＩＳＴＦＴ）を行うことにより取得された第２信号Ｓｓｅと観測振動加速度Ａ９３とのコヒーレンスが大きくなるように、ノイズ除去マスクαを生成するニューラルネットワークの重みＷ、及び、伝達関数Ｈを学習する点。
（８）さらに、学習部２１は、教師信号である観測振動加速度Ａ９３に対して短時間フーリエ変換（ＳＴＦＴ）を行い求めたスペクトログラムと推定信号である推定振動加速度Ａ９２との誤差が最小となるように、ニューラルネットワーク９４により、ノイズ除去マスクαを生成するニューラルネットワークの重みＷ、及び、伝達関数Ｈを学習する点。 When the signal processing device 10 of FIG. 1 is used in this application example change, the operation of the signal processing device 10 in the learning mode of the application example change shown in FIG. Compared with the operation of the device 10, the following points are different.
(1) The noise reduction input physical quantity generating unit 15 and the selecting unit 16 are input with the electronic device near sound A90 instead of the engine near sound 90 (FIG. 3A) as the input physical quantity.
(2) The noise reduction input physical quantity generation unit 15 generates a noise component (in the application change example, Generating a noise-reduced input physical quantity A90A in which the noise A91b) is reduced.
(3) The selection unit 16 selects either one or both of the electronic device near-field sound A90 and the noise reduction input physical quantity A90A, which are input physical quantities, and supplies them to the learning unit 21 .
(4) Using the spectrogram of the sound pressure signal and the vibration acceleration signal input from the signal storage unit 13, the learning unit 21 selects the input physical quantity of the electronic device near sound A90 and the noise reduction input physical quantity A90A. learning the weights W of a neural network 94 that produces a denoising mask α for removing the noise component (noise A 91b) from the selection of .
(5) Using the spectrogram of the sound pressure signal and the vibration acceleration signal input from the signal storage unit 13, the learning unit 21 converts the extraction signal extracted by the generated noise removal mask α into the estimated vibration acceleration A92 of the electronic component 101. The point of learning a transfer function H that transforms into .
(6) The learning unit 21 extracts an extracted operation sound A91a as an extracted signal instead of the extracted knocking sound 91a (FIG. 3A).
(7) The learning unit 21 reduces the relevance between the first signal Sfi obtained by performing arbitrary processing on the noise A91b, which is the noise component, and the observed vibration acceleration A93, which is the teacher signal, and extracts The weight of the neural network is learned so that the relationship between the second signal Sse obtained by performing arbitrary processing on the extracted motion sound A91a, which is a signal, and the observed vibration acceleration A93, which is a teacher signal, increases. point. Specifically, the learning unit 21 reduces the coherence between the first signal Sfi obtained by performing an inverse short-time Fourier transform (ISTFT) on the spectrogram of the noise A91b and the observed vibration acceleration A93, and extracts A neural network for generating a noise removal mask α so as to increase the coherence between a second signal Sse obtained by performing an inverse short-time Fourier transform (ISTFT) on the spectrogram of the operating sound A91a and the observed vibration acceleration A93. , and the point of learning the transfer function H.
(8) Further, the learning unit 21 performs a short-time Fourier transform (STFT) on the observed vibration acceleration A93, which is the teacher signal, so as to minimize the error between the spectrogram obtained by performing the short-time Fourier transform (STFT) and the estimated vibration acceleration A92, which is the estimation signal. Second, the neural network 94 learns the neural network weight W and the transfer function H for generating the noise removal mask α.

このような用途変更例において、信号処理装置１０は、エンジン近傍音９０（図３Ａ）の代わりに、受聴者の耳位置の音である電子装置近傍音Ａ９０（図２１）を入力物理量として扱う。また、信号処理装置１０は、抽出ノッキング音９１ａ（図３Ａ）の代わりに、検証対象の電子部品１０１の動作音である抽出動作音Ａ９１ａ（図２１）を抽出信号として扱う。また、信号処理装置１０は、雑音９１ｂ（図３Ａ）の代わりに雑音Ａ９１ｂ（図２１）をノイズ成分として扱う。また、信号処理装置１０は、推定ノッキング筒内圧９２（図３Ａ）の代わりに、検証対象の電子部品１０１の推定振動加速度Ａ９２（図２１）を推定信号として扱う。また、信号処理装置１０は、観測ノッキング筒内圧９３（図３Ａ）の代わりに、検証対象の電子部品１０１の観測振動加速度Ａ９３（図２１）を教師信号として扱う。そして、信号処理装置１０は、これらの信号を用いて、学習モード、閾値算出モード、判定モード、分離モード、官能試験モードを実行する。このような信号処理装置１０は、官能試験モードにおいて、電子部品１０１でどのような動作音が発生しているかを検証することができる。 In such a modified application, the signal processing device 10 treats the electronic device near sound A90 (FIG. 21), which is the sound at the listener's ear position, as an input physical quantity instead of the engine near sound 90 (FIG. 3A). Further, instead of the extracted knocking sound 91a (FIG. 3A), the signal processing device 10 treats an extracted operation sound A91a (FIG. 21), which is the operation sound of the electronic component 101 to be verified, as an extracted signal. Also, the signal processing device 10 treats the noise A 91b (FIG. 21) as a noise component instead of the noise 91b (FIG. 3A). In addition, the signal processing device 10 handles the estimated vibration acceleration A92 (FIG. 21) of the electronic component 101 to be verified as an estimated signal instead of the estimated knocking in-cylinder pressure 92 (FIG. 3A). Further, the signal processing device 10 treats the observed vibration acceleration A93 (FIG. 21) of the electronic component 101 to be verified as a teacher signal instead of the observed knocking cylinder pressure 93 (FIG. 3A). Using these signals, the signal processing device 10 executes a learning mode, a threshold calculation mode, a determination mode, a separation mode, and a sensory test mode. Such a signal processing device 10 can verify what kind of operation sound is generated by the electronic component 101 in the sensory test mode.

なお、前記した比較例に係る信号処理装置１０Ｚも、検証対象を電子装置に搭載された複数の電子部品のうちの１つとして、この検証対象の電子部品でどのような動作音が発生しているかを検証する用途変更例に用いることができる。しかし、前記した比較例に係る信号処理装置１０Ｚは、この用途変更例に用いられた場合の学習モードにおいて、入力物理量に含まれるノイズ成分が比較的大きい場合、入力物理量からノイズ成分を除去して抽出信号を抽出する際に、取り逃した信号ＳＧ１３（抽出し損ねた信号）（図２２Ｃ）が発生し、抽出信号を正確に分離できなくなる可能性がある。つまり、比較例に係る信号処理装置１０Ｚは、学習モードにおいて、入力物理量に含まれるノイズ成分が比較的大きい場合に、入力物理量から抽出したい信号（取りたい信号ＳＧ１２（図２２Ｂ））を綺麗に分離できなくなる可能性がある。 In the signal processing device 10Z according to the comparative example as well, the verification target is one of the plurality of electronic components mounted in the electronic device. It can be used in a modified example to verify whether However, the signal processing device 10Z according to the comparative example described above removes the noise component from the input physical quantity when the noise component included in the input physical quantity is relatively large in the learning mode when used in this application change example. When extracting the extracted signals, a missed signal SG13 (failed to extract) (FIG. 22C) may occur, preventing the extracted signals from being accurately separated. That is, in the learning mode, the signal processing device 10Z according to the comparative example cleanly separates the signal to be extracted from the input physical quantity (the desired signal SG12 (FIG. 22B)) when the noise component included in the input physical quantity is relatively large. may not be possible.

図２２Ａから図２２Ｄは、それぞれ、比較例に係る信号処理装置１０Ｚにおいて発生する不適切な例（入力物理量から取りたい信号ＳＧ１１（抽出したい信号）を綺麗に分離できない場合の例）の説明図である。ここでは、前記した比較例に係る信号処理装置１０Ｚが、検証対象を電子装置に搭載された複数の電子部品のうちの１つとして、この検証対象の電子部品でどのような動作音が発生しているかを検証する用途変更例に用いられる場合を想定して説明する。 FIGS. 22A to 22D are explanatory diagrams of inappropriate examples (examples in which a desired signal SG11 (a signal to be extracted) to be obtained from an input physical quantity cannot be separated cleanly) occurring in the signal processing device 10Z according to the comparative example. be. Here, the signal processing device 10Z according to the comparative example described above assumes that one of the plurality of electronic components mounted in the electronic device is to be verified, and what kind of operating sound is generated by this electronic component to be verified. A description will be given assuming that it is used in a modified example of usage for verifying whether the

図２２Ａは、入力物理量のスペクトログラムの一例である。入力物理量には、取りたい信号ＳＧ１１（抽出したい信号）とノイズＮＯ１１とが混在している。 FIG. 22A is an example of a spectrogram of input physical quantities. The input physical quantity includes a mixture of the desired signal SG11 (the signal to be extracted) and noise NO11.

図２２Ｂは、図２２Ａに示す入力物理量から取りたい信号ＳＧ１２（抽出したい信号）を抽出した抽出信号のスペクトログラムを示している。 FIG. 22B shows a spectrogram of an extraction signal obtained by extracting the desired signal SG12 (the signal to be extracted) from the input physical quantity shown in FIG. 22A.

図２２Ｃは、図２２Ａに示す入力物理量から抽出信号を抽出した後のノイズ成分のスペクトログラムである。ノイズ成分には抽出信号を抽出する際に取り逃した信号ＳＧ１３（抽出し損ねた信号）とノイズＮＯ１１が混在している。取り逃した信号ＳＧ１３はノイズＮＯ１１よりも音圧が小さいため、ノイズ成分と教師信号とのコヒーレンスは低くなる（図２２Ｄ）。 FIG. 22C is a spectrogram of the noise component after extracting the extraction signal from the input physical quantity shown in FIG. 22A. The noise component includes the signal SG13 missed when extracting the extraction signal (the signal that was not extracted) and the noise NO11. Since the missed signal SG13 has a lower sound pressure than the noise NO11, the coherence between the noise component and the teacher signal is low (FIG. 22D).

図２２Ｄは、不適切な例におけるノイズ成分と教師信号とのコヒーレンスを示している。図２２Ｄに示すように、不適切な例では、取り逃した信号ＳＧ１３が含まれているノイズ成分と教師信号とのコヒーレンスが低くなっている。この場合に、学習部２１によるニューラルネットワークの重みＷと伝達関数Ｈの学習において、ノイズ成分に取り逃した信号ＳＧ１３が含まれているにもかかわらず学習が止まってしまい、取り逃がしが発生する。この点について、ノイズＮＯ１１が無ければ（ノイズの音圧が小さければ）（図２３Ｃ）、取り逃した信号ＳＧ１３が含まれているノイズ成分と教師信号とのコヒーレンスが高くなる（図２３Ｄ）。本第１実施形態に係る信号処理装置１０は、図２３Ｄに示すように取り逃した信号ＳＧ１３を含むノイズ成分と教師信号とのコヒーレンスを高くなるようにして、ニューラルネットワークの重みＷと伝達関数Ｈの学習が止まらないように改善したものである。 FIG. 22D shows the coherence between the noise component and the teacher signal in an inappropriate example. As shown in FIG. 22D, in the inappropriate example, the coherence between the noise component containing the missed signal SG13 and the teacher signal is low. In this case, in the learning of the neural network weight W and the transfer function H by the learning unit 21, the learning is stopped even though the missed signal SG13 is included in the noise component, and the missing occurs. Regarding this point, if there is no noise NO11 (if the sound pressure of the noise is small) (FIG. 23C), the coherence between the noise component containing the missing signal SG13 and the teacher signal becomes high (FIG. 23D). The signal processing apparatus 10 according to the first embodiment increases the coherence between the teacher signal and the noise component including the missing signal SG13 as shown in FIG. It is improved so that learning does not stop.

図２３Ａから図２３Ｅは、それぞれ、本第１実施形態に係る信号処理装置１０において実現される好適な例（入力物理量から取りたい信号ＳＧ２１（抽出したい信号）を綺麗に分離できる場合の例）の説明図である。 FIGS. 23A to 23E each show a preferred example (an example in which a signal SG21 to be taken from an input physical quantity (a signal to be extracted) can be cleanly separated) implemented in the signal processing device 10 according to the first embodiment. It is an explanatory diagram.

図２３Ａは、好適な例における、入力物理量のスペクトログラムの一例である。入力物理量には取りたい信号ＳＧ２１（抽出したい信号）とノイズＮＯ２１とが混在している。 FIG. 23A is an example of a spectrogram of input physical quantities in a preferred example. The input physical quantity includes a signal SG21 to be taken (a signal to be extracted) and noise NO21.

図２３Ｂは、処理マスクを用いて図２３Ａに示す入力物理量からノイズＮＯ２１を低減した後の、取りたい信号ＳＧ２２（抽出したい信号）が含まれているノイズ低減入力物理量のスペクトログラムを示している。図２３Ｂに示すノイズ低減入力物理量には、取りたい信号ＳＧ２２と処理マスクによって低減されたノイズＮＯ２２とが含まれている。 FIG. 23B shows the spectrogram of the noise-reduced input physical quantity containing the desired signal SG22 (the signal to be extracted) after noise NO21 is reduced from the input physical quantity shown in FIG. 23A using the processing mask. The noise reduction input physical quantity shown in FIG. 23B includes the desired signal SG22 and the noise NO22 reduced by the processing mask.

図２３Ｃは、学習部２１によるニューラルネットワークの重みＷと伝達関数Ｈの学習途中における、図２３Ｂに示すノイズ低減入力物理量から抽出信号を抽出した後のノイズ成分のスペクトログラムである。図２３Ｃに示す例のノイズ成分には、取り逃した信号ＳＧ２３が含まれており、取り逃した信号ＳＧ２３よりもノイズＮＯ２３の音圧が小さくなっている。そのため、図２３Ｂに示すノイズ低減入力物理量から抽出信号を抽出した後におけるノイズ成分と教師信号とのコヒーレンスが高くなる（図２３Ｄ）。 FIG. 23C is a spectrogram of the noise component after extracting the extraction signal from the noise-reduced input physical quantity shown in FIG. The noise component in the example shown in FIG. 23C includes the missing signal SG23, and the sound pressure of the noise NO23 is lower than the missing signal SG23. Therefore, the coherence between the noise component and the teacher signal after extracting the extracted signal from the noise-reduced input physical quantity shown in FIG. 23B increases (FIG. 23D).

図２３Ｄは、適切な例におけるノイズ成分と教師信号とのコヒーレンスを示している。図２３Ｄに示すように、適切な例では、取り逃した信号ＳＧ２３が含まれているノイズ成分と教師信号とのコヒーレンスが高くなっている。本第１実施形態に係る信号処理装置１０は、図２３Ｄに示すように取り逃した信号ＳＧ２３を含むノイズ成分と教師信号とのコヒーレンスを高くなるようにすることで、ニューラルネットワークの重みＷと伝達関数Ｈの学習の際に、ノイズ成分に取り逃した信号ＳＧ２３が含まれている場合に学習が止まらないようにすることができる。図２２Ｄに示す不適切な場合では、学習部２１によるニューラルネットワークの重みＷと伝達関数Ｈの学習において、ノイズ成分に取り逃した信号ＳＧ１３が含まれていても学習が止まってしまい、取り逃がしが発生する。本第１実施形態に係る信号処理装置１０は、このような取り逃がしの発生を抑制することができる。 FIG. 23D shows the coherence between the noise component and the teacher signal for a good example. As shown in FIG. 23D, in a suitable example, the coherence between the noise component containing the missed signal SG23 and the teacher signal is high. The signal processing apparatus 10 according to the first embodiment increases the coherence between the teacher signal and the noise component including the missed signal SG23 as shown in FIG. During learning of H, learning can be prevented from stopping when the missed signal SG23 is included in the noise component. In the inappropriate case shown in FIG. 22D, in the learning of the neural network weight W and the transfer function H by the learning unit 21, the learning is stopped even if the missed signal SG13 is included in the noise component, and the missing occurs. . The signal processing device 10 according to the first embodiment can suppress the occurrence of such missing.

図２３Ｅは、図２３Ａの好適な例における入力物理量のスペクトログラムから抽出信号を抽出した後のノイズ成分のスペクトログラムである。図２３Ｅに示す好適な例における取り逃した信号ＳＧ２４の音圧は、図２２Ｃに示す不適切な例における取り逃した信号ＳＧ１３の音圧よりも小さい。このように、本第１実施形態に係る信号処理装置１０は、取り逃した信号ＳＧ２４を小さくすることができる。 FIG. 23E is a spectrogram of the noise component after extracting the extraction signal from the spectrogram of the input physical quantity in the preferred example of FIG. 23A. The sound pressure of the missed signal SG24 in the preferred example shown in Figure 23E is less than the sound pressure of the missed signal SG13 in the inappropriate example shown in Figure 22C. Thus, the signal processing device 10 according to the first embodiment can reduce the missing signal SG24.

本第１実施形態に係る信号処理装置１０は、学習が途中で止まらないようにするために、入力物理量である電子装置近傍音Ａ９０からノイズが支配的な部分についてノイズを低減したデータであるノイズ低減入力物理量Ａ９０Ａを生成しながら、ニューラルネットワークの重みＷと伝達関数Ｈを学習する。学習モード時において、入力物理量に含まれるノイズの音圧が比較的大きいと、入力物理量から取りたい信号（抽出したい信号）を綺麗に分離できなくなる。そのため、取りたい信号（抽出したい信号）の分離性能を向上させるために本第１実施形態に係る信号処理装置１０に、前記のノイズ低減入力物理量生成機能とシャッフル機能が付与されている。前記のノイズ低減入力物理量生成機能は、ノイズ低減入力物理量生成部１５によって実現される（図３Ａ及び図３Ｂ）。また、前記のシャッフル機能は、選択部１６によって実現される（図３Ａ及び図３Ｂ）。 In order to prevent the learning from stopping halfway, the signal processing apparatus 10 according to the first embodiment uses noise data, which is data obtained by reducing noise in a portion where noise is dominant from the electronic device vicinity sound A90, which is an input physical quantity. The weight W and transfer function H of the neural network are learned while generating the reduced input physical quantity A90A. In the learning mode, if the sound pressure of noise contained in the input physical quantity is relatively large, it becomes impossible to cleanly separate the desired signal (the signal to be extracted) from the input physical quantity. Therefore, the noise reduction input physical quantity generation function and the shuffle function are added to the signal processing device 10 according to the first embodiment in order to improve the separation performance of the signal to be taken (signal to be extracted). The noise reduction input physical quantity generation function is implemented by the noise reduction input physical quantity generation unit 15 (FIGS. 3A and 3B). Also, the shuffle function described above is implemented by the selector 16 (FIGS. 3A and 3B).

図２４は、処理マスクβの生成方法の概略説明図である。処理マスクβは、電子装置近傍音Ａ９０のスペクトログラムの内、ノイズが支配的ではない部分については信号を通すように、一方で、ノイズが支配的な部分についてはノイズを低減するように動作する。処理マスクβは、ノイズ除去マスクαをフィルタ処理することにより生成される。図２４に示す例では、ノイズ除去マスクαをマックスプーリングとアップサンプリングを用いる方法でフィルタ処理することにより生成する例を示している。 FIG. 24 is a schematic explanatory diagram of a method of generating the processing mask β. The processing mask β operates to pass a portion of the spectrogram of the electronic device near-field sound A90 where noise is not dominant, while reducing noise in a portion where noise is dominant. The processing mask β is generated by filtering the denoising mask α. The example shown in FIG. 24 shows an example in which the noise removal mask α is generated by filtering using a method using maxpooling and upsampling.

図２５は、処理マスクβの生成方法の説明図である。図２５は、マックスプーリングとアップサンプリングを用いる方法の概要を示している。図２５に示すように、マックスプーリングとアップサンプリングを用いる方法は、指定された領域内のすべての要素を最大値の要素に置き換える操作を行うものである。その他にも最大値フィルタや二値化手法、またはそれらの組み合わせた処理を用いても良い。これにより、ノイズが支配的ではない部分を所定領域だけ拡張し、誤って抽出したい信号を低減することがないようにしている。 FIG. 25 is an explanatory diagram of a method of generating the processing mask β. FIG. 25 outlines a method using maxpooling and upsampling. As shown in FIG. 25, the method using maxpooling and upsampling involves replacing all elements within a specified region with the element with the maximum value. In addition, a maximum value filter, a binarization method, or a combination of them may be used. As a result, a portion where noise is not dominant is expanded by a predetermined region so as not to erroneously reduce a signal to be extracted.

信号処理装置１０のノイズ低減入力物理量生成部１５は、これらの手法を用いてノイズ除去マスクαから処理マスクβを生成する。なお、これらの手法は一例に過ぎず、ノイズ低減入力物理量生成部１５は、これら以外の手法を用いてノイズ除去マスクαから処理マスクβを生成することができる。 The noise reduction input physical quantity generator 15 of the signal processing apparatus 10 uses these methods to generate the processing mask β from the noise removal mask α. Note that these methods are merely examples, and the noise reduction input physical quantity generation unit 15 can generate the processing mask β from the noise removal mask α using methods other than these.

図２６及び図２７は、それぞれ、ノイズ低減入力物理量Ａ９０Ａの生成方法の説明図である。図２６及び図２７は、ノイズ低減入力物理量生成部１５が電子装置近傍音Ａ９０と処理マスクβとの要素積を演算することにより、ノイズ低減入力物理量Ａ９０Ａを生成することを示している。このようなノイズ低減入力物理量生成部１５は、処理マスクβで電子装置近傍音Ａ９０からノイズが支配的な部分を低減したノイズ低減入力物理量Ａ９０Ａのスペクトログラムを生成することができる。ノイズ低減入力物理量生成部１５によって生成されたノイズ低減入力物理量Ａ９０Ａは、教師信号である観測振動加速度Ａ９３と関連付けされたデータセットとして選択部１６に供給される。 26 and 27 are explanatory diagrams of the method of generating the noise reduction input physical quantity A90A, respectively. 26 and 27 show that the noise reduction input physical quantity generator 15 generates the noise reduction input physical quantity A90A by calculating the element product of the electronic device near sound A90 and the processing mask β. Such a noise-reduction input physical quantity generation unit 15 can generate a spectrogram of a noise-reduction input physical quantity A90A in which a noise-dominant portion is reduced from the electronic device near-field sound A90 by the processing mask β. The noise reduction input physical quantity A90A generated by the noise reduction input physical quantity generation unit 15 is supplied to the selection unit 16 as a data set associated with the observed vibration acceleration A93, which is the teacher signal.

図２８は、本第１実施形態に係る信号処理装置１０の学習モードにおけるシャッフル処理の説明図である。図２８は、複数の電子装置近傍音Ａ９０とそれに対応する教師信号である観測振動加速度Ａ９３とが関連付けされたデータセットＤＳ９０が用意されるとともに、複数のノイズ低減入力物理量Ａ９０Ａとそれに対応する観測振動加速度Ａ９３とが関連付けされたデータセットＤＳ９０Ａが用意されることを示している。そして、図２８は、選択部１６が、データセットＤＳ９０とデータセットＤＳ９０Ａとをシャッフルして、学習部２１に供給することを示している。なお、データセットＤＳ９０の観測振動加速度Ａ９３とデータセットＤＳ９０Ａの観測振動加速度Ａ９３は同じものとなっている。 FIG. 28 is an explanatory diagram of shuffle processing in the learning mode of the signal processing device 10 according to the first embodiment. FIG. 28 shows a data set DS90 in which a plurality of near-field sounds A90 of an electronic device and an observed vibration acceleration A93, which is a teacher signal corresponding thereto, are associated with each other. Data set DS90A associated with acceleration A93 is prepared. 28 shows that the selection unit 16 shuffles the data set DS90 and the data set DS90A and supplies them to the learning unit 21. As shown in FIG. Note that the observed vibration acceleration A93 of the data set DS90 and the observed vibration acceleration A93 of the data set DS90A are the same.

学習部２１は、データセットＤＳ９０とデータセットＤＳ９０Ａの中の選択部１６が選択したデータセットに基づいてニューラルネットワークの重みＷと伝達関数Ｈを学習する。その際に、データセットＤＳ９０Ａに含まれているノイズ低減入力物理量Ａ９０Ａのスペクトログラムは、処理マスクβによりノイズの支配的な部分についてはノイズが低減されているため、図２３Ｄに示すように取りたい信号ＳＧ２２と教師信号とのコヒーレンスを高くすることができる。学習部２１は、ニューラルネットワークの重みＷと伝達関数Ｈを学習する際に、そのコヒーレンスが下がるように学習を行う。 The learning unit 21 learns the weight W and the transfer function H of the neural network based on the data set DS90 and the data set DS90A selected by the selection unit 16 . At that time, the spectrogram of the noise-reduced input physical quantity A90A included in the data set DS90A has the noise in the dominant part of the noise reduced by the processing mask β. The coherence between SG22 and the teacher signal can be increased. The learning unit 21 learns the weight W and the transfer function H of the neural network so as to reduce the coherence thereof.

このような本第１実施形態に係る信号処理装置１０は、入力物理量から取りたい信号（抽出したい信号）よりもノイズの音圧が比較的大きい場合であっても、取り逃した信号がノイズ成分に含まれているにもかかわらず、ノイズ成分と教師信号との関連の度合い（例えばコヒーレンス）が低下して学習が止まることを避けられる。そのため、本実施形態に係る信号処理装置１０は、学習部２１によるニューラルネットワークの重みＷと伝達関数Ｈの学習において、取り逃した信号が発生することを抑制することができる。 In the signal processing apparatus 10 according to the first embodiment as described above, even when the sound pressure of noise is relatively larger than the signal to be taken from the input physical quantity (signal to be extracted), the missed signal becomes the noise component. It is possible to prevent learning from stopping due to a decrease in the degree of association (for example, coherence) between the noise component and the teacher signal even though the noise component is included. Therefore, the signal processing apparatus 10 according to the present embodiment can suppress generation of missed signals in the learning of the neural network weight W and the transfer function H by the learning unit 21 .

ところで、図２９に示すように、本第１実施形態に係る信号処理装置１０の学習部２１は、学習モード時に、伝達関数Ｈを用いずに、ニューラルネットワークの重みＷを学習する構成にすることができる。つまり、本第１実施形態に係る信号処理装置１０の学習部２１は学習モード時に、一部の学習の最適化を行わない構成にすることができる。図２９は、本第１実施形態に係る信号処理装置１０の学習モード時における動作の変更例の説明図である。図２９に示すように、変形例の学習部２１は、図３Ａに示す例の学習部２１と比較すると、推定信号である推定振動加速度Ａ９２と教師信号である観測振動加速度Ａ９３との関連性が小さくなる判定処理（誤差の最小化の判定処理）を削除している点で相違している。したがって、変形例の学習部２１は、学習モード時に、ノイズ成分である雑音Ａ９１ｂと教師信号である観測振動加速度Ａ９３との関連性が小さくなる判定処理（コヒーレンスの最小化の判定処理）と、抽出信号である抽出動作音Ａ９１ａと教師信号である観測振動加速度Ａ９３との関連性が大きくなる判定処理（コヒーレンスの最大化の判定処理）のみを行う構成になっている。 By the way, as shown in FIG. 29, the learning unit 21 of the signal processing device 10 according to the first embodiment is configured to learn the weight W of the neural network without using the transfer function H in the learning mode. can be done. In other words, the learning unit 21 of the signal processing apparatus 10 according to the first embodiment can be configured so as not to perform part of the learning optimization in the learning mode. FIG. 29 is an explanatory diagram of a modification example of the operation in the learning mode of the signal processing device 10 according to the first embodiment. As shown in FIG. 29, the learning unit 21 of the modified example has a higher relationship between the estimated vibration acceleration A92, which is the estimation signal, and the observed vibration acceleration A93, which is the teacher signal, when compared with the learning unit 21 of the example shown in FIG. 3A. The difference is that the determination process for reducing the error (determination process for minimizing the error) is deleted. Therefore, in the learning mode, the learning unit 21 of the modification performs determination processing (determination processing for minimizing coherence) in which the relevance between the noise component A91b and the observed vibration acceleration A93 as the teacher signal becomes smaller, and extraction Only determination processing (determination processing for maximizing coherence) that increases the relationship between the extracted motion sound A91a, which is the signal, and the observed vibration acceleration A93, which is the teacher signal, is performed.

＜信号処理装置（推定装置）の主な特徴＞
本実施形態に係る信号処理装置１０は、主に以下のような特徴を有する。なお、以下の説明において、「入力物理量」は、例えば図３Ａに示す用途例では「エンジン近傍音９０」を意味し、図２１に示す用途変更例では「電子装置近傍音Ａ９０」を意味する。また、「ノイズ低減入力物理量」は、例えば図３Ａに示す用途例では「ノイズ低減入力物理量９０Ａ」を意味し、図２１に示す用途変更例では「ノイズ低減入力物理量Ａ９０Ａ」を意味する。また、「抽出信号」は、例えば図３Ａに示す用途例では「抽出ノッキング音９１ａ」を意味し、図２１に示す用途変更例では「抽出動作音Ａ９１ａ」を意味する。また、「ノイズ成分」は、例えば図３Ａに示す用途例では「雑音９１ｂ」を意味し、図２１に示す用途変更例では「雑音Ａ９１ｂ」を意味する。また、「推定信号」は、例えば図３Ａに示す用途例では「推定ノッキング筒内圧９２」を意味し、図２１に示す用途変更例では「推定振動加速度Ａ９２」を意味する。また、「教師信号」は、例えば図３Ａに示す用途例では「観測ノッキング筒内圧９３」を意味し、図２１に示す用途変更例では「観測振動加速度Ａ９３」を意味する。 <Main Features of Signal Processing Device (Estimation Device)>
The signal processing device 10 according to this embodiment mainly has the following features. In the following description, the "input physical quantity" means "near engine sound 90" in the application example shown in FIG. 3A, and means "electronic device near sound A90" in the application change example shown in FIG. Further, the "noise reduction input physical quantity" means "noise reduction input physical quantity 90A" in the application example shown in FIG. 3A, and means "noise reduction input physical quantity A90A" in the application change example shown in FIG. Further, the "extracted signal" means "extracted knocking sound 91a" in the application example shown in FIG. 3A, and means "extracted operation sound A91a" in the application change example shown in FIG. Further, the "noise component" means "noise 91b" in the application example shown in FIG. 3A, and means "noise A 91b" in the application change example shown in FIG. Further, the "estimated signal" means "estimated knocking in-cylinder pressure 92" in the application example shown in FIG. 3A, and means "estimated vibration acceleration A92" in the application example shown in FIG. Further, the "teacher signal" means "observed knocking in-cylinder pressure 93" in the application example shown in FIG. 3A, and means "observed vibration acceleration A93" in the application change example shown in FIG.

（１）図２に示すように、本実施形態に係る信号処理装置１０は、ノイズ低減入力物理量生成部１５と、選択部１６と、学習部２１と、を備える。図３Ａ及び図２１に示すように、ノイズ低減入力物理量生成部１５は、入力物理量に含まれるノイズ成分を低減したノイズ低減入力物理量を生成する。つまり、本実施形態に係る信号処理装置１０は、ノイズ低減入力物理量生成部１５でノイズの小さいデータとしてノイズ低減入力物理量を生成する。選択部１６は、入力物理量とノイズ低減入力物理量のいずれか一方又は双方を選択して、学習部２１に供給する。学習部２１は、入力物理量とノイズ低減入力物理量の中の選択部１６が選択したものからノイズ成分を除去するためのノイズ除去マスクαを生成するニューラルネットワークの重みＷを学習する。 (1) As shown in FIG. 2 , the signal processing device 10 according to the present embodiment includes a noise reduction input physical quantity generator 15 , a selector 16 and a learner 21 . As shown in FIGS. 3A and 21, the noise reduction input physical quantity generation unit 15 generates noise reduction input physical quantity by reducing the noise component included in the input physical quantity. That is, in the signal processing apparatus 10 according to the present embodiment, the noise reduction input physical quantity generation unit 15 generates the noise reduction input physical quantity as data with low noise. The selection unit 16 selects one or both of the input physical quantity and the noise reduction input physical quantity, and supplies it to the learning unit 21 . The learning unit 21 learns the weight W of the neural network that generates the noise removal mask α for removing noise components from the input physical quantity and the noise reduction input physical quantity selected by the selection unit 16 .

ここで、「入力物理量とノイズ低減入力物理量の中の選択部１６が選択したもの」とは、具体的には、複数の電子装置近傍音Ａ９０（入力物理量）とそれに対応する観測振動加速度Ａ９３（教師信号）とが関連付けされたデータセットＤＳ９０（図２８）と複数のノイズ低減入力物理量Ａ９０Ａとそれに対応する観測振動加速度Ａ９３（教師信号）とが関連付けされたデータセットＤＳ９０Ａ（図２８）の中の選択部１６が選択したデータセットを意味する。 Here, "the one selected by the selector 16 from among the input physical quantity and the noise reduction input physical quantity" specifically means a plurality of electronic device near-field sounds A90 (input physical quantity) and the corresponding observed vibration acceleration A93 ( A data set DS90 ( FIG. 28 ) associated with a teacher signal) and a data set DS90A ( FIG. 28 ) associated with a plurality of noise reduction input physical quantities A90A and the corresponding observed vibration acceleration A93 (teacher signal) It means a data set selected by the selection unit 16 .

このような本実施形態に係る信号処理装置１０は、入力物理量に含まれるノイズ成分を低減したノイズ低減入力物理量を生成するノイズ低減入力物理量生成工程と、前記入力物理量と前記ノイズ低減入力物理量のいずれか一方又は双方を選択する選択工程と、前記入力物理量と前記ノイズ低減入力物理量の中の前記選択工程で選択したものからノイズ成分を除去するためのノイズ除去マスクを生成するニューラルネットワークの重みを学習する学習工程と、を含むことを特徴とする信号処理方法を実現することができる。 The signal processing apparatus 10 according to this embodiment includes a noise-reduced input physical quantity generation step of generating a noise-reduced input physical quantity in which a noise component included in the input physical quantity is reduced, and any one of the input physical quantity and the noise-reduced input physical quantity. a selection step of selecting one or both; and learning neural network weights that generate a noise removal mask for removing noise components from the ones selected in the selection step of the input physical quantity and the noise reduction input physical quantity. A signal processing method can be realized that includes a learning step of:

このような本実施形態に係る信号処理装置１０は、ノイズ量が大きい場合（特に、入力物理量から取りたい信号（抽出したい信号）よりもノイズの音圧が比較的大きい場合）であっても、取り逃した信号がノイズ成分に含まれているにもかかわらず、ノイズ成分と教師信号との関連の度合い（例えばコヒーレンス）が低下して学習が止まることを避けられる。そのため、本実施形態に係る信号処理装置１０は、学習部２１によるニューラルネットワークの重みＷと伝達関数Ｈの学習において、取り逃した信号が発生することを抑制することができる。これにより、本実施形態に係る信号処理装置１０は、ノイズ量が大きい場合であっても、入力物理量をノイズ成分と抽出信号とに良好に分離するための学習を行うことができる。 The signal processing device 10 according to the present embodiment as described above, even when the amount of noise is large (especially when the sound pressure of the noise is relatively higher than the signal to be taken from the input physical quantity (the signal to be extracted)), It is possible to prevent learning from stopping due to a decrease in the degree of association (for example, coherence) between the noise component and the teacher signal even though the missed signal is included in the noise component. Therefore, the signal processing apparatus 10 according to the present embodiment can suppress generation of missed signals in the learning of the neural network weight W and the transfer function H by the learning unit 21 . As a result, the signal processing apparatus 10 according to the present embodiment can perform learning for properly separating the input physical quantity into the noise component and the extraction signal even when the amount of noise is large.

（２）図３Ａ及び図２１に示すように、本実施形態に係る信号処理装置１０のノイズ低減入力物理量生成部１５は、ノイズ除去マスクαをフィルタ処理した処理マスクβを用いて、入力物理量からノイズ成分を低減したデータ（ノイズ低減入力物理量）を生成するとよい。 (2) As shown in FIGS. 3A and 21, the noise reduction input physical quantity generation unit 15 of the signal processing device 10 according to the present embodiment uses a processing mask β obtained by filtering the noise removal mask α to convert the input physical quantity into Data with reduced noise components (noise-reduced input physical quantity) may be generated.

このような本実施形態に係る信号処理装置１０は、取り逃した信号が含まれるノイズ成分と教師信号とのコヒーレンスを高くし、ニューラルネットワークの重みＷを学習する際に、そのコヒーレンスが下がるように学習を行うことで、ニューラルネットワークの重みＷの良好な学習を行うことができる。 The signal processing apparatus 10 according to this embodiment increases the coherence between the noise component containing the missed signal and the teacher signal, and learns so that the coherence decreases when learning the weight W of the neural network. is performed, the weight W of the neural network can be learned satisfactorily.

（３）図３Ａ及び図２１に示すように、本実施形態に係る信号処理装置１０の学習部２１は、ノイズ成分と教師信号との関連性が小さくなるとともに、入力物理量又はノイズ低減入力物理量からノイズ成分を除去した抽出信号と教師信号との関連性が大きくなるように、ニューラルネットワークの重みＷを学習するとよい。 (3) As shown in FIGS. 3A and 21 , the learning unit 21 of the signal processing device 10 according to the present embodiment reduces the relationship between the noise component and the teacher signal, and the input physical quantity or the noise reduction input physical quantity It is preferable to learn the weight W of the neural network so as to increase the relationship between the extracted signal from which the noise component has been removed and the teacher signal.

このような本実施形態に係る信号処理装置１０は、分離後のノイズ成分に教師信号と関連のある成分が含まれていないかを評価することができる。したがって、本実施形態に係る信号処理装置１０は、ノイズ低減入力物理量生成部１５で入力物理量（入力信号）に含まれるノイズ量を低減したノイズ低減入力物理量を生成して、学習部２１で分離後のノイズ成分に教師信号と関連のある成分が含まれていないかを評価することにより、入力物理量をノイズ成分と抽出信号とに良好に分離するための学習を行うことができる。 The signal processing apparatus 10 according to this embodiment can evaluate whether or not the separated noise components include components related to the teacher signal. Therefore, in the signal processing apparatus 10 according to the present embodiment, the noise-reduced input physical quantity generation unit 15 generates a noise-reduced input physical quantity by reducing the amount of noise included in the input physical quantity (input signal), and the learning unit 21 separates By evaluating whether or not the noise component of .DELTA.

（４）図３Ａ及び図２１に示すように、本実施形態に係る信号処理装置１０の学習部２１は、ノイズ除去マスクαを用いて、入力物理量又はノイズ低減入力物理量からノイズ成分を除去した抽出信号を生成して、抽出信号を教師信号と同じ次元の推定信号に変換するための伝達関数Ｈを学習するとよい。その際に、学習部２１は、教師信号に対する推定信号の誤差が小さくなるように、ニューラルネットワークの重みＷと伝達関数Ｈとを学習するとよい。 (4) As shown in FIGS. 3A and 21, the learning unit 21 of the signal processing device 10 according to the present embodiment extracts the noise components from the input physical quantity or the noise-reduced input physical quantity using the noise removal mask α. It is preferable to learn a transfer function H for generating a signal and transforming the extracted signal into an estimated signal of the same dimension as the teacher signal. At that time, the learning unit 21 should learn the weight W and the transfer function H of the neural network so that the error of the estimated signal with respect to the teacher signal becomes small.

このような本実施形態に係る信号処理装置１０は、教師信号に対する推定信号の誤差が小さくなるように、ニューラルネットワークの重みＷと伝達関数Ｈとを学習することで、最適なニューラルネットワークの重みＷと伝達関数Ｈとを学習することができる。 The signal processing apparatus 10 according to this embodiment learns the neural network weight W and the transfer function H so that the error of the estimated signal with respect to the teacher signal becomes small. and the transfer function H can be learned.

（５）図３Ａ及び図２１に示すように、本実施形態に係る信号処理装置１０の学習部２１は、入力物理量とノイズ低減入力物理量の中の選択部１６が選択したものから位相を加味してノイズ成分を除去するためのノイズ除去マスクαを生成するニューラルネットワークの重みＷを学習する。また、学習部２１は、入力物理量とノイズ低減入力物理量の中の選択部１６が選択したものからノイズ成分が除去された抽出信号を、教師信号と同じ次元（単位）の推定信号に位相を加味して変換するための伝達関数Ｈを学習する。係る構成において、図３Ａ及び図２１に示すように、学習部２１は、ノイズ成分と教師信号との関連性が小さくなるとともに、抽出信号と教師信号との関連性が大きくなるように、ニューラルネットワーク９４の重みＷを学習する。 (5) As shown in FIGS. 3A and 21, the learning unit 21 of the signal processing device 10 according to the present embodiment adds the phase from the input physical quantity and the noise reduction input physical quantity selected by the selection unit 16. learning the weight W of the neural network that generates the noise removal mask α for removing the noise component. In addition, the learning unit 21 adds the extracted signal from which the noise component is removed from the input physical quantity and the noise reduction input physical quantity selected by the selection unit 16 to the estimated signal of the same dimension (unit) as the teacher signal with the phase added. learning the transfer function H for transforming In such a configuration, as shown in FIGS. 3A and 21, the learning unit 21 uses the neural network to reduce the relevance between the noise component and the teacher signal and increase the relevance between the extracted signal and the teacher signal. 94 weights W are learned.

このような本実施形態に係る信号処理装置１０は、ニューラルネットワーク９４の重みＷを学習する際に、入力物理量（入力信号）に含まれるノイズ量を低減したノイズ低減入力物理量を生成して、分離後のノイズ成分に教師信号と関連のある成分が含まれていないかを評価することにより、入力物理量をノイズ成分と抽出信号とに良好に分離するための学習を行うことができる。このような信号処理装置１０は、ノイズ成分と、官能試験でのレベル変更に適した信号とに良好に分離することができる。また、信号処理装置１０は、背景音に目的音（ノッキング音や、検証対象の電子部品の動作音）が混入しているか否かを評価し易くすることができる。そのため、信号処理装置１０は、例えば特許文献２及び特許文献３に記載された従来技術よりも、ノッキング音や検証対象の電子部品の動作音の評価性能を向上させることができる。また、信号処理装置１０は、良好な官能試験を行うことができ、官能試験で許容不可となったデータに基づいて閾値を決定することで、検査者に近い判定ができる。 When learning the weight W of the neural network 94, the signal processing apparatus 10 according to the present embodiment generates a noise-reduced input physical quantity by reducing the amount of noise included in the input physical quantity (input signal), and separates Learning to separate the input physical quantity into the noise component and the extracted signal can be performed by evaluating whether the subsequent noise component includes a component related to the teacher signal. Such a signal processing device 10 can satisfactorily separate a noise component from a signal suitable for changing the level in a sensory test. In addition, the signal processing device 10 can easily evaluate whether or not the target sound (knocking sound or operation sound of the electronic component to be verified) is mixed in the background sound. Therefore, the signal processing device 10 can improve the evaluation performance of the knocking sound and the operating sound of the electronic component to be verified, compared to the conventional techniques described in Patent Documents 2 and 3, for example. In addition, the signal processing apparatus 10 can perform a good sensory test, and by determining a threshold value based on the unacceptable data in the sensory test, it is possible to make a judgment close to the examiner's.

（６）図３Ａ及び図２１に示すように、本実施形態に係る信号処理装置１０の学習部２１は、ノイズ除去マスクαを生成するニューラルネットワークの重みＷ、及び、伝達関数Ｈを学習する際に、入力物理量に関連する振幅と位相成分を加味して学習する。 (6) As shown in FIGS. 3A and 21, the learning unit 21 of the signal processing device 10 according to the present embodiment learns the weight W of the neural network that generates the noise removal mask α and the transfer function H. In addition, learning is performed by adding the amplitude and phase components related to the input physical quantity.

このような本実施形態に係る信号処理装置１０は、雑音の中にノッキング音や、検証対象の電子部品の動作音（目的音）が混入しないように、入力物理量を抽出信号とノイズ成分とに分離することができる。このような信号処理装置１０は、特許文献２及び特許文献３に記載された従来技術よりも、ノッキング音や検証対象の電子部品の動作音の評価性能を向上させることができる。また、このような信号処理装置１０は、良好な官能試験を行うことができる。 The signal processing apparatus 10 according to this embodiment converts the input physical quantity into the extraction signal and the noise component so that the noise does not include the knocking sound or the operation sound (target sound) of the electronic component to be verified. can be separated. Such a signal processing device 10 can improve the evaluation performance of the knocking sound and the operation sound of the electronic component to be verified, compared to the conventional techniques described in Patent Documents 2 and 3. Moreover, such a signal processing device 10 can perform a good sensory test.

（７）本実施形態に係る信号処理装置１０の学習部２１は、ノイズ成分と教師信号との関連性が小さくなるとともに、抽出信号と教師信号との関連性が大きくなるように、ニューラルネットワークの重みＷ、及び、伝達関数Ｈを学習する。その際に、例えば、図３Ａ及び図２１に示すように、本実施形態に係る信号処理装置１０の学習部２１は、ノイズ成分と教師信号とのコヒーレンスをノイズ成分と教師信号との関連性を表す要素として用いるとよい。また、本実施形態に係る信号処理装置１０の学習部２１は、抽出信号と教師信号とのコヒーレンスを抽出信号と教師信号との関連性を表す要素として用いるとよい。具体的には、図３Ａ及び図２１に示すように、本実施形態に係る信号処理装置１０の学習部２１は、ノイズ成分を逆短時間フーリエ変換（ＩＳＴＦＴ）して求めた信号波形と教師信号とのコヒーレンス（又はコヒーレントアウトプットパワー）が小さくなるとともに、抽出信号を逆短時間フーリエ変換（ＩＳＴＦＴ）して求めた信号波形と教師信号とのコヒーレンス（又はコヒーレントアウトプットパワー）が大きくなるように、ニューラルネットワークの重みＷ、及び、伝達関数Ｈを学習するとよい。 (7) The learning unit 21 of the signal processing device 10 according to the present embodiment reduces the relevance between the noise component and the teacher signal and increases the relevance between the extracted signal and the teacher signal. Weights W and transfer functions H are learned. At that time, for example, as shown in FIGS. 3A and 21, the learning unit 21 of the signal processing device 10 according to the present embodiment determines the coherence between the noise component and the teacher signal, and the relationship between the noise component and the teacher signal. It is good to use it as an element to represent. Also, the learning unit 21 of the signal processing device 10 according to the present embodiment may use the coherence between the extracted signal and the teacher signal as an element representing the relationship between the extracted signal and the teacher signal. Specifically, as shown in FIGS. 3A and 21, the learning unit 21 of the signal processing device 10 according to the present embodiment performs an inverse short-time Fourier transform (ISTFT) on the noise component to obtain a signal waveform and a teacher signal. As the coherence (or coherent output power) with becomes smaller, the coherence (or coherent output power) between the signal waveform obtained by inverse short-time Fourier transform (ISTFT) of the extracted signal and the teacher signal becomes larger. , the neural network weight W, and the transfer function H.

このような本実施形態に係る信号処理装置１０は、ノイズ成分からノッキング筒内圧や、検証対象の電子部品の動作に起因する音を除去することができる。 The signal processing device 10 according to the present embodiment can remove the knocking cylinder internal pressure and the sound caused by the operation of the electronic component to be verified from the noise components.

（８）図３Ａ及び図２１に示すように、本実施形態に係る信号処理装置１０の学習部２１は、ニューラルネットワークの重みＷ、及び、伝達関数Ｈを学習する場合に、教師信号に対する推定信号の誤差が小さくなるように、学習する。具体的には、信号処理装置１０の学習部２１は、教師信号に対して短時間フーリエ変換（ＳＴＦＴ）を行い求めたスペクトログラムに対する推定信号の誤差が小さくなるように、学習する。又は、信号処理装置１０の学習部２１は、教師信号と、抽出信号に対して、逆短時間フーリエ変換（ＩＳＴＦＴ）と高速フーリエ変換（ＦＦＴ）とを行い、伝達関数Ｈを掛け、逆高速フーリエ変換（ＩＦＦＴ）を行い求めた推定信号の信号波形との誤差が小さくなるように、学習する。 (8) As shown in FIGS. 3A and 21, the learning unit 21 of the signal processing device 10 according to the present embodiment learns the weight W of the neural network and the transfer function H, the estimated signal for the teacher signal learn so that the error of Specifically, the learning unit 21 of the signal processing device 10 performs learning so as to reduce the error of the estimated signal with respect to the spectrogram obtained by performing short-time Fourier transform (STFT) on the teacher signal. Alternatively, the learning unit 21 of the signal processing device 10 performs an inverse short-time Fourier transform (ISTFT) and a fast Fourier transform (FFT) on the teacher signal and the extracted signal, multiplies the transfer function H, and performs an inverse fast Fourier transform. Learning is performed so as to reduce the error from the signal waveform of the estimated signal obtained by conversion (IFFT).

このような本実施形態に係る信号処理装置１０は、教師信号に対して短時間フーリエ変換（ＳＴＦＴ）を行い求めたスペクトログラムに対する推定信号の誤差が小さくなるように、ニューラルネットワークの重みＷ、及び、伝達関数Ｈを学習することができる。又は、信号処理装置１０は、教師信号と、抽出信号に対して、逆短時間フーリエ変換（ＩＳＴＦＴ）と高速フーリエ変換（ＦＦＴ）とを行い、伝達関数Ｈを掛け、逆高速フーリエ変換（ＩＦＦＴ）を行い求めた推定信号の信号波形との誤差が小さくなるように、ニューラルネットワークの重みＷ、及び、伝達関数Ｈを学習することができる。 The signal processing apparatus 10 according to this embodiment performs the short-time Fourier transform (STFT) on the teacher signal so that the error of the estimated signal with respect to the spectrogram obtained is small. A transfer function H can be learned. Alternatively, the signal processing device 10 performs an inverse short-time Fourier transform (ISTFT) and a fast Fourier transform (FFT) on the teacher signal and the extracted signal, multiplies the transfer function H, and performs the inverse fast Fourier transform (IFFT). The weight W of the neural network and the transfer function H can be learned so that the error with the signal waveform of the estimated signal obtained by performing the above is reduced.

（９）図７Ｂに示すように、本実施形態に係る信号処理装置１０は、抽出信号のレベルを変更して、入力物理量から分離されたノイズ成分と合成して加工音を生成する信号合成部５０を備える。 (9) As shown in FIG. 7B, the signal processing device 10 according to the present embodiment includes a signal synthesizing unit that changes the level of the extracted signal and synthesizes it with the noise component separated from the input physical quantity to generate the processed sound. 50.

このような本実施形態に係る信号処理装置１０は、検査すべき目的音（ノッキング音／検証対象の電子部品の動作音）を聞き分け易い状態にすることができる。このような信号処理装置１０は、目的音（ノッキング音／検証対象の電子部品の動作音）の有無を高精度に検査者に把握させることができ、官能試験で許容不可となったデータに基づいて閾値を決定することで、検査者に近い判定ができる。 The signal processing apparatus 10 according to this embodiment can make the target sound to be inspected (knocking sound/operation sound of the electronic component to be verified) easy to distinguish. Such a signal processing device 10 enables the inspector to grasp the presence or absence of the target sound (knocking sound/operation sound of the electronic component to be verified) with high accuracy, and based on the unacceptable data in the sensory test. Determining the threshold value by the operator can make a judgment close to that of the inspector.

（１０）図７Ｂに示すように、本実施形態に係る信号処理装置１０の信号合成部５０は、検査者、または操作者による抽出信号のレベルの指定を受け付ける信号調整部５１を有している。 (10) As shown in FIG. 7B, the signal synthesizing unit 50 of the signal processing apparatus 10 according to the present embodiment has a signal adjusting unit 51 that receives specification of the level of the extraction signal by the inspector or operator. .

このような本実施形態に係る信号処理装置１０は、抽出信号のレベルを任意にかつ細やかに変更することができる。 The signal processing device 10 according to this embodiment can arbitrarily and finely change the level of the extraction signal.

（１１）図１４Ａ及び図１７に示すように、本実施形態に係る信号処理装置１０は、以下の信号処理方法を実現することができる。すなわち、本実施形態に係る信号処理方法は、学習工程（図１４ＡのステップＳ３０からステップＳ３２の工程）と、分離工程（図１７のステップＳ６０からステップＳ６２の工程）と、を含む。学習工程では、ノイズ成分が含まれている入力物理量からノイズ成分を除去するためのノイズ除去マスクαを生成するニューラルネットワーク９４の重みＷを学習する。その際に、ノイズ成分と教師信号との関連性が小さくなるとともに、抽出信号と教師信号との関連性が大きくなるように、ニューラルネットワーク９４の重みＷを学習する。また、その際に、ニューラルネットワーク９４により、ノイズ成分が含まれている入力物理量から位相を加味してノイズ成分を除去するためのノイズ除去マスクαを生成するニューラルネットワーク９４の重みＷ、及び、入力物理量からノイズ成分が除去された抽出信号を、教師信号と同じ次元（単位）の推定信号に位相を加味して変換するための伝達関数Ｈを学習するとよい。分離工程では、ノイズ除去マスクαを用いて入力物理量をノイズ成分とノイズ成分が除去された抽出信号とに分離する。 (11) As shown in FIGS. 14A and 17, the signal processing device 10 according to this embodiment can realize the following signal processing methods. That is, the signal processing method according to this embodiment includes a learning process (steps S30 to S32 in FIG. 14A) and a separation process (steps S60 to S62 in FIG. 17). In the learning process, the weight W of the neural network 94 that generates the noise removal mask α for removing the noise component from the input physical quantity containing the noise component is learned. At that time, the weight W of the neural network 94 is learned so that the relationship between the noise component and the teacher signal becomes smaller and the relationship between the extracted signal and the teacher signal becomes larger. At that time, the weight W of the neural network 94 that generates a noise removal mask α for removing the noise component from the input physical quantity containing the noise component by adding the phase, and the input It is preferable to learn a transfer function H for converting an extracted signal from which a noise component has been removed from a physical quantity to an estimated signal of the same dimension (unit) as the teacher signal with consideration given to the phase. In the separation step, the noise removal mask α is used to separate the input physical quantity into a noise component and an extraction signal from which the noise component has been removed.

このような本実施形態に係る信号処理方法は、入力物理量をノイズ成分と抽出信号とに良好に分離することができる。特に、ノイズ成分と、官能試験でのレベル変更に適した信号とに分離することができる。 Such a signal processing method according to the present embodiment can satisfactorily separate an input physical quantity into a noise component and an extracted signal. In particular, it is possible to separate noise components and signals suitable for level changes in sensory tests.

（１２）図３Ａ及び図２１に示すように、本実施形態に係る信号処理装置１０は、入力物理量からノイズ成分を除去した抽出信号に伝達関数Ｈを掛け合わせて、教師信号と同じ次元（単位）の推定信号を推定する推定装置である。本実施形態に係る信号処理装置１０の学習部２１は、ニューラルネットワーク９４により、入力物理量からノイズ成分を除去して抽出信号を抽出するためのノイズ除去マスクαを生成するニューラルネットワークの重みＷ、及び、伝達関数Ｈを学習する。 (12) As shown in FIGS. 3A and 21, the signal processing apparatus 10 according to the present embodiment multiplies the extraction signal obtained by removing the noise component from the input physical quantity by the transfer function H, and obtains the same dimension (unit as the teacher signal). ) is an estimator for estimating the estimated signal. The learning unit 21 of the signal processing device 10 according to the present embodiment uses the neural network 94 to remove noise components from the input physical quantity and generate a noise removal mask α for extracting an extraction signal. , the transfer function H is learned.

このような本実施形態に係る信号処理装置１０は、ニューラルネットワークの重みＷ、及び、伝達関数Ｈを学習する際に、ノイズ除去マスクα、及び、伝達関数Ｈに対して、入力物理量に関連する振幅と位相成分を加味して学習することができる。 When the signal processing device 10 according to this embodiment learns the weight W and the transfer function H of the neural network, the noise removal mask α and the transfer function H are related to the input physical quantity. Amplitude and phase components can be considered for learning.

（１３）本実施形態に係る信号処理装置１０は、以下の推定方法を実現することができる。すなわち、本実施形態に係る推定方法は、入力物理量からノイズ成分を除去した抽出信号に伝達関数Ｈを掛け合わせて、教師信号と同じ次元（単位）の推定信号を推定する方法である。図１４Ｂ又は図１４Ｃに示すように、本実施形態に係る推定方法は、学習工程（ステップＳ１０３又はステップＳ１０３ａの工程）と、推定信号推定工程（ステップＳ１０２の工程）と、を含む。学習工程（ステップＳ１０３又はステップＳ１０３ａの工程）では、ニューラルネットワーク９４により、入力物理量からノイズ成分を除去して抽出信号を抽出するためのノイズ除去マスクαを生成するニューラルネットワークの重みＷ、及び、伝達関数Ｈを学習する。推定信号推定工程（ステップＳ１０２の工程）では、ニューラルネットワーク９４により、ノイズ除去マスクαを用いて、ノイズ成分が除去された抽出信号を取得し、伝達関数Ｈを掛け合わせることで抽出信号を推定信号に変換する。本実施形態に係る推定方法は、学習工程（ステップＳ１０３又はステップＳ１０３ａの工程）において、ニューラルネットワークの重みＷ、及び、伝達関数Ｈを学習する際に、ノイズ除去マスクα、及び、伝達関数Ｈに対して、入力物理量に関連する振幅と位相成分を加味して学習する。 (13) The signal processing device 10 according to this embodiment can realize the following estimation method. That is, the estimation method according to the present embodiment is a method of estimating an estimation signal of the same dimension (unit) as the teacher signal by multiplying the extraction signal obtained by removing the noise component from the input physical quantity by the transfer function H. As shown in FIG. 14B or 14C, the estimation method according to this embodiment includes a learning step (step S103 or step S103a) and an estimated signal estimation step (step S102). In the learning step (step S103 or step S103a), the neural network 94 generates a noise removal mask α for removing noise components from the input physical quantity and extracting an extraction signal. Learn the function H. In the estimated signal estimating step (step S102), the neural network 94 uses the noise removal mask α to acquire the extracted signal from which noise components have been removed, and multiplies the extracted signal by the transfer function H to convert the extracted signal into an estimated signal. Convert to In the estimation method according to the present embodiment, in the learning step (step S103 or step S103a), when learning the weight W of the neural network and the transfer function H, the noise removal mask α and the transfer function H On the other hand, it learns by adding the amplitude and phase components related to the input physical quantity.

このような本実施形態に係る推定方法は、ニューラルネットワークの重みＷ、及び、伝達関数Ｈを学習する際に、ノイズ除去マスクα、及び、伝達関数Ｈに対して、入力物理量に関連する振幅と位相成分を加味して学習することができる。 In the estimation method according to this embodiment, when learning the weight W and the transfer function H of the neural network, the noise removal mask α and the transfer function H are applied to the amplitude and It is possible to learn by adding the phase component.

以上の通り、本実施形態に係る信号処理装置１０によれば、ノイズ量が大きい場合（特に、入力物理量から取りたい信号（抽出したい信号）よりもノイズの音圧が比較的大きい場合）であっても、取り逃した信号がノイズ成分に含まれているにもかかわらず、ノイズ成分と教師信号とのコヒーレンスが低下して学習が止まることを避けられる。そのため、本実施形態に係る信号処理装置１０は、学習部２１によるニューラルネットワークの重みＷと伝達関数Ｈの学習において、取り逃した信号が発生することを抑制することができる。これにより、本実施形態に係る信号処理装置１０は、ノイズ量が大きい場合であっても、入力物理量をノイズ成分と抽出信号とに良好に分離するための学習を行うことができる。 As described above, according to the signal processing device 10 according to the present embodiment, even when the amount of noise is large (especially when the sound pressure of the noise is relatively higher than the signal to be taken from the input physical quantity (the signal to be extracted)), Even so, it is possible to prevent learning from stopping due to a decrease in coherence between the noise component and the teacher signal, even though the missed signal is included in the noise component. Therefore, the signal processing apparatus 10 according to the present embodiment can suppress generation of missed signals in the learning of the neural network weight W and the transfer function H by the learning unit 21 . As a result, the signal processing apparatus 10 according to the present embodiment can perform learning for properly separating the input physical quantity into the noise component and the extraction signal even when the amount of noise is large.

また、本第１実施形態に係る信号処理装置１０によれば、ニューラルネットワーク９４の重みＷを学習する際に、入力物理量（入力信号）に含まれるノイズ量を低減したノイズ低減入力物理量を生成して、分離後のノイズ成分に教師信号と関連のある成分が含まれていないかを評価しながら、評価結果を学習データに加えることにより、入力物理量をノイズ成分と抽出信号とに良好に分離するための学習を行うことができる。また、本第１実施形態に係る信号処理装置１０によれば、入力物理量をノイズ成分と抽出信号とに良好に分離することができる。特に、信号処理装置１０は、ノイズ成分と、官能試験でのレベル変更に適した信号とに分離することができる。このような信号処理装置１０は、背景音に目的音（ノッキング音／検証対象の電子部品の動作音）が混入しているか否かを評価し易くすることができる。そのため、信号処理装置１０は、ノッキング音や検証対象の電子部品の動作音の評価性能を向上させることができる。また、信号処理装置１０は、良好な官能試験を行うことができる。 Further, according to the signal processing apparatus 10 according to the first embodiment, when learning the weight W of the neural network 94, a noise-reduced input physical quantity is generated by reducing the amount of noise included in the input physical quantity (input signal). By adding the evaluation results to the learning data while evaluating whether the separated noise components contain any component related to the teacher signal, the input physical quantity is well separated into the noise component and the extracted signal. You can study for Further, according to the signal processing device 10 according to the first embodiment, it is possible to separate the input physical quantity into the noise component and the extracted signal. In particular, the signal processing device 10 can separate noise components and signals suitable for level changes in sensory tests. Such a signal processing device 10 can facilitate evaluation of whether or not the target sound (knocking sound/operation sound of the electronic component to be verified) is mixed in the background sound. Therefore, the signal processing device 10 can improve the evaluation performance of the knocking sound and the operation sound of the electronic component to be verified. In addition, the signal processing device 10 can perform good sensory tests.

［第２実施形態］
図３０を参照して、本第２実施形態に係る信号処理装置１０Ａ（推定装置）の構成について説明する。図３０は、第２実施形態に係る信号処理装置１０Ａ（推定装置）の構成を示すブロック図である。 [Second embodiment]
The configuration of the signal processing device 10A (estimation device) according to the second embodiment will be described with reference to FIG. FIG. 30 is a block diagram showing the configuration of a signal processing device 10A (estimation device) according to the second embodiment.

図３０の信号処理システム１００Ａは、信号処理装置１０Ａによって実現される。図３０に示すように、本第２実施形態に係る信号処理装置１０Ａ（推定装置）は、第１実施形態に係る信号処理装置１０（図２参照）と比較すると、信号合成部５０が以下の機能を有する点で相違している。すなわち、信号合成部５０は、教師信号に伝達関数Ｈの逆数を掛け、入力物理量とノイズ低減入力物理量の中の選択部１６が選択したものから分離されたノイズ成分と合成して加工音を生成する機能を有する。 A signal processing system 100A in FIG. 30 is realized by a signal processing device 10A. As shown in FIG. 30, in the signal processing device 10A (estimation device) according to the second embodiment, when compared with the signal processing device 10 (see FIG. 2) according to the first embodiment, the signal synthesizing unit 50 has the following They are different in that they have functions. That is, the signal synthesizing unit 50 multiplies the teacher signal by the reciprocal of the transfer function H, synthesizes it with the noise component separated from the input physical quantity and the noise reduction input physical quantity selected by the selecting unit 16, and generates the processed sound. It has the function to

このような本第２実施形態に係る信号処理装置１０Ａ（推定装置）は、第１実施形態に係る信号処理装置１０と同様に、目的音（エンジン１のノッキング音や電子部品１０１の動作音等）の有無を高精度に検査者に把握させることができ、検査性能を向上させることができる。 Like the signal processing device 10 according to the first embodiment, the signal processing device 10A (estimation device) according to the second embodiment can detect the target sound (the knocking sound of the engine 1, the operating sound of the electronic component 101, etc.). ) can be accurately grasped by the inspector, and the inspection performance can be improved.

なお、信号合成部５０は、教師信号のレベル（大きさ）を変更したレベル変更教師信号に伝達関数Ｈの逆数を掛け、入力物理量とノイズ低減入力物理量の中の選択部１６が選択したものから分離されたノイズ成分と合成して加工音を生成する機能を有してもよい。 The signal synthesizing unit 50 multiplies the level-changed teacher signal obtained by changing the level (magnitude) of the teacher signal by the reciprocal of the transfer function H, and selects from the input physical quantity and the noise reduction input physical quantity by the selector 16. It may have a function of synthesizing with the separated noise component to generate a processed sound.

また、信号合成部５０は、教師信号ではなく、教師信号と同じ次元である任意の信号（例えば、任意のノッキング筒内圧信号や振動加速度）に伝達関数Ｈの逆数を掛け、入力物理量とノイズ低減入力物理量の中の選択部１６が選択したものから分離されたノイズ成分と合成して加工音を生成する機能を有してもよい。 In addition, the signal synthesizing unit 50 multiplies the reciprocal of the transfer function H not by the teacher signal but by an arbitrary signal having the same dimension as the teacher signal (for example, an arbitrary knocking in-cylinder pressure signal or vibration acceleration) to obtain input physical quantity and noise reduction. It may have a function of generating a processed sound by synthesizing it with a noise component separated from the input physical quantity selected by the selector 16 .

また、信号合成部５０は、伝達関数Ｈの値を変更した変更伝達関数Ｈｃの逆数を教師信号に掛け、入力物理量とノイズ低減入力物理量の中の選択部１６が選択したものから分離されたノイズ成分と合成して加工音を生成するようにしてもよい。 In addition, the signal synthesis unit 50 multiplies the teacher signal by the reciprocal of the modified transfer function Hc obtained by changing the value of the transfer function H, and the noise separated from the input physical quantity and the noise reduction input physical quantity selected by the selection unit 16 A processed sound may be generated by synthesizing with the component.

また、信号合成部５０は、教師信号のレベル（大きさ）を変更したレベル変更教師信号に伝達関数Ｈの値を変更した変更伝達関数Ｈｃの逆数を掛け、入力物理量とノイズ低減入力物理量の中の選択部１６が選択したものから分離されたノイズ成分と合成して加工音を生成する機能を有してもよい。 In addition, the signal synthesizing unit 50 multiplies the level-changed teacher signal obtained by changing the level (magnitude) of the teacher signal by the reciprocal of the changed transfer function Hc obtained by changing the value of the transfer function H, and obtains the input physical quantity and the noise reduction input physical quantity. may have a function of synthesizing with the noise component separated from the one selected by the selection unit 16 to generate a processed sound.

ここで、例えば、変更伝達関数Ｈｃは、伝達関数Ｈのある周波数帯に該当する振幅を増加又は減少させたもの、及び／又は位相を変更させたものである。 Here, for example, the modified transfer function Hc is obtained by increasing or decreasing the amplitude and/or changing the phase of the transfer function H corresponding to a certain frequency band.

なお、本発明は、前記した実施形態に限定されるものではなく、本発明の要旨を逸脱しない範囲で種々の変更や変形を行うことができる。 The present invention is not limited to the above-described embodiments, and various changes and modifications can be made without departing from the gist of the present invention.

例えば、前記した実施形態は、本発明の要旨を分かり易く説明するために詳細に説明したものである。そのため、本発明は、必ずしも説明した全ての構成要素を備えるものに限定されるものではない。また、本発明は、ある構成要素に他の構成要素を追加したり、一部の構成要素を他の構成要素に変更したりすることができる。また、本発明は、一部の構成要素を削除することもできる。 For example, the above-described embodiments have been described in detail in order to explain the gist of the present invention in an easy-to-understand manner. Therefore, the present invention is not necessarily limited to those including all the components described. Also, in the present invention, other components can be added to a certain component, or some components can be changed to other components. Also, the present invention can omit some components.

１…エンジン
２…エンジンＥＣＵ
３…車両
４…音圧センサ
５…筒内圧センサ
６…データ収集装置
７…モニタ
８…ヘッドホン（放音部）
９…レベル指定部
１０，１０Ａ，１０Ｚ…信号処理装置（推定装置）
１１…信号切出部
１２…スペクトログラム算出部
１３…信号記憶部
１４…スイッチ
１５…ノイズ低減入力物理量生成部
１６…選択部
２０，２０Ｚ…学習処理部
２１，２１Ｚ…学習部
２２…学習済みパラメータ記憶部
２３…第１推定部
２４…閾値算出部
２５…閾値記憶部
２６ａ…教師信号記憶部
２６ｂ…推定信号記憶部
２６ｃ…ノイズ成分記憶部
２６ｄ…抽出信号記憶部
３０…判定処理部
３１…第２推定部（抽出信号推定部）
３２…閾値判定部
４０…分離部
５０…信号合成部
５１…信号調整部
５２…信号出力部
８１…ノッキング振動
８２…ノッキング音
８３…メカニカルノイズ（ノイズ成分）
９０…エンジン近傍音（入力物理量）
９０Ａ…ノイズ低減入力物理量
９１ａ…抽出ノッキング音（抽出信号）
９１ａａ…レベル変更ノッキング音（レベル変更抽出信号）
９１ｂ…雑音（ノイズ成分）
９１ｃ…加工音
９２…推定ノッキング筒内圧（推定信号）
９３…観測ノッキング筒内圧（教師信号）
９４…ニューラルネットワーク
９４Ａ…マスク生成ネットワーク
９５…Ｕ－Ｎｅｔ
９６…下向きパス（Ｅｎｃｏｄｅｒ）
９７…階層
９８…上向きパス（Ｄｅｃｏｄｅｒ）
９９…出力
１００，１００Ａ，１００Ｚ…信号処理システム
１０１…電子部品（検証対象）
１０２…電子部品
１０３…電子装置
１０５…加速度センサ
α…ノイズ除去マスク
β…処理マスク
Ａ９０…電子装置近傍音（入力物理量）
Ａ９０Ａ…ノイズ低減入力物理量
Ａ９１ａ…抽出動作音（抽出信号）
Ａ９１ｂ…雑音（ノイズ成分）
Ａ９２…推定振動加速度（推定信号）
Ａ９３…観測振動加速度（教師信号）
ＤＳ９０，ＤＳ９０Ａ…データセット
Ｆ１１，Ｆ２１，Ｆ３１，Ｆ３７…スペクトログラム
Ｆ１２，Ｆ２２，Ｆ３２，Ｆ３６，Ｆ９３…信号波形
Ｆ１３，Ｆ２３…コヒーレンス
Ｆ３３，Ｆ３５…スペクトル
Ｆ３４…周波数応答特性
Ｈ，Ｈｅ…伝達関数
Ｈｃ…変更伝達関数
Ｍ１…学習モード用接続部
Ｍ２…閾値算出モード用接続部
Ｍ３…判定モード用接続部
Ｍ４…分離モード用接続部
Ｍ５…官能試験モード用接続部
ＮＯ１１、ＮＯ２１，ＮＯ２２…ノイズ
ＳＧ１１，ＳＧ１２，ＳＧ２１，ＳＧ２２…取りたい信号（抽出したい信号）
ＳＧ１３，ＳＧ２３…取り逃した信号（抽出し損ねた信号）
Ｗ…重み
Ｔ…閾値 1... Engine 2... Engine ECU
3 Vehicle 4 Sound pressure sensor 5 In-cylinder pressure sensor 6 Data collecting device 7 Monitor 8 Headphone (sound emitting unit)
9 Level designation units 10, 10A, 10Z Signal processing device (estimation device)
11 Signal clipping unit 12 Spectrogram calculating unit 13 Signal storage unit 14 Switch 15 Noise reduction input physical quantity generating unit 16 Selecting units 20, 20Z Learning processing units 21, 21Z Learning unit 22 Learned parameter storage Section 23 First estimation section 24 Threshold calculation section 25 Threshold storage section 26a Teacher signal storage section 26b Estimated signal storage section 26c Noise component storage section 26d Extracted signal storage section 30 Judgment processing section 31 Second Estimator (extracted signal estimator)
32... Threshold decision unit 40... Separation unit 50... Signal synthesis unit 51... Signal adjustment unit 52... Signal output unit 81... Knocking vibration 82... Knocking sound 83... Mechanical noise (noise component)
90... Near engine sound (input physical quantity)
90A Noise reduction input physical quantity 91a Extracted knocking sound (extracted signal)
91aa... Level change knocking sound (level change extraction signal)
91b... Noise (noise component)
91c... Processed sound 92... Estimated knocking in-cylinder pressure (estimated signal)
93... Observed knocking cylinder pressure (teacher signal)
94 Neural network 94A Mask generation network 95 U-Net
96... Downward path (Encoder)
97... Hierarchy 98... Upward path (Decoder)
99... Outputs 100, 100A, 100Z... Signal processing system 101... Electronic parts (verification target)
102... Electronic component 103... Electronic device 105... Acceleration sensor α... Noise elimination mask β... Processing mask A90... Near electronic device sound (input physical quantity)
A90A: Noise reduction input physical quantity A91a: Extracted operation sound (extracted signal)
A91b... noise (noise component)
A92 ... Estimated vibration acceleration (estimated signal)
A93 ... Observed vibration acceleration (teacher signal)
DS90, DS90A Data sets F11, F21, F31, F37 Spectrograms F12, F22, F32, F36, F93 Signal waveforms F13, F23 Coherence F33, F35 Spectrum F34 Frequency response characteristics H, He Transfer function Hc Changed transfer function M1 Learning mode connection M2 Threshold calculation mode connection M3 Judgment mode connection M4 Separation mode connection M5 Sensory test mode connection NO11, NO21, NO22 Noise SG11, SG12 , SG21, SG22 ... signals to be taken (signals to be extracted)
SG13, SG23... Missed signals (failed to extract signals)
W... Weight T... Threshold

Claims

a noise-reduced input physical quantity generator that generates a noise-reduced input physical quantity in which noise components included in the input physical quantity are reduced;
a selection unit that selects one or both of the input physical quantity and the noise reduction input physical quantity;
a learning unit that learns weights of a neural network that generates a noise removal mask for removing noise components from the input physical quantity and the noise reduction input physical quantity selected by the selection unit. signal processor.

The signal processing device according to claim 1,
The noise reduction input physical quantity generation unit generates the noise reduction input physical quantity by reducing the noise component from the input physical quantity using a processing mask obtained by filtering the noise removal mask.

In the signal processing device according to claim 1 or claim 2,
The learning unit reduces the relationship between the noise component and the teacher signal and increases the relationship between the input physical quantity or the extracted signal obtained by removing the noise component from the noise-reduced input physical quantity and the teacher signal. (2) a signal processing apparatus that learns weights of the neural network;

In the signal processing device according to claim 3,
The learning unit uses the noise removal mask to generate the extracted signal obtained by removing the noise component from the input physical quantity or the noise-reduced input physical quantity, and converts the extracted signal to an estimated signal having the same dimension as that of the teacher signal. A signal processing device characterized by learning a transfer function for converting to .

In the signal processing device according to claim 4,
The signal processing apparatus, wherein the learning unit learns the weights of the neural network and the transfer function so that an error of the estimated signal with respect to the teacher signal is reduced.

In the signal processing device according to any one of claims 1 to 5,
The signal processing device, wherein the selection unit supplies a data set obtained by shuffling the input physical quantity and the noise reduction input physical quantity to the learning unit.

a noise-reduced input physical quantity generation step of generating a noise-reduced input physical quantity in which a noise component included in the input physical quantity is reduced;
a selection step of selecting one or both of the input physical quantity and the noise reduction input physical quantity;
and a learning step of learning weights for a neural network that generates a noise removal mask for removing noise components from those selected in the selection step among the input physical quantity and the noise reduction input physical quantity. signal processing method.