JP6959420B1

JP6959420B1 - Signal processing device and signal processing method

Info

Publication number: JP6959420B1
Application number: JP2020170469A
Authority: JP
Inventors: 太郎笠原; 光渡部; 洋志吉越
Original assignee: Ono Sokki Co Ltd
Current assignee: Ono Sokki Co Ltd
Priority date: 2020-10-08
Filing date: 2020-10-08
Publication date: 2021-11-02
Anticipated expiration: 2040-10-08
Also published as: JP2022062452A

Abstract

【課題】入力物理量をノイズ成分と抽出信号とに良好に分離する。【解決手段】信号処理装置１０は、学習部２１と、分離部４０と、を備える。学習部２１は、ニューラルネットワーク９４により、ノイズ成分（雑音９１ｂ）が含まれている入力物理量（エンジン近傍音９０）から位相を加味してノイズ成分を除去するためのマスクαを生成するネットワークの重み、及び、入力物理量からノイズ成分が除去された抽出信号（抽出ノッキング音９１ａ）を、教師信号（観測ノッキング筒内圧９３）と同じ次元（単位）の推定信号（推定ノッキング筒内圧９２）に位相を加味して変換するための伝達関数Ｈを学習する。分離部４０は、入力物理量をノイズ成分と抽出信号とに分離する。【選択図】図３ＡPROBLEM TO BE SOLVED: To satisfactorily separate an input physical quantity into a noise component and an extracted signal. A signal processing device (10) includes a learning unit (21) and a separation unit (40). The learning unit 21 uses the neural network 94 to generate a mask α for removing the noise component by adding the phase from the input physical quantity (sound near the engine 90) including the noise component (noise 91b). The phase of the extracted signal (extracted knocking sound 91a) from which the noise component is removed from the input physical quantity is set to the estimated signal (estimated knocking cylinder internal pressure 92) having the same dimension (unit) as the teacher signal (observed knocking cylinder internal pressure 93). The transfer function H for addition and conversion is learned. The separation unit 40 separates the input physical quantity into a noise component and an extraction signal. [Selection diagram] FIG. 3A

Description

本発明は、入力物理量をノイズ成分とノイズ成分が除去された抽出信号とに良好に分離する信号処理装置、及び、信号処理方法に関する。 The present invention relates to a signal processing device that satisfactorily separates an input physical quantity into a noise component and an extracted signal from which the noise component has been removed, and a signal processing method.

例えば、ガソリンエンジンなどの内燃機関における点火時期は、出力トルクの向上を目的として、ノッキングが発生しないクランク角度の範囲内において可能な限り進角されることが一般的である。そこで、点火時期を調整する過程では、ノッキングが発生しているか否かが試験者又はノッキング判定装置によって判定される。こうしたノッキング判定装置の一例が特許文献１に記載されている。 For example, the ignition timing of an internal combustion engine such as a gasoline engine is generally advanced as much as possible within the range of the crank angle at which knocking does not occur for the purpose of improving the output torque. Therefore, in the process of adjusting the ignition timing, the tester or the knocking determination device determines whether or not knocking has occurred. An example of such a knocking determination device is described in Patent Document 1.

特許文献１に記載のノッキング判定装置は、ノッキングの有無が判定される判定信号と比較される対象信号とを、判定信号との関係（例えば、時間的な関係や運転条件における関係）で定まる条件に基づいて選択する。 The knocking determination device described in Patent Document 1 is a condition in which a determination signal for determining the presence or absence of knocking and a target signal to be compared are determined by a relationship with the determination signal (for example, a temporal relationship or a relationship in operating conditions). Select based on.

特許文献１に記載のノッキング判定装置では、ノッキング有無の判定結果しか外部に提示できず、判定結果の裏付けをとることが困難である。そこで、本発明の発明者らによって、判定結果の裏付けとなる所望の物理量を推定することが可能な装置として、特許文献２又は特許文献３に記載の装置が提案された。 With the knocking determination device described in Patent Document 1, only the determination result of the presence or absence of knocking can be presented to the outside, and it is difficult to support the determination result. Therefore, the inventors of the present invention have proposed the device described in Patent Document 2 or Patent Document 3 as a device capable of estimating a desired physical quantity that supports the determination result.

特許文献２に記載の装置は、「入力物理量に含まれるノイズ成分を除去し、前記ノイズ成分が除去された入力物理量から所望の物理量を推定するための学習装置であって、前記ノイズ成分は、内燃機関で発生するノッキング音以外の雑音であり、前記入力物理量は、前記雑音及び前記ノッキング音が含まれる前記内燃機関の音圧であり、前記所望の物理量は、ノッキング発生時の前記内燃機関の筒内圧であり、ニューラルネットワークにより、前記雑音を除去し、かつ、前記ノッキング音を抽出するマスクを生成するネットワークの重み、及び／又は、前記マスクにより抽出されたノッキング音を前記ノッキング発生時の内燃機関の筒内圧に変換する伝達関数を学習する学習部、を備えることを特徴とする学習装置。」というものである。特許文献２に記載の装置は、ニューラルネットワークにより、入力物理量に含まれるノイズ成分を除去するマスクを生成するネットワークの重み、及び／又は、マスクによりノイズ成分が除去された入力物理量を所望の物理量に変換する伝達関数を学習する学習部を備え、入力物理量に含まれるノイズ成分を除去し、ノイズ成分が除去された入力物理量から所望の物理量を推定し、ノッキングの有無を判定する。 The device described in Patent Document 2 is "a learning device for removing a noise component contained in an input physical quantity and estimating a desired physical quantity from the input physical quantity from which the noise component has been removed, and the noise component is a learning device. It is noise other than the knocking sound generated in the internal combustion engine, the input physical quantity is the sound pressure of the internal combustion engine including the noise and the knocking sound, and the desired physical quantity is the sound pressure of the internal combustion engine when knocking occurs. It is an in-cylinder pressure, the weight of the network that removes the noise by the neural network and generates a mask for extracting the knocking sound, and / or the internal combustion engine of the knocking sound extracted by the mask at the time of the knocking occurrence. A learning device characterized by including a learning unit that learns a transmission function that converts an in-cylinder pressure into an engine. " The apparatus described in Patent Document 2 uses a neural network to generate a mask that removes noise components contained in the input physical quantity, and / or sets the input physical quantity from which noise components are removed by the mask to a desired physical quantity. A learning unit for learning the transfer function to be converted is provided, a noise component included in the input physical quantity is removed, a desired physical quantity is estimated from the input physical quantity from which the noise component has been removed, and the presence or absence of knocking is determined.

特許文献３に記載の装置は、「内燃機関で発生するノッキング音以外の雑音を除去し、前記雑音が除去されたノッキング音を推定するノッキング判定装置であって、ニューラルネットワークにより、前記雑音を除去し、かつ、前記ノッキング音を抽出するマスクを生成するネットワークの重み、及び、前記マスクにより抽出されたノッキング音をノッキング発生時の内燃機関の筒内圧に変換する伝達関数を学習する学習部と、ニューラルネットワークにより、前記マスクを用いて、前記雑音が含まれるノッキング音から前記雑音が除去されたノッキング音を推定する第２推定部と、を備えることを特徴とするノッキング判定装置。」というものである。特許文献３に記載の装置は、ニューラルネットワークにより、マスクを用いて、雑音が含まれるノッキング音から雑音が除去されたノッキング音を推定する。 The device described in Patent Document 3 is "a knocking determination device that removes noise other than the knocking sound generated in an internal combustion engine and estimates the knocking sound from which the noise has been removed, and removes the noise by a neural network. A learning unit that learns the weight of the network that generates the mask that extracts the knocking sound, and the transmission function that converts the knocking sound extracted by the mask into the in-cylinder pressure of the internal combustion engine when knocking occurs. A knocking determination device comprising a second estimation unit that estimates a knocking sound in which the noise is removed from a knocking sound containing the noise by using the mask by a neural network. " be. The device described in Patent Document 3 estimates a knocking sound in which noise is removed from a knocking sound containing noise by using a mask by a neural network.

特開２０１７−４４１４８号公報Japanese Unexamined Patent Publication No. 2017-44148 特許第６６０５１７０号公報Japanese Patent No. 6605170 特許第６６５１０４０号公報Japanese Patent No. 6651040

しかしながら、特許文献２及び特許文献３に記載された従来技術は、入力物理量に関連する位相成分を考慮することなく、入力物理量をノイズ成分と抽出信号とに分離する構成になっていた。そのため、従来技術は、入力物理量をノイズ成分と抽出信号とに分離しても、ノイズ成分に抽出信号が混入してしまい、抽出信号を正確に分離できなかった。 However, the prior art described in Patent Documents 2 and 3 has a configuration in which the input physical quantity is separated into a noise component and an extraction signal without considering the phase component related to the input physical quantity. Therefore, in the prior art, even if the input physical quantity is separated into the noise component and the extraction signal, the extraction signal is mixed in the noise component, and the extraction signal cannot be separated accurately.

本発明は、前記した課題を解決するためになされたものであり、位相成分を考慮した構成を実現することにより、入力物理量をノイズ成分と抽出信号とに良好に分離する信号処理装置、及び、信号処理方法を提供することを主な目的とする。 The present invention has been made to solve the above-mentioned problems, and a signal processing device that satisfactorily separates an input physical quantity into a noise component and an extracted signal by realizing a configuration in consideration of a phase component, and a signal processing device. The main purpose is to provide a signal processing method.

前記課題を解決するため、本発明は、信号処理装置であって、ニューラルネットワークにより、ノイズ成分が含まれている入力物理量から前記ノイズ成分を除去するためのマスクを生成するネットワークの重みを学習するとともに、前記入力物理量から前記ノイズ成分が除去された抽出信号を教師信号と同じ次元の推定信号に変換するための伝達関数を学習する学習部と、前記入力物理量を前記ノイズ成分と前記抽出信号とに分離する分離部と、を備え、前記学習部は、前記入力物理量に関連する位相成分を加味して、前記マスクを生成するネットワークの重みを学習するとともに、前記入力物理量に関連する位相成分を加味して、前記伝達関数を学習する構成とする。 To solve the above problems, the present invention is a signal processing device, by the neural network, the weights of the network for generating a mask for removing an input physical quantity or found before Symbol noise component that contains the noise component as well as learning, a learning unit that learns a transfer function for convert the extracted signal in which the noise component is removed from the input physical quantity in the same dimension of the estimation signal and the teacher signal, and the input physical quantity the noise component The learning unit includes a separation unit that separates from the extracted signal, and the learning unit learns the weight of the network that generates the mask by adding a phase component related to the input physical quantity, and is related to the input physical quantity. The transfer function is learned by adding the phase component to be used.

また、本発明は、信号処理方法であって、ニューラルネットワークにより、ノイズ成分が含まれている入力物理量から前記ノイズ成分を除去するためのマスクを生成するネットワークの重みを学習するとともに、前記入力物理量から前記ノイズ成分が除去された抽出信号を教師信号と同じ次元の推定信号に変換するための伝達関数を学習する学習工程と、前記入力物理量を前記ノイズ成分と前記抽出信号とに分離する分離工程と、を含み、前記学習工程において、前記入力物理量に関連する位相成分を加味して、前記マスクを生成するネットワークの重みを学習するとともに、前記入力物理量に関連する位相成分を加味して、前記伝達関数を学習する構成とする。
その他の手段は、後記する。
Further, the present invention is a signal processing method, by the neural network, with learning the weights of the network for generating a mask for removing an input physical quantity or found before Symbol noise component that contains the noise component, the a learning step for learning a transfer function for convert the extracted signal in which the noise component is removed from the input physical quantity in the same dimension of the estimation signal and the teacher signal, the input physical quantity and the extraction signal and the noise component look including a separation step of separating, in said learning step, in consideration of the phase components associated with the input physical quantity, as well as learning the weights of the network for generating the mask, the phase components associated with the input physical quantity In addition, the configuration is such that the transfer function is learned.
Other means will be described later.

本発明によれば、入力物理量をノイズ成分と抽出信号とに良好に分離することができる。 According to the present invention, the input physical quantity can be satisfactorily separated into a noise component and an extraction signal.

第１実施形態に係る信号処理装置（推定装置）を含む信号処理システムの全体構成を示すブロック図である。It is a block diagram which shows the whole structure of the signal processing system including the signal processing apparatus (estimating apparatus) which concerns on 1st Embodiment. 第１実施形態に係る信号処理装置（推定装置）の構成を示すブロック図である。It is a block diagram which shows the structure of the signal processing apparatus (estimation apparatus) which concerns on 1st Embodiment. 学習モードの説明図である。It is explanatory drawing of the learning mode. 第１実施形態に係る信号処理装置（推定装置）の学習モード時の動作説明図である。It is operation explanatory drawing in the learning mode of the signal processing apparatus (estimation apparatus) which concerns on 1st Embodiment. 閾値算出モードの説明図である。It is explanatory drawing of the threshold value calculation mode. 第１実施形態に係る信号処理装置（推定装置）の閾値算出モード時の動作説明図である。It is operation explanatory figure in the threshold value calculation mode of the signal processing apparatus (estimation apparatus) which concerns on 1st Embodiment. 判定モードの説明図である。It is explanatory drawing of the determination mode. 第１実施形態に係る信号処理装置（推定装置）の判定モード時の動作説明図である。It is operation explanatory figure in the determination mode of the signal processing apparatus (estimation apparatus) which concerns on 1st Embodiment. 分離モードの説明図である。It is explanatory drawing of the separation mode. 第１実施形態に係る信号処理装置（推定装置）の分離モード時の動作説明図である。It is operation explanatory drawing in the separation mode of the signal processing apparatus (estimation apparatus) which concerns on 1st Embodiment. 官能試験モードの説明図である。It is explanatory drawing of the sensory test mode. 第１実施形態に係る信号処理装置（推定装置）の官能試験モード時の動作説明図である。It is operation explanatory drawing in the sensory test mode of the signal processing apparatus (estimating apparatus) which concerns on 1st Embodiment. 学習時におけるノイズ成分と教師信号との関係を表す説明図である。It is explanatory drawing which shows the relationship between a noise component and a teacher signal at the time of learning. 学習時における抽出信号と教師信号との関係を表す説明図である。It is explanatory drawing which shows the relationship between the extraction signal and a teacher signal at the time of learning. 学習時における抽出信号と推定信号との関係を表す説明図である。It is explanatory drawing which shows the relationship between the extraction signal and the estimation signal at the time of learning. 第１実施形態において、マスクを生成するネットワークの重みを学習するニューラルネットワークの説明図である。It is explanatory drawing of the neural network which learns the weight of the network which generates a mask in 1st Embodiment. 圧力や、振動、音などの位相成分が考慮されていない場合の計算例の説明図である。It is explanatory drawing of the calculation example when the phase component such as pressure, vibration, and sound is not considered. 圧力や、振動、音などの位相成分が考慮されている場合の計算例の説明図である。It is explanatory drawing of the calculation example when the phase component such as pressure, vibration, and sound is taken into consideration. 官能試験における信号処理の説明図（１）である。It is explanatory drawing (1) of signal processing in a sensory test. 官能試験における信号処理の説明図（２）である。It is explanatory drawing (2) of the signal processing in a sensory test. 官能試験における信号処理の説明図（３）である。It is explanatory drawing (3) of the signal processing in a sensory test. 官能試験における信号処理の説明図（４）である。It is explanatory drawing (4) of the signal processing in a sensory test. 官能試験における信号処理の説明図（５）である。It is explanatory drawing (5) of signal processing in a sensory test. 官能試験における信号処理の説明図（６）である。It is explanatory drawing (6) of signal processing in a sensory test. 第１実施形態において、信号処理装置（推定装置）のデータ収集処理を示すフローチャートである。It is a flowchart which shows the data collection processing of the signal processing apparatus (estimating apparatus) in 1st Embodiment. 第１実施形態において、信号処理装置（推定装置）の学習処理を示すフローチャートである。It is a flowchart which shows the learning process of the signal processing apparatus (estimating apparatus) in 1st Embodiment. 学習処理のサブルーチンを示すフローチャートである。It is a flowchart which shows the subroutine of learning processing. 学習処理のサブルーチンの変更例を示すフローチャートである。It is a flowchart which shows the modification example of the subroutine of learning processing. 第１実施形態において、信号処理装置（推定装置）の閾値算出処理を示すフローチャートである。It is a flowchart which shows the threshold value calculation process of the signal processing apparatus (estimating apparatus) in 1st Embodiment. 第１実施形態において、図１７の分離処理及び図１８の官能試験処理の後に行われる信号処理装置（推定装置）の閾値算出処理を示すフローチャートである。In the first embodiment, it is a flowchart which shows the threshold value calculation process of the signal processing apparatus (estimation apparatus) performed after the separation process of FIG. 17 and the sensory test process of FIG. 第１実施形態において、信号処理装置（推定装置）の判定処理を示すフローチャートである。It is a flowchart which shows the determination process of the signal processing apparatus (estimating apparatus) in 1st Embodiment. 第１実施形態において、信号処理装置（推定装置）の分離処理を示すフローチャートである。It is a flowchart which shows the separation process of the signal processing apparatus (estimating apparatus) in 1st Embodiment. 第１実施形態において、信号処理装置（推定装置）の官能試験処理を示すフローチャートである。It is a flowchart which shows the sensory test processing of the signal processing apparatus (estimating apparatus) in 1st Embodiment. 第２実施形態に係る信号処理装置（推定装置）の構成を示すブロック図である。It is a block diagram which shows the structure of the signal processing apparatus (estimation apparatus) which concerns on 2nd Embodiment. 第１変形例の説明図である。It is explanatory drawing of the 1st modification. 第１変形例において、マスクを生成するネットワークの重み及び伝達関数の学習の説明図である。In the first modification, it is explanatory drawing of learning of the weight and the transfer function of the network which generates a mask. 第２変形例の説明図である。It is explanatory drawing of the 2nd modification. 第２変形例において、マスクを生成するネットワークの重み及び伝達関数の学習の説明図である。In the second modification, it is explanatory drawing of learning of the weight and the transfer function of the network which generates a mask. 第３変形例において、マスクを生成するネットワークの重み及び伝達関数の学習の説明図である。In the third modification, it is explanatory drawing of learning of the weight and the transfer function of the network which generates a mask.

以下、図面を参照して、本発明の実施の形態（以下、「本実施形態」と称する）について詳細に説明する。なお、各図は、本発明を十分に理解できる程度に、概略的に示しているに過ぎない。よって、本発明は、図示例のみに限定されるものではない。また、各図において、共通する構成要素や同様な構成要素については、同一の符号を付し、それらの重複する説明を省略する。 Hereinafter, embodiments of the present invention (hereinafter, referred to as “the present embodiment”) will be described in detail with reference to the drawings. It should be noted that each figure is merely schematically shown to the extent that the present invention can be fully understood. Therefore, the present invention is not limited to the illustrated examples. Further, in each figure, common components and similar components are designated by the same reference numerals, and duplicate description thereof will be omitted.

［第１実施形態］
＜信号処理装置（推定装置）を含む信号処理システムの全体構成＞
以下、図１を参照して、本第１実施形態に係る信号処理装置１０（推定装置）を含む信号処理システム１００の全体構成について説明する。図１は、信号処理装置１０（推定装置）を含む信号処理システム１００の全体構成を示すブロック図である。 [First Embodiment]
<Overall configuration of signal processing system including signal processing device (estimating device)>
Hereinafter, the overall configuration of the signal processing system 100 including the signal processing device 10 (estimating device) according to the first embodiment will be described with reference to FIG. FIG. 1 is a block diagram showing an overall configuration of a signal processing system 100 including a signal processing device 10 (estimating device).

信号処理装置１０は、信号に対して各種の処理を行う装置である。本実施形態では、信号処理装置１０が、ノイズ成分が除去された抽出信号に伝達関数を掛け合わせて、教師信号と同じ次元（単位）の推定信号を推定する推定装置として機能するものとして説明する。また、本実施形態では、ガソリンエンジンなどの内燃機関でノッキングが発生しているか否かを検証する用途に信号処理装置１０が用いられるものとして説明する。ただし、信号処理装置１０は、このような用途に限らず、様々な用途に用いることができる。 The signal processing device 10 is a device that performs various types of processing on a signal. In the present embodiment, the signal processing device 10 will be described as functioning as an estimation device that estimates an estimation signal of the same dimension (unit) as the teacher signal by multiplying the extraction signal from which the noise component has been removed by a transfer function. .. Further, in the present embodiment, the signal processing device 10 will be described as being used for the purpose of verifying whether or not knocking has occurred in an internal combustion engine such as a gasoline engine. However, the signal processing device 10 is not limited to such applications, and can be used for various purposes.

図１に示すように、信号処理システム１００は、試験対象となるエンジン１のノッキングの有無を判定するものであり、音圧センサ４と、筒内圧センサ５と、データ収集装置６と、モニタ７と、ヘッドホン８と、レベル指定部９と、信号処理装置１０とを備える。 As shown in FIG. 1, the signal processing system 100 determines whether or not the engine 1 to be tested is knocked, and includes a sound pressure sensor 4, an in-cylinder pressure sensor 5, a data collecting device 6, and a monitor 7. The headphone 8, the level designation unit 9, and the signal processing device 10 are provided.

図１に示すように、試験対象のエンジン１は、車両３に搭載されている。なお、試験対象のエンジン１は、車両３に搭載されない状態、例えば単独の状態で用いられてもよい。エンジン１には、エンジン１の駆動を制御するエンジンＥＣＵ（Electronic Control Unit）２が接続されている。エンジンＥＣＵ２は、ＣＰＵ（Central Processing Unit）、ＲＯＭ（Read Only Memory）、ＲＡＭ（Random access memory）、その他の記憶装置等で構成されている。エンジンＥＣＵ２は、ＲＯＭや記憶装置に記憶されたプログラムをＣＰＵで演算処理することで、エンジン１の駆動制御に必要な各種情報をエンジンＥＣＵ２の外部から取得しながらエンジン１の駆動を制御する。 As shown in FIG. 1, the engine 1 to be tested is mounted on the vehicle 3. The engine 1 to be tested may be used in a state where it is not mounted on the vehicle 3, for example, in a single state. An engine ECU (Electronic Control Unit) 2 that controls the drive of the engine 1 is connected to the engine 1. The engine ECU 2 is composed of a CPU (Central Processing Unit), a ROM (Read Only Memory), a RAM (Random access memory), and other storage devices. The engine ECU 2 controls the drive of the engine 1 while acquiring various information necessary for the drive control of the engine 1 from the outside of the engine ECU 2 by arithmetically processing the program stored in the ROM or the storage device by the CPU.

エンジンＥＣＵ２は、エンジン１の現在の回転角度を表す角度情報をデータ収集装置６に出力する。角度情報には、例えば回転パルスとクランク角度パルスとが含まれる。回転パルスは、クランク軸における絶対角度が出力される信号であって、例えば、クランク軸が１回転するごとに１パルス出力される。クランク角度パルスは、クランク軸の回転角度が単位角度進むごとに出力される信号であって、例えば、１°毎に１パルスが出力される場合、吸入、圧縮、燃焼及び排気の４行程を１サイクルとする４ストロークエンジンにおいて、１サイクルにクランク軸が回転する２回転の間に７２０パルス出力される。なお、角度情報がデータ収集装置６に入力されるのであれば、エンジンＥＣＵ２を介さず、クランク軸の回転角度の原点を検出する原点センサや、クランク軸の回転角度を検出する角度センサからの角度情報がデータ収集装置６に入力されてもよい。 The engine ECU 2 outputs angle information representing the current rotation angle of the engine 1 to the data collection device 6. The angle information includes, for example, a rotation pulse and a crank angle pulse. The rotation pulse is a signal that outputs an absolute angle on the crank shaft. For example, one pulse is output for each rotation of the crank shaft. The crank angle pulse is a signal that is output every time the rotation angle of the crank shaft advances by a unit angle. For example, when one pulse is output every 1 °, the four strokes of intake, compression, combustion, and exhaust are 1. In a 4-stroke engine as a cycle, 720 pulses are output during two rotations in which the crank shaft rotates in one cycle. If the angle information is input to the data collecting device 6, the angle from the origin sensor that detects the origin of the rotation angle of the crank shaft and the angle sensor that detects the rotation angle of the crank shaft without going through the engine ECU 2. Information may be input to the data collection device 6.

エンジン１の近くには、音圧センサ４が設置されている。音圧センサ４は、エンジン１から発生する音を検出し、この検出した音に基づく音圧信号をデータ収集装置６に出力する。詳述すると、音圧センサ４は、エンジン１に発生する筒内圧力変動に基づく物理量の一例である音圧を検出し、検出された音圧の大きさを示す音圧信号を生成する。よって、エンジン１にノッキングが発生していないとき、音圧センサ４から出力される音圧信号には、ノッキングに相関のある音は含まれない。一方、エンジン１にノッキングが発生しているとき、音圧センサ４から出力される音圧信号には、ノッキングに相関のある音が含まれている。 A sound pressure sensor 4 is installed near the engine 1. The sound pressure sensor 4 detects a sound generated from the engine 1 and outputs a sound pressure signal based on the detected sound to the data collection device 6. More specifically, the sound pressure sensor 4 detects the sound pressure, which is an example of the physical amount based on the in-cylinder pressure fluctuation generated in the engine 1, and generates a sound pressure signal indicating the magnitude of the detected sound pressure. Therefore, when knocking does not occur in the engine 1, the sound pressure signal output from the sound pressure sensor 4 does not include sounds that correlate with knocking. On the other hand, when knocking occurs in the engine 1, the sound pressure signal output from the sound pressure sensor 4 includes a sound correlating with the knocking.

エンジン１には、エンジン１の筒内圧を検出する筒内圧センサ５を取り付ける。筒内圧センサ５は、筒内の燃焼ガス振動に応じた波形成分を含む筒内圧信号をデータ収集装置６に出力する。ここで、エンジン１にノッキングが発生していないとき、筒内圧センサ５から出力される筒内圧信号には、ノッキングに相関した成分が含まれない。一方、エンジン１にノッキングが発生しているとき、筒内圧センサ５から出力される筒内圧信号には、ノッキングに相関した成分が含まれている。筒内圧センサ５は、点火プラグと一体化したものを用いてもよいし、点火プラグとは別個に構成されたものを用いてもよい。 An in-cylinder pressure sensor 5 for detecting the in-cylinder pressure of the engine 1 is attached to the engine 1. The in-cylinder pressure sensor 5 outputs an in-cylinder pressure signal including a waveform component corresponding to the vibration of the combustion gas in the cylinder to the data collecting device 6. Here, when knocking does not occur in the engine 1, the in-cylinder pressure signal output from the in-cylinder pressure sensor 5 does not include a component that correlates with knocking. On the other hand, when knocking occurs in the engine 1, the in-cylinder pressure signal output from the in-cylinder pressure sensor 5 contains a component that correlates with knocking. The in-cylinder pressure sensor 5 may be integrated with the spark plug or may be configured separately from the spark plug.

データ収集装置６は、音圧センサ４からの音圧信号を入力してＡ／Ｄ変換する。また、データ収集装置６は、音圧信号を入力するタイミングで、エンジンＥＣＵ２から現在の角度情報を取得する。そして、データ収集装置６は、角度情報に基づいてエンジン１の１サイクル分の音圧信号を取得する。よって、４ストローク単気筒エンジンを例とすると、データ収集装置６は、単位時間当たりのエンジン１の回転速度に応じた数、例えば、回転速度が３０００［ｒ／ｍｉｎ］であれば一分間に１５００個の音圧信号を生成する。また、データ収集装置６は、一部又は全部の音圧信号に角度情報を関連付ける。そして、データ収集装置６は、角度情報が関連付けられた音圧信号を信号処理装置１０に出力する。なお、データ収集装置６は、音圧信号を一時的に保持したり、一旦蓄えたりしてから信号処理装置１０に出力してもよい。また、データ収集装置６は、音圧信号に時刻情報を関連付けてもよい。 The data collection device 6 inputs the sound pressure signal from the sound pressure sensor 4 and performs A / D conversion. Further, the data collection device 6 acquires the current angle information from the engine ECU 2 at the timing of inputting the sound pressure signal. Then, the data collection device 6 acquires the sound pressure signal for one cycle of the engine 1 based on the angle information. Therefore, taking a 4-stroke single-cylinder engine as an example, the data collection device 6 has a number corresponding to the rotation speed of the engine 1 per unit time, for example, if the rotation speed is 3000 [r / min], 1500 per minute. Generates individual sound pressure signals. Further, the data collection device 6 associates angle information with a part or all of the sound pressure signals. Then, the data collection device 6 outputs a sound pressure signal associated with the angle information to the signal processing device 10. The data collecting device 6 may temporarily hold the sound pressure signal or temporarily store the sound pressure signal and then output the sound pressure signal to the signal processing device 10. Further, the data collection device 6 may associate the time information with the sound pressure signal.

モニタ７は、信号処理装置１０が推定したエンジン１の筒内圧や、信号処理装置１０によるノッキング有無の判定結果を表示する。なお、モニタ７の一例として、一般的なフラットパネルディスプレイがある。 The monitor 7 displays the in-cylinder pressure of the engine 1 estimated by the signal processing device 10 and the determination result of the presence or absence of knocking by the signal processing device 10. As an example of the monitor 7, there is a general flat panel display.

ヘッドホン８は、音を発する放音部である。ヘッドホン８は、後記する官能試験モードで試験対象物（本実施形態では、エンジン１）の状況を検査する際に、信号処理装置１０の操作者の頭部に装着される。 The headphone 8 is a sound emitting unit that emits sound. The headphones 8 are worn on the head of the operator of the signal processing device 10 when inspecting the state of the test object (engine 1 in this embodiment) in the sensory test mode described later.

レベル指定部９は、信号に対する上昇レベル又は下降レベルを指定するレベル指定情報を受け付ける入力部である。レベル指定部９は、例えばタッチパネルディスプレイや、テンキー、専用のスイッチなどによって構成されている。本実施形態では、抽出信号（抽出ノッキング音９１ａ（図１１Ｂ））のレベルを変更してレベル変更抽出信号（レベル変更ノッキング音９１ａａ（図１２Ａ））を生成する場合に、抽出信号の上昇レベル又は下降レベルを指定（入力）するために、信号処理装置１０の操作者又はその周囲の人物によってレベル指定部９が操作される。 The level designation unit 9 is an input unit that receives level designation information for designating an ascending level or a descending level for a signal. The level designation unit 9 is composed of, for example, a touch panel display, a numeric keypad, a dedicated switch, and the like. In the present embodiment, when the level of the extraction signal (extraction knocking sound 91a (FIG. 11B)) is changed to generate the level change extraction signal (level change knocking sound 91aa (FIG. 12A)), the rising level of the extraction signal or In order to specify (input) the descending level, the level designating unit 9 is operated by the operator of the signal processing device 10 or a person around it.

信号処理装置１０は、マスクαを生成するネットワークの重みＷ（図３Ｂ）及び伝達関数Ｈ（図３Ａ）を学習する。ここで、マスクαは、ノイズ成分が含まれている入力物理量からノイズ成分を除去するための実数又は複素数の行列である。マスクαは、入力物理量に合わせて変化する。また、伝達関数Ｈは、複素数の重みベクトルであり、エンジン１の構造減衰補正量の逆数と解釈する。構造減衰補正量とは、エンジン燃焼時の筒内圧に起因する振動がエンジン１を通り、音となって音圧センサに到達するまでの伝達特性のことである。エンジン１の筒内圧の周波数成分に構造減衰補正量を乗算したものがエンジン１の燃焼騒音レベルとなる。 The signal processing device 10 learns the weight W (FIG. 3B) and the transfer function H (FIG. 3A) of the network that generates the mask α. Here, the mask α is a matrix of real numbers or complex numbers for removing the noise component from the input physical quantity including the noise component. The mask α changes according to the input physical quantity. Further, the transfer function H is a complex number weight vector, and is interpreted as the reciprocal of the structural attenuation correction amount of the engine 1. The structural damping correction amount is a transmission characteristic until the vibration caused by the in-cylinder pressure during engine combustion passes through the engine 1 and becomes a sound to reach the sound pressure sensor. The combustion noise level of the engine 1 is obtained by multiplying the frequency component of the in-cylinder pressure of the engine 1 by the structural damping correction amount.

そのため、音圧であるエンジン１の近傍音に含まれる燃焼騒音に構造減衰補正量の逆数を乗算すれば、エンジン１の筒内圧が求められる。したがって、エンジン１の近傍音からエンジン１の筒内圧を推定することができる。 Therefore, the in-cylinder pressure of the engine 1 can be obtained by multiplying the combustion noise included in the noise near the engine 1, which is the sound pressure, by the reciprocal of the structural attenuation correction amount. Therefore, the in-cylinder pressure of the engine 1 can be estimated from the noise in the vicinity of the engine 1.

エンジン１の近傍音からエンジン１の筒内圧を推定する原理については、例えば、前記した特許文献２に記載されている。例えば、前記した特許文献２によれば、「まず、エンジン１の近傍音ｙと、教師データ（教師信号）として、実測したエンジン１の筒内圧ｘとを収集する。……エンジン１の近傍音ｙ及びエンジン１の実測筒内圧ｘを用いて、未知数であるマスクαを生成するネットワークの重み及び伝達関数Ｈをニューラルネットワークにより学習する。そして、学習したネットワークが生成したマスクα及び伝達関数Ｈを用いて、試験時に測定したエンジン１の近傍音ｙから、エンジン１の筒内圧ｘを推定する。」と記載されている。 The principle of estimating the in-cylinder pressure of the engine 1 from the near sound of the engine 1 is described in, for example, Patent Document 2 described above. For example, according to the above-mentioned Patent Document 2, "First, the proximity sound y of the engine 1 and the measured in-cylinder pressure x of the engine 1 are collected as teacher data (teacher signal) .... The neighborhood sound of the engine 1. Using y and the measured in-cylinder pressure x of the engine 1, the weight and transmission function H of the network that generates the unknown mask α are learned by the neural network, and the mask α and the transmission function H generated by the learned network are learned. The in-cylinder pressure x of the engine 1 is estimated from the near sound y of the engine 1 measured at the time of the test. "

信号処理装置１０は、学習したネットワークが生成したマスクα及び伝達関数Ｈを用いて、エンジン１の近傍音からノッキング音を抽出し、エンジン１のノッキング筒内圧を推定し、抽出したノッキング音や推定したエンジン１のノッキング筒内圧に基づいて、ノッキングの有無を判定する。そして、信号処理装置１０は、抽出したノッキング音や推定したエンジン１のノッキング筒内圧や、ノッキング有無の判定結果をモニタ７に表示する。 The signal processing device 10 extracts a knocking sound from the near sound of the engine 1 by using the mask α and the transmission function H generated by the learned network, estimates the knocking cylinder internal pressure of the engine 1, and estimates the extracted knocking sound and the estimation. The presence or absence of knocking is determined based on the knocking cylinder internal pressure of the engine 1. Then, the signal processing device 10 displays the extracted knocking sound, the estimated knocking cylinder internal pressure of the engine 1, and the determination result of the presence or absence of knocking on the monitor 7.

ここで、信号処理装置１０は、ＣＰＵ、ＲＯＭ、ＲＡＭ、その他の記憶装置等で構成されている。信号処理装置１０は、ＲＯＭや記憶装置に記憶されているプログラムをＣＰＵで演算処理する。なお、信号処理装置１０は、以下の処理を実行するプログラムを有するパーソナルコンピュータ（ＰＣ）等であってもよい。 Here, the signal processing device 10 is composed of a CPU, a ROM, a RAM, another storage device, and the like. The signal processing device 10 performs arithmetic processing on the program stored in the ROM or the storage device by the CPU. The signal processing device 10 may be a personal computer (PC) or the like having a program that executes the following processing.

本実施形態では、信号処理装置１０は、学習モード、閾値算出モード、判定モード、分離モード、官能試験モードという５つの動作モードで動作する。１つ目の学習モードは、マスクα（図３Ａ）を生成するネットワークの重みＷ（図３Ｂ）及び伝達関数Ｈ（図３Ａ）を学習する動作モードである。２つ目の閾値算出モードは、マスクαを生成するネットワークの重みＷ及び伝達関数Ｈの学習後、ノッキングの有無を閾値判定するときの閾値を算出する動作モードである。３つ目の判定モードは、学習したネットワークに入力物理量を入力することで生成されたマスクαを用いて入力物理量（本実施形態では、エンジン近傍音９０）から抽出信号（本実施形態では、抽出ノッキング音９１ａ）を抽出し、ノッキングの有無を判定する動作モードである。４つ目の分離モードは、入力物理量を抽出信号とノイズ成分（本実施形態では、雑音９１ｂ）とに分離する動作モードである。５つ目の官能試験モードは、検査者が後記する加工音を聞き取り、検査者の聴感によって検査すべき目的音（本実施形態では、ノッキング音）における閾値算出モードで使用するデータを決定するための動作モードである。５つ目の官能試験モードでは、例えば許容範囲外となった音から閾値を算出する。 In the present embodiment, the signal processing device 10 operates in five operation modes: a learning mode, a threshold value calculation mode, a determination mode, a separation mode, and a sensory test mode. The first learning mode is an operation mode for learning the network weight W (FIG. 3B) and the transfer function H (FIG. 3A) that generate the mask α (FIG. 3A). The second threshold value calculation mode is an operation mode for calculating the threshold value when determining the presence or absence of knocking after learning the weight W and the transfer function H of the network that generates the mask α. The third determination mode uses a mask α generated by inputting an input physical quantity into the learned network to extract a signal (extracted in the present embodiment) from the input physical quantity (engine near sound 90 in the present embodiment). This is an operation mode in which the knocking sound 91a) is extracted and the presence or absence of knocking is determined. The fourth separation mode is an operation mode in which the input physical quantity is separated into an extraction signal and a noise component (noise 91b in this embodiment). The fifth sensory test mode is for listening to the processed sound described later by the inspector and determining the data to be used in the threshold value calculation mode in the target sound (knocking sound in this embodiment) to be inspected by the inspector's hearing. Operation mode. In the fifth sensory test mode, for example, the threshold value is calculated from the sounds that are out of the permissible range.

これら５つの動作モードは、任意に切り替えることができる。例えば、図示を省略した管理装置により、ＣＡＮ（Controller Area Network）を介して、信号処理装置１０の動作モードを切り替えることができる。また、図示を省略したマウス、キーボード等の操作手段を用いて、信号処理装置１０の動作モードを切り替えてもよい。 These five operation modes can be arbitrarily switched. For example, the operation mode of the signal processing device 10 can be switched via the CAN (Controller Area Network) by a management device (not shown). Further, the operation mode of the signal processing device 10 may be switched by using an operating means such as a mouse or a keyboard (not shown).

学習モードの場合、ノッキングが発生する運転条件、及び、ノッキングが発生しない運転条件でそれぞれエンジン１を運転し、データ収集装置６が、教師データ（教師信号）として、筒内圧信号を収集する。このとき、データ収集装置６は、筒内圧センサ５からの筒内圧信号を入力してＡ／Ｄ変換し、これを音圧信号に関連付けておく。 In the learning mode, the engine 1 is operated under the operating conditions where knocking occurs and the operating conditions where knocking does not occur, respectively, and the data collection device 6 collects the in-cylinder pressure signal as teacher data (teacher signal). At this time, the data collection device 6 inputs the in-cylinder pressure signal from the in-cylinder pressure sensor 5, performs A / D conversion, and associates this with the sound pressure signal.

閾値算出モードの場合、ノッキングが発生しない運転条件でエンジン１を運転し、データ収集装置６が、音圧信号を収集する。 In the threshold value calculation mode, the engine 1 is operated under operating conditions where knocking does not occur, and the data collection device 6 collects the sound pressure signal.

判定モードの場合、学習したネットワークにより生成されたマスクαを用いてエンジン近傍音９０（入力物理量）から抽出ノッキング音９１ａ（抽出信号）を抽出し、信号処理装置１０が、閾値に基づいてノッキングの有無を判定する。 In the determination mode, the extracted knocking sound 91a (extracted signal) is extracted from the engine proximity sound 90 (input physical quantity) using the mask α generated by the learned network, and the signal processing device 10 knocks based on the threshold value. Determine the presence or absence.

なお、閾値算出モード又は判定モードの場合、データ収集装置６が、筒内圧信号を収集する必要はない。また、学習モードの場合、音圧センサ４と筒内圧センサ５の双方が動作するが、判定モードの場合、音圧センサ４のみが動作する。 In the threshold calculation mode or the determination mode, the data collection device 6 does not need to collect the in-cylinder pressure signal. Further, in the learning mode, both the sound pressure sensor 4 and the in-cylinder pressure sensor 5 operate, but in the determination mode, only the sound pressure sensor 4 operates.

分離モードの場合、エンジン１を運転し、信号処理装置１０が、エンジン近傍音９０（入力物理量）を学習したネットワークに入力してマスクαを生成し、雑音９１ｂ（ノイズ成分）と抽出ノッキング音９１ａ（抽出信号）とに分離する。その際に、信号処理装置１０は、雑音９１ｂ（ノイズ成分）をノイズ成分記憶部２６ｃに、抽出ノッキング音９１ａ（抽出信号）を抽出信号記憶部２６ｄに、それぞれ記憶する。本発明では、入力物理量に関連する振幅と位相を考慮して、マスクαを生成するネットワークの重みＷ及び伝達関数Ｈを学習することにより分離性能が向上している。マスクαは実数又は複素数、伝達関数Ｈは複素数で実装する。これにより、信号処理装置１０は、入力物理量をノイズ成分と抽出信号とに良好に分離することができる。 In the separation mode, the engine 1 is operated, the signal processing device 10 inputs the engine proximity sound 90 (input physical quantity) into the learned network to generate a mask α, and the noise 91b (noise component) and the extraction knocking sound 91a are generated. Separated into (extracted signal). At that time, the signal processing device 10 stores the noise 91b (noise component) in the noise component storage unit 26c and the extraction knocking sound 91a (extraction signal) in the extraction signal storage unit 26d. In the present invention, the separation performance is improved by learning the weight W and the transfer function H of the network that generates the mask α in consideration of the amplitude and phase related to the input physical quantity. The mask α is implemented as a real number or a complex number, and the transfer function H is implemented as a complex number. As a result, the signal processing device 10 can satisfactorily separate the input physical quantity into the noise component and the extracted signal.

官能試験モードの場合、信号処理装置１０が、レベル指定部９からレベル指定情報を受け付け、レベル指定情報に基づいて、雑音９１ｂ（ノイズ成分）と抽出ノッキング音９１ａ（抽出信号）とを用いて加工音を生成する。 In the sensory test mode, the signal processing device 10 receives the level designation information from the level designation unit 9, and processes using the noise 91b (noise component) and the extraction knocking sound 91a (extraction signal) based on the level designation information. Generate sound.

仮に振幅スペクトルが同じだとしても、位相スペクトルによって信号波形は大きく変わるため、聴感印象に多大な影響を及ぼす。そのため、品質の良い加工音を生成するには、位相を考慮して、加工音に用いる抽出信号（本実施形態では、抽出ノッキング音９１ａ）を取得することが重要である。換言すると、信号処理装置１０は、入力物理量に関連する振幅と位相を考慮して、マスクαを生成するネットワークの重みＷ及び伝達関数Ｈを学習することにより分離性能が向上し、品質の良い加工音（聴感上、ノッキング音が自然な加工音）を生成することができる。 Even if the amplitude spectrum is the same, the signal waveform changes greatly depending on the phase spectrum, which greatly affects the audible impression. Therefore, in order to generate a high-quality processed sound, it is important to acquire the extraction signal (extract knocking sound 91a in the present embodiment) used for the processed sound in consideration of the phase. In other words, the signal processing device 10 improves the separation performance by learning the weight W and the transfer function H of the network that generates the mask α in consideration of the amplitude and phase related to the input physical quantity, and performs high-quality processing. It is possible to generate a sound (a processed sound in which the knocking sound is natural in terms of audibility).

＜信号処理装置（推定装置）の構成＞
以下、図２を参照して、信号処理装置１０（推定装置）の構成について説明する。図２は、信号処理装置１０（推定装置）の構成を示すブロック図である。図２に示すように、信号処理装置１０は、信号切出部１１と、スペクトログラム算出部１２と、信号記憶部１３と、スイッチ１４と、学習処理部２０（学習装置）と、判定処理部３０と、分離部４０と、信号合成部５０と、を備える。ここで、信号処理装置１０は、データ収集装置６から、音圧信号と、この音圧信号に関連付けられた角度情報とが入力される。さらに、学習モードの場合、信号処理装置１０は、データ収集装置６から筒内圧信号が入力される。 <Configuration of signal processing device (estimation device)>
Hereinafter, the configuration of the signal processing device 10 (estimation device) will be described with reference to FIG. FIG. 2 is a block diagram showing the configuration of the signal processing device 10 (estimating device). As shown in FIG. 2, the signal processing device 10 includes a signal cutting unit 11, a spectrogram calculation unit 12, a signal storage unit 13, a switch 14, a learning processing unit 20 (learning device), and a determination processing unit 30. And a separation unit 40 and a signal synthesis unit 50. Here, the signal processing device 10 receives the sound pressure signal and the angle information associated with the sound pressure signal from the data collecting device 6. Further, in the learning mode, the signal processing device 10 receives an in-cylinder pressure signal from the data collecting device 6.

信号切出部１１は、データ収集装置６から入力された角度情報に基づいて、入力された音圧信号から所定の切出角度範囲の音圧信号を切り出す。例えば、切出角度範囲はＡＴＤＣ（After Top Dead Center）の約−１０〜９０°の角度範囲である。本実施形態では、ＴＤＣ（Top Dead Center）を基準として切り出されているため、点火タイミングが変更されても切出角度範囲は固定されたままであるが、点火タイミングの変更に応じて切出角度範囲を変更してもよい。信号切出部１１は、音圧信号を切り出すと、切り出した音圧信号をスペクトログラム算出部１２に出力する。 The signal cutting unit 11 cuts out a sound pressure signal in a predetermined cutting angle range from the input sound pressure signal based on the angle information input from the data collecting device 6. For example, the cutout angle range is an angle range of about −10 to 90 ° of ATDC (After Top Dead Center). In the present embodiment, since the cutting is performed with reference to the TDC (Top Dead Center), the cutting angle range remains fixed even if the ignition timing is changed, but the cutting angle range is changed according to the change in the ignition timing. May be changed. When the signal cutting unit 11 cuts out the sound pressure signal, the signal cutting unit 11 outputs the cut out sound pressure signal to the spectrogram calculation unit 12.

スペクトログラム算出部１２は、信号切出部１１が切り出した音圧信号に対して短時間フーリエ変換（ＳＴＦＴ：Short Time Fourier Transform）を行い、音圧信号のスペクトログラムを算出する。短時間フーリエ変換は、例えば、離散フーリエ変換を高速に計算する高速フーリエ変換（ＦＦＴ：Fast Fourier Transform）により行われる。
その後、スペクトログラム算出部１２は、音圧信号のスペクトログラムを信号記憶部１３に書き込む。 The spectrogram calculation unit 12 performs a short-time Fourier transform (STFT) on the sound pressure signal cut out by the signal cutting unit 11 to calculate a spectrogram of the sound pressure signal. The short-time Fourier transform is performed, for example, by a fast Fourier transform (FFT) that calculates a discrete Fourier transform at high speed.
After that, the spectrogram calculation unit 12 writes the spectrogram of the sound pressure signal into the signal storage unit 13.

信号記憶部１３は、スペクトログラム算出部１２が変換した音圧信号のスペクトログラムを記憶するメモリ、ＨＤＤ（Hard Disk Drive）、ＳＳＤ（Solid State Drive）等の記憶装置である。なお、学習モードの場合、信号記憶部１３は、データ収集装置６から入力された筒内圧信号（観測ノッキング筒内圧９３（図３Ａ））を、教師データ（教師信号）として記憶する。ここで、「ノッキング筒内圧」とは、筒内圧信号に重畳したノッキング成分を表す。ノッキング筒内圧は、筒内圧をノッキングの周波数成分以上の周波数帯を通過させるハイパスフィルタ処理することで得られる。 The signal storage unit 13 is a storage device such as a memory, HDD (Hard Disk Drive), SSD (Solid State Drive), etc. that stores the spectrogram of the sound pressure signal converted by the spectrogram calculation unit 12. In the learning mode, the signal storage unit 13 stores the in-cylinder pressure signal (observation knocking in-cylinder pressure 93 (FIG. 3A)) input from the data collection device 6 as teacher data (teacher signal). Here, the "knocking in-cylinder pressure" represents a knocking component superimposed on the in-cylinder pressure signal. The knocking in-cylinder pressure is obtained by performing a high-pass filter process in which the in-cylinder pressure is passed through a frequency band equal to or higher than the knocking frequency component.

スイッチ１４は、前記した５つの動作モードに対応する学習モード用接続部Ｍ１、閾値算出モード用接続部Ｍ２、判定モード用接続部Ｍ３、分離モード用接続部Ｍ４、官能試験モード用接続部Ｍ５に任意に切り替えることができる。信号処理装置１０は、信号処理装置１０の動作モードに応じてスイッチ１４を切り替えることにより、信号記憶部１３に記憶されている信号を任意の出力先に出力することができる。 The switch 14 is attached to the learning mode connection unit M1, the threshold value calculation mode connection unit M2, the determination mode connection unit M3, the separation mode connection unit M4, and the sensory test mode connection unit M5 corresponding to the above-mentioned five operation modes. It can be switched arbitrarily. The signal processing device 10 can output the signal stored in the signal storage unit 13 to an arbitrary output destination by switching the switch 14 according to the operation mode of the signal processing device 10.

例えば学習モードの場合、信号処理装置１０は、スイッチ１４を学習モード用接続部Ｍ１に接続して、信号記憶部１３に記憶されている音圧信号のスペクトログラム及び筒内圧信号を後記する学習部２１に出力する。また、例えば閾値算出モードの場合、信号処理装置１０は、スイッチ１４を閾値算出モード用接続部Ｍ２に接続して信号記憶部１３に記憶されている音圧信号のスペクトログラムを後記する第１推定部２３に出力する。また、例えば判定モードの場合、信号処理装置１０は、スイッチ１４を判定モード用接続部Ｍ３に接続して、信号記憶部１３に記憶されている音圧信号のスペクトログラムを後記する第２推定部３１（抽出信号推定部）に出力する。また、例えば分離モードの場合、信号処理装置１０は、スイッチ１４を分離モード用接続部Ｍ４に接続して、信号記憶部１３に記憶されている音圧信号のスペクトログラムを後記する分離部４０に出力する。また、例えば官能試験モードの場合、信号処理装置１０は、スイッチ１４を官能試験モード用接続部Ｍ５に接続して、官能試験モードの実行指示信号を後記する信号調整部５１に出力する。 For example, in the learning mode, the signal processing device 10 connects the switch 14 to the learning mode connection unit M1 and the learning unit 21 which describes the spectrogram of the sound pressure signal and the in-cylinder pressure signal stored in the signal storage unit 13. Output to. Further, for example, in the case of the threshold value calculation mode, the signal processing device 10 connects the switch 14 to the threshold value calculation mode connection unit M2, and the first estimation unit described later describes the spectrogram of the sound pressure signal stored in the signal storage unit 13. Output to 23. Further, for example, in the case of the determination mode, the signal processing device 10 connects the switch 14 to the determination mode connection unit M3, and the second estimation unit 31 below describes the spectrogram of the sound pressure signal stored in the signal storage unit 13. Output to (extract signal estimation unit). Further, for example, in the case of the separation mode, the signal processing device 10 connects the switch 14 to the separation mode connection unit M4 and outputs the spectrogram of the sound pressure signal stored in the signal storage unit 13 to the separation unit 40 described later. do. Further, for example, in the case of the sensory test mode, the signal processing device 10 connects the switch 14 to the sensory test mode connection unit M5 and outputs the execution instruction signal of the sensory test mode to the signal adjusting unit 51 described later.

＜学習処理部＞
学習処理部２０は、学習モードにおいて、マスクαを生成するネットワークの重みＷ及び伝達関数Ｈを学習する学習処理（図１４Ａ）を行い、閾値算出モードにおいて、エンジン１の抽出したノッキング音の閾値判定に用いる閾値を算出する閾値算出処理（図１５Ａ）を行う。図２に示すように、学習処理部２０は、学習部２１と、学習済みパラメータ記憶部２２と、第１推定部２３と、閾値算出部２４と、閾値記憶部２５と、教師信号記憶部２６ａと、推定信号記憶部２６ｂと、ノイズ成分記憶部２６ｃと、抽出信号記憶部２６ｄと、を備える。 <Learning processing department>
The learning processing unit 20 performs learning processing (FIG. 14A) for learning the weight W of the network that generates the mask α and the transmission function H in the learning mode, and determines the threshold value of the knocking sound extracted by the engine 1 in the threshold value calculation mode. A threshold value calculation process (FIG. 15A) for calculating the threshold value used for is performed. As shown in FIG. 2, the learning processing unit 20 includes a learning unit 21, a learned parameter storage unit 22, a first estimation unit 23, a threshold calculation unit 24, a threshold storage unit 25, and a teacher signal storage unit 26a. The estimation signal storage unit 26b, the noise component storage unit 26c, and the extraction signal storage unit 26d are provided.

学習部２１は、学習モードにおいて、ニューラルネットワーク９４（図３Ａ）により、エンジン近傍音９０（入力物理量）から雑音９１ｂを除去し、かつ、抽出ノッキング音９１ａ（抽出信号）を抽出するマスクαを生成するネットワークの重みＷと、生成されたマスクαで抽出した抽出ノッキング音９１ａ（抽出信号）をノッキング発生時のエンジン１の推定ノッキング筒内圧９２に変換する伝達関数Ｈとを学習する。本実施形態では、学習部２１は、信号記憶部１３から入力された音圧信号のスペクトログラム及び筒内圧信号を用いて、マスクαを生成するネットワークの重みＷ及び伝達関数Ｈを学習する。その後、学習部２１は、学習したマスクαを生成するネットワークの重みＷ及び伝達関数Ｈを学習済みパラメータ記憶部２２に書き込む。 In the learning mode, the learning unit 21 generates a mask α that removes the noise 91b from the engine near sound 90 (input physical quantity) and extracts the extraction knocking sound 91a (extraction signal) by the neural network 94 (FIG. 3A). The weight W of the network to be generated and the transmission function H that converts the extracted knocking sound 91a (extracted signal) extracted by the generated mask α into the estimated knocking in-cylinder pressure 92 of the engine 1 when knocking occurs are learned. In the present embodiment, the learning unit 21 learns the weight W and the transfer function H of the network that generates the mask α by using the spectrogram of the sound pressure signal input from the signal storage unit 13 and the in-cylinder pressure signal. After that, the learning unit 21 writes the weight W and the transfer function H of the network that generates the learned mask α into the learned parameter storage unit 22.

学習済みパラメータ記憶部２２は、学習済みのパラメータ（マスクαを生成するネットワークの重みＷ及び伝達関数Ｈ）を記憶するメモリ、ＨＤＤ、ＳＳＤ等の記憶装置である。 The learned parameter storage unit 22 is a storage device for a memory, an HDD, an SSD, or the like that stores the learned parameters (the weight W of the network that generates the mask α and the transfer function H).

第１推定部２３は、閾値算出モードにおいて、ニューラルネットワーク９４により生成されたマスクαを用いて、音圧信号のスペクトログラムからエンジン１のノッキング音（抽出信号）を抽出する。この第１推定部２３が抽出したノッキング音（抽出信号）は、後記する閾値を算出するときに利用される。具体的には、第１推定部２３は、学習済みパラメータ記憶部２２のマスクαを生成するネットワークの重みＷが反映されたニューラルネットワーク９４に、信号記憶部１３から入力された音圧信号のスペクトログラムを入力する。すると、学習したネットワークにより生成されたマスクαが音圧信号のスペクトログラムからノッキング音（抽出信号）を抽出する。その後、第１推定部２３は、ノッキング音（抽出信号）を閾値算出部２４に出力する。 The first estimation unit 23 extracts the knocking sound (extracted signal) of the engine 1 from the spectrogram of the sound pressure signal by using the mask α generated by the neural network 94 in the threshold value calculation mode. The knocking sound (extracted signal) extracted by the first estimation unit 23 is used when calculating the threshold value described later. Specifically, the first estimation unit 23 is a spectrogram of the sound pressure signal input from the signal storage unit 13 to the neural network 94 in which the weight W of the network that generates the mask α of the learned parameter storage unit 22 is reflected. Enter. Then, the mask α generated by the learned network extracts the knocking sound (extracted signal) from the spectrogram of the sound pressure signal. After that, the first estimation unit 23 outputs the knocking sound (extracted signal) to the threshold value calculation unit 24.

閾値算出部２４は、第１推定部２３が抽出したエンジン１のノッキング音（抽出信号）に基づいて閾値を算出する。具体的には、閾値算出部２４は、ノッキング音（抽出信号）のスペクトログラムの絶対値を所定時間（例えば、エンジン１の１サイクル）毎に総和する。例えば、複数の気筒を有するエンジン１では、ノッキング音（抽出信号）を総和すると、気筒別に１つのスコアが求められる。続いて、閾値算出部２４は、所定時間毎に総和したノッキング音（抽出信号）の中央値を算出する。例えば、閾値算出部２４は、全てのサイクルについて、ノッキング音（抽出信号）の総和の中央値を算出する。このとき、閾値算出部２４は、任意の値で予め設定したマージンを中央値に加算し、閾値とする。なお、閾値算出部２４は、気筒毎にノッキング音（抽出信号）を総和して中央値を求め、気筒毎の閾値を算出してもよい。一方、閾値算出部２４は、各気筒でノッキング音（抽出信号）を総和し、全気筒で中央値を求め、全気筒で共通の閾値を算出してもよい。また、後述する官能試験にて、検査者により許容不可能と判断された加工音９１ｃのノッキング音（抽出信号）の総和を求め、総和値以下の任意の値を閾値としてもよい。その後、閾値算出部２４は、算出した閾値を閾値記憶部２５に書き込む。 The threshold value calculation unit 24 calculates the threshold value based on the knocking sound (extracted signal) of the engine 1 extracted by the first estimation unit 23. Specifically, the threshold value calculation unit 24 sums the absolute values of the spectrograms of the knocking sounds (extracted signals) for each predetermined time (for example, one cycle of the engine 1). For example, in the engine 1 having a plurality of cylinders, when the knocking sounds (extracted signals) are summed, one score is obtained for each cylinder. Subsequently, the threshold value calculation unit 24 calculates the median value of the knocking sounds (extracted signals) summed up at predetermined time intervals. For example, the threshold value calculation unit 24 calculates the median value of the total sum of knocking sounds (extracted signals) for all cycles. At this time, the threshold value calculation unit 24 adds a margin preset with an arbitrary value to the median value to obtain a threshold value. The threshold value calculation unit 24 may sum the knocking sounds (extracted signals) for each cylinder to obtain the median value, and calculate the threshold value for each cylinder. On the other hand, the threshold value calculation unit 24 may sum the knocking sounds (extracted signals) in each cylinder, obtain the median value in all cylinders, and calculate a common threshold value in all cylinders. Further, the sum of the knocking sounds (extracted signals) of the processed sounds 91c judged to be unacceptable by the inspector in the sensory test described later may be obtained, and an arbitrary value equal to or less than the total value may be used as the threshold value. After that, the threshold value calculation unit 24 writes the calculated threshold value in the threshold value storage unit 25.

閾値記憶部２５は、閾値算出部２４が算出した閾値を記憶するメモリ、ＨＤＤ、ＳＳＤ等の記憶装置である。 The threshold value storage unit 25 is a storage device such as a memory, HDD, SSD, etc. that stores the threshold value calculated by the threshold value calculation unit 24.

教師信号記憶部２６ａは、教師信号（本実施形態では、観測ノッキング筒内圧９３（図３Ａ））を記憶するメモリ、ＨＤＤ、ＳＳＤ等の記憶装置である。 The teacher signal storage unit 26a is a storage device such as a memory, HDD, SSD, etc. that stores a teacher signal (in this embodiment, the observation knocking cylinder internal pressure 93 (FIG. 3A)).

推定信号記憶部２６ｂは、推定信号（本実施形態では、推定ノッキング筒内圧９２（図３Ａ））を記憶するメモリ、ＨＤＤ、ＳＳＤ等の記憶装置である。 The estimated signal storage unit 26b is a storage device for a memory, an HDD, an SSD, or the like that stores an estimated signal (in this embodiment, an estimated knocking cylinder internal pressure 92 (FIG. 3A)).

ノイズ成分記憶部２６ｃは、入力物理量に含まれているノイズ成分（本実施形態では、雑音９１ｂ（図６Ａ））を記憶するメモリ、ＨＤＤ、ＳＳＤ等の記憶装置である。 The noise component storage unit 26c is a storage device for a memory, HDD, SSD, or the like that stores a noise component (noise 91b (FIG. 6A) in the present embodiment) included in the input physical quantity.

抽出信号記憶部２６ｄは、ノイズ成分が含まれている入力物理量からノイズ成分を除去した抽出信号（本実施形態では、抽出ノッキング音９１ａ（図６Ａ））を記憶するメモリ、ＨＤＤ、ＳＳＤ等の記憶装置である。 The extraction signal storage unit 26d stores a memory, an HDD, an SSD, or the like that stores an extraction signal (in this embodiment, the extraction knocking sound 91a (FIG. 6A)) in which the noise component is removed from the input physical quantity containing the noise component. It is a device.

＜判定処理部＞
判定処理部３０は、判定モードにおいて、音圧信号のスペクトログラムからエンジン１のノッキング音を抽出し、ノッキングの有無を判定する判定処理（図１６）を行う。図２に示すように、判定処理部３０は、第２推定部３１（抽出信号推定部）と、閾値判定部３２とを備える。 <Judgment processing unit>
In the determination mode, the determination processing unit 30 extracts the knocking sound of the engine 1 from the spectrogram of the sound pressure signal, and performs determination processing (FIG. 16) for determining the presence or absence of knocking. As shown in FIG. 2, the determination processing unit 30 includes a second estimation unit 31 (extraction signal estimation unit) and a threshold value determination unit 32.

第２推定部３１（抽出信号推定部）は、判定モードにおいて、ニューラルネットワーク９４により生成されたマスクαを用いて、音圧信号のスペクトログラムから抽出ノッキング音９１ａ（抽出信号）を抽出する。この第２推定部３１が抽出した抽出ノッキング音９１ａ（抽出信号）は、後記する閾値判定に利用される。なお、第２推定部３１の処理内容は、第１推定部２３と同様のため、説明を省略する。
その後、第２推定部３１は、抽出した抽出ノッキング音９１ａ（抽出信号）を閾値判定部３２に出力する。 The second estimation unit 31 (extraction signal estimation unit) extracts the extraction knocking sound 91a (extraction signal) from the spectrogram of the sound pressure signal by using the mask α generated by the neural network 94 in the determination mode. The extracted knocking sound 91a (extracted signal) extracted by the second estimation unit 31 is used for the threshold value determination described later. Since the processing content of the second estimation unit 31 is the same as that of the first estimation unit 23, the description thereof will be omitted.
After that, the second estimation unit 31 outputs the extracted extraction knocking sound 91a (extraction signal) to the threshold value determination unit 32.

閾値判定部３２は、判定モードにおいて、閾値記憶部２５に記憶されている閾値と、第２推定部３１が抽出したノッキング音（抽出信号）との閾値判定により、ノッキングの有無を判定する。具体的には、閾値判定部３２は、閾値算出部２４と同様、ノッキング音（抽出信号）のスペクトログラムの絶対値を所定時間毎に総和する。そして、閾値判定部３２は、総和したノッキング音（抽出信号）と閾値とを比較し、総和したノッキング音（抽出信号）が閾値を超える場合にはノッキング有りと判定し、総和したノッキング音（抽出信号）が閾値以下の場合にはノッキング無しと判定する。その後、閾値判定部３２は、ノッキング有無の判定結果と、第２推定部３１から入力されたノッキング音（抽出信号）をモニタ７（図１）に出力する。 In the determination mode, the threshold value determination unit 32 determines the presence or absence of knocking by determining the threshold value stored in the threshold value storage unit 25 and the knocking sound (extracted signal) extracted by the second estimation unit 31. Specifically, the threshold value determination unit 32 sums the absolute values of the spectrograms of the knocking sounds (extracted signals) at predetermined time intervals, as in the threshold value calculation unit 24. Then, the threshold value determination unit 32 compares the totaled knocking sound (extracted signal) with the threshold value, and if the totaled knocking sound (extracted signal) exceeds the threshold value, determines that there is knocking, and the totaled knocking sound (extracted). If the signal) is less than or equal to the threshold value, it is determined that there is no knocking. After that, the threshold value determination unit 32 outputs the knocking presence / absence determination result and the knocking sound (extracted signal) input from the second estimation unit 31 to the monitor 7 (FIG. 1).

＜分離部＞
分離部４０は、入力物理量をノイズ成分と抽出信号とに分離する。本実施形態では、図６Ａに示すように、分離部４０は、入力物理量としてのエンジン近傍音９０をノイズ成分である雑音９１ｂと抽出信号である抽出ノッキング音９１ａとに分離する。 <Separation part>
The separation unit 40 separates the input physical quantity into a noise component and an extraction signal. In the present embodiment, as shown in FIG. 6A, the separation unit 40 separates the engine vicinity sound 90 as an input physical quantity into a noise 91b which is a noise component and an extraction knocking sound 91a which is an extraction signal.

＜信号合成部＞
信号合成部５０は、各種の信号を合成して加工音を生成する。本実施形態では、図１１Ａ乃至図１２Ｃに示すように、信号合成部５０は、抽出信号（抽出ノッキング音９１ａ）のレベルを変更して、入力物理量（エンジン近傍音９０）から分離されたノイズ成分（雑音９１ｂ）と合成して加工音を生成する。信号合成部５０は、信号調整部５１と、信号出力部５２とを有している。信号調整部５１は、レベル指定部９で指定（入力）された信号の上昇レベル又は下降レベルに応じて信号のレベルを変更し、レベル変更信号とノイズ成分とを合成して加工音を生成する。信号出力部５２は、加工音（信号）を放音部（ヘッドホン８）に出力して、放音部に加工音を放音させる。 <Signal synthesizer>
The signal synthesis unit 50 synthesizes various signals to generate a processed sound. In the present embodiment, as shown in FIGS. 11A to 12C, the signal synthesizer 50 changes the level of the extraction signal (extraction knocking sound 91a) to change the noise component separated from the input physical quantity (engine proximity sound 90). A processed sound is generated by combining with (noise 91b). The signal synthesis unit 50 includes a signal adjustment unit 51 and a signal output unit 52. The signal adjusting unit 51 changes the signal level according to the rising level or falling level of the signal designated (input) by the level designating unit 9, and combines the level changing signal and the noise component to generate a processed sound. .. The signal output unit 52 outputs the processed sound (signal) to the sound emitting unit (headphones 8), and causes the sound emitting unit to emit the processed sound.

＜学習モード時の動作＞
以下、図３Ａ、図３Ｂ、図８Ａ、図８Ｂ、図８Ｃ、及び図９を参照して、信号処理装置１０の学習モード時の動作について説明する。図３Ａは、学習モードの説明図である。図３Ｂは、信号処理装置１０の学習モード時の動作説明図である。図８Ａは、学習時におけるノイズ成分と教師信号との関係を表す説明図である。図８Ｂは、学習時における抽出信号と教師信号との関係を表す説明図である。図８Ｃは、学習時における抽出信号と推定信号との関係を表す説明図である。図９は、第１実施形態において、マスクαを生成するネットワークの重みＷを学習するニューラルネットワーク９４の説明図である。 <Operation in learning mode>
Hereinafter, the operation of the signal processing device 10 in the learning mode will be described with reference to FIGS. 3A, 3B, 8A, 8B, 8C, and 9. FIG. 3A is an explanatory diagram of the learning mode. FIG. 3B is an operation explanatory view of the signal processing device 10 in the learning mode. FIG. 8A is an explanatory diagram showing the relationship between the noise component and the teacher signal during learning. FIG. 8B is an explanatory diagram showing the relationship between the extracted signal and the teacher signal during learning. FIG. 8C is an explanatory diagram showing the relationship between the extracted signal and the estimated signal during learning. FIG. 9 is an explanatory diagram of the neural network 94 that learns the weight W of the network that generates the mask α in the first embodiment.

図３Ａに示すように、学習モード時において、信号処理装置１０の学習部２１は、エンジン近傍音９０（入力物理量）を雑音９１ｂ（ノイズ成分）と抽出ノッキング音９１ａ（抽出信号）とに分離する。その際に、学習部２１は、エンジン近傍音９０（入力物理量）にマスクαを掛け合わせて抽出ノッキング音９１ａ（抽出信号）を取得する。また、学習部２１は、エンジン近傍音９０（入力物理量）から抽出ノッキング音９１ａ（抽出信号）を差し引くことで、雑音９１ｂ（ノイズ成分）を取得する。本実施形態では、マスクαは、エンジン近傍音９０（入力物理量）に含まれるノッキング音の割合と位相成分（位相の修正量）を表す。位相成分については、図１０Ａ及び図１０Ｂを用いて後記する。 As shown in FIG. 3A, in the learning mode, the learning unit 21 of the signal processing device 10 separates the engine vicinity sound 90 (input physical quantity) into the noise 91b (noise component) and the extraction knocking sound 91a (extraction signal). .. At that time, the learning unit 21 acquires the extraction knocking sound 91a (extraction signal) by multiplying the engine proximity sound 90 (input physical quantity) by the mask α. Further, the learning unit 21 acquires the noise 91b (noise component) by subtracting the extraction knocking sound 91a (extraction signal) from the engine vicinity sound 90 (input physical quantity). In the present embodiment, the mask α represents the ratio of the knocking sound included in the engine near sound 90 (input physical quantity) and the phase component (phase correction amount). The phase component will be described later with reference to FIGS. 10A and 10B.

信号処理装置１０の学習部２１は、マスクαを生成するネットワークの重みＷ、及び、抽出ノッキング音９１ａ（抽出信号）を観測ノッキング筒内圧９３（教師信号）と同じ次元（単位）の推定ノッキング筒内圧９２（推定信号）に位相を加味して変換するための伝達関数Ｈを学習する。換言すると、学習部２１は、ネットワークの重みＷ、及び、伝達関数Ｈを学習する際に、マスクα、及び、伝達関数Ｈに対して、入力物理量（エンジン近傍音９０）に関連する振幅と位相成分を加味して学習する。本実施形態では、学習部２１は、抽出ノッキング音９１ａ（抽出信号）に対して、逆短時間フーリエ変換（ＩＳＴＦＴ）と高速フーリエ変換（ＦＦＴ）とを行い、伝達関数Ｈを掛け、逆高速フーリエ変換（ＩＦＦＴ）と短時間フーリエ変換（ＳＴＦＴ）とを行うことで、抽出ノッキング音９１ａ（抽出信号）を推定ノッキング筒内圧９２（推定信号）に変換している。なお、本実施形態では、伝達関数Ｈは、抽出ノッキング音９１ａ（抽出信号）を推定ノッキング筒内圧９２（推定信号）に変換するための振幅（ゲイン）と位相成分である。位相成分については、図１０Ａ及び図１０Ｂを用いて後記する。 The learning unit 21 of the signal processing device 10 observes the weight W of the network that generates the mask α and the extracted knocking sound 91a (extracted signal). The estimated knocking cylinder having the same dimension (unit) as the knocking cylinder internal pressure 93 (teacher signal). The transfer function H for converting the internal pressure 92 (estimated signal) by adding the phase is learned. In other words, when learning the network weight W and the transfer function H, the learning unit 21 has an amplitude and a phase related to the input physical quantity (engine near sound 90) with respect to the mask α and the transfer function H. Learn by adding ingredients. In the present embodiment, the learning unit 21 performs an inverse short-time Fourier transform (ISTFT) and a fast Fourier transform (FFT) on the extraction knocking sound 91a (extract signal), multiplies the transfer function H, and reverse-fast Fourier transform. By performing the transform (IFFT) and the short-time Fourier transform (SFTFT), the extracted knocking sound 91a (extracted signal) is converted into the estimated knocking in-cylinder pressure 92 (estimated signal). In the present embodiment, the transfer function H is an amplitude (gain) and a phase component for converting the extracted knocking sound 91a (extracted signal) into the estimated knocking cylinder pressure 92 (estimated signal). The phase component will be described later with reference to FIGS. 10A and 10B.

本実施形態では、図３Ａに示すように、学習部２１は、ノイズ成分（雑音９１ｂ）を逆短時間フーリエ変換（ＩＳＴＦＴ）して求めた信号波形と教師信号（観測ノッキング筒内圧９３）とのコヒーレンスが小さくなるとともに、抽出信号（抽出ノッキング音９１ａ）を逆短時間フーリエ変換（ＩＳＴＦＴ）して求めた信号波形と教師信号（観測ノッキング筒内圧９３）とのコヒーレンスが大きくなるように、マスクαを生成するネットワークの重みＷ、及び、伝達関数Ｈを学習する。これにより、信号処理装置１０は、雑音９１ｂ（ノイズ成分）から筒内圧に起因する音を除去することができる。 In the present embodiment, as shown in FIG. 3A, the learning unit 21 combines a signal waveform obtained by inverse short-time Fourier transform (ISTFT) of a noise component (noise 91b) and a teacher signal (observation knocking cylinder pressure 93). Mask α so that the coherence becomes smaller and the coherence between the signal waveform obtained by inverse short-time Fourier transform (ISTFT) of the extracted signal (extracted knocking sound 91a) and the teacher signal (observed knocking cylinder pressure 93) becomes larger. The weight W of the network that generates the above and the transfer function H are learned. As a result, the signal processing device 10 can remove the sound caused by the in-cylinder pressure from the noise 91b (noise component).

図３Ｂに、信号処理装置１０の学習モード時の動作を示す。図３Ｂの太枠は学習モード時の作動する構成要素を示している。また、図３Ｂの太線矢印は、学習モード時に出力される信号を示している。 FIG. 3B shows the operation of the signal processing device 10 in the learning mode. The thick frame in FIG. 3B shows the operating components in the learning mode. The thick arrow in FIG. 3B indicates a signal output in the learning mode.

図３Ｂに示すように、学習モード時において、信号処理装置１０は、スイッチ１４を学習モード用接続部Ｍ１に接続して、信号記憶部１３に記憶されているエンジン近傍音９０（入力物理量）のスペクトログラム及び観測ノッキング筒内圧９３（教師信号）を学習部２１に出力する。 As shown in FIG. 3B, in the learning mode, the signal processing device 10 connects the switch 14 to the learning mode connection unit M1 to generate the engine proximity sound 90 (input physical quantity) stored in the signal storage unit 13. The spectrogram and the observation knocking cylinder pressure 93 (teacher signal) are output to the learning unit 21.

これに応答して、学習部２１は、ニューラルネットワーク９４（図３Ａ）により、エンジン近傍音９０（入力物理量）から雑音９１ｂ（ノイズ成分）を除去するマスクαを生成するネットワークの重みＷと、生成されたマスクαで抽出した抽出ノッキング音９１ａ（抽出信号）をノッキング発生時のエンジン１の推定ノッキング筒内圧９２（推定信号）に変換する伝達関数Ｈとを学習する。このとき、学習部２１は、新しいエンジン近傍音９０（入力物理量）が入力される度に、エンジン近傍音９０（入力物理量）に対応するマスクαを生成する。 In response to this, the learning unit 21 generates a network weight W that generates a mask α that removes the noise 91b (noise component) from the engine near sound 90 (input physical quantity) by the neural network 94 (FIG. 3A). The transmission function H that converts the extracted knocking sound 91a (extracted signal) extracted by the mask α into the estimated knocking cylinder pressure 92 (estimated signal) of the engine 1 when knocking occurs is learned. At this time, the learning unit 21 generates a mask α corresponding to the engine near sound 90 (input physical quantity) every time a new engine near sound 90 (input physical quantity) is input.

そして、学習部２１は、学習されたパラメータ（マスクαを生成するネットワークの重みＷと伝達関数Ｈ）を学習済みパラメータ記憶部２２に記憶する。また、学習部２１は、観測ノッキング筒内圧９３（教師信号）を教師信号記憶部２６ａに記憶するとともに、推定ノッキング筒内圧９２（推定信号）を推定信号記憶部２６ｂに記憶する。 Then, the learning unit 21 stores the learned parameters (the weight W of the network that generates the mask α and the transfer function H) in the learned parameter storage unit 22. Further, the learning unit 21 stores the observation knocking cylinder internal pressure 93 (teacher signal) in the teacher signal storage unit 26a, and stores the estimated knocking cylinder internal pressure 92 (estimated signal) in the estimated signal storage unit 26b.

＜閾値算出モード時の動作＞
以下、図４Ａ及び図４Ｂを参照して、信号処理装置１０の閾値算出モード時の動作について説明する。図４Ａは、閾値算出モードの説明図である。図４Ｂは、信号処理装置１０の閾値算出モード時の動作説明図である。 <Operation in threshold calculation mode>
Hereinafter, the operation of the signal processing device 10 in the threshold value calculation mode will be described with reference to FIGS. 4A and 4B. FIG. 4A is an explanatory diagram of the threshold value calculation mode. FIG. 4B is an operation explanatory diagram of the signal processing device 10 in the threshold value calculation mode.

図４Ｂに示すように、閾値算出モード時において、図３Ｂの学習モード時と異なり、信号処理装置１０の学習部２１は、停止した状態になっている。その代わりに、図４Ｂに示すように、学習処理部２０の第１推定部２３と閾値算出部２４が作動して、エンジン１の抽出ノッキング音（抽出信号）の閾値判定に用いる閾値Ｔを算出する。図４Ｂの太枠は閾値算出モード時の作動する構成要素を示している。また、図４Ｂの太線矢印は、閾値算出モード時に出力される信号を示している。 As shown in FIG. 4B, in the threshold value calculation mode, unlike the learning mode in FIG. 3B, the learning unit 21 of the signal processing device 10 is in a stopped state. Instead, as shown in FIG. 4B, the first estimation unit 23 and the threshold value calculation unit 24 of the learning processing unit 20 operate to calculate the threshold value T used for determining the threshold value of the extraction knocking sound (extraction signal) of the engine 1. do. The thick frame in FIG. 4B shows the operating components in the threshold calculation mode. The thick arrow in FIG. 4B indicates a signal output in the threshold value calculation mode.

図４Ｂに示すように、閾値算出モード時において、信号処理装置１０は、スイッチ１４を閾値算出モード用接続部Ｍ２に接続して、信号記憶部１３に記憶されているエンジン近傍音９０（入力物理量）のスペクトログラムを第１推定部２３に出力する。 As shown in FIG. 4B, in the threshold value calculation mode, the signal processing device 10 connects the switch 14 to the threshold value calculation mode connection unit M2, and the engine proximity sound 90 (input physical quantity) stored in the signal storage unit 13. ) Is output to the first estimation unit 23.

これに応答して、第１推定部２３は、学習済みパラメータ記憶部２２から、マスクαを生成するネットワークの重みＷを取得する。そして、第１推定部２３は、ニューラルネットワーク９４（図４Ａ）により生成したマスクαを用いてエンジン近傍音９０（入力物理量）からエンジン１の抽出ノッキング音９１ａ（抽出信号）を抽出する。図４Ａは、このときの第１推定部２３の動作の概要を示している。この後、第１推定部２３は、抽出した抽出ノッキング音９１ａ（抽出信号）を閾値算出部２４に出力する。 In response to this, the first estimation unit 23 acquires the weight W of the network that generates the mask α from the learned parameter storage unit 22. Then, the first estimation unit 23 extracts the extraction knocking sound 91a (extraction signal) of the engine 1 from the engine proximity sound 90 (input physical quantity) by using the mask α generated by the neural network 94 (FIG. 4A). FIG. 4A shows an outline of the operation of the first estimation unit 23 at this time. After that, the first estimation unit 23 outputs the extracted extraction knocking sound 91a (extraction signal) to the threshold value calculation unit 24.

これに応答して、閾値算出部２４は、エンジン１の各サイクルで抽出ノッキング音９１ａ（抽出信号）のスペクトログラムの絶対値を総和し、予め設定したマージンを加算して、閾値Ｔを算出する。そして、閾値算出部２４は、閾値Ｔを閾値記憶部２５に記憶する。 In response to this, the threshold value calculation unit 24 calculates the threshold value T by summing the absolute values of the spectrograms of the extraction knocking sound 91a (extraction signal) in each cycle of the engine 1 and adding a preset margin. Then, the threshold value calculation unit 24 stores the threshold value T in the threshold value storage unit 25.

＜判定モード時の動作＞
以下、図５Ａ及び図５Ｂを参照して、信号処理装置１０の判定モード時の動作について説明する。図５Ａは、判定モードの説明図である。図５Ｂは、信号処理装置１０の判定モード時の動作説明図である。 <Operation in judgment mode>
Hereinafter, the operation of the signal processing device 10 in the determination mode will be described with reference to FIGS. 5A and 5B. FIG. 5A is an explanatory diagram of the determination mode. FIG. 5B is an operation explanatory view of the signal processing device 10 in the determination mode.

図５Ｂに示すように、判定モード時において、図３Ｂの学習モード時と異なり、信号処理装置１０の学習部２１は、停止した状態になっている。その代わりに、図５Ｂに示すように、判定処理部３０の第２推定部３１（抽出信号推定部）と閾値判定部３２が作動して、ノッキングの有無を判定する。図５Ｂの太枠は判定モード時の作動する構成要素を示している。また、図５Ｂの太線矢印は、判定モード時に出力される信号を示している。 As shown in FIG. 5B, in the determination mode, unlike the learning mode in FIG. 3B, the learning unit 21 of the signal processing device 10 is in a stopped state. Instead, as shown in FIG. 5B, the second estimation unit 31 (extraction signal estimation unit) and the threshold value determination unit 32 of the determination processing unit 30 operate to determine the presence or absence of knocking. The thick frame in FIG. 5B shows the operating components in the determination mode. The thick arrow in FIG. 5B indicates a signal output in the determination mode.

図５Ｂに示すように、判定モード時において、信号処理装置１０は、スイッチ１４を判定モード用接続部Ｍ３に接続して、信号記憶部１３に記憶されているエンジン近傍音９０（入力物理量）のスペクトログラムを第２推定部３１（抽出信号推定部）に出力する。 As shown in FIG. 5B, in the determination mode, the signal processing device 10 connects the switch 14 to the determination mode connection unit M3, and the engine proximity sound 90 (input physical quantity) stored in the signal storage unit 13 The spectrogram is output to the second estimation unit 31 (extraction signal estimation unit).

これに応答して、第２推定部３１は、学習済みパラメータ記憶部２２から、マスクαを生成するネットワークの重みＷを取得する。そして、第２推定部３１は、ニューラルネットワーク９４（図５Ａ）により生成したマスクαを用いてエンジン近傍音９０（入力物理量）から抽出ノッキング音９１ａ（抽出信号）を抽出する。図５Ａは、このときの第２推定部３１の動作の概要を示している。この後、第２推定部３１は、抽出した抽出ノッキング音９１ａ（抽出信号）を閾値判定部３２に出力する。 In response to this, the second estimation unit 31 acquires the weight W of the network that generates the mask α from the learned parameter storage unit 22. Then, the second estimation unit 31 extracts the extracted knocking sound 91a (extracted signal) from the engine vicinity sound 90 (input physical quantity) by using the mask α generated by the neural network 94 (FIG. 5A). FIG. 5A shows an outline of the operation of the second estimation unit 31 at this time. After that, the second estimation unit 31 outputs the extracted extraction knocking sound 91a (extraction signal) to the threshold value determination unit 32.

これに応答して、閾値判定部３２は、閾値記憶部２５から閾値Ｔを取得し、抽出ノッキング音９１ａ（抽出信号）の絶対値の総和と閾値Ｔを比較してノッキングの有無を判定する。そして、閾値判定部３２は、例えば、ノッキングの有無の判定結果や、抽出ノッキング音９１ａ（抽出信号）と閾値Ｔとの関係を表す波形図等をモニタ７に出力して表示させる。 In response to this, the threshold value determination unit 32 acquires the threshold value T from the threshold value storage unit 25, compares the sum of the absolute values of the extracted knocking sounds 91a (extracted signal) with the threshold value T, and determines the presence or absence of knocking. Then, the threshold value determination unit 32 outputs, for example, a determination result of the presence or absence of knocking, a waveform diagram showing the relationship between the extraction knocking sound 91a (extraction signal) and the threshold value T, and the like to the monitor 7 and displays them.

＜分離モード時の動作＞
以下、図６Ａ及び図６Ｂを参照して、信号処理装置１０の分離モード時の動作について説明する。図６Ａは、分離モードの説明図である。図６Ｂは、信号処理装置１０の分離モード時の動作説明図である。 <Operation in separation mode>
Hereinafter, the operation of the signal processing device 10 in the separation mode will be described with reference to FIGS. 6A and 6B. FIG. 6A is an explanatory diagram of the separation mode. FIG. 6B is an operation explanatory view of the signal processing device 10 in the separation mode.

図６Ｂに示すように、分離モード時において、図３Ｂの学習モード時と異なり、信号処理装置１０の学習部２１は、停止した状態になっている。その代わりに、図６Ｂに示すように、分離部４０が作動して、エンジン近傍音９０（入力物理量）を抽出ノッキング音９１ａ（抽出信号）と雑音９１ｂ（ノイズ成分）とに分離する。図６Ｂの太枠は分離モード時の作動する構成要素を示している。また、図６Ｂの太線矢印は、分離モード時に出力される信号を示している。 As shown in FIG. 6B, in the separation mode, unlike the learning mode in FIG. 3B, the learning unit 21 of the signal processing device 10 is in a stopped state. Instead, as shown in FIG. 6B, the separation unit 40 operates to separate the engine vicinity sound 90 (input physical quantity) into the extraction knocking sound 91a (extraction signal) and the noise 91b (noise component). The thick frame in FIG. 6B shows the operating components in the separation mode. The thick arrow in FIG. 6B indicates a signal output in the separation mode.

図６Ｂに示すように、分離モード時において、信号処理装置１０は、スイッチ１４を分離モード用接続部Ｍ４に接続して、信号記憶部１３に記憶されているエンジン近傍音９０（入力物理量）のスペクトログラムを分離部４０に出力する。 As shown in FIG. 6B, in the separation mode, the signal processing device 10 connects the switch 14 to the separation mode connection unit M4, and the engine proximity sound 90 (input physical quantity) stored in the signal storage unit 13. The spectrogram is output to the separation unit 40.

これに応答して、分離部４０は、学習済みパラメータ記憶部２２から、マスクαを生成するネットワークの重みＷを取得する。そして、分離部４０は、ニューラルネットワーク９４（図６Ａ）により生成したマスクαを用いて、エンジン近傍音９０（入力物理量）を抽出ノッキング音９１ａ（抽出信号）と雑音９１ｂ（ノイズ成分）とに分離する。図６Ａは、このときの分離部４０の動作の概要を示している。この後、分離部４０は、抽出ノッキング音９１ａ（抽出信号）を抽出信号記憶部２６ｄに記憶するとともに、雑音９１ｂ（ノイズ成分）をノイズ成分記憶部２６ｃに記憶する。 In response to this, the separation unit 40 acquires the weight W of the network that generates the mask α from the learned parameter storage unit 22. Then, the separation unit 40 separates the engine vicinity sound 90 (input physical quantity) into the extraction knocking sound 91a (extraction signal) and the noise 91b (noise component) by using the mask α generated by the neural network 94 (FIG. 6A). do. FIG. 6A shows an outline of the operation of the separation unit 40 at this time. After that, the separation unit 40 stores the extraction knocking sound 91a (extraction signal) in the extraction signal storage unit 26d, and stores the noise 91b (noise component) in the noise component storage unit 26c.

＜官能試験モード時の動作＞
以下、図７Ａ及び図７Ｂを参照して、信号処理装置１０の官能試験モード時の動作について説明する。図７Ａは、官能試験モードの説明図である。図７Ｂは、信号処理装置１０の官能試験モード時の動作説明図である。官能試験モードは、検査者の聴感に基づく閾値Ｔを算出するモードである。 <Operation in sensory test mode>
Hereinafter, the operation of the signal processing device 10 in the sensory test mode will be described with reference to FIGS. 7A and 7B. FIG. 7A is an explanatory diagram of the sensory test mode. FIG. 7B is an operation explanatory view of the signal processing device 10 in the sensory test mode. The sensory test mode is a mode for calculating the threshold value T based on the auditory sense of the examiner.

図７Ｂに示すように、官能試験モード時において、図３Ｂの学習モード時と異なり、信号処理装置１０の学習部２１は、停止した状態になっている。その代わりに、図７Ｂに示すように、信号合成部５０の信号調整部５１と信号出力部５２、並びに、学習処理部２０の第１推定部２３と閾値算出部２４が作動して、検査者の聴感に基づく閾値Ｔを算出する。図７Ｂの太枠は官能試験モード時の作動する構成要素を示している。また、図７Ｂの太線矢印は、官能試験モード時に出力される信号を示している。 As shown in FIG. 7B, in the sensory test mode, unlike the learning mode in FIG. 3B, the learning unit 21 of the signal processing device 10 is in a stopped state. Instead, as shown in FIG. 7B, the signal adjustment unit 51 and the signal output unit 52 of the signal synthesis unit 50, and the first estimation unit 23 and the threshold value calculation unit 24 of the learning processing unit 20 are activated to operate the inspector. The threshold value T based on the audibility of is calculated. The thick frame in FIG. 7B shows the operating components in the sensory test mode. The thick arrow in FIG. 7B indicates a signal output in the sensory test mode.

図７Ｂに示すように、官能試験モード時において、信号処理装置１０は、スイッチ１４を官能試験モード用接続部Ｍ５に接続することで、官能試験モードの実行指示信号を信号調整部５１に出力する。 As shown in FIG. 7B, in the sensory test mode, the signal processing device 10 outputs the execution instruction signal of the sensory test mode to the signal adjusting unit 51 by connecting the switch 14 to the sensory test mode connection unit M5. ..

これに応答して、信号調整部５１は、レベル指定部９からレベル指定情報を受け取るとともに、抽出信号記憶部２６ｄに記憶されている抽出ノッキング音９１ａ（抽出信号）とノイズ成分記憶部２６ｃに記憶されている雑音９１ｂ（ノイズ成分）とを取得する。そして、信号調整部５１は、レベル指定情報に基づいて、抽出ノッキング音９１ａ（抽出信号）と雑音９１ｂ（ノイズ成分）とを用いて加工音９１ｃを生成する。このとき、信号調整部５１は、レベル指定情報によって指定された量だけ抽出ノッキング音９１ａ（抽出信号）のレベル（大きさ）を上昇又は下降させてから、雑音９１ｂ（ノイズ成分）と合成することによって、加工音９１ｃを生成して信号出力部５２に出力する。信号出力部５２は、加工音９１ｃをヘッドホン８（放音部）に出力して放音させる。 In response to this, the signal adjusting unit 51 receives the level designation information from the level designation unit 9, and stores the extraction knocking sound 91a (extract signal) and the noise component storage unit 26c stored in the extraction signal storage unit 26d. The noise 91b (noise component) that has been generated is acquired. Then, the signal adjusting unit 51 generates the processed sound 91c by using the extracted knocking sound 91a (extracted signal) and the noise 91b (noise component) based on the level designation information. At this time, the signal adjusting unit 51 raises or lowers the level (magnitude) of the extracted knocking sound 91a (extracted signal) by the amount specified by the level designation information, and then synthesizes it with the noise 91b (noise component). Generates the processing sound 91c and outputs it to the signal output unit 52. The signal output unit 52 outputs the processed sound 91c to the headphones 8 (sound emitting unit) to emit the processed sound 91c.

また、信号調整部５１は、加工音９１ｃを第１推定部２３に出力する。第１推定部２３は、学習済みパラメータ記憶部２２から、マスクαを生成するネットワークの重みＷを取得する。そして、第１推定部２３は、ニューラルネットワーク９４（図７Ａ）により生成したマスクαを用いて加工音９１ｃからエンジン１の抽出ノッキング音９１ａ（抽出信号）を抽出する。図７Ａは、このときの第１推定部２３の動作の概要を示している。この後、第１推定部２３は、抽出した抽出ノッキング音９１ａ（抽出信号）を閾値算出部２４に出力する。閾値算出部２４は、エンジン１の各サイクルで抽出ノッキング音９１ａ（抽出信号）のスペクトログラムの絶対値を総和し、総和値以下の任意の値を閾値Ｔとする。そして、閾値算出部２４は、閾値Ｔを閾値記憶部２５に記憶する。 Further, the signal adjusting unit 51 outputs the processed sound 91c to the first estimation unit 23. The first estimation unit 23 acquires the weight W of the network that generates the mask α from the learned parameter storage unit 22. Then, the first estimation unit 23 extracts the extraction knocking sound 91a (extraction signal) of the engine 1 from the processing sound 91c by using the mask α generated by the neural network 94 (FIG. 7A). FIG. 7A shows an outline of the operation of the first estimation unit 23 at this time. After that, the first estimation unit 23 outputs the extracted extraction knocking sound 91a (extraction signal) to the threshold value calculation unit 24. The threshold value calculation unit 24 sums the absolute values of the spectrograms of the extracted knocking sound 91a (extracted signal) in each cycle of the engine 1, and sets an arbitrary value equal to or less than the summed value as the threshold value T. Then, the threshold value calculation unit 24 stores the threshold value T in the threshold value storage unit 25.

ところで、特許文献２及び特許文献３に記載された従来技術は、ニューラルネットワークの学習時における目的関数に音の分離度合いを測る関数を含めていないため、エンジン近傍音（入力物理量）から除去される雑音（ノッキング音以外の音（背景音））の中にノッキング音（目的音）が混入する可能性があった。つまり、特許文献２及び特許文献３に記載された従来技術は、学習時に、推定筒内圧と実測筒内圧（教師データ）との二乗誤差を最小化するだけであるため、「雑音が除去されたエンジン音」が雑音（ノッキング音以外の音（背景音））だけを良好に除去されたものであるか否かを監視するものではなかった。例えば、特許文献３に記載された従来技術は、エンジン近傍音（入力物理量）に関連する位相成分が考慮されていないマスクαをエンジン近傍音に掛けることで、「雑音が除去されたエンジン音」すなわちノッキング音（本実施形態の「抽出信号」に相当）を抽出する。その際に、特許文献３に記載された従来技術は、音の分離度合いを測る関数を用いていないため、雑音（ノッキング音以外の音（背景音））と共に、除去されるべきでないノッキング音（目的音）がエンジン近傍音から除去される可能性があった。したがって、特許文献３に記載された従来技術は、除去されるべきでないノッキング音の成分をもエンジン近傍音から除去してしまう可能性があった。そのため、特許文献３に記載された従来技術は、ノッキングの有無の評価性能を低下させる可能性があった。 By the way, in the prior art described in Patent Document 2 and Patent Document 3, since the objective function at the time of learning the neural network does not include the function for measuring the degree of sound separation, it is removed from the sound near the engine (input physical quantity). There was a possibility that the knocking sound (target sound) was mixed in the noise (sound other than the knocking sound (background sound)). That is, since the prior art described in Patent Document 2 and Patent Document 3 only minimizes the square error between the estimated in-cylinder pressure and the actually measured in-cylinder pressure (teacher data) at the time of learning, "noise has been removed. It was not monitored whether or not the "engine sound" was the one in which only noise (sound other than knocking sound (background sound)) was satisfactorily removed. For example, in the prior art described in Patent Document 3, "engine sound from which noise has been removed" is obtained by applying a mask α that does not consider the phase component related to engine near sound (input physical quantity) to engine near sound. That is, the knocking sound (corresponding to the "extract signal" of the present embodiment) is extracted. At that time, since the conventional technique described in Patent Document 3 does not use a function for measuring the degree of sound separation, the knocking sound (sound other than the knocking sound (background sound)) and the knocking sound that should not be removed (the knocking sound (background sound)) There was a possibility that the target sound) would be removed from the sound near the engine. Therefore, the prior art described in Patent Document 3 has a possibility of removing a knocking sound component that should not be removed from the sound near the engine. Therefore, the prior art described in Patent Document 3 may reduce the evaluation performance of the presence or absence of knocking.

これに対して、本実施形態に係る信号処理装置１０は、学習時に、エンジン近傍音（入力物理量）から除去される雑音（ノッキング音以外の音（背景音））の中にノッキング音（目的音）が混入しているか否かを評価する構成になっている。そのための構成として、本実施形態に係る信号処理装置１０は、ノイズ成分（雑音９１ｂ）を逆短時間フーリエ変換（ＩＳＴＦＴ）して求めた信号波形と教師信号（観測ノッキング筒内圧９３）とのコヒーレンスが小さくなるように学習する構成になっている。このような本実施形態に係る信号処理装置１０は、特許文献２及び特許文献３に記載された従来技術よりも、ノッキングの有無の評価性能を向上させることができる。 On the other hand, in the signal processing device 10 according to the present embodiment, the knocking sound (target sound) is included in the noise (sound other than the knocking sound (background sound)) removed from the sound near the engine (input physical quantity) during learning. ) Is mixed in or not. As a configuration for that purpose, the signal processing device 10 according to the present embodiment has coherence between the signal waveform obtained by inverse short-time Fourier transform (ISTFT) of the noise component (noise 91b) and the teacher signal (observation knocking cylinder pressure 93). It is configured to learn so that Such a signal processing device 10 according to the present embodiment can improve the evaluation performance of the presence or absence of knocking as compared with the conventional techniques described in Patent Documents 2 and 3.

図８Ａは、雑音９１ｂ（ノイズ成分）のスペクトログラムＦ１１と、スペクトログラムＦ１１に対して逆短時間フーリエ変換（ＩＳＴＦＴ）を行うことにより得られる信号波形Ｆ１２と、観測ノッキング筒内圧９３（教師信号）の信号波形Ｆ９３と、を示している。また、図８Ａは、信号波形Ｆ１２と観測ノッキング筒内圧９３（教師信号）の信号波形Ｆ９３との学習の過程で最小化されるコヒーレンスＦ１３を示している。 FIG. 8A shows the spectrogram F11 of the noise 91b (noise component), the signal waveform F12 obtained by performing the inverse short-time Fourier transform (ISTFT) on the spectrogram F11, and the signal of the observation knocking cylinder pressure 93 (teacher signal). The waveform F93 and the like are shown. Further, FIG. 8A shows the coherence F13 minimized in the process of learning between the signal waveform F12 and the signal waveform F93 of the observation knocking cylinder pressure 93 (teacher signal).

図８Ｂは、抽出ノッキング音９１ａ（抽出信号）のスペクトログラムＦ２１と、スペクトログラムＦ２１に対して逆短時間フーリエ変換（ＩＳＴＦＴ）を行うことにより得られる信号波形Ｆ２２と、観測ノッキング筒内圧９３（教師信号）の信号波形Ｆ９３を示している。また、図８Ｂは、信号波形Ｆ２２と観測ノッキング筒内圧９３（教師信号）の信号波形Ｆ９３との学習の過程で最大化されるコヒーレンスＦ２３を示している。 FIG. 8B shows the spectrogram F21 of the extracted knocking sound 91a (extracted signal), the signal waveform F22 obtained by performing an inverse short-time Fourier transform (ISTFT) on the spectrogram F21, and the observed knocking in-cylinder pressure 93 (teacher signal). The signal waveform F93 of is shown. Further, FIG. 8B shows the coherence F23 maximized in the process of learning between the signal waveform F22 and the signal waveform F93 of the observation knocking cylinder pressure 93 (teacher signal).

コヒーレンスは以下のコヒーレンス関数γ^２によって定義される。コヒーレンス関数γ^２は、系の入力と出力の関連の度合いを示すものである。コヒーレンス関数γ^２は、クロススペクトルの絶対値の２乗を測定入力及び系の出力の各々のパワースペクトルで割り算したものである。

Coherence is defined by the following coherence function γ ^2. The coherence function γ ² indicates the degree of association between the input and output of the system. The coherence function γ ² is obtained by dividing the square of the absolute value of the cross spectrum by the power spectra of the measurement input and the output of the system.

ここで、Ｗｘｙはクロススペクトルの平均値、Ｗｘｘはｘのパワースペクトルの平均値、Ｗｙｙはｙのパワースペクトルの平均値を意味している。コヒーレンス関数γ^２は、０から１までの間の値をとる。γ^２（ｆ）が１の場合、その周波数ｆにおいて、系の出力がすべて測定入力に起因していることを示している。また、γ^２（ｆ）が０の場合、その周波数ｆにおいて、系の出力が測定入力にまったく関係ないことを示している。また、０＜γ^２（ｆ）＜１の場合、測定とは無関係な信号、系内部で発生しているノイズ、系の非直線性等があるものと考えられる。 Here, Wxy means the average value of the cross spectrum, Wxx means the average value of the power spectrum of x, and Wyy means the average value of the power spectrum of y. The coherence function γ ² takes a value between 0 and 1. When γ ² (f) is 1, it indicates that all the outputs of the system are caused by the measurement input at the frequency f. Further, when γ ² (f) is 0, it indicates that the output of the system has nothing to do with the measurement input at the frequency f. Further, when 0 <γ ² (f) <1, it is considered that there is a signal unrelated to the measurement, noise generated inside the system, non-linearity of the system, and the like.

また、エンジン近傍音（入力物理量）から除去される雑音（ノッキング音以外の音（背景音））の中にノッキング音（目的音）が混入しているか否かを評価する方法としてコヒーレントアウトプットパワーを用いても良い。学習部２１は、ノイズ成分（雑音９１ｂ）を逆短時間フーリエ変換（ＩＳＴＦＴ）して求めた信号波形と教師信号（観測ノッキング筒内圧９３）とのコヒーレンス及び／又はコヒーレントアウトプットパワーが小さくなるとともに、抽出信号（抽出ノッキング音９１ａ）を逆短時間フーリエ変換（ＩＳＴＦＴ）して求めた信号波形と教師信号（観測ノッキング筒内圧９３）とのコヒーレンス及び／又はコヒーレントアウトプットパワーが大きくなるように、マスクαを生成するネットワークの重みＷ、及び、伝達関数Ｈを学習する。これにより、信号処理装置１０は、ノッキングの有無の評価性能を更に向上させることができる。 In addition, coherent output power is a method for evaluating whether or not a knocking sound (target sound) is mixed in the noise (sound other than the knocking sound (background sound)) removed from the sound near the engine (input physical quantity). May be used. The learning unit 21 reduces the coherence and / or coherent output power between the signal waveform obtained by inverse short-time Fourier transform (ISTFT) of the noise component (noise 91b) and the teacher signal (observation knocking cylinder internal pressure 93). , The coherence and / or coherent output power of the signal waveform obtained by inverse short-time Fourier transform (ISTFT) of the extracted signal (extracted knocking sound 91a) and the teacher signal (observation knocking cylinder internal pressure 93) is increased. The weight W of the network that generates the mask α and the transfer function H are learned. As a result, the signal processing device 10 can further improve the evaluation performance of the presence or absence of knocking.

コヒーレントアウトプットパワーはコヒーレンス関数γ^２とｙのパワースペクトルの平均値Ｗｙｙとの積で定義される。コヒーレントアウトプットパワーは、系の出力に含まれる入力に起因したパワーを示すものである。 The coherent output power is ^{defined by the product of the coherence function γ 2} and the average value Wyy of the power spectrum of y. The coherent output power indicates the power caused by the input included in the output of the system.

また、図３Ａに示すように、学習部２１は、観測ノッキング筒内圧９３（教師信号）に対して短時間フーリエ変換（ＳＴＦＴ）を行い求めたスペクトログラムと推定ノッキング筒内圧９２（推定信号）との誤差が最小となるように、ニューラルネットワーク９４により、マスクαを生成するネットワークの重みＷ、及び、伝達関数Ｈを学習する。これにより、信号処理装置１０は、マスクαを生成するネットワークの重みＷと伝達関数Ｈの学習精度を向上させることができる。 Further, as shown in FIG. 3A, the learning unit 21 transfers the spectrogram obtained by performing a short-time Fourier transform (STFT) on the observed knocking cylinder internal pressure 93 (teacher signal) and the estimated knocking cylinder internal pressure 92 (estimated signal). The neural network 94 learns the weight W of the network that generates the mask α and the transfer function H so that the error is minimized. As a result, the signal processing device 10 can improve the learning accuracy of the weight W of the network that generates the mask α and the transfer function H.

また、学習部２１は、観測ノッキング筒内圧９３（教師信号）と、抽出信号（抽出ノッキング音９１ａ）に対して、逆短時間フーリエ変換（ＩＳＴＦＴ）と高速フーリエ変換（ＦＦＴ）とを行い、伝達関数Ｈを掛け、逆高速フーリエ変換（ＩＦＦＴ）を行い求めた推定ノッキング筒内圧９２（推定信号）の信号波形との誤差が最小となるように、ニューラルネットワーク９４により、マスクαを生成するネットワークの重みＷ、及び、伝達関数Ｈを学習しても良い。これにより、信号処理装置１０は、マスクαを生成するネットワークの重みＷと伝達関数Ｈの学習精度を更に向上させることができる。 Further, the learning unit 21 performs inverse short-time Fourier transform (ISTFT) and fast Fourier transform (FFT) on the observation knocking cylinder internal pressure 93 (teacher signal) and the extraction signal (extraction knocking sound 91a) and transmits them. A network that generates a mask α by the neural network 94 so that the error from the signal waveform of the estimated knocking in-cylinder pressure 92 (estimated signal) obtained by multiplying the function H and performing the inverse fast Fourier transform (IFFT) is minimized. The weight W and the transfer function H may be learned. As a result, the signal processing device 10 can further improve the learning accuracy of the weight W of the network that generates the mask α and the transfer function H.

図８Ｃは、抽出ノッキング音９１ａ（抽出信号）を推定ノッキング筒内圧９２（推定信号）に変換する際の一例を示している。図８Ｃは、抽出ノッキング音９１ａ（抽出信号）のスペクトログラムＦ３１と、スペクトログラムＦ３１に対して逆短時間フーリエ変換（ＩＳＴＦＴ）を行うことにより得られる信号波形Ｆ３２と、さらに高速フーリエ変換（ＦＦＴ）を行うことにより得られるスペクトルＦ３３と、を示している。また、図８Ｃは、スペクトルＦ３３の信号に伝達関数Ｈを掛けることにより得られる推定ノッキング筒内圧のスペクトルＦ３５と、を示している。なお、図８Ｃでは、伝達関数Ｈの一例として、周波数と係数との対応関係を示す周波数応答特性Ｆ３４が示されている。また、図８Ｃは、スペクトルＦ３５に逆高速フーリエ変換（ＩＦＦＴ）を行うことにより得られる信号波形Ｆ３６と、さらに、短時間フーリエ変換（ＳＴＦＴ）を行うことにより得られる、推定ノッキング筒内圧９２（推定信号）のスペクトログラムＦ３７と、を示している。 FIG. 8C shows an example of converting the extracted knocking sound 91a (extracted signal) into the estimated knocking cylinder pressure 92 (estimated signal). FIG. 8C shows the spectrogram F31 of the extraction knocking sound 91a (extract signal), the signal waveform F32 obtained by performing the inverse short-time Fourier transform (ISTFT) on the spectrogram F31, and further performing the fast Fourier transform (FFT). The spectrum F33 obtained by the above is shown. Further, FIG. 8C shows the spectrum F35 of the estimated knocking cylinder pressure obtained by multiplying the signal of the spectrum F33 by the transfer function H. In FIG. 8C, as an example of the transfer function H, the frequency response characteristic F34 showing the correspondence between the frequency and the coefficient is shown. Further, FIG. 8C shows a signal waveform F36 obtained by performing an inverse fast Fourier transform (IFFT) on the spectrum F35, and an estimated knocking in-cylinder pressure 92 (estimated) obtained by further performing a short-time Fourier transform (STFT). Signal) spectrogram F37.

本実施形態では、学習部２１は、畳み込みニューラルネットワーク（ＣＮＮ：Convolutional Neural Network）によりマスクαを生成するネットワークの重みＷを学習する。さらに、学習部２１は、伝達関数Ｈとして、抽出信号のスペクトルＦ３３に乗じる重みを学習する。 In the present embodiment, the learning unit 21 learns the weight W of the network that generates the mask α by the convolutional neural network (CNN). Further, the learning unit 21 learns the weight to be multiplied by the spectrum F33 of the extracted signal as the transfer function H.

なお、図９に示すように、マスクαを生成するネットワークの重みＷを学習するニューラルネットワーク９４の一例として、Ｕ−Ｎｅｔ９５がある。このＵ−Ｎｅｔ９５は、Ｅｎｃｏｄｅｒ‐Ｄｅｃｏｄｅｒモデルの一種で、画像認識や音の分離に使用されている深層学習の一手法である。音の分離において、Ｕ−Ｎｅｔ９５では、下向きパス９６（Ｅｎｃｏｄｅｒ）で畳み込み（ストライドは２以上）を行い、階層９７が深くなるにつれて音の特徴を抽出する。一方、上向きパス９８（Ｄｅｃｏｄｅｒ）では、抽出された音の特徴から逆畳み込みとＵＰサンプリング（膨張）を行うことによりマスクαを生成する。ここまでは、一般的なＥｎｃｏｄｅｒ‐Ｄｅｃｏｄｅｒモデルの構成である。さらに、Ｕ−Ｎｅｔ９５では、各Ｅｎｃｏｄｅｒの畳み込み層からの出力９９をＤｅｃｏｄｅｒの畳み込み層にマージする。これにより、Ｕ−Ｎｅｔ９５では、一般的なＥｎｃｏｄｅｒ‐Ｄｅｃｏｄｅｒモデルよりも高精度なマスクαを生成できる。信号処理装置１０は、マスクαを生成し、エンジン近傍音９０に掛け合わせることで、抽出信号（抽出ノッキング音９１ａ）を取得できる。 As shown in FIG. 9, U-Net95 is an example of a neural network 94 that learns the weight W of the network that generates the mask α. This U-Net95 is a kind of Encoder-Decoder model, and is a deep learning method used for image recognition and sound separation. In sound separation, in U-Net95, convolution (stride is 2 or more) is performed with a downward path 96 (Encoder), and sound characteristics are extracted as the layer 97 becomes deeper. On the other hand, in the upward path 98 (Decoder), the mask α is generated by performing deconvolution and UP sampling (expansion) from the characteristics of the extracted sound. Up to this point, it is the configuration of a general Encoder-Decoder model. Further, in U-Net95, the output 99 from the convolution layer of each Encoder is merged with the convolution layer of the Decoder. As a result, the U-Net95 can generate a mask α with higher accuracy than the general Encoder-Decoder model. The signal processing device 10 can acquire an extraction signal (extraction knocking sound 91a) by generating a mask α and multiplying it by a sound near the engine 90.

このように、学習部２１がＵ−Ｎｅｔ９５でマスクαを生成するネットワークの重みＷを学習するので、学習したネットワークは入力されるエンジン１の近傍音に応じて適切なマスクαを生成するようになる。これにより、信号処理装置１０は、エンジン１のノッキング筒内圧を正確に推定することができる。 In this way, since the learning unit 21 learns the weight W of the network that generates the mask α in the U-Net 95, the learned network generates an appropriate mask α according to the input near sound of the engine 1. Become. As a result, the signal processing device 10 can accurately estimate the knocking cylinder internal pressure of the engine 1.

＜位相成分の影響＞
以下、図１０Ａ及び図１０Ｂを参照して、位相成分の影響について説明する。図１０Ａは、圧力や、振動、音などの位相成分が考慮されていない場合の計算例の説明図である。図１０Ｂは、圧力や、振動、音などの位相成分が考慮されている場合の計算例の説明図である。図１０Ａ及び図１０Ｂにおいて、円内の矢印は、圧力や、振動、音などに含まれる位相成分を表している。 <Effect of phase component>
Hereinafter, the influence of the phase component will be described with reference to FIGS. 10A and 10B. FIG. 10A is an explanatory diagram of a calculation example when phase components such as pressure, vibration, and sound are not taken into consideration. FIG. 10B is an explanatory diagram of a calculation example when phase components such as pressure, vibration, and sound are taken into consideration. In FIGS. 10A and 10B, the arrows in the circle represent phase components contained in pressure, vibration, sound, and the like.

なお、図１０Ａ及び図１０Ｂに示す例では、ともに、以下の条件が仮定されている。
・系が線形時不変系（入力と出力との間の伝達特性が線形かつ時間に依存して変化しない系）である。 In both the examples shown in FIGS. 10A and 10B, the following conditions are assumed.
-The system is a linear time-invariant system (a system in which the transfer characteristics between the input and output are linear and do not change with time).

図１０Ａに示す例では、エンジン１が駆動されると、筒内圧センサ５（図１）により、観測ノッキング筒内圧９３が観測されている。観測ノッキング筒内圧９３がエンジン１内を伝搬することで、ノッキング音８２が放出される。さらに、ノッキング音８２が背景音であるメカニカルノイズ８３（ノイズ成分）と合わさることで、音圧センサ４によりエンジン近傍音９０（入力物理量）が観測される。 In the example shown in FIG. 10A, when the engine 1 is driven, the observation knocking in-cylinder pressure 93 is observed by the in-cylinder pressure sensor 5 (FIG. 1). The knocking sound 82 is emitted by the observation knocking cylinder internal pressure 93 propagating in the engine 1. Further, when the knocking sound 82 is combined with the mechanical noise 83 (noise component) which is the background sound, the sound pressure sensor 4 observes the engine vicinity sound 90 (input physical quantity).

図１０Ａは、以下のように、観測ノッキング筒内圧９３、ノッキング振動８１、ノッキング音８２、メカニカルノイズ８３、及びエンジン近傍音９０の概略的な計算の一例を示しており、位相φは変わらないと仮定している。
観測ノッキング筒内圧９３＝ＡＳｉｎ（Ωｔ＋φ）
ノッキング振動８１＝ＡＢＳｉｎ（Ωｔ＋φ）
ノッキング音８２＝ＡＢＣＳｉｎ（Ωｔ＋φ）
メカニカルノイズ８３＝ＮＳｉｎ（Ωｔ＋φＮ）
エンジン近傍音９０＝（ＡＢＣ＋Ｎ）Ｓｉｎ（Ωｔ＋φ） FIG. 10A shows an example of a schematic calculation of the observed knocking cylinder internal pressure 93, the knocking vibration 81, the knocking sound 82, the mechanical noise 83, and the engine vicinity sound 90 as shown below, and the phase φ does not change. I'm assuming.
Observation knocking cylinder pressure 93 = ASin (Ωt + φ)
Knocking vibration 81 = ABSin (Ωt + φ)
Knocking sound 82 = ABCSin (Ωt + φ)
Mechanical noise 83 = NSin (Ωt + φN)
Engine neighborhood sound 90 = (ABC + N) Sin (Ωt + φ)

これに対して、図１０Ｂは位相を考慮した例である。以下のように、観測ノッキング筒内圧９３、ノッキング振動８１、ノッキング音８２、メカニカルノイズ８３、及びエンジン近傍音９０の概略的な計算の一例を示しており、次元（単位）が変化する度に位相φが変化することを考慮している。
観測ノッキング筒内圧９３＝ＡＳｉｎ（Ωｔ＋φＡ）
ノッキング振動８１＝ＡＢＳｉｎ（Ωｔ＋φＡ＋φＢ）
ノッキング音８２＝ＡＢＣＳｉｎ（Ωｔ＋φＡ＋φＢ＋φＣ）
メカニカルノイズ８３＝ＮＳｉｎ（Ωｔ＋φＮ）
エンジン近傍音９０＝ＡＢＣＳｉｎ（Ωｔ＋φＡ＋φＢ＋φＣ）＋ＮＳｉｎ（Ωｔ＋φＮ） On the other hand, FIG. 10B is an example in which the phase is taken into consideration. As shown below, an example of a schematic calculation of the observed knocking cylinder internal pressure 93, knocking vibration 81, knocking sound 82, mechanical noise 83, and engine proximity sound 90 is shown, and the phase changes each time the dimension (unit) changes. Considering that φ changes.
Observation knocking cylinder pressure 93 = ASin (Ωt + φA)
Knocking vibration 81 = ABSin (Ωt + φA + φB)
Knocking sound 82 = ABCSin (Ωt + φA + φB + φC)
Mechanical noise 83 = NSin (Ωt + φN)
Engine neighborhood sound 90 = ABCSin (Ωt + φA + φB + φC) + NSin (Ωt + φN)

図１０Ｂに示す例は、圧力が振動となり、音となって放射される際の位相成分の変化が考慮されているため、図１０Ａに示す例よりも、信号を良好に解析することができる。そこで、信号処理装置１０は、図３Ａに示すように、マスクαを生成するネットワークの重みＷ及び伝達関数Ｈの学習時に、位相を加味して学習する構成となっている。このような信号処理装置１０は、エンジン近傍音９０（入力物理量）を抽出ノッキング音９１ａ（抽出信号）と雑音９１ｂ（ノイズ成分）とに良好に分離することができる。つまり、信号処理装置１０は、雑音（ノッキング音以外の音（背景音））の中にノッキング音（目的音）が混入しないように、エンジン近傍音９０（入力物理量）を抽出ノッキング音９１ａ（抽出信号）と雑音９１ｂ（ノイズ成分）とに分離することができる。このような信号処理装置１０は、特許文献２及び特許文献３に記載された従来技術よりも、ノッキングの有無の評価性能を向上させることができる。また、このような信号処理装置１０は、良好な官能試験を行うことができる。 In the example shown in FIG. 10B, since the change in the phase component when the pressure becomes vibration and is radiated as sound is taken into consideration, the signal can be analyzed better than the example shown in FIG. 10A. Therefore, as shown in FIG. 3A, the signal processing device 10 has a configuration in which the weight W of the network that generates the mask α and the transfer function H are learned in consideration of the phase. Such a signal processing device 10 can satisfactorily separate the engine vicinity sound 90 (input physical quantity) into the extraction knocking sound 91a (extraction signal) and the noise 91b (noise component). That is, the signal processing device 10 extracts the engine vicinity sound 90 (input physical quantity) so that the knocking sound (target sound) is not mixed in the noise (sound other than the knocking sound (background sound)). Knocking sound 91a (extracted). It can be separated into a signal) and a noise 91b (noise component). Such a signal processing device 10 can improve the evaluation performance of the presence or absence of knocking as compared with the conventional techniques described in Patent Documents 2 and 3. In addition, such a signal processing device 10 can perform a good sensory test.

（官能試験の概要）
以下、図１１Ａ乃至図１２Ｃを参照して、官能試験の概要について説明する。図１１Ａ乃至図１２Ｃは、それぞれ、官能試験における信号処理の説明図である。 (Outline of sensory test)
Hereinafter, the outline of the sensory test will be described with reference to FIGS. 11A to 12C. 11A to 12C are explanatory views of signal processing in the sensory test, respectively.

図２に示すように、圧力と音の各信号は、データ収集装置６によって収集され、信号処理装置１０に出力される。信号処理装置１０は、信号切出部１１で各信号を切り出し、スペクトログラム算出部１２でスペクトログラムを算出して、スペクトログラムを信号記憶部１３に記憶する。 As shown in FIG. 2, each of the pressure and sound signals is collected by the data collecting device 6 and output to the signal processing device 10. The signal processing device 10 cuts out each signal by the signal cutting unit 11, calculates the spectrogram by the spectrogram calculation unit 12, and stores the spectrogram in the signal storage unit 13.

本実施形態に係る信号処理装置１０は、前記した分離モードと前記した官能試験モードとを実行することができる。分離モード時に、信号処理装置１０は、入力物理量（エンジン近傍音９０）を抽出信号（抽出ノッキング音９１ａ）とノイズ成分（雑音９１ｂ）とに分離する。そして、官能試験モード時に、信号処理装置１０は、抽出信号（抽出ノッキング音９１ａ）のレベルを変更してレベル変更抽出信号（レベル変更ノッキング音９１ａａ（図１２Ａ））を生成し、ノイズ成分（雑音９１ｂ）と合成して加工音９１ｃ（図１２Ｃ）を生成する。 The signal processing device 10 according to the present embodiment can execute the above-mentioned separation mode and the above-mentioned sensory test mode. In the separation mode, the signal processing device 10 separates the input physical quantity (engine proximity sound 90) into an extraction signal (extraction knocking sound 91a) and a noise component (noise 91b). Then, in the sensory test mode, the signal processing device 10 changes the level of the extraction signal (extraction knocking sound 91a) to generate a level change extraction signal (level change knocking sound 91aa (FIG. 12A)), and generates a noise component (noise). The processing sound 91c (FIG. 12C) is generated by combining with 91b).

官能試験モード時において、検査者は、頭部にヘッドホン８（図１）を装着して、レベル指定部９を操作可能な態勢で待機する。そして、操作者は、許容可能なレベルを周囲に宣告して、レベル指定部９を操作してレベル指定情報を信号処理装置１０に入力する。信号処理装置１０は、入力されたレベル指定情報に基づいて、抽出信号（抽出ノッキング音９１ａ）のレベルを変更する。 In the sensory test mode, the inspector wears the headphones 8 (FIG. 1) on the head and stands by in a state in which the level designation unit 9 can be operated. Then, the operator declares an acceptable level to the surroundings, operates the level designation unit 9, and inputs the level designation information to the signal processing device 10. The signal processing device 10 changes the level of the extraction signal (extraction knocking sound 91a) based on the input level designation information.

図１１Ａは、入力物理量（エンジン近傍音９０）を示しており、図１１Ｂと図１１Ｃは、分離モード時に入力物理量（エンジン近傍音９０）から分離された抽出信号（抽出ノッキング音９１ａ）とノイズ成分（雑音９１ｂ）とを示している。図１２Ａは、官能試験モード時に、抽出信号（抽出ノッキング音９１ａ）のレベルを３ｄｂ上昇させる変更を行って生成されたレベル変更抽出信号（レベル変更ノッキング音９１ａａ）を示しており、図１２Ｂは、レベル変更抽出信号（レベル変更ノッキング音９１ａａ）と合成されるノイズ成分（雑音９１ｂ）を示しており、図１２Ｃは、レベル変更抽出信号（レベル変更ノッキング音９１ａａ）とノイズ成分（雑音９１ｂ）とを合成した加工音９１ｃを示している。信号処理装置１０は、加工音９１ｃを生成することにより、検査すべき目的音（ノッキング音）を聞き分け易い状態にすることができる。このような信号処理装置１０は、目的音（ノッキング音）の有無を高精度に検査者に把握させることができ、検査性能を向上させることができる。また、信号処理装置１０は、検査者により許容不可能と判断された加工音９１ｃのスペクトログラムの絶対値の総和を求めて、総和値以下の任意の値を閾値記憶部２５に書き込む。これにより、閾値Ｔは検査者の官能にあった値となる。 FIG. 11A shows an input physical quantity (engine proximity sound 90), and FIGS. 11B and 11C show an extraction signal (extraction knocking sound 91a) and a noise component separated from the input physical quantity (engine proximity sound 90) in the separation mode. (Noise 91b). FIG. 12A shows a level change extraction signal (level change knocking sound 91aa) generated by changing the level of the extraction signal (extraction knocking sound 91a) by 3db in the sensory test mode, and FIG. 12B shows. The noise component (noise 91b) synthesized with the level change extraction signal (level change knocking sound 91aa) is shown, and FIG. 12C shows the level change extraction signal (level change knocking sound 91aa) and the noise component (noise 91b). The synthesized processing sound 91c is shown. By generating the processing sound 91c, the signal processing device 10 can make it easy to distinguish the target sound (knocking sound) to be inspected. Such a signal processing device 10 can make the inspector grasp the presence or absence of the target sound (knocking sound) with high accuracy, and can improve the inspection performance. Further, the signal processing device 10 obtains the sum of the absolute values of the spectrograms of the processed sound 91c determined to be unacceptable by the inspector, and writes an arbitrary value equal to or less than the sum of values into the threshold storage unit 25. As a result, the threshold value T becomes a value suitable for the inspector's sensuality.

なお、信号処理装置１０は、運用に応じて、官能試験モード時に、レベル変更抽出信号（レベル変更ノッキング音９１ａａ）に測定対象から受聴者の耳位置までの伝達関数Ｈｅ（図示せず）を掛けてから、前記受聴者の耳位置でのノイズ成分を合成して加工音を生成するようにしてもよい。 The signal processing device 10 multiplies the level change extraction signal (level change knocking sound 91aa) by the transfer function He (not shown) from the measurement target to the listener's ear position in the sensory test mode, depending on the operation. Then, the noise component at the ear position of the listener may be synthesized to generate the processed sound.

＜信号処理装置（推定装置）の動作＞
以下、図１３乃至図１８を参照して、信号処理装置１０の動作について説明する。図１３は、信号処理装置１０（推定装置）のデータ収集処理を示すフローチャートである。図１４Ａは、信号処理装置１０（推定装置）の学習処理を示すフローチャートである。図１４Ｂは、学習処理のサブルーチンを示すフローチャートである。図１４Ｃは、学習処理のサブルーチンの変更例を示すフローチャートである。図１５Ａは、信号処理装置１０（推定装置）の閾値算出処理を示すフローチャートである。図１５Ｂは、信号処理装置１０（推定装置）の図１７の分離処理及び図１８の官能試験処理の後に行われる閾値算出処理を示すフローチャートである。図１６は、信号処理装置１０（推定装置）の判定処理を示すフローチャートである。図１７は、信号処理装置１０（推定装置）の分離処理を示すフローチャートである。図１８は、信号処理装置１０（推定装置）の官能試験処理を示すフローチャートである。 <Operation of signal processing device (estimating device)>
Hereinafter, the operation of the signal processing device 10 will be described with reference to FIGS. 13 to 18. FIG. 13 is a flowchart showing a data collection process of the signal processing device 10 (estimating device). FIG. 14A is a flowchart showing the learning process of the signal processing device 10 (estimating device). FIG. 14B is a flowchart showing a subroutine of learning processing. FIG. 14C is a flowchart showing a modification example of the subroutine of the learning process. FIG. 15A is a flowchart showing a threshold value calculation process of the signal processing device 10 (estimating device). FIG. 15B is a flowchart showing a threshold value calculation process performed after the separation process of FIG. 17 and the sensory test process of FIG. 18 of the signal processing device 10 (estimating device). FIG. 16 is a flowchart showing a determination process of the signal processing device 10 (estimating device). FIG. 17 is a flowchart showing the separation process of the signal processing device 10 (estimating device). FIG. 18 is a flowchart showing a sensory test process of the signal processing device 10 (estimating device).

学習モードの場合、信号処理装置１０は、図１３のデータ収集処理を行った後、図１４Ａと図１４Ｂの学習処理を行う。また、閾値算出モードの場合、信号処理装置１０は、図１３のデータ収集処理を行った後、図１５Ａの閾値算出処理を行う。また、分離モードで分離処理及び官能試験モードで官能試験処理を行った場合に、信号処理装置１０は、分離処理及び官能試験処理の後に、図１５Ｂの閾値算出処理を行う。また、判定モードの場合、信号処理装置１０は、図１３のデータ収集処理を行った後、図１６の判定処理を行う。ただし、信号処理装置１０は、図１３のデータ収集処理を行いながら、リアルタイムで図１６の判定処理を行うようにしてもよい。また、分離モードの場合、信号処理装置１０は、図１３のデータ収集処理を行った後、図１７の分離処理を行う。また、官能試験モードの場合、信号処理装置１０は、図１５Ｂの閾値算出処理を行う前に、図１８の官能試験処理を行う。 In the learning mode, the signal processing device 10 performs the learning process of FIGS. 14A and 14B after performing the data collection process of FIG. 13. Further, in the threshold value calculation mode, the signal processing device 10 performs the threshold value calculation process of FIG. 15A after performing the data collection process of FIG. 13. Further, when the separation process is performed in the separation mode and the sensory test process is performed in the sensory test mode, the signal processing device 10 performs the threshold value calculation process of FIG. 15B after the separation process and the sensory test process. Further, in the determination mode, the signal processing device 10 performs the determination process of FIG. 16 after performing the data collection process of FIG. 13. However, the signal processing device 10 may perform the determination process of FIG. 16 in real time while performing the data collection process of FIG. 13. Further, in the case of the separation mode, the signal processing device 10 performs the data collection process of FIG. 13 and then the separation process of FIG. Further, in the sensory test mode, the signal processing device 10 performs the sensory test process of FIG. 18 before performing the threshold value calculation process of FIG. 15B.

（データ収集処理）
以下、図１３を参照して、データ収集処理を説明する。
図１３に示すように、ステップＳ２０において、データ収集装置６が、音圧信号、筒内圧信号（教師信号）、及び、角度情報を信号切出部１１に入力する。なお、分離モード、閾値算出モード、又は判定モードの場合、ステップＳ２０では、筒内圧信号を入力する必要がない。
ステップＳ２１において、信号切出部１１は、角度情報から各気筒の燃焼行程タイミングを算出する。 (Data collection process)
Hereinafter, the data collection process will be described with reference to FIG.
As shown in FIG. 13, in step S20, the data collection device 6 inputs the sound pressure signal, the in-cylinder pressure signal (teacher signal), and the angle information to the signal cutting unit 11. In the case of the separation mode, the threshold value calculation mode, or the determination mode, it is not necessary to input the in-cylinder pressure signal in step S20.
In step S21, the signal cutting unit 11 calculates the combustion stroke timing of each cylinder from the angle information.

ステップＳ２２において、信号切出部１１は、ステップＳ２１で算出した燃焼行程タイミングに合わせて、ＴＤＣ付近の音圧信号と筒内圧信号を切り出す。
ステップＳ２３において、スペクトログラム算出部１２は、ステップＳ２２で切り出した音圧信号に対して短時間フーリエ変換（ＳＴＦＴ）を行い、音圧信号のスペクトログラムを算出する。
ステップＳ２４において、スペクトログラム算出部１２は、音圧信号のスペクトログラムを信号記憶部１３に書き込む。また、信号切出部１１は、筒内圧信号（教師信号）を信号記憶部１３に書き込む。 In step S22, the signal cutting unit 11 cuts out the sound pressure signal and the in-cylinder pressure signal in the vicinity of the TDC in accordance with the combustion stroke timing calculated in step S21.
In step S23, the spectrogram calculation unit 12 performs a short-time Fourier transform (STFT) on the sound pressure signal cut out in step S22 to calculate the spectrogram of the sound pressure signal.
In step S24, the spectrogram calculation unit 12 writes the spectrogram of the sound pressure signal into the signal storage unit 13. Further, the signal cutting unit 11 writes the in-cylinder pressure signal (teacher signal) to the signal storage unit 13.

（学習処理）
以下、図１４Ａ乃至図１４Ｃを参照して、学習モードで実行される学習処理を説明する。
図１４Ａに示すように、ステップＳ３０において、信号処理装置１０は、信号記憶部１３から音圧信号のスペクトログラム及び筒内圧信号を読み出して、学習部２１に入力する。
ステップＳ３１において、学習部２１は、ステップＳ３０で入力された音圧信号のスペクトログラム及び筒内圧信号を用いて、ニューラルネットワーク９４で入力物理量に関連する振幅と位相成分を加味して、マスクαを生成するネットワークの重みＷ、及び、伝達関数Ｈを学習する。学習部２１は、畳み込みニューラルネットワーク（例えば、Ｕ−Ｎｅｔ）によりマスクαを生成するネットワークの重みＷを学習し、伝達関数Ｈとして、抽出信号のスペクトルに乗じる重みを学習する。 (Learning process)
Hereinafter, the learning process executed in the learning mode will be described with reference to FIGS. 14A to 14C.
As shown in FIG. 14A, in step S30, the signal processing device 10 reads the spectrogram of the sound pressure signal and the in-cylinder pressure signal from the signal storage unit 13 and inputs them to the learning unit 21.
In step S31, the learning unit 21 uses the spectrum of the sound pressure signal and the in-cylinder pressure signal input in step S30 to generate the mask α by adding the amplitude and phase components related to the input physical quantity in the neural network 94. The weight W of the network to be used and the transfer function H are learned. The learning unit 21 learns the weight W of the network that generates the mask α by the convolutional neural network (for example, U-Net), and learns the weight to be multiplied by the spectrum of the extracted signal as the transfer function H.

ステップＳ３２において、学習部２１は、学習したマスクαを生成するネットワークの重みＷ及び伝達関数Ｈを学習済みパラメータ記憶部２２に書き込む。 In step S32, the learning unit 21 writes the weight W and the transfer function H of the network that generates the learned mask α into the learned parameter storage unit 22.

なお、ステップＳ３１の処理は、例えば図１４Ｂに示す一連のルーチンの処理によって行われる。
図１４Ｂに示すように、ステップＳ１０１において、学習部２１は、予め乱数で重みＷを初期化した畳み込みニューラルネットワークに入力物理量を入力して生成したマスクを用いて入力物理量をノイズ成分と抽出信号とに分離する。 The process of step S31 is performed by, for example, a series of routine processes shown in FIG. 14B.
As shown in FIG. 14B, in step S101, the learning unit 21 uses a mask generated by inputting an input physical quantity into a convolutional neural network whose weight W is initialized with a random number in advance to convert the input physical quantity into a noise component and an extraction signal. Separate into.

ステップＳ１０２において、学習部２１は、伝達関数Ｈを掛け合わせて抽出信号（抽出ノッキング音９１ａ）を推定信号（推定ノッキング筒内圧９２）に変換する。
ステップＳ１０３において、学習部２１は、マスクαを生成するネットワークの重みＷ、及び、伝達関数Ｈの更新を行う。ステップＳ１０３では、学習部２１は、逆短時間フーリエ変換（ＩＳＴＦＴ）で求めたノイズ成分（雑音９１ｂ）の信号波形と教師信号（観測ノッキング筒内圧９３）とのコヒーレンスが小さくなるとともに、逆短時間フーリエ変換（ＩＳＴＦＴ）で求めた抽出信号（抽出ノッキング音９１ａ）の信号波形と教師信号（観測ノッキング筒内圧９３）とのコヒーレンスが大きくなるように、また、推定信号（推定ノッキング筒内圧９２）と教師信号（観測ノッキング筒内圧９３）に対して短時間フーリエ変換（ＳＴＦＴ）を行い求めたスペクトログラムとの誤差が小さくなるように、マスクαを生成するネットワークの重みＷ、及び、伝達関数Ｈを更新する。 In step S102, the learning unit 21 multiplies the transfer function H to convert the extracted signal (extracted knocking sound 91a) into an estimated signal (estimated knocking in-cylinder pressure 92).
In step S103, the learning unit 21 updates the weight W of the network that generates the mask α and the transfer function H. In step S103, the learning unit 21 reduces the coherence between the signal waveform of the noise component (noise 91b) obtained by the inverse short-time Fourier transform (ISTFT) and the teacher signal (observation knocking cylinder internal pressure 93), and also reduces the inverse short-time Fourier transform (ISTFT). To increase the coherence between the signal waveform of the extracted signal (extracted knocking sound 91a) obtained by Fourier transform (ISTFT) and the teacher signal (observed knocking cylinder pressure 93), and to increase the coherence with the estimated signal (estimated knocking cylinder pressure 92). The weight W of the network that generates the mask α and the transfer function H are updated so that the error from the spectrum obtained by performing short-time Fourier transform (STFT) on the teacher signal (observation knocking cylinder pressure 93) becomes small. do.

ステップＳ１０４において、学習部２１は、コヒーレンスと誤差が収束したか否かを判定し、収束したと判定された場合（“Ｙｅｓ”の場合）に、ステップＳ３１の処理を終了し、一方、収束していないと判定された場合（“Ｎｏ”の場合）に、ステップＳ１０１以降の処理を繰り返す。なお、ステップＳ１０４において、コヒーレンスと誤差が収束した状態とは、教師信号（観測ノッキング筒内圧９３）とノイズ成分（雑音９１ｂ）を逆短時間フーリエ変換（ＩＳＴＦＴ）して求めた信号波形とのコヒーレンスが値「０」に近い値で、かつ、教師信号（観測ノッキング筒内圧９３）と抽出信号（抽出ノッキング音９１ａ）を逆短時間フーリエ変換（ＩＳＴＦＴ）して求めた信号波形とのコヒーレンスが「１」に近い値で、かつ、教師信号（観測ノッキング筒内圧９３）に対して短時間フーリエ変換（ＳＴＦＴ）を行い求めたスペクトログラムに対する推定信号（推定ノッキング筒内圧９２）の誤差が「０」に近い値に収束した状態である。 In step S104, the learning unit 21 determines whether or not the coherence and the error have converged, and if it is determined that the coherence and the error have converged (in the case of “Yes”), the process of step S31 is terminated, while the learning unit 21 converges. If it is determined that the case is not present (in the case of “No”), the processing of step S101 and subsequent steps is repeated. In step S104, the state in which the coherence and the error have converged is the coherence between the teacher signal (observation knocking cylinder internal pressure 93) and the signal waveform obtained by inverse short-time Fourier transform (ISTFT) of the noise component (noise 91b). Is a value close to the value "0", and the coherence between the teacher signal (observation knocking cylinder pressure 93) and the signal waveform obtained by inverse short-time Fourier transform (ISTFT) of the extraction signal (extraction knocking sound 91a) is ". The error of the estimated signal (estimated knocking in-cylinder pressure 92) with respect to the spectrogram obtained by performing short-time Fourier transform (STFT) on the teacher signal (observed knocking in-cylinder pressure 93) with a value close to "1" becomes "0". It is in a state of converging to a close value.

なお、ステップＳ３１の処理は、図１４Ｂの代わりに、例えば図１４Ｃに示す処理を行うようにしてもよい。
図１４Ｃに示す処理は、図１４Ｂに示す処理と比べると、ステップＳ１０３の処理の代わりに、ステップＳ１０３ａの処理を行う点で相違している。
ステップＳ１０３ａにおいて、学習部２１は、ステップＳ１０３の条件に加え、さらに、推定信号（推定ノッキング筒内圧９２）に対して逆短時間フーリエ変換（ＩＳＴＦＴ）を行い求めた信号波形の最大値、及び、最大値と最小値との差（ノッキングインテンシティ）と、教師信号（観測ノッキング筒内圧９３）の最大値、及び、最大値と最小値との差（ノッキングインテンシティ）を算出し、前者の最大値と後者の最大値との誤差、及び、前者の最大値と最小値との差（ノッキングインテンシティ）と後者の最大値と最小値との差（ノッキングインテンシティ）における誤差がそれぞれ小さくなるようにするという条件を満たすように、マスクαを生成するネットワークの重みＷ、及び、伝達関数Ｈの更新を行う。すなわち、ステップＳ１０３ａでは、学習部２１は、逆短時間フーリエ変換（ＩＳＴＦＴ）で求めたノイズ成分（雑音９１ｂ）の信号波形と教師信号（観測ノッキング筒内圧９３）とのコヒーレンスが小さくなるとともに、逆短時間フーリエ変換（ＩＳＴＦＴ）で求めた抽出信号（抽出ノッキング音９１ａ）の信号波形と教師信号（観測ノッキング筒内圧９３）とのコヒーレンスが大きくなるように、また、推定信号（推定ノッキング筒内圧９２）と教師信号（観測ノッキング筒内圧９３）に対して短時間フーリエ変換（ＳＴＦＴ）を行い求めたスペクトログラムとの誤差が小さくなるように、さらに、推定信号（推定ノッキング筒内圧９２）に対して逆短時間フーリエ変換（ＩＳＴＦＴ）を行い求めた信号波形の最大値、及び、最大値と最小値との差（ノッキングインテンシティ）と、教師信号（観測ノッキング筒内圧９３）の最大値、及び、最大値と最小値との差（ノッキングインテンシティ）を算出し、前者の最大値と後者の最大値との誤差、及び、前者の最大値と最小値との差（ノッキングインテンシティ）と後者の最大値と最小値との差（ノッキングインテンシティ）における誤差がそれぞれ小さくなるように、マスクαを生成するネットワークの重みＷ、及び、伝達関数Ｈを更新する。 The process of step S31 may be performed, for example, as shown in FIG. 14C, instead of FIG. 14B.
The process shown in FIG. 14C is different from the process shown in FIG. 14B in that the process of step S103a is performed instead of the process of step S103.
In step S103a, in addition to the conditions of step S103, the learning unit 21 further performs an inverse short-time Fourier transform (ISTFT) on the estimated signal (estimated knocking cylinder pressure 92) to obtain the maximum value of the signal waveform and the maximum value of the signal waveform. The difference between the maximum value and the minimum value (knocking intensity), the maximum value of the teacher signal (observed knocking cylinder internal pressure 93), and the difference between the maximum value and the minimum value (knocking intensity) are calculated, and the former maximum value is calculated. The error between the value and the maximum value of the latter, and the difference between the maximum value and the minimum value of the former (knocking intensity) and the difference between the maximum value and the minimum value of the latter (knocking intensity) are reduced. The weight W of the network that generates the mask α and the transfer function H are updated so as to satisfy the condition of. That is, in step S103a, the learning unit 21 reduces the coherence between the signal waveform of the noise component (noise 91b) obtained by the inverse short-time Fourier transform (ISTFT) and the teacher signal (observation knocking cylinder internal pressure 93), and reverses it. The coherence between the signal waveform of the extracted signal (extracted knocking sound 91a) obtained by the short-time Fourier transform (ISTFT) and the teacher signal (observed knocking cylinder pressure 93) is increased, and the estimated signal (estimated knocking cylinder pressure 92) is increased. ) And the spectrogram obtained by performing a short-time Fourier transform (STFT) on the teacher signal (observed knocking cylinder pressure 93), and further inversely with respect to the estimated signal (estimated knocking cylinder pressure 92). The maximum value of the signal waveform obtained by short-time Fourier transform (ISTFT), the difference between the maximum value and the minimum value (knocking intensity), the maximum value of the teacher signal (observed knocking cylinder pressure 93), and the maximum value. The difference between the value and the minimum value (knocking intensity) is calculated, the error between the former maximum value and the latter maximum value, and the difference between the former maximum value and the minimum value (knocking intensity) and the latter maximum. The weight W of the network that generates the mask α and the transfer function H are updated so that the error in the difference between the value and the minimum value (knocking intensity) becomes small.

（閾値算出処理）
以下、図１５Ａを参照して、閾値算出モードで実行される閾値算出処理を説明する。
図１５Ａに示すように、ステップＳ４０において、第１推定部２３は、ニューラルネットワーク９４により生成したマスクαを用いて、音圧信号のスペクトログラムからエンジン１のノッキング音（抽出信号）を抽出する。
ステップＳ４１において、閾値算出部２４は、エンジン１の各サイクルでノッキング音（抽出信号）のスペクトログラムの絶対値を総和する。 (Threshold calculation process)
Hereinafter, the threshold value calculation process executed in the threshold value calculation mode will be described with reference to FIG. 15A.
As shown in FIG. 15A, in step S40, the first estimation unit 23 extracts the knocking sound (extract signal) of the engine 1 from the spectrogram of the sound pressure signal by using the mask α generated by the neural network 94.
In step S41, the threshold value calculation unit 24 sums the absolute values of the spectrograms of the knocking sounds (extracted signals) in each cycle of the engine 1.

ステップＳ４２において、閾値算出部２４は、エンジン１の全サイクルにおけるノッキング音（抽出信号）のスペクトログラムの中央値を算出する。
ステップＳ４３において、閾値算出部２４は、中央値に任意のマージンを加算し、閾値とする。
ステップＳ４４において、閾値算出部２４は、算出した閾値を閾値記憶部２５に書き込む。 In step S42, the threshold value calculation unit 24 calculates the median value of the spectrogram of the knocking sound (extracted signal) in the entire cycle of the engine 1.
In step S43, the threshold value calculation unit 24 adds an arbitrary margin to the median value to obtain a threshold value.
In step S44, the threshold value calculation unit 24 writes the calculated threshold value in the threshold value storage unit 25.

なお、信号処理装置１０は、図１７の分離処理及び図１８の官能試験処理の後に、図１５Ｂの閾値算出処理を実行する機能を有している。以下、図１５Ｂを参照して、図１７の分離処理及び図１８の官能試験処理の後に行われる閾値算出処理を説明する。
図１５Ｂに示すように、ステップＳ１４０において、信号調整部５１は、ノイズ成分記憶部２６ｃと抽出信号記憶部２６ｄから、官能試験で許容不可となった信号を取得し、第１推定部２３に供給する。
ステップＳ１４１において、第１推定部２３は、ニューラルネットワーク９４により生成したマスクαを用いて、官能試験で許容不可となった信号のスペクトログラムからノッキング音（抽出信号）のスペクトログラムを取得し、閾値算出部２４に供給する。
ステップＳ１４２において、閾値算出部２４は、第１推定部２３から供給されたノッキング音（抽出信号）のスペクトログラムの絶対値を総和する。 The signal processing device 10 has a function of executing the threshold value calculation process of FIG. 15B after the separation process of FIG. 17 and the sensory test process of FIG. Hereinafter, the threshold value calculation process performed after the separation process of FIG. 17 and the sensory test process of FIG. 18 will be described with reference to FIG. 15B.
As shown in FIG. 15B, in step S140, the signal adjusting unit 51 acquires a signal that is unacceptable in the sensory test from the noise component storage unit 26c and the extraction signal storage unit 26d, and supplies the signal to the first estimation unit 23. do.
In step S141, the first estimation unit 23 acquires the spectrogram of the knocking sound (extracted signal) from the spectrogram of the signal that is unacceptable in the sensory test by using the mask α generated by the neural network 94, and the threshold calculation unit 23. Supply to 24.
In step S142, the threshold value calculation unit 24 sums the absolute values of the spectrograms of the knocking sounds (extracted signals) supplied from the first estimation unit 23.

ステップＳ１４３において、閾値算出部２４は、総和値以下の任意の値を閾値とする。
ステップＳ１４４において、閾値算出部２４は、算出した閾値を閾値記憶部２５に書き込む。 In step S143, the threshold value calculation unit 24 sets an arbitrary value equal to or less than the total value as the threshold value.
In step S144, the threshold value calculation unit 24 writes the calculated threshold value in the threshold value storage unit 25.

（判定処理）
以下、図１６を参照して、判定モードで実行される判定処理を説明する。
図１６に示すように、ステップＳ５０において、第２推定部３１は、ニューラルネットワーク９４により生成したマスクαを用いて、信号記憶部１３より入力された音圧信号のスペクトログラムからエンジン１のノッキング音（抽出信号）を抽出する。 (Determination process)
Hereinafter, the determination process executed in the determination mode will be described with reference to FIG.
As shown in FIG. 16, in step S50, the second estimation unit 31 uses the mask α generated by the neural network 94 to knock the engine 1 knocking sound (from the spectrogram of the sound pressure signal input from the signal storage unit 13). Extraction signal) is extracted.

ステップＳ５１において、閾値判定部３２は、エンジン１の各サイクルでノッキング音（抽出信号）のスペクトログラムの絶対値を総和する。
ステップＳ５２において、閾値判定部３２は、閾値記憶部２５に記憶済みの閾値と、総和したノッキング音（抽出信号）とを比較し、ノッキング音（抽出信号）が閾値を超えているか否かを判定する。
総和したノッキング音（抽出信号）が閾値を超えている場合（ステップＳ５２でＹｅｓ）、閾値判定部３２は、ノッキング有りと判定する（ステップＳ５３）。 In step S51, the threshold value determination unit 32 sums the absolute values of the spectrograms of the knocking sounds (extracted signals) in each cycle of the engine 1.
In step S52, the threshold value determination unit 32 compares the threshold value stored in the threshold value storage unit 25 with the total knocking sound (extracted signal), and determines whether or not the knocking sound (extracted signal) exceeds the threshold value. do.
When the total knocking sound (extracted signal) exceeds the threshold value (Yes in step S52), the threshold value determination unit 32 determines that knocking is present (step S53).

総和したノッキング音（抽出信号）が閾値以下の場合、（ステップＳ５２でＮｏ）、閾値判定部３２は、ノッキング無しと判定する（ステップＳ５４）。
ステップＳ５５において、閾値判定部３２は、閾値判定の結果及びノッキング音（抽出信号）をモニタ７に出力する。 When the total knocking sound (extracted signal) is equal to or less than the threshold value (No in step S52), the threshold value determination unit 32 determines that there is no knocking (step S54).
In step S55, the threshold value determination unit 32 outputs the threshold value determination result and the knocking sound (extracted signal) to the monitor 7.

（分離処理）
以下、図１７を参照して、分離モードで実行される分離処理を説明する。
図１７に示すように、ステップＳ６０において、分離部４０は、ニューラルネットワーク９４により生成されたマスクαを用いて、入力物理量をノイズ成分と抽出信号とに分離する。
ステップＳ６１において、学習処理部２０は、分離したノイズ成分と抽出信号を、それぞれに対応するノイズ成分記憶部２６ｃと抽出信号記憶部２６ｄに記憶する。 (Separation process)
Hereinafter, the separation process executed in the separation mode will be described with reference to FIG.
As shown in FIG. 17, in step S60, the separation unit 40 separates the input physical quantity into the noise component and the extraction signal by using the mask α generated by the neural network 94.
In step S61, the learning processing unit 20 stores the separated noise component and the extracted signal in the corresponding noise component storage unit 26c and the extracted signal storage unit 26d, respectively.

（官能試験処理）
以下、図１８を参照して、官能試験モードで実行される官能試験処理を説明する。
図１８に示すように、ステップＳ７０において、信号合成部５０の信号調整部５１は、レベル指定部９からのレベル指定情報の入力を受け付ける。
ステップＳ７１において、信号調整部５１は、抽出信号記憶部２６ｄから抽出信号を取得するとともに、ノイズ成分記憶部２６ｃからノイズ成分を取得し、レベル指定情報に応じて抽出信号のレベルを変更してレベル変更抽出信号を生成する。 (Sensory test processing)
Hereinafter, the sensory test process performed in the sensory test mode will be described with reference to FIG.
As shown in FIG. 18, in step S70, the signal adjustment unit 51 of the signal synthesis unit 50 receives the input of the level designation information from the level designation unit 9.
In step S71, the signal adjusting unit 51 acquires the extracted signal from the extracted signal storage unit 26d, acquires the noise component from the noise component storage unit 26c, and changes the level of the extracted signal according to the level designation information to level. Generate a change extraction signal.

ステップＳ７２において、信号調整部５１は、レベル変更抽出信号とノイズ成分とを合成して加工音を生成する。
ステップＳ７３において、信号合成部５０の信号出力部５２は、加工音を放音部（ヘッドホン８）に出力して、放音部から加工音を放音させる。
ステップＳ７４において、信号合成部５０は、レベル指定部９からの終了の指示があったか否かを判定し、終了の指示があったと判定された場合（“Ｙｅｓ”の場合）に、加工音を第１推定部２３に出力し、閾値算出部２４により閾値を算出したのち、閾値記憶部２５に書き込み、図１８の処理を終了する。一方、終了の指示がないと判定された場合（“Ｎｏ”の場合）に、ステップＳ７５において、信号合成部５０の信号調整部５１は、レベル指定部９からのレベル指定情報の変更を受け付け、ステップＳ７１以降の処理を繰り返す。 In step S72, the signal adjusting unit 51 synthesizes the level change extraction signal and the noise component to generate a processed sound.
In step S73, the signal output unit 52 of the signal synthesis unit 50 outputs the processed sound to the sound emitting unit (headphones 8), and emits the processed sound from the sound emitting unit.
In step S74, the signal synthesizing unit 50 determines whether or not there is an end instruction from the level designating unit 9, and if it is determined that the end instruction has been given (in the case of “Yes”), the processing sound is generated. 1 Output to the estimation unit 23, the threshold value is calculated by the threshold value calculation unit 24, and then the threshold value is written to the threshold value storage unit 25 to end the process of FIG. On the other hand, when it is determined that there is no end instruction (in the case of "No"), in step S75, the signal adjustment unit 51 of the signal synthesis unit 50 accepts the change of the level designation information from the level designation unit 9. The processing after step S71 is repeated.

なお、図１７の分離処理及び図１８の官能試験処理の後に、信号処理装置１０は、図１５Ｂに示す閾値算出処理を行う。 After the separation process of FIG. 17 and the sensory test process of FIG. 18, the signal processing device 10 performs the threshold value calculation process shown in FIG. 15B.

＜信号処理装置（推定装置）の主な特徴＞
（１）図３Ｂに示すように、本実施形態に係る信号処理装置１０（推定装置）は、学習部２１と、分離部４０と、を備える。学習部２１は、ニューラルネットワーク９４により、ノイズ成分（雑音９１ｂ）が含まれている入力物理量（エンジン近傍音９０）から位相を加味してノイズ成分を除去するためのマスクαを生成するネットワークの重みＷ、及び、入力物理量（エンジン近傍音９０）からノイズ成分が除去された抽出信号（抽出ノッキング音９１ａ）を、教師信号（観測ノッキング筒内圧９３）と同じ次元（単位）の推定信号（推定ノッキング筒内圧９２）に位相を加味して変換するための伝達関数Ｈを学習する。分離部４０は、入力物理量（エンジン近傍音９０）をノイズ成分と抽出信号（抽出ノッキング音９１ａ）とに分離する。 <Main features of signal processing device (estimation device)>
(1) As shown in FIG. 3B, the signal processing device 10 (estimating device) according to the present embodiment includes a learning unit 21 and a separation unit 40. The learning unit 21 uses the neural network 94 to generate a mask α for removing the noise component by adding the phase from the input physical quantity (sound near the engine 90) including the noise component (noise 91b). Estimated signal (estimated knocking) of the same dimension (unit) as the teacher signal (observation knocking cylinder internal pressure 93) is the extracted signal (extracted knocking sound 91a) from which the noise component is removed from W and the input physical quantity (sound near the engine 90). The transfer function H for converting the in-cylinder pressure 92) by adding the phase is learned. The separation unit 40 separates the input physical quantity (engine proximity sound 90) into a noise component and an extraction signal (extraction knocking sound 91a).

このような本実施形態に係る信号処理装置１０（推定装置）は、入力物理量をノイズ成分と抽出信号とに良好に分離することができる。特に、信号処理装置１０（推定装置）は、ノイズ成分と、官能試験でのレベル変更に適した信号とに分離することができる。このような信号処理装置１０（推定装置）は、背景音に目的音（ノッキング音）が混入しているか否かを評価し易くすることができる。そのため、信号処理装置１０（推定装置）は、例えば特許文献２及び特許文献３に記載された従来技術よりも、ノッキングの有無の評価性能を向上させることができる。また、信号処理装置１０は、良好な官能試験を行うことができ、官能試験で許容不可となったデータに基づいて閾値を決定することで、検査者に近い判定ができる。 The signal processing device 10 (estimation device) according to the present embodiment can satisfactorily separate the input physical quantity into the noise component and the extracted signal. In particular, the signal processing device 10 (estimating device) can separate the noise component into a signal suitable for level change in the sensory test. Such a signal processing device 10 (estimating device) can easily evaluate whether or not a target sound (knocking sound) is mixed in the background sound. Therefore, the signal processing device 10 (estimating device) can improve the evaluation performance of the presence or absence of knocking as compared with the conventional techniques described in, for example, Patent Document 2 and Patent Document 3. In addition, the signal processing device 10 can perform a good sensory test, and by determining the threshold value based on the data that is unacceptable in the sensory test, it is possible to make a judgment close to that of an inspector.

（２）図３Ａに示すように、本実施形態に係る信号処理装置１０（推定装置）の学習部２１は、マスクαを生成するネットワークの重みＷ、及び、伝達関数Ｈを学習する際に、入力物理量（エンジン近傍音９０）に関連する振幅と位相成分を加味して学習する。 (2) As shown in FIG. 3A, the learning unit 21 of the signal processing device 10 (estimating device) according to the present embodiment learns the weight W of the network that generates the mask α and the transfer function H. Learning is performed by taking into account the amplitude and phase components related to the input physical quantity (sound near the engine 90).

このような本実施形態に係る信号処理装置１０（推定装置）は、雑音（ノッキング音以外の音（背景音））の中にノッキング音（目的音）が混入しないように、エンジン近傍音９０（入力物理量）を抽出ノッキング音９１ａ（抽出信号）と雑音９１ｂ（ノイズ成分）とに分離することができる。このような信号処理装置１０は、特許文献２及び特許文献３に記載された従来技術よりも、ノッキングの有無の評価性能を向上させることができる。また、このような信号処理装置１０は、良好な官能試験を行うことができる。 The signal processing device 10 (estimation device) according to the present embodiment has a sound near the engine 90 (a sound near the engine (target sound)) so that the knocking sound (target sound) is not mixed in the noise (sound other than the knocking sound (background sound)). The input physical quantity) can be separated into an extraction knocking sound 91a (extraction signal) and a noise 91b (noise component). Such a signal processing device 10 can improve the evaluation performance of the presence or absence of knocking as compared with the conventional techniques described in Patent Documents 2 and 3. In addition, such a signal processing device 10 can perform a good sensory test.

（３）本実施形態に係る信号処理装置１０（推定装置）の学習部２１は、ノイズ成分（雑音９１ｂ）と教師信号（観測ノッキング筒内圧９３）との関連性が小さくなるとともに、抽出信号（抽出ノッキング音９１ａ）と教師信号（観測ノッキング筒内圧９３）との関連性が大きくなるように、ネットワークの重みＷ、及び、伝達関数Ｈを学習する。具体的には、図３Ａに示すように、本実施形態に係る信号処理装置１０（推定装置）の学習部２１は、ノイズ成分（雑音９１ｂ）を逆短時間フーリエ変換（ＩＳＴＦＴ）して求めた信号波形と教師信号（観測ノッキング筒内圧９３）とのコヒーレンス（及び／又はコヒーレントアウトプットパワー）が小さくなるとともに、抽出信号（抽出ノッキング音９１ａ）を逆短時間フーリエ変換（ＩＳＴＦＴ）して求めた信号波形と教師信号（観測ノッキング筒内圧９３）とのコヒーレンス（及び／又はコヒーレントアウトプットパワー）が大きくなるように、ネットワークの重みＷ、及び、伝達関数Ｈを学習する。 (3) In the learning unit 21 of the signal processing device 10 (estimating device) according to the present embodiment, the relationship between the noise component (noise 91b) and the teacher signal (observation knocking cylinder internal pressure 93) becomes small, and the extraction signal (extract signal (3)) The network weight W and the transmission function H are learned so that the relationship between the extracted knocking sound 91a) and the teacher signal (observation knocking cylinder internal pressure 93) becomes large. Specifically, as shown in FIG. 3A, the learning unit 21 of the signal processing device 10 (estimation device) according to the present embodiment obtained the noise component (noise 91b) by inverse short-time Fourier transform (ISTFT). The coherence (and / or coherent output power) between the signal waveform and the teacher signal (observation knocking cylinder pressure 93) became smaller, and the extraction signal (extraction knocking sound 91a) was obtained by inverse short-time Fourier transform (ISTFT). The network weight W and the transfer function H are learned so that the coherence (and / or coherent output power) between the signal waveform and the teacher signal (observation knocking cylinder pressure 93) becomes large.

このような本実施形態に係る信号処理装置１０（推定装置）は、雑音９１ｂ（ノイズ成分）からノッキング筒内圧に起因する音を排除することができる。 The signal processing device 10 (estimation device) according to the present embodiment can exclude the sound caused by the knocking cylinder internal pressure from the noise 91b (noise component).

（４）図３Ａに示すように、本実施形態に係る信号処理装置１０（推定装置）の学習部２１は、ネットワークの重みＷ、及び、伝達関数Ｈを学習する場合に、教師信号（観測ノッキング筒内圧９３）に対する推定信号（推定ノッキング筒内圧９２）の誤差が小さくなるように、学習する。具体的には、信号処理装置１０（推定装置）の学習部２１は、教師信号（観測ノッキング筒内圧９３）に対して短時間フーリエ変換（ＳＴＦＴ）を行い求めたスペクトログラムに対する推定信号（推定ノッキング筒内圧９２）の誤差が小さくなるように、学習する。又は、信号処理装置１０（推定装置）の学習部２１は、教師信号（観測ノッキング筒内圧９３）と、抽出信号（抽出ノッキング音９１ａ）に対して、逆短時間フーリエ変換（ＩＳＴＦＴ）と高速フーリエ変換（ＦＦＴ）とを行い、伝達関数Ｈを掛け、逆高速フーリエ変換（ＩＦＦＴ）を行い求めた推定信号（推定ノッキング筒内圧９２）の信号波形との誤差が小さくなるように、学習する。 (4) As shown in FIG. 3A, the learning unit 21 of the signal processing device 10 (estimating device) according to the present embodiment learns the train weight W and the transmission function H when learning the teacher signal (observation knocking). Learning is performed so that the error of the estimated signal (estimated knocking in-cylinder pressure 92) with respect to the in-cylinder pressure 93) becomes small. Specifically, the learning unit 21 of the signal processing device 10 (estimation device) performs a short-time Fourier transform (STFT) on the teacher signal (observation knocking cylinder internal pressure 93) to obtain an estimated signal (estimated knocking cylinder) for the spectrogram. Learning is performed so that the error of the internal pressure 92) becomes small. Alternatively, the learning unit 21 of the signal processing device 10 (estimating device) receives an inverse short-time Fourier transform (ISTFT) and a fast Fourier transform (ISTFT) with respect to the teacher signal (observation knocking cylinder internal pressure 93) and the extraction signal (extraction knocking sound 91a). The transformation (FFT) is performed, the transfer function H is multiplied, and the inverse fast Fourier transform (IFFT) is performed to obtain the estimated signal (estimated knocking in-cylinder pressure 92) so that the error with the signal waveform becomes small.

このような本実施形態に係る信号処理装置１０（推定装置）は、教師信号（観測ノッキング筒内圧９３）に対して短時間フーリエ変換（ＳＴＦＴ）を行い求めたスペクトログラムに対する推定信号（推定ノッキング筒内圧９２）の誤差が小さくなるように、ネットワークの重みＷ、及び、伝達関数Ｈを学習することができる。又は、信号処理装置１０（推定装置）は、教師信号（観測ノッキング筒内圧９３）と、抽出信号（抽出ノッキング音９１ａ）に対して、逆短時間フーリエ変換（ＩＳＴＦＴ）と高速フーリエ変換（ＦＦＴ）とを行い、伝達関数Ｈを掛け、逆高速フーリエ変換（ＩＦＦＴ）を行い求めた推定信号（推定ノッキング筒内圧９２）の信号波形との誤差が小さくなるように、ネットワークの重みＷ、及び、伝達関数Ｈを学習することができる。 The signal processing device 10 (estimation device) according to the present embodiment as described above performs an estimated signal (estimated knocking cylinder pressure) for the spectrogram obtained by performing a short-time Fourier transform (STFT) on the teacher signal (observation knocking cylinder pressure 93). The network weight W and the transfer function H can be learned so that the error of 92) becomes small. Alternatively, the signal processing device 10 (estimation device) performs an inverse short-time Fourier transform (ISTFT) and a fast Fourier transform (FFT) with respect to the teacher signal (observation knocking cylinder internal pressure 93) and the extraction signal (extraction knocking sound 91a). And the network weight W and the transmission so that the error from the signal waveform of the estimated signal (estimated knocking cylinder pressure 92) obtained by performing the inverse fast Fourier transform (IFFT) by multiplying the transfer function H by The function H can be learned.

（５）図２に示すように、本実施形態に係る信号処理装置１０（推定装置）は、抽出信号（抽出ノッキング音９１ａ）のレベルを変更して、入力物理量（エンジン近傍音９０）から分離されたノイズ成分と合成して加工音を生成する信号合成部５０を備える。 (5) As shown in FIG. 2, the signal processing device 10 (estimation device) according to the present embodiment changes the level of the extraction signal (extraction knocking sound 91a) and separates it from the input physical quantity (engine neighborhood sound 90). It is provided with a signal synthesizing unit 50 that generates a processed sound by synthesizing with the generated noise component.

このような本実施形態に係る信号処理装置１０（推定装置）は、検査すべき目的音（ノッキング音）を聞き分け易い状態にすることができる。このような信号処理装置１０は、目的音（ノッキング音）の有無を高精度に検査者に把握させることができ、官能試験で許容不可となったデータに基づいて閾値を決定することで、検査者に近い判定ができる。 The signal processing device 10 (estimating device) according to the present embodiment can make it easy to distinguish the target sound (knocking sound) to be inspected. Such a signal processing device 10 can make the inspector grasp the presence or absence of the target sound (knocking sound) with high accuracy, and determines the threshold value based on the data that is unacceptable in the sensory test to perform the inspection. You can make a judgment close to that of a person.

（６）図２に示すように、本実施形態に係る信号処理装置１０（推定装置）の信号合成部５０は、操作者による抽出信号（抽出ノッキング音９１ａ）のレベルの指定を受け付ける信号調整部５１を有している。 (6) As shown in FIG. 2, the signal synthesis unit 50 of the signal processing device 10 (estimation device) according to the present embodiment is a signal adjustment unit that receives the level designation of the extraction signal (extraction knocking sound 91a) by the operator. Has 51.

このような本実施形態に係る信号処理装置１０（推定装置）は、抽出信号（抽出ノッキング音９１ａ）のレベルを任意にかつ細やかに変更することができる。 The signal processing device 10 (estimation device) according to the present embodiment can arbitrarily and finely change the level of the extraction signal (extraction knocking sound 91a).

（７）図１４Ａ及び図１７に示すように、本実施形態に係る信号処理装置１０（推定装置）は、以下の信号処理方法を実現することができる。すなわち、本実施形態に係る信号処理方法は、学習工程（図１４ＡのステップＳ３０からステップＳ３２の工程）と、分離工程（図１７のステップＳ６０からステップＳ６１の工程）と、を含む。学習工程では、ニューラルネットワーク９４により、ノイズ成分（雑音９１ｂ）が含まれている入力物理量（エンジン近傍音９０）から位相を加味してノイズ成分を除去するためのマスクαを生成するネットワークの重みＷ、及び、入力物理量（エンジン近傍音９０）からノイズ成分が除去された抽出信号（抽出ノッキング音９１ａ）を、教師信号（観測ノッキング筒内圧９３）と同じ次元（単位）の推定信号（推定ノッキング筒内圧９２）に位相を加味して変換するための伝達関数Ｈを学習する。分離工程では、入力物理量（エンジン近傍音９０）をノイズ成分と抽出信号（抽出ノッキング音９１ａ）とに分離する。 (7) As shown in FIGS. 14A and 17, the signal processing device 10 (estimating device) according to the present embodiment can realize the following signal processing method. That is, the signal processing method according to the present embodiment includes a learning step (step S30 to step S32 in FIG. 14A) and a separation step (step S60 to step S61 in FIG. 17). In the learning step, the weight W of the network that generates the mask α for removing the noise component by adding the phase from the input physical quantity (sound near the engine 90) including the noise component (noise 91b) by the neural network 94. , And the extraction signal (extraction knocking sound 91a) from which the noise component is removed from the input physical quantity (engine neighborhood sound 90) is the estimated signal (estimated knocking cylinder) of the same dimension (unit) as the teacher signal (observation knocking cylinder internal pressure 93). The transfer function H for converting the internal pressure 92) by adding the phase is learned. In the separation step, the input physical quantity (engine near sound 90) is separated into a noise component and an extraction signal (extraction knocking sound 91a).

このような本実施形態に係る信号処理方法は、入力物理量をノイズ成分と抽出信号とに良好に分離することができる。特に、ノイズ成分と、官能試験でのレベル変更に適した信号とに分離することができる。 In such a signal processing method according to the present embodiment, the input physical quantity can be satisfactorily separated into a noise component and an extracted signal. In particular, it can be separated into a noise component and a signal suitable for level change in a sensory test.

（８）図３Ａに示すように、本実施形態に係る信号処理装置１０（推定装置）は、入力物理量（エンジン近傍音９０）からノイズ成分（雑音９１ｂ）を除去した抽出信号（抽出ノッキング音９１ａ）に伝達関数Ｈを掛け合わせて、教師信号（観測ノッキング筒内圧９３）と同じ次元（単位）の推定信号（推定ノッキング筒内圧９２）を推定する推定装置である。本実施形態に係る信号処理装置１０（推定装置）は、前記した学習部２１と、分離部４０と、を備える。学習部２１は、ニューラルネットワーク９４により、入力物理量（エンジン近傍音９０）からノイズ成分を除去して抽出信号（抽出ノッキング音９１ａ）を抽出するためのマスクαを生成するネットワークの重みＷ、及び、伝達関数Ｈを学習する。抽出信号推定部（第２推定部３１）は、マスクαを用いて、ノイズ成分が除去された抽出信号（抽出ノッキング音９１ａ）を取得する。学習部２１は、ネットワークの重みＷ、及び、伝達関数Ｈを学習する際に、マスクα、及び、伝達関数Ｈに対して、入力物理量（エンジン近傍音９０）に関連する振幅と位相成分を加味して学習する。 (8) As shown in FIG. 3A, the signal processing device 10 (estimation device) according to the present embodiment has an extraction signal (extraction knocking sound 91a) obtained by removing a noise component (noise 91b) from an input physical quantity (engine proximity sound 90). ) Is multiplied by the transmission function H to estimate an estimated signal (estimated knocking in-cylinder pressure 92) having the same dimension (unit) as the teacher signal (observed knocking in-cylinder pressure 93). The signal processing device 10 (estimation device) according to the present embodiment includes the learning unit 21 and the separation unit 40 described above. The learning unit 21 uses the neural network 94 to remove the noise component from the input physical quantity (sound near the engine 90) and generate a mask α for extracting the extraction signal (extraction knocking sound 91a). Learn the transfer function H. The extraction signal estimation unit (second estimation unit 31) acquires an extraction signal (extraction knocking sound 91a) from which noise components have been removed by using the mask α. When learning the network weight W and the transfer function H, the learning unit 21 adds the amplitude and phase components related to the input physical quantity (engine proximity sound 90) to the mask α and the transfer function H. And learn.

このような本実施形態に係る信号処理装置１０（推定装置）は、ネットワークの重みＷ、及び、伝達関数Ｈを学習する際に、マスクα、及び、伝達関数Ｈに対して、入力物理量（エンジン近傍音９０）に関連する振幅と位相成分を加味して学習することができる。 Such a signal processing device 10 (estimation device) according to the present embodiment has an input physical quantity (engine) with respect to the mask α and the transfer function H when learning the network weight W and the transfer function H. It is possible to learn by adding the amplitude and phase components related to the neighboring sound 90).

（９）本実施形態に係る信号処理装置１０（推定装置）は、以下の推定方法を実現することができる。すなわち、本実施形態に係る推定方法は、入力物理量（エンジン近傍音９０）からノイズ成分（雑音９１ｂ）を除去した抽出信号（抽出ノッキング音９１ａ）に伝達関数Ｈを掛け合わせて、教師信号（観測ノッキング筒内圧９３）と同じ次元（単位）の推定信号（推定ノッキング筒内圧９２）を推定する方法である。図１４Ｂ又は図１４Ｃに示すように、本実施形態に係る推定方法は、学習工程（ステップＳ１０３の工程）と、推定信号推定工程（ステップＳ１０２の工程）と、を含む。学習工程（ステップＳ１０３の工程）では、ニューラルネットワーク９４により、入力物理量（エンジン近傍音９０）からノイズ成分を除去して抽出信号（抽出ノッキング音９１ａ）を抽出するためのマスクαを生成するネットワークの重みＷ、及び、伝達関数Ｈを学習する。推定信号推定工程（ステップＳ１０２の工程）では、ニューラルネットワーク９４により、マスクαを用いて、ノイズ成分が除去された抽出信号（抽出ノッキング音９１ａ）を取得し、伝達関数Ｈを掛け合わせることで抽出信号（抽出ノッキング音９１ａ）を推定信号に変換する。本実施形態に係る推定方法は、学習工程（ステップＳ１０３の工程）において、ネットワークの重みＷ、及び、伝達関数Ｈを学習する際に、マスクα、及び、伝達関数Ｈに対して、入力物理量（エンジン近傍音９０）に関連する振幅と位相成分を加味して学習する。 (9) The signal processing device 10 (estimation device) according to the present embodiment can realize the following estimation method. That is, in the estimation method according to the present embodiment, the teacher signal (observation) is obtained by multiplying the extraction signal (extraction knocking sound 91a) obtained by removing the noise component (noise 91b) from the input physical quantity (engine vicinity sound 90) by the transmission function H. This is a method of estimating an estimated signal (estimated knocking cylinder pressure 92) having the same dimension (unit) as the knocking cylinder pressure 93). As shown in FIG. 14B or FIG. 14C, the estimation method according to the present embodiment includes a learning step (step S103 step) and an estimation signal estimation step (step S102 step). In the learning step (step S103), the neural network 94 removes the noise component from the input physical quantity (engine neighborhood sound 90) and generates a mask α for extracting the extraction signal (extraction knocking sound 91a). The weight W and the transfer function H are learned. In the estimation signal estimation step (step S102), the neural network 94 is used to acquire the extraction signal (extraction knocking sound 91a) from which the noise component has been removed by using the mask α, and the extraction is extracted by multiplying the transmission function H. The signal (extracted knocking sound 91a) is converted into an estimated signal. In the estimation method according to the present embodiment, when learning the network weight W and the transfer function H in the learning step (step S103), the input physical quantity (with respect to the mask α and the transfer function H) ( Learning is performed by taking into account the amplitude and phase components related to the engine vicinity sound 90).

このような本実施形態に係る推定方法は、ネットワークの重みＷ、及び、伝達関数Ｈを学習する際に、マスクα、及び、伝達関数Ｈに対して、入力物理量（エンジン近傍音９０）に関連する振幅と位相成分を加味して学習することができる。 Such an estimation method according to the present embodiment is related to the input physical quantity (engine proximity sound 90) with respect to the mask α and the transfer function H when learning the network weight W and the transfer function H. It is possible to learn by adding the amplitude and phase component to be performed.

以上の通り、本第１実施形態に係る信号処理装置１０（推定装置）によれば、入力物理量をノイズ成分と抽出信号とに良好に分離することができる。特に、信号処理装置１０（推定装置）は、ノイズ成分と、官能試験でのレベル変更に適した信号とに分離することができる。このような信号処理装置１０（推定装置）は、背景音に目的音（ノッキング音）が混入しているか否かを評価し易くすることができる。そのため、信号処理装置１０（推定装置）は、ノッキングの有無の評価性能を向上させることができる。また、信号処理装置１０は、良好な官能試験を行うことができる。 As described above, according to the signal processing device 10 (estimating device) according to the first embodiment, the input physical quantity can be satisfactorily separated into the noise component and the extracted signal. In particular, the signal processing device 10 (estimating device) can separate the noise component into a signal suitable for level change in the sensory test. Such a signal processing device 10 (estimating device) can easily evaluate whether or not a target sound (knocking sound) is mixed in the background sound. Therefore, the signal processing device 10 (estimating device) can improve the evaluation performance of the presence or absence of knocking. In addition, the signal processing device 10 can perform a good sensory test.

［第２実施形態］
以下、図１９を参照して、本第２実施形態に係る信号処理装置１０Ａ（推定装置）の構成について説明する。図１９は、第２実施形態に係る信号処理装置１０Ａ（推定装置）の構成を示すブロック図である。 [Second Embodiment]
Hereinafter, the configuration of the signal processing device 10A (estimation device) according to the second embodiment will be described with reference to FIG. FIG. 19 is a block diagram showing a configuration of the signal processing device 10A (estimation device) according to the second embodiment.

図１９に示すように、本第２実施形態に係る信号処理装置１０Ａ（推定装置）は、第１実施形態に係る信号処理装置１０（推定装置）（図２参照）と比較すると、信号合成部５０が以下の機能を有する点で相違している。すなわち、信号合成部５０は、教師信号（観測ノッキング筒内圧９３）に伝達関数Ｈの逆数を掛け、入力物理量（エンジン近傍音９０）から分離されたノイズ成分と合成して加工音を生成する機能を有する。 As shown in FIG. 19, the signal processing device 10A (estimation device) according to the second embodiment is a signal synthesizer as compared with the signal processing device 10 (estimation device) (see FIG. 2) according to the first embodiment. The difference is that 50 has the following functions. That is, the signal synthesis unit 50 has a function of multiplying the teacher signal (observation knocking cylinder internal pressure 93) by the reciprocal of the transfer function H and combining it with the noise component separated from the input physical quantity (engine proximity sound 90) to generate a processed sound. Has.

このような本第２実施形態に係る信号処理装置１０Ａ（推定装置）は、第１実施形態に係る信号処理装置１０と同様に、目的音（ノッキング音）の有無を高精度に検査者に把握させることができ、検査性能を向上させることができる。 Similar to the signal processing device 10 according to the first embodiment, the signal processing device 10A (estimating device) according to the second embodiment grasps the presence or absence of the target sound (knocking sound) by the inspector with high accuracy. It is possible to improve the inspection performance.

なお、信号合成部５０は、教師信号（観測ノッキング筒内圧９３）のレベル（大きさ）を変更したレベル変更教師信号に伝達関数Ｈの逆数を掛け、入力物理量（エンジン近傍音９０）から分離されたノイズ成分と合成して加工音を生成する機能を有してもよい。 The signal synthesis unit 50 multiplies the level-changed teacher signal whose level (magnitude) of the teacher signal (observation knocking cylinder pressure 93) is changed by the inverse of the transfer function H, and is separated from the input physical quantity (engine neighborhood sound 90). It may have a function of generating a processed sound by synthesizing it with a noise component.

また、信号合成部５０は、教師信号（観測ノッキング筒内圧９３）ではなく、教師信号（観測ノッキング筒内圧９３）と同じ次元である任意の信号（例えば、任意のノッキング筒内圧信号）に伝達関数Ｈの逆数を掛け、入力物理量（エンジン近傍音９０）から分離されたノイズ成分と合成して加工音を生成する機能を有してもよい。 Further, the signal synthesizer 50 is not a teacher signal (observation knocking cylinder pressure 93) but a transmission function to an arbitrary signal (for example, an arbitrary knocking cylinder pressure signal) having the same dimension as the teacher signal (observation knocking cylinder pressure 93). It may have a function of multiplying by the inverse number of H and synthesizing it with a noise component separated from the input physical quantity (engine proximity sound 90) to generate a processed sound.

また、信号合成部５０は、伝達関数Ｈの値を変更した変更伝達関数Ｈｃの逆数を教師信号（観測ノッキング筒内圧９３）に掛け、入力物理量（エンジン近傍音９０）から分離されたノイズ成分と合成して加工音を生成するようにしてもよい。 Further, the signal synthesis unit 50 multiplies the teacher signal (observation knocking cylinder internal pressure 93) by the inverse of the change transfer function Hc in which the value of the transfer function H is changed, and sets the noise component separated from the input physical quantity (engine vicinity sound 90). It may be synthesized to generate a processed sound.

また、信号合成部５０は、教師信号（観測ノッキング筒内圧９３）のレベル（大きさ）を変更したレベル変更教師信号に伝達関数Ｈの値を変更した変更伝達関数Ｈｃの逆数を掛け、入力物理量（エンジン近傍音９０）から分離されたノイズ成分と合成して加工音を生成する機能を有してもよい。 Further, the signal synthesis unit 50 multiplies the level change teacher signal whose level (magnitude) of the teacher signal (observation knocking cylinder pressure 93) is changed by the inverse of the change transfer function Hc whose value of the transfer function H is changed, and inputs the physical quantity. It may have a function of generating a processed sound by synthesizing it with a noise component separated from (sound near the engine 90).

ここで、例えば、変更伝達関数Ｈｃは、伝達関数Ｈのある周波数帯に該当する振幅及び／又は位相を、増加又は減少させたものである。 Here, for example, the modified transfer function Hc is obtained by increasing or decreasing the amplitude and / or phase corresponding to a certain frequency band of the transfer function H.

また、信号合成部５０は、音響加振実験で測定した測定対象から受聴者の耳位置までの伝達関数Ｈｅ（図示せず）を抽出信号に掛け、実車走行時に測定した受聴者の耳位置のノイズ成分を合成して加工音を生成する機能を有してもよい。 Further, the signal synthesis unit 50 applies a transmission function He (not shown) from the measurement target measured in the acoustic vibration experiment to the ear position of the listener to the extraction signal, and measures the ear position of the listener when the vehicle is running. It may have a function of synthesizing noise components to generate a processed sound.

また、信号合成部５０は、ニューラルネットワーク９４により推定された伝達関数Ｈの逆数を教師信号（観測ノッキング筒内圧９３）に掛けて推定ノッキング音を求め、さらに、音響加振やシミュレーションで求めた測定対象から受聴者（車両の場合は搭乗者（特にドライバー））の耳位置までの伝達関数Ｈｅ（図示せず）を掛け、実車走行時に測定した受聴者の耳位置のノイズ成分を合成して加工音を生成する機能を有してもよい。このような信号合成部５０は、例えば推定ノッキング音から求めた受聴者（車両の搭乗者（特にドライバー））の耳位置のノッキング音に、実走行時の受聴者の耳位置で観測したノイズ成分（実走行音）を合成して、実走行時のノッキング音をシミュレーションすることができる。 Further, the signal synthesis unit 50 obtains the estimated knocking sound by multiplying the teacher signal (observation knocking cylinder internal pressure 93) by the inverse number of the transmission function H estimated by the neural network 94, and further obtains the measurement by acoustic excitation or simulation. Multiply the transmission function He (not shown) from the target to the ear position of the listener (in the case of a vehicle, the passenger (especially the driver)), and synthesize and process the noise component of the listener's ear position measured while driving the actual vehicle. It may have a function of generating sound. In such a signal synthesis unit 50, for example, the knocking sound at the ear position of the listener (vehicle occupant (particularly driver)) obtained from the estimated knocking sound has a noise component observed at the listener's ear position during actual running. (Actual running sound) can be combined to simulate the knocking sound during actual driving.

なお、本発明は、前記した実施形態に限定されるものではなく、本発明の要旨を逸脱しない範囲で種々の変更や変形を行うことができる。 The present invention is not limited to the above-described embodiment, and various modifications and modifications can be made without departing from the gist of the present invention.

例えば、前記した実施形態は、本発明の要旨を分かり易く説明するために詳細に説明したものである。そのため、本発明は、必ずしも説明した全ての構成要素を備えるものに限定されるものではない。また、本発明は、ある構成要素に他の構成要素を追加したり、一部の構成要素を他の構成要素に変更したりすることができる。また、本発明は、一部の構成要素を削除することもできる。 For example, the above-described embodiment has been described in detail in order to explain the gist of the present invention in an easy-to-understand manner. Therefore, the present invention is not necessarily limited to those including all the components described above. In addition, the present invention can add other components to a certain component, or change some components to other components. In addition, the present invention can also delete some components.

例えば、以下に説明するように、信号処理装置１０（推定装置）は、試験対象を任意の物品に変更することができる。また、信号処理装置１０（推定装置）は、入力物理量や教師信号を任意の信号に変更することができる。 For example, as described below, the signal processing device 10 (estimating device) can change the test target to any article. Further, the signal processing device 10 (estimating device) can change the input physical quantity and the teacher signal to any signal.

（１）例えば、信号処理装置１０は、図２０に示す環境において、図２１に示す第１変形例のように変更して使用することができる。図２０は、第１変形例の説明図である。図２１は、第１変形例において、マスクを生成するネットワークの重み及び伝達関数の学習の説明図である。 (1) For example, the signal processing device 10 can be changed and used in the environment shown in FIG. 20 as in the first modification shown in FIG. FIG. 20 is an explanatory diagram of the first modification. FIG. 21 is an explanatory diagram of learning of the weight and transfer function of the network that generates the mask in the first modification.

図２０は、車両３の車室３ａにおいて、受聴者（車両の搭乗者（特にドライバー））の耳位置で検知されるインパネ３ｂ（インストルメントパネル）の振動音を評価する場合の一例を示している。図２０に示す例において、受聴者耳位置走行音１９０（受聴者の耳位置で聞こえる車室内音）が入力物理量であり、観測インパネ振動加速度１９３が教師信号である。図２１に示すように、信号処理装置１０の学習部２１は、マスクαを用いて入力物理量（受聴者耳位置走行音１９０）を抽出信号（抽出インパネ振動音１９１ａ）とノイズ成分（雑音１９１ｂ）とに分離する。なお、マスクα及び伝達関数Ｈは、入力物理量に関連する振幅や位相成分を含んでいる。 FIG. 20 shows an example of evaluating the vibration sound of the instrument panel 3b (instrument panel) detected at the ear position of the listener (passenger (particularly driver) of the vehicle) in the passenger compartment 3a of the vehicle 3. There is. In the example shown in FIG. 20, the listener's ear position running sound 190 (the vehicle interior sound heard at the listener's ear position) is the input physical quantity, and the observation instrument panel vibration acceleration 193 is the teacher signal. As shown in FIG. 21, the learning unit 21 of the signal processing device 10 uses the mask α to extract an input physical quantity (listener ear position running sound 190) as an extraction signal (extraction instrument panel vibration sound 191a) and a noise component (noise 191b). Separate into and. The mask α and the transfer function H include amplitude and phase components related to the input physical quantity.

また、図２１に示すように、信号処理装置１０の学習部２１は、ノイズ成分（雑音１９１ｂ）を逆短時間フーリエ変換（ＩＳＴＦＴ）して求めた信号波形と教師信号（観測インパネ振動加速度１９３）とのコヒーレンスが小さくなるとともに、抽出信号（抽出インパネ振動音１９１ａ）を逆短時間フーリエ変換（ＩＳＴＦＴ）して求めた信号波形と教師信号（観測インパネ振動加速度１９３）とのコヒーレンスが大きくなるように、ネットワークの重みＷ、及び、伝達関数Ｈを学習する。これにより、信号処理装置１０は、雑音（インパネ３ｂの振動に起因しない音（背景音））の中に目的音（インパネ３ｂの振動に起因する音）が混入しないように、入力物理量（受聴者耳位置走行音１９０）を抽出信号（抽出インパネ振動音１９１ａ）とノイズ成分（雑音１９１ｂ）とに分離することができる。 Further, as shown in FIG. 21, the learning unit 21 of the signal processing device 10 obtains a signal waveform and a teacher signal (observation instrument panel vibration acceleration 193) obtained by performing an inverse short-time Fourier transform (ISTFT) on a noise component (noise 191b). The coherence between the signal waveform and the teacher signal (observation instrument panel vibration acceleration 193) obtained by inverse short-time Fourier transform (ISTFT) of the extracted signal (extracted instrument panel vibration sound 191a) becomes larger as the coherence with the signal becomes smaller. , The network weight W, and the transfer function H are learned. As a result, the signal processing device 10 has an input physical quantity (listener) so that the target sound (sound caused by the vibration of the instrument panel 3b) is not mixed in the noise (sound not caused by the vibration of the instrument panel 3b (background sound)). The ear position running sound 190) can be separated into an extraction signal (extraction instrument panel vibration sound 191a) and a noise component (noise 191b).

また、図２１に示すように、信号処理装置１０の学習部２１は、抽出信号（抽出インパネ振動音１９１ａ）に対して、逆短時間フーリエ変換（ＩＳＴＦＴ）と高速フーリエ変換（ＦＦＴ）とを行い、伝達関数Ｈを掛け、逆高速フーリエ変換（ＩＦＦＴ）と短時間フーリエ変換（ＳＴＦＴ）とを行うことで、抽出信号（抽出インパネ振動音１９１ａ）を推定信号（推定インパネ振動加速度１９２）に変換している。そして、学習部２１は、推定信号（推定インパネ振動加速度１９２）と教師信号（観測インパネ振動加速度１９３）に対して短時間フーリエ変換（ＳＴＦＴ）を行い求めたスペクトログラムとの誤差が最小となるように、ニューラルネットワーク９４により、マスクαを生成するネットワークの重みＷ、及び、伝達関数Ｈを学習する。 Further, as shown in FIG. 21, the learning unit 21 of the signal processing device 10 performs an inverse short-time Fourier transform (ISTFT) and a fast Fourier transform (FFT) on the extracted signal (extracted instrument panel vibration sound 191a). , By multiplying the transfer function H and performing inverse fast Fourier transform (IFFT) and short-time Fourier transform (STFT), the extracted signal (extracted instrument panel vibration sound 191a) is converted into an estimated signal (estimated instrument panel vibration acceleration 192). ing. Then, the learning unit 21 performs a short-time Fourier transform (STFT) on the estimated signal (estimated instrument panel vibration acceleration 192) and the teacher signal (observation instrument panel vibration acceleration 193) so that the error between the spectrogram obtained is minimized. , The neural network 94 learns the weight W of the network that generates the mask α and the transfer function H.

このような信号処理装置１０は、例えば官能試験時に、受聴者耳位置走行音１９０（受聴者の耳位置で聞こえる車室内音）のうち、インパネ３ｂの振動に起因する音の大きさを検査者に把握させ易くすることができる。例えば、インパネ３ｂの振動に起因する音が比較的大きい場合に、信号処理装置１０は、信号合成部５０の信号調整部５１で加工音を生成して、検査者の聴感による主観評価を行うことで、インパネ３ｂの振動低減の目標値を設定することができる。 In such a signal processing device 10, for example, during a sensory test, the inspector examines the loudness of the sound caused by the vibration of the instrument panel 3b among the listener's ear position running sound 190 (vehicle interior sound heard at the listener's ear position). It can be made easier to grasp. For example, when the sound caused by the vibration of the instrument panel 3b is relatively loud, the signal processing device 10 generates a processed sound by the signal adjusting unit 51 of the signal synthesis unit 50, and performs subjective evaluation based on the auditory sense of the inspector. Therefore, the target value for vibration reduction of the instrument panel 3b can be set.

（２）また、信号処理装置１０は、図２２に示す環境において、図２３に示す第２変形例のように変更して使用することができる。図２２は、第２変形例の説明図である。図２３は、第２変形例において、マスクを生成するネットワークの重み及び伝達関数の学習の説明図である。 (2) Further, the signal processing device 10 can be changed and used in the environment shown in FIG. 22 as in the second modification shown in FIG. 23. FIG. 22 is an explanatory diagram of the second modification. FIG. 23 is an explanatory diagram of learning of the weight and transfer function of the network that generates the mask in the second modification.

図２２は、冷蔵庫２０３の近傍において、コンプレッサーの振動音を評価する場合の一例を示している。図２２に示す例において、受聴者の耳位置で聞こえる冷蔵庫音２９０が入力物理量であり、観測コンプレッサー振動加速度２９３が教師信号である。図２３に示すように、信号処理装置１０の学習部２１は、マスクαを用いて入力物理量（冷蔵庫音２９０）を抽出信号（抽出コンプレッサー音２９１ａ）とノイズ成分（雑音２９１ｂ）とに分離する。なお、マスクα及び伝達関数Ｈは、入力物理量に関連する振幅や位相成分を含んでいる。 FIG. 22 shows an example of evaluating the vibration sound of the compressor in the vicinity of the refrigerator 203. In the example shown in FIG. 22, the refrigerator sound 290 heard at the listener's ear position is the input physical quantity, and the observation compressor vibration acceleration 293 is the teacher signal. As shown in FIG. 23, the learning unit 21 of the signal processing device 10 uses the mask α to separate the input physical quantity (refrigerator sound 290) into an extraction signal (extraction compressor sound 291a) and a noise component (noise 291b). The mask α and the transfer function H include amplitude and phase components related to the input physical quantity.

また、図２３に示すように、信号処理装置１０の学習部２１は、ノイズ成分（雑音２９１ｂ）を逆短時間フーリエ変換（ＩＳＴＦＴ）して求めた信号波形と教師信号（観測コンプレッサー振動加速度２９３）とのコヒーレンスが小さくなるとともに、抽出信号（抽出コンプレッサー音２９１ａ）を逆短時間フーリエ変換（ＩＳＴＦＴ）して求めた信号波形と教師信号（観測コンプレッサー振動加速度２９３）とのコヒーレンスが大きくなるように、ネットワークの重みＷ、及び、伝達関数Ｈを学習する。これにより、信号処理装置１０は、雑音（コンプレッサーの振動に起因しない音（背景音））の中に目的音（コンプレッサーの振動に起因する音）が混入しないように、入力物理量（冷蔵庫音２９０）を抽出信号（抽出コンプレッサー音２９１ａ）とノイズ成分（雑音２９１ｂ）とに分離することができる。 Further, as shown in FIG. 23, the learning unit 21 of the signal processing device 10 obtains a signal waveform and a teacher signal (observation compressor vibration acceleration 293) obtained by performing an inverse short-time Fourier transform (ISTFT) on a noise component (noise 291b). The coherence between the signal waveform and the teacher signal (observation compressor vibration acceleration 293) obtained by inverse short-time Fourier transform (ISTFT) of the extracted signal (extracted compressor sound 291a) becomes larger as the coherence with the signal becomes smaller. The weight W of the network and the transfer function H are learned. As a result, the signal processing device 10 has an input physical quantity (refrigerator sound 290) so that the target sound (sound caused by the vibration of the compressor) is not mixed in the noise (sound not caused by the vibration of the compressor (background sound)). Can be separated into an extraction signal (extraction compressor sound 291a) and a noise component (noise 291b).

また、図２３に示すように、信号処理装置１０の学習部２１は、抽出信号（抽出コンプレッサー音２９１ａ）に対して、逆短時間フーリエ変換（ＩＳＴＦＴ）と高速フーリエ変換（ＦＦＴ）とを行い、伝達関数Ｈを掛け、逆高速フーリエ変換（ＩＦＦＴ）と短時間フーリエ変換（ＳＴＦＴ）とを行うことで、抽出信号（抽出コンプレッサー音２９１ａ）を推定信号（推定コンプレッサー振動加速度２９２）に変換している。そして、学習部２１は、推定信号（推定コンプレッサー振動加速度２９２）と教師信号（観測コンプレッサー振動加速度２９３）に対して短時間フーリエ変換（ＳＴＦＴ）を行い求めたスペクトログラムとの誤差が最小となるように、ニューラルネットワーク９４により、マスクαを生成するネットワークの重みＷ、及び、伝達関数Ｈを学習する。 Further, as shown in FIG. 23, the learning unit 21 of the signal processing device 10 performs an inverse short-time Fourier transform (ISTFT) and a fast Fourier transform (FFT) on the extracted signal (extracting compressor sound 291a). By multiplying the transfer function H and performing the inverse fast Fourier transform (IFFT) and the short-time Fourier transform (STFT), the extraction signal (extraction compressor sound 291a) is converted into the estimation signal (estimated compressor vibration acceleration 292). .. Then, the learning unit 21 performs a short-time Fourier transform (STFT) on the estimated signal (estimated compressor vibration acceleration 292) and the teacher signal (observation compressor vibration acceleration 293) so that the error between the spectrogram obtained is minimized. , The neural network 94 learns the weight W of the network that generates the mask α and the transfer function H.

このような信号処理装置１０は、例えば官能試験時に、受聴者の耳位置で聞こえる冷蔵庫音２９０のうち、コンプレッサーの振動に起因する音の大きさを検査者に把握させ易くすることができる。例えば、コンプレッサーの振動に起因する音が比較的大きい場合に、信号処理装置１０は、信号合成部５０の信号調整部５１で加工音を生成して、検査者の聴感による主観評価を行うことで、コンプレッサーの振動低減の目標値を設定することができる。 Such a signal processing device 10 can make it easier for the inspector to grasp the loudness of the refrigerator sound 290 that can be heard at the listener's ear position due to the vibration of the compressor, for example, during a sensory test. For example, when the sound caused by the vibration of the compressor is relatively loud, the signal processing device 10 generates a processed sound by the signal adjusting unit 51 of the signal synthesizing unit 50, and performs subjective evaluation by the inspector's hearing. , The target value of the vibration reduction of the compressor can be set.

（３）また、信号処理装置１０は、図２４に示す第３変形例のように変更して使用することができる。図２４は、第３変形例において、マスクを生成するネットワークの重み及び伝達関数の学習の説明図である。 (3) Further, the signal processing device 10 can be modified and used as in the third modification shown in FIG. 24. FIG. 24 is an explanatory diagram of learning of the weight and transfer function of the network that generates the mask in the third modification.

図２４に示す例では、車両３の車室３ａにおいて、受聴者（車両の搭乗者（特にドライバー））の耳位置で検知されるエンジン１（図１）の放射音に含まれる燃焼音を評価する場合の一例を示している。図２４に示す例では、エンジン１（図１）の教師信号としての筒内圧３９０と、入力物理量としての放射音３９０ａと振動３９０ｂとを観測し、それぞれに基づいてマスクと伝達関数を学習する構成を示している。 In the example shown in FIG. 24, in the passenger compartment 3a of the vehicle 3, the combustion sound included in the radiated sound of the engine 1 (FIG. 1) detected at the ear position of the listener (passenger (particularly driver) of the vehicle) is evaluated. An example of the case is shown. In the example shown in FIG. 24, the in-cylinder pressure 390 as a teacher signal of the engine 1 (FIG. 1), the radiated sound 390a and the vibration 390b as input physical quantities are observed, and the mask and the transfer function are learned based on each. Is shown.

信号処理装置１０は、マスクα１を用いて放射音３９０ａ（入力物理量）から燃焼音３９１ａａ（抽出信号）を取得するとともに、雑音３９１ａｂ（ノイズ成分）を取得する。雑音３９１ａｂ（ノイズ成分）は、放射音３９０ａ（入力物理量）から燃焼音３９１ａａ（抽出信号）を除去することで取得される。 The signal processing device 10 acquires the combustion sound 391aa (extracted signal) from the radiated sound 390a (input physical quantity) by using the mask α1 and also acquires the noise 391ab (noise component). The noise 391ab (noise component) is acquired by removing the combustion sound 391aa (extracted signal) from the radiated sound 390a (input physical quantity).

信号処理装置１０の学習部２１は、ノイズ成分（雑音３９１ａｂ）と教師信号（筒内圧３９０）とのコヒーレンスが小さくなるとともに、抽出信号（燃焼音３９１ａａ）と教師信号（筒内圧３９０）とのコヒーレンスが大きくなるように、また、推定信号（推定筒内圧３９２ａ）と教師信号（筒内圧３９０）に対して短時間フーリエ変換（ＳＴＦＴ）を行い求めたスペクトログラムとの誤差が最小になるように、ニューラルネットワーク９４により、マスクα１を生成するネットワークの重みＷ１（図示せず）、及び、伝達関数Ｈ１を学習する。マスクα１は、放射音３９０ａ（入力物理量）に含まれる燃焼音の割合と位相成分（位相の修正量）を表す。また、伝達関数Ｈ１は、燃焼音３９１ａａ（抽出信号）を推定筒内圧３９２ａ（推定信号）に変換するための振幅（ゲイン）と位相成分である。 In the learning unit 21 of the signal processing device 10, the coherence between the noise component (noise 391ab) and the teacher signal (in-cylinder pressure 390) is reduced, and the coherence between the extraction signal (combustion sound 391aa) and the teacher signal (in-cylinder pressure 390) is reduced. The neural The network 94 learns the weight W1 (not shown) of the network that generates the mask α1 and the transfer function H1. The mask α1 represents the ratio of the combustion sound contained in the radiated sound 390a (input physical quantity) and the phase component (phase correction amount). Further, the transfer function H1 is an amplitude (gain) and a phase component for converting the combustion sound 391aa (extracted signal) into the estimated in-cylinder pressure 392a (estimated signal).

また、信号処理装置１０は、マスクα２を用いて振動３９０ｂ（入力物理量）から燃焼振動３９１ｂａ（抽出信号）を取得するとともに、ノイズ振動３９１ｂｂ（ノイズ成分）を取得する。ノイズ振動３９１ｂｂ（ノイズ成分）は、振動３９０ｂ（入力物理量）から燃焼振動３９１ｂａ（抽出信号）を除去することで取得される。 Further, the signal processing device 10 acquires the combustion vibration 391ba (extracted signal) from the vibration 390b (input physical quantity) by using the mask α2, and also acquires the noise vibration 391bb (noise component). The noise vibration 391bb (noise component) is acquired by removing the combustion vibration 391ba (extracted signal) from the vibration 390b (input physical quantity).

信号処理装置１０の学習部２１は、ノイズ成分（ノイズ振動３９１ｂｂ）と教師信号（筒内圧３９０）とのコヒーレンスが小さくなるとともに、抽出信号（燃焼振動３９１ｂａ）と教師信号（筒内圧３９０）とのコヒーレンスが大きくなるように、また、推定信号（推定筒内圧３９２ｂ）と教師信号（筒内圧３９０）に対して短時間フーリエ変換（ＳＴＦＴ）を行い求めたスペクトログラムとの誤差が最小になるように、ニューラルネットワーク９４により、マスクα２を生成するネットワークの重みＷ２（図示せず）、及び、伝達関数Ｈ２を学習する。マスクα２は、振動３９０ｂ（入力物理量）に含まれる燃焼振動の割合と位相成分（位相の修正量）を表す。また、伝達関数Ｈ２は、燃焼振動３９１ｂａ（抽出信号）を推定筒内圧３９２ｂ（推定信号）に変換するための振幅（ゲイン）と位相成分である。 In the learning unit 21 of the signal processing device 10, the coherence between the noise component (noise vibration 391bb) and the teacher signal (in-cylinder pressure 390) becomes small, and the extraction signal (combustion vibration 391ba) and the teacher signal (in-cylinder pressure 390) are combined. To increase the coherence and to minimize the error between the estimated signal (estimated in-cylinder pressure 392b) and the spectrogram obtained by performing a short-time Fourier transform (STFT) on the teacher signal (in-cylinder pressure 390). The neural network 94 learns the weight W2 (not shown) of the network that generates the mask α2 and the transfer function H2. The mask α2 represents the ratio of the combustion vibration included in the vibration 390b (input physical quantity) and the phase component (phase correction amount). Further, the transfer function H2 is an amplitude (gain) and a phase component for converting the combustion vibration 391ba (extracted signal) into the estimated in-cylinder pressure 392b (estimated signal).

さらに、信号処理装置１０は、数値シミュレーション又は加振実験によって求めたエンジンルーム内の音と受聴者の耳位置の音との伝達関数Ｈ１ａを燃焼音３９１ａａ（抽出信号）に掛け合わせて推定空気伝搬音３９３ａ（推定信号）を生成する。また、信号処理装置１０は、数値シミュレーション又は加振実験によって求めたエンジンマウント振動と受聴者の耳位置の音との伝達関数Ｈ２ｂを燃焼振動３９１ｂａ（抽出信号）に掛け合わせて推定個体伝搬音３９３ｂ（推定信号）を生成する。この後、信号処理装置１０は、推定空気伝搬音３９３ａ（推定信号）と推定個体伝搬音３９３ｂ（推定信号）とを合成して推定受聴者耳位置音３９３（合成推定信号）を生成する。 Further, the signal processing device 10 multiplies the transmission function H1a between the sound in the engine room and the sound at the ear position of the listener obtained by numerical simulation or vibration experiment with the combustion sound 391aa (extracted signal) to estimate air propagation. Generates sound 393a (estimated signal). Further, the signal processing device 10 multiplies the transmission function H2b between the engine mount vibration obtained by the numerical simulation or the vibration experiment and the sound of the listener's ear position with the combustion vibration 391ba (extracted signal) to estimate the individual propagation sound 393b. Generate (estimated signal). After that, the signal processing device 10 synthesizes the estimated air propagation sound 393a (estimated signal) and the estimated individual propagation sound 393b (estimated signal) to generate the estimated listener ear position sound 393 (synthetic estimated signal).

このような信号処理装置１０は、例えば官能試験時に、受聴者（車両の搭乗者（特にドライバー））の耳位置で検知される燃焼音をシミュレーションすることができる。 Such a signal processing device 10 can simulate the combustion sound detected at the ear position of a listener (vehicle occupant (particularly driver)), for example, during a sensory test.

１…エンジン
２…エンジンＥＣＵ
３…車両
３ａ…車室
３ｂ…インパネ
４…音圧センサ
５…筒内圧センサ
６…データ収集装置
７…モニタ
８…ヘッドホン（放音部）
９…レベル指定部
１０，１０Ａ…信号処理装置（推定装置）
１１…信号切出部
１２…スペクトログラム算出部
１３…信号記憶部
１４…スイッチ
２０…学習処理部
２１…学習部
２２…学習済みパラメータ記憶部
２３…第１推定部
２４…閾値算出部
２５…閾値記憶部
２６ａ…教師信号記憶部
２６ｂ…推定信号記憶部
２６ｃ…ノイズ成分記憶部
２６ｄ…抽出信号記憶部
３０…判定処理部
３１…第２推定部（抽出信号推定部）
３２…閾値判定部
４０…分離部
５０…信号合成部
５１…信号調整部
５２…信号出力部
８１…ノッキング振動
８２…ノッキング音
８３…メカニカルノイズ（ノイズ成分）
９０…エンジン近傍音（入力物理量）
９１ａ…抽出ノッキング音（抽出信号）
９１ａａ…レベル変更ノッキング音（レベル変更抽出信号）
９１ｂ…雑音（ノイズ成分）
９１ｃ…加工音
９２…推定ノッキング筒内圧（推定信号）
９３…観測ノッキング筒内圧（教師信号）
９４…ニューラルネットワーク
９４Ａ…マスク生成ネットワーク
９５…Ｕ−Ｎｅｔ
９６…下向きパス（Ｅｎｃｏｄｅｒ）
９７…階層
９８…上向きパス（Ｄｅｃｏｄｅｒ）
９９…出力
１００…信号処理システム
１９０…受聴者耳位置走行音（入力物理量）
１９１ａ…抽出インパネ振動音（抽出信号）
１９１ｂ…雑音（ノイズ成分）
１９２…推定インパネ振動加速度（推定信号）
１９３…観測インパネ振動加速度（教師信号）
２０３…冷蔵庫
２９０…冷蔵庫音（入力物理量）
２９１ａ…抽出コンプレッサー音（抽出信号）
２９１ｂ…雑音（ノイズ成分）
２９２…推定コンプレッサー振動加速度（推定信号）
２９３…観測コンプレッサー振動加速度（教師信号）
３９０…筒内圧（教師信号）
３９０ａ…放射音（入力物理量）
３９０ｂ…振動（入力物理量）
３９１ａａ…燃焼音（抽出信号）
３９１ａｂ…雑音（ノイズ成分）
３９１ｂａ…燃焼振動（抽出信号）
３９１ｂｂ…ノイズ振動（ノイズ成分）
３９２ａ，３９２ｂ…推定筒内圧（推定信号）
３９３…推定受聴者耳位置音（合成推定信号）
３９３ａ…推定空気伝搬音（推定信号）
３９３ｂ…推定個体伝搬音（推定信号）
α，α１，α２…マスク
Ｆ１１，Ｆ２１，Ｆ３１，Ｆ３７…スペクトログラム
Ｆ１２，Ｆ２２，Ｆ３２，Ｆ３６，Ｆ９３…信号波形
Ｆ１３，Ｆ２３…コヒーレンス
Ｆ３３，Ｆ３５…スペクトル
Ｆ３４…周波数応答特性
Ｈ，Ｈ１，Ｈ１ａ，Ｈ２，Ｈ２ｂ，Ｈｅ…伝達関数
Ｈｃ…変更伝達関数
Ｍ１…学習モード用接続部
Ｍ２…閾値算出モード用接続部
Ｍ３…判定モード用接続部
Ｍ４…分離モード用接続部
Ｍ５…官能試験モード用接続部
Ｗ…ネットワークの重み
Ｔ…閾値 1 ... Engine 2 ... Engine ECU
3 ... Vehicle 3a ... Vehicle interior 3b ... Instrument panel 4 ... Sound pressure sensor 5 ... In-cylinder pressure sensor 6 ... Data collection device 7 ... Monitor 8 ... Headphones (sound release unit)
9 ... Level designation unit 10, 10A ... Signal processing device (estimation device)
11 ... Signal cutting unit 12 ... Spectrogram calculation unit 13 ... Signal storage unit 14 ... Switch 20 ... Learning processing unit 21 ... Learning unit 22 ... Learned parameter storage unit 23 ... First estimation unit 24 ... Threshold calculation unit 25 ... Threshold memory Unit 26a ... Teacher signal storage unit 26b ... Estimated signal storage unit 26c ... Noise component storage unit 26d ... Extraction signal storage unit 30 ... Judgment processing unit 31 ... Second estimation unit (extracted signal estimation unit)
32 ... Threshold determination unit 40 ... Separation unit 50 ... Signal synthesis unit 51 ... Signal adjustment unit 52 ... Signal output unit 81 ... Knocking vibration 82 ... Knocking sound 83 ... Mechanical noise (noise component)
90 ... Sound near the engine (input physical quantity)
91a ... Extraction knocking sound (extraction signal)
91aa ... Level change knocking sound (level change extraction signal)
91b ... Noise (noise component)
91c ... Processing sound 92 ... Estimated knocking in-cylinder pressure (estimated signal)
93 ... Observation knocking in-cylinder pressure (teacher signal)
94 ... Neural network 94A ... Mask generation network 95 ... U-Net
96 ... Downward path (Encoder)
97 ... Hierarchy 98 ... Upward path (Decoder)
99 ... Output 100 ... Signal processing system 190 ... Listener ear position running sound (input physical quantity)
191a ... Extraction instrument panel vibration sound (extraction signal)
191b ... Noise (noise component)
192 ... Estimated instrument panel vibration acceleration (estimated signal)
193 ... Observation instrument panel vibration acceleration (teacher signal)
203 ... Refrigerator 290 ... Refrigerator sound (input physical quantity)
291a ... Extraction compressor sound (extraction signal)
291b ... Noise (noise component)
292 ... Estimated compressor vibration acceleration (estimated signal)
293 ... Observation compressor vibration acceleration (teacher signal)
390 ... In-cylinder pressure (teacher signal)
390a ... Radiated sound (input physical quantity)
390b ... Vibration (input physical quantity)
391aa ... Combustion sound (extraction signal)
391ab ... Noise (noise component)
391ba ... Combustion vibration (extraction signal)
391bb ... Noise vibration (noise component)
392a, 392b ... Estimated in-cylinder pressure (estimated signal)
393 ... Estimated listener ear position sound (synthetic estimated signal)
393a ... Estimated air propagation sound (estimated signal)
393b ... Estimated individual propagation sound (estimated signal)
α, α1, α2 ... Masks F11, F21, F31, F37 ... Spectrograms F12, F22, F32, F36, F93 ... Signal waveforms F13, F23 ... Coherence F33, F35 ... Spectrum F34 ... Frequency response characteristics H, H1, H1a, H2 , H2b, He ... Transmission function Hc ... Change transmission function M1 ... Learning mode connection M2 ... Threshold calculation mode connection M3 ... Judgment mode connection M4 ... Separation mode connection M5 ... Sensory test mode connection W ... Network weight T ... Threshold

Claims

By the neural network, with learning the weights of the network for generating a mask for removing an input physical quantity or found before Symbol noise component that contains the noise component, the extraction signal the noise component is removed from the input physical quantity a learning unit that learns a transfer function for conversion to the estimated signal of the same dimensions as teacher signals,
A separation unit that separates the input physical quantity into the noise component and the extraction signal is provided .
The learning unit learns the weight of the network that generates the mask by adding the phase component related to the input physical quantity, and learns the transfer function by adding the phase component related to the input physical quantity. <br/> A signal processing device characterized in that.

In the signal processing device according to claim 1,
The learning unit, the network weights to generate the mask, and, when learning the transfer function, in addition to the phase component, in consideration of the amplitude associated with the input physical quantity, the network generating the mask as well as learning a weight, the addition to the phase component, in consideration of amplitude related to the input physical quantity, a signal processing apparatus characterized by learning the transfer function.

In the signal processing device according to claim 2,
In the learning unit, the weight of the network that generates the mask and the weight of the network that generates the mask are increased so that the relationship between the noise component and the teacher signal becomes smaller and the relationship between the extracted signal and the teacher signal becomes larger. A signal processing device characterized by learning a transfer function.

In the signal processing device according to claim 3,
The learning unit is characterized in that when learning the weight of the network that generates the mask and the transfer function, the learning unit further learns so that the error of the estimated signal with respect to the teacher signal becomes small. Processing equipment.

The signal processing device according to any one of claims 1 to 4.
Further comprising a signal combining unit for generating a processed sound by combining the signals,
The signal processing unit is a signal processing device that changes the magnitude of the extracted signal and synthesizes it with the noise component separated from the input physical quantity to generate a processed sound.

The signal processing device according to any one of claims 1 to 4.
Furthermore, it is equipped with a signal synthesizer that synthesizes signals to generate processed sounds.
The signal processing unit has a function of multiplying the teacher signal by the reciprocal of the transfer function and synthesizing it with the noise component separated from the input physical quantity to generate a processed sound.

The signal processing device according to any one of claims 1 to 4.
Furthermore, it is equipped with a signal synthesizer that synthesizes signals to generate processed sounds.
The signal synthesis unit has a function of multiplying the changed teacher signal whose magnitude is changed by the inverse of the transfer function and synthesizing it with the noise component separated from the input physical quantity to generate a processed sound. A signal processing device characterized by.

The signal processing device according to any one of claims 1 to 4.
Furthermore, it is equipped with a signal synthesizer that synthesizes signals to generate processed sounds.
The signal synthesizer has a function of multiplying an arbitrary signal having the same dimension as the teacher signal by the inverse of the transfer function and synthesizing it with the noise component separated from the input physical quantity to generate a processed sound. A characteristic signal processing device.

The signal processing device according to any one of claims 1 to 4.
Furthermore, it is equipped with a signal synthesizer that synthesizes signals to generate processed sounds.
The signal synthesizer has a function of multiplying the teacher signal by the inverse of the change transfer function in which the value of the transfer function is changed and synthesizing it with the noise component separated from the input physical quantity to generate a processed sound. A characteristic signal processing device.

The signal processing device according to any one of claims 1 to 4.
Furthermore, it is equipped with a signal synthesizer that synthesizes signals to generate processed sounds.
The signal synthesizer multiplies the changed teacher signal whose magnitude is changed by the inverse number of the changed transfer function whose value of the transfer function is changed, and synthesizes and processes it with the noise component separated from the input physical quantity. A signal processing device characterized by having a function of generating sound.

The signal processing device according to any one of claims 1 to 4.
Furthermore, it is equipped with a signal synthesizer that synthesizes signals to generate processed sounds.
The signal synthesizer has a function of multiplying the extracted signal by a transfer function from the measurement target to the listener's ear position and synthesizing the noise component of the listener's ear position to generate a processed sound. Signal processing device.

The signal processing device according to any one of claims 1 to 4.
Furthermore, it is equipped with a signal synthesizer that synthesizes signals to generate processed sounds.
In addition to multiplying the teacher signal by the inverse of the transfer function, the signal synthesizer further multiplies the transfer function from the measurement target to the listener's ear position to synthesize the noise component of the listener's ear position. A signal processing device characterized by having a function of generating processed sound.

The signal processing device according to any one of claims 1 to 4.
Furthermore, it is equipped with a signal synthesizer that synthesizes signals to generate processed sounds.
The signal processing unit is a signal processing device including a signal adjusting unit that receives a designation of the magnitude of the extracted signal by an operator.

By the neural network, with learning the weights of the network for generating a mask for removing an input physical quantity or found before Symbol noise component that contains the noise component, the extraction signal the noise component is removed from the input physical quantity a learning step for learning a transfer function for conversion to the estimated signal of the same dimensions as teacher signals,
Look including a separation step of separating the input physical quantity and the extraction signal and the noise component,
In the learning step, the phase component related to the input physical quantity is added to learn the weight of the network that generates the mask, and the phase component related to the input physical quantity is added to learn the transfer function. <br/> A signal processing method characterized by the fact that.