WO2020230184A1 - Signal switching device, signal switching method, and recording medium - Google Patents

Signal switching device, signal switching method, and recording medium Download PDF

Info

Publication number
WO2020230184A1
WO2020230184A1 PCT/JP2019/018697 JP2019018697W WO2020230184A1 WO 2020230184 A1 WO2020230184 A1 WO 2020230184A1 JP 2019018697 W JP2019018697 W JP 2019018697W WO 2020230184 A1 WO2020230184 A1 WO 2020230184A1
Authority
WO
WIPO (PCT)
Prior art keywords
likelihood
signal
unit
model
input
Prior art date
Application number
PCT/JP2019/018697
Other languages
French (fr)
Japanese (ja)
Inventor
玲史 近藤
裕子 太田
Original Assignee
日本電気株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 日本電気株式会社 filed Critical 日本電気株式会社
Priority to JP2021519030A priority Critical patent/JPWO2020230184A1/ja
Priority to PCT/JP2019/018697 priority patent/WO2020230184A1/en
Publication of WO2020230184A1 publication Critical patent/WO2020230184A1/en

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination

Definitions

  • the present invention relates to a signal switching device, a signal switching method, and a recording medium.
  • Patent Document 1 discloses a technique of generating learning data using the feature amount of the content and specifying the type of the scene of the content by clustering the above learning data.
  • a model is generated by generating a model with learning information including a set of time-series acoustic signal sequences and acoustic event information representing acoustic events corresponding to the acoustic signal sequences, and the generated model is used for model calculation.
  • Patent Document 3 discloses a technique of extracting the feature amount of an image of a content and annotating the content by a model generated by extracting the feature amount of a text which is an explanation of the image of the content. ..
  • JP-A-2018-141854 Japanese Unexamined Patent Publication No. 2014-048522 Japanese Unexamined Patent Publication No. 2012-038240
  • the signal related to the partially silent content or the signal related to the content on which noise is superimposed is changed to the signal related to the other content.
  • An object of the present invention is to provide a signal switching device, a signal switching method, and a recording medium that solve the above-mentioned problems.
  • the signal switching device is input from a plurality of channels by using a model storage unit that stores a trained model that has learned the feature amount of the signal applied to the content and the trained model.
  • a likelihood specifying unit that specifies the likelihood of the input signal with respect to the signal related to the content, and a selection unit that selects an output signal from the input signals input from the plurality of channels using the likelihood.
  • a signal output unit for outputting the selected output signal is provided.
  • the signal switching method includes a step of storing a trained model in which the feature amount of the signal applied to the content is learned, and an input input from a plurality of channels using the trained model.
  • the program stored in the recording medium uses a computer, a model storage unit that stores a learned model in which the feature amount of the signal applied to the content is learned, and the learned model.
  • the output signal among the input signals input from the plurality of channels is selected by using the likelihood specifying unit for specifying the likelihood of the input signal input from the plurality of channels with respect to the signal related to the content and the likelihood. It functions as a selection unit to be selected and a signal output unit to output the selected output signal.
  • the signal switching device outputs the likelihood of the input signal with respect to the signal related to the content by the model generated by using the feature amount of the signal related to the content, and switches the input signal. .. Therefore, the user can reduce the manual work in switching the input signal due to silence or noise.
  • FIG. 1 is a schematic block diagram showing a configuration of a signal switching device 100 according to the first embodiment.
  • the signal switching device 100 includes an input unit 110, a model learning unit 120, a model storage unit 130, a likelihood specifying unit 140, a likelihood storage storage unit 170, a selection unit 180, and a signal output unit 190.
  • the input unit 110 accepts the input of the learning signal, and also accepts the input of the output signal via the plurality of channels.
  • Examples of signals include audio or a combination of audio and image, and a compressed version of the combination.
  • the input unit 110 decompresses it, converts it into voice or a combination of voice and image, and accepts it.
  • AAC Advanced Audio Coding
  • the input unit 110 decodes the signal, decompresses it, and signals as audio. Accept.
  • the above signal may be a combination of voices having different frequencies.
  • a combination of a voice having a sampling frequency of 16 kHz and a voice having a sampling frequency of 48 kHz is also included in the above signal.
  • the above-mentioned sound may be a sound related to monaural sound or a sound related to stereo.
  • the model learning unit 120 learns the model by using the learning signals input from the plurality of channels received by the input unit 110.
  • the above learning signal is a signal related to the content.
  • the model is represented by a probability density function generated by learning to reflect the relationship between the feature amount converted from the signal applied to the content and the content input to the input unit 110. Examples of the above content include information, ideas, news programs, variety programs, music programs, dramas, in-house broadcasts, rock songs, pop songs, country songs, classical performance songs, rogue songs, statements, Internet broadcasts, etc. It is a creative expression to convey emotions.
  • FIG. 2 is a flowchart showing the operation of the model learning unit 120 of the signal switching device 100 according to the first embodiment.
  • the model learning unit 120 extracts a feature amount for each unit time from the learning signal received by the input unit 110 (step S1).
  • the unit time is a period having a predetermined period length (for example, several hundred milliseconds).
  • feature quantities include MFCC (Mel-Freequency Cepstrum Factors), short-time frequency spectrum, log frequency spectrum, wavelet conversion spectrum, frequency components converted by orthogonal conversion, and NMF (Non-negative Matrix Factorization).
  • MFCC Mel-Freequency Cepstrum Factors
  • NMF Non-negative Matrix Factorization
  • Non-negative value decomposed by non-negative matrix factorization Non-negative value decomposed by non-negative matrix factorization
  • principal component decomposed by PCA Principal Component Analysis, principal component decomposition
  • waveform power characteristic frequency intensity
  • characteristic frequency intensity and a combination of the above features. Be done.
  • the model learning unit 120 extracts a total of 28-dimensional vectors including a 14-dimensional vector of the MFCC and a 14-dimensional vector of the MFCC fine coefficient from the learning signal received by the input unit 110 as a feature quantity.
  • the model learning unit 120 cuts out the waveform of the received signal, for example, every 100 ms, performs Fourier transform to obtain a logarithmic amplitude spectrum, obtains a mel frequency spectrum, and extracts MFCC by discrete cosine transform.
  • the model learning unit 120 learns a model representing the probability density function generated to reflect the relationship with the content input to the input unit 110 by using the feature amount extracted in step S1 (step S2).
  • An example of a model learned by the model learning unit 120 is a GMM (Gaussian Mixture Model, Gaussian mixture model).
  • the model learning unit 120 learns a model representing a probability density function that inputs the N-dimensional features extracted in step S1 and outputs the distribution probability of the signal related to the content. That is, in the model learning unit 120 according to the first embodiment, the model is learned by unsupervised learning.
  • the model learning unit 120 uses a preset GMM having a mixing number n as a model, and reflects the relationship with the content input to the input unit 110 by the MFCC which is the feature amount converted in step S1. Learn GMM for.
  • the mixing number n a value optimized by the characteristics and quantity of the learning signal received by the input unit 110 is used.
  • the model storage unit 130 stores the learned model that the model learning unit 120 has learned by using the learning signal received by the input unit 110.
  • the likelihood specifying unit 140 extracts a feature amount from the output signal received by the input unit 110 for each channel to which the output signal is input, and uses the model stored in the model storage unit 130 to output the above. Identify the likelihood of the signal.
  • the likelihood indicates the likelihood that the model representing the distribution of the output signal is the model stored in the model storage unit 130. That is, the likelihood is represented by a value obtained by evaluating the distribution of the output signal with the model stored in the model storage unit 130.
  • An example of a value representing the likelihood is a real number of 0 or more and 1 or less.
  • the likelihood specifying unit 140 may set not only a real number of 0 or more and 1 or less as a value representing the likelihood, but also a real number in a range in which a real number other than 0 and 1 is set as an upper limit value and a lower limit value of the likelihood, respectively. it can. Further, the likelihood specifying unit 140 can also use a negative logarithm as a value representing the likelihood. The higher the likelihood is, the closer the value representing the likelihood is to the upper limit of the value representing the likelihood, and the lower the likelihood is, the closer the value representing the likelihood is to the lower limit of the value representing the likelihood.
  • the likelihood specifying unit 140 uses a negative logarithm as the value representing the likelihood, the higher the likelihood is, the closer the value representing the likelihood is to the lower limit of the value representing the likelihood. The lower the likelihood, the closer the value representing the likelihood is to the upper limit of the value representing the likelihood.
  • FIG. 3 is a flowchart showing the operation of the likelihood specifying unit 140.
  • the likelihood specifying unit 140 extracts a feature amount from the output signals for each input channel for each unit time of output signals input from a plurality of input channels (step S5).
  • the above unit time is the same as the unit time in the feature amount extraction of the model learning unit 120.
  • the above-mentioned feature amount extraction is the extraction of the feature amount applied to the model stored in the model storage unit 130.
  • the likelihood specifying unit 140 evaluates the feature amount for each unit time extracted in step S5 using the model stored in the model storage unit 130. That is, the observation probability of the feature quantity for each unit time in the model is obtained (step S6).
  • the likelihood specifying unit 140 specifies the likelihood by obtaining the direct product of the plurality of observation probabilities obtained in step S6 (step S7). Further, in step S7, the likelihood specifying unit 140 specifies the likelihood by using an average value weighted by the oblivion coefficient of a plurality of observation probabilities obtained in step S6 instead of obtaining the direct product as described above. You can also do it.
  • the learning signal received by the input unit 110 is the content related to the TV program
  • the model stored in the model storage unit 130 is the model applied to the MFCC
  • the output signal is input from two channels. is there.
  • the likelihood specifying unit 140 extracts the MFCC from the output signal received by the input unit 110 for each of the two input channels every unit time. Further, the likelihood specifying unit 140 obtains the observation probability of the MFCC for the model related to the news program and the variety program in which the model storage unit 130 stores the MFCC for each of the two channels.
  • the likelihood specifying unit 140 specifies the likelihood for each of the two channels using the above-mentioned MFCC observation probability.
  • the likelihood in this example represents the TV program-likeness of the output signal. The higher the likelihood value in this example, the more likely the output signal is to be a television program.
  • the likelihood specifying unit 140 outputs the likelihood by changing the conversion method of the feature amount for each format. For example, when the sampling frequencies of the output signals are different, the likelihood specifying unit 140 emphasizes the low frequency region of each output signal, converts the feature amount, and outputs the likelihood using a model.
  • the likelihood storage storage unit 170 stores and stores the likelihood output by the likelihood identification unit 140 for each channel for a certain period of time. For example, when the above-mentioned fixed period is set to 10 seconds, the likelihood storage storage unit 170 stores and stores the likelihood of the output signal for 10 seconds for each channel.
  • the selection unit 180 selects an output signal by using a value obtained by calculating the likelihood stored and stored by the likelihood storage storage unit 170 for each channel related to the output signal.
  • Examples of likelihood-based selection include: The selection unit 180 obtains the minimum value of the likelihood for each channel in a certain period for each channel in which the likelihood storage storage unit 170 stores and stores the likelihood, and outputs an output signal to the channel having a higher minimum value. Select as output signal. Further, the selection unit 180 obtains the average value of the likelihood for each channel in which the likelihood accumulation storage unit 170 stores and stores the likelihood for each channel in a certain period, and outputs a signal to the channel having a higher average value. May be selected as the output signal.
  • the selection unit 180 obtains the likelihood deviation value for each channel in which the likelihood storage storage unit 170 stores and stores the likelihood for a certain period of time, and outputs an output signal to the channel having a lower likelihood deviation value. It may be selected as an output signal.
  • the selection unit 180 when all the likelihoods applied to all the channels in which the likelihood storage storage unit 170 stores and stores the likelihood are equal to or less than the preset likelihood threshold value, the selection unit 180 performs the previous operation. Continue to select the output signal for the selected channel as the output signal.
  • the selection unit 180 when all the likelihoods applied to all the channels in which the likelihood storage storage unit 170 stores and stores the likelihood are not equal to or less than a preset likelihood threshold, the selection unit 180 may perform the selection unit 180.
  • the output signal is selected by using any of the above-mentioned minimum likelihood value, average likelihood value, and deviation value of likelihood.
  • the likelihood storage storage unit 170 stores the likelihood for a certain period of time, and the likelihood storage storage unit 170 stores the likelihood.
  • the maximum value of the likelihood for each channel in a certain period is obtained for each channel, and the output signal for the channel having the lower maximum value is selected as the output signal.
  • the selection unit 180 is a channel for a certain period for each channel in which the likelihood storage storage unit 170 stores and stores the likelihood.
  • the average value of the likelihoods may be obtained separately, and the output signal applied to the channel having the lower average value may be selected as the output signal.
  • the signal output unit 190 outputs the output signal selected by the selection unit 180.
  • FIG. 4 is a flowchart showing an operation related to model learning of the signal switching device 100.
  • the input unit 110 of the signal switching device 100 receives the input learning signal (step S11).
  • the model learning unit 120 extracts a feature amount from the learning signal received in step S11 (step S12).
  • the model learning unit 120 learns a model representing the probability density function generated to reflect the relationship with the content input to the input unit 110 by using the feature amount extracted in step S12 (step S13).
  • the model storage unit 130 stores the model learned in step S13 (step S14). Steps S11 to S14 are initial settings of the signal switching device 100.
  • FIG. 5 is a flowchart showing an operation related to signal switching of the signal switching device 100.
  • the input unit 110 of the signal switching device 100 receives the output signals input from the plurality of channels (step S15).
  • the likelihood specifying unit 140 extracts a feature amount from the output signal received in step S14, obtains an observation probability using the model stored in the model storage unit 130, and specifies the likelihood (step S16). ..
  • the likelihood accumulation storage unit 170 stores and stores the likelihood specified by the likelihood specifying unit 140 in step S15 for each channel for a certain period of time (step S17).
  • the selection unit 180 selects an output signal by using the value obtained by calculating the likelihood accumulated and stored for each channel in step S16 (step S18).
  • the signal output unit 190 outputs the output signal selected in step S17 (step S19).
  • the signal switching device 100 includes a model storage unit that stores a model that has learned the feature amount of the signal related to the content, and has a likelihood of outputting the likelihood of the signal related to the content.
  • a specific unit 140 is provided.
  • the signal switching device 100 includes a selection unit 180 for selecting an output signal using the likelihood output by the likelihood specifying unit 140, and a signal output unit 190 for outputting the output signal selected by the selection unit 180. ..
  • the signal switching device 100 automatically switches the signal related to the content to the output signal applied to the channel receiving the signal when the channel for receiving the signal related to the content is intermittently switched due to noise or failure, for example. It can be carried out.
  • the signal switching device 100 learns a model by accepting a waveform or a combination of a waveform and an image as a signal related to the content. As a result, the signal switching device 100 can reduce manual work in switching the input signal by the signal switching device 100 even if the output signal is a waveform or a combination of the waveform and the image. Further, according to the first embodiment, the signal switching device 100 includes a likelihood storage storage unit 170 that stores and stores the likelihood output in a certain period for each channel related to the output signal. As a result, the signal switching device 100 can prevent complicated fluctuations in the output signal while having a function of automatically switching signals.
  • FIG. 6 is a schematic block diagram showing the configuration of the signal switching device 100 according to the second embodiment.
  • the configuration of the signal switching device 100 according to the second embodiment is the same as the configuration of the signal switching device 100 according to the first embodiment.
  • the input unit 110 accepts the inputs of the learning signal and the output signal, and further accepts the user's selection of the output signal.
  • the above output signal selection is, for example, a user's channel selection.
  • the selection unit 180 selects an output signal based on the information of the output signal selected by the user received by the input unit 110.
  • FIG. 7 is a flowchart showing the operation of the selection unit 180.
  • step S21: NO When at least one likelihood applied to the channel selected by the user is not equal to or more than a preset likelihood threshold value (step S21: NO), the selection unit 180 has the likelihood accumulated and stored by the likelihood storage storage unit 170. The output signal is selected using the value calculated for each channel (step S23).
  • the signal switching device 100 receives the information of the output signal selected by the user and selects the output signal. As a result, the signal switching device 100 can select the output signal not only by the likelihood of the output signal but also by the user's direct selection.
  • FIG. 8 is a schematic block diagram showing the configuration of the signal switching device 100 according to the third embodiment.
  • the configuration of the signal switching device 100 according to the third embodiment is a configuration in which the model update unit 210 is added to the configuration of the signal switching device 100 according to the first embodiment.
  • the input unit 110 receives the input of the learning signal and the output signal via the plurality of channels, and further receives the information of the cycle in which the model update unit 210 updates the model.
  • the model update unit 210 stores and stores the feature amount of the output signal selected by the selection unit 180, and uses the above-mentioned accumulated and stored feature amount for each update cycle of the model update unit 210 received by the input unit 110.
  • the model stored in the model storage unit 130 is updated.
  • the signal switching device 100 uses the feature amount of the output signal selected by the selection unit 180 to update the model stored in the model storage unit 130. To be equipped. As a result, the signal switching device 100 updates the model with the signal selected by the selection unit 180, so that the output signal can be selected using the latest information.
  • a fourth embodiment of the present invention will be described in detail with reference to the drawings.
  • FIG. 9 is a schematic block diagram showing the configuration of the signal switching device 100 according to the fourth embodiment.
  • the signal switching device 100 according to the fourth embodiment includes a display unit 160 and a determination unit 150 in addition to the configuration of the signal switching device 100 according to the first embodiment.
  • the model learning unit 120 extracts the feature amount of the learning signal received by the input unit 110 for each content type of the learning signal, and learns the model for each content type. For example, when the learning signal received by the input unit 110 is a signal related to the content types of the variety program and the news program, the model learning unit 120 learns a model for each of the variety program and the news program.
  • the model storage unit 130 stores the models learned for each content type.
  • the likelihood specifying unit 140 extracts a feature amount from the output signal for each channel, and uses a model for each content type stored in the model storage unit 130 to describe the content of the output signal received by the input unit 110. Identify the likelihood of each type. That is, the likelihood specifying unit 140 of the first embodiment specifies the likelihood of the output signal for each channel, while the likelihood specifying unit 140 of the fourth embodiment is for output by channel and content type. Identify the likelihood of the signal.
  • the determination unit 150 determines whether or not the output signal received by the input unit 110 is content by using the value obtained by calculating the likelihood accumulated and stored by the likelihood storage unit 170 for each channel. To do.
  • the determination unit 150 uses a likelihood threshold set in advance for each content type in the above determination.
  • the model storage unit 130 stores a model related to the contents of the variety program and the news program
  • the threshold of the likelihood of the model of the variety program is 0.7
  • the threshold of the likelihood of the model of the news program is 0.8.
  • Variety of output signals received by input unit 110 When the program model has a likelihood of 0.8 and the news program model has a likelihood of 0.6, the determination unit 150 determines the output signal.
  • the determination unit Reference numeral 150 determines that the output signal is the content of a variety program having a higher likelihood, and determines that the signal is not the content of a news program.
  • the determination unit 150 Determines that the output signal is the content of a news program with a higher likelihood, and determines that it is not the content of a variety program.
  • the display unit 160 displays the determination result of the determination unit 150.
  • the signal switching device 100 learns a signal model for a plurality of contents for each content type. Further, the signal switching device 100 includes a determination unit 150 that specifies the likelihood of the input signal for each content type and determines whether or not the input signal is content for each content type. As a result, the signal switching device 100 displays whether or not the input signal is content, so that the user can know which content type the input signal corresponds to among the plurality of content types.
  • the model learning unit 120 of the signal switching device 100 may learn a model for each frequency and format.
  • the likelihood specifying unit 140 of the signal switching device 100 may specify the likelihood for each frequency and format of the output signal.
  • the label may be trained using a data set in which a signal is used as an input sample and a label indicating whether or not the signal is content is used as an output sample. In this case, the likelihood is calculated by the value of the label obtained by inputting a signal to the label. Learning and inference of the signal switching device 100 may be performed by another device.
  • FIG. 10 is a schematic block diagram showing a basic configuration of the signal switching device 100 according to the present invention.
  • the configuration shown in FIG. 1 has been described as an embodiment of the signal switching device 100 according to the present invention, but the basic configuration of the signal switching device 100 according to the present invention is as shown in FIG. That is, according to the present invention, the model storage unit 130, the likelihood specifying unit 140, the selection unit 180, and the signal output unit 190 are the basic configurations.
  • the model storage unit 130 stores the model used by the likelihood specifying unit 140.
  • the model stored in the model storage unit 130 is not limited to the model learned by the model learning unit 120, but may be, for example, a model given in advance.
  • the selection unit 180 selects an output signal using the likelihood for output specified by the likelihood specifying unit 140 for each channel.
  • the signal switching device 100 specifies the likelihood of the input signal with respect to the signal related to the content by using a model in which the feature amount of the signal related to the content is learned. .. As a result, the signal switching device 100 automatically switches the signal related to the content to the output signal related to the received channel when the channel for receiving the signal related to the content is intermittently switched due to noise or failure.
  • FIG. 11 is a schematic block diagram showing the configuration of a computer according to at least one embodiment.
  • the computer 1100 includes a processor 1110, a main memory 1120, a storage 1130, and an interface 1140.
  • the signal switching device 100 described above is mounted on the computer 1100.
  • the operation of each processing unit described above is stored in the storage 1130 in the form of a program.
  • the processor 1110 reads a program from the storage 1130, expands it into the main memory 1120, and executes the above processing according to the program. Further, the processor 1110 secures a storage area corresponding to each of the above-mentioned storage units in the main memory 1120 according to the program.
  • the program may be for realizing a part of the functions exerted on the computer 1100.
  • the program may exert its function in combination with another program already stored in the storage 1130, or in combination with another program mounted on another device.
  • the computer 1100 may be provided with a custom LSI (Large Scale Integrated Circuit) such as a PLD (Programmable Logic Device) in addition to or in place of the above configuration.
  • PLDs include PAL (Programmable Array Logic), GAL (Generic Array Logic), CPLD (Complex Programmable Logic Device), and FPGA (Field Programmable Gate Array).
  • PLDs Programmable Logic Device
  • PAL Programmable Array Logic
  • GAL Generic Array Logic
  • CPLD Complex Programmable Logic Device
  • FPGA Field Programmable Gate Array
  • Examples of the storage 1130 include magnetic disks, magneto-optical disks, semiconductor memories, and the like.
  • the storage 1130 may be internal media directly connected to the bus of computer 1100, or external media connected to the computer via interface 1140 or a communication line.
  • this program is distributed to the computer 1100 via a communication line, the distributed computer 1100 may expand the program in the main memory 1120 and execute the above processing.
  • storage 1130 is a non-temporary tangible storage medium.
  • the program may be for realizing a part of the above-mentioned functions. Further, the program may be a so-called difference file (difference program) that realizes the above-mentioned function in combination with another program already stored in the storage 1130.
  • difference file difference program
  • the signal switching device outputs the likelihood of the input signal to the signal related to the content by the model generated by using the feature amount of the signal applied to the content, and switches the input signal. Therefore, the user can reduce the manual work in switching the input signal due to silence or noise.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

A signal switching device (100) according to the present invention outputs the likelihood of an input signal being a content-related signal by using a model generated using features of content-related signals, and switches input signals. Thus, it is possible to reduce the amount of manual labor performed by a user when switching input signals due to silence or noise.

Description

信号切り替え装置、信号切り替え方法および記録媒体Signal switching device, signal switching method and recording medium
 本発明は、信号切り替え装置、信号切り替え方法および記録媒体に関する。 The present invention relates to a signal switching device, a signal switching method, and a recording medium.
 特許文献1には、コンテンツの特徴量を用いて学習データを生成し、上記の学習データをクラスタリングにより、コンテンツのシーンの種別を特定する技術が開示されている。
 特許文献2には、時系列の音響信号列の集合と、音響信号列に対応する音響イベントを表す音響イベント情報を含む学習用情報でモデルを生成することにより、生成されたモデルがモデル算出に利用されたデータに過剰にフィッティングすることなく、新たなデータに対して精度のよい推定を行うことを可能にする技術が開示されている。
 特許文献3には、コンテンツの画像の特徴量を抽出するとともに、コンテンツの画像の説明であるテキストの特徴量を抽出して生成されたモデルにより、コンテンツにアノテーションを付与する技術が開示されている。
Patent Document 1 discloses a technique of generating learning data using the feature amount of the content and specifying the type of the scene of the content by clustering the above learning data.
In Patent Document 2, a model is generated by generating a model with learning information including a set of time-series acoustic signal sequences and acoustic event information representing acoustic events corresponding to the acoustic signal sequences, and the generated model is used for model calculation. Disclosed is a technique that enables accurate estimation of new data without overfitting the used data.
Patent Document 3 discloses a technique of extracting the feature amount of an image of a content and annotating the content by a model generated by extracting the feature amount of a text which is an explanation of the image of the content. ..
特開2018-141854号公報JP-A-2018-141854 特開2014-048522号公報Japanese Unexamined Patent Publication No. 2014-048522 特開2012-038240号公報Japanese Unexamined Patent Publication No. 2012-038240
 コンテンツにかかる信号を出力する場合、出力するコンテンツにかかる信号の品質を保つため、部分的に無音となったコンテンツにかかる信号や雑音が重畳されたコンテンツにかかる信号を他のコンテンツにかかる信号に切り替える必要がある。そこで、信号の切り替えに用いられる、複数の音声信号の異同を比較判別するいわゆる自動音声モニター装置には、複数のコンテンツにかかる信号の波形を全波整流して、複数のコンテンツにかかる信号それぞれのエネルギーを比較することで、異同を判別するものがある。
 しかし、特許文献1-3の記載の技術のように、コンテンツのシーンの種別特定やコンテンツのアノテーション付与にコンテンツの特徴量で生成されたモデルが活用されているものの、コンテンツにかかる信号の切り替えには、コンテンツの特徴量で生成されたモデルが活用されていない。また、コンテンツにかかる信号が部分的に無音となった場合、コンテンツにかかる信号のエネルギーは小さくなるが、コンテンツにかかる信号に雑音が重畳した場合は、コンテンツにかかる信号のエネルギーが大きくなる。そこで、エネルギーを比較する自動音声モニター装置は、どちらのコンテンツにかかる信号に無音や雑音が重畳しているのか判別できない。そのため、無音や雑音が重畳していない信号への切り替えにおいて、ユーザの手作業が発生している。
 本発明の目的は、上述した課題を解決する信号切り替え装置、信号切り替え方法および記録媒体を提供することにある。
When outputting a signal related to the content, in order to maintain the quality of the signal related to the output content, the signal related to the partially silent content or the signal related to the content on which noise is superimposed is changed to the signal related to the other content. You need to switch. Therefore, in a so-called automatic audio monitor device that compares and discriminates differences between a plurality of audio signals used for signal switching, the waveforms of the signals applied to a plurality of contents are full-wave rectified, and each of the signals applied to the plurality of contents There is something that distinguishes differences by comparing energies.
However, as in the technique described in Patent Document 1-3, although the model generated by the feature amount of the content is utilized for specifying the type of the scene of the content and annotating the content, it is used for switching the signal related to the content. Does not utilize the model generated by the features of the content. Further, when the signal applied to the content is partially silenced, the energy of the signal applied to the content becomes small, but when noise is superimposed on the signal applied to the content, the energy of the signal applied to the content becomes large. Therefore, an automatic voice monitor device that compares energies cannot determine which content has silence or noise superimposed on the signal. Therefore, the user has to manually switch to a signal in which silence or noise is not superimposed.
An object of the present invention is to provide a signal switching device, a signal switching method, and a recording medium that solve the above-mentioned problems.
 本発明の第1態様によれば、信号切り替え装置は、コンテンツにかかる信号の特徴量を学習した学習済みモデルを記憶するモデル記憶部と、前記学習済みモデルを用いて、複数のチャネルから入力された入力信号の、前記コンテンツにかかる信号に対する尤度を特定する尤度特定部と、前記尤度を用いて、前記複数のチャネルから入力された入力信号のうち、出力信号を選択する選択部と、選択された前記出力信号を出力する信号出力部を備える。
 本発明の第2態様によれば、信号切り替え方法は、コンテンツにかかる信号の特徴量を学習した学習済みモデルを記憶するステップと、前記学習済みモデルを用いて、複数のチャネルから入力された入力信号の、前記コンテンツにかかる信号に対する尤度を特定するステップと、前記尤度を用いて、前記複数のチャネルから入力された入力信号のうち、出力信号を選択するステップと、選択された前記出力信号を出力するステップを有する。
 本発明の第3態様によれば、記録媒体に記憶されたプログラムは、コンピュータを、コンテンツにかかる信号の特徴量を学習した学習済みモデルを記憶するモデル記憶部、前記学習済みモデルを用いて、複数のチャネルから入力された入力信号の、前記コンテンツにかかる信号に対する尤度を特定する尤度特定部、前記尤度を用いて、前記複数のチャネルから入力された入力信号のうち、出力信号を選択する選択部、選択された前記出力信号を出力する信号出力部として機能させる。
According to the first aspect of the present invention, the signal switching device is input from a plurality of channels by using a model storage unit that stores a trained model that has learned the feature amount of the signal applied to the content and the trained model. A likelihood specifying unit that specifies the likelihood of the input signal with respect to the signal related to the content, and a selection unit that selects an output signal from the input signals input from the plurality of channels using the likelihood. , A signal output unit for outputting the selected output signal is provided.
According to the second aspect of the present invention, the signal switching method includes a step of storing a trained model in which the feature amount of the signal applied to the content is learned, and an input input from a plurality of channels using the trained model. A step of specifying the likelihood of the signal with respect to the signal related to the content, a step of selecting an output signal from the input signals input from the plurality of channels using the likelihood, and a step of selecting the selected output. It has a step to output a signal.
According to the third aspect of the present invention, the program stored in the recording medium uses a computer, a model storage unit that stores a learned model in which the feature amount of the signal applied to the content is learned, and the learned model. The output signal among the input signals input from the plurality of channels is selected by using the likelihood specifying unit for specifying the likelihood of the input signal input from the plurality of channels with respect to the signal related to the content and the likelihood. It functions as a selection unit to be selected and a signal output unit to output the selected output signal.
 上記態様のうち少なくとも1つの態様によれば、信号切り替え装置は、コンテンツにかかる信号の特徴量を用いて生成したモデルにより、入力信号のコンテンツにかかる信号に対する尤度を出力し、入力信号を切り替える。よって、ユーザは、無音や雑音による入力信号の切り替えにおいて、手作業を削減できる。 According to at least one of the above aspects, the signal switching device outputs the likelihood of the input signal with respect to the signal related to the content by the model generated by using the feature amount of the signal related to the content, and switches the input signal. .. Therefore, the user can reduce the manual work in switching the input signal due to silence or noise.
第1の実施形態に係る信号切り替え装置の構成を示す概略ブロック図である。It is a schematic block diagram which shows the structure of the signal switching apparatus which concerns on 1st Embodiment. 第1の実施形態に係る信号切り替え装置のモデル学習部の動作を示すフローチャートである。It is a flowchart which shows the operation of the model learning part of the signal switching apparatus which concerns on 1st Embodiment. 第1の実施形態に係る信号切り替え装置の尤度特定部の動作を示すフローチャートである。It is a flowchart which shows the operation of the likelihood specifying part of the signal switching apparatus which concerns on 1st Embodiment. 第1の実施形態に係る信号切り替え装置のモデル学習に係る動作を示すフローチャートである。It is a flowchart which shows the operation which concerns on the model learning of the signal switching apparatus which concerns on 1st Embodiment. 第1の実施形態に係る信号切り替え装置の信号切り替えに係る動作を示すフローチャートである。It is a flowchart which shows the operation which concerns on the signal switching of the signal switching apparatus which concerns on 1st Embodiment. 第2の実施形態に係る信号切り替え装置の構成を示す概略ブロック図である。It is a schematic block diagram which shows the structure of the signal switching apparatus which concerns on 2nd Embodiment. 第2の実施形態に係る信号切り替え装置の選択部の動作を示すフローチャートである。It is a flowchart which shows the operation of the selection part of the signal switching apparatus which concerns on 2nd Embodiment. 第3の実施形態に係る信号切り替え装置の構成を示す概略ブロック図である。It is a schematic block diagram which shows the structure of the signal switching apparatus which concerns on 3rd Embodiment. 第4の実施形態に係る信号切り替え装置の構成を示す概略ブロック図である。It is a schematic block diagram which shows the structure of the signal switching apparatus which concerns on 4th Embodiment. 基本構成に係る信号切り替え装置を示す概略ブロック図である。It is a schematic block diagram which shows the signal switching device which concerns on a basic configuration. 少なくとも1つの実施形態に係るコンピュータの構成を示す概略ブロック図である。It is a schematic block diagram which shows the structure of the computer which concerns on at least one Embodiment.
〈第1の実施形態〉
 以下、図面を参照しながら本発明の第1実施形態について詳しく説明する。
<First Embodiment>
Hereinafter, the first embodiment of the present invention will be described in detail with reference to the drawings.
《信号切り替え装置の構成》
 図1は、第1の実施形態に係る信号切り替え装置100の構成を示す概略ブロック図である。信号切り替え装置100は、入力部110と、モデル学習部120と、モデル記憶部130と、尤度特定部140と、尤度蓄積記憶部170と、選択部180、信号出力部190を備える。
<< Configuration of signal switching device >>
FIG. 1 is a schematic block diagram showing a configuration of a signal switching device 100 according to the first embodiment. The signal switching device 100 includes an input unit 110, a model learning unit 120, a model storage unit 130, a likelihood specifying unit 140, a likelihood storage storage unit 170, a selection unit 180, and a signal output unit 190.
 入力部110は、学習用信号の入力を受け入れ、また、複数のチャネルを介して出力用信号の入力を受け入れる。信号の例としては、音声または音声と画像の組み合わせ、その組み合わせを圧縮したものが挙げられる。
 入力部110は、入力された信号が、圧縮されたものである場合には、圧縮を解除し、音声または音声と画像を組み合わせたものに変換して受け入れる。例えば、入力部110に入力された信号がAAC(Advanced Audio Coding、先進的音響符号化)で圧縮されたビットストリームの場合、入力部110は、信号をデコードして圧縮を解除し、音声として信号を受け入れる。
The input unit 110 accepts the input of the learning signal, and also accepts the input of the output signal via the plurality of channels. Examples of signals include audio or a combination of audio and image, and a compressed version of the combination.
When the input signal is compressed, the input unit 110 decompresses it, converts it into voice or a combination of voice and image, and accepts it. For example, when the signal input to the input unit 110 is a bit stream compressed by AAC (Advanced Audio Coding), the input unit 110 decodes the signal, decompresses it, and signals as audio. Accept.
 また、上記の信号は、周波数が異なる音声の組み合わせであってもよい。例えば、サンプリング周波数16kHzの音声と、サンプリング周波数48kHzの音声の組み合わせも上記の信号に含まれる。さらに、上記の音声は、モノラルにかかる音声、ステレオにかかる音声であってもよい。 Further, the above signal may be a combination of voices having different frequencies. For example, a combination of a voice having a sampling frequency of 16 kHz and a voice having a sampling frequency of 48 kHz is also included in the above signal. Further, the above-mentioned sound may be a sound related to monaural sound or a sound related to stereo.
 モデル学習部120は、入力部110が受け入れた、複数のチャネルから入力された学習用信号を用いて、モデルを学習する。上記の学習用信号は、コンテンツに係る信号である。モデルは、コンテンツにかかる信号から変換された特徴量と、入力部110に入力されたコンテンツとの関係を反映させるための学習により生成された確率密度関数で表される。
 上記のコンテンツの例としては、ニュース番組、バラエティ番組、音楽番組、ドラマ、館内放送、ロック楽曲、ポップス歌唱楽曲、カントリー楽曲、クラシック演奏楽曲、浪曲楽曲、声明、インターネット放送などの、情報、思想や感情を伝達するための創作的な表現である。
 図2は、第1の実施形態に係る信号切り替え装置100のモデル学習部120の動作を示すフローチャートである。
The model learning unit 120 learns the model by using the learning signals input from the plurality of channels received by the input unit 110. The above learning signal is a signal related to the content. The model is represented by a probability density function generated by learning to reflect the relationship between the feature amount converted from the signal applied to the content and the content input to the input unit 110.
Examples of the above content include information, ideas, news programs, variety programs, music programs, dramas, in-house broadcasts, rock songs, pop songs, country songs, classical performance songs, rogue songs, statements, Internet broadcasts, etc. It is a creative expression to convey emotions.
FIG. 2 is a flowchart showing the operation of the model learning unit 120 of the signal switching device 100 according to the first embodiment.
 モデル学習部120は、入力部110が受け入れた学習用信号から、単位時間毎に特徴量を抽出する(ステップS1)。単位時間とは、所定の期間長(例えば数百ミリ秒)の期間である。
 特徴量の例としては、MFCC(Mel-Frequency Cepstrum Coefficients、メル周波数ケプストラム係数)、短時間周波数スペクトラム、ログ周波数スペクトラム、ウェーブレット変換スペクトラム、直交変換により変換された周波数成分、NMF(Non-negative Matrix Factorization、非負値行列因子分解)により分解された非負値、PCA(Principal Component Analysis、主成分分解)による分解された主成分、波形のパワー、特性周波数強度、上記の特徴量を複数組み合わせたものが挙げられる。
 例えば、モデル学習部120は、入力部110が受け入れた学習用信号から、MFCCの14次元のベクトル、およびMFCCの微係数の14次元のベクトルからなる合計28次元のベクトルを特徴量として抽出する。モデル学習部120は、受け入れられた信号の波形を例えば100m秒ごとに切り出し、フーリエ変換して対数振幅スペクトルを求め、メル周波数スペクトルを求め、離散コサイン変換でMFCCを抽出する。
The model learning unit 120 extracts a feature amount for each unit time from the learning signal received by the input unit 110 (step S1). The unit time is a period having a predetermined period length (for example, several hundred milliseconds).
Examples of feature quantities include MFCC (Mel-Freequency Cepstrum Factors), short-time frequency spectrum, log frequency spectrum, wavelet conversion spectrum, frequency components converted by orthogonal conversion, and NMF (Non-negative Matrix Factorization). , Non-negative value decomposed by non-negative matrix factorization), principal component decomposed by PCA (Principal Component Analysis, principal component decomposition), waveform power, characteristic frequency intensity, and a combination of the above features. Be done.
For example, the model learning unit 120 extracts a total of 28-dimensional vectors including a 14-dimensional vector of the MFCC and a 14-dimensional vector of the MFCC fine coefficient from the learning signal received by the input unit 110 as a feature quantity. The model learning unit 120 cuts out the waveform of the received signal, for example, every 100 ms, performs Fourier transform to obtain a logarithmic amplitude spectrum, obtains a mel frequency spectrum, and extracts MFCC by discrete cosine transform.
 モデル学習部120は、ステップS1で抽出した特徴量を用いて、入力部110に入力されたコンテンツとの関係を反映させるため生成された確率密度関数を表すモデルを学習する(ステップS2)。モデル学習部120が学習するモデルの例としては、GMM(Gaussian Mixture Model、ガウス混合モデル)が挙げられる。
 例えば、モデル学習部120は、ステップS1で抽出したN次元の特徴量をそれぞれ入力し、コンテンツに係る信号の分布確率を出力とする確率密度関数を表すモデルを学習する。すなわち、第1実施形態にかかるモデル学習部120では、教師なし学習によりモデルを学習させる。
 例えば、モデル学習部120は、予め設定された混合数nのGMMをモデルとして用いて、ステップS1で変換された特徴量であるMFCCにより、入力部110に入力されたコンテンツとの関係を反映させるためのGMMを学習する。上記の混合数nは、入力部110が受け入れる学習用信号の特性や分量によって最適化された値を用いる。
The model learning unit 120 learns a model representing the probability density function generated to reflect the relationship with the content input to the input unit 110 by using the feature amount extracted in step S1 (step S2). An example of a model learned by the model learning unit 120 is a GMM (Gaussian Mixture Model, Gaussian mixture model).
For example, the model learning unit 120 learns a model representing a probability density function that inputs the N-dimensional features extracted in step S1 and outputs the distribution probability of the signal related to the content. That is, in the model learning unit 120 according to the first embodiment, the model is learned by unsupervised learning.
For example, the model learning unit 120 uses a preset GMM having a mixing number n as a model, and reflects the relationship with the content input to the input unit 110 by the MFCC which is the feature amount converted in step S1. Learn GMM for. As the mixing number n, a value optimized by the characteristics and quantity of the learning signal received by the input unit 110 is used.
 モデル記憶部130は、モデル学習部120が、入力部110の受け入れた学習用信号を用いて学習した学習済みのモデルを記憶する。 The model storage unit 130 stores the learned model that the model learning unit 120 has learned by using the learning signal received by the input unit 110.
 尤度特定部140は、出力用信号が入力されたチャネル別に、入力部110が受け入れた出力用信号から特徴量を抽出し、モデル記憶部130が記憶しているモデルを用いて、上記の出力用信号の尤度を特定する。
 尤度は、出力用信号の分布を表すモデルが、モデル記憶部130が記憶しているモデルである尤もらしさを示す。つまり、尤度は、出力用信号の分布を、モデル記憶部130が記憶しているモデルで評価した値で表される。尤度を表す値の例として、0以上1以下の実数が挙げられる。また、尤度特定部140は、尤度を表す値として0以上1以下の実数だけでなく、0と1以外の実数をそれぞれ尤度の上限値と下限値とする範囲の実数とすることもできる。さらに、尤度特定部140は、尤度を表す値として負の対数を用いることもできる。
 上記の尤もらしさが高いほど尤度を表す値は尤度を表す値の上限値に近くなり、上記の尤もらしさが低いほど尤度を表す値は尤度を表す値の下限値に近くなる。但し、尤度特定部140が尤度を表す値として負の対数を用いた場合は、上記の尤もらしさが高いほど尤度を表す値は尤度を表す値の下限値に近くなり、上記の尤もらしさが低いほど尤度を表す値は尤度を表す値の上限値に近くなる。
The likelihood specifying unit 140 extracts a feature amount from the output signal received by the input unit 110 for each channel to which the output signal is input, and uses the model stored in the model storage unit 130 to output the above. Identify the likelihood of the signal.
The likelihood indicates the likelihood that the model representing the distribution of the output signal is the model stored in the model storage unit 130. That is, the likelihood is represented by a value obtained by evaluating the distribution of the output signal with the model stored in the model storage unit 130. An example of a value representing the likelihood is a real number of 0 or more and 1 or less. Further, the likelihood specifying unit 140 may set not only a real number of 0 or more and 1 or less as a value representing the likelihood, but also a real number in a range in which a real number other than 0 and 1 is set as an upper limit value and a lower limit value of the likelihood, respectively. it can. Further, the likelihood specifying unit 140 can also use a negative logarithm as a value representing the likelihood.
The higher the likelihood is, the closer the value representing the likelihood is to the upper limit of the value representing the likelihood, and the lower the likelihood is, the closer the value representing the likelihood is to the lower limit of the value representing the likelihood. However, when the likelihood specifying unit 140 uses a negative logarithm as the value representing the likelihood, the higher the likelihood is, the closer the value representing the likelihood is to the lower limit of the value representing the likelihood. The lower the likelihood, the closer the value representing the likelihood is to the upper limit of the value representing the likelihood.
 図3は、尤度特定部140の動作を示すフローチャートである。
 尤度特定部140は、複数の入力チャネルから入力された出力用信号を、単位時間毎に、入力チャネル別に、出力用信号から特徴量を抽出する(ステップS5)。上記の単位時間は、モデル学習部120の特徴量抽出での単位時間と同じ時間である。上記の特徴量抽出とは、モデル記憶部130が記憶しているモデルにかかる特徴量の抽出である。
 尤度特定部140は、ステップS5で抽出された単位時間ごとの特徴量を、モデル記憶部130が記憶しているモデルを用いて評価する。すなわち、上記の単位時間ごとの特徴量の、上記モデルにおける観測確率を求める(ステップS6)。
 尤度特定部140は、ステップS6で求められた複数の観測確率の直積を求めることにより、尤度を特定する(ステップS7)。また、尤度特定部140はステップS7において、上記のように直積を求める代わりに、ステップS6で求められた複数の観測確率を、忘却係数で重み付けした平均値を用いて、尤度を特定することもできる。
FIG. 3 is a flowchart showing the operation of the likelihood specifying unit 140.
The likelihood specifying unit 140 extracts a feature amount from the output signals for each input channel for each unit time of output signals input from a plurality of input channels (step S5). The above unit time is the same as the unit time in the feature amount extraction of the model learning unit 120. The above-mentioned feature amount extraction is the extraction of the feature amount applied to the model stored in the model storage unit 130.
The likelihood specifying unit 140 evaluates the feature amount for each unit time extracted in step S5 using the model stored in the model storage unit 130. That is, the observation probability of the feature quantity for each unit time in the model is obtained (step S6).
The likelihood specifying unit 140 specifies the likelihood by obtaining the direct product of the plurality of observation probabilities obtained in step S6 (step S7). Further, in step S7, the likelihood specifying unit 140 specifies the likelihood by using an average value weighted by the oblivion coefficient of a plurality of observation probabilities obtained in step S6 instead of obtaining the direct product as described above. You can also do it.
 例えば、以下の具体例が挙げられる。入力部110が受け入れた学習用信号が、テレビ番組にかかるコンテンツで、モデル記憶部130が記憶しているモデルがMFCCにかかるモデルであり、出力用信号が2つのチャネルから入力される具体例である。この具体例の場合、尤度特定部140は、入力部110が受け入れた出力用信号から、2つの入力チャネル別に、単位時間毎にMFCCを抽出する。また、尤度特定部140は、MFCCをモデル記憶部130が記憶しているニュース番組とバラエティ番組にかかるモデルに対するMFCCの観測確率を2つのチャネル別に求める。尤度特定部140は、上記のMFCCの観測確率を用いて、2つのチャネル別に、尤度を特定する。この例における尤度は、出力用信号の、テレビ番組らしさを表す。この例における尤度の値が高いほど、出力用信号がテレビ番組らしいことを意味する。 For example, the following specific examples can be given. The learning signal received by the input unit 110 is the content related to the TV program, the model stored in the model storage unit 130 is the model applied to the MFCC, and the output signal is input from two channels. is there. In the case of this specific example, the likelihood specifying unit 140 extracts the MFCC from the output signal received by the input unit 110 for each of the two input channels every unit time. Further, the likelihood specifying unit 140 obtains the observation probability of the MFCC for the model related to the news program and the variety program in which the model storage unit 130 stores the MFCC for each of the two channels. The likelihood specifying unit 140 specifies the likelihood for each of the two channels using the above-mentioned MFCC observation probability. The likelihood in this example represents the TV program-likeness of the output signal. The higher the likelihood value in this example, the more likely the output signal is to be a television program.
 また、入力部110が受け入れる出力用信号が周波数の異なる音声の組み合わせである場合は、尤度特定部140は、形式ごとに特徴量の変換方法を変えて尤度を出力する。例えば、尤度特定部140は、出力用信号のサンプリング周波数が異なる場合、それぞれの出力用信号の低周波数域を強調して、特徴量を変換し、モデルを用いて尤度を出力する。 Further, when the output signal received by the input unit 110 is a combination of voices having different frequencies, the likelihood specifying unit 140 outputs the likelihood by changing the conversion method of the feature amount for each format. For example, when the sampling frequencies of the output signals are different, the likelihood specifying unit 140 emphasizes the low frequency region of each output signal, converts the feature amount, and outputs the likelihood using a model.
 尤度蓄積記憶部170は、尤度特定部140が出力した尤度を、チャネル別に、一定期間、蓄積して記憶する。例えば、上記の一定期間が10秒と設定された場合、尤度蓄積記憶部170は、10秒間の出力用信号の尤度を、チャネル別に蓄積して記憶する。 The likelihood storage storage unit 170 stores and stores the likelihood output by the likelihood identification unit 140 for each channel for a certain period of time. For example, when the above-mentioned fixed period is set to 10 seconds, the likelihood storage storage unit 170 stores and stores the likelihood of the output signal for 10 seconds for each channel.
 選択部180は、尤度蓄積記憶部170が蓄積記憶した尤度を、出力用信号にかかるチャネル別に演算した値を用いて、出力信号を選択する。
 尤度に基づく選択の例として、以下のものが挙げられる。
 選択部180は、尤度蓄積記憶部170が尤度を蓄積記憶しているチャネル別に、一定期間におけるチャネル別の尤度の最小値を求め、より高い最小値を有するチャネルにかかる出力用信号を出力信号として選択する。また、選択部180は、尤度蓄積記憶部170が尤度を蓄積記憶しているチャネル別に、一定期間におけるチャネル別に尤度の平均値を求め、より高い平均値を有するチャネルにかかる出力用信号を出力信号として選択してもよい。
 選択部180は、尤度蓄積記憶部170が一定期間、尤度の蓄積記憶しているチャネル別に、尤度の偏差値を求め、より低い尤度の偏差値を有するチャネルにかかる出力用信号を出力信号として選択してもよい。
 選択部180は、尤度蓄積記憶部170が尤度を蓄積記憶している全てのチャネルにかかる全ての尤度が、予め設定された尤度の閾値以下の場合は、選択部180が、前回選択しているチャネルにかかる出力用信号を出力信号として選択し続ける。選択部180は、尤度蓄積記憶部170が尤度を蓄積記憶している全てのチャネルにかかる全ての尤度が、予め設定された尤度の閾値以下ではない場合は、選択部180は、上記の尤度の最小値、尤度の平均値および尤度の偏差値の何れかを用いることで、出力信号を選択する。
 また、選択部180は、尤度特定部140が尤度を表す値として負の対数を用いた場合は、尤度蓄積記憶部170が一定期間、尤度蓄積記憶部170が尤度を蓄積記憶しているチャネル別に、一定期間におけるチャネル別の尤度の最大値を求め、より低い最大値を有するチャネルにかかる出力用信号を出力信号として選択する。また、選択部180は、尤度特定部140が尤度を表す値として負の対数を用いた場合は、尤度蓄積記憶部170が尤度を蓄積記憶しているチャネル別に、一定期間におけるチャネル別に尤度の平均値を求め、より低い平均値を有するチャネルにかかる出力用信号を出力信号として選択してもよい。
The selection unit 180 selects an output signal by using a value obtained by calculating the likelihood stored and stored by the likelihood storage storage unit 170 for each channel related to the output signal.
Examples of likelihood-based selection include:
The selection unit 180 obtains the minimum value of the likelihood for each channel in a certain period for each channel in which the likelihood storage storage unit 170 stores and stores the likelihood, and outputs an output signal to the channel having a higher minimum value. Select as output signal. Further, the selection unit 180 obtains the average value of the likelihood for each channel in which the likelihood accumulation storage unit 170 stores and stores the likelihood for each channel in a certain period, and outputs a signal to the channel having a higher average value. May be selected as the output signal.
The selection unit 180 obtains the likelihood deviation value for each channel in which the likelihood storage storage unit 170 stores and stores the likelihood for a certain period of time, and outputs an output signal to the channel having a lower likelihood deviation value. It may be selected as an output signal.
In the selection unit 180, when all the likelihoods applied to all the channels in which the likelihood storage storage unit 170 stores and stores the likelihood are equal to or less than the preset likelihood threshold value, the selection unit 180 performs the previous operation. Continue to select the output signal for the selected channel as the output signal. In the selection unit 180, when all the likelihoods applied to all the channels in which the likelihood storage storage unit 170 stores and stores the likelihood are not equal to or less than a preset likelihood threshold, the selection unit 180 may perform the selection unit 180. The output signal is selected by using any of the above-mentioned minimum likelihood value, average likelihood value, and deviation value of likelihood.
Further, in the selection unit 180, when the likelihood specifying unit 140 uses a negative logarithm as the value representing the likelihood, the likelihood storage storage unit 170 stores the likelihood for a certain period of time, and the likelihood storage storage unit 170 stores the likelihood. The maximum value of the likelihood for each channel in a certain period is obtained for each channel, and the output signal for the channel having the lower maximum value is selected as the output signal. Further, when the likelihood specifying unit 140 uses a negative logarithm as the value representing the likelihood, the selection unit 180 is a channel for a certain period for each channel in which the likelihood storage storage unit 170 stores and stores the likelihood. Alternatively, the average value of the likelihoods may be obtained separately, and the output signal applied to the channel having the lower average value may be selected as the output signal.
 信号出力部190は、選択部180により選択された出力信号を出力する。 The signal output unit 190 outputs the output signal selected by the selection unit 180.
《信号切り替え装置の動作》
 次に、信号切り替え装置100の動作について説明する。
 図4は、信号切り替え装置100のモデル学習にかかる動作を示すフローチャートである。
<< Operation of signal switching device >>
Next, the operation of the signal switching device 100 will be described.
FIG. 4 is a flowchart showing an operation related to model learning of the signal switching device 100.
 信号切り替え装置100の入力部110は、入力された学習用信号を受け入れる(ステップS11)。
 モデル学習部120は、ステップS11で受け入れた学習用信号から特徴量を抽出する(ステップS12)。
 モデル学習部120は、ステップS12で抽出した特徴量を用いて、入力部110に入力されたコンテンツとの関係を反映させるため生成された確率密度関数を表すモデルを学習する(ステップS13)。
 モデル記憶部130は、ステップS13で学習されたモデルを記憶する(ステップS14)。
 ステップS11からステップS14は、信号切り替え装置100の初期設定である。
The input unit 110 of the signal switching device 100 receives the input learning signal (step S11).
The model learning unit 120 extracts a feature amount from the learning signal received in step S11 (step S12).
The model learning unit 120 learns a model representing the probability density function generated to reflect the relationship with the content input to the input unit 110 by using the feature amount extracted in step S12 (step S13).
The model storage unit 130 stores the model learned in step S13 (step S14).
Steps S11 to S14 are initial settings of the signal switching device 100.
 図5は、信号切り替え装置100の信号切り替えにかかる動作を示すフローチャートである。
 上述の初期設定以降、信号切り替え装置100の入力部110は、複数のチャネルから入力された出力用信号を受け入れる(ステップS15)。
 尤度特定部140は、ステップS14で受け入れた出力用信号から特徴量を抽出し、モデル記憶部130が記憶しているモデルを用いて、観測確率を求め、尤度を特定する(ステップS16)。
 尤度蓄積記憶部170は、一定期間、ステップS15で尤度特定部140が特定した尤度を、チャネル別に蓄積記憶する(ステップS17)。
 選択部180は、ステップS16でチャネル別に蓄積記憶している尤度を演算した値を用いて、出力信号を選択する(ステップS18)。
 信号出力部190は、ステップS17で選択された出力信号を出力する(ステップS19)。
FIG. 5 is a flowchart showing an operation related to signal switching of the signal switching device 100.
After the above-mentioned initial setting, the input unit 110 of the signal switching device 100 receives the output signals input from the plurality of channels (step S15).
The likelihood specifying unit 140 extracts a feature amount from the output signal received in step S14, obtains an observation probability using the model stored in the model storage unit 130, and specifies the likelihood (step S16). ..
The likelihood accumulation storage unit 170 stores and stores the likelihood specified by the likelihood specifying unit 140 in step S15 for each channel for a certain period of time (step S17).
The selection unit 180 selects an output signal by using the value obtained by calculating the likelihood accumulated and stored for each channel in step S16 (step S18).
The signal output unit 190 outputs the output signal selected in step S17 (step S19).
《作用・効果》
 このように、第1実施形態によれば、信号切り替え装置100は、コンテンツにかかる信号の特徴量を学習したモデルを記憶したモデル記憶部を備え、コンテンツにかかる信号に対する尤度を出力する尤度特定部140を備える。また、信号切り替え装置100は、尤度特定部140により出力された尤度を用いて出力信号を選択する選択部180を備え、選択部180が選択した出力信号を出力する信号出力部190を備える。これにより、信号切り替え装置100は、例えば、雑音や故障により断続的にコンテンツにかかる信号を受信するチャネルが切り替わる場合に、コンテンツにかかる信号を受信したチャネルにかかる出力信号への切り替えを自動的に行うことができる。
 また、第1実施形態によれば、信号切り替え装置100は、コンテンツにかかる信号として、波形、または波形と画像を組み合わせたものを受け入れてモデルを学習する。これにより、信号切り替え装置100は、出力用信号が波形、または波形と画像を組み合わせたものであっても、信号切り替え装置100による入力信号の切り替えにおいて、手作業を削減できる。
 さらに、第1実施形態によれば、信号切り替え装置100は、一定期間に出力された尤度を、出力用信号にかかるチャネル別に蓄積記憶する尤度蓄積記憶部170を備える。これにより、信号切り替え装置100は、自動的に信号を切り替える機能を有しながらも、出力信号の煩雑な変動を防止することができる。
《Action / Effect》
As described above, according to the first embodiment, the signal switching device 100 includes a model storage unit that stores a model that has learned the feature amount of the signal related to the content, and has a likelihood of outputting the likelihood of the signal related to the content. A specific unit 140 is provided. Further, the signal switching device 100 includes a selection unit 180 for selecting an output signal using the likelihood output by the likelihood specifying unit 140, and a signal output unit 190 for outputting the output signal selected by the selection unit 180. .. As a result, the signal switching device 100 automatically switches the signal related to the content to the output signal applied to the channel receiving the signal when the channel for receiving the signal related to the content is intermittently switched due to noise or failure, for example. It can be carried out.
Further, according to the first embodiment, the signal switching device 100 learns a model by accepting a waveform or a combination of a waveform and an image as a signal related to the content. As a result, the signal switching device 100 can reduce manual work in switching the input signal by the signal switching device 100 even if the output signal is a waveform or a combination of the waveform and the image.
Further, according to the first embodiment, the signal switching device 100 includes a likelihood storage storage unit 170 that stores and stores the likelihood output in a certain period for each channel related to the output signal. As a result, the signal switching device 100 can prevent complicated fluctuations in the output signal while having a function of automatically switching signals.
〈第2の実施形態〉
 以下、図面を参照しながら本発明の第2実施形態について詳しく説明する。
<Second embodiment>
Hereinafter, the second embodiment of the present invention will be described in detail with reference to the drawings.
《信号切り替え装置の構成》
 図6は、第2の実施形態にかかる信号切り替え装置100の構成を示す概略ブロック図である。第2の実施形態にかかる信号切り替え装置100の構成は、第1実施形態にかかる信号切り替え装置100の構成と同じである。
<< Configuration of signal switching device >>
FIG. 6 is a schematic block diagram showing the configuration of the signal switching device 100 according to the second embodiment. The configuration of the signal switching device 100 according to the second embodiment is the same as the configuration of the signal switching device 100 according to the first embodiment.
 入力部110は、学習用信号、出力用信号の入力を受け入れ、さらに、ユーザによる出力信号の選択を受け入れる。上記の出力信号の選択とは、例えば、ユーザによるチャネルの選択である。
 選択部180は、入力部110が受け入れたユーザが選択した出力信号の情報により、出力信号を選択する。
 図7は、選択部180の動作を示すフローチャートである。
 選択部180は、ユーザが選択したチャネルにかかる、尤度蓄積記憶部170が蓄積記憶している全ての尤度が、予め設定された尤度の閾値以上である場合は(ステップS21:YES)、ユーザが選択したチャネルにかかる出力用信号を、出力信号として選択する(ステップS22)。選択部180は、ユーザが選択したチャネルにかかる少なくとも1つの尤度が、予め設定された尤度の閾値以上でない場合は(ステップS21:NO)、尤度蓄積記憶部170が蓄積記憶した尤度をチャネル別に演算した値を用いて、出力信号を選択する(ステップS23)。
The input unit 110 accepts the inputs of the learning signal and the output signal, and further accepts the user's selection of the output signal. The above output signal selection is, for example, a user's channel selection.
The selection unit 180 selects an output signal based on the information of the output signal selected by the user received by the input unit 110.
FIG. 7 is a flowchart showing the operation of the selection unit 180.
When all the likelihoods stored and stored in the likelihood storage storage unit 170 over the channel selected by the user are equal to or higher than the preset likelihood threshold value, the selection unit 180 (step S21: YES). , The output signal on the channel selected by the user is selected as the output signal (step S22). When at least one likelihood applied to the channel selected by the user is not equal to or more than a preset likelihood threshold value (step S21: NO), the selection unit 180 has the likelihood accumulated and stored by the likelihood storage storage unit 170. The output signal is selected using the value calculated for each channel (step S23).
《作用・効果》
 このように、第2実施形態によれば、信号切り替え装置100は、ユーザが選択した出力信号の情報を受け入れて、出力信号を選択する。これにより、信号切り替え装置100は、出力用信号の尤度だけでなく、ユーザの直接選択により、出力信号を選択できる。
《Action / Effect》
As described above, according to the second embodiment, the signal switching device 100 receives the information of the output signal selected by the user and selects the output signal. As a result, the signal switching device 100 can select the output signal not only by the likelihood of the output signal but also by the user's direct selection.
〈第3の実施形態〉
 以下、図面を参照しながら本発明の第3実施形態について詳しく説明する。
<Third embodiment>
Hereinafter, a third embodiment of the present invention will be described in detail with reference to the drawings.
《信号切り替え装置の構成》
 図8は、第3の実施形態にかかる信号切り替え装置100の構成を示す概略ブロック図である。
 第3の実施形態にかかる信号切り替え装置100の構成は、第1の実施形態にかかる信号切り替え装置100の構成に、モデル更新部210を加えた構成である。
<< Configuration of signal switching device >>
FIG. 8 is a schematic block diagram showing the configuration of the signal switching device 100 according to the third embodiment.
The configuration of the signal switching device 100 according to the third embodiment is a configuration in which the model update unit 210 is added to the configuration of the signal switching device 100 according to the first embodiment.
 入力部110は、学習用信号、複数のチャネルを介した出力用信号の入力を受け入れ、さらに、モデル更新部210がモデルを更新する周期の情報を受け入れる。
 モデル更新部210は、選択部180が選択した出力信号の特徴量を蓄積記憶し、入力部110が受け入れたモデル更新部210の更新の周期ごとに、上記の蓄積記憶した特徴量を用いて、モデル記憶部130に記憶されているモデルを更新する。
The input unit 110 receives the input of the learning signal and the output signal via the plurality of channels, and further receives the information of the cycle in which the model update unit 210 updates the model.
The model update unit 210 stores and stores the feature amount of the output signal selected by the selection unit 180, and uses the above-mentioned accumulated and stored feature amount for each update cycle of the model update unit 210 received by the input unit 110. The model stored in the model storage unit 130 is updated.
《作用・効果》
 このように、第3実施形態によれば、信号切り替え装置100は、選択部180が選択した出力信号の特徴量を用いて、モデル記憶部130が記憶しているモデルを更新するモデル更新部210を備える。これにより、信号切り替え装置100は、選択部180により選択された信号によりモデルを更新するので、最新の情報を用いて、出力信号を選択することができる。
〈第4の実施形態〉
 以下、図面を参照しながら本発明の第4実施形態について詳しく説明する。
《Action / Effect》
As described above, according to the third embodiment, the signal switching device 100 uses the feature amount of the output signal selected by the selection unit 180 to update the model stored in the model storage unit 130. To be equipped. As a result, the signal switching device 100 updates the model with the signal selected by the selection unit 180, so that the output signal can be selected using the latest information.
<Fourth Embodiment>
Hereinafter, a fourth embodiment of the present invention will be described in detail with reference to the drawings.
《信号切り替え装置の構成》
 図9は、第4の実施形態にかかる信号切り替え装置100の構成を示す概略ブロック図である。
 第4の実施形態にかかる信号切り替え装置100は、第1実施形態にかかる信号切り替え装置100の構成に加え、表示部160と、判定部150を有する。
<< Configuration of signal switching device >>
FIG. 9 is a schematic block diagram showing the configuration of the signal switching device 100 according to the fourth embodiment.
The signal switching device 100 according to the fourth embodiment includes a display unit 160 and a determination unit 150 in addition to the configuration of the signal switching device 100 according to the first embodiment.
 モデル学習部120は、入力部110が受け入れた学習用信号を、学習用信号のコンテンツ種類別に特徴量を抽出し、コンテンツ種類別にモデルを学習する。
 例えば、モデル学習部120は、入力部110が受け入れた学習用信号が、バラエティ番組とニュース番組のコンテンツ種類にかかる信号である場合、バラエティ番組とニュース番組それぞれについてのモデルを学習する。
 モデル記憶部130は、コンテンツ種類別に学習されたモデルを記憶する。
The model learning unit 120 extracts the feature amount of the learning signal received by the input unit 110 for each content type of the learning signal, and learns the model for each content type.
For example, when the learning signal received by the input unit 110 is a signal related to the content types of the variety program and the news program, the model learning unit 120 learns a model for each of the variety program and the news program.
The model storage unit 130 stores the models learned for each content type.
 尤度特定部140は、チャネル別に出力用信号から特徴量を抽出し、モデル記憶部130が記憶している、コンテンツ種類別のモデルを用いて、入力部110が受け入れた出力用信号について、コンテンツ種類別の尤度を特定する。すなわち、第1の実施形態の尤度特定部140は、チャネル別に出力用信号の尤度を特定するが、第4の実施形態の尤度特定部140は、チャネル別且つコンテンツ種類別の出力用信号の尤度を特定する。 The likelihood specifying unit 140 extracts a feature amount from the output signal for each channel, and uses a model for each content type stored in the model storage unit 130 to describe the content of the output signal received by the input unit 110. Identify the likelihood of each type. That is, the likelihood specifying unit 140 of the first embodiment specifies the likelihood of the output signal for each channel, while the likelihood specifying unit 140 of the fourth embodiment is for output by channel and content type. Identify the likelihood of the signal.
 判定部150は、尤度蓄積記憶部170が蓄積記憶した尤度を、チャネル別に演算した値を用いて、入力部110が受け入れた出力用信号がコンテンツであるか否かを、コンテンツ種類別に判定する。判定部150は、上記の判定において、コンテンツ種類別に予め設定された尤度の閾値を用いる。
 例えば、モデル記憶部130がバラエティ番組とニュース番組のコンテンツにかかるモデルを記憶しており、バラエティ番組のモデルの尤度の閾値は0.7、ニュース番組のモデルの尤度の閾値は0.8とする。入力部110が受け入れた出力用信号のバラエティ番組のモデルについての尤度が0.8、ニュース番組のモデルの尤度が0.6の場合は、判定部150は、上記の出力用信号について、バラエティ番組のコンテンツであると判定し、ニュース番組のコンテンツでないと判定する。一方、出力用信号にかかるバラエティ番組の尤度が0.9であり、出力用信号にかかるニュース番組の尤度が0.85である場合、それぞれの尤度の閾値以上であるため、判定部150は、出力用信号について、より高い尤度にかかるバラエティ番組のコンテンツであると判定し、ニュース番組のコンテンツでないと判定する。また、出力用信号にかかるバラエティ番組の尤度が0.6であり、出力用信号にかかるニュース番組の尤度が0.75である場合、それぞれの尤度の閾値以上でないため、判定部150は、出力用信号について、より高い尤度にかかるニュース番組のコンテンツであると判定し、バラエティ番組のコンテンツでないと判定する。
The determination unit 150 determines whether or not the output signal received by the input unit 110 is content by using the value obtained by calculating the likelihood accumulated and stored by the likelihood storage unit 170 for each channel. To do. The determination unit 150 uses a likelihood threshold set in advance for each content type in the above determination.
For example, the model storage unit 130 stores a model related to the contents of the variety program and the news program, the threshold of the likelihood of the model of the variety program is 0.7, and the threshold of the likelihood of the model of the news program is 0.8. And. Variety of output signals received by input unit 110 When the program model has a likelihood of 0.8 and the news program model has a likelihood of 0.6, the determination unit 150 determines the output signal. It is determined that it is the content of a variety program, and it is determined that it is not the content of a news program. On the other hand, when the likelihood of the variety program on the output signal is 0.9 and the likelihood of the news program on the output signal is 0.85, it is equal to or higher than the threshold of each likelihood, so that the determination unit Reference numeral 150 determines that the output signal is the content of a variety program having a higher likelihood, and determines that the signal is not the content of a news program. Further, when the likelihood of the variety program on the output signal is 0.6 and the likelihood of the news program on the output signal is 0.75, it is not equal to or higher than the threshold of each likelihood, so the determination unit 150 Determines that the output signal is the content of a news program with a higher likelihood, and determines that it is not the content of a variety program.
 表示部160は、判定部150の判定の結果を表示する。 The display unit 160 displays the determination result of the determination unit 150.
《作用・効果》
 このように、第4実施形態によれば、信号切り替え装置100は、複数のコンテンツにかかる信号のモデルをコンテンツ種類別に学習する。また、信号切り替え装置100は、入力信号に対して、コンテンツ種類別に尤度を特定し、入力信号がコンテンツであるか否かを、コンテンツ種類別に判定する判定部150を備える。これにより、信号切り替え装置100が、入力信号がコンテンツであるか否か表示するため、ユーザは、入力信号が複数のコンテンツの種類のうち、どのコンテンツ種類に該当するかが判る。
《Action / Effect》
As described above, according to the fourth embodiment, the signal switching device 100 learns a signal model for a plurality of contents for each content type. Further, the signal switching device 100 includes a determination unit 150 that specifies the likelihood of the input signal for each content type and determines whether or not the input signal is content for each content type. As a result, the signal switching device 100 displays whether or not the input signal is content, so that the user can know which content type the input signal corresponds to among the plurality of content types.
〈他の実施形態〉
 以上、図面を参照して一実施形態について詳しく説明してきたが、具体的な構成は上述のものに限られることはなく、様々な設計変更等をすることが可能である。
 例えば、信号切り替え装置100のモデル学習部120は、周波数、形式ごとにモデルを学習してもよい。また、信号切り替え装置100の尤度特定部140は、出力用信号の周波数、形式ごとに、尤度を特定してもよい。
 また、モデルの学習においては、教師なし学習だけでなく、教師あり学習を用いてもよい。つまり、信号を入力サンプルとし、その信号がコンテンツであるか否かを示すラベルを出力サンプルとするデータセットを用いてラベルを学習させてもよい。この場合、ラベルに信号を入力して得られるラベルの値にて尤度が算出される。
 信号切り替え装置100の学習と推論は別の装置にてされてもよい。
<Other Embodiments>
Although one embodiment has been described in detail with reference to the drawings, the specific configuration is not limited to the above, and various design changes and the like can be made.
For example, the model learning unit 120 of the signal switching device 100 may learn a model for each frequency and format. Further, the likelihood specifying unit 140 of the signal switching device 100 may specify the likelihood for each frequency and format of the output signal.
Further, in the learning of the model, not only unsupervised learning but also supervised learning may be used. That is, the label may be trained using a data set in which a signal is used as an input sample and a label indicating whether or not the signal is content is used as an output sample. In this case, the likelihood is calculated by the value of the label obtained by inputting a signal to the label.
Learning and inference of the signal switching device 100 may be performed by another device.
《基本構成》
 図10は、本発明による信号切り替え装置100の基本構成を示す概略ブロック図である。
 上述した実施形態では、本発明による信号切り替え装置100の一実施形態として図1に示す構成について説明したが、本発明による信号切り替え装置100の基本構成は、図10に示すとおりである。
 すなわち、本発明によるは、モデル記憶部130と、尤度特定部140と、選択部180と、信号出力部190を基本構成とする。
<< Basic configuration >>
FIG. 10 is a schematic block diagram showing a basic configuration of the signal switching device 100 according to the present invention.
In the above-described embodiment, the configuration shown in FIG. 1 has been described as an embodiment of the signal switching device 100 according to the present invention, but the basic configuration of the signal switching device 100 according to the present invention is as shown in FIG.
That is, according to the present invention, the model storage unit 130, the likelihood specifying unit 140, the selection unit 180, and the signal output unit 190 are the basic configurations.
 モデル記憶部130は、尤度特定部140が用いるモデルを記憶する。ここで、モデル記憶部130が記憶するモデルは、モデル学習部120が学習したモデルだけでなく、例えば、予め与えられたモデルでも良い。
 選択部180は、尤度特定部140が、チャネル別に特定した出力用の尤度を用いて出力信号を選択する。
The model storage unit 130 stores the model used by the likelihood specifying unit 140. Here, the model stored in the model storage unit 130 is not limited to the model learned by the model learning unit 120, but may be, for example, a model given in advance.
The selection unit 180 selects an output signal using the likelihood for output specified by the likelihood specifying unit 140 for each channel.
 基本構成にかかる信号切り替え装置100によれば、信号切り替え装置100は、コンテンツにかかる信号の特徴量を学習したモデルを用いて入力信号の、コンテンツにかかる信号に対する尤度である尤度を特定する。これにより、信号切り替え装置100は、雑音や故障により断続的にコンテンツにかかる信号を受信するチャネルが切り替わる場合に、コンテンツにかかる信号を受信したチャネルにかかる出力信号への切り替えを自動的に行う。 According to the signal switching device 100 according to the basic configuration, the signal switching device 100 specifies the likelihood of the input signal with respect to the signal related to the content by using a model in which the feature amount of the signal related to the content is learned. .. As a result, the signal switching device 100 automatically switches the signal related to the content to the output signal related to the received channel when the channel for receiving the signal related to the content is intermittently switched due to noise or failure.
 図11は、少なくとも1つの実施形態に係るコンピュータの構成を示す概略ブロック図である。
 コンピュータ1100は、プロセッサ1110、メインメモリ1120、ストレージ1130、インタフェース1140を備える。
 上述の信号切り替え装置100は、コンピュータ1100に実装される。そして、上述した各処理部の動作は、プログラムの形式でストレージ1130に記憶されている。プロセッサ1110は、プログラムをストレージ1130から読み出してメインメモリ1120に展開し、当該プログラムに従って上記処理を実行する。また、プロセッサ1110は、プログラムに従って、上述した各記憶部に対応する記憶領域をメインメモリ1120に確保する。
FIG. 11 is a schematic block diagram showing the configuration of a computer according to at least one embodiment.
The computer 1100 includes a processor 1110, a main memory 1120, a storage 1130, and an interface 1140.
The signal switching device 100 described above is mounted on the computer 1100. The operation of each processing unit described above is stored in the storage 1130 in the form of a program. The processor 1110 reads a program from the storage 1130, expands it into the main memory 1120, and executes the above processing according to the program. Further, the processor 1110 secures a storage area corresponding to each of the above-mentioned storage units in the main memory 1120 according to the program.
 プログラムは、コンピュータ1100に発揮させる機能の一部を実現するためのものであってもよい。例えば、プログラムは、ストレージ1130に既に記憶されている他のプログラムとの組み合わせ、または他の装置に実装された他のプログラムとの組み合わせによって機能を発揮させるものであってもよい。なお、他の実施形態においては、コンピュータ1100は、上記構成に加えて、または上記構成に代えてPLD(Programmable Logic Device)などのカスタムLSI(Large Scale Integrated Circuit)を備えてもよい。PLDの例としては、PAL(Programmable Array Logic)、GAL(Generic Array Logic)、CPLD(Complex Programmable Logic Device)、FPGA(Field Programmable Gate Array)が挙げられる。この場合、プロセッサ1110によって実現される機能の一部または全部が当該集積回路によって実現されてよい。 The program may be for realizing a part of the functions exerted on the computer 1100. For example, the program may exert its function in combination with another program already stored in the storage 1130, or in combination with another program mounted on another device. In another embodiment, the computer 1100 may be provided with a custom LSI (Large Scale Integrated Circuit) such as a PLD (Programmable Logic Device) in addition to or in place of the above configuration. Examples of PLDs include PAL (Programmable Array Logic), GAL (Generic Array Logic), CPLD (Complex Programmable Logic Device), and FPGA (Field Programmable Gate Array). In this case, some or all of the functions realized by the processor 1110 may be realized by the integrated circuit.
 ストレージ1130の例としては、磁気ディスク、光磁気ディスク、半導体メモリ等が挙げられる。ストレージ1130は、コンピュータ1100のバスに直接接続された内部メディアであってもよいし、インタフェース1140または通信回線を介してコンピュータに接続される外部メディアであってもよい。また、このプログラムが通信回線によってコンピュータ1100に配信される場合、配信を受けたコンピュータ1100が当該プログラムをメインメモリ1120に展開し、上記処理を実行してもよい。少なくとも1つの実施形態において、ストレージ1130は、一時的でない有形の記憶媒体である。 Examples of the storage 1130 include magnetic disks, magneto-optical disks, semiconductor memories, and the like. The storage 1130 may be internal media directly connected to the bus of computer 1100, or external media connected to the computer via interface 1140 or a communication line. When this program is distributed to the computer 1100 via a communication line, the distributed computer 1100 may expand the program in the main memory 1120 and execute the above processing. In at least one embodiment, storage 1130 is a non-temporary tangible storage medium.
 また、当該プログラムは、前述した機能の一部を実現するためのものであってもよい。さらに、当該プログラムは、前述した機能をストレージ1130に既に記憶されている他のプログラムとの組み合わせで実現するもの、いわゆる差分ファイル(差分プログラム)であってもよい。 Further, the program may be for realizing a part of the above-mentioned functions. Further, the program may be a so-called difference file (difference program) that realizes the above-mentioned function in combination with another program already stored in the storage 1130.
 信号切り替え装置は、コンテンツにかかる信号の特徴量を用いて生成したモデルにより、入力信号のコンテンツにかかる信号に対する尤度を出力し、入力信号を切り替える。よって、ユーザは、無音や雑音による入力信号の切り替えにおいて、手作業を削減できる。 The signal switching device outputs the likelihood of the input signal to the signal related to the content by the model generated by using the feature amount of the signal applied to the content, and switches the input signal. Therefore, the user can reduce the manual work in switching the input signal due to silence or noise.
 100 信号切り替え装置
 110 入力部
 120 モデル学習部
 130 モデル記憶部
 140 尤度特定部
 150 判定部
 160 表示部
 170 尤度蓄積記憶部
 180 選択部
 190 信号出力部
 210 モデル更新部
 1100 コンピュータ
 1110 プロセッサ
 1120 メインメモリ
 1130 ストレージ
 1140 インタフェース
100 Signal switching device 110 Input unit 120 Model learning unit 130 Model storage unit 140 Probability specification unit 150 Judgment unit 160 Display unit 170 Liability storage storage unit 180 Selection unit 190 Signal output unit 210 Model update unit 1100 Computer 1110 Processor 1120 Main memory 1130 storage 1140 interface

Claims (7)

  1.  コンテンツにかかる信号の特徴量を学習した学習済みモデルを記憶するモデル記憶部と、
     前記学習済みモデルを用いて、複数のチャネルから入力された入力信号の、前記コンテンツにかかる信号に対する尤度を特定する尤度特定部と、
     前記尤度を用いて、前記複数のチャネルから入力された入力信号のうち、出力信号を選択する選択部と、
     選択された前記出力信号を出力する信号出力部と、
     を備える信号切り替え装置。
    A model storage unit that stores learned models that have learned the features of signals related to content,
    Using the trained model, a likelihood specifying unit that specifies the likelihood of input signals input from a plurality of channels with respect to the signal related to the content, and
    A selection unit that selects an output signal from the input signals input from the plurality of channels using the likelihood.
    A signal output unit that outputs the selected output signal and
    A signal switching device comprising.
  2.  一定期間に出力された前記尤度を、前記チャネル別に蓄積記憶する尤度蓄積記憶部を備え、
     前記選択部は、前記尤度蓄積記憶部が蓄積記憶した前記尤度を、前記チャネル別に演算した値を用いて、出力信号を選択する
     請求項1に記載の信号切り替え装置。
    A likelihood storage storage unit that stores and stores the likelihood output for a certain period for each channel is provided.
    The signal switching device according to claim 1, wherein the selection unit selects an output signal by using a value calculated for each channel of the likelihood stored and stored by the likelihood storage storage unit.
  3.  前記選択部は、ユーザによる前記チャネルの選択にかかる情報と前記尤度を用いて、前記複数のチャネルから入力された入力信号のうち、出力信号を選択する
     請求項1または請求項2に記載の信号切り替え装置。
    The selection unit according to claim 1 or 2, wherein the selection unit selects an output signal from the input signals input from the plurality of channels by using the information related to the selection of the channel by the user and the likelihood. Signal switching device.
  4.  前記選択部が選択した出力信号の特徴量を用いて、前記モデル記憶部が記憶している前記学習済みモデルを更新するモデル更新部と、
     を備える請求項1から請求項3の何れか1項に記載の信号切り替え装置。
    A model update unit that updates the learned model stored in the model storage unit using the feature amount of the output signal selected by the selection unit, and a model update unit.
    The signal switching device according to any one of claims 1 to 3.
  5.  前記モデル記憶部は、複数の前記コンテンツにかかる信号の特徴量を、コンテンツ種類別に学習した学習済みモデルを記憶し、
     前記尤度特定部は、前記学習済みモデルを用いて、複数のチャネルから入力された入力信号それぞれの尤度をコンテンツ種類別に特定し、
     前記尤度を用いて、前記複数のチャネルから入力された各入力信号が、コンテンツにかかる信号であるか否かを、コンテンツ種類別に判定する判定部を備える
     請求項1に記載の信号切り替え装置。
    The model storage unit stores a learned model in which the feature amounts of signals applied to a plurality of the contents are learned for each content type.
    The likelihood specifying unit uses the trained model to specify the likelihood of each input signal input from a plurality of channels for each content type.
    The signal switching device according to claim 1, further comprising a determination unit for determining whether or not each input signal input from the plurality of channels is a signal related to the content by using the likelihood.
  6.  コンテンツにかかる信号の特徴量を学習した学習済みモデルを記憶するステップと、
     前記学習済みモデルを用いて、複数のチャネルから入力された入力信号の、前記コンテンツにかかる信号に対する尤度を特定するステップと、
     前記尤度を用いて、前記複数のチャネルから入力された入力信号のうち、出力信号を選択するステップと、
     選択された前記出力信号を出力するステップと、
     を有する信号切り替え方法。
    A step to memorize the trained model that learned the features of the signal applied to the content,
    Using the trained model, a step of identifying the likelihood of input signals input from a plurality of channels with respect to the signal applied to the content, and
    A step of selecting an output signal from the input signals input from the plurality of channels using the likelihood, and a step of selecting an output signal.
    The step of outputting the selected output signal and
    Signal switching method having.
  7.  コンピュータを、
     コンテンツにかかる信号の特徴量を学習した学習済みモデルを記憶するモデル記憶部、
     前記学習済みモデルを用いて、複数のチャネルから入力された入力信号の、前記コンテンツにかかる信号に対する尤度を特定する尤度特定部、
     前記尤度を用いて、前記複数のチャネルから入力された入力信号のうち、出力信号を選択する選択部、
     選択された前記出力信号を出力する信号出力部、
     として機能させるためのプログラムを記憶する記録媒体。
    Computer,
    A model storage unit that stores learned models that have learned the features of signals related to content.
    A likelihood specifying unit that specifies the likelihood of input signals input from a plurality of channels with respect to a signal related to the content using the trained model.
    A selection unit that selects an output signal from the input signals input from the plurality of channels using the likelihood.
    A signal output unit that outputs the selected output signal,
    A recording medium that stores a program for functioning as.
PCT/JP2019/018697 2019-05-10 2019-05-10 Signal switching device, signal switching method, and recording medium WO2020230184A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
JP2021519030A JPWO2020230184A1 (en) 2019-05-10 2019-05-10
PCT/JP2019/018697 WO2020230184A1 (en) 2019-05-10 2019-05-10 Signal switching device, signal switching method, and recording medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2019/018697 WO2020230184A1 (en) 2019-05-10 2019-05-10 Signal switching device, signal switching method, and recording medium

Publications (1)

Publication Number Publication Date
WO2020230184A1 true WO2020230184A1 (en) 2020-11-19

Family

ID=73289838

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2019/018697 WO2020230184A1 (en) 2019-05-10 2019-05-10 Signal switching device, signal switching method, and recording medium

Country Status (2)

Country Link
JP (1) JPWO2020230184A1 (en)
WO (1) WO2020230184A1 (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2006333426A (en) * 2004-07-09 2006-12-07 Victor Co Of Japan Ltd Automatic program selecting apparatus, automatic program selecting method, and automatic program selecting program
JP2007325117A (en) * 2006-06-02 2007-12-13 Sony Corp Receiving apparatus and method
JP2010062653A (en) * 2008-09-01 2010-03-18 Panasonic Corp Content selection device, program selection device, channel selection device, and content selection method
JP2011166252A (en) * 2010-02-05 2011-08-25 Funai Electric Co Ltd Television receiver

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2008252667A (en) * 2007-03-30 2008-10-16 Matsushita Electric Ind Co Ltd System for detecting event in moving image
JP2011013383A (en) * 2009-06-30 2011-01-20 Toshiba Corp Audio signal correction device and audio signal correction method
JP6758890B2 (en) * 2016-04-07 2020-09-23 キヤノン株式会社 Voice discrimination device, voice discrimination method, computer program

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2006333426A (en) * 2004-07-09 2006-12-07 Victor Co Of Japan Ltd Automatic program selecting apparatus, automatic program selecting method, and automatic program selecting program
JP2007325117A (en) * 2006-06-02 2007-12-13 Sony Corp Receiving apparatus and method
JP2010062653A (en) * 2008-09-01 2010-03-18 Panasonic Corp Content selection device, program selection device, channel selection device, and content selection method
JP2011166252A (en) * 2010-02-05 2011-08-25 Funai Electric Co Ltd Television receiver

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
ROACH, MATTHEW ET AL.: "Video Genre Verification using both Acoustic and Visual Modes", IEEE WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING, December 2002 (2002-12-01), pages 157 - 160, XP010642536 *

Also Published As

Publication number Publication date
JPWO2020230184A1 (en) 2020-11-19

Similar Documents

Publication Publication Date Title
US11790934B2 (en) Deep learning based method and system for processing sound quality characteristics
CN104080024B (en) Volume leveller controller and control method and audio classifiers
KR100472904B1 (en) Digital Recorder for Selectively Storing Only a Music Section Out of Radio Broadcasting Contents and Method thereof
JP4438144B2 (en) Signal classification method and apparatus, descriptor generation method and apparatus, signal search method and apparatus
CN104079247B (en) Balanced device controller and control method and audio reproducing system
KR100745976B1 (en) Method and apparatus for classifying voice and non-voice using sound model
US10410615B2 (en) Audio information processing method and apparatus
CN110047514B (en) Method for evaluating purity of accompaniment and related equipment
US20070038455A1 (en) Accent detection and correction system
JP2004530153A (en) Method and apparatus for characterizing a signal and method and apparatus for generating an index signal
CN109616142A (en) Device and method for audio classification and processing
CN116997962A (en) Robust intrusive perceptual audio quality assessment based on convolutional neural network
CN110739006B (en) Audio processing method and device, storage medium and electronic equipment
WO2020230184A1 (en) Signal switching device, signal switching method, and recording medium
US20220277040A1 (en) Accompaniment classification method and apparatus
JPH10247093A (en) Audio information classifying device
US20240038258A1 (en) Audio content identification
Ramírez et al. Stem audio mixing as a content-based transformation of audio features
CN115273826A (en) Singing voice recognition model training method, singing voice recognition method and related device
US20220093089A1 (en) Model constructing method for audio recognition
CN113781989A (en) Audio animation playing and rhythm stuck point identification method and related device
CN111354352B (en) Automatic template cleaning method and system for audio retrieval
CN117251095B (en) Data input method and system for PDA
Nguyen et al. Improving mix-and-separate training in audio-visual sound source separation with an object prior
CN111048110A (en) Musical instrument identification method, medium, device and computing equipment

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19928360

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2021519030

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19928360

Country of ref document: EP

Kind code of ref document: A1