WO2022054414A1 - 音信号処理システム、及び、音信号処理方法 - Google Patents
音信号処理システム、及び、音信号処理方法 Download PDFInfo
- Publication number
- WO2022054414A1 WO2022054414A1 PCT/JP2021/027054 JP2021027054W WO2022054414A1 WO 2022054414 A1 WO2022054414 A1 WO 2022054414A1 JP 2021027054 W JP2021027054 W JP 2021027054W WO 2022054414 A1 WO2022054414 A1 WO 2022054414A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- sound
- sound signal
- signal processing
- unit
- processing system
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/27—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
- G10L25/36—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using chaos theory
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/16—Sound input; Sound output
- G06F3/165—Management of the audio stream, e.g. setting of volume, audio stream path
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10K—SOUND-PRODUCING DEVICES; METHODS OR DEVICES FOR PROTECTING AGAINST, OR FOR DAMPING, NOISE OR OTHER ACOUSTIC WAVES IN GENERAL; ACOUSTICS NOT OTHERWISE PROVIDED FOR
- G10K15/00—Acoustics not otherwise provided for
- G10K15/02—Synthesis of acoustic waves
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/003—Changing voice quality, e.g. pitch or formants
- G10L21/007—Changing voice quality, e.g. pitch or formants characterised by the process used
Definitions
- the present disclosure relates to a sound signal processing system and a sound signal processing method.
- Patent Document 1 discloses a synthetic sound generator capable of generating a synthetic sound in which deterioration of quality is suppressed.
- the present disclosure provides a sound signal processing system capable of outputting a new sound signal that reflects the characteristics of another sound to one sound, and a sound signal processing method.
- the sound signal processing system acquires a first acquisition unit that acquires recurrence plot information indicating the characteristics of the first sound, and a sound signal of a second sound that is different from the first sound. Based on the second acquisition unit and the recurrence plot information acquired by the first acquisition unit, the characteristics of the first sound are added to the sound signal of the second sound acquired by the second acquisition unit. It is provided with a generation unit for generating a sound signal reflecting the above and an output unit for outputting the generated sound signal.
- the sound signal processing system and the sound signal processing method according to one aspect of the present disclosure can output a new sound signal that reflects the characteristics of another sound in one sound.
- FIG. 1 is a block diagram showing a functional configuration of a sound signal processing system according to an embodiment.
- FIG. 2 is a diagram showing an example of recurrence plot information when a sine wave is associated with the vertical axis and the horizontal axis.
- FIG. 3 is a diagram showing an example of recurrence plot information when white noise is associated with the vertical axis and the horizontal axis.
- FIG. 4 is a diagram for explaining the characteristics of the time series data indicated by the Recurrence plot information.
- FIG. 5 is a diagram showing a method of generating UpperRP from natural sounds.
- FIG. 6 is a diagram showing UpperRP stored in the storage unit.
- FIG. 7 is a diagram showing a sound signal of a seed sound stored in a storage unit.
- FIG. 1 is a block diagram showing a functional configuration of a sound signal processing system according to an embodiment.
- FIG. 2 is a diagram showing an example of recurrence plot information when a sine wave is associated with the vertical
- FIG. 8 is a sequence diagram of an operation example 1 of the sound signal processing system according to the embodiment.
- FIG. 9 is a diagram showing an example of a screen for selecting a seed sound and a sensibility word.
- FIG. 10 is a flowchart of a sound signal generation method.
- FIG. 11 is a sequence diagram of operation example 2 of the sound signal processing system according to the embodiment.
- FIG. 12 is a diagram showing an example of a seed sound and a selection screen of a natural sound having characteristics desired to be given to the seed sound.
- FIG. 1 is a block diagram showing a functional configuration of a sound signal processing system according to an embodiment.
- the sound signal processing system 10 includes a server device 20 and an information terminal 30. Specifically, the sound signal processing system 10 can output a natural sound desired by the user by reflecting the characteristics of other natural sounds (in other words, regularity). Specifically, the natural sound means a sound generated in the natural world such as the sound of water, the bark of insects, and the voice of animals.
- the server device 20 includes a communication unit 21, a signal processing unit 22, and a storage unit 23.
- the communication unit 21 is a communication circuit (in other words, a communication module) for the server device 20 to communicate with the information terminal 30 via a wide area communication network 40 such as the Internet.
- the communication standard for communication performed by the communication unit 21 is not particularly limited.
- the signal processing unit 22 By processing the sound signal, the signal processing unit 22 generates and outputs a sound signal of a sound in which the characteristics of other natural sounds are reflected in the natural sound desired by the user.
- the signal processing unit 22 is realized by, for example, a microcomputer, but may be realized by a processor such as a DSP (Digital Signal Processor).
- the signal processing unit 22 includes a first acquisition unit 24, a second acquisition unit 25, a third acquisition unit 26, a generation unit 27, and an output unit 28 as functional components.
- the functions of the first acquisition unit 24, the second acquisition unit 25, the third acquisition unit 26, the generation unit 27, and the output unit 28, the microcomputer and the like constituting the signal processing unit 22 are stored in the storage unit 23. It is realized by executing a computer program. The detailed functions of these components will be described later.
- the storage unit 23 is a storage device (memory) in which various information necessary for the signal processing unit 22 to process a sound signal, a computer program, and the like are stored.
- the storage unit 23 is an example of a first storage unit and a second storage unit.
- the storage unit 23 is realized by, for example, an HDD (Hard Disk Drive), but may be realized by a semiconductor memory.
- the information terminal 30 is an information terminal operated by the user to access the server device 20.
- the information terminal 30 is, for example, a portable information terminal such as a notebook personal computer, a smartphone, and a tablet terminal, but may be a stationary information terminal such as a desktop personal computer.
- the information terminal 30 includes a UI (User Interface) unit 31, a speaker 32, an information processing unit 33, and a storage unit 34.
- UI User Interface
- the UI unit 31 is a user interface device that accepts user operations and presents images to the user.
- the UI unit 31 is realized by an operation reception unit such as a touch panel or a keyboard, and a display unit such as a display panel.
- the speaker 32 is a sound output device that reproduces (that is, outputs sound) a sound signal provided by the server device 20.
- the information processing unit 33 performs information processing related to displaying an image on the display unit and outputting sound from the speaker 32.
- the information processing unit 33 is realized by, for example, a microcomputer, but may be realized by a processor.
- the image display function, the sound output function, and the like are realized by executing a computer program stored in the storage unit 34 by a microcomputer or the like constituting the information processing unit 33.
- the storage unit 34 is a storage device (memory) in which various information necessary for the information processing unit 33 to process a sound signal, a computer program, and the like are stored.
- the storage unit 34 is realized by, for example, a semiconductor memory.
- the sound signal processing system 10 can output the natural sound desired by the user by reflecting the characteristics of other natural sounds.
- a recurrence plot is used as a means for processing the sound signal in this way.
- Recurrence plot is one of the methods of nonlinear time series analysis, and the recurrence plot information obtained by recurrence plot is represented by a plan view. Recurrence plot information can be said to be two-dimensional array information.
- FIG. 2 is a diagram showing an example of recurrence plot information when a sine wave is associated with the vertical axis and the horizontal axis, and FIG.
- FIGS. 2 and 3 are diagram showing recurrence when white noise is associated with the vertical axis and the horizontal axis. It is a figure which shows an example of plot information.
- the black portion corresponds to the presence of the plot
- the white portion corresponds to the presence of the plot.
- FIG. 4 is a diagram for explaining the characteristics of the time series data indicated by the Recurrence plot information.
- the recurrence plot information indicates that the time series data has periodicity when the lines parallel to the center line (Line Of Identity) are lined up.
- the distance to the center line (width W1, width W2, etc. in FIG. 4) indicates a period.
- Non-Patent Document 1 (Fukino, Miwa, et al. “Coarse-Graining Time Series Data: Recurrence Plot of Recurrence Plots and Its Application for Music.” Chaos: An Interdisciplinary Journal of Nonlinear Science, vol. 2, no. 26, 2016, pp. 0-12, doi: 10.1063 / 1.4941371.). The contents of Non-Patent Document 1 are also included in the present disclosure.
- the sound signal processing system 10 uses UpperRP (Recurrence Plot), which is recurrence plot information obtained by a hierarchical recurrence plot, to output a user's desired natural sound by reflecting the characteristics of other natural sounds. .. UpperRP is generated based on other natural sounds (hereinafter, also simply referred to as natural sounds).
- Natural sounds hereinafter, also simply referred to as natural sounds.
- FIG. 5 is a diagram showing a method of generating UpperRP from natural sounds.
- the generation unit 27 of the server device 20 performs a process of generating UpperRP from natural sounds, but this process may be performed by a device other than the server device 20.
- a plurality of UpperRPs may be stored in the storage unit 23.
- the generation unit 27 divides the sound signal (time waveform of sound) of natural sound into n processing units defined by the window width T1 and the shift width T2 ((a) in FIG. 5).
- the window width T1 is, for example, 2.0 sec
- the shift width T2 is, for example, 0.5 sec
- the number n of the processing units is, for example, about several tens to several hundreds.
- the window width T1, the shift width T2, and the specific numerical values of the number n of the processing units are not particularly limited.
- the generation unit 27 generates a Short term RP (hereinafter also referred to as SRP) from each of the n processing units ((b) in FIG. 5).
- SRP Short term RP
- the time-series data of the sound signal corresponding to one processing unit is associated with the vertical axis and the horizontal axis, and the i-th state (specifically, the amplitude of the sound signal) on the vertical axis is s (i).
- SRP (i, j) d (s (i), s (j)). It should be noted that 1 ⁇ i and j ⁇ m (m is a natural number of 2 or more).
- the SRP is, for example, matrix data composed of m ⁇ m elements. In FIG. 5B, the SRP is schematically illustrated in grayscale.
- the generation unit 27 generates UpperRP by associating n SRPs generated from each of the n processing units with the vertical axis and the horizontal axis ((c) in FIG. 5).
- UpperRP is, for example, n ⁇ n matrix data.
- the UpperRP is schematically illustrated in grayscale.
- URP (i, j) D (SRP (i), SRP (j)). It should be noted that 1 ⁇ i and j ⁇ n (n is a natural number of 2 or more). D is a function indicating the distance, for example, a function for obtaining the Euclidean distance between SRPs (that is, between matrices).
- the generation unit 27 can also generate the UpperRP after the threshold processing by performing the threshold processing on the UpperRP ((d) in FIG. 5).
- the signal processing unit 22 plots the position when each of n ⁇ n elements of the original data of UpperRP is less than the threshold value, and does not plot the position when the element is equal to or more than the threshold value. As a result, UpperRP after threshold processing is generated.
- FIG. 6 is a diagram showing UpperRP stored in the storage unit 23.
- a plurality of UpperRPs are stored in the storage unit 23, and each of the plurality of UpperRPs is associated with a sensibility word for the natural sound that is the source of the UpperRP.
- Kansei words are words (for example, modifiers) for expressing the impression of sound, such as "intense” and "gentle", in other words, impression words. Kansei words are defined based on the subjective evaluation results (questionnaire results of a plurality of users, etc.) of a plurality of users who have heard the natural sound.
- the storage unit 23 may store only the Upper RP based on the natural sound determined to be comfortable by the subjective evaluation results of a plurality of users.
- each of the plurality of UpperRPs may be associated with the sound signal of the natural sound that is the source of the UpperRP.
- the time length of this sound signal is relatively short, for example, about several seconds to 10 seconds.
- FIG. 7 is a diagram showing a sound signal of a seed sound stored in the storage unit 23.
- the seed sound is an example of the second sound, and means the sound (sound source) that is the source (seed) of the sound that is finally output.
- the seed sound is, for example, a natural sound.
- the time length of the sound signal of the seed sound is relatively short, for example, about several seconds to 10 seconds. Labels (identification information) such as "wind sound” and "bird song” are attached to the sound signal of the seed sound.
- the sound signal processing system 10 uses the sound signals of the UpperRP and the seed sound stored in the storage unit 23 of the server device 20 to give the seed sound desired by the user the characteristics exhibited by the UpperRP (the original natural sound of the UpperRP has. It is possible to output a sound that reflects the characteristics).
- operation example 1 of such a sound signal processing system 10 will be described.
- FIG. 8 is a sequence diagram of an operation example 1 of the sound signal processing system 10.
- the information processing unit 33 of the information terminal 30 displays a seed sound and a sensitivity word selection screen on the UI unit 31 (display unit) (S10).
- FIG. 9 is a diagram showing an example of a screen for selecting a seed sound and a sensibility word.
- the user When such a selection screen is displayed, the user performs a desired seed sound selection operation and a desired Kansei word selection operation on the UI unit 31 of the information terminal 30, and the UI unit 31 performs an operation of selecting a desired sensibility word.
- This operation is accepted (S11).
- the information terminal 30 transmits a request for a sound signal to the server device 20 (S12).
- the request for the sound signal includes seed sound information indicating the seed sound selected by the user and Kansei word information indicating the Kansei word selected by the user.
- the communication unit 21 of the server device 20 receives the request for the sound signal.
- the second acquisition unit 25 uses the seed sound indicated by the seed sound information (that is, the user) among the sound signals of the plurality of seed sounds (FIG. 7) based on the seed sound information included in the request for the received sound signal.
- the sound signal of the selected seed sound is acquired from the storage unit 23 (S13).
- the third acquisition unit 26 acquires the Kansei word information included in the request for the received sound signal (S14). Based on the Kansei word information acquired by the third acquisition unit 26, the first acquisition unit 24 is associated with the Kansei word indicated by the Kansei word information among the plurality of UpperRP (FIG. 6). Information) is acquired from the storage unit 23 (S15).
- the generation unit 27 Based on the UpperRP acquired by the first acquisition unit 24, the generation unit 27 adds the sound signal of the seed sound acquired by the second acquisition unit 25 to the original sound of the UpperRP acquired by the first acquisition unit 24. A sound signal that reflects the characteristics is generated (S16). The details of the sound signal generation method will be described later.
- the output unit 28 outputs the generated sound signal (S17).
- the communication unit 21 transmits the output sound signal to the information terminal 30 (S18).
- the information terminal 30 receives a sound signal.
- the information processing unit 33 reproduces a sound signal using the speaker 32 (S19).
- a sound reflecting the characteristics of the sensibility word selected by the user (more specifically, the characteristics (characteristics) of the natural sound associated with the sensibility word) is output from the speaker 32 to the seed sound desired by the user. Will be done.
- FIG. 10 is a flowchart of a sound signal generation method.
- the generation unit 27 normalizes each element of the UpperRP acquired by the first acquisition unit 24 in step S15 with the maximum value in the UpperRP, and the array NU having (1-normalized UpperRP element) as an element. Is generated (S16a).
- the array NU means a dissimilarity matrix.
- the generation unit 27 performs dimensional reduction using the array NU by a multidimensional scaling method or the like, and each of SRP (1), SRP (2), ..., SRP (n) included in Upper. YU (1), YU (2), ..., YU (n) indicating the increase / decrease relationship of the total value of the elements at the time is calculated (S16b).
- the generation unit 27 generates at least one SRP (hereinafter, also referred to as a seed sound SRP) based on the sound signal of the seed sound acquired by the second acquisition unit 25 in step S13 (S16c). ..
- SRP seed sound SRP
- the method of generating the SRP from the sound signal is the same as the method described with reference to FIGS. 5A and 5B. For example, one SRP is generated.
- the generation unit 27 generates seed sounds SRP (1) to (n) based on the generated seed sound SRP and YU (1), YU (2), ..., YU (n). (S16d). For example, the generation unit 27 considers that the generated seed sound SRP corresponds to the SRP (1) of the UpperRP (that is, the generated seed sound SRP is the seed sound SRP (1)), and the seed sound SRP (1) is used as the seed sound.
- the seed sound SRP (2) is generated by changing each element of the seed sound SRP (1) so that the change to SRP (2) matches the change from YU (1) to YU (2).
- Method of generating seed sound SRP (2) from seed sound SRP (1) (How to change each element of seed sound SRP (1) to match the change from YU (1) to YU (2))
- Various methods can be considered for. For example, there is a method of increasing / decreasing all the elements of the seed sound SRP (1) in the same manner, or a method of increasing / decreasing the value only of the element at the position where the diagonal line exists in the seed sound SRP (1). It is also conceivable to create a template group of the increase / decrease method in advance from the comfortable sound obtained by the subjective evaluation, and to increase / decrease by referring to this.
- the generation unit 27 also generates the seed sounds SRP (3) to (n) in the same manner as the seed sound SRP (2).
- the generation unit 27 generates a sound signal based on the generated seed sounds SRP (1) to (n) (S16e).
- the generation unit 27 also synthesizes waveforms by using a multidimensional scaling method or the like in the generation of sound signals (waveform synthesis).
- the generation unit 27 normalizes each element with the maximum value in the seed sound SRP, and (1-normalized seed sound SRP element) is an element.
- the sequences NS (1) to (n) to be used are generated.
- the array NS means a dissimilarity matrix.
- the generation unit 27 performs dimensional reduction using the arrays NS (1) to (n) by a multidimensional scaling method or the like, and waveforms of the sounds of the seed sounds SRP (1) to (n) at each time.
- YN (1) to (n) which are time-series data of the above, are calculated.
- the adjacent YNs are smoothly connected by performing fade-in / fade-out processing or averaging processing. Perform processing. In this way, the generation unit 27 can generate a sound signal from the seed sounds SRP (1) to (n).
- the information processing unit 33 of the information terminal 30 displays the seed sound and the natural sound selection screen having the characteristics desired to be given to the seed sound on the UI unit 31 (display unit) (S20).
- FIG. 12 is a diagram showing an example of a seed sound and a selection screen of a natural sound having characteristics desired to be given to the seed sound. In operation example 2, it is not necessary for each of the plurality of UpperRPs to be associated with the Kansei word.
- the user When such a selection screen is displayed, the user performs a desired seed sound selection operation and a desired sensibility word selection operation on the UI unit 31 of the information terminal 30, and the UI unit 31 performs an operation of selecting a desired sensibility word.
- This operation is accepted (S21).
- the information terminal 30 transmits a request for a sound signal to the server device 20 (S22).
- the request for the sound signal includes seed sound information indicating the seed sound selected by the user and natural sound information indicating the natural sound selected by the user.
- the communication unit 21 of the server device 20 receives the request for the sound signal.
- the second acquisition unit 25 uses the seed sound indicated by the seed sound information (that is, the user) among the sound signals of the plurality of seed sounds (FIG. 7) based on the seed sound information included in the request for the received sound signal.
- the sound signal of the selected seed sound is acquired from the storage unit 23 (S23).
- the third acquisition unit 26 acquires the natural sound information included in the request for the received sound signal (S24).
- the first acquisition unit 24 is an UpperRP (recurrence plot) associated with the natural sound indicated by the natural sound information among the plurality of UpperRPs (FIG. 6) based on the natural sound information acquired by the third acquisition unit 26.
- Information is acquired from the storage unit 23 (S25).
- the generation unit 27 Based on the UpperRP acquired by the first acquisition unit 24, the generation unit 27 adds the sound signal of the seed sound acquired by the second acquisition unit 25 to the original sound of the UpperRP acquired by the first acquisition unit 24. A sound signal that reflects the characteristics is generated (S26).
- the output unit 28 outputs the generated sound signal (S27).
- the communication unit 21 transmits the output sound signal to the information terminal 30 (S28).
- the information terminal 30 receives a sound signal.
- the information processing unit 33 reproduces a sound signal using the speaker 32 (S29).
- the speaker 32 outputs a sound in which the characteristics (characteristics) of the natural sound selected by the user are reflected in the seed sound desired by the user.
- the seed sound is a natural sound, but may be an artificial sound (musical piece, electronic sound, etc.), or may include both a natural sound and an artificial sound.
- the sound that is the source of the plurality of UpperRPs is a natural sound, but it may be an artificial sound, or may include both a natural sound and an artificial sound.
- the seed sound and the sound that is the source of the plurality of UpperRP do not have to be the same kind of sound.
- the seed sound may be a natural sound and the sound that is the source of the plurality of UpperRP may be an artificial sound. ..
- UpperRP was used as recurrence plot information.
- the recurrence plot information was generated by a two-tiered hierarchical recurrence plot.
- the recurrence plot information may be generated by a one-layer recurrence plot (that is, a non-layered recurrence plot) or by a multi-level recurrence plot of two or more layers.
- the sound signal processing system 10 acquires the sound signal of the second sound different from the first sound and the first acquisition unit 24 that acquires the recurrence plot information indicating the characteristics of the first sound. Based on the recurrence plot information acquired by the second acquisition unit 25 and the first acquisition unit 24, the characteristics of the first sound are reflected in the sound signal of the second sound acquired by the second acquisition unit 25. A generation unit 27 for generating the generated sound signal and an output unit 28 for outputting the generated sound signal are provided.
- the recurrence plot information is UpperRP
- the first sound is the sound that is the source of UpperRP
- the second sound is the seed sound.
- Such a sound signal processing system 10 can output a new sound signal that reflects the characteristics of the first sound in the second sound by using the recurrence plot information. For example, a method of generating a new song using a machine learning model that has learned a large number of songs (see, for example, https://openai.com/blog/jukebox/) is known. The method requires a huge amount of learning data, and it takes time to build a machine learning model. In addition, the amount of information processing when generating music is large. On the other hand, the sound signal processing system 10 can output a new sound signal while reducing the amount of information processing by using the recurrence plot information.
- the sound signal processing system 10 further includes a storage unit 23 in which a plurality of sound signals of the second sound are stored.
- the second acquisition unit 25 acquires the sound signal of the second sound selected by the user from the sound signals of the plurality of second sounds from the storage unit 23.
- the storage unit 23 in this case is an example of the first storage unit.
- Such a sound signal processing system 10 can output a new sound signal that reflects the characteristics of the first sound in the second sound desired by the user.
- the sound signal processing system 10 further acquires a storage unit 23 in which a plurality of recurrence plot information is stored in association with the Kansei word, and an Kansei word information indicating the Kansei word designated by the user.
- the acquisition unit 26 is provided.
- the first acquisition unit 24 acquires the recurrence plot information associated with the Kansei word indicated by the Kansei word information acquired by the third acquisition unit 26 from the storage unit 23 among the plurality of recurrence plot information.
- the storage unit 23 in this case is an example of the second storage unit.
- Such a sound signal processing system 10 can output a new sound signal that reflects the characteristics of the first sound corresponding to the user's desired sensibility word for the second sound.
- the recurrence plot information is information obtained by hierarchically recurrence plotting the sound signal of the first sound.
- Such a sound signal processing system 10 reflects the characteristics of the first sound in the second sound by using the recurrence plot information obtained by hierarchically recurrence plotting the sound signal of the first sound. It is possible to output the sound signal of the new sound.
- the first sound is a natural sound.
- Such a sound signal processing system 10 can output a new sound signal that reflects the characteristics of a natural sound different from the second sound in the second sound.
- the second sound is a natural sound.
- Such a sound signal processing system 10 can output a new sound signal that reflects the characteristics of the first sound in the natural sound.
- the sound signal processing system 10 includes a microcomputer and a memory, and the microcomputer acquires recurrence plot information indicating the characteristics of the first sound by executing a computer program stored in the memory. , The sound signal of the second sound different from the first sound is acquired, and the characteristic of the first sound is added to the acquired sound signal of the second sound based on the acquired recurrence plot information. Is generated, and the generated sound signal is output.
- Such a sound signal processing system 10 can output a new sound signal that reflects the characteristics of the first sound in the second sound by using the recurrence plot information.
- the sound signal processing method executed by a computer such as the sound signal processing system 10 includes a first acquisition step S15 for acquiring recurrence plot information indicating the characteristics of the first sound, and a second sound signal processing method different from the first sound. Based on the recurrence plot information acquired in the second acquisition step S13 and the first acquisition step S15 to acquire the sound signal of the sound of, the second acquisition sound signal acquired in the second acquisition step S13 is the second. It includes a generation step S16 for generating a sound signal reflecting the characteristics of one sound, and an output step S17 for outputting the generated sound signal.
- Such a sound signal processing method can output a new sound signal that reflects the characteristics of the first sound in the second sound by using the recurrence plot information.
- the recurrence plot information is generated based on the sound signal, but may be generated based on the time series data other than the sound signal.
- recurrence plot information may be generated based on stock price fluctuation data, temperature fluctuation data, and the like.
- the sound signal processing system is realized by a plurality of devices, but may be realized as a single device.
- the sound signal processing system may be realized as a single device corresponding to an information terminal, or may be realized as a single device corresponding to a server device.
- the functional components of the sound signal processing system may be distributed to the plurality of devices in any way.
- the information terminal may include some or all of the functional components included in the server device.
- the communication method between the devices in the above embodiment is not particularly limited.
- a relay device (not shown) may be interposed between the two devices.
- the order of processing described in the above embodiment is an example.
- the order of the plurality of processes may be changed, or the plurality of processes may be executed in parallel.
- another processing unit may execute the processing executed by the specific processing unit.
- each component may be realized by executing a software program suitable for each component.
- Each component may be realized by a program execution unit such as a CPU or a processor reading and executing a software program recorded on a recording medium such as a hard disk or a semiconductor memory.
- each component may be realized by hardware.
- each component may be a circuit (or an integrated circuit). These circuits may form one circuit as a whole, or may be separate circuits from each other. Further, each of these circuits may be a general-purpose circuit or a dedicated circuit.
- the general or specific aspects of the present disclosure may be realized by a recording medium such as a system, an apparatus, a method, an integrated circuit, a computer program, or a computer-readable CD-ROM. Further, it may be realized by any combination of a system, an apparatus, a method, an integrated circuit, a computer program and a recording medium.
- the present disclosure may be executed as a sound signal processing method executed by a computer such as a sound signal processing system, or may be realized as a program for causing a computer to execute such a sound signal processing method.
- the present disclosure may be realized as a computer-readable non-temporary recording medium in which such a program is recorded.
- the program here includes an application program for making a general-purpose information terminal function as the information terminal of the above embodiment.
- the sound signal processing system of the present disclosure is useful as a system capable of outputting a new sound signal that reflects the characteristics of another sound in one sound.
- Sound signal processing system 20 Server device 21 Communication unit 22 Signal processing unit 23 Storage unit (first storage unit, second storage unit) 24 1st acquisition unit 25 2nd acquisition unit 26 3rd acquisition unit 27 Generation unit 28 Output unit 30 Information terminal 31 UI unit 32 Speaker 33 Information processing unit 34 Storage unit 40 Wide area communication network
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Multimedia (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- General Health & Medical Sciences (AREA)
- Acoustics & Sound (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Signal Processing (AREA)
- Computational Linguistics (AREA)
- Quality & Reliability (AREA)
- Stereophonic System (AREA)
Priority Applications (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP2022547424A JP7394411B2 (ja) | 2020-09-08 | 2021-07-19 | 音信号処理システム、及び、音信号処理方法 |
| US18/046,062 US12511098B2 (en) | 2020-09-08 | 2022-10-12 | Sound signal processing system and sound signal processing method |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP2020150215 | 2020-09-08 | ||
| JP2020-150215 | 2020-09-08 |
Related Child Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US18/046,062 Continuation US12511098B2 (en) | 2020-09-08 | 2022-10-12 | Sound signal processing system and sound signal processing method |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2022054414A1 true WO2022054414A1 (ja) | 2022-03-17 |
Family
ID=80632516
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/JP2021/027054 Ceased WO2022054414A1 (ja) | 2020-09-08 | 2021-07-19 | 音信号処理システム、及び、音信号処理方法 |
Country Status (3)
| Country | Link |
|---|---|
| US (1) | US12511098B2 (https=) |
| JP (1) | JP7394411B2 (https=) |
| WO (1) | WO2022054414A1 (https=) |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2026009783A1 (ja) * | 2024-07-05 | 2026-01-08 | パナソニックIpマネジメント株式会社 | 装着型デバイス、再生方法、及びプログラム |
Families Citing this family (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2022054414A1 (ja) * | 2020-09-08 | 2022-03-17 | パナソニックIpマネジメント株式会社 | 音信号処理システム、及び、音信号処理方法 |
Citations (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JPH1097267A (ja) * | 1996-09-24 | 1998-04-14 | Hitachi Ltd | 声質変換方法および装置 |
| JP2008116588A (ja) * | 2006-11-01 | 2008-05-22 | National Institute Of Advanced Industrial & Technology | 特徴抽出装置及び方法並びにプログラム |
| WO2008102594A1 (ja) * | 2007-02-19 | 2008-08-28 | Panasonic Corporation | 力み変換装置、音声変換装置、音声合成装置、音声変換方法、音声合成方法およびプログラム |
| WO2008149547A1 (ja) * | 2007-06-06 | 2008-12-11 | Panasonic Corporation | 声質編集装置および声質編集方法 |
| EP3200188A1 (en) * | 2016-01-27 | 2017-08-02 | Telefonica Digital España, S.L.U. | Computer implemented methods for assessing a disease through voice analysis and computer programs thereof |
| JP6474518B1 (ja) * | 2018-07-02 | 2019-02-27 | 坪井 純子 | 簡易操作声質変換システム |
Family Cites Families (13)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP7139628B2 (ja) * | 2018-03-09 | 2022-09-21 | ヤマハ株式会社 | 音処理方法および音処理装置 |
| JP7183556B2 (ja) | 2018-03-26 | 2022-12-06 | カシオ計算機株式会社 | 合成音生成装置、方法、及びプログラム |
| KR102093822B1 (ko) * | 2018-11-12 | 2020-03-26 | 한국과학기술연구원 | 음원 분리 장치 |
| US10573312B1 (en) * | 2018-12-04 | 2020-02-25 | Sorenson Ip Holdings, Llc | Transcription generation from multiple speech recognition systems |
| JP7496514B2 (ja) * | 2019-06-06 | 2024-06-07 | パナソニックIpマネジメント株式会社 | コンテンツ選択方法、コンテンツ選択装置及びコンテンツ選択プログラム |
| EP4009629A4 (en) * | 2019-08-02 | 2022-09-21 | NEC Corporation | SPEECH PROCESSING DEVICE, SPEECH PROCESSING METHOD AND RECORDING MEDIA |
| US11562761B2 (en) * | 2020-07-31 | 2023-01-24 | Zoom Video Communications, Inc. | Methods and apparatus for enhancing musical sound during a networked conference |
| WO2022054414A1 (ja) * | 2020-09-08 | 2022-03-17 | パナソニックIpマネジメント株式会社 | 音信号処理システム、及び、音信号処理方法 |
| US11723568B2 (en) * | 2020-09-10 | 2023-08-15 | Frictionless Systems, LLC | Mental state monitoring system |
| US11432094B2 (en) * | 2020-10-08 | 2022-08-30 | Sony Group Corporation | Three-dimensional (3D) audio notification for vehicle |
| JP7619375B2 (ja) * | 2020-11-25 | 2025-01-22 | ヤマハ株式会社 | 音響処理方法、音響処理システム、電子楽器およびプログラム |
| US11133023B1 (en) * | 2021-03-10 | 2021-09-28 | V5 Systems, Inc. | Robust detection of impulsive acoustic event onsets in an audio stream |
| KR20240001831A (ko) * | 2022-06-28 | 2024-01-04 | 현대모비스 주식회사 | Asd 시스템의 적응형 음량 제어 시스템 및 그 방법 |
-
2021
- 2021-07-19 WO PCT/JP2021/027054 patent/WO2022054414A1/ja not_active Ceased
- 2021-07-19 JP JP2022547424A patent/JP7394411B2/ja active Active
-
2022
- 2022-10-12 US US18/046,062 patent/US12511098B2/en active Active
Patent Citations (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JPH1097267A (ja) * | 1996-09-24 | 1998-04-14 | Hitachi Ltd | 声質変換方法および装置 |
| JP2008116588A (ja) * | 2006-11-01 | 2008-05-22 | National Institute Of Advanced Industrial & Technology | 特徴抽出装置及び方法並びにプログラム |
| WO2008102594A1 (ja) * | 2007-02-19 | 2008-08-28 | Panasonic Corporation | 力み変換装置、音声変換装置、音声合成装置、音声変換方法、音声合成方法およびプログラム |
| WO2008149547A1 (ja) * | 2007-06-06 | 2008-12-11 | Panasonic Corporation | 声質編集装置および声質編集方法 |
| EP3200188A1 (en) * | 2016-01-27 | 2017-08-02 | Telefonica Digital España, S.L.U. | Computer implemented methods for assessing a disease through voice analysis and computer programs thereof |
| JP6474518B1 (ja) * | 2018-07-02 | 2019-02-27 | 坪井 純子 | 簡易操作声質変換システム |
Non-Patent Citations (1)
| Title |
|---|
| KEN’ICHI SAWAI , YOSHITO HIRATA , KAZUYUKI AIHARA: "Music Similarity Measures using a Time Series Analysis Technique", INTERNATIONAL SYMPOSIUM ON NONLINEAR THEORY AND ITS APPLICATIONS, 18 October 2009 (2009-10-18), pages 643 - 646, XP055911417, DOI: 10.34385/proc.43.C3L-A3 * |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2026009783A1 (ja) * | 2024-07-05 | 2026-01-08 | パナソニックIpマネジメント株式会社 | 装着型デバイス、再生方法、及びプログラム |
Also Published As
| Publication number | Publication date |
|---|---|
| US20230057629A1 (en) | 2023-02-23 |
| JP7394411B2 (ja) | 2023-12-08 |
| JPWO2022054414A1 (https=) | 2022-03-17 |
| US12511098B2 (en) | 2025-12-30 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US12051439B2 (en) | Method and system for learning and using latent-space representations of audio signals for audio content-based retrieval | |
| CN110189741B (zh) | 音频合成方法、装置、存储介质和计算机设备 | |
| CN109920409A (zh) | 一种声音检索方法、装置、系统及存储介质 | |
| JP7394411B2 (ja) | 音信号処理システム、及び、音信号処理方法 | |
| Howard et al. | Four-part choral synthesis system for investigating intonation in a cappella choral singing | |
| WO2023181570A1 (ja) | 情報処理方法、情報処理システムおよびプログラム | |
| Thio et al. | A minimal template for interactive web-based demonstrations of musical machine learning | |
| WO2021187395A1 (ja) | パラメータ推論方法、パラメータ推論システム、及びパラメータ推論プログラム | |
| JP2025172909A (ja) | コード推定装置およびコード推定方法 | |
| US20230419932A1 (en) | Information processing device and control method thereof | |
| WO2022264461A1 (ja) | 情報処理システム及び情報処理方法 | |
| CN1130686C (zh) | 具有演奏风格变换装置的卡拉ok装置 | |
| Delgado et al. | A state of the art on computational music performance | |
| US9293124B2 (en) | Tempo-adaptive pattern velocity synthesis | |
| JP4799333B2 (ja) | 楽曲分類方法、楽曲分類装置及びコンピュータプログラム | |
| CN120077430A (zh) | 用于同步通信的音频合成 | |
| Cosma et al. | Automatic music generation using machine learning | |
| CN118658247B (zh) | 生成提示音的方法、装置、计算机设备和计算机程序产品 | |
| JP2020184092A5 (ja) | 情報処理方法、情報処理システムおよびプログラム | |
| US12367010B1 (en) | Systems and methods for facilitating audio processing | |
| Lin et al. | VocalistMirror: A Singer Support Interface for Avoiding Undesirable Facial Expressions | |
| FRATTICIOLI | MambaTransfer: raw audio musical timbre transfer using selective state-space models | |
| JP2019168620A (ja) | 合成音生成装置、方法、及びプログラム | |
| WO2025126548A1 (ja) | 情報処理システム、情報処理方法および情報処理プログラム | |
| JP2018106212A (ja) | 音響解析方法および音響解析装置 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21866374 Country of ref document: EP Kind code of ref document: A1 |
|
| ENP | Entry into the national phase |
Ref document number: 2022547424 Country of ref document: JP Kind code of ref document: A |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 21866374 Country of ref document: EP Kind code of ref document: A1 |