GB2552330B - Method and system for isolating and separating contributions in a composite signal - Google Patents

Method and system for isolating and separating contributions in a composite signal Download PDF

Info

Publication number
GB2552330B
GB2552330B GB1612430.7A GB201612430A GB2552330B GB 2552330 B GB2552330 B GB 2552330B GB 201612430 A GB201612430 A GB 201612430A GB 2552330 B GB2552330 B GB 2552330B
Authority
GB
United Kingdom
Prior art keywords
signal
artefact
cwt
interest
frequency spectrum
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
GB1612430.7A
Other versions
GB2552330A (en
GB201612430D0 (en
Inventor
Phol Cavalier Paul
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Cape Town
Original Assignee
University of Cape Town
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Cape Town filed Critical University of Cape Town
Priority to GB1612430.7A priority Critical patent/GB2552330B/en
Publication of GB201612430D0 publication Critical patent/GB201612430D0/en
Priority to PCT/IB2017/054303 priority patent/WO2018015867A1/en
Publication of GB2552330A publication Critical patent/GB2552330A/en
Application granted granted Critical
Publication of GB2552330B publication Critical patent/GB2552330B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • G10L19/0212Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders using orthogonal transformation
    • G10L19/0216Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders using orthogonal transformation using wavelet decomposition
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • G10L25/54Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for retrieval
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/14Fourier, Walsh or analogous domain transformations, e.g. Laplace, Hilbert, Karhunen-Loeve, transforms
    • G06F17/148Wavelet transforms
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0272Voice signal separating
    • G10L21/028Voice signal separating using properties of sound source
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2218/00Aspects of pattern recognition specially adapted for signal processing
    • G06F2218/02Preprocessing
    • G06F2218/04Denoising
    • G06F2218/06Denoising by applying a scale-space analysis, e.g. using wavelet analysis
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2210/00Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
    • G10H2210/031Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal
    • G10H2210/056Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal for extraction or identification of individual instrumental parts, e.g. melody, chords, bass; Identification or separation of instrumental parts by their characteristic voices or timbres
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2250/00Aspects of algorithms or signal processing methods without intrinsic musical character, yet specifically adapted for or used in electrophonic musical processing
    • G10H2250/131Mathematical functions for musical analysis, processing, synthesis or composition
    • G10H2250/215Transforms, i.e. mathematical transforms into domains appropriate for musical signal processing, coding or compression
    • G10H2250/251Wavelet transform, i.e. transform with both frequency and temporal resolution, e.g. for compression of percussion sounds; Discrete Wavelet Transform [DWT]

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Multimedia (AREA)
  • General Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Human Computer Interaction (AREA)
  • Mathematical Physics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Health & Medical Sciences (AREA)
  • Signal Processing (AREA)
  • Pure & Applied Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Mathematical Analysis (AREA)
  • Computational Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Quality & Reliability (AREA)
  • Databases & Information Systems (AREA)
  • Algebra (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Auxiliary Devices For Music (AREA)

Description

METHOD AND SYSTEM FOR ISOLATING AND SEPARATING CONTRIBUTIONS IN A COMPOSITE SIGNAL
FIELD OF THE INVENTION
This invention relates to the field of digital signal processing. More specifically, but not exclusively, the invention relates to the isolation and separation of source contributions in a composite electronic signal utilising digital signal processing techniques.
BACKGROUND TO THE INVENTION
Compared to the Fourier transform, the theory relative to the wavelet transform is very young and its use is presently expanding across most of the domains where signal processing appears. The use of wavelets has however found a number of applications in the field of music. Most notably, wavelets have been proposed to be used to obtain signal transformation, equalization, pitch shifting, pitch detecting within a music signal and even decomposition of a music signal for later reconstruction. More recently, wavelet analysis has been used to de-noise signals. Wavelet transforms are similar in some ways to Fourier transforms, but differ in that the signal decomposition is done using a wavelet base or “mother” function over the plurality of time-versus-frequency spans, each span having a different scale.
Wavelet analysis of musical signals is useful to show features in both time and frequency simultaneously. This is advantageous because the benefits of both domains can be taken advantage of. Having only access to the signal on the time domain and the Fourier transform of that signal on the frequency domain is not very useful unless there is a way to relate them.
Previous methods for decomposition of music signals have proposed the use of bank filtering techniques, sometimes involving Discrete Wavelet Transforms (DWT), and comparison with databases of spectral signatures. All of these so-called Dictionary Based Methods (DBM), however, involve external information, parameters and/or functions that have no direct relation to the signal being analysed. Because musical recordings invariably include noise associated with the environments in which they were recorded and because sounds behave differently depending on the physical characteristics of the recording environment, commonly referred to as the “acoustics” of the recording environment, Dictionary Based Methods of music signal decomposition are inherently inaccurate as it is extremely unlikely that the dictionaries used would compensate for such recording specific parameters. Any matching of signatures in predefined databases to a music signal under investigation will therefore have inherently limited accuracy unless the samples used to create the signatures were recorded under exactly the same conditions, and include the same noise factors, as the music signal under investigation, a situation which is highly unlikely.
The continuous wavelet transform is similar to the Fourier transform (FT). However, since the wavelets are localized in time and frequency, while the sines and cosines have infinite lengths, the wavelets have to be shifted in time to transform the whole space. Thus, the coefficients are defined on a time/frequency space (while the FT coefficients are only on the frequency axis) by the projection of the signal y(t) on the contracted/dilated shifted wavelets:
In CWT, the coefficients are highly redundant. The signal can be rebuilt from these coefficients with a double integration on time and frequency axes.
The preceding discussion of the background to the invention is intended only to facilitate an understanding of the present invention. It should be appreciated that the discussion is not an acknowledgment or admission that any of the material referred to was part of the common general knowledge in the art as at the priority date of the application.
SUMMARY OF THE INVENTION
In accordance with the invention there is provided a computer-implemented method of isolating and separating a source of interest contribution created by an artefact of interest from a composite electronic signal, the method conducted at a computing device including a processor and a memory component for storing computer-executable instructions and comprising the steps of: receiving the composite electronic signal and an identifier of the artefact of interest from a requesting entity; performing a windowed frequency domain transform on the electronic signal to produce a signal short-time frequency spectrum of the electronic signal; selecting a portion of the signal short-time frequency spectrum containing frequency
content generated at least partially by the artefact of interest; generating a base wavelet from the selected portion of the signal short-time frequency spectrum; performing a Continuous Wavelet Transform (CWT) on the composite electronic signal utilising the base wavelet to produce a CWT diagram including a set of coefficients which indicate levels of similarity between the composite electronic signal and scaled versions of the base wavelet; eliminating coefficients indicating relatively lower levels of similarity from the CWT diagram thus enhancing coefficients indicating relatively higher levels of similarity to produce an artefact dominant CWT diagram; constructing an artefact contribution signal from the artefact dominant CWT diagram; and transmitting the artefact contribution signal to the requesting entity.
Further features provide for the step of performing the windowed frequency domain transform on the electronic signal to include performing a Short Time Fourier Transform (STFT) on the entire electronic signal; and for the step of selecting the portion of the signal short-time frequency spectrum containing frequency content generated at least partially by the artefact of interest to preferably include selecting a portion of the signal short-time frequency spectrum containing content generated by the artefact of interest only.
Still further features provide for the step of generating the base wavelet from the selected portion of the signal short-time frequency spectrum to include: selecting main peaks from the selected portion of the signal; creating a Dirac comb utilising the main peaks; convolving the selected main peaks with a Gaussian envelope or other envelope with a similar characteristic shape; and performing an Inverse Fast Fourier Transform (IFFT) of the convolved signal.
Yet further features provide for the step of constructing the artefact contribution signal from the artefact dominant CWT diagram to include: utilising a frequency spectrum of the base wavelet to construct a matrix of scaled spectra; and convolving the matrix with the artefact dominant CWT diagram to provide an artefact of interest total spectrogram.
Further features provide for the method to include one or more of the steps of: subtracting the artefact of interest spectrogram from the signal short-time frequency spectrum; repeating the method to isolate and separate contributions created by other artefacts of interest from the composite electronic signal; and performing an Inverse STFT on the artefact of interest spectrogram to revert to a time domain artefact contribution signal to the composite electronic signal or, alternatively, perform an Inverse CWT on the artefact dominant CWT diagram utilising the base wavelet to revert to the time domain artefact contribution signal. A still further feature provides for the step of selecting the portion of the signal short-time frequency spectrum containing frequency content generated at least partially by the artefact of interest to be automated.
The invention extends to a system for isolating and separating a source of interest contribution created by an artefact of interest from a composite electronic signal, the system comprising a computing device including a memory component for storing computer-executable instructions and a processor for executing the computer-executable instructions, the computing device including: a receiving component for receiving the composite electronic signal and an identifier of the artefact of interest from a requesting entity; a frequency spectrum producing component for performing a windowed frequency domain transform on the electronic signal to produce a signal short-time frequency spectrum of the electronic signal; a selection component for selecting a portion of the signal short-time frequency spectrum containing frequency content generated at least partially by the artefact of interest; a wavelet generating component for generating a base wavelet from the selected portion of the signal short-time frequency spectrum; a CWT diagram producing component for performing a Continuous Wavelet Transform (CWT) on the composite electronic signal utilising the base wavelet to produce a CWT diagram including a set of coefficients which indicate levels of similarity between the composite electronic signal and scaled versions of the base wavelet; an eliminating component for eliminating coefficients indicating relatively lower levels of similarity from the CWT diagram thus enhancing coefficients indicating relatively higher levels of similarity to produce an artefact dominant CWT diagram; a signal constructing component for constructing an artefact contribution signal from the artefact dominant CWT diagram; and a transmitting component for transmitting the artefact contribution signal to the requesting entity.
Further features provide for the frequency spectrum producing component to perform a Short Time Fourier Transform (STFT) on the entire electronic signal; and for the selection component to preferably select a portion of the signal short-time frequency spectrum containing content generated by the artefact of interest only.
Still further features provide for the wavelet generating component to: select main peaks from the selected portion of the signal; create a Dirac comb utilising the main peaks; convolve the selected main peaks with a Gaussian envelope or other envelope with a similar characteristic shape; and perform an Inverse Fast Fourier Transform (IFFT) of the convolved signal.
Yet further features provide for the signal constructing component to: utilise a frequency spectrum of the base wavelet to construct a matrix of scaled spectra; and convolve the matrix with the artefact dominant CWT diagram to provide an artefact of interest total spectrogram.
Further features provide for the system to include: a subtracting component for subtracting the artefact of interest spectrogram from the signal short-time frequency spectrum; and a signal reconstructing component for performing an Inverse STFT on the artefact of interest spectrogram to revert to a time domain artefact contribution signal to the composite electronic signal or, alternatively, perform an Inverse CWT on the artefact dominant CWT diagram utilising the base wavelet to revert to the time domain artefact contribution signal. A still further feature provides for the selection component to automatically select the portion of the signal short-time frequency spectrum containing frequency content generated at least partially by the artefact of interest to be automated.
The invention extends to a computer program product for isolating and separating a source of interest contribution created by an artefact of interest from a composite electronic signal, the computer program product comprising a computer-readable medium having stored computer-readable program code for performing the steps of: receiving the composite electronic signal and an identifier of the artefact of interest from a requesting entity; performing a windowed frequency domain transform on the electronic signal to produce a signal short-time frequency spectrum of the electronic signal; selecting a portion of the signal short-time frequency spectrum containing frequency content generated at least partially by the artefact of interest; generating a base wavelet from the selected portion of the signal short-time frequency spectrum; performing a Continuous Wavelet Transform (CWT) on the composite electronic signal utilising the base wavelet to produce a CWT diagram including a set of coefficients which indicate levels of similarity between the composite electronic signal and scaled versions of the base wavelet; eliminating coefficients indicating relatively lower levels of similarity from the CWT diagram thus enhancing coefficients indicating relatively higher levels of similarity to produce an artefact dominant CWT diagram; constructing an artefact contribution signal from the artefact dominant CWT diagram; and transmitting the artefact contribution signal to the requesting entity.
Further features provide for the computer-readable medium to be a non-transitory computer-readable medium and for the computer-readable program code to be executable by a processing circuit.
Embodiments of the invention will now be described, by way of example only, with reference to the accompanying drawings.
BRIEF DESCRIPTION OF THE DRAWINGS
In the drawings:
Figure 1 is a schematic diagram illustrating an embodiment of a system for isolating and separating a source of interest contribution created by an artefact of interest from a composite electronic signal according to the invention;
Figure 2 is a block diagram illustrating components of a processor used in a system according to the invention;
Figure 3 is a functional block diagram illustrating an implementation of a method according to the invention;
Figure 4 is a functional block diagram illustrating a step of the block diagram of
Figure 3 in more detail;
Figure 5 is a block diagram illustrating a computing device in which various aspects of the disclosure may be implemented; and
Figures 6A-6I are exemplary graphical representations of signals that may be obtained at various steps in a method according to the invention.
DETAILED DESCRIPTION WITH REFERENCE TO THE DRAWINGS A system and method for isolating and separating a source of interest contribution created by an artefact of interest, from a composite electronic signal are provided. In operation, a request for isolating and separating the source of interest contribution from the composite electronic signal is received at a processor of a computing device from a requesting entity. It should be apparent that the requesting entity may be a module or component operating on the computing device or a separate or remote entity in data communication with the computing device. The request may include the composite electronic signal under investigation as well as an identifier of the artefact of interest. The computing device then performs the isolation and separation of the source of interest contribution from the composite electronic signal using the artefact of interest identifier, constructs an artefact contribution signal for which the artefact of interest was responsible and transmits the artefact contribution signal back to the requesting entity.
Figure 1 illustrates an embodiment of a system (100) for isolating and separating a contribution by an artefact of interest to a composite electronic signal, from the electronic signal. The system (100) comprises a computing device (102) including a memory component (104) and a processor (106) in data communication with it. The processor (106) is configured to execute computer-executable instructions stored on the memory component (104) and is capable of communicating with a requesting entity (108) over a communications network (110). In one embodiment the composite electronic signal is a digitised music recording (112) containing source contributions from a variety of artefacts which could, for example, include instruments, vocal contributions, percussion and even noise or other acoustic contributions of the environment in which the music recording (112) was recorded. In the example that follows as it relates to the isolation and extraction of artefact contributions to a music recording, a composite electronic signal may also be referred to as a “track”, as this term is commonly used among those skilled in the art.
As shown in more detail in the block diagram illustrated in Figure 2, the computing device (102) includes a receiving component (202) for receiving the composite electronic signal and an identifier of the artefact of interest from the requesting entity (108), a frequency spectrum producing component (204) for producing a spectrogram of the composite electronic signal, a selection component (206) for selecting a portion of the spectrogram containing frequency content for which the artefact of interest is predominantly responsible, a wavelet generating component (208) for generating a base wavelet from the selected portion of the spectrogram, a Continuous Wavelet Transform (CWT) component (210) for performing a CWT transform on the composite electronic signal utilising the base wavelet, an eliminating and/or clean-up component (212) for eliminating coefficients indicating relatively lower levels of similarity to the wavelet from the CWT diagram produced by the CWT component (210), a signal constructing component (214) for constructing an artefact contribution signal from the artefact dominant CWT diagram, and a transmitting component (216) for transmitting the artefact contribution signal back to the requesting entity (108).
Figure 3 illustrates a block diagram showing an implementation of a method (300) according to the invention in more detail. The method is applied to a composite music signal or “track” in an attempt to isolate and separate the contribution of a single artefact of interest, in this embodiment a single musical instrument, from the track. The track contains contributions by multiple musical instruments. The track is received (302) at the computing device from the requesting entity, which in this embodiment is a computer which is in communication with the computing device over the Internet, preferably together with an identification of the instrument whose contribution is required to be isolated and separated from the track. The identification may include any one or more of a variety of elements including, but not limited to, a name of the instrument, an identification of a time in the track at which the instrument is playing in isolation or a frequency signature of the instrument, to name but a few. It will be appreciated by those skilled in the art that numerous ways of identifying an instrument of interest may be presented. A spectrogram of the track is then extracted (304) by the frequency spectrum producing component. This can be done in a variety of known ways including, but not limited to conducting a Short-Time Fourier Transform (STFT) of the track. In practice any windowed frequency domain transform may be used. The spectrogram is in essence a short-time frequency spectrum representing the track in the frequency domain. Once the spectrogram has been extracted (304), the selection component selects (306), preferably automatically although some user interaction may be required, a part of the spectrogram containing frequency content generated at least partially by the instrument. Ideally the selected portion should contain only frequency content generated by the instrument of interest although in practice this may be difficult. Typically, a part of the spectrogram containing the least harmonics will represent the instrument playing in isolation. This selection will also represent what is commonly referred to as the “timbre” of the instrument. The selection process may work well on the assumption that the part of the spectrogram with the least harmonics corresponds to a single instrument playing a single note (not a chord of notes). A wavelet is then generated (308) by the wavelet generating component using the selected portion of the spectrogram. This wavelet forms the base wavelet representing the instrument of interest and which is used for further analysis of the track. Once the base wavelet has been generated, a CWT is performed (310) on the original track using the base wavelet to produce a CWT diagram containing a set of coefficients which indicate levels of similarity between the track and scaled versions of the base wavelet, stronger or relatively higher coefficients indicating a high level of similarity to the base wavelet and weaker or relatively lower coefficients indicating a low level of similarity to the base wavelet. In general, the strongest coefficients occur at the fundamental frequencies or scales where the instrument under consideration is playing. Indeed, one of the core ideas around which the invention is based is that the timbre of an instrument is scaled accordingly to the fundamental frequency played by the instrument, and does not vary from one note to another (the relative strengths of harmonics are constant notwithstanding the note played, which gives recognisability to the instrument). Similarly, the wavelet’s spectrum dilates or contracts according to its scale. The transform coefficient is the product of signal and wavelet spectra. In this way, the strongest transform coefficient occurs when the spectrum of the track matches the spectrum of the wavelet most closely. The stronger coefficients occur principally at the scale corresponding to the strongest spectral component of the instrument. The time-scale representation can be converted to time-frequency representation using the central frequency of the wavelet.
Weaker coefficients indicating relatively lower levels of similarity between the track and the base wavelet may then be eliminated (312) from the CWT diagram through techniques such as thresholding in order to keep only the stronger coefficients. This in turn highlight the strongest convolution results and produces an instrument or artefact dominant CWT diagram which predominantly represents the contribution to the track of the instrument of interest.
The signal constructing component then constructs (314) an instrument contribution signal from the instrument dominant CWT diagram, which is in turn returned (316) to the requesting entity by the transmitting component.
It should be apparent that the instrument contribution signal may be returned to the requesting entity as the artefact dominant CWT diagram, a spectogram in the frequency domain or as a time domain signal. In the case of a spectogram being returned, the signal constructing component may utilise a frequency spectrum of the wavelet to construct a matrix of scaled spectra which it then convolves with the instrument dominant CWT diagram to provide the spectogram of the instrument’s total contribution to the track. In the case of time domain signal being returned, the instrument of interest’s total spectrogram may be subjected to an Inverse STFT or, alternatively, an Inverse CWT may be performed on the artefact dominant CWT diagram, again utilising the base wavelet, to produce the time domain instrument contribution signal. It should be noted that by performing the Inverse CWT with an alternative wavelet other than the base wavelet previously calculated, the alternative wavelet perhaps representing a different instrument, the sound characteristics of the alternative instrument may in fact be injected into the original instrument’s contribution to the track.
Once the instrument of interest’s spectogram has been obtained it may be subtracted (314) from the spectogram of the original track. The process may then be repeated to isolate and separate the contributions of other instruments of interest form the track.
The block diagram illustrated in Figure 4 shows a more detailed process (400) by which the base wavelet may be generated by the wavelet generating component (208). The selected portion of the track spectrogram may first be subjected to a peak finding algorithm (402) during which main peaks in the signal may be selected. A Dirac comb may then be created (404) from these peaks which may in turn be convolved with a Gaussian envelope (406) or other envelope with similar characteristics, after which in Inverse FFT of the convolved signal may be calculated (408).
Figures 6A to 6I show exemplary graphical representations of signals that may be obtained at various steps of a method disclosed in this specification. Figure 6A, for example, shows a CWT diagram resulting from performing a CWT on a composite electric signal utilising a base wavelet. Figure 6A shows the CWT diagram in the real, time domain according to the wavelet scale and Figure 6B shows the same diagram but with scales converted to their equivalent frequencies.
Figure 6C in turn shows the diagram of Figure 6B smoothed by short time window averaging. Figure 6D may be obtained after weaker coefficients of Figure 6C are removed, thus showing only the relatively higher levels of similarity to the base wavelet.
Figures 6E and 6F show examples of a base wavelet, in the real (6E) and frequency (6F) domains respectively, which may be used to perform an initial CWT on a composite electronic signal, but also to reconstruct a time domain artefact contribution signal from an artefact of interest spectogram.
Figure 6G shows a resampled matrix of scaled spectra, which could be used to reconstruct an artefact contribution signal and Figures 6H and 61 show a reconstructed artefact contribution signal and a reconstructed artefact contribution signal with time averages, respectively.
In one embodiment of the invention the computing device may receive a single composite electronic signal and identifiers of multiple artefacts of interest that contributed to the composite signal and return multiple contribution signals, one for each of the artefacts of interest, to the requesting entity. A key development presented by this invention is the use of very specific wavelets that are automatically generated from the artefact characteristics extracted from the original composite electronic signal. This allows case-specific, tailored solutions to isolation and extraction of artefact contributions from the original composite signal. Such an approach is then easily transposable to various contexts where an artefact with a given spectral fingerprint needs to be highlighted, in single or multiple dimensions.
It should be noted that the use of wavelet formalism is not necessarily required and can be replaced by scaling in spectral domain of a given bank filter. This scaling occurs naturally in the case of the CWT, and so scaling a filter bank is equivalent to performing a wavelet analysis, and can be described as such.
The techniques presented use continuous wavelet transforms using wavelets generated from sections within the original composite signal to be analysed, particularly from sections of the original compound signal where one source is predominant. The approach applies particularly to situations where a given source is present in a signal, but with a spectrum that varies in scale (contraction or dilation). This happens for instance in music, where an instrument has a given timbre that is contracted or dilated according to the fundamental frequency (pitch) played. Similarly in remote tracking and identification of objects, a target with a certain spectrum can see its spectrum contracted or dilated according to its distance from a receiver. In both cases, the method described allows isolating of the instrument or other target’s spectral signature, despite their scaling variation throughout the signal. Consequently this methodology may be extended to all distance-decaying phenomena and quantities, in all dimensions (thus including 2D image processing).
Specifically, one of the key steps resides in isolating the signal of interest from the composite signal once and generating a wavelet from its spectrum. The CWT will then inherently perform the convolution between the entire signal with scaled versions of that wavelet, ultimately showing a diagram of similarity between the signal and the scaled wavelets. This similarity highlights the presence of the particular component of interest, with its scale, within the entire signal. A method has accordingly been developed that allows following a signal with a given signature that can vary in scale, using continuous wavelet transforms. As a wavelet is created from an example of that signature within the original composite signal, the transform automatically highlights the presence of that signature, possibly varying in scale, throughout the composite signal. This allows in fine isolating one source from the signal, which permits signal decomposition and manipulation according to composite signal’s source components.
As described in the above example, one possible application of the invention may be the extraction of an instrument partition from a soundtrack containing contributions from several instruments. Another application may be the identification and classification of interference artefacts in signals, such as radio-frequency interference in radio-telescope arrays.
The proposed technique involves only information contained within the original composite signal and realizes an inherently adaptive filtering, unlike classical filtering techniques that operate in fixed spectral ranges.
An important difference of the current invention when compared to existing DWT techniques, as is for example discussed in United States patent number US6182018, is the generation of case-tailored wavelets, instead of resorting to conventional wavelets or wavelet databases existing in the literature. Furthermore, the DWT techniques are focused on the signal decomposition, and further statistical analysis of resulting coefficients to classify the coefficient sets, which are not necessarily involved here.
The method described may be applicable to all cases involving spectral scaling (dilation or contraction) of the artefact of interest. The founding ideas have been applied in potential field (gravity and magnetic) source characterization to recover distance and type of source in geological contexts. Potentially, this may apply to all distance-decaying phenomena and quantities where such spectral scaling occurs. The ideas proposed here cover musical instrument separation within a composite soundtrack, or the remote sensing and tracking of a target. Furthermore, multi-dimensional applications may be imagined. For instance 2D wavelets may be designed to match a particular shape (or fingerprint). The base concept may lead to localizing a given shape/fingerprint within a picture, give its size and even count the number of such shapes within the picture (for all sizes of interest). It may also provide an alternative to existing facial recognition techniques.
The current invention therefore relies on the use of the scale-adaptive spectral filtering of a composite signal, occurring inherently in the CWT, with wavelets generated of an artefact of interest as occurring in the actual composite signal. The analysing wavelet may act as a matching filter in the spectral domain, highlighting spectral signatures similar to itself (thus to the artefact of interest), with the additional ability of detecting these signatures no matter their scale.
As mentioned before with reference to Figure 1, the computing device (102) may include a processor (104) for executing the functions of components described, which may be provided by hardware or by software units executing on the computing device (102). The software units may be stored in a memory component (106) and instructions may be provided to the processor (104) to carry out the functionality of the described components. In some cases, for example in a cloud computing implementation, software units arranged to manage and/or process data on behalf of the computing device may be provided remotely. Some or all of the components may be provided by a software application downloadable onto and executable on the computing device.
Figure 6 illustrates an example of a computing device (500) in which various aspects of the disclosure may be implemented. The computing device (500) may be suitable for storing and executing computer program code. The various participants and elements in the previously described system diagrams may use any suitable number of subsystems or components of the computing device (500) to facilitate the functions described herein. The computing device (500) may include subsystems or components interconnected via a communication infrastructure (505) (for example, a communications bus, a cross-over bar device, or a network). The computing device (500) may include one or more central processors (510) and at least one memory component in the form of computer-readable media. In some configurations, a number of processors may be provided and may be arranged to carry out calculations simultaneously. In some implementations, a number of computing devices (500) may be provided in a distributed, cluster or cloud-based computing configuration and may provide software units arranged to manage and/or process data on behalf of remote devices.
The memory components may include system memory (515), which may include read only memory (ROM) and random access memory (RAM). A basic input/output system (BIOS) may be stored in ROM. System software may be stored in the system memory (515) including operating system software. The memory components may also include secondary memory (520). The secondary memory (520) may include a fixed disk (521), such as a hard disk drive, and, optionally, one or more removable-storage interfaces (522) for removable-storage components (523). The removable-storage interfaces (522) may be in the form of removable-storage drives (for example, magnetic tape drives, optical disk drives, etc.) for corresponding removable storage-components (for example, a magnetic tape, an optical disk, etc.), which may be written to and read by the removable-storage drive. The removable-storage interfaces (522) may also be in the form of ports or sockets for interfacing with other forms of removable-storage components (523) such as a flash memory drive, external hard drive, or removable memory chip, etc.
The computing device (500) may include an external communications interface (530) for operation of the computing device (500) in a networked environment enabling transfer of data between multiple computing devices (500). Data transferred via the external communications interface (530) may be in the form of signals, which may be electronic, electromagnetic, optical, radio, or other types of signal. The external communications interface (530) may enable communication of data between the computing device (500) and other computing devices including servers and external storage facilities. Web services may be accessible by the computing device (500) via the communications interface (530). The external communications interface (530) may also enable other forms of communication to and from the computing device (500) including, voice communication, near field communication, radio frequency communications, such as Bluetooth™, etc.
The computer-readable media in the form of the various memory components may provide storage of computer-executable instructions, data structures, program modules, software units and other data. A computer program product may be provided by a computer-readable medium having stored computer-readable program code executable by the central processor (510). A computer program product may be provided by a non-transient computer-readable medium, or may be provided via a signal or other transient means via the communications interface (530).
Interconnection via the communication infrastructure (505) allows the central processor (510) to communicate with each subsystem or component and to control the execution of instructions from the memory components, as well as the exchange of information between subsystems or components. Peripherals (such as printers, scanners, cameras, or the like) and input/output (I/O) devices (such as a mouse, touchpad, keyboard, microphone, and the like) may couple to the computing device (500) either directly or via an I/O controller (535). These components may be connected to the computing device (500) by any number of means known in the art, such as a serial port. One or more monitors (545) may be coupled via a display or video adapter (540) to the computing device (500).
The foregoing description has been presented for the purpose of illustration; it is not intended to be exhaustive or to limit the invention to the precise forms disclosed. Persons skilled in the relevant art can appreciate that many modifications and variations are possible in light of the above disclosure.
Any of the steps, operations, components or processes described herein may be performed or implemented with one or more hardware or software units, alone or in combination with other devices. In one embodiment, a software unit is implemented with a computer program product comprising a non-transient computer-readable medium containing computer program code, which can be executed by a processor for performing any or all of the steps, operations, or processes described. Software units or functions described in this application may be implemented as computer program code using any suitable computer language such as, for example, Java™, C++, or Perl™ using, for example, conventional or object-oriented techniques. The computer program code may be stored as a series of instructions, or commands on a non-transitory computer-readable medium, such as a random access memory (RAM), a read-only memory (ROM), a magnetic medium such as a hard-drive, or an optical medium such as a CD-ROM. Any such computer-readable medium may also reside on or within a single computational apparatus, and may be present on or within different computational apparatuses within a system or network.
Flowchart illustrations and block diagrams of methods, systems, and computer program products according to embodiments are used herein. Each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, may provide functions which may be implemented by computer readable program instructions. In some alternative implementations, the functions identified by the blocks may take place in a different order to that shown in the flowchart illustrations.
Some portions of this description describe the embodiments of the invention in terms of algorithms and symbolic representations of operations on information. These algorithmic descriptions and representations are commonly used by those skilled in the data processing arts to convey the substance of their work effectively to others skilled in the art. These operations, while described functionally, computationally, or logically, are understood to be implemented by computer programs or equivalent electrical circuits, microcode, or the like. The described operations may be embodied in software, firmware, hardware, or any combinations thereof.
It will be appreciated by those skilled in the art that the above description is by way of example only and that numerous changes and modifications may be made to the embodiments of the invention described without departing from the scope of the invention. For example, it will be apparent to those skilled in the art that different strategies may be employed to eliminate weaker coefficients from CWT diagram as discussed above. One example of such an alternative strategy may require an additional iterative amplitudematching step within the CWT. This may imply that the CWT employed may not be a traditional CWT by definition. Such a step may also be employed as a refinement to the already described steps and may involve using the scaled filter matrix that is currently used for reconstruction, but for the ‘de-construction’ part. The strategy would be to perform the STFT on the entire signal, giving the signal spectrogram. Then filter it at each instant with the scaled filter bank, while checking for all peaks (fundamental as well as harmonics) to match each expected amplitude (harmonics amplitudes would depend on the amplitude of the fundamental note/frequency). Consequently, when harmonic peaks do not match, the analysed peak may be set to zero (as not corresponding to the signature that is being sought). The result may be to keep only the fundamental frequencies for which all harmonics correspond to the signature that is being sought, thus producing the partition of the instrument. This strategy may not necessitate traditional Wavelet terminology, even though it may be inspired by it. The fundamental difference is that the resulting coefficient here would be obtained from a step-wise multiplication of a Wavelet signal in the frequency domain, whereas in classical CWT it is obtained from a Wavelet signal multiplication on the entire frequency domain.
It is also anticipated that the methods disclosed herein may find application in so-called “Blind Source Separation” or “Blind Signal Separation” (“BSS”) techniques.
The language used in the specification has been principally selected for readability and instructional purposes, and it may not have been selected to delineate or circumscribe the inventive subject matter. It is therefore intended that the scope of the invention be limited not by this detailed description, but rather by any claims that issue on an application based hereon. Accordingly, the disclosure of the embodiments of the invention is intended to be illustrative, but not limiting, of the scope of the invention, which is set forth in the following claims.
Finally, throughout the specification and claims unless the contents requires otherwise the word ‘comprise’ or variations such as ‘comprises’ or ‘comprising’ will be understood to imply the inclusion of a stated integer or group of integers but not the exclusion of any other integer or group of integers.

Claims (15)

CLAIMS:
1. A computer-implemented method of isolating and separating a source-of-interest contribution created by an artefact of interest from a composite electronic signal, the method conducted at a computing device including a processor and a memory component for storing computer-executable instructions and comprising the steps of: receiving the composite electronic signal and an identifier of the artefact of interest from a requesting entity; performing a windowed frequency domain transform on the electronic signal to produce a signal short-time frequency spectrum of the electronic signal; selecting a portion of the signal short-time frequency spectrum containing frequency content generated at least partially by the artefact of interest; generating a base wavelet from the selected portion of the signal short-time frequency spectrum; performing a Continuous Wavelet Transform (CWT) on the composite electronic signal utilising the base wavelet to produce a CWT diagram including a set of coefficients which indicate levels of similarity between the composite electronic signal and scaled versions of the base wavelet; eliminating coefficients indicating relatively lower levels of similarity from the CWT diagram thus enhancing coefficients indicating relatively higher levels of similarity to produce an artefact dominant CWT diagram; constructing an artefact contribution signal from the artefact dominant CWT diagram; and transmitting the artefact contribution signal to the requesting entity.
2. A method as claimed in claim 1, wherein the step of performing the windowed frequency domain transform on the electronic signal includes performing a Short Time Fourier Transform (STFT) on the entire electronic signal.
3. A method as claimed in claim 1 or claim 2, wherein the step of selecting the portion of the signal short-time frequency spectrum containing frequency content generated at least partially by the artefact-of-interest includes selecting a portion of the signal short-time frequency spectrum containing content generated by the artefact-of-interest only.
4. A method as claimed in any one of the preceding claims, wherein the step of generating the base wavelet from the selected portion of the signal short-time frequency spectrum includes one or more of the steps of: selecting main peaks from the selected portion of the signal; creating a Dirac comb utilising the main peaks; convolving the selected main peaks with a Gaussian envelope or other envelope with a similar characteristic shape; and performing an Inverse Fast Fourier Transform (IFFT) of the convolved signal.
5. A method as claimed in any one of the preceding claims, wherein the step of constructing the artefact contribution signal from the artefact dominant CWT diagram includes one or both of the steps of: utilising a frequency spectrum of the base wavelet to construct a matrix of scaled spectra; and convolving the matrix with the artefact dominant CWT diagram to provide an artefact of interest total spectrogram.
6. A method as claimed in any one of the preceding claims, which includes the step of subtracting the artefact-of-interest spectrogram from the signal short-time frequency spectrum.
7. A method as claimed in any one of the preceding claims, which includes repeating the method to isolate and separate contributions created by other artefacts-of-interest from the composite electronic signal.
8. A method as claimed in any one of the preceding claims, which includes the step of performing an Inverse STFT on the artefact of interest spectrogram to revert to a time domain artefact contribution signal to the composite electronic signal or perform an Inverse CWT on the artefact dominant CWT diagram utilising the base wavelet to revert to the time domain artefact contribution signal.
9. A method as claimed in any one of the preceding claims, wherein the step of selecting the portion of the signal short-time frequency spectrum containing frequency content generated at least partially by the artefact of interest is automated.
10. A system for isolating and separating a source-of-interest contribution created by an artefact of interest from a composite electronic signal, the system comprising a computing device including a memory component for storing computer-executable instructions and a processor for executing the computer-executable instructions, the computing device including: a receiving component for receiving the composite electronic signal and an identifier of the artefact of interest from a requesting entity; a frequency spectrum producing component for performing a windowed frequency domain transform on the electronic signal to produce a signal short-time frequency spectrum of the electronic signal; a selection component for selecting a portion of the signal short-time frequency spectrum containing frequency content generated at least partially by the artefact of interest; a wavelet generating component for generating a base wavelet from the selected portion of the signal short-time frequency spectrum; a CWT diagram producing component for performing a Continuous Wavelet Transform (CWT) on the composite electronic signal utilising the base wavelet to produce a CWT diagram including a set of coefficients which indicate levels of similarity between the composite electronic signal and scaled versions of the base wavelet; an eliminating component for eliminating coefficients indicating relatively lower levels of similarity from the CWT diagram thus enhancing coefficients indicating relatively higher levels of similarity to produce an artefact dominant CWT diagram; and a signal constructing component for constructing an artefact contribution signal from the artefact dominant CWT diagram; and a transmitting component for transmitting the artefact contribution signal to the requesting entity.
11. A system as claimed in claim 10, wherein the frequency spectrum producing component performs a Short Time Fourier Transform (STFT) on the entire electronic signal and the selection component selects a portion of the signal short-time frequency spectrum containing content generated by the artefact-of-interest only.
12. A system as claimed in claim 10 or claim 11, wherein the wavelet generating component: selects main peaks from the selected portion of the signal; creates a Dirac comb utilising the main peaks; convolves the selected main peaks with a Gaussian envelope or other envelope with a similar characteristic shape; and/or performs an Inverse Fast Fourier Transform (IFFT) of the convolved signal.
13. A system as claimed in any one of claims 10 to 12, wherein the signal constructing component: utilises a frequency spectrum of the base wavelet to construct a matrix of scaled spectra; and convolves the matrix with the artefact dominant CWT diagram to provide and artefact of interest total spectrogram.
14. A system as claimed in any one of claims 10 to 13, which includes: a subtracting component for subtracting the artefact-of-interest spectrogram from the signal short-time frequency spectrum; and/or a signal reconstructing component for performing an Inverse STFT on the artefact-of-interest spectrogram to revert to a time domain artefact contribution signal to the composite electronic signal or perform an Inverse CWT on the artefact dominant CWT diagram utilising the base wavelet to revert to the time domain artefact contribution signal.
15. A computer program product for isolating and separating a source-of-interest contribution created by an artefact of interest from a composite electronic signal, the computer program product comprising a computer-readable medium having stored computer-readable program code for performing the steps of: receiving the composite electronic signal and an identifier of the artefact of interest from a requesting entity; performing a windowed frequency domain transform on the electronic signal to produce a signal short-time frequency spectrum of the electronic signal; selecting a portion of the signal short-time frequency spectrum containing frequency content generated at least partially by the artefact of interest; generating a base wavelet from the selected portion of the signal short-time frequency spectrum; performing a Continuous Wavelet Transform (CWT) on the composite electronic signal utilising the base wavelet to produce a CWT diagram including a set of coefficients which indicate levels of similarity between the composite electronic signal and scaled versions of the base wavelet; eliminating coefficients indicating relatively lower levels of similarity from the CWT diagram thus enhancing coefficients indicating relatively higher levels of similarity to produce an artefact dominant CWT diagram; constructing an artefact contribution signal from the artefact dominant CWT diagram; and transmitting the artefact contribution signal to the requesting entity.
GB1612430.7A 2016-07-18 2016-07-18 Method and system for isolating and separating contributions in a composite signal Expired - Fee Related GB2552330B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
GB1612430.7A GB2552330B (en) 2016-07-18 2016-07-18 Method and system for isolating and separating contributions in a composite signal
PCT/IB2017/054303 WO2018015867A1 (en) 2016-07-18 2017-07-17 Method and system for isolating and separating contributions in a composite signal

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
GB1612430.7A GB2552330B (en) 2016-07-18 2016-07-18 Method and system for isolating and separating contributions in a composite signal

Publications (3)

Publication Number Publication Date
GB201612430D0 GB201612430D0 (en) 2016-08-31
GB2552330A GB2552330A (en) 2018-01-24
GB2552330B true GB2552330B (en) 2019-07-31

Family

ID=56890711

Family Applications (1)

Application Number Title Priority Date Filing Date
GB1612430.7A Expired - Fee Related GB2552330B (en) 2016-07-18 2016-07-18 Method and system for isolating and separating contributions in a composite signal

Country Status (2)

Country Link
GB (1) GB2552330B (en)
WO (1) WO2018015867A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117092206A (en) * 2023-08-09 2023-11-21 国网四川省电力公司电力科学研究院 Defect detection method for cable lead sealing area, computer equipment and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6182018B1 (en) * 1998-08-25 2001-01-30 Ford Global Technologies, Inc. Method and apparatus for identifying sound in a composite sound signal
US20110071376A1 (en) * 2009-09-24 2011-03-24 Nellcor Puritan Bennett Llc Determination Of A Physiological Parameter

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8954173B1 (en) * 2008-09-03 2015-02-10 Mark Fischer Method and apparatus for profiling and identifying the source of a signal

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6182018B1 (en) * 1998-08-25 2001-01-30 Ford Global Technologies, Inc. Method and apparatus for identifying sound in a composite sound signal
US20110071376A1 (en) * 2009-09-24 2011-03-24 Nellcor Puritan Bennett Llc Determination Of A Physiological Parameter

Also Published As

Publication number Publication date
WO2018015867A1 (en) 2018-01-25
GB2552330A (en) 2018-01-24
GB201612430D0 (en) 2016-08-31

Similar Documents

Publication Publication Date Title
Luo et al. Dual-path rnn: efficient long sequence modeling for time-domain single-channel speech separation
Boashash Time-frequency signal analysis and processing: a comprehensive reference
US6323412B1 (en) Method and apparatus for real time tempo detection
JPWO2006085537A1 (en) Signal separation device, signal separation method, signal separation program, and recording medium
Bardenet et al. Time-frequency transforms of white noises and Gaussian analytic functions
Sarroff Complex neural networks for audio
Stöter et al. Common fate model for unison source separation
Cho et al. Sparse music representation with source-specific dictionaries and its application to signal separation
US8965832B2 (en) Feature estimation in sound sources
US9570060B2 (en) Techniques of audio feature extraction and related processing apparatus, method, and program
Tengtrairat et al. Single-channel separation using underdetermined blind autoregressive model and least absolute deviation
Wang et al. Denoising speech based on deep learning and wavelet decomposition
CN116472579A (en) Machine learning for microphone style transfer
Kemiha et al. Complex blind source separation
GB2552330B (en) Method and system for isolating and separating contributions in a composite signal
Muradeli et al. Differentiable Time-Frequency Scattering On GPU
Genussov et al. Multiple fundamental frequency estimation based on sparse representations in a structured dictionary
JP2023545820A (en) Generative neural network model for processing audio samples in the filter bank domain
US20230326476A1 (en) Bandwidth extension and speech enhancement of audio
CN115116469B (en) Feature representation extraction method, device, equipment, medium and program product
CN112687280B (en) Biodiversity monitoring system with frequency spectrum-time space interface
Lefèvre et al. A convex formulation for informed source separation in the single channel setting
Masri et al. A review of time–frequency representations, with application to sound/music analysis–resynthesis
Li et al. Speech privacy leakage from shared gradients in distributed learning
Brown et al. Comparison of modelled pursuits with ESPRIT and the matrix pencil method in the modelling of medical percussion signals

Legal Events

Date Code Title Description
PCNP Patent ceased through non-payment of renewal fee

Effective date: 20200718