WO2020249870A1 - A method for processing a music performance - Google Patents

A method for processing a music performance Download PDF

Info

Publication number
WO2020249870A1
WO2020249870A1 PCT/FI2020/050418 FI2020050418W WO2020249870A1 WO 2020249870 A1 WO2020249870 A1 WO 2020249870A1 FI 2020050418 W FI2020050418 W FI 2020050418W WO 2020249870 A1 WO2020249870 A1 WO 2020249870A1
Authority
WO
WIPO (PCT)
Prior art keywords
music
signal
score
performance
background
Prior art date
Application number
PCT/FI2020/050418
Other languages
French (fr)
Inventor
Tommi Ilmonen
Original Assignee
Tadadaa Oy
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority to FI20195503 priority Critical
Priority to FI20195503 priority
Application filed by Tadadaa Oy filed Critical Tadadaa Oy
Publication of WO2020249870A1 publication Critical patent/WO2020249870A1/en

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS
    • G10H1/00Details of electrophonic musical instruments
    • G10H1/0008Associated control or indicating means
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS
    • G10H1/00Details of electrophonic musical instruments
    • G10H1/02Means for controlling the tone frequencies, e.g. attack, decay; Means for producing special musical effects, e.g. vibrato, glissando
    • G10H1/06Circuits for establishing the harmonic content of tones, or other arrangements for changing the tone colour
    • G10H1/12Circuits for establishing the harmonic content of tones, or other arrangements for changing the tone colour by filtering complex waveforms
    • G10H1/125Circuits for establishing the harmonic content of tones, or other arrangements for changing the tone colour by filtering complex waveforms using a digital filter
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS
    • G10H1/00Details of electrophonic musical instruments
    • G10H1/36Accompaniment arrangements
    • G10H1/361Recording/reproducing of accompaniment for use with an external source, e.g. karaoke systems
    • G10H1/368Recording/reproducing of accompaniment for use with an external source, e.g. karaoke systems displaying animated or moving pictures synchronized with the music or audio part
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS
    • G10H2210/00Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
    • G10H2210/031Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal
    • G10H2210/056Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal for extraction or identification of individual instrumental parts, e.g. melody, chords, bass; Identification or separation of instrumental parts by their characteristic voices or timbres
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS
    • G10H2210/00Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
    • G10H2210/031Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal
    • G10H2210/091Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal for performance evaluation, i.e. judging, grading or scoring the musical qualities or faithfulness of a performance, e.g. with respect to pitch, tempo or other timings of a reference performance
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS
    • G10H2220/00Input/output interfacing specifically adapted for electrophonic musical tools or instruments
    • G10H2220/005Non-interactive screen display of musical or status data
    • G10H2220/015Musical staff, tablature or score displays, e.g. for score reading during a performance.
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS
    • G10H2250/00Aspects of algorithms or signal processing methods without intrinsic musical character, yet specifically adapted for or used in electrophonic musical processing
    • G10H2250/055Filters for musical processing or musical effects; Filter responses, filter architecture, filter coefficients or control parameters therefor

Abstract

A method for processing a music performance is provided. The method comprises the steps of providing a background music to accompany a music performance, selectively filtering the signal of said background music to significantly reduce the signal energy relating to at least one instrument score at certain limited frequency bands, providing a live music performance, playing said background music through loudspeakers so that a human performer can hear the background music while simultaneously performing music. The combined sounds of the background music and the music performance are captured through at least one microphone to form a combined sound signal, and the combined sound signal is analyzed within said limited frequency bands to determine the score of the music performance and to compare it to the filtered score.

Description

TITLE OF THE INVENTION
A METHOD FOR PROCESSING A MUSIC PERFORMANCE FIELD OF THE INVENTION
The present invention relates to a music performance system and method thereof. In particular, the invention provides a method and system with capabilities such as music extraction, music creation, providing feedback to a user and performance scoring.
BACKGROUND OF THE INVENTION
Various music learning and karaoke systems are known, where the performance of a musician or a singer that sings along is mixed with prerecorded musical accompaniment. The prerecorded music may be processed by extraction of features related to melody, pitch, rhythm, loudness, and timbre for transcribing music data. Indeed, most karaoke systems use re-created or re-recorded background music. Some systems also eliminate the main or lead voice from the original track.
In karaoke systems, the performance is usually visualized, in order to aid or score the
performance of the user somehow. Display of required metadata or characteristic, such as time- aligned lyrics & melody information, is a typical functionality of any karaoke system.
Computer implemented methods for generating real-time performance feedback for a user who is playing an instrument are known. Such methods may be based on comparison of single notes or chords, or the harmonic user content and the harmonic music track parameters. Often these systems use background music - an audio signal that is played to the human performer through headphones or loudspeakers as a musical reference. In its most simple form, background music can contain only“click” information that indicates the rhythm of the music. More elaborate background music can contain selected instruments from the music, or even full musical performance.
Known systems either merely mix the vocals or music tune that is to be provided by the user to the background track, or attempt to eliminate or cancel out from the background a music score to be replaced by the score to be performed by the user.
Such methods encounter several challenges, such as - it is difficult to identify only the instrument/voice of the human performer, other instruments may be at least partially present in the captured signal as well, as several instruments and/or vocals have frequency ranges that overlap;
- the need to use headphones to get sufficient level of separation between the background music and human performer.
- feedback to the user regarding the quality of his or her performance is produced
afterwards, or by visualization in some way. In either case, there is little guidance in real time to the user how close to the original removed background music the user is playing, as any kind of intuitive feedback is lacking. For example, the evaluation of the pitch, which is often the single most important musical attribute, is usually restricted to musical notes;
- visualization regarding the real-time performance of a music instrument is not easy to implement in a meaningful way, as it would distract the player and because prior art methods do not process the background music data and mix it with a user-played score in a manner that lends itself to produce an easily recognizable visualization in real time.
There has thus been a persistent need for music performance systems, which are capable of processing a playing performance and a background music and mixing them in real time to give a perceivable feedback to the player of his or her playing proficiency in comparison with a benchmark. An easy-to-follow display feedback about the playing proficiency is also not disclosed in the prior art so forth.
It is an aim of the present invention to provide a simple and efficient method and a system for processing a music performance which eliminates or alleviates the aforementioned problems. One aspect of the invention concerns a method for processing a music performance, the method comprising the steps of:
- providing a background music to accompany a music performance;
- selectively filtering the signal of said background music to significantly reduce the signal energy relating to at least one instrument score at certain limited frequency bands;
- providing a live music performance; - playing said background music through loudspeakers so that a human performer can hear the background music while simultaneously performing music;
- capturing the combined sounds of the background music and the music performance through at least one microphone to form a combined sound signal;
- analyzing said combined sound signal within said limited frequency bands to determine the score of said music performance and to compare it to said filtered score.
In some embodiments, the selectively filtered background signal relates to the score of one instrument, and the music performance consists of the score of one instrument played by a human.
In some embodiments, the analysis of said combined sound signal is done in a frequency domain or in a time domain, or both in frequency- and time domains.
In some embodiments, the analysis is performed with the aid of a band-pass filter bank.
In further embodiments of the invention, various embodiments of the inventive method may include one or several of the following bulleted features:
• the limited frequency bands are selected in real-time to match an expected frequency band of the signal of said musical performance;
• the background music source is a background soundtrack;
• the background music is generated by audio synthesis;
• audio synthesis is controlled to filter out sounds in certain limited frequency bands;
• the background music signal in said limited frequency bands are identified at least some of the frequencies of harmonic overtones of a fundamental note of said at least one instrument score, which then at least partly are filtered from said background music;
• the sampling of said background music comprises overlapping sine or cosine half- wave sampling sequences.
In some embodiments, the method comprises the additional step of displaying said music performance graphically on a display, where deviations in timing and/or pitch of the score of the musical performance from said background music are indicated in the context of each note played.
The invention offers considerable advantages. A human performer may listen to a background music and play along with it with her instrument, trying to keep the same tempo and tune. Immediate sound feedback, and in some embodiments visual feedback, is provided in real time, without disrupting the player performing the music.
The various embodiments of the invention are characterized by what is stated in the appending claims.
BRIEF DESCRIPTION OF THE DRAWINGS
Fig. 1 illustrates an arrangement capable of carrying out at least some embodiments of the present invention;
Fig. 2 shows a signal of the sound of a single instrument;
Fig. 3 shows a signal of the sound of a multi- instrument ensemble;
Fig. 4 shows a frequency spectrum of an instrument signal;
Fig. 5 shows a spectrogram of a recording of a monophonic melody;
Fig. 6 shows samples of the spectrum from Fig. 5;
Fig. 7 shows the spectrum of Fig. 6 with markers added;
Fig. 8 shows some first seconds of a viola playing a melody accompanied by a backing track;
Fig. 9 shows samples of the spectrum from Fig. 8 at the same time as in Fig. 6;
Fig. 10 shows the same spectrum as in Fig 9 with markers;
Fig. 11 shows the effect of attenuating frequencies in the spectrogram of Fig. 8;
Fig. 12 shows samples of the spectrum of Fig. 11;
Fig. 13 shows the combined sound of the human performer and the backing track;
Fig. 14 shows samples of the spectrum with a filtered backing track combined with a viola sound; Fig. 15 shows the detected harmonics peaks of Fig. 14 with added marks;
Fig. 16 shows a signal from a piano recording;
Fig. 17 shows a short segment sampled from the signal of Fig. 16;
Fig. 18 shows the segment of Fig. 17 with a raised cosine weighting function applied;
Fig. 19 shows a series of segments of Fig. 16 after windowing and weighting;
Fig. 20 shows a set of weighted analysis windows of Fig. 19 with an offset;
Fig. 21 illustrates an example apparatus capable of supporting at least some embodiments of the present invention;
Fig. 22 shows a feedback system according to some embodiment of the invention.
DETAILED DESCRIPTION OF EMBODIMENTS
In Fig. 1 is shown an arrangement capable of carrying out at least some embodiments of the present invention. A computing unit 11 (e.g. laptop, mobile phone, tablet etc.) is connected to internal or external loudspeakers 12 that play a background music to accompany a music performance. A human performer 10 is playing along with an instrument, such as a violin or viola 14, for example. The player 10 listens to the background music and plays along with it, while trying to keep the same tempo and tune. The sound is captured by a microphone 13, which may be an external one as shown, or it may be an internal microphone of the computing unit 11. The recorded audio signal is transferred to the computation unit 11 which performs analysis on the signal, to detect the pitch, timing, volume and other musical parameters from the acoustic signal.
In Fig. 2 is shown an exemplary signal 20 of the sound of a single instrument, like the violin. This signal is picked up by a microphone and played as such or with only modest processing through the loudspeaker.
In Fig. 3 is shown an exemplary signal 30 of the sound of a multi- instrument ensemble. This signal will be processed as explained in connection with Figs. 4 and 5, and played as the background music through the loudspeaker. Fig. 4 shows a frequency spectrum of an instrument signal. The spectrum of an ensemble signal would have frequency peaks across a broader bandwidth of the spectrum, as is obvious for one skilled in the art. For clarity, only a few peaks relating to one instrument are shown. A limited frequency band 40 between 200 - 1600 Hz is decided to be filtered out. This band may contain the fundamental note and all or some harmonic overtones (multiples) of the instrument, such as a guitar, violin, flute, etc. The frequency band spectrum may be created by a Fast Fourier
Transform (FFT) of the original sound signal. An FFT of 4096 samples will produce 11 transforms per second. Not all significant frequency components need to be filtered out. For training purposes, or in order to be able to better follow the rhythm or beat of that particular instrument, it may be beneficial for the player of the instrument to have some remnants or “humming” of the original score of her instrument left in the background music.
Referring now to Figs. 16 - 20, in Fig. 16 is shown an exemplary real-world signal from a piano recording. To perform analysis on the signal it is split into short segments. In Fig. 17 a short segment (aka window) of the signal is sampled to create a spectral view of the recording. The segment has been created by taking 4096 samples of the original digitized signal. In Fig. 18 a raised cosine window weighting function has been applied to the segment of Fig. 17, to smoothen out artifacts caused by abrupt endings in the plain segment.
In Fig 19 is shown a series of segments, after the windowing and weighting. From Fig. 19 becomes clear that portions of the original signal are effectively ignored, due to the weighting functions. This happens for example in Fig. 19 shortly before the 0.2 seconds time mark.
In Fig. 20, the problem of ignored signal parts is solved by using overlapping windows. This figure shows a second set of weighted analysis windows which are offset compared to Fig. 19. By using overlapping analysis windows none of the original signal gets ignored, and also the temporal accuracy of the analysis is increased. By calculating spectrum contents of each of the segments, we can get a spectrogram of the signal, as shown in Fig. 5.
Fig. 5 shows a spectrogram of the first 5 seconds of a recording, with only a viola playing a monophonic melody. The spectrogram shows the energy values from 0 Hz to 5 kHz. The spectrum is calculated with a 4096-sample fast Fourier transform from a 44.1kHz audio signal (0.093 seconds of audio data per Fourier window). There are 33 Fourier transforms per second (0.033. seconds between successive transforms). In the figure one can see the successive notes as horizontal lines, with the overtones of the viola at multiples of the base frequency. Fig. 6 shows a 4096 samples-long snapshot of the spectrum from Fig. 5 at 4 seconds. For the sake of clarity, only the frequencies in range 0-5000Hz are included in Fig 6. Frequencies above 5000 are not shown (i.e. range 5000-2205 OFlz). In this picture the strongest peaks are related to the overtone series of the current note. Other peaks are related to resonance from open strings and/or reverberation of the previous notes. The same 4096-sample analysis window is used in the following figures.
Fig. 7 shows the same spectrum as in Fig. 6, but with markers added at three equally spaced overtones. The 3rd overtone is at 1967 Hz and the base frequency is roughly 656Hz. This shows that it is relatively easy to determine the base frequency from a spectral view, if the overtones are clearly visible and they stand out from the rest of spectrum.
Fig. 8 shows the first 5 seconds of a music performance, when the viola plays the same melody as in Fig. 5 and is accompanied by a backing track. In this case the view is much more complex, as there are more voices playing.
Fig. 9 shows a 4096 samples-long snapshot of the spectrum from Fig. 8 at 4 seconds and at the same time as in Fig. 6. The signal from the viola is also present in the spectrum but it is not clearly distinguishable.
Fig. 10 contains the same spectrum as in Fig 9 with markers that show where the viola overtones are. How-ever, due to the presence of all the other musical components, the overtones do not stand out and it is very difficult to determine peaks that would correspond to specific notes.
Fig. 11 shows the effect of attenuating frequencies in the spectrogram of Fig. 8 which are within the range of the first three harmonics that the human performer 10 is expected to play.
Frequencies within the white band have been attenuated with a dynamic band-stop filter, with 20db attenuation.
Fig. 12 shows a 4096 samples-long snapshot of the spectrum at 4 seconds of Fig. 11. This is substantially the same picture as in Fig. 9, but the frequencies in range 550-2200Hz have been heavily attenuated. The attenuation has been done by digitally filtering the original audio signal, before the sound is played from the loudspeaker 12. The attenuated frequency area is called “reserved range” since it is reserved for the sound coming from the human performer. Digital filtering techniques may be used for attenuating frequencies in the reserved range. Other methods can be used as well. One alternative to digital filtering is the use of analog band-stop filter. Another form of filtering is to use additive synthesis to create the background music and configure the synthesis engine to not produce any signals (or heavily attenuate signals) in the reserved frequency range. There can be multiple stop-bands (aka reserved ranges), depending on the pitch-tracking algorithm, instrument in use and other parameters.
Fig. 13 shows the combined sound of the human performer and the backing track. The sound from the viola 14 fills the frequency range between 1st and 3rd harmonic.
Fig. 14 shows a 4096 samples-long snapshot of the spectrum at 4 seconds, with filtered backing track combined with the viola sound from the human performer 10. In this case the first three harmonics are visible, just like in Fig. 6.
Fig. 15 shows added marks to the detected harmonics peaks of Fig. 14. This shows how the pitch tracking system can work within the limited frequency scope (550-2200Hz) and detect the fundamental frequency. The human performer can perform with the filtered backing track, which gives the sense of tempo and tune to the performer. The algorithms that detect pitch, dynamics, timbre and other musical characteristics can work within this reserved frequency band, and ignore frequencies outside this range.
In the following, two approaches to detecting the pitch of the human performer within a reserved range are outlined.
According to a first exemplary pitch detection method, detection is done in a frequency domain and it is based on the Fourier transform (usually implemented with Fast Fourier transform, or FFT): In this case the computational unit uses a window function to sample segments of the combined sound from the microphone, for example with 4096 samples-long segments and performs FFT on those segments. For each segment, the algorithm calculates the energy in each frequency bin (2048 bin for a 4096-sample window). The algorithm then evaluates energy spectrum in the reserved range to find peaks in the signal (such as the ones visible in Fig 15).
The system can test how well different base frequencies would explain the strongest energy peaks and choose the base frequency with best overall match. In the case of Fig 15 this would be 655Hz.
According to a second exemplary pitch detection method, detection is based on a filter-bank approach and it is done in the time domain: In this case the algorithm would have a collection of band-pass filters with narrow pass-bands (for example 10 Hz wide). The system would feed the combined signal from the microphone into the filter-bank and then analyze the average energy in each band. The purpose of each filter is to capture a single overtone from the instrument. For bands that have higher energy, the system would assume that the signal is sinusoidal (which is true for a periodic signal like instrument overtones) and calculate the base frequency by calculating the average cycle length of the signal for each of the high-energy frequency bands. Again, the system can evaluate how different base frequencies would explain the different sinusoids and select the base frequency with the best fit.
It is noted that there are many variations of these pitch tracking methods and they can also be used together - for example by using the frequency-domain approach for determining the approximate pitch(es) of the performer and a filter-bank to get a more accurate estimate of the base frequency/frequencies of the performer. Further, these examples only illustrate same different ways in which the pitch tracking can work within a limited frequency range. One skilled in the art can create other methods or use other known methods, based on the general principle of using a reserved frequency range to detect pitch and other characteristics of the performer.
Fig. 21 illustrates an example apparatus capable of supporting at least some embodiments of the present invention. Illustrated is device 20, which may comprise, for example, a mobile communication device, a tablet or a laptop computer such as device 11 of Fig 1. Comprised in device 20 is processor 21, which may comprise, for example, a single- or multi-core processor wherein a single-core processor comprises one processing core and a multi-core processor comprises more than one processing core. Processor 21 may comprise, in general, a control device. Processor 21 may comprise more than one processor. Processor 21 may be a control device. A processing core may comprise, for example, a Cortex-A8 processing core manufactured by ARM Holdings or a Steamroller processing core designed by Advanced Micro Devices Corporation. Processor 21 may comprise at least one Qualcomm Snapdragon and/or Intel Atom processor. Processor 21 may comprise at least one application-specific integrated circuit, ASIC. Processor 21 may comprise at least one field-programmable gate array, FPGA. Processor 21 may be means for performing method steps in device 20. Processor 21 may be configured, at least in part by computer instructions, to perform actions.
A processor may comprise circuitry, or be constituted as circuitry or circuitries, the circuitry or circuitries being configured to perform phases of methods in accordance with embodiments described herein. As used in this application, the term“circuitry” may refer to one or more or all of the following: (a) hardware-only circuit implementations, such as implementations in only analog and/or digital circuitry, and (b) combinations of hardware circuits and software, such as, as applicable: (i) a combination of analog and/or digital hardware circuit(s) with software/firmware and (ii) any portions of hardware processor(s) with software (including digital signal processor(s)), software, and memory(ies) that work together to cause an apparatus, such as a mobile phone or server, to perform various functions) and (c) hardware circuit(s) and or processor(s), such as a microprocessor(s) or a portion of a microprocessor(s), that requires software (e.g., firmware) for operation, but the software may not be present when it is not needed for operation.
This definition of circuitry applies to all uses of this term in this application, including in any claims. As a further example, as used in this application, the term circuitry also covers an implementation of merely a hardware circuit or processor (or multiple processors) or portion of a hardware circuit or processor and its (or their) accompanying software and/or firmware. The term circuitry also covers, for example and if applicable to the particular claim element, a baseband integrated circuit or processor integrated circuit for a mobile device or a similar integrated circuit in server, a cellular network device, or other computing or network device.
Device 20 may comprise memory 22. Memory 22 may comprise random-access memory and/or permanent memory. Memory 22 may comprise at least one RAM chip. Memory 22 may comprise solid-state, magnetic, optical and/or holographic memory, for example. Memory 22 may be at least in part accessible to processor 21. Memory 22 may be at least in part comprised in processor 21. Memory 22 may be means for storing information. Memory 22 may comprise computer instructions that processor 21 is configured to execute. When computer instructions configured to cause processor 21 to perform certain actions are stored in memory 22, and device 20 overall is configured to run under the direction of processor 21 using computer instructions from memory 22, processor 21 and/or its at least one processing core may be considered to be configured to perform said certain actions. Memory 22 may be at least in part comprised in processor 21. Memory 22 may be at least in part external to device 20 but accessible to device 20. Device 20 may comprise a transmitter 23. Device 20 may comprise a receiver 24. Transmitter 23 and receiver 24 may be configured to transmit and receive, respectively, information in accordance with at least one cellular or non-cellular standard. Transmitter 23 may comprise more than one transmitter. Receiver 24 may comprise more than one receiver. Transmitter 23 and/or receiver 24 may be configured to operate in accordance with global system for mobile communication, GSM, wideband code division multiple access, WCDMA, 5G, long term evolution, LTE, IS-95, wireless local area network, WLAN, Ethernet and/or worldwide interoperability for microwave access, WiMAX, standards, for example.
Device 20 may comprise a near- field communication, NFC, transceiver 25. NFC transceiver 25 may support at least one NFC technology, such as NFC, Bluetooth, Wibree or similar technologies.
Device 20 may comprise user interface, UI, 26. UI 26 may comprise at least one of a display, a keyboard, a touchscreen, a vibrator arranged to signal to a user by causing device 20 to vibrate, a speaker and a microphone. A user may be able to operate device 20 via UI 26, for example to accept incoming telephone calls, to originate telephone calls or video calls, to access location- based services, to manage digital files stored in memory 22 or on a cloud accessible via transmitter 23 and receiver 24, or via NFC transceiver 25, and/or to play games.
Device 20 may comprise or be arranged to accept a user identity module 27. User identity module 27 may comprise, for example, a subscriber identity module, SIM, card installable in device 20. A user identity module 27 may comprise information identifying a subscription of a user of device 20. A user identity module 27 may comprise cryptographic information usable to verify the identity of a user of device 20 and/or to facilitate encryption of communicated information and billing of the user of device 20 for communication effected via device 20.
Processor 21 may be furnished with a transmitter arranged to output information from processor 21, via electrical leads internal to device 20, to other devices comprised in device 20. Such a transmitter may comprise a serial bus transmitter arranged to, for example, output information via at least one electrical lead to memory 22 for storage therein. Alternatively to a serial bus, the transmitter may comprise a parallel bus transmitter. Likewise processor 21 may comprise a receiver arranged to receive information in processor 21, via electrical leads internal to device 20, from other devices comprised in device 20. Such a receiver may comprise a serial bus receiver arranged to, for example, receive information via at least one electrical lead from receiver 24 for processing in processor 21. Alternatively to a serial bus, the receiver may comprise a parallel bus receiver. Device 20 may comprise further devices not illustrated in Fig. 3. For example, where device 20 comprises a smartphone, it may comprise at least one microphone 13. In some embodiments, device 20 lacks at least one device described above. For example, some devices 20 may lack a NFC transceiver 25 and/or user identity module 27.
Processor 21, memory 22, transmitter 23, receiver 24, NFC transceiver 25, UI 26 and/or user identity module 27 may be interconnected by electrical leads internal to device 20 in a multitude of different ways. For example, each of the aforementioned devices may be separately connected to a master bus internal to device 20, to allow for the devices to exchange information. However, as the skilled person will appreciate, this is only one example and depending on the embodiment various ways of interconnecting at least two of the aforementioned devices may be selected without departing from the scope of the present invention.
Fig. 22 shows an example of an inventive feedback system. An exemplary display arrangement comprising a stave 30 with a head 31 that indicates how far in the musical score one is playing. The next notes to play are on the right side of the head, and the already played notes are on the left side of the head. The pointers 32 - 34 by the notes indicate mistakes in playing. For example, a horizontal pointer 32 pointing to the left indicates that the note was played too late. A vertical pointer 34 above the note indicates that the pitch was too high. A vertical pointer 33 below the note indicates that the pitch was too low.
In this way, feedback to the player 10 on the display of the computing unit 11 is efficiently provided in real time. It is clear to one skilled in the art that deviations of the score of the musical performance from a target score may transformed in any kind of graphical
representations of the deviations, without parting from the inventive concept.
It is to be understood that the embodiments of the invention disclosed are not limited to the particular structures, process steps, or materials disclosed herein, but are extended to equivalents thereof as would be recognized by those ordinarily skilled in the relevant arts. It should also be understood that terminology employed herein is used for the purpose of describing particular embodiments only and is not intended to be limiting.
Reference throughout this specification to one embodiment or an embodiment means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the present invention. Thus, appearances of the phrases “in one embodiment” or“in an embodiment” in various places throughout this specification are not necessarily all referring to the same embodiment. Where reference is made to a numerical value using a term such as, for example, about or substantially, the exact numerical value is also disclosed.
As used herein, a plurality of items, structural elements, compositional elements, and/or materials may be presented in a common list for convenience. However, these lists should be construed as though each member of the list is individually identified as a separate and unique member. Thus, no individual member of such list should be construed as a de facto equivalent of any other member of the same list solely based on their presentation in a common group without indications to the contrary. In addition, various embodiments and example of the present invention may be referred to herein along with alternatives for the various components thereof. It is understood that such embodiments, examples, and alternatives are not to be construed as de facto equivalents of one another, but are to be considered as separate and autonomous representations of the present invention.
Furthermore, the described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. In the preceding description, numerous specific details are provided, such as examples of lengths, widths, shapes, etc., to provide a thorough understanding of embodiments of the invention. One skilled in the relevant art will recognize, however, that the invention can be practiced without one or more of the specific details, or with other methods, components, materials, etc. In other instances, well-known structures, materials, or operations are not shown or described in detail to avoid obscuring aspects of the invention.
While the forgoing examples are illustrative of the principles of the present invention in one or more particular applications, it will be apparent to those of ordinary skill in the art that numerous modifications in form, usage and details of implementation can be made without the exercise of inventive faculty, and without departing from the principles and concepts of the invention. Accordingly, it is not intended that the invention be limited, except as by the claims set forth below.
The verbs“to comprise” and“to include” are used in this document as open limitations that neither exclude nor require the existence of also un-recited features. The features recited in depending claims are mutually freely combinable unless otherwise explicitly stated. Furthermore, it is to be understood that the use of "a" or "an", that is, a singular form, throughout this document does not exclude a plurality. INDUSTRIAL APPLICABILITY
At least some embodiments of the present invention find industrial application at least in processing music performances for training, learning and gaming purposes.

Claims

1. A method for processing a music performance, the method comprising the steps of:
- providing a background music to accompany a music performance;
- selectively filtering the signal of said background music to significantly reduce the signal energy relating to at least one instrument score at certain limited frequency bands;
- providing a live music performance;
- playing said background music through loudspeakers so that a human performer can hear the background music while simultaneously performing music;
- capturing the combined sounds of the background music and the music performance through at least one microphone to form a combined sound signal;
- analyzing said combined sound signal within said limited frequency bands to determine the score of said music performance and to compare it to said filtered score.
2. A method according to claim 1, wherein the selectively filtered background signal relates to the score of one instrument, and the music performance consists of the score of one instrument played by a human.
3. A method according to claim 1 or 2, wherein the analysis of said combined sound signal is done in a frequency domain.
4. A method according to claim 1 or 2, wherein the analysis of said combined sound signal is done in a time domain.
5. A method according to claim 1 or 2, wherein the analysis of said combined sound signal is done both in frequency- and time domains.
6. A method according to claims 1 or 2, wherein the analysis is performed with the aid of a band-pass filter bank.
7. A method according to any of claims 1 - 6, wherein said limited frequency bands are selected in real-time to match an expected frequency band of the signal of said musical performance.
8. A method according to any of claims 1 - 7, wherein said background music source is a background soundtrack.
9. A method according to any of claims 1 - 7, wherein said background music is generated by audio synthesis.
10. A method according to claims 1-9 where audio synthesis is controlled to filter out sounds in certain limited frequency bands.
11. A method according to any of claims 1 - 9, wherein from said background music signal in said limited frequency bands are identified at least some of the frequencies of harmonic overtones of a fundamental note of said at least one instrument score, which then at least partly are filtered from said background music.
12. A method according to any of claims 1 - 11, wherein sampling of said background score comprises overlapping sine or cosine half- wave sampling sequences.
13. A method according to any of claims 1 - 12, comprising the additional step of displaying said music performance graphically on a display, where deviations in timing and/or pitch of the score of the musical performance from said background music are indicated in the context of each note played.
14. A computer program configured to cause a method in accordance with at least one of claims 1 - 13 to be performed, when run.
PCT/FI2020/050418 2019-06-12 2020-06-12 A method for processing a music performance WO2020249870A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
FI20195503 2019-06-12
FI20195503 2019-06-12

Publications (1)

Publication Number Publication Date
WO2020249870A1 true WO2020249870A1 (en) 2020-12-17

Family

ID=71620470

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/FI2020/050418 WO2020249870A1 (en) 2019-06-12 2020-06-12 A method for processing a music performance

Country Status (1)

Country Link
WO (1) WO2020249870A1 (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5567162A (en) * 1993-11-09 1996-10-22 Daewoo Electronics Co., Ltd. Karaoke system capable of scoring singing of a singer on accompaniment thereof
US20080295672A1 (en) * 2007-06-01 2008-12-04 Compton James M Portable sound processing device
US20090038467A1 (en) * 2007-08-10 2009-02-12 Sonicjam, Inc. Interactive music training and entertainment system
GB2484084A (en) * 2010-09-28 2012-04-04 Edward Hartley Portable karaoke system for use with a motor vehicles sound system
EP2447944A2 (en) * 2010-10-28 2012-05-02 Yamaha Corporation Technique for suppressing particular audio component
US20190005934A1 (en) * 2017-06-28 2019-01-03 Abu Dhabi University System and Method for improving singing voice separation from monaural music recordings

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5567162A (en) * 1993-11-09 1996-10-22 Daewoo Electronics Co., Ltd. Karaoke system capable of scoring singing of a singer on accompaniment thereof
US20080295672A1 (en) * 2007-06-01 2008-12-04 Compton James M Portable sound processing device
US20090038467A1 (en) * 2007-08-10 2009-02-12 Sonicjam, Inc. Interactive music training and entertainment system
GB2484084A (en) * 2010-09-28 2012-04-04 Edward Hartley Portable karaoke system for use with a motor vehicles sound system
EP2447944A2 (en) * 2010-10-28 2012-05-02 Yamaha Corporation Technique for suppressing particular audio component
US20190005934A1 (en) * 2017-06-28 2019-01-03 Abu Dhabi University System and Method for improving singing voice separation from monaural music recordings

Similar Documents

Publication Publication Date Title
US10930296B2 (en) Pitch correction of multiple vocal performances
US9224375B1 (en) Musical modification effects
US9263021B2 (en) Method for generating a musical compilation track from multiple takes
JP6290858B2 (en) Computer processing method, apparatus, and computer program product for automatically converting input audio encoding of speech into output rhythmically harmonizing with target song
US9251796B2 (en) Methods and systems for disambiguation of an identification of a sample of a media stream
EP3480819B1 (en) Audio data processing method and apparatus
US8415549B2 (en) Time compression/expansion of selected audio segments in an audio file
JP6027087B2 (en) Acoustic signal processing system and method for performing spectral behavior transformations
KR101521368B1 (en) Method, apparatus and machine-readable storage medium for decomposing a multichannel audio signal
Jehan Creating music by listening
Yeh et al. Multiple fundamental frequency estimation and polyphony inference of polyphonic music signals
Verfaille et al. Adaptive digital audio effects (A-DAFx): A new class of sound transformations
US7714222B2 (en) Collaborative music creation
Harte et al. Automatic chord identifcation using a quantised chromagram
US8452586B2 (en) Identifying music from peaks of a reference sound fingerprint
ES2523800T3 (en) Apparatus and procedure for modifying an audio signal using envelope modeling
JP4823804B2 (en) Code name detection device and code name detection program
CN104978962B (en) Singing search method and system
JP4940588B2 (en) Beat extraction apparatus and method, music synchronization image display apparatus and method, tempo value detection apparatus and method, rhythm tracking apparatus and method, music synchronization display apparatus and method
US9159338B2 (en) Systems and methods of rendering a textual animation
EP3048607B1 (en) Automatic transcription of musical content .
Scheirer Tempo and beat analysis of acoustic musical signals
KR20150021508A (en) Systems and methods for source signal separation
US9847078B2 (en) Music performance system and method thereof
US10229662B2 (en) Social music system and method with continuous, real-time pitch correction of vocal performance and dry vocal capture for subsequent re-rendering based on selectively applicable vocal effect(s) schedule(s)

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20740669

Country of ref document: EP

Kind code of ref document: A1