WO2018231185A1 - Method of synchronizing sound signals - Google Patents

Method of synchronizing sound signals Download PDF

Info

Publication number
WO2018231185A1
WO2018231185A1 PCT/UA2017/000089 UA2017000089W WO2018231185A1 WO 2018231185 A1 WO2018231185 A1 WO 2018231185A1 UA 2017000089 W UA2017000089 W UA 2017000089W WO 2018231185 A1 WO2018231185 A1 WO 2018231185A1
Authority
WO
WIPO (PCT)
Prior art keywords
synchronization
audio
file
sound
method
Prior art date
Application number
PCT/UA2017/000089
Other languages
French (fr)
Russian (ru)
Inventor
Василий Васильевич ДУМА
Роман Викторович КУЛИНИЧ
Дмитрий Константинович ХАНЧОПУЛО
Original Assignee
Василий Васильевич ДУМА
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority to UAa201706097 priority Critical
Priority to UA2017006097 priority
Application filed by Василий Васильевич ДУМА filed Critical Василий Васильевич ДУМА
Publication of WO2018231185A1 publication Critical patent/WO2018231185A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/04Time compression or expansion
    • G10L21/055Time compression or expansion for synchronising with other signals, e.g. video signals

Abstract

The invention relates to the processing of sound signals, in particular to a method of processing dynamic audio properties using a mechanism or sequence of tuning operations so as to quickly adapt to changes in the content of a sound signal. The method uses synchronization maps of sound signals recorded from a microphone for synchronizing a rendering of an original or other sound track using a mobile client device and a mechanism for generating a synchronization map and saving same in a digital file, wherein the synchronization map is generated beforehand on a server and transmitted remotely or locally to a user device, the synchronization map data being encrypted beforehand. The method provides the capability of synchronizing a converted original audio file comprising audio recorded from a microphone for reproduction of the same audio file, a similar audio file or a different audio file.

Description

 METHOD OF SYNCHRONIZATION OF AUDIO SIGNALS

The invention relates to the processing of audio signals, in particular a method for processing the dynamic properties of audio using a tuning mechanism or sequence of operations for quickly adapting to changes in content in an audio signal, as well as for computer programs for implementing such methods in practice.

 A tuning signal can be generated by analyzing the audio signal itself, or tuning can be triggered by an external event, such as a change in a channel on a television receiver, or a change in input selection on an audio / video receiver. In the case of an external audio signal, one or more indications of the state of dynamic properties for the current sound source can be stored and associatively associated with such a sound source before switching to a new sound source. Then, if the system switches back to the first sound source, the dynamic processor can be configured to the state saved earlier, or its approximation.

A known method of mixing two input audio signals into a single composite audio signal with support for the perceived sound level of the composite audio signal, the method includes the steps of: accepting the main input audio signal; receive a connected input audio signal, and the associated input signal is connected to the main input audio signal; accept mixing metadata containing scaling information for scaling the main input audio signal and determining how the main input signal and the associated input signal should be mixed in order to generate a composite audio signal at a perceived sound level; wherein the scaling information from the mixing metadata comprises a metadata scale factor for the main input audio signal, for scaling the main input audio signal relative to the associated input audio signal; weighting the main input audio signal and the associated input audio signal in the composite audio signal, as defined in the mixing metadata; identify the predominant signal either as the main input audio signal, or as related the input audio signal from the scaling information provided by the mixing metadata and from the mixing balance input, where the corresponding other input signal is then identified as a non-dominant signal; and where the predominant signal is identified by comparing the mixing balance input signal with a metadata scale factor for the main audio input signal; scaling the non-predominant signal relative to the predominant signal; and combining a scalable non-dominant signal with a dominant signal to generate a composite audio signal [UA j l05590, H03G 3/00, 2014].

 However, the user may wish to deviate from the settings provided by the manufacturer, dictated by the metadata transmitted along with the associated signal. For example, a user who activates the director’s comments while watching a movie at some point during playback decides that he is more likely to hear the original dialogue that the manufacturer indicated in the metadata as being subject to weakening during mixing so that it does not prevail over director’s comments.

 Therefore, there is a need to create a regulation that would allow the user to regulate the mixing of the input audio signals and, at the same time, would provide a favorable user experience by storing the perceived sound level in the composite signal. In addition, there is also a need to create a control for mixing the input audio signals and, at the same time, maintain a consistent sound level for the composite signal even when the scaling information from the metadata and the external input from the user can be time-varying so that there was no need for additional adjustment of the level of the composite signal.

Closest to the claimed invention is a method for processing an audio signal using a setting, which consists in the fact that the dynamic properties of the audio signal are changed in accordance with the sequence of operations for adjusting the dynamic properties, an event is detected in the temporary development of the audio signal, in which the level of the audio signal decreases by an amount greater than the threshold of visibility (Ldrop) within the time interval, no more than the second threshold value of time (tdrop), while the above is detected It reveals a decrease in the sound signal level in the plural number of frequency bands and reconfigures the sequence of operations regulation of dynamic properties in response to the mentioned detection [UA N ° 94968, H03G 3/00, H03G 7/00, 201 1].

 However, this method, as well as the previous analogue, does not sufficiently effectively facilitate the synchronization of the converted original audio file with the audio recorded from the microphone to play the same, similar, or different audio file.

The basis of the invention is the task of creating a method for synchronizing audio signals, which would be able to effectively facilitate the synchronization of the converted original audio file recorded from the microphone audio to play the same, similar or different audio file.

The problem is solved in that in a method for synchronizing audio signals in which the dynamic properties of an audio signal are changed in accordance with a sequence of adjusting dynamic properties according to the invention, synchronization cards are used for audio signals recorded from a microphone to synchronize the rendering of an original or other audio track using client’s mobile device (mobile phone, smartphone, smart TV, laptop, laptop), use the card generation mechanism Synchronizing and storing it in the digital file, the synchronization map and data thereon in advance is generated and encrypted at the server and transmitted to the user device remotely or locally.

In addition, in the method of synchronizing audio signals when preparing a file, synchronization cards convert sound into a frequency domain and use filtering and extraction methods.

 As a mobile device they use a mobile phone, smartphone, smart TV, laptop, laptop or tablet.

 The inventive method provides the ability to synchronize the converted original audio file recorded from the microphone audio to play the same, similar or different audio file.

The utility model is illustrated by the drawings:

 figure 1 shows a diagram of a device for implementing the method;

 figure 2 is a sequence diagram of the method.

The method is implemented as follows.

Mobile device 2 (mobile phone, smart phone smart TV, laptop, laptop or tablet) is used to record an audio signal or sounds in an open or closed space 1, using an incoming sound data source 4 (for example, a microphone).

 The device uses the recorded sound in block 12 to synchronize with another audio track, which is currently played using the reference synchronization card prepared earlier and received on the device (phone, smartphone or tablet) via a wireless or other network.

 Synchronization is carried out in real time and the offset in the original synchronization file is taken into account, taking into account the recorded audio segment (from 4 to 15 seconds).

 Synchronization is performed on the converted and filtered (digitized) data in the media buffer 5 on the synchronization card using the synchronization unit 12.

 In addition, an acceleration or deceleration algorithm for the recorded track from the microphone can be used.

 As a result of synchronization, another audio file can be played back, taking into account the time offset obtained when synchronizing with the first original audio file converted in block 1 1.

 They use standard mechanisms for converting audio signals (map conversion unit 6) into another coordinate system - frequency (such as fast Fourier transform, but other methods are also possible).

 Great emphasis is placed on filtering and highlighting the rms frequency maxima or values close to peak (but not frequency peaks) with values of at least 50%, 75% or more of the maximum.

 The breakdown of frequencies into ranges can be performed from 5 to 14 ranges at a given time.

 They also use an additional search refinement algorithm (synchronization block 12):

 The work of the complementary algorithm consists of the following steps:

 - compilation of vector maps (VMP);

 - drawing up the intersection of vector maps;

 - selection from the track sections, probably corresponding to the desired fragment;

~ selection of the site that has the most correspondence with the VMP of the desired fragment. The selected sections are analyzed and one that has the highest VMP similarity of the desired fragment according to the following criteria: the length of the section along the time axis, the number of vectors in the section and the number of vectors having common points.

 For the selected site, ts is calculated. The algorithm considers that the found ts is the beginning of the desired fragment ts.

 To determine the accuracy of the match, the entire recorded and converted fragment of the audio is divided into separate subbands, and the summing function passes through each of the fragments. Next, use the difference in displacements for each of the fragments. The number of fragments can be from 4 to 10.

 Comparative Formula 1:

j.v4 - 3 | <| 2 - xl \ + j

 During one of the stages, an array is created that stores pairs of vectors - a vector from a VMP fragment and a corresponding vector from a VMP track.

 The criteria for compliance is a vector key. If several vectors from a VMP track correspond to one vector of a VMP fragment, then several elements are created in the array that have the same vector from the VMP fragment, but different vectors of the V P track.

 When working with the converted original audio file, the decryption unit 10 is used, which additionally decrypts the converted original audio file in memory - the synchronization card.

 After decoding part of the audio file, the client device can play back the original fragment of the audio track on the client device 2, taking into account all the delays during the operation of the algorithm.

 To download synchronization files, the user first logs in to the authorization block 7 and, after gaining access, can download encrypted files: synchronization cards and audio tracks for playing in the synchronization card block (encrypted) 8 and media block 9, which through the decryption block 10 fall into the synchronization block 12 .

To download data for synchronization from the server (cloud) 3, the user via the Internet goes to the authorization unit 13 and then to the content delivery unit 14, where through the encryption unit 15 it receives data from the database for cards 16 and the audio database 17. A working model has been created for various audio files and data synchronization using synchronization cards of sound signals recorded from a microphone to synchronize the rendering of the original or other sound track using a client’s mobile device (mobile phone, smartphone, smart TV, laptop, laptop).

 To get started, the client must press a button on the keyboard or on the touch screen or in any other way.

 To implement this, use the mechanism for generating a synchronization card and saving it in a digital file. The synchronization card file is generated on the server in advance and on the user's device remotely or locally. Data is encrypted in advance. For encryption, both symmetric and asymmetric algorithms are used. When preparing a synchronization map file, methods are used to convert sound to the frequency domain, and various filtering and highlighting methods can be used.

Claims

CLAIM
1. A method for synchronizing audio signals in which the dynamic properties of an audio signal are changed in accordance with a sequence of adjusting dynamic properties, characterized in that they use synchronization cards of sound signals recorded from a microphone to synchronize the rendering of an original or other audio track using a client’s mobile device, a mechanism for generating a synchronization card and storing it in a digital file, while the synchronization card is generated on the Zara server it and the user device remotely or locally and the data on the card is encrypted synchronization advance.
 2. The method according to claim 1, characterized in that when preparing the file, the synchronization cards turn the sound into a frequency domain and use filtering and extraction methods.
 3. The method according to claim 1, characterized in that as a mobile device using a mobile phone, or smartphone, or smart TV, or laptop, or netbook, or tablet.
PCT/UA2017/000089 2017-06-16 2017-09-05 Method of synchronizing sound signals WO2018231185A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
UAa201706097 2017-06-16
UA2017006097 2017-06-16

Publications (1)

Publication Number Publication Date
WO2018231185A1 true WO2018231185A1 (en) 2018-12-20

Family

ID=64659688

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/UA2017/000089 WO2018231185A1 (en) 2017-06-16 2017-09-05 Method of synchronizing sound signals

Country Status (1)

Country Link
WO (1) WO2018231185A1 (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080059160A1 (en) * 2000-03-02 2008-03-06 Akiba Electronics Institute Llc Techniques for accommodating primary content (pure voice) audio and secondary content remaining audio capability in the digital audio production process
US20100198377A1 (en) * 2006-10-20 2010-08-05 Alan Jeffrey Seefeldt Audio Dynamics Processing Using A Reset
WO2010141504A1 (en) * 2009-06-01 2010-12-09 Music Mastermind, LLC System and method of receiving, analyzing, and editing audio to create musical compositions
US8325944B1 (en) * 2008-11-07 2012-12-04 Adobe Systems Incorporated Audio mixes for listening environments
US20130170672A1 (en) * 2010-09-22 2013-07-04 Dolby International Ab Audio stream mixing with dialog level normalization
US20160021476A1 (en) * 2011-07-01 2016-01-21 Dolby Laboratories Licensing Corporation System and Method for Adaptive Audio Signal Generation, Coding and Rendering

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080059160A1 (en) * 2000-03-02 2008-03-06 Akiba Electronics Institute Llc Techniques for accommodating primary content (pure voice) audio and secondary content remaining audio capability in the digital audio production process
US20100198377A1 (en) * 2006-10-20 2010-08-05 Alan Jeffrey Seefeldt Audio Dynamics Processing Using A Reset
US8325944B1 (en) * 2008-11-07 2012-12-04 Adobe Systems Incorporated Audio mixes for listening environments
WO2010141504A1 (en) * 2009-06-01 2010-12-09 Music Mastermind, LLC System and method of receiving, analyzing, and editing audio to create musical compositions
US20130170672A1 (en) * 2010-09-22 2013-07-04 Dolby International Ab Audio stream mixing with dialog level normalization
US20160021476A1 (en) * 2011-07-01 2016-01-21 Dolby Laboratories Licensing Corporation System and Method for Adaptive Audio Signal Generation, Coding and Rendering

Similar Documents

Publication Publication Date Title
US6263507B1 (en) Browser for use in navigating a body of information, with particular application to browsing information represented by audiovisual data
CA2798093C (en) Methods and systems for processing a sample of a media stream
US9503781B2 (en) Commercial detection based on audio fingerprinting
EP1417584B1 (en) Playlist generation method and apparatus
Haitsma et al. A highly robust audio fingerprinting system.
US9258459B2 (en) System and method for compiling and playing a multi-channel video
EP1464172B1 (en) Captioning system
US9159338B2 (en) Systems and methods of rendering a textual animation
US9274673B2 (en) Methods, systems, and media for rewinding media content based on detected audio events
CN102177726B (en) Feature optimization and reliability estimation for audio and video signature generation and detection
Haitsma et al. A highly robust audio fingerprinting system with an efficient search strategy
US20120089911A1 (en) Bookmarking System
CN104023247B (en) Acquiring, push information and a method and apparatus information exchange system
CA2895964C (en) Method and system for performing an audio information collection and query
US9436689B2 (en) Distributed and tiered architecture for content search and content monitoring
JP2006528859A (en) Fingerprint generation and detection method and apparatus for synchronizing audio and video
US20120315014A1 (en) Audio fingerprinting to bookmark a location within a video
US8996380B2 (en) Methods and systems for synchronizing media
ES2512640T3 (en) Methods and apparatus for generating signatures
KR20150113991A (en) Methods and systems for performing comparisons of received data and providing a follow-on service based on the comparisons
US20130036442A1 (en) System and method for visual selection of elements in video content
EP2628047B1 (en) Alternative audio for smartphones in a movie theater.
CN101548294A (en) Extracting features of video &amp; audio signal content to provide reliable identification of the signals
CN1973536A (en) Video-audio synchronization
JP2002142175A (en) Recording playback apparatus capable of extracting and searching index information at once

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17913953

Country of ref document: EP

Kind code of ref document: A1