WO2022160669A1 - Procédé de traitement audio et appareil de traitement audio - Google Patents

Procédé de traitement audio et appareil de traitement audio Download PDF

Info

Publication number
WO2022160669A1
WO2022160669A1 PCT/CN2021/113086 CN2021113086W WO2022160669A1 WO 2022160669 A1 WO2022160669 A1 WO 2022160669A1 CN 2021113086 W CN2021113086 W CN 2021113086W WO 2022160669 A1 WO2022160669 A1 WO 2022160669A1
Authority
WO
WIPO (PCT)
Prior art keywords
user
audio
background audio
background
time
Prior art date
Application number
PCT/CN2021/113086
Other languages
English (en)
Chinese (zh)
Inventor
邢文浩
张晨
Original Assignee
北京达佳互联信息技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京达佳互联信息技术有限公司 filed Critical 北京达佳互联信息技术有限公司
Publication of WO2022160669A1 publication Critical patent/WO2022160669A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H1/00Details of electrophonic musical instruments
    • G10H1/0008Associated control or indicating means
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H1/00Details of electrophonic musical instruments
    • G10H1/36Accompaniment arrangements
    • G10H1/361Recording/reproducing of accompaniment for use with an external source, e.g. karaoke systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2210/00Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
    • G10H2210/005Musical accompaniment, i.e. complete instrumental rhythm synthesis added to a performed melody, e.g. as output by drum machines
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2210/00Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
    • G10H2210/031Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2240/00Data organisation or data communication aspects, specifically adapted for electrophonic musical tools or instruments
    • G10H2240/171Transmission of musical instrument data, control or status information; Transmission, remote access or control of music data for electrophonic musical instruments

Definitions

  • the present disclosure relates to the field of signal processing, and in particular, to an audio processing method, an apparatus, an electronic device, and a storage medium.
  • Online KTV chorus means that two people (for example, A and B) choose the same song to chorus. At this time, both A and B can hear each other's singing and their own accompaniment, just like offline KTV chorus.
  • the present disclosure provides an audio processing method, apparatus, electronic device and storage medium.
  • an audio processing method comprising: receiving an audio segment of a first user collected during singing and a background audio of the first user corresponding to the audio segment the playing time of the background audio; adjust the playing position of the background audio of the second user according to the playing time of the background audio, so that the adjusted background audio of the second user is aligned with the received audio clip of the first user, wherein the The background audio of the second user is the same as the background audio of the first user.
  • the background audio playback moment is obtained by subtracting a time delay due to audio capture from the current playback moment of the background audio of the first user.
  • the adjusting the playing position of the background audio of the second user according to the playing moment of the background audio includes: determining the playing position of the background audio of the second user when the audio clip of the first user is received; When the background audio playback position is within the time interval from the end of the second user's singing to the first user's beginning to sing, or the time interval from the second user's background audio to the first user's beginning to sing, the background audio The audio playback position is adjusted to correspond to the received background audio playback time.
  • the adjusting the playing of the background audio of the second user according to the playing moment of the background audio includes: determining that the background audio of the second user is in a time interval from the end of the second user's singing to the beginning of the first user's singing Or the sub-interval with the smallest average audio energy in the time interval from the second user's background audio to the first user's beginning to sing; in the sub-interval, adjust the playback position of the second user's background audio according to the background audio playback time .
  • the adjusting the playback position of the background audio of the second user in the sub-interval according to the playback moment of the background audio includes: determining the background audio of the second user when the audio clip of the first user is received Playing position; in response to the background audio playing position being in the sub-section, adjusting the background audio playing position to correspond to the received background audio playing time.
  • the sub-intervals with the smallest audio frequency average energy include: calculating the audio average energy of each sub-interval in the time interval according to the following formula, and determining the sub-intervals with the smallest frequency average energy according to the calculated audio average energy of the respective intervals:
  • E(ab) is the average energy of sub-interval ab
  • TSb is the number of sampling points up to time b
  • TSa is the number of sampling points up to time a
  • TSb-TSa is the number of sampling points between intervals ab number
  • S(i) is the amplitude of the ith sampling point.
  • the audio processing method further includes: sending to the first user the audio clip of the second user collected during singing and the background audio playing time of the second user's background audio corresponding to the audio clip of the second user .
  • the audio processing method further includes: establishing a communication connection with the first user; playing the background audio, and playing the received audio clip of the first user.
  • the receiving the audio clip of the first user collected during singing and the background audio playing time of the background audio of the first user corresponding to the audio clip includes: receiving the audio clip collected during singing according to a predetermined time interval. The audio clip of the first user and the background audio playing time of the background audio of the first user corresponding to the audio clip.
  • an audio processing apparatus comprising: a receiving unit configured to receive an audio segment of a first user collected during singing and an audio clip of the background audio of the first user The background audio playback time corresponding to the audio clip; the adjusting unit is configured to adjust the playback position of the background audio of the second user according to the background audio playback time, so that the adjusted background audio of the second user is the same as the received audio.
  • the audio clips of the first user are aligned, wherein the background audio of the second user is the same as the background audio of the first user.
  • the background audio playback moment is obtained by subtracting a time delay due to audio capture from the current playback moment of the background audio of the first user.
  • the adjusting the playback position of the background audio of the second user according to the playback moment of the background audio includes:
  • the The background audio playback position is adjusted to correspond to the received background audio playback time.
  • the adjusting the playing of the background audio of the second user according to the playing time of the background audio includes: determining that the background audio of the second user is in a time interval from the end of the second user's singing to the beginning of the first user's singing Or the sub-interval with the smallest average audio energy in the time interval from the second user's background audio to the first user's beginning to sing; in the sub-interval, adjust the playback position of the second user's background audio according to the background audio playback time .
  • the adjusting the playback position of the background audio of the second user in the sub-interval according to the playback moment of the background audio includes: determining the background audio of the second user when the audio clip of the first user is received Playing position; in response to the background audio playing position being in the sub-section, adjusting the background audio playing position to correspond to the received background audio playing time.
  • the sub-intervals with the smallest audio frequency average energy include: calculating the audio average energy of each sub-interval in the time interval according to the following formula, and determining the sub-intervals with the smallest frequency average energy according to the calculated audio average energy of the respective intervals:
  • E(ab) is the average energy of sub-interval ab
  • TSb is the number of sampling points up to time b
  • TSa is the number of sampling points up to time a
  • TSb-TSa is the number of sampling points between intervals ab number
  • S(i) is the amplitude of the ith sampling point.
  • the audio processing apparatus further includes: a sending unit configured to send to the first user the audio clip of the second user collected during singing, the background audio of the second user and the audio clip of the second user The corresponding background audio playback time.
  • the audio processing apparatus further includes: a communication unit configured to establish a communication connection with the first user; an audio playback unit configured to play the background audio and play the received audio of the first user. audio clip.
  • the receiving unit receives the audio segment of the first user collected during singing and the background audio playing time of the background audio of the first user corresponding to the audio segment at predetermined time intervals.
  • an electronic device comprising: at least one processor; at least one memory storing computer-executable instructions, wherein the computer-executable instructions are stored by the At least one processor, when run, causes the at least one processor to perform the audio processing method as described above.
  • a computer-readable storage medium storing instructions that, when executed by at least one processor, cause the at least one processor to perform the audio processing method as described above .
  • a computer program product comprising computer instructions, which when executed by a processor implement the audio processing method as described above.
  • the embodiment of the present disclosure adjusts the playback position of the second user's background audio according to the background audio playback time of the first user's background audio corresponding to its audio segment, so that the second user's adjusted background audio is the same as the received background audio.
  • the alignment of the audio clips of the first user can prevent the audio clips of the first user from being misaligned with the background audio of the local second user due to transmission delay, which affects the chorus experience.
  • the embodiments of the present disclosure can also reduce the impact on the sense of hearing when adjusting the playing position of the background audio of the second user.
  • FIG. 1 is an exemplary system architecture in which exemplary embodiments of the present disclosure may be applied;
  • FIG. 2 is a flowchart of an audio processing method of an exemplary embodiment of the present disclosure
  • FIG. 3 is a schematic diagram illustrating acquiring a background audio playback moment corresponding to an audio segment according to an exemplary embodiment of the present disclosure
  • FIG. 4 is a schematic diagram of adjusting the playback position of background audio according to an exemplary embodiment of the present disclosure
  • FIG. 5 is a schematic diagram of an application scenario of the audio processing method according to an exemplary embodiment of the present disclosure
  • FIG. 6 is a block diagram of an audio processing apparatus of an exemplary embodiment of the present disclosure.
  • FIG. 7 is a block diagram of an electronic device according to an exemplary embodiment of the present disclosure.
  • FIG. 1 illustrates an exemplary system architecture 100 in which exemplary embodiments of the present disclosure may be applied.
  • the system architecture 100 may include terminal devices 101 , 102 , and 103 , a network 104 and a server 105 .
  • the network 104 is a medium used to provide a communication link between the terminal devices 101 , 102 , 103 and the server 105 .
  • the network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
  • the user can use the terminal devices 101 , 102 and 103 to interact with the server 105 through the network 104 to receive or send messages (eg, audio and video data upload requests, audio and video data acquisition requests) and the like.
  • Various communication client applications may be installed on the terminal devices 101 , 102 and 103 , such as singing applications, audio and video recording software, audio and video players, instant communication tools, email clients, social platform software, and the like.
  • the terminal devices 101, 102, and 103 may be hardware or software.
  • the terminal devices 101, 102, and 103 are hardware, they can be various electronic devices with a display screen and capable of audio and video playback and recording, including but not limited to smart phones, tablet computers, laptop computers and desktops computer, etc.
  • the terminal devices 101, 102, and 103 are software, they can be installed in the electronic devices listed above, which can be implemented as multiple software or software modules (for example, to provide distributed services), or can be implemented as a single software or software modules.
  • the terminal devices 101, 102, and 103 may be installed with image capture devices (eg, cameras) to capture video data.
  • image capture devices eg, cameras
  • the smallest visual unit that composes a video is a frame.
  • Each frame is a static image.
  • a dynamic video is formed by synthesizing a sequence of temporally consecutive frames together.
  • the terminal devices 101, 102, 103 may also be installed with components for converting electrical signals into sounds (such as speakers) to play sounds, and may also be installed with devices for converting analog audio signals into digital audio signals (for example, microphone) to capture sound.
  • the server 105 may be a server that provides various services, such as a background server that provides support for multimedia applications installed on the terminal devices 101 , 102 , and 103 .
  • the background server can parse and store the received audio and video data upload requests and other data, and can also receive audio and video data acquisition requests sent by the terminal devices 101, 102, and 103, and send the audio and video data acquisition requests.
  • the indicated audio and video data are fed back to the terminal devices 101 , 102 and 103 .
  • the server 105 may, in response to a user's query request (eg, song query request), feed back information (eg, song information) corresponding to the query request to the terminal devices 101 , 102 , and 103 .
  • the server may be hardware or software.
  • the server can be implemented as a distributed server cluster composed of multiple servers, or can be implemented as a single server.
  • the server is software, it can be implemented as a plurality of software or software modules (for example, for providing distributed services), or can be implemented as a single software or software module.
  • the audio processing methods provided by the embodiments of the present disclosure are generally executed by the terminal devices 101 , 102 , and 103 , and correspondingly, the audio processing apparatuses are generally set in the terminal devices 101 , 102 , and 103 .
  • terminal devices, networks and servers in FIG. 1 are merely illustrative. There can be any number of terminal devices, networks and servers according to implementation needs.
  • FIG. 2 is a flowchart of an audio processing method of an exemplary embodiment of the present disclosure.
  • the method shown in FIG. 2 can be performed by any electronic device with audio processing function.
  • the electronic device may be a PC computer, a tablet device, a personal digital assistant, a smart phone, or other devices capable of executing the above set of instructions.
  • step S201 the audio clip of the first user collected during singing and the background audio playing time of the background audio of the first user corresponding to the audio clip are received.
  • the background audio may be background music or accompaniment when the user sings a song.
  • the background audio of the first user is the background audio played when the first user sings.
  • the audio clip of the first user collected during singing and the background audio playing time of the background audio of the first user corresponding to the audio clip may be received at predetermined time intervals.
  • the predetermined time interval may be a user-defined time interval, such as 20ms, but is not limited thereto.
  • the above-mentioned background audio playback moment corresponding to the audio segment (hereinafter, may be denoted as T1) is obtained by subtracting the current playback moment of the background audio of the first user by a time delay due to audio capture. acquired.
  • FIG. 3 is a schematic diagram illustrating acquiring a background audio playback moment corresponding to an audio segment according to an exemplary embodiment of the present disclosure. As shown in FIG. 3 , in the case where the user sings along with the background audio after the background audio is played, the current playing time (which may be represented as T0 ) of the background audio played locally by the user may be acquired.
  • the time delay due to audio capture i.e., the time difference between when a sound (such as a user's singing) is emitted and when it is captured by a capture device (such as a microphone), can be denoted as Tr), so that the The playing time of the background audio corresponding to the user's audio clip is not T0, but a time before T0, for example, T0-Tr.
  • the playback position of the background audio of the second user may be adjusted according to the playback time of the background audio, so that the adjusted background audio of the second user is aligned with the received audio clip of the first user.
  • the background audio of the second user is the same as the background audio of the first user.
  • the background audio of the second user represents the locally played background audio of the second user. Alignment of the adjusted background audio of the second user with the received audio clip of the first user indicates that there is no deviation between the background audio played locally by the second user and the received audio clip of the first user, in short, for example, The singing voice of the first user sounds in harmony with the accompaniment played locally by the second user.
  • the adjustment of the background audio playback position is performed because there is a transmission delay when the audio clip of the first user is transmitted to the user equipment of the second user.
  • the background audio playback position of the second user may be determined first when the audio segment of the first user is received, and then, the background audio playback position is from the end of the second user's singing to the first In the case of a time interval when a user starts to sing or within a time interval from the second user's background audio to the first user's start to sing, the background audio playback position is adjusted to be played with the received background audio. corresponding time.
  • FIG. 4 is a schematic diagram of adjusting the playback position of background audio according to an exemplary embodiment of the present disclosure.
  • B due to the transmission delay, after A sings a sentence, B can actually hear it after a period of time. At this time, B will feel that the singing voice sung by A and the accompaniment played locally by B are correct. (A's singing is later than B's own accompaniment). For example, when B receives the singing voice at time T1 sung by A, the background audio played locally by B has been played until time T2, where T2 is equal to T1 + transmission delay Td. In this case, the playback position of the background audio played locally by B can be adjusted according to T1.
  • B can rewind his accompaniment from time T2 to time T1 to start playing, and then he can sing with A.
  • the singing is aligned.
  • the rollback operation will make the user feel that the music has gone backwards, affecting the sense of hearing.
  • Adjusting the background audio playback position under the circumstance that the audio playback position is adjusted can reduce the impact on the user's sense of hearing caused by the adjustment of the audio playback position.
  • the adjustment of the background audio playback position of the second user may also be performed in other time intervals.
  • step S202 it may be first determined that the background audio of the second user is in the time interval from the end of the second user's singing to the beginning of the first user's singing, or from the beginning of the background audio of the second user to the beginning of the first user.
  • the sub-interval with the smallest average audio energy in the singing time interval and then, in the sub-interval, the playing position of the background audio of the second user is adjusted according to the playing time of the background audio. Since the playback position of the background audio of the second user is adjusted in the sub-section where the average audio energy is the smallest, the influence on the user's sense of hearing caused by the adjustment of the audio playback position can be minimized.
  • the background audio playback position of the second user may be determined first when the audio clip of the first user is received, and then, in response to the background audio playback position being in the above-mentioned sub-range, the background audio playback position is set to Adjusted to correspond to the received background audio playback time.
  • the audio average energy of each sub-interval in the time interval can be calculated according to the following formula, and the sub-interval with the smallest frequency average energy is determined according to the calculated audio average energy of the respective interval:
  • E(ab) is the average energy of sub-interval ab
  • TSb is the number of sampling points up to time b
  • TSa is the number of sampling points up to time a
  • TSb-TSa is the number of sampling points between intervals ab number
  • S(i) is the amplitude of the ith sampling point.
  • a communication connection can be established with the first user, and the background audio can be played, and the received audio segment of the first user is played.
  • the first user and the second user may connect microphones first, and then select a song to sing, and then both start to play the same background music at the same time.
  • the second user can adjust the playback position of the background audio played locally by the second user when receiving the audio clip of the first user.
  • the audio segment of the second user can also be aligned with the audio segment of the second user.
  • the background audio playback time corresponding to the audio clips of the two users is sent to the first user, so that the first user can adjust the playback position of the background audio of the first user according to the received background audio playback time, so that the adjusted background audio of the first user can be adjusted.
  • the background audio is aligned with the received audio segment of the second user.
  • the audio processing method shown in FIG. 2 may further include: sending to the first user an audio clip of the second user collected during singing, and the background audio of the second user and the audio clip of the second user The corresponding background audio playback time.
  • the audio segment of the second user collected during singing and the background audio playback time of the second user's background audio corresponding to the audio segment of the second user may be sent to the first user at predetermined time intervals.
  • the time interval for transmitting the audio segment of the second user may be the same as or different from the time interval for receiving the audio segment of the first user.
  • the audio processing method according to the exemplary embodiment of the present disclosure has been described above with reference to FIGS. 2 to 4 . According to the above audio processing method, deviations between the audio segment sent by the other party and the local background audio due to transmission delay can be avoided. In addition, the embodiments of the present disclosure can also reduce the impact on the sense of hearing when adjusting the playback position of the background audio.
  • FIG. 5 is a schematic diagram of an application scenario of an audio processing method according to an exemplary embodiment of the present disclosure.
  • FIG. 5 shows that in the online KTV scene, when the first user and the second user perform a K-song chorus, the two users jointly sing a song "The Girl by the Bridge".
  • the devices of the first user (A) and the second user (B) can display the lyrics corresponding to the background music, and each sentence is marked in the lyrics file whether A sings or B sings, and A and B sing in turn. own sentences.
  • the switch from B to A B has finished singing, A starts singing), or A sings the first sentence of the song
  • B needs to perform a rollback operation (ie, the operation of adjusting the playback position of the background audio described above with reference to FIG. 2 to FIG. 4 ) according to the playback time T1 of the background audio received from A, so that the background music is played from the position T1 .
  • the rollback operation may be performed according to the playing time T1 of the background audio received from the other party.
  • FIG. 6 is a block diagram of an audio processing apparatus of an exemplary embodiment of the present disclosure.
  • the audio processing apparatus 600 may include a receiving unit 601 and an adjusting unit 602 .
  • the receiving unit 601 may be configured to receive an audio clip of the first user collected during singing and a background audio playback moment of the first user's background audio corresponding to the audio clip.
  • the adjustment unit 602 may be configured to adjust the playback position of the background audio of the second user according to the background audio playback time, so that the adjusted background audio of the second user is aligned with the received audio segment of the first user.
  • the background audio of the second user is the same as the background audio of the first user.
  • the audio processing apparatus 600 may further include a sending unit (not shown), and the sending unit may send to the first user the audio clip of the second user collected during singing and the background audio of the second user and the second user The background audio playback time corresponding to the user's audio clip.
  • the audio processing apparatus 600 may further include a communication unit (not shown) and an audio playback unit (not shown).
  • the communication unit may establish a communication connection with the first user before receiving the audio segment and the background audio playing time.
  • the audio playing unit may play the background audio and play the received audio clip of the first user.
  • the audio playing unit may also play the collected audio of the second user.
  • the audio processing method shown in FIG. 2 can be performed by the audio processing apparatus 600 shown in FIG. 6 , and the receiving unit 601 and the adjusting unit 602 can respectively perform operations corresponding to steps 201 and 202 in FIG.
  • the receiving unit 601 and the adjusting unit 602 can respectively perform operations corresponding to steps 201 and 202 in FIG.
  • the receiving unit 601 and the adjusting unit 602 can respectively perform operations corresponding to steps 201 and 202 in FIG.
  • the audio processing apparatus 600 is described above as being divided into units for performing corresponding processing respectively, it is clear to those skilled in the art that the processing performed by the above units can also be performed in the audio processing unit.
  • the apparatus 600 is executed without any specific unit division or clear demarcation between the units.
  • the audio processing apparatus 500 may further include other units, for example, a storage unit.
  • FIG. 7 is a block diagram of an electronic device 700 according to an embodiment of the present disclosure.
  • an electronic device 700 may include at least one memory 701 and at least one processor 702, the at least one memory stores a set of computer-executable instructions, when the set of computer-executable instructions is executed by the at least one processor, the execution is performed according to the The audio processing method of the embodiment of the present disclosure.
  • the electronic device may be a PC computer, a tablet device, a personal digital assistant, a smart phone, or other device capable of executing the above set of instructions.
  • the electronic device does not have to be a single electronic device, but can also be any set of devices or circuits that can individually or jointly execute the above-mentioned instructions (or instruction sets).
  • the electronic device may also be part of an integrated control system or system manager, or may be configured as a portable electronic device that interfaces locally or remotely (e.g., via wireless transmission).
  • a processor may include a central processing unit (CPU), a graphics processing unit (GPU), a programmable logic device, a special purpose processor system, a microcontroller, or a microprocessor.
  • processors may also include analog processors, digital processors, microprocessors, multi-core processors, processor arrays, network processors, and the like.
  • the processor may execute instructions or code stored in memory, which may also store data. Instructions and data may also be sent and received over a network via a network interface device, which may employ any known transport protocol.
  • the memory may be integrated with the processor, eg, RAM or flash memory arranged within an integrated circuit microprocessor or the like. Additionally, the memory may comprise a separate device such as an external disk drive, a storage array, or any other storage device that may be used by a database system.
  • the memory and the processor may be operatively coupled, or may communicate with each other, eg, through I/O ports, network connections, etc., to enable the processor to read files stored in the memory.
  • the electronic device may also include a video display (such as a liquid crystal display) and a user interaction interface (such as a keyboard, mouse, touch input device, etc.). All components of the electronic device can be connected to each other via a bus and/or a network.
  • a video display such as a liquid crystal display
  • a user interaction interface such as a keyboard, mouse, touch input device, etc.
  • a computer-readable storage medium storing instructions, wherein the instructions, when executed by at least one processor, cause the at least one processor to perform the audio processing method according to an exemplary embodiment of the present disclosure .
  • Examples of the computer-readable storage medium herein include: Read Only Memory (ROM), Random Access Programmable Read Only Memory (PROM), Electrically Erasable Programmable Read Only Memory (EEPROM), Random Access Memory (RAM) , dynamic random access memory (DRAM), static random access memory (SRAM), flash memory, non-volatile memory, CD-ROM, CD-R, CD+R, CD-RW, CD+RW, DVD-ROM , DVD-R, DVD+R, DVD-RW, DVD+RW, DVD-RAM, BD-ROM, BD-R, BD-R LTH, BD-RE, Blu-ray or Optical Disc Storage, Hard Disk Drive (HDD), Solid State Hard disk (SSD), card memory (such as a multimedia card, Secure Digital (SD) card,
  • SD Secure Digital
  • the computer program in the above-mentioned computer readable storage medium can be executed in an environment deployed in a computer device such as a client, a host, a proxy device, a server, etc.
  • the computer program and any associated data, data files and data structures are distributed over networked computer systems so that the computer programs and any associated data, data files and data structures are stored, accessed and executed in a distributed fashion by one or more processors or computers.
  • a computer program product may also be provided, including computer instructions, which when executed by a processor implement the audio processing method according to an exemplary embodiment of the present disclosure.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Reverberation, Karaoke And Other Acoustics (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

La présente invention concerne un procédé de traitement audio, consistant à : recevoir un clip audio, qui est collecté pendant un chant, d'un premier utilisateur et un temps de lecture d'audio de fond, qui correspond au clip audio, de l'audio de fond du premier utilisateur (S201) ; et ajuster une position de lecture de l'audio de fond d'un second utilisateur selon le temps de lecture d'audio de fond, de telle sorte que l'audio de fond du second utilisateur après l'ajustement est aligné avec le clip audio reçu du premier utilisateur (S202), l'audio de fond du second utilisateur étant le même que l'audio de fond du premier utilisateur. La présente invention concerne en outre un appareil de traitement audio, un dispositif électronique et un support de stockage.
PCT/CN2021/113086 2021-01-26 2021-08-17 Procédé de traitement audio et appareil de traitement audio WO2022160669A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110110917.7A CN112927666B (zh) 2021-01-26 2021-01-26 音频处理方法、装置、电子设备及存储介质
CN202110110917.7 2021-01-26

Publications (1)

Publication Number Publication Date
WO2022160669A1 true WO2022160669A1 (fr) 2022-08-04

Family

ID=76166954

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/113086 WO2022160669A1 (fr) 2021-01-26 2021-08-17 Procédé de traitement audio et appareil de traitement audio

Country Status (2)

Country Link
CN (1) CN112927666B (fr)
WO (1) WO2022160669A1 (fr)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112927666B (zh) * 2021-01-26 2023-11-28 北京达佳互联信息技术有限公司 音频处理方法、装置、电子设备及存储介质

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2002251194A (ja) * 2001-10-05 2002-09-06 Yamaha Corp カラオケ装置
CN111261133A (zh) * 2020-01-15 2020-06-09 腾讯科技(深圳)有限公司 演唱处理方法、装置、电子设备及存储介质
CN111383669A (zh) * 2020-03-19 2020-07-07 杭州网易云音乐科技有限公司 多媒体文件上传方法、装置、设备及计算机可读存储介质
CN111524494A (zh) * 2020-04-27 2020-08-11 腾讯音乐娱乐科技(深圳)有限公司 一种异地实时合唱方法及装置、存储介质
CN112017622A (zh) * 2020-09-04 2020-12-01 广州趣丸网络科技有限公司 一种音频数据的对齐方法、装置、设备和存储介质
CN112118062A (zh) * 2019-06-19 2020-12-22 华为技术有限公司 一种多终端的多媒体数据通信方法和系统
CN112148248A (zh) * 2020-09-28 2020-12-29 腾讯音乐娱乐科技(深圳)有限公司 线上歌房实现方法及电子设备和计算机可读存储介质
CN112927666A (zh) * 2021-01-26 2021-06-08 北京达佳互联信息技术有限公司 音频处理方法、装置、电子设备及存储介质

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9406341B2 (en) * 2011-10-01 2016-08-02 Google Inc. Audio file processing to reduce latencies in play start times for cloud served audio files
CN107093419B (zh) * 2016-02-17 2020-04-24 广州酷狗计算机科技有限公司 一种动态伴唱方法和装置
KR101987473B1 (ko) * 2017-12-12 2019-06-10 미디어스코프 주식회사 온라인 노래방 서비스의 반주 및 가창 음성 간 동기화 시스템 및 이를 수행하기 위한 장치
CN109033335B (zh) * 2018-07-20 2021-03-26 广州酷狗计算机科技有限公司 音频录制方法、装置、终端及存储介质
CN110267081B (zh) * 2019-04-02 2021-01-22 北京达佳互联信息技术有限公司 直播流处理方法、装置、系统、电子设备及存储介质
CN110491358B (zh) * 2019-08-15 2023-06-27 广州酷狗计算机科技有限公司 进行音频录制的方法、装置、设备、系统及存储介质
CN111028818B (zh) * 2019-11-14 2022-11-22 北京达佳互联信息技术有限公司 合唱方法、装置、电子设备和存储介质

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2002251194A (ja) * 2001-10-05 2002-09-06 Yamaha Corp カラオケ装置
CN112118062A (zh) * 2019-06-19 2020-12-22 华为技术有限公司 一种多终端的多媒体数据通信方法和系统
CN111261133A (zh) * 2020-01-15 2020-06-09 腾讯科技(深圳)有限公司 演唱处理方法、装置、电子设备及存储介质
CN111383669A (zh) * 2020-03-19 2020-07-07 杭州网易云音乐科技有限公司 多媒体文件上传方法、装置、设备及计算机可读存储介质
CN111524494A (zh) * 2020-04-27 2020-08-11 腾讯音乐娱乐科技(深圳)有限公司 一种异地实时合唱方法及装置、存储介质
CN112017622A (zh) * 2020-09-04 2020-12-01 广州趣丸网络科技有限公司 一种音频数据的对齐方法、装置、设备和存储介质
CN112148248A (zh) * 2020-09-28 2020-12-29 腾讯音乐娱乐科技(深圳)有限公司 线上歌房实现方法及电子设备和计算机可读存储介质
CN112927666A (zh) * 2021-01-26 2021-06-08 北京达佳互联信息技术有限公司 音频处理方法、装置、电子设备及存储介质

Also Published As

Publication number Publication date
CN112927666A (zh) 2021-06-08
CN112927666B (zh) 2023-11-28

Similar Documents

Publication Publication Date Title
US10043504B2 (en) Karaoke processing method, apparatus and system
WO2020253806A1 (fr) Procédé et appareil de génération d'une vidéo d'affichage, dispositif et support de stockage
JP2018519538A (ja) カラオケ処理方法およびシステム
US11120782B1 (en) System, method, and non-transitory computer-readable storage medium for collaborating on a musical composition over a communication network
JP6785904B2 (ja) 情報プッシュ方法及び装置
WO2017177621A1 (fr) Procédé de synchronisation de données dans un réseau local, et appareil et terminal utilisateur associés
JP2020174339A (ja) 段落と映像を整列させるための方法、装置、サーバー、コンピュータ可読記憶媒体およびコンピュータプログラム
WO2022142619A1 (fr) Procédé et dispositif d'appel audio ou vidéo privé
US20220047954A1 (en) Game playing method and system based on a multimedia file
CN110312162A (zh) 精选片段处理方法、装置、电子设备及可读介质
US20170092253A1 (en) Karaoke system
WO2022110943A1 (fr) Procédé et appareil de prévisualisation de la parole
US9405501B2 (en) System and method for automatic synchronization of audio layers
WO2022160669A1 (fr) Procédé de traitement audio et appareil de traitement audio
CN112365868B (zh) 声音处理方法、装置、电子设备及存储介质
CN112687247B (zh) 音频对齐方法、装置、电子设备及存储介质
US20160307551A1 (en) Multifunctional Media Players
US11862187B2 (en) Systems and methods for jointly estimating sound sources and frequencies from audio
WO2022227625A1 (fr) Procédé et appareil de traitement de signaux
JP6170692B2 (ja) 通信障害時にデュエット歌唱を継続可能な通信カラオケシステム
US11297368B1 (en) Methods, systems, and apparatuses and live audio capture
US20220303152A1 (en) Recordation of video conference based on bandwidth issue(s)
JP4981631B2 (ja) コンテンツ送信装置、コンテンツ送信方法及びコンピュータ・プログラム
CN116450256A (zh) 音频特效的编辑方法、装置、设备及存储介质
US11522936B2 (en) Synchronization of live streams from web-based clients

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21922264

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 06.11.2023)

122 Ep: pct application non-entry in european phase

Ref document number: 21922264

Country of ref document: EP

Kind code of ref document: A1