CN115242757B - Data processing method and device, electronic equipment and storage medium - Google Patents

Data processing method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN115242757B
CN115242757B CN202211147902.9A CN202211147902A CN115242757B CN 115242757 B CN115242757 B CN 115242757B CN 202211147902 A CN202211147902 A CN 202211147902A CN 115242757 B CN115242757 B CN 115242757B
Authority
CN
China
Prior art keywords
video conference
conference terminal
original audio
video
audio signal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211147902.9A
Other languages
Chinese (zh)
Other versions
CN115242757A (en
Inventor
袁磊
赵卫东
王守帅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Weitai Shixin Technology Co ltd
Original Assignee
Beijing Weitai Shixin Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Weitai Shixin Technology Co ltd filed Critical Beijing Weitai Shixin Technology Co ltd
Priority to CN202211147902.9A priority Critical patent/CN115242757B/en
Publication of CN115242757A publication Critical patent/CN115242757A/en
Application granted granted Critical
Publication of CN115242757B publication Critical patent/CN115242757B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/40Support for services or applications
    • H04L65/403Arrangements for multi-party communication, e.g. for conferences
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/16Vocoder architecture
    • G10L19/167Audio streaming, i.e. formatting and decoding of an encoded audio signal representation into a data stream for transmission or storage purposes

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Multimedia (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
  • Telephonic Communication Services (AREA)

Abstract

The application provides a data processing method, a device, electronic equipment and a storage medium, wherein the method comprises the following steps: for each video conference terminal accessing the same video conference, the video conference terminal encodes the audio signal currently acquired by the video conference terminal to obtain an audio code stream; the video conference terminal sends the audio code stream to each reference video conference terminal except the video conference terminal; when the video conference terminal receives the reference audio code stream sent by the reference video conference terminal, for each reference audio code stream, the video conference terminal decodes the reference audio code stream to obtain a reference audio signal; if the reference audio signals comprise at least two, the video conference terminal performs audio mixing processing on the reference original audio signals to obtain mixed audio signals; the video conference terminal plays the mixed audio signal. The video conference can be developed without configuring the MCU, so that the front cost of the video conference is reduced.

Description

Data processing method and device, electronic equipment and storage medium
Technical Field
The present invention relates to the field of communications technologies, and in particular, to a data processing method, a data processing device, an electronic device, and a storage medium.
Background
In a video conference system, a multipoint controller (Multipoint Control Unit, MCU) is required. The MCU is a multimedia information exchanger and is used for carrying out multipoint calling and connection, realizing functions of video broadcasting, video selection, audio mixing, data broadcasting and the like, and completing signal tandem and switching among terminals. Specifically, each participant (i.e., terminal) participating in the same video conference encodes local video and audio, and transmits the encoded audio/video code stream to the MCU through a transmission protocol. After receiving the audio and video signals of each participant, the MCU switches the video according to the application requirements and sends the proper video to each participant; and after decoding the received audio signals, audio mixing is carried out, and the mixed audio is sent to each participant. In this way, each participant can see the video of the other participants and hear the audio of each participant at the same time.
The MCU has strong functions, and some MCUs with super-strong performances can even realize the simultaneous processing of more than 300 paths of audios and videos (namely, more than 300 terminals are realized to participate in the same video conference). However, the more powerful the MCU, the higher the cost, and the more practical it makes for large-scale video conferences to configure the MCU, but for small-scale video conferences, such as 3-party or 5-party conferences, configuring an MCU can greatly raise the overhead cost of the video conference.
Disclosure of Invention
In view of the foregoing, an object of the present application is to provide a data processing method, apparatus, electronic device, and storage medium, which can implement video conference development without configuring an MCU, thereby reducing the front-end cost of the video conference.
In a first aspect, an embodiment of the present application provides a data processing method, where the method includes:
for each video conference terminal in a plurality of video conference terminals accessing the same video conference, the video conference terminal codes an original audio signal acquired by the video conference terminal at present to obtain an audio code stream;
the video conference terminal sends the audio code stream to each reference video conference terminal except the video conference terminal;
under the condition that the video conference terminal receives at least one reference audio code stream sent by the reference video conference terminal, decoding the reference audio code stream by the video conference terminal for each reference audio code stream to obtain a reference original audio signal corresponding to the reference audio code stream;
if the reference original audio signals comprise at least two, the video conference terminal carries out audio mixing processing on the reference original audio signals to obtain mixed audio signals;
the video conference terminal plays the mixed audio signal.
In one possible implementation manner, before the video conference terminal encodes the original audio signal currently acquired by the video conference terminal to obtain an audio code stream, the method further includes:
the video conference terminal judges whether the volume of the original audio signal reaches a preset threshold value or not;
the video conference terminal encodes an original audio signal currently acquired by the video conference terminal to obtain an audio code stream, and the video conference terminal comprises:
and if the volume of the original audio signal reaches the preset threshold, the video conference terminal encodes the original audio signal to obtain the audio code stream.
In a possible implementation manner, each video conference terminal is provided with a priority; when the number of the reference original audio signals exceeds a preset number, the preset number is larger than 2, and the video conference terminal performs audio mixing processing on the reference original audio signals to obtain mixed audio signals, wherein the audio mixing processing comprises the following steps:
the video conference terminal performs audio mixing processing on a target reference original audio signal to obtain the mixed audio signal, wherein the target reference original audio signal comprises: the method comprises the steps of obtaining a first target reference original audio signal and a second target reference original audio signal, wherein the first target reference original audio signal is the preset number minus one reference original audio signal with higher volume, the second target reference original audio signal is the reference original audio signal with highest priority except for the first target reference original audio signal in all the reference original audio signals, and for each reference original audio signal, the priority corresponding to the reference original audio signal is the priority of a reference video conference terminal which sends a reference audio code stream corresponding to the reference original audio signal to the video conference terminal.
In one possible implementation, for each two corresponding target reference original audio signals with different priorities, the third target reference original audio signal occupies a larger weight when mixing than the fourth target reference original audio signal, wherein the third target reference original audio signal is a target reference original audio signal with a higher priority corresponding to the two corresponding target reference original audio signals with different priorities, and the fourth target reference original audio signal is a target reference original audio signal with a lower priority corresponding to the two corresponding target reference original audio signals with different priorities.
In one possible embodiment, the method further comprises:
the video conference terminal encodes an original video signal which is currently acquired by the video conference terminal, and a video code stream is obtained;
the video conference terminal sends the video code stream to each reference video conference terminal;
if the video conference terminal receives at least one reference video code stream sent by the reference video conference terminal, decoding the reference video code stream by the video conference terminal for each reference video code stream to obtain a reference original video signal corresponding to the reference video code stream;
the video conference terminal allocates a playing area for each reference original video signal on a screen display area of the video conference terminal, wherein the playing area allocated to a target reference original video signal is larger than the playing area allocated to a candidate reference original video signal, the target reference original video signal is a reference original video signal with the highest corresponding priority, the candidate reference original video signal is a reference original video signal except for the target reference original video signal in all the reference original video signals, and for each reference original video signal, the priority corresponding to the reference original video signal is the priority of a reference video conference terminal which sends a reference video code stream corresponding to the reference original video signal to the video conference terminal;
for each of the reference raw video signals, the videoconferencing endpoint plays the reference raw video signal on the play area allocated for the reference raw video signal.
In a second aspect, embodiments of the present application further provide a data processing apparatus, where the apparatus includes:
the audio coding module is used for coding an original audio signal currently acquired by each video conference terminal in a plurality of video conference terminals accessed to the same video conference to obtain an audio code stream;
an audio transmitting module, configured to transmit the audio code stream to each reference videoconference terminal of the plurality of videoconference terminals except the videoconference terminal;
the audio decoding module is used for decoding each reference audio code stream under the condition that the video conference terminal receives at least one reference audio code stream sent by the reference video conference terminal, so as to obtain a reference original audio signal corresponding to the reference audio code stream;
the audio mixing module is used for carrying out audio mixing processing on the reference original audio signals if the reference original audio signals comprise at least two audio signals, so as to obtain mixed audio signals;
and the audio playing module is used for playing the mixed audio signal.
In one possible embodiment, the apparatus further comprises:
the judging module is used for judging whether the volume of the original audio signal reaches a preset threshold before the audio coding module codes the original audio signal currently acquired by the video conference terminal to obtain an audio code stream;
the audio coding module is specifically configured to:
and if the volume of the original audio signal reaches the preset threshold, encoding the original audio signal to obtain the audio code stream.
In a possible implementation manner, each video conference terminal is provided with a priority; when the number of the reference original audio signals exceeds a preset number, the preset number is larger than 2, and the audio mixing module is specifically configured to:
mixing the target reference original audio signal to obtain the mixed audio signal, wherein the target reference original audio signal comprises: the method comprises the steps of obtaining a first target reference original audio signal and a second target reference original audio signal, wherein the first target reference original audio signal is the preset number minus one reference original audio signal with higher volume, the second target reference original audio signal is the reference original audio signal with highest priority except for the first target reference original audio signal in all the reference original audio signals, and for each reference original audio signal, the priority corresponding to the reference original audio signal is the priority of a reference video conference terminal which sends a reference audio code stream corresponding to the reference original audio signal to the video conference terminal.
In one possible implementation, for each two corresponding target reference original audio signals with different priorities, the third target reference original audio signal occupies a larger weight when mixing than the fourth target reference original audio signal, wherein the third target reference original audio signal is a target reference original audio signal with a higher priority corresponding to the two corresponding target reference original audio signals with different priorities, and the fourth target reference original audio signal is a target reference original audio signal with a lower priority corresponding to the two corresponding target reference original audio signals with different priorities.
In one possible embodiment, the apparatus further comprises:
the video coding module is used for coding the original video signal currently acquired by the video conference terminal to obtain a video code stream;
the video sending module is used for sending the video code stream to each reference video conference terminal;
the video decoding module is used for decoding each reference video code stream to obtain a reference original video signal corresponding to the reference video code stream if the video conference terminal receives at least one reference video code stream sent by the reference video conference terminal;
the allocation module is configured to allocate a play area for each reference original video signal on a screen display area of the video conference terminal, where the play area allocated to a target reference original video signal is greater than the play area allocated to a candidate reference original video signal, the target reference original video signal is a reference original video signal with the highest corresponding priority, the candidate reference original video signal is a reference original video signal except for the target reference original video signal in all the reference original video signals, and for each reference original video signal, the priority corresponding to the reference original video signal is the priority of a reference video conference terminal that sends a reference video code stream corresponding to the reference original video signal to the video conference terminal;
and the video playing module is used for playing the reference original video signal on a playing area allocated for the reference original video signal for each reference original video signal.
In a third aspect, an embodiment of the present application further provides an electronic device, including: a processor, a storage medium storing machine-readable instructions executable by the processor, the processor and the storage medium communicating over a bus when the electronic device is running, the processor executing the machine-readable instructions to perform the steps of the data processing method of any of the first aspects.
In a fourth aspect, embodiments of the present application also provide a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the data processing method according to any of the first aspects.
According to the data processing method, the device, the electronic equipment and the storage medium, the video conference can be developed without configuring an MCU, and therefore the front cost of the video conference is reduced.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the embodiments will be briefly described below, it being understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered limiting the scope, and that other related drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 shows a flow chart of a data processing method provided in an embodiment of the present application;
FIG. 2 is a flow chart illustrating another method of data processing provided by an embodiment of the present application;
FIG. 3 is a schematic diagram of a data processing apparatus according to an embodiment of the present application;
fig. 4 shows a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more clear, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it should be understood that the accompanying drawings in the present application are only for the purpose of illustration and description, and are not intended to limit the protection scope of the present application. In addition, it should be understood that the schematic drawings are not drawn to scale. A flowchart, as used in this application, illustrates operations implemented according to some embodiments of the present application. It should be understood that the operations of the flow diagrams may be implemented out of order and that steps without logical context may be performed in reverse order or concurrently. Moreover, one or more other operations may be added to the flow diagrams and one or more operations may be removed from the flow diagrams as directed by those skilled in the art.
In addition, the described embodiments are only some, but not all, of the embodiments of the present application. The components of the embodiments of the present application, which are generally described and illustrated in the figures herein, may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present application, as provided in the accompanying drawings, is not intended to limit the scope of the application, as claimed, but is merely representative of selected embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present application without making any inventive effort, are intended to be within the scope of the present application.
It should be noted that the term "comprising" will be used in the embodiments of the present application to indicate the presence of the features stated hereinafter, but not to exclude the addition of other features.
For the sake of understanding the present embodiment, a data processing method, apparatus, electronic device, and storage medium provided in the embodiments of the present application are described in detail.
Referring to fig. 1, a flowchart of a data processing method according to an embodiment of the present application is shown, where the method includes:
s101, for each video conference terminal in a plurality of video conference terminals accessing the same video conference, the video conference terminal codes an original audio signal acquired by the video conference terminal at present, and an audio code stream is obtained.
For example, the number of video conference terminals may be 2 to 5, and if the number of video conference terminals accessing the same video conference is too large, the video conference terminals may be easily blocked (depending on the performance of the video conference terminals).
S102, the video conference terminal sends the audio code stream to each reference video conference terminal except the video conference terminal.
For example, there are three video conference terminals, including: video conference terminal a, video conference terminal B, video conference terminal C.
Then, for the video conference terminal a, the video conference terminal a sends an audio code stream o obtained by encoding the original video signal a currently acquired by the video conference terminal a to the video conference terminal B and the video conference terminal C;
for the video conference terminal B, the video conference terminal B sends an audio code stream p obtained by encoding an original video signal B currently acquired by the video conference terminal B to the video conference terminal A and the video conference terminal C;
for the video conference terminal C, the video conference terminal C sends an audio code stream q obtained by encoding the original video signal C currently acquired by the video conference terminal C to the video conference terminal A and the video conference terminal B.
For each video conference terminal, the video conference terminal can receive at most n-1 reference audio code streams, wherein n is the number of the video conference terminals.
S103, under the condition that the video conference terminal receives at least one reference audio code stream sent by the reference video conference terminal, for each reference audio code stream, the video conference terminal decodes the reference audio code stream to obtain a reference original audio signal corresponding to the reference audio code stream.
And S104, if the reference original audio signals comprise at least two, the video conference terminal performs audio mixing processing on the reference original audio signals to obtain mixed audio signals.
That is, the present application changes the mixing work that would otherwise be performed by the MCU to be performed locally by the videoconferencing terminal.
And S105, the video conference terminal plays the mixed audio signal.
If the reference original audio signal only includes one, the video conference terminal directly plays the reference original audio signal.
In one possible implementation manner, before the video conference terminal encodes the original audio signal currently acquired by the video conference terminal to obtain an audio code stream, the method further includes:
the video conference terminal judges whether the volume of the original audio signal reaches a preset threshold value.
The video conference terminal encodes an original audio signal currently acquired by the video conference terminal to obtain an audio code stream, and the video conference terminal comprises:
and if the volume of the original audio signal reaches the preset threshold, the video conference terminal encodes the original audio signal to obtain the audio code stream.
That is, the video conference terminal performs the step of encoding the original audio signal only when the volume of the original audio signal reaches a certain level (i.e., a preset threshold) or more, and does not perform the step of encoding the original audio signal if the volume of the original audio signal does not reach the preset threshold.
In a possible implementation manner, each video conference terminal is provided with a priority; when the number of the reference original audio signals exceeds a preset number, the preset number is larger than 2, and the video conference terminal performs audio mixing processing on the reference original audio signals to obtain mixed audio signals, wherein the audio mixing processing comprises the following steps:
the video conference terminal performs audio mixing processing on a target reference original audio signal to obtain the mixed audio signal, wherein the target reference original audio signal comprises: the method comprises the steps of obtaining a first target reference original audio signal and a second target reference original audio signal, wherein the first target reference original audio signal is the preset number minus one reference original audio signal with higher volume, the second target reference original audio signal is the reference original audio signal with highest priority except for the first target reference original audio signal in all the reference original audio signals, and for each reference original audio signal, the priority corresponding to the reference original audio signal is the priority of a reference video conference terminal which sends a reference audio code stream corresponding to the reference original audio signal to the video conference terminal.
Illustratively, the number of reference original audio signals is 5, and the preset number is 4; the priority comprises the following steps in order from high to low: class a, class B, class C; the reference original audio signal comprises: reference to original audio signal 1 (corresponding priority level C, volume 100 db), reference to original audio signal 2 (corresponding priority level B, volume 80 db), reference to original audio signal 3 (corresponding priority level a, volume 60 db), reference to original audio signal 4 (corresponding priority level B, volume 70 db), reference to original audio signal 5 (corresponding priority level C, volume 120 db).
Then, the first target reference original audio signal (i.e. the reference original audio signal with a preset number of one minus one (3 minus the preset number of one, since the previous example mentions a preset number of 4)) is: reference is made to the original audio signal 5 (highest volume, 120 db), to the original audio signal 1 (second highest volume, 100 db), and to the original audio signal 2 (third highest volume, 80 db).
The second target reference original audio signal is the reference original audio signal 3.
In some cases, the second target reference original audio signal includes two or even more (i.e., there is a case where priorities are the same), then the second target reference original audio signal having the highest volume is selected from all the second target reference original audio signals as the second target reference original audio signal for mixing (i.e., the second target reference original audio signal includes only one, i.e., only a preset number of reference original audio signals are mixed, when mixing is performed).
Specifically, in order to enable the video conference terminal to know the priority corresponding to each reference original audio signal, for each reference original audio signal, the first target reference video conference terminal may add, to the reference original audio signal, an identifier for characterizing the priority of the first target reference video conference terminal, where the first target reference video conference terminal is a reference video conference terminal that sends a reference audio code stream corresponding to the reference original audio signal to the video conference terminal, before encoding the reference original audio signal.
By the above mode, even if the speaking volume of the important participant (i.e. the video conference terminal with higher priority) is smaller, the important participant can participate in the audio mixing, so that the speaking of the important participant can be listened to by the rest of participants (i.e. the rest of video conference terminals).
In one possible implementation, for each two corresponding target reference original audio signals with different priorities, the third target reference original audio signal occupies a larger weight when mixing than the fourth target reference original audio signal, wherein the third target reference original audio signal is a target reference original audio signal with a higher priority corresponding to the two corresponding target reference original audio signals with different priorities, and the fourth target reference original audio signal is a target reference original audio signal with a lower priority corresponding to the two corresponding target reference original audio signals with different priorities.
For every two corresponding target reference original audio signals with the same priority, the weights occupied by the two corresponding target reference original audio signals with the same priority when mixing can be the same.
By the above manner, even if the speaking volume of the important participant (i.e., the video conference terminal with higher priority) is smaller, the important participant can participate in the audio mixing, and the speaking of the important participant (i.e., the video conference terminal with higher priority) can be more clearly heard by the rest of participants (i.e., the rest of video conference terminals).
Referring to fig. 2, a flowchart of another data processing method according to an embodiment of the present application is shown, and in one possible implementation manner, the method further includes:
s201, the video conference terminal encodes an original video signal which is currently acquired by the video conference terminal, and a video code stream is obtained.
S202, the video conference terminal sends the video code stream to each reference video conference terminal.
S203, if the video conference terminal receives at least one reference video code stream sent by the reference video conference terminal, for each reference video code stream, the video conference terminal decodes the reference video code stream to obtain a reference original video signal corresponding to the reference video code stream.
S204, the video conference terminal allocates a playing area for each reference original video signal on a screen display area of the video conference terminal, wherein the playing area allocated to a target reference original video signal is larger than the playing area allocated to a candidate reference original video signal, the target reference original video signal is the reference original video signal with the highest corresponding priority, the candidate reference original video signal is the reference original video signal except for the target reference original video signal in all the reference original video signals, and for each reference original video signal, the priority corresponding to the reference original video signal is the priority of the reference video conference terminal which sends the reference video code stream corresponding to the reference original video signal to the video conference terminal.
That is, the play area allocated to the reference original video signal having the highest priority corresponding thereto is the largest.
The size of the play area allocated to each candidate reference original video signal may be the same or different.
Specifically, in order to enable the video conference terminal to know the priority corresponding to each reference original video signal, for each reference original video signal, before encoding the reference original video signal, the second target reference video conference terminal may add, to the reference original video signal, an identifier for characterizing the priority of the second target reference video conference terminal, where the second target reference video conference terminal is the reference video conference terminal that sends the reference video code stream corresponding to the reference original video signal to the video conference terminal.
S205, for each of the reference original video signals, the video conference terminal plays the reference original video signal on the play area allocated for the reference original video signal.
In addition, the playback area allocated to the target reference original video signal may be also subjected to highlighting.
According to the data processing method, the video conference can be carried out without configuring an MCU, and therefore the front cost of the video conference is reduced.
Referring to fig. 3, a schematic structural diagram of a data processing apparatus according to an embodiment of the present application is shown, where the apparatus includes:
an audio encoding module 301, configured to encode, for each of a plurality of video conference terminals accessing the same video conference, an original audio signal currently acquired by the video conference terminal, to obtain an audio code stream;
an audio transmitting module 302, configured to transmit the audio code stream to each reference videoconferencing terminal of the plurality of videoconferencing terminals except the videoconferencing terminal;
an audio decoding module 303, configured to, when the video conference terminal receives at least one reference audio code stream sent by the reference video conference terminal, decode the reference audio code stream for each reference audio code stream, and obtain a reference original audio signal corresponding to the reference audio code stream;
a mixing module 304, configured to perform mixing processing on the reference original audio signal if the reference original audio signal includes at least two reference original audio signals, so as to obtain a mixed audio signal;
an audio playing module 305, configured to play the mixed audio signal.
In one possible embodiment, the apparatus further comprises:
the judging module is configured to judge whether the volume of the original audio signal reaches a preset threshold before the audio encoding module 301 encodes the original audio signal currently acquired by the video conference terminal to obtain an audio code stream;
the audio encoding module 301 is specifically configured to:
and if the volume of the original audio signal reaches the preset threshold, encoding the original audio signal to obtain the audio code stream.
In a possible implementation manner, each video conference terminal is provided with a priority; when the number of the reference original audio signals exceeds a preset number, the preset number is greater than 2, and the audio mixing module 304 is specifically configured to:
mixing the target reference original audio signal to obtain the mixed audio signal, wherein the target reference original audio signal comprises: the method comprises the steps of obtaining a first target reference original audio signal and a second target reference original audio signal, wherein the first target reference original audio signal is the preset number minus one reference original audio signal with higher volume, the second target reference original audio signal is the reference original audio signal with highest priority except for the first target reference original audio signal in all the reference original audio signals, and for each reference original audio signal, the priority corresponding to the reference original audio signal is the priority of a reference video conference terminal which sends a reference audio code stream corresponding to the reference original audio signal to the video conference terminal.
In one possible implementation, for each two corresponding target reference original audio signals with different priorities, the third target reference original audio signal occupies a larger weight when mixing than the fourth target reference original audio signal, wherein the third target reference original audio signal is a target reference original audio signal with a higher priority corresponding to the two corresponding target reference original audio signals with different priorities, and the fourth target reference original audio signal is a target reference original audio signal with a lower priority corresponding to the two corresponding target reference original audio signals with different priorities.
In one possible embodiment, the apparatus further comprises:
the video coding module is used for coding the original video signal currently acquired by the video conference terminal to obtain a video code stream;
the video sending module is used for sending the video code stream to each reference video conference terminal;
the video decoding module is used for decoding each reference video code stream to obtain a reference original video signal corresponding to the reference video code stream if the video conference terminal receives at least one reference video code stream sent by the reference video conference terminal;
the allocation module is configured to allocate a play area for each reference original video signal on a screen display area of the video conference terminal, where the play area allocated to a target reference original video signal is greater than the play area allocated to a candidate reference original video signal, the target reference original video signal is a reference original video signal with the highest corresponding priority, the candidate reference original video signal is a reference original video signal except for the target reference original video signal in all the reference original video signals, and for each reference original video signal, the priority corresponding to the reference original video signal is the priority of a reference video conference terminal that sends a reference video code stream corresponding to the reference original video signal to the video conference terminal;
and the video playing module is used for playing the reference original video signal on a playing area allocated for the reference original video signal for each reference original video signal.
According to the data processing device, the video conference can be carried out without configuring an MCU, so that the front cost of the video conference is reduced.
Referring to fig. 4, an electronic device 400 provided in an embodiment of the present application includes: a processor 401, a memory 402 and a bus, said memory 402 storing machine-readable instructions executable by said processor 401, said processor 401 communicating with said memory 402 via the bus when the electronic device is running, said processor 401 executing said machine-readable instructions to perform the steps of the method of data processing as described above.
In particular, the memory 402 and the processor 401 can be general-purpose memories and processors, and are not particularly limited herein, and the method of data processing described above can be performed when the processor 401 runs a computer program stored in the memory 402.
Corresponding to the above data processing method, the embodiments of the present application further provide a computer readable storage medium, on which a computer program is stored, which when being executed by a processor, performs the steps of the above data processing method.
It will be clearly understood by those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described system and apparatus may refer to corresponding procedures in the method embodiments, which are not described in detail in this application. In the several embodiments provided herein, it should be understood that the disclosed systems, and methods may be implemented in other ways. The above-described apparatus embodiments are merely illustrative, and the division of the modules is merely a logical function division, and there may be additional divisions when actually implemented, and for example, multiple modules or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be through some communication interface, indirect coupling or communication connection of devices or modules, electrical, mechanical, or other form.
The modules described as separate components may or may not be physically separate, and components shown as modules may or may not be physical units, may be located in one place, or may be distributed over multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in each embodiment of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a non-volatile computer readable storage medium executable by a processor. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: a usb disk, a removable hard disk, a ROM, a RAM, a magnetic disk, or an optical disk, etc.
The foregoing is merely a specific embodiment of the present application, but the protection scope of the present application is not limited thereto, and any person skilled in the art can easily think about changes or substitutions within the technical scope of the present application, and the changes or substitutions are covered in the protection scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (5)

1. A method of data processing, the method comprising:
for each video conference terminal in a plurality of video conference terminals accessing the same video conference, the video conference terminal codes an original audio signal acquired by the video conference terminal at present to obtain an audio code stream;
the video conference terminal sends the audio code stream to each reference video conference terminal except the video conference terminal;
under the condition that the video conference terminal receives at least one reference audio code stream sent by the reference video conference terminal, decoding the reference audio code stream by the video conference terminal for each reference audio code stream to obtain a reference original audio signal corresponding to the reference audio code stream;
if the reference original audio signals comprise at least two, the video conference terminal carries out audio mixing processing on the reference original audio signals to obtain mixed audio signals;
the video conference terminal plays the mixed audio signal;
before the video conference terminal encodes the original audio signal which is currently acquired by the video conference terminal to obtain an audio code stream, the method further comprises the following steps:
the video conference terminal judges whether the volume of the original audio signal reaches a preset threshold value or not;
the video conference terminal encodes an original audio signal currently acquired by the video conference terminal to obtain an audio code stream, and the video conference terminal comprises:
if the volume of the original audio signal reaches the preset threshold, the video conference terminal encodes the original audio signal to obtain the audio code stream;
each video conference terminal is provided with a priority; when the number of the reference original audio signals exceeds a preset number, the preset number is larger than 2, and the video conference terminal performs audio mixing processing on the reference original audio signals to obtain mixed audio signals, wherein the audio mixing processing comprises the following steps:
the video conference terminal performs audio mixing processing on a target reference original audio signal to obtain the mixed audio signal, wherein the target reference original audio signal comprises: the method comprises the steps of enabling a first target to refer to original audio signals, enabling a second target to refer to original audio signals, wherein the first target to refer to original audio signals is the preset number minus one of the original audio signals with higher volume, the second target to refer to original audio signals with highest priority except for the first target to refer to original audio signals in all the original audio signals, and enabling the priority corresponding to each original audio signal to be the priority of a reference video conference terminal which sends a reference audio code stream corresponding to the original audio signal to the video conference terminal;
for every two corresponding target reference original audio signals with different priorities, the weight occupied by a third target reference original audio signal when mixing is larger than the weight occupied by a fourth target reference original audio signal when mixing, wherein the third target reference original audio signal is a target reference original audio signal with higher priority corresponding to the two corresponding target reference original audio signals with different priorities, and the fourth target reference original audio signal is a target reference original audio signal with lower priority corresponding to the two corresponding target reference original audio signals with different priorities.
2. The data processing method of claim 1, wherein the method further comprises:
the video conference terminal encodes an original video signal which is currently acquired by the video conference terminal, and a video code stream is obtained;
the video conference terminal sends the video code stream to each reference video conference terminal;
if the video conference terminal receives at least one reference video code stream sent by the reference video conference terminal, decoding the reference video code stream by the video conference terminal for each reference video code stream to obtain a reference original video signal corresponding to the reference video code stream;
the video conference terminal allocates a playing area for each reference original video signal on a screen display area of the video conference terminal, wherein the playing area allocated to a target reference original video signal is larger than the playing area allocated to a candidate reference original video signal, the target reference original video signal is a reference original video signal with the highest corresponding priority, the candidate reference original video signal is a reference original video signal except for the target reference original video signal in all the reference original video signals, and for each reference original video signal, the priority corresponding to the reference original video signal is the priority of a reference video conference terminal which sends a reference video code stream corresponding to the reference original video signal to the video conference terminal;
for each of the reference raw video signals, the videoconferencing endpoint plays the reference raw video signal on the play area allocated for the reference raw video signal.
3. A data processing apparatus, the apparatus comprising:
the audio coding module is used for coding an original audio signal currently acquired by each video conference terminal in a plurality of video conference terminals accessed to the same video conference to obtain an audio code stream;
an audio transmitting module, configured to transmit the audio code stream to each reference videoconference terminal of the plurality of videoconference terminals except the videoconference terminal;
the audio decoding module is used for decoding each reference audio code stream under the condition that the video conference terminal receives at least one reference audio code stream sent by the reference video conference terminal, so as to obtain a reference original audio signal corresponding to the reference audio code stream;
the audio mixing module is used for carrying out audio mixing processing on the reference original audio signals if the reference original audio signals comprise at least two audio signals, so as to obtain mixed audio signals;
the audio playing module is used for playing the mixed audio signal;
the apparatus further comprises:
the judging module is used for judging whether the volume of the original audio signal reaches a preset threshold before the audio coding module codes the original audio signal currently acquired by the video conference terminal to obtain an audio code stream;
the audio coding module is specifically configured to:
if the volume of the original audio signal reaches the preset threshold, encoding the original audio signal to obtain the audio code stream;
each video conference terminal is provided with a priority; when the number of the reference original audio signals exceeds a preset number, the preset number is larger than 2, and the audio mixing module is specifically configured to:
mixing the target reference original audio signal to obtain the mixed audio signal, wherein the target reference original audio signal comprises: the method comprises the steps of enabling a first target to refer to original audio signals, enabling a second target to refer to original audio signals, wherein the first target to refer to original audio signals is the preset number minus one of the original audio signals with higher volume, the second target to refer to original audio signals with highest priority except for the first target to refer to original audio signals in all the original audio signals, and enabling the priority corresponding to each original audio signal to be the priority of a reference video conference terminal which sends a reference audio code stream corresponding to the original audio signal to the video conference terminal;
for every two corresponding target reference original audio signals with different priorities, the weight occupied by a third target reference original audio signal when mixing is larger than the weight occupied by a fourth target reference original audio signal when mixing, wherein the third target reference original audio signal is a target reference original audio signal with higher priority corresponding to the two corresponding target reference original audio signals with different priorities, and the fourth target reference original audio signal is a target reference original audio signal with lower priority corresponding to the two corresponding target reference original audio signals with different priorities.
4. An electronic device, comprising: a processor, a storage medium and a bus, the storage medium storing machine-readable instructions executable by the processor, the processor and the storage medium communicating over the bus when the electronic device is running, the processor executing the machine-readable instructions to perform the steps of the data processing method according to any one of claims 1 to 2.
5. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, performs the steps of the data processing method according to any one of claims 1 to 2.
CN202211147902.9A 2022-09-21 2022-09-21 Data processing method and device, electronic equipment and storage medium Active CN115242757B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211147902.9A CN115242757B (en) 2022-09-21 2022-09-21 Data processing method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211147902.9A CN115242757B (en) 2022-09-21 2022-09-21 Data processing method and device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN115242757A CN115242757A (en) 2022-10-25
CN115242757B true CN115242757B (en) 2023-05-26

Family

ID=83680739

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211147902.9A Active CN115242757B (en) 2022-09-21 2022-09-21 Data processing method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN115242757B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015180330A1 (en) * 2014-05-30 2015-12-03 中兴通讯股份有限公司 Volume adjustment method and device, and multipoint control unit
CN113973103A (en) * 2021-10-26 2022-01-25 北京达佳互联信息技术有限公司 Audio processing method and device, electronic equipment and storage medium

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8704870B2 (en) * 2010-05-13 2014-04-22 Lifesize Communications, Inc. Multiway telepresence without a hardware MCU
CN102938833B (en) * 2012-07-25 2016-10-12 苏州科达科技股份有限公司 Method and device, multipoint control unit and video conferencing system in video conference
CN112272281B (en) * 2020-10-09 2024-05-31 上海晨驭信息科技有限公司 Regional distributed video conference system

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015180330A1 (en) * 2014-05-30 2015-12-03 中兴通讯股份有限公司 Volume adjustment method and device, and multipoint control unit
CN113973103A (en) * 2021-10-26 2022-01-25 北京达佳互联信息技术有限公司 Audio processing method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN115242757A (en) 2022-10-25

Similar Documents

Publication Publication Date Title
US9031849B2 (en) System, method and multipoint control unit for providing multi-language conference
US8805928B2 (en) Control unit for multipoint multimedia/audio system
CN111583942B (en) Method and device for controlling coding rate of voice session and computer equipment
CN102648584B (en) Use the system of forward error correction inspection available bandwidth, method and medium
US20130100239A1 (en) Method, apparatus, and system for processing cascade conference sites in cascade conference
US20150237086A1 (en) Local Media Rendering
CN112055166B (en) Audio data processing method, device, conference system and storage medium
CN112911383A (en) Multipath screen projection method, device and system under local area network
CN105611219A (en) Method and device for processing video conference
CN104469032A (en) Sound mixing processing method and system
CN115242757B (en) Data processing method and device, electronic equipment and storage medium
US20200329083A1 (en) Video conference transmission method and apparatus, and mcu
CN110971862B (en) Video conference broadcasting method and device
CN107124575A (en) A kind of media processing method, device and media server
US11431855B1 (en) Encoder pools for conferenced communications
CN101742220B (en) System and method for realizing multi-picture based on serial differential switch
CN102833520A (en) Video conference signal processing method, video conference server and video conference system
CN113038183B (en) Video processing method, system, device and medium based on multiprocessor system
CN113301293A (en) Multi-screen bidirectional 4K communication method and system for video conference
CN114650387A (en) Method, device and equipment for small program conference based on TRTC (true radio frequency communication) protocol
CN111554312A (en) Method, device and system for controlling audio coding type
CN117793071A (en) Media stream processing method and device
CN115334058A (en) Media file playing system and method thereof
CN116156099A (en) Network transmission method, device and system
CN116647539A (en) Voice data transmission method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant