WO2012174931A1 - Parameter control method and device - Google Patents

Parameter control method and device Download PDF

Info

Publication number
WO2012174931A1
WO2012174931A1 PCT/CN2012/074031 CN2012074031W WO2012174931A1 WO 2012174931 A1 WO2012174931 A1 WO 2012174931A1 CN 2012074031 W CN2012074031 W CN 2012074031W WO 2012174931 A1 WO2012174931 A1 WO 2012174931A1
Authority
WO
WIPO (PCT)
Prior art keywords
terminal
parameter
audio
picture
stream
Prior art date
Application number
PCT/CN2012/074031
Other languages
French (fr)
Chinese (zh)
Inventor
刘军莉
陈军
佟鑫
王福
张良平
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2012174931A1 publication Critical patent/WO2012174931A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/485End-user interface for client configuration
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems

Definitions

  • the present invention relates to the field of communications, and in particular, to a parameter control method and apparatus.
  • BACKGROUND For services that require a terminal and a server to cooperate (for example, conference television), if the terminal wishes to change certain service-related parameters, it needs to send a dedicated instruction to the service, and the instruction is carried in the control. In the message, therefore, for each additional control method, the terminal needs to be upgraded so that it can send the commands required for the control. Since the number of terminals is large and widely distributed, it is not easy to upgrade the terminal.
  • the following is an example of a video conference.
  • the existing implementation method of the video conference is to add several terminals to the conference by setting the relevant configuration of the conference in a conference.
  • the multipoint conference unit After the conference, the multipoint conference unit (MCU) first performs signaling interaction with each terminal. Determining the format of the terminal, and then each terminal sends its own image compression code stream to the MCU according to the interaction format; the MCU decodes the code stream transmitted by the terminal, encodes and compresses the synthesized image, and sends it back to each terminal, so that each The images received by the terminals are the same. In order to enable each terminal to customize the number of multi-pictures according to their own needs, it is necessary to add a universal port.
  • each terminal needs to increase the signaling processing for acquiring and processing the adjusted multi-picture number, and needs to receive the user's request in real time, and set the multi-picture number information set by the user. It is also transmitted to the MCU through the signaling channel; on the MCU side, the multipoint processor (MP) receives the signaling of adjusting the multi-picture of each terminal, and then adjusts the transmission relationship between each decoding node and the coding node, according to the new After the corresponding multi-picture synthesis is performed, the image is encoded and sent back to each terminal.
  • MP multipoint processor
  • Embodiments of the present invention provide a parameter control method and apparatus to solve at least the above problems.
  • a parameter control method comprising the steps of: parsing audio data and/or video data from a terminal; and determining audio and/or video data in the audio data.
  • the picture in the picture contains an audio and/or picture for indicating adjustment of the parameter; the parameter is adjusted according to the audio and/or picture for indicating adjustment of the parameter.
  • Determining, in a predetermined period of time, a picture in the audio and/or video data in the audio data includes the audio and/or picture for indicating adjustment of a parameter.
  • the parameter is a parameter used to send a media stream to the terminal, after adjusting the parameter according to the audio and/or picture for indicating adjustment of the parameter, using the adjusted
  • the parameter sends a media stream to the terminal.
  • the media stream is a video stream
  • the parameter includes at least one of: a number of terminal pictures included in the video stream, a layout of the video stream displayed on the terminal, and a frame rate of the video stream.
  • the code rate of the video stream and the format of the video stream. Obtaining an image linear code after parsing the video data from the terminal; determining that the image linear code includes a screen for indicating adjustment of the parameter.
  • a parameter control apparatus comprising: a parsing module configured to parse audio data and/or video data from a terminal; and a determining module configured to determine the audio
  • the picture in the audio and/or video data in the data comprises audio and/or pictures for indicating adjustment of the parameters; an adjustment module arranged to be in accordance with said audio for indicating adjustment of said parameters and/or The screen adjusts the parameters.
  • the determining module is configured to determine, in a preset period of time, that the picture in the audio and/or video data in the audio data includes the audio and/or picture for indicating adjustment of a parameter.
  • the adjustment module is configured to adjust the parameter according to the audio and/or picture for indicating adjustment of the parameter
  • the media stream is sent to the terminal using the adjusted parameters.
  • the parameter includes at least one of: a number of terminal pictures included in the video stream, a layout of the video stream displayed on the terminal, and a frame rate of the video stream.
  • the parsing module is configured to parse the video data from the terminal to obtain an image linear code; the determining module is configured to determine that the image linear code includes a screen for indicating adjustment of the parameter.
  • the audio data and/or the video data from the terminal are parsed; determining that the picture in the audio and/or video data in the audio data includes audio for indicating the parameter adjustment and/or And adjusting the parameter according to the audio and/or the screen for indicating the adjustment of the parameter, which solves the problem that the terminal needs to upgrade the terminal in order to increase the control function of the terminal in the prior art. Furthermore, the effect of expanding the control function of the terminal without modifying the terminal is achieved.
  • FIG. 2 is a block diagram showing a structure of a parameter control apparatus according to an embodiment of the present invention
  • FIG. 3 is a media stream processing according to a preferred embodiment of the present invention.
  • FIG. 4 is a flowchart of an MCU side of a media stream processing method according to a preferred embodiment of the present invention;
  • FIG. 5 is a downward and upward direction of a gesture recognition multi-screen method according to an embodiment of the present invention
  • FIG. 6 is a three-screen department diagram of a gesture recognition multi-screen method according to an embodiment of the present invention
  • FIG. 7 is a four-screen department diagram of a gesture recognition multi-screen method according to an embodiment of the present invention.
  • FIG. 1 is a flowchart of a parameter control method according to an embodiment of the present invention.
  • the method includes the following steps: Step S102: Audio data from a terminal And/or the video data is parsed; step S104, determining that the picture in the audio and/or video data in the audio data includes an audio and/or a picture for indicating adjustment of the parameter; Step S106, according to the above, The audio and/or picture adjusted by the above parameters adjust the above parameters.
  • Step S102 Audio data from a terminal And/or the video data is parsed
  • step S104 determining that the picture in the audio and/or video data in the audio data includes an audio and/or a picture for indicating adjustment of the parameter
  • Step S106 according to the above,
  • the audio and/or picture adjusted by the above parameters adjust the above parameters.
  • the server needs to be adjusted, because the terminal can transmit video through the camera or can transmit audio through the microphone, so that the terminal does not need to make any changes, as long as the user acts on the camera or through the microphone.
  • the voice Compared with the prior art, the signaling interaction is relieved, and the control function of the terminal is expanded to some extent.
  • the problem can be only to analyze audio and video data for a certain period of time. For example, it can be agreed in advance. For 10 minutes of audio and video data, the data of the first 5 minutes is not parsed, and the data of the last 5 minutes is parsed, so that for each hour, the user can respectively be in 5-10 minutes. 15-20 minutes, 25-30 minutes, 35-40 minutes, 45-50 minutes, 55-60 minutes to make action or voice that can be controlled.
  • all audio and video data can be parsed until parsed into a picture or audio indicating a change in parsing mode.
  • the preset time can also be set in advance according to the specific situation of the conference. If you want the conference to update fast, you can set the time interval to be shorter. If you want to update slowly, you can set the judgment time to be slightly longer.
  • the parameter is used to send a parameter used by the media stream to the terminal. After the server adjusts the parameter according to the audio and/or screen for indicating the adjustment of the parameter, the server sends the media to the terminal by using the adjusted parameter.
  • Stream for example, audio stream, video stream).
  • control of receiving a picture by the terminal may be implemented by using audio and/or a picture.
  • the parameter includes at least one of the following: a number of terminal pictures included in the video stream, and the video stream is at the terminal.
  • an image linear code is obtained after parsing the video data from the terminal; and determining that the image linear code includes indicating to perform the parameter Adjusted picture. Because the identification of linear codes is relatively easy, the process of judgment is relatively simple.
  • a parameter control device is also provided, which is used to implement the above-mentioned embodiments and preferred embodiments thereof, which have been described, and will not be described again.
  • . 2 is a block diagram showing the structure of a parameter control apparatus according to an embodiment of the present invention. As shown in FIG. 2, the apparatus includes: an analysis module 22, a determination module 24, and an adjustment module 26. The structure will be described below.
  • the parsing module 22 is configured to parse the audio data and/or the video data from the terminal; the judging module 24 is connected to the parsing module 22, and is configured to determine the screen inclusion in the audio and/or video data in the audio data, For example, when the media stream is a video stream, the foregoing parameters include at least one of the following: a number of terminal pictures included in the video stream, and the video stream is displayed on the terminal. The layout, the frame rate of the video stream, the bit rate of the video stream, and the format of the video stream.
  • the adjustment module 26 is coupled to the determination module 24 and configured to adjust the parameters in accordance with the audio and/or screen for indicating adjustment of the parameters.
  • the determining module 24 is configured to determine that the picture in the audio and/or video data in the audio data comprises the above-mentioned audio and/or picture for indicating adjustment of the parameter within a preset time period.
  • the adjustment module 26 is configured to use the adjusted parameter after adjusting the parameter according to the audio and/or screen for indicating the adjustment of the parameter.
  • the terminal sends a media stream.
  • the parsing module 22 is configured to parse the video data from the terminal to obtain an image linear code; the determining module 24 is configured to determine that the image linear code includes a screen for indicating adjustment of the parameter.
  • a method for acquiring a gesture image using an existing camera of a terminal in a conference television system and adjusting a multi-picture number by gesture or other audio and/or video recognition on the MCU side is provided.
  • VPU video processing unit
  • the functions involved in the technical solution include: (1) the conference control interface sets the conference-related configuration; (2) the MCU and the terminal perform signaling interaction of the coding format; (3) the terminal sends the image compressed code stream to the MCU; (4) the MCU receives After decoding to the code stream, the image linear code is obtained; (5) the MCU performs multi-picture synthesis and re-encoding on the decoded image of each terminal; (6) the last MCU sends the encoded code stream back to the terminal; (7) in the MCU The VPU module performs gesture recognition on the decoded linear code.
  • Step S302 A user sets a conference-related configuration through a WEB interface, including the current conference. There are several terminals, what format and what bit rate are each terminal, and the number of multi-pictures of the conference, etc.; Step S304, the MCU sends a letter to each terminal according to the encoding format of each terminal according to the conference setting.
  • Step S306 if the image format signaling interaction is unsuccessful, the terminal does not go up; if the interaction is successful, in step S3062, the terminal and the MCU interact to output an image format; in step S3064, the terminal sends the image compressed code stream to the MCU according to the corresponding format; S308, the VPU module in the MCU separately decodes the received image compression code streams of each terminal, and obtains an image linear code of each terminal.
  • Step S310 the VPU separately performs gesture recognition on the image linear code decoded by each terminal. In step S312, it is determined whether there is multi-screen adjustment information. If there is no multi-screen adjustment gesture, the process directly jumps to step S316. If there is an adjusted gesture, step S314 is performed.
  • Step S314 the corresponding multi-screen number adjustment information is acquired, and the VPU module will recognize The multi-screen number adjustment information is sent to the MP module of the MCU, and the MP module receives the signaling, and adjusts the corresponding codec node relationship; in step S316, the MCU performs multi-picture synthesis on the decoded image of each terminal according to the corresponding relationship of the codec.
  • Step S402 the MP obtains the relevant configuration of the conference by the conference control interface, including the number of terminals in the conference, the format and the code rate of each terminal, and the number of multiple frames of the conference;
  • Step S404 the MP performs signaling interaction with each terminal according to the encoding format of each terminal according to the conference setting; step S406, if the image format signaling interaction is unsuccessful, the terminal does not attend; if the interaction is successful, the receiving terminal presses the corresponding format.
  • the transmitted image compresses the code stream and sends it to the DSP; Step S4062, the DSP receives the terminal code stream; Step S408, the VPU module in the DSP separately decodes the received image compressed code streams of each terminal, and obtains the terminals of each terminal.
  • Step S410 the VPU separately performs gesture recognition on the image linear code decoded by each terminal; Step S412, determining the image If there is no multi-screen adjustment gesture, skip to step S420, if there is an adjusted gesture, go to step S414; step S414, determine whether the gesture is legal, if not, go directly to step S420, if it is legal, then Sending the multi-screen number adjustment information to the MP module; Step S416, the MP receives the corresponding multi-screen number adjustment information sent by the DSP; Step S418, adjusting the corresponding relationship between the corresponding codec nodes; Step S420, the DSP performs multi-picture synthesis and re-encoding on the decoded image of each terminal according to the corresponding relationship between the codecs.
  • step S422 the encoded code stream is sent back to each terminal.
  • the system used in the preferred embodiment is a conference television system, with reference to the above embodiments and preferred embodiments
  • the flow chart is processed.
  • the user first holds a four-screen 720P (1280x720) 30-frame conference of four terminals A, B, C, and D.
  • the conference control interface sends this information to the MCU MP module. After the MP module obtains this signaling, And the terminal performs 720P (1280x720) 30 frame signaling interaction.
  • the interaction is successful.
  • four terminals A, B, C, and D send 720P (1280x720) 30 frames of compressed code streams to MCU, assuming that no user has controlled the multi-screen by gesture at this time.
  • the MCU After receiving the code stream, the MCU sends the four image compression code streams to the VPU module on the MCU side.
  • the VPU uses four decoding nodes for decoding, and performs gesture recognition after decoding. No information is found to adjust the number of multi-pictures, then four The decoding node sends the respective image linear codes to the same encoding node a, and then the encoding node a reduces the respective images by a quarter, performs multi-picture synthesis according to FIG.
  • the terminal receives the multi-picture compressed code stream sent by the MCU for decoding and display, and the four terminal users can see the four-picture image as shown in FIG. 7.
  • the gesture shown in FIG. 5 is presented to the camera (two options are optional), and the terminal will include the gesture.
  • the image is sent to the MCU.
  • the MCU After receiving the code stream containing the gesture information, the MCU sends the four image compression code streams to the VPU module on the MCU side.
  • the VPU uses four decoding nodes to decode, and the four decoding nodes decode the gestures respectively.
  • Nodes B, C, and D do not find information for adjusting the number of multi-pictures.
  • Decoding node A finds that terminal A needs to adjust the multi-picture number information of the three-picture image, and decoding node A sends this information to MP, and MP adds one coding node b.
  • MP adds one coding node b.
  • the decoding nodes A, B, and C also need to convert the images. The linear code is sent to the encoding node b for multi-picture synthesis.
  • the encoding node a reduces the image by a quarter and then synthesizes the multi-picture according to Figure 7, and encodes the image linear code and sends it back to the terminal B.
  • the coding node b reduces each image by a quarter and then synthesizes according to the multi-picture of FIG. 6, and encodes the image linear code and returns it to the terminal A.
  • Terminal A receives the compressed code stream of the encoding node b, and after decoding and transmitting, can see the three-picture image as shown in FIG. 6; the terminal B, C, D receives the compressed code stream of the encoding node a, and can be decoded and sent.
  • modules or steps of the present invention can be implemented by a general-purpose computing device, which can be concentrated on a single computing device or distributed over a network composed of multiple computing devices. Alternatively, they may be implemented by program code executable by the computing device so that they may be stored in the storage device by the computing device, or they may be separately fabricated into individual integrated circuit modules, or Multiple modules or steps are made into a single integrated circuit module. Thus, the invention is not limited to any specific combination of hardware and software.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Human Computer Interaction (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

The present invention provides a parameter control method and device. The method comprises the following steps: analyzing audio data and/or video data from a terminal; determining that the audio in the audio data and/or the picture in the video data comprises an audio and/or picture for instructing adjustment of a parameter; and adjusting the parameter according to the audio and/or picture for instructing adjustment of the parameter. By means of the present invention, the effect of expanding the control functions on a terminal without modifying the terminal is achieved.

Description

参数控制方法及装置 技术领域 本发明涉及通信领域, 尤其涉及一种参数控制方法及装置。 背景技术 对于需要终端和服务器配合来进行的业务而言(例如, 会议电视), 如果终端希望 更改某些业务相关的参数, 则需要向服务来发专用的指令来实现, 该指令是携带在控 制消息中的, 因此, 每增加一种控制方法, 就需要对终端进行升级, 使其能发送该控 制所需要的指令, 由于终端数量众多且分布较广, 对终端进行升级不容易实现。 以下 以电视会议为例进行说明。 电视会议现有的实现方法是在一个会议中, 通过设置会议的相关配置将几个终端 加入到会议中, 开会后多点会议单元 (Multipoint Conference Unit 简称为 MCU)先和 各个终端进行信令交互确定终端的格式, 之后各个终端按照交互格式将自己的图像压 缩码流发送给 MCU; MCU对终端传过来的码流进行解码, 将合成后的图像进行编码 压缩后发回给各个终端, 这样每个终端收到的图像是一样的。 为了使每个终端可以按照自己的需求定制多画面数等, 即需要增加一个通用端口 TECHNICAL FIELD The present invention relates to the field of communications, and in particular, to a parameter control method and apparatus. BACKGROUND For services that require a terminal and a server to cooperate (for example, conference television), if the terminal wishes to change certain service-related parameters, it needs to send a dedicated instruction to the service, and the instruction is carried in the control. In the message, therefore, for each additional control method, the terminal needs to be upgraded so that it can send the commands required for the control. Since the number of terminals is large and widely distributed, it is not easy to upgrade the terminal. The following is an example of a video conference. The existing implementation method of the video conference is to add several terminals to the conference by setting the relevant configuration of the conference in a conference. After the conference, the multipoint conference unit (MCU) first performs signaling interaction with each terminal. Determining the format of the terminal, and then each terminal sends its own image compression code stream to the MCU according to the interaction format; the MCU decodes the code stream transmitted by the terminal, encodes and compresses the synthesized image, and sends it back to each terminal, so that each The images received by the terminals are the same. In order to enable each terminal to customize the number of multi-pictures according to their own needs, it is necessary to add a universal port.
(universe port) 功能。 现有的实现方法是: 在终端侧, 每个终端除了之前的功能外, 还需要增加获取并处理调整多画面数的信令处理, 需要实时接收用户的需求, 将用户 设置的多画面数信息也通过信令通道传送给 MCU;在 MCU侧,多点处理器 (Multipoint Processor简称为 MP) 收到各个终端的调整多画面的信令后调整各个解码节点和编码 节点的传送关系, 按照新的关系进行相应多画面合成后将图像进行编码, 发回给各个 终端。 这样就完成了 universe port功能, 即每个终端收到的图像可以是自己定制的图 像, 是不一样的。 这种处理方式也是要对终端进行升级使其能够支持该功能的, 这会带来一系列的 问题: 例如, 之前卖出去的所有的终端都需要大规模修改升级; 终端要跟 MCU进行 信令交互, 就需要额外增加信令通道的负担; 并且, 其他的厂商的终端无法实现此功 发明内容 本发明实施例在于提供一种参数控制方法及装置, 以至少解决上述问题。 根据本发明实施例的一方面, 提供了一种参数控制方法, 该方法包括如下步骤: 对来自终端的音频数据和 /或视频数据进行解析;判断所述音频数据中的音频和 /或视频 数据中的画面包含, 用于指示对参数进行调整的音频和 /或画面; 根据所述用于指示对 所述参数进行调整的音频和 /或画面调整所述参数。 在预先设定的时间段内判断所述音频数据中的音频和 /或视频数据中的画面包含 所述用于指示对参数进行调整的音频和 /或画面。 在所述参数为向所述终端发送媒体流所使用的参数的情况下, 根据所述用于指示 对所述参数进行调整的音频和 /或画面调整所述参数之后, 使用调整后的所述参数向所 述终端发送媒体流。 在所述媒体流为视频流时, 所述参数包括以下至少之一: 所述视频流中包括的终 端画面的数目、 所述视频流在所述终端显示的布局、 所述视频流的帧频、 所述视频流 的码率、 所述视频流的格式。 对来自终端的视频数据进行解析之后得到图像线性码; 判断所述图像线性码中包 含用于指示对所述参数进行调整的画面。 根据本发明实施例的另一方面, 提供了一种参数控制装置, 该装置包括: 解析模 块, 设置为对来自终端的音频数据和 /或视频数据进行解析; 判断模块, 设置为判断所 述音频数据中的音频和 /或视频数据中的画面包含,用于指示对参数进行调整的音频和 /或画面;调整模块,设置为根据所述用于指示对所述参数进行调整的音频和 /或画面调 整所述参数。 所述判断模块, 设置为在预先设定的时间段内判断所述音频数据中的音频和 /或视 频数据中的画面包含所述用于指示对参数进行调整的音频和 /或画面。 在所述参数为向所述终端发送媒体流所使用的参数的情况下, 所述调整模块设置 为根据所述用于指示对所述参数进行调整的音频和 /或画面调整所述参数之后, 使用调 整后的所述参数向所述终端发送媒体流。 在所述媒体流为视频流时, 所述参数包括以下至少之一: 所述视频流中包括的终 端画面的数目、 所述视频流在所述终端显示的布局、 所述视频流的帧频、 所述视频流 的码率、 所述视频流的格式。 所述解析模块设置为对来自终端的视频数据进行解析之后得到图像线性码; 所述 判断模块设置为判断所述图像线性码中包含用于指示对所述参数进行调整的画面。 通过本发明实施例, 采用对来自终端的音频数据和 /或视频数据进行解析; 判断所 述音频数据中的音频和 /或视频数据中的画面包含,用于指示对参数进行调整的音频和 /或画面;根据所述用于指示对所述参数进行调整的音频和 /或画面调整所述参数,解决 了现有技术中为了增加终端的控制功能而需要对终端进行升级所带来的问题, 进而达 到了不需要修改终端而实现对终端的控制功能进行扩充的效果。 附图说明 此处所说明的附图用来提供对本发明的进一步理解, 构成本申请的一部分, 本发 明的示意性实施例及其说明用于解释本发明, 并不构成对本发明的不当限定。 在附图 中: 图 1是根据本发明实施例的参数控制方法的流程图; 图 2是根据本发明实施例的参数控制装置的结构框图; 图 3是根据本发明优选实施例的媒体流处理方法的整个系统处理的流程图; 图 4是根据本发明优选实施例的媒体流处理方法的 MCU侧的流程图; 图 5是根据本发明实施例的手势识别调整多画面方法的向下、 向上调整多画面数 手势图; 图 6是根据本发明实施例的手势识别调整多画面方法的三画面部局图; 图 7是根据本发明实施例的手势识别调整多画面方法的四画面部局图。 具体实施方式 下文中将参考附图并结合实施例来详细说明本发明。 需要说明的是, 在不冲突的 情况下, 本申请中的实施例及实施例中的特征可以相互组合。 由于对于服务器而言, 对于某项业务进行控制是体现在对参数的调整上的, 即是 通过调整不同的参数来达到进行控制的目的的。 在本实施例中提供了一种参数控制方 法, 图 1是根据本发明实施例的参数控制方法的流程图, 如图 1所示, 该方法包括如 下步骤: 步骤 S102, 对来自终端的音频数据和 /或视频数据进行解析; 步骤 S104,判断上述音频数据中的音频和 /或视频数据中的画面包含用于指示对参 数进行调整的音频和 /或画面; 步骤 S106, 根据上述用于指示对上述参数进行调整的音频和 /或画面调整上述参 数。 通过以上步骤, 只需要对服务器进行调整即可, 由于终端可以通过摄像头来传送 视频或者可以通过麦克风来传递音频, 这样不需要终端进行任何改动, 只要用户对着 摄像头做出动作或者通过麦克风来说出指示的语音即可。 相比于现有技术中通过信令 交互, 减轻了信令通道的负担, 在一定程度上扩充了终端的控制功能。 在实施时, 如果对于所有的音视频数据均进行解析, 那么可能会稍微增加服务器 的负担, 虽然在实际测试中发现并不会对服务器带来过大的影响, 但是, 为了避免这 一可能出现的问题, 可以只解析某一段时间的音视频数据。 例如, 可以提前约定好, 对于 10分钟的音视频数据,对于前 5分钟的数据不进行解析, 而对于后 5分钟的数据 进行解析, 这样对于每个小时, 用户可以分别在 5-10分钟、 15-20分钟、 25-30分钟、 35-40分钟、 45-50分钟、 55-60分钟做出可以进行控制的动作或者语音。 在一个更优 的实施方式中, 在开始时, 可以对所有的音视频数据进行解析, 直到解析到指示改变 解析方式的画面或者音频为止。 当然, 预先设定的时间也可以根据会议的具体情况来 提前设置, 如果希望会议更新快的话, 可以设置时间间隔短一点, 如果希望更新较慢 的话, 可以设置该判断的时间稍长一些。 以下以参数为向终端发送媒体流所使用的参数的情况为例进行说明, 服务器在根 据用于指示对参数进行调整的音频和 /或画面调整该参数之后, 使用调整后的参数向终 端发送媒体流 (例如, 音频流、 视频流)。 例如, 通过音频和 /或画面可以实现对终端接收画面的控制, 在上述媒体流为视频 流时, 该参数包括以下至少之一: 该视频流中包括的终端画面的数目、 该视频流在终 端显示的布局、 上述视频流的帧频、 上述视频流的码率、 上述视频流的格式。 对于视频数据的解析, 在本实施例中提供了一种优选的方式, 即在对来自终端的 视频数据进行解析之后得到图像线性码; 判断该图像线性码中包含用于指示对所述参 数进行调整的画面。 因为对线性码的识别比较容易, 因此, 判断的过程也比较简单。 在本实施例中, 还提供了一种参数控制装置, 该装置用于实现上述实施例及其优 选实施方式, 已经进行过说明的, 不再赘述, 下面对该装置中涉及到模块进行说明。 图 2是根据本发明实施例的参数控制装置的结构框图, 如图 2所示, 该装置包括: 解 析模块 22、 判断模块 24和调整模块 26, 下面对该结构进行说明。 解析模块 22, 设置为对来自终端的音频数据和 /或视频数据进行解析; 判断模块 24,连接至解析模块 22,设置为判断上述音频数据中的音频和 /或视频数据中的画面包 含, 用于指示对参数进行调整的音频和 /或画面; 例如, 在该媒体流为视频流时, 上述 参数包括以下至少之一: 上述视频流中包括的终端画面的数目、 上述视频流在终端显 示的布局、上述视频流的帧频、上述视频流的码率、上述视频流的格式。调整模块 26, 连接至判断模块 24,设置为根据上述用于指示对上述参数进行调整的音频和 /或画面调 整上述参数。 优选地, 判断模块 24, 设置为在预先设定的时间段内判断上述音频数据中的音频 和 /或视频数据中的画面包含上述用于指示对参数进行调整的音频和 /或画面。 在上述参数为向终端发送媒体流所使用的参数的情况下,调整模块 26设置为根据 上述用于指示对上述参数进行调整的音频和 /或画面调整上述参数之后, 使用调整后的 上述参数向终端发送媒体流。 优选地, 解析模块 22 设置为对来自终端的视频数据进行解析之后得到图像线性 码;判断模块 24设置为判断上述图像线性码中包含用于指示对上述参数进行调整的画 面。 以下以会议电视系统为例结合一个优选实施例进行说明。 在本优选实施例提供了一种在会议电视系统中利用终端现有的摄像头获取手势图 像, 在 MCU侧通过手势或其它音频和 /或视频识别来调整多画面数的方法。 在本实施例中, 在实现 universe port功能时, 不需要对终端作任何修改, 只需要 在 MCU侧的视频处理单元(Video Processing Unit 简称为 VPU)中增加一个手势识别 模块, 识别终端图像中对多画面的调整, 然后将调整后的多画面信息传送给 MP, MP 收到信令后调整各个解码节点和编码节点的传送关系就可以了。 在本实施例中采用的 技术方案的涉及的功能包括: (1 )会议控制界面设置会议相关配置; (2) MCU和终端 进行编码格式的信令交互; (3 ) 终端发送图像压缩码流给 MCU; (4) MCU收到码流 后解码,得到图像线性码; (5 ) MCU对各个终端的解码图像进行多画面合成、再编码; ( 6 )最后 MCU将编码后码流发回给终端; ( 7 ) MCU中的 VPU模块对解码后的线性 码进行手势识别; (8 ) VPU模块如果识别出多画面数调整信息, 就将此信令发送给 MCU的 MP模块; (9) MP模块收到此信令, 调整相应编解码节点关系。 图 3是根据本发明优选实施例的媒体流处理方法的整个系统处理的流程图, 如图 3所示, 该流程包括如下步骤: 步骤 S302, 用户通过 WEB界面设置会议相关配置, 包括本次会议有几个终端上 会, 每个终端分别是什么格式和什么码率, 以及本次会议的多画面数等; 步骤 S304, MCU根据会议设置每个终端的编码格式分别跟每个终端就此进行信 (universe port) function. The existing implementation method is as follows: On the terminal side, in addition to the previous functions, each terminal needs to increase the signaling processing for acquiring and processing the adjusted multi-picture number, and needs to receive the user's request in real time, and set the multi-picture number information set by the user. It is also transmitted to the MCU through the signaling channel; on the MCU side, the multipoint processor (MP) receives the signaling of adjusting the multi-picture of each terminal, and then adjusts the transmission relationship between each decoding node and the coding node, according to the new After the corresponding multi-picture synthesis is performed, the image is encoded and sent back to each terminal. This completes the universe port function, that is, the image received by each terminal can be a customized image, which is different. This kind of processing is also to upgrade the terminal to support this function, which will bring a series of problems: For example, all the terminals that were sold before need to be modified extensively; the terminal must send a letter with the MCU. To make the interaction, the additional burden of the signaling channel is required; and the terminals of other manufacturers cannot implement this function. SUMMARY OF THE INVENTION Embodiments of the present invention provide a parameter control method and apparatus to solve at least the above problems. According to an aspect of an embodiment of the present invention, a parameter control method is provided, the method comprising the steps of: parsing audio data and/or video data from a terminal; and determining audio and/or video data in the audio data. The picture in the picture contains an audio and/or picture for indicating adjustment of the parameter; the parameter is adjusted according to the audio and/or picture for indicating adjustment of the parameter. Determining, in a predetermined period of time, a picture in the audio and/or video data in the audio data includes the audio and/or picture for indicating adjustment of a parameter. In the case that the parameter is a parameter used to send a media stream to the terminal, after adjusting the parameter according to the audio and/or picture for indicating adjustment of the parameter, using the adjusted The parameter sends a media stream to the terminal. When the media stream is a video stream, the parameter includes at least one of: a number of terminal pictures included in the video stream, a layout of the video stream displayed on the terminal, and a frame rate of the video stream. The code rate of the video stream and the format of the video stream. Obtaining an image linear code after parsing the video data from the terminal; determining that the image linear code includes a screen for indicating adjustment of the parameter. According to another aspect of an embodiment of the present invention, a parameter control apparatus is provided, the apparatus comprising: a parsing module configured to parse audio data and/or video data from a terminal; and a determining module configured to determine the audio The picture in the audio and/or video data in the data comprises audio and/or pictures for indicating adjustment of the parameters; an adjustment module arranged to be in accordance with said audio for indicating adjustment of said parameters and/or The screen adjusts the parameters. The determining module is configured to determine, in a preset period of time, that the picture in the audio and/or video data in the audio data includes the audio and/or picture for indicating adjustment of a parameter. In the case that the parameter is a parameter used to send a media stream to the terminal, the adjustment module is configured to adjust the parameter according to the audio and/or picture for indicating adjustment of the parameter, The media stream is sent to the terminal using the adjusted parameters. When the media stream is a video stream, the parameter includes at least one of: a number of terminal pictures included in the video stream, a layout of the video stream displayed on the terminal, and a frame rate of the video stream. The code rate of the video stream and the format of the video stream. The parsing module is configured to parse the video data from the terminal to obtain an image linear code; the determining module is configured to determine that the image linear code includes a screen for indicating adjustment of the parameter. Through the embodiment of the present invention, the audio data and/or the video data from the terminal are parsed; determining that the picture in the audio and/or video data in the audio data includes audio for indicating the parameter adjustment and/or And adjusting the parameter according to the audio and/or the screen for indicating the adjustment of the parameter, which solves the problem that the terminal needs to upgrade the terminal in order to increase the control function of the terminal in the prior art. Furthermore, the effect of expanding the control function of the terminal without modifying the terminal is achieved. BRIEF DESCRIPTION OF THE DRAWINGS The accompanying drawings, which are set to illustrate,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, 1 is a flowchart of a parameter control method according to an embodiment of the present invention; FIG. 2 is a block diagram showing a structure of a parameter control apparatus according to an embodiment of the present invention; and FIG. 3 is a media stream processing according to a preferred embodiment of the present invention. FIG. 4 is a flowchart of an MCU side of a media stream processing method according to a preferred embodiment of the present invention; FIG. 5 is a downward and upward direction of a gesture recognition multi-screen method according to an embodiment of the present invention; FIG. 6 is a three-screen department diagram of a gesture recognition multi-screen method according to an embodiment of the present invention; and FIG. 7 is a four-screen department diagram of a gesture recognition multi-screen method according to an embodiment of the present invention. BEST MODE FOR CARRYING OUT THE INVENTION Hereinafter, the present invention will be described in detail with reference to the accompanying drawings. It should be noted that the embodiments in the present application and the features in the embodiments may be combined with each other without conflict. For the server, the control of a certain service is reflected in the adjustment of the parameters, that is, by adjusting different parameters to achieve the purpose of control. In this embodiment, a parameter control method is provided. FIG. 1 is a flowchart of a parameter control method according to an embodiment of the present invention. As shown in FIG. 1, the method includes the following steps: Step S102: Audio data from a terminal And/or the video data is parsed; step S104, determining that the picture in the audio and/or video data in the audio data includes an audio and/or a picture for indicating adjustment of the parameter; Step S106, according to the above, The audio and/or picture adjusted by the above parameters adjust the above parameters. Through the above steps, only the server needs to be adjusted, because the terminal can transmit video through the camera or can transmit audio through the microphone, so that the terminal does not need to make any changes, as long as the user acts on the camera or through the microphone. Just indicate the voice. Compared with the prior art, the signaling interaction is relieved, and the control function of the terminal is expanded to some extent. In the implementation, if all the audio and video data is parsed, it may slightly increase the burden on the server, although it is found in the actual test that it will not have too much impact on the server, but in order to avoid this possibility The problem can be only to analyze audio and video data for a certain period of time. For example, it can be agreed in advance. For 10 minutes of audio and video data, the data of the first 5 minutes is not parsed, and the data of the last 5 minutes is parsed, so that for each hour, the user can respectively be in 5-10 minutes. 15-20 minutes, 25-30 minutes, 35-40 minutes, 45-50 minutes, 55-60 minutes to make action or voice that can be controlled. In a more preferred embodiment, at the beginning, all audio and video data can be parsed until parsed into a picture or audio indicating a change in parsing mode. Of course, the preset time can also be set in advance according to the specific situation of the conference. If you want the conference to update fast, you can set the time interval to be shorter. If you want to update slowly, you can set the judgment time to be slightly longer. The following is an example in which the parameter is used to send a parameter used by the media stream to the terminal. After the server adjusts the parameter according to the audio and/or screen for indicating the adjustment of the parameter, the server sends the media to the terminal by using the adjusted parameter. Stream (for example, audio stream, video stream). For example, control of receiving a picture by the terminal may be implemented by using audio and/or a picture. When the media stream is a video stream, the parameter includes at least one of the following: a number of terminal pictures included in the video stream, and the video stream is at the terminal. The layout of the display, the frame rate of the video stream, the bit rate of the video stream, and the format of the video stream. For the parsing of the video data, a preferred manner is provided in the embodiment, that is, an image linear code is obtained after parsing the video data from the terminal; and determining that the image linear code includes indicating to perform the parameter Adjusted picture. Because the identification of linear codes is relatively easy, the process of judgment is relatively simple. In this embodiment, a parameter control device is also provided, which is used to implement the above-mentioned embodiments and preferred embodiments thereof, which have been described, and will not be described again. . 2 is a block diagram showing the structure of a parameter control apparatus according to an embodiment of the present invention. As shown in FIG. 2, the apparatus includes: an analysis module 22, a determination module 24, and an adjustment module 26. The structure will be described below. The parsing module 22 is configured to parse the audio data and/or the video data from the terminal; the judging module 24 is connected to the parsing module 22, and is configured to determine the screen inclusion in the audio and/or video data in the audio data, For example, when the media stream is a video stream, the foregoing parameters include at least one of the following: a number of terminal pictures included in the video stream, and the video stream is displayed on the terminal. The layout, the frame rate of the video stream, the bit rate of the video stream, and the format of the video stream. The adjustment module 26 is coupled to the determination module 24 and configured to adjust the parameters in accordance with the audio and/or screen for indicating adjustment of the parameters. Preferably, the determining module 24 is configured to determine that the picture in the audio and/or video data in the audio data comprises the above-mentioned audio and/or picture for indicating adjustment of the parameter within a preset time period. In the case that the parameter is a parameter used to send a media stream to the terminal, the adjustment module 26 is configured to use the adjusted parameter after adjusting the parameter according to the audio and/or screen for indicating the adjustment of the parameter. The terminal sends a media stream. Preferably, the parsing module 22 is configured to parse the video data from the terminal to obtain an image linear code; the determining module 24 is configured to determine that the image linear code includes a screen for indicating adjustment of the parameter. The following describes a conference television system as an example in conjunction with a preferred embodiment. In the preferred embodiment, a method for acquiring a gesture image using an existing camera of a terminal in a conference television system and adjusting a multi-picture number by gesture or other audio and/or video recognition on the MCU side is provided. In this embodiment, when the universe port function is implemented, no modification to the terminal is required, and only a gesture recognition module is added to the video processing unit (VPU) on the MCU side to identify the terminal image. The multi-picture adjustment, and then the adjusted multi-picture information is transmitted to the MP, and the MP adjusts the transmission relationship between each decoding node and the coding node after receiving the signaling. Adopted in this embodiment The functions involved in the technical solution include: (1) the conference control interface sets the conference-related configuration; (2) the MCU and the terminal perform signaling interaction of the coding format; (3) the terminal sends the image compressed code stream to the MCU; (4) the MCU receives After decoding to the code stream, the image linear code is obtained; (5) the MCU performs multi-picture synthesis and re-encoding on the decoded image of each terminal; (6) the last MCU sends the encoded code stream back to the terminal; (7) in the MCU The VPU module performs gesture recognition on the decoded linear code. (8) If the VPU module identifies the multi-picture number adjustment information, the VPU module sends the signaling to the MP module of the MCU; (9) The MP module receives the signaling, adjusts Corresponding codec node relationships. 3 is a flowchart of the entire system processing of the media stream processing method according to a preferred embodiment of the present invention. As shown in FIG. 3, the process includes the following steps: Step S302: A user sets a conference-related configuration through a WEB interface, including the current conference. There are several terminals, what format and what bit rate are each terminal, and the number of multi-pictures of the conference, etc.; Step S304, the MCU sends a letter to each terminal according to the encoding format of each terminal according to the conference setting.
步骤 S306, 如果图像格式信令交互不成功, 此终端不上会; 如果交互成功, 步骤 S3062, 终端和 MCU交互出图像格式; 步骤 S3064, 终端按相应格式将图像压缩码流发送给 MCU; 步骤 S308, MCU中的 VPU模块对收到的各个终端的图像压缩码流分别进行解码 后, 得到各个终端的图像线性码; 步骤 S310, VPU分别对每个终端解码后的图像线性码进行手势识别, 步骤 S312, 判断是否有多画面调整信息, 如果没有多画面调整手势直接跳到步骤 S316, 如果有调整的手势, 则进行步骤 S314; 步骤 S314,获取相应的多画面数调整信息, VPU模块将识别出多画面数调整信息 发送给 MCU的 MP模块, MP模块收到此信令, 调整相应编解码节点关系; 步骤 S316, MCU根据编解码相应关系, 将各个终端的解码图像进行多画面合成、 再编码 步骤 S318, 将编码后码流发回给终端; 步骤 S3182, 终端接收码流解码送显。 用本优选实施例的方法, 完全不需要修改终端代码, 不需要增加终端和 MCU信 令通道的负担, 不需要进行大规模地升级终端, 而且其它的终端也可以实现此功能, 这是现有技术所达不到的。 除此之外, 还可以做以下扩展: (1 )可以调整多画面布局, 可以调整帧频, 可以调整码率, 可以调整 format 格式; (2) 以后只要是增加终端和 MCU的信令交互都可以通过设计相应手势进行控制; (3 )除了手势, 声音控制也可以 按照这种方法实现; (4) 以后只要是通过某种信号识别来进行控制的都可以按照这种 方法实现。 图 4是根据本发明优选实施例的媒体流处理方法的 MCU侧的流程图, 如图 4所 示, MCU侧包括 MP和数字声音处理器 (Digital Sound Processor, 简称为 DSP), 该 流程包括如下步骤: 步骤 S402, MP得到会议控制界面对会议的相关配置, 包括本次会议有几个终端 上会, 每个终端分别是什么格式和什么码率, 以及本次会议的多画面数等; 步骤 S404, MP根据会议设置每个终端的编码格式分别跟每个终端就此进行信令 交互; 步骤 S406, 如果图像格式信令交互不成功, 此终端不上会; 如果交互成功, 接收 终端按相应格式发送的图像压缩码流并将其发送给 DSP; 步骤 S4062, DSP接收终端码流; 步骤 S408, DSP中的 VPU模块对收到的各个终端的图像压缩码流分别进行解码 后, 得到各个终端的图像线性码; 步骤 S410, VPU分别对每个终端解码后的图像线性码进行手势识别; 步骤 S412, 判断图像线性码是否有多画面调整, 如果没有多画面调整手势直接跳 到步骤 S420, 如果有调整的手势, 进行步骤 S414; 步骤 S414, 判断手势是否合法, 如果不合法则直接进行步骤 S420, 如果合法, 则将识别出多画面数调整信息发送给 MP模块; 步骤 S416, MP接收 DSP发送过来的相应的多画面数调整信息; 步骤 S418, 调整相应编解码节点之间的相应关系; 步骤 S420, DSP根据编解码之间的相应关系, 将各个终端的解码图像进行多画面 合成、 再编码; 步骤 S422, 将编码后码流发回给各个终端。 下面结合终端获取的手势和终端呈现的画面来说明一下在会议电视系统中通过手 势识别调整多画面的方法: 在本优选实施例中所用的系统是会议电视系统, 参照以上 实施例及优选实施例的流程图来进行处理。 用户首先召开一个四个终端 A、终端 B、终端 C、终端 D的四画面 720P(1280x720) 30帧的会议, 会议控制界面把此信息发送给 MCU侧 MP模块, MP模块得到此信令 后会和终端进行 720P(1280x720) 30帧的信令交互。 四个终端 A、 终端 B、 终端 C、 终 端 D如果都支持此格式, 就交互成功, 然后四个终端 A、 终端 B、 终端 C、 终端 D把 720P(1280x720) 30帧图像压缩码流发送给 MCU, 假设此时还没有用户通过手势来控 制多画面。 MCU收到码流后,把四个图像压缩码流都发送给 MCU侧 VPU模块, VPU 使用四个解码节点进行解码, 解码后分别进行手势识别, 没有发现调整多画面数的信 息, 那么四个解码节点将各自的图像线性码都发送给同一编码节点 a, 然后编码节点 a 将各个图像四分之一缩小后按照图 7进行多画面合成, 最后将合成后的图像线性码进 行编码后送还给终端。 终端收到 MCU发过来的多画面压缩码流进行解码送显, 四个 终端用户就都可以看到如图 7所示的四画面图像。 此时, 如果终端 A的用户只想看到三个终端的图像即三画面, 就对着摄像头摆出 图 5所示的手势 (两个任选其一即可), 终端会将包含此手势的图像发送给 MCU。 MCU收到包含手势信息的码流后, 把四个图像压缩码流都发送给 MCU侧 VPU 模块, VPU使用四个解码节点进行解码, 解码后四个解码节点会分别进行手势识别, 此时解码节点 B、 C、 D没有发现调整多画面数的信息, 解码节点 A发现终端 A需要 三画面图像的调整多画面数信息, 解码节点 A就将此信息发送给 MP, MP会增加一 个编码节点 b来编码终端 A所需的图像, 并且调整编解码节点关系, 四个解码节点除 了将各自的图像线性码都发送给编码节点 a进行多画面合成外, 解码节点 A、 B、 C还 需要将图像线性码都发送给编码节点 b进行多画面合成, 编码节点合成完成后, 编码 节点 a将各个图像四分之一缩小后按照图 7多画面合成, 并将图像线性码编码后送还 给终端 B、 C、 D, 编码节点 b将各个图像四分之一缩小后按照图 6多画面合成, 并将 图像线性码编码后送还给终端 A。 终端 A收到编码节点 b的压缩码流, 解码送显后可以看到如图 6所示的三画面图 像; 终端 B、 C、 D收到编码节点 a的压缩码流, 解码送显后可以看到如图 7所示的四 画面图像。 这样就完成了一次通过手势识别降低多画面数的操作。 显然, 本领域的技术人员应该明白, 上述的本发明的各模块或各步骤可以用通用 的计算装置来实现, 它们可以集中在单个的计算装置上, 或者分布在多个计算装置所 组成的网络上, 可选地, 它们可以用计算装置可执行的程序代码来实现, 从而可以将 它们存储在存储装置中由计算装置来执行,或者将它们分别制作成各个集成电路模块, 或者将它们中的多个模块或步骤制作成单个集成电路模块来实现。 这样, 本发明不限 制于任何特定的硬件和软件结合。 以上所述仅为本发明的优选实施例而已, 并不用于限制本发明, 对于本领域的技 术人员来说, 本发明可以有各种更改和变化。 凡在本发明的精神和原则之内, 所作的 任何修改、 等同替换、 改进等, 均应包含在本发明的保护范围之内。 Step S306, if the image format signaling interaction is unsuccessful, the terminal does not go up; if the interaction is successful, in step S3062, the terminal and the MCU interact to output an image format; in step S3064, the terminal sends the image compressed code stream to the MCU according to the corresponding format; S308, the VPU module in the MCU separately decodes the received image compression code streams of each terminal, and obtains an image linear code of each terminal. Step S310, the VPU separately performs gesture recognition on the image linear code decoded by each terminal. In step S312, it is determined whether there is multi-screen adjustment information. If there is no multi-screen adjustment gesture, the process directly jumps to step S316. If there is an adjusted gesture, step S314 is performed. Step S314, the corresponding multi-screen number adjustment information is acquired, and the VPU module will recognize The multi-screen number adjustment information is sent to the MP module of the MCU, and the MP module receives the signaling, and adjusts the corresponding codec node relationship; in step S316, the MCU performs multi-picture synthesis on the decoded image of each terminal according to the corresponding relationship of the codec. Encoding step S318, sending the encoded code stream back to the terminal; Step S3182, the terminal receiving the code stream decoding and sending . With the method of the preferred embodiment, there is no need to modify the terminal code at all, and there is no need to increase the burden of the terminal and the MCU signaling channel, and it is not necessary to upgrade the terminal on a large scale, and other terminals can also implement this function, which is existing. Technology can't reach it. In addition, the following extensions can be made: (1) The multi-screen layout can be adjusted, the frame rate can be adjusted, the code rate can be adjusted, and the format format can be adjusted; (2) As long as the signaling interaction between the terminal and the MCU is increased It can be controlled by designing corresponding gestures; (3) In addition to gestures, sound control can also be implemented in this way; (4) Any subsequent control by some kind of signal recognition can be implemented in this way. 4 is a flowchart of an MCU side of a media stream processing method according to a preferred embodiment of the present invention. As shown in FIG. 4, the MCU side includes an MP and a digital sound processor (DSP), and the process includes the following: Steps: Step S402, the MP obtains the relevant configuration of the conference by the conference control interface, including the number of terminals in the conference, the format and the code rate of each terminal, and the number of multiple frames of the conference; S404, the MP performs signaling interaction with each terminal according to the encoding format of each terminal according to the conference setting; step S406, if the image format signaling interaction is unsuccessful, the terminal does not attend; if the interaction is successful, the receiving terminal presses the corresponding format. The transmitted image compresses the code stream and sends it to the DSP; Step S4062, the DSP receives the terminal code stream; Step S408, the VPU module in the DSP separately decodes the received image compressed code streams of each terminal, and obtains the terminals of each terminal. Image linear code; Step S410, the VPU separately performs gesture recognition on the image linear code decoded by each terminal; Step S412, determining the image If there is no multi-screen adjustment gesture, skip to step S420, if there is an adjusted gesture, go to step S414; step S414, determine whether the gesture is legal, if not, go directly to step S420, if it is legal, then Sending the multi-screen number adjustment information to the MP module; Step S416, the MP receives the corresponding multi-screen number adjustment information sent by the DSP; Step S418, adjusting the corresponding relationship between the corresponding codec nodes; Step S420, the DSP performs multi-picture synthesis and re-encoding on the decoded image of each terminal according to the corresponding relationship between the codecs. In step S422, the encoded code stream is sent back to each terminal. The following is a description of a method for adjusting a multi-picture by gesture recognition in a conference television system in combination with a gesture acquired by a terminal and a screen presented by a terminal: The system used in the preferred embodiment is a conference television system, with reference to the above embodiments and preferred embodiments The flow chart is processed. The user first holds a four-screen 720P (1280x720) 30-frame conference of four terminals A, B, C, and D. The conference control interface sends this information to the MCU MP module. After the MP module obtains this signaling, And the terminal performs 720P (1280x720) 30 frame signaling interaction. If four terminals A, B, C, and D support this format, the interaction is successful. Then, four terminals A, B, C, and D send 720P (1280x720) 30 frames of compressed code streams to MCU, assuming that no user has controlled the multi-screen by gesture at this time. After receiving the code stream, the MCU sends the four image compression code streams to the VPU module on the MCU side. The VPU uses four decoding nodes for decoding, and performs gesture recognition after decoding. No information is found to adjust the number of multi-pictures, then four The decoding node sends the respective image linear codes to the same encoding node a, and then the encoding node a reduces the respective images by a quarter, performs multi-picture synthesis according to FIG. 7, and finally encodes the synthesized image linear codes and returns them. Give the terminal. The terminal receives the multi-picture compressed code stream sent by the MCU for decoding and display, and the four terminal users can see the four-picture image as shown in FIG. 7. At this time, if the user of the terminal A only wants to see the image of the three terminals, that is, the three screens, the gesture shown in FIG. 5 is presented to the camera (two options are optional), and the terminal will include the gesture. The image is sent to the MCU. After receiving the code stream containing the gesture information, the MCU sends the four image compression code streams to the VPU module on the MCU side. The VPU uses four decoding nodes to decode, and the four decoding nodes decode the gestures respectively. Nodes B, C, and D do not find information for adjusting the number of multi-pictures. Decoding node A finds that terminal A needs to adjust the multi-picture number information of the three-picture image, and decoding node A sends this information to MP, and MP adds one coding node b. To encode the image required by the terminal A, and adjust the codec node relationship. In addition to transmitting the respective image linear codes to the encoding node a for multi-picture synthesis, the decoding nodes A, B, and C also need to convert the images. The linear code is sent to the encoding node b for multi-picture synthesis. After the encoding node is synthesized, the encoding node a reduces the image by a quarter and then synthesizes the multi-picture according to Figure 7, and encodes the image linear code and sends it back to the terminal B. , C, D, the coding node b reduces each image by a quarter and then synthesizes according to the multi-picture of FIG. 6, and encodes the image linear code and returns it to the terminal A. Terminal A receives the compressed code stream of the encoding node b, and after decoding and transmitting, can see the three-picture image as shown in FIG. 6; the terminal B, C, D receives the compressed code stream of the encoding node a, and can be decoded and sent. See the four-picture image shown in Figure 7. This completes the operation of reducing the number of multi-pictures by gesture recognition. Obviously, those skilled in the art should understand that the above modules or steps of the present invention can be implemented by a general-purpose computing device, which can be concentrated on a single computing device or distributed over a network composed of multiple computing devices. Alternatively, they may be implemented by program code executable by the computing device so that they may be stored in the storage device by the computing device, or they may be separately fabricated into individual integrated circuit modules, or Multiple modules or steps are made into a single integrated circuit module. Thus, the invention is not limited to any specific combination of hardware and software. The above is only the preferred embodiment of the present invention, and is not intended to limit the present invention, and various modifications and changes can be made to the present invention. Any modifications, equivalent substitutions, improvements, etc. made within the spirit and scope of the present invention are intended to be included within the scope of the present invention.

Claims

权 利 要 求 书 Claim
1. 一种参数控制方法, 包括如下步骤: 1. A parameter control method comprising the following steps:
对来自终端的音频数据和 /或视频数据进行解析;  Parsing audio data and/or video data from the terminal;
判断所述音频数据中的音频和 /或视频数据中的画面包含,用于指示对参数 进行调整的音频和 /或画面;  Determining that the picture in the audio and/or video data in the audio data includes an audio and/or picture for indicating adjustment of the parameter;
根据所述用于指示对所述参数进行调整的音频和 /或画面调整所述参数。  The parameter is adjusted in accordance with the audio and/or picture for indicating adjustment of the parameter.
2. 根据权利要求 1所述的方法, 其中, 在预先设定的时间段内判断所述音频数据 中的音频和 /或视频数据中的画面包含所述用于指示对参数进行调整的音频和 / 或画面。 2. The method according to claim 1, wherein determining, in a predetermined period of time, a picture in the audio and/or video data in the audio data comprises the audio and the instruction to indicate adjustment of a parameter / or screen.
3. 根据权利要求 1所述的方法, 其中, 在所述参数为向所述终端发送媒体流所使 用的参数的情况下,根据所述用于指示对所述参数进行调整的音频和 /或画面调 整所述参数之后, 使用调整后的所述参数向所述终端发送媒体流。 3. The method according to claim 1, wherein, in the case that the parameter is a parameter used to send a media stream to the terminal, according to the audio for indicating adjustment of the parameter and/or After the picture is adjusted, the adjusted media stream is used to send the media stream to the terminal.
4. 根据权利要求 3所述的方法, 其中, 在所述媒体流为视频流时, 所述参数包括 以下至少之一: 所述视频流中包括的终端画面的数目、 所述视频流在所述终端 显示的布局、 所述视频流的帧频、 所述视频流的码率、 所述视频流的格式。 The method according to claim 3, wherein, when the media stream is a video stream, the parameter includes at least one of: a number of terminal pictures included in the video stream, the video stream is in a The layout of the terminal display, the frame rate of the video stream, the code rate of the video stream, and the format of the video stream.
5. 根据权利要求 1至 4中任一项所述的方法, 其中, 对来自终端的视频数据进行 解析之后得到图像线性码; 判断所述图像线性码中包含用于指示对所述参数进 行调整的画面。 The method according to any one of claims 1 to 4, wherein the video data from the terminal is parsed to obtain an image linear code; and the image linear code is included to indicate that the parameter is adjusted Picture.
6. 一种参数控制装置, 包括: 6. A parameter control device comprising:
解析模块, 设置为对来自终端的音频数据和 /或视频数据进行解析; 判断模块, 设置为判断所述音频数据中的音频和 /或视频数据中的画面包 含, 用于指示对参数进行调整的音频和 /或画面;  a parsing module configured to parse audio data and/or video data from the terminal; the determining module configured to determine a picture included in the audio and/or video data in the audio data, to indicate that the parameter is adjusted Audio and / or picture;
调整模块,设置为根据所述用于指示对所述参数进行调整的音频和 /或画面 调整所述参数。  An adjustment module configured to adjust the parameter in accordance with the audio and/or picture for indicating adjustment of the parameter.
7. 根据权利要求 6所述的装置, 其中, 所述判断模块, 设置为在预先设定的时间 段内判断所述音频数据中的音频和 /或视频数据中的画面包含所述用于指示对 参数进行调整的音频和 /或画面。 根据权利要求 6所述的装置, 其中, 在所述参数为向所述终端发送媒体流所使 用的参数的情况下, 所述调整模块设置为根据所述用于指示对所述参数进行调 整的音频和 /或画面调整所述参数之后,使用调整后的所述参数向所述终端发送 媒体流。 根据权利要求 8所述的装置, 其中, 在所述媒体流为视频流时, 所述参数包括 以下至少之一: 所述视频流中包括的终端画面的数目、 所述视频流在所述终端 显示的布局、 所述视频流的帧频、 所述视频流的码率、 所述视频流的格式。 根据权利要求 6至 9中任一项所述的装置, 其中, 所述解析模块设置为对来自 终端的视频数据进行解析之后得到图像线性码; 所述判断模块设置为判断所述 图像线性码中包含用于指示对所述参数进行调整的画面。 7. The apparatus according to claim 6, wherein the determining module is configured to determine, in a preset time period, that a picture in the audio and/or video data in the audio data includes the indication for indicating Audio and/or picture that adjusts the parameters. The device according to claim 6, wherein, in the case that the parameter is a parameter used to send a media stream to the terminal, the adjustment module is configured to perform an adjustment according to the parameter for indicating After the audio and/or picture adjusts the parameters, the adjusted media is used to send the media stream to the terminal. The device according to claim 8, wherein, when the media stream is a video stream, the parameter comprises at least one of: a number of terminal pictures included in the video stream, the video stream is at the terminal The displayed layout, the frame rate of the video stream, the code rate of the video stream, and the format of the video stream. The apparatus according to any one of claims 6 to 9, wherein the parsing module is configured to parse the video data from the terminal to obtain an image linear code; the determining module is configured to determine the linear code of the image A screen for indicating adjustment of the parameters is included.
PCT/CN2012/074031 2011-06-20 2012-04-13 Parameter control method and device WO2012174931A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN2011101656743A CN102256099A (en) 2011-06-20 2011-06-20 Parameter control method and device
CN201110165674.3 2011-06-20

Publications (1)

Publication Number Publication Date
WO2012174931A1 true WO2012174931A1 (en) 2012-12-27

Family

ID=44983048

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2012/074031 WO2012174931A1 (en) 2011-06-20 2012-04-13 Parameter control method and device

Country Status (2)

Country Link
CN (1) CN102256099A (en)
WO (1) WO2012174931A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102256099A (en) * 2011-06-20 2011-11-23 中兴通讯股份有限公司 Parameter control method and device
CN104735390A (en) * 2013-12-20 2015-06-24 华为技术有限公司 Layout method and device for videos and audios in immersive conference

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101031065A (en) * 2007-04-27 2007-09-05 华为技术有限公司 Method, apparatus and system for switching pictures in video service
CN101291398A (en) * 2008-06-02 2008-10-22 深圳华为通信技术有限公司 Method and apparatus of multi-image setting
CN101340550A (en) * 2008-08-21 2009-01-07 华为技术有限公司 Method and apparatus for multiple image display control
CN101437124A (en) * 2008-12-17 2009-05-20 三星电子(中国)研发中心 Method for processing dynamic gesture identification signal facing (to)television set control
CN101951474A (en) * 2010-10-12 2011-01-19 冠捷显示科技(厦门)有限公司 Television technology based on gesture control
CN102256099A (en) * 2011-06-20 2011-11-23 中兴通讯股份有限公司 Parameter control method and device

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101330641A (en) * 2007-06-22 2008-12-24 中兴通讯股份有限公司 Apparatus and method for transmitting multimedia flow in a wireless communication terminal
CN101355683B (en) * 2007-07-23 2011-07-13 中兴通讯股份有限公司 Method for automatically mapping image of wideband private wire conference television system
CN101217634B (en) * 2007-12-28 2011-06-15 华为终端有限公司 A wireless video and audio communication device and system
CN101370114B (en) * 2008-09-28 2011-02-02 华为终端有限公司 Video and audio processing method, multi-point control unit and video conference system
JP5155223B2 (en) * 2009-03-17 2013-03-06 株式会社ミツトヨ Absolute linear encoder and position adjustment method thereof
CN102025970A (en) * 2010-12-15 2011-04-20 广东威创视讯科技股份有限公司 Method and system for automatically adjusting display mode of video conference

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101031065A (en) * 2007-04-27 2007-09-05 华为技术有限公司 Method, apparatus and system for switching pictures in video service
CN101291398A (en) * 2008-06-02 2008-10-22 深圳华为通信技术有限公司 Method and apparatus of multi-image setting
CN101340550A (en) * 2008-08-21 2009-01-07 华为技术有限公司 Method and apparatus for multiple image display control
CN101437124A (en) * 2008-12-17 2009-05-20 三星电子(中国)研发中心 Method for processing dynamic gesture identification signal facing (to)television set control
CN101951474A (en) * 2010-10-12 2011-01-19 冠捷显示科技(厦门)有限公司 Television technology based on gesture control
CN102256099A (en) * 2011-06-20 2011-11-23 中兴通讯股份有限公司 Parameter control method and device

Also Published As

Publication number Publication date
CN102256099A (en) 2011-11-23

Similar Documents

Publication Publication Date Title
US8477950B2 (en) Home theater component for a virtualized home theater system
EP2154885A1 (en) A caption display method and a video communication system, apparatus
CN101755454B (en) Method and apparatus for determining preferred image format between mobile video telephones
US7508413B2 (en) Video conference data transmission device and data transmission method adapted for small display of mobile terminals
WO2019169682A1 (en) Audio-video synthesis method and system
EP1860841B1 (en) Method and system for replacing media stream in a communication process of a terminal
JP6172610B2 (en) Video conferencing system
US20140146129A1 (en) Telepresence method, terminal and system
US8749611B2 (en) Video conference system
CN108055497B (en) Conference signal playing method and device, video conference terminal and mobile device
CN111092898B (en) Message transmission method and related equipment
CN109753259B (en) Screen projection system and control method
CN111147860A (en) Video data decoding method and device
US20060109803A1 (en) Easy volume adjustment for communication terminal in multipoint conference
CN112019792A (en) Conference control method, conference control device, terminal equipment and storage medium
CN114979755A (en) Screen projection method and device, terminal equipment and computer readable storage medium
KR20160062787A (en) Method for processing and mixing multiple feed videos for video conference, video conference terminal apparatus, video conference server and video conference system using the same
WO2017177802A1 (en) Audio/video playback method and apparatus, and computer storage medium
WO2016147538A1 (en) Videoconference communication device
WO2012174931A1 (en) Parameter control method and device
CN111083427B (en) Data processing method of embedded terminal and 4K video conference system
JP4425887B2 (en) VIDEO CONFERENCE SYSTEM, TERMINAL USED FOR VIDEO CONFERENCE SYSTEM, TERMINAL PROCESSING METHOD, AND ITS PROGRAM
JP2013046319A (en) Image processing apparatus and image processing method
JP6481937B2 (en) Communication device for video conference
JP2002290940A (en) Video conference system

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 12801856

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 12801856

Country of ref document: EP

Kind code of ref document: A1