WO2014190811A1 - Method and apparatus for remote display endpoint capability exchange, and data flow - Google Patents

Method and apparatus for remote display endpoint capability exchange, and data flow Download PDF

Info

Publication number
WO2014190811A1
WO2014190811A1 PCT/CN2014/075201 CN2014075201W WO2014190811A1 WO 2014190811 A1 WO2014190811 A1 WO 2014190811A1 CN 2014075201 W CN2014075201 W CN 2014075201W WO 2014190811 A1 WO2014190811 A1 WO 2014190811A1
Authority
WO
WIPO (PCT)
Prior art keywords
endpoint
remote presentation
remote
request
video
Prior art date
Application number
PCT/CN2014/075201
Other languages
French (fr)
Chinese (zh)
Inventor
王亮
叶小阳
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2014190811A1 publication Critical patent/WO2014190811A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/141Systems for two-way working between two video terminals, e.g. videophone
    • H04N7/147Communication arrangements, e.g. identifying the communication as a video-communication, intermediate storage of the signals

Definitions

  • the present invention relates to the field of communications, and in particular to a method and apparatus for interacting with capabilities of a remote presentation endpoint, and a data stream.
  • the capability negotiation is performed on the H.245
  • only the traditional video conferencing endpoint capability can be negotiated, and the capability of the remote presentation endpoint cannot be negotiated.
  • Specific Tables When capabilities are interacted, only the codec capabilities of traditional video conferencing endpoints can be interacted, and the new media stream receiving and sending capabilities of the remote rendering endpoints cannot be interacted.
  • you open a logical channel you can only specify the codec attribute of the logical channel.
  • the present invention provides a method and apparatus for interacting with capabilities of a remote presentation endpoint, and a data stream to address at least the above problems.
  • a capability interaction method for remotely presenting an endpoint including: performing a capability interaction between a first remote presentation endpoint and a second remote presentation endpoint, where the capability interaction message carries a remote presentation endpoint capability set, wherein the capability interaction message carries the remote presentation endpoint transmission capability set; the first remote presentation endpoint receives a mode request message of the second remote presentation endpoint; the first remote presentation The endpoint opens a logical channel between the first telepresence endpoint and the second telepresence endpoint based on the result of the capability interaction and the received mode request information.
  • performing the capability interaction between the first remote presentation endpoint and the second remote presentation endpoint comprises: sending, by the first remote presentation endpoint, the first capability set interaction request to the second remote presentation endpoint, where the first The first set of transmission capabilities of the first remote presentation endpoint is carried in the capability set interaction request;
  • the first remote presentation endpoint receives the second capability set interaction request sent by the second remote presentation endpoint, where the second capability set interaction request carries the second transmission capability set of the second remote presentation endpoint;
  • the first remote presentation endpoint receives the first mode request message sent by the second remote presentation endpoint, where the first mode request message carries the sending parameter of the first remote presentation endpoint;
  • sending a first logical channel request to the second remote presentation endpoint where the first logical channel request is used to request the first remote rendering
  • performing the capability interaction between the first remote presentation endpoint and the second remote presentation endpoint comprises: sending, by the first remote presentation endpoint, a third capability set interaction request to the second remote presentation endpoint, wherein the third The capability set interaction request carries a first transmission capability set of the first remote presentation endpoint; the first remote presentation endpoint receives a three-mode request message sent by the second remote presentation endpoint, where the third mode The request message carries the sending parameter of the first remote rendering endpoint; the first remote rendering endpoint receives the fourth capability set interaction request sent by the second remote rendering endpoint, where the fourth capability set interaction request is carried a second transmission capability set of the second remote presentation endpoint; a fourth mode request message sent by the first remote presentation endpoint to the second remote presentation endpoint, where the fourth mode request message carries Transmitting parameters of the second remote presentation endpoint; the first remote presentation endpoint according to the mode corresponding to the fourth mode request message As a result of the request process, sending a third logical channel request to the second remote presentation endpoint, wherein the third logical
  • the method further includes: the first remote presentation endpoint receiving, by the second remote presentation endpoint, the corresponding The first capability set interaction request corresponding response message; after the first remote presentation endpoint sends the third capability set interaction request to the second remote presentation endpoint, the method further includes: the first remote presentation endpoint receiving And a response message corresponding to the third capability set interaction request sent by the second remote presentation endpoint.
  • the method further includes: sending, by the first remote presentation endpoint, the second remote presentation endpoint to the second remote presentation endpoint The second capability set interaction request corresponding to the response message; after the first remote presentation endpoint receives the fourth capability set interaction request sent by the second remote presentation endpoint, the method further includes: the first remote presentation endpoint to the second The telepresence endpoint sends a response message corresponding to the fourth capability set interaction request.
  • the method further includes: the first remote presentation endpoint sending the corresponding location to the second remote presentation endpoint a response message corresponding to the first mode request message; after the first remote presentation endpoint receives the third mode request message sent by the second remote presentation endpoint, the method further includes: the first remote presentation endpoint to the The remote presentation endpoint sends a response message corresponding to the third mode request message.
  • the method further includes: sending, by the first remote presentation endpoint, the second remote presentation endpoint to the second remote presentation endpoint a response message of the second mode request message; after the fourth mode request message sent by the first remote presentation endpoint to the second remote presentation endpoint, the method further includes: the first remote presentation endpoint to the second The telepresence endpoint sends a response message corresponding to the fourth mode request message.
  • the method further includes: The telepresence endpoint receives a response message corresponding to the first logical channel request sent by the second remote presentation endpoint; and the result of the mode request process corresponding to the second remote presentation endpoint according to the second mode request message
  • the method further includes: the first remote presentation endpoint receiving, by the second remote presentation endpoint, a response message corresponding to the third logical channel request .
  • the method further includes: the first remote presentation endpoint sending the first to the second remote presentation endpoint
  • the method further includes: the first remote presentation endpoint to the second remote The presentation endpoint sends a response message corresponding to the fourth logical channel request.
  • the remote presentation endpoint transmission capability set includes a capture parameter, wherein the capture parameter includes a universal parameter, a video parameter, and/or an audio parameter.
  • the universal parameters include media capture content, scene description, scene switching policy, general space information, and/or general encoding information; the media capturing content indicates a purpose of media capturing; and the scene description is used to provide an overall scenario.
  • the scenario switching policy is used to indicate a supported media switching policy, where the supported media switching policy includes a location switching policy and a partial switching policy, where the location switching policy is used to indicate simultaneous handover All captures are taken to ensure that the captures come together from the same endpoint location, which is used to indicate that different captures can be switched at different times and from the same and/or different remote presentation endpoints.
  • the general space information includes a scene area and/or an area scale parameter, where the scene area parameter is used to indicate a range of an overall scene related to the endpoint, and the area scale indicates a scale used by the spatial information parameter.
  • the universal coding information includes all the maximum bandwidth, all the maximum number of pixels per second, and/or all the maximum number of macroblocks per second, wherein the total maximum bandwidth is used to indicate the preset type issued by the terminal.
  • the maximum number of bitrates per second of all codestreams; the total number of pixels per second is used to represent all of the coding groups
  • the maximum number of pixels per second independently coded; the total number of macroblocks per second represents the maximum number of macroblocks per second for all video streams sent by the endpoint.
  • the video parameters include: a video capture quantity, video capture space information, and/or video capture coding information; the video capture quantity is used to indicate the number of video captures.
  • said video capture spatial information comprises a capture area, a capture point and/or a point on the capture line, wherein said capture area is for indicating a spatial location of the video capture in the overall captured scene; A capture point, used to indicate the location of the video capture in the captured scene; a point on the capture line that describes the spatial location of the second point on the optical axis of the capture device, and the first point is the capture point.
  • the video capture coding information includes a maximum video bandwidth, a maximum number of pixels per second, a width of a maximum video resolution, a height of a maximum video resolution, and/or a maximum video frame rate; wherein, the maximum video bandwidth, The maximum number of bits per second for indicating a single video encoding; the maximum number of pixels per second, the parameter is used to represent the maximum number of pixels per second for a single video encoding; the width of the maximum video resolution, the parameter is used to represent The width of the maximum video resolution in pixels; the height of the maximum video resolution, which is used to represent the height of the maximum video resolution in pixels; the maximum video frame rate, which indicates the maximum Video frame rate.
  • the maximum video bandwidth The maximum number of bits per second for indicating a single video encoding
  • the maximum number of pixels per second the parameter is used to represent the maximum number of pixels per second for a single video encoding
  • the width of the maximum video resolution the parameter is used to represent The width of the maximum video resolution in pixels
  • the audio parameters include an audio capture amount, audio capture space information, and/or audio capture coding information; the audio capture number is used to indicate the number of audio captures.
  • the audio capture space information includes: a capture area, and/or a capture point, where the capture area is used to indicate a spatial location where the audio capture is located in the overall captured scene; the capture point is used to indicate Captures the location of the audio capture in the scene.
  • the audio capture coding information includes: an audio channel format and/or a maximum audio bandwidth; the audio channel format is used to indicate an attribute of an audio channel; and the maximum audio bandwidth is used to indicate a single audio code per second. The maximum number of bits.
  • a capability interaction device for remotely presenting an endpoint, which is applied to a first remote presentation endpoint, and includes: an interaction module configured to perform capability interaction with a second remote presentation endpoint, where The capability interaction message carries a remote presentation endpoint capability set, where the capability interaction message carries the remote presentation endpoint transmission capability set; and the first receiving module is configured to receive the second remote presentation endpoint mode.
  • a request message the processing module, configured to open a logical channel between the first telepresence endpoint and the second telepresence endpoint according to the result of the capability interaction and the received mode request information.
  • the interaction module includes: a first sending module, configured to send a first capability set interaction request to the second remote presentation endpoint, where the first capability set interaction request carries the first remote Presenting a first set of transmission capabilities of the endpoint; the second receiving module is configured to receive the second capability set interaction request sent by the second remote presentation endpoint, where the second capability set interaction request carries the second remote presentation a second sending capability set of the endpoint; the third receiving module is configured to receive the first mode request message sent by the second remote rendering endpoint, where the first mode request message carries the first remote rendering endpoint The second sending module is configured to send the second mode request message to the second remote presentation endpoint, where the second mode request message carries the sending parameter of the second remote rendering endpoint; a third sending module, configured to send, according to a result of the mode request process corresponding to the second mode request message, to the second remote The presentation endpoint sends a first logical channel request, wherein the first logical channel request is used to request the first remote presentation endpoint to open forward logic between the first remote
  • the interaction module includes: a fourth sending module, configured to send a third capability set interaction request to the second remote presentation endpoint, where the third capability set interaction request carries the first remote Presenting a first set of transmission capabilities of the endpoint; the fifth receiving module is configured to receive the third mode request message sent by the second remote presentation endpoint, where the third mode request message carries the first remote presentation a sending parameter of the endpoint; a sixth receiving module, configured to receive a fourth capability set interaction request sent by the second remote presentation endpoint, where the fourth capability set interaction request carries the second remote presentation endpoint a sending capability set; a fifth sending module, configured to send the fourth mode request message to the second remote rendering endpoint, where the fourth mode request message carries the sending parameter of the second remote rendering endpoint; a sixth sending module, configured to send a third logical channel request to the second remote rendering endpoint according to a result of the mode request process corresponding to the fourth mode request message, where the third logical channel request is used for the request
  • the first remote presentation endpoint opens a forward logical
  • the method further includes: an eighth receiving module, configured to receive, after the first sending module sends the first capability set interaction request to the second remote presentation endpoint, the second remote rendering endpoint sends the corresponding a response message corresponding to the capability set interaction request; the ninth receiving module is configured to: after the fourth sending module sends the third capability set interaction request to the second remote presentation endpoint, receive the correspondence sent by the second remote rendering endpoint Corresponding to the response message corresponding to the third capability set.
  • the method further includes: a seventh sending module, configured to send, after the second receiving module receives the second capability set interaction request sent by the second remote rendering endpoint, to the second remote rendering endpoint, corresponding to the first
  • the eighth sending module is configured to: after the fourth receiving module receives the fourth capability set interaction request sent by the second remote rendering endpoint, the method further includes: the first remote rendering endpoint Sending a response message corresponding to the fourth capability set interaction request to the second remote presentation endpoint.
  • the method further includes: a ninth sending module, configured to send, after the third receiving module receives the first mode request message sent by the second remote rendering endpoint, to the second remote rendering endpoint, corresponding to the a response message corresponding to the first mode request message; a tenth sending module, configured to send, after the fifth receiving module receives the third mode request message sent by the second remote rendering endpoint, to the second remote rendering endpoint, corresponding to the third mode request message Response message.
  • a ninth sending module configured to send, after the third receiving module receives the first mode request message sent by the second remote rendering endpoint, to the second remote rendering endpoint, corresponding to the a response message corresponding to the first mode request message
  • a tenth sending module configured to send, after the fifth receiving module receives the third mode request message sent by the second remote rendering endpoint, to the second remote rendering endpoint, corresponding to the third mode request message Response message.
  • the method further includes: an eleventh sending module, configured to send, after the second sending module sends the second mode request message to the second remote rendering endpoint, to the second remote rendering endpoint, corresponding to the a response message of the second mode request message; the twelfth sending module is configured to: after the fourth mode request message sent by the fifth sending module to the second remote presentation endpoint, to the second remote presentation endpoint A response message corresponding to the fourth mode request message is transmitted.
  • an eleventh sending module configured to send, after the second sending module sends the second mode request message to the second remote rendering endpoint, to the second remote rendering endpoint, corresponding to the a response message of the second mode request message
  • the twelfth sending module is configured to: after the fourth mode request message sent by the fifth sending module to the second remote presentation endpoint, to the second remote presentation endpoint A response message corresponding to the fourth mode request message is transmitted.
  • the method further includes: a tenth receiving module, configured to send, by the third sending module, a first logical channel request to the second remote rendering endpoint according to a result of a mode request process corresponding to the second mode request message
  • the eleventh receiving module is configured to correspond to the second mode request message according to the second sending module.
  • the mode request process after sending the third logical channel request to the second remote presentation endpoint, receiving a response message corresponding to the third logical channel request sent by the second remote presentation endpoint.
  • the method further includes: a thirteenth sending module, configured to: after the first remote rendering endpoint of the fourth receiving module receives the second logical channel request sent by the second remote rendering endpoint, to the second The telepresence endpoint sends a response message corresponding to the second logical channel request; the fourteenth sending module is configured to: after the seventh receiving module receives the fourth logical channel request sent by the second remote rendering endpoint, The second telepresence endpoint sends a response message corresponding to the fourth logical channel request.
  • a thirteenth sending module configured to: after the first remote rendering endpoint of the fourth receiving module receives the second logical channel request sent by the second remote rendering endpoint, to the second The telepresence endpoint sends a response message corresponding to the second logical channel request.
  • a capability interaction apparatus for remotely presenting an endpoint comprising: according to another aspect of the present invention, a data stream is provided, comprising: a remote presentation endpoint capability set, wherein the remote The presentation endpoint capability set includes: a transmission capability set, where the transmission capability set includes: a capture parameter, where the capture parameter includes: a general parameter, a video parameter, and/or an audio parameter.
  • the universal parameters include media capture content, scene description, scene switching policy, general space information, and/or general encoding information; the media capturing content indicates a purpose of media capturing; and the scene description is used to provide an overall scenario.
  • the scenario switching policy is used to indicate a supported media switching policy, where the supported media switching policy includes a location switching policy and a partial switching policy, where the location switching policy is used to indicate simultaneous handover All captures are taken to ensure that the captures come together from the same endpoint location, which is used to indicate that different captures can be switched at different times and from the same and/or different remote presentation endpoints.
  • the general space information includes a scene area and/or an area scale parameter, where the scene area parameter is used to indicate a range of an overall scene related to the endpoint, and the area scale indicates a scale used by the spatial information parameter. kind.
  • the universal coding information includes all the maximum bandwidth, all the maximum number of pixels per second, and/or all the maximum number of macroblocks per second, wherein the total maximum bandwidth is used to indicate the preset type issued by the terminal.
  • the video parameters include: a video capture quantity, video capture space information, and/or video capture coding information; the video capture quantity is used to indicate the number of video captures.
  • said video capture spatial information comprises a capture area, a capture point and/or a point on the capture line, wherein said capture area is for indicating a spatial location of the video capture in the overall captured scene;
  • a capture point used to indicate the location of the video capture in the captured scene;
  • a point on the capture line that describes the spatial location of the second point on the optical axis of the capture device, and the first point is the capture point.
  • the video capture coding information includes a maximum video bandwidth, a maximum number of pixels per second, a width of a maximum video resolution, a height of a maximum video resolution, and/or a maximum video frame rate; wherein, the maximum video bandwidth, The maximum number of bits per second for indicating a single video encoding; the maximum number of pixels per second, the parameter is used to represent the maximum number of pixels per second for a single video encoding; the width of the maximum video resolution, the parameter is used to represent The width of the maximum video resolution in pixels; the height of the maximum video resolution, which is used to represent the height of the maximum video resolution in pixels; the maximum video frame rate, which indicates the maximum Video frame rate.
  • the maximum video bandwidth The maximum number of bits per second for indicating a single video encoding
  • the maximum number of pixels per second the parameter is used to represent the maximum number of pixels per second for a single video encoding
  • the width of the maximum video resolution the parameter is used to represent The width of the maximum video resolution in pixels
  • the audio parameters include an audio capture amount, audio capture space information, and/or audio capture coding information; the audio capture number is used to indicate the number of audio captures.
  • the audio capture space information includes: a capture area, and/or a capture point, where the capture area is used to indicate a spatial location where the audio capture is located in the overall captured scene; the capture point is used to indicate Captures the location of the audio capture in the scene.
  • the audio capture coding information includes: an audio channel format and/or a maximum audio bandwidth; the audio channel format is used to indicate an attribute of an audio channel; and the maximum audio bandwidth is used to indicate a single audio code per second. The maximum number of bits.
  • FIG. 1 is a flowchart of a capability interaction method of a remote presentation endpoint according to an embodiment of the present invention
  • FIG. 2 is a structural block diagram of a capability interaction device for remotely presenting an endpoint according to an embodiment of the present invention
  • FIG. 4 is a flowchart 1 of a method for negotiating a remote presentation endpoint according to an embodiment of the present invention
  • FIG. 4 is a flowchart 1 of a method for negotiating a remote presentation endpoint according to an embodiment of the present invention
  • FIG. 6 is a flowchart 3 of a method for negotiating a remote presentation endpoint according to an embodiment of the present invention
  • FIG. 7 is a method for negotiating a remote presentation endpoint according to an embodiment of the present invention
  • FIG. 8 is a flowchart of a method for negotiating a remote presentation endpoint according to an embodiment of the present invention
  • FIG. 9 is a flowchart 6 of a method for negotiating a remote presentation endpoint according to an embodiment of the present invention
  • FIG. 11 is a flowchart of a method for negotiating a remote presentation endpoint according to an embodiment of the present invention
  • FIG. 11 is a flowchart 8 of a method for negotiating a remote presentation endpoint according to an embodiment of the present invention
  • 12 is a flowchart 9 of a method for negotiating a remote presentation endpoint according to an embodiment of the present invention
  • FIG. 13 is a flowchart 10 of a method for negotiating a remote presentation endpoint according to an embodiment of the present invention
  • 14 is a flowchart 11 of a method for negotiating a remote presentation endpoint according to an embodiment of the present invention
  • FIG. 11 is a flowchart of a method for negotiating a remote presentation endpoint according to an embodiment of the present invention
  • FIG. 11 is a flowchart 8 of a method for negotiating a remote presentation endpoint according to an embodiment of the present invention
  • 12 is a flowchart 9 of a method for negotiating a remote presentation endpoint according to an embodiment of the present invention
  • FIG. 15 is a flowchart 12 of a method for negotiating a remote presentation endpoint according to an embodiment of the present invention.
  • FIG. 17 is a schematic diagram of a remote presentation endpoint capability set in accordance with an embodiment of the present invention.
  • BEST MODE FOR CARRYING OUT THE INVENTION will be described in detail with reference to the accompanying drawings. It should be noted that the embodiments in the present application and the features in the embodiments may be combined with each other without conflict.
  • the present embodiment provides a method for interacting with a remote presentation endpoint.
  • FIG. 1 is a flowchart of a method for interacting with a remote presentation endpoint according to an embodiment of the present invention.
  • the method includes the following steps S102 to S106.
  • Step S102 Perform a capability interaction between the first remote presentation endpoint and the second remote presentation endpoint, where the capability interaction message carries a remote presentation endpoint capability set, where the capability interaction message carries the remote presentation endpoint transmission capability set .
  • Step S104 The first telepresence endpoint receives a mode request message of the second telepresence endpoint.
  • Step S106 The first telepresence endpoint opens a logical channel between the first telepresence endpoint and the second telepresence endpoint according to the result of the capability interaction and the received mode request information.
  • step S102 can be implemented in the following two manners.
  • Method 1 The method includes the following sub-steps S1 to S11.
  • Step S1 The first remote presentation endpoint sends a first capability set interaction request to the second remote presentation endpoint, where the first capability set interaction request carries the first transmission capability set of the first remote presentation endpoint.
  • Step S3 The remote presentation endpoint receives the second capability set interaction request sent by the second remote presentation endpoint, where the second capability set interaction request carries the second transmission capability set of the second remote presentation endpoint;
  • Step S5 The first remote presentation endpoint receives a first mode request message sent by the second remote presentation endpoint, where the first mode request message carries a sending parameter of the first remote presentation endpoint;
  • Step S7 The second mode request message sent by the first remote presentation endpoint to the second remote presentation endpoint, where the second mode request message carries the sending parameter of the second remote presentation endpoint;
  • Step S9 the first remote presentation endpoint is configured according to As a result of the mode request process corresponding to the second mode request message, sending a first logical channel request to the second remote presentation endpoint, where the first logical channel request is used to request the first remote presentation
  • Manner 2 The mode includes the following sub-steps S2 to S12.
  • Step S2 The first remote presentation endpoint sends a third capability set interaction request to the second remote presentation endpoint, where the third capability set interaction request carries the first transmission capability set of the first remote presentation endpoint.
  • Step S4 The remote presentation endpoint receives the three-mode request message sent by the second remote presentation endpoint, where the third mode request message carries the transmission parameter of the first remote presentation endpoint;
  • Step S6 the first remote presentation endpoint receives the second remote presentation endpoint to send The fourth capability set interaction request, where the fourth capability set interaction request carries the second transmission capability set of the second remote presentation endpoint;
  • Step S8 the fourth mode sent by the first remote presentation endpoint to the second remote presentation endpoint a request message, where the fourth mode request message carries the sending parameter of the second remote presentation endpoint;
  • the presentation endpoint opens a forward logical channel between the first remote presentation endpoint and the second remote presentation endpoint;
  • Step S12 The first remote presentation endpoint receives a fourth logical channel request sent by the second remote
  • the method further includes: receiving, by the first remote presentation endpoint, the first capability corresponding to the first remote presentation endpoint Set the response message corresponding to the interaction request.
  • the method further includes: receiving, by the first remote presentation endpoint, a response message corresponding to the third capability set interaction request sent by the second remote presentation endpoint.
  • the method further includes: sending, by the first remote presentation endpoint, the second remote presentation endpoint to the second capability
  • the method further includes: sending, by the first remote presentation endpoint, the second remote presentation endpoint to the fourth capability Set the response message corresponding to the interaction request.
  • the method further includes: sending, by the first remote presentation endpoint, the first mode request to the second remote presentation endpoint The response message corresponding to the message; after the first remote presentation endpoint receives the third mode request message sent by the second remote presentation endpoint, the method further includes: transmitting, by the first remote presentation endpoint, to the second remote presentation endpoint, corresponding to the third mode request message Response message.
  • the method further includes: the first remote presentation endpoint sending the second mode request to the second remote presentation endpoint a response message of the message, after the first remote presentation endpoint sends the fourth mode request message to the second remote presentation endpoint, the method further comprising: the first remote presentation endpoint transmitting a response corresponding to the fourth mode request message to the second remote presentation endpoint Message.
  • the method further includes: the first remote presentation endpoint Receiving, by the second telepresence endpoint, a response message corresponding to the first logical channel request; After the first remote presentation endpoint sends the third logical channel request to the second remote presentation endpoint according to the result of the mode request process corresponding to the second mode request message, the method further includes: receiving, by the first remote presentation endpoint, the second remote presentation endpoint Corresponding to the third logical channel request corresponding response message.
  • the method further includes: the first remote presentation endpoint sending the second logical channel request corresponding to the second remote presentation endpoint
  • the method further includes: the first remote presentation endpoint sending a response message corresponding to the fourth logical channel request to the second remote presentation endpoint.
  • the telepresence endpoint transmission capability set includes a capture parameter, where the capture parameter includes a universal parameter, a video parameter, and/or an audio parameter.
  • the universal parameters include media capture content, scene description, scene switching policy, general space information and/or general encoding information; media capture content indicates use of media capture; scene description is used to provide description of the overall scene; preferably, the scene
  • the switching policy is used to indicate the supported media switching policy, where the supported media switching policy includes a location switching policy and a partial switching policy, where the location switching policy is used to indicate that all the capturing is simultaneously switched to ensure that the capturing is from the same endpoint together A place, partial handoff strategy is used to indicate that different captures can be switched at different times and from the same and/or different remote presentation endpoints.
  • the general spatial information includes a scene area and/or an area scale parameter, wherein the scene area parameter is used to indicate a range of the overall scene associated with the endpoint, and the area scale indicates the type of scale used by the spatial information parameter.
  • the universal coding information includes all the maximum bandwidth, the total number of maximum pixels per second, and/or the maximum number of macroblocks per second, wherein all the maximum bandwidth is used to indicate all the code streams of the preset type sent by the terminal.
  • the maximum number of bits per second; the maximum number of pixels per second is used to represent the maximum number of pixels per second independently coded in the code group; the maximum number of macroblocks per second represents the total number of video streams sent by the endpoint.
  • the video parameters include: video capture number, video capture spatial information, and/or video capture encoded information; the number of video captures is used to indicate the number of video captures.
  • the video capture spatial information comprises a capture area, a capture point and/or a point on the capture line, wherein the capture area is used to indicate a spatial location of the video capture in the overall captured scene; a capture point, used to indicate In the captured scene, the location of the video capture; the point on the capture line, describes the spatial location of the second point on the optical axis of the capture device, and the first point is the capture point.
  • the video captures the encoded information, including the maximum video bandwidth, the maximum number of pixels per second, the width of the maximum video resolution, the height of the maximum video resolution, and/or the maximum video frame rate; wherein, the maximum video bandwidth is used to indicate a single The maximum number of bits per second for video encoding; the maximum number of pixels per second, which is used to represent the maximum number of pixels per second for a single video encoding; the maximum video resolution width, which is used to represent the maximum video resolution in pixels. The width of the rate; the height of the maximum video resolution, which is used to indicate the height of the maximum video resolution in pixels; the maximum video frame rate, which indicates the maximum video frame rate.
  • the audio parameters include an audio capture amount, audio capture spatial information, and/or audio capture encoded information; the number of audio captures is used to indicate the number of audio captures.
  • the audio capture spatial information comprises: a capture area and/or a capture point, wherein the capture area is used to indicate a spatial location where the audio capture is located in the overall captured scene; and a capture point is used to represent the audio capture in the captured scene s position.
  • the audio capture encoded information comprises: an audio channel format and/or a maximum audio bandwidth; an audio channel format for indicating an attribute of the audio channel; a maximum audio bandwidth for indicating a maximum number of bits per second for a single audio encoding.
  • a capability interactive software for remotely presenting endpoints is provided for performing the technical solutions described in the above embodiments and preferred embodiments.
  • a storage medium is provided, the storage medium having the capability of interacting with the remote presentation endpoints, including but not limited to: an optical disk, a floppy disk, a hard disk, a rewritable memory, and the like.
  • the embodiment of the present invention further provides a capability interaction device for a remote presentation endpoint, which is applicable to a first remote presentation endpoint, and the capability interaction device of the remote presentation endpoint may be used to implement the capability interaction method and a preferred implementation manner of the remote presentation endpoint.
  • the modules involved in the capability interaction device of the remote presentation endpoint will be described below.
  • the term "module" can achieve a predetermined function a combination of software and / or hardware.
  • the systems and methods described in the following embodiments are preferably implemented in software, hardware, or a combination of software and hardware, is also possible and contemplated.
  • 2 is a structural block diagram of a capability interaction device for remotely presenting an endpoint according to an embodiment of the present invention.
  • the device includes: an interaction module 22, a first receiving module 24, a processing module 26, and the foregoing structure.
  • the interaction module 22 is configured to perform a capability interaction with the second remote presentation endpoint, where the capability interaction message carries a remote presentation endpoint capability set, where the capability interaction message carries the remote presentation endpoint transmission capability set; a receiving module 24, configured to receive a mode request message of the second telepresence endpoint; the processing module 26, connected to the interaction module 22 and the processing module 26, configured to open the first remote according to the result of the capability interaction and the received mode request information
  • a logical channel between the endpoint and the second telepresence endpoint is rendered.
  • the interaction module 22 includes: a first sending module 220, configured to Sending a first capability set interaction request to the second remote presentation endpoint, where the first capability set interaction request carries a first transmission capability set of the first remote presentation endpoint; the second receiving module 221 is configured to receive the second remote presentation a second capability set interaction request sent by the endpoint, where the second capability set interaction request carries a second transmission capability set of the second remote presentation endpoint, and the third receiving module 222 is configured to receive the second remote presentation endpoint a mode request message, where the first mode request message carries the transmission parameter of the first remote presentation endpoint; the second sending module 223 is configured to send the second mode request message to the second remote presentation endpoint, where the second The mode request message carries the sending parameter of the second remote rendering endpoint; the third sending module 224, And sending a first logical channel request
  • the interaction module 22 includes: a fourth sending module 226, configured to send a third capability set interaction request to the second remote presentation endpoint, where the third capability set interaction request carries the first transmission of the first remote presentation endpoint
  • the fifth receiving module 227 is configured to receive the third mode request message sent by the second remote presentation endpoint, where the third mode request message carries the sending parameter of the first remote rendering endpoint;
  • the sixth receiving module 228, The fourth capability set interaction request sent by the second remote presentation endpoint is received, where the fourth capability set interaction request carries the second transmission capability set of the second remote presentation endpoint, and the fifth sending module 229 is set to the first a fourth mode request message sent by the remote presentation endpoint, where the fourth mode request message carries the sending parameter of the second remote presentation endpoint, and the sixth sending module 230 is configured to select the mode request process corresponding to the fourth mode request message.
  • the seventh receiving module 231 is configured to receive the second remote endpoint a fourth logical channel request, wherein the fourth logical channel request is determined by the second remote rendering endpoint according to a result of the mode request process corresponding to the first mode request, and the fourth logical channel request is used to request the second remote rendering endpoint to open the second remote A forward logical channel between the endpoint and the first telepresence endpoint is presented.
  • the foregoing apparatus further includes: an eighth receiving module 31, connected to the first sending module 220, after the first sending module sends the first capability set interaction request to the second remote rendering endpoint, receiving the second remote rendering endpoint sending Corresponding to the response message corresponding to the first capability set interaction request; the ninth receiving module 32 is connected to the fourth sending module 226, and after the fourth sending module 226 sends the third capability set interaction request to the second remote rendering endpoint, The response message corresponding to the third capability set interaction request sent by the remote presentation endpoint is sent.
  • the foregoing apparatus further includes: a seventh sending module 33, connected to the second receiving module 221, after the second receiving module 221 receives the second capability set interaction request sent by the second remote rendering endpoint, to the second remote presentation endpoint Sending a response message corresponding to the second capability set interaction request;
  • the eighth sending module 34 is connected to the sixth receiving module 228.
  • the method further includes: the first remote rendering endpoint to the second remote rendering endpoint A response message corresponding to the fourth capability set interaction request is sent.
  • the foregoing apparatus further includes: a ninth sending module 35, connected to the third receiving module 222, after the third receiving module 222 receives the first mode request message sent by the second remote rendering endpoint, sending the message to the second remote rendering endpoint Corresponding to the response message corresponding to the first mode request message; the tenth sending module 36 is connected to the fifth receiving module 227, and after the fifth receiving module 227 receives the third mode request message sent by the second remote rendering endpoint, to the second The telepresence endpoint sends a response message corresponding to the third mode request message.
  • the foregoing apparatus further includes: an eleventh sending module 37, connected to the second sending module 223, configured to send to the second remote mode after the second sending module 223 sends the second mode request message to the second remote rendering endpoint
  • the presentation endpoint sends a response message corresponding to the second mode request message
  • the twelfth sending module 38 is connected to the fifth sending module 229, and is set as the fourth mode request message sent by the fifth sending module 229 to the second remote presentation endpoint. Thereafter, a response message corresponding to the fourth mode request message is sent to the second remote presentation endpoint.
  • the foregoing apparatus further includes: a tenth receiving module 39, connected to the third sending module 224, configured to send the second remote presentation endpoint to the second remote sending module 224 according to a result of the mode request process corresponding to the second mode request message
  • the eleventh receiving module 40 is connected to the sixth sending module 230, and is configured to be in the sixth sending module. 230.
  • the thirteenth sending module 41 is connected to the fourth receiving module 225, and the fourth receiving module is configured to: after the first remote rendering endpoint of the fourth receiving module 225 receives the second logical channel request sent by the second remote rendering endpoint, to the second The telepresence endpoint sends a response message corresponding to the second logical channel request;
  • the fourteenth sending module 42 is connected to the seventh receiving module 231, and configured to send the fourth logical channel request to the second remote rendering endpoint after the seventh receiving module 231 receives the fourth logical channel request sent by the second remote rendering endpoint Corresponding response message.
  • the embodiment provides a data stream, a remote presentation endpoint capability set, where the remote presentation endpoint capability set includes: a transmission capability set, where the transmission capability set includes: a capture parameter, where the capture parameters include: a general parameter, a video parameter, and/or an audio parameter.
  • the universal parameters include media capture content, scene description, scene switching policy, general space information, and/or general encoding information; media capture content indicates use of media capture, the attribute includes a media capture perspective, a role of the representation of the media, the media Whether it is auxiliary stream content, media related language; scene description is used to provide a description of the overall scene, such as a text description.
  • the scenario switching policy is used to indicate the supported media switching policy, where the supported media switching policy includes a location switching policy and a partial switching policy, where the location switching policy is used to indicate that all the capturing is simultaneously switched to ensure that the capturing is performed together.
  • a partial handover policy is used to indicate that different acquisitions can be switched at different times and from the same and/or different remote presentation endpoints.
  • the general spatial information includes a scene area and/or an area scale parameter, wherein the scene area parameter is used to indicate a range of the overall scene associated with the endpoint, and the area scale indicates the type of scale used by the spatial information parameter.
  • the universal coding information includes all the maximum bandwidth, the total number of maximum pixels per second, and/or the maximum number of macroblocks per second, wherein all the maximum bandwidth is used to indicate all the code streams of the preset type sent by the terminal.
  • the video parameters include: video capture number, video capture spatial information, and/or video capture encoded information; the number of video captures is used to indicate the number of video captures.
  • the video capture spatial information comprises a capture area, a capture point and/or a point on the capture line, wherein the capture area is used to indicate a spatial location of the video capture in the overall captured scene; a capture point, used to indicate In the captured scene, the location of the video capture; the point on the capture line, describes the spatial location of the second point on the optical axis of the capture device, and the first point is the capture point.
  • the video captures the encoded information, including the maximum video bandwidth, the maximum number of pixels per second, the width of the maximum video resolution, the height of the maximum video resolution, and/or the maximum video frame rate; wherein, the maximum video bandwidth is used to indicate a single The maximum number of bits per second for video encoding; the maximum number of pixels per second, which is used to represent the maximum number of pixels per second for a single video encoding; the maximum video resolution width, which is used to represent the maximum video resolution in pixels. The width of the rate; the height of the maximum video resolution, which is used to indicate the height of the maximum video resolution in pixels; the maximum video frame rate, which indicates the maximum video frame rate.
  • the audio parameters include an audio capture amount, audio capture spatial information, and/or audio capture encoded information; the number of audio captures is used to indicate the number of audio captures.
  • the audio capture spatial information comprises: a capture area and/or a capture point, wherein the capture area is used to indicate a spatial location where the audio capture is located in the overall captured scene; and a capture point is used to represent the audio capture in the captured scene s position.
  • the audio capture encoded information comprises: an audio channel format and/or a maximum audio bandwidth; an audio channel format for indicating an attribute of the audio channel; a maximum audio bandwidth for indicating a maximum number of bits per second for a single audio encoding.
  • the telepresence endpoint capability set comprises: a telepresence endpoint symmetric capability set
  • the telepresence endpoint symmetric capability set comprises: capturing rendering parameters
  • capturing rendering parameters comprises: general parameters, video parameters and/or audio parameters.
  • the universal parameters include media capture rendering content, scene description, scene switching strategy, general space information, and/or general encoding information; media capture rendering content represents media capture and/or rendering purposes, including media capture perspective, media The representation of the role, whether the media is the auxiliary stream content, the media related language; the scenario description is used to provide a description of the overall scenario; preferably, the scenario switching policy is used to indicate the supported media switching policy, wherein the supported media switching policy includes a place switching policy and a partial switching policy, wherein the place switching policy is used to indicate that all the captured renderings are simultaneously switched to ensure that the captured renderings come together from the same endpoint location, and the partial switching strategy is used to indicate that different capturing renderings can be switched at different times.
  • the general spatial information includes a scene area and/or an area scale parameter, wherein the scene area parameter is used to indicate a range of the overall scene associated with the endpoint, and the area scale indicates the type of scale used by the spatial information parameter.
  • the universal coding information includes all the maximum bandwidth, all the maximum number of pixels per second and/or the maximum number of macroblocks per second, wherein all the maximum bandwidth is used to indicate the pre-send and/or received pre-received by the capture rendering endpoint.
  • the maximum number of bitrates per second for all streams of the type the maximum number of pixels per second is used to represent the maximum number of pixels per second independently encoded in the encoded group sent and/or received by the endpoint; all per second
  • the maximum number of macroblocks represents the maximum number of macroblocks per second for all video streams sent and/or received by the endpoint.
  • the video parameters include: video capture rendering amount, video capture rendering space information, and/or video capture rendering encoding information; the number of video capture renderings is used to indicate the number of video captures and/or renderings.
  • the video capture rendering spatial information comprises capturing a rendering area, capturing a rendering point, and/or capturing a point on the rendering line, wherein the capturing rendering area is used to indicate where the video capture rendering is in the overall captured and/or rendered scene Spatial position; capture rendering point, used to indicate the location of video capture and/or rendering in the captured and/or rendered scene; capture points on the rendering line, describing the second on the optical axis of the capture and/or rendering device The spatial position of the points, and the first point is the capture and / or render point.
  • the video capture renders the encoded information, including the maximum video bandwidth, the maximum number of pixels per second, the width of the maximum video resolution, the height of the maximum video resolution, and/or the maximum video frame rate; wherein the maximum video bandwidth is used to indicate The maximum number of bits per second for a single video encoding; the maximum number of pixels per second, which is used to represent the maximum number of pixels per second for a single video encoding; the maximum video resolution width, which is used to represent the largest video in pixels. The width of the resolution; the height of the maximum video resolution, which is used to indicate the height of the maximum video resolution in pixels; the maximum video frame rate, which indicates the maximum video frame rate.
  • the audio parameters include an audio capture rendering amount, audio capture rendering space information, and/or audio capture rendering encoding information; an audio capture rendering number is used to indicate the number of audio capture renderings.
  • the audio capture rendering spatial information comprises: capturing a rendering area and/or capturing a rendering point, wherein the rendering area is used to represent an audio capture and/or rendering a spatial location where the overall captured and/or rendered scene is located; Render point, used to indicate the location of audio capture and/or rendering in the captured and/or rendered scene.
  • the audio capture rendering encoded information comprises: an audio channel format and/or a maximum audio bandwidth; an audio channel format for indicating an attribute of the audio channel; and a maximum audio bandwidth for indicating a maximum number of bits per second involved in a single audio encoding.
  • the telepresence endpoint capability set includes: As a preferred implementation, the telepresence endpoint capability set includes: a telepresence endpoint reception capability set, including: a rendering parameter, where the rendering parameters include: a general parameter, a video parameter, and/or Audio parameters.
  • the universal parameters include: media rendering content, scene description, scene switching strategy, general space information, and/or general encoding information, wherein the media rendering content is used to represent an attribute of the captured content required by the rendering endpoint, the attribute including media capture
  • the scenario description used to provide a description of the overall scenario
  • the scenario switching policy used to indicate the supported media switching policy, preferably, the scenario switching policy
  • the method includes a location switching policy and/or a partial switching policy, where the location switching policy is to switch all the renderings at the same time, to ensure that the renderings come together from the same endpoint location, and the partial switching strategy switches for different renderings at different times, from the same and / or different endpoints.
  • the general spatial information includes: a scene area and/or an area scale parameter, wherein the scene area parameter is used to indicate a range of the overall scene related to the endpoint, and the area scale is used to indicate the type of the scale used by the spatial information parameter.
  • the universally encoded information includes all of the maximum bandwidth, all of the maximum number of pixels per second, and/or all of the maximum number of macroblocks per second, wherein all of the maximum bandwidth represents all of the preset types of streams received by the rendering endpoint.
  • the maximum number of bits per second; the maximum number of pixels per second represents the maximum number of pixels processed per second independently encoded in the code group; all the maximum number of macroblocks per second represents all video streams received by the endpoint The maximum number of macroblocks per second.
  • the video parameters include: a number of video renderings, video rendering space information, and/or video rendering encoding information, where the number of video renderings is used to indicate the number of video renderings; the video rendering spatial information is used to indicate that the video rendering representation is a whole Render a portion of the scene.
  • the video rendering coding information includes a maximum video bandwidth, a maximum number of pixels per second, a width of a maximum video resolution, a height of a maximum video resolution, and/or a maximum video frame rate; wherein, a maximum video bandwidth, the parameter is used to represent The maximum number of bits per second for a single video encoding; the maximum number of pixels per second, which is used to represent the maximum number of pixels per second for a single video encoding; the maximum video resolution width, which is used to represent the largest video in pixels.
  • the width of the resolution; the height of the maximum video resolution which is used to indicate the height of the maximum video resolution in pixels; the maximum video frame rate, which is used to represent the maximum video frame rate.
  • the audio parameters include: an audio rendering amount, audio rendering space information, and/or audio rendering encoding information, where the audio rendering number is used to represent the number of audio renderings; the audio rendering space information is used to indicate that the audio rendering is in the overall rendering scene.
  • the audio rendering encoded information comprises: an audio channel format and/or a maximum audio bandwidth, wherein the audio channel format is used to represent an attribute of the audio channel; and the maximum audio bandwidth is used to represent a maximum number of bits per second for a single audio encoding.
  • Preferred Embodiment 1 The preferred embodiment provides a capability negotiation method, which can ensure scene description, codec capability, and negotiation of multiplexing channels of multiple media between remote presentation endpoints.
  • FIG. 4 is a flowchart 1 of a method for negotiating a remote presentation endpoint according to an embodiment of the present invention.
  • the negotiation mode includes a capability set interaction and a logical channel open. section.
  • the method includes the following steps S401 and S402: Step S401: Capability set interaction: capability interaction between two telepresence endpoints, carrying a telepresence endpoint capability set. Step S402: The logical channel is opened: Open a logical channel between two remote rendering endpoints, and specify the channel attribute after negotiation.
  • a negotiation mode A-1 is provided.
  • FIG. 5 is a flowchart 2 of a method for negotiating a remote presentation endpoint according to an embodiment of the present invention, as shown in FIG. 5.
  • the capability set carried in the negotiation mode capability set interaction message is a receiving capability set, and the receiving capability set includes the endpoint receiving capability related parameter, and the following steps S501 and S502 are included.
  • Step S501 capability set interaction: capability set interaction between two remote presentation endpoints, where the message carries a remote presentation endpoint reception capability set parameter.
  • Step S502 Logical channel open: Open a logical channel between two remote rendering endpoints, and specify the channel attribute after negotiation.
  • FIG. 6 is a flowchart 3 of a method for negotiating a remote presentation endpoint according to an embodiment of the present invention.
  • the capability set interaction and the logical channel open procedure in the negotiation manner shown in FIG. 6 may include the following steps S601 to S608.
  • Step S601 The remote presentation endpoint A initiates a capability set interaction request to the remote presentation endpoint B, and the message carries the reception capability set of the endpoint A.
  • Step S602 Endpoint B replies to the endpoint A capability set interaction response message;
  • Step S603 The remote presentation endpoint B initiates a capability set interaction request to the remote presentation endpoint A, and the message carries the reception capability set of the endpoint B.
  • Step S604 The endpoint A replies to the endpoint B capability set interaction response message;
  • Step S605 The endpoint B according to the endpoint A The receiving capability, combined with its own sending capability, sends an Open Logical Channel Request message to Endpoint A.
  • Step S606 Endpoint A replies to the Endpoint B to open the logical channel response message;
  • Step S607 Endpoint A according to the receiving capability of the endpoint B, combined with its own sending capability, to the endpoint B sends an open logical channel request message.
  • Step S608 Endpoint B replies to Endpoint A to open the logical channel response message.
  • One pair of request and response messages has a chronological order, and the first pair of capability interaction messages are sent before the first pair of logical channel open messages.
  • the order of sending the messages described in FIG. 3 may also be 1, 3, 2, 4, 5, 6, 7, 8, and the endpoints A and B first transmit the response set after interacting with each other, or 1 After the 2, 5, 6, 3, 4, 7, and 8 capabilities are completed, first open the logical channel on one side.
  • multiple messages sent from one endpoint may be carried at a time, such as 1, 2+3, 4, 5, 6+7, 8 .
  • FIG. 7 is a flowchart 4 of a method for negotiating a remote presentation endpoint according to an embodiment of the present invention. As shown in FIG. 7, the method includes step S701 and step S702.
  • Step S701 Capability set interaction: The capability set interaction is performed between two remote rendering endpoints, and the message carries the parameters of the remote rendering endpoint symmetric capability set.
  • Step S702 The logical channel is opened: Open a logical channel between two remote rendering endpoints, and specify the channel attribute after negotiation.
  • FIG. 8 is a flowchart of a method for negotiating a remote presentation endpoint according to an embodiment of the present invention. As shown in FIG. 8, the capability set interaction and the logical channel open procedure in the negotiation manner include the following steps S801 to S808.
  • Step S801 The remote presentation endpoint A initiates a capability set interaction request to the remote presentation endpoint B, and the message carries the symmetric capability set of the endpoint A.
  • Step S803 the endpoint B replies to the endpoint A capability set interaction response message;
  • Step S804 remotely presents the endpoint B direction
  • the remote presentation endpoint A initiates the capability set interaction request, and the message carries the symmetric capability set of the endpoint B.
  • Step S804 the endpoint A replies to the endpoint B capability set interaction response message;
  • Step S805 the endpoint B combines its own capability according to the symmetric capability of the endpoint A. , Send an open logical channel request message to endpoint A.
  • Step S806 End point A replies to the end point B to open the logical channel response message
  • Step S807 End point A according to the symmetry capability of the end point B, combined with its own capability, to the end point B Send Open logical channel request message.
  • Step S808 Endpoint B replies to Endpoint A to open the logical channel response message.
  • One pair of request and response messages has a chronological order, and the first pair of capability interaction messages are sent before the first pair of logical channel open messages. According to this rule, the order of sending the messages described in FIG.
  • 5 may also be 1, 3, 2, 4, 5, 6, 7, 8, and the endpoints A and B first transmit the response set and then send the response message, or 1 After the 2, 5, 6, 3, 4, 7, and 8 capabilities are completed, first open the logical channel on one side.
  • multiple messages sent from one endpoint may be carried at a time, such as 1, 2+3, 4, 5, 6+7, 8 . 2+3 indicates that end point B is sent to the endpoint A-strip information, and the interaction contains two messages and two messages.
  • Preferred Embodiment 3 The preferred embodiment provides a capability negotiation method, which can ensure scene description, codec capability, and negotiation of multiplexing channels of multiple media between remote presentation endpoints.
  • FIG. 9 is a flowchart 6 of a method for negotiating a remote presentation endpoint according to an embodiment of the present invention. As shown in FIG. 9, the method includes steps S901 and S902.
  • Step S901 capability set interaction: a capability set interaction is performed between two remote presentation endpoints, and the message carries a remote presentation endpoint transmission capability set parameter.
  • Step S902 The logical channel is opened: Open the logical channel between the two telepresence endpoints, and specify the channel attribute after negotiation. The following description will be made with reference to examples.
  • Example 1 This example provides a negotiation mode A-3-1.
  • FIG. 10 is a flowchart 7 of a method for negotiating a remote presentation endpoint according to an embodiment of the present invention. As shown in FIG. 10, the method is in a capability interaction response message. Returning the selected parameters includes the following steps S101 to S108. Step S101: The remote presentation endpoint A initiates a capability set interaction request to the remote presentation endpoint B, and the message carries the transmission capability set of the endpoint A. Step S102: The endpoint B replies to the endpoint A capability set interaction response message, and carries the transmission capability of the B from the A.
  • Step S103 The remote presentation endpoint B initiates a capability set interaction request to the remote presentation endpoint A, and the message carries the transmission capability set of the endpoint B.
  • Step S104 The endpoint A replies to the endpoint B capability set interaction response message, carrying the A slave B.
  • the sending capability concentrates the selected parameters;
  • Step S106 Endpoint B replies to Endpoint A to open the logical channel response message;
  • Step S107 Endpoint B according to the capability of Endpoint A in the capability set interactive response selection, to Endpoint A Send Open logical channel request message.
  • Step S108 Endpoint A replies to Endpoint B with the open logical channel response message.
  • One pair of request and response messages has a chronological order, and the first pair of capability interaction messages are sent before the first pair of logical channel open messages.
  • the order of sending the messages described in FIG. 7 may also be 1, 3, 2, 4, 5, 6, 7, 8, and the endpoints A and B first transmit the response set and then send the response message, or 1 After the 2, 5, 6, 3, 4, 7, and 8 capabilities are completed, first open the logical channel on one side.
  • Example 2 This example provides negotiation mode A-3-2. As shown in FIG. 11, the mode does not carry the selection parameter in the capability set interaction response message, and the reverse logical channel is requested to be opened in the open logical channel request. The following steps S111 to S118 are included.
  • Step S111 The remote presentation endpoint A initiates a capability set interaction request to the remote presentation endpoint B, and the message carries the transmission capability set of the endpoint A.
  • Step S112 the endpoint B replies to the endpoint A capability set interaction response message;
  • Step S113 remotely renders the endpoint B direction
  • the remote presentation endpoint A initiates the capability set interaction request, and the message carries the transmission capability set of the endpoint B.
  • Step S114 The endpoint A replies to the endpoint B capability set interaction response message;
  • Step S115 the endpoint B combines its own reception according to the sending capability of the endpoint A. Capabilities, sends an open logical channel request message to endpoint A.
  • Step S116 Endpoint A replies to the Endpoint B to open the logical channel response message
  • Step S117 Endpoint A according to the sending capability of the endpoint B, combined with its own receiving capability, to the endpoint B sends an open logical channel request message.
  • Step S118 Endpoint B replies to Endpoint A to open the logical channel response message.
  • One pair of request and response messages has a chronological order, and the first pair of capability interaction messages are sent before the first pair of logical channel open messages. According to this rule, the order of sending the messages described in FIG.
  • the preferred embodiment provides a capability negotiation method, which can ensure scene description, codec capability and negotiation of multiplexing channels of multiple media between remote presentation endpoints.
  • the preferred embodiment provides a negotiation mode A-4, and the capability set carried in the negotiation mode capability set interaction message is a reception capability set and a transmission capability set.
  • the receiving capability set includes endpoint receiving capability related parameters
  • the sending capability set includes endpoint sending capability related parameters.
  • Step S1201 Capability set interaction: The capability set interaction is performed between two telepresence endpoints, and the message carries the telepresence endpoint receiving capability set and the sending capability set parameter.
  • Step S1202 Logical channel open: Open the logical channel between two telepresence endpoints, and specify the channel properties after negotiation.
  • Instance 1 This example describes the negotiation mode A-4-1. As shown in FIG.
  • the negotiation mode carries the receiving capability set and the sending capability set simultaneously in the capability set interaction, and includes the following steps S1301 to S1308.
  • Step S1301 The remote presentation endpoint A initiates a capability set interaction request to the remote presentation endpoint B, and the message carries the reception capability set and the transmission capability set of the endpoint A.
  • Step S1302 The endpoint B replies to the endpoint A capability set interaction response message;
  • Step S1303 Remote The presentation endpoint B initiates a capability set interaction request to the remote presentation endpoint A, and the message carries the reception capability set and the transmission capability set of the endpoint B.
  • Step S1304 The endpoint A replies to the endpoint B capability set interaction response message;
  • Step S1305 The endpoint B according to the endpoint A The ability to, in conjunction with its own capabilities, sends an Open Logical Channel Request message to Endpoint A.
  • Step S1306 Endpoint A replies to Endpoint B to open the logical channel response message;
  • Step S1307 Endpoint A sends an Open Logical Channel Request message to Endpoint B according to the capabilities of Endpoint B and its own capabilities.
  • Specify the channel attribute request to open the forward logical channel from A to B;
  • Step S1308 Endpoint B replies to Endpoint A to open the logical channel response message.
  • One pair of request and response messages has a chronological order, and the first pair of capability interaction messages are sent before the first pair of logical channel open messages.
  • the order of sending the messages described in FIG. 10 may also be 1, 3, 2, 4, 5, 6, 7, 8, and the endpoints A and B first interact with each other and then open the logical channels on both sides. Or 1, 2, 5, 6, 3, 4, 7, 8, after the ability to complete a part of the interaction, first open the logical channel on one side.
  • multiple messages sent from one endpoint may be carried at a time, such as 1, 2+3, 4, 5, 6+7, 8 .
  • FIG. 14 is a flowchart 11 of a method for negotiating a remote presentation endpoint according to an embodiment of the present invention.
  • a negotiation mode A-4-2 is described.
  • the mode is in a capability set.
  • the receiving capability set as the receiving end in the interaction request message is formed by the transmission capability set as the transmitting end, and includes the following steps S1401 to S1412.
  • the remote presentation endpoint A initiates a capability set interaction request to the remote presentation endpoint B, and the message carries the transmission capability set of the endpoint A;
  • Endpoint B replies to the endpoint A capability set interaction response message
  • the endpoint B initiates a capability interaction request message to the endpoint A according to the sending capability of the endpoint A and the receiving capability of the endpoint A, where the message carries the receiving capability set of the endpoint B.
  • the remote presentation endpoint B initiates a capability set interaction request to the remote presentation endpoint A, and the message carries the transmission capability set of the endpoint B;
  • S1406 Endpoint A replies to the endpoint B capability set interaction response message
  • S1407 The endpoint A initiates a capability interaction request message to the endpoint B according to the sending capability of the endpoint B and its own receiving needs, where the message carries the receiving capability set of the endpoint A
  • S1408 Endpoint B replies to the endpoint A capability set interaction response message
  • Endpoint A sends an Open Logical Channel Request message to Endpoint B according to the receiving capability set of Endpoint B. Specify the channel attribute, request to open the forward logical channel from A to B;
  • S1410 Endpoint B replies to Endpoint A to open logical channel response message
  • S1411 Endpoint B sends an Open Logical Channel Request message to Endpoint A according to the receiving capability set of Endpoint A. Specify the channel attribute, request to open the forward logical channel from B to A;
  • S1412 Endpoint A replies to Endpoint B with an open logical channel response message.
  • One of the pair of request and response messages has a chronological order, and the first two pairs of sending and receiving capability exchange messages are sent before the first pair of logical channel open messages.
  • the order of sending the messages described in FIG. 11 may also be 1, 3, 5, 7, 2, 4, 6, 8, and the endpoints A and B first transmit the response set after interacting with each other, or 1 2, 3, 4, 9, 10, 5, 6, 7, 8, 11, 12, after the part of the capability interaction is completed, first open the logical channel on one side.
  • Step S1501 capability set interaction: capability between two remote presentation endpoints Set interaction, the message carries the telepresence endpoint receiving capability set and the sending capability set parameter.
  • Step S1502 Mode request: The receiving end requests a specific transmission mode according to the sending capability set of the transmitting end.
  • Step S1503 Logical channel open: Open the logical channel between the two telepresence endpoints, and specify the negotiated channel attributes. The following is a detailed description by way of example.
  • FIG. 16 is a flowchart 13 of a method for negotiating a remote presentation endpoint according to an embodiment of the present invention. As shown in FIG. 16, the method includes the following steps S1601 to S1612.
  • Step S1601 The remote presentation endpoint A initiates a capability set interaction request to the remote presentation endpoint B, and the message carries the transmission capability set of the endpoint A.
  • Step S1602 The endpoint B replies to the endpoint A capability set interaction response message.
  • Step S1603 The remote presentation endpoint B initiates a capability set interaction request to the remote presentation endpoint A, and the message carries the transmission capability set of the endpoint B.
  • Step S1604 Endpoint A replies to the Endpoint B Capability Set Interactivity Response message.
  • Step S 1605 The remote presentation endpoint B initiates a mode request message to the remote presentation endpoint A, requesting a specific transmission mode, and the mode carries the transmission parameter of the endpoint A.
  • Step S1606 Endpoint A replies to the Endpoint B Mode Request Response message.
  • Step S1607 The remote presentation endpoint A initiates a mode request message to the remote presentation endpoint B, requesting a specific transmission mode, and the mode carries the transmission parameter of the endpoint B.
  • Step S1608 Endpoint B replies to the endpoint A mode request response message.
  • Step S1609 Endpoint A sends an Open Logical Channel Request message to Endpoint B according to the result of the mode request process. Specify the channel properties and request to open the forward logical channel from A to B.
  • Step S1610 Endpoint B replies to Endpoint A to open the logical channel response message;
  • Step S1611 Endpoint B sends an Open Logical Channel Request message to Endpoint A according to the result of the mode request process.
  • Step S1612 Endpoint A replies to Endpoint B with an open logical channel response message.
  • the pair of request and response messages have a temporal sequence
  • the first pair of capability interaction message sending time is before the first pair mode request message
  • the first pair mode request message is before the first pair of logical channel open messages.
  • the order of message transmission described in FIG. 13 may also be 1, 2, 5, 6, 3, 4, 7, 8, 9, 10, 11, 12, or 1, 2, 5, 6, 9, 10, 3, 4, 7, 8, 11, 12.
  • FIG. 17 is a schematic diagram of a remote presentation endpoint capability set according to an embodiment of the present invention. As shown in FIG. 17, the transmission capability set and the reception capability are divided. set. A detailed description will be given below.
  • the transmission capability set mainly includes capturing related parameters, and the related parameters include common parameters, video parameters and audio parameters. Other parameters in the set of transmit capabilities can be used to include coding standards in related art, such as
  • the general parameters are used to describe scene related parameters and general coding related parameters.
  • the scene related parameters include a scene description, a scene switching policy, a scene area, and metric information.
  • the general encoding related parameters include a maximum bandwidth, a maximum number of macroblocks per second, and an encoding standard.
  • the video parameters are used to describe the attributes of the individual videos that make up the captured scene.
  • the video parameters mainly include video capture space information, video capture and encoding information, number of video captures, video content attributes, video switching strategies, video combination strategies and other parameters.
  • the video capture spatial information includes a capture area, a capture point, and a point on the capture line; the video capture encoded information includes a maximum video bandwidth captured by the video, a maximum number of macroblocks per second, a maximum video resolution width, a maximum video resolution height, Maximum video frame rate.
  • the audio parameters are used to describe the attributes of the individual audios that make up the captured scene.
  • the audio parameters mainly include parameters such as audio capture space information, audio capture coding information, and number of audio captures.
  • the audio capture space information includes a capture area, a capture point, and a point on the capture line; the audio capture encoded information includes an audio channel format of the audio capture, and a maximum audio bandwidth.
  • the receiving capability set corresponds to the sending capability set
  • the rendering related parameter corresponds to the capturing related parameter.
  • the receiving capability set mainly includes rendering related parameters
  • the rendering related parameters include general parameters, video parameters and audio parameters.
  • Other parameters of the receive capability set can be used to include decoding standards such as H.263, H.264, and the like.
  • the general parameters are used to describe scene related parameters and general decoding related parameters.
  • the scene related parameters include a scene description, a scene switching policy, a scene area, and metric information.
  • the general decoding related parameters include a maximum bandwidth, a maximum number of macroblocks per second, and a decoding standard.
  • the video parameters are used to describe the attributes of the individual videos that make up the rendered scene.
  • Video parameters mainly include video rendering space information, video rendering and decoding information, number of video renderings, content attributes, automatic switching strategies, combined strategies and other parameters.
  • the video rendering space information includes a rendering area, a rendering point, and a point on the rendering line;
  • the video rendering decoding information includes a maximum video bandwidth of the video rendering, a maximum number of macroblocks per second, a maximum video resolution width, and a maximum video resolution height. Maximum video frame rate.
  • the audio parameters are used to describe the properties of the individual audios that make up the rendered scene.
  • the audio parameters mainly include parameters such as audio rendering space information, audio rendering decoding information, and number of audio renderings.
  • the audio rendering space information includes a rendering area, a rendering point, and a point on the rendering line; the audio rendering decoding information includes an audio channel format of the audio rendering, and a maximum audio bandwidth.
  • the logical channel attribute specified by the remote presentation terminal mainly includes media related information, codec information, and the like used by the channel, and if the logical channel needs to be multiplexed, channel multiplexing information needs to be specified.
  • the preferred embodiment 7 is one of the preferred embodiments of the negotiation mode A-1 in the preferred embodiment in this embodiment, and describes a specific capability negotiation process between the remote presentation endpoint A and the remote presentation endpoint B, wherein the endpoint A With 3 cameras, 3 monitors, 1 microphone, 1 speaker, Endpoint B has 3 cameras, 3 monitors, 1 microphone, 1 speaker.
  • Endpoint A or / and Endpoint B can also be MCU devices.
  • the method includes the following steps S1702 to S1716.
  • the video parameters in the rendering related parameters included in the receiving capability set are: the video rendering space information with the rendering identifier VR0 is represented as the left, the video content attribute is the main video, and the maximum video bandwidth in the video rendering decoding information is 4M;
  • the video rendering space information of VR1 is represented as medium, the video content attribute is the main video, and the maximum video bandwidth in the video rendering and decoding information is 4M;
  • the video rendering space information with the rendering identifier VR2 is represented as the right, the video content attribute is the main video, and the maximum video bandwidth in the video rendering decoding information is 4M;
  • the video rendering space information with the rendering identifier VR3 is represented as the left, and the video content attribute is the largest.
  • the maximum video bandwidth in the video rendering and decoding information is 4M
  • the automatic switching attribute is YES
  • the video rendering space information with the rendering identifier VR4 is represented as medium, the video content attribute is the panoramic video, and the maximum video in the video rendering decoding information
  • the bandwidth is 4M
  • the video rendering space information with the rendering identifier VR5 is represented as right, the video content attribute is VIP
  • the maximum video bandwidth in the video rendering decoding information is 4M
  • the audio parameters in the rendering related parameters included in the receiving capability set are: rendering The audio content identified as AR0 is the main audio, the maximum bandwidth in the audio rendering and decoding information is 128K, and the audio channel format is stereo
  • the common parameters in the rendering related parameters included in the receiving capability set are:
  • the scene with the scene identifier is 1 by VR0, VR1, VR2, scene description is to render left, center, right video
  • the codec standard is H.264, and the maximum bandwidth is 12M.
  • the scene with scene ID 2 consists of VR3, VR4, and VR5.
  • the scene is described as rendering the largest speaker, panorama, VIP video, and the codec standard is H.264.
  • the bandwidth is 12M.
  • the scene with the scene ID of 3 is composed of AR0.
  • the scene is described as rendering the main audio.
  • the codec standard is G711 and the maximum bandwidth is 128K.
  • Step S1704 The endpoint ⁇ reply to the endpoint ⁇ capability set interaction response message;
  • Step S1706 The remote presentation endpoint B initiates a capability set interaction request to the remote presentation endpoint A, where the message carries the reception capability set of the endpoint B; and the video rendering space with the rendering identifier VR0
  • the information is represented as left, the video content attribute is the main video, and the maximum video bandwidth in the video rendering and decoding information is 4M; the video rendering space information with the rendering identifier VR1 is represented as medium, the video content attribute is the main video, and the video rendering information is decoded.
  • the maximum video bandwidth is 4M;
  • the video rendering space information with the rendering identifier VR2 is represented as the right, the video content attribute is the main video, and the maximum video bandwidth in the video rendering decoding information is 4M;
  • the audio content of the rendering identifier AR0 is the main audio, and the audio rendering decoding information is the largest.
  • the bandwidth is 128K, and the audio channel format is stereo.
  • the common parameters in the rendering related parameters included in the receiving capability set are:
  • the scene with the scene identifier 1 consists of VR0, VR1, and VR2, and the scene is described as rendering left, center, and right video.
  • the decoding standard is H.264, and the maximum bandwidth is 12M.
  • the scene with the scene ID of 2 is composed of AR0. The scene is described as rendering the main audio.
  • Step SI 708 Endpoint A replies to the Endpoint B Capability Set Interactivity Response message.
  • Step S1710 According to the receiving capability of the endpoint A, the endpoint B determines the sending scenario 1 and scenario 3 of the receiving capability of the endpoint A according to its own sending capability. Endpoint B sends an Open Logical Channel Request message to Endpoint A, specifies the channel properties, and requests to open the forward logical channel from B to A.
  • the specified logical channel attributes are: One video logical channel transmits three RTP video streams, and one audio logical channel transmits one RTP audio stream.
  • the video channel media related information is the VR0, VR1, and VR2 video transmission; the encoding related information is encoded by H.264, and the maximum bandwidth is 12M; the multiplexing information is VR0, VR1, and VR2 respectively. 1, 2, used to distinguish different RTP streams in the same logical channel.
  • the audio channel media related information is sent AR0 audio, the encoding related information is G711, and the maximum bandwidth is 128K.
  • Step S1712 The endpoint replies to the endpoint B to open the logical channel response message.
  • Endpoint A sends an Open Logical Channel Request message to Endpoint B, specifies the channel properties, and requests to open the forward logical channel from A to B.
  • the specified logical channel attributes are: One video logical channel transmits three RTP video streams, and one audio logical channel transmits one RTP audio stream.
  • the video channel media related information is the VR0, VR1, and VR2 video transmission; the encoding related information is encoded by H.264, and the maximum bandwidth is 12M; the multiplexing information is VR0, VR1, and VR2 respectively. 1, 2, used to distinguish different RTP streams in the same logical channel.
  • the audio channel media related information is sent AR0 audio, the encoding related information is G711, and the maximum bandwidth is 128K.
  • Step S1716 The endpoint ⁇ replies to the endpoint ⁇ to open the logical channel response message.
  • Preferred Embodiment 8 This preferred embodiment is one of the preferred embodiments of the preferred embodiment negotiation mode A1, and describes a specific capability negotiation procedure between the remote presentation endpoint A and the remote presentation endpoint B.
  • Endpoint A and Endpoint B are telepresence video conferencing endpoints.
  • Endpoint A has 3 cameras, 3 displays, 1 microphone, and 1 speaker.
  • Endpoint B has 1 camera, 1 display, 1 microphone, and 1 speaker.
  • Endpoint A or / and Endpoint B can also be MCU devices.
  • the method includes the following steps S1802 to S1816.
  • Step S1802 The remote presentation endpoint A initiates a capability set interaction request to the remote presentation endpoint B, and the message carries the reception capability set of the endpoint A.
  • the video parameters in the rendering related parameters included in the receiving capability set are: the video rendering spatial information with the rendering identifier VR0 is represented as the left, the video content attribute is the main video, and the maximum video bandwidth in the video rendering decoding information is 4M.
  • the video rendering space information with the rendering identifier VR1 is represented as medium, the video content attribute is the main video, and the maximum video bandwidth in the video rendering decoding information is 4M.
  • the video rendering space information with the rendering identifier VR2 is represented as the right, the video content attribute is the main video, and the maximum video bandwidth in the video rendering decoding information is 4M.
  • the video rendering space information with the rendering identifier VR3 is represented as medium, the video content attribute is panoramic video, and the maximum video bandwidth in the video rendering decoding information is 4M.
  • the audio parameters in the rendering related parameters included in the receiving capability set are: rendering the audio content identified as AR0 as the main audio, the maximum bandwidth in the audio rendering decoding information is 128K, and the audio channel format is stereo.
  • the common parameters in the rendering-related parameters included in the receiving capability set are:
  • the scene with the scene identifier of 1 is composed of VR0, VR1, and VR2, and the scene is described as rendering left, center, and right video.
  • the codec standard is H.264, and the maximum bandwidth is 12M.
  • the scene with the scene ID of 2 is composed of VR3.
  • the scene is described as rendering panoramic video.
  • the codec standard is H.264, and the maximum bandwidth is 4M.
  • the scene with the scene ID of 3 is composed of AR0.
  • the scene is described as rendering the main audio.
  • the codec standard is G711, and the maximum bandwidth is 128K.
  • Step S1806 The remote presentation endpoint B initiates a capability set interaction request to the remote presentation endpoint A, and the message carries the reception capability set of the endpoint B.
  • the scene with the rendering identifier VR0 is described as rendering the panoramic video.
  • the codec standard is H.264, and the maximum bandwidth is
  • the audio content identified as AR0 is rendered as the main audio, and the maximum bandwidth in the audio rendering and decoding information is 128K, and the audio channel format is stereo.
  • the common parameters in the rendering-related parameters included in the receiving capability set are: The scene with the scene identifier of 1 is composed of VR0, and the scene is described as rendering panoramic video.
  • the codec standard is H.264, and the maximum bandwidth is 4M.
  • the scene with the scene ID of 2 is composed of AR0. The scene is described as rendering the main audio.
  • the codec standard is G711, and the maximum bandwidth is 128K.
  • Step SI 808 Endpoint A replies to the Endpoint B Capability Set Interactivity Response message.
  • Step S1810 According to the receiving capability of the endpoint A, the endpoint B determines the sending scenario 2 and scenario 3 of the receiving capability of the endpoint A according to its own sending capability. Endpoint B sends an Open Logical Channel Request message to Endpoint A, specifies the channel properties, and requests to open the forward logical channel from B to A.
  • the specified logical channel attributes are: One video logical channel transmits one RTP video stream, and one audio logical channel transmits one RTP audio stream.
  • the video channel media related information is sent VR3 video; the encoding related information is encoded by H.264, and the maximum bandwidth is 4M; no multiplexing information.
  • the audio channel media related information is AR0 audio, the encoding related information is G711, and the maximum bandwidth is 128K.
  • Step S1812 The endpoint replies to the endpoint B to open the logical channel response message.
  • Step S1814 Based on the receiving capability of the endpoint B, and in conjunction with its own sending capability, the endpoint A determines to send the scenario 1 and scenario 2 of the endpoint B receiving capability set. Endpoint A sends an Open Logical Channel Request message to Endpoint B, specifies the channel properties, and requests to open the forward logical channel from A to B.
  • the specified logical channel attributes are: One video logical channel transmits one RTP video stream, and one audio logical channel transmits one RTP audio stream.
  • the video channel media related information is the VR0 video transmission; the encoding related information is encoded by H.264, and the maximum bandwidth is 12M; no multiplexing information.
  • the audio channel media related information is AR0 audio
  • the encoding related information is G711
  • the maximum bandwidth is 128K.
  • Step S1816 The endpoint ⁇ replies to the endpoint ⁇ to open the logical channel response message.
  • Preferred Embodiment 9 This preferred embodiment is one of the preferred embodiments of the negotiation mode A-2, and describes a specific capability negotiation procedure between the remote presentation endpoint A and the remote presentation endpoint B.
  • Endpoint A and Endpoint B are telepresence video conferencing endpoints.
  • Endpoint A has 3 cameras, 3 displays, 1 microphone, and 1 speaker.
  • Endpoint B has 3 cameras, 3 displays, 1 microphone, and 1 speaker.
  • Endpoint A or / and Endpoint B can also be MCU devices.
  • the preferred embodiment includes the following steps S1902 to S1916.
  • Step S1902 The remote presentation endpoint A initiates a capability set interaction request to the remote presentation endpoint B, and the message carries the symmetric capability set of the endpoint A;
  • the video parameters in the rendering/capture related parameters included in the symmetric capability set are: the rendering/capture identifier is The video rendering space information of VR0 is represented as left, the video content attribute is the main video, and the maximum video bandwidth in the video rendering/capturing codec information is 4M;
  • the rendering/capturing video rendering space information identified as VR1 is represented as medium, video content attribute For the main video, the maximum video bandwidth in the video rendering/capturing codec information is 4M;
  • the rendering/capturing video rendering space information identified as VR2 is represented as the right, the video content attribute is the main video, and the video rendering/capturing codec information is The maximum video bandwidth is 4M;
  • the audio parameters in the rendering related parameters included in the receiving capability set are: rendering/capturing the audio content identified as AR0 as the main audio, and the maximum bandwidth in the audio rendering/capturing codec information is 128K
  • the scene is described as rendering left, center, and right video.
  • the codec standard is H.264, and the maximum bandwidth is 12M.
  • the scene with the scene ID of 2 is composed of VR3.
  • the scene is described as rendering panoramic video.
  • the codec standard is H.264, and the maximum bandwidth is 4M.
  • the scene with scene ID 3 is composed of AR0.
  • the scene is described as rendering main audio, codec standard. For G711, the maximum bandwidth is 128K.
  • Step S1904 The endpoint ⁇ reply to the endpoint ⁇ capability set interaction response message;
  • Step S1906 The remote presentation endpoint B initiates a capability set interaction request to the remote presentation endpoint A, the message carries the symmetric capability set of the endpoint B; and renders/captures the video identified as VR0
  • the rendering/capturing spatial information is represented as left, the video content attribute is the main video, and the maximum video bandwidth in the video rendering/capturing codec information is 4M;
  • the rendering/capturing video rendering/capturing spatial information identified as VR1 is represented as medium, video content
  • the attribute is the main video, the maximum video bandwidth in the video rendering/capturing codec information is 4M;
  • the rendering/capturing video rendering/capturing space information identified as VR2 is represented as right, the video content attribute is the main video, and the video rendering/capturing codec is decoded.
  • the maximum video bandwidth in the message is 4M; the audio content identified as AR0 is rendered/captured as the primary audio, the maximum bandwidth in the audio rendering/capture codec information is 128K, the audio channel format is stereo; the receive capability set contains rendering/capture
  • the common parameters in the related parameters are:
  • the scene with the scene ID is 1 consists of VR0, VR1, and VR2.
  • the scene is described as rendering/capturing the left, middle, and right videos.
  • the codec standard is H.264, and the maximum bandwidth is 12M.
  • the scene with the scene identifier of 2 is composed of AR0.
  • the scene is described as rendering/capturing the main audio, and the codec standard is G. .711, the maximum bandwidth is 128K.
  • Step S1908 The endpoint ⁇ replies to the endpoint B capability set interaction response message.
  • Endpoint B sends an Open Logical Channel Request message to Endpoint A.
  • the specified logical channel attributes are: One video logical channel transmits three RTP video streams, and one audio logical channel transmits one RTP audio stream.
  • Video channel media related information is sent VR0, VR1, VR2 video; encoding related information is compiled with H.264 The maximum bandwidth of the code is 12M.
  • the value of the RTP header extension identifier corresponding to the VR0, VR1, and VR2 is 0, 1, and 2, which are used to distinguish different RTP streams in the same logical channel.
  • the audio channel media related information is sent AR0 audio, the encoding related information is G711, and the maximum bandwidth is 128K.
  • Step S1912 The endpoint replies to the endpoint B to open the logical channel response message.
  • Step S1914 The endpoint A determines the scenario 1 and scenario 2 of the endpoint B receiving capability set according to the symmetry capability of the endpoint B and its own sending capability. Endpoint A sends an Open Logical Channel Request message to Endpoint B, specifies the channel properties, and requests to open the forward logical channel from A to B.
  • the specified logical channel attributes are: One video logical channel transmits three RTP video streams, and one audio logical channel transmits one RTP audio stream.
  • the video channel media related information is the VR0, VR1, and VR2 video transmission; the encoding related information is encoded by H.264, and the maximum bandwidth is 12M; the multiplexing information is VR0, VR1, and VR2 respectively. 1, 2, used to distinguish different RTP streams in the same logical channel.
  • the audio channel media related information is sent AR0 audio, the encoding related information is G711, and the maximum bandwidth is 128K.
  • Step S1916 The endpoint ⁇ replies to the endpoint ⁇ to open the logical channel response message.
  • the preferred embodiment is one of the preferred embodiments of the negotiation mode A-3-1, and describes a specific capability negotiation procedure between the remote presentation endpoint A and the remote presentation endpoint B.
  • Endpoint A and Endpoint B are telepresence video conferencing endpoints.
  • Endpoint A has 3 cameras, 3 displays, 1 microphone, and 1 speaker.
  • Endpoint B has 1 camera, 1 display, 1 microphone, and 1 speaker.
  • Endpoint A or / and Endpoint B can also be MCU devices.
  • the preferred embodiment includes the following steps S2002 to S2016.
  • Step S2002 The remote presentation endpoint A initiates a capability set interaction request to the remote presentation endpoint B, and the message carries the transmission capability set of the endpoint A;
  • the video parameters in the capture related parameters included in the transmission capability set are: video capture with the capture identifier VR0
  • the spatial information is represented as the left, the video content attribute is the main video, and the maximum video bandwidth in the video capture encoded information is 4M;
  • the video capture spatial information with the capture identifier VR1 is represented as medium, the video content attribute is the main video, and the video capture encoded information is included.
  • the maximum video bandwidth is 4M; the video capture space information with the capture identifier VR2 is represented as the right, the video content attribute is the main video, and the maximum video bandwidth in the video capture encoded information is 4M; Capture the video content attribute identified as VR3 as a panorama, and the maximum video bandwidth in the video capture encoded information is
  • the audio parameters in the acquisition related parameters included in the receiving capability set are: capturing the audio content identified as AR0 as the main audio, the maximum bandwidth in the audio capture encoding information is 128K, the audio channel format is stereo; the transmission capability set includes the capture
  • the common parameters in the related parameters are:
  • the scene with the scene ID is 1 is composed of VR0, VR1, and VR2.
  • the scene is described as capturing left, center, and right video.
  • the encoding standard is H.264, and the maximum bandwidth is 12M.
  • the scene identifier is 2
  • the scene is composed of VR3.
  • the scene is described as capturing panoramic video.
  • the encoding standard is H.264, and the maximum bandwidth is 4M.
  • the scene with scene ID 3 is composed of AR0.
  • the scene is described as capturing main audio.
  • the encoding standard is G711.
  • the bandwidth is 128K.
  • Step S2004: the endpoint ⁇ reply endpoint ⁇ capability set interaction response message carries the parameters selected by B from the transmission capability set of A: the media in scenario 2 and scenario 3; step S2006: the remote presentation endpoint B initiates the capability set to the remote presentation endpoint A
  • the interaction request, the message carries the transmission capability set of the endpoint B;
  • the video parameters in the acquisition related parameters included in the transmission capability set are:
  • the scene with the capture identifier VR0 is described as capturing the panoramic video, the coding standard is H.264, and the maximum bandwidth is 4M; captures the audio content identified as AR0 as the main audio, the maximum bandwidth in the audio capture encoding information is 128K, and the audio channel format is stereo;
  • the common parameters in the capture related parameters included in the transmission capability set are: the scene with the scene identifier being 1 It is composed of VR0.
  • Step S2008 Endpoint A replies to the Endpoint B capability set interaction response message, and carries the parameters selected by A from the transmission capability set of B: media in scenario 1 and scenario 2; step S2010: endpoint A selects in the capability set interaction response according to endpoint B The ability to send an Open Logical Channel Request message to Endpoint B. Specify the channel properties and request to open the forward logical channel from A to B.
  • the specified channel attributes are: One video logical channel transmits one RTP video stream, and one audio logical channel transmits one RTP audio stream.
  • the video channel media related information is sent VR3 video; the encoding related information is encoded by H.264, and the maximum bandwidth is 4M; no multiplexing information.
  • the audio channel media related information is AR0 audio, the encoding related information is G711, and the maximum bandwidth is 128K.
  • Step S2014 The endpoint B sends an open logical channel request message to the endpoint A according to the capability of the endpoint A in the capability set interaction response selection.
  • Step S2016 The endpoint replies to the endpoint B to open the logical channel response message. It should be noted that the corresponding embodiments of the other methods are similar to the above embodiments.
  • modules or steps of the present invention can be implemented by a general-purpose computing device, which can be concentrated on a single computing device or distributed over a network composed of multiple computing devices. Alternatively, they may be implemented by program code executable by the computing device, such that they may be stored in the storage device by the computing device and, in some cases, may be different from the order herein.
  • the steps shown or described are performed, or they are separately fabricated into individual integrated circuit modules, or a plurality of modules or steps are fabricated as a single integrated circuit module.
  • the invention is not limited to any specific combination of hardware and software.
  • the above is only the preferred embodiment of the present invention, and is not intended to limit the present invention, and various modifications and changes can be made to the present invention.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

Disclosed are a method and apparatus for remote display endpoint capability exchange, and a data flow. The method comprises: capability exchange is carried out between a first remote display endpoint and a second remote display endpoint, capability sets of the remote display endpoints being carried in capability exchange messages, send capability sets of the remote display endpoints being carried in the capability exchange message; the first remote display endpoint receives a mode request message from the second remote display endpoint; on the basis of capability exchange results and the received mode request information, the first remote display endpoint opens a logical channel between the first remote display endpoint and the second remote display endpoint. The present invention allows capability exchange between remote display terminals, thereby improving the user experience.

Description

远程呈现端点的能力交互方法及装置、 数据流 技术领域 本发明涉及通信领域, 具体而言,涉及一种远程呈现端点的能力交互方法及装置、 数据流。 背景技术 目前基于 H.323的视频会议类产品中, 采用 H.245进行能力协商的时候, 只能对 传统的视频会议类端点能力进行协商, 无法对远程呈现端点的能力进行协商。 具体表 现在能力交互时, 只能对传统视频会议端点的编解码能力进行交互, 无法对远程呈现 端点新增的媒体流接收以及发送能力进行交互。 打开逻辑通道时, 也只能对逻辑通道 的编解码属性进行指定, 无法对逻辑通道的媒体流属性, 编解码属性, 复用属性, 以 及这些远程呈现端点能力属性之间的绑定关系进行指定。 针对相关技术中的协商方法无法对远程呈现端点的能力集进行协商的问题, 目前 尚未提出有效的解决方案。 发明内容 本发明提供了一种远端呈现端点的能力交互方法及装置、 数据流, 以至少解决上 述问题。 根据本发明的一个方面, 提供了一种远端呈现端点的能力交互方法, 包括: 第一 远程呈现端点和第二远程呈现端点之间进行能力交互, 其中, 所述能力交互的消息中 携带有远程呈现端点能力集, 其中, 所述能力交互的消息中携带有所述远程呈现端点 发送能力集; 所述第一远程呈现端点接收第二远程呈现端点的模式请求消息; 所述第 一远程呈现端点根据所述能力交互的结果和接收到的所述模式请求信息打开所述第一 远程呈现端点和所述第二远程呈现端点之间的逻辑通道。 优选地, 第一远程呈现端点和第二远程呈现端点之间进行能力交互包括: 所述第一远程呈现端点向所述第二远程呈现端点发送第一能力集交互请求,其中, 所述第一能力集交互请求中携带有所述第一远程呈现端点的第一发送能力集; 所述第一远程呈现端点接收第二远程呈现端点发送的第二能力集交互请求,其中, 所述第二能力集交互请求中携带有所述第二远程呈现端点的第二发送能力集; 所述第一远程呈现端点接收所述第二远程呈现端点发送的第一模式请求消息, 其 中, 所述第一模式请求消息中携带有所述第一远程呈现端点的发送参数; 所述第一远程呈现端点向所述第二远程呈现端点发送的第二模式请求消息,其中, 所述第二模式请求消息中携带有所述第二远程呈现端点的发送参数; 所述第一远程呈现端点根据所述第二模式请求消息对应的模式请求过程的结果, 向所述第二远程呈现端点发送第一逻辑通道请求, 其中, 所述第一逻辑通道请求用于 请求所述第一远程呈现端点打开所述第一远程呈现端点至所述第二远程呈现端点之间 的前向逻辑通道; 所述第一远程呈现端点接收所述第二远程呈现端点发送的第二逻辑通道请求, 其 中, 所述第二逻辑通道请求是所述第二远程呈现端点根据所述第一模式请求对应的模 式请求过程的结果确定的, 所述第二逻辑通道请求用于请求所述第二远程呈现端点打 开所述第二远程呈现端点到所述第一远程呈现端点之间的前向逻辑通道。 优选地, 第一远程呈现端点和第二远程呈现端点之间进行能力交互包括: 所述第一远程呈现端点向所述第二远程呈现端点发送第三能力集交互请求,其中, 所述第三能力集交互请求中携带有所述第一远程呈现端点的第一发送能力集; 所述第一远程呈现端点接收所述第二远程呈现端点发送的三模式请求消息,其中, 所述第三模式请求消息中携带有所述第一远程呈现端点的发送参数; 所述第一远程呈现端点接收第二远程呈现端点发送的第四能力集交互请求,其中, 所述第四能力集交互请求中携带有所述第二远程呈现端点的第二发送能力集; 所述第一远程呈现端点向所述第二远程呈现端点发送的第四模式请求消息,其中, 所述第四模式请求消息中携带有所述第二远程呈现端点的发送参数; 所述第一远程呈现端点根据所述第四模式请求消息对应的模式请求过程的结果, 向所述第二远程呈现端点发送第三逻辑通道请求, 其中, 所述第三逻辑通道请求用于 请求所述第一远程呈现端点打开所述第一远程呈现端点至所述第二远程呈现端点之间 的前向逻辑通道; 所述第一远程呈现端点接收所述第二远程呈现端点发送的第四逻辑通道请求, 其 中, 所述第四逻辑通道请求是所述第二远程呈现端点根据所述第一模式请求对应的模 式请求过程的结果确定的, 所述第四逻辑通道请求用于请求所述第二远程呈现端点打 开所述第二远程呈现端点到所述第一远程呈现端点之间的前向逻辑通道。 优选地, 在所述第一远程呈现端点向所述第二远程呈现端点发送第一能力集交互 请求之后, 还包括: 所述第一远程呈现端点接收所述第二远程呈现端点发送的对应于 所述第一能力集交互请求对应的响应消息; 在所述第一远程呈现端点向所述第二远程呈现端点发送第三能力集交互请求之 后, 还包括: 所述第一远程呈现端点接收所述第二远程呈现端点发送的对应于所述第 三能力集交互请求对应的响应消息。 优选地, 在所述第一远程呈现端点接收第二远程呈现端点发送的第二能力集交互 请求之后, 还包括: 所述第一远程呈现端点向所述第二远程呈现端点发送对应于所述 第二能力集交互请求对应的响应消息; 在所述第一远程呈现端点接收第二远程呈现端点发送的第四能力集交互请求之 后, 还包括: 所述第一远程呈现端点向所述第二远程呈现端点发送对应于所述第四能 力集交互请求对应的响应消息。 优选地, 在所述第一远程呈现端点接收所述第二远程呈现端点发送的第一模式请 求消息之后, 还包括: 所述第一远程呈现端点向所述第二远程呈现端点发送对应于所 述第一模式请求消息对应的响应消息; 在所述第一远程呈现端点接收所述第二远程呈现端点发送的第三模式请求消息之 后, 还包括: 所述第一远程呈现端点向所述第二远程呈现端点发送对应于所述第三模 式请求消息对应的响应消息。 优选地, 在所述第一远程呈现端点向所述第二远程呈现端点发送的第二模式请求 消息之后, 还包括: 所述第一远程呈现端点向所述第二远程呈现端点发送对应于所述 第二模式请求消息的响应消息; 在所述第一远程呈现端点向所述第二远程呈现端点发送的第四模式请求消息之 后, 还包括: 所述第一远程呈现端点向所述第二远程呈现端点发送对应于所述第四模 式请求消息的响应消息。 优选地, 在所述第一远程呈现端点根据所述第二模式请求消息对应的模式请求过 程的结果, 向所述第二远程呈现端点发送第一逻辑通道请求之后, 还包括: 所述第一 远程呈现端点接收所述第二远程呈现端点发送的对应于所述第一逻辑通道请求对应的 响应消息; 在所述第一远程呈现端点根据所述第二模式请求消息对应的模式请求过程的结 果, 向所述第二远程呈现端点发送第三逻辑通道请求之后, 还包括: 所述第一远程呈 现端点接收所述第二远程呈现端点发送的对应于所述第三逻辑通道请求对应的响应消 息。 优选地, 在所述第一远程呈现端点接收所述第二远程呈现端点发送的第二逻辑通 道请求之后, 还包括: 所述第一远程呈现端点向所述第二远程呈现端点发送所述第二 逻辑通道请求对应的响应消息; 在所述第一远程呈现端点接收所述第二远程呈现端点发送的第四逻辑通道请求之 后, 还包括: 所述第一远程呈现端点向所述第二远程呈现端点发送所述第四逻辑通道 请求对应的响应消息。 优选地, 所述远程呈现端点发送能力集包括捕获参数, 其中, 所述捕获参数包括 通用参数、 视频参数和 /或音频参数。 优选地, 所述通用参数包括媒体捕获内容、 场景描述、 场景切换策略、 通用空间 信息和 /或通用编码信息; 所述媒体捕获内容表示媒体捕获的用途; 所述场景描述用于 提供整体场景的描述; 优选地, 所述场景切换策略用于指示所支持媒体切换策略, 其中, 所述支持的媒 体切换的策略包括场所切换策略和部分切换策略, 其中, 所述场所切换策略用于指示 同时切换全部的捕获, 以保证捕获一起来自同一个端点场所, 所述部分切换策略用于 指示不同的捕获可以在不同的时间切换, 并且来自相同和 /或不同的远程呈现端点。 优选地, 所述通用空间信息包括场景区域和 /或区域刻度参数, 其中, 所述场景区 域参数用于指示与端点相关的整体场景的范围, 所述区域刻度表明了空间信息参数采 用的刻度的种类。 优选地, 所述通用编码信息包括全部的最大带宽、 全部的每秒最大像素数和 /或全 部的每秒最大宏块数, 其中, 所述全部最大带宽用于指示由终端发出的预设类型的全 部码流的每秒最大数量的比特率; 所述全部的每秒最大像素数用于表示编码组中全部 独立编码的每秒最大像素数; 所述全部的每秒最大宏块数表示由端点发送的全部视频 码流每秒最大宏块数。 优选地, 所述视频参数包括: 视频捕获数量、 视频捕获空间信息和 /或视频捕获编 码信息; 所述视频捕获数量用于表示视频捕获的数量。 优选地,所述视频捕获空间信息包括捕获区域、捕获点和 /或捕获线上的点,其中, 所述捕获区域用于表示该视频捕获在整体捕获场景中的所处的空间位置;所述捕获点, 用于指示在捕获场景中, 视频捕获的位置; 所述捕获线上的点, 描述了捕获设备光轴 上的第二个点的空间位置, 且第一个点为捕获点。 优选地, 所述视频捕获编码信息, 包括最大视频带宽、 每秒最大像素数、 最大视 频分辨率的宽度、 最大视频分辨率的高度和 /或最大视频帧速率; 其中, 所述最大视频 带宽, 用于指示单一视频编码的每秒最大比特数; 所述每秒最大像素数, 该参数用于 表示单一视频编码的每秒最大像素数; 所述最大视频分辨率的宽度, 该参数用于表示 以像素为单位的最大视频分辨率的宽度; 所述最大视频分辨率的高度, 该参数用于表 示以像素为单位的最大视频分辨率的高度; 所述最大视频帧速率, 该参数表明了最大 视频帧率。 优选地, 所述音频参数包括音频捕获数量、 音频捕获空间信息和 /或音频捕获编码 信息; 所述音频捕获数量用于指示音频捕获的数量。 优选地, 所述音频捕获空间信息包括: 捕获区域和 /或捕获点, 其中, 所述捕获区 域, 用于表示音频捕获在整体捕获场景所处的空间位置; 所述捕获点, 用于表示在捕 获场景中, 音频捕获的位置。 优选地, 所述音频捕获编码信息包括: 音频信道格式和 /或最大音频带宽; 所述音 频信道格式, 用于指示音频信道的属性; 所述最大音频带宽, 用于指示单一音频编码 的每秒最大比特数。 根据本发明的另一方面, 提供了一种远端呈现端点的能力交互装置, 应用于第一 远程呈现端点, 包括: 交互模块, 设置为和第二远程呈现端点之间进行能力交互, 其 中, 所述能力交互的消息中携带有远程呈现端点能力集, 其中, 所述能力交互的消息 中携带有所述远程呈现端点发送能力集; 第一接收模块, 设置为接收第二远程呈现端 点的模式请求消息; 处理模块, 设置为根据所述能力交互的结果和接收到的所述模式 请求信息打开所述第一远程呈现端点和所述第二远程呈现端点之间的逻辑通道。 优选地, 所述交互模块包括: 第一发送模块, 设置为向所述第二远程呈现端点发送第一能力集交互请求,其中, 所述第一能力集交互请求中携带有所述第一远程呈现端点的第一发送能力集; 第二接收模块, 设置为接收第二远程呈现端点发送的第二能力集交互请求,其中, 所述第二能力集交互请求中携带有所述第二远程呈现端点的第二发送能力集; 第三接收模块, 设置为接收所述第二远程呈现端点发送的第一模式请求消息, 其 中, 所述第一模式请求消息中携带有所述第一远程呈现端点的发送参数; 第二发送模块, 设置为向所述第二远程呈现端点发送的第二模式请求消息,其中, 所述第二模式请求消息中携带有所述第二远程呈现端点的发送参数; 第三发送模块, 设置为根据所述第二模式请求消息对应的模式请求过程的结果, 向所述第二远程呈现端点发送第一逻辑通道请求, 其中, 所述第一逻辑通道请求用于 请求所述第一远程呈现端点打开所述第一远程呈现端点至所述第二远程呈现端点之间 的前向逻辑通道; 第四接收模块, 设置为接收所述第二远程呈现端点发送的第二逻辑通道请求, 其 中, 所述第二逻辑通道请求是所述第二远程呈现端点根据所述第一模式请求对应的模 式请求过程的结果确定的, 所述第二逻辑通道请求用于请求所述第二远程呈现端点打 开所述第二远程呈现端点到所述第一远程呈现端点之间的前向逻辑通道。 优选地, 所述交互模块包括: 第四发送模块, 设置为向所述第二远程呈现端点发送第三能力集交互请求,其中, 所述第三能力集交互请求中携带有所述第一远程呈现端点的第一发送能力集; 第五接收模块, 设置为接收所述第二远程呈现端点发送的第三模式请求消息, 其 中, 所述第三模式请求消息中携带有所述第一远程呈现端点的发送参数; 第六接收模块, 设置为接收第二远程呈现端点发送的第四能力集交互请求,其中, 所述第四能力集交互请求中携带有所述第二远程呈现端点的第二发送能力集; 第五发送模块, 设置为向所述第二远程呈现端点发送的第四模式请求消息,其中, 所述第四模式请求消息中携带有所述第二远程呈现端点的发送参数; 第六发送模块, 设置为根据所述第四模式请求消息对应的模式请求过程的结果, 向所述第二远程呈现端点发送第三逻辑通道请求, 其中, 所述第三逻辑通道请求用于 请求所述第一远程呈现端点打开所述第一远程呈现端点至所述第二远程呈现端点之间 的前向逻辑通道; 第七接收模块, 设置为接收所述第二远程呈现端点发送的第四逻辑通道请求, 其 中, 所述第四逻辑通道请求是所述第二远程呈现端点根据所述第一模式请求对应的模 式请求过程的结果确定的, 所述第四逻辑通道请求用于请求所述第二远程呈现端点打 开所述第二远程呈现端点到所述第一远程呈现端点之间的前向逻辑通道。 优选地, 还包括: 第八接收模块, 设置为在第一发送模块向所述第二远程呈现端点发送第一能力集 交互请求之后, 接收所述第二远程呈现端点发送的对应于所述第一能力集交互请求对 应的响应消息; 第九接收模块, 设置为在第四发送模块向所述第二远程呈现端点发送第三能力集 交互请求之后, 接收所述第二远程呈现端点发送的对应于所述第三能力集交互请求对 应的响应消息。 优选地, 还包括: 第七发送模块, 设置为在所述第二接收模块接收第二远程呈现端点发送的第二能 力集交互请求之后, 向所述第二远程呈现端点发送对应于所述第二能力集交互请求对 应的响应消息; 第八发送模块, 设置为在所述第六接收模块接收第二远程呈现端点发送的第四能 力集交互请求之后, 还包括: 所述第一远程呈现端点向所述第二远程呈现端点发送对 应于所述第四能力集交互请求对应的响应消息。 优选地, 还包括: 第九发送模块, 设置为在所述第三接收模块接收所述第二远程呈现端点发送的第 一模式请求消息之后, 向所述第二远程呈现端点发送对应于所述第一模式请求消息对 应的响应消息; 第十发送模块, 设置为在所述第五接收模块接收所述第二远程呈现端点发送的第 三模式请求消息之后, 向所述第二远程呈现端点发送对应于所述第三模式请求消息对 应的响应消息。 优选地, 还包括: 第十一发送模块, 设置为在所述第二发送模块向所述第二远程呈现端点发送的第 二模式请求消息之后, 向所述第二远程呈现端点发送对应于所述第二模式请求消息的 响应消息; 第十二发送模块, 设置为在所述第五发送模块向所述第二远程呈现端点发送的第 四模式请求消息之后, 向所述第二远程呈现端点发送对应于所述第四模式请求消息的 响应消息。 优选地, 还包括: 第十接收模块, 设置为在所述第三发送模块根据所述第二模式请求消息对应的模 式请求过程的结果, 向所述第二远程呈现端点发送第一逻辑通道请求之后, 接收所述 第二远程呈现端点发送的对应于所述第一逻辑通道请求对应的响应消息; 第十一接收模块, 设置为在所述第六发送模块根据所述第二模式请求消息对应的 模式请求过程的结果, 向所述第二远程呈现端点发送第三逻辑通道请求之后, 接收所 述第二远程呈现端点发送的对应于所述第三逻辑通道请求对应的响应消息。 优选地, 还包括: 第十三发送模块, 设置为在所述第四接收模块所述第一远程呈现端点接收所述第 二远程呈现端点发送的第二逻辑通道请求之后, 向所述第二远程呈现端点发送所述第 二逻辑通道请求对应的响应消息; 第十四发送模块, 设置为在所述第七接收模块接收所述第二远程呈现端点发送的 第四逻辑通道请求之后, 向所述第二远程呈现端点发送所述第四逻辑通道请求对应的 响应消息。 根据本发明的另一方面, 提供了一种远端呈现端点的能力交互装置, 包括: 根据本发明的另一方面, 提供了一种数据流, 包括: 远程呈现端点能力集, 其中 所述远程呈现端点能力集包括: 发送能力集, 所述发送能力集包括: 捕获参数, 所述 捕获参数包括: 通用参数、 视频参数和 /或音频参数。 优选地, 所述通用参数包括媒体捕获内容、 场景描述、 场景切换策略、 通用空间 信息和 /或通用编码信息; 所述媒体捕获内容表示媒体捕获的用途; 所述场景描述用于 提供整体场景的描述; 优选地, 所述场景切换策略用于指示所支持媒体切换策略, 其中, 所述支持的媒 体切换的策略包括场所切换策略和部分切换策略, 其中, 所述场所切换策略用于指示 同时切换全部的捕获, 以保证捕获一起来自同一个端点场所, 所述部分切换策略用于 指示不同的捕获可以在不同的时间切换, 并且来自相同和 /或不同的远程呈现端点。 优选地, 所述通用空间信息包括场景区域和 /或区域刻度参数, 其中, 所述场景区 域参数用于指示与端点相关的整体场景的范围, 所述区域刻度表明了空间信息参数采 用的刻度的种类。 优选地, 所述通用编码信息包括全部的最大带宽、 全部的每秒最大像素数和 /或全 部的每秒最大宏块数, 其中, 所述全部最大带宽用于指示由终端发出的预设类型的全 部码流的每秒最大数量的比特率; 所述全部的每秒最大像素数用于表示编码组中全部 独立编码的每秒最大像素数; 所述全部的每秒最大宏块数表示由端点发送的全部视频 码流每秒最大宏块数。 优选地, 所述视频参数包括: 视频捕获数量、 视频捕获空间信息和 /或视频捕获编 码信息; 所述视频捕获数量用于表示视频捕获的数量。 优选地,所述视频捕获空间信息包括捕获区域、捕获点和 /或捕获线上的点,其中, 所述捕获区域用于表示该视频捕获在整体捕获场景中的所处的空间位置;所述捕获点, 用于指示在捕获场景中, 视频捕获的位置; 所述捕获线上的点, 描述了捕获设备光轴 上的第二个点的空间位置, 且第一个点为捕获点。 优选地, 所述视频捕获编码信息, 包括最大视频带宽、 每秒最大像素数、 最大视 频分辨率的宽度、 最大视频分辨率的高度和 /或最大视频帧速率; 其中, 所述最大视频 带宽, 用于指示单一视频编码的每秒最大比特数; 所述每秒最大像素数, 该参数用于 表示单一视频编码的每秒最大像素数; 所述最大视频分辨率的宽度, 该参数用于表示 以像素为单位的最大视频分辨率的宽度; 所述最大视频分辨率的高度, 该参数用于表 示以像素为单位的最大视频分辨率的高度; 所述最大视频帧速率, 该参数表明了最大 视频帧率。 优选地, 所述音频参数包括音频捕获数量、 音频捕获空间信息和 /或音频捕获编码 信息; 所述音频捕获数量用于指示音频捕获的数量。 优选地, 所述音频捕获空间信息包括: 捕获区域和 /或捕获点, 其中, 所述捕获区 域, 用于表示音频捕获在整体捕获场景所处的空间位置; 所述捕获点, 用于表示在捕 获场景中, 音频捕获的位置。 优选地, 所述音频捕获编码信息包括: 音频信道格式和 /或最大音频带宽; 所述音 频信道格式, 用于指示音频信道的属性; 所述最大音频带宽, 用于指示单一音频编码 的每秒最大比特数。 通过本发明, 实现了远程呈现终端的能力交互, 提高了用户体验度。 附图说明 此处所说明的附图用来提供对本发明的进一步理解, 构成本申请的一部分, 本发 明的示意性实施例及其说明用于解释本发明, 并不构成对本发明的不当限定。 在附图 中: 图 1是根据本发明实施例的远端呈现端点的能力交互方法的流程图; 图 2是根据本发明实施例的远端呈现端点的能力交互装置的结构框图; 图 3是根据本发明实施例的远端呈现端点的能力交互装置的优选的结构框图; 图 4是根据本发明实施例的远端呈现端点的协商方法的流程图一; 图 5是根据本发明实施例的远端呈现端点的协商方法的流程图二; 图 6是根据本发明实施例的远端呈现端点的协商方法的流程图三; 图 7是根据本发明实施例的远端呈现端点的协商方法的流程图四; 图 8是根据本发明实施例的远端呈现端点的协商方法的流程图五; 图 9是根据本发明实施例的远端呈现端点的协商方法的流程图六; 图 10是根据本发明实施例的远端呈现端点的协商方法的流程图七; 图 11是根据本发明实施例的远端呈现端点的协商方法的流程图八; 图 12是根据本发明实施例的远端呈现端点的协商方法的流程图九; 图 13是根据本发明实施例的远端呈现端点的协商方法的流程图十; 图 14是根据本发明实施例的远端呈现端点的协商方法的流程图十一; 图 15是根据本发明实施例的远端呈现端点的协商方法的流程图十二; 图 16是根据本发明实施例的远端呈现端点的协商方法的流程图十三; 图 17是根据本发明实施例的远程呈现端点能力集的示意图。 具体实施方式 下文中将参考附图并结合实施例来详细说明本发明。 需要说明的是, 在不冲突的 情况下, 本申请中的实施例及实施例中的特征可以相互组合。 本实施例提供了一种远程呈现端点的能力交互方法, 图 1是根据本发明实施例的 远端呈现端点的能力交互方法的流程图,如图 1所示,该方法包括如下步骤 S102至步 骤 S106。 步骤 S102: 第一远程呈现端点和第二远程呈现端点之间进行能力交互, 其中, 能 力交互的消息中携带有远程呈现端点能力集, 其中, 能力交互的消息中携带有远程呈 现端点发送能力集。 步骤 S104: 第一远程呈现端点接收第二远程呈现端点的模式请求消息。 步骤 S106:第一远程呈现端点根据能力交互的结果和接收到的模式请求信息打开 第一远程呈现端点和第二远程呈现端点之间的逻辑通道。 作为一个较优的实施方式, 步骤 S102可以通过如下两种方式进行实施。 方式一 该方式包括如下子步骤 S1至步骤 Sll。 步骤 S1 :第一远程呈现端点向第二远程呈现端点发送第一能力集交互请求,其中, 第一能力集交互请求中携带有第一远程呈现端点的第一发送能力集; 步骤 S3 : 第一远程呈现端点接收第二远程呈现端点发送的第二能力集交互请求, 其中, 第二能力集交互请求中携带有第二远程呈现端点的第二发送能力集; 步骤 S5: 第一远程呈现端点接收第二远程呈现端点发送的第一模式请求消息, 其 中, 第一模式请求消息中携带有第一远程呈现端点的发送参数; 步骤 S7:第一远程呈现端点向第二远程呈现端点发送的第二模式请求消息,其中, 第二模式请求消息中携带有第二远程呈现端点的发送参数; 步骤 S9: 第一远程呈现端点根据第二模式请求消息对应的模式请求过程的结果, 向第二远程呈现端点发送第一逻辑通道请求, 其中, 第一逻辑通道请求用于请求第一 远程呈现端点打开第一远程呈现端点至第二远程呈现端点之间的前向逻辑通道; 步骤 S11 : 第一远程呈现端点接收第二远程呈现端点发送的第二逻辑通道请求, 其中, 第二逻辑通道请求是第二远程呈现端点根据第一模式请求对应的模式请求过程 的结果确定的, 第二逻辑通道请求用于请求第二远程呈现端点打开第二远程呈现端点 到第一远程呈现端点之间的前向逻辑通道。 方式二: 该方式包括如下子步骤 S2至步骤 S12。 步骤 S2:第一远程呈现端点向第二远程呈现端点发送第三能力集交互请求,其中, 第三能力集交互请求中携带有第一远程呈现端点的第一发送能力集; 步骤 S4:第一远程呈现端点接收第二远程呈现端点发送的三模式请求消息,其中, 第三模式请求消息中携带有第一远程呈现端点的发送参数; 步骤 S6: 第一远程呈现端点接收第二远程呈现端点发送的第四能力集交互请求, 其中, 第四能力集交互请求中携带有第二远程呈现端点的第二发送能力集; 步骤 S8:第一远程呈现端点向第二远程呈现端点发送的第四模式请求消息,其中, 第四模式请求消息中携带有第二远程呈现端点的发送参数; 步骤 S10:第一远程呈现端点根据第四模式请求消息对应的模式请求过程的结果, 向第二远程呈现端点发送第三逻辑通道请求, 其中, 第三逻辑通道请求用于请求第一 远程呈现端点打开第一远程呈现端点至第二远程呈现端点之间的前向逻辑通道; 步骤 S12: 第一远程呈现端点接收第二远程呈现端点发送的第四逻辑通道请求, 其中, 第四逻辑通道请求是第二远程呈现端点根据第一模式请求对应的模式请求过程 的结果确定的, 第四逻辑通道请求用于请求第二远程呈现端点打开第二远程呈现端点 到第一远程呈现端点之间的前向逻辑通道。 作为一个较优的实施方式, 在第一远程呈现端点向第二远程呈现端点发送第一能 力集交互请求之后, 还包括: 第一远程呈现端点接收第二远程呈现端点发送的对应于 第一能力集交互请求对应的响应消息。 在第一远程呈现端点向第二远程呈现端点发送第三能力集交互请求之后,还包括: 第一远程呈现端点接收第二远程呈现端点发送的对应于第三能力集交互请求对应的响 应消息。 作为一个较优的实施方式, 在第一远程呈现端点接收第二远程呈现端点发送的第 二能力集交互请求之后, 还包括: 第一远程呈现端点向第二远程呈现端点发送对应于 第二能力集交互请求对应的响应消息; 在第一远程呈现端点接收第二远程呈现端点发送的第四能力集交互请求之后, 还 包括: 第一远程呈现端点向第二远程呈现端点发送对应于第四能力集交互请求对应的 响应消息。 作为一个较优的实施方式, 在第一远程呈现端点接收第二远程呈现端点发送的第 一模式请求消息之后, 还包括: 第一远程呈现端点向第二远程呈现端点发送对应于第 一模式请求消息对应的响应消息; 在第一远程呈现端点接收第二远程呈现端点发送的第三模式请求消息之后, 还包 括: 第一远程呈现端点向第二远程呈现端点发送对应于第三模式请求消息对应的响应 消息。 作为一个较优的实施方式, 在第一远程呈现端点向第二远程呈现端点发送的第二 模式请求消息之后, 还包括: 第一远程呈现端点向第二远程呈现端点发送对应于第二 模式请求消息的响应消息; 在第一远程呈现端点向第二远程呈现端点发送的第四模式请求消息之后,还包括: 第一远程呈现端点向第二远程呈现端点发送对应于第四模式请求消息的响应消息。 作为一个较优的实施方式, 在第一远程呈现端点根据第二模式请求消息对应的模 式请求过程的结果, 向第二远程呈现端点发送第一逻辑通道请求之后, 还包括: 第一 远程呈现端点接收第二远程呈现端点发送的对应于第一逻辑通道请求对应的响应消 息; 在第一远程呈现端点根据第二模式请求消息对应的模式请求过程的结果, 向第二 远程呈现端点发送第三逻辑通道请求之后, 还包括: 第一远程呈现端点接收第二远程 呈现端点发送的对应于第三逻辑通道请求对应的响应消息。 作为一个较优的实施方式, 在第一远程呈现端点接收第二远程呈现端点发送的第 二逻辑通道请求之后, 还包括: 第一远程呈现端点向第二远程呈现端点发送第二逻辑 通道请求对应的响应消息; 在第一远程呈现端点接收第二远程呈现端点发送的第四逻辑通道请求之后, 还包 括: 第一远程呈现端点向第二远程呈现端点发送第四逻辑通道请求对应的响应消息。 作为一个较优的实施方式, 远程呈现端点发送能力集包括捕获参数, 其中, 捕获 参数包括通用参数、 视频参数和 /或音频参数。 优选地, 通用参数包括媒体捕获内容、 场景描述、 场景切换策略、 通用空间信息 和 /或通用编码信息; 媒体捕获内容表示媒体捕获的用途; 场景描述用于提供整体场景 的描述; 优选地, 场景切换策略用于指示所支持媒体切换策略, 其中, 支持的媒体切换的 策略包括场所切换策略和部分切换策略, 其中, 场所切换策略用于指示同时切换全部 的捕获, 以保证捕获一起来自同一个端点场所, 部分切换策略用于指示不同的捕获可 以在不同的时间切换, 并且来自相同和 /或不同的远程呈现端点。 优选地, 通用空间信息包括场景区域和 /或区域刻度参数, 其中, 场景区域参数用 于指示与端点相关的整体场景的范围, 区域刻度表明了空间信息参数采用的刻度的种 类。 优选地, 通用编码信息包括全部的最大带宽、 全部的每秒最大像素数和 /或全部的 每秒最大宏块数, 其中, 全部最大带宽用于指示由终端发出的预设类型的全部码流的 每秒最大数量的比特率; 全部的每秒最大像素数用于表示编码组中全部独立编码的每 秒最大像素数; 全部的每秒最大宏块数表示由端点发送的全部视频码流每秒最大宏块 数。 优选地, 视频参数包括: 视频捕获数量、 视频捕获空间信息和 /或视频捕获编码信 息; 视频捕获数量用于表示视频捕获的数量。 优选地, 视频捕获空间信息包括捕获区域、 捕获点和 /或捕获线上的点, 其中, 捕 获区域用于表示该视频捕获在整体捕获场景中的所处的空间位置; 捕获点, 用于指示 在捕获场景中, 视频捕获的位置; 捕获线上的点, 描述了捕获设备光轴上的第二个点 的空间位置, 且第一个点为捕获点。 优选地, 视频捕获编码信息, 包括最大视频带宽、 每秒最大像素数、 最大视频分 辨率的宽度、 最大视频分辨率的高度和 /或最大视频帧速率; 其中, 最大视频带宽, 用 于指示单一视频编码的每秒最大比特数; 每秒最大像素数, 该参数用于表示单一视频 编码的每秒最大像素数; 最大视频分辨率的宽度, 该参数用于表示以像素为单位的最 大视频分辨率的宽度; 最大视频分辨率的高度, 该参数用于表示以像素为单位的最大 视频分辨率的高度; 最大视频帧速率, 该参数表明了最大视频帧率。 优选地,音频参数包括音频捕获数量、音频捕获空间信息和 /或音频捕获编码信息; 音频捕获数量用于指示音频捕获的数量。 优选地, 音频捕获空间信息包括: 捕获区域和 /或捕获点, 其中, 捕获区域, 用于 表示音频捕获在整体捕获场景所处的空间位置; 捕获点, 用于表示在捕获场景中, 音 频捕获的位置。 优选地, 音频捕获编码信息包括: 音频信道格式和 /或最大音频带宽; 音频信道格 式, 用于指示音频信道的属性; 最大音频带宽, 用于指示单一音频编码的每秒最大比 特数。 需要说明的是, 在附图的流程图示出的步骤可以在诸如一组计算机可执行指令的 计算机系统中执行, 并且, 虽然在流程图中示出了逻辑顺序, 但是在某些情况下, 可 以以不同于此处的顺序执行所示出或描述的步骤。 在另外一个实施例中, 还提供了一种远程呈现端点的能力交互软件, 该软件用于 执行上述实施例及优选实施例中描述的技术方案。 在另外一个实施例中, 还提供了一种存储介质, 该存储介质中存储有上述远程呈 现端点的能力交互装置, 该存储介质包括但不限于: 光盘、 软盘、 硬盘、 可擦写存储 器等。 本发明实施例还提供了一种远程呈现端点的能力交互装置, 应用于第一远程呈现 端点, 该远程呈现端点的能力交互装置可以用于实现上述远程呈现端点的能力交互方 法及优选实施方式, 已经进行过说明的, 不再赘述, 下面对该远程呈现端点的能力交 互装置中涉及到的模块进行说明。 如以下所使用的, 术语 "模块"可以实现预定功能 的软件和 /或硬件的组合。 尽管以下实施例所描述的系统和方法较佳地以软件来实现, 但是硬件, 或者软件和硬件的组合的实现也是可能并被构想的。 图 2是根据本发明实施例的远端呈现端点的能力交互装置的结构框图, 如图 2所 示, 该装置包括: 交互模块 22, 第一接收模块 24, 处理模块 26, 下面对上述结构进 行详细描述。 交互模块 22, 设置为和第二远程呈现端点之间进行能力交互, 其中, 能力交互的 消息中携带有远程呈现端点能力集, 其中, 能力交互的消息中携带有远程呈现端点发 送能力集; 第一接收模块 24, 设置为接收第二远程呈现端点的模式请求消息; 处理模 块 26, 连接至交互模块 22和处理模块 26, 设置为根据能力交互的结果和接收到的模 式请求信息打开第一远程呈现端点和第二远程呈现端点之间的逻辑通道。 图 3是根据本发明实施例的远端呈现端点的能力交互装置的优选的结构框图, 如 图 3所示, 该装置的优选结构描述如下: 交互模块 22包括: 第一发送模块 220, 设置为向第二远程呈现端点发送第一能力集交互请求, 其中, 第一能力集交互请求中携带有第一远程呈现端点的第一发送能力集; 第二接收模块 221, 设置为接收第二远程呈现端点发送的第二能力集交互请求, 其中, 第二能力集交互请求中携带有第二远程呈现端点的第二发送能力集; 第三接收模块 222, 设置为接收第二远程呈现端点发送的第一模式请求消息, 其 中, 第一模式请求消息中携带有第一远程呈现端点的发送参数; 第二发送模块 223, 设置为向第二远程呈现端点发送的第二模式请求消息, 其中, 第二模式请求消息中携带有第二远程呈现端点的发送参数; 第三发送模块 224, 设置为根据第二模式请求消息对应的模式请求过程的结果, 向第二远程呈现端点发送第一逻辑通道请求, 其中, 第一逻辑通道请求用于请求第一 远程呈现端点打开第一远程呈现端点至第二远程呈现端点之间的前向逻辑通道; 第四接收模块 225, 设置为接收第二远程呈现端点发送的第二逻辑通道请求, 其 中, 第二逻辑通道请求是第二远程呈现端点根据第一模式请求对应的模式请求过程的 结果确定的, 第二逻辑通道请求用于请求第二远程呈现端点打开第二远程呈现端点到 第一远程呈现端点之间的前向逻辑通道。 优选地, 交互模块 22包括: 第四发送模块 226, 设置为向第二远程呈现端点发送第三能力集交互请求, 其中, 第三能力集交互请求中携带有第一远程呈现端点的第一发送能力集; 第五接收模块 227, 设置为接收第二远程呈现端点发送的第三模式请求消息, 其 中, 第三模式请求消息中携带有第一远程呈现端点的发送参数; 第六接收模块 228, 设置为接收第二远程呈现端点发送的第四能力集交互请求, 其中, 第四能力集交互请求中携带有第二远程呈现端点的第二发送能力集; 第五发送模块 229, 设置为向第二远程呈现端点发送的第四模式请求消息, 其中, 第四模式请求消息中携带有第二远程呈现端点的发送参数; 第六发送模块 230, 设置为根据第四模式请求消息对应的模式请求过程的结果, 向第二远程呈现端点发送第三逻辑通道请求, 其中, 第三逻辑通道请求用于请求第一 远程呈现端点打开第一远程呈现端点至第二远程呈现端点之间的前向逻辑通道; 第七接收模块 231, 设置为接收第二远程呈现端点发送的第四逻辑通道请求, 其 中, 第四逻辑通道请求是第二远程呈现端点根据第一模式请求对应的模式请求过程的 结果确定的, 第四逻辑通道请求用于请求第二远程呈现端点打开第二远程呈现端点到 第一远程呈现端点之间的前向逻辑通道。 优选地, 上述装置还包括: 第八接收模块 31, 连接至第一发送模块 220, 在第一发送模块向第二远程呈现端 点发送第一能力集交互请求之后, 接收第二远程呈现端点发送的对应于第一能力集交 互请求对应的响应消息; 第九接收模块 32, 连接至第四发送模块 226, 在第四发送模块 226向第二远程呈 现端点发送第三能力集交互请求之后, 接收第二远程呈现端点发送的对应于第三能力 集交互请求对应的响应消息。 优选地, 上述装置还包括: 第七发送模块 33, 连接至第二接收模块 221, 在第二接收模块 221接收第二远程 呈现端点发送的第二能力集交互请求之后, 向第二远程呈现端点发送对应于第二能力 集交互请求对应的响应消息; 第八发送模块 34, 连接至第六接收模块 228, 在第六接收模块 228接收第二远程 呈现端点发送的第四能力集交互请求之后, 还包括: 第一远程呈现端点向第二远程呈 现端点发送对应于第四能力集交互请求对应的响应消息。 优选地, 上述装置还包括: 第九发送模块 35, 连接至第三接收模块 222, 在第三接收模块 222接收第二远程 呈现端点发送的第一模式请求消息之后, 向第二远程呈现端点发送对应于第一模式请 求消息对应的响应消息; 第十发送模块 36, 连接至第五接收模块 227, 在第五接收模块 227接收第二远程 呈现端点发送的第三模式请求消息之后, 向第二远程呈现端点发送对应于第三模式请 求消息对应的响应消息。 优选地, 上述装置还包括: 第十一发送模块 37, 连接至第二发送模块 223, 设置为在第二发送模块 223向第 二远程呈现端点发送的第二模式请求消息之后, 向第二远程呈现端点发送对应于第二 模式请求消息的响应消息; 第十二发送模块 38, 连接至第五发送模块 229, 设置为在第五发送模块 229向第 二远程呈现端点发送的第四模式请求消息之后, 向第二远程呈现端点发送对应于第四 模式请求消息的响应消息。 优选地, 上述装置还包括: 第十接收模块 39, 连接至第三发送模块 224, 设置为在第三发送模块 224根据第 二模式请求消息对应的模式请求过程的结果, 向第二远程呈现端点发送第一逻辑通道 请求之后, 接收第二远程呈现端点发送的对应于第一逻辑通道请求对应的响应消息; 第十一接收模块 40, 连接至第六发送模块 230, 设置为在第六发送模块 230根据 第二模式请求消息对应的模式请求过程的结果, 向第二远程呈现端点发送第三逻辑通 道请求之后,接收第二远程呈现端点发送的对应于第三逻辑通道请求对应的响应消息。 第十三发送模块 41, 连接至第四接收模块 225, 第四接收模块设置为在第四接收 模块 225第一远程呈现端点接收第二远程呈现端点发送的第二逻辑通道请求之后, 向 第二远程呈现端点发送第二逻辑通道请求对应的响应消息; 第十四发送模块 42, 连接至第七接收模块 231, 设置为在第七接收模块 231接收 第二远程呈现端点发送的第四逻辑通道请求之后, 向第二远程呈现端点发送第四逻辑 通道请求对应的响应消息。 本实施例提供了一种数据流, 远程呈现端点能力集, 其中远程呈现端点能力集包 括: 发送能力集, 发送能力集包括: 捕获参数, 捕获参数包括: 通用参数、 视频参数 和 /或音频参数。 优选地, 通用参数包括媒体捕获内容、 场景描述、 场景切换策略、 通用空间信息 和 /或通用编码信息; 媒体捕获内容表示媒体捕获的用途, 该属性包括媒体捕获视角, 媒体的表示的角色, 媒体是否为辅流内容, 媒体相关语言; 场景描述用于提供整体场 景的描述, 例如文字描述。 优选地, 场景切换策略用于指示所支持媒体切换策略, 其中, 支持的媒体切换的 策略包括场所切换策略和部分切换策略, 其中, 场所切换策略用于指示同时切换全部 的捕获, 以保证捕获一起来自同一个端点场所, 部分切换策略用于指示不同的捕获可 以在不同的时间切换, 并且来自相同和 /或不同的远程呈现端点。 优选地, 通用空间信息包括场景区域和 /或区域刻度参数, 其中, 场景区域参数用 于指示与端点相关的整体场景的范围, 区域刻度表明了空间信息参数采用的刻度的种 类。 优选地, 通用编码信息包括全部的最大带宽、 全部的每秒最大像素数和 /或全部的 每秒最大宏块数, 其中, 全部最大带宽用于指示由终端发出的预设类型的全部码流的 每秒最大数量的比特率; 全部的每秒最大像素数用于表示编码组中全部独立编码的每 秒最大像素数; 全部的每秒最大宏块数表示由端点发送的全部视频码流每秒最大宏块 数。 优选地, 视频参数包括: 视频捕获数量、 视频捕获空间信息和 /或视频捕获编码信 息; 视频捕获数量用于表示视频捕获的数量。 优选地, 视频捕获空间信息包括捕获区域、 捕获点和 /或捕获线上的点, 其中, 捕 获区域用于表示该视频捕获在整体捕获场景中的所处的空间位置; 捕获点, 用于指示 在捕获场景中, 视频捕获的位置; 捕获线上的点, 描述了捕获设备光轴上的第二个点 的空间位置, 且第一个点为捕获点。 优选地, 视频捕获编码信息, 包括最大视频带宽、 每秒最大像素数、 最大视频分 辨率的宽度、 最大视频分辨率的高度和 /或最大视频帧速率; 其中, 最大视频带宽, 用 于指示单一视频编码的每秒最大比特数; 每秒最大像素数, 该参数用于表示单一视频 编码的每秒最大像素数; 最大视频分辨率的宽度, 该参数用于表示以像素为单位的最 大视频分辨率的宽度; 最大视频分辨率的高度, 该参数用于表示以像素为单位的最大 视频分辨率的高度; 最大视频帧速率, 该参数表明了最大视频帧率。 优选地,音频参数包括音频捕获数量、音频捕获空间信息和 /或音频捕获编码信息; 音频捕获数量用于指示音频捕获的数量。 优选地, 音频捕获空间信息包括: 捕获区域和 /或捕获点, 其中, 捕获区域, 用于 表示音频捕获在整体捕获场景所处的空间位置; 捕获点, 用于表示在捕获场景中, 音 频捕获的位置。 优选地, 音频捕获编码信息包括: 音频信道格式和 /或最大音频带宽; 音频信道格 式, 用于指示音频信道的属性; 最大音频带宽, 用于指示单一音频编码的每秒最大比 特数。 优选地, 远程呈现端点能力集包括: 远程呈现端点对称能力集, 远程呈现端点对 称能力集包括: 捕获渲染参数, 捕获渲染参数包括: 通用参数、 视频参数和 /或音频参 数。 优选地, 通用参数包括媒体捕获渲染内容、 场景描述、 场景切换策略、 通用空间 信息和 /或通用编码信息;媒体捕获渲染内容表示媒体捕获和 /或渲染的用途,该属性包 括媒体捕获视角, 媒体的表示的角色, 媒体是否为辅流内容, 媒体相关语言; 场景描 述用于提供整体场景的描述; 优选地, 场景切换策略用于指示所支持媒体切换策略, 其中, 支持的媒体切换的 策略包括场所切换策略和部分切换策略, 其中, 场所切换策略用于指示同时切换全部 的捕获渲染, 以保证捕获渲染一起来自同一个端点场所, 部分切换策略用于指示不同 的捕获渲染可以在不同的时间切换, 并且来自相同的和 /或不同的远程呈现端点。 优选地, 通用空间信息包括场景区域和 /或区域刻度参数, 其中, 场景区域参数用 于指示与端点相关的整体场景的范围, 区域刻度表明了空间信息参数采用的刻度的种 类。 优选地, 通用编码信息包括全部的最大带宽、 全部的每秒最大像素数和 /或全部的 每秒最大宏块数, 其中, 全部最大带宽用于指示捕获渲染端点发送和 /或接收到的预设 类型的全部码流每秒最大数量的比特率; 全部的每秒最大像素数用于表示由端点发送 和 /或接收到的编码组中全部独立编码的每秒最大像素数; 全部的每秒最大宏块数表示 由端点发送和 /或接收到的全部视频码流每秒最大宏块数。 优选地, 视频参数包括: 视频捕获渲染数量、 视频捕获渲染空间信息和 /或视频捕 获渲染编码信息; 视频捕获渲染数量用于表示视频捕获和 /或渲染的数量。 优选地, 视频捕获渲染空间信息包括捕获渲染区域、 捕获渲染点和 /或捕获渲染线 上的点, 其中, 捕获渲染区域用于表示该视频捕获渲染在整体捕获和 /或渲染场景中的 所处的空间位置; 捕获渲染点, 用于指示在捕获和 /或渲染场景中, 视频捕获和 /或渲染 的位置;捕获渲染线上的点,描述了捕获和 /或渲染设备光轴上的第二个点的空间位置, 且第一个点为捕获和 /或渲染点。 优选地, 视频捕获渲染编码信息, 包括最大视频带宽、 每秒最大像素数、 最大视 频分辨率的宽度、最大视频分辨率的高度和 /或最大视频帧速率;其中,最大视频带宽, 用于指示单一视频编码的每秒最大比特数; 每秒最大像素数, 该参数用于表示单一视 频编码的每秒最大像素数; 最大视频分辨率的宽度, 该参数用于表示以像素为单位的 最大视频分辨率的宽度; 最大视频分辨率的高度, 该参数用于表示以像素为单位的最 大视频分辨率的高度; 最大视频帧速率, 该参数表明了最大视频帧率。 优选地, 音频参数包括音频捕获渲染数量、 音频捕获渲染空间信息和 /或音频捕获 渲染编码信息; 音频捕获渲染数量用于指示音频捕获渲染的数量。 优选地, 音频捕获渲染空间信息包括: 捕获渲染区域和 /或捕获渲染点, 其中, 捕 获渲染区域, 用于表示音频捕获和 /或渲染在整体捕获和 /或渲染场景所处的空间位置; 捕获渲染点, 用于表示在捕获和 /或渲染场景中, 音频捕获和 /或渲染的位置。 优选地, 音频捕获渲染编码信息包括: 音频信道格式和 /或最大音频带宽; 音频信 道格式, 用于指示音频信道的属性; 最大音频带宽, 用于指示涉及单一音频编码的每 秒最大比特数。 优选地, 远程呈现端点能力集包括: 作为一个较优的实施方式, 远程呈现端点能 力集包括: 远程呈现端点接收能力集, 其包括: 渲染参数, 渲染参数包括: 通用参数、 视频参数和 /或音频参数。 优选地, 通用参数包括: 媒体渲染内容、 场景描述、 场景切换策略、 通用空间信 息和 /或通用编码信息,其中,媒体渲染内容用于表示渲染端点需要的捕获内容的属性, 该属性包括媒体捕获视角, 媒体的表示的角色, 媒体是否为辅流内容, 媒体相关语言; 场景描述, 用于提供整体场景的描述; 场景切换策略, 用于指示所支持的媒体切换策 略, 优选地, 场景切换策略包括场所切换策略和 /或部分切换策略, 其中, 场所切换策 略为同时切换全部的渲染, 用于保证渲染一起来自同一个端点场所, 部分切换策略为 不同的渲染在不同的时间切换, 来自相同和 /或不同的端点。 优选地, 通用空间信息: 包括场景区域和 /或区域刻度参数, 其中, 场景区域参数 用于表示与端点相关的整体场景的范围, 区域刻度用于表示空间信息参数采用的刻度 的种类。 优选地, 通用编码信息包括全部的最大带宽、 全部的每秒最大像素数和 /或全部的 每秒最大宏块数, 其中, 全部最大带宽表示由渲染端点接收到的预设类型的全部码流 的每秒最大数量的比特率; 全部的每秒最大像素数表示编码组中全部独立编码的每秒 处理的最大像素数; 全部的每秒最大宏块数表示由端点接收到的全部视频码流每秒最 大宏块数。 优选地, 视频参数包括: 视频渲染数量、 视频渲染空间信息和 /或视频渲染编码信 息, 其中, 视频渲染数量用于表示视频渲染的数量; 视频渲染空间信息用于表示该视 频渲染表示的是整体渲染场景中的一部分。 优选地, 视频渲染编码信息包括最大视频带宽、 每秒最大像素数、 最大视频分辨 率的宽度、 最大视频分辨率的高度和 /或最大视频帧速率; 其中, 最大视频带宽, 该参 数用于表示单一视频编码的每秒最大比特数; 每秒最大像素数, 该参数用于表示单一 视频编码的每秒最大像素数; 最大视频分辨率的宽度, 该参数用于表示以像素为单位 的最大视频分辨率的宽度; 最大视频分辨率的高度, 该参数用于表示以像素为单位的 最大视频分辨率的高度; 最大视频帧速率, 该参数用于表示最大视频帧率。 优选地, 音频参数包括: 音频渲染数量、 音频渲染空间信息和 /或音频渲染编码信 息, 其中, 音频渲染数量用于表示音频渲染的数量; 音频渲染空间信息用于表示音频 渲染在整体渲染场景中所处的空间信息。 优选地, 音频渲染编码信息包括: 音频信道格式和 /或最大音频带宽, 其中, 音频 信道格式, 用于表示音频信道的属性; 最大音频带宽, 用于表示单一音频编码的每秒 最大比特数。 优选实施例一 本优选实施例提供一种能力协商方法, 能够保证远程呈现端点之间多路媒体的场 景描述, 编解码能力以及复用通道的协商。 在本实施例中, 提供了协商方式 A, 图 4是根据本发明实施例的远端呈现端点的 协商方法的流程图一, 如图 4所示, 协商方式包括能力集交互以及逻辑通道打开两部 分。 包括如下步骤 S401和步骤 S402: 步骤 S401 : 能力集交互: 两个远程呈现端点之间进行能力交互, 携带远程呈现端 点能力集。 步骤 S402: 逻辑通道打开: 打开两个远程呈现端点之间的逻辑通道, 指定协商后 的通道属性。 下面结合实例进行描述。 在本实施例中提供了协商方式 A-l, 图 5是根据本发明实施例的远端呈现端点的 协商方法的流程图二, 如图 5所示。 该协商方式能力集交互消息中携带的能力集为接 收能力集, 接收能力集包含端点接收能力相关参数, 包括如下步骤 S501和步骤 S502。 步骤 S501 : 能力集交互: 两个远程呈现端点之间进行能力集交互, 消息中携带远 程呈现端点接收能力集参数。 步骤 S502: 逻辑通道打开: 打开两个远程呈现端点之间的逻辑通道, 指定协商后 的通道属性。 图 6是根据本发明实施例的远端呈现端点的协商方法的流程图三, 如图 6所示该 种协商方式中的能力集交互以及逻辑通道打开流程可以包括如下步骤 S601 至步骤 S608。 步骤 S601 : 远程呈现端点 A向远程呈现端点 B发起能力集交互请求, 消息中携 带端点 A的接收能力集。 步骤 S602: 端点 B回复端点 A能力集交互响应消息; 步骤 S603 : 远程呈现端点 B向远程呈现端点 A发起能力集交互请求, 消息中携 带端点 B的接收能力集; 步骤 S604: 端点 A回复端点 B能力集交互响应消息; 步骤 S605: 端点 B根据端点 A的接收能力, 结合自己的发送能力, 向端点 A发 送打开逻辑通道请求消息。 指定通道属性, 请求打开 B到 A的前向逻辑通道; 步骤 S606: 端点 A向端点 B回复打开逻辑通道响应消息; 步骤 S607: 端点 A根据端点 B的接收能力, 结合自己的发送能力, 向端点 B发 送打开逻辑通道请求消息。 指定通道属性, 请求打开 A到 B的前向逻辑通道; 步骤 S608: 端点 B向端点 A回复打开逻辑通道响应消息。 其中一对请求和响应消息有时间上的先后顺序, 第一对能力交互消息发送时间在 第一对逻辑通道打开消息之前。 按照这种规则, 图 3中描述的消息发送的顺序还可以 是 1、 3、 2、 4、 5、 6、 7、 8, 端点 A和 B先互相交互能力集后再发送响应消息, 或 者 1、 2、 5、 6、 3、 4、 7、 8, 能力交互完成一部分后, 先打开一侧的逻辑通道。 其中一次端点 A到端点 B的信息交互过程中, 只要底层封包长度允许, 则可以一 次携带多条从一侧端点发送出的消息, 比如 1、 2+3、 4、 5、 6+7、 8。 其中 2+3表示端 点 B发送给端点 A—条信息, 该次交互中包含了 2和 3两条消息。 优选实施例二 本优选实施例提供一种能力协商方法, 能够保证远程呈现端点之间多路媒体的场 景描述, 编解码能力以及复用通道的协商。 本优选实施例提供了协商方式 A-2, 该协商方式能力集交互消息中携带的能力集 为对称能力集, 对称能力集表示该端点的接收能力集和发送能力集一致。 图 7是根据 本发明实施例的远端呈现端点的协商方法的流程图四, 如图 7所示, 该方法包括步骤 S701和步骤 S702。 步骤 S701 : 能力集交互: 两个远程呈现端点之间进行能力集交互, 消息中携带远 程呈现端点对称能力集参数。 步骤 S702: 逻辑通道打开: 打开两个远程呈现端点之间的逻辑通道, 指定协商后 的通道属性。 图 8是根据本发明实施例的远端呈现端点的协商方法的流程图五, 如图 8所示, 该种协商方式中的能力集交互以及逻辑通道打开流程包括如下步骤 S801至步骤 S808。 步骤 S801 : 远程呈现端点 A向远程呈现端点 B发起能力集交互请求, 消息中携 带端点 A的对称能力集; 步骤 S803 : 端点 B回复端点 A能力集交互响应消息; 步骤 S804: 远程呈现端点 B向远程呈现端点 A发起能力集交互请求, 消息中携 带端点 B的对称能力集; 步骤 S804: 端点 A回复端点 B能力集交互响应消息; 步骤 S805: 端点 B根据端点 A的对称能力, 结合自己的能力, 向端点 A发送打 开逻辑通道请求消息。 指定通道属性, 请求打开 B到 A的前向逻辑通道; 步骤 S806: 端点 A向端点 B回复打开逻辑通道响应消息; 步骤 S807: 端点 A根据端点 B的对称能力, 结合自己的能力, 向端点 B发送打 开逻辑通道请求消息。 指定通道属性, 请求打开 A到 B的前向逻辑通道; 步骤 S808: 端点 B向端点 A回复打开逻辑通道响应消息。 其中一对请求和响应消息有时间上的先后顺序, 第一对能力交互消息发送时间在 第一对逻辑通道打开消息之前。 按照这种规则, 图 5中描述的消息发送的顺序还可以 是 1、 3、 2、 4、 5、 6、 7、 8, 端点 A和 B先互相交互能力集后再发送响应消息, 或 者 1、 2、 5、 6、 3、 4、 7、 8, 能力交互完成一部分后, 先打开一侧的逻辑通道。 其中一次端点 A到端点 B的信息交互过程中, 只要底层封包长度允许, 则可以一 次携带多条从一侧端点发送出的消息, 比如 1、 2+3、 4、 5、 6+7、 8。 其中 2+3表示端 点 B发送给端点 A—条信息, 该次交互中包含了 2和 3两条消息。 优选实施例三 本优选实施例提供一种能力协商方法, 能够保证远程呈现端点之间多路媒体的场 景描述, 编解码能力以及复用通道的协商。 本优选实施例提供了协商方式 A-3, 该协商方式能力集交互消息中携带的能力集 为发送能力集, 发送能力集包含端点发送能力相关参数。 图 9是根据本发明实施例的 远端呈现端点的协商方法的流程图六,如图 9所示,该方法包括步骤 S901和步骤 S902。 步骤 S901 : 能力集交互: 两个远程呈现端点之间进行能力集交互, 消息中携带远 程呈现端点发送能力集参数。 步骤 S902: 逻辑通道打开: 打开两个远程呈现端点之间的逻辑通道, 指定协商后 的通道属性。 下面结合实例进行说明。 实例一: 该实例提供了协商方式 A-3-1 , 图 10是根据本发明实施例的远端呈现端点的协商 方法的流程图七, 如图 10所示, 该方式在能力交互响应消息中返回选择后的参数, 包 括如下步骤 S101至步骤 S108。 步骤 S101 : 远程呈现端点 A向远程呈现端点 B发起能力集交互请求, 消息中携 带端点 A的发送能力集; 步骤 S102: 端点 B回复端点 A能力集交互响应消息, 携带 B从 A的发送能力集 中选择的参数; 步骤 S103 : 远程呈现端点 B向远程呈现端点 A发起能力集交互请求, 消息中携 带端点 B的发送能力集; 步骤 S104: 端点 A回复端点 B能力集交互响应消息, 携带 A从 B的发送能力集 中选择的参数; 步骤 S105: 端点 A根据端点 B在能力集交互响应中选择的能力, 向端点 B发送 打开逻辑通道请求消息。 指定通道属性, 请求打开 A到 B的前向逻辑通道; 步骤 S106: 端点 B向端点 A回复打开逻辑通道响应消息; 步骤 S107: 端点 B根据端点 A在能力集交互响应选择的能力, 向端点 A发送打 开逻辑通道请求消息。 指定通道属性, 请求打开 B到 A的前向逻辑通道; 步骤 S108: 端点 A向端点 B回复打开逻辑通道响应消息。 其中一对请求和响应消息有时间上的先后顺序, 第一对能力交互消息发送时间在 第一对逻辑通道打开消息之前。 按照这种规则, 图 7中描述的消息发送的顺序还可以 是 1、 3、 2、 4、 5、 6、 7、 8, 端点 A和 B先互相交互能力集后再发送响应消息, 或 者 1、 2、 5、 6、 3、 4、 7、 8, 能力交互完成一部分后, 先打开一侧的逻辑通道。 其中一次端点 A到端点 B的信息交互过程中, 只要底层封包长度允许, 则可以一 次携带多条从一侧端点发送出的消息, 比如 1、 2+3、 4、 5、 6+7、 8。 其中 2+3表示端 点 B发送给端点 A—条信息, 该次交互中包含了 2和 3两条消息。 实例二: 该实例提供了协商方式 A-3-2, 如图 11所示, 该方式在能力集交互响应消息中不 携带选择参数, 在打开逻辑通道请求中请求打开的是反向逻辑通道, 包括如下步骤 S111至步骤 S118。 步骤 S111 : 远程呈现端点 A向远程呈现端点 B发起能力集交互请求, 消息中携 带端点 A的发送能力集; 步骤 S112: 端点 B回复端点 A能力集交互响应消息; 步骤 S113 : 远程呈现端点 B向远程呈现端点 A发起能力集交互请求, 消息中携 带端点 B的发送能力集; 步骤 S114: 端点 A回复端点 B能力集交互响应消息; 步骤 S115 : 端点 B根据端点 A的发送能力, 结合自己的接收能力, 向端点 A发 送打开逻辑通道请求消息。 指定通道属性, 请求打开 B到 A的反向逻辑通道; 步骤 S116: 端点 A向端点 B回复打开逻辑通道响应消息; 步骤 S117: 端点 A根据端点 B的发送能力, 结合自己的接收能力, 向端点 B发 送打开逻辑通道请求消息。 指定通道属性, 请求打开 A到 B的反向逻辑通道; 步骤 S118: 端点 B向端点 A回复打开逻辑通道响应消息。 其中一对请求和响应消息有时间上的先后顺序, 第一对能力交互消息发送时间在 第一对逻辑通道打开消息之前。 按照这种规则, 图 8中描述的消息发送的顺序还可以 是 1、 3、 2、 4、 5、 6、 7、 8, 端点 A和 B先互相交互能力集后再发送响应消息, 或 者 1、 2、 5、 6、 3、 4、 7、 8, 能力交互完成一部分后, 先打开一侧的逻辑通道。 其中一次端点 A到端点 B的信息交互过程中, 只要底层封包长度允许, 则可以一 次携带多条从一侧端点发送出的消息, 比如 1、 2+3、 4、 5、 6+7、 8。 其中 2+3表示端 点 B发送给端点 A—条信息, 该次交互中包含了 2和 3两条消息。 优选实施例四 本优选实施例提供一种能力协商方法, 能够保证远程呈现端点之间多路媒体的场 景描述, 编解码能力以及复用通道的协商。 本优选实施例提供了协商方式 A-4, 该协商方式能力集交互消息中携带的能力集 为接收能力集和发送能力集。 接收能力集包含端点接收能力相关参数, 发送能力集包 含端点发送能力相关参数。 如图 12所示, 包括如下步骤 S1201和步骤 S1202。 步骤 S1201 : 能力集交互: 两个远程呈现端点之间进行能力集交互, 消息中携带 远程呈现端点接收能力集和发送能力集参数。 步骤 S1202: 逻辑通道打开: 打开两个远程呈现端点之间的逻辑通道, 指定协商 后的通道属性。 下面通过实例进行说明。 实例一 本实例描述了协商方式 A-4-1 , 如图 13所示, 该协商方式在能力集交互时同时携 带接收能力集以及发送能力集, 包括如下步骤 S1301至步骤 S1308。 步骤 S1301 : 远程呈现端点 A向远程呈现端点 B发起能力集交互请求, 消息中携 带端点 A的接收能力集和发送能力集; 步骤 S1302: 端点 B回复端点 A能力集交互响应消息; 步骤 S1303 : 远程呈现端点 B向远程呈现端点 A发起能力集交互请求, 消息中携 带端点 B的接收能力集和发送能力集; 步骤 S1304: 端点 A回复端点 B能力集交互响应消息; 步骤 S1305 : 端点 B根据端点 A的能力, 结合自己的能力, 向端点 A发送打开逻 辑通道请求消息。 指定通道属性, 请求打开 B到 A的前向逻辑通道; 步骤 S1306: 端点 A向端点 B回复打开逻辑通道响应消息; 步骤 S1307: 端点 A根据端点 B的能力, 结合自己的能力, 向端点 B发送打开逻 辑通道请求消息。 指定通道属性, 请求打开 A到 B的前向逻辑通道; 步骤 S1308: 端点 B向端点 A回复打开逻辑通道响应消息。 其中一对请求和响应消息有时间上的先后顺序, 第一对能力交互消息发送时间在 第一对逻辑通道打开消息之前。 按照这种规则.图 10 中描述的消息发送的顺序还可以 是 1、 3、 2、 4、 5、 6、 7、 8, 端点 A和 B先互相交互能力集后再打开两边的逻辑通 道, 或者 1、 2、 5、 6、 3、 4、 7、 8, 能力交互完成一部分后, 先打开一侧的逻辑通道。 其中一次端点 A到端点 B的信息交互过程中, 只要底层封包长度允许, 则可以一 次携带多条从一侧端点发送出的消息, 比如 1、 2+3、 4、 5、 6+7、 8。 其中 2+3表示端 点 B发送给端点 A—条信息, 该次交互中包含了 2和 3两条消息。 实例二 图 14是根据本发明实施例的远端呈现端点的协商方法的流程图十一,在本实施例 中描述了协商方式 A-4-2, 如图 14所示, 该方式在能力集交互请求消息中作为接收端 的接受能力集要依赖作为发送端的发送能力集形成, 包括如下步骤 S1401 至步骤 S1412。  The present invention relates to the field of communications, and in particular to a method and apparatus for interacting with capabilities of a remote presentation endpoint, and a data stream. In the H.323-based videoconferencing product, when the capability negotiation is performed on the H.245, only the traditional video conferencing endpoint capability can be negotiated, and the capability of the remote presentation endpoint cannot be negotiated. Specific Tables When capabilities are interacted, only the codec capabilities of traditional video conferencing endpoints can be interacted, and the new media stream receiving and sending capabilities of the remote rendering endpoints cannot be interacted. When you open a logical channel, you can only specify the codec attribute of the logical channel. You cannot specify the media stream attribute, codec attribute, multiplexing attribute, and binding relationship between these remote presentation endpoint capability attributes of the logical channel. . In view of the inability of the negotiation method in the related art to negotiate the capability set of the remote presentation endpoint, an effective solution has not been proposed yet. SUMMARY OF THE INVENTION The present invention provides a method and apparatus for interacting with capabilities of a remote presentation endpoint, and a data stream to address at least the above problems. According to an aspect of the present invention, a capability interaction method for remotely presenting an endpoint is provided, including: performing a capability interaction between a first remote presentation endpoint and a second remote presentation endpoint, where the capability interaction message carries a remote presentation endpoint capability set, wherein the capability interaction message carries the remote presentation endpoint transmission capability set; the first remote presentation endpoint receives a mode request message of the second remote presentation endpoint; the first remote presentation The endpoint opens a logical channel between the first telepresence endpoint and the second telepresence endpoint based on the result of the capability interaction and the received mode request information. Preferably, performing the capability interaction between the first remote presentation endpoint and the second remote presentation endpoint comprises: sending, by the first remote presentation endpoint, the first capability set interaction request to the second remote presentation endpoint, where the first The first set of transmission capabilities of the first remote presentation endpoint is carried in the capability set interaction request;  The first remote presentation endpoint receives the second capability set interaction request sent by the second remote presentation endpoint, where the second capability set interaction request carries the second transmission capability set of the second remote presentation endpoint; The first remote presentation endpoint receives the first mode request message sent by the second remote presentation endpoint, where the first mode request message carries the sending parameter of the first remote presentation endpoint; Presenting, by the endpoint, a second mode request message sent by the endpoint to the second remote presentation endpoint, where the second mode request message carries a sending parameter of the second remote presentation endpoint; As a result of the mode request process corresponding to the second mode request message, sending a first logical channel request to the second remote presentation endpoint, where the first logical channel request is used to request the first remote rendering endpoint to open a forward logical channel between the first telepresence endpoint and the second telepresence endpoint; the first telepresence Receiving, by the point, the second logical channel request sent by the second remote presentation endpoint, where the second logical channel request is determined by the second remote presentation endpoint according to a result of a mode request process corresponding to the first mode request The second logical channel request is for requesting the second remote rendering endpoint to open a forward logical channel between the second remote rendering endpoint and the first remote rendering endpoint. Preferably, performing the capability interaction between the first remote presentation endpoint and the second remote presentation endpoint comprises: sending, by the first remote presentation endpoint, a third capability set interaction request to the second remote presentation endpoint, wherein the third The capability set interaction request carries a first transmission capability set of the first remote presentation endpoint; the first remote presentation endpoint receives a three-mode request message sent by the second remote presentation endpoint, where the third mode The request message carries the sending parameter of the first remote rendering endpoint; the first remote rendering endpoint receives the fourth capability set interaction request sent by the second remote rendering endpoint, where the fourth capability set interaction request is carried a second transmission capability set of the second remote presentation endpoint; a fourth mode request message sent by the first remote presentation endpoint to the second remote presentation endpoint, where the fourth mode request message carries Transmitting parameters of the second remote presentation endpoint; the first remote presentation endpoint according to the mode corresponding to the fourth mode request message As a result of the request process, sending a third logical channel request to the second remote presentation endpoint, wherein the third logical channel request is for requesting the first remote presentation endpoint to open the first remote presentation endpoint to a forward logical channel between the second telepresence endpoints;  Receiving, by the first remote presentation endpoint, a fourth logical channel request sent by the second remote presentation endpoint, where the fourth logical channel request is a mode corresponding to the first remote mode endpoint according to the first mode request As a result of the requesting process, the fourth logical channel request is for requesting the second telepresence endpoint to open a forward logical channel between the second telepresence endpoint and the first telepresence endpoint. Preferably, after the first remote presentation endpoint sends the first capability set interaction request to the second remote presentation endpoint, the method further includes: the first remote presentation endpoint receiving, by the second remote presentation endpoint, the corresponding The first capability set interaction request corresponding response message; after the first remote presentation endpoint sends the third capability set interaction request to the second remote presentation endpoint, the method further includes: the first remote presentation endpoint receiving And a response message corresponding to the third capability set interaction request sent by the second remote presentation endpoint. Preferably, after the first remote presentation endpoint receives the second capability set interaction request sent by the second remote presentation endpoint, the method further includes: sending, by the first remote presentation endpoint, the second remote presentation endpoint to the second remote presentation endpoint The second capability set interaction request corresponding to the response message; after the first remote presentation endpoint receives the fourth capability set interaction request sent by the second remote presentation endpoint, the method further includes: the first remote presentation endpoint to the second The telepresence endpoint sends a response message corresponding to the fourth capability set interaction request. Preferably, after the first remote presentation endpoint receives the first mode request message sent by the second remote presentation endpoint, the method further includes: the first remote presentation endpoint sending the corresponding location to the second remote presentation endpoint a response message corresponding to the first mode request message; after the first remote presentation endpoint receives the third mode request message sent by the second remote presentation endpoint, the method further includes: the first remote presentation endpoint to the The remote presentation endpoint sends a response message corresponding to the third mode request message. After the second remote presentation endpoint sends the second mode request message to the second remote presentation endpoint, the method further includes: sending, by the first remote presentation endpoint, the second remote presentation endpoint to the second remote presentation endpoint a response message of the second mode request message; after the fourth mode request message sent by the first remote presentation endpoint to the second remote presentation endpoint, the method further includes: the first remote presentation endpoint to the second The telepresence endpoint sends a response message corresponding to the fourth mode request message.  Preferably, after the first remote presentation endpoint sends the first logical channel request to the second remote presentation endpoint according to the result of the mode request process corresponding to the second mode request message, the method further includes: The telepresence endpoint receives a response message corresponding to the first logical channel request sent by the second remote presentation endpoint; and the result of the mode request process corresponding to the second remote presentation endpoint according to the second mode request message After the third logical channel request is sent to the second remote presentation endpoint, the method further includes: the first remote presentation endpoint receiving, by the second remote presentation endpoint, a response message corresponding to the third logical channel request . Preferably, after the first remote presentation endpoint receives the second logical channel request sent by the second remote presentation endpoint, the method further includes: the first remote presentation endpoint sending the first to the second remote presentation endpoint After the first remote presentation endpoint receives the fourth logical channel request sent by the second remote presentation endpoint, the method further includes: the first remote presentation endpoint to the second remote The presentation endpoint sends a response message corresponding to the fourth logical channel request. Preferably, the remote presentation endpoint transmission capability set includes a capture parameter, wherein the capture parameter includes a universal parameter, a video parameter, and/or an audio parameter. Preferably, the universal parameters include media capture content, scene description, scene switching policy, general space information, and/or general encoding information; the media capturing content indicates a purpose of media capturing; and the scene description is used to provide an overall scenario. Preferably, the scenario switching policy is used to indicate a supported media switching policy, where the supported media switching policy includes a location switching policy and a partial switching policy, where the location switching policy is used to indicate simultaneous handover All captures are taken to ensure that the captures come together from the same endpoint location, which is used to indicate that different captures can be switched at different times and from the same and/or different remote presentation endpoints. Preferably, the general space information includes a scene area and/or an area scale parameter, where the scene area parameter is used to indicate a range of an overall scene related to the endpoint, and the area scale indicates a scale used by the spatial information parameter. kind. Preferably, the universal coding information includes all the maximum bandwidth, all the maximum number of pixels per second, and/or all the maximum number of macroblocks per second, wherein the total maximum bandwidth is used to indicate the preset type issued by the terminal. The maximum number of bitrates per second of all codestreams; the total number of pixels per second is used to represent all of the coding groups The maximum number of pixels per second independently coded; the total number of macroblocks per second represents the maximum number of macroblocks per second for all video streams sent by the endpoint. Preferably, the video parameters include: a video capture quantity, video capture space information, and/or video capture coding information; the video capture quantity is used to indicate the number of video captures. Advantageously, said video capture spatial information comprises a capture area, a capture point and/or a point on the capture line, wherein said capture area is for indicating a spatial location of the video capture in the overall captured scene; A capture point, used to indicate the location of the video capture in the captured scene; a point on the capture line that describes the spatial location of the second point on the optical axis of the capture device, and the first point is the capture point. Preferably, the video capture coding information includes a maximum video bandwidth, a maximum number of pixels per second, a width of a maximum video resolution, a height of a maximum video resolution, and/or a maximum video frame rate; wherein, the maximum video bandwidth, The maximum number of bits per second for indicating a single video encoding; the maximum number of pixels per second, the parameter is used to represent the maximum number of pixels per second for a single video encoding; the width of the maximum video resolution, the parameter is used to represent The width of the maximum video resolution in pixels; the height of the maximum video resolution, which is used to represent the height of the maximum video resolution in pixels; the maximum video frame rate, which indicates the maximum Video frame rate. Preferably, the audio parameters include an audio capture amount, audio capture space information, and/or audio capture coding information; the audio capture number is used to indicate the number of audio captures. Preferably, the audio capture space information includes: a capture area, and/or a capture point, where the capture area is used to indicate a spatial location where the audio capture is located in the overall captured scene; the capture point is used to indicate Captures the location of the audio capture in the scene. Preferably, the audio capture coding information includes: an audio channel format and/or a maximum audio bandwidth; the audio channel format is used to indicate an attribute of an audio channel; and the maximum audio bandwidth is used to indicate a single audio code per second. The maximum number of bits. According to another aspect of the present invention, a capability interaction device for remotely presenting an endpoint is provided, which is applied to a first remote presentation endpoint, and includes: an interaction module configured to perform capability interaction with a second remote presentation endpoint, where The capability interaction message carries a remote presentation endpoint capability set, where the capability interaction message carries the remote presentation endpoint transmission capability set; and the first receiving module is configured to receive the second remote presentation endpoint mode. a request message, the processing module, configured to open a logical channel between the first telepresence endpoint and the second telepresence endpoint according to the result of the capability interaction and the received mode request information.  Preferably, the interaction module includes: a first sending module, configured to send a first capability set interaction request to the second remote presentation endpoint, where the first capability set interaction request carries the first remote Presenting a first set of transmission capabilities of the endpoint; the second receiving module is configured to receive the second capability set interaction request sent by the second remote presentation endpoint, where the second capability set interaction request carries the second remote presentation a second sending capability set of the endpoint; the third receiving module is configured to receive the first mode request message sent by the second remote rendering endpoint, where the first mode request message carries the first remote rendering endpoint The second sending module is configured to send the second mode request message to the second remote presentation endpoint, where the second mode request message carries the sending parameter of the second remote rendering endpoint; a third sending module, configured to send, according to a result of the mode request process corresponding to the second mode request message, to the second remote The presentation endpoint sends a first logical channel request, wherein the first logical channel request is used to request the first remote presentation endpoint to open forward logic between the first remote presentation endpoint and the second remote presentation endpoint a fourth receiving module, configured to receive a second logical channel request sent by the second remote rendering endpoint, where the second logical channel request is that the second remote rendering endpoint responds according to the first mode request The result of the mode request process is determined, the second logical channel requesting to request the second telepresence endpoint to open a forward logical channel between the second telepresence endpoint and the first telepresence endpoint. Preferably, the interaction module includes: a fourth sending module, configured to send a third capability set interaction request to the second remote presentation endpoint, where the third capability set interaction request carries the first remote Presenting a first set of transmission capabilities of the endpoint; the fifth receiving module is configured to receive the third mode request message sent by the second remote presentation endpoint, where the third mode request message carries the first remote presentation a sending parameter of the endpoint; a sixth receiving module, configured to receive a fourth capability set interaction request sent by the second remote presentation endpoint, where the fourth capability set interaction request carries the second remote presentation endpoint a sending capability set; a fifth sending module, configured to send the fourth mode request message to the second remote rendering endpoint, where the fourth mode request message carries the sending parameter of the second remote rendering endpoint;  a sixth sending module, configured to send a third logical channel request to the second remote rendering endpoint according to a result of the mode request process corresponding to the fourth mode request message, where the third logical channel request is used for the request The first remote presentation endpoint opens a forward logical channel between the first remote presentation endpoint and the second remote presentation endpoint; the seventh receiving module is configured to receive the fourth sent by the second remote presentation endpoint a logical channel request, where the fourth logical channel request is determined by the second remote presentation endpoint according to a result of a mode request process corresponding to the first mode request, where the fourth logical channel request is used to request the A second telepresence endpoint opens a forward logical channel between the second telepresence endpoint and the first telepresence endpoint. Preferably, the method further includes: an eighth receiving module, configured to receive, after the first sending module sends the first capability set interaction request to the second remote presentation endpoint, the second remote rendering endpoint sends the corresponding a response message corresponding to the capability set interaction request; the ninth receiving module is configured to: after the fourth sending module sends the third capability set interaction request to the second remote presentation endpoint, receive the correspondence sent by the second remote rendering endpoint Corresponding to the response message corresponding to the third capability set. Preferably, the method further includes: a seventh sending module, configured to send, after the second receiving module receives the second capability set interaction request sent by the second remote rendering endpoint, to the second remote rendering endpoint, corresponding to the first And the eighth sending module is configured to: after the fourth receiving module receives the fourth capability set interaction request sent by the second remote rendering endpoint, the method further includes: the first remote rendering endpoint Sending a response message corresponding to the fourth capability set interaction request to the second remote presentation endpoint. Preferably, the method further includes: a ninth sending module, configured to send, after the third receiving module receives the first mode request message sent by the second remote rendering endpoint, to the second remote rendering endpoint, corresponding to the a response message corresponding to the first mode request message;  a tenth sending module, configured to send, after the fifth receiving module receives the third mode request message sent by the second remote rendering endpoint, to the second remote rendering endpoint, corresponding to the third mode request message Response message. Preferably, the method further includes: an eleventh sending module, configured to send, after the second sending module sends the second mode request message to the second remote rendering endpoint, to the second remote rendering endpoint, corresponding to the a response message of the second mode request message; the twelfth sending module is configured to: after the fourth mode request message sent by the fifth sending module to the second remote presentation endpoint, to the second remote presentation endpoint A response message corresponding to the fourth mode request message is transmitted. Preferably, the method further includes: a tenth receiving module, configured to send, by the third sending module, a first logical channel request to the second remote rendering endpoint according to a result of a mode request process corresponding to the second mode request message After receiving the response message corresponding to the first logical channel request sent by the second remote presentation endpoint, the eleventh receiving module is configured to correspond to the second mode request message according to the second sending module. As a result of the mode request process, after sending the third logical channel request to the second remote presentation endpoint, receiving a response message corresponding to the third logical channel request sent by the second remote presentation endpoint. Preferably, the method further includes: a thirteenth sending module, configured to: after the first remote rendering endpoint of the fourth receiving module receives the second logical channel request sent by the second remote rendering endpoint, to the second The telepresence endpoint sends a response message corresponding to the second logical channel request; the fourteenth sending module is configured to: after the seventh receiving module receives the fourth logical channel request sent by the second remote rendering endpoint, The second telepresence endpoint sends a response message corresponding to the fourth logical channel request. According to another aspect of the present invention, a capability interaction apparatus for remotely presenting an endpoint is provided, comprising: according to another aspect of the present invention, a data stream is provided, comprising: a remote presentation endpoint capability set, wherein the remote The presentation endpoint capability set includes: a transmission capability set, where the transmission capability set includes: a capture parameter, where the capture parameter includes: a general parameter, a video parameter, and/or an audio parameter.  Preferably, the universal parameters include media capture content, scene description, scene switching policy, general space information, and/or general encoding information; the media capturing content indicates a purpose of media capturing; and the scene description is used to provide an overall scenario. Preferably, the scenario switching policy is used to indicate a supported media switching policy, where the supported media switching policy includes a location switching policy and a partial switching policy, where the location switching policy is used to indicate simultaneous handover All captures are taken to ensure that the captures come together from the same endpoint location, which is used to indicate that different captures can be switched at different times and from the same and/or different remote presentation endpoints. Preferably, the general space information includes a scene area and/or an area scale parameter, where the scene area parameter is used to indicate a range of an overall scene related to the endpoint, and the area scale indicates a scale used by the spatial information parameter. kind. Preferably, the universal coding information includes all the maximum bandwidth, all the maximum number of pixels per second, and/or all the maximum number of macroblocks per second, wherein the total maximum bandwidth is used to indicate the preset type issued by the terminal. The maximum number of bits per second of all code streams; the total number of pixels per second is used to represent the maximum number of pixels per second independently coded in the code group; the maximum number of macroblocks per second is represented by The maximum number of macroblocks per second for all video streams sent by the endpoint. Preferably, the video parameters include: a video capture quantity, video capture space information, and/or video capture coding information; the video capture quantity is used to indicate the number of video captures. Advantageously, said video capture spatial information comprises a capture area, a capture point and/or a point on the capture line, wherein said capture area is for indicating a spatial location of the video capture in the overall captured scene; A capture point, used to indicate the location of the video capture in the captured scene; a point on the capture line that describes the spatial location of the second point on the optical axis of the capture device, and the first point is the capture point. Preferably, the video capture coding information includes a maximum video bandwidth, a maximum number of pixels per second, a width of a maximum video resolution, a height of a maximum video resolution, and/or a maximum video frame rate; wherein, the maximum video bandwidth, The maximum number of bits per second for indicating a single video encoding; the maximum number of pixels per second, the parameter is used to represent the maximum number of pixels per second for a single video encoding; the width of the maximum video resolution, the parameter is used to represent The width of the maximum video resolution in pixels; the height of the maximum video resolution, which is used to represent the height of the maximum video resolution in pixels; the maximum video frame rate, which indicates the maximum Video frame rate. Preferably, the audio parameters include an audio capture amount, audio capture space information, and/or audio capture coding information; the audio capture number is used to indicate the number of audio captures.  Preferably, the audio capture space information includes: a capture area, and/or a capture point, where the capture area is used to indicate a spatial location where the audio capture is located in the overall captured scene; the capture point is used to indicate Captures the location of the audio capture in the scene. Preferably, the audio capture coding information includes: an audio channel format and/or a maximum audio bandwidth; the audio channel format is used to indicate an attribute of an audio channel; and the maximum audio bandwidth is used to indicate a single audio code per second. The maximum number of bits. Through the invention, the capability interaction of the telepresence terminal is realized, and the user experience is improved. BRIEF DESCRIPTION OF THE DRAWINGS The accompanying drawings, which are set to illustrate,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, In the drawings: FIG. 1 is a flowchart of a capability interaction method of a remote presentation endpoint according to an embodiment of the present invention; FIG. 2 is a structural block diagram of a capability interaction device for remotely presenting an endpoint according to an embodiment of the present invention; FIG. 4 is a flowchart 1 of a method for negotiating a remote presentation endpoint according to an embodiment of the present invention; FIG. 4 is a flowchart 1 of a method for negotiating a remote presentation endpoint according to an embodiment of the present invention; FIG. 6 is a flowchart 3 of a method for negotiating a remote presentation endpoint according to an embodiment of the present invention; FIG. 7 is a method for negotiating a remote presentation endpoint according to an embodiment of the present invention; FIG. 8 is a flowchart of a method for negotiating a remote presentation endpoint according to an embodiment of the present invention; FIG. 9 is a flowchart 6 of a method for negotiating a remote presentation endpoint according to an embodiment of the present invention; FIG. FIG. 11 is a flowchart of a method for negotiating a remote presentation endpoint according to an embodiment of the present invention; FIG. 11 is a flowchart 8 of a method for negotiating a remote presentation endpoint according to an embodiment of the present invention; 12 is a flowchart 9 of a method for negotiating a remote presentation endpoint according to an embodiment of the present invention; FIG. 13 is a flowchart 10 of a method for negotiating a remote presentation endpoint according to an embodiment of the present invention;  14 is a flowchart 11 of a method for negotiating a remote presentation endpoint according to an embodiment of the present invention; FIG. 15 is a flowchart 12 of a method for negotiating a remote presentation endpoint according to an embodiment of the present invention; Flowchart 13 of a method for negotiating a remote presentation endpoint of an embodiment; FIG. 17 is a schematic diagram of a remote presentation endpoint capability set in accordance with an embodiment of the present invention. BEST MODE FOR CARRYING OUT THE INVENTION Hereinafter, the present invention will be described in detail with reference to the accompanying drawings. It should be noted that the embodiments in the present application and the features in the embodiments may be combined with each other without conflict. The present embodiment provides a method for interacting with a remote presentation endpoint. FIG. 1 is a flowchart of a method for interacting with a remote presentation endpoint according to an embodiment of the present invention. As shown in FIG. 1, the method includes the following steps S102 to S106. Step S102: Perform a capability interaction between the first remote presentation endpoint and the second remote presentation endpoint, where the capability interaction message carries a remote presentation endpoint capability set, where the capability interaction message carries the remote presentation endpoint transmission capability set . Step S104: The first telepresence endpoint receives a mode request message of the second telepresence endpoint. Step S106: The first telepresence endpoint opens a logical channel between the first telepresence endpoint and the second telepresence endpoint according to the result of the capability interaction and the received mode request information. As a preferred implementation, step S102 can be implemented in the following two manners. Method 1 The method includes the following sub-steps S1 to S11. Step S1: The first remote presentation endpoint sends a first capability set interaction request to the second remote presentation endpoint, where the first capability set interaction request carries the first transmission capability set of the first remote presentation endpoint. Step S3: The remote presentation endpoint receives the second capability set interaction request sent by the second remote presentation endpoint, where the second capability set interaction request carries the second transmission capability set of the second remote presentation endpoint; Step S5: The first remote presentation endpoint receives a first mode request message sent by the second remote presentation endpoint, where the first mode request message carries a sending parameter of the first remote presentation endpoint;  Step S7: The second mode request message sent by the first remote presentation endpoint to the second remote presentation endpoint, where the second mode request message carries the sending parameter of the second remote presentation endpoint; Step S9: the first remote presentation endpoint is configured according to As a result of the mode request process corresponding to the second mode request message, sending a first logical channel request to the second remote presentation endpoint, where the first logical channel request is used to request the first remote presentation endpoint to open the first remote presentation endpoint to the second Remotely presenting a forward logical channel between the endpoints; Step S11: The first remote presentation endpoint receives the second logical channel request sent by the second remote presentation endpoint, where the second logical channel request is the second remote presentation endpoint according to the first mode Requesting a result of the corresponding mode request procedure, the second logical channel request is for requesting the second telepresence endpoint to open the forward logical channel between the second telepresence endpoint and the first telepresence endpoint. Manner 2: The mode includes the following sub-steps S2 to S12. Step S2: The first remote presentation endpoint sends a third capability set interaction request to the second remote presentation endpoint, where the third capability set interaction request carries the first transmission capability set of the first remote presentation endpoint. Step S4: The remote presentation endpoint receives the three-mode request message sent by the second remote presentation endpoint, where the third mode request message carries the transmission parameter of the first remote presentation endpoint; Step S6: the first remote presentation endpoint receives the second remote presentation endpoint to send The fourth capability set interaction request, where the fourth capability set interaction request carries the second transmission capability set of the second remote presentation endpoint; Step S8: the fourth mode sent by the first remote presentation endpoint to the second remote presentation endpoint a request message, where the fourth mode request message carries the sending parameter of the second remote presentation endpoint; Step S10: the first remote presentation endpoint sends the second remote presentation endpoint according to the result of the mode request process corresponding to the fourth mode request message Sending a third logical channel request, where the third logical channel request is used to request the first far The presentation endpoint opens a forward logical channel between the first remote presentation endpoint and the second remote presentation endpoint; Step S12: The first remote presentation endpoint receives a fourth logical channel request sent by the second remote presentation endpoint, where the fourth logical channel The request is determined by the second telepresence endpoint as a result of the first mode request corresponding mode request process, and the fourth logical channel request is for requesting the second telepresence endpoint to open the second telepresence endpoint to the first telepresence endpoint Forward logical channel.  As a preferred implementation, after the first remote presentation endpoint sends the first capability set interaction request to the second remote presentation endpoint, the method further includes: receiving, by the first remote presentation endpoint, the first capability corresponding to the first remote presentation endpoint Set the response message corresponding to the interaction request. After the first remote presentation endpoint sends the third capability set interaction request to the second remote presentation endpoint, the method further includes: receiving, by the first remote presentation endpoint, a response message corresponding to the third capability set interaction request sent by the second remote presentation endpoint. As a preferred implementation, after the first remote presentation endpoint receives the second capability set interaction request sent by the second remote presentation endpoint, the method further includes: sending, by the first remote presentation endpoint, the second remote presentation endpoint to the second capability After the first remote presentation endpoint receives the fourth capability set interaction request sent by the second remote presentation endpoint, the method further includes: sending, by the first remote presentation endpoint, the second remote presentation endpoint to the fourth capability Set the response message corresponding to the interaction request. As a preferred implementation, after the first remote presentation endpoint receives the first mode request message sent by the second remote presentation endpoint, the method further includes: sending, by the first remote presentation endpoint, the first mode request to the second remote presentation endpoint The response message corresponding to the message; after the first remote presentation endpoint receives the third mode request message sent by the second remote presentation endpoint, the method further includes: transmitting, by the first remote presentation endpoint, to the second remote presentation endpoint, corresponding to the third mode request message Response message. As a preferred implementation manner, after the first remote presentation endpoint sends the second mode request message to the second remote presentation endpoint, the method further includes: the first remote presentation endpoint sending the second mode request to the second remote presentation endpoint a response message of the message, after the first remote presentation endpoint sends the fourth mode request message to the second remote presentation endpoint, the method further comprising: the first remote presentation endpoint transmitting a response corresponding to the fourth mode request message to the second remote presentation endpoint Message. As a preferred implementation manner, after the first remote presentation endpoint sends the first logical channel request to the second remote presentation endpoint according to the result of the mode request process corresponding to the second mode request message, the method further includes: the first remote presentation endpoint Receiving, by the second telepresence endpoint, a response message corresponding to the first logical channel request;  After the first remote presentation endpoint sends the third logical channel request to the second remote presentation endpoint according to the result of the mode request process corresponding to the second mode request message, the method further includes: receiving, by the first remote presentation endpoint, the second remote presentation endpoint Corresponding to the third logical channel request corresponding response message. As a preferred implementation, after the first remote presentation endpoint receives the second logical channel request sent by the second remote presentation endpoint, the method further includes: the first remote presentation endpoint sending the second logical channel request corresponding to the second remote presentation endpoint After the first remote presentation endpoint receives the fourth logical channel request sent by the second remote presentation endpoint, the method further includes: the first remote presentation endpoint sending a response message corresponding to the fourth logical channel request to the second remote presentation endpoint. As a preferred implementation manner, the telepresence endpoint transmission capability set includes a capture parameter, where the capture parameter includes a universal parameter, a video parameter, and/or an audio parameter. Preferably, the universal parameters include media capture content, scene description, scene switching policy, general space information and/or general encoding information; media capture content indicates use of media capture; scene description is used to provide description of the overall scene; preferably, the scene The switching policy is used to indicate the supported media switching policy, where the supported media switching policy includes a location switching policy and a partial switching policy, where the location switching policy is used to indicate that all the capturing is simultaneously switched to ensure that the capturing is from the same endpoint together A place, partial handoff strategy is used to indicate that different captures can be switched at different times and from the same and/or different remote presentation endpoints. Preferably, the general spatial information includes a scene area and/or an area scale parameter, wherein the scene area parameter is used to indicate a range of the overall scene associated with the endpoint, and the area scale indicates the type of scale used by the spatial information parameter. Preferably, the universal coding information includes all the maximum bandwidth, the total number of maximum pixels per second, and/or the maximum number of macroblocks per second, wherein all the maximum bandwidth is used to indicate all the code streams of the preset type sent by the terminal. The maximum number of bits per second; the maximum number of pixels per second is used to represent the maximum number of pixels per second independently coded in the code group; the maximum number of macroblocks per second represents the total number of video streams sent by the endpoint. The maximum number of macroblocks in seconds. Preferably, the video parameters include: video capture number, video capture spatial information, and/or video capture encoded information; the number of video captures is used to indicate the number of video captures. Preferably, the video capture spatial information comprises a capture area, a capture point and/or a point on the capture line, wherein the capture area is used to indicate a spatial location of the video capture in the overall captured scene; a capture point, used to indicate In the captured scene, the location of the video capture; the point on the capture line, describes the spatial location of the second point on the optical axis of the capture device, and the first point is the capture point. Preferably, the video captures the encoded information, including the maximum video bandwidth, the maximum number of pixels per second, the width of the maximum video resolution, the height of the maximum video resolution, and/or the maximum video frame rate; wherein, the maximum video bandwidth is used to indicate a single The maximum number of bits per second for video encoding; the maximum number of pixels per second, which is used to represent the maximum number of pixels per second for a single video encoding; the maximum video resolution width, which is used to represent the maximum video resolution in pixels. The width of the rate; the height of the maximum video resolution, which is used to indicate the height of the maximum video resolution in pixels; the maximum video frame rate, which indicates the maximum video frame rate. Preferably, the audio parameters include an audio capture amount, audio capture spatial information, and/or audio capture encoded information; the number of audio captures is used to indicate the number of audio captures. Preferably, the audio capture spatial information comprises: a capture area and/or a capture point, wherein the capture area is used to indicate a spatial location where the audio capture is located in the overall captured scene; and a capture point is used to represent the audio capture in the captured scene s position. Preferably, the audio capture encoded information comprises: an audio channel format and/or a maximum audio bandwidth; an audio channel format for indicating an attribute of the audio channel; a maximum audio bandwidth for indicating a maximum number of bits per second for a single audio encoding. It should be noted that the steps shown in the flowchart of the accompanying drawings may be performed in a computer system such as a set of computer executable instructions, and, although the logical order is shown in the flowchart, in some cases, The steps shown or described may be performed in an order different than that herein. In another embodiment, a capability interactive software for remotely presenting endpoints is provided for performing the technical solutions described in the above embodiments and preferred embodiments. In another embodiment, a storage medium is provided, the storage medium having the capability of interacting with the remote presentation endpoints, including but not limited to: an optical disk, a floppy disk, a hard disk, a rewritable memory, and the like. The embodiment of the present invention further provides a capability interaction device for a remote presentation endpoint, which is applicable to a first remote presentation endpoint, and the capability interaction device of the remote presentation endpoint may be used to implement the capability interaction method and a preferred implementation manner of the remote presentation endpoint. Having already explained, it will not be described again, and the modules involved in the capability interaction device of the remote presentation endpoint will be described below. As used below, the term "module" can achieve a predetermined function a combination of software and / or hardware. Although the systems and methods described in the following embodiments are preferably implemented in software, hardware, or a combination of software and hardware, is also possible and contemplated. 2 is a structural block diagram of a capability interaction device for remotely presenting an endpoint according to an embodiment of the present invention. As shown in FIG. 2, the device includes: an interaction module 22, a first receiving module 24, a processing module 26, and the foregoing structure. Carry out a detailed description. The interaction module 22 is configured to perform a capability interaction with the second remote presentation endpoint, where the capability interaction message carries a remote presentation endpoint capability set, where the capability interaction message carries the remote presentation endpoint transmission capability set; a receiving module 24, configured to receive a mode request message of the second telepresence endpoint; the processing module 26, connected to the interaction module 22 and the processing module 26, configured to open the first remote according to the result of the capability interaction and the received mode request information A logical channel between the endpoint and the second telepresence endpoint is rendered. FIG. 3 is a block diagram of a preferred structure of a device for remotely presenting an endpoint according to an embodiment of the present invention. As shown in FIG. 3, a preferred structure of the device is as follows: The interaction module 22 includes: a first sending module 220, configured to Sending a first capability set interaction request to the second remote presentation endpoint, where the first capability set interaction request carries a first transmission capability set of the first remote presentation endpoint; the second receiving module 221 is configured to receive the second remote presentation a second capability set interaction request sent by the endpoint, where the second capability set interaction request carries a second transmission capability set of the second remote presentation endpoint, and the third receiving module 222 is configured to receive the second remote presentation endpoint a mode request message, where the first mode request message carries the transmission parameter of the first remote presentation endpoint; the second sending module 223 is configured to send the second mode request message to the second remote presentation endpoint, where the second The mode request message carries the sending parameter of the second remote rendering endpoint; the third sending module 224, And sending a first logical channel request to the second remote rendering endpoint according to a result of the mode request process corresponding to the second mode request message, where the first logical channel request is used to request the first remote rendering endpoint to open the first remote rendering endpoint a second logical channel request sent by the second remote presentation endpoint, wherein the second logical channel request is a second remote presentation endpoint according to the forward logical channel between the second remote presentation endpoints The first mode request determines the result of the corresponding mode request process, and the second logical channel request is for requesting the second telepresence endpoint to open the forward logical channel between the second telepresence endpoint and the first telepresence endpoint.  Preferably, the interaction module 22 includes: a fourth sending module 226, configured to send a third capability set interaction request to the second remote presentation endpoint, where the third capability set interaction request carries the first transmission of the first remote presentation endpoint The fifth receiving module 227 is configured to receive the third mode request message sent by the second remote presentation endpoint, where the third mode request message carries the sending parameter of the first remote rendering endpoint; the sixth receiving module 228, The fourth capability set interaction request sent by the second remote presentation endpoint is received, where the fourth capability set interaction request carries the second transmission capability set of the second remote presentation endpoint, and the fifth sending module 229 is set to the first a fourth mode request message sent by the remote presentation endpoint, where the fourth mode request message carries the sending parameter of the second remote presentation endpoint, and the sixth sending module 230 is configured to select the mode request process corresponding to the fourth mode request message. As a result, sending a third logical channel request to the second telepresence endpoint, where The third logical channel request is for requesting the first remote rendering endpoint to open the forward logical channel between the first remote rendering endpoint and the second remote rendering endpoint; the seventh receiving module 231 is configured to receive the second remote endpoint a fourth logical channel request, wherein the fourth logical channel request is determined by the second remote rendering endpoint according to a result of the mode request process corresponding to the first mode request, and the fourth logical channel request is used to request the second remote rendering endpoint to open the second remote A forward logical channel between the endpoint and the first telepresence endpoint is presented. Preferably, the foregoing apparatus further includes: an eighth receiving module 31, connected to the first sending module 220, after the first sending module sends the first capability set interaction request to the second remote rendering endpoint, receiving the second remote rendering endpoint sending Corresponding to the response message corresponding to the first capability set interaction request; the ninth receiving module 32 is connected to the fourth sending module 226, and after the fourth sending module 226 sends the third capability set interaction request to the second remote rendering endpoint, The response message corresponding to the third capability set interaction request sent by the remote presentation endpoint is sent. Preferably, the foregoing apparatus further includes: a seventh sending module 33, connected to the second receiving module 221, after the second receiving module 221 receives the second capability set interaction request sent by the second remote rendering endpoint, to the second remote presentation endpoint Sending a response message corresponding to the second capability set interaction request;  The eighth sending module 34 is connected to the sixth receiving module 228. After the sixth receiving module 228 receives the fourth capability set interaction request sent by the second remote rendering endpoint, the method further includes: the first remote rendering endpoint to the second remote rendering endpoint A response message corresponding to the fourth capability set interaction request is sent. Preferably, the foregoing apparatus further includes: a ninth sending module 35, connected to the third receiving module 222, after the third receiving module 222 receives the first mode request message sent by the second remote rendering endpoint, sending the message to the second remote rendering endpoint Corresponding to the response message corresponding to the first mode request message; the tenth sending module 36 is connected to the fifth receiving module 227, and after the fifth receiving module 227 receives the third mode request message sent by the second remote rendering endpoint, to the second The telepresence endpoint sends a response message corresponding to the third mode request message. Preferably, the foregoing apparatus further includes: an eleventh sending module 37, connected to the second sending module 223, configured to send to the second remote mode after the second sending module 223 sends the second mode request message to the second remote rendering endpoint The presentation endpoint sends a response message corresponding to the second mode request message; the twelfth sending module 38 is connected to the fifth sending module 229, and is set as the fourth mode request message sent by the fifth sending module 229 to the second remote presentation endpoint. Thereafter, a response message corresponding to the fourth mode request message is sent to the second remote presentation endpoint. Preferably, the foregoing apparatus further includes: a tenth receiving module 39, connected to the third sending module 224, configured to send the second remote presentation endpoint to the second remote sending module 224 according to a result of the mode request process corresponding to the second mode request message After the first logical channel request is sent, the response message corresponding to the first logical channel request sent by the second remote presentation endpoint is received; the eleventh receiving module 40 is connected to the sixth sending module 230, and is configured to be in the sixth sending module. 230. After transmitting the third logical channel request to the second remote presentation endpoint according to the result of the mode request process corresponding to the second mode request message, receiving a response message corresponding to the third logical channel request sent by the second remote presentation endpoint. The thirteenth sending module 41 is connected to the fourth receiving module 225, and the fourth receiving module is configured to: after the first remote rendering endpoint of the fourth receiving module 225 receives the second logical channel request sent by the second remote rendering endpoint, to the second The telepresence endpoint sends a response message corresponding to the second logical channel request;  The fourteenth sending module 42 is connected to the seventh receiving module 231, and configured to send the fourth logical channel request to the second remote rendering endpoint after the seventh receiving module 231 receives the fourth logical channel request sent by the second remote rendering endpoint Corresponding response message. The embodiment provides a data stream, a remote presentation endpoint capability set, where the remote presentation endpoint capability set includes: a transmission capability set, where the transmission capability set includes: a capture parameter, where the capture parameters include: a general parameter, a video parameter, and/or an audio parameter. . Preferably, the universal parameters include media capture content, scene description, scene switching policy, general space information, and/or general encoding information; media capture content indicates use of media capture, the attribute includes a media capture perspective, a role of the representation of the media, the media Whether it is auxiliary stream content, media related language; scene description is used to provide a description of the overall scene, such as a text description. Preferably, the scenario switching policy is used to indicate the supported media switching policy, where the supported media switching policy includes a location switching policy and a partial switching policy, where the location switching policy is used to indicate that all the capturing is simultaneously switched to ensure that the capturing is performed together. From the same endpoint location, a partial handover policy is used to indicate that different acquisitions can be switched at different times and from the same and/or different remote presentation endpoints. Preferably, the general spatial information includes a scene area and/or an area scale parameter, wherein the scene area parameter is used to indicate a range of the overall scene associated with the endpoint, and the area scale indicates the type of scale used by the spatial information parameter. Preferably, the universal coding information includes all the maximum bandwidth, the total number of maximum pixels per second, and/or the maximum number of macroblocks per second, wherein all the maximum bandwidth is used to indicate all the code streams of the preset type sent by the terminal. The maximum number of bits per second; the maximum number of pixels per second is used to represent the maximum number of pixels per second independently coded in the code group; the maximum number of macroblocks per second represents the total number of video streams sent by the endpoint. The maximum number of macroblocks in seconds. Preferably, the video parameters include: video capture number, video capture spatial information, and/or video capture encoded information; the number of video captures is used to indicate the number of video captures. Preferably, the video capture spatial information comprises a capture area, a capture point and/or a point on the capture line, wherein the capture area is used to indicate a spatial location of the video capture in the overall captured scene; a capture point, used to indicate In the captured scene, the location of the video capture; the point on the capture line, describes the spatial location of the second point on the optical axis of the capture device, and the first point is the capture point.  Preferably, the video captures the encoded information, including the maximum video bandwidth, the maximum number of pixels per second, the width of the maximum video resolution, the height of the maximum video resolution, and/or the maximum video frame rate; wherein, the maximum video bandwidth is used to indicate a single The maximum number of bits per second for video encoding; the maximum number of pixels per second, which is used to represent the maximum number of pixels per second for a single video encoding; the maximum video resolution width, which is used to represent the maximum video resolution in pixels. The width of the rate; the height of the maximum video resolution, which is used to indicate the height of the maximum video resolution in pixels; the maximum video frame rate, which indicates the maximum video frame rate. Preferably, the audio parameters include an audio capture amount, audio capture spatial information, and/or audio capture encoded information; the number of audio captures is used to indicate the number of audio captures. Preferably, the audio capture spatial information comprises: a capture area and/or a capture point, wherein the capture area is used to indicate a spatial location where the audio capture is located in the overall captured scene; and a capture point is used to represent the audio capture in the captured scene s position. Preferably, the audio capture encoded information comprises: an audio channel format and/or a maximum audio bandwidth; an audio channel format for indicating an attribute of the audio channel; a maximum audio bandwidth for indicating a maximum number of bits per second for a single audio encoding. Preferably, the telepresence endpoint capability set comprises: a telepresence endpoint symmetric capability set, and the telepresence endpoint symmetric capability set comprises: capturing rendering parameters, and capturing rendering parameters comprises: general parameters, video parameters and/or audio parameters. Preferably, the universal parameters include media capture rendering content, scene description, scene switching strategy, general space information, and/or general encoding information; media capture rendering content represents media capture and/or rendering purposes, including media capture perspective, media The representation of the role, whether the media is the auxiliary stream content, the media related language; the scenario description is used to provide a description of the overall scenario; preferably, the scenario switching policy is used to indicate the supported media switching policy, wherein the supported media switching policy includes a place switching policy and a partial switching policy, wherein the place switching policy is used to indicate that all the captured renderings are simultaneously switched to ensure that the captured renderings come together from the same endpoint location, and the partial switching strategy is used to indicate that different capturing renderings can be switched at different times. And from the same and/or different telepresence endpoints. Preferably, the general spatial information includes a scene area and/or an area scale parameter, wherein the scene area parameter is used to indicate a range of the overall scene associated with the endpoint, and the area scale indicates the type of scale used by the spatial information parameter.  Preferably, the universal coding information includes all the maximum bandwidth, all the maximum number of pixels per second and/or the maximum number of macroblocks per second, wherein all the maximum bandwidth is used to indicate the pre-send and/or received pre-received by the capture rendering endpoint. Set the maximum number of bitrates per second for all streams of the type; the maximum number of pixels per second is used to represent the maximum number of pixels per second independently encoded in the encoded group sent and/or received by the endpoint; all per second The maximum number of macroblocks represents the maximum number of macroblocks per second for all video streams sent and/or received by the endpoint. Preferably, the video parameters include: video capture rendering amount, video capture rendering space information, and/or video capture rendering encoding information; the number of video capture renderings is used to indicate the number of video captures and/or renderings. Preferably, the video capture rendering spatial information comprises capturing a rendering area, capturing a rendering point, and/or capturing a point on the rendering line, wherein the capturing rendering area is used to indicate where the video capture rendering is in the overall captured and/or rendered scene Spatial position; capture rendering point, used to indicate the location of video capture and/or rendering in the captured and/or rendered scene; capture points on the rendering line, describing the second on the optical axis of the capture and/or rendering device The spatial position of the points, and the first point is the capture and / or render point. Preferably, the video capture renders the encoded information, including the maximum video bandwidth, the maximum number of pixels per second, the width of the maximum video resolution, the height of the maximum video resolution, and/or the maximum video frame rate; wherein the maximum video bandwidth is used to indicate The maximum number of bits per second for a single video encoding; the maximum number of pixels per second, which is used to represent the maximum number of pixels per second for a single video encoding; the maximum video resolution width, which is used to represent the largest video in pixels. The width of the resolution; the height of the maximum video resolution, which is used to indicate the height of the maximum video resolution in pixels; the maximum video frame rate, which indicates the maximum video frame rate. Preferably, the audio parameters include an audio capture rendering amount, audio capture rendering space information, and/or audio capture rendering encoding information; an audio capture rendering number is used to indicate the number of audio capture renderings. Preferably, the audio capture rendering spatial information comprises: capturing a rendering area and/or capturing a rendering point, wherein the rendering area is used to represent an audio capture and/or rendering a spatial location where the overall captured and/or rendered scene is located; Render point, used to indicate the location of audio capture and/or rendering in the captured and/or rendered scene. Preferably, the audio capture rendering encoded information comprises: an audio channel format and/or a maximum audio bandwidth; an audio channel format for indicating an attribute of the audio channel; and a maximum audio bandwidth for indicating a maximum number of bits per second involved in a single audio encoding. Preferably, the telepresence endpoint capability set includes: As a preferred implementation, the telepresence endpoint capability set includes: a telepresence endpoint reception capability set, including: a rendering parameter, where the rendering parameters include: a general parameter, a video parameter, and/or Audio parameters.  Preferably, the universal parameters include: media rendering content, scene description, scene switching strategy, general space information, and/or general encoding information, wherein the media rendering content is used to represent an attribute of the captured content required by the rendering endpoint, the attribute including media capture The perspective, the role of the media representation, whether the media is the auxiliary stream content, the media related language; the scenario description, used to provide a description of the overall scenario; the scenario switching policy, used to indicate the supported media switching policy, preferably, the scenario switching policy The method includes a location switching policy and/or a partial switching policy, where the location switching policy is to switch all the renderings at the same time, to ensure that the renderings come together from the same endpoint location, and the partial switching strategy switches for different renderings at different times, from the same and / or different endpoints. Preferably, the general spatial information includes: a scene area and/or an area scale parameter, wherein the scene area parameter is used to indicate a range of the overall scene related to the endpoint, and the area scale is used to indicate the type of the scale used by the spatial information parameter. Preferably, the universally encoded information includes all of the maximum bandwidth, all of the maximum number of pixels per second, and/or all of the maximum number of macroblocks per second, wherein all of the maximum bandwidth represents all of the preset types of streams received by the rendering endpoint. The maximum number of bits per second; the maximum number of pixels per second represents the maximum number of pixels processed per second independently encoded in the code group; all the maximum number of macroblocks per second represents all video streams received by the endpoint The maximum number of macroblocks per second. Preferably, the video parameters include: a number of video renderings, video rendering space information, and/or video rendering encoding information, where the number of video renderings is used to indicate the number of video renderings; the video rendering spatial information is used to indicate that the video rendering representation is a whole Render a portion of the scene. Preferably, the video rendering coding information includes a maximum video bandwidth, a maximum number of pixels per second, a width of a maximum video resolution, a height of a maximum video resolution, and/or a maximum video frame rate; wherein, a maximum video bandwidth, the parameter is used to represent The maximum number of bits per second for a single video encoding; the maximum number of pixels per second, which is used to represent the maximum number of pixels per second for a single video encoding; the maximum video resolution width, which is used to represent the largest video in pixels. The width of the resolution; the height of the maximum video resolution, which is used to indicate the height of the maximum video resolution in pixels; the maximum video frame rate, which is used to represent the maximum video frame rate. Preferably, the audio parameters include: an audio rendering amount, audio rendering space information, and/or audio rendering encoding information, where the audio rendering number is used to represent the number of audio renderings; the audio rendering space information is used to indicate that the audio rendering is in the overall rendering scene. The spatial information in which it is located.  Preferably, the audio rendering encoded information comprises: an audio channel format and/or a maximum audio bandwidth, wherein the audio channel format is used to represent an attribute of the audio channel; and the maximum audio bandwidth is used to represent a maximum number of bits per second for a single audio encoding. Preferred Embodiment 1 The preferred embodiment provides a capability negotiation method, which can ensure scene description, codec capability, and negotiation of multiplexing channels of multiple media between remote presentation endpoints. In this embodiment, a negotiation mode A is provided. FIG. 4 is a flowchart 1 of a method for negotiating a remote presentation endpoint according to an embodiment of the present invention. As shown in FIG. 4, the negotiation mode includes a capability set interaction and a logical channel open. section. The method includes the following steps S401 and S402: Step S401: Capability set interaction: capability interaction between two telepresence endpoints, carrying a telepresence endpoint capability set. Step S402: The logical channel is opened: Open a logical channel between two remote rendering endpoints, and specify the channel attribute after negotiation. The following description will be made with reference to examples. In this embodiment, a negotiation mode A-1 is provided. FIG. 5 is a flowchart 2 of a method for negotiating a remote presentation endpoint according to an embodiment of the present invention, as shown in FIG. 5. The capability set carried in the negotiation mode capability set interaction message is a receiving capability set, and the receiving capability set includes the endpoint receiving capability related parameter, and the following steps S501 and S502 are included. Step S501: capability set interaction: capability set interaction between two remote presentation endpoints, where the message carries a remote presentation endpoint reception capability set parameter. Step S502: Logical channel open: Open a logical channel between two remote rendering endpoints, and specify the channel attribute after negotiation. FIG. 6 is a flowchart 3 of a method for negotiating a remote presentation endpoint according to an embodiment of the present invention. The capability set interaction and the logical channel open procedure in the negotiation manner shown in FIG. 6 may include the following steps S601 to S608. Step S601: The remote presentation endpoint A initiates a capability set interaction request to the remote presentation endpoint B, and the message carries the reception capability set of the endpoint A. Step S602: Endpoint B replies to the endpoint A capability set interaction response message;  Step S603: The remote presentation endpoint B initiates a capability set interaction request to the remote presentation endpoint A, and the message carries the reception capability set of the endpoint B. Step S604: The endpoint A replies to the endpoint B capability set interaction response message; Step S605: The endpoint B according to the endpoint A The receiving capability, combined with its own sending capability, sends an Open Logical Channel Request message to Endpoint A. Specifying the channel attribute, requesting to open the forward logical channel of B to A; Step S606: Endpoint A replies to the Endpoint B to open the logical channel response message; Step S607: Endpoint A according to the receiving capability of the endpoint B, combined with its own sending capability, to the endpoint B sends an open logical channel request message. Specify the channel attribute, request to open the forward logical channel of A to B; Step S608: Endpoint B replies to Endpoint A to open the logical channel response message. One pair of request and response messages has a chronological order, and the first pair of capability interaction messages are sent before the first pair of logical channel open messages. According to this rule, the order of sending the messages described in FIG. 3 may also be 1, 3, 2, 4, 5, 6, 7, 8, and the endpoints A and B first transmit the response set after interacting with each other, or 1 After the 2, 5, 6, 3, 4, 7, and 8 capabilities are completed, first open the logical channel on one side. In the process of information interaction from endpoint A to endpoint B, as long as the length of the underlying packet is allowed, multiple messages sent from one endpoint may be carried at a time, such as 1, 2+3, 4, 5, 6+7, 8 . 2+3 indicates that end point B is sent to the endpoint A-strip information, and the interaction contains two messages and two messages. Preferred Embodiment 2 The preferred embodiment provides a capability negotiation method, which can ensure scene description, codec capability, and negotiation of multiplexing channels of multiple media between remote presentation endpoints. The preferred embodiment provides a negotiation mode A-2, and the capability set carried in the negotiation mode capability set interaction message is a symmetric capability set, and the symmetric capability set indicates that the receiving capability set and the transmission capability set of the endpoint are consistent. FIG. 7 is a flowchart 4 of a method for negotiating a remote presentation endpoint according to an embodiment of the present invention. As shown in FIG. 7, the method includes step S701 and step S702. Step S701: Capability set interaction: The capability set interaction is performed between two remote rendering endpoints, and the message carries the parameters of the remote rendering endpoint symmetric capability set. Step S702: The logical channel is opened: Open a logical channel between two remote rendering endpoints, and specify the channel attribute after negotiation.  FIG. 8 is a flowchart of a method for negotiating a remote presentation endpoint according to an embodiment of the present invention. As shown in FIG. 8, the capability set interaction and the logical channel open procedure in the negotiation manner include the following steps S801 to S808. Step S801: The remote presentation endpoint A initiates a capability set interaction request to the remote presentation endpoint B, and the message carries the symmetric capability set of the endpoint A. Step S803: the endpoint B replies to the endpoint A capability set interaction response message; Step S804: remotely presents the endpoint B direction The remote presentation endpoint A initiates the capability set interaction request, and the message carries the symmetric capability set of the endpoint B. Step S804: the endpoint A replies to the endpoint B capability set interaction response message; Step S805: the endpoint B combines its own capability according to the symmetric capability of the endpoint A. , Send an open logical channel request message to endpoint A. Specifying the channel attribute, requesting to open the forward logical channel of B to A; Step S806: End point A replies to the end point B to open the logical channel response message; Step S807: End point A according to the symmetry capability of the end point B, combined with its own capability, to the end point B Send Open logical channel request message. Specify the channel attribute, request to open the forward logical channel of A to B; Step S808: Endpoint B replies to Endpoint A to open the logical channel response message. One pair of request and response messages has a chronological order, and the first pair of capability interaction messages are sent before the first pair of logical channel open messages. According to this rule, the order of sending the messages described in FIG. 5 may also be 1, 3, 2, 4, 5, 6, 7, 8, and the endpoints A and B first transmit the response set and then send the response message, or 1 After the 2, 5, 6, 3, 4, 7, and 8 capabilities are completed, first open the logical channel on one side. In the process of information interaction from endpoint A to endpoint B, as long as the length of the underlying packet is allowed, multiple messages sent from one endpoint may be carried at a time, such as 1, 2+3, 4, 5, 6+7, 8 . 2+3 indicates that end point B is sent to the endpoint A-strip information, and the interaction contains two messages and two messages. Preferred Embodiment 3 The preferred embodiment provides a capability negotiation method, which can ensure scene description, codec capability, and negotiation of multiplexing channels of multiple media between remote presentation endpoints.  The preferred embodiment provides a negotiation mode A-3, and the capability set carried in the negotiation mode capability set interaction message is a transmission capability set, and the transmission capability set includes an endpoint transmission capability related parameter. FIG. 9 is a flowchart 6 of a method for negotiating a remote presentation endpoint according to an embodiment of the present invention. As shown in FIG. 9, the method includes steps S901 and S902. Step S901: capability set interaction: a capability set interaction is performed between two remote presentation endpoints, and the message carries a remote presentation endpoint transmission capability set parameter. Step S902: The logical channel is opened: Open the logical channel between the two telepresence endpoints, and specify the channel attribute after negotiation. The following description will be made with reference to examples. Example 1: This example provides a negotiation mode A-3-1. FIG. 10 is a flowchart 7 of a method for negotiating a remote presentation endpoint according to an embodiment of the present invention. As shown in FIG. 10, the method is in a capability interaction response message. Returning the selected parameters includes the following steps S101 to S108. Step S101: The remote presentation endpoint A initiates a capability set interaction request to the remote presentation endpoint B, and the message carries the transmission capability set of the endpoint A. Step S102: The endpoint B replies to the endpoint A capability set interaction response message, and carries the transmission capability of the B from the A. Step S103: The remote presentation endpoint B initiates a capability set interaction request to the remote presentation endpoint A, and the message carries the transmission capability set of the endpoint B. Step S104: The endpoint A replies to the endpoint B capability set interaction response message, carrying the A slave B. The sending capability concentrates the selected parameters; Step S105: Endpoint A sends an Open Logical Channel Request message to Endpoint B according to the capability of Endpoint B to select in the capability set interaction response. Specifying the channel attribute, requesting to open the forward logical channel of A to B; Step S106: Endpoint B replies to Endpoint A to open the logical channel response message; Step S107: Endpoint B according to the capability of Endpoint A in the capability set interactive response selection, to Endpoint A Send Open logical channel request message. Specify the channel attribute, request to open the forward logical channel of B to A; Step S108: Endpoint A replies to Endpoint B with the open logical channel response message.  One pair of request and response messages has a chronological order, and the first pair of capability interaction messages are sent before the first pair of logical channel open messages. According to this rule, the order of sending the messages described in FIG. 7 may also be 1, 3, 2, 4, 5, 6, 7, 8, and the endpoints A and B first transmit the response set and then send the response message, or 1 After the 2, 5, 6, 3, 4, 7, and 8 capabilities are completed, first open the logical channel on one side. In the process of information interaction from endpoint A to endpoint B, as long as the length of the underlying packet is allowed, multiple messages sent from one endpoint may be carried at a time, such as 1, 2+3, 4, 5, 6+7, 8 . 2+3 indicates that end point B is sent to the endpoint A-strip information, and the interaction contains two messages and two messages. Example 2: This example provides negotiation mode A-3-2. As shown in FIG. 11, the mode does not carry the selection parameter in the capability set interaction response message, and the reverse logical channel is requested to be opened in the open logical channel request. The following steps S111 to S118 are included. Step S111: The remote presentation endpoint A initiates a capability set interaction request to the remote presentation endpoint B, and the message carries the transmission capability set of the endpoint A. Step S112: the endpoint B replies to the endpoint A capability set interaction response message; Step S113: remotely renders the endpoint B direction The remote presentation endpoint A initiates the capability set interaction request, and the message carries the transmission capability set of the endpoint B. Step S114: The endpoint A replies to the endpoint B capability set interaction response message; Step S115: the endpoint B combines its own reception according to the sending capability of the endpoint A. Capabilities, sends an open logical channel request message to endpoint A. Specifying the channel attribute, requesting to open the reverse logical channel of B to A; Step S116: Endpoint A replies to the Endpoint B to open the logical channel response message; Step S117: Endpoint A according to the sending capability of the endpoint B, combined with its own receiving capability, to the endpoint B sends an open logical channel request message. Specify the channel attribute, request to open the reverse logical channel from A to B; Step S118: Endpoint B replies to Endpoint A to open the logical channel response message. One pair of request and response messages has a chronological order, and the first pair of capability interaction messages are sent before the first pair of logical channel open messages. According to this rule, the order of sending the messages described in FIG. 8 may also be 1, 3, 2, 4, 5, 6, 7, 8, and the endpoints A and B first transmit the response set after interacting with each other, or 1 After the 2, 5, 6, 3, 4, 7, and 8 capabilities are completed, first open the logical channel on one side.  In the process of information interaction from endpoint A to endpoint B, as long as the length of the underlying packet is allowed, multiple messages sent from one endpoint may be carried at a time, such as 1, 2+3, 4, 5, 6+7, 8 . 2+3 indicates that end point B is sent to the endpoint A-strip information, and the interaction contains two messages and two messages. Preferred Embodiment 4 The preferred embodiment provides a capability negotiation method, which can ensure scene description, codec capability and negotiation of multiplexing channels of multiple media between remote presentation endpoints. The preferred embodiment provides a negotiation mode A-4, and the capability set carried in the negotiation mode capability set interaction message is a reception capability set and a transmission capability set. The receiving capability set includes endpoint receiving capability related parameters, and the sending capability set includes endpoint sending capability related parameters. As shown in FIG. 12, the following steps S1201 and S1202 are included. Step S1201: Capability set interaction: The capability set interaction is performed between two telepresence endpoints, and the message carries the telepresence endpoint receiving capability set and the sending capability set parameter. Step S1202: Logical channel open: Open the logical channel between two telepresence endpoints, and specify the channel properties after negotiation. The following is an example to illustrate. Instance 1 This example describes the negotiation mode A-4-1. As shown in FIG. 13, the negotiation mode carries the receiving capability set and the sending capability set simultaneously in the capability set interaction, and includes the following steps S1301 to S1308. Step S1301: The remote presentation endpoint A initiates a capability set interaction request to the remote presentation endpoint B, and the message carries the reception capability set and the transmission capability set of the endpoint A. Step S1302: The endpoint B replies to the endpoint A capability set interaction response message; Step S1303: Remote The presentation endpoint B initiates a capability set interaction request to the remote presentation endpoint A, and the message carries the reception capability set and the transmission capability set of the endpoint B. Step S1304: The endpoint A replies to the endpoint B capability set interaction response message; Step S1305: The endpoint B according to the endpoint A The ability to, in conjunction with its own capabilities, sends an Open Logical Channel Request message to Endpoint A. Specify the channel attribute, request to open the forward logical channel of B to A; Step S1306: Endpoint A replies to Endpoint B to open the logical channel response message;  Step S1307: Endpoint A sends an Open Logical Channel Request message to Endpoint B according to the capabilities of Endpoint B and its own capabilities. Specify the channel attribute, request to open the forward logical channel from A to B; Step S1308: Endpoint B replies to Endpoint A to open the logical channel response message. One pair of request and response messages has a chronological order, and the first pair of capability interaction messages are sent before the first pair of logical channel open messages. According to this rule, the order of sending the messages described in FIG. 10 may also be 1, 3, 2, 4, 5, 6, 7, 8, and the endpoints A and B first interact with each other and then open the logical channels on both sides. Or 1, 2, 5, 6, 3, 4, 7, 8, after the ability to complete a part of the interaction, first open the logical channel on one side. In the process of information interaction from endpoint A to endpoint B, as long as the length of the underlying packet is allowed, multiple messages sent from one endpoint may be carried at a time, such as 1, 2+3, 4, 5, 6+7, 8 . 2+3 indicates that end point B is sent to the endpoint A-strip information, and the interaction contains two messages and two messages. Example 2 FIG. 14 is a flowchart 11 of a method for negotiating a remote presentation endpoint according to an embodiment of the present invention. In this embodiment, a negotiation mode A-4-2 is described. As shown in FIG. 14, the mode is in a capability set. The receiving capability set as the receiving end in the interaction request message is formed by the transmission capability set as the transmitting end, and includes the following steps S1401 to S1412.
S1401 : 远程呈现端点 A向远程呈现端点 B发起能力集交互请求, 消息中携带端 点 A的发送能力集; S1401: The remote presentation endpoint A initiates a capability set interaction request to the remote presentation endpoint B, and the message carries the transmission capability set of the endpoint A;
S1402: 端点 B回复端点 A能力集交互响应消息; S1402: Endpoint B replies to the endpoint A capability set interaction response message;
S1403 : 端点 B根据端点 A的发送能力以及自己的接收需要, 向端点 A发起能力 交互请求消息, 消息中携带端点 B的接收能力集; S1403: The endpoint B initiates a capability interaction request message to the endpoint A according to the sending capability of the endpoint A and the receiving capability of the endpoint A, where the message carries the receiving capability set of the endpoint B.
S1404: 端点 A回复端点 B能力集交互响应消息; S1404: Endpoint A replies to the endpoint B capability set interaction response message;
S1405 : 远程呈现端点 B向远程呈现端点 A发起能力集交互请求, 消息中携带端 点 B的发送能力集; S1405: The remote presentation endpoint B initiates a capability set interaction request to the remote presentation endpoint A, and the message carries the transmission capability set of the endpoint B;
S1406: 端点 A回复端点 B能力集交互响应消息; S1407: 端点 A根据端点 B的发送能力以及自己的接收需要, 向端点 B发起能力 交互请求消息, 消息中携带端点 A的接收能力集; S1408: 端点 B回复端点 A能力集交互响应消息; S1406: Endpoint A replies to the endpoint B capability set interaction response message; S1407: The endpoint A initiates a capability interaction request message to the endpoint B according to the sending capability of the endpoint B and its own receiving needs, where the message carries the receiving capability set of the endpoint A; S1408: Endpoint B replies to the endpoint A capability set interaction response message;
S1409: 端点 A根据端点 B的接收能力集, 向端点 B发送打开逻辑通道请求消 息。 指定通道属性, 请求打开 A到 B的前向逻辑通道; S1409: Endpoint A sends an Open Logical Channel Request message to Endpoint B according to the receiving capability set of Endpoint B. Specify the channel attribute, request to open the forward logical channel from A to B;
S1410: 端点 B向端点 A回复打开逻辑通道响应消息; S1411 :端点 B根据端点 A的接收能力集, 向端点 A发送打开逻辑通道请求消息。 指定通道属性, 请求打开 B到 A的前向逻辑通道; S1410: Endpoint B replies to Endpoint A to open logical channel response message; S1411: Endpoint B sends an Open Logical Channel Request message to Endpoint A according to the receiving capability set of Endpoint A. Specify the channel attribute, request to open the forward logical channel from B to A;
S1412: 端点 A向端点 B回复打开逻辑通道响应消息。 其中一对请求和响应消息有时间上的先后顺序, 前两对发送和接收能力交互消息 发送时间在第一对逻辑通道打开消息之前。按照这种规则, 图 11中描述的消息发送的 顺序还可以是 1、 3、 5、 7、 2、 4、 6、 8, 端点 A和 B先互相交互能力集后再发送响 应消息, 或者 1、 2、 3、 4、 9、 10、 5、 6、 7、 8、 11、 12, 能力交互完成一部分后, 先打开一侧的逻辑通道。 其中一次端点 A到端点 B的信息交互过程中, 只要底层封包长度允许, 则可以一 次携带多条从一侧端点发送出的消息, 比如 1、 2+3、 4、 5、 6+7、 8、 9、 10+11、 12。 其中 2+3表示端点 B发送给端点 A—条信息, 该次交互中包含了 2和 3两条消息。 优选实施例五 本优选实施例提供一种能力协商方法, 能够保证远程呈现端点之间多路媒体的场 景描述, 编解码能力以及复用通道的协商。 在本实施例中, 协商方式能力集交互消息中携带的能力集为发送能力集, 如图 15 所示, 分如下三个步骤: 步骤 S1501 : 能力集交互: 两个远程呈现端点之间进行能力集交互, 消息中携带 远程呈现端点接收能力集和发送能力集参数。 步骤 S1502: 模式请求: 接收端根据发送端的发送能力集请求特定的发送模式。 步骤 S1503 : 逻辑通道打开: 打开两个远程呈现端点之间的逻辑通道, 指定协商 后的通道属性。 下面通过实例进行详细说明。 实例一 图 16是根据本发明实施例的远端呈现端点的协商方法的流程图十三,如 16所示, 该方法包括如下步骤 S1601至步骤 S1612。 步骤 S1601 : 远程呈现端点 A向远程呈现端点 B发起能力集交互请求, 消息中携 带端点 A的发送能力集; 步骤 S1602: 端点 B回复端点 A能力集交互响应消息。 步骤 S1603 : 远程呈现端点 B向远程呈现端点 A发起能力集交互请求, 消息中携 带端点 B的发送能力集。 步骤 S1604: 端点 A回复端点 B能力集交互响应消息。 步骤 S 1605: 远程呈现端点 B向远程呈现端点 A发起模式请求消息, 请求特定的 发送模式, 模式中携带端点 A的发送参数。 步骤 S1606: 端点 A回复端点 B模式请求响应消息。 步骤 S1607: 远程呈现端点 A向远程呈现端点 B发起模式请求消息, 请求特定的 发送模式, 模式中携带端点 B的发送参数。 步骤 S1608: 端点 B回复端点 A模式请求响应消息。 步骤 S1609: 端点 A根据模式请求过程的结果, 向端点 B发送打开逻辑通道请求 消息。 指定通道属性, 请求打开 A到 B的前向逻辑通道。 步骤 S1610: 端点 B向端点 A回复打开逻辑通道响应消息; 步骤 S1611 : 端点 B根据模式请求过程的结果, 向端点 A发送打开逻辑通道请求 消息。 指定通道属性, 请求打开 B到 A的前向逻辑通道。 步骤 S1612: 端点 A向端点 B回复打开逻辑通道响应消息。 优选地, 其中一对请求和响应消息有时间上的先后顺序, 第一对能力交互消息发 送时间在第一对模式请求消息之前, 第一对模式请求消息在第一对逻辑通道打开消息 之前。 按照这种规则, 图 13中描述的消息发送的顺序还可以是 1、 2、 5、 6、 3、 4、 7、 8、 9、 10、 11、 12, 或者 1、 2、 5、 6、 9、 10、 3、 4、 7、 8、 11、 12。 其中一次端点 A到端点 B的信息交互过程中, 只要底层封包长度允许, 则可以一 次携带多条从一侧端点发送出的消息, 比如 1、 2+3、 4、 5、 6+7、 8、 9、 10+11、 12。 其中 2+3表示端点 B发送给端点 A—条信息, 该次交互中包含了 2和 3两条消息。 优选实施例六 本优选实施例提供了一种远程呈现端点能力集参数,图 17是根据本发明实施例的 远程呈现端点能力集的示意图, 如图 17所示, 分为发送能力集和接收能力集。下面进 行详细描述。 需要说明的是, 图中以及上述实施例中列出描述的只是较为重要的参数, 并不代 表全部的参数。 在能力交互消息中传递的参数并非要包括下面全部的参数, 可以根据 需要实际组合。 发送能力集中主要包括捕获相关参数, 捕获相关参数中包括通用参数, 视频参数 和音频参数。 发送能力集中的其它参数可以用来包含相关技术中的编码标准, 例如S1412: Endpoint A replies to Endpoint B with an open logical channel response message. One of the pair of request and response messages has a chronological order, and the first two pairs of sending and receiving capability exchange messages are sent before the first pair of logical channel open messages. According to this rule, the order of sending the messages described in FIG. 11 may also be 1, 3, 5, 7, 2, 4, 6, 8, and the endpoints A and B first transmit the response set after interacting with each other, or 1 2, 3, 4, 9, 10, 5, 6, 7, 8, 11, 12, after the part of the capability interaction is completed, first open the logical channel on one side. In the process of information interaction from endpoint A to endpoint B, as long as the length of the underlying packet is allowed, multiple messages sent from one endpoint may be carried at a time, such as 1, 2+3, 4, 5, 6+7, 8 , 9, 10+1, 12. 2+3 indicates that endpoint B sends to endpoint A-strip information, and the interaction contains 2 and 3 messages. Preferred Embodiment 5 This preferred embodiment provides a capability negotiation method, which can ensure scene description, codec capability, and negotiation of multiplexing channels of multiple media between remote presentation endpoints. In this embodiment, the capability set carried in the negotiation mode capability set interaction message is a transmission capability set. As shown in FIG. 15, the following three steps are performed: Step S1501: capability set interaction: capability between two remote presentation endpoints Set interaction, the message carries the telepresence endpoint receiving capability set and the sending capability set parameter. Step S1502: Mode request: The receiving end requests a specific transmission mode according to the sending capability set of the transmitting end. Step S1503: Logical channel open: Open the logical channel between the two telepresence endpoints, and specify the negotiated channel attributes. The following is a detailed description by way of example. FIG. 16 is a flowchart 13 of a method for negotiating a remote presentation endpoint according to an embodiment of the present invention. As shown in FIG. 16, the method includes the following steps S1601 to S1612. Step S1601: The remote presentation endpoint A initiates a capability set interaction request to the remote presentation endpoint B, and the message carries the transmission capability set of the endpoint A. Step S1602: The endpoint B replies to the endpoint A capability set interaction response message. Step S1603: The remote presentation endpoint B initiates a capability set interaction request to the remote presentation endpoint A, and the message carries the transmission capability set of the endpoint B. Step S1604: Endpoint A replies to the Endpoint B Capability Set Interactivity Response message. Step S 1605: The remote presentation endpoint B initiates a mode request message to the remote presentation endpoint A, requesting a specific transmission mode, and the mode carries the transmission parameter of the endpoint A. Step S1606: Endpoint A replies to the Endpoint B Mode Request Response message. Step S1607: The remote presentation endpoint A initiates a mode request message to the remote presentation endpoint B, requesting a specific transmission mode, and the mode carries the transmission parameter of the endpoint B. Step S1608: Endpoint B replies to the endpoint A mode request response message. Step S1609: Endpoint A sends an Open Logical Channel Request message to Endpoint B according to the result of the mode request process. Specify the channel properties and request to open the forward logical channel from A to B. Step S1610: Endpoint B replies to Endpoint A to open the logical channel response message; Step S1611: Endpoint B sends an Open Logical Channel Request message to Endpoint A according to the result of the mode request process. Specifies the channel properties, requesting to open the forward logical channel from B to A. Step S1612: Endpoint A replies to Endpoint B with an open logical channel response message. Preferably, the pair of request and response messages have a temporal sequence, the first pair of capability interaction message sending time is before the first pair mode request message, and the first pair mode request message is before the first pair of logical channel open messages. According to this rule, the order of message transmission described in FIG. 13 may also be 1, 2, 5, 6, 3, 4, 7, 8, 9, 10, 11, 12, or 1, 2, 5, 6, 9, 10, 3, 4, 7, 8, 11, 12. In the process of information interaction from endpoint A to endpoint B, as long as the length of the underlying packet is allowed, multiple messages sent from one endpoint may be carried at a time, such as 1, 2+3, 4, 5, 6+7, 8 , 9, 10+1, 12. 2+3 indicates that endpoint B sends to endpoint A-strip information, and the interaction contains 2 and 3 messages. Preferred Embodiment 6 This preferred embodiment provides a remote presentation endpoint capability set parameter, and FIG. 17 is a schematic diagram of a remote presentation endpoint capability set according to an embodiment of the present invention. As shown in FIG. 17, the transmission capability set and the reception capability are divided. set. A detailed description will be given below. It should be noted that the parameters listed in the figure and in the above embodiments are only important parameters, and do not represent all the parameters. The parameters passed in the capability interaction message are not intended to include all of the following parameters, and can be actually combined as needed. The transmission capability set mainly includes capturing related parameters, and the related parameters include common parameters, video parameters and audio parameters. Other parameters in the set of transmit capabilities can be used to include coding standards in related art, such as
H.263、 H.264等。 优选地, 通用参数用来描述场景相关参数和通用编码相关参数。 其中场景相关参 数包括场景描述, 场景切换策略, 场景区域, 度量信息; 通用编码相关参数包括最大 带宽, 每秒最大宏块数、 编码标准等。 优选地, 视频参数用来描述组成捕获场景的各个独立视频的属性。 视频参数主要 包括视频捕获空间信息, 视频捕获编码信息, 视频捕获的数量, 视频内容属性, 视频 切换策略, 视频组合策略等参数。 其中视频捕获空间信息包括捕获区域, 捕获点, 捕 获线上的一点; 视频捕获编码信息包括该视频捕获的最大视频带宽,每秒最大宏块数, 最大视频分辨率宽度, 最大视频分辨率高度, 最大视频帧率。 优选地, 音频参数用来描述组成捕获场景的各个独立音频的属性。 音频参数主要 包括音频捕获空间信息, 音频捕获编码信息, 音频捕获数量等参数。 其中音频捕获空 间信息包括捕获区域, 捕获点, 捕获线上的一点; 音频捕获编码信息包括该音频捕获 的音频信道格式, 最大音频带宽。 优选地, 接收能力集和发送能力集相对应, 渲染相关参数和捕获相关参数对应。 优选地, 接收能力集中主要包括渲染相关参数, 渲染相关参数中包括通用参数, 视频参数和音频参数。 接收能力集中的其它参数可以用来包含解码标准, 例如 H.263、 H.264等。 优选地, 通用参数用来描述场景相关参数和通用解码相关参数。 其中场景相关参 数包括场景描述, 场景切换策略, 场景区域, 度量信息等; 通用解码相关参数包括最 大带宽, 每秒最大宏块数, 解码标准等。 优选地, 视频参数用来描述组成渲染场景的各个独立视频的属性。 视频参数主要 包括视频渲染空间信息, 视频渲染解码信息, 视频渲染的数量, 内容属性, 自动切换 策略, 组合策略等参数。 其中视频渲染空间信息包括渲染区域, 渲染点, 渲染线上的 一点; 视频渲染解码信息包括该视频渲染的最大视频带宽, 每秒最大宏块数, 最大视 频分辨率宽度, 最大视频分辨率高度, 最大视频帧率。 优选地, 音频参数用来描述组成渲染场景的各个独立音频的属性。 音频参数主要 包括音频渲染空间信息, 音频渲染解码信息, 音频渲染数量等参数。 其中音频渲染空 间信息包括渲染区域, 渲染点, 渲染线上的一点; 音频渲染解码信息包括该音频渲染 的音频信道格式, 最大音频带宽。 优选地, 远程呈现端点指定的逻辑通道属性, 主要包括该通道用来传送的媒体相 关信息, 编解码信息等, 如果需要复用该逻辑通道, 需要指定通道复用信息。 优选实施例七 在本实施例中是优选实施例中的协商方式 A-1的优选实施例之一, 描述的是远程 呈现端点 A和远程呈现端点 B之间具体的能力协商流程, 其中端点 A具备 3个摄像 头, 3个显示器, 1个麦克风, 1个扬声器, 端点 B具备 3个摄像头, 3个显示器, 1 个麦克风, 1个扬声器。 端点 A或 /和端点 B也可以是 MCU设备。 该方法包括如下步 骤 S1702至步骤 S1716。 步骤 S1702: 远程呈现端点 A向远程呈现端点 B发起能力集交互请求, 消息中携 带端点 A的接收能力集。 接收能力集包含的渲染相关参数中的视频参数分别为: 渲染标识为 VR0的视频渲染空间信息表示为左, 视频内容属性为主视频, 视频渲 染解码信息中的最大视频带宽为 4M; 渲染标识为 VR1的视频渲染空间信息表示为中, 视频内容属性为主视频, 视频渲 染解码信息中的最大视频带宽为 4M; 渲染标识为 VR2的视频渲染空间信息表示为右, 视频内容属性为主视频, 视频渲 染解码信息中的最大视频带宽为 4M; 渲染标识为 VR3的视频渲染空间信息表示为左, 视频内容属性为最大声发言者, 视频渲染解码信息中的最大视频带宽为 4M, 自动切换属性为 YES; 渲染标识为 VR4的视频渲染空间信息表示为中, 视频内容属性为全景视频, 视频 渲染解码信息中的最大视频带宽为 4M; 渲染标识为 VR5的视频渲染空间信息表示为右, 视频内容属性为 VIP, 视频渲染 解码信息中的最大视频带宽为 4M; 接收能力集包含的渲染相关参数中的音频参数为: 渲染标识为 AR0的音频内容为主音频,音频渲染解码信息中的最大带宽为 128K, 音频信道格式为立体声; 接收能力集包含的渲染相关参数中的通用参数为: 场景标识为 1的场景由 VR0、 VR1、 VR2组成, 场景描述为渲染左、 中、右视频, 编解码标准为 H.264, 最大带宽为 12M; 场景标识为 2的场景由 VR3、 VR4、 VR5组成, 场景描述为渲染最大声发言者、 全景、 VIP视频, 编解码标准为 H.264, 最大带宽为 12M; 场景标识为 3的场景由 AR0组成, 场景描述为渲染主音频, 编解码标准为 G711 , 最大带宽为 128K。 步骤 S1704: 端点 Β回复端点 Α能力集交互响应消息; 步骤 S1706: 远程呈现端点 B向远程呈现端点 A发起能力集交互请求, 消息中携 带端点 B的接收能力集; 渲染标识为 VR0的视频渲染空间信息表示为左, 视频内容属性为主视频, 视频渲 染解码信息中的最大视频带宽为 4M; 渲染标识为 VR1的视频渲染空间信息表示为中, 视频内容属性为主视频, 视频渲 染解码信息中的最大视频带宽为 4M; 渲染标识为 VR2的视频渲染空间信息表示为右, 视频内容属性为主视频, 视频渲 染解码信息中的最大视频带宽为 4M; 渲染标识为 AR0的音频内容为主音频,音频渲染解码信息中的最大带宽为 128K, 音频信道格式为立体声; 接收能力集包含的渲染相关参数中的通用参数为: 场景标识为 1的场景由 VR0、 VR1、 VR2组成, 场景描述为渲染左、 中、右视频, 编解码标准为 H.264, 最大带宽为 12M。 场景标识为 2的场景由 AR0组成, 场景描述为渲染主音频, 编解码标准为 G711 , 最大带宽为 128K。 步骤 SI 708: 端点 A回复端点 B能力集交互响应消息。 步骤 S1710: 端点 B根据端点 A的接收能力, 结合自己的发送能力, 决定发送端 点 A接收能力集中的场景 1和场景 3。 端点 B向端点 A发送打开逻辑通道请求消息, 指定通道属性, 请求打开 B到 A的前向逻辑通道。 指定的逻辑通道属性分别为: 一路视频逻辑通道传送 3路 RTP视频流,一路音频逻辑通道传输 1路 RTP音频流。 视频通道媒体相关信息为发送 VR0、 VR1、 VR2视频; 编码相关信息为采用 H.264编 码, 最大带宽为 12M; 复用信息为 VR0、 VR1、 VR2分别对应的 RTP头扩展标识的值 为 0、 1、 2, 用来区分同一个逻辑通道中不同的 RTP流。 音频通道媒体相关信息为发 送 AR0音频, 编码相关信息为 G711 , 最大带宽为 128K。 步骤 S1712: 端点 Α向端点 B回复打开逻辑通道响应消息; 步骤 S1714: 端点 A根据端点 B的接收能力, 结合自己的发送能力, 决定发送端 点 B接收能力集中的场景 1和场景 2。 端点 A向端点 B发送打开逻辑通道请求消息, 指定通道属性, 请求打开 A到 B的前向逻辑通道。 指定的逻辑通道属性分别为: 一路视频逻辑通道传送 3路 RTP视频流,一路音频逻辑通道传输 1路 RTP音频流。 视频通道媒体相关信息为发送 VR0、 VR1、 VR2视频; 编码相关信息为采用 H.264编 码, 最大带宽为 12M; 复用信息为 VR0、 VR1、 VR2分别对应的 RTP头扩展标识的值 为 0、 1、 2, 用来区分同一个逻辑通道中不同的 RTP流。 音频通道媒体相关信息为发 送 AR0音频, 编码相关信息为 G711 , 最大带宽为 128K。 步骤 S1716: 端点 Β向端点 Α回复打开逻辑通道响应消息。 优选实施例八 本优选实施例是优选实施例协商方式 A-l的优选实施例之一, 描述的是远程呈现 端点 A和远程呈现端点 B之间具体的能力协商流程。 端点 A和端点 B为远程呈现视 频会议端点, 端点 A具备 3个摄像头、 3个显示器、 1个麦克风、 1个扬声器, 端点 B 具备 1个摄像头、 1个显示器、 1个麦克风、 1个扬声器。 端点 A或 /和端点 B也可以 是 MCU设备。 该方法包括如下步骤 S1802至步骤 S1816。 步骤 S1802: 远程呈现端点 A向远程呈现端点 B发起能力集交互请求, 消息中携 带端点 A的接收能力集。 优选地, 接收能力集包含的渲染相关参数中的视频参数分别为: 渲染标识为 VR0的视频渲染空间信息表示为左, 视频内容属性为主视频, 视频渲 染解码信息中的最大视频带宽为 4M。 渲染标识为 VR1的视频渲染空间信息表示为中, 视频内容属性为主视频, 视频渲 染解码信息中的最大视频带宽为 4M。 渲染标识为 VR2的视频渲染空间信息表示为右, 视频内容属性为主视频, 视频渲 染解码信息中的最大视频带宽为 4M。 渲染标识为 VR3的视频渲染空间信息表示为中, 视频内容属性为全景视频, 视频 渲染解码信息中的最大视频带宽为 4M。 优选地, 接收能力集包含的渲染相关参数中的音频参数为: 渲染标识为 AR0的音频内容为主音频,音频渲染解码信息中的最大带宽为 128K, 音频信道格式为立体声。 接收能力集包含的渲染相关参数中的通用参数为: 场景标识为 1的场景由 VR0、 VR1、 VR2组成, 场景描述为渲染左、 中、右视频, 编解码标准为 H.264, 最大带宽为 12M。 场景标识为 2的场景由 VR3组成,场景描述为渲染全景视频,编解码标准为 H.264, 最大带宽为 4M。 场景标识为 3的场景由 AR0组成, 场景描述为渲染主音频, 编解码标准为 G711 , 最大带宽为 128K。 步骤 S1804: 端点 Β回复端点 Α能力集交互响应消息。 步骤 S1806: 远程呈现端点 B向远程呈现端点 A发起能力集交互请求, 消息中携 带端点 B的接收能力集。 渲染标识为 VR0的场景描述为渲染全景视频, 编解码标准为 H.264, 最大带宽为H.263, H.264, etc. Preferably, the general parameters are used to describe scene related parameters and general coding related parameters. The scene related parameters include a scene description, a scene switching policy, a scene area, and metric information. The general encoding related parameters include a maximum bandwidth, a maximum number of macroblocks per second, and an encoding standard. Preferably, the video parameters are used to describe the attributes of the individual videos that make up the captured scene. The video parameters mainly include video capture space information, video capture and encoding information, number of video captures, video content attributes, video switching strategies, video combination strategies and other parameters. The video capture spatial information includes a capture area, a capture point, and a point on the capture line; the video capture encoded information includes a maximum video bandwidth captured by the video, a maximum number of macroblocks per second, a maximum video resolution width, a maximum video resolution height, Maximum video frame rate. Preferably, the audio parameters are used to describe the attributes of the individual audios that make up the captured scene. The audio parameters mainly include parameters such as audio capture space information, audio capture coding information, and number of audio captures. The audio capture space information includes a capture area, a capture point, and a point on the capture line; the audio capture encoded information includes an audio channel format of the audio capture, and a maximum audio bandwidth. Preferably, the receiving capability set corresponds to the sending capability set, and the rendering related parameter corresponds to the capturing related parameter. Preferably, the receiving capability set mainly includes rendering related parameters, and the rendering related parameters include general parameters, video parameters and audio parameters. Other parameters of the receive capability set can be used to include decoding standards such as H.263, H.264, and the like. Preferably, the general parameters are used to describe scene related parameters and general decoding related parameters. The scene related parameters include a scene description, a scene switching policy, a scene area, and metric information. The general decoding related parameters include a maximum bandwidth, a maximum number of macroblocks per second, and a decoding standard. Preferably, the video parameters are used to describe the attributes of the individual videos that make up the rendered scene. Video parameters mainly include video rendering space information, video rendering and decoding information, number of video renderings, content attributes, automatic switching strategies, combined strategies and other parameters. The video rendering space information includes a rendering area, a rendering point, and a point on the rendering line; the video rendering decoding information includes a maximum video bandwidth of the video rendering, a maximum number of macroblocks per second, a maximum video resolution width, and a maximum video resolution height. Maximum video frame rate. Preferably, the audio parameters are used to describe the properties of the individual audios that make up the rendered scene. The audio parameters mainly include parameters such as audio rendering space information, audio rendering decoding information, and number of audio renderings. The audio rendering space information includes a rendering area, a rendering point, and a point on the rendering line; the audio rendering decoding information includes an audio channel format of the audio rendering, and a maximum audio bandwidth. Preferably, the logical channel attribute specified by the remote presentation terminal mainly includes media related information, codec information, and the like used by the channel, and if the logical channel needs to be multiplexed, channel multiplexing information needs to be specified. The preferred embodiment 7 is one of the preferred embodiments of the negotiation mode A-1 in the preferred embodiment in this embodiment, and describes a specific capability negotiation process between the remote presentation endpoint A and the remote presentation endpoint B, wherein the endpoint A With 3 cameras, 3 monitors, 1 microphone, 1 speaker, Endpoint B has 3 cameras, 3 monitors, 1 microphone, 1 speaker. Endpoint A or / and Endpoint B can also be MCU devices. The method includes the following steps S1702 to S1716. Step S1702: The remote presentation endpoint A initiates a capability set interaction request to the remote presentation endpoint B, and the message carries the reception capability set of the endpoint A. The video parameters in the rendering related parameters included in the receiving capability set are: the video rendering space information with the rendering identifier VR0 is represented as the left, the video content attribute is the main video, and the maximum video bandwidth in the video rendering decoding information is 4M; The video rendering space information of VR1 is represented as medium, the video content attribute is the main video, and the maximum video bandwidth in the video rendering and decoding information is 4M; The video rendering space information with the rendering identifier VR2 is represented as the right, the video content attribute is the main video, and the maximum video bandwidth in the video rendering decoding information is 4M; the video rendering space information with the rendering identifier VR3 is represented as the left, and the video content attribute is the largest. Acoustic speaker, the maximum video bandwidth in the video rendering and decoding information is 4M, the automatic switching attribute is YES; the video rendering space information with the rendering identifier VR4 is represented as medium, the video content attribute is the panoramic video, and the maximum video in the video rendering decoding information The bandwidth is 4M; the video rendering space information with the rendering identifier VR5 is represented as right, the video content attribute is VIP, and the maximum video bandwidth in the video rendering decoding information is 4M; the audio parameters in the rendering related parameters included in the receiving capability set are: rendering The audio content identified as AR0 is the main audio, the maximum bandwidth in the audio rendering and decoding information is 128K, and the audio channel format is stereo; the common parameters in the rendering related parameters included in the receiving capability set are: The scene with the scene identifier is 1 by VR0, VR1, VR2, scene description is to render left, center, right video The codec standard is H.264, and the maximum bandwidth is 12M. The scene with scene ID 2 consists of VR3, VR4, and VR5. The scene is described as rendering the largest speaker, panorama, VIP video, and the codec standard is H.264. The bandwidth is 12M. The scene with the scene ID of 3 is composed of AR0. The scene is described as rendering the main audio. The codec standard is G711 and the maximum bandwidth is 128K. Step S1704: The endpoint Β reply to the endpoint Α capability set interaction response message; Step S1706: The remote presentation endpoint B initiates a capability set interaction request to the remote presentation endpoint A, where the message carries the reception capability set of the endpoint B; and the video rendering space with the rendering identifier VR0 The information is represented as left, the video content attribute is the main video, and the maximum video bandwidth in the video rendering and decoding information is 4M; the video rendering space information with the rendering identifier VR1 is represented as medium, the video content attribute is the main video, and the video rendering information is decoded. The maximum video bandwidth is 4M; The video rendering space information with the rendering identifier VR2 is represented as the right, the video content attribute is the main video, and the maximum video bandwidth in the video rendering decoding information is 4M; the audio content of the rendering identifier AR0 is the main audio, and the audio rendering decoding information is the largest. The bandwidth is 128K, and the audio channel format is stereo. The common parameters in the rendering related parameters included in the receiving capability set are: The scene with the scene identifier 1 consists of VR0, VR1, and VR2, and the scene is described as rendering left, center, and right video. The decoding standard is H.264, and the maximum bandwidth is 12M. The scene with the scene ID of 2 is composed of AR0. The scene is described as rendering the main audio. The codec standard is G711, and the maximum bandwidth is 128K. Step SI 708: Endpoint A replies to the Endpoint B Capability Set Interactivity Response message. Step S1710: According to the receiving capability of the endpoint A, the endpoint B determines the sending scenario 1 and scenario 3 of the receiving capability of the endpoint A according to its own sending capability. Endpoint B sends an Open Logical Channel Request message to Endpoint A, specifies the channel properties, and requests to open the forward logical channel from B to A. The specified logical channel attributes are: One video logical channel transmits three RTP video streams, and one audio logical channel transmits one RTP audio stream. The video channel media related information is the VR0, VR1, and VR2 video transmission; the encoding related information is encoded by H.264, and the maximum bandwidth is 12M; the multiplexing information is VR0, VR1, and VR2 respectively. 1, 2, used to distinguish different RTP streams in the same logical channel. The audio channel media related information is sent AR0 audio, the encoding related information is G711, and the maximum bandwidth is 128K. Step S1712: The endpoint replies to the endpoint B to open the logical channel response message. Step S1714: The endpoint A determines the sending scenario 1 and scenario 2 of the endpoint B receiving capability set according to the receiving capability of the endpoint B and its own sending capability. Endpoint A sends an Open Logical Channel Request message to Endpoint B, specifies the channel properties, and requests to open the forward logical channel from A to B. The specified logical channel attributes are: One video logical channel transmits three RTP video streams, and one audio logical channel transmits one RTP audio stream. The video channel media related information is the VR0, VR1, and VR2 video transmission; the encoding related information is encoded by H.264, and the maximum bandwidth is 12M; the multiplexing information is VR0, VR1, and VR2 respectively. 1, 2, used to distinguish different RTP streams in the same logical channel. The audio channel media related information is sent AR0 audio, the encoding related information is G711, and the maximum bandwidth is 128K. Step S1716: The endpoint Α replies to the endpoint Α to open the logical channel response message. Preferred Embodiment 8 This preferred embodiment is one of the preferred embodiments of the preferred embodiment negotiation mode A1, and describes a specific capability negotiation procedure between the remote presentation endpoint A and the remote presentation endpoint B. Endpoint A and Endpoint B are telepresence video conferencing endpoints. Endpoint A has 3 cameras, 3 displays, 1 microphone, and 1 speaker. Endpoint B has 1 camera, 1 display, 1 microphone, and 1 speaker. Endpoint A or / and Endpoint B can also be MCU devices. The method includes the following steps S1802 to S1816. Step S1802: The remote presentation endpoint A initiates a capability set interaction request to the remote presentation endpoint B, and the message carries the reception capability set of the endpoint A. Preferably, the video parameters in the rendering related parameters included in the receiving capability set are: the video rendering spatial information with the rendering identifier VR0 is represented as the left, the video content attribute is the main video, and the maximum video bandwidth in the video rendering decoding information is 4M. The video rendering space information with the rendering identifier VR1 is represented as medium, the video content attribute is the main video, and the maximum video bandwidth in the video rendering decoding information is 4M. The video rendering space information with the rendering identifier VR2 is represented as the right, the video content attribute is the main video, and the maximum video bandwidth in the video rendering decoding information is 4M. The video rendering space information with the rendering identifier VR3 is represented as medium, the video content attribute is panoramic video, and the maximum video bandwidth in the video rendering decoding information is 4M. Preferably, the audio parameters in the rendering related parameters included in the receiving capability set are: rendering the audio content identified as AR0 as the main audio, the maximum bandwidth in the audio rendering decoding information is 128K, and the audio channel format is stereo. The common parameters in the rendering-related parameters included in the receiving capability set are: The scene with the scene identifier of 1 is composed of VR0, VR1, and VR2, and the scene is described as rendering left, center, and right video. The codec standard is H.264, and the maximum bandwidth is 12M. The scene with the scene ID of 2 is composed of VR3. The scene is described as rendering panoramic video. The codec standard is H.264, and the maximum bandwidth is 4M. The scene with the scene ID of 3 is composed of AR0. The scene is described as rendering the main audio. The codec standard is G711, and the maximum bandwidth is 128K. Step S1804: The endpoint Β reply to the endpoint Α capability set interaction response message. Step S1806: The remote presentation endpoint B initiates a capability set interaction request to the remote presentation endpoint A, and the message carries the reception capability set of the endpoint B. The scene with the rendering identifier VR0 is described as rendering the panoramic video. The codec standard is H.264, and the maximum bandwidth is
4M。 渲染标识为 AR0的音频内容为主音频,音频渲染解码信息中的最大带宽为 128K, 音频信道格式为立体声。 接收能力集包含的渲染相关参数中的通用参数为: 场景标识为 1的场景由 VR0组成,场景描述为渲染全景视频,编解码标准为 H.264, 最大带宽为 4M。 场景标识为 2的场景由 AR0组成, 场景描述为渲染主音频, 编解码标准为 G711 , 最大带宽为 128K。 步骤 SI 808: 端点 A回复端点 B能力集交互响应消息。 步骤 S1810: 端点 B根据端点 A的接收能力, 结合自己的发送能力, 决定发送端 点 A接收能力集中的场景 2和场景 3。 端点 B向端点 A发送打开逻辑通道请求消息, 指定通道属性, 请求打开 B到 A的前向逻辑通道。 指定的逻辑通道属性分别为: 一路视频逻辑通道传送 1路 RTP视频流,一路音频逻辑通道传输 1路 RTP音频流。 视频通道媒体相关信息为发送 VR3视频; 编码相关信息为采用 H.264编码, 最大带宽 为 4M;无复用信息。音频通道媒体相关信息为发送 AR0音频,编码相关信息为 G711 , 最大带宽为 128K。 步骤 S1812: 端点 Α向端点 B回复打开逻辑通道响应消息。 步骤 S1814: 端点 A根据端点 B的接收能力, 结合自己的发送能力, 决定发送端 点 B接收能力集中的场景 1和场景 2。 端点 A向端点 B发送打开逻辑通道请求消息, 指定通道属性, 请求打开 A到 B的前向逻辑通道。 指定的逻辑通道属性分别为: 一路视频逻辑通道传送 1路 RTP视频流,一路音频逻辑通道传输 1路 RTP音频流。 视频通道媒体相关信息为发送 VR0视频; 编码相关信息为采用 H.264编码, 最大带宽 为 12M;无复用信息。音频通道媒体相关信息为发送 AR0音频,编码相关信息为 G711 , 最大带宽为 128K。 步骤 S1816: 端点 Β向端点 Α回复打开逻辑通道响应消息。 优选实施例九 本优选实施例是协商方式 A-2的优选实施例之一, 描述的是远程呈现端点 A和远 程呈现端点 B之间具体的能力协商流程。 端点 A和端点 B为远程呈现视频会议端点, 端点 A具备 3个摄像头、 3个显示器、 1个麦克风、 1个扬声器, 端点 B具备 3个摄 像头、 3个显示器、 1个麦克风、 1个扬声器。 端点 A或 /和端点 B也可以是 MCU设 备。 本优选实施例包括如下步骤 S1902至步骤 S1916。 步骤 S1902: 远程呈现端点 A向远程呈现端点 B发起能力集交互请求, 消息中携 带端点 A的对称能力集; 对称能力集包含的渲染 /捕获相关参数中的视频参数分别为: 渲染 /捕获标识为 VR0 的视频渲染空间信息表示为左, 视频内容属性为主视频, 视频渲染 /捕获编解码信息中的最大视频带宽为 4M; 渲染 /捕获标识为 VR1 的视频渲染空间信息表示为中, 视频内容属性为主视频, 视频渲染 /捕获编解码信息中的最大视频带宽为 4M; 渲染 /捕获标识为 VR2 的视频渲染空间信息表示为右, 视频内容属性为主视频, 视频渲染 /捕获编解码信息中的最大视频带宽为 4M; 接收能力集包含的渲染相关参数中的音频参数为: 渲染 /捕获标识为 AR0的音频内容为主音频, 音频渲染 /捕获编解码信息中的最大 带宽为 128K, 音频信道格式为立体声; 对称能力集包含的渲染 /捕获相关参数中的通用参数为: 场景标识为 1的场景由 VR0、 VR1、 VR2组成, 场景描述为渲染左、 中、右视频, 编解码标准为 H.264, 最大带宽为 12M; 场景标识为 2的场景由 VR3组成,场景描述为渲染全景视频,编解码标准为 H.264, 最大带宽为 4M; 场景标识为 3的场景由 AR0组成, 场景描述为渲染主音频, 编解码标准为 G711 , 最大带宽为 128K。 步骤 S1904: 端点 Β回复端点 Α能力集交互响应消息; 步骤 S1906: 远程呈现端点 B向远程呈现端点 A发起能力集交互请求, 消息中携 带端点 B的对称能力集; 渲染 /捕获标识为 VR0的视频渲染 /捕获空间信息表示为左, 视频内容属性为主视 频, 视频渲染 /捕获编解码信息中的最大视频带宽为 4M; 渲染 /捕获标识为 VR1的视频渲染 /捕获空间信息表示为中, 视频内容属性为主视 频, 视频渲染 /捕获编解码信息中的最大视频带宽为 4M; 渲染 /捕获标识为 VR2的视频渲染 /捕获空间信息表示为右, 视频内容属性为主视 频, 视频渲染 /捕获编解码信息中的最大视频带宽为 4M; 渲染 /捕获标识为 AR0的音频内容为主音频, 音频渲染 /捕获编解码信息中的最大 带宽为 128K, 音频信道格式为立体声; 接收能力集包含的渲染 /捕获相关参数中的通用参数为: 场景标识为 1的场景由 VR0、 VR1、 VR2组成, 场景描述为渲染 /捕获左、 中、 右 视频, 编解码标准为 H.264, 最大带宽为 12M; 场景标识为 2的场景由 AR0组成, 场景描述为渲染 /捕获主音频, 编解码标准为 G.711 , 最大带宽为 128K。 步骤 S1908: 端点 Α回复端点 B能力集交互响应消息; 步骤 S1910: 端点 B根据端点 A的对称能力, 结合自己的能力, 决定发送端点 A 接收能力集中的场景 1和场景 3。 端点 B向端点 A发送打开逻辑通道请求消息。 指定 通道属性, 请求打开 B到 A的前向逻辑通道。 指定的逻辑通道属性分别为: 一路视频逻辑通道传送 3路 RTP视频流,一路音频逻辑通道传输 1路 RTP音频流。 视频通道媒体相关信息为发送 VR0、 VR1、 VR2视频; 编码相关信息为采用 H.264编 码, 最大带宽为 12M; 复用信息为 VR0、 VR1、 VR2分别对应的 RTP头扩展标识的值 为 0、 1、 2, 用来区分同一个逻辑通道中不同的 RTP流。 音频通道媒体相关信息为发 送 AR0音频, 编码相关信息为 G711 , 最大带宽为 128K。 步骤 S1912: 端点 Α向端点 B回复打开逻辑通道响应消息; 步骤 S1914: 端点 A根据端点 B的对称能力, 结合自己的发送能力, 决定发送端 点 B接收能力集中的场景 1和场景 2。 端点 A向端点 B发送打开逻辑通道请求消息, 指定通道属性, 请求打开 A到 B的前向逻辑通道。 指定的逻辑通道属性分别为: 一路视频逻辑通道传送 3路 RTP视频流,一路音频逻辑通道传输 1路 RTP音频流。 视频通道媒体相关信息为发送 VR0、 VR1、 VR2视频; 编码相关信息为采用 H.264编 码, 最大带宽为 12M; 复用信息为 VR0、 VR1、 VR2分别对应的 RTP头扩展标识的值 为 0、 1、 2, 用来区分同一个逻辑通道中不同的 RTP流。 音频通道媒体相关信息为发 送 AR0音频, 编码相关信息为 G711 , 最大带宽为 128K。 步骤 S1916: 端点 Β向端点 Α回复打开逻辑通道响应消息。 优选实施例十 本优选实施例是协商方式 A-3-1的最佳实施例之一, 描述的是远程呈现端点 A和 远程呈现端点 B之间具体的能力协商流程。端点 A和端点 B为远程呈现视频会议端点, 端点 A具备 3个摄像头、 3个显示器、 1个麦克风、 1个扬声器, 端点 B具备 1个摄 像头、 1个显示器、 1个麦克风、 1个扬声器。 端点 A或 /和端点 B也可以是 MCU设 备。 该优选实施例包括如下步骤 S2002至步骤 S2016。 步骤 S2002: 远程呈现端点 A向远程呈现端点 B发起能力集交互请求, 消息中携 带端点 A的发送能力集; 发送能力集包含的捕获相关参数中的视频参数分别为: 捕获标识为 VR0的视频捕获空间信息表示为左, 视频内容属性为主视频, 视频捕 获编码信息中的最大视频带宽为 4M; 捕获标识为 VR1的视频捕获空间信息表示为中, 视频内容属性为主视频, 视频捕 获编码信息中的最大视频带宽为 4M; 捕获标识为 VR2的视频捕获空间信息表示为右, 视频内容属性为主视频, 视频捕 获编码信息中的最大视频带宽为 4M; 捕获标识为 VR3的视频内容属性为全景,视频捕获编码信息中的最大视频带宽为4M. The audio content identified as AR0 is rendered as the main audio, and the maximum bandwidth in the audio rendering and decoding information is 128K, and the audio channel format is stereo. The common parameters in the rendering-related parameters included in the receiving capability set are: The scene with the scene identifier of 1 is composed of VR0, and the scene is described as rendering panoramic video. The codec standard is H.264, and the maximum bandwidth is 4M. The scene with the scene ID of 2 is composed of AR0. The scene is described as rendering the main audio. The codec standard is G711, and the maximum bandwidth is 128K. Step SI 808: Endpoint A replies to the Endpoint B Capability Set Interactivity Response message. Step S1810: According to the receiving capability of the endpoint A, the endpoint B determines the sending scenario 2 and scenario 3 of the receiving capability of the endpoint A according to its own sending capability. Endpoint B sends an Open Logical Channel Request message to Endpoint A, specifies the channel properties, and requests to open the forward logical channel from B to A. The specified logical channel attributes are: One video logical channel transmits one RTP video stream, and one audio logical channel transmits one RTP audio stream. The video channel media related information is sent VR3 video; the encoding related information is encoded by H.264, and the maximum bandwidth is 4M; no multiplexing information. The audio channel media related information is AR0 audio, the encoding related information is G711, and the maximum bandwidth is 128K. Step S1812: The endpoint replies to the endpoint B to open the logical channel response message. Step S1814: Based on the receiving capability of the endpoint B, and in conjunction with its own sending capability, the endpoint A determines to send the scenario 1 and scenario 2 of the endpoint B receiving capability set. Endpoint A sends an Open Logical Channel Request message to Endpoint B, specifies the channel properties, and requests to open the forward logical channel from A to B. The specified logical channel attributes are: One video logical channel transmits one RTP video stream, and one audio logical channel transmits one RTP audio stream. The video channel media related information is the VR0 video transmission; the encoding related information is encoded by H.264, and the maximum bandwidth is 12M; no multiplexing information. The audio channel media related information is AR0 audio, the encoding related information is G711, and the maximum bandwidth is 128K. Step S1816: The endpoint Α replies to the endpoint Α to open the logical channel response message. Preferred Embodiment 9 This preferred embodiment is one of the preferred embodiments of the negotiation mode A-2, and describes a specific capability negotiation procedure between the remote presentation endpoint A and the remote presentation endpoint B. Endpoint A and Endpoint B are telepresence video conferencing endpoints. Endpoint A has 3 cameras, 3 displays, 1 microphone, and 1 speaker. Endpoint B has 3 cameras, 3 displays, 1 microphone, and 1 speaker. Endpoint A or / and Endpoint B can also be MCU devices. The preferred embodiment includes the following steps S1902 to S1916. Step S1902: The remote presentation endpoint A initiates a capability set interaction request to the remote presentation endpoint B, and the message carries the symmetric capability set of the endpoint A; the video parameters in the rendering/capture related parameters included in the symmetric capability set are: the rendering/capture identifier is The video rendering space information of VR0 is represented as left, the video content attribute is the main video, and the maximum video bandwidth in the video rendering/capturing codec information is 4M; the rendering/capturing video rendering space information identified as VR1 is represented as medium, video content attribute For the main video, the maximum video bandwidth in the video rendering/capturing codec information is 4M; the rendering/capturing video rendering space information identified as VR2 is represented as the right, the video content attribute is the main video, and the video rendering/capturing codec information is The maximum video bandwidth is 4M; the audio parameters in the rendering related parameters included in the receiving capability set are: rendering/capturing the audio content identified as AR0 as the main audio, and the maximum bandwidth in the audio rendering/capturing codec information is 128K, the audio channel format For stereo; the symmetrical capability set contains common parameters in the rendering/capture related parameters The scene with the scene ID is 1 is composed of VR0, VR1, and VR2. The scene is described as rendering left, center, and right video. The codec standard is H.264, and the maximum bandwidth is 12M. The scene with the scene ID of 2 is composed of VR3. The scene is described as rendering panoramic video. The codec standard is H.264, and the maximum bandwidth is 4M. The scene with scene ID 3 is composed of AR0. The scene is described as rendering main audio, codec standard. For G711, the maximum bandwidth is 128K. Step S1904: The endpoint Β reply to the endpoint Α capability set interaction response message; Step S1906: The remote presentation endpoint B initiates a capability set interaction request to the remote presentation endpoint A, the message carries the symmetric capability set of the endpoint B; and renders/captures the video identified as VR0 The rendering/capturing spatial information is represented as left, the video content attribute is the main video, and the maximum video bandwidth in the video rendering/capturing codec information is 4M; the rendering/capturing video rendering/capturing spatial information identified as VR1 is represented as medium, video content The attribute is the main video, the maximum video bandwidth in the video rendering/capturing codec information is 4M; the rendering/capturing video rendering/capturing space information identified as VR2 is represented as right, the video content attribute is the main video, and the video rendering/capturing codec is decoded. The maximum video bandwidth in the message is 4M; the audio content identified as AR0 is rendered/captured as the primary audio, the maximum bandwidth in the audio rendering/capture codec information is 128K, the audio channel format is stereo; the receive capability set contains rendering/capture The common parameters in the related parameters are: The scene with the scene ID is 1 consists of VR0, VR1, and VR2. The scene is described as rendering/capturing the left, middle, and right videos. The codec standard is H.264, and the maximum bandwidth is 12M. The scene with the scene identifier of 2 is composed of AR0. The scene is described as rendering/capturing the main audio, and the codec standard is G. .711, the maximum bandwidth is 128K. Step S1908: The endpoint Α replies to the endpoint B capability set interaction response message. Step S1910: The endpoint B determines the scenario 1 and scenario 3 of the endpoint A receiving capability set according to the symmetry capability of the endpoint A and its own capabilities. Endpoint B sends an Open Logical Channel Request message to Endpoint A. Specifies the channel properties, requesting to open the forward logical channel from B to A. The specified logical channel attributes are: One video logical channel transmits three RTP video streams, and one audio logical channel transmits one RTP audio stream. Video channel media related information is sent VR0, VR1, VR2 video; encoding related information is compiled with H.264 The maximum bandwidth of the code is 12M. The value of the RTP header extension identifier corresponding to the VR0, VR1, and VR2 is 0, 1, and 2, which are used to distinguish different RTP streams in the same logical channel. The audio channel media related information is sent AR0 audio, the encoding related information is G711, and the maximum bandwidth is 128K. Step S1912: The endpoint replies to the endpoint B to open the logical channel response message. Step S1914: The endpoint A determines the scenario 1 and scenario 2 of the endpoint B receiving capability set according to the symmetry capability of the endpoint B and its own sending capability. Endpoint A sends an Open Logical Channel Request message to Endpoint B, specifies the channel properties, and requests to open the forward logical channel from A to B. The specified logical channel attributes are: One video logical channel transmits three RTP video streams, and one audio logical channel transmits one RTP audio stream. The video channel media related information is the VR0, VR1, and VR2 video transmission; the encoding related information is encoded by H.264, and the maximum bandwidth is 12M; the multiplexing information is VR0, VR1, and VR2 respectively. 1, 2, used to distinguish different RTP streams in the same logical channel. The audio channel media related information is sent AR0 audio, the encoding related information is G711, and the maximum bandwidth is 128K. Step S1916: The endpoint Α replies to the endpoint Α to open the logical channel response message. Preferred Embodiments The preferred embodiment is one of the preferred embodiments of the negotiation mode A-3-1, and describes a specific capability negotiation procedure between the remote presentation endpoint A and the remote presentation endpoint B. Endpoint A and Endpoint B are telepresence video conferencing endpoints. Endpoint A has 3 cameras, 3 displays, 1 microphone, and 1 speaker. Endpoint B has 1 camera, 1 display, 1 microphone, and 1 speaker. Endpoint A or / and Endpoint B can also be MCU devices. The preferred embodiment includes the following steps S2002 to S2016. Step S2002: The remote presentation endpoint A initiates a capability set interaction request to the remote presentation endpoint B, and the message carries the transmission capability set of the endpoint A; the video parameters in the capture related parameters included in the transmission capability set are: video capture with the capture identifier VR0 The spatial information is represented as the left, the video content attribute is the main video, and the maximum video bandwidth in the video capture encoded information is 4M; the video capture spatial information with the capture identifier VR1 is represented as medium, the video content attribute is the main video, and the video capture encoded information is included. The maximum video bandwidth is 4M; the video capture space information with the capture identifier VR2 is represented as the right, the video content attribute is the main video, and the maximum video bandwidth in the video capture encoded information is 4M; Capture the video content attribute identified as VR3 as a panorama, and the maximum video bandwidth in the video capture encoded information is
4M; 接收能力集包含的捕获相关参数中的音频参数为: 捕获标识为 AR0的音频内容为主音频,音频捕获编码信息中的最大带宽为 128K, 音频信道格式为立体声; 发送能力集包含的捕获相关参数中的通用参数为: 场景标识为 1的场景由 VR0、 VR1、 VR2组成, 场景描述为捕获左、 中、右视频, 编编码标准为 H.264, 最大带宽为 12M; 场景标识为 2的场景由 VR3组成,场景描述为捕获全景视频,编编码标准为 H.264, 最大带宽为 4M; 场景标识为 3的场景由 AR0组成, 场景描述为捕获主音频, 编编码标准为 G711 , 最大带宽为 128K。 步骤 S2004:端点 Β回复端点 Α能力集交互响应消息,携带 B从 A的发送能力集 中选择的参数: 场景 2和场景 3中的媒体; 步骤 S2006: 远程呈现端点 B向远程呈现端点 A发起能力集交互请求, 消息中携 带端点 B的发送能力集; 发送能力集包含的捕获相关参数中的视频参数分别为: 捕获标识为 VR0的场景描述为捕获全景视频,编码标准为 H.264,最大带宽为 4M; 捕获标识为 AR0的音频内容为主音频,音频捕获编码信息中的最大带宽为 128K, 音频信道格式为立体声; 发送能力集包含的捕获相关参数中的通用参数为: 场景标识为 1的场景由 VR0组成,场景描述为捕获全景视频,编码标准为 H.264, 最大带宽为 4M; 场景标识为 2的场景由 AR0组成, 场景描述为捕获主音频, 编码标准为 G711 , 最大带宽为 128K。 步骤 S2008:端点 A回复端点 B能力集交互响应消息,携带 A从 B的发送能力集 中选择的参数: 场景 1和场景 2中的媒体; 步骤 S2010: 端点 A根据端点 B在能力集交互响应中选择的能力, 向端点 B发送 打开逻辑通道请求消息。 指定通道属性, 请求打开 A到 B的前向逻辑通道。 指定的通 道属性为: 一路视频逻辑通道传送 1路 RTP视频流,一路音频逻辑通道传输 1路 RTP音频流。 视频通道媒体相关信息为发送 VR3视频; 编码相关信息为采用 H.264编码, 最大带宽 为 4M;无复用信息。音频通道媒体相关信息为发送 AR0音频,编码相关信息为 G711 , 最大带宽为 128K。 步骤 S2012: 端点 Β向端点 Α回复打开逻辑通道响应消息; 步骤 S2014: 端点 B根据端点 A在能力集交互响应选择的能力, 向端点 A发送打 开逻辑通道请求消息。 指定通道属性, 请求打开 B到 A的前向逻辑通道; 一路视频逻辑通道传送 1路 RTP视频流,一路音频逻辑通道传输 1路 RTP音频流。 视频通道媒体相关信息为发送 VR0视频; 编码相关信息为采用 H.264编码, 最大带宽 为 12M;无复用信息。音频通道媒体相关信息为发送 AR0音频,编码相关信息为 G711 , 最大带宽为 128K。 步骤 S2016: 端点 Α向端点 B回复打开逻辑通道响应消息。 需要说明的是, 其它方法的对应实施例和上述实施例类似。 显然, 本领域的技术人员应该明白, 上述的本发明的各模块或各步骤可以用通用 的计算装置来实现, 它们可以集中在单个的计算装置上, 或者分布在多个计算装置所 组成的网络上, 可选地, 它们可以用计算装置可执行的程序代码来实现, 从而, 可以 将它们存储在存储装置中由计算装置来执行, 并且在某些情况下, 可以以不同于此处 的顺序执行所示出或描述的步骤, 或者将它们分别制作成各个集成电路模块, 或者将 它们中的多个模块或步骤制作成单个集成电路模块来实现。 这样, 本发明不限制于任 何特定的硬件和软件结合。 以上所述仅为本发明的优选实施例而已, 并不用于限制本发明, 对于本领域的技 术人员来说, 本发明可以有各种更改和变化。 凡在本发明的精神和原则之内, 所作的 任何修改、 等同替换、 改进等, 均应包含在本发明的保护范围之内。 工业实用性 本发明实施例提供的技术方案可以应用于通信领域, 实现了远程呈现终端的能 力交互, 提高了用户体验度。 4M; The audio parameters in the acquisition related parameters included in the receiving capability set are: capturing the audio content identified as AR0 as the main audio, the maximum bandwidth in the audio capture encoding information is 128K, the audio channel format is stereo; the transmission capability set includes the capture The common parameters in the related parameters are: The scene with the scene ID is 1 is composed of VR0, VR1, and VR2. The scene is described as capturing left, center, and right video. The encoding standard is H.264, and the maximum bandwidth is 12M. The scene identifier is 2 The scene is composed of VR3. The scene is described as capturing panoramic video. The encoding standard is H.264, and the maximum bandwidth is 4M. The scene with scene ID 3 is composed of AR0. The scene is described as capturing main audio. The encoding standard is G711. The bandwidth is 128K. Step S2004: the endpoint Β reply endpoint Α capability set interaction response message carries the parameters selected by B from the transmission capability set of A: the media in scenario 2 and scenario 3; step S2006: the remote presentation endpoint B initiates the capability set to the remote presentation endpoint A The interaction request, the message carries the transmission capability set of the endpoint B; the video parameters in the acquisition related parameters included in the transmission capability set are: The scene with the capture identifier VR0 is described as capturing the panoramic video, the coding standard is H.264, and the maximum bandwidth is 4M; captures the audio content identified as AR0 as the main audio, the maximum bandwidth in the audio capture encoding information is 128K, and the audio channel format is stereo; the common parameters in the capture related parameters included in the transmission capability set are: the scene with the scene identifier being 1 It is composed of VR0. The scene is described as capturing panoramic video. The encoding standard is H.264, and the maximum bandwidth is 4M. The scene with scene identifier 2 is composed of AR0. The scene is described as capturing main audio. The encoding standard is G711 and the maximum bandwidth is 128K. Step S2008: Endpoint A replies to the Endpoint B capability set interaction response message, and carries the parameters selected by A from the transmission capability set of B: media in scenario 1 and scenario 2; step S2010: endpoint A selects in the capability set interaction response according to endpoint B The ability to send an Open Logical Channel Request message to Endpoint B. Specify the channel properties and request to open the forward logical channel from A to B. The specified channel attributes are: One video logical channel transmits one RTP video stream, and one audio logical channel transmits one RTP audio stream. The video channel media related information is sent VR3 video; the encoding related information is encoded by H.264, and the maximum bandwidth is 4M; no multiplexing information. The audio channel media related information is AR0 audio, the encoding related information is G711, and the maximum bandwidth is 128K. Step S2012: The endpoint sends a logical channel response message to the endpoint Α reply; Step S2014: The endpoint B sends an open logical channel request message to the endpoint A according to the capability of the endpoint A in the capability set interaction response selection. Specify the channel attribute, request to open the forward logical channel from B to A; one video logical channel transmits one RTP video stream, and one audio logical channel transmits one RTP audio stream. The video channel media related information is the VR0 video transmission; the encoding related information is encoded by H.264, and the maximum bandwidth is 12M; no multiplexing information. The audio channel media related information is AR0 audio, the encoding related information is G711, and the maximum bandwidth is 128K. Step S2016: The endpoint replies to the endpoint B to open the logical channel response message. It should be noted that the corresponding embodiments of the other methods are similar to the above embodiments. Obviously, those skilled in the art should understand that the above modules or steps of the present invention can be implemented by a general-purpose computing device, which can be concentrated on a single computing device or distributed over a network composed of multiple computing devices. Alternatively, they may be implemented by program code executable by the computing device, such that they may be stored in the storage device by the computing device and, in some cases, may be different from the order herein. The steps shown or described are performed, or they are separately fabricated into individual integrated circuit modules, or a plurality of modules or steps are fabricated as a single integrated circuit module. Thus, the invention is not limited to any specific combination of hardware and software. The above is only the preferred embodiment of the present invention, and is not intended to limit the present invention, and various modifications and changes can be made to the present invention. Any modifications, equivalent substitutions, improvements, etc. made within the spirit and scope of the present invention are intended to be included within the scope of the present invention. INDUSTRIAL APPLICABILITY The technical solution provided by the embodiments of the present invention can be applied to the field of communications, and the capability interaction of the remote presentation terminal is realized, and the user experience is improved.

Claims

权 利 要 求 书 Claims
1. 一种远程呈现端点的能力交互方法, 包括: 1. A method for interactively interacting with endpoints, including:
第一远程呈现端点和第二远程呈现端点之间进行能力交互, 其中, 所述能 力交互的消息中携带有远程呈现端点能力集, 其中, 所述能力交互的消息中携 带有所述远程呈现端点发送能力集;  A capability interaction between the first remote presentation endpoint and the second remote presentation endpoint, where the capability interaction message carries a remote presentation endpoint capability set, where the capability interaction message carries the remote presentation endpoint Sending capability set;
所述第一远程呈现端点接收第二远程呈现端点的模式请求消息; 所述第一远程呈现端点根据所述能力交互的结果和接收到的所述模式请求 信息打开所述第一远程呈现端点和所述第二远程呈现端点之间的逻辑通道。  The first telepresence endpoint receives a mode request message of the second telepresence endpoint; the first telepresence endpoint opens the first telepresence endpoint according to the result of the capability interaction and the received mode request information The second telepresence logical channel between the endpoints.
2. 根据权利要求 1所述的方法, 其中, 第一远程呈现端点和第二远程呈现端点之 间进行能力交互包括: 2. The method of claim 1, wherein performing capability interaction between the first telepresence endpoint and the second telepresence endpoint comprises:
所述第一远程呈现端点向所述第二远程呈现端点发送第一能力集交互请 求, 其中, 所述第一能力集交互请求中携带有所述第一远程呈现端点的第一发 送能力集;  The first remote presentation endpoint sends a first capability set interaction request to the second remote presentation endpoint, where the first capability set interaction request carries a first transmission capability set of the first remote presentation endpoint;
所述第一远程呈现端点接收第二远程呈现端点发送的第二能力集交互请 求, 其中, 所述第二能力集交互请求中携带有所述第二远程呈现端点的第二发 送能力集;  The first remote presentation endpoint receives the second capability set interaction request sent by the second remote presentation endpoint, where the second capability set interaction request carries the second transmission capability set of the second remote presentation endpoint;
所述第一远程呈现端点接收所述第二远程呈现端点发送的第一模式请求消 息,其中,所述第一模式请求消息中携带有所述第一远程呈现端点的发送参数; 所述第一远程呈现端点向所述第二远程呈现端点发送的第二模式请求消 息,其中,所述第二模式请求消息中携带有所述第二远程呈现端点的发送参数; 所述第一远程呈现端点根据所述第二模式请求消息对应的模式请求过程的 结果, 向所述第二远程呈现端点发送第一逻辑通道请求, 其中, 所述第一逻辑 通道请求用于请求所述第一远程呈现端点打开所述第一远程呈现端点至所述第 二远程呈现端点之间的前向逻辑通道;  Receiving, by the first remote presentation endpoint, the first mode request message sent by the second remote presentation endpoint, where the first mode request message carries a sending parameter of the first remote presentation endpoint; a second mode request message sent by the remote presentation endpoint to the second remote presentation endpoint, where the second mode request message carries a transmission parameter of the second remote presentation endpoint; the first remote presentation endpoint is configured according to As a result of the mode request process corresponding to the second mode request message, sending a first logical channel request to the second remote presentation endpoint, where the first logical channel request is used to request the first remote rendering endpoint to open a forward logical channel between the first telepresence endpoint and the second telepresence endpoint;
所述第一远程呈现端点接收所述第二远程呈现端点发送的第二逻辑通道请 求, 其中, 所述第二逻辑通道请求是所述第二远程呈现端点根据所述第一模式 请求对应的模式请求过程的结果确定的, 所述第二逻辑通道请求用于请求所述 第二远程呈现端点打开所述第二远程呈现端点到所述第一远程呈现端点之间的 前向逻辑通道。 Receiving, by the first remote presentation endpoint, a second logical channel request sent by the second remote presentation endpoint, where the second logical channel request is a mode corresponding to the first remote mode endpoint according to the first mode request As a result of the requesting process, the second logical channel request is for requesting the second telepresence endpoint to open a forward logical channel between the second telepresence endpoint and the first telepresence endpoint.
3. 根据权利要求 1所述的方法, 其中, 第一远程呈现端点和第二远程呈现端点之 间进行能力交互包括: 3. The method of claim 1, wherein performing capability interaction between the first telepresence endpoint and the second telepresence endpoint comprises:
所述第一远程呈现端点向所述第二远程呈现端点发送第三能力集交互请 求, 其中, 所述第三能力集交互请求中携带有所述第一远程呈现端点的第一发 送能力集;  The first remote presentation endpoint sends a third capability set interaction request to the second remote presentation endpoint, where the third capability set interaction request carries the first transmission capability set of the first remote presentation endpoint;
所述第一远程呈现端点接收所述第二远程呈现端点发送的三模式请求消 息,其中,所述第三模式请求消息中携带有所述第一远程呈现端点的发送参数; 所述第一远程呈现端点接收第二远程呈现端点发送的第四能力集交互请 求, 其中, 所述第四能力集交互请求中携带有所述第二远程呈现端点的第二发 送能力集;  The first remote presentation endpoint receives the three-mode request message sent by the second remote presentation endpoint, where the third mode request message carries the sending parameter of the first remote presentation endpoint; The presentation endpoint receives the fourth capability set interaction request sent by the second remote presentation endpoint, where the fourth capability set interaction request carries the second transmission capability set of the second remote presentation endpoint;
所述第一远程呈现端点向所述第二远程呈现端点发送的第四模式请求消 息,其中,所述第四模式请求消息中携带有所述第二远程呈现端点的发送参数; 所述第一远程呈现端点根据所述第四模式请求消息对应的模式请求过程的 结果, 向所述第二远程呈现端点发送第三逻辑通道请求, 其中, 所述第三逻辑 通道请求用于请求所述第一远程呈现端点打开所述第一远程呈现端点至所述第 二远程呈现端点之间的前向逻辑通道;  a fourth mode request message sent by the first remote presentation endpoint to the second remote presentation endpoint, where the fourth mode request message carries a transmission parameter of the second remote presentation endpoint; The telepresence endpoint sends a third logical channel request to the second remote presentation endpoint according to a result of the mode request process corresponding to the fourth mode request message, where the third logical channel request is used to request the first The telepresence endpoint opens a forward logical channel between the first telepresence endpoint and the second telepresence endpoint;
所述第一远程呈现端点接收所述第二远程呈现端点发送的第四逻辑通道请 求, 其中, 所述第四逻辑通道请求是所述第二远程呈现端点根据所述第一模式 请求对应的模式请求过程的结果确定的, 所述第四逻辑通道请求用于请求所述 第二远程呈现端点打开所述第二远程呈现端点到所述第一远程呈现端点之间的 前向逻辑通道。  Receiving, by the first remote presentation endpoint, a fourth logical channel request sent by the second remote presentation endpoint, where the fourth logical channel request is a mode corresponding to the first remote mode endpoint according to the first mode request As a result of the requesting process, the fourth logical channel request is for requesting the second telepresence endpoint to open a forward logical channel between the second telepresence endpoint and the first telepresence endpoint.
4. 根据权利要求 2或 3所述的方法, 其中, 在所述第一远程呈现端点向所述第二远程呈现端点发送第一能力集交互请 求之后, 还包括: 所述第一远程呈现端点接收所述第二远程呈现端点发送的对 应于所述第一能力集交互请求对应的响应消息; The method according to claim 2 or 3, wherein, after the first remote presentation endpoint sends the first capability set interaction request to the second remote presentation endpoint, the method further includes: the first remote presentation endpoint Receiving, by the second remote presentation endpoint, a response message corresponding to the first capability set interaction request;
在所述第一远程呈现端点向所述第二远程呈现端点发送第三能力集交互请 求之后, 还包括: 所述第一远程呈现端点接收所述第二远程呈现端点发送的对 应于所述第三能力集交互请求对应的响应消息。 根据权利要求 2或 3所述的方法, 其中, 在所述第一远程呈现端点接收第二远程呈现端点发送的第二能力集交互请 求之后, 还包括: 所述第一远程呈现端点向所述第二远程呈现端点发送对应于 所述第二能力集交互请求对应的响应消息; After the first remote presentation endpoint sends the third capability set interaction request to the second remote presentation endpoint, the method further includes: the first remote presentation endpoint receiving, by the second remote presentation endpoint, the corresponding The response message corresponding to the three capability set interaction request. The method according to claim 2 or 3, wherein After the first remote presentation endpoint receives the second capability set interaction request sent by the second remote presentation endpoint, the method further includes: sending, by the first remote presentation endpoint, the second remote presentation endpoint to the second capability Set a response message corresponding to the interaction request;
在所述第一远程呈现端点接收第二远程呈现端点发送的第四能力集交互请 求之后, 还包括: 所述第一远程呈现端点向所述第二远程呈现端点发送对应于 所述第四能力集交互请求对应的响应消息。  After the first remote presentation endpoint receives the fourth capability set interaction request sent by the second remote presentation endpoint, the method further includes: the first remote presentation endpoint sending the fourth capability to the second remote presentation endpoint Set the response message corresponding to the interaction request.
6. 根据权利要求 2或 3所述的方法, 其中, 在所述第一远程呈现端点接收所述第二远程呈现端点发送的第一模式请求 消息之后, 还包括: 所述第一远程呈现端点向所述第二远程呈现端点发送对应 于所述第一模式请求消息对应的响应消息; The method according to claim 2 or 3, after the first remote presentation endpoint receives the first mode request message sent by the second remote presentation endpoint, further comprising: the first remote presentation endpoint Transmitting, to the second remote presentation endpoint, a response message corresponding to the first mode request message;
在所述第一远程呈现端点接收所述第二远程呈现端点发送的第三模式请求 消息之后, 还包括: 所述第一远程呈现端点向所述第二远程呈现端点发送对应 于所述第三模式请求消息对应的响应消息。  After the first remote presentation endpoint receives the third mode request message sent by the second remote presentation endpoint, the method further includes: the first remote presentation endpoint sending the third remote presentation endpoint to the third remote presentation endpoint The response message corresponding to the mode request message.
7. 根据权利要求 2或 3所述的方法, 其中, 在所述第一远程呈现端点向所述第二远程呈现端点发送的第二模式请求消 息之后, 还包括: 所述第一远程呈现端点向所述第二远程呈现端点发送对应于 所述第二模式请求消息的响应消息; The method according to claim 2 or 3, wherein, after the second mode request message sent by the first remote presentation endpoint to the second remote presentation endpoint, the method further comprises: the first remote presentation endpoint Sending a response message corresponding to the second mode request message to the second remote presentation endpoint;
在所述第一远程呈现端点向所述第二远程呈现端点发送的第四模式请求消 息之后, 还包括: 所述第一远程呈现端点向所述第二远程呈现端点发送对应于 所述第四模式请求消息的响应消息。  After the first remote presentation endpoint sends the fourth mode request message to the second remote presentation endpoint, the method further includes: the first remote presentation endpoint sending the fourth remote presentation endpoint to the fourth remote presentation endpoint A response message to the mode request message.
8. 根据权利要求 2或 3所述的方法, 其中, 在所述第一远程呈现端点根据所述第二模式请求消息对应的模式请求过程 的结果, 向所述第二远程呈现端点发送第一逻辑通道请求之后, 还包括: 所述 第一远程呈现端点接收所述第二远程呈现端点发送的对应于所述第一逻辑通道 请求对应的响应消息; The method according to claim 2 or 3, wherein the first remote presentation endpoint sends a first to the second remote presentation endpoint according to a result of a mode request process corresponding to the second mode request message. After the logical channel request, the method further includes: the first remote presentation endpoint receiving a response message corresponding to the first logical channel request sent by the second remote presentation endpoint;
在所述第一远程呈现端点根据所述第二模式请求消息对应的模式请求过程 的结果, 向所述第二远程呈现端点发送第三逻辑通道请求之后, 还包括: 所述 第一远程呈现端点接收所述第二远程呈现端点发送的对应于所述第三逻辑通道 请求对应的响应消息。  After the first remote presentation endpoint sends the third logical channel request to the second remote presentation endpoint according to the result of the mode request process corresponding to the second mode request message, the method further includes: the first remote presentation endpoint Receiving, by the second remote presentation endpoint, a response message corresponding to the third logical channel request.
9. 根据权利要求 2或 3所述的方法, 其中, 在所述第一远程呈现端点接收所述第二远程呈现端点发送的第二逻辑通道 请求之后, 还包括: 所述第一远程呈现端点向所述第二远程呈现端点发送所述 第二逻辑通道请求对应的响应消息; 9. The method according to claim 2 or 3, wherein After the first remote presentation endpoint receives the second logical channel request sent by the second remote presentation endpoint, the method further includes: the first remote presentation endpoint sending the second logical channel to the second remote presentation endpoint Request a corresponding response message;
在所述第一远程呈现端点接收所述第二远程呈现端点发送的第四逻辑通道 请求之后, 还包括: 所述第一远程呈现端点向所述第二远程呈现端点发送所述 第四逻辑通道请求对应的响应消息。  After the first remote presentation endpoint receives the fourth logical channel request sent by the second remote presentation endpoint, the method further includes: the first remote presentation endpoint sending the fourth logical channel to the second remote presentation endpoint Request the corresponding response message.
10. 根据权利要求 1至 3中任一项所述的方法, 其中, 所述远程呈现端点发送能力 集包括捕获参数,其中,所述捕获参数包括通用参数、视频参数和 /或音频参数。 The method according to any one of claims 1 to 3, wherein the remote presentation endpoint transmission capability set comprises a capture parameter, wherein the capture parameter comprises a universal parameter, a video parameter and/or an audio parameter.
11. 根据权利要求 10所述的方法, 其中, 所述通用参数包括媒体捕获内容、场景描 述、 场景切换策略、 通用空间信息和 /或通用编码信息; 所述媒体捕获内容表示 媒体捕获的用途; 所述场景描述用于提供整体场景的描述。 The method according to claim 10, wherein the universal parameters include media capture content, scene description, scene switching policy, general space information, and/or general encoding information; and the media capturing content indicates a use of media capturing; The scenario description is used to provide a description of the overall scenario.
12. 根据权利要求 11所述的方法,其中,所述场景切换策略用于指示所支持媒体切 换策略,其中,所述支持的媒体切换的策略包括场所切换策略和部分切换策略, 其中, 所述场所切换策略用于指示同时切换全部的捕获, 以保证捕获一起来自 同一个端点场所, 所述部分切换策略用于指示不同的捕获可以在不同的时间切 换, 并且来自相同和 /或不同的远程呈现端点。 The method according to claim 11, wherein the scenario switching policy is used to indicate a supported media switching policy, wherein the supported media switching policy comprises a location switching policy and a partial switching policy, where The location switching policy is used to indicate that all acquisitions are simultaneously switched to ensure that the acquisitions come together from the same endpoint location, the partial handover strategy is used to indicate that different acquisitions can be switched at different times, and from the same and/or different remote presentations End point.
13. 根据权利要求 11所述的方法, 其中, 所述通用空间信息包括场景区域和 /或区 域刻度参数,其中,所述场景区域参数用于指示与端点相关的整体场景的范围, 所述区域刻度表明了空间信息参数采用的刻度的种类。 The method according to claim 11, wherein the general space information comprises a scene area and/or an area scale parameter, wherein the scene area parameter is used to indicate a range of an overall scene related to the endpoint, the area The scale indicates the type of scale used for the spatial information parameter.
14. 根据权利要求 11所述的方法, 其中, 所述通用编码信息包括全部的最大带宽、 全部的每秒最大像素数和 /或全部的每秒最大宏块数, 其中, 所述全部最大带宽 用于指示由终端发出的预设类型的全部码流的每秒最大数量的比特率; 所述全 部的每秒最大像素数用于表示编码组中全部独立编码的每秒最大像素数; 所述 全部的每秒最大宏块数表示由端点发送的全部视频码流每秒最大宏块数。 The method according to claim 11, wherein the universal coding information includes all maximum bandwidth, all maximum number of pixels per second, and/or all maximum macroblocks per second, wherein the total maximum bandwidth a maximum number of bits per second for indicating a total code stream of a preset type issued by the terminal; the total number of pixels per second being used to represent the maximum number of pixels per second independently coded in the code group; The maximum number of macroblocks per second represents the maximum number of macroblocks per second for all video streams sent by the endpoint.
15. 根据权利要求 10所述的方法, 其中, 所述视频参数包括: 视频捕获数量、视频 捕获空间信息和 /或视频捕获编码信息;所述视频捕获数量用于表示视频捕获的 数量。 15. The method of claim 10, wherein the video parameters comprise: a number of video captures, video capture spatial information, and/or video capture encoded information; the number of video captures is used to represent the number of video captures.
16. 根据权利要求 15所述的方法, 其中, 所述视频捕获空间信息包括捕获区域、捕 获点和 /或捕获线上的点, 其中, 所述捕获区域用于表示该视频捕获在整体捕获 场景中的所处的空间位置; 所述捕获点, 用于指示在捕获场景中, 视频捕获的 位置; 所述捕获线上的点, 描述了捕获设备光轴上的第二个点的空间位置, 且 第一个点为捕获点。 16. The method of claim 15, wherein the video capture spatial information comprises a capture area, a capture point, and/or a point on a capture line, wherein the capture area is used to indicate that the video capture is captured in an overall scene The spatial location in which it is located; the capture point, used to indicate the video capture in the captured scene Position; a point on the capture line that describes the spatial location of the second point on the optical axis of the capture device, and the first point is the capture point.
17. 根据权利要求 15所述的方法, 其中, 所述视频捕获编码信息, 包括最大视频带 宽、 每秒最大像素数、 最大视频分辨率的宽度、 最大视频分辨率的高度和 /或最 大视频帧速率; 其中, 所述最大视频带宽, 用于指示单一视频编码的每秒最大 比特数; 所述每秒最大像素数, 该参数用于表示单一视频编码的每秒最大像素 数; 所述最大视频分辨率的宽度, 该参数用于表示以像素为单位的最大视频分 辨率的宽度; 所述最大视频分辨率的高度, 该参数用于表示以像素为单位的最 大视频分辨率的高度; 所述最大视频帧速率, 该参数表明了最大视频帧率。 17. The method of claim 15, wherein the video capture encoded information comprises a maximum video bandwidth, a maximum number of pixels per second, a width of a maximum video resolution, a height of a maximum video resolution, and/or a maximum video frame. Rate; wherein the maximum video bandwidth is used to indicate a maximum number of bits per second for a single video encoding; the maximum number of pixels per second, the parameter is used to represent a maximum number of pixels per second for a single video encoding; The width of the resolution, which is used to represent the width of the maximum video resolution in pixels; the height of the maximum video resolution, which is used to represent the height of the maximum video resolution in pixels; Maximum video frame rate, this parameter indicates the maximum video frame rate.
18. 根据权利要求 10所述的方法, 其中, 所述音频参数包括音频捕获数量、音频捕 获空间信息和 /或音频捕获编码信息;所述音频捕获数量用于指示音频捕获的数 18. The method of claim 10, wherein the audio parameters comprise an audio capture quantity, audio capture space information, and/or audio capture coding information; the audio capture quantity is used to indicate an audio capture number
19. 根据权利要求 18所述的方法, 其中, 所述音频捕获空间信息包括: 捕获区域和 /或捕获点, 其中, 所述捕获区域, 用于表示音频捕获在整体捕获场景所处的空 间位置; 所述捕获点, 用于表示在捕获场景中, 音频捕获的位置。 19. The method according to claim 18, wherein the audio capture spatial information comprises: a capture area and/or a capture point, wherein the capture area is used to indicate a spatial location where the audio capture is located in the overall captured scene. The capture point is used to indicate the location of the audio capture in the captured scene.
20. 根据权利要求 18所述的方法, 其中, 所述音频捕获编码信息包括: 音频信道格 式和 /或最大音频带宽; 所述音频信道格式, 用于指示音频信道的属性; 所述最 大音频带宽, 用于指示单一音频编码的每秒最大比特数。 20. The method according to claim 18, wherein the audio capture coding information comprises: an audio channel format and/or a maximum audio bandwidth; the audio channel format, used to indicate an attribute of an audio channel; , the maximum number of bits per second used to indicate a single audio encoding.
21. 一种远程呈现端点的能力交互装置, 应用于第一远程呈现端点, 包括: 21. A capability interaction device for telepresence endpoints, applied to a first telepresence endpoint, comprising:
交互模块, 设置为和第二远程呈现端点之间进行能力交互, 其中, 所述能 力交互的消息中携带有远程呈现端点能力集, 其中, 所述能力交互的消息中携 带有所述远程呈现端点发送能力集;  The interaction module is configured to perform a capability interaction with the second remote presentation endpoint, where the capability interaction message carries a remote presentation endpoint capability set, where the capability interaction message carries the remote presentation endpoint Sending capability set;
第一接收模块, 设置为接收第二远程呈现端点的模式请求消息; 处理模块, 设置为根据所述能力交互的结果和接收到的所述模式请求信息 打开所述第一远程呈现端点和所述第二远程呈现端点之间的逻辑通道。  a first receiving module, configured to receive a mode request message of the second remote rendering endpoint; the processing module, configured to open the first remote rendering endpoint and the according to the result of the capability interaction and the received mode request information The second telepresence logical channel between the endpoints.
22. 根据权利要求 21所述的装置, 其中, 所述交互模块包括: 第一发送模块,设置为向所述第二远程呈现端点发送第一能力集交互请求, 其中, 所述第一能力集交互请求中携带有所述第一远程呈现端点的第一发送能 力集; 第二接收模块,设置为接收第二远程呈现端点发送的第二能力集交互请求, 其中, 所述第二能力集交互请求中携带有所述第二远程呈现端点的第二发送能 力集; The device according to claim 21, wherein the interaction module comprises: a first sending module, configured to send a first capability set interaction request to the second remote presentation endpoint, where the first capability set The first request capability set of the first remote presentation endpoint is carried in the interaction request; a second receiving module, configured to receive a second capability set interaction request sent by the second remote presentation endpoint, where the second capability set interaction request carries a second sending capability set of the second remote presentation endpoint;
第三接收模块, 设置为接收所述第二远程呈现端点发送的第一模式请求消 息,其中,所述第一模式请求消息中携带有所述第一远程呈现端点的发送参数; 第二发送模块,设置为向所述第二远程呈现端点发送的第二模式请求消息, 其中, 所述第二模式请求消息中携带有所述第二远程呈现端点的发送参数; 第三发送模块, 设置为根据所述第二模式请求消息对应的模式请求过程的 结果, 向所述第二远程呈现端点发送第一逻辑通道请求, 其中, 所述第一逻辑 通道请求用于请求所述第一远程呈现端点打开所述第一远程呈现端点至所述第 二远程呈现端点之间的前向逻辑通道;  a third receiving module, configured to receive a first mode request message sent by the second remote presentation endpoint, where the first mode request message carries a sending parameter of the first remote rendering endpoint; a second mode request message sent to the second remote presentation endpoint, where the second mode request message carries a transmission parameter of the second remote presentation endpoint, and a third sending module is configured to As a result of the mode request process corresponding to the second mode request message, sending a first logical channel request to the second remote presentation endpoint, where the first logical channel request is used to request the first remote rendering endpoint to open a forward logical channel between the first telepresence endpoint and the second telepresence endpoint;
第四接收模块, 设置为接收所述第二远程呈现端点发送的第二逻辑通道请 求, 其中, 所述第二逻辑通道请求是所述第二远程呈现端点根据所述第一模式 请求对应的模式请求过程的结果确定的, 所述第二逻辑通道请求用于请求所述 第二远程呈现端点打开所述第二远程呈现端点到所述第一远程呈现端点之间的 前向逻辑通道。  a fourth receiving module, configured to receive a second logical channel request sent by the second remote rendering endpoint, where the second logical channel request is a mode corresponding to the second remote rendering endpoint according to the first mode request As a result of the requesting process, the second logical channel request is for requesting the second telepresence endpoint to open a forward logical channel between the second telepresence endpoint and the first telepresence endpoint.
23. 根据权利要求 21所述的装置, 其中, 所述交互模块包括: The device according to claim 21, wherein the interaction module comprises:
第四发送模块,设置为向所述第二远程呈现端点发送第三能力集交互请求, 其中, 所述第三能力集交互请求中携带有所述第一远程呈现端点的第一发送能 力集;  a fourth sending module, configured to send a third capability set interaction request to the second remote presentation endpoint, where the third capability set interaction request carries a first sending capability set of the first remote rendering endpoint;
第五接收模块, 设置为接收所述第二远程呈现端点发送的第三模式请求消 息,其中,所述第三模式请求消息中携带有所述第一远程呈现端点的发送参数; 第六接收模块,设置为接收第二远程呈现端点发送的第四能力集交互请求, 其中, 所述第四能力集交互请求中携带有所述第二远程呈现端点的第二发送能 力集;  a fifth receiving module, configured to receive a third mode request message sent by the second remote presentation endpoint, where the third mode request message carries a sending parameter of the first remote rendering endpoint; The fourth capability set interaction request sent by the second remote presentation endpoint is received, where the fourth capability set interaction request carries the second transmission capability set of the second remote presentation endpoint;
第五发送模块,设置为向所述第二远程呈现端点发送的第四模式请求消息, 其中, 所述第四模式请求消息中携带有所述第二远程呈现端点的发送参数; 第六发送模块, 设置为根据所述第四模式请求消息对应的模式请求过程的 结果, 向所述第二远程呈现端点发送第三逻辑通道请求, 其中, 所述第三逻辑 通道请求用于请求所述第一远程呈现端点打开所述第一远程呈现端点至所述第 二远程呈现端点之间的前向逻辑通道; 第七接收模块, 设置为接收所述第二远程呈现端点发送的第四逻辑通道请 求, 其中, 所述第四逻辑通道请求是所述第二远程呈现端点根据所述第一模式 请求对应的模式请求过程的结果确定的, 所述第四逻辑通道请求用于请求所述 第二远程呈现端点打开所述第二远程呈现端点到所述第一远程呈现端点之间的 前向逻辑通道。 a fifth sending module, configured to send a fourth mode request message to the second remote presentation endpoint, where the fourth mode request message carries a sending parameter of the second remote rendering endpoint; And sending, to the second remote presentation endpoint, a third logical channel request according to a result of the mode request process corresponding to the fourth mode request message, where the third logical channel request is used to request the first The telepresence endpoint opens a forward logical channel between the first telepresence endpoint and the second telepresence endpoint; a seventh receiving module, configured to receive a fourth logical channel request sent by the second remote presentation endpoint, where the fourth logical channel request is a mode corresponding to the first remote mode endpoint according to the first mode request As a result of the requesting process, the fourth logical channel request is for requesting the second telepresence endpoint to open a forward logical channel between the second telepresence endpoint and the first telepresence endpoint.
24. 根据权利要求 22或 23所述的装置, 其中, 还包括: 第八接收模块, 设置为在第一发送模块向所述第二远程呈现端点发送第一 能力集交互请求之后, 接收所述第二远程呈现端点发送的对应于所述第一能力 集交互请求对应的响应消息; The device according to claim 22 or 23, further comprising: an eighth receiving module, configured to: after the first sending module sends the first capability set interaction request to the second remote rendering endpoint, receive the a response message corresponding to the first capability set interaction request sent by the second remote presentation endpoint;
第九接收模块, 设置为在第四发送模块向所述第二远程呈现端点发送第三 能力集交互请求之后, 接收所述第二远程呈现端点发送的对应于所述第三能力 集交互请求对应的响应消息。  The ninth receiving module is configured to: after the fourth sending module sends the third capability set interaction request to the second remote rendering endpoint, receive, by the second remote rendering endpoint, corresponding to the third capability set interaction request Response message.
25. 根据权利要求 22或 23所述的装置, 其中, 还包括: 25. The device according to claim 22 or 23, further comprising:
第七发送模块, 设置为在所述第二接收模块接收第二远程呈现端点发送的 第二能力集交互请求之后, 向所述第二远程呈现端点发送对应于所述第二能力 集交互请求对应的响应消息;  a seventh sending module, configured to send, after the second receiving module receives the second capability set interaction request sent by the second remote rendering endpoint, to the second remote rendering endpoint, corresponding to the second capability set interaction request Response message
第八发送模块, 设置为在所述第六接收模块接收第二远程呈现端点发送的 第四能力集交互请求之后, 还包括: 所述第一远程呈现端点向所述第二远程呈 现端点发送对应于所述第四能力集交互请求对应的响应消息。  The eighth sending module is configured to: after the sixth receiving module receives the fourth capability set interaction request sent by the second remote rendering endpoint, the method further includes: sending, by the first remote rendering endpoint, the corresponding to the second remote rendering endpoint Corresponding to the response message corresponding to the fourth capability set.
26. 根据权利要求 22或 23所述的装置, 其中, 还包括: The device according to claim 22 or 23, further comprising:
第九发送模块, 设置为在所述第三接收模块接收所述第二远程呈现端点发 送的第一模式请求消息之后, 向所述第二远程呈现端点发送对应于所述第一模 式请求消息对应的响应消息;  a ninth sending module, configured to send, after the third receiving module receives the first mode request message sent by the second remote rendering endpoint, to the second remote rendering endpoint, corresponding to the first mode request message Response message
第十发送模块, 设置为在所述第五接收模块接收所述第二远程呈现端点发 送的第三模式请求消息之后, 向所述第二远程呈现端点发送对应于所述第三模 式请求消息对应的响应消息。  a tenth sending module, configured to send, after the fifth receiving module receives the third mode request message sent by the second remote rendering endpoint, to the second remote rendering endpoint, corresponding to the third mode request message Response message.
27. 根据权利要求 22或 23所述的装置, 其中, 27. Apparatus according to claim 22 or 23, wherein
第十一发送模块, 设置为在所述第二发送模块向所述第二远程呈现端点发 送的第二模式请求消息之后, 向所述第二远程呈现端点发送对应于所述第二模 式请求消息的响应消息; 第十二发送模块, 设置为在所述第五发送模块向所述第二远程呈现端点发 送的第四模式请求消息之后, 向所述第二远程呈现端点发送对应于所述第四模 式请求消息的响应消息。 An eleventh sending module, configured to send, after the second sending module sends the second mode request message to the second remote rendering endpoint, to the second remote rendering endpoint, corresponding to the second mode request message Response message a twelfth sending module, configured to send, after the fourth mode sending message sent by the fifth sending module to the second remote rendering endpoint, a message corresponding to the fourth mode request message to the second remote rendering endpoint Response message.
28. 根据权利要求 22或 23所述的装置, 其中, 第十接收模块, 设置为在所述第三发送模块根据所述第二模式请求消息对 应的模式请求过程的结果, 向所述第二远程呈现端点发送第一逻辑通道请求之 后, 接收所述第二远程呈现端点发送的对应于所述第一逻辑通道请求对应的响 应消息; The device according to claim 22 or 23, wherein the tenth receiving module is configured to: at the third sending module, according to a result of the mode request process corresponding to the second mode request message, to the second After the remote presentation endpoint sends the first logical channel request, receiving a response message corresponding to the first logical channel request sent by the second remote presentation endpoint;
第十一接收模块, 设置为在所述第六发送模块根据所述第二模式请求消息 对应的模式请求过程的结果, 向所述第二远程呈现端点发送第三逻辑通道请求 之后, 接收所述第二远程呈现端点发送的对应于所述第三逻辑通道请求对应的 响应消息。  The eleventh receiving module is configured to: after the sixth sending module sends the third logical channel request to the second remote rendering endpoint according to the result of the mode request process corresponding to the second mode request message, A response message corresponding to the third logical channel request sent by the second remote presentation endpoint.
29. 根据权利要求 22或 23所述的装置, 其中, 29. Apparatus according to claim 22 or 23, wherein
第十三发送模块, 设置为在所述第四接收模块所述第一远程呈现端点接收 所述第二远程呈现端点发送的第二逻辑通道请求之后, 向所述第二远程呈现端 点发送所述第二逻辑通道请求对应的响应消息;  a thirteenth sending module, configured to send, after the first remote rendering endpoint of the fourth receiving module receives the second logical channel request sent by the second remote rendering endpoint, to the second remote rendering endpoint The second logical channel requests a corresponding response message;
第十四发送模块, 设置为在所述第七接收模块接收所述第二远程呈现端点 发送的第四逻辑通道请求之后, 向所述第二远程呈现端点发送所述第四逻辑通 道请求对应的响应消息。  a fourteenth sending module, configured to: after the seventh receiving module receives the fourth logical channel request sent by the second remote rendering endpoint, send the fourth logical channel request corresponding to the second remote rendering endpoint Response message.
30.—种数据流, 包括: 远程呈现端点能力集, 其中所述远程呈现端点能力集包括: 发 送能力集, 所述发送能力集包括: 捕获参数, 所述捕获参数包括: 通用参数、 视频参数和 /或音频参数。 The data stream includes: a telepresence endpoint capability set, where the telepresence endpoint capability set includes: a sending capability set, where the sending capability set includes: a capturing parameter, where the capturing parameter includes: a universal parameter, a video parameter And / or audio parameters.
31. 根据权利要求 30所述的数据流, 其中, 所述通用参数包括媒体捕获内容、场景 描述、 场景切换策略、 通用空间信息和 /或通用编码信息; 所述媒体捕获内容表 示媒体捕获的用途; 所述场景描述用于提供整体场景的描述。 31. The data stream of claim 30, wherein the universal parameters comprise media capture content, scene description, scene switching policy, general space information, and/or general encoding information; the media capture content represents use of media capture The scene description is used to provide a description of the overall scene.
32. 根据权利要求 31所述的数据流,其中,所述场景切换策略用于指示所支持媒体 切换策略, 其中, 所述支持的媒体切换的策略包括场所切换策略和部分切换策 略, 其中, 所述场所切换策略用于指示同时切换全部的捕获, 以保证捕获一起 来自同一个端点场所, 所述部分切换策略用于指示不同的捕获可以在不同的时 间切换, 并且来自相同和 /或不同的远程呈现端点。 The data flow according to claim 31, wherein the scenario switching policy is used to indicate a supported media switching policy, where the supported media switching policy comprises a location switching policy and a partial switching policy, where The local handover policy is used to indicate that all acquisitions are simultaneously switched to ensure that the acquisitions come together from the same endpoint location, the partial handover strategy is used to indicate that different acquisitions can be switched at different times, and from the same and/or different remotes. Render the endpoint.
33. 根据权利要求 31所述的数据流, 其中, 所述通用空间信息包括场景区域和 /或 区域刻度参数, 其中, 所述场景区域参数用于指示与端点相关的整体场景的范 围, 所述区域刻度表明了空间信息参数采用的刻度的种类。 The data stream according to claim 31, wherein the general space information includes a scene area and/or an area scale parameter, where the scene area parameter is used to indicate a range of an overall scene related to the endpoint, The area scale indicates the type of scale used for the spatial information parameter.
34. 根据权利要求 31所述的数据流,其中,所述通用编码信息包括全部的最大带宽、 全部的每秒最大像素数和 /或全部的每秒最大宏块数, 其中, 所述全部最大带宽 用于指示由终端发出的预设类型的全部码流的每秒最大数量的比特率; 所述全 部的每秒最大像素数用于表示编码组中全部独立编码的每秒最大像素数; 所述 全部的每秒最大宏块数表示由端点发送的全部视频码流每秒最大宏块数。 34. The data stream of claim 31, wherein the universally encoded information comprises all maximum bandwidth, all maximum number of pixels per second, and/or all maximum number of macroblocks per second, wherein the total maximum is The bandwidth is used to indicate a maximum number of bit rates per second of all code streams of a preset type sent by the terminal; the total number of pixels per second is used to represent the maximum number of pixels per second independently coded in the code group; The maximum number of macroblocks per second represents the maximum number of macroblocks per second for all video streams sent by the endpoint.
35. 根据权利要求 30所述的数据流, 其中, 所述视频参数包括: 视频捕获数量、视 频捕获空间信息和 /或视频捕获编码信息;所述视频捕获数量用于表示视频捕获 的数量。 35. The data stream of claim 30, wherein the video parameters comprise: a video capture number, video capture spatial information, and/or video capture encoded information; the video capture number is used to represent a number of video captures.
36. 根据权利要求 35所述的数据流, 其中, 所述视频捕获空间信息包括捕获区域、 捕获点和 /或捕获线上的点, 其中, 所述捕获区域用于表示该视频捕获在整体捕 获场景中的所处的空间位置; 所述捕获点, 用于指示在捕获场景中, 视频捕获 的位置; 所述捕获线上的点, 描述了捕获设备光轴上的第二个点的空间位置, 且第一个点为捕获点。 36. The data stream of claim 35, wherein the video capture spatial information comprises a capture area, a capture point, and/or a point on a capture line, wherein the capture area is used to indicate that the video capture is captured in an overall manner a spatial location in the scene; the capture point, used to indicate the location of the video capture in the captured scene; the point on the capture line, describing the spatial location of the second point on the optical axis of the capture device , and the first point is the capture point.
37. 根据权利要求 35所述的数据流, 其中, 所述视频捕获编码信息, 包括最大视频 带宽、 每秒最大像素数、 最大视频分辨率的宽度、 最大视频分辨率的高度和 / 或最大视频帧速率; 其中, 所述最大视频带宽, 用于指示单一视频编码的每秒 最大比特数; 所述每秒最大像素数, 该参数用于表示单一视频编码的每秒最大 像素数; 所述最大视频分辨率的宽度, 该参数用于表示以像素为单位的最大视 频分辨率的宽度; 所述最大视频分辨率的高度, 该参数用于表示以像素为单位 的最大视频分辨率的高度;所述最大视频帧速率,该参数表明了最大视频帧率。 37. The data stream of claim 35, wherein the video capture encoded information comprises a maximum video bandwidth, a maximum number of pixels per second, a width of a maximum video resolution, a height of a maximum video resolution, and/or a maximum video. a frame rate, where the maximum video bandwidth is used to indicate a maximum number of bits per second for a single video encoding; the maximum number of pixels per second, the parameter is used to represent a maximum number of pixels per second for a single video encoding; The width of the video resolution, which is used to represent the width of the maximum video resolution in pixels; the height of the maximum video resolution, which is used to represent the height of the maximum video resolution in pixels; The maximum video frame rate, which indicates the maximum video frame rate.
38. 根据权利要求 30所述的数据流, 其中, 所述音频参数包括音频捕获数量、音频 捕获空间信息和 /或音频捕获编码信息;所述音频捕获数量用于指示音频捕获的 数量。 38. The data stream of claim 30, wherein the audio parameters comprise an audio capture number, audio capture spatial information, and/or audio capture encoded information; the audio capture quantity is used to indicate the number of audio captures.
39. 根据权利要求 38所述的数据流, 其中, 所述音频捕获空间信息包括: 捕获区域 和 /或捕获点, 其中, 所述捕获区域, 用于表示音频捕获在整体捕获场景所处的 空间位置; 所述捕获点, 用于表示在捕获场景中, 音频捕获的位置。 The data stream according to claim 38, wherein the audio capture space information comprises: a capture area and/or a capture point, wherein the capture area is used to indicate that the audio capture is in a space where the entire captured scene is located Position; the capture point, used to indicate the location of the audio capture in the captured scene.
0. 根据权利要求 38所述的数据流, 其中, 所述音频捕获编码信息包括: 音频信道 格式和 /或最大音频带宽; 所述音频信道格式, 用于指示音频信道的属性; 所述 最大音频带宽, 用于指示单一音频编码的每秒最大比特数。 0. The data stream according to claim 38, wherein the audio capture coding information comprises: an audio channel format and/or a maximum audio bandwidth; the audio channel format, used to indicate an attribute of an audio channel; Bandwidth, the maximum number of bits per second used to indicate a single audio encoding.
PCT/CN2014/075201 2013-06-01 2014-04-11 Method and apparatus for remote display endpoint capability exchange, and data flow WO2014190811A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201310213813.4 2013-06-01
CN201310213813.4A CN104219483B (en) 2013-06-01 2013-06-01 The long-range ability exchange method and device that endpoint is presented

Publications (1)

Publication Number Publication Date
WO2014190811A1 true WO2014190811A1 (en) 2014-12-04

Family

ID=51987969

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2014/075201 WO2014190811A1 (en) 2013-06-01 2014-04-11 Method and apparatus for remote display endpoint capability exchange, and data flow

Country Status (2)

Country Link
CN (1) CN104219483B (en)
WO (1) WO2014190811A1 (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH09214561A (en) * 1996-01-31 1997-08-15 Canon Inc Multimedia communication system, inter-work equipment and multimedia communication terminal equipment used for the system
CN1620133A (en) * 2003-11-21 2005-05-25 华为技术有限公司 Method of implementing switching of single picture and multi-pictures in conference television system
CN101218579A (en) * 2005-07-11 2008-07-09 派克维迪奥公司 System and method for transferring data
CN101594512A (en) * 2009-06-30 2009-12-02 中兴通讯股份有限公司 Realize terminal, multipoint control unit, the system and method for high definition multiple images
CN102868873A (en) * 2011-07-08 2013-01-09 中兴通讯股份有限公司 Remote presenting method, terminal and system
CN102883131A (en) * 2011-07-15 2013-01-16 中兴通讯股份有限公司 Signaling interaction method and device based on tele-presence system

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
NO20071401L (en) * 2007-03-16 2008-09-17 Tandberg Telecom As System and arrangement for lifelike video communication

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH09214561A (en) * 1996-01-31 1997-08-15 Canon Inc Multimedia communication system, inter-work equipment and multimedia communication terminal equipment used for the system
CN1620133A (en) * 2003-11-21 2005-05-25 华为技术有限公司 Method of implementing switching of single picture and multi-pictures in conference television system
CN101218579A (en) * 2005-07-11 2008-07-09 派克维迪奥公司 System and method for transferring data
CN101594512A (en) * 2009-06-30 2009-12-02 中兴通讯股份有限公司 Realize terminal, multipoint control unit, the system and method for high definition multiple images
CN102868873A (en) * 2011-07-08 2013-01-09 中兴通讯股份有限公司 Remote presenting method, terminal and system
CN102883131A (en) * 2011-07-15 2013-01-16 中兴通讯股份有限公司 Signaling interaction method and device based on tele-presence system

Also Published As

Publication number Publication date
CN104219483A (en) 2014-12-17
CN104219483B (en) 2019-10-25

Similar Documents

Publication Publication Date Title
CN102868873B (en) A kind of remote presentation method, terminal and system
US9215416B2 (en) Method and system for switching between video streams in a continuous presence conference
KR100880150B1 (en) Multi-point video conference system and media processing method thereof
JP5345081B2 (en) Method and system for conducting resident meetings
KR101555855B1 (en) Method and system for conducting video conferences of diverse participating devices
US9344475B2 (en) Media transmission method and system based on telepresence
WO2010034254A1 (en) Video and audio processing method, multi-point control unit and video conference system
CN102204244A (en) Systems,methods, and media for providing cascaded multi-point video conferencing units
WO2007140668A1 (en) Method and apparatus for realizing remote monitoring in conference television system
CN110267064A (en) Audio broadcast state processing method, device, equipment and storage medium
WO2012175025A1 (en) Remotely presented conference system, method for recording and playing back remotely presented conference
WO2015127799A1 (en) Method and device for negotiating on media capability
WO2014176965A1 (en) Capability negotiation processing method and device, and telepresence endpoint
EP2557780A2 (en) Method and system for switching between video streams in a continuous presence conference
CN101489090B (en) Method, apparatus and system for multipath media stream transmission and reception
WO2014190809A1 (en) Method and apparatus for remote display endpoint capability exchange, and data flow
WO2014190811A1 (en) Method and apparatus for remote display endpoint capability exchange, and data flow
WO2014190808A1 (en) Method and apparatus for remote display endpoint capability exchange, and data flow
WO2014190812A1 (en) Method and apparatus for remote display endpoint capability exchange, and data flow
WO2014190810A1 (en) Method and apparatus for remote display endpoint capability exchange, and data flow
CN104519305A (en) Endpoint information interactive processing method, endpoint information interactive processing device and remote rendering endpoint
CN104270655A (en) Multi-point video converging system
WO2015018216A1 (en) Multi-content media communication method, device and system

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14804121

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 14804121

Country of ref document: EP

Kind code of ref document: A1