CN115102927A - SIP (Session initiation protocol) talkback method, system and storage device for keeping video clear - Google Patents

SIP (Session initiation protocol) talkback method, system and storage device for keeping video clear Download PDF

Info

Publication number
CN115102927A
CN115102927A CN202210468252.1A CN202210468252A CN115102927A CN 115102927 A CN115102927 A CN 115102927A CN 202210468252 A CN202210468252 A CN 202210468252A CN 115102927 A CN115102927 A CN 115102927A
Authority
CN
China
Prior art keywords
video
receiving end
sip
host
frame
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210468252.1A
Other languages
Chinese (zh)
Other versions
CN115102927B (en
Inventor
庄宗辉
叶智鑫
卢刚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xiamen Leelen Technology Co Ltd
Original Assignee
Xiamen Leelen Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xiamen Leelen Technology Co Ltd filed Critical Xiamen Leelen Technology Co Ltd
Priority to CN202210468252.1A priority Critical patent/CN115102927B/en
Priority to PCT/CN2022/117773 priority patent/WO2023206910A1/en
Publication of CN115102927A publication Critical patent/CN115102927A/en
Application granted granted Critical
Publication of CN115102927B publication Critical patent/CN115102927B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/24Traffic characterised by specific attributes, e.g. priority or QoS
    • H04L47/2425Traffic characterised by specific attributes, e.g. priority or QoS for supporting services specification, e.g. SLA
    • H04L47/2433Allocation of priorities to traffic types
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/1066Session management
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/1066Session management
    • H04L65/1069Session establishment or de-establishment
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/40Support for services or applications
    • H04L65/403Arrangements for multi-party communication, e.g. for conferences
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/80Responding to QoS
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/854Content authoring
    • H04N21/8547Content authoring involving timestamps for synchronizing content
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Multimedia (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Business, Economics & Management (AREA)
  • General Business, Economics & Management (AREA)
  • Computer Security & Cryptography (AREA)
  • Mobile Radio Communication Systems (AREA)
  • Telephonic Communication Services (AREA)

Abstract

The invention relates to an SIP (session initiation protocol) talkback method, a system and a storage device for keeping a video clear, wherein the method comprises the following steps of: after communication between the extension or mobile terminal and the host, performing: the sending end encodes and sends IDR frames to the receiving end every GOP time, and the receiving end decodes the IDR frames and outputs corresponding images; when the receiving end detects that the RTP sequence number is discontinuous or two continuous frames fail to decode in the video decoding, the receiving end requests the sending end to retransmit the last IDR frame. When the receiving end detects the packet loss, the method informs the sending end to encode the previous frame into the IDR frame to be used as the next IDR frame to be sent, avoids the traditional working mode of continuously requesting the loss to request the sending end to retransmit the corresponding RTP packet, can directly and completely play the IDR frame at the receiving end, and enables a user to obtain a clear image.

Description

SIP (Session initiation protocol) talkback method, system and storage device for keeping video clear
Technical Field
The invention relates to the field of SIP (session initiation protocol) talkback, in particular to a SIP talkback method, a system and a storage device for keeping a video clear.
Background
In the building intercom system, a plurality of hosts and a plurality of extension sets are installed in a local area network, SIP intercom software is installed on a mobile phone of a user, and the hosts can call the extension sets or the mobile phone through the network to realize SIP intercom. In practical application, the host can call several extension sets and several mobile phones in the user's home at the same time, and needs to send early video to these extension sets and mobile phones when the host rings, and the extension sets and mobile phones ring to receive and display the host video.
The existing building intercom system has the situation that video snowflakes, mosaics and the like appear in the video process of a user due to the situation of network congestion. The reason is that the receiving end sends NACK message of RTCP to the transmitting end, and requests the transmitting end to retransmit a corresponding RTP packet, and when network congestion is severe, a large amount of retransmission requests aggravate the situation of network congestion, and more snowflakes or mosaics are generated, as shown in fig. 1.
The invention aims to design an SIP talkback method, a system and a storage device for keeping a video clear aiming at the problems in the prior art.
Disclosure of Invention
In view of the problems in the prior art, the present invention provides an SIP intercom method, system and storage device for keeping video clear, which can effectively solve the problems in the prior art.
The technical scheme of the invention is as follows:
an SIP talkback method for keeping video clear under the condition of network congestion, which comprises the following steps:
after communication between the extension or mobile terminal and the host, performing:
the sending end encodes and sends IDR frames to the receiving end every GOP time, and the receiving end decodes the IDR frames and outputs corresponding images;
when the receiving end detects that the RTP sequence number is discontinuous or two continuous frames fail to decode in the video decoding, the receiving end requests the sending end to retransmit the last IDR frame.
Further, the requesting the sending end to retransmit the last IDR frame includes:
the receiving end sends a packet loss retransmission message to the sending end;
and the sending end defines the last IDR frame as a next frame and sends the next frame to the receiving end.
Further, after the request for retransmission of the last IDR frame is sent to the sending end, the following steps are performed:
and after delaying the second time, the sending end codes again at intervals of GOP time and sends the IDR frame to the receiving end.
Further, after the communication between the extension or the mobile terminal and the host, performing:
the priority of transmitting audio is set to be greater than the priority of transmitting video.
Further, the setting of the priority of the transmission audio to be greater than the priority of the transmission video includes:
defining audio as a main axis and video as an auxiliary axis, and caching the audio and the video by a receiving end according to a first-in first-out principle;
and when the receiving end obtains the audio and/or the video every time, calculating the time difference value of the time stamp of the audio and the time stamp of the video in the current frame, and adjusting the playing speed of the video in the current frame according to the time difference value.
Further, the adjusting the playing speed of the video in the current frame according to the time difference value includes:
if the time difference is less than or equal to 250ms, the playing speed of the video in the current frame is not adjusted;
if the time difference value is greater than 250ms, the video is faster than the audio, and the buffer amount of the video is less than a first amount, slowing down or pausing the playing speed of the video in the current frame until the time difference value is less than or equal to 250 ms;
if the time difference value is greater than 250ms, the video is faster than the audio, and the buffer number of the video is greater than a first number, the playing speed of the video in the current frame is not adjusted;
if the time difference value is larger than 250ms and the video is slower than the audio, the playing speed of the video in the current frame is increased until the time difference value is smaller than or equal to 250 ms.
Further, the method is based on a one-call multi-SIP system fusing a local area network and a wide area network, and is characterized in that: the mobile terminal comprises a host, a plurality of extensions and a plurality of mobile terminals, wherein the host is connected with the extensions through a local area network, and the host is connected with the mobile terminals through a wide area network.
Further, before the communication between the extension or the mobile terminal and the host, performing:
establishing connection between a host and a plurality of extension sets through a local area network, establishing connection between the host and a plurality of mobile terminals through a wide area network, and registering the host and the mobile terminals to an SIP server;
when any extension or any mobile terminal initiates answering, the host sends a call hang-up instruction to other extensions or other mobile terminals, and communication between the corresponding extension or mobile terminal and the host is established.
Further provides an SIP intercom system for keeping clear video under the condition of network congestion, which comprises the following modules:
the encoding and sending module is used for encoding every GOP time by the sending end and sending the IDR frame to the receiving end, and the receiving end decodes the IDR frame and outputs a corresponding image;
and the request module is used for requesting the sending end to retransmit the last IDR frame when the receiving end detects that the RTP sequence number is discontinuous or the decoding of two continuous frames fails in the video decoding.
There is further provided a computer readable storage medium storing a computer program which, when executed by a processor, implements the SIP talkback method for keeping video clear in case of network congestion.
Accordingly, the present invention provides the following effects and/or advantages:
when the receiving end detects the packet loss, the method informs the sending end to encode the previous frame into the IDR frame to be used as the next IDR frame to be sent, avoids the traditional working mode of continuously requesting the loss to request the sending end to retransmit the corresponding RTP packet, can directly and completely play the IDR frame at the receiving end, and enables a user to obtain a clear image.
Due to network congestion and other conditions, after the last frame of IDR frame is requested and the second time is delayed, the sending end encodes again at intervals of GOP time and sends the IDR frame to the receiving end. Thereby preventing further congestion caused by the receiver continuously requesting to resend IDR frames in case of network congestion. Meanwhile, due to network congestion under the current condition, data transmission or data reception can be reduced within the time period of delaying the second time, and the congestion condition is relieved.
In the method, the priority of transmitting the audio is set to be greater than the priority of transmitting the video. Because the audio data is often smaller than the video data, the audio can be played directly and the video can be slowly adjusted to match the timestamps of the two.
The method is based on an SIP talkback system consisting of a local area network and a wide area network, only the host and the mobile terminal are registered to an SIP server, and the extension is not registered to the SIP server. The terminal in the local area network directly carries out data communication without passing through the SIP server, so that the communication pressure of the SIP server can be reduced, and the video conversation quality is improved.
It is to be understood that both the foregoing general description and the following detailed description of the present invention are exemplary and explanatory and are intended to provide further explanation of the invention as claimed.
Drawings
Fig. 1 is a video screenshot of a prior art SIP walkback method in the case of network congestion.
Fig. 2 is a flow chart of a method provided herein.
Fig. 3 is a video screenshot of the SIP talkback method provided by the present invention under the condition of network congestion.
Fig. 4 is a block diagram of an SIP intercom system provided by the present application.
Fig. 5 is a logic/timing diagram of the SIP intercom method provided in the present application.
Fig. 6 is a system block diagram of a conventional SIP intercom.
Fig. 7 is a logic/timing diagram of a conventional SIP walkie-talkie.
Detailed Description
To facilitate understanding for those skilled in the art, the present invention will now be described in further detail with reference to the following drawings: it should be understood that the steps mentioned in the embodiment, unless the order is specifically stated, may be performed in any order, or may even be performed simultaneously or partially simultaneously.
As a result of the prior art techniques,
referring to fig. 2, an SIP talkback method for keeping a clear video under a network congestion condition includes the following steps:
s1, connecting a host with a plurality of extensions through a local area network, connecting the host with a plurality of mobile terminals through a wide area network, and registering the host and the mobile terminals to an SIP server;
SIP is part of IETF standard processes that are built on a basis such as SMTP and HTTP. It is used to set up, change and terminate calls between users of an IP-based network. It also requires the incorporation of different standards and protocols in order to provide telephony services: in particular, there is a need to ensure transmission, signaling interconnection with the current telephone network, to be able to ensure voice quality, to be able to provide a directory, to be able to authenticate a user, etc. A host refers to a terminal for initiating video or voice communication to other multiple users/terminals/extensions.
In this embodiment, only the host and the mobile terminal are registered to the SIP server, and the extension is not registered to the SIP server. The host and the mobile terminal are registered to the SIP server, namely, the extension set or the mobile terminal periodically sends a registration request (REGISTER) to the network and reports information such as the current IP address, the user name and the like of the extension set or the mobile terminal. Thereafter, the SIP server always stores information of the extension or the mobile terminal.
In the case, when a user initiates a call, a call request instruction is sent to the extension and the mobile terminal through the host based on the SIP protocol; when a user initiates a call through the host, the host initiates a call to an extension in a local area network and initiates a call to a mobile terminal in a wide area network simultaneously through an SIP protocol. In the ringing process, when any extension or any mobile terminal initiates hang-up, a call canceling instruction is sent to the host through the corresponding extension or the mobile terminal; if any extension or any mobile terminal does not want to be connected or does not want to continuously ring, the extension or the mobile terminal hangs up the ring through the corresponding extension or the mobile terminal, and the extension or the mobile terminal sends a call Cancel instruction (Cancel) to the host. At this moment, the extension set or the mobile terminal stops ringing, and the host does not continuously send instructions and media data to the extension set or the mobile terminal any more, and meanwhile, other extension sets or the mobile terminal are not influenced and continue to keep a ringing state. When any extension or any mobile terminal initiates answering, the host sends a call hang-up instruction to other extensions or other mobile terminals, and communication between the corresponding extension or mobile terminal and the host is established.
In this step, when the user answers the call instruction of the host through the extension or the mobile terminal, that is, it indicates that there is no need for establishing communication with the host any more for other extensions or mobile terminals, the user sends a call hangup instruction to other extensions or mobile terminals through the host, and at this time, the other extensions or mobile terminals do not ring any more. Meanwhile, communication is established between the extension or the mobile terminal and the host, and a user starts to talk or video between the extension or the mobile terminal and the host.
Through the steps, the method can be used for hanging up an extension set or a mobile terminal during ringing without influencing other equipment to continuously keep the ringing state, and when any called equipment answers, the host hangs up other calls and carries out audio and video talkback with the equipment. In addition, in the prior art, the extension and the mobile phone are both registered to the SIP server, the extension local area network is provided with an outlet to the wide area network, and even if only the extension exists, communication data between the extension and the host also needs to be wound around the SIP server and then return to the local area network, so that the communication complexity between the extension and the host is increased, and the communication quality is reduced. The host and the extension are connected through the local area network, and the SIP server is not needed to transfer data in the local area network, so that the host does not need to register the information of the extension to the SIP server, the host can directly call the extension of the local area network, or the host calls the mobile terminal of the wide area network through the SIP server.
Some optimization directions of the present embodiment are described below.
S2, the sending end encodes and sends IDR frames to the receiving end every GOP time, and the receiving end decodes the IDR frames and outputs corresponding images;
s3, when the receiving end detects the RTP sequence number is discontinuous or the two continuous frames are failed to decode in the video decoding, the receiving end requests the sending end to retransmit the last IDR frame.
In this embodiment, in h.264, the I frame is divided into a general I frame and an IDR frame (special I frame); an IDR frame blocks the accumulation of errors, frames following an IDR frame cannot refer to frames preceding the IDR frame, and normal I frames do not block the accumulation of errors. The IDR frame must be an I frame, but the I frame is not necessarily an IDR frame. In H264, pictures are organized in units of a sequence, a sequence being a segment of the coded data stream of pictures, starting with an I-frame and ending with the next I-frame. The first picture of a sequence is called an IDR picture, and when the decoder receives an IDR frame, all reference frame queues are discarded. "GOP time" means the interval duration between two I-frames. In this embodiment, the GOP time is generally 1 to 2 seconds according to the requirement of video communication, and this embodiment adopts 1 second.
In step S2, the IDR frame is encoded and sent to the receiving end every GOP time, so that the receiving end can preferentially parse the IDR frame and directly output the IDR frame, and the screen of the receiving end immediately obtains a picture, as shown in fig. 3. When network congestion occurs, in the prior art, a receiving end needs to send a NACK message of an RTCP to a sending end, and the sending end needs to retransmit a corresponding RTP packet, and when the network congestion is severe, a picture as shown in fig. 1 may occur and is full of mosaics and the like. In order to avoid this phenomenon, in this embodiment, step S2 is performed, this step discards the NACK message of RTCP sent by the receiving end to the sending end, and requests the sending end to retransmit the corresponding RTP packet, and switches to request the sending end to retransmit the previous IDR frame, where one IDR frame includes a complete image, and when the receiving end detects that the RTP sequence number is discontinuous or that two consecutive frames fail to be decoded in video decoding, this step reappears a complete image, and at this time, the receiving end decodes the previous IDR frame as shown in fig. 3 again, so that situations such as snowflakes, mosaics, and the like do not occur.
And the sending end sends 1I frame every 1s within 3 seconds after the sending end starts to send the video, so that the receiving end can rapidly output stable video images. Within these 3 seconds, FIR requests sent from the receiving end are not processed.
After 3 seconds, after receiving the RTCP PLI or FIR message sent by the receiving end, the sending end notifies the application layer through the RTCP event to trigger the next frame to directly transmit an IDR frame, but the minimum interval of the IDR frame is guaranteed to be 300 ms. If the I frame is just sent, in the process of continuing to send the P (B) frame, receiving the retransmission request of the other side I frame within 300ms, stopping sending the P (B) frame until 300ms later, and sending the I frame. If any number of I frame retransmission requests are received during the period, the I frames are processed after 300ms, and are transmitted only once.
Specifically, the requesting the sending end to retransmit the last IDR frame includes:
s3.1, the receiving end sends a packet loss retransmission message to the sending end;
and S3.2, the sending end defines the last IDR frame as a next frame and sends the next frame to the receiving end. In the prior art, after receiving an IDR frame, a receiving end deletes all previous buffer frames, and in this step, a sending end redefines a previous IDR frame as a next IDR frame, so that the sending end can send the previous IDR frame again under the request of the receiving end.
Further, after requesting the sending end to retransmit the last IDR frame, the following steps are performed:
and S4, after delaying the second time, the sending end encodes again at every GOP time and sends the IDR frame to the receiving end.
In this step, since the receiving end detects the packet loss in step S3.2, the RTCP notifies the sender to encode the previous frame into an IDR frame to be sent as the next IDR frame, and due to the network congestion, the sending frequency of sending the next IDR frame can be reduced after the sending end has executed S3.2, that is, the GOP time is extended, thereby preventing the receiving end from continuously requesting to resend the IDR frame to cause further congestion in the case of network congestion. In this embodiment, the second time may be set to 10 to 15s, and this embodiment specifically adopts 10 s. That is, in step S4, the process is optimized to wait for 10 seconds before the sender encodes and sends IDR frames to the receiver again at intervals of GOP time in step S2, and the receiver decodes the IDR frames and outputs corresponding pictures.
Further, under the condition of network congestion, video and audio analyzed by a receiving end are asynchronous, and a user considers that lip action in the video is asynchronous with sound in the video communication process.
After the communication between the corresponding extension or mobile terminal and the host is established, the following steps are executed:
s5, the priority of transmitting audio is set to be greater than the priority of transmitting video.
In this step, priority audio > video. Because it is guaranteed that audio is transmitted normally and preferentially, when the receiving end receives audio data, the audio data is directly played, and when the receiving end receives video data, the audio data needs to be played by referring to the audio time stamp. If the network is poor when the video is used as the timestamp, the audio is delayed to play in the whole process for matching the video.
Specifically, the setting of the priority of the transmission audio to be higher than the priority of the transmission video includes:
s5.1, defining audio as a main axis and video as an auxiliary axis, and caching the audio and the video by a receiving end according to a first-in first-out principle; in this embodiment, the audio and video are buffered for 0.6 second first, the maximum buffer area is set to 2 seconds, that is, 100 audio data are stored, 50 video data are stored, and the first data are deleted when the percentage of the video data exceeds 10 percent of the buffer area
S5.2, when the receiving end obtains the audio and/or the video every time, calculating the time difference value of the audio time stamp and the video time stamp in the current frame, and adjusting the playing speed of the video in the current frame according to the time difference value.
Specifically, the adjusting the playing speed of the video in the current frame according to the time difference includes:
if the time difference is less than or equal to 250ms, the playing speed of the video in the current frame is not adjusted; this is because the user does not strongly perceive lip-sync for video and audio having a time difference of 250ms or less, and the time difference may be a video time stamp-audio time stamp or an audio time stamp-video time stamp.
If the time difference value is greater than 250ms, the video is faster than the audio, and the buffer amount of the video is less than a first amount, slowing down or pausing the playing speed of the video in the current frame until the time difference value is less than or equal to 250 ms; in the present case, video is faster than audio, so it is necessary to play video slowly or pause it to synchronize between audio and video.
If the time difference value is greater than 250ms, the video is faster than the audio, and the buffer amount of the video is greater than a first amount, not adjusting the playing speed of the video in the current frame; in this case, there is more video than audio in the buffer, and there is a possibility that the video buffer is reduced due to the loss of video packets due to network congestion or the like, so that the video is not processed slowly or fast in this case, and then is naturally resynchronized in the case of network congestion.
If the time difference value is greater than 250ms and the video is slower than the audio, the playing speed of the video in the current frame is increased until the time difference value is less than or equal to 250 ms. When network congestion occurs and video and audio data are received asynchronously, and the number of video buffers possibly exceeds the preset number, the video needs to be output and played quickly.
In the present embodiment, the first number may be 50. And may be any number from 10-100 in other embodiments.
Referring to fig. 4, the method is based on a one-call multi-SIP system that merges a local area network and a wide area network, and includes a host, a plurality of extensions, and a plurality of mobile terminals, where the host establishes a connection with the plurality of extensions through the local area network, and the host establishes a connection with the plurality of mobile terminals through the wide area network. The workflow of a one-call multi-SIP system merging local area network and wide area network can be referred to fig. 5. Existing SIP systems and their workflow refer to fig. 6-7.
The contrast is obvious, and the host computer can directly communicate with the extension set without passing through the SIP server in the system, so that the communication link that the data of the host computer is transmitted to the SIP server and then transmitted to the extension set is reduced, and the data is directly transmitted to the extension set from the host computer, thereby further improving the definition and the smoothness of the video.
There is further provided a computer readable storage medium storing a computer program which, when executed by a processor, implements the SIP talkback method for keeping video clear in case of network congestion.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the invention.
In the description of the present invention, it is to be understood that the terms "first", "second", and the like are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or to imply that the number of technical features indicated are in fact significant. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature. In the description of the present invention, "a plurality" means two or more unless specifically defined otherwise.
In the present invention, unless otherwise explicitly stated or limited, the terms "mounted," "connected," "fixed," and the like are to be construed broadly, e.g., as being fixedly connected, detachably connected, or integrated; can be mechanically or electrically connected; either directly or indirectly through intervening media, either internally or in any other relationship. The specific meanings of the above terms in the present invention can be understood by those skilled in the art according to specific situations.
In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above should not be understood to necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.

Claims (10)

1. A SIP talkback method for keeping clear video under the condition of network congestion is characterized in that: the method comprises the following steps:
after communication between the extension or mobile terminal and the host, performing:
the sending end encodes and sends IDR frames to the receiving end every GOP time, and the receiving end decodes the IDR frames and outputs corresponding images;
when the receiving end detects that the RTP sequence number is discontinuous or two continuous frames fail to decode in the video decoding, the receiving end requests the sending end to retransmit the last IDR frame.
2. The SIP talkback method for keeping video clear under the condition of network congestion according to claim 1, characterized in that: the requesting the retransmission of the last IDR frame to the transmitting end includes:
the receiving end sends a packet loss retransmission message to the sending end;
and the sending end defines the last IDR frame as a next frame and sends the next frame to the receiving end.
3. The SIP talkback method for keeping video clear under the condition of network congestion according to claim 1, characterized in that: after the request for retransmitting the last IDR frame is sent to the sending end, the following steps are executed:
and after delaying the second time, the sending end codes again every GOP time and sends the IDR frame to the receiving end.
4. The SIP talkback method for keeping video clear under the condition of network congestion according to claim 1, characterized in that: after communication between the extension or mobile terminal and the host, performing:
the priority of transmitting audio is set to be greater than the priority of transmitting video.
5. The SIP talkback method for keeping video clear under the condition of network congestion according to claim 4, characterized in that: the setting of the priority of the transmission audio to be greater than the priority of the transmission video includes:
defining audio as a main axis and video as an auxiliary axis, and caching the audio and the video by a receiving end according to a first-in first-out principle;
and when the receiving end obtains the audio and/or the video every time, calculating the time difference value of the time stamp of the audio and the time stamp of the video in the current frame, and adjusting the playing speed of the video in the current frame according to the time difference value.
6. The SIP talkback method for keeping video clear under the condition of network congestion according to claim 5, characterized in that: the adjusting the playing speed of the video in the current frame according to the time difference comprises:
if the time difference is less than or equal to 250ms, the playing speed of the video in the current frame is not adjusted;
if the time difference value is greater than 250ms, the video is faster than the audio, and the buffer amount of the video is less than a first amount, slowing down or pausing the playing speed of the video in the current frame until the time difference value is less than or equal to 250 ms;
if the time difference value is greater than 250ms, the video is faster than the audio, and the buffer amount of the video is greater than a first amount, not adjusting the playing speed of the video in the current frame;
if the time difference value is greater than 250ms and the video is slower than the audio, the playing speed of the video in the current frame is increased until the time difference value is less than or equal to 250 ms.
7. The SIP talkback method for keeping video clear under the condition of network congestion according to claim 1, characterized in that: the method is based on a one-call multi-SIP system fusing a local area network and a wide area network, and is characterized in that: the mobile terminal comprises a host, a plurality of extensions and a plurality of mobile terminals, wherein the host is connected with the extensions through a local area network, and the host is connected with the mobile terminals through a wide area network.
8. The SIP intercom method capable of keeping video clear under the condition of network congestion according to claim 7, wherein the SIP intercom method comprises the following steps: before communication between an extension or a mobile terminal and a host, performing:
establishing connection between a host and a plurality of extension sets through a local area network, establishing connection between the host and a plurality of mobile terminals through a wide area network, and registering the host and the mobile terminals to an SIP server;
when any extension or any mobile terminal initiates answering, the host sends a call hang-up instruction to other extensions or other mobile terminals, and communication between the corresponding extension or mobile terminal and the host is established.
9. The utility model provides a keep clear SIP intercom system of video under network congestion condition which characterized in that: the system comprises the following modules:
the coding sending module is used for coding every GOP time by the sending end and sending the IDR frame to the receiving end, and the receiving end decodes the IDR frame and outputs a corresponding image;
and the request module is used for requesting the sending end to retransmit the last IDR frame when the receiving end detects that the RTP sequence number is discontinuous or the decoding of two continuous frames fails in the video decoding.
10. A computer-readable storage medium, storing a computer program, wherein the computer program, when executed by a processor, implements a SIP talk back method for maintaining video clarity in a network congestion situation according to any of claims 1 to 9.
CN202210468252.1A 2022-04-29 2022-04-29 SIP intercom method, system and storage device for keeping video clear Active CN115102927B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202210468252.1A CN115102927B (en) 2022-04-29 2022-04-29 SIP intercom method, system and storage device for keeping video clear
PCT/CN2022/117773 WO2023206910A1 (en) 2022-04-29 2022-09-08 Sip intercom method and system based on local area network and wide area network, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210468252.1A CN115102927B (en) 2022-04-29 2022-04-29 SIP intercom method, system and storage device for keeping video clear

Publications (2)

Publication Number Publication Date
CN115102927A true CN115102927A (en) 2022-09-23
CN115102927B CN115102927B (en) 2023-10-27

Family

ID=83287499

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210468252.1A Active CN115102927B (en) 2022-04-29 2022-04-29 SIP intercom method, system and storage device for keeping video clear

Country Status (1)

Country Link
CN (1) CN115102927B (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2010003342A1 (en) * 2008-07-07 2010-01-14 华为技术有限公司 Method, device and system for identifying frame type of rtp packet
JP2010177933A (en) * 2009-01-28 2010-08-12 Aiphone Co Ltd Intercom system
CN104618786A (en) * 2014-12-22 2015-05-13 深圳市腾讯计算机系统有限公司 Audio/video synchronization method and device
CN204362246U (en) * 2014-11-20 2015-05-27 深圳市华百安智能技术有限公司 The non-viewable numbers intercom system of a kind of across a network
CN107231328A (en) * 2016-03-23 2017-10-03 福建星网锐捷通讯股份有限公司 Method for real-time video transmission, device, equipment and system
CN110012363A (en) * 2019-04-18 2019-07-12 浙江工业大学 A kind of video chat system based on Session Initiation Protocol
US20200221147A1 (en) * 2019-01-04 2020-07-09 Gainspan Corporation Intelligent video frame dropping for improved digital video flow control over a crowded wireless network
CN112995214A (en) * 2021-04-26 2021-06-18 安心智能(武汉)信息技术有限公司 Real-time video transmission system, method and computer readable storage medium

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2010003342A1 (en) * 2008-07-07 2010-01-14 华为技术有限公司 Method, device and system for identifying frame type of rtp packet
JP2010177933A (en) * 2009-01-28 2010-08-12 Aiphone Co Ltd Intercom system
CN204362246U (en) * 2014-11-20 2015-05-27 深圳市华百安智能技术有限公司 The non-viewable numbers intercom system of a kind of across a network
CN104618786A (en) * 2014-12-22 2015-05-13 深圳市腾讯计算机系统有限公司 Audio/video synchronization method and device
CN107231328A (en) * 2016-03-23 2017-10-03 福建星网锐捷通讯股份有限公司 Method for real-time video transmission, device, equipment and system
US20200221147A1 (en) * 2019-01-04 2020-07-09 Gainspan Corporation Intelligent video frame dropping for improved digital video flow control over a crowded wireless network
CN110012363A (en) * 2019-04-18 2019-07-12 浙江工业大学 A kind of video chat system based on Session Initiation Protocol
CN112995214A (en) * 2021-04-26 2021-06-18 安心智能(武汉)信息技术有限公司 Real-time video transmission system, method and computer readable storage medium

Also Published As

Publication number Publication date
CN115102927B (en) 2023-10-27

Similar Documents

Publication Publication Date Title
CN107231328B (en) Real-time video transmission method, device, equipment and system
KR100537499B1 (en) Method of generating transmission control parameter and selective retranmission method according to the packet characteristics.
CN106686438B (en) method, device and system for synchronously playing audio images across equipment
CN105704580B (en) A kind of video transmission method
EP2466911B1 (en) Method and device for fast pushing unicast stream in fast channel change
US8607286B2 (en) Method, equipment and system for reducing media delay
JP2018529261A (en) Sender video phone downgrade
US20050021809A1 (en) Video mail server with reduced frame loss
WO2010066135A1 (en) Channel switching method, device and system
JP2007535875A (en) Adaptive video phone system
KR101749006B1 (en) Video pause indication in video telephony
JP2007150916A (en) Communication system, terminal device and computer program
US9674737B2 (en) Selective rate-adaptation in video telephony
WO2011137837A1 (en) Method, device and system for obtaining key information during fast channel switching
US8339439B2 (en) Method of speeding up video recovery of videotelephony after an interruption and mobile terminal and system using the same
JP3707369B2 (en) Video phone equipment
CN108366044B (en) VoIP remote audio/video sharing method
CN110012363B (en) Video chat system based on SIP protocol
Schierl et al. 3GPP compliant adaptive wireless video streaming using H. 264/AVC
CN101651815A (en) Visual telephone and method for enhancing video quality by utilizing same
CN114979080B (en) SIP intercom method, system and storage device integrating local area network and wide area network
KR20180031673A (en) Switching display devices in video telephony
US8446823B2 (en) Method of managing the flow of time-sensitive data over packet networks
CN115102927B (en) SIP intercom method, system and storage device for keeping video clear
CN101645903A (en) Method and device for transmitting multimedia data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
CB02 Change of applicant information

Address after: Unit 403-12, 4th Floor, No. 56, Chengyi North Street, Phase III, Software Park, Torch High-tech Zone, Xiamen, Fujian 361000

Applicant after: XIAMEN LEELEN TECHNOLOGY Co.,Ltd.

Address before: 2-5 / F, 780 Tieshan Road, Guankou Town, Jimei District, Xiamen City, Fujian Province 361021

Applicant before: XIAMEN LEELEN TECHNOLOGY Co.,Ltd.

CB02 Change of applicant information
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant