CN115102927B - SIP intercom method, system and storage device for keeping video clear - Google Patents

SIP intercom method, system and storage device for keeping video clear Download PDF

Info

Publication number
CN115102927B
CN115102927B CN202210468252.1A CN202210468252A CN115102927B CN 115102927 B CN115102927 B CN 115102927B CN 202210468252 A CN202210468252 A CN 202210468252A CN 115102927 B CN115102927 B CN 115102927B
Authority
CN
China
Prior art keywords
video
frame
receiving end
host
sip
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210468252.1A
Other languages
Chinese (zh)
Other versions
CN115102927A (en
Inventor
庄宗辉
叶智鑫
卢刚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xiamen Leelen Technology Co Ltd
Original Assignee
Xiamen Leelen Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xiamen Leelen Technology Co Ltd filed Critical Xiamen Leelen Technology Co Ltd
Priority to CN202210468252.1A priority Critical patent/CN115102927B/en
Priority to PCT/CN2022/117773 priority patent/WO2023206910A1/en
Publication of CN115102927A publication Critical patent/CN115102927A/en
Application granted granted Critical
Publication of CN115102927B publication Critical patent/CN115102927B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/24Traffic characterised by specific attributes, e.g. priority or QoS
    • H04L47/2425Traffic characterised by specific attributes, e.g. priority or QoS for supporting services specification, e.g. SLA
    • H04L47/2433Allocation of priorities to traffic types
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/1066Session management
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/1066Session management
    • H04L65/1069Session establishment or de-establishment
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/40Support for services or applications
    • H04L65/403Arrangements for multi-party communication, e.g. for conferences
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/80Responding to QoS
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/854Content authoring
    • H04N21/8547Content authoring involving timestamps for synchronizing content
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Multimedia (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Business, Economics & Management (AREA)
  • General Business, Economics & Management (AREA)
  • Computer Security & Cryptography (AREA)
  • Telephonic Communication Services (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

The application relates to a SIP intercom method, a system and a storage device for keeping video clear, wherein the method comprises the following steps: after communication between the extension or mobile terminal and the host, performing: the transmitting end encodes and transmits the IDR frame to the receiving end every GOP time, and the receiving end decodes the IDR frame and outputs a corresponding image; and when the discontinuous RTP sequence number or the continuous two-frame decoding failure is detected in the video decoding of the receiving end, requesting the transmitting end to retransmit the last IDR frame. When the receiving end detects packet loss, the method informs the sender to encode the previous frame into the IDR frame as the next IDR frame to be sent, avoids the traditional working mode of continuously requesting to lose and requiring retransmission of the corresponding RTP packet to the sending end, and can directly and completely play one frame of IDR frame at the receiving end, so that a user obtains a clear image.

Description

SIP intercom method, system and storage device for keeping video clear
Technical Field
The application relates to the field of SIP (session initiation protocol) intercom, in particular to an SIP intercom method, an SIP intercom system and a SIP intercom storage device for keeping video clear.
Background
In the building intercom system, a plurality of hosts and a plurality of extensions are installed in a local area network, SIP intercom software is installed on a mobile phone of a user, and the hosts can call the extensions or the mobile phone through a network to realize SIP intercom. In practical applications, the host can call several extensions and several handsets in the user's home at the same time, and it is necessary to send early video to these extensions and handsets when the host rings, and the extensions and handsets ring to receive and display the host video.
The existing building intercom system has the problems that due to network congestion, video snowflakes, mosaics and the like can appear in the video process of users. The reason for this is that the receiving end sends the NACK message of RTCP to the transmitting end, and requests retransmission of the corresponding RTP packet to the transmitting end, and when the network congestion is severe, a large number of retransmission requests aggravate the network congestion situation, and more snowflakes or mosaics are generated, as shown in fig. 1.
Aiming at the problems in the prior art, the application designs a SIP intercom method, a system and a storage device for keeping video clear.
Disclosure of Invention
Aiming at the problems in the prior art, the application aims to provide a SIP intercom method, a system and a storage device for keeping video clear, which can effectively solve the problems in the prior art.
The technical scheme of the application is as follows:
a SIP intercom method for maintaining video clarity in the event of network congestion, said method comprising the steps of:
after communication between the extension or mobile terminal and the host, performing:
the transmitting end encodes and transmits the IDR frame to the receiving end every GOP time, and the receiving end decodes the IDR frame and outputs a corresponding image;
and when the discontinuous RTP sequence number or the continuous two-frame decoding failure is detected in the video decoding of the receiving end, requesting the transmitting end to retransmit the last IDR frame.
Further, the requesting retransmission of the last IDR frame from the transmitting end includes:
the receiving end sends a packet loss retransmission message to the sending end;
the transmitting end defines the last IDR frame as the next frame and transmits the next frame to the receiving end.
Further, after the request for retransmission of the last IDR frame from the transmitting end, the following steps are performed:
after the second time delay, the transmitting end encodes and transmits the IDR frame to the receiving end again every GOP time.
Further, after communication between the extension or the mobile terminal and the host, performing:
the priority of transmitting audio is set to be greater than the priority of transmitting video.
Further, the setting the priority of transmitting the audio to be greater than the priority of transmitting the video includes:
defining audio as a main shaft and video as an auxiliary shaft, and caching the audio and the video by a receiving end according to a first-in first-out principle;
and when the receiving end obtains the audio and/or the video each time, calculating a time difference value between the time stamp of the audio and the time stamp of the video in the current frame, and adjusting the playing speed of the video in the current frame according to the time difference value.
Further, the adjusting the playing speed of the video in the current frame according to the time difference value includes:
if the time difference is less than or equal to 250ms, not adjusting the playing speed of the video in the current frame;
if the time difference is greater than 250ms, the video is faster than the audio and the number of caches of the video is smaller than the first number, slowing down or suspending the playing speed of the video in the current frame until the time difference is less than or equal to 250ms;
if the time difference is greater than 250ms, the video is faster than the audio and the number of buffers of the video is greater than a first number, not adjusting the playing speed of the video in the current frame;
and if the time difference is greater than 250ms and the video is slower than the audio, the playing speed of the video in the current frame is adjusted until the time difference is less than or equal to 250ms.
Further, the method is based on a one-call-multiple SIP system integrating a local area network and a wide area network, and is characterized in that: the system comprises a host, a plurality of extensions and a plurality of mobile terminals, wherein the host establishes connection with the extensions through a local area network, and the host establishes connection with the mobile terminals through a wide area network.
Further, before the communication between the extension or the mobile terminal and the host, performing:
establishing connection between a host and a plurality of extensions through a local area network, establishing connection between the host and a plurality of mobile terminals through a wide area network, and registering the host and the mobile terminals to an SIP server;
when any extension or any mobile terminal initiates answering, the host sends a call hang-up instruction to other extensions or other mobile terminals, and communication between the corresponding extension or mobile terminal and the host is established.
Further provided is a SIP intercom system for keeping video clear in case of network congestion, comprising the following modules:
the coding and transmitting module is used for the transmitting end to code and transmit the IDR frame to the receiving end every GOP time, and the receiving end decodes the IDR frame and outputs a corresponding image;
and the request module is used for requesting the sender to resend the last IDR frame when the discontinuous RTP sequence number or the continuous two-frame decoding failure is detected in the video decoding of the receiving end.
There is further provided a computer readable storage medium storing a computer program which when executed by a processor implements the SIP intercom method of maintaining video clarity in the event of network congestion.
Accordingly, the present application provides the following effects and/or advantages:
when the receiving end detects packet loss, the method informs the sender to encode the previous frame into the IDR frame as the next IDR frame to be sent, avoids the traditional working mode of continuously requesting to lose and requiring retransmission of the corresponding RTP packet to the sending end, and can directly and completely play one frame of IDR frame at the receiving end, so that a user obtains a clear image.
In the method, due to network congestion and other conditions, after the last IDR frame is requested and the second time is delayed, the transmitting end codes every GOP time again and transmits the IDR frame to the receiving end. Thereby preventing the receiving end from continuously requesting retransmission of the IDR frame to cause further congestion in case of network congestion. Meanwhile, due to network congestion in the current situation, data transmission or receiving can be reduced within a time period of delaying the second time, and congestion conditions are relieved.
In the method, the priority of transmitting audio is set to be greater than the priority of transmitting video. Because audio data is often smaller than video data, audio can be played directly while video playback is slowly adjusted to match the time stamps of the two.
The method is based on an SIP intercom system formed by a local area network and a wide area network, only the host and the mobile terminal are registered to an SIP server, and the extension is not registered to the SIP server. The terminal in the local area network directly performs data communication without passing through the SIP server, so that the communication pressure of the SIP server can be reduced, and the video dialogue quality is improved.
It is to be understood that both the foregoing general description and the following detailed description of the present application are exemplary and explanatory and are intended to provide further explanation of the application as claimed.
Drawings
The prior art SIP intercom method mentioned in the background of fig. 1 captures a video in case of network congestion.
Fig. 2 is a flow chart of the method provided by the application.
Fig. 3 is a video screenshot of the SIP intercom method provided by the present application under the condition of network congestion.
Fig. 4 is a block diagram of a SIP intercom system provided by the present application.
Fig. 5 is a logic/timing diagram of the SIP intercom method provided by the present application.
Fig. 6 is a system block diagram of a conventional SIP intercom.
Fig. 7 is a logic/timing diagram of a conventional SIP intercom.
Detailed Description
For the purpose of facilitating understanding to those skilled in the art, the present application will now be described in further detail with reference to the accompanying drawings: it should be understood that, in this embodiment, the steps mentioned in this embodiment may be performed sequentially or sequentially, or may be performed simultaneously or partially, unless specifically stated otherwise.
As a result of the fact that in the prior art,
referring to fig. 2, a SIP intercom method for keeping video clear in case of network congestion includes the following steps:
s1, a host establishes connection with a plurality of extensions through a local area network, the host establishes connection with a plurality of mobile terminals through a wide area network, and the host and the mobile terminals are registered to an SIP server;
SIP is part of the IETF standard procedure and is built on such bases as SMTP and HTTP. It is used to set up, change and terminate calls between users over an IP network. It also requires the incorporation of different standards and protocols in order to provide telephony services: in particular, it is necessary to ensure transmission, signaling interconnection with the current telephone network, to be able to ensure voice quality, to be able to provide catalogues, to be able to authenticate users, etc. A host refers to a terminal for initiating video or voice communications to other multiple users/terminals/extensions.
In this embodiment, only the host and the mobile terminal are registered with the SIP server, and the extension is not registered with the SIP server. The host and the mobile terminal REGISTER with the SIP server, which means that the extension or the mobile terminal periodically transmits a registration request (REGISTER) to the network, reporting the current IP address, user name, and other information thereof. After that, the SIP server always stores information of the extension or the mobile terminal.
In this case, when a user initiates a call, a call request instruction is simultaneously transmitted to the extension and the mobile terminal through a host based on a SIP protocol; when a user initiates a call through a host, the host initiates a call to an extension through a local area network and initiates a call to a mobile terminal through a wide area network simultaneously through a SIP protocol. In the ringing process, when any extension or any mobile terminal initiates hang-up, a call cancellation instruction is sent to the host through the corresponding extension or mobile terminal; if any extension or any mobile terminal does not want to be connected or does not want to continuously ring, the corresponding extension or mobile terminal hangs up to ring, and the extension or mobile terminal sends a call Cancel instruction (cancer) to the host. At this time, the extension or the mobile terminal stops ringing, and the host does not continuously send instructions and media data to the extension or the mobile terminal, and at the same time, other extensions or mobile terminals are not affected, and continue to maintain the ringing state. When any extension or any mobile terminal initiates answering, the host sends a call hang-up instruction to other extensions or other mobile terminals, and communication between the corresponding extension or mobile terminal and the host is established.
In this step, when the user answers the call instruction of the host through the extension or the mobile terminal, that is, the other extension or the mobile terminal is indicated that there is no need to establish communication with the host, the call hang-up instruction is sent to the other extension or the mobile terminal through the host, and at this time, the other extension or the mobile terminal does not ring. Simultaneously, communication is established between the extension or the mobile terminal and the host, and a user starts a dialogue or video between the extension or the mobile terminal and the host.
Through the steps, the application can hang up a certain extension or mobile terminal when ringing and does not affect other equipment to keep ringing, when any called equipment receives the call, the host hangs up other circuit calls and carries out audio and video intercom with the equipment. In addition, in the prior art, both the extension and the mobile phone are registered to the SIP server, and the extension local area network needs to have an outlet to the wide area network, so that even if the extension is only used, communication data between the extension and the host needs to bypass the SIP server and return to the local area network, the communication complexity between the extension and the host is increased, and the communication quality is reduced. The host and the extension are connected through the local area network, and the SIP server is not needed to transfer data in the local area network, so that the information of the extension is not needed to be registered to the SIP server, and the host can directly call the local area network extension, or the host calls the wide area network mobile terminal through the SIP server.
The following describes some of the optimization directions of this embodiment.
S2, the transmitting end encodes and transmits the IDR frame to the receiving end every GOP time, and the receiving end decodes the IDR frame and outputs a corresponding image;
s3, when the discontinuous RTP sequence number or the continuous two-frame decoding failure is detected in the video decoding of the receiving end, the sending end is requested to resend the last IDR frame.
In this embodiment, in h.264, the I frames are divided into normal I frames and IDR frames (special I frames); the IDR frame blocks the accumulation of errors, and none of the frames following the IDR frame can refer to the frame preceding the IDR frame, and the normal I frame does not block the accumulation of errors. The IDR frame must be an I-frame, but the I-frame is not necessarily an IDR frame. In H264 the pictures are organized in units of sequences, one sequence being a stream of pictures encoded beginning with an I frame and ending with the next I frame. The first picture of a sequence is called the IDR picture and when the decoder receives an IDR frame, it discards all the reference frame queues. "GOP time" refers to the duration of the interval between two I frames. In this embodiment, GOP time is generally 1-2 seconds, and 1 second is used in this embodiment, according to the video communication requirement.
In step S2, the IDR frame is encoded and sent to the receiving end every GOP time, and then the receiving end can parse the IDR frame preferentially and output it directly, and the screen of the receiving end obtains a picture immediately, as shown in fig. 3. When network congestion occurs, the prior art needs that the receiving end sends a NACK message of the RTCP to the sending end, and requests to retransmit a corresponding RTP packet to the sending end, and when the network congestion is serious, a picture as shown in fig. 1 appears, and the picture is full of mosaics and the like. In order to avoid this phenomenon, the present embodiment solves through step S2, the step discards the NACK message sent by the receiving end to the sending end, and requests to retransmit the corresponding RTP packet to the sending end, and changes to request to retransmit the last IDR frame to the sending end, where one IDR frame includes a complete image, and when the receiving end detects that the RTP sequence number is discontinuous or two continuous frames fail to decode in video decoding, the step reapply a complete image, and at this time, the receiving end re-decodes the last IDR frame as shown in fig. 3, so that the situations of snowflake, mosaic, etc. do not occur.
The sending end sends 1I frame every 1s within 3 seconds after the video is sent, so that the receiving end can quickly send out stable video images. During these 3 seconds, the FIR request sent from the receiving end is not processed.
After 3 seconds, after receiving the RTCP PLI or FIR message sent by the receiving end, the sending end notifies the application layer through the RTCP event, and triggers the next frame to directly transmit an IDR frame, but the minimum interval of the IDR frame is ensured to be 300ms. If the I frame is just sent, the I frame retransmission request of the opposite side is received within 300ms in the process of continuing to send the P (B) frame, and the P (B) frame is stopped until 300ms is passed, and the I frame is not sent. If no matter how many I-frame retransmission requests are received in this period, the I-frames are processed together after 300ms, and are transmitted only once.
Specifically, the requesting the sender to retransmit the last IDR frame includes:
s3.1, a receiving end sends a packet loss retransmission message to a sending end;
and S3.2, the transmitting end defines the last IDR frame as the next frame and transmits the next IDR frame to the receiving end. In the prior art, after the receiving end receives the IDR frame, all previous buffer frames are deleted, and in this step, the transmitting end redefines the previous IDR frame as the next frame, so that the previous IDR frame can be transmitted again by the transmitting end under the request of the receiving end.
Further, after the request for retransmission of the last IDR frame from the transmitting end, the following steps are performed:
s4, after the second time is delayed, the sending end codes every GOP time again and sends the IDR frame to the receiving end.
In this step, since the receiving end detects packet loss in step S3.2, the receiving end uses RTCP to inform the sender to encode the previous frame into the IDR frame to be sent as the next IDR frame, and after the sending end completes S3.2 due to network congestion, the sending frequency of sending the next IDR frame can be reduced, i.e. GOP time is prolonged, so as to prevent the receiving end from continuously requesting to resend the IDR frame to cause further congestion under the condition of network congestion. In this embodiment, the second time may be set to 10-15s, and 10s is specifically used in this embodiment. That is, in step S4, it is optimized to wait for 10 seconds before starting to return to step S2, where the transmitting end encodes and transmits the IDR frame to the receiving end every GOP time, and the receiving end decodes the IDR frame and outputs a corresponding image.
Further, in the case of network congestion, the video and the audio analyzed by the receiving end may be unsynchronized, and the user may consider that the lip action in the video is unsynchronized with the sound in the video communication process, so as to alleviate or avoid the situation, the present application proposes a solution in the following steps.
After the communication between the extension or the mobile terminal and the host is established, the following steps are executed:
and S5, setting the priority of the transmission audio to be larger than the priority of the transmission video.
In this step, priority audio > video. Because the audio is guaranteed to be transmitted normally preferentially, when the receiving end receives the audio data, the audio data is directly played, and when the receiving end receives the video data, the audio time stamp is required to be referenced for playing. If the video is taken as a time stamp and network difference occurs, the audio is delayed to play in the whole process for matching the video.
Specifically, the setting the priority of the transmission audio to be greater than the priority of the transmission video includes:
s5.1, defining audio as a main shaft and video as an auxiliary shaft, and caching the audio and the video by a receiving end according to a first-in first-out principle; in this embodiment, the audio and video are buffered for 0.6 seconds, the largest buffer is set to 2 seconds, i.e. 100 audio stores, 50 video stores, and the first data is deleted if the buffer exceeds 10 percent of the buffer
S5.2, when the receiving end obtains the audio and/or the video each time, calculating a time difference value between the time stamp of the audio and the time stamp of the video in the current frame, and adjusting the playing speed of the video in the current frame according to the time difference value.
Specifically, the adjusting the playing speed of the video in the current frame according to the time difference value includes:
if the time difference is less than or equal to 250ms, not adjusting the playing speed of the video in the current frame; this is because the user does not strongly perceive lip-sync with respect to video and audio having a time difference of 250ms or less, and the time difference may be a video time stamp-audio time stamp or an audio time stamp-video time stamp.
If the time difference is greater than 250ms, the video is faster than the audio and the number of caches of the video is smaller than the first number, slowing down or suspending the playing speed of the video in the current frame until the time difference is less than or equal to 250ms; in the current situation, video is faster than audio, so it is necessary to slowly play out or pause the video so that the audio and the video are synchronized.
If the time difference is greater than 250ms, the video is faster than the audio and the number of buffers of the video is greater than a first number, not adjusting the playing speed of the video in the current frame; in this case, there is more video than audio in the buffer, and then there is a possibility that the video buffer is reduced due to the loss of video packets in the case of network congestion, etc., so that the video is not slowed down or fast processed in this case, and the subsequent process is naturally re-synchronized in the case of network congestion.
And if the time difference is greater than 250ms and the video is slower than the audio, the playing speed of the video in the current frame is adjusted until the time difference is less than or equal to 250ms. When network congestion occurs, video and audio data are received asynchronously, and the number of video buffers exceeds the preset number, the video needs to be output and played quickly.
In this embodiment, the first number may be 50. In other embodiments any number from 10-100 is possible.
Referring to fig. 4, the method is based on a one-call-multiple SIP system integrating a local area network and a wide area network, and includes a host, a plurality of extensions, and a plurality of mobile terminals, where the host establishes a connection with the extensions through the local area network, and the host establishes a connection with the plurality of mobile terminals through the wide area network. A workflow of a one-call-multiple SIP system that merges a local area network and a wide area network may refer to fig. 5. Existing SIP systems and their workflow refer to fig. 6-7.
Compared with the prior art, in the system, the host can directly communicate with the extension without passing through the SIP server, so that the communication link that data of the host are transmitted to the SIP server and then transmitted to the extension is reduced, and the data are directly transmitted from the host to the extension, thereby further improving the definition and fluency of video.
There is further provided a computer readable storage medium storing a computer program which when executed by a processor implements the SIP intercom method of maintaining video clarity in the event of network congestion.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiments and all such alterations and modifications as fall within the scope of the application.
In the description of the present application, it should be understood that the terms "first," "second," and the like are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include one or more such feature. In the description of the present application, the meaning of "a plurality" is two or more, unless explicitly defined otherwise.
In the present application, unless explicitly specified and limited otherwise, the terms "mounted," "connected," "secured," and the like are to be construed broadly, and may be, for example, fixedly connected, detachably connected, or integrally formed; can be mechanically or electrically connected; can be directly connected or indirectly connected through an intermediate medium, and can be communicated with the inside of two elements or the interaction relationship of the two elements. The specific meaning of the above terms in the present application can be understood by those of ordinary skill in the art according to the specific circumstances.
In the description of the present specification, a description referring to terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present application. In this specification, schematic representations of the above terms should not be understood as necessarily being directed to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, the different embodiments or examples described in this specification and the features of the different embodiments or examples may be combined and combined by those skilled in the art without contradiction.

Claims (7)

1. The SIP intercom method for keeping video clear under the condition of network congestion is characterized in that: the method comprises the following steps:
after communication between the extension or mobile terminal and the host, performing:
the transmitting end encodes and transmits the IDR frame to the receiving end every GOP time, and the receiving end decodes the IDR frame and outputs a corresponding image;
when the discontinuous RTP sequence number or the continuous two-frame decoding failure is detected in the video decoding of the receiving end, requesting the transmitting end to retransmit the last IDR frame;
the requesting retransmission of the last IDR frame from the transmitting end includes:
the receiving end sends a packet loss retransmission message to the sending end;
the transmitting end defines the last IDR frame as the next frame and transmits the next frame to the receiving end;
the method is based on a one-call-multiple SIP system integrating a local area network and a wide area network, and is characterized in that: the system comprises a host, a plurality of extensions and a plurality of mobile terminals, wherein the host establishes connection with the extensions through a local area network, and the host establishes connection with the mobile terminals through a wide area network;
before communication between an extension or mobile terminal and a host, performing:
establishing connection between a host and a plurality of extensions through a local area network, establishing connection between the host and a plurality of mobile terminals through a wide area network, and registering the host and the mobile terminals to an SIP server;
when any extension or any mobile terminal initiates answering, the host sends a call hang-up instruction to other extensions or other mobile terminals, and communication between the corresponding extension or mobile terminal and the host is established.
2. The SIP intercom method according to claim 1, wherein the video is kept clear in case of network congestion, wherein: after the request for retransmission of the last IDR frame from the transmitting end, the following steps are performed:
after the second time delay, the transmitting end encodes and transmits the IDR frame to the receiving end again every GOP time.
3. The SIP intercom method according to claim 1, wherein the video is kept clear in case of network congestion, wherein: after communication between the extension or mobile terminal and the host, performing:
the priority of transmitting audio is set to be greater than the priority of transmitting video.
4. A SIP intercom method for keeping video clear in case of network congestion as claimed in claim 3, wherein: the setting of the priority of transmitting audio to be greater than the priority of transmitting video includes:
defining audio as a main shaft and video as an auxiliary shaft, and caching the audio and the video by a receiving end according to a first-in first-out principle;
and when the receiving end obtains the audio and/or the video each time, calculating a time difference value between the time stamp of the audio and the time stamp of the video in the current frame, and adjusting the playing speed of the video in the current frame according to the time difference value.
5. The SIP intercom method according to claim 4, wherein the video is kept clear in case of network congestion, wherein: the adjusting the playing speed of the video in the current frame according to the time difference value comprises the following steps:
if the time difference is less than or equal to 250ms, not adjusting the playing speed of the video in the current frame;
if the time difference is greater than 250ms, the video is faster than the audio and the number of caches of the video is smaller than the first number, slowing down or suspending the playing speed of the video in the current frame until the time difference is less than or equal to 250ms;
if the time difference is greater than 250ms, the video is faster than the audio and the number of buffers of the video is greater than a first number, not adjusting the playing speed of the video in the current frame;
and if the time difference is greater than 250ms and the video is slower than the audio, the playing speed of the video in the current frame is adjusted until the time difference is less than or equal to 250ms.
6. The SIP intercom system for keeping video clear under the condition of network congestion is characterized in that: the method comprises the following modules:
the coding and transmitting module is used for the transmitting end to code and transmit the IDR frame to the receiving end every GOP time, and the receiving end decodes the IDR frame and outputs a corresponding image;
and the request module is used for requesting the sender to resend the last IDR frame when the discontinuous RTP sequence number or the continuous two-frame decoding failure is detected in the video decoding of the receiving end.
7. A computer readable storage medium storing a computer program, wherein the computer program when executed by a processor implements a SIP intercom method of keeping video clear in case of network congestion as claimed in any one of claims 1 to 5.
CN202210468252.1A 2022-04-29 2022-04-29 SIP intercom method, system and storage device for keeping video clear Active CN115102927B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202210468252.1A CN115102927B (en) 2022-04-29 2022-04-29 SIP intercom method, system and storage device for keeping video clear
PCT/CN2022/117773 WO2023206910A1 (en) 2022-04-29 2022-09-08 Sip intercom method and system based on local area network and wide area network, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210468252.1A CN115102927B (en) 2022-04-29 2022-04-29 SIP intercom method, system and storage device for keeping video clear

Publications (2)

Publication Number Publication Date
CN115102927A CN115102927A (en) 2022-09-23
CN115102927B true CN115102927B (en) 2023-10-27

Family

ID=83287499

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210468252.1A Active CN115102927B (en) 2022-04-29 2022-04-29 SIP intercom method, system and storage device for keeping video clear

Country Status (1)

Country Link
CN (1) CN115102927B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2010003342A1 (en) * 2008-07-07 2010-01-14 华为技术有限公司 Method, device and system for identifying frame type of rtp packet
JP2010177933A (en) * 2009-01-28 2010-08-12 Aiphone Co Ltd Intercom system
CN104618786A (en) * 2014-12-22 2015-05-13 深圳市腾讯计算机系统有限公司 Audio/video synchronization method and device
CN204362246U (en) * 2014-11-20 2015-05-27 深圳市华百安智能技术有限公司 The non-viewable numbers intercom system of a kind of across a network
CN107231328A (en) * 2016-03-23 2017-10-03 福建星网锐捷通讯股份有限公司 Method for real-time video transmission, device, equipment and system
CN110012363A (en) * 2019-04-18 2019-07-12 浙江工业大学 A kind of video chat system based on Session Initiation Protocol
CN112995214A (en) * 2021-04-26 2021-06-18 安心智能(武汉)信息技术有限公司 Real-time video transmission system, method and computer readable storage medium

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11350142B2 (en) * 2019-01-04 2022-05-31 Gainspan Corporation Intelligent video frame dropping for improved digital video flow control over a crowded wireless network

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2010003342A1 (en) * 2008-07-07 2010-01-14 华为技术有限公司 Method, device and system for identifying frame type of rtp packet
JP2010177933A (en) * 2009-01-28 2010-08-12 Aiphone Co Ltd Intercom system
CN204362246U (en) * 2014-11-20 2015-05-27 深圳市华百安智能技术有限公司 The non-viewable numbers intercom system of a kind of across a network
CN104618786A (en) * 2014-12-22 2015-05-13 深圳市腾讯计算机系统有限公司 Audio/video synchronization method and device
CN107231328A (en) * 2016-03-23 2017-10-03 福建星网锐捷通讯股份有限公司 Method for real-time video transmission, device, equipment and system
CN110012363A (en) * 2019-04-18 2019-07-12 浙江工业大学 A kind of video chat system based on Session Initiation Protocol
CN112995214A (en) * 2021-04-26 2021-06-18 安心智能(武汉)信息技术有限公司 Real-time video transmission system, method and computer readable storage medium

Also Published As

Publication number Publication date
CN115102927A (en) 2022-09-23

Similar Documents

Publication Publication Date Title
US6697097B1 (en) Synchronizing voice and video transmitted over separate channels
US8526048B1 (en) Systems and methods for the reliable transmission of facsimiles over packet networks
JP4479650B2 (en) Communication system, terminal device and computer program
CN105704580B (en) A kind of video transmission method
US7290058B2 (en) Video mail server with reduced frame loss
JP5421346B2 (en) High-speed transmission method and apparatus for unicast stream in high-speed channel change
JP2018529261A (en) Sender video phone downgrade
JP2009163734A (en) Method and system for fast session establishment between equipment using h.324 and related telecommunications protocol, and h.324 similar terminal
WO2010066135A1 (en) Channel switching method, device and system
MX2011012652A (en) Method, apparatus and system for reducing media delay.
WO2011022977A1 (en) Video data reception and transmission system and video data processing method for videophone
US8339439B2 (en) Method of speeding up video recovery of videotelephony after an interruption and mobile terminal and system using the same
JP3707369B2 (en) Video phone equipment
CN108366044B (en) VoIP remote audio/video sharing method
CN110012363B (en) Video chat system based on SIP protocol
Schierl et al. 3GPP compliant adaptive wireless video streaming using H. 264/AVC
US10085029B2 (en) Switching display devices in video telephony
CN114979080B (en) SIP intercom method, system and storage device integrating local area network and wide area network
CN115102927B (en) SIP intercom method, system and storage device for keeping video clear
CN109274980A (en) A kind of data transmission method for being quickly broadcast live
EP2512161A1 (en) Method, service terminal and server for mobile video communicating
WO2023206910A1 (en) Sip intercom method and system based on local area network and wide area network, and storage medium
JP3969155B2 (en) Multimedia communication transfer method, multimedia communication terminal, exchange, management device
JP2005210160A (en) Video receiving terminal having communication state display
KR101462222B1 (en) Image currency connection method between different kind terminal that video format differs

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
CB02 Change of applicant information
CB02 Change of applicant information

Address after: Unit 403-12, 4th Floor, No. 56, Chengyi North Street, Phase III, Software Park, Torch High-tech Zone, Xiamen, Fujian 361000

Applicant after: XIAMEN LEELEN TECHNOLOGY Co.,Ltd.

Address before: 2-5 / F, 780 Tieshan Road, Guankou Town, Jimei District, Xiamen City, Fujian Province 361021

Applicant before: XIAMEN LEELEN TECHNOLOGY Co.,Ltd.

SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant