CN115102927B - SIP intercom method, system and storage device for keeping video clear - Google Patents
SIP intercom method, system and storage device for keeping video clear Download PDFInfo
- Publication number
- CN115102927B CN115102927B CN202210468252.1A CN202210468252A CN115102927B CN 115102927 B CN115102927 B CN 115102927B CN 202210468252 A CN202210468252 A CN 202210468252A CN 115102927 B CN115102927 B CN 115102927B
- Authority
- CN
- China
- Prior art keywords
- video
- frame
- receiving end
- host
- sip
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 40
- 238000003860 storage Methods 0.000 title claims abstract description 12
- 230000006854 communication Effects 0.000 claims abstract description 22
- 238000004891 communication Methods 0.000 claims abstract description 21
- 238000004590 computer program Methods 0.000 claims description 11
- 239000000872 buffer Substances 0.000 claims description 10
- 238000010586 diagram Methods 0.000 description 10
- 230000005540 biological transmission Effects 0.000 description 6
- 230000008569 process Effects 0.000 description 6
- 230000006870 function Effects 0.000 description 4
- 238000012545 processing Methods 0.000 description 4
- 101000741965 Homo sapiens Inactive tyrosine-protein kinase PRAG1 Proteins 0.000 description 3
- 102100038659 Inactive tyrosine-protein kinase PRAG1 Human genes 0.000 description 3
- 241000533950 Leucojum Species 0.000 description 3
- 230000003111 delayed effect Effects 0.000 description 3
- 238000009825 accumulation Methods 0.000 description 2
- 230000000977 initiatory effect Effects 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 206010028980 Neoplasm Diseases 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 201000011510 cancer Diseases 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000010348 incorporation Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000002035 prolonged effect Effects 0.000 description 1
- 230000011664 signaling Effects 0.000 description 1
- 238000010561 standard procedure Methods 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/10—Flow control; Congestion control
- H04L47/24—Traffic characterised by specific attributes, e.g. priority or QoS
- H04L47/2425—Traffic characterised by specific attributes, e.g. priority or QoS for supporting services specification, e.g. SLA
- H04L47/2433—Allocation of priorities to traffic types
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L65/00—Network arrangements, protocols or services for supporting real-time applications in data packet communication
- H04L65/1066—Session management
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L65/00—Network arrangements, protocols or services for supporting real-time applications in data packet communication
- H04L65/1066—Session management
- H04L65/1069—Session establishment or de-establishment
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L65/00—Network arrangements, protocols or services for supporting real-time applications in data packet communication
- H04L65/40—Support for services or applications
- H04L65/403—Arrangements for multi-party communication, e.g. for conferences
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L65/00—Network arrangements, protocols or services for supporting real-time applications in data packet communication
- H04L65/80—Responding to QoS
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/85—Assembly of content; Generation of multimedia applications
- H04N21/854—Content authoring
- H04N21/8547—Content authoring involving timestamps for synchronizing content
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N7/00—Television systems
- H04N7/14—Systems for two-way working
- H04N7/15—Conference systems
Landscapes
- Engineering & Computer Science (AREA)
- Signal Processing (AREA)
- Multimedia (AREA)
- Computer Networks & Wireless Communication (AREA)
- Business, Economics & Management (AREA)
- General Business, Economics & Management (AREA)
- Computer Security & Cryptography (AREA)
- Mobile Radio Communication Systems (AREA)
- Telephonic Communication Services (AREA)
Abstract
The application relates to a SIP intercom method, a system and a storage device for keeping video clear, wherein the method comprises the following steps: after communication between the extension or mobile terminal and the host, performing: the transmitting end encodes and transmits the IDR frame to the receiving end every GOP time, and the receiving end decodes the IDR frame and outputs a corresponding image; and when the discontinuous RTP sequence number or the continuous two-frame decoding failure is detected in the video decoding of the receiving end, requesting the transmitting end to retransmit the last IDR frame. When the receiving end detects packet loss, the method informs the sender to encode the previous frame into the IDR frame as the next IDR frame to be sent, avoids the traditional working mode of continuously requesting to lose and requiring retransmission of the corresponding RTP packet to the sending end, and can directly and completely play one frame of IDR frame at the receiving end, so that a user obtains a clear image.
Description
Technical Field
The application relates to the field of SIP (session initiation protocol) intercom, in particular to an SIP intercom method, an SIP intercom system and a SIP intercom storage device for keeping video clear.
Background
In the building intercom system, a plurality of hosts and a plurality of extensions are installed in a local area network, SIP intercom software is installed on a mobile phone of a user, and the hosts can call the extensions or the mobile phone through a network to realize SIP intercom. In practical applications, the host can call several extensions and several handsets in the user's home at the same time, and it is necessary to send early video to these extensions and handsets when the host rings, and the extensions and handsets ring to receive and display the host video.
The existing building intercom system has the problems that due to network congestion, video snowflakes, mosaics and the like can appear in the video process of users. The reason for this is that the receiving end sends the NACK message of RTCP to the transmitting end, and requests retransmission of the corresponding RTP packet to the transmitting end, and when the network congestion is severe, a large number of retransmission requests aggravate the network congestion situation, and more snowflakes or mosaics are generated, as shown in fig. 1.
Aiming at the problems in the prior art, the application designs a SIP intercom method, a system and a storage device for keeping video clear.
Disclosure of Invention
Aiming at the problems in the prior art, the application aims to provide a SIP intercom method, a system and a storage device for keeping video clear, which can effectively solve the problems in the prior art.
The technical scheme of the application is as follows:
a SIP intercom method for maintaining video clarity in the event of network congestion, said method comprising the steps of:
after communication between the extension or mobile terminal and the host, performing:
the transmitting end encodes and transmits the IDR frame to the receiving end every GOP time, and the receiving end decodes the IDR frame and outputs a corresponding image;
and when the discontinuous RTP sequence number or the continuous two-frame decoding failure is detected in the video decoding of the receiving end, requesting the transmitting end to retransmit the last IDR frame.
Further, the requesting retransmission of the last IDR frame from the transmitting end includes:
the receiving end sends a packet loss retransmission message to the sending end;
the transmitting end defines the last IDR frame as the next frame and transmits the next frame to the receiving end.
Further, after the request for retransmission of the last IDR frame from the transmitting end, the following steps are performed:
after the second time delay, the transmitting end encodes and transmits the IDR frame to the receiving end again every GOP time.
Further, after communication between the extension or the mobile terminal and the host, performing:
the priority of transmitting audio is set to be greater than the priority of transmitting video.
Further, the setting the priority of transmitting the audio to be greater than the priority of transmitting the video includes:
defining audio as a main shaft and video as an auxiliary shaft, and caching the audio and the video by a receiving end according to a first-in first-out principle;
and when the receiving end obtains the audio and/or the video each time, calculating a time difference value between the time stamp of the audio and the time stamp of the video in the current frame, and adjusting the playing speed of the video in the current frame according to the time difference value.
Further, the adjusting the playing speed of the video in the current frame according to the time difference value includes:
if the time difference is less than or equal to 250ms, not adjusting the playing speed of the video in the current frame;
if the time difference is greater than 250ms, the video is faster than the audio and the number of caches of the video is smaller than the first number, slowing down or suspending the playing speed of the video in the current frame until the time difference is less than or equal to 250ms;
if the time difference is greater than 250ms, the video is faster than the audio and the number of buffers of the video is greater than a first number, not adjusting the playing speed of the video in the current frame;
and if the time difference is greater than 250ms and the video is slower than the audio, the playing speed of the video in the current frame is adjusted until the time difference is less than or equal to 250ms.
Further, the method is based on a one-call-multiple SIP system integrating a local area network and a wide area network, and is characterized in that: the system comprises a host, a plurality of extensions and a plurality of mobile terminals, wherein the host establishes connection with the extensions through a local area network, and the host establishes connection with the mobile terminals through a wide area network.
Further, before the communication between the extension or the mobile terminal and the host, performing:
establishing connection between a host and a plurality of extensions through a local area network, establishing connection between the host and a plurality of mobile terminals through a wide area network, and registering the host and the mobile terminals to an SIP server;
when any extension or any mobile terminal initiates answering, the host sends a call hang-up instruction to other extensions or other mobile terminals, and communication between the corresponding extension or mobile terminal and the host is established.
Further provided is a SIP intercom system for keeping video clear in case of network congestion, comprising the following modules:
the coding and transmitting module is used for the transmitting end to code and transmit the IDR frame to the receiving end every GOP time, and the receiving end decodes the IDR frame and outputs a corresponding image;
and the request module is used for requesting the sender to resend the last IDR frame when the discontinuous RTP sequence number or the continuous two-frame decoding failure is detected in the video decoding of the receiving end.
There is further provided a computer readable storage medium storing a computer program which when executed by a processor implements the SIP intercom method of maintaining video clarity in the event of network congestion.
Accordingly, the present application provides the following effects and/or advantages:
when the receiving end detects packet loss, the method informs the sender to encode the previous frame into the IDR frame as the next IDR frame to be sent, avoids the traditional working mode of continuously requesting to lose and requiring retransmission of the corresponding RTP packet to the sending end, and can directly and completely play one frame of IDR frame at the receiving end, so that a user obtains a clear image.
In the method, due to network congestion and other conditions, after the last IDR frame is requested and the second time is delayed, the transmitting end codes every GOP time again and transmits the IDR frame to the receiving end. Thereby preventing the receiving end from continuously requesting retransmission of the IDR frame to cause further congestion in case of network congestion. Meanwhile, due to network congestion in the current situation, data transmission or receiving can be reduced within a time period of delaying the second time, and congestion conditions are relieved.
In the method, the priority of transmitting audio is set to be greater than the priority of transmitting video. Because audio data is often smaller than video data, audio can be played directly while video playback is slowly adjusted to match the time stamps of the two.
The method is based on an SIP intercom system formed by a local area network and a wide area network, only the host and the mobile terminal are registered to an SIP server, and the extension is not registered to the SIP server. The terminal in the local area network directly performs data communication without passing through the SIP server, so that the communication pressure of the SIP server can be reduced, and the video dialogue quality is improved.
It is to be understood that both the foregoing general description and the following detailed description of the present application are exemplary and explanatory and are intended to provide further explanation of the application as claimed.
Drawings
The prior art SIP intercom method mentioned in the background of fig. 1 captures a video in case of network congestion.
Fig. 2 is a flow chart of the method provided by the application.
Fig. 3 is a video screenshot of the SIP intercom method provided by the present application under the condition of network congestion.
Fig. 4 is a block diagram of a SIP intercom system provided by the present application.
Fig. 5 is a logic/timing diagram of the SIP intercom method provided by the present application.
Fig. 6 is a system block diagram of a conventional SIP intercom.
Fig. 7 is a logic/timing diagram of a conventional SIP intercom.
Detailed Description
For the purpose of facilitating understanding to those skilled in the art, the present application will now be described in further detail with reference to the accompanying drawings: it should be understood that, in this embodiment, the steps mentioned in this embodiment may be performed sequentially or sequentially, or may be performed simultaneously or partially, unless specifically stated otherwise.
As a result of the fact that in the prior art,
referring to fig. 2, a SIP intercom method for keeping video clear in case of network congestion includes the following steps:
s1, a host establishes connection with a plurality of extensions through a local area network, the host establishes connection with a plurality of mobile terminals through a wide area network, and the host and the mobile terminals are registered to an SIP server;
SIP is part of the IETF standard procedure and is built on such bases as SMTP and HTTP. It is used to set up, change and terminate calls between users over an IP network. It also requires the incorporation of different standards and protocols in order to provide telephony services: in particular, it is necessary to ensure transmission, signaling interconnection with the current telephone network, to be able to ensure voice quality, to be able to provide catalogues, to be able to authenticate users, etc. A host refers to a terminal for initiating video or voice communications to other multiple users/terminals/extensions.
In this embodiment, only the host and the mobile terminal are registered with the SIP server, and the extension is not registered with the SIP server. The host and the mobile terminal REGISTER with the SIP server, which means that the extension or the mobile terminal periodically transmits a registration request (REGISTER) to the network, reporting the current IP address, user name, and other information thereof. After that, the SIP server always stores information of the extension or the mobile terminal.
In this case, when a user initiates a call, a call request instruction is simultaneously transmitted to the extension and the mobile terminal through a host based on a SIP protocol; when a user initiates a call through a host, the host initiates a call to an extension through a local area network and initiates a call to a mobile terminal through a wide area network simultaneously through a SIP protocol. In the ringing process, when any extension or any mobile terminal initiates hang-up, a call cancellation instruction is sent to the host through the corresponding extension or mobile terminal; if any extension or any mobile terminal does not want to be connected or does not want to continuously ring, the corresponding extension or mobile terminal hangs up to ring, and the extension or mobile terminal sends a call Cancel instruction (cancer) to the host. At this time, the extension or the mobile terminal stops ringing, and the host does not continuously send instructions and media data to the extension or the mobile terminal, and at the same time, other extensions or mobile terminals are not affected, and continue to maintain the ringing state. When any extension or any mobile terminal initiates answering, the host sends a call hang-up instruction to other extensions or other mobile terminals, and communication between the corresponding extension or mobile terminal and the host is established.
In this step, when the user answers the call instruction of the host through the extension or the mobile terminal, that is, the other extension or the mobile terminal is indicated that there is no need to establish communication with the host, the call hang-up instruction is sent to the other extension or the mobile terminal through the host, and at this time, the other extension or the mobile terminal does not ring. Simultaneously, communication is established between the extension or the mobile terminal and the host, and a user starts a dialogue or video between the extension or the mobile terminal and the host.
Through the steps, the application can hang up a certain extension or mobile terminal when ringing and does not affect other equipment to keep ringing, when any called equipment receives the call, the host hangs up other circuit calls and carries out audio and video intercom with the equipment. In addition, in the prior art, both the extension and the mobile phone are registered to the SIP server, and the extension local area network needs to have an outlet to the wide area network, so that even if the extension is only used, communication data between the extension and the host needs to bypass the SIP server and return to the local area network, the communication complexity between the extension and the host is increased, and the communication quality is reduced. The host and the extension are connected through the local area network, and the SIP server is not needed to transfer data in the local area network, so that the information of the extension is not needed to be registered to the SIP server, and the host can directly call the local area network extension, or the host calls the wide area network mobile terminal through the SIP server.
The following describes some of the optimization directions of this embodiment.
S2, the transmitting end encodes and transmits the IDR frame to the receiving end every GOP time, and the receiving end decodes the IDR frame and outputs a corresponding image;
s3, when the discontinuous RTP sequence number or the continuous two-frame decoding failure is detected in the video decoding of the receiving end, the sending end is requested to resend the last IDR frame.
In this embodiment, in h.264, the I frames are divided into normal I frames and IDR frames (special I frames); the IDR frame blocks the accumulation of errors, and none of the frames following the IDR frame can refer to the frame preceding the IDR frame, and the normal I frame does not block the accumulation of errors. The IDR frame must be an I-frame, but the I-frame is not necessarily an IDR frame. In H264 the pictures are organized in units of sequences, one sequence being a stream of pictures encoded beginning with an I frame and ending with the next I frame. The first picture of a sequence is called the IDR picture and when the decoder receives an IDR frame, it discards all the reference frame queues. "GOP time" refers to the duration of the interval between two I frames. In this embodiment, GOP time is generally 1-2 seconds, and 1 second is used in this embodiment, according to the video communication requirement.
In step S2, the IDR frame is encoded and sent to the receiving end every GOP time, and then the receiving end can parse the IDR frame preferentially and output it directly, and the screen of the receiving end obtains a picture immediately, as shown in fig. 3. When network congestion occurs, the prior art needs that the receiving end sends a NACK message of the RTCP to the sending end, and requests to retransmit a corresponding RTP packet to the sending end, and when the network congestion is serious, a picture as shown in fig. 1 appears, and the picture is full of mosaics and the like. In order to avoid this phenomenon, the present embodiment solves through step S2, the step discards the NACK message sent by the receiving end to the sending end, and requests to retransmit the corresponding RTP packet to the sending end, and changes to request to retransmit the last IDR frame to the sending end, where one IDR frame includes a complete image, and when the receiving end detects that the RTP sequence number is discontinuous or two continuous frames fail to decode in video decoding, the step reapply a complete image, and at this time, the receiving end re-decodes the last IDR frame as shown in fig. 3, so that the situations of snowflake, mosaic, etc. do not occur.
The sending end sends 1I frame every 1s within 3 seconds after the video is sent, so that the receiving end can quickly send out stable video images. During these 3 seconds, the FIR request sent from the receiving end is not processed.
After 3 seconds, after receiving the RTCP PLI or FIR message sent by the receiving end, the sending end notifies the application layer through the RTCP event, and triggers the next frame to directly transmit an IDR frame, but the minimum interval of the IDR frame is ensured to be 300ms. If the I frame is just sent, the I frame retransmission request of the opposite side is received within 300ms in the process of continuing to send the P (B) frame, and the P (B) frame is stopped until 300ms is passed, and the I frame is not sent. If no matter how many I-frame retransmission requests are received in this period, the I-frames are processed together after 300ms, and are transmitted only once.
Specifically, the requesting the sender to retransmit the last IDR frame includes:
s3.1, a receiving end sends a packet loss retransmission message to a sending end;
and S3.2, the transmitting end defines the last IDR frame as the next frame and transmits the next IDR frame to the receiving end. In the prior art, after the receiving end receives the IDR frame, all previous buffer frames are deleted, and in this step, the transmitting end redefines the previous IDR frame as the next frame, so that the previous IDR frame can be transmitted again by the transmitting end under the request of the receiving end.
Further, after the request for retransmission of the last IDR frame from the transmitting end, the following steps are performed:
s4, after the second time is delayed, the sending end codes every GOP time again and sends the IDR frame to the receiving end.
In this step, since the receiving end detects packet loss in step S3.2, the receiving end uses RTCP to inform the sender to encode the previous frame into the IDR frame to be sent as the next IDR frame, and after the sending end completes S3.2 due to network congestion, the sending frequency of sending the next IDR frame can be reduced, i.e. GOP time is prolonged, so as to prevent the receiving end from continuously requesting to resend the IDR frame to cause further congestion under the condition of network congestion. In this embodiment, the second time may be set to 10-15s, and 10s is specifically used in this embodiment. That is, in step S4, it is optimized to wait for 10 seconds before starting to return to step S2, where the transmitting end encodes and transmits the IDR frame to the receiving end every GOP time, and the receiving end decodes the IDR frame and outputs a corresponding image.
Further, in the case of network congestion, the video and the audio analyzed by the receiving end may be unsynchronized, and the user may consider that the lip action in the video is unsynchronized with the sound in the video communication process, so as to alleviate or avoid the situation, the present application proposes a solution in the following steps.
After the communication between the extension or the mobile terminal and the host is established, the following steps are executed:
and S5, setting the priority of the transmission audio to be larger than the priority of the transmission video.
In this step, priority audio > video. Because the audio is guaranteed to be transmitted normally preferentially, when the receiving end receives the audio data, the audio data is directly played, and when the receiving end receives the video data, the audio time stamp is required to be referenced for playing. If the video is taken as a time stamp and network difference occurs, the audio is delayed to play in the whole process for matching the video.
Specifically, the setting the priority of the transmission audio to be greater than the priority of the transmission video includes:
s5.1, defining audio as a main shaft and video as an auxiliary shaft, and caching the audio and the video by a receiving end according to a first-in first-out principle; in this embodiment, the audio and video are buffered for 0.6 seconds, the largest buffer is set to 2 seconds, i.e. 100 audio stores, 50 video stores, and the first data is deleted if the buffer exceeds 10 percent of the buffer
S5.2, when the receiving end obtains the audio and/or the video each time, calculating a time difference value between the time stamp of the audio and the time stamp of the video in the current frame, and adjusting the playing speed of the video in the current frame according to the time difference value.
Specifically, the adjusting the playing speed of the video in the current frame according to the time difference value includes:
if the time difference is less than or equal to 250ms, not adjusting the playing speed of the video in the current frame; this is because the user does not strongly perceive lip-sync with respect to video and audio having a time difference of 250ms or less, and the time difference may be a video time stamp-audio time stamp or an audio time stamp-video time stamp.
If the time difference is greater than 250ms, the video is faster than the audio and the number of caches of the video is smaller than the first number, slowing down or suspending the playing speed of the video in the current frame until the time difference is less than or equal to 250ms; in the current situation, video is faster than audio, so it is necessary to slowly play out or pause the video so that the audio and the video are synchronized.
If the time difference is greater than 250ms, the video is faster than the audio and the number of buffers of the video is greater than a first number, not adjusting the playing speed of the video in the current frame; in this case, there is more video than audio in the buffer, and then there is a possibility that the video buffer is reduced due to the loss of video packets in the case of network congestion, etc., so that the video is not slowed down or fast processed in this case, and the subsequent process is naturally re-synchronized in the case of network congestion.
And if the time difference is greater than 250ms and the video is slower than the audio, the playing speed of the video in the current frame is adjusted until the time difference is less than or equal to 250ms. When network congestion occurs, video and audio data are received asynchronously, and the number of video buffers exceeds the preset number, the video needs to be output and played quickly.
In this embodiment, the first number may be 50. In other embodiments any number from 10-100 is possible.
Referring to fig. 4, the method is based on a one-call-multiple SIP system integrating a local area network and a wide area network, and includes a host, a plurality of extensions, and a plurality of mobile terminals, where the host establishes a connection with the extensions through the local area network, and the host establishes a connection with the plurality of mobile terminals through the wide area network. A workflow of a one-call-multiple SIP system that merges a local area network and a wide area network may refer to fig. 5. Existing SIP systems and their workflow refer to fig. 6-7.
Compared with the prior art, in the system, the host can directly communicate with the extension without passing through the SIP server, so that the communication link that data of the host are transmitted to the SIP server and then transmitted to the extension is reduced, and the data are directly transmitted from the host to the extension, thereby further improving the definition and fluency of video.
There is further provided a computer readable storage medium storing a computer program which when executed by a processor implements the SIP intercom method of maintaining video clarity in the event of network congestion.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiments and all such alterations and modifications as fall within the scope of the application.
In the description of the present application, it should be understood that the terms "first," "second," and the like are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include one or more such feature. In the description of the present application, the meaning of "a plurality" is two or more, unless explicitly defined otherwise.
In the present application, unless explicitly specified and limited otherwise, the terms "mounted," "connected," "secured," and the like are to be construed broadly, and may be, for example, fixedly connected, detachably connected, or integrally formed; can be mechanically or electrically connected; can be directly connected or indirectly connected through an intermediate medium, and can be communicated with the inside of two elements or the interaction relationship of the two elements. The specific meaning of the above terms in the present application can be understood by those of ordinary skill in the art according to the specific circumstances.
In the description of the present specification, a description referring to terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present application. In this specification, schematic representations of the above terms should not be understood as necessarily being directed to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, the different embodiments or examples described in this specification and the features of the different embodiments or examples may be combined and combined by those skilled in the art without contradiction.
Claims (7)
1. The SIP intercom method for keeping video clear under the condition of network congestion is characterized in that: the method comprises the following steps:
after communication between the extension or mobile terminal and the host, performing:
the transmitting end encodes and transmits the IDR frame to the receiving end every GOP time, and the receiving end decodes the IDR frame and outputs a corresponding image;
when the discontinuous RTP sequence number or the continuous two-frame decoding failure is detected in the video decoding of the receiving end, requesting the transmitting end to retransmit the last IDR frame;
the requesting retransmission of the last IDR frame from the transmitting end includes:
the receiving end sends a packet loss retransmission message to the sending end;
the transmitting end defines the last IDR frame as the next frame and transmits the next frame to the receiving end;
the method is based on a one-call-multiple SIP system integrating a local area network and a wide area network, and is characterized in that: the system comprises a host, a plurality of extensions and a plurality of mobile terminals, wherein the host establishes connection with the extensions through a local area network, and the host establishes connection with the mobile terminals through a wide area network;
before communication between an extension or mobile terminal and a host, performing:
establishing connection between a host and a plurality of extensions through a local area network, establishing connection between the host and a plurality of mobile terminals through a wide area network, and registering the host and the mobile terminals to an SIP server;
when any extension or any mobile terminal initiates answering, the host sends a call hang-up instruction to other extensions or other mobile terminals, and communication between the corresponding extension or mobile terminal and the host is established.
2. The SIP intercom method according to claim 1, wherein the video is kept clear in case of network congestion, wherein: after the request for retransmission of the last IDR frame from the transmitting end, the following steps are performed:
after the second time delay, the transmitting end encodes and transmits the IDR frame to the receiving end again every GOP time.
3. The SIP intercom method according to claim 1, wherein the video is kept clear in case of network congestion, wherein: after communication between the extension or mobile terminal and the host, performing:
the priority of transmitting audio is set to be greater than the priority of transmitting video.
4. A SIP intercom method for keeping video clear in case of network congestion as claimed in claim 3, wherein: the setting of the priority of transmitting audio to be greater than the priority of transmitting video includes:
defining audio as a main shaft and video as an auxiliary shaft, and caching the audio and the video by a receiving end according to a first-in first-out principle;
and when the receiving end obtains the audio and/or the video each time, calculating a time difference value between the time stamp of the audio and the time stamp of the video in the current frame, and adjusting the playing speed of the video in the current frame according to the time difference value.
5. The SIP intercom method according to claim 4, wherein the video is kept clear in case of network congestion, wherein: the adjusting the playing speed of the video in the current frame according to the time difference value comprises the following steps:
if the time difference is less than or equal to 250ms, not adjusting the playing speed of the video in the current frame;
if the time difference is greater than 250ms, the video is faster than the audio and the number of caches of the video is smaller than the first number, slowing down or suspending the playing speed of the video in the current frame until the time difference is less than or equal to 250ms;
if the time difference is greater than 250ms, the video is faster than the audio and the number of buffers of the video is greater than a first number, not adjusting the playing speed of the video in the current frame;
and if the time difference is greater than 250ms and the video is slower than the audio, the playing speed of the video in the current frame is adjusted until the time difference is less than or equal to 250ms.
6. The SIP intercom system for keeping video clear under the condition of network congestion is characterized in that: the method comprises the following modules:
the coding and transmitting module is used for the transmitting end to code and transmit the IDR frame to the receiving end every GOP time, and the receiving end decodes the IDR frame and outputs a corresponding image;
and the request module is used for requesting the sender to resend the last IDR frame when the discontinuous RTP sequence number or the continuous two-frame decoding failure is detected in the video decoding of the receiving end.
7. A computer readable storage medium storing a computer program, wherein the computer program when executed by a processor implements a SIP intercom method of keeping video clear in case of network congestion as claimed in any one of claims 1 to 5.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210468252.1A CN115102927B (en) | 2022-04-29 | 2022-04-29 | SIP intercom method, system and storage device for keeping video clear |
PCT/CN2022/117773 WO2023206910A1 (en) | 2022-04-29 | 2022-09-08 | Sip intercom method and system based on local area network and wide area network, and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210468252.1A CN115102927B (en) | 2022-04-29 | 2022-04-29 | SIP intercom method, system and storage device for keeping video clear |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115102927A CN115102927A (en) | 2022-09-23 |
CN115102927B true CN115102927B (en) | 2023-10-27 |
Family
ID=83287499
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210468252.1A Active CN115102927B (en) | 2022-04-29 | 2022-04-29 | SIP intercom method, system and storage device for keeping video clear |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115102927B (en) |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2010003342A1 (en) * | 2008-07-07 | 2010-01-14 | 华为技术有限公司 | Method, device and system for identifying frame type of rtp packet |
JP2010177933A (en) * | 2009-01-28 | 2010-08-12 | Aiphone Co Ltd | Intercom system |
CN104618786A (en) * | 2014-12-22 | 2015-05-13 | 深圳市腾讯计算机系统有限公司 | Audio/video synchronization method and device |
CN204362246U (en) * | 2014-11-20 | 2015-05-27 | 深圳市华百安智能技术有限公司 | The non-viewable numbers intercom system of a kind of across a network |
CN107231328A (en) * | 2016-03-23 | 2017-10-03 | 福建星网锐捷通讯股份有限公司 | Method for real-time video transmission, device, equipment and system |
CN110012363A (en) * | 2019-04-18 | 2019-07-12 | 浙江工业大学 | A kind of video chat system based on Session Initiation Protocol |
CN112995214A (en) * | 2021-04-26 | 2021-06-18 | 安心智能(武汉)信息技术有限公司 | Real-time video transmission system, method and computer readable storage medium |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11350142B2 (en) * | 2019-01-04 | 2022-05-31 | Gainspan Corporation | Intelligent video frame dropping for improved digital video flow control over a crowded wireless network |
-
2022
- 2022-04-29 CN CN202210468252.1A patent/CN115102927B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2010003342A1 (en) * | 2008-07-07 | 2010-01-14 | 华为技术有限公司 | Method, device and system for identifying frame type of rtp packet |
JP2010177933A (en) * | 2009-01-28 | 2010-08-12 | Aiphone Co Ltd | Intercom system |
CN204362246U (en) * | 2014-11-20 | 2015-05-27 | 深圳市华百安智能技术有限公司 | The non-viewable numbers intercom system of a kind of across a network |
CN104618786A (en) * | 2014-12-22 | 2015-05-13 | 深圳市腾讯计算机系统有限公司 | Audio/video synchronization method and device |
CN107231328A (en) * | 2016-03-23 | 2017-10-03 | 福建星网锐捷通讯股份有限公司 | Method for real-time video transmission, device, equipment and system |
CN110012363A (en) * | 2019-04-18 | 2019-07-12 | 浙江工业大学 | A kind of video chat system based on Session Initiation Protocol |
CN112995214A (en) * | 2021-04-26 | 2021-06-18 | 安心智能(武汉)信息技术有限公司 | Real-time video transmission system, method and computer readable storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN115102927A (en) | 2022-09-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US6697097B1 (en) | Synchronizing voice and video transmitted over separate channels | |
US8526048B1 (en) | Systems and methods for the reliable transmission of facsimiles over packet networks | |
JP4479650B2 (en) | Communication system, terminal device and computer program | |
CN105704580B (en) | A kind of video transmission method | |
US7290058B2 (en) | Video mail server with reduced frame loss | |
JP5421346B2 (en) | High-speed transmission method and apparatus for unicast stream in high-speed channel change | |
JP2018529261A (en) | Sender video phone downgrade | |
JP2009163734A (en) | Method and system for fast session establishment between equipment using h.324 and related telecommunications protocol, and h.324 similar terminal | |
WO2010066135A1 (en) | Channel switching method, device and system | |
MX2011012652A (en) | Method, apparatus and system for reducing media delay. | |
WO2011022977A1 (en) | Video data reception and transmission system and video data processing method for videophone | |
US8339439B2 (en) | Method of speeding up video recovery of videotelephony after an interruption and mobile terminal and system using the same | |
JP3707369B2 (en) | Video phone equipment | |
CN108366044B (en) | VoIP remote audio/video sharing method | |
CN110012363B (en) | Video chat system based on SIP protocol | |
Schierl et al. | 3GPP compliant adaptive wireless video streaming using H. 264/AVC | |
US10085029B2 (en) | Switching display devices in video telephony | |
CN114979080B (en) | SIP intercom method, system and storage device integrating local area network and wide area network | |
CN115102927B (en) | SIP intercom method, system and storage device for keeping video clear | |
CN109274980A (en) | A kind of data transmission method for being quickly broadcast live | |
EP2512161A1 (en) | Method, service terminal and server for mobile video communicating | |
WO2023206910A1 (en) | Sip intercom method and system based on local area network and wide area network, and storage medium | |
WO2010117644A1 (en) | Method and apparatus for asynchronous video transmission over a communication network | |
JP3969155B2 (en) | Multimedia communication transfer method, multimedia communication terminal, exchange, management device | |
JP2005210160A (en) | Video receiving terminal having communication state display |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
CB02 | Change of applicant information | ||
CB02 | Change of applicant information |
Address after: Unit 403-12, 4th Floor, No. 56, Chengyi North Street, Phase III, Software Park, Torch High-tech Zone, Xiamen, Fujian 361000 Applicant after: XIAMEN LEELEN TECHNOLOGY Co.,Ltd. Address before: 2-5 / F, 780 Tieshan Road, Guankou Town, Jimei District, Xiamen City, Fujian Province 361021 Applicant before: XIAMEN LEELEN TECHNOLOGY Co.,Ltd. |
|
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |