CN115914178A - VOIP real-time audio and video call method, system and device - Google Patents

VOIP real-time audio and video call method, system and device Download PDF

Info

Publication number
CN115914178A
CN115914178A CN202310032604.3A CN202310032604A CN115914178A CN 115914178 A CN115914178 A CN 115914178A CN 202310032604 A CN202310032604 A CN 202310032604A CN 115914178 A CN115914178 A CN 115914178A
Authority
CN
China
Prior art keywords
stream
identity authentication
terminal
gateway server
media data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310032604.3A
Other languages
Chinese (zh)
Other versions
CN115914178B (en
Inventor
陈炫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Xunhong Network Technology Co ltd
Original Assignee
Guangzhou Xunhong Network Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Xunhong Network Technology Co ltd filed Critical Guangzhou Xunhong Network Technology Co ltd
Priority to CN202310032604.3A priority Critical patent/CN115914178B/en
Publication of CN115914178A publication Critical patent/CN115914178A/en
Application granted granted Critical
Publication of CN115914178B publication Critical patent/CN115914178B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The embodiment of the invention discloses a VOIP real-time audio and video call method, a system and a device, comprising: receiving a request instruction from an initiating terminal, performing identity authentication on the initiating terminal based on the request instruction, and returning a passing instruction of the initiating terminal after the identity authentication is passed; sending a video call request to a VOIP gateway server, receiving response information from the VOIP gateway server when a called terminal is off-hook, and establishing a call; detecting whether a first media data stream from an initiating terminal or a second media data stream from a VOIP gateway server is received, and when the first media data stream or the second media data stream is received, transcoding the first media data stream and then forwarding the first media data stream to the VOIP gateway server so as to send the first media data stream to a called terminal; and forwarding the second media data stream to the initiating terminal. By arranging the RTMP gateway server and the VOIP gateway server to be respectively butted with the initiating terminal and the called terminal, the intelligent terminal and the SIP terminal are realized to implement an audio and video intercommunication function, and the compatibility is stronger.

Description

VOIP real-time audio and video call method, system and device
Technical Field
The invention relates to the technical field of computer audio and video communication, in particular to a VOIP real-time audio and video call method, system and device.
Background
Currently, RTMP is a Real Time Messaging Protocol. The protocol is based on TCP and is a protocol family, and comprises various variants such as RTMP basic protocol, RTMPT/RTMPS/RTMPE and the like. RTMP is a network protocol designed for real-time data communication, and is mainly used for audio-video and data communication between a Flash/AIR platform and a streaming media/interaction server supporting the RTMP protocol.
VOIP is a Voice over IP (Voice over Internet Protocol) technology, which is a technology for transmitting Voice signals in the form of data packets in an IP network environment after analog Voice signals are compressed and packaged.
In the field of real-time audio and video telephony, several implementations exist: traditional soft phone implementation mode based on SIP + RTP technology; a web page implementation mode based on a WSS + WebRTC technology; the real-time audio and video call can be realized through the browser, each browser manufacturer provides support aiming at the WebRTC technology in sequence, and the browser manufacturer realizes the problem of multiple compatibility caused by difference; the user experience is also poor.
Disclosure of Invention
Aiming at the defects, the embodiment of the invention discloses a VOIP real-time audio and video call method, a system and a device, which can solve the compatibility problem of the traditional audio and video call.
The first aspect of the embodiment of the invention discloses a VOIP real-time audio and video call method, which comprises the following steps:
receiving a request instruction from an initiating terminal, performing identity authentication on the initiating terminal based on the request instruction, and returning to the initiating terminal to pass the instruction after the identity authentication is passed;
sending a video call request to a VOIP gateway server, receiving ringing response information from the VOIP gateway server when a called terminal is off-hook, playing ringing audio and video to an initiating terminal, and establishing a call;
detecting whether a first media data stream from an initiating terminal or a second media data stream which is from a VOIP gateway server and is transcoded by a media coding and decoding module is received, and when the first media data stream or the second media data stream is received, transcoding the first media data stream and forwarding the first media data stream to the VOIP gateway server so that the VOIP gateway server sends the first media data stream to a called terminal; and forwarding the second media data stream to an initiating end.
As an alternative implementation manner, in the first aspect of the embodiment of the present invention, the initiation end includes a push flow end and a pull flow end.
As an optional implementation manner, in the first aspect of the embodiment of the present invention, the receiving a request instruction from an originating terminal, and performing identity authentication on the originating terminal based on the request instruction includes:
receiving a call request instruction from a plug flow end, performing plug flow end identity authentication on the plug flow end, and returning a plug flow end connection success response instruction after the plug flow end identity authentication is passed;
and receiving a data request instruction from the pull end, performing pull end identity authentication on the pull end, and returning a pull end data connection response instruction after the pull end identity authentication is passed.
As an optional implementation manner, in the first aspect of the embodiment of the present invention, the receiving a call request instruction from a push stream end, performing identity authentication on the push stream end, and returning a connection success response instruction to the push stream end after the identity authentication of the push stream end is passed includes:
receiving a handshake and connection request from a plug flow end, performing first plug flow end identity authentication on the plug flow end, and returning a plug flow end connection success response instruction after the first plug flow end identity authentication is passed;
receiving a data interaction request from a stream pushing end, performing second stream pushing end identity authentication on the stream pushing end, and returning a stream pushing end interaction response instruction after the second stream pushing end identity authentication is passed;
and receiving an issuing request from the stream pushing end, performing third stream pushing end identity authentication on the stream pushing end, and returning a stream pushing end issuing response instruction after the third stream pushing end identity authentication passes.
As an optional implementation manner, in the first aspect of the embodiment of the present invention, the receiving a data request instruction from a pull stream end, performing pull stream end identity authentication on the pull stream end, and returning a pull stream end data connection response instruction after the pull stream end identity authentication passes includes:
receiving a handshake and connection request from a pull end, performing first pull end identity authentication on the pull end, and returning a pull end connection success response instruction after the first pull end identity authentication is passed;
receiving a data interaction request from an stream pulling end, performing secondary stream pulling end identity authentication on the stream pulling end, and returning a stream pulling end interaction response instruction after the secondary stream pulling end identity authentication is passed;
receiving a playing request from the stream pulling end, performing the third time of stream pulling end identity authentication on the stream pulling end, and returning a stream pulling end playing response instruction after the third time of stream pulling end identity authentication is passed.
As an optional implementation manner, in the first aspect of the embodiment of the present invention, when the first media data stream is not received, the session is controlled to sleep for a preset duration and count for a preset number of times within the preset duration of sleep, whether the first media data stream is received is continuously detected within the preset duration and the preset number of times, and when the first media data stream is not detected to be received after exceeding the preset duration or exceeding the preset number of times, the call is ended.
The second aspect of the embodiment of the invention discloses a VOIP real-time audio and video call system, which comprises a stream pushing terminal, a stream pulling terminal, an RTMP gateway server, a media coding and decoding module and a VOIP gateway server, wherein the stream pushing terminal, the stream pulling terminal, the media coding and decoding module and the VOIP gateway server are all connected with the RTMP gateway server, the media coding and decoding module is connected with the VOIP gateway server, the stream pushing terminal is used for initiating a call request instruction, the stream pulling terminal is used for initiating a data request instruction, the RTMP gateway server is used for carrying out identity authentication on the stream pushing terminal based on the call request instruction, carrying out identity authentication on the stream pulling terminal based on the data request instruction, sending a video call request to the VOIP gateway server, sending a first media data stream from the stream pushing terminal to the media coding and decoding module for coding and decoding, and then forwarding the first media data stream to the VOIP gateway server, and forwarding a second media data stream from the VOIP gateway server and coded and decoded by the media coding and decoding module to the stream pulling terminal.
The third aspect of the embodiments of the present invention discloses a VOIP real-time audio/video call device, including:
the instruction receiving module: the system comprises a request instruction used for receiving a request instruction from an initiating terminal, carrying out identity authentication on the initiating terminal based on the request instruction, and returning to the initiating terminal to pass the instruction after the identity authentication is passed;
a call establishment module: the system is used for sending a video call request to the VOIP gateway server, receiving ringing response information from the VOIP gateway server when a called terminal is off-hook, playing ringing audio and video to an initiating terminal, and establishing a call;
the data interaction module: the media encoding and decoding module is used for detecting whether a first media data stream from an initiating terminal or a second media data stream from a VOIP gateway server and subjected to transcoding by the media encoding and decoding module is received, and when the first media data stream or the second media data stream is received, the first media data stream is transcoded and then forwarded to the VOIP gateway server so that the VOIP gateway server sends the first media data stream to a called terminal; and forwarding the second media data stream to an initiating end.
A fourth aspect of the embodiments of the present invention discloses an electronic device, including: a memory storing executable program code; a processor coupled with the memory; the processor calls the executable program code stored in the memory to execute the VOIP real-time audio and video call method disclosed by the first aspect of the embodiment of the invention.
A fifth aspect of the embodiments of the present invention discloses a computer-readable storage medium storing a computer program, where the computer program enables a computer to execute the VOIP real-time audio/video call method disclosed in the first aspect of the embodiments of the present invention.
Compared with the prior art, the embodiment of the invention has the following beneficial effects:
the embodiment of the invention realizes the audio and video intercommunication function between the intelligent terminal and the SIP terminal by respectively butting the initiating terminal and the called terminal by arranging the RTMP gateway server and the VOIP gateway server, has stronger compatibility, is additionally provided with the media server to encode and decode data, does not need to cache the data, has the advantage of low delay, supervises whether the media data stream is received during the conversation and can realize the effective real-time intercommunication of the media data stream.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without creative efforts.
Fig. 1 is a schematic flow chart of a VOIP real-time audio/video call method disclosed in an embodiment of the present invention;
fig. 2 is a schematic diagram of a module structure of a VOIP real-time audio/video conversation system disclosed in the embodiment of the present invention;
fig. 3 is a schematic flow diagram of a VOIP real-time audio/video call system according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of a VOIP real-time audio/video call device according to an embodiment of the present invention;
fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without making any creative effort based on the embodiments in the present invention, belong to the protection scope of the present invention.
It should be noted that the terms "first", "second", "third", "fourth", and the like in the description and the claims of the present invention are used for distinguishing different objects, and are not used for describing a specific order. The terms "comprises," "comprising," and any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
The embodiment of the invention discloses a VOIP real-time audio and video call method, a system, a device, electronic equipment and a storage medium, wherein an RTMP gateway server and a VOIP gateway server are respectively arranged to be in butt joint with an initiating terminal and a called terminal, so that the intelligent terminal and an SIP terminal are realized to implement an audio and video intercommunication function, the compatibility is stronger, a media server is additionally arranged to carry out coding and decoding on data, the data does not need to be cached, the advantage of low delay is achieved, whether a media data stream is received during the call or not is monitored, and the effective real-time intercommunication of the media data stream can be realized.
Example one
Referring to fig. 1, fig. 1 is a schematic flow chart of a VOIP real-time audio/video call method disclosed in an embodiment of the present invention. The execution main body of the method described in the embodiment of the present invention is an execution main body composed of software or/and hardware, and the execution main body can receive related information in a wired or/and wireless manner and can send a certain instruction. Of course, it may also have certain processing and storage functions. The execution subject may control a plurality of devices, for example, a remote physical server or a cloud server and related software, or may be a local host or a server and related software for performing related operations on a device installed somewhere. In some scenarios, multiple storage devices may also be controlled, which may be co-located with the device or located in a different location. As shown in fig. 1, the VOIP real-time audio/video call method includes the following steps:
101. and receiving a request instruction from the initiating terminal, performing identity authentication on the initiating terminal based on the request instruction, and returning to the initiating terminal passing instruction after the identity authentication is passed.
The initiating terminal in the embodiment comprises a stream pushing terminal and a stream pulling terminal, namely a stream pushing terminal and a stream pulling terminal, wherein the stream pushing terminal refers to a channel for acquiring audio and video data and comprises a mobile phone/tablet APP (application), a Web terminal and a PC (personal computer) terminal, and the stream pulling terminal refers to a channel for receiving the audio and video data and comprises the mobile phone/tablet APP, the Web terminal and the PC terminal.
Specifically, receiving a request instruction from an originating terminal, and performing identity authentication on the originating terminal based on the request instruction includes: receiving a call request instruction from a plug flow end, performing plug flow end identity authentication on the plug flow end, and returning a plug flow end connection success response instruction after the plug flow end identity authentication is passed; and receiving a data request instruction from the stream pulling end, performing stream pulling end identity authentication on the stream pulling end, and returning a stream pulling end data connection response instruction after the stream pulling end identity authentication is passed.
For the stream pushing end, specifically, receiving a handshake and connection request from the stream pushing end, performing first stream pushing end identity authentication on the stream pushing end, and returning a stream pushing end connection success response instruction after the first stream pushing end identity authentication is passed; receiving a data interaction request from a stream pushing end, performing second stream pushing end identity authentication on the stream pushing end, and returning a stream pushing end interaction response instruction after the second stream pushing end identity authentication is passed; and receiving an issuing request from the stream pushing end, performing third time stream pushing end identity authentication on the stream pushing end, and returning a stream pushing end issuing response instruction after the third time stream pushing end identity authentication passes.
For the pull stream end, specifically, receiving a handshake and connection request from the pull stream end, performing first pull stream end identity authentication on the pull stream end, and returning a pull stream end connection success response instruction after the first pull stream end identity authentication is passed; receiving a data interaction request from a pull end, performing secondary pull end identity authentication on the pull end, and returning a pull end interaction response instruction after the secondary pull end identity authentication is passed; receiving a playing request from the stream pulling end, performing the third time of stream pulling end identity authentication on the stream pulling end, and returning a stream pulling end playing response instruction after the third time of stream pulling end identity authentication is passed.
102. And sending a video call request to the VOIP gateway server, receiving ringing response information from the VOIP gateway server when the called terminal is off-hook, playing ringing audio and video to the initiating terminal, and establishing a call.
The called terminal refers to terminal equipment and software application programs supporting the VOIP gateway server, such as a physical phone, a soft phone, a web phone, a mobile app, and the like.
103. Detecting whether a first media data stream from an initiating terminal or a second media data stream from a VOIP gateway server and subjected to transcoding by a media coding and decoding module is received, and when the first media data stream or the second media data stream is received, transcoding the first media data stream and then forwarding the transcoded first media data stream to the VOIP gateway server so that the VOIP gateway server sends the first media data stream to a called terminal; and forwarding the second media data stream to an initiating end.
Further, when the first media data stream is not received, controlling a preset dormancy duration of the session and counting a preset number of times within the preset dormancy duration, continuously detecting whether the first media data stream is received within the preset duration and the preset number of times, and when the first media data stream is not detected to be received after the preset duration is exceeded or the preset number of times is exceeded, ending the call.
Example two
Please refer to fig. 2, fig. 2 is a schematic diagram of a module structure of a VOIP real-time audio/video call system disclosed in the embodiment of the present invention, fig. 3 is a schematic diagram of a working principle flow of the VOIP real-time audio/video call system disclosed in the embodiment of the present invention, referring to fig. 2 and fig. 3, the system includes a stream pushing terminal, a stream pulling terminal, an RTMP gateway server, a media codec module and a VOIP gateway server, the stream pushing terminal, the stream pulling terminal, the media codec module and the VOIP gateway server are all connected to the RTMP gateway server, the media codec module is connected to the VOIP gateway server, the stream pushing terminal is used for initiating a call request instruction, the stream pulling terminal is used for initiating a data request instruction, the RTMP gateway server is used for performing identity authentication on the stream pushing terminal based on the call request instruction, the stream pulling terminal is performed identity authentication based on the data request instruction, and sends a video call request to the VOIP gateway server, the first media data stream codec module from the stream pushing terminal is forwarded to the VOIP gateway server, and the second media stream is forwarded to the VOIP gateway server after the media stream is encoded and decoded.
Specifically, the stream pushing terminal and the stream pulling terminal respectively send a handshake and a connection request to the RTMP gateway server to apply for authentication, and when the authentication passes, a connection success response is returned. And then the stream pushing terminal sends a data interaction request to the RTMP gateway server, and the RTMP gateway server returns a stream pushing terminal interaction response instruction. The stream pushing terminal initiates a publishing request, the RTMP gateway server returns a successful publishing response, and meanwhile, the media stream data of the stream pushing terminal is received. The stream pulling terminal initiates a handshake and connection request, a data interaction request and a play request in sequence, and the RTMP gateway server returns successful responses one by one. Meanwhile, the RTMP gateway server sends a signaling request video call to the VOIP gateway server, when the called terminal rings, the VOIP gateway server returns a ringing event, and at the moment, the RTMP gateway server plays ringing audio and video to the pull terminal to inform the pull terminal that the called terminal rings. When the called terminal is off-hook, the VOIP gateway server returns an answer event, at the moment, the RTMP gateway server judges whether the current session receives the media data stream of the stream pushing terminal, and if the current session does not receive the data, the RTMP gateway server judges whether the media data stream of the stream pushing terminal is received by a mode of sleeping for M seconds and counting for N times. If the judgment is failed, the call is ended, if the judgment is successful, RTMP media stream data is transferred to the media coding and decoding module for transcoding processing, and the RTMP media stream data is transcoded into RTP media stream data and then forwarded to the VOIP gateway server. Similarly, RTP media stream data from the VOIP gateway server is received, transcoding processing is carried out through the media coding and decoding module, and the RTP media stream data is transcoded into RTMP media stream and then forwarded to the stream pulling end. The calling party and the called party see the image of the opposite party and hear the sound of the opposite party in real time, and real-time intercommunication of media stream data for video calling based on RTMP is realized.
The audio codec format of an embodiment includes one or more of G711A/G711U/AAC/SPEEX; the video coding and decoding comprises the following steps: one or more of H263/H264/VP 8.
EXAMPLE III
Referring to fig. 4, fig. 4 is a schematic structural diagram of a VOIP real-time audio/video call device disclosed in the embodiment of the present invention. As shown in fig. 4, the VOIP real-time audio/video call device may include: the call establishing device comprises an instruction receiving module 401, a call establishing module 402 and a data interaction module 403, wherein the instruction receiving module 401: the system comprises a request instruction used for receiving a request instruction from an initiating terminal, performing identity authentication on the initiating terminal based on the request instruction, and returning a passing instruction of the initiating terminal after the identity authentication is passed; the call establishment module 402: the system comprises a VOIP gateway server, a calling end and a calling end, wherein the VOIP gateway server is used for sending a video call request to the VOIP gateway server, receiving ringing response information from the VOIP gateway server when the called end is off-hook, playing ringing audio and video to the calling end and establishing a call; the data interaction module 403: the media encoding and decoding module is used for detecting whether a first media data stream from an initiating terminal or a second media data stream from a VOIP gateway server and subjected to transcoding by the media encoding and decoding module is received, and when the first media data stream or the second media data stream is received, the first media data stream is transcoded and then forwarded to the VOIP gateway server so that the VOIP gateway server sends the first media data stream to a called terminal; and forwarding the second media data stream to an initiating end.
In the above, the initiation end comprises a push flow end and a pull flow end. The receiving a request instruction from an initiator, and performing identity authentication on the initiator based on the request instruction includes: receiving a call request instruction from a plug flow end, performing plug flow end identity authentication on the plug flow end, and returning a plug flow end connection success response instruction after the plug flow end identity authentication is passed; and receiving a data request instruction from the pull end, performing pull end identity authentication on the pull end, and returning a pull end data connection response instruction after the pull end identity authentication is passed.
The data interaction module 403 further includes, when the first media data stream is not received, controlling a preset duration of the session dormancy and counting a preset number of times within the preset duration of the session dormancy, continuously detecting whether the first media data stream is received within the preset duration and the preset number of times, and when the preset duration is exceeded or the preset number of times is exceeded and the first media data stream is not detected to be received, ending the session.
EXAMPLE five
Referring to fig. 5, fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the disclosure. The electronic device may be a computer, a server, or the like, and may also be an intelligent device such as a mobile phone, a tablet computer, a monitoring terminal, or the like, and an image acquisition device having a processing function. As shown in fig. 5, the electronic device may include:
a memory 501 in which executable program code is stored;
a processor 502 coupled to a memory 501;
the processor 502 calls the executable program code stored in the memory 501 to execute part or all of the steps in the VOIP real-time audio/video call method in the first embodiment.
The embodiment of the invention discloses a computer readable storage medium which stores a computer program, wherein the computer program enables a computer to execute part or all of the steps in the VOIP real-time audio and video call method in the first embodiment.
The embodiment of the invention also discloses a computer program product, wherein when the computer program product runs on a computer, the computer is enabled to execute part or all of the steps in the VOIP real-time audio and video call method in the first embodiment.
The embodiment of the invention also discloses an application release platform, wherein the application release platform is used for releasing the computer program product, and when the computer program product runs on a computer, the computer is enabled to execute part or all of the steps in the VOIP real-time audio and video call method in the first embodiment.
In various embodiments of the present invention, it should be understood that the sequence numbers of the processes do not mean the execution sequence necessarily in order, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation on the implementation process of the embodiments of the present invention.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented as a software functional unit and sold or used as a stand-alone product, may be stored in a computer accessible memory. Based on such understanding, the technical solution of the present invention, which is essential or contributes to the prior art, or all or part of the technical solution, may be embodied in the form of a software product, which is stored in a memory and includes several requests for causing a computer device (which may be a personal computer, a server, or a network device, etc., and may specifically be a processor in the computer device) to execute part or all of the steps of the method according to the embodiments of the present invention.
In the embodiments provided herein, it should be understood that "B corresponding to a" means that B is associated with a from which B can be determined. It should also be understood that determining B from a does not mean determining B from a alone, but may also be determined from a and/or other information.
Those of ordinary skill in the art will appreciate that some or all of the steps of the methods of the embodiments may be implemented by associated hardware instructed by a program, which may be stored in a computer-readable storage medium, including Read-Only Memory (ROM), random Access Memory (RAM), programmable Read-Only Memory (PROM), erasable Programmable Read-Only Memory (EPROM), one-time Programmable Read-Only Memory (OTPROM), electrically Erasable Programmable Read-Only Memory (EEPROM), compact Disc Read-Only (CD-ROM) or other Memory capable of storing data, magnetic tape, or any other medium capable of carrying computer data.
The VOIP real-time audio and video call method, apparatus, electronic device and storage medium disclosed in the embodiments of the present invention are described in detail above, and specific embodiments are applied herein to explain the principle and implementation manner of the present invention, and the description of the above embodiments is only used to help understanding the method and core ideas of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims (10)

1. A VOIP real-time audio and video call method is characterized by comprising the following steps:
receiving a request instruction from an initiating terminal, performing identity authentication on the initiating terminal based on the request instruction, and returning to the initiating terminal to pass the instruction after the identity authentication is passed;
sending a video call request to a VOIP gateway server, receiving ringing response information from the VOIP gateway server when a called terminal is off-hook, playing ringing audio and video to an initiating terminal, and establishing a call;
detecting whether a first media data stream from an initiating terminal or a second media data stream from a VOIP gateway server and subjected to transcoding by a media coding and decoding module is received, and when the first media data stream or the second media data stream is received, transcoding the first media data stream and then forwarding the transcoded first media data stream to the VOIP gateway server so that the VOIP gateway server sends the first media data stream to a called terminal; and forwarding the second media data stream to an initiating end.
2. The VOIP real-time audio/video conversation method of claim 1, wherein the originating terminal comprises a push stream terminal and a pull stream terminal.
3. The VOIP real-time audio/video call method according to claim 2, wherein the receiving a request instruction from an originating terminal, and performing identity authentication on the originating terminal based on the request instruction comprises:
receiving a call request instruction from a plug flow end, performing plug flow end identity authentication on the plug flow end, and returning a plug flow end connection success response instruction after the plug flow end identity authentication is passed;
and receiving a data request instruction from the stream pulling end, performing stream pulling end identity authentication on the stream pulling end, and returning a stream pulling end data connection response instruction after the stream pulling end identity authentication is passed.
4. The VOIP real-time audio/video conversation method according to claim 3, wherein the receiving of the call request instruction from the push streaming end, the performing of the identity authentication of the push streaming end on the push streaming end, and the returning of the response instruction of successful connection of the push streaming end after the identity authentication of the push streaming end, comprises:
receiving a handshake and connection request from a plug flow end, performing first plug flow end identity authentication on the plug flow end, and returning a plug flow end connection success response instruction after the first plug flow end identity authentication is passed;
receiving a data interaction request from a stream pushing end, performing second stream pushing end identity authentication on the stream pushing end, and returning a stream pushing end interaction response instruction after the second stream pushing end identity authentication is passed;
and receiving an issuing request from the stream pushing end, performing third time stream pushing end identity authentication on the stream pushing end, and returning a stream pushing end issuing response instruction after the third time stream pushing end identity authentication passes.
5. The VOIP real-time audio/video call method according to claim 3, wherein the receiving a data request instruction from the stream pulling end, performing stream pulling end identity authentication on the stream pulling end, and returning a stream pulling end data connection response instruction after the stream pulling end identity authentication is passed includes:
receiving a handshake and connection request from a pull end, performing first pull end identity authentication on the pull end, and returning a pull end connection success response instruction after the first pull end identity authentication is passed;
receiving a data interaction request from a pull end, performing secondary pull end identity authentication on the pull end, and returning a pull end interaction response instruction after the secondary pull end identity authentication is passed;
receiving a playing request from the stream pulling end, performing the third time of stream pulling end identity authentication on the stream pulling end, and returning a stream pulling end playing response instruction after the third time of stream pulling end identity authentication is passed.
6. The VOIP real-time audio/video call method according to claim 1, further comprising controlling a session dormancy preset time and counting a preset number of times within the dormancy preset time when the first media data stream is not received, continuously detecting whether the first media data stream is received within the preset time and the preset number of times, and ending the call when the first media data stream is not detected to be received over the preset time or over the preset number of times.
7. A VOIP real-time audio and video conversation system is characterized by comprising a stream pushing terminal, a stream pulling terminal, an RTMP gateway server, a media coding and decoding module and a VOIP gateway server, wherein the stream pushing terminal, the stream pulling terminal, the media coding and decoding module and the VOIP gateway server are all connected with the RTMP gateway server, the media coding and decoding module is connected with the VOIP gateway server, the stream pushing terminal is used for initiating a call request instruction, the stream pulling terminal is used for initiating a data request instruction, the RTMP gateway server is used for carrying out identity authentication on the stream pushing terminal based on the call request instruction, carrying out identity authentication on the stream pulling terminal based on the data request instruction, sending a video call request to the VOIP gateway server, sending a first media data stream from the stream pushing terminal to the media coding and decoding module for coding and decoding and then forwarding to the VOIP gateway server, and forwarding a second media data stream which is coded and decoded by the media coding and decoding module to the stream pulling terminal.
8. A VOIP real-time audio and video calling device is characterized by comprising:
an instruction receiving module: the system comprises a request instruction used for receiving a request instruction from an initiating terminal, carrying out identity authentication on the initiating terminal based on the request instruction, and returning to the initiating terminal to pass the instruction after the identity authentication is passed;
a conversation establishing module: the system comprises a VOIP gateway server, a calling end and a calling end, wherein the VOIP gateway server is used for sending a video call request to the VOIP gateway server, receiving ringing response information from the VOIP gateway server when the called end is off-hook, playing ringing audio and video to the calling end and establishing a call;
the data interaction module: the media encoding and decoding module is used for detecting whether a first media data stream from an initiating terminal or a second media data stream from a VOIP gateway server and subjected to transcoding by the media encoding and decoding module is received, and when the first media data stream or the second media data stream is received, the first media data stream is transcoded and then forwarded to the VOIP gateway server so that the VOIP gateway server sends the first media data stream to a called terminal; and forwarding the second media data stream to an initiating end.
9. An electronic device, comprising: a memory storing executable program code; a processor coupled with the memory; the processor calls the executable program code stored in the memory for executing the VOIP real-time audio/video call method of any one of claims 1 to 7.
10. A computer-readable storage medium storing a computer program, wherein the computer program causes a computer to execute the VOIP real-time audio video call method of any one of claims 1 to 7.
CN202310032604.3A 2023-01-10 2023-01-10 VOIP real-time audio and video call method, system and device Active CN115914178B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310032604.3A CN115914178B (en) 2023-01-10 2023-01-10 VOIP real-time audio and video call method, system and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310032604.3A CN115914178B (en) 2023-01-10 2023-01-10 VOIP real-time audio and video call method, system and device

Publications (2)

Publication Number Publication Date
CN115914178A true CN115914178A (en) 2023-04-04
CN115914178B CN115914178B (en) 2023-05-02

Family

ID=85740850

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310032604.3A Active CN115914178B (en) 2023-01-10 2023-01-10 VOIP real-time audio and video call method, system and device

Country Status (1)

Country Link
CN (1) CN115914178B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050201363A1 (en) * 2004-02-25 2005-09-15 Rod Gilchrist Method and apparatus for controlling unsolicited messaging in real time messaging networks
US20120124227A1 (en) * 2010-11-15 2012-05-17 Nabil Al-Khowaiter Browser-based voip service method and system
CN102790710A (en) * 2011-05-16 2012-11-21 北京新媒传信科技有限公司 Method and device for audio and video communication between PC (personal computer) terminal and cell phone
CN106941629A (en) * 2017-04-05 2017-07-11 深圳进门财经科技股份有限公司 Real-time live broadcast method based on SIP+RTP Yu RTMP protocol interconnections
CN107819725A (en) * 2016-09-12 2018-03-20 山东量子科学技术研究院有限公司 Method and mobile terminal based on VoIP calls
CN112533006A (en) * 2020-11-05 2021-03-19 深圳市咪码科技有限公司 Communication method and device for live broadcast platform and VOIP terminal

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050201363A1 (en) * 2004-02-25 2005-09-15 Rod Gilchrist Method and apparatus for controlling unsolicited messaging in real time messaging networks
US20120124227A1 (en) * 2010-11-15 2012-05-17 Nabil Al-Khowaiter Browser-based voip service method and system
CN102790710A (en) * 2011-05-16 2012-11-21 北京新媒传信科技有限公司 Method and device for audio and video communication between PC (personal computer) terminal and cell phone
CN107819725A (en) * 2016-09-12 2018-03-20 山东量子科学技术研究院有限公司 Method and mobile terminal based on VoIP calls
CN106941629A (en) * 2017-04-05 2017-07-11 深圳进门财经科技股份有限公司 Real-time live broadcast method based on SIP+RTP Yu RTMP protocol interconnections
CN112533006A (en) * 2020-11-05 2021-03-19 深圳市咪码科技有限公司 Communication method and device for live broadcast platform and VOIP terminal

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
陈天骢: "基于RTMP与SIP的可视化交互系统研究与设计", 《CNKI优秀硕士学位论文全文库》 *

Also Published As

Publication number Publication date
CN115914178B (en) 2023-05-02

Similar Documents

Publication Publication Date Title
CN102857729B (en) Set top box based video conversation method and system
CN101909192B (en) Television terminal and communication method thereof
JP2012514367A5 (en)
RU2007125542A (en) METHOD FOR MONITORING VIDEO TELEPHONE SERVICES AND INTENDED FOR THIS SYSTEM
KR20110003491A (en) Method and apparatus for video services
EP2884750A1 (en) Monitoring method and internet protocol television set top box
CN104253696B (en) Police handheld speech talkback method and system Internet-based
JP2008153782A (en) Call managing method, call management system, and message processing server system
US11856149B2 (en) Method for establishing call connection, first terminal, server, and storage medium
CN114710473A (en) Method and system for realizing audio-video interaction between applet and SIP contact center
CN103327380A (en) Set top box and method for achieving conversation on set top box
CN115914178B (en) VOIP real-time audio and video call method, system and device
CN103684970A (en) Transmission method and thin terminals for media data streams
WO2010124499A1 (en) Method and terminal for synchronously recording sounds and images of opposite ends based on circuit domain video telephone
JP5423534B2 (en) Intercom system, center device, and noise removal method
JP5163750B2 (en) Multimedia service
JP2006270558A (en) Originating method and program of ip telephone device which reproduce content during originating
US20110051718A1 (en) Methods and apparatus for delivering audio content to a caller placed on hold
KR102109607B1 (en) System for reducing delay of transmission and reception in communication network, and apparatus thereof
CN108184033A (en) A kind of method, terminal and the system of teleengineering support VoIP
CN108650425B (en) Monitoring method and monitoring system
JP2008060752A (en) Calling method of communication terminal
CN111865878A (en) Call method, monitoring device, cloud platform and monitoring system
KR20150050241A (en) Terminal Equipment and echo cancellation method for mVoIP
CN113645454B (en) Air-to-ground video communication method and device under satellite link

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant