CN117395231A

CN117395231A - Multi-terminal same-screen interactive display method

Info

Publication number: CN117395231A
Application number: CN202311111633.5A
Authority: CN
Inventors: 葛成; 陈市伟; 马项; 薛唯; 翟浩
Original assignee: Guolian Life Insurance Co ltd
Current assignee: Guolian Life Insurance Co ltd
Priority date: 2023-08-31
Filing date: 2023-08-31
Publication date: 2024-01-12

Abstract

The invention belongs to the technical field of computers, and particularly relates to a multi-terminal same-screen interactive display method. In the method, an operation instruction is transmitted between an agent end and a client end through a channel IM; the agent end and the client end transmit data through a video data channel RTC, a white board data channel WS and a same screen data channel WS, multi-end same screen interaction display is carried out, the agent end needs to desensitize sensitive information in the data between sending the data to the client end, a front end program of the agent end and a front end program of the client end record vector data of each white board respectively, the vector data is changed into image data through a vector redrawing method, and the server carries out bidirectional transmission. In the method, the one-way desensitization operation is realized in the same screen operation in an encryption mode in the transmission process, the interactive operation marked when the screen sharing is realized by using the whiteboard, and the digital person is used for explaining and displaying the content in the insurance plan, so that the viewer of the client can intuitively know the insurance content.

Description

Multi-terminal same-screen interactive display method

Technical Field

The invention belongs to the technical field of computers, and particularly relates to a multi-terminal same-screen interactive display method.

Background

With the continuous development of technology, the insurance industry needs to advance over time to provide professional and peri services to customers without remote contact, for example, to explain the content of an insurance plan to the customers, etc.

The current common screen demonstration scheme in the prior art is based on the mobile phone App to realize sharing screen content, a user can only realize the common screen function by installing a corresponding App, the common screen demonstration scheme is not realized at the H5 level of the mobile phone end, software applied in the current industry can only display content to the other end through a one-way display method, and operation can only be performed by a party who displays the content. The method has the advantages that one-way desensitization processing can be achieved when the screen sharing is carried out, other screen sharing software is shared in a video-based mode, screen pictures seen by both sides are consistent, and if the screen sharing is carried out, the sensitive data cannot be seen by a presenter and a presenter during display operation, so that the problem of difficult operation of the presenter during display operation is caused. The existing APP can only realize the functions of the same screen and real-time audio and video, and has no method for expressing the meaning of the APP on the displayed screen through means such as labeling and drawing.

Although the enterprise WeChat terminal can also realize audio and video call, the operation and the storage of the screen sharing and the multi-terminal drawing board cannot be carried out. Under the condition that the client is not contacted remotely, the client cannot be shown and read in real time according to the content of the planning, the quality of the remote service client is seriously affected, the client cannot see and explain the content in the planning for the client in real time, the client cannot comprehensively and carefully know the content of the planning, and the client cannot see and hear the planning while seeing the planning, so that a salesman can explain and operate pages according to the content of the planning.

In summary, the current multi-terminal on-screen demonstration method has the following disadvantages:

1. the same screen display function of the mobile phone terminal H5 is realized without any transmission;

2. two-way operation of H5 cannot be realized;

3. unidirectional desensitization operation during the same screen cannot be realized;

4. the user cannot express the meaning of the user through means such as labeling, drawing and the like when watching the explanation on the same screen;

5. and only the insurance plan is displayed in a voice broadcasting mode.

Disclosure of Invention

The invention provides a multi-terminal same-screen interactive display method, which solves the problems that in the prior art, only App can be used for multi-terminal same-screen operation, real-time desensitization cannot be carried out during the same-screen operation, and operations such as labeling, drawing and the like cannot be carried out on the same-screen.

The technical scheme of the invention is realized as follows:

a multi-terminal same-screen interactive display method comprises the following steps:

the first step, the operation instruction is transmitted between the agent end and the client end through a channel IM; the back-end program creates a room, the agent end and the client end join the room after authentication by the server, and an operation instruction is transmitted between the agent end and the client end through a channel IM;

the second step, the data are transmitted between the agent end and the client end through a video data channel RTC, a white board data channel WS and a same screen data channel WS, multi-end same screen interactive display is carried out, wherein the data in the video data channel RTC, the white board data channel WS and the same screen data channel WS are mixed and assembled into integrated video data;

thirdly, video sharing is carried out between the agent end and the client end through P2P connection, video of the agent end is sequentially subjected to audio and video acquisition and audio and video streaming, bypass recording is finally carried out, and recorded video is sent to the client end in a playback mode;

recording page data by the agent end and the client end in a snapshot mode for H5 screen sharing, and transferring and conveying the page data of the agent end and the page data of the client end through a server, wherein the desensitization operation is carried out before the page data of the agent end is uploaded to the server, so that the data received by the client end does not contain sensitive data;

fifthly, sharing the agent end and the client end through white boards to perform H5 bidirectional operation, respectively recording vector data of each white board by a front-end program of the agent end and a front-end program of the client end, changing the vector data into image data through a vector redrawing method, and carrying out bidirectional conveying by a server;

and sixthly, displaying the virtual digital person image on an H5 page of the client through the embedded digital person frame, producing voice on a node needing to interact with the client, and displaying the voice to the client through a voice driving algorithm of the digital person.

Through the technical scheme, the agent side and the client side in the invention mutually transmit operation instructions through the channel IM, such as opening a room, making a call, opening or closing a whiteboard and other operation instructions. So that the mutual transmission of information between the two parties is faster and more stable.

The agent end and the client end transmit data through a video data channel RTC, a whiteboard data channel WS and a same screen data channel WS, the life cycle of the channels is different, and the data forms are also different. RTC is large data that is suitable for transmitting video without going through the server point-to-point. The amount of data transmitted per second by the whiteboard and the same screen is smaller, but the data is saved by the server. The RTC transmission of big data occupies small resources but the data can be lost; the white board and the WS transmission mode of the same screen are stable and occupy large resources. The resources comprise networks, storages and memories. The video data channel RTC, the whiteboard data channel WS and the same screen data channel WS are combined to convey data, and the channels are independent from each other and cannot generate data interference.

The agent end and the client end are connected through P2P to share video, so that point-to-point data transmission between the agent end and the client end is realized, a server is not needed to be passed, the limitation of the bandwidth of the server is avoided, and the construction cost of the whole same-screen interaction scheme is saved.

The agent side and the client side record page data in a snapshot mode to carry out H5 screen sharing, the data of the agent side need to be subjected to desensitization operation before being uploaded to the server, sensitive information in the page of the agent side is removed and replaced, one-way desensitization is carried out while screen sharing is realized, and the page data conveyed by the agent side does not contain sensitive information.

The agent side and the client side share through the whiteboard to carry out H5 bidirectional operation, so that both sides can carry out marking operation on respective pages, both sides can see the operation of the other side in real time, bidirectional interactive marking operation is realized, explanation display of the insurance plan is carried out more intuitively, and efficiency of explanation display is improved.

The content explanation of the insurance plan is displayed to the customer through the digital person, so that the explanation process is more vivid and image, and the customer is facilitated to more intuitively know the content in the insurance plan

Optionally, H5 screen sharing includes the steps of:

s1, recording user operation on an insurance plan page by a front-end program of an agent end; when a user explains an insurance plan, a front-end program of an agent firstly records a current snapshot of DOM nodes in a snapshot mode, then monitors interactive operation of the user in a browser, and records the operations;

s2, the front-end program of the agent end converts the user operation into an event stream;

s3, the front-end program of the agent end sequences the event stream into a JSON format to be stored locally, and the JSON format is uploaded to a server after desensitization operation;

s4, the server reads the serialized event stream, plays back the user operation according to the time stamp of the event, reproduces the interactive operation of the user in the browser, and sends the interactive operation to the client side, so that the viewer of the client side can see the interactive operation of the user of the agent side in the browser.

According to the technical scheme, the agent side records the current snapshot of the DOM node in a snapshot mode, so that the agent side can send the node data to the server for page reconstruction after the desensitization operation is finished, and unidirectional desensitization operation in screen sharing is facilitated. The event stream is stored locally in a JSON format, and when the server transmits the event stream sent by the client to the agent, the event stream data of the client and the locally stored data are subjected to fusion and superposition operation, so that new page data are formed and sent to the client. The event stream is serialized into the JSON format, so that the method has the advantages of strong readability, strong protocol universality, small data volume, cross-platform compatibility and the like.

Optionally, the H5 bi-directional operation includes the steps of:

the method comprises the steps that D1, a server carries out page reconstruction on user operation and sends the user operation to a client in a playback mode, and a viewer of the client carries out operation again;

d2, the front-end program of the client side sequences an operation generation event stream of the client side into JSON data and uploads the JSON data to the server:

d3, the agent receives the operation data of the client sent by the server, fuses the operation data with the local operation data to form a new page, and sends the new page to the server after desensitization operation;

and D4, the server transmits the operation of fusing the client side by the agent side to the client side, so that the two-way operation of the same screen between the agent side and the client side is realized.

Through the technical scheme, the agent end and the client end can be operated and shared in real time through the same-screen bidirectional operation, so that the communication efficiency is improved, and the whole display process is more visual and efficient.

Optionally, the ways of recording the interaction in step S1 include listening for changes in DOM using node comparison, and listening for browser events (mouse operation and keyboard operation events of the browser).

By the technical scheme, node comparison is performed to find out the incremental data, and monitoring DOM changes is a trigger point for node comparison, because the network is very burdened with transmitting the full data every time.

Optionally, the event stream includes the type of operation, a timestamp, a body of the target element, and data resulting from the operation.

Through the technical scheme, the content contained in the event stream can be played back orderly by the client.

Optionally, the desensitizing operation includes removing and replacing the corresponding node data containing sensitive information after the agent records the DOM snapshot.

Through the technical scheme, the desensitization operation is carried out on the agent end, so that the data transmitted from the agent end to the client end does not contain sensitive information, unidirectional desensitization during screen sharing is realized, the user of the agent end can be ensured to explain the content of the insurance plan to the viewer of the client end, and sensitive information cannot be revealed.

Optionally, in step D1, after the client receives the reconstructed page sent by the server, when the viewer operates on the reconstructed page, a complete DOM tree structure is formed, and the front-end program of the client monitors the operation event of the viewer on the node and records the operation event, and the front-end program of the client does not record the DOM tree structure, so that only the user at the agent end can operate the content of the insurance plan.

Through the technical scheme, the front-end program of the client only monitors node operation of the viewer, but does not record the DOM tree structure, so that the viewer of the client can only mark on the page transmitted by the agent end, but cannot change the content on the page, and the page operation and marking content of the client can be completely transmitted to the agent end.

Optionally, in step D3, the server reconstructs the page of the serialized event stream data of the client, and sends the page to the agent in a playback manner, and the front-end program of the agent fuses the event stream data of the client with the local event stream data, operates the fused page DOM, and sends the page DOM data after the operation to the server in a mirror snapshot and incremental snapshot manner after the desensitization processing.

According to the technical scheme, the server transmits the event stream data of the client to the agent side, the agent side can fuse the event stream data stored locally with the event stream data of the client to form a new page, after the new page is operated, the new page is converted into new DOM data, sensitive information is removed and replaced, and the new page is transmitted to the server in a mirror image snapshot mode and an incremental snapshot mode. The mirror snapshot is full data, and the data volume can be built in a full+increment mode, but the operation process of a user can be reproduced.

Optionally, the front-end program of the agent end and the front-end program of the client end respectively acquire vector tracks of the whiteboards, the vector tracks are synchronized with the pictures, the vector tracks are reconstructed through data push flow, the vector tracks are converted into image data in a vector redrawing mode and are uploaded to the server, and the server mutually conveys the whiteboard data of the agent end and the client end.

Through the technical scheme, the vector track of each whiteboard is collected, the vector track is synchronous with the picture, the vector track is reconstructed through data push flow, the vector track is converted into image data in a vector redrawing mode and is uploaded to the server, and the data quantity can be greatly saved by adopting the whiteboard interactive data mode, namely only the vector transmission mode, and the transmission speed is improved.

After the technical scheme is adopted, the invention has the beneficial effects that:

the invention realizes the explanation of the same screen by operating on the mobile terminal equipment and displaying the content of the insurance plan by the digital person and simultaneously supporting the mutual sharing of the file, the audio and video and the whiteboard content in the operating process. The method of the invention can change the real-time audio and video, the picture displayed on the screen and the data mixed flow of the user drawn by the whiteboard into a video, and the video is started to be watched by the viewer of the client.

The H5 page on-screen display method and device achieve on-screen display of the H5 page in a H5 node snapshot mode under the condition that third-party software is not needed. The bidirectional operation is realized by using the H5 mode, and the agent side and the client side can control a unified interface through H5 node snapshot and H5 event acquisition and communication. In the invention, the snapshot node is sent between the agent end and the client end in the communication process, so that the snapshot node can be sent to the server for page reconstruction after the desensitization operation is finished on the node data, and the server sends the reconstructed page to the client end in a playback mode.

Drawings

In order to more clearly illustrate the embodiments of the invention or the technical solutions of the prior art, the drawings which are used in the description of the embodiments or the prior art will be briefly described, it being obvious that the drawings in the description below are only some embodiments of the invention, and that other drawings can be obtained according to these drawings without inventive faculty for a person skilled in the art.

FIG. 1 is a schematic diagram of screen sharing between an agent and a client in an embodiment;

FIG. 2 is a schematic diagram of communication between an agent and a client in an embodiment;

fig. 3 is a schematic diagram of a digital man system in an embodiment.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

The embodiment of the application discloses a multi-terminal same-screen interactive display method.

Examples

According to the method shown in fig. 1 to 3, the multi-terminal on-screen interactive display method comprises the following steps:

The mobile phone end H5 performs the same screen display and comprises the following steps:

s1, recording user operation: when the user explains the insurance plan, the front-end program of the agent side firstly records the current snapshot of the DOM node in a snapshot mode, then monitors interactive operations of the user in the browser, such as mouse clicking, keyboard inputting and the like, records the operations, and monitors changes of the DOM in a node comparison mode and monitors browser events (mouse clicking, scrolling, keyboard clicking and the like). The front-end program runs on the browser.

S2, the agent side generates an event stream: the agent-side front-end program converts the recorded user operations into a series of events, called event streams. The event stream includes the type of user operation, a time stamp, a body of a target element, data (input data) caused by the operation, and the like. The type of operation refers to a mouse click, double click, right click, movement, scrolling, keyboard event, etc.

S3, serializing event streams: the front-end program of the agent end sequences the event stream into a JSON format and stores the JSON format locally, and uploads the JSON format to the server after desensitization operation.

S4, the server reads the serialized event stream to perform playback operation: during playback, the back-end program of the on-screen application reads the serialized event stream and parses it into a series of events. And then playing back the events in sequence according to the time stamps of the events, reproducing the interactive operation of the user in the browser, and sending the interactive operation to the client, so that the viewer of the client can see the interactive operation of the user of the agent side in the browser.

The mobile phone end H5 carries out multi-end and same-screen bidirectional operation, and comprises the following steps:

d1, playing back the reconstructed page by the client: when a client views a reconstructed page formed by playback operation sent by a server, a complete DOM tree structure is formed, a front-end program of the client running on a browser also monitors and records operation events of the client on nodes, and the client does not record the DOM tree structure, so that only a client at an agent side can operate an insurance plan, and the client cannot operate. Wherein, the agent end can perform modification operation and the viewer cannot do so. The operation event of the node refers to operations such as mouse click, keyboard input and the like.

D2, the client generates an event stream: the client converts the recorded viewer operation into a series of events, called event streams. And the front-end program of the client running on the browser uploads the recorded event stream of the client to the server as JSON data.

And D3, receiving the operation data of the watching party returned by the client by the agent: the server sends the event stream data of the client to the agent end, and the agent end fuses the event transmitted by the client and the local event to operate the page DOM.

And D4, after the user at the agent end operates the page of the insurance plan, transmitting the operated DOM mirror image snapshot or the increment snapshot to a viewer, wherein the viewer and the agent end form a consistent operation impression. And the data stored locally by the agent are combined to convey the new page data after the local and fusion client operation to a viewer of the client for viewing through a server in a DOM mirror image snapshot and DOM increment snapshot mode.

Desensitization operation on screen: after the agent end records the DOM snapshot, the front-end program of the agent end performs desensitization operations such as removal, replacement and the like on corresponding node data according to configuration, and then sends the desensitization operations to a viewer of the client. The agent end needs to monitor the data sent to the server, and needs to remove and replace the sensitive content and upload the sensitive content to the server. And the viewer of the client can only receive the reconstructed page data which is sent by the agent and subjected to desensitization treatment.

Integrated electronic whiteboard functionality: the server accesses the SDK of the interactive whiteboard integrated with the messenger cloud to a unified channel, and transmits commands of functions such as starting, closing and the like of the whiteboard through the channel; meanwhile, data are transmitted through websocket, the front-end program of the browser changes the recorded vector data of the whiteboard into image data through a vector redrawing method, and finally, the image data and the real-time audio and video are mixed, and data shared by a screen are assembled into integrated video data.

The digital person performs the display and broadcasting of the insurance plan, and comprises the following contents:

(1) And writing the display content of the insurance plan. And writing a digital person display script according to the content of the insurance plan, wherein the digital person display script comprises voice broadcasting content, facial expression, gesture actions and the like.

(2) Speech is generated using speech synthesis techniques. According to the presentation script, speech synthesis technology is used to generate the speech of the digital person. The speech synthesis technique may convert text into natural language speech through deep learning or the like.

(3) And the functions of face recognition and expression recognition are realized. Facial expressions of users can be identified through the face recognition technology, and then expression change of digital people is realized according to the display script. The expression recognition technique may use deep learning or the like to recognize facial expressions.

(4) And realizing gesture recognition and action control. Through a gesture recognition technology, gesture actions of a user can be recognized, and then action control of a digital person is realized according to the display script. The gesture recognition technique may use a technique such as deep learning to recognize a gesture motion.

(5) The integration technology realizes the display and broadcasting of the digital person. According to the display script, technologies such as voice, expression, gestures and the like are integrated, so that display and broadcasting of digital people are realized.

The DOM is an HTML5 document object model that provides a way to organize document content in a tree structure so that developers can manipulate document content and structures, including elements, nodes, attributes, text nodes, etc., through JavaScript.

In the method, the mobile phone end H5 realizes the screen sharing of the mobile phone in a DOM snapshot mode and bidirectional operation, realizes the one-way desensitization operation in the same screen operation in an encryption mode in the transmission process, realizes the interaction operation marked when the screen sharing is realized by using a whiteboard, and uses a digital person to explain and display the content in the insurance plan so that a viewer of the client can know the content of the insurance in detail.

The method for carrying out the same-screen demonstration by utilizing the H5 of the mobile phone terminal in the invention ensures that a user does not need to install any third party App on the mobile phone terminal, the same-screen operation is directly realized under the environments of a self-contained browser, weChat, enterprise WeChat and the like, the sharing of screen content can be realized, the sharing of audio and video can also be realized, and an operator can be allowed to carry out annotation analysis on a whiteboard while explaining, so that the operator can better express the meaning of the operator, a viewer can see the annotation analysis process, the same-screen demonstration process supports desensitization in the data transmission process, and the phenomenon that the operator and the viewer cannot see sensitive data due to the direct desensitization on an interface does not occur. The existing same screen scheme only supports broadcasting the content of the insurance plan in a mode of converting TTS into voice, and the invention can more friendly convey the content of the insurance plan through the access of a digital person.

The foregoing description of the preferred embodiments of the invention is not intended to be limiting, but rather is intended to cover all modifications, equivalents, alternatives, and improvements that fall within the spirit and scope of the invention.

Claims

1. The multi-terminal same-screen interactive display method is characterized by comprising the following steps of:

2. The multi-terminal on-screen interactive display method according to claim 1, wherein the H5 screen sharing comprises the steps of:

3. The multi-terminal on-screen interactive display method according to claim 2, wherein the H5 bi-directional operation comprises the steps of:

4. The multi-terminal on-screen interactive presentation method according to claim 2, wherein the recording of the interactive operation in step S1 comprises monitoring the DOM for changes using node comparison, and monitoring for browser events.

5. The multi-terminal on-screen interactive presentation method of claim 3, wherein the event stream comprises a type of operation, a time stamp, a body of the target element, and data resulting from the operation.

6. The multi-terminal on-screen interactive display method according to claim 3, wherein the desensitization operation comprises removing and replacing corresponding node data containing sensitive information after the agent terminal records the DOM snapshot.

7. The multi-terminal on-screen interactive display method according to claim 3, wherein in step D1, after receiving the reconstructed page sent by the server, the client side also forms a complete DOM tree structure when the viewer operates on the reconstructed page, the front-end program of the client side monitors and records the operation event of the viewer on the node, and the front-end program of the client side does not record the DOM tree structure, so that only the user at the agent side can operate the content of the insurance plan.

8. The multi-terminal on-screen interactive display method according to claim 3, wherein in step D3, the server performs page reconstruction on the serialized event stream data of the client, and sends the event stream data to the agent in a playback manner, the front-end program of the agent merges the event stream data of the client with the local event stream data, operates the merged page DOM, and sends the page DOM data after the operation to the server in a mirror snapshot and incremental snapshot manner after the desensitization processing.

9. The multi-terminal same-screen interactive display method of claim 1, wherein the front-end program of the agent side and the front-end program of the client side respectively acquire vector tracks of the whiteboards, the vector tracks are synchronous with pictures, the vector tracks are reconstructed through data push flow, the vector tracks are converted into image data through a vector redrawing mode and are uploaded to a server, and the server mutually conveys the white board data of the agent side and the client side.