CN116055666A

CN116055666A - Display device and conference summary generation method

Info

Publication number: CN116055666A
Application number: CN202111261746.4A
Authority: CN
Inventors: 张宏波; 王金童
Original assignee: Qingdao Jukanyun Technology Co ltd
Current assignee: Qingdao Jukanyun Technology Co ltd
Priority date: 2021-10-28
Filing date: 2021-10-28
Publication date: 2023-05-02

Abstract

The embodiment of the application provides a display device and a method for generating a conference summary, wherein the display device comprises: a display; a controller in communication with the display, the controller configured to: after a conference is started, recording multimedia files of the conference in real time; receiving triggering operation of a meeting summary control input by a user; responding to the triggering operation of the conference summary control, generating and displaying a floating layer for recording the conference summary on a conference interface, and acquiring summary data input on the conference recording floating layer; when the user exits the conference recording floating layer, generating conference records containing initial recording time and the summary data, wherein the initial recording time is the recording time of the multimedia file when the floating layer is generated; and sending the conference record to a server, so that the server generates a conference summary according to all conference records of the conference. The method and the device improve the generation efficiency of the meeting summary.

Description

Display device and conference summary generation method

Technical Field

The application relates to the technical field of servers, in particular to a display device and a method for generating a meeting summary.

Background

The meeting period is a prescribed document that records and conveys the basic conditions or main spirit, protocol, etc. of the meeting. In order to facilitate meeting summary, in the related art, some meeting applications are provided with audio recording and video recording functions, so that audio files and video files of the meeting can be generated. After the meeting, the user can sort out meeting summary according to the audio file and the video file. However, when the duration of the conference is long, searching for the content needed to generate the conference summary is laborious and troublesome due to the large audio file and video file, resulting in low efficiency of generating the conference summary.

Disclosure of Invention

In order to solve the technical problem of low efficiency of generating the meeting summary, the application provides display equipment and a method for generating the meeting summary.

In a first aspect, the present application provides a display device, comprising:

a display;

a controller in communication with the display, the controller configured to:

after a conference is started, recording multimedia files of the conference in real time;

receiving triggering operation of a meeting summary control input by a user;

responding to the triggering operation of the conference summary control, generating and displaying a floating layer for recording the conference summary on a conference interface, and acquiring summary data input on the conference recording floating layer;

When the user exits the conference recording floating layer, generating conference records containing initial recording time and the summary data, wherein the initial recording time is the recording time of the multimedia file when the floating layer is generated;

and sending the conference record to a server, so that the server generates a conference summary according to all conference records of the conference.

In some embodiments, obtaining summary data entered on the conference recording float layer includes:

acquiring a coordinate area selected by a user in the conference interface frame;

and carrying out text recognition on the image in the coordinate area, determining the recognized text as summary data input on the conference recording floating layer, and adding the recognized text into an input box of the conference recording floating layer.

and acquiring text data input by a user in an input box of the conference recording floating layer, and determining the text data input in the input box as summary data input on the conference recording floating layer.

And acquiring voice data input by a user in an input box of the conference recording floating layer, converting the voice data into text data, and determining the converted text data as summary data input on the conference recording floating layer.

and acquiring text data from another display device sent by a server, and determining the text data from the other display device sent by the server as the summary data input on the conference recording floating layer.

In some embodiments, generating a meeting record containing a start record time and the summary data includes:

and generating a meeting record containing an summary type, a starting record moment and the summary data, wherein the summary type comprises a main point record type, a backlog record type and a question-answer record type.

In some embodiments, the meeting summary control is a main point record control or a to-do record control or a question-answer record control, the summary type is obtained according to control data of the meeting summary control, the summary type in the control data of the main point record control is a main point record type, the summary type in the control data of the to-do record control is a to-do record type, and the summary type in the control data of the question-answer record control is a question-answer record type.

In some embodiments, the controller is further configured to:

a hyperlink is generated from the start recording time, the hyperlink being configured to jump to the start recording time of the multimedia file.

In a second aspect, the present application provides a method for generating a meeting summary, where the method includes:

receiving triggering operation of a meeting summary control input by a user;

The display device and the conference summary generation method have the beneficial effects that:

according to the method and the device for recording the summary of the conference, the floating layer used for recording the summary of the conference is generated in the conference process, so that a user can input summary data in the conference process, and the initial recording time when the summary data is input is determined, after the conference is finished, the summary of the conference can be quickly generated according to the summary data input by the user and the initial recording time, if the generated summary of the conference needs to be edited, the relative position of the summary data in the multimedia file of the conference can be quickly positioned according to the initial recording time, the multimedia file does not need to be checked from the beginning, and the generation efficiency of the summary of the conference is improved.

Drawings

In order to more clearly illustrate the technical solutions of the present application, the drawings that are needed in the embodiments will be briefly described below, and it will be obvious to those skilled in the art that other drawings can be obtained from these drawings without inventive effort.

FIG. 1 illustrates an operational scenario between a display device and a control apparatus according to some embodiments;

fig. 2 illustrates a schematic view of a scenario of a video conference, according to some embodiments;

FIG. 3 illustrates a flow diagram of a method of conference summary generation in accordance with some embodiments;

FIG. 4 illustrates a partial timing diagram of a conference summary generation method in accordance with some embodiments;

an interface schematic diagram of a conferencing application according to some embodiments is shown schematically in fig. 5;

a partial timing diagram of a conference summary generation method according to some embodiments is illustrated in fig. 6;

an interface schematic diagram of a meeting application according to some embodiments is shown schematically in fig. 7;

a partial timing diagram of a conference summary generation method according to some embodiments is illustrated in fig. 8;

an interface schematic diagram of a conferencing application according to some embodiments is shown schematically in fig. 9;

A partial timing diagram of a conference summary generation method according to some embodiments is illustrated in fig. 10;

an interface schematic diagram of a conferencing application according to some embodiments is shown schematically in fig. 11;

FIG. 12 illustrates a partial timing diagram of a meeting summary generation method in accordance with some embodiments;

FIG. 13 illustrates an interface diagram of a meeting summary in accordance with some embodiments.

Detailed Description

For purposes of clarity and implementation of the present application, the following description will make clear and complete descriptions of exemplary implementations of the present application with reference to the accompanying drawings in which exemplary implementations of the present application are illustrated, it being apparent that the exemplary implementations described are only some, but not all, of the examples of the present application.

It should be noted that the brief description of the terms in the present application is only for convenience in understanding the embodiments described below, and is not intended to limit the embodiments of the present application. Unless otherwise indicated, these terms should be construed in their ordinary and customary meaning.

The terms "first," second, "" third and the like in the description and in the claims and in the above-described figures are used for distinguishing between similar or similar objects or entities and not necessarily for limiting a particular order or sequence, unless otherwise indicated. It is to be understood that the terms so used are interchangeable under appropriate circumstances.

The terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a product or apparatus that comprises a list of elements is not necessarily limited to all elements explicitly listed, but may include other elements not expressly listed or inherent to such product or apparatus.

The display device provided in the embodiment of the application may have various implementation forms, for example, may be a television, an intelligent television, a laser projection device, a display (monitor), an electronic whiteboard (electronic bulletin board), an electronic desktop (electronic table), and the like. Fig. 1 is a specific embodiment of a display device of the present application.

Fig. 1 is a schematic diagram of an operation scenario between a display device and a control apparatus according to an embodiment. As shown in fig. 1, a user may operate the display device 200 through the smart device 300 or the control apparatus 100.

In some embodiments, the control apparatus 100 may be a remote controller, and the communication between the remote controller and the display device includes infrared protocol communication or bluetooth protocol communication, and other short-range communication modes, and the display device 200 is controlled by a wireless or wired mode. The user may control the display device 200 by inputting user instructions through keys on a remote control, voice input, control panel input, etc.

In some embodiments, a smart device 300 (e.g., mobile terminal, tablet, computer, notebook, etc.) may also be used to control the display device 200. For example, the display device 200 is controlled using an application running on a smart device.

In some embodiments, the display device may receive instructions not using the smart device or control device described above, but rather receive control of the user by touch or gesture, or the like.

In some embodiments, the display device 200 may also perform control in a manner other than the control apparatus 100 and the smart device 300, for example, the voice command control of the user may be directly received through a module configured inside the display device 200 device for acquiring voice commands, or the voice command control of the user may be received through a voice control device configured outside the display device 200 device.

In some embodiments, the display device 200 is also in data communication with a server 400. The display device 200 may be permitted to make communication connections via a Local Area Network (LAN), a Wireless Local Area Network (WLAN), and other networks. The server 400 may provide various contents and interactions to the display device 200. The server 400 may be a cluster, or may be multiple clusters, and may include one or more types of servers.

In some embodiments, a conferencing application may be installed on display device 200 and a user may conduct a video conference with users of other devices that are installed with the conferencing application via display device 200.

In some embodiments, the conference application may not be installed on the display device 200, and only a wired connection or a wireless connection with the device on which the conference application is installed may be required, and a video screen of the conference application may be displayed and audio of the conference application may be played.

Referring to fig. 2, a schematic view of a video conference scene according to some embodiments is shown. As shown in fig. 2, the person participating in the video conference may include a presenter, a live audience, and an online audience, and the device used by the presenter to participate in the video conference is device a, which may be a smart tv, that is, the display device 200 in the above-described embodiment. The number of online spectators is assumed to be three, and the devices for participating in the video conference are respectively a device B1, a device B2 and a device B3, wherein the device B1 is a notebook computer, the device B2 is a video conference device, such as a mobile phone or a tablet, and the device B3 is a desktop computer.

In some embodiments, device a may be a display device supporting touch operations, such as a touch television.

In some embodiments, device a may be a display device that supports voice operation, such as a voice television.

In some embodiments, the device a supports both touch and voice operations, and also supports control of terminal devices such as remote controllers, smartphones, and the like.

In some embodiments, after the presenter finishes the video conference on device a, the conference summary may be sorted out from the conference video and audio recorded by device a, however, this approach is less efficient in generating the conference summary.

In order to solve the technical problem of low efficiency of generating a conference summary, the embodiment of the application provides a method for generating a conference summary, referring to fig. 3, the method for generating a conference summary may include the following steps:

step S101: and after the conference is started, recording the multimedia file of the conference in real time.

In some embodiments, after a presenter initiates a meeting, the meeting application may automatically record multimedia files of the meeting in real-time. The multimedia file may include a video file, an audio file, and a subtitle file of a display screen of the presenter's device a after the conference is started, wherein the subtitle file may be generated by performing voice recognition on the audio file.

In some embodiments, the server creates a virtual room for the conference, to which different terminals all join, and through which audio and video transmissions and exchanges take place. In some embodiments, the terminal may parse and display the audio and video data of the corresponding user pulled out from the virtual room according to the user identifier corresponding to the window in the display interface.

In some embodiments, the server may identify and text-convert the audio uploaded by each terminal, and combine the texts corresponding to different terminals according to the time points to form a subtitle file, for example, the subtitle at time 1 may include the speech of user 1 and the speech of user 2 at the same time, and the subtitle at time 2 may include the speech of user 1 and the speech of user 3 at the same time.

In some embodiments, the text corresponding to different terminals is combined to form a subtitle file, the text corresponding to a certain terminal in the different terminals is used as a text, and the identifier corresponding to the terminal is added in front of the text. The text arrangements of the different bars form the subtitle file at that moment. In some embodiments the arrangement may include an arrangement of display positions and/or an arrangement of display orders.

In some embodiments, since the conference application is running on a terminal, the frames recorded by the terminal may be the frames displayed on the terminal, and in some embodiments, the frames displayed by different terminals of the conference may be different.

Step S102: and receiving triggering operation of the meeting summary control input by the user.

In some embodiments, the meeting summary control may be limited to being displayed on device A of the presenter, and not on devices of participants other than the presenter.

In some embodiments, the meeting summary control may be displayed on the device of any participant.

Taking the example of displaying the conference summary control on only the device a of the presenter, in some embodiments, after the conference is started, the conference summary control may be displayed on an interface of a conference application of the presenter, and after the presenter clicks the conference summary control, the device a may generate a triggering operation of the conference summary control.

In some embodiments, the meeting summary control may be a single-function control, e.g., the meeting summary control is a meeting point record function control, or a backlog record function control, or an interactive problem record function control, etc. The device A can display a plurality of meeting summary controls with different functions, and after a user clicks one of the meeting summary controls, the device A can generate triggering operation of the control.

In some embodiments, the meeting summary control may be a control with integrated functions, and after the user clicks the meeting summary control, the device a may pop up a plurality of controls with single functions, such as a control with a meeting gist recording function, a control with a backlog recording function, or a control with an interactive problem recording function, for the user to select.

If the meeting summary control is a single-function control, in the meeting process, a user may trigger different meeting summary controls multiple times to generate multiple meeting summary. If the meeting summary control is an integrated function control, in the meeting process, a user can trigger the control for multiple times to select different controls with single functions, and multiple pieces of meeting summary data are generated. Of course, during the meeting, the user may trigger the meeting summary control only once, and only one piece of meeting summary data is generated.

Step S103: and responding to the triggering operation of the conference summary control, generating and displaying a floating layer for recording the conference summary on a conference interface, and acquiring summary data input on the conference recording floating layer.

Taking the meeting summary control as a control with a single function as an example, in some embodiments, the device a may generate a floating layer for recording the meeting summary on a meeting interface of the device a according to a triggering operation of the meeting summary control, and display the floating layer above the meeting interface. When the floating layer is generated, the current recording time of the multimedia file is obtained and used as the initial recording time of the conference summary, and the initial recording time can be called as the initial recording time. Illustratively, the start recording time is the 10 th minute of the meeting.

In some embodiments, the content in the conference recording float layer is a combination of the current interface and the subtitle file, without manual entry by the user. The combination of the current interface and the caption file can be generated by screenshot of the video window layer and the caption floating layer, or can be generated by combining the screenshot of the video window layer and the text of the caption file.

In some embodiments, the conference recording floating layer requires the user to record the content, and the user can input summary data in the conference recording floating layer, where the summary data can be data copied or captured by the user from the display interface, or can be data input by the user through voice or touch.

Step S104: and when the user exits the conference recording floating layer, generating conference records containing initial recording time and the summary data, wherein the initial recording time is the recording time of the multimedia file when the floating layer is generated.

In some embodiments, the user may exit the conference recording float using a control (e.g., a save control) in the conference recording float, and upon exiting the conference recording float, device a may automatically save the data entered by the user.

In some embodiments, after the user exits the conference recording float, a hyperlink may be generated according to the start recording time, where the hyperlink is configured to enable the user to access the multimedia file generated by recording the conference after selecting the hyperlink, jump to the start recording time of the multimedia file corresponding to the conference recording float, and highlight the data of the start recording time.

Step S105: and sending the conference record to a server, so that the server generates a conference summary according to all conference records of the conference.

In some embodiments, at the end of a meeting, if the user has triggered the meeting summary control only once, device a may generate a meeting summary containing one piece of meeting summary data, and if the user has triggered the meeting summary control multiple times, device a may generate a meeting summary containing multiple pieces of meeting summary data.

Therefore, the conference summary control is set in the conference application, so that a user can generate the conference summary in real time in the conference process, and the user can conveniently and subsequently locate the file position corresponding to the conference summary in the multimedia file by recording the generation time and the exit time of the conference summary floating layer, thereby being convenient for the user to subsequently edit the conference summary and review the conference according to the multimedia file.

In order to further describe the method for generating the conference summary, the process for generating the conference summary is described below in conjunction with some timing diagrams of the process for generating the conference summary and some interface diagrams of the conference application.

Referring to fig. 4, a partial timing diagram of a method of conference summary generation in accordance with some embodiments is shown. In fig. 4, an online audience takes a person as an example, and the device B may be a device for participating in a conference, such as a device B1, a device B2, and a device B3. After the presenter and the online audience successfully enter the same meeting, a join message for the meeting may be sent to the server, which may include the presenter or online audience's user ID in the meeting and the meeting ID, so that the server determines the participant device for the meeting ID.

In some embodiments, the interface of the conferencing application may be provided with the following operational controls: a screen throwing control, a microphone control, a camera control, a conference member control and a conference summary control.

As shown in fig. 4, after the conference is started, after a presenter opens a piece of lecture material such as PPT, word file, etc. on the device a, the presenter can click on the screen control on the device a, and then operate the microphone control on the device a to start lecturing.

In some embodiments, after receiving the triggering operation of the screen throwing control, the device a may transmit the display screen on the device a to the server in real time. After receiving the triggering operation of the microphone control, the device A can start the microphone to record the speech audio of the speaker, and transmit the speech audio to the server in real time. Device a sends the conference ID to the server simultaneously in addition to the display screen and the presentation audio.

In some embodiments, after receiving the display screen sent by all devices a, the server may send the display screen to the participant devices, such as device B, except for device a, corresponding to the conference ID according to the conference ID. After receiving the speech audio sent by the device A, the server converts the sound into caption text, and sends the speech audio and the caption text to the participant devices except the device A corresponding to the conference ID, such as the device B. And after receiving the video, the audio and the caption, the equipment B plays the video and the audio and displays the caption.

In some embodiments, referring to fig. 5, during a meeting, device a may display the following meeting summary controls: a "gist record" control, a "backlog record" control, and a "question-answer record" control.

In some embodiments, device a may display other controls, such as controls corresponding to each of the participant members, in addition to the controls shown in fig. 5. In the conference process, the control on the equipment A is automatically hidden, so that the lecture content is ensured not to be blocked. If the presenter needs to operate a control, the presenter can call out the control through a preset instruction. For example, the preset instruction may be to operate a preset key, for example, to operate an exit full screen key.

The generation process of the meeting summary is described below by taking a main speaker operation "gist record" control, "backlog record" control and "question-answer record" control as examples. Wherein, no matter which meeting summary control is operated by the presenter, the device A can generate a corresponding meeting summary float.

Referring to fig. 6, a partial timing diagram of a method of conference summary generation in accordance with some embodiments is shown.

As shown in fig. 6, a presenter may operate a "point of interest record" control on device a. If the presenter operates the "gist record" control in fig. 5, device a may be triggered based on the control to generate a gist record float layer that is displayed on the current interface of device a. The floating layer can be a size-adjustable and position-adjustable floating layer, and the floating layer can be provided with an input frame. The equipment A also records the recorded time length of the multimedia file when the main speaker operates the key point recording control, and determines the time when the main speaker performs key point recording according to the time length, for example, the recorded time length is 10 minutes, and the 10 th minute is the initial recording time of the conference summary.

In some embodiments, the presenter may select text on device A that is the gist of the meeting. Referring to fig. 7, the text first selected by the presenter may include "AAAA". In some embodiments, the text selected a second time by the presenter is "BBBB".

In some embodiments, if the device a supports touch operation, the manner in which the presenter selects the text that is the gist of the meeting may be: and long-pressing the device A until the device A selects a line of text where the touch point of the presenter is or a text selection prompt is displayed, wherein the device A takes the touch point as a starting point, the presenter selects a line of text, moves the touch position, the device A can enlarge the selected position, and releases the touch, the touch release position of the presenter is taken as an end point, a rectangular area between the starting point and the end point is taken as a selected area selected by a user frame, and the selected area is a coordinate area. The text in the selected area, which may be referred to as the gist text, is identified and/or copied into the gist recording float and stored in device a as the gist text as the summary data entered by the user, i.e. the presenter. In fig. 7, the text selected by the presenter is text located in the selected area 501, the text in the area is "AAAA", the display area of the recording float is area 502, and the area 502 may be an input box.

In some embodiments, if the selected region does not support text replication, e.g., the selected region is in a picture format, text within the selected region may be identified by an OCR (Optical Character Recognition ) algorithm and then copied into region 502.

In some embodiments, the point recording float layer supports size adjustment and position adjustment, and the point recording float layer can be configured to be position-adjustable after receiving a long press operation, and a presenter can drag the point recording float layer, and can complete adjustment of the position of the point recording float layer after releasing the touch. The gist recording float layer may be further configured to be adjustable in size upon receiving a double-click operation, the presenter may slide the corner position of the gist recording float layer to adjust the size of the gist recording float layer. Alternatively, the gist recording float layer may be further configured to pop up a control menu after receiving the long press operation, and display a plurality of controls at the control menu, such as a move control for moving the float layer position and a size control for adjusting the size of the float layer.

In some embodiments, after the presenter copies the text in a selected area to the point recording float layer, if there is a text to be copied in the current interface, then a second area is selected, and the device a displays the text in the newly selected area of the presenter in the point recording float layer in a superimposed manner, which may be set below the last selected text, and if the last selected text occupies the area of the point recording float layer, the point recording float layer may move the last selected text up to a position where part of the text or all of the text is moved out of the upper boundary of the point recording float layer, so as to free a display area for the text in the newly selected area, thereby realizing the effect of the floating change of the text.

In some embodiments, the gist recording floating layer may be provided with a save control, and after the presenter clicks the save control, the device a may record the recorded duration of the multimedia file at this time, so as to determine the end time of the meeting summary.

In some embodiments, after the presenter clicks the save control, the device a determines to end the session, and transmits the cached session summary data, the start recording time, and the end time input by the user as a summary record to the server.

Illustratively, the format of the point record generated by device A is: highlights: { "text": "AAAA\BBBBBB", "time": t1}. The highlights represent the type of the meeting summary as the essential point record, text is the text selected by the user, time is the initial recording time of the meeting summary, i.e. t1 is a moment, such as 10:00, and the initial recording time of the meeting summary is the 10 th minute of the meeting.

In some embodiments, after receiving a point record, the server may determine whether the point record is a first meeting summary corresponding to a meeting ID, if so, generate a meeting_summary list, store the point record in the meeting_summary list, and if the point record is not the first meeting summary corresponding to the meeting ID, add the point record to the meeting_summary list.

Illustratively, the format in which the server stores a point record is: [ { "type": "highlights", "text": "AAAA\BBBBBB", "time": t1} ].

Referring to fig. 8, a partial timing diagram of a conference summary generation method according to some embodiments.

As shown in fig. 8, a presenter may operate a "to-do record" control on device a. If the presenter operates the "to-do record" control in fig. 5, device a may generate a to-do record floating layer according to the control being triggered, and display the floating layer on the current interface of device a. The floating layer can be a size-adjustable and position-adjustable floating layer. The equipment A also records the recorded time length of the multimedia file when the presenter operates the backlog record control, and determines the time of the presenter to record the backlog according to the time length, for example, the recorded time length is 20 minutes, and the 20 th minute is the initial recording time of the meeting summary.

Referring to fig. 9, the display area of the backlog recording floating layer is an area 503, and the area 503 may be an input box, and a host may input backlog in the area 503 by voice, touch, or by a computer connected to the device a. If the user inputs voice data, the voice data is converted into text data, the text data is used as summary data input by the user, and if the user inputs the text data in an input box, the text data input by the user is directly used as summary data input by the user.

Illustratively, the user-entered backlog is in the format of: completing xx matters; responsible person: xx; x years x months x days.

After receiving the backlog input by the user, the device a displays the backlog.

In some embodiments, the to-do floating layer may be provided with a save control, and after the host clicks the save control, the device a may record the recorded duration of the multimedia file at this time, so as to determine the end time of the session summary.

Illustratively, the backlog generated by device a is in the format of: todo: { "text": "completes the xx item. Responsible person: xx; x month x day "," time ": t2}. The todo represents a to-do record of the type of the conference summary, the text is the to-do record input by the user, the time is the initial recording time of the conference summary, i.e. t2 is a moment, such as 20:00, and the initial recording time of the conference summary is the 20 th minute of the conference.

In some embodiments, after receiving a backlog record, the server may determine whether the backlog record is a first meeting summary corresponding to a meeting ID, if so, generate a meeting_details list, store the backlog record in the meeting_details list, and if the backlog record is not the first meeting summary corresponding to the meeting ID, add the backlog record to the meeting_details list.

Illustratively, the format in which the server stores a backlog record is: [ { "type": "todo", "text": "complete xx item". Responsible person: xx; x month x day "," time ": t2} ].

Referring to fig. 10, a partial timing diagram of a method of conference summary generation in accordance with some embodiments is shown.

As shown in fig. 10, a presenter may operate a "question and answer record" control on device a. If the presenter operates the "question and answer record" control in fig. 5, device a may be triggered based on the control to generate a question and answer record float layer that is displayed on the current interface of device a. The floating layer can be a size-adjustable and position-adjustable floating layer. The equipment A also records the recorded time length of the multimedia file when the presenter operates the question and answer recording control, and determines the time of the presenter to record the question and answer according to the time length, for example, the recorded time length is 30 minutes, and the 30 th minute is the initial recording time of the meeting summary.

Referring to fig. 11, the display area of the question and answer recording float includes an area 504 and an area 505, wherein the area 504 is used to display the question contents of the audience and the area 505 is used to display the answer contents of the presenter.

In some embodiments,

regions

504 and 505 may also be two separate floating layers, facilitating the adjustment of the position and size of the two regions by the presenter, respectively.

In some embodiments, the presenter may perform an interactive presentation after the "question and answer record" control is operated to prompt the audience that the interaction may be performed. After the lecture audio of the presenter is transmitted to the equipment B through the server, the equipment B can play the lecture audio, and after the audience hears the lecture audio, the audience can ask questions.

In some embodiments, if an online viewer needs to ask a question, the "question" control may be operated on his device, such as device B, and device B may generate a question request after receiving a trigger instruction of the "question" control, and send the question request to the server, where the question request includes the conference ID and the user ID of device B, and the server may send the question request to device a after receiving the question request. After receiving the question request, the device a may display a hand-lifting icon on the audience control corresponding to the question request, so that the presenter learns that the audience wants to ask the question. The presenter can click the icon of the hand, and after receiving the operation of clicking the icon of the hand, the device a can generate response data for agreeing to the question and send the response data to the server. The response data includes the user ID of device B. The server may send the response data to device B according to the user ID, device B being configured to cancel the mute state of the microphone of device B and update the microphone icon to the record state after receiving the user ID, wherein device B is set to the mute state by default after entering the conference. After seeing the microphone state, the audience can issue a question. The question may be voice or text.

In some embodiments, during the questioning session, other devices are muted, except for the presenter and the questioner's devices.

If the audience question received by the device B is a voice, the device B may transmit the voice to the device a through the server, so that the device a plays the voice.

Further, the server performs voice recognition on the voice sent by the device B to obtain a question text, encapsulates the data such as the storage address of the voice sent by the device B on the server, the question text, the user ID of the device B, the user nickname, and the like into a data packet, and sends the data packet to the device a, so that the device a displays the question content of the device B in the area 504, that is, the device a can also determine the text data sent by the server from another display device as the summary data input on the conference recording floating layer.

Illustratively, when the questions received by device B are voice, the format of the server-encapsulated data packets is as follows:

the query { "audio": "audience 1", "id": "xxx", "voice": "xxx", "text": "query 1" }. Wherein, "query 1" is a question text. "voice" is the storage address on the server of the voice of the audience question.

If the audience question received by device B is text, device B may package the text, the user ID of device B, the user nickname, etc. into a data packet, and send the data packet to device a, so that device a displays the question content of device B in area 504.

Illustratively, when the questions received by device B are text, the format of the server-encapsulated data packet is as follows:

the query is { "audio": "audience 1", "id": "xxx", "text": "query 1" }.

Referring to fig. 11, upon receiving the server-encapsulated packet, device a may extract the viewer nickname and the quiz text and display the viewer nickname and quiz text in area 504.

In some embodiments, after the presenter sees the question text displayed in area 504, he may reply to the question text.

Device a may, upon receiving the presenter's reply data, such as reply audio, package the reply data into the following format:

answer:{"text":"xxx","voice":"xxx"}。

in some embodiments, device a may transmit the packaged question data, the answer data, and the answer audio to the server, and the server may transmit the packaged question data, the answer data, and the answer audio to device B, causing device B to play the presenter's answer audio. And displays a floating layer as shown on the right side of fig. 11, in which the device B can display own question data and the presenter's answer data.

In some embodiments, after the presenter clicks the close button in area 504, device a determines to end the current audience question and answer, device a may generate a question and answer record of the current audience in the following format:

after generating a question and answer record, device a may clear the display data in

regions

504 and 505 and the presenter may turn on the next question and answer.

In some embodiments, the question-answer recording float may be provided with a save control that is clicked by the presenter. And then, a plurality of question-answer records and the starting record moments of the questions and the answers are sent to a server, and the storage format of the server for the question-answer records is as follows:

the time is the time period of the conference, i.e. t3 is a time, for example, 30:00, and the time range of the conference is the 30 th minute of the conference.

According to the above embodiment, during the conference, the presenter can set a plurality of conference records of different types of attitudes or the same type of attitudes. The process of the server sorting the meeting records into a meeting summary can be seen in fig. 12, which is a partial timing diagram of a meeting summary generation method according to some embodiments.

As shown in fig. 12, after the presenter clicks to exit the conference, device a may generate and send a conference summary generation request to the server, which may include the conference ID, upon receiving the operation to exit the conference.

In some embodiments, after receiving the meeting summary generation request, the server may obtain a plurality of meeting records corresponding to the meeting ID, and generate a hyperlink of a multimedia file of the meeting ID according to a start record time in each meeting record, where the hyperlink may be a link capable of jumping to the meeting application, and the jump position is the corresponding start record time.

In some embodiments, the server may further perform adjacent arrangement of meeting records of the same type according to a summary type in each meeting record, where the summary type includes a gist record type, a backlog record type, and a question-answer record type, the gist record type is represented by highlights, the backlog record type is represented by todo, and the backlog record type is represented by qa_record.

Referring to fig. 13, an interface diagram of a meeting summary in accordance with some embodiments. In fig. 13, the conference subjects, time, presenter and participant may be determined by presenter at a predetermined stage of the conference prior to the start of the conference.

As shown in fig. 13, the content of the gist record is generated according to the record of type= "highlights" in meeting_minutes, the text value of each gist is obtained from the text field, and the hyperlink is a jump link to the video conference app, i.e. the conference application, generated according to the time field and the conference number, such as video of:// conference_id=12345 & type=highlights & time=t1;

The content of backlog is generated according to the record of type= "todo" in meeting_minutes, the text value of each record is obtained from text field, and the hyperlink is generated according to time field and meeting number and is to the jump link of video conference app, such as vi deocon:// conference_id=12345 & type=todo & time=t2;

the content of the QA record is generated according to the record of type= "qa_record" in meeting_minutes, and the text value of each question-answer record is obtained from qa_record.record.query.text and qa_record.answer.text fields, and the hyperlink is generated according to the qa_record.time field and the conference number and is a jump link to the video conference app, such as video_record:// conference_id=12345 & type=qa_record & time=t3.

As shown in FIG. 13, a meeting record of a partial summary type, such as a question-answer record type, may also not generate hyperlinks.

In some embodiments, after the presenter confirms that the content is correct, the presenter logs in to the video conference background to click on the confirmation to send, and the server side sends mail to all viewers participating in the conference. The viewer, on the device on which the video conferencing application is installed, can jump to the multimedia file, for example, a specified time for video playback of the presentation, by clicking on a hyperlink in the mail body, and then jump to the starting recording time for recording the conference gist.

According to the embodiment of the invention, the floating layer for recording the summary of the conference is generated in the conference process, so that a user can input summary data in the conference process and determine the initial recording time when the summary data is input, after the conference is finished, the summary of the conference can be quickly generated according to the summary data input by the user and the initial recording time, if the generated summary of the conference is required to be edited, the relative position of the summary data in the multimedia file of the conference can be quickly positioned according to the initial recording time, the multimedia file does not need to be checked from the beginning, and the generation efficiency of the summary of the conference is improved.

Since the foregoing embodiments are all described in other modes by reference to the above, the same parts are provided between different embodiments, and the same and similar parts are provided between the embodiments in the present specification. And will not be described in detail herein.

It should be noted that in this specification, relational terms such as "first" and "second" and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a circuit structure, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such circuit structure, article, or apparatus. Without further limitation, the statement "comprises" or "comprising" a … … "does not exclude the presence of other identical elements in a circuit structure, article or apparatus that comprises the element.

Other embodiments of the present application will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure of the invention herein. This application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the application pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the application being indicated by the following claims.

The above embodiments of the present application are not intended to limit the scope of the present application.

Claims

1. A display device, characterized by comprising:

a display;

a controller in communication with the display, the controller configured to:

receiving triggering operation of a meeting summary control input by a user;

2. The display device of claim 1, wherein obtaining summary data entered on the conference recording float comprises:

3. The display device of claim 1, wherein obtaining summary data entered on the conference recording float comprises:

4. The display device of claim 1, wherein obtaining summary data entered on the conference recording float comprises:

5. The display device of claim 1, wherein obtaining summary data entered on the conference recording float comprises:

6. The display device of claim 1, wherein generating a meeting record containing a start record time and the summary data comprises:

7. The display device of claim 6, wherein the meeting summary control is a summary record control or a to-do record control or a question-answer record control, the summary type is obtained from control data of the meeting summary control, the summary type in the control data of the summary record control is a summary record type, the summary type in the control data of the to-do record control is a to-do record type, and the summary type in the control data of the question-answer record control is a question-answer record type.

8. The display device of claim 1, wherein the controller is further configured to:

9. The method for generating the conference summary is characterized by comprising the following steps:

receiving triggering operation of a meeting summary control input by a user;

10. The method for generating a meeting summary according to claim 9, wherein obtaining summary data entered on the meeting record floating layer comprises: