WO2018069580A1 - Interactive collaboration tool - Google Patents
Interactive collaboration tool Download PDFInfo
- Publication number
- WO2018069580A1 WO2018069580A1 PCT/FI2017/050719 FI2017050719W WO2018069580A1 WO 2018069580 A1 WO2018069580 A1 WO 2018069580A1 FI 2017050719 W FI2017050719 W FI 2017050719W WO 2018069580 A1 WO2018069580 A1 WO 2018069580A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- visualisation
- attendees
- meeting
- key term
- key
- Prior art date
Links
- 230000002452 interceptive effect Effects 0.000 title description 3
- 238000012800 visualization Methods 0.000 claims abstract description 61
- 238000000034 method Methods 0.000 claims abstract description 56
- 238000004590 computer program Methods 0.000 claims abstract description 9
- 230000014509 gene expression Effects 0.000 claims description 11
- 230000008520 organization Effects 0.000 claims description 2
- 230000035620 dolor Effects 0.000 claims 8
- 241001233214 Viverra Species 0.000 claims 6
- 241000489861 Maximus Species 0.000 claims 3
- 241000157997 Metus Species 0.000 claims 2
- 238000011161 development Methods 0.000 description 8
- 230000018109 developmental process Effects 0.000 description 8
- 230000009471 action Effects 0.000 description 6
- 238000004891 communication Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 238000013459 approach Methods 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000002457 bidirectional effect Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000005923 long-lasting effect Effects 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000029058 respiratory gaseous exchange Effects 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
- 239000002699 waste material Substances 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/10—Office automation; Time management
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L12/00—Data switching networks
- H04L12/02—Details
- H04L12/16—Arrangements for providing special services to substations
- H04L12/18—Arrangements for providing special services to substations for broadcast or conference, e.g. multicast
- H04L12/1813—Arrangements for providing special services to substations for broadcast or conference, e.g. multicast for computer conferences, e.g. chat rooms
- H04L12/1831—Tracking arrangements for later retrieval, e.g. recording contents, participants activities or behavior, network status
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L51/00—User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail
- H04L51/06—Message adaptation to terminal or network requirements
- H04L51/066—Format adaptation, e.g. format conversion or compression
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L2015/088—Word spotting
Definitions
- the present invention generally relates to an interactive collaboration tool.
- Meetings as such may be useful for dissemination of information, but often some further action should be taken and some co-ordination of work is needed.
- the agenda of a next meeting includes verification of the progress in the action points recorded in the minutes of a previous meeting and the attendants may use the minutes as well as a reminder of any tasks assigned to them.
- the tools that may help to visualize and summarize the discussions during the meetings do not offer particular support for post-meeting collaboration.
- a method comprising: receiving recorded speech of plural attendees of a meeting;
- the key term visualization may comprise or be any of a graphic significance presentation of a sub-set of the key terms; and a numeric chart of a sub-set of the key terms; and a word cloud.
- the sub-set of key terms may comprise key terms ranked by descending score.
- the word cloud may be formed with words and/or phrases.
- the forming of the dynamic key term visualisation may be performed repeatedly during the meeting.
- the forming of the dynamic key term visualisation may be performed repeatedly during pauses in speech of the participants.
- the attendees may be enabled to modify the key terms in real-time during the meeting.
- the meeting may be an on-line meeting in which at least one attendee participates over a data connection. Alternatively, the meeting may be a face-to-face meeting in which all participants physically meet each other.
- the key terms may comprise keywords.
- the key terms may comprise key phrases.
- the attendees may be provided by a visualization that illustrates main items discussed in the meeting.
- the illustration of the main items may facilitate understanding and developing the topic of the meeting.
- the dynamic development of the key term visualisation may facilitate collaboration by the attendees by enabling on-line and/or offline development of the illustration.
- the method may comprise enabling a user to combine key terms to a combined expression.
- the combined expression may be usable for defining action points and/or summarizing the meeting.
- the method may comprise associating the combined expression with a given person or group of persons.
- the associating may comprise forming a calendar entry using the combined expression for the given person or group of persons.
- the method may comprise comparing the dynamic key term visualisation with earlier formed key term visualisations.
- the comparison may be used to identify earlier work relating to the topic of the meeting.
- the comparison may be used to detect developments made since earlier work relating to the topic.
- the comparison may be used to identify the contribution of the attendants to the propagation of the meeting and / or to the development since the earlier work relating to the topic.
- the modifying of the key terms may comprise adding or removing key terms.
- the modifying of the key terms may comprise editing the keyterms.
- the method may comprise detecting the attendee whose speech is being received.
- the method may comprise presenting the identified key terms together with an indication of the related attendee or attendees.
- the method may comprise detecting from received speech such attendees whose speech has been received less than by a set minimum proportion and prompting comments from such attendees.
- the detecting of the attendee whose speech is being received may be performed by identifying an individual channel used by the attendee or by use of voice recognition.
- the method may comprise receiving a topic of the meeting and maintaining the text associated with the topic.
- the method further comprise storing the dynamic key term visualisation in a repository accessible by the attendees.
- the method may comprise reporting the key term visualisation to one or more persons associated with the attendees.
- the speech may be converted to text using a browser operated web service.
- the service may be implemented using mobile devices and dedicated application(s).
- the method may be performed in a network based service.
- the network based service may be implemented using a NodeJS backend.
- the network based service may be run by a dedicated server and/or a cloud computing system.
- the method may comprise providing a browser based user interface for the attendees.
- the browser based user interface may be implemented using WebRTC.
- a computer program comprising computer executable program code which when executed by at least one processor causes an apparatus to perform the second aspect.
- an apparatus comprising a memory configured to store the computer program of the second aspect and a processor configured to control operation of the apparatus according to the computer program.
- a computer program product comprising a non-transitory computer readable medium having the computer program of the third aspect stored thereon.
- Any foregoing memory medium may comprise a digital data storage such as a data disc or diskette, optical storage, magnetic storage, or opto-magnetic storage.
- the memory medium may be formed into a device without other substantial functions than storing memory or it may be formed as part of a device with other functions, including but not limited to a memory of a computer, a chip set, and a sub assembly of an electronic device.
- Fig. 1 shows a schematic picture of a system according to an embodiment of the invention
- Fig. 2 shows a flow chart illustrating a method of an embodiment of the invention
- Fig. 3 shows a block diagram of an apparatus suited for implementing an embodiment of the invention
- Fig. 4 illustrates a dashboard view of an embodiment
- Fig. 5 shows a modification view for allowing participants.
- Fig. 1 shows a schematic picture of a system 100 according to an embodiment of the invention.
- the system has plural users 110, a web browser user interface 120 and an application interface 130 for use with a browser or dedicated applications or apps such tablet computer or mobile phone apps.
- a speech recognition web service 150 using WebRTC, for example
- a NodeJS Backend 140 which are connected to an information extraction and scoring engine 160 that extracts and scores key terms (e.g. keywords and/or key phrases) based on natural language processing with scaleable information matching.
- key terms e.g. keywords and/or key phrases
- Fig. 2 can be implemented using discrete elements such as separate servers or cloud services that provide the different functionalities.
- some or all of the services used by the users 110 can be provided in common by one or more entities.
- Fig. 2 shows a flow chart illustrating a method of an embodiment of the invention. The method comprises:
- the forming of the dynamic key term visualisation is preferably performed repeatedly during the meeting, for example during pauses in speech of the participants, with given intervals or when a given amount of speech has been converted to text.
- the attendees can be provided by a visualization that illustrates main items discussed in the meeting.
- a visualization that illustrates main items discussed in the meeting.
- Such an illustration of the main items facilitates understanding and developing the topic of the meeting.
- the dynamic development of the key term visualisation facilitates collaboration by the attendees by enabling on-line and/or offline development of the illustration.
- a user is enabled to combine key terms to a combined expression.
- the combined expression can be usable, for example, for defining action points and/or summarizing the meeting.
- the method comprises in an embodiment associating the combined expression with a given person or group of persons.
- the associating comprises, for example, forming a calendar entry using the combined expression for the given person or group of persons.
- the method comprises in an embodiment comparing the dynamic key term visualisation with earlier formed key term visualisations for the comparison to be used, for example, to any of: identifying earlier work relating to the topic of the meeting; detecting developments made since earlier work relating to the topic; identifying the contribution of the attendants to the propagation of the meeting and / or development since the earlier work relating to the topic.
- the modifying of the key terms comprises, for example, adding or removing key terms.
- the modifying of the key terms may comprise editing the key terms.
- the method comprises detecting the attendee whose speech is being received; and/or presenting the identified key terms together with an indication of the related attendee or attendees.
- the method comprises in an embodiment detecting from received speech such attendees whose speech has been received less than by a set minimum proportion of time and prompting comments from such attendees.
- the detecting of the attendee whose speech is being received can be performed, for example, by identifying an individual channel used by the attendee or by use of voice recognition.
- the method comprises in an embodiment receiving a topic of the meeting and maintaining the text associated with the topic.
- the method preferably comprises storing the dynamic key term visualisation in a repository accessible by the attendees.
- the method may comprise reporting the key term visualisation to one or more persons associated with the attendees.
- the speech is converted to text using a browser operated web service.
- the method is implemented in an embodiment with support for mobile devices and applications configured to interface with the user 110.
- the user 110 logs into a service of an embodiment and creates a title for a topic. If the topic is new, the user is prompted to create a new session. The user may be prompted, for example, to give a title name and date and time for the session and to add desired participants for this specific session.
- the user is shown a dashboard view to all the sessions the user has created or participated in, wherein one topic my comprise plural sessions.
- Fig. 4 illustrates a dashboard view of an embodiment.
- a session all the spoken audio of the participants is recorded.
- the participants are preferably able to edit and add text while the recording is paused, see Fig. 5.
- a canvas is shown (e.g. to the participants and optionally other authorized people) to enable further editing of the session outcomes.
- This editing can be performed by allowing the users to position key terms on the canvas as they like on the canvas.
- Content may also be added or removed from the content on the canvas e.g. by typing new content into a text box, after clicking on desired position on the canvas or dragging out or clicking an existing entry.
- a view and further processing of the session results can be arranged by enabling users to work with a word list shown with associated scores in which key terms can be selected for editing their text and/or score (e.g. estimated relevance to the topic of the session).
- the text converted from speech can be split into key terms based on various probabilistic models. For example, frequency of various words or phrases can be compared to a reference corpus to determine how much their use frequency differs either being greater or smaller than in the reference corpus.
- the reference corpus can be selected from the same or associated topic to reduce significance of likely trivial items.
- a further key term visualisation is presented based on a search on results from outside of the present session. For example, a user may search in other recorded sessions regarding same or other topic, or from another source such as the Internet and the search results can be presented as another key term visualisation for comparison.
- the method is preferable performed automatically so that the key term visualisation of a session is built and updated as an ongoing process while the session progresses so that the participants can obtain an instantaneous and dynamically developing graphical presentation of the progress of content of the session.
- the method may help to remove need for manual summarizing of topics and recording of future action points and reduce the need to communicate between the attendees or other persons who should be aware of the progress or results of the session.
- the method may further enable combining massive data sets interactively with the spoken contribution of the attendees so providing an all new search and interaction tool. For example, during the session, thousands of comparisons and decisions may be performed per second while the key term visualisation is updated. Such fast computation may be particularly useful to update the visualization of the participants during natural pauses in speech and thus avoid necessitating people to interrupt their normal interaction.
- Fig. 3 shows a block diagram of an apparatus 300 suited for implementing an embodiment of the invention.
- the apparatus 300 can be used, depending on implementation, as a user terminal and/or computer server for implementing at least some parts of the method of Fig. 2. Notice that it is not necessary to run any part of the method of Fig. 2 as a network based service but instead, in some embodiments the functionalities are implemented locally.
- the apparatus 300 comprises a communication interface or input/output 310 for communicating with other entities with, for example, a local area network (LAN) port or mobile communication networks (e.g. UMTS, CDMA-2000, GSM), a processor 320, a user interface 330, a memory 340.
- the memory 340 comprises a work memory 342 and non-volatile memory 344 comprising computer program code 346 to be executed by the processor 320 in place and/or within the work memory 340.
- the non-volatile memory 344 can be used for storing additionally other long-lasting data such as user settings, database data for storing, for example, key term visualisation data.
- the processor 320 is, for example, formed of one or more of: a master control unit (MCU); a microprocessor; a digital signal processor (DSP); an application specific integrated circuit (ASIC); a field programmable gate array; a microcontroller.
- the processor 320 is capable of, for example, controlling the operation of the apparatus 300 using on the computer program code 346.
- video image is received in an embodiment in addition to the recorded speech so that also video image or still images of the video image can be stored and presented on displaying any derivative information based on the recorded speech.
- the words of the word cloud or other visualization may be associated with respective portion of speech.
- a respective portion of received speech can be replayed.
- video image or still images of the video can be presented on accessing the word of the visualization.
- the key terms of the visualization are associated with respective portions of recorded speech or video and replayed on accessing the key terms e.g. through the visualization.
- the key terms of the visualization are used as search terms on accessing of the key terms.
- a supplementary information search is automatically performed from the Internet or an inter-organisation data repository.
- the supplementary search is performed in advance for use of contemporary material and the search results are then presented on accessing of the respective key term.
- key terms or phrases can be stored to an idea bank for subsequent use and stored key terms or phrases can be automatically searched and retrieved from the idea bank through associated key terms.
- the user may be provided with related information.
- the related information may comprise any of a key term visualization; recording of speech; recording of video; and written note.
- productivity of different people and groups of people is automatically measured by computing the amount of or relevance of the produced key term visualisations to subsequent work within the organization.
- productivity of different people and groups of people is automatically measured by computing the amount of or relevance of the produced key term visualisations to subsequent work within the organization.
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Strategic Management (AREA)
- Human Resources & Organizations (AREA)
- Entrepreneurship & Innovation (AREA)
- Physics & Mathematics (AREA)
- Economics (AREA)
- General Business, Economics & Management (AREA)
- Multimedia (AREA)
- Marketing (AREA)
- Quality & Reliability (AREA)
- Operations Research (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Signal Processing (AREA)
- Computer Networks & Wireless Communication (AREA)
- Tourism & Hospitality (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Data Mining & Analysis (AREA)
- Development Economics (AREA)
- Educational Administration (AREA)
- Game Theory and Decision Science (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
A method and computer program for receiving (210) recorded speech of plural attendees of a meeting; converting (220) recorded speech to text during the meeting; enabling editing (230) of the text by the attendees during the meeting; identifying (240) key terms from the edited text; forming (250) a dynamic key term visualisation from the identified key terms; and enabling modifying (260) of the key terms by the attendees after the forming of the key term visualisation and correspondingly updating the dynamic key term visualisation.
Description
INTERACTIVE COLLABORATION TOOL
TECHNICAL FIELD
The present invention generally relates to an interactive collaboration tool.
BACKGROUND ART
This section illustrates useful background information without admission of any technique described herein representative of the state of the art. Collaboration of people typically involves bidirectional exchange of information in which thoughts and ideas are jointly developed by participants to facilitate assessment of the objectives and / or the solutions or approach to reach the objectives. This is often implemented by holding a meeting in which attendants discuss based on an agenda of the meeting and someone takes and afterwards circulates minutes of the meeting. Action points may be written in the minutes to designate who should take care of various work items that were decided to be done.
Tools exist to facilitate the forming of the minutes and various techniques further exist to facilitate quick access to information in the minutes. For example, US 9,035,996 Bl discloses that recordings made with participants' computing devices can be processed into transcript in which each participant is identified. The processed text from the transcript can be displayed in a word cloud using a variety of formats and techniques including alphabetization, font size differences for emphasis and color differences for emphasis or other formats to provide distinction to the text within the word cloud. The word cloud can be displayed as a summary of the transcript and used to pick the interest of a user seeking to join multi-device video communication session and a user can search recorded content by selecting a text element within the word cloud. Existing tools such as the US 9,035,996 Bl may facilitate visualization of a meeting and particularly access to the content of its recording, but not particularly help the collaboration while a meeting is in progress or after the meeting is over. Typically,
meetings involve presence of plural people and initiate thought processes that do not stop when the meeting ends. Sometimes, some of the participants may further discuss after the meeting informally at other occasions when in touch, but such discussions would not be conveyed to the other attendants. Moreover, the automatic formation of the word cloud can only make an effort to summarize the meeting by drawing the minutes using computerized techniques such as using a frequency analysis. Such summaries may better identify new business jargon terms than point to actually important topics, because people tend to paraphrase others rather than echo same words not to waste time and to indicate having understood previous speakers.
Meetings as such may be useful for dissemination of information, but often some further action should be taken and some co-ordination of work is needed. Traditionally, the agenda of a next meeting includes verification of the progress in the action points recorded in the minutes of a previous meeting and the attendants may use the minutes as well as a reminder of any tasks assigned to them. However, the tools that may help to visualize and summarize the discussions during the meetings do not offer particular support for post-meeting collaboration.
It is an object of the present invention to avoid or mitigate the aforementioned problems or at least to provide new technical alternatives to the state of the art.
SUMMARY
According to a first aspect of the invention there is provided a method comprising: receiving recorded speech of plural attendees of a meeting;
converting recorded speech to text during the meeting;
enabling editing of the text by the attendees during the meeting;
identifying key terms from the edited text;
forming a dynamic key term visualisation from the identified key terms;
enabling modifying of the key terms by the attendees after the forming of the key term visualisation and correspondingly updating the dynamic key term visualisation.
The key term visualization may comprise or be any of a graphic significance presentation of a sub-set of the key terms; and a numeric chart of a sub-set of the key terms; and a word cloud. The sub-set of key terms may comprise key terms ranked by descending score. The word cloud may be formed with words and/or phrases.
The forming of the dynamic key term visualisation may be performed repeatedly during the meeting. The forming of the dynamic key term visualisation may be performed repeatedly during pauses in speech of the participants. The attendees may be enabled to modify the key terms in real-time during the meeting. The meeting may be an on-line meeting in which at least one attendee participates over a data connection. Alternatively, the meeting may be a face-to-face meeting in which all participants physically meet each other. The key terms may comprise keywords. The key terms may comprise key phrases.
By forming a dynamic key term visualisation based on the recorded speech and enabling modifying of the dynamic key term visualisation by the attendees, the attendees may be provided by a visualization that illustrates main items discussed in the meeting. The illustration of the main items may facilitate understanding and developing the topic of the meeting. Moreover, the dynamic development of the key term visualisation may facilitate collaboration by the attendees by enabling on-line and/or offline development of the illustration. The method may comprise enabling a user to combine key terms to a combined expression. The combined expression may be usable for defining action points and/or summarizing the meeting.
The method may comprise associating the combined expression with a given person or group of persons. The associating may comprise forming a calendar entry using the combined expression for the given person or group of persons.
The method may comprise comparing the dynamic key term visualisation with earlier formed key term visualisations. The comparison may be used to identify earlier work relating to the topic of the meeting. The comparison may be used to detect developments made since earlier work relating to the topic. The comparison may be used to identify the contribution of the attendants to the propagation of the meeting and / or to the development since the earlier work relating to the topic.
The modifying of the key terms may comprise adding or removing key terms. Alternatively or additionally, the modifying of the key terms may comprise editing the keyterms.
The method may comprise detecting the attendee whose speech is being received. The method may comprise presenting the identified key terms together with an indication of the related attendee or attendees. The method may comprise detecting from received speech such attendees whose speech has been received less than by a set minimum proportion and prompting comments from such attendees.
The detecting of the attendee whose speech is being received may be performed by identifying an individual channel used by the attendee or by use of voice recognition.
The method may comprise receiving a topic of the meeting and maintaining the text associated with the topic.
The method further comprise storing the dynamic key term visualisation in a repository accessible by the attendees. The method may comprise reporting the key term visualisation to one or more persons associated with the attendees.
The speech may be converted to text using a browser operated web service. The service may be implemented using mobile devices and dedicated application(s).
The method may be performed in a network based service. The network based
service may be implemented using a NodeJS backend. The network based service may be run by a dedicated server and/or a cloud computing system.
The method may comprise providing a browser based user interface for the attendees. The browser based user interface may be implemented using WebRTC.
According to a second aspect of the invention there is provided a computer program comprising computer executable program code which when executed by at least one processor causes an apparatus to perform the second aspect.
According to a third aspect of the invention there is provided an apparatus comprising a memory configured to store the computer program of the second aspect and a processor configured to control operation of the apparatus according to the computer program.
According to a fourth aspect of the invention there is provided a computer program product comprising a non-transitory computer readable medium having the computer program of the third aspect stored thereon. Any foregoing memory medium may comprise a digital data storage such as a data disc or diskette, optical storage, magnetic storage, or opto-magnetic storage. The memory medium may be formed into a device without other substantial functions than storing memory or it may be formed as part of a device with other functions, including but not limited to a memory of a computer, a chip set, and a sub assembly of an electronic device.
Different non-binding aspects and embodiments of the present invention have been illustrated in the foregoing. The embodiments in the foregoing are used merely to explain selected aspects or steps that may be utilized in implementations of the present invention. Some embodiments may be presented only with reference to certain aspects of the invention. It should be appreciated that corresponding embodiments may apply to other aspects as well.
BRIEF DESCRIPTION OF DRAWINGS
Some embodiments of the invention will be described with reference to the accompanying drawings, in which:
Fig. 1 shows a schematic picture of a system according to an embodiment of the invention;
Fig. 2 shows a flow chart illustrating a method of an embodiment of the invention; Fig. 3 shows a block diagram of an apparatus suited for implementing an embodiment of the invention;
Fig. 4 illustrates a dashboard view of an embodiment; and
Fig. 5 shows a modification view for allowing participants.
DETAILED DESCRIPTION
In the following description, like reference signs denote like elements or steps.
Fig. 1 shows a schematic picture of a system 100 according to an embodiment of the invention. The system has plural users 110, a web browser user interface 120 and an application interface 130 for use with a browser or dedicated applications or apps such tablet computer or mobile phone apps. Through the web browser user interface 120 and the application interface 130 are connected to a speech recognition web service 150 (using WebRTC, for example) and to a NodeJS Backend 140, which are connected to an information extraction and scoring engine 160 that extracts and scores key terms (e.g. keywords and/or key phrases) based on natural language processing with scaleable information matching.
The architecture of Fig. 2 can be implemented using discrete elements such as separate servers or cloud services that provide the different functionalities. On the other hand, some or all of the services used by the users 110 can be provided in common by one or more entities.
Fig. 2 shows a flow chart illustrating a method of an embodiment of the invention.
The method comprises:
210. Receiving recorded speech of plural attendees of a meeting;
220. Converting recorded speech to text during the meeting;
230. Enabling editing of the text by the attendees during the meeting;
240. Identifying key terms from the edited text;
250. Forming a dynamic key term visualisation from the identified key terms; and 260. Enabling modifying of the key terms by the attendees after the forming of the key term visualisation and correspondingly updating the dynamic key term visualisation.
The forming of the dynamic key term visualisation is preferably performed repeatedly during the meeting, for example during pauses in speech of the participants, with given intervals or when a given amount of speech has been converted to text.
By forming a dynamic key term visualisation based on the recorded speech and enabling modifying of the dynamic key term visualisation by the attendees, the attendees can be provided by a visualization that illustrates main items discussed in the meeting. Such an illustration of the main items facilitates understanding and developing the topic of the meeting. Moreover, the dynamic development of the key term visualisation facilitates collaboration by the attendees by enabling on-line and/or offline development of the illustration.
In an embodiment, a user is enabled to combine key terms to a combined expression. The combined expression can be usable, for example, for defining action points and/or summarizing the meeting.
The method comprises in an embodiment associating the combined expression with a given person or group of persons. The associating comprises, for example, forming a calendar entry using the combined expression for the given person or group of persons.
The method comprises in an embodiment comparing the dynamic key term visualisation with earlier formed key term visualisations for the comparison to be used, for example, to any of: identifying earlier work relating to the topic of the meeting; detecting developments made since earlier work relating to the topic; identifying the contribution of the attendants to the propagation of the meeting and / or development since the earlier work relating to the topic.
The modifying of the key terms comprises, for example, adding or removing key terms. Alternatively or additionally, the modifying of the key terms may comprise editing the key terms.
In an embodiment, the method comprises detecting the attendee whose speech is being received; and/or presenting the identified key terms together with an indication of the related attendee or attendees. The method comprises in an embodiment detecting from received speech such attendees whose speech has been received less than by a set minimum proportion of time and prompting comments from such attendees.
The detecting of the attendee whose speech is being received can be performed, for example, by identifying an individual channel used by the attendee or by use of voice recognition.
The method comprises in an embodiment receiving a topic of the meeting and maintaining the text associated with the topic.
The method preferably comprises storing the dynamic key term visualisation in a repository accessible by the attendees. The method may comprise reporting the key term visualisation to one or more persons associated with the attendees.
In an embodiment, the speech is converted to text using a browser operated web service.
The method is implemented in an embodiment with support for mobile devices and applications configured to interface with the user 110.
An example use case is next described. First, the user 110 logs into a service of an embodiment and creates a title for a topic. If the topic is new, the user is prompted to create a new session. The user may be prompted, for example, to give a title name and date and time for the session and to add desired participants for this specific session. The user is shown a dashboard view to all the sessions the user has created or participated in, wherein one topic my comprise plural sessions. Fig. 4 illustrates a dashboard view of an embodiment.
During a session, all the spoken audio of the participants is recorded. The participants are preferably able to edit and add text while the recording is paused, see Fig. 5. After the session ends, the results of the session are processed and a canvas is shown (e.g. to the participants and optionally other authorized people) to enable further editing of the session outcomes. This editing can be performed by allowing the users to position key terms on the canvas as they like on the canvas. Content may also be added or removed from the content on the canvas e.g. by typing new content into a text box, after clicking on desired position on the canvas or dragging out or clicking an existing entry. Alternatively or additionally to canvas based modification, a view and further processing of the session results can be arranged by enabling users to work with a word list shown with associated scores in which key terms can be selected for editing their text and/or score (e.g. estimated relevance to the topic of the session).
The text converted from speech can be split into key terms based on various probabilistic models. For example, frequency of various words or phrases can be compared to a reference corpus to determine how much their use frequency differs either being greater or smaller than in the reference corpus. The reference corpus can be selected from the same or associated topic to reduce significance of likely trivial items.
In an embodiment, a further key term visualisation is presented based on a search on results from outside of the present session. For example, a user may search in other recorded sessions regarding same or other topic, or from another source such as the Internet and the search results can be presented as another key term visualisation for comparison.
The method is preferable performed automatically so that the key term visualisation of a session is built and updated as an ongoing process while the session progresses so that the participants can obtain an instantaneous and dynamically developing graphical presentation of the progress of content of the session. The method may help to remove need for manual summarizing of topics and recording of future action points and reduce the need to communicate between the attendees or other persons who should be aware of the progress or results of the session. The method may further enable combining massive data sets interactively with the spoken contribution of the attendees so providing an all new search and interaction tool. For example, during the session, thousands of comparisons and decisions may be performed per second while the key term visualisation is updated. Such fast computation may be particularly useful to update the visualization of the participants during natural pauses in speech and thus avoid necessitating people to interrupt their normal interaction.
Fig. 3 shows a block diagram of an apparatus 300 suited for implementing an embodiment of the invention. The apparatus 300 can be used, depending on implementation, as a user terminal and/or computer server for implementing at least some parts of the method of Fig. 2. Notice that it is not necessary to run any part of the method of Fig. 2 as a network based service but instead, in some embodiments the functionalities are implemented locally.
The apparatus 300 comprises a communication interface or input/output 310 for communicating with other entities with, for example, a local area network (LAN) port or mobile communication networks (e.g. UMTS, CDMA-2000, GSM), a processor 320, a user interface 330, a memory 340. The memory 340 comprises a work memory 342
and non-volatile memory 344 comprising computer program code 346 to be executed by the processor 320 in place and/or within the work memory 340. The non-volatile memory 344 can be used for storing additionally other long-lasting data such as user settings, database data for storing, for example, key term visualisation data.
The processor 320 is, for example, formed of one or more of: a master control unit (MCU); a microprocessor; a digital signal processor (DSP); an application specific integrated circuit (ASIC); a field programmable gate array; a microcontroller. The processor 320 is capable of, for example, controlling the operation of the apparatus 300 using on the computer program code 346.
Various embodiments have been presented. It should be appreciated that in this document, words comprise, include and contain are each used as open-ended expressions with no intended exclusivity.
For example, video image is received in an embodiment in addition to the recorded speech so that also video image or still images of the video image can be stored and presented on displaying any derivative information based on the recorded speech. For example, the words of the word cloud or other visualization may be associated with respective portion of speech. On accessing a word of visualisation, a respective portion of received speech can be replayed. Alternatively or additionally, video image or still images of the video can be presented on accessing the word of the visualization. In an embodiment, the key terms of the visualization are associated with respective portions of recorded speech or video and replayed on accessing the key terms e.g. through the visualization.
In one example, the key terms of the visualization are used as search terms on accessing of the key terms. For example, on clicking a key term of the visualization, a supplementary information search is automatically performed from the Internet or an inter-organisation data repository. In another example, the supplementary search is performed in advance for use of contemporary material and the search results are then presented on accessing of the respective key term.
In an example, key terms or phrases can be stored to an idea bank for subsequent use and stored key terms or phrases can be automatically searched and retrieved from the idea bank through associated key terms. For example, on accessing a visualized key term, the user may be provided with related information. The related information may comprise any of a key term visualization; recording of speech; recording of video; and written note. The automatic searching and retrieving of the stored key terms or phrases can be performed even during normal breathing pauses or change of turn of speaker in a normal meeting thanks to the speed of computers in a manner that would be impossible to manually implement by people.
In an example, productivity of different people and groups of people is automatically measured by computing the amount of or relevance of the produced key term visualisations to subsequent work within the organization. By automatically computing the subsequent use of work of earlier people and teams, thousands of different word cloud combinations can be compared and adaptively scored unlike with any existing manual methods.
The foregoing description has provided by way of non-limiting examples of particular implementations and embodiments of the invention a full and informative description of the best mode presently contemplated by the inventors for carrying out the invention. It is however clear to a person skilled in the art that the invention is not restricted to details of the embodiments presented in the foregoing, but that it can be implemented in other embodiments using equivalent means or in different combinations of embodiments without deviating from the characteristics of the invention.
Furthermore, some of the features of the afore-disclosed embodiments of this invention may be used to advantage without the corresponding use of other features. As such, the foregoing description shall be considered as merely illustrative of the principles of the present invention, and not in limitation thereof. Hence, the scope of the invention is only restricted by the appended patent claims.
Claims
1. A method comprising:
receiving (210) recorded speech of plural attendees of a meeting;
converting (220) recorded speech to text during the meeting;
enabling editing (230) of the text by the attendees during the meeting;
identifying (240) key terms from the edited text;
forming (250) a dynamic key term visualisation to the attendees from the identified key terms;
enabling modifying (260) of the key terms by the attendees after the forming of the key term visualisation and correspondingly updating the dynamic key term visualisation.
2. The method of claim 1, characterized in that the key term visualization comprises a word cloud.
3. The method of claim 1 or 2, characterized in that the forming of the dynamic key term visualisation is performed repeatedly during the meeting.
4. The method of any one of preceding claims, characterized in that the forming of the dynamic key term visualisation is performed repeatedly during pauses in speech of the participants.
5. The method of any one of preceding claims, characterized in that the method further comprises enabling a user to combine key terms to a combined expression.
6. The method of claim 5, characterized in that the method further comprises associating the combined expression with a given person or group of persons.
7. The method of any one of preceding claims, characterized in that the method further comprises comparing the dynamic key term visualisation with earlier formed key term visualisations.
8. The method of any one of preceding claims, characterized in that the method further comprises storing the dynamic key term visualisation in a repository accessible by the attendees.
9. The method of any one of preceding claims, characterized in that speech is converted to text using a browser operated web service or mobile devices with dedicated application support.
10. The method of any one of preceding claims, characterized in that the method is performed in a network based service using a NodeJS backend and a browser based user interface for the attendees that is implemented using WebRTC.
11. The method of any one of preceding claims, characterized in that the key terms of the visualization are associated to respective passages of the recorded speech.
12. The method of any one of preceding claims, characterized in that the key terms of the visualization are associated to respective video images or still images.
13. The method of any one of preceding claims, characterized in that the key terms of the visualization are associated to respective data available in the Internet or an inter-organisation data repository.
14. The method of any one of preceding claims, characterized in that productivity of different people and groups of people is automatically measured by computing the amount of or relevance of the produced key term visualisations to subsequent work within the organization.
15. A computer program comprising computer executable program code which when executed by at least one processor causes an apparatus to perform the method of any one of the preceding claims.
1/5
2 / 5
210
Receiving recorded speech of plural attendees of a
meeting
220
Converting recorded speech to text during the meeting
230
Enabling editing of the text by the attendees during the
meeting
240
Identifying key terms from the edited text
Forming a dynamic key term visualisation from the 250 identified key terms
Enabling modifying of the key terms by the attendees 260 after the forming of the key term visualisation and
correspondingly updating the dynamic key term
visualisation
Fig.4a
Fig. 4b
Ad Session
Third Session Name Fourth Session Name Fifth ABy Janet Hawkins ABy Janet Hawkins &By|
¾12 Mar 2016 ¾12 Mar 2016
Description lorem ipsum Description lorem ipsum Descripti dolor sit amet, consectetur dolor sit amet, consectetur amet, co adipiscing elit. Praesent adipiscing elit. Praesent Praesenl consecteiur odio pretitium, consecteiur odio pretitium, Posuere posuere dul in, imperdiet posuere dul in, imperdiet Praesenl metus. Praesent et metus. Praesent et
vulputate lacus (more)... vulputate lacus (more)...
£A 4 < 12 o o o ££ 4 <P 12 o o o 2^
Cynthia Foster
Keith Johnston]
Kathryn Day
Topic Name In Progress
Hl2 Mar 2016 Lorem ipsum dolor sit amet consectetur adipiscing elit.
Curabitur maximus, enim id viverra ultrices, odio lorem
Proposed by : Users should be: able:
viverra neque, vel ultricies erat est eget velit. Vivamus : to edit: and: add : text,: : Participiants: vel sagittis leo. Phasellus consectetur lacus vitae dolor while: the recording: is:
[Cynthia Foster] vehicula, in condimentum massa pellentesque. Vivamus
tempor arcu sit amet blandit faucibus. Internum et
Keith Johnston) malesuada fames ec ante ipsum primis in faucibus.
Nuinc nec orci ut orci aliquam vehicula ac et nisi.
Kathrvn Day!
Stop session:
Philip Watson 1 0
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Curabitur
maximus, enim id viverra ultrices, odio lorem viverra neque, vel
ultricies erat est eget velit. Vivamus vel sagittis leo. Phasellus
consectetur lacus vitae dolor vehicula, in condimentum massa
pellentesque. Vivamus tempor arcu sit amet blandit
Cynthia Foster]
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Curabitur
maximus, enim id viverra ultrices, odio lorem viverra neque, vel
ultricies erat est eget velit. Vivamus vel sagittis leo. Phasellus
consectetur lacus vitae dolor vehicula, in condimentum massa
pellentesque. Vivamus tempor arcu sit amet blandit fauci-
Comment:
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
FI20165781 | 2016-10-13 | ||
FI20165781 | 2016-10-13 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2018069580A1 true WO2018069580A1 (en) | 2018-04-19 |
Family
ID=60201607
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/FI2017/050719 WO2018069580A1 (en) | 2016-10-13 | 2017-10-13 | Interactive collaboration tool |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2018069580A1 (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113113017A (en) * | 2021-04-08 | 2021-07-13 | 百度在线网络技术(北京)有限公司 | Audio processing method and device |
CN113129895A (en) * | 2021-04-20 | 2021-07-16 | 上海仙剑文化传媒股份有限公司 | Voice detection processing system |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100161619A1 (en) * | 2008-12-18 | 2010-06-24 | Lamere Paul B | Method and Apparatus for Generating Recommendations From Descriptive Information |
EP2288105A1 (en) * | 2009-08-17 | 2011-02-23 | Avaya Inc. | Word cloud audio navigation |
CA2692314A1 (en) * | 2010-02-08 | 2011-08-08 | Yellowpages.Com Llc | Systems and methods to provide search based on social graphs and affinity groups |
US20120179465A1 (en) * | 2011-01-10 | 2012-07-12 | International Business Machines Corporation | Real time generation of audio content summaries |
WO2012175556A2 (en) * | 2011-06-20 | 2012-12-27 | Koemei Sa | Method for preparing a transcript of a conversation |
US9035996B1 (en) | 2012-04-16 | 2015-05-19 | Google Inc. | Multi-device video communication session |
US20160171090A1 (en) * | 2014-12-11 | 2016-06-16 | University Of Connecticut | Systems and Methods for Collaborative Project Analysis |
-
2017
- 2017-10-13 WO PCT/FI2017/050719 patent/WO2018069580A1/en active Application Filing
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100161619A1 (en) * | 2008-12-18 | 2010-06-24 | Lamere Paul B | Method and Apparatus for Generating Recommendations From Descriptive Information |
EP2288105A1 (en) * | 2009-08-17 | 2011-02-23 | Avaya Inc. | Word cloud audio navigation |
CA2692314A1 (en) * | 2010-02-08 | 2011-08-08 | Yellowpages.Com Llc | Systems and methods to provide search based on social graphs and affinity groups |
US20120179465A1 (en) * | 2011-01-10 | 2012-07-12 | International Business Machines Corporation | Real time generation of audio content summaries |
WO2012175556A2 (en) * | 2011-06-20 | 2012-12-27 | Koemei Sa | Method for preparing a transcript of a conversation |
US9035996B1 (en) | 2012-04-16 | 2015-05-19 | Google Inc. | Multi-device video communication session |
US20160171090A1 (en) * | 2014-12-11 | 2016-06-16 | University Of Connecticut | Systems and Methods for Collaborative Project Analysis |
Non-Patent Citations (1)
Title |
---|
"Learning Python with Raspberry Pi", 29 January 2014, ISBN: 978-1-118-71705-9, article BRADBURY & EVERARD: "Learning Python with Raspberry Pi", pages: 180 - 180, XP055430713 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113113017A (en) * | 2021-04-08 | 2021-07-13 | 百度在线网络技术(北京)有限公司 | Audio processing method and device |
CN113113017B (en) * | 2021-04-08 | 2024-04-09 | 百度在线网络技术(北京)有限公司 | Audio processing method and device |
CN113129895A (en) * | 2021-04-20 | 2021-07-16 | 上海仙剑文化传媒股份有限公司 | Voice detection processing system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10860985B2 (en) | Post-meeting processing using artificial intelligence | |
US11307735B2 (en) | Creating agendas for electronic meetings using artificial intelligence | |
EP3467822B1 (en) | Speech-to-text conversion for interactive whiteboard appliances in multi-language electronic meetings | |
US20210297275A1 (en) | Organizing and aggregating meetings into threaded representations | |
EP3467821B1 (en) | Selection of transcription and translation services and generation of combined results | |
US11062271B2 (en) | Interactive whiteboard appliances with learning capabilities | |
US11030585B2 (en) | Person detection, person identification and meeting start for interactive whiteboard appliances | |
EP3309731A1 (en) | Managing electronic meetings using artificial intelligence and meeting rules templates | |
US9712569B2 (en) | Method and apparatus for timeline-synchronized note taking during a web conference | |
US20220294836A1 (en) | Systems for information sharing and methods of use, discussion and collaboration system and methods of use | |
US20180101760A1 (en) | Selecting Meeting Participants for Electronic Meetings Using Artificial Intelligence | |
US8271509B2 (en) | Search and chat integration system | |
JP5003125B2 (en) | Minutes creation device and program | |
US10860797B2 (en) | Generating summaries and insights from meeting recordings | |
US20120321062A1 (en) | Telephonic Conference Access System | |
US20140278377A1 (en) | Automatic note taking within a virtual meeting | |
CN107636651A (en) | Subject index is generated using natural language processing | |
US20130144603A1 (en) | Enhanced voice conferencing with history | |
US20090006982A1 (en) | Collaborative generation of meeting minutes and agenda confirmation | |
US20100161604A1 (en) | Apparatus and method for multimedia content based manipulation | |
US20150066935A1 (en) | Crowdsourcing and consolidating user notes taken in a virtual meeting | |
CN107211062A (en) | Audio playback scheduling in virtual acoustic room | |
JP2008310618A (en) | Web conference support program, recording medium with the same recorded thereon, and device and method for web conference support | |
US20230274730A1 (en) | Systems and methods for real time suggestion bot | |
Chi et al. | Intelligent assistance for conversational storytelling using story patterns |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 17792108 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 17792108 Country of ref document: EP Kind code of ref document: A1 |