WO2023053940A1 - Système de traitement d'informations, programme et procédé de traitement d'informations - Google Patents

Système de traitement d'informations, programme et procédé de traitement d'informations Download PDF

Info

Publication number
WO2023053940A1
WO2023053940A1 PCT/JP2022/034197 JP2022034197W WO2023053940A1 WO 2023053940 A1 WO2023053940 A1 WO 2023053940A1 JP 2022034197 W JP2022034197 W JP 2022034197W WO 2023053940 A1 WO2023053940 A1 WO 2023053940A1
Authority
WO
WIPO (PCT)
Prior art keywords
script
information processing
area
processing system
data
Prior art date
Application number
PCT/JP2022/034197
Other languages
English (en)
Japanese (ja)
Inventor
崇義 松田
麻里 笹山
司 入日
Original Assignee
エピックベース株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by エピックベース株式会社 filed Critical エピックベース株式会社
Publication of WO2023053940A1 publication Critical patent/WO2023053940A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0484Interaction techniques based on graphical user interfaces [GUI] for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range
    • G06F3/04842Selection of displayed objects or displayed text elements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/10Office automation; Time management
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L51/00User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail
    • H04L51/04Real-time or near real-time messaging, e.g. instant messaging [IM]

Definitions

  • the present invention relates to an information processing system, a program, and an information processing method.
  • Patent Document 1 discloses a prior art that recognizes voice and assists in creating minutes.
  • the present invention provides an information processing device etc. that can perform character recognition and easily extract summaries, important parts, etc. in proceedings.
  • an information processing system includes a control unit.
  • the controller is configured to perform the following steps.
  • the display control step displays an input screen for creating the minutes of the meeting, including a document creation area and a script display area.
  • the document creation area is configured to be able to display the results of non-voice input by a user during a meeting conducted by one or more users.
  • the script display area is configured to be able to display a conference script generated based on the audio of the conference.
  • the specifying step when a selection of the non-voice input displayed in the document creation area is received, a portion of the script corresponding to the time when the non-voice input was input is specified.
  • the transfer step when a selection of the script displayed in the script display area is received, the selected portion is transferred to the document creation area.
  • FIG. 1 is an example of a configuration diagram representing an information processing system 1.
  • FIG. 2 is an example of a block diagram showing a hardware configuration of a server 2;
  • FIG. 3 is an example of a block diagram showing a hardware configuration of a user terminal 3;
  • FIG. 4 is an example of a block diagram showing functions realized by a control unit 23 of the server 2;
  • FIG. 1 is an example of an activity diagram showing an outline of information processing executed by the information processing system 1.
  • FIG. It is an example of minutes displayed on the display unit 34 of the user terminal 3 immediately after the end of the meeting and before speech recognition in the first embodiment.
  • 3 is an example of minutes displayed on the display unit 34 of the user terminal 3 after speech recognition in the first embodiment.
  • 3 is an example of the minutes displayed on the display unit 34 of the user terminal 3 when the script is specified according to the first embodiment.
  • 4 is an example of the minutes displayed on the display unit 34 of the user terminal 3 when the script is transcribed according to the first embodiment. It is an example of the minutes displayed on the display unit 34 of the user terminal 3 according to the second embodiment.
  • the program for realizing the software appearing in this embodiment may be provided as a non-transitory computer-readable medium (Non-Transitory Computer-Readable Medium), or may be downloaded from an external server. It may be provided as possible, or may be provided so that the program is activated on an external computer and the function is realized on the client terminal (so-called cloud computing).
  • the term “unit” may include, for example, a combination of hardware resources implemented by circuits in a broad sense and software information processing that can be specifically realized by these hardware resources.
  • various information is handled in the present embodiment, and these information are, for example, physical values of signal values representing voltage and current, and signal values as binary bit aggregates composed of 0 or 1. It is represented by high and low, or quantum superposition (so-called quantum bit), and communication and operation can be performed on a circuit in a broad sense.
  • a circuit in a broad sense is a circuit realized by at least appropriately combining circuits, circuits, processors, memories, and the like.
  • Application Specific Integrated Circuit ASIC
  • Programmable Logic Device for example, Simple Programmable Logic Device (SPLD), Complex Programmable Logic Device (CPLD), and field It includes a programmable gate array (Field Programmable Gate Array: FPGA)).
  • Hardware Configuration Section 1 describes the hardware configuration of the first embodiment.
  • FIG. 1 is an example of a configuration diagram showing an information processing system 1.
  • the information processing system 1 includes a server 2, user terminals 3 (eg, user terminals 3-1, 3-2, . . . , user terminals 3-n), and sound collectors 4 (eg, microphones 4- 1, microphone 4-2, . Connected through a network. These components are further described.
  • the system exemplified by the information processing system 1 consists of one or more devices or components.
  • FIG. 2 is an example of a block diagram showing the hardware configuration of the server 2. As shown in FIG. The server 2 has a communication section 21 , a storage section 22 and a control section 23 , and these constituent elements are electrically connected via a communication bus 20 inside the server 2 . Each component will be further described.
  • the communication unit 21 is preferably a wired communication means such as USB, IEEE1394, Thunderbolt (registered trademark), wired LAN network communication, etc., but wireless LAN network communication, mobile communication such as LTE/3G, Bluetooth (registered trademark) communication, etc. may be included as needed. That is, it is more preferable to implement as a set of these communication means. That is, the server 2 communicates various information with the sound collector 4 and the user terminal 3 via the network via the communication unit 21 . In particular, the server 2 is configured to receive audio data containing the user from the sound collector 4 . Details of these will be described later.
  • the storage unit 22 stores various information defined by the above description. For example, it can be used as a storage device such as a solid state drive (SSD) for storing various programs related to the server 2 executed by the control unit 23, or as a temporary storage device related to program calculation. It can be implemented as a memory such as a random access memory (RAM) that stores various information (arguments, arrays, etc.). A combination of these may also be used. In particular, the storage unit 22 stores voice data as recorded data. The storage unit 22 also stores various other programs related to the server 2 that are executed by the control unit 23 .
  • SSD solid state drive
  • RAM random access memory
  • the control unit 23 processes and controls overall operations related to the server 2 .
  • the control unit 23 is, for example, a central processing unit (CPU) (not shown).
  • the control unit 23 implements various functions related to the server 2 by reading a predetermined program stored in the storage unit 22 . That is, information processing by software stored in the storage unit 22 can be specifically realized by the control unit 23 which is an example of hardware, and can be executed as each functional unit included in the control unit 23 . These are further detailed in the next section.
  • the control unit 23 is not limited to a single unit, and may be implemented to have a plurality of control units 23 for each function. A combination thereof may also be used.
  • FIG. 3 is an example of a block diagram showing the hardware configuration of the user terminal 3. As shown in FIG. The user terminal 3 has a communication section 31 , a storage section 32 , a control section 33 , a display section 34 and an input section 35 . electrically connected. Descriptions of the communication unit 31, the storage unit 32, and the control unit 33 are omitted because they are substantially the same as those of the communication unit 21, the storage unit 22, and the control unit 23 in the server 2. FIG.
  • the display unit 34 may be included in the housing of the user terminal 3 or may be externally attached.
  • the display unit 34 displays a screen of a graphical user interface (GUI) that can be operated by the user.
  • GUI graphical user interface
  • the display unit 34 will be described as being included in the housing of the user terminal 3 .
  • the input unit 35 may be included in the housing of the user terminal 3 or may be externally attached.
  • the input unit 35 may be integrated with the display unit 34 and implemented as a touch panel. With a touch panel, the user can input a tap operation, a swipe operation, or the like.
  • a switch button, a mouse, a QWERTY keyboard, or the like may be employed instead of the touch panel. That is, the input unit 35 receives an operation input made by the user. The input is transferred as a command signal to the control unit 33 via the communication bus 30, and the control unit 33 can perform predetermined control or calculation as necessary.
  • the sound collector 4 is a so-called microphone that is configured to convert external sounds into signals.
  • the sound collector 4 may be provided by directly connecting the sound collector 4 and the server 2, but is provided or connected to the user terminal 3, for example.
  • the sound collector 4 is configured to generate voice data by collecting the user's utterance. Note that the voice data may be temporarily stored in the memory within the user terminal 3 and may not be stored in the storage unit 32 in a non-volatile manner. Audio data generated by the sound collector 4 is configured to be transferable to the server 2 via a network.
  • the sound collector 4 collects at least sounds in the human audible range, sounds with frequencies between 20 Hz and 20,000 Hz, and converts them into electrical signals.
  • Audio can be mono or stereo recordings.
  • Sampling rates for digitally processing audio data are, for example, 48000 Hz, 44100 Hz, 32000 Hz, 22050 Hz, 16000 Hz, 11025 Hz, 11000 Hz, and 8000 Hz. It may be within any of the numerical values exemplified here. By increasing the sampling rate, it is possible to finely discretize the temporal timing of speech, and improve the accuracy of speech recognition.
  • the data collected by the sound collector 4 may be appropriately compressed by the control unit 33 of the user terminal 3.
  • the compression format at this time is MP3, AAC, WMA, Vorbis, AC3, MP2. , FLAC, TAK, or the like. Communication traffic due to data transfer from the user terminal 3 to the server 2 can be reduced by compression.
  • the photographing device 5 is a so-called camera configured to be able to take out information of the outside world as an image.
  • the imaging device 5 may be provided by directly connecting the imaging device 5 and the server 2, but is provided or connected to the user terminal 3, for example.
  • the imaging device 5 is configured to generate moving image data by collecting fragmented images.
  • the moving image data generated by the photographing device 5 is configured so that the collected moving image data can be transferred to the server 2 via the network.
  • FIG. 4 is an example of a block diagram showing functions realized by the control unit 23 of the server 2.
  • the control unit 23 of the server 2 which is an example of the information processing system 1, includes a reception unit 231, a display control unit 232, an output unit 233, a time recording unit 234, a recording unit 235, and a voice recognition unit. It includes a unit 236 , a specifying unit 237 , a transfer unit 238 and a reproducing unit 239 .
  • the reception unit 231 receives data such as instructions, audio data, and video data transmitted from the user terminal 3 via the network. Note that the voice data is temporarily stored in the memory within the server 2 and is not recorded in the storage section 22 .
  • the display control unit 232 controls screen data displayed on the display unit 34 of the user terminal 3.
  • the screen data may be visual information such as screens, images, icons, texts, etc., which are generated in a manner that can be visually recognized by the user. It may be rendering information for display.
  • the output unit 233 performs processing related to output such as transmission of data to the user terminal 3 and storage of minutes in the storage unit 22 via the network.
  • the time recording unit 234 associates the time data of the voice data with the time data of the script data.
  • the time data is data representing the time counted from the start of the conference. In other embodiments, the time data may simply be data representing the current time.
  • the recording unit 235 causes the storage unit 22 to store the voice data acquired via the sound collector 4 as recorded data.
  • the speech recognition unit 236 recognizes the recorded data and converts the recorded data into script data.
  • the script data is data obtained by transcribing the recorded data. That is, the script data is data obtained by transcribing voice data during the conference.
  • the specifying unit 237 specifies part of the information in the document creation area and the script display area, which will be described later.
  • the transcription unit 238 transcribes the document data in the document creation area, which will be described later, to the script display area. Also, the transfer unit 238 transfers the script data in the script display area to the document creation area.
  • the playback unit 239 plays back recorded data recorded by the recording unit 235 .
  • Information processing method 3.1 Overview of Information Processing Method In this section, an overview of the information processing method of the information processing system 1 described above will be described.
  • FIG. 5 is an example of an activity diagram outlining the information processing executed by the information processing system 1.
  • FIG. 5 is an example of an activity diagram outlining the information processing executed by the information processing system 1.
  • the control unit 33 of the user terminal 3 accepts the input of the user ID and password and the participation request through the operation of the input unit 35 by the user.
  • the control unit 33 Upon receiving the participation request, the control unit 33 transmits the user ID, password, and participation request to the server 2 via the communication unit 31 of the user terminal 3 and the network.
  • the reception unit 231 of the server 2 acquires the user ID, password, and participation request from the user terminal 3 via the network and the communication unit 21 of the server 2.
  • the reception unit 231 determines whether or not the user ID and password match the account information stored in the storage unit 22, and if they match, performs login authentication.
  • a user who has been successfully logged in is registered as a conference participant.
  • the display control unit 232 of the server 2 controls the screen data so that the display unit 34 displays that the login authentication has succeeded and that the user can participate in the conference.
  • the output unit 233 of the server 2 transmits the screen data to the user terminal 3 via the communication unit 21 and the network.
  • the control unit 33 of the user terminal 3 receives a meeting start request from the host user via the input unit 35 operated by the host user. Upon receiving the conference start request, the control unit 33 transmits the conference start request to the server 2 via the communication unit 31 of the user terminal 3 of the host user and the network.
  • the reception unit 231 of the server 2 receives a meeting start request from the user terminal 3 of the host user via the network and the communication unit 21 of the server 2.
  • the reception unit 231 receives the conference start request, processing for starting the online conference is performed.
  • the reception unit 231 puts data such as audio data and video data transmitted from each user terminal 3 into a state in which it can be received.
  • the display control unit 232 of the server 2 includes a document creation area 62 and a script display area 63 (for the document creation area 62 and the script display area 63, see FIG. 6 below), and creates minutes of the meeting.
  • the document creation area 62 is configured to be able to display non-voice input document data by a user during a meeting conducted by one or more users.
  • the script display area 63 is configured to be able to display meeting script data 635 generated based on the voice of the meeting (for the script data 635, see FIG. 7 later).
  • the output unit 233 of the server 2 puts data such as screen data, audio data, and video data into a state in which they can be transmitted to each user terminal 3 .
  • the control unit 23 of the server 2 starts the conference. Specifically, the output unit 233 starts transmitting screen data controlled by the display control unit 232 of the server 2 to all users via the communication unit 21 of the server 2 and the network. In addition, transmission of audio data or moving image data acquired by one user's sound collector 4 or imaging device 5 to other users is started.
  • the control unit 33 of the user terminal 3 of each user receives character input to the minutes through the user's operation on the input unit 35 during the conference. Also, the control unit 33 acquires voice data from each user via the sound collector 4 . Furthermore, the control unit 33 acquires moving image data via the imaging device 5 . The control unit 33 transmits character input data, audio data, and video data to the server 2 via the communication unit 31 and the network.
  • the reception unit 231 of the server 2 receives character input data, voice data, and video data from each user terminal 3 via the network and the communication unit 21 of the server 2.
  • the time recording unit 234 and the display control unit 232 of the server 2 appropriately process the character input data in a document input area 622 and a time stamp display area 623, which will be described later.
  • the recording unit 235 of the server 2 stores the acquired voice data in the storage unit 22 as recording data.
  • the output unit 233 of the server 2 may transmit the screen data controlled by the display control unit 232 to all users via the communication unit 21 and the network, or may The audio data or video data acquired by the device 5 may be transmitted to other users.
  • the control unit 33 of the user terminal 3 receives a voice recognition request from the user via the input unit 35.
  • the control unit 33 Upon receiving the voice recognition request, the control unit 33 transmits the voice recognition request to the server 2 via the communication unit 31 of the user terminal 3 and the network.
  • the reception unit 231 of the server 2 acquires the voice recognition request via the network and the communication unit 21 of the server 2.
  • the speech recognition unit 236 of the server 2 that has received the speech recognition request recognizes the recorded data during the conference.
  • the speech recognition unit 236 of the server 2 outputs script data 635 based on the recognized recorded data.
  • the display control unit 232 controls screen data so that the output script data 635 is displayed on the display unit 34 .
  • the output unit 233 transmits screen data to the user terminal 3 via the communication unit 21 of the server 2 and the network. A9 and A10 will be described with reference to FIGS. 6 and 7. FIG.
  • control unit 33 of the user terminal 3 receives screen data via the network and the communication unit 31 of the user terminal 3.
  • the control unit 33 causes the display unit 34 to display the screen data.
  • the control unit 33 of the user terminal 3 receives character input or area operation through the user's operation of the input unit 35.
  • the control unit 33 transmits information on the character input or area operation to the server 2 via the communication unit 31 and the network.
  • the reception unit 231 of the server 2 receives information regarding character input or area operation via the network and the communication unit 21 of the server 2. A13 will be described with reference to FIGS. 8 and 9. FIG.
  • control unit 33 of the user terminal 3 transmits a save instruction to the server 2 via the communication unit 31 and the network.
  • control unit 23 of the server 2 stores the minutes in the storage unit 22 via the network and the communication unit 21 of the server 2 .
  • A15 refer to the processing related to the save button 602 and the completion button 603 in FIG.
  • FIG. 6 is an example of minutes displayed on the display unit 34 of the user terminal 3 immediately after the end of the conference and before speech recognition in the first embodiment.
  • a minutes area 6 is displayed on the screen.
  • the minutes area 6 includes an agenda area 601, a save button 602, a completion button 603, a layout button 604, an agenda display area 61, a document creation area 62, a script display area 63, and a playback area 64. included.
  • An agenda area 601 is an area in which an agenda of a meeting such as "regular meeting of sales headquarters" is displayed.
  • a save button 602 is a button for temporarily saving the minutes. That is, the receiving unit 231 of the server 2 receives selection of the save button 602 . Upon receiving the selection of the save button 602, the output unit 233 of the server 2 causes the storage unit 22 to store the current minutes. As a result, it is possible to temporarily store the minutes that are being created.
  • a completion button 603 is a button for making the minutes a completed version. That is, the accepting unit 231 accepts selection of the completion button 603 .
  • the output unit 233 of the server 2 converts the current minutes into a specified storage destination, converts them into an arbitrary extension such as PDF, and stores them in the storage unit 22 .
  • the output unit 233 transmits the minutes to the user terminals 3 of related members of the conference, such as participants and absentees, via the communication unit 21 and the network.
  • the completed minutes can be saved in any storage location, formatted in any format, and shared with any member.
  • a layout button 604 is a button for changing the layout of the minutes area 6 . That is, the receiving unit 231 receives an operation of the layout button 604. FIG.
  • the display control unit 232 controls screen data so as to change the layout of the minutes area 6 for display on the display unit 34 of the user terminal 3 . This makes it possible to freely change the layout of the minutes.
  • the proceedings overview display area 61 is an area in which an overview of proceedings is displayed.
  • the summary of the proceedings includes the date and time of the meeting, such as "2022/4/23 10:00-11:00", and the storage location of the materials used in the meeting, such as "https://meeting". URL, attendees, absentees, minutes creator, and other data. That is, the reception unit 231 receives input to the proceedings summary display area 61 .
  • the display control unit 232 controls the screen data so as to change the description of the proceedings summary display area 61 . This allows the summary of the minutes to be edited arbitrarily.
  • the document creation area 62 also includes a document creation auxiliary area 621, a document input area 622, a time stamp display area 623, a specific button 624, and the like.
  • the document creation assistance area 621 is configured so that the data displayed in the document creation area 62 such as "Agenda" can be displayed in one word.
  • Text input area 622 is configured to allow display of the results of text input by a user before, during, and after a meeting conducted by one or more users.
  • character input is an example of non-voice input. That is, accepting portion 231 accepts character input to document input area 622 .
  • the display control unit 232 controls the screen data so that the user terminal 3 displays the character input as document data. Thereby, the minutes can be edited arbitrarily.
  • the time stamp display area 623 is configured to be able to display time data regarding the elapsed time of the meeting when characters are input in the document input area 622 . That is, accepting portion 231 accepts character input to document input area 622 .
  • the time recording unit 234 records time data in the time stamp display area 623 for the time at which characters are input in the document input area 622 .
  • the display control unit 232 controls screen data to be displayed on the user terminal 3 . For example, assume that the reception unit 231 receives a character input in the document input area 622 when 1 minute and 16 seconds have passed since the start of the conference. In this case, the time recording unit 234 records the time data as "1:16" next to the character input and in the time stamp display area 623 . As a result, it is possible to record the time when characters are input to the minutes.
  • a specific button 624 is a button for reflecting a part of the document input in the document input area 622 to specific items such as determination items, To Do items, and the like. That is, the accepting unit 231 accepts selection of document data in the document input area 622 by dragging with a cursor or the like. The identifying unit 237 identifies the selected document data. Furthermore, the accepting unit 231 accepts selection of the specific button 624 . When receiving these, the display control unit 232 controls the screen data to be displayed on the user terminal 3 so that the specified document data can recognize specific items such as determination items and ToDo items. As a result, arbitrary document data can be specified as a specific item.
  • the script display area 63 includes a script display auxiliary area 631 , a script area 632 , a search area 633 and a specific button 634 .
  • the script display auxiliary area 631 is configured so that the data displayed in the script area 632 such as "transcription" can be displayed in one word.
  • the script area 632 is configured to be able to display conference script data 635 generated based on the audio of the conference. At least one of the document input area 622 and the script area 632 may be popped up and displayed. Processing performed in the script area 632 will be described in detail with reference to FIG.
  • a search area 633 is an area for searching for any keyword in the script area 632 . That is, accepting unit 231 accepts an input of a keyword to search area 633 . The output unit 233 determines whether or not the corresponding keyword exists. If the relevant keyword exists, the display control unit 232 highlights and displays the relevant keyword in the script area 632 . This makes it easy to find arbitrary data in the script.
  • the specific button 634 is a button for reflecting a part of the document input in the script area 632 in decisions, to-do items, and the like. Processing related to the specific button 634 will be described in detail using FIG.
  • the playback area 64 objects for playing recorded data such as play and stop are displayed.
  • FIG. 7 is an example of minutes displayed on the display unit 34 of the user terminal 3 after speech recognition in the first embodiment. Compared to FIG. 6, it differs in that transcription has been completed and script data 635 has been generated.
  • the script data 635 is data obtained by transcribing the recorded data.
  • the reception unit 231 receives audio data of the conference.
  • the recording unit 235 stores (records) voice data of the conference in the storage unit 22 as recorded data.
  • the reception unit 231 of the server 2 acquires the voice recognition request via the network and the communication unit 21 of the server 2 .
  • the speech recognition unit 236 of the server 2 that has received the speech recognition request recognizes the recorded data during the conference.
  • the speech recognition unit 236 outputs script data 635 to the script area 632 based on the recognized recorded data.
  • the display control unit 232 controls the script area 632 to display the output script data 635 . That is, script area 632 is configured to display script data 635 generated based on recorded data. As a result, the recorded data can be extracted as script data.
  • FIG. 8 is an example of minutes displayed on the display unit 34 of the user terminal 3 when the script data 635 is specified according to the first embodiment. Compared to FIG. 7, it differs in that a time highlight area 625 and a script highlight area 636 are displayed.
  • Time highlight area 625 is displayed by selecting data in time stamp display area 623 . For example, when "11:32" is selected, the time highlight area 625 is displayed in a different color.
  • Script highlight area 636 is displayed by selecting data in time stamp display area 623 .
  • the script highlight area 636 when "11:32" is selected, the script corresponding to that time is displayed in a different color.
  • the seek bar in the playback area 64 can be configured to move to "11:32", which is the start time of the agenda, in accordance with the selection of data in the time stamp display area 623 .
  • the reception unit 231 receives selection of information displayed in the document input area 622 or the time stamp display area 623 .
  • the specifying unit 237 selects the script in the script area 632 corresponding to the time when the document data or time data is input. Identify data 635 .
  • document data or time data is an example of non-voice input.
  • the script data 635 is specified as a certain range.
  • the display control unit 232 controls the screen data so that the selected document data in the document input area 622 or the time data in the time stamp display area 623 and the corresponding script data 635 are highlighted and displayed. This makes it possible to easily refer to the document data or the script data corresponding to the time stamp display area.
  • an object for reproducing the recorded data may be provided in the minutes area 6 . That is, the reception unit 231 receives selection of information displayed in the document input area 622 or the time stamp display area 623 .
  • the specifying unit 237 specifies script data 635 in the script area 632 corresponding to the time when the document data or time data was input.
  • the receiving unit 231 receives an object operation for reproduction.
  • the reproducing unit 239 reproduces the recorded data corresponding to the certain range of the script data 635 specified by the specifying unit 237 .
  • the conference recording data corresponding to the specified script data can be reproduced.
  • FIG. 9 is an example of minutes displayed on the display unit 34 of the user terminal 3 when the script data 635 is transcribed according to the first embodiment. 7 in that a document highlight area 626 and a script highlight area 637 are displayed.
  • the script highlight area 637 is displayed by dragging data in the script data 635 with a cursor. For example, in the script highlight area 637, by dragging the portion "Ask attendees to give feedback on the new project" with the cursor, the color of the portion is changed and displayed.
  • the document highlight area 626 is displayed by operating the specific button 634 while the script highlight area 637 is being displayed.
  • the seek bar in the reproduction area 64 can be configured to move to "11:32", which is the start time of the agenda, in accordance with the selection of the script data 635 .
  • the accepting unit 231 accepts selection of a script in the script display area 63 .
  • the transfer unit 238 transfers the selected part into the document creation area 62 .
  • the display control unit 232 controls the screen data so as to display the transferred data. As a result, specific items such as determined items can be easily set for the transcribed script data.
  • Embodiment 2 In the first embodiment, the example in which the creator creates the minutes has been described. Embodiment 2 describes an example in which minutes are created in a chat format. Since the hardware configuration and functional configuration are the same as those of the first embodiment, description thereof will be omitted. Also, this chat may be realized by cooperating with an existing chat service or chat application.
  • FIG. 10 is an example of minutes displayed on the display unit 34 of the user terminal 3 according to the second embodiment.
  • the minutes area 7 is displayed in FIG.
  • the minutes area 7 includes an agenda area 71 and a point area 72 .
  • the minutes area 7 may include the title of the minutes, the summary of the minutes, the decisions of the meeting, and the to-do items of the meeting.
  • the agenda summary area 71 includes an agenda information area 711 and a playback area 712 .
  • the agenda information area 711 displays the number of posts, the number of decided items, the number of ToDo items, and the time required for each agenda item. Also, in the playback area 712, objects for playing back recorded data such as play and stop are displayed. That is, the reception unit 231 receives an operation on the agenda information area 711 . When the agenda information area 711 is operated, the specifying unit 237 specifies recorded data corresponding to the agenda. Furthermore, the accepting unit 231 accepts an operation on the playback area 712 . When the operation of the reproduction area 712 is accepted, the reproduction unit 239 reproduces the recorded data corresponding to the selected agenda information area 711 . As a result, it is possible to reproduce the recorded data of the meeting corresponding to the specified agenda.
  • the summary area 72 includes an important data area 721 and a header data area 722 .
  • Important data area 721 is configured to be able to display the results of inputs entered via chat by users during a meeting conducted by one or more users.
  • Heading data area 722 is configured to be able to display part of the script data corresponding to each topic of the conference generated by speech recognition unit 236 based on the speech of the conference.
  • the important data area 721 and the header data area 722 are examples of the document creation area and the script display area.
  • the input input via chat is an example of non-voice input.
  • Header data area 722 may be selected to display all of the script information corresponding to each topic. That is, the reception unit 231 receives selection of any agenda item in the header data area 722 . When accepting the selection of the agenda, the display control unit 232 controls the screen data so as to display all the script data corresponding to the agenda. This makes it possible to smoothly refer to script data.
  • identification and transfer may be performed.
  • the display control unit 232 controls the screen data so as to display an input screen for creating the minutes of the meeting, including the important data area 721 and the header data area 722 .
  • the specifying unit 237 specifies the header data area 722 corresponding to the time when the important data area 721 was input, and expands the script data.
  • the transcription unit 238 transcribes the selected portion to the important data area 721 when a selection for the expanded script data is received. As a result, specific items such as determined items can be easily set for the transcribed script data.
  • minutes can be created more efficiently. That is, it is possible to perform character recognition and easily extract a summary, an important part, etc. of proceedings.
  • the sound collector 4 may be directly connected to the communication unit 21 of the server 2 via a network without the user terminal 3 , and may be configured to be able to transfer collected sound data to the server 2 . In this case, it is preferable to record which user the sound data collected from which sound collector 4 is associated with. Voice data can be associated with a plurality of users, but if one user has one sound collector 4, it is possible to record which user's voice data is uttered. The same applies to the imaging device 5 as well.
  • An information processing system comprising a control unit, the control unit being configured to execute the following steps; displaying an input screen for creating minutes, wherein the document creation area is configured to be able to display results of non-voice input by the user during a meeting conducted by one or more users, and the script display; The region is configured to be able to display a script of the meeting generated based on the voice of the meeting, and in the specifying step, if a selection for the non-voice input displayed in the document creation region is accepted, the non-voice The part of the script corresponding to the time when the input was entered is specified, and in the transcription step, the selected part is displayed in the document creation area when a selection for the script displayed in the script display area is accepted.
  • control unit is configured to further execute a recording step, and in the recording step, audio of the conference is recorded.
  • An information processing system configured to record as recorded data, and to display a script generated based on the recorded data in the script display area.
  • control unit is configured to further execute a reproduction step, wherein the reproduction step corresponds to a certain range of the script identified by the identification step.
  • An information processing system configured to be able to reproduce recorded data.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Business, Economics & Management (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Strategic Management (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Telephonic Communication Services (AREA)
  • User Interface Of Digital Computer (AREA)
  • Document Processing Apparatus (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

Le problème décrit par la présente invention est de fournir un dispositif de traitement d'informations ou similaire qui peut reconnaître des caractères et extraire facilement des résumés, des parties importantes et similaires de rapports. La solution selon un mode de réalisation de la présente invention porte sur un système de traitement d'informations. Ce système de traitement d'informations comprend une unité de commande. L'unité de commande est configurée pour exécuter chacune des étapes suivantes. Dans une étape de commande d'affichage, un écran d'entrée pour créer les comptes-rendus d'une réunion est affiché, comprenant une zone de création de document et une zone d'affichage de script. La zone de création de document est configurée pour pouvoir afficher les résultats d'une entrée non vocale effectuée par un utilisateur pendant une réunion conduite par un ou plusieurs utilisateurs. La zone d'affichage de script est configurée pour pouvoir afficher un script de conférence généré sur la base de l'audio de la conférence. Dans une étape de spécification, une partie d'un script correspondant au moment auquel l'entrée non vocale a été entrée est spécifiée lorsqu'une sélection se rapportant à l'entrée non vocale affichée dans la zone de création de document est reçue. Lors d'une étape de transfert, lorsqu'une sélection se rapportant au script affiché dans la zone d'affichage de script est reçue, la partie sélectionnée est transférée à la zone de création de document.
PCT/JP2022/034197 2021-09-30 2022-09-13 Système de traitement d'informations, programme et procédé de traitement d'informations WO2023053940A1 (fr)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
JP2021-161664 2021-09-30
JP2021161664 2021-09-30
JP2021-198172 2021-12-07
JP2021198172A JP7215770B1 (ja) 2021-09-30 2021-12-07 情報処理システム、プログラム及び情報処理方法

Publications (1)

Publication Number Publication Date
WO2023053940A1 true WO2023053940A1 (fr) 2023-04-06

Family

ID=85111699

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2022/034197 WO2023053940A1 (fr) 2021-09-30 2022-09-13 Système de traitement d'informations, programme et procédé de traitement d'informations

Country Status (2)

Country Link
JP (3) JP7215770B1 (fr)
WO (1) WO2023053940A1 (fr)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2015015632A (ja) * 2013-07-05 2015-01-22 株式会社リコー 議事録生成装置、議事録生成方法、議事録生成プログラムおよび通信会議システム
WO2016043110A1 (fr) * 2014-09-16 2016-03-24 株式会社東芝 Dispositif, procédé et programme d'accumulation d'informations de conférence
WO2016163028A1 (fr) * 2015-04-10 2016-10-13 株式会社東芝 Dispositif de présentation d'énoncé, procédé de présentation d'énoncé et programme
JP2018092365A (ja) * 2016-12-02 2018-06-14 株式会社アドバンスト・メディア 情報処理システム、情報処理装置、情報処理方法及びプログラム
JP2021067830A (ja) * 2019-10-24 2021-04-30 日本金銭機械株式会社 議事録作成システム

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2008172582A (ja) 2007-01-12 2008-07-24 Ricoh Co Ltd 議事録作成再生装置
JP6280312B2 (ja) 2013-05-13 2018-02-14 キヤノン株式会社 議事録記録装置、議事録記録方法及びプログラム
JP6165913B1 (ja) * 2016-03-24 2017-07-19 株式会社東芝 情報処理装置、情報処理方法およびプログラム
US10742695B1 (en) 2018-08-01 2020-08-11 Salesloft, Inc. Methods and systems of recording information related to an electronic conference system
KR102530669B1 (ko) 2020-10-07 2023-05-09 네이버 주식회사 앱과 웹의 연동을 통해 음성 파일에 대한 메모를 작성하는 방법, 시스템, 및 컴퓨터 판독가능한 기록 매체

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2015015632A (ja) * 2013-07-05 2015-01-22 株式会社リコー 議事録生成装置、議事録生成方法、議事録生成プログラムおよび通信会議システム
WO2016043110A1 (fr) * 2014-09-16 2016-03-24 株式会社東芝 Dispositif, procédé et programme d'accumulation d'informations de conférence
WO2016163028A1 (fr) * 2015-04-10 2016-10-13 株式会社東芝 Dispositif de présentation d'énoncé, procédé de présentation d'énoncé et programme
JP2018092365A (ja) * 2016-12-02 2018-06-14 株式会社アドバンスト・メディア 情報処理システム、情報処理装置、情報処理方法及びプログラム
JP2021067830A (ja) * 2019-10-24 2021-04-30 日本金銭機械株式会社 議事録作成システム

Also Published As

Publication number Publication date
JP2023164835A (ja) 2023-11-14
JP7337415B2 (ja) 2023-09-04
JP2023051950A (ja) 2023-04-11
JP2023051656A (ja) 2023-04-11
JP7215770B1 (ja) 2023-01-31

Similar Documents

Publication Publication Date Title
CN110751940B (zh) 一种生成语音包的方法、装置、设备和计算机存储介质
JP2023539820A (ja) インタラクティブ情報処理方法、装置、機器、及び媒体
KR100752568B1 (ko) 이벤트 기반 주석 달기 방법 및 시스템
JP2002358092A (ja) 音声合成システム
EP3752891B1 (fr) Systèmes et procédés d'identification et de fourniture d'informations concernant des entités sémantiques dans des signaux audio
CN109474843A (zh) 语音操控终端的方法、客户端、服务器
JP5729844B1 (ja) コンテンツの評価装置、システム、サーバ装置及び端末装置
JP2002099530A (ja) 議事録作成装置及び方法並びにこれを用いた記憶媒体
EP4289129A1 (fr) Systèmes et procédés de gestion d'interruptions de flux audio vocal
JP2023538061A (ja) 検索コンテンツのマッチング方法、装置、電子機器および記憶媒体
WO2023053940A1 (fr) Système de traitement d'informations, programme et procédé de traitement d'informations
JP7417272B2 (ja) 端末装置、サーバ装置、配信方法、学習器取得方法、およびプログラム
JP7123448B1 (ja) 情報処理方法、コンピュータプログラム及び情報処理装置
WO2021153618A1 (fr) Dispositif de traitement d'informations, procédé de traitement d'informations, programme et support d'enregistrement
JP6619072B2 (ja) 音合成装置、音合成方法、及びそのプログラム
JP2022015775A (ja) 会話支援システム、会話支援方法及びプログラム
KR20220046165A (ko) 앱과 웹의 연동을 통해 음성 파일에 대한 메모를 작성하는 방법, 시스템, 및 컴퓨터 판독가능한 기록 매체
JP5758770B2 (ja) 顧客応対支援システムおよび顧客応対支援方法
WO2022059446A1 (fr) Dispositif, programme et procédé de traitement de l'information
JP7344612B1 (ja) プログラム、会話要約装置、および会話要約方法
JP6807586B1 (ja) 情報処理装置、情報処理方法及びプログラム
KR102427213B1 (ko) 음성 파일에 대한 텍스트 변환 기록과 메모를 함께 관리하는 방법, 시스템, 및 컴퓨터 판독가능한 기록 매체
WO2021084718A1 (fr) Programme, procédé et système de restitution de la voix
JP2024078838A (ja) 情報処理装置、及びプログラム
JP2005345508A (ja) 聴取支援プログラム、聴取支援装置及び聴取支援方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22875812

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE