WO2016042600A1 - 情報提供システム - Google Patents
情報提供システム Download PDFInfo
- Publication number
- WO2016042600A1 WO2016042600A1 PCT/JP2014/074412 JP2014074412W WO2016042600A1 WO 2016042600 A1 WO2016042600 A1 WO 2016042600A1 JP 2014074412 W JP2014074412 W JP 2014074412W WO 2016042600 A1 WO2016042600 A1 WO 2016042600A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- target word
- recognition target
- recognition
- heading
- candidate
- Prior art date
Links
- 238000000034 method Methods 0.000 claims description 36
- 238000000605 extraction Methods 0.000 claims description 35
- 239000000284 extract Substances 0.000 claims description 6
- 238000004148 unit process Methods 0.000 claims 2
- 238000004458 analytical method Methods 0.000 description 23
- 241000282836 Camelus dromedarius Species 0.000 description 16
- 238000010586 diagram Methods 0.000 description 12
- 238000012545 processing Methods 0.000 description 10
- 238000012546 transfer Methods 0.000 description 10
- 238000003672 processing method Methods 0.000 description 7
- 230000015572 biosynthetic process Effects 0.000 description 6
- 238000003786 synthesis reaction Methods 0.000 description 6
- 241000209507 Camellia Species 0.000 description 3
- 235000018597 common camellia Nutrition 0.000 description 3
- 239000003086 colorant Substances 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 239000012141 concentrate Substances 0.000 description 2
- 230000000877 morphologic effect Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000008929 regeneration Effects 0.000 description 1
- 238000011069 regeneration method Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 230000002194 synthesizing effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/10—Speech classification or search using distance or distortion measures between unknown speech and reference templates
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/1815—Semantic context, e.g. disambiguation of the recognition hypotheses based on word meaning
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L2015/088—Word spotting
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/223—Execution procedure of a spoken command
Definitions
- a headline such as news is displayed on a display, or a voice is output from a speaker, and a user speaks a keyword included in the headline.
- Information providing apparatuses that read and display are known.
- Patent Document 1 reads out a text described in HTML or the like, and uses a keyword included in a link character string representing the contents of linked information as a speech recognition target, and the keyword is uttered. Describes a voice recognition / synthesizing device that obtains and reads out corresponding link destination information (contents).
- Patent Document 1 when keywords extracted from a plurality of link character strings are duplicated, if the user speaks the duplicate keywords, only a notification that there are a plurality of options is provided. A method for easily selecting information on one link destination by an operation is not disclosed, and there is a problem that a user cannot easily acquire information (content) of a desired link destination.
- a desired heading can be easily selected from a plurality of headings by a user's voice operation, and information (content) corresponding to the selected heading. It is an object to provide an information providing system capable of presenting information.
- the present invention provides an information providing system that presents a plurality of headings, selects one of the headings, and presents content corresponding to the selected heading.
- An extraction unit that extracts a recognition target word candidate from each of the acquired headings, and obtains a recognition target word candidate for each heading from the extraction unit, and recognizes a recognition target word candidate for each of the acquired headings
- a target word determination unit that generates a recognition target word for each heading based on the speech recognition result when the speech recognition result by speech matches the recognition target word generated by the target word determination unit,
- a control unit that instructs to present content corresponding to a heading to be recognized as a recognition target word, and the target word determination unit overlaps with the recognition target word candidate of each acquired heading If the identification target word candidates exists, characterized in that the recognition target word of each heading is dynamically generated differently the recognition target words each heading each other.
- the information providing system of the present invention since recognition target words for selecting one headline from a plurality of headlines are not duplicated, a headline is uniquely selected by the user's utterance, and the selected headline is selected. It becomes possible to present information (contents) corresponding to the above, and convenience for the user is improved.
- FIG. 6 is an explanatory diagram for explaining an outline of an operation of the information providing system according to Embodiment 1.
- FIG. 6 is a diagram illustrating an example of a display screen on which a news headline is displayed on a display by the information providing system according to Embodiment 1.
- FIG. 1 is a schematic configuration diagram illustrating a main hardware configuration of an information providing system according to Embodiment 1.
- FIG. 1 is a block diagram illustrating an example of an information providing system according to Embodiment 1.
- FIG. It is a table
- surface which shows an example of each headline which the analysis part acquired, and the content (news text) corresponding to the said headline. 6 is a table showing a result of an extraction unit in Embodiment 1 extracting a recognition target word candidate from each heading shown in FIG.
- 4 is a table showing an example of recognition target words for each headline, each headline, and content (news text) corresponding to each headline stored in the storage unit in the first embodiment.
- 4 is a flowchart illustrating an operation up to generation of a speech recognition dictionary in the information providing system according to the first embodiment.
- 4 is a flowchart illustrating operations of outputting a headline and storing content or the like in a storage unit in the information providing system according to the first embodiment.
- 4 is a flowchart illustrating an operation of presenting content in the information providing system according to the first embodiment. It is a figure which shows the content information which the acquisition part acquired via the network.
- FIG. 6 is a table showing a result of an extraction unit in Embodiment 2 extracting a plurality of recognition target word candidates from each heading shown in FIG. 5 and associating the recognition target word candidates with each heading.
- it is a table
- 10 is a table showing an example of a recognition target word for each headline, each headline, and content (news text) corresponding to each headline in which recognition target words are re-determined and stored in a storage unit in the second embodiment.
- 10 is a flowchart illustrating an operation up to generation of a speech recognition dictionary in the information providing system according to the second embodiment.
- 10 is a flowchart illustrating operations for storing / updating content and the like in a storage unit and presenting content in the information providing system according to the second embodiment.
- the information providing system of the present invention presents a plurality of headings, selects one heading from the plurality of headings, and presents information (content) corresponding to the selected heading.
- FIG. FIG. 1 is an explanatory diagram for explaining the outline of the operation of the information providing system according to the first embodiment of the present invention.
- the information providing system 1 acquires information (content information) via the network 2, and displays the display 3 or the speaker 4 (hereinafter referred to as “display”) so as to display a headline of the content included in the acquired content information by display or audio output. Etc.)).
- the information providing system 1 selects the headline and displays (displays or outputs sound) the content corresponding to the selected headline.
- the instruction is output to
- the information providing system 1 acquires news information as content information via the network 2 and displays the headline of the news text (content) included in the news information (content information).
- the headline here is a summary sentence summarizing the content of the news, but any headline may be used as long as it represents the content of the content.
- FIG. 2 is a diagram illustrating an example of a display screen on which news headlines are displayed on the display 3 by the information providing system according to the first embodiment. Then, when the recognition target word included in the headline is uttered by the user, the information providing system 1 selects a headline including the recognition target word and presents a news text corresponding to the selected headline, etc. The instruction is output.
- the “content” included in the content information is described as a news body that is text information.
- the present invention is not limited to this.
- the “content” is not limited to text information but may be an image (a still image, a moving image, a moving image including sound, etc.).
- the sound included in the moving image is used. It may be output. The same applies to the following embodiments.
- the “heading” of the content is described as a heading corresponding to the news body (content) that is text information.
- the present invention is not limited to this.
- the heading is text information including a recognition target word uttered by the user. The same applies to the following embodiments.
- FIG. 3 is a schematic configuration diagram showing a main hardware configuration of the information providing system 1 according to the first embodiment.
- a CPU Central Processing Unit
- ROM Read Only Memory
- RAM Random Access Memory
- an input device 104 a communication device 105
- HDD Hard Disk Drive
- the CPU 101 reads out and executes various programs stored in the ROM 102 and the HDD 106, thereby realizing various functions in cooperation with each hardware.
- the RAM 103 is a memory used when executing the program.
- the input device 104 receives user input and is a microphone, a remote controller, a touch sensor, or the like.
- the HDD 106 is an example of an external storage device.
- Examples of the external storage device include a storage that employs a flash memory such as a CD, a DVD, a USB memory, and an SD card in addition to the HDD.
- the output device 107 includes a speaker, a liquid crystal display, an organic EL, and the like.
- FIG. 4 is a block diagram showing an example of the information providing system 1 according to the first embodiment of the present invention.
- the information providing system 1 includes an acquisition unit 10, an analysis unit 11, an extraction unit 12, a target word determination unit 13, a dictionary generation unit 14, a speech recognition dictionary 15, a control unit 16, a storage unit 17, a speech recognition unit 18, and a speech synthesis.
- the unit 19 is provided.
- these components may be distributed to a server on a network, a mobile terminal such as a smartphone, and an in-vehicle device.
- the acquisition unit 10 acquires content information described in HTML (HyperText Markup Language) or XML (eXtensible Markup Language) format via the network 2.
- HTML HyperText Markup Language
- XML eXtensible Markup Language
- the network 2 for example, a public line such as the Internet or a mobile phone can be used.
- the analysis unit 11 analyzes the content information acquired by the acquisition unit 10 and acquires the content and the heading of the content. When a plurality of contents are included, all contents and corresponding headings are acquired.
- FIG. 5 is a table showing an example of a plurality of headings acquired by the analysis unit 11 and contents (news texts) corresponding to the headings.
- the analysis unit 11 analyzes the content information acquired from the acquisition unit 10, and finds the headline “Rakudai Nihon Series First Advance” and the corresponding content (news text) “Rakudai is today's "The first Japanese series advancement was decided because we won 1 against 0 in the eastern game.” Headline “Dalbi misses no-hit no-run” and the corresponding content (news text) Hit the mark and missed the no-hit no-run. ”Headline“ Rakudai Nakata wants to move to the big league ”and the corresponding content (news text)“ Rakudai Nakata press conference this afternoon Announced that he would like to transfer to the Major League. "
- the extraction unit 12 acquires a plurality of headings and contents corresponding to the headings from the analysis unit 11. Then, a reading of a word or the like that is a speech recognition target word candidate (hereinafter referred to as “recognition target word candidate”) is extracted from each acquired heading.
- the extracted recognition target word candidate is associated with the extraction source heading.
- any method may be used as the extraction method.
- a headline is divided into words by morphological analysis, and a word string (hereinafter, “Words etc.”) may be extracted as recognition target word candidates.
- FIG. 6 is a table showing a result of the extraction unit 12 extracting recognition target word candidates from the headings shown in FIG. 5 and associating the recognition target word candidates with the headings.
- the recognition target word candidate of the headline “Rakudai Nihon Series First Advance” is “Lucky”
- the recognition target word candidate of the headline “Dalbi is a no-hit no-run” is “Dalubi”
- the recognition target word candidate is extracted such that the recognition target word candidate “Rakudai Nakata wants to move to the major league” is “Rakudai”.
- the target word determination unit 13 acquires a plurality of headings, contents corresponding to the headings, and recognition target word candidates for the headings from the extraction unit 12. Then, it is determined whether or not there are overlapping recognition target word candidates among the recognition target word candidates of each acquired heading, and a recognition target word for each heading is dynamically generated based on the recognition target word candidates. To do.
- the recognition target words of each heading are dynamically changed so that the recognition target words of each heading are different from each other.
- the recognition target words of each heading are dynamically changed so that the recognition target words of each heading are different from each other.
- At least one recognition target word candidate among overlapping recognition target word candidates is processed or replaced so that the recognition target word candidates are different from each other.
- the recognition target word candidate of the first heading is processed into“ first camellia ”and the recognition target word candidate of the third heading is changed. Process to "Sanban no Camel".
- the target word determination unit 13 determines the recognition target word candidate after processing or replacement as a speech recognition target word (hereinafter, referred to as “recognition target word”) corresponding to the heading. On the other hand, when there are no overlapping recognition target word candidates, the target word determination unit 13 determines the recognition target word candidate acquired from the extraction unit 12 as it is as a recognition target word corresponding to the headline.
- the target word determination unit 13 has a list of recognized vocabularies related to operations other than the selection of headings (for example, a list of operation commands for operating navigation devices, other in-vehicle devices, etc.).
- the recognition target word candidate may be processed so that the recognition target word does not match or resemble the word or the like included in. That is, the target word determination unit 13 may dynamically generate a recognition target word so as to be different from the operation command of the device.
- the dictionary generation unit 14 generates a speech recognition dictionary 15 having the recognition target word determined by the target word determination unit 13 as a recognition vocabulary.
- the voice recognition unit 18 recognizes the voice collected by the microphone 5 with reference to the voice recognition dictionary 15 and outputs a recognition result character string.
- Description is abbreviate
- the control unit 16 acquires a heading, a recognition target word corresponding to the heading, and content corresponding to the heading from the target word determination unit 13 and stores them in the storage unit 17.
- FIG. 7 is a table showing an example of recognition target words for each headline, each headline, and content (news text) corresponding to each headline stored in the storage unit 17 in the first embodiment.
- the target word determination unit 13 changes the first headline recognition target word candidate “Rakudai” to “first camellia” and the third headline recognition target word candidate “Rakusai”. "" Is processed into “Sanban no Rakusai” and is determined as a recognition target word, indicating the state stored in the storage unit 17.
- control unit 16 outputs an instruction to the display or the like so as to present a plurality of headings acquired from the target word determination unit 13. Specifically, an instruction is output to the display 3 to display the headline acquired from the target word determination unit 13. Alternatively, after an instruction is output to the speech synthesizer 19 to generate a synthesized speech corresponding to the heading, an instruction is output to the speaker 4 to output the synthesized speech generated by the speech synthesizer 19.
- a method of outputting an instruction to present a headline from the control unit 16 it may be performed by at least one of display output and audio output, or may be output by both display and audio. Moreover, since the method of speech synthesis by the speech synthesizer 19 may be a known technique, a description thereof will be omitted.
- control unit 16 searches the storage unit 17 using the recognition result character string output by the voice recognition unit 18 as a search key. Then, the heading corresponding to the recognition target word that matches the search key is selected, and the content corresponding to the heading is acquired. That is, when the speech recognition result by speech matches the recognition target word generated by the target word determination unit 13, the control unit 16 selects a heading that uses the matched speech recognition result as the recognition target word, and the selection The content corresponding to the headline is acquired.
- an instruction is output to the display 3 to display the acquired content.
- an instruction is output to the voice synthesizer 19 to generate a synthesized voice using the acquired content, and an instruction is output to the speaker 4 to output the synthesized voice generated by the voice synthesizer 19.
- a method of outputting an instruction to present content from the control unit 16 it may be performed by at least one of display output and audio output, or output by both display and audio. It may be.
- the target word determination unit 13 processes at least one of the overlapping recognition target word candidates when there are overlapping recognition target word candidates among the recognition target word candidates of each heading acquired from the extraction unit 12, By determining the recognition target word candidates after processing as recognition target words corresponding to the headings, the recognition target words are dynamically generated so that the recognition target words of the headings are different from each other.
- the target word determination unit 13 identifies the presentation position or presentation order of the headline including the overlapping recognition target word candidate, and processes the recognition target word candidate based on the identified presentation position or presentation order. That is, each headline from which the recognition target word candidate is extracted adds a word or the like indicating the position or order displayed on the display 3 or output from the speaker 4 in front of each overlapping recognition target word candidate. Process.
- the headlines acquired by the analysis unit 11 are “Rakudai Nihon Series advancement”, “Dalbi misses no-hit no-run”, “Rakudai Nakata wants to transfer to the big league” It is assumed that the recognition target word candidates corresponding to each headline extracted by 12 are “Rakuji”, “Darubi”, and “Rakuji”. Each heading is displayed on the display 3 as shown in FIG.
- the target word determination unit 13 makes the first headline and the third headline recognition target word candidate overlap in “Rakudai”, so the first headline “Rakudai Nihon Series Advancement” is entered.
- the word candidate “Rakudai” corresponding to “Rakudai” is processed into the “most first camellia” and the third word “Rakudai Nakata wants to move to the league” "I” is processed into "Sanban no Camel".
- the target word determination unit 13 uses a word representing the relative presentation position and presentation order (relative display position or voice output order) of headings from which overlapping recognition target word candidates are extracted. Or the like may be added in front of each overlapping recognition target word candidate.
- the recognition target word candidate “Rakudai” corresponding to the first (first) heading “Rakudai Nihon Series Advance” is selected as “First Rakusai” or “ The word candidate “Rakudai” is processed into “Ura Rakusai” and the last (third, relatively second) headline “Rakudai Nakata wants to move to the league” Process to "Baby's Camel” or "Don't Camel”.
- the target word determination unit 13 may process the recognition target word candidate by adding another word or the like (word or word string) to the overlapping recognition target word candidate. For example, a process of adding a word adjacent to the recognition target word candidate in the heading to the overlapping recognition target word candidate may be performed. For the adjacent word or the like, a result analyzed by the extraction unit 12 may be used.
- the recognition target word candidate “Rakusai” corresponding to the first heading “Rakudai Nihon Series Advance” is used, for example, the adjacent word string “Nihonshirizu” ”And processed into“ Rakudai Nihonshirizu ”, and for the third headline“ Rakudai Nakata wants to transfer to the big league ”, the target word candidate“ Rakudai ”, for example, Add the adjacent word string “Nakakata” and process it into “Camelly”.
- the target word determination unit 13 replaces at least one of the duplicate recognition target word candidates with another paraphrase.
- the recognition target words of each heading become different from each other.
- the recognition target word is dynamically generated.
- the target word determination unit 13 determines, from at least one recognition target word candidate among overlapping recognition target word candidates, other recognition target word candidates that do not overlap with recognition target word candidates corresponding to other headings from the extraction unit 12. Is replaced with the acquired recognition target word candidate. That is, the target word determination unit 13 replaces the recognition target word candidate with another recognition target word candidate included in the heading including the overlapping recognition target word candidate.
- the recognition target word candidates of the first heading and the third heading both overlap with “Rakudai”, so the first heading “Rakudai” As a recognition target word candidate corresponding to “Japan Series Advancement”, “Nihonshirizu” that is another recognition target word candidate and does not overlap with a recognition target word candidate corresponding to another heading is obtained from the extraction unit 12 and replaced. To do. That is, instead of the recognition target word candidate “Rakudai”, “Nihonshirizu” is set as the recognition target word candidate, and this is determined as the recognition target word.
- the third heading “Rakudai Nakata is a major league As a recognition target word candidate corresponding to “I want to transfer to”, “Nakata”, which is another recognition target word candidate and does not overlap with a recognition target word candidate corresponding to another heading, is obtained from the extraction unit 12, and this is also replaced You may make it do.
- a headline from which duplicate recognition target word candidates are extracted is analyzed, a category to which the headline belongs is specified, a new word representing the category, etc. May be generated. And the thing which replaced the duplication recognition object word candidate with the said produced
- the target word determination unit 13 analyzes the meaning of the headline including the overlapping recognition target word candidate, sets the term corresponding to the meaning as another recognition target word candidate, and selects at least one of the overlapping recognition target word candidates. Replace.
- the headline acquired by the analysis unit 11 is “Notice of arrival of the holiday season from Rakudai” “Dalbi misses no hit no-run” “I want to transfer to the league”.
- the recognition target word candidates of the first heading and the third heading both overlap with “Lucky”, but the first heading “Rakudai to excursion season” is achieved by a well-known intention understanding technique.
- the category to which “Announcement of Arrival” belongs is specified as “Travel”, and the category to which the third heading “Nakata wants to transfer to the big league” is specified as “Baseball”.
- the recognition target word candidate “Rakudai” corresponding to “hope to transfer to” may also be replaced with the recognition target word candidate “Yakyu no Ho”.
- the word is not limited to the category to which the headline belongs, and may be any word as long as it represents the intent of the headline.
- FIG. 8 is a flowchart showing operations up to the generation of the speech recognition dictionary in the information providing system 1 according to the first embodiment.
- FIG. 9 is a flowchart showing an operation of outputting a headline and storing contents in the storage unit 17 in the information providing system 1 according to the first embodiment.
- FIG. 10 is a flowchart showing an operation of presenting content in the information providing system 1 according to the first embodiment.
- FIG. 11 is a diagram showing content information acquired by the acquisition unit 10 via the network 2.
- description will be made assuming that the text described in the HTML format as shown in FIG. 11 is the processing target of the information providing system 1.
- the information providing system 1 will be described on the assumption that the headline is displayed on the display and the content corresponding to the headline selected by the user utterance is output as audio. This is in consideration of a case where the user cannot read the content even if a detailed news text is displayed on the display, for example, during driving of the vehicle.
- the display is a considerably large screen or when there is no situation where the user needs to concentrate elsewhere, such as driving a vehicle
- the content corresponding to the headline selected by the user utterance is displayed on the display. It may be. Further, both display and audio output may be performed.
- the acquisition unit 10 acquires content information via the network 2 (step ST01).
- content information news information consisting of text information
- FIG. 11 is acquired.
- the HTML file designated by the “HREF” attribute is acquired via the acquisition unit 10 and the news text (content) described in the file is acquired.
- the headline and news body (content) are processed for the content specified by all ⁇ A> tags.
- An example of each headline and content (news text) acquired by the analysis unit 11 is as shown in FIG.
- the extraction unit 12 extracts recognition target word candidates from all headings acquired by the analysis unit 11 (step ST03).
- the extraction method any method may be used as described in the description of the extraction unit 12 described above.
- each heading is divided into words by morphological analysis, and the head is the first.
- a description will be given assuming that a certain word reading is extracted as a recognition target word candidate for the heading.
- the target word determination unit 13 determines whether or not there are overlapping recognition target word candidates in the recognition target word candidates extracted by the extraction unit 12 (step ST04).
- the target word determination unit 13 determines whether or not there are overlapping recognition target word candidates in the recognition target word candidates extracted by the extraction unit 12 (step ST04).
- at least one recognition target word candidate among the overlapping recognition target word candidates is processed or replaced by the above-described processing method or replacement method ( Step ST05), the recognition target word candidate after processing or replacement is determined as a speech recognition target word (step ST06).
- the recognition target word candidates extracted by the extraction unit 12 are determined as speech recognition target words as they are (step ST06).
- the recognition target word candidate “Rakudai” is duplicated, for example, in the above-described processing method, the relative display position or the voice output order of the headings from which each of the overlapping recognition target word candidates is extracted is described. Is added to the front of each recognition target word candidate, and the recognition target word candidate of the headline “Rakudai Nihon Series First Advance” is set to “First Rakurai”, the heading “Rakudai no Rakudai” Recognize candidate words for "Nakada wants to move to the major league” into “Nibanme no Rakusai” and determine each as a word to be recognized.
- the dictionary generation unit 14 generates a speech recognition dictionary 15 that uses the recognition target word determined by the target word determination unit 13 as a recognition vocabulary (step ST07).
- the speech recognition dictionary 15 is generated with the recognition vocabulary as “the first camel”, “darubi”, and “the second camel”.
- a rule such that a word representing a relative display position or a voice output order is added first may be determined in advance. Also, for example, if the rule is to add a word that indicates the relative display position or the voice output order first, the voice will say, for example, “What is the best? ” May be provided. As a result, the user can be informed that “the first camel” and “the second camel” are recognition target word candidates.
- the recognition target word candidate when the recognition target word candidate is duplicated, when replacing with another paraphrase word or the like, in order to present to the user which word is the recognition target word candidate, the paraphrase word is used, for example.
- the recognition target word candidates may be color-changed or highlighted.
- the recognition target word candidate for the headline “Rakudai Nihon Series First Advance” is “Nihonshiraizu”
- the recognition target word candidate for the headline “Rakudai Nakata wants to move to the big league” is “Nakata”.
- the word “Japan series” and the word “Nakada” in the heading displayed on the display 3 may be displayed in different colors or highlighted. .
- the control unit 16 acquires a recognition target word, a headline, and contents from the target word determination unit 13 (step ST11), and stores them in the storage unit 17 (step ST12).
- the recognition target word corresponding to each headline, each headline, and the content (news text) corresponding to each headline stored in the storage unit 17 in this way are as shown in FIG.
- the control unit 16 outputs an instruction to the display 3 so as to display (present) the headline acquired from the target word determination unit 13 (step ST13).
- the headline is displayed on the display 3 as shown in FIG.
- a headline may be shown by audio
- the voice recognition unit 18 recognizes the voice acquired by the microphone 5 and outputs a recognition result character string to the control unit 16 (step ST21). For example, when the user speaks “the first camel”, the speech recognition unit 18 performs a recognition process with reference to the speech recognition dictionary 15 and outputs the character string “the first camel” as a recognition result. .
- control unit 16 acquires the recognition result character string output by the voice recognition unit 18, and acquires content information corresponding to the acquired recognition result character string (step ST22). That is, the storage unit 17 is searched using the acquired recognition result character string as a search key. Then, the heading corresponding to the recognition target word that matches the search key is selected, and the content corresponding to the heading is acquired.
- control unit 16 sets the character string “Rakudai in today's eastern battle” of the content (news text) corresponding to the recognition result character string “the first camel” that is output by the voice recognition unit 18. “Because of winning 1 to 0, the first Japanese series advancement has been decided.” Is acquired from the storage unit 17.
- control unit 16 outputs an instruction to the speech synthesizer 19 so as to generate a synthesized speech using the character string of the acquired content (news text) (step ST23), and the synthesis generated by the speech synthesizer 19 is performed.
- Audio is acquired, and an instruction is output to the speaker 4 to output (present) the content (news text) with the synthesized audio (step ST24).
- the information providing system 1 presenting a plurality of headings, selecting one heading from the plurality of headings, and presenting information (content) corresponding to the selected heading selects the heading. Since the recognition target word is determined by performing processing or the like so that the recognition target word does not overlap, the user selects a headline uniquely by utterance, and selects information (content) corresponding to the selected heading. Can be acquired.
- the reading of the word at the head of each heading is extracted as a recognition target word candidate for the heading, and if there is an overlapping word, processing is performed so as not to overlap.
- processing is performed so as not to overlap.
- the heading is uniquely selected by the user's utterance.
- Information (contents) corresponding to the selected headline can be presented, and convenience for the user is improved.
- Embodiment 2 when recognition target word candidates for selecting a headline overlap, the recognition target word candidates processed in advance so that the recognition target word candidates are different are set as new recognition target words for selecting a headline.
- the overlapping recognition target word candidate is treated as not being a recognition target word. For this reason, when the user utters the overlapping recognition target word candidate, erroneous recognition occurs.
- the recognition target word candidate “Lucky” is duplicated, for example, as shown in FIG. ”And the third heading recognition target word is“ Sanban no Rakusai ”, even if the user utters“ Lucky ”, it is not included in the speech recognition dictionary 15. Because it becomes a character string, it ends without being recognized.
- the information providing system 20 first, when a recognition target word existing in a plurality of headings is uttered by a user, a headline including the recognition target word and another heading are displayed.
- the display mode is displayed differently, and it is clearly indicated to the user that the headline including the recognition target word is narrowed down. As a result, the user can be informed that the recognition target words are duplicated.
- the information providing system 20 recognizes the recognition target word included in the narrowed headline and includes the spoken recognition target word when a recognition target word that does not overlap among the headings is uttered by the user. Select. Thereafter, an instruction is output to the display or the like so as to present the content corresponding to the selected heading by display or audio output.
- three headlines are displayed on the display 3 as “News Japan Series first advancement”, “Dalbi misses a no-hit no-run”, and “Rakudai Nakata wants to move to the big league”.
- the display 3 is instructed to gray out the headline “Dalbi misses no-hit no-run” that does not include “camel”.
- the second headline “Dalbi is a no-hit no-run miss” is grayed out, making it difficult to see.
- "I want to transfer to the major league” is displayed as it is bright and easy to see, and it is clearly shown that it has been narrowed down to these two headings.
- FIG. 12 is a diagram showing an example of a display screen displayed on the display in a state where news headlines are narrowed down by the information providing system according to the second embodiment.
- the news body corresponding to the headline “Rakudai Nakata wants to move to the major league” is output from the speaker 4 or displayed on the display 3.
- FIG. 13 is a block diagram showing an example of an information providing system according to Embodiment 2 of the present invention.
- symbol is attached
- the extraction unit 22 extracts a plurality of recognition target word candidates for each headline acquired from the analysis unit 11.
- FIG. 14 is a table showing a result of the extraction unit 22 extracting a plurality of recognition target word candidates from each heading shown in FIG. 5 and associating the recognition target word candidates with each heading.
- the heading “Rakudai Nihon Series First Advance” is associated with “Rakusai” as the first candidate word for recognition and “Nihonshirizu” as the second candidate word for recognition.
- the heading “Dalbi misses a no-hit no-run” is associated with “Darubi” as the first recognition target word candidate and “No-Hitto” as the second recognition target word candidate.
- the heading “Rakudai Nakata wants to move to the major league” is associated with “Rakudai” as the first candidate for recognition target and “Nakata” as the second candidate for recognition target.
- the second headword for the first heading “Rakudai Nihon Series First Advancement” is the first candidate, “First”, “First”, “Uenoho”, etc., and the third heading “Rakudai Nakata moved to the major league”
- the second candidate word to be recognized as “hope” may be “sanbanme”, “second”, “doing”, or the like.
- the headline acquired by the analysis unit 11 is “Notification of arrival of a holiday season from Rakudai” “Dalbi misses no hit no-run ”“ Rakudai Nakata wants to move to the league ”“ Ri corresponds to the category “Travel” to which the first headline “Notice of coming to Rakudai from Rakudai” belongs. “Yokono” corresponds to the category “baseball” to which the third headline “Rakuda Nakata wants to move to the major leagues” belongs. "" May be the second candidate word for recognition corresponding to this. Note that the word is not limited to the category to which the headline belongs, and may be any word as long as it represents the intent of the headline.
- the recognition target word candidate extracted by the extraction unit 22 is described as two, but the present invention is not limited to this.
- the target word determination unit 23 acquires a plurality of headings, contents corresponding to the headings, and recognition target word candidates for the headings from the extraction unit 22. And the recognition object word of each heading is dynamically produced
- the target word determination unit 23 receives an instruction from the control unit 26 to re-determine (regenerate) the recognition target word corresponding to the headline (selected headline) specified by the control unit 26, at least one For the heading, the second recognition target word candidate associated with the heading is determined as a new recognition target word. That is, the recognition target word of each heading is dynamically regenerated so that the recognition target words of each heading are different from each other.
- the target word determination unit 23 has a list of recognized vocabularies related to operations other than the selection of headings (for example, a list of operation commands for operating navigation devices, other in-vehicle devices, etc.), and the list A new recognition target word may be determined so that the recognition target word does not match or resemble the word or the like included in.
- the extraction unit 22 selects “Lucky” as the first candidate word to be recognized, “Ea. If “Kon” is extracted as the second candidate word for recognition, the target word determination unit 23 determines “Eakon” as the recognition target word for this headline. Such a thing is not determined as a new recognition target word.
- the dictionary generation unit 14 generates a speech recognition dictionary 15 that uses the recognition target word determined by the target word determination unit 23 as a recognition vocabulary.
- the control unit 26 acquires a heading, a recognition target word corresponding to the heading, and content corresponding to the heading from the target word determination unit 23, and stores them in the storage unit 17.
- FIG. 15 is a table showing an example of recognition target words for each headline, each headline, and content (news text) corresponding to each headline stored in the storage unit 17 in the second embodiment.
- the target word determination unit 23 first determines the recognition target word first candidate for each heading as the recognition target word, and outputs the recognition target word to the control unit 26 and stores it in the storage unit 17. Is shown. That is, the recognition target word of the headline “Rakudai Nihon Series First Advancement” is “Lucky”, the recognition target word of the headline “Dalbi is a no-hit no-run” is “Darubi”, and the headline “Rakudai Nakata moves to the Major League” The word to be recognized is "Lucky”.
- control unit 26 outputs an instruction to the display or the like so as to present the headline acquired from the target word determination unit 23. Specifically, an instruction is output to the display 3 so that the headline acquired from the target word determination unit 23 is displayed. Alternatively, after an instruction is output to the speech synthesizer 19 to generate a synthesized speech corresponding to the heading, an instruction is output to the speaker 4 to output the synthesized speech generated by the speech synthesizer 19.
- a method for outputting an instruction to present a headline from the control unit 26 it may be performed by at least one of display output and audio output, or may be output by both display and audio. Moreover, since the method of speech synthesis by the speech synthesizer 19 may be a known technique, a description thereof will be omitted.
- control unit 26 searches the storage unit 17 using the recognition result character string output by the voice recognition unit 18 as a search key.
- the target word determination unit 23 is instructed to re-determine the recognition target word for the heading corresponding to each recognition target word that matches the search key. . That is, when the speech recognition result by voice matches the plurality of recognition target words generated by the target word determination unit 23, the control unit 26 narrows down to two or more headlines having the same recognition target word, and performs the narrowing down. For the headline, the target word determination unit 23 regenerates another recognition target word candidate (recognition target word second candidate) different from the overlapping recognition target word candidate (recognition target word first candidate) as the recognition target word. Is instructed.
- the voice recognition unit 18 outputs “camel” as a recognition result character string, and searches the storage unit 17 using this “camel” as a search key. That is, when the table shown in FIG. 15 is searched, since the recognition target word of the first heading matches the recognition target word of the third heading, the control unit 26 instructs the target word determination unit 23 to recognize the recognition target word. Outputs instructions to re-determine
- the recognition target word is re-determined by the target word determination unit 23.
- the recognition target word of the headline “Rakudai Nihon Series First Advancement” has the headline “Dalbi” in the second candidate “Nihonshirizu”.
- the no-hit no-run missed recognition target word remains "Darubi”
- the headline "Rakudai Nakata wants to move to the big league” is re-determined as the second candidate "Nakata”
- the data is output to the control unit 26 and the dictionary generation unit 14.
- the dictionary generation unit 14 again generates the speech recognition dictionary 15 that uses the recognition target word determined by the target word determination unit 23 as the recognition vocabulary.
- the control unit 26 acquires the recognition target word redetermined by the target word determination unit 23 and updates the recognition target word for each heading stored in the storage unit 17 to the redetermined recognition target word.
- FIG. 16 shows an example of the recognition target word for each headline stored in the storage unit 17, each headline, and the content (news text) corresponding to each headline after the recognition target word is determined again in the second embodiment. It is a table.
- control unit 26 outputs an instruction to the display 3 so that the heading corresponding to the recognition target word that matches the search key and the other heading are displayed in different display modes. As a result, a display screen in which news headlines are narrowed down as shown in FIG.
- control unit 26 searches the storage unit 17 using the recognition result character string output from the voice recognition unit 18 as a search key, and as a result, there is one recognition target word that matches the search key (there are a plurality of words). If not, the heading corresponding to the recognition target word that matches the search key is selected, and the content corresponding to the heading is acquired. That is, when the speech recognition result by speech matches the recognition target word generated by the target word determination unit 23, the control unit 26 selects a heading that uses the matched speech recognition result as the recognition target word, and the selection The content corresponding to the headline is acquired.
- an instruction is output to the display 3 to display the acquired content.
- an instruction is output to the voice synthesizer 19 to generate a synthesized voice using the acquired content, and an instruction is output to the speaker 4 to output the synthesized voice generated by the voice synthesizer 19.
- a method of outputting an instruction to present content from the control unit 26 it may be performed by at least one of display output and audio output, or output by both display and audio. It may be.
- FIG. 17 is a flowchart showing operations up to the generation of the speech recognition dictionary in the information providing system 20 according to the second embodiment.
- FIG. 18 is a flowchart showing operations for storing / updating content and the like in the storage unit 17 and presenting the content in the information providing system 20 according to the second embodiment.
- the information providing system 20 will be described on the assumption that the headline is displayed on the display and the content corresponding to the headline selected by the user utterance is output as audio. This is in consideration of a case where the user cannot read the content even if a detailed news text is displayed on the display, for example, during driving of the vehicle.
- the display is a considerably large screen or when there is no situation where the user needs to concentrate elsewhere, such as driving a vehicle
- the content corresponding to the headline selected by the user utterance is displayed on the display. It may be. Further, both display and audio output may be performed.
- the acquisition unit 10 acquires content information via the network 2 (step ST31).
- content information (news information consisting of text information) as shown in FIG. 11 is acquired.
- the HTML file designated by the “HREF” attribute is acquired via the acquisition unit 10 and the news text (content) described in the file is acquired.
- the headline and news body (content) are processed for the content specified by all ⁇ A> tags.
- An example of each headline and content (news text) acquired by the analysis unit 11 is as shown in FIG.
- the extraction unit 22 extracts a plurality of recognition target word candidates from all headings acquired by the analysis unit 11 (step ST33).
- the headline “Rakudai Nihon Series First Advance” “Lucky” and “Nihonshirizu” from the headline “Dalbi misses no hit no-run” “Darubi” “Hitto” and the headline “Rakudai Nakata wants to move to the league” “Rakutai” and “Nakata” are extracted as recognition target words.
- the target word determination unit 23 acquires a plurality of recognition target word candidates, headings, and contents extracted by the extraction unit 22, regardless of whether or not there are overlapping recognition target word candidates.
- the first candidate is determined as a speech recognition target word (step ST34).
- the headline “Rakudai Japan Series First Advance” is “Lucky”
- the headline “Dalbi misses a no-hit no-run” “Dalubi”
- the headline “Rakudai ’s” For “Nakada wants to transfer to the major leagues”, “Lucky” is determined as the recognition target word corresponding to each headline.
- the dictionary generation unit 14 generates a speech recognition dictionary 15 that uses the recognition target word determined by the target word determination unit 23 as a recognition vocabulary (step ST35).
- the speech recognition dictionary 15 having “Lucky” and “Dalubi” as recognition vocabulary is generated.
- the voice recognition unit 18 recognizes the voice acquired by the microphone 5 and outputs a recognition result character string to the control unit 26 (step ST41). For example, when “Lucky” is spoken by the user, the voice recognition unit 18 performs a recognition process with reference to the voice recognition dictionary 15 and outputs a character string “Lucky” as a recognition result.
- control unit 26 acquires the recognition result character string output by the voice recognition unit 18 and determines whether or not there are a plurality of recognition target words that match the acquired recognition result character string in the storage unit 17. (Step ST42). That is, the storage unit 17 is searched using the acquired recognition result character string as a search key.
- step ST42 When there are a plurality of recognition target words that match the search key (in the case of “YES” in step ST42), the display is performed so that the heading corresponding to each recognition target word is different in display form from the other headings. 3 is output as an instruction (step ST43). As a result, each heading is displayed on the display 3 as shown in FIG.
- control unit 26 instructs the target word determination unit 23 to re-determine the recognition target word for the headline corresponding to each recognition target word that matches the search key. Then, in response to the instruction from the control unit 26, the target word determination unit 23 newly determines the second candidate among the plurality of recognition target word candidates associated with each heading as a recognition target word (step). ST44).
- control part 26 acquires the recognition object word re-determined by the object word determination part 23, and updates the recognition object word with respect to each heading stored in the memory
- the dictionary generation unit 14 generates a speech recognition dictionary 15 that uses the recognition target word determined by the target word determination unit 23 as a recognition vocabulary (step ST46), and ends the process.
- the speech recognition dictionary 15 having “Nihonshirizu”, “Darubi” and “Nakata” as recognition words is generated.
- the words “Nippon Series” and “Nakada” in the headline displayed on the display 3 are displayed in different colors or highlighted.
- the user may be notified that the recognition target word of the first heading is “Nihonshirizu” and the recognition target word of the third heading is “Nakata”.
- control unit 26 selects a heading corresponding to the recognition target word that matches the search key, and sets it as the heading. Get the corresponding content. That is, the content information corresponding to the recognition result character string output by the voice recognition unit 18 is acquired (step ST47).
- the voice recognition unit 18 performs a recognition process with reference to the voice recognition dictionary 15 and outputs a recognition result character string “Nakata”. Then, since there are not a plurality of recognition target words that match “Nakata” in the storage unit 17, the control unit 26 has content (news) corresponding to the recognition result character string “Nakata” output by the speech recognition unit 18. The text “Rakudai Nakata announced that he wanted to transfer to the major league at a press conference this afternoon” is acquired from the storage unit 17.
- control unit 26 outputs an instruction to the speech synthesizer 19 so as to generate a synthesized speech using a character string of the acquired content (news text) (step ST48), and the synthesis generated by the speech synthesizer 19 is performed.
- Audio is acquired, and an instruction is output to the speaker 4 to output (present) the content (news text) with the synthesized audio (step ST49).
- the information providing system 20 that presents a plurality of headings, selects one of the headings, and presents information (content) corresponding to the selected heading selects the heading.
- the recognition target words to be recognized overlap, the user recognizes the utterance of the recognition target words, and then determines a new recognition target word that does not overlap between the headings.
- Information (contents) corresponding to the headline selected by the user can be acquired.
- the second embodiment it has been described that a plurality of recognition target word candidates are first extracted. However, first, only the first candidate for recognition target words is extracted to determine a recognition target word, and the control unit If a redetermination instruction is received from 26, the next candidate (second candidate) may be determined as a recognition target word.
- the target words overlap it may include dynamically generating the recognition target words for each heading so that the recognition target words for each heading are different from each other.
- the target word determining unit 23 recognizes the second recognition target word of each heading regardless of whether or not the second candidate recognition target word candidates associated with the five narrowed headings are duplicated. Candidates are determined as recognition target words for each heading. As a result, it is assumed that the recognition target word “B” for two headings further overlaps.
- the control unit 26 narrows down to two headings including the recognition target word “B”, and instructs the target word determination unit 23 to regenerate the recognition target word. . After that, the target word determination unit 23 that has received the instruction has only duplicated the second candidate recognition target word candidates associated with the narrowed headings.
- the recognition target words of each heading are dynamically generated so that they are different from each other.
- the target word determination unit 23 recognizes each heading so that the recognition target words are different from each other when the recognition target word candidates are duplicated and only the duplicate candidate words are found.
- the word is generated dynamically, but the recognition is performed when narrowing-down is performed a predetermined number of times or more (when the recognition target word regeneration instruction from the control unit 26 is the predetermined number of times or more).
- the recognition target word of each heading may be dynamically generated so that the recognition target words of each heading are different from each other.
- the information providing system of the present invention presents a plurality of headings, selects one heading from the plurality of headings, and presents information (content) corresponding to the selected heading. You may apply to such an apparatus and apparatus.
- the component of this information provision system may be distributed to a server on a network, a mobile terminal such as a smartphone, an in-vehicle device, and the like.
- 1,20 information provision system 2 networks, 3 displays, 4 speakers, 5 microphones, 10 acquisition unit, 11 analysis unit, 12, 22 extraction unit, 13, 23 target word determination unit, 14 dictionary generation unit, 15 speech recognition dictionary 16, 26, control unit, 17 storage unit, 18 speech recognition unit, 19 speech synthesis unit, 100 bus, 101 CPU, 102 ROM, 103 RAM, 104 input device, 105 communication device, 106 HDD, 107 output device.
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Artificial Intelligence (AREA)
- Machine Translation (AREA)
- User Interface Of Digital Computer (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
この発明の情報提供システムは、複数の見出しを提示して、当該複数の見出しの中から1つの見出しを選択し、選択された見出しに対応する情報(コンテンツ)を提示するものである。
図1は、この発明の実施の形態1による情報提供システムの動作の概略を説明するための説明図である。情報提供システム1は、ネットワーク2を介して情報(コンテンツ情報)を取得し、取得したコンテンツ情報に含まれるコンテンツの見出しを表示または音声出力により提示するよう、ディスプレイ3またはスピーカ4(以下、「ディスプレイ等」と記載する。)に対して指示出力する。
そして、情報提供システム1は、見出しに含まれる認識対象語がユーザにより発話されると、当該認識対象語を含む見出しを選択し、その選択した見出しに対応するニュース本文を提示するよう、ディスプレイ等に対して指示出力する。
RAM103は、プログラム実行時に使用するメモリである。
入力装置104は、ユーザ入力を受け付けるものであり、マイク、リモコン、タッチセンサ等である。
出力装置107は、スピーカ、液晶ディスプレイ、有機EL等を含む。
図5は、解析部11が取得した複数の見出しと、当該各見出しに対応するコンテンツ(ニュース本文)の一例を示す表である。
なお、例えば、ユーザの発話履歴を用いて発話頻度が高い単語等の読みを抽出する等、他の抽出方法を用いてもよい。以下の実施の形態においても、同様である。
この図6に示すように、ここでは、見出し「楽大日本シリーズ初進出」の認識対象語候補は「らくだい」、見出し「ダルビがノーヒットノーラン逃す」の認識対象語候補は「だるび」、見出し「楽大の中田が大リーグへ移籍を希望」の認識対象語候補は「らくだい」というように、認識対象語候補が抽出されている。
一方、重複する認識対象語候補が存在しない場合、対象語決定部13は、抽出部12から取得した認識対象語候補をそのまま、見出しに対応する認識対象語に決定する。
音声認識部18は、マイク5により集音された音声を、音声認識辞書15を参照して認識し、認識結果文字列を出力する。なお、音声認識部18による音声認識の方法については、公知の技術を用いればよいため、説明を省略する。
図7は、実施の形態1において、記憶部17に格納された各見出しの認識対象語、各見出し、各見出しに対応するコンテンツ(ニュース本文)の一例を示す表である。
また、音声合成部19による音声合成の方法については、公知の技術を用いればよいため、説明を省略する。
まず、対象語決定部13による認識対象語候補の加工方法について説明する。
対象語決定部13は、抽出部12から取得した各見出しの認識対象語候補の中に重複する認識対象語候補が存在する場合、当該重複する認識対象語候補の少なくとも1つを加工して、加工後の認識対象語候補をその見出しに対応する認識対象語として決定することにより、各見出しの認識対象語が互いに異なるものになるように、認識対象語を動的に生成する。
対象語決定部13は、抽出部12から取得した各見出しの認識対象語候補の中に重複する認識対象語候補が存在する場合、当該重複する認識対象語候補のうち少なくとも1つを他の言い換え語(他の認識対象語候補)に置換して、置換後の認識対象語候補をその見出しに対応する認識対象語として決定することにより、各見出しの認識対象語が互いに異なるものになるように、認識対象語を動的に生成する。
なお、見出しが属するカテゴリに限らず、見出しの意図を表す単語等であれば、どのようなものであってもよい。
まず、取得部10が、ネットワーク2を介してコンテンツ情報を取得する(ステップST01)。ここでは、図11に示すようなコンテンツ情報(テキスト情報からなるニュース情報)を取得するものとする。
具体的には、解析部11は、取得部10により取得されたコンテンツ情報(テキスト情報からなるニュース情報)の構造を解析し、</A>タグで指定された「<A HREF=”news1.html”>楽大日本シリーズ初進出</A>」から、このニュース(コンテンツ)の見出しとして「楽大日本シリーズ初進出」を取得する。
解析部11により取得された、各見出しとコンテンツ(ニュース本文)の例は、図5に示すとおりである。
このようにして、各見出しから認識対象語候補を抽出した結果は、図6に示すとおりである。
重複する認識対象語候補がある場合(ステップST04の「YES」の場合)、重複する認識対象語候補のうち少なくとも1つの認識対象語候補を、前述した加工方法または置換方法により加工または置換し(ステップST05)、加工後または置換後の認識対象語候補を音声認識対象語として決定する(ステップST06)。
この場合には、「いちばんめのらくだい」「だるび」「にばんめのらくだい」を認識語彙とする音声認識辞書15が生成される。
まず、制御部16は、対象語決定部13から、認識対象語と見出しとコンテンツを取得し(ステップST11)、記憶部17に格納する(ステップST12)。
このようにして記憶部17に格納された、各見出しに対応する認識対象語と、各見出しと、各見出しに対応するコンテンツ(ニュース本文)は、図7に示すとおりである。
この結果、図2に示すように、ディスプレイ3に見出しが表示される。
なお、ここでは、見出しを提示する際に表示により提示されるものとして説明するが、見出しが音声出力により提示されるものであってもよいし、表示と音声出力の両方により提示されるものであってもよい。
まず、音声認識部18は、マイク5により取得された音声を認識し、認識結果文字列を制御部16に出力する(ステップST21)。
例えば、ユーザにより「いちばんめのらくだい」が発話されると、音声認識部18は音声認識辞書15を参照して認識処理を行い、認識結果として文字列「いちばんめのらくだい」を出力する。
なお、前述のとおり、ここでは、ユーザが選択した見出しに対応するコンテンツ(ニュース本文)の文字列を音声出力するものとして説明したが、制御部16がステップST22において取得したコンテンツ情報に基づいて、当該取得したコンテンツ(ニュース本文)の文字列をディスプレイ3に表示させるようにしてもよい。また、表示と音声出力の両方を行うものであってもよい。
実施の形態1においては、見出しを選択するための認識対象語候補が重複する場合、あらかじめ各認識対象語候補が異なるように加工等したものを、見出しを選択するための新たな認識対象語とし、重複する認識対象語候補は、認識対象語ではないものとして扱われる。そのため、ユーザが当該重複する認識対象語候補を発話した場合、誤認識となってしまう。
この状態において、続いてユーザにより「なかた」と発話されると、見出し「楽大の中田が大リーグへ移籍を希望」に対応するニュース本文が、スピーカ4から音声出力されたり、ディスプレイ3に表示されたりする。
図14は、抽出部22が、図5に示す各見出しから複数の認識対象語候補を抽出し、当該認識対象語候補を各見出しに対応付けた結果を示す表である。
なお、見出しが属するカテゴリに限らず、見出しの意図を表す単語等であれば、どのようなものであってもよい。
なお、説明を簡単にするために、以降の説明においては、抽出部22が抽出する認識対象語候補は2つとして説明するが、これに限られるものではない。
この実施の形態2では、対象語決定部23は、各見出しの認識対象語が異なるように各見出しの認識対象語を動的に生成する前において、抽出部22から取得した各見出しの認識対象語候補の中に重複する認識対象語候補が存在するか否かにかかわらず、まず最初は、取得した認識対象語候補の第一候補を認識対象語として決定する。
制御部26は、対象語決定部23から、見出しと、当該見出しに対応する認識対象語と、同じくその見出しに対応するコンテンツとを取得し、記憶部17に格納する。
図15は、実施の形態2において、記憶部17に格納された各見出しの認識対象語、各見出し、各見出しに対応するコンテンツ(ニュース本文)の一例を示す表である。
また、音声合成部19による音声合成の方法については、公知の技術を用いればよいため、説明を省略する。
また、制御部26は、対象語決定部23により再決定された認識対象語を取得し、記憶部17に格納されている各見出しに対する認識対象語を、再決定された認識対象語に更新する。
図16は、実施の形態2において、認識対象語が再決定されて、記憶部17に格納された各見出しの認識対象語、各見出し、各見出しに対応するコンテンツ(ニュース本文)の一例を示す表である。
これにより、図12に示すような、ニュースの見出しが絞り込まれた表示画面がディスプレイ3に表示される。
また、コンテンツであるニュース本文は、「news1.html」「news2.html」「news3.html」に記述されており、その内容は、図5に示すように「楽大が今日の東部戦に1対0で勝利したため、初の日本シリーズ進出が決定しました。」「ダルビは9回ツーアウトの状況で4番マークにヒットを打たれノーヒットノーランを逃しました。」「楽大の中田が今日の午後、記者会見で大リーグへの移籍を希望していると発表しました。」であるとする。
まず、取得部10が、ネットワーク2を介してコンテンツ情報を取得する(ステップST31)。ここでは、図11に示すようなコンテンツ情報(テキスト情報からなるニュース情報)を取得するものとする。
具体的には、解析部11は、取得部10により取得されたコンテンツ情報(テキスト情報からなるニュース情報)の構造を解析し、</A>タグで指定された「<A HREF=”news1.html”>楽大日本シリーズ初進出</A>」から、このニュース(コンテンツ)の見出しとして「楽大日本シリーズ初進出」を取得する。
解析部11により取得された、各見出しとコンテンツ(ニュース本文)の例は、図5に示すとおりである。
その結果、図14に示すように、見出し「楽大日本シリーズ初進出」からは「らくだい」と「にほんしりーず」、見出し「ダルビがノーヒットノーラン逃す」からは「だるび」と「のーひっと」、見出し「楽大の中田が大リーグへ移籍を希望」からは「らくだい」と「なかた」が、それぞれ認識対象語候補として抽出される。
この場合には、「らくだい」と「だるび」を認識語彙とする音声認識辞書15が生成される。
図9に示すフローチャートにしたがって処理が行われた結果、図15に示すように、各見出しに対応する認識対象語と、各見出しと、各見出しに対応するコンテンツ(ニュース本文)が記憶部17に格納される。
まず、音声認識部18は、マイク5により取得された音声を認識し、認識結果文字列を制御部26に出力する(ステップST41)。
例えば、ユーザにより「らくだい」が発話されると、音声認識部18は音声認識辞書15を参照して認識処理を行い、認識結果として文字列「らくだい」を出力する。
その結果、図12に示すように、各見出しがディスプレイ3に表示される。
その結果、記憶部17に格納されていた情報は、図15に示す内容から、図16に示す内容へと、記憶部17の内容が更新される。
その結果、「にほんしりーず」「だるび」「なかた」を認識語彙とする音声認識辞書15が生成される。
なお、前述のとおり、ここでは、ユーザが選択した見出しに対応するコンテンツ(ニュース本文)の文字列を音声出力するものとして説明したが、制御部26がステップST47において取得したコンテンツ情報に基づいて、当該取得したコンテンツ(ニュース本文)の文字列をディスプレイ3に表示させるようにしてもよい。また、表示と音声出力の両方を行うものであってもよい。
その後、当該指示を受けた対象語決定部23は、絞り込まれた見出しに対応付けられている第二候補の認識対象語候補が重複しているもののみになったので、各見出しの認識対象語が互いに異なるように各見出しの認識対象語を動的に生成する。
Claims (9)
- 複数の見出しを提示して、当該複数の見出しの中の1つを選択し、選択された見出しに対応するコンテンツを提示する情報提供システムにおいて、
前記複数の見出しを取得し、当該取得した各見出しから認識対象語候補を抽出する抽出部と、
前記抽出部から、前記各見出しの認識対象語候補を取得し、当該取得した各見出しの認識対象語候補に基づいて前記各見出しの認識対象語を生成する対象語決定部と、
音声による音声認識結果が前記対象語決定部により生成された認識対象語と一致した場合に、当該一致した音声認識結果を認識対象語とする見出しに対応するコンテンツを提示するよう指示を行う制御部とを備え、
前記対象語決定部は、前記取得した各見出しの認識対象語候補の中に重複する認識対象語候補が存在する場合、前記各見出しの認識対象語が互いに異なるように前記各見出しの認識対象語を動的に生成する
ことを特徴とする情報提供システム。 - 前記対象語決定部は、前記取得した各見出しの認識対象語候補の中に重複する認識対象語候補が存在する場合、当該重複する認識対象語候補の少なくとも1つを加工して、前記各見出しの認識対象語が互いに異なるものになるように認識対象語を動的に生成する
ことを特徴とする請求項1記載の情報提供システム。 - 前記対象語決定部は、前記重複する認識対象語候補を含む見出しの提示位置または提示順序を特定し、当該特定した提示位置または提示順序に基づいて前記認識対象語候補を加工する
ことを特徴とする請求項2記載の情報提供システム。 - 前記対象語決定部は、前記重複する認識対象語候補に他の単語または単語列を追加することにより、前記認識対象語候補を加工する
ことを特徴とする請求項2記載の情報提供システム。 - 前記対象語決定部は、前記取得した各見出しの認識対象語候補の中に重複する認識対象語候補が存在する場合、当該重複する認識対象語候補の少なくとも1つを他の認識対象語候補に置換する
ことを特徴とする請求項1記載の情報提供システム。 - 前記対象語決定部は、前記重複する認識対象語候補を含む見出しの意味を解析し、当該意味に相当する用語を他の認識対象語候補として、前記重複する認識対象語候補の少なくとも1つを置換する
ことを特徴とする請求項5記載の情報提供システム。 - 前記対象語決定部は、前記重複する認識対象語候補を含む見出しに含まれる他の認識対象語候補に、前記認識対象語候補を置換する
ことを特徴とする請求項5記載の情報提供システム。 - 前記対象語決定部は、前記各見出しの認識対象語が互いに異なるように前記各見出しの認識対象語を動的に生成する前において、前記取得した各見出しの認識対象語候補の中に重複する認識対象語候補が存在するか否かにかかわらず、当該認識対象語候補を前記各見出しの認識対象語として決定するとともに、
前記制御部は、音声による音声認識結果が前記対象語決定部により決定された複数の認識対象語と一致した場合に、当該認識対象語が同一である2以上の見出しに絞り込み、当該絞り込んだ各見出しについて、前記重複する認識対象語候補とは異なる認識対象語候補を認識対象語として再生成するよう前記対象語決定部に対して指示を行う
ことを特徴とする請求項1記載の情報提供システム。 - 前記対象語決定部は、機器の操作コマンドとは異なるものになるように、前記認識対象語を動的に生成する
ことを特徴とする請求項1記載の情報提供システム。
Priority Applications (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/JP2014/074412 WO2016042600A1 (ja) | 2014-09-16 | 2014-09-16 | 情報提供システム |
DE112014006957.4T DE112014006957B4 (de) | 2014-09-16 | 2014-09-16 | Informations-Bereitstellsystem |
CN201480081948.2A CN106688036B (zh) | 2014-09-16 | 2014-09-16 | 信息提供系统 |
US15/315,506 US9978368B2 (en) | 2014-09-16 | 2014-09-16 | Information providing system |
JP2016548463A JP6022138B2 (ja) | 2014-09-16 | 2014-09-16 | 情報提供システム |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/JP2014/074412 WO2016042600A1 (ja) | 2014-09-16 | 2014-09-16 | 情報提供システム |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2016042600A1 true WO2016042600A1 (ja) | 2016-03-24 |
Family
ID=55532671
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2014/074412 WO2016042600A1 (ja) | 2014-09-16 | 2014-09-16 | 情報提供システム |
Country Status (5)
Country | Link |
---|---|
US (1) | US9978368B2 (ja) |
JP (1) | JP6022138B2 (ja) |
CN (1) | CN106688036B (ja) |
DE (1) | DE112014006957B4 (ja) |
WO (1) | WO2016042600A1 (ja) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2021168013A (ja) * | 2020-04-09 | 2021-10-21 | 日鉄エンジニアリング株式会社 | 情報出力装置、情報出力システム、情報出力方法、プログラム、サーバ装置及びデータ出力方法 |
JP2021167933A (ja) * | 2020-04-09 | 2021-10-21 | 日鉄エンジニアリング株式会社 | 情報出力装置、情報出力システム、情報出力方法、プログラム、サーバ装置及びデータ出力方法 |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113626715A (zh) * | 2021-08-26 | 2021-11-09 | 北京字跳网络技术有限公司 | 查询结果显示方法、装置、介质和电子设备 |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH08137883A (ja) * | 1994-11-08 | 1996-05-31 | Oki Electric Ind Co Ltd | 辞書装置 |
JP2003099089A (ja) * | 2001-09-20 | 2003-04-04 | Sharp Corp | 音声認識・合成装置および方法 |
JP2003141117A (ja) * | 2001-11-05 | 2003-05-16 | Casio Comput Co Ltd | 対話装置、及びプログラム、情報提供装置、及びプログラム |
JP2004070876A (ja) * | 2002-08-09 | 2004-03-04 | Casio Comput Co Ltd | 会話システム及び会話処理プログラム |
JP2004234389A (ja) * | 2003-01-30 | 2004-08-19 | Ricoh Co Ltd | 識別情報表示制御装置、識別情報表示方法、プレゼンテーションデータ作成装置、プログラムおよび記録媒体 |
JP2012043046A (ja) * | 2010-08-16 | 2012-03-01 | Konica Minolta Business Technologies Inc | 議事録作成システムおよびプログラム |
Family Cites Families (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5680511A (en) * | 1995-06-07 | 1997-10-21 | Dragon Systems, Inc. | Systems and methods for word recognition |
US5924068A (en) * | 1997-02-04 | 1999-07-13 | Matsushita Electric Industrial Co. Ltd. | Electronic news reception apparatus that selectively retains sections and searches by keyword or index for text to speech conversion |
US6470307B1 (en) * | 1997-06-23 | 2002-10-22 | National Research Council Of Canada | Method and apparatus for automatically identifying keywords within a document |
EP1083545A3 (en) * | 1999-09-09 | 2001-09-26 | Xanavi Informatics Corporation | Voice recognition of proper names in a navigation apparatus |
US6937986B2 (en) * | 2000-12-28 | 2005-08-30 | Comverse, Inc. | Automatic dynamic speech recognition vocabulary based on external sources of information |
SE0202058D0 (sv) | 2002-07-02 | 2002-07-02 | Ericsson Telefon Ab L M | Voice browsing architecture based on adaptive keyword spotting |
GB2407657B (en) * | 2003-10-30 | 2006-08-23 | Vox Generation Ltd | Automated grammar generator (AGG) |
US7624019B2 (en) * | 2005-10-17 | 2009-11-24 | Microsoft Corporation | Raising the visibility of a voice-activated user interface |
US8612230B2 (en) * | 2007-01-03 | 2013-12-17 | Nuance Communications, Inc. | Automatic speech recognition with a selection list |
US7729899B2 (en) * | 2007-02-06 | 2010-06-01 | Basis Technology Corporation | Data cleansing system and method |
US8538757B2 (en) * | 2007-05-17 | 2013-09-17 | Redstart Systems, Inc. | System and method of a list commands utility for a speech recognition command system |
US8494944B2 (en) * | 2007-06-06 | 2013-07-23 | O2 Media, LLC | System, report, and method for generating natural language news-based stories |
JP2010154397A (ja) * | 2008-12-26 | 2010-07-08 | Sony Corp | データ処理装置、データ処理方法、及び、プログラム |
JPWO2011013177A1 (ja) * | 2009-07-31 | 2013-01-07 | 三菱電機株式会社 | 施設検索装置 |
CN102770910B (zh) * | 2010-03-30 | 2015-10-21 | 三菱电机株式会社 | 声音识别装置 |
US9691381B2 (en) * | 2012-02-21 | 2017-06-27 | Mediatek Inc. | Voice command recognition method and related electronic device and computer-readable medium |
US9858038B2 (en) * | 2013-02-01 | 2018-01-02 | Nuance Communications, Inc. | Correction menu enrichment with alternate choices and generation of choice lists in multi-pass recognition systems |
KR102247533B1 (ko) * | 2014-07-30 | 2021-05-03 | 삼성전자주식회사 | 음성 인식 장치 및 그 제어 방법 |
-
2014
- 2014-09-16 DE DE112014006957.4T patent/DE112014006957B4/de not_active Expired - Fee Related
- 2014-09-16 WO PCT/JP2014/074412 patent/WO2016042600A1/ja active Application Filing
- 2014-09-16 CN CN201480081948.2A patent/CN106688036B/zh not_active Expired - Fee Related
- 2014-09-16 JP JP2016548463A patent/JP6022138B2/ja not_active Expired - Fee Related
- 2014-09-16 US US15/315,506 patent/US9978368B2/en not_active Expired - Fee Related
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH08137883A (ja) * | 1994-11-08 | 1996-05-31 | Oki Electric Ind Co Ltd | 辞書装置 |
JP2003099089A (ja) * | 2001-09-20 | 2003-04-04 | Sharp Corp | 音声認識・合成装置および方法 |
JP2003141117A (ja) * | 2001-11-05 | 2003-05-16 | Casio Comput Co Ltd | 対話装置、及びプログラム、情報提供装置、及びプログラム |
JP2004070876A (ja) * | 2002-08-09 | 2004-03-04 | Casio Comput Co Ltd | 会話システム及び会話処理プログラム |
JP2004234389A (ja) * | 2003-01-30 | 2004-08-19 | Ricoh Co Ltd | 識別情報表示制御装置、識別情報表示方法、プレゼンテーションデータ作成装置、プログラムおよび記録媒体 |
JP2012043046A (ja) * | 2010-08-16 | 2012-03-01 | Konica Minolta Business Technologies Inc | 議事録作成システムおよびプログラム |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2021168013A (ja) * | 2020-04-09 | 2021-10-21 | 日鉄エンジニアリング株式会社 | 情報出力装置、情報出力システム、情報出力方法、プログラム、サーバ装置及びデータ出力方法 |
JP2021167933A (ja) * | 2020-04-09 | 2021-10-21 | 日鉄エンジニアリング株式会社 | 情報出力装置、情報出力システム、情報出力方法、プログラム、サーバ装置及びデータ出力方法 |
Also Published As
Publication number | Publication date |
---|---|
US9978368B2 (en) | 2018-05-22 |
JP6022138B2 (ja) | 2016-11-09 |
DE112014006957T5 (de) | 2017-06-01 |
US20170200448A1 (en) | 2017-07-13 |
JPWO2016042600A1 (ja) | 2017-04-27 |
DE112014006957B4 (de) | 2018-06-28 |
CN106688036B (zh) | 2017-12-22 |
CN106688036A (zh) | 2017-05-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
KR102191425B1 (ko) | 인터랙티브 캐릭터 기반 외국어 학습 장치 및 방법 | |
JP2013068952A (ja) | 音声認識結果の統合 | |
CN110740275B (zh) | 一种非线性编辑系统 | |
US7742924B2 (en) | System and method for updating information for various dialog modalities in a dialog scenario according to a semantic context | |
JPWO2007097390A1 (ja) | 音声認識システム、音声認識結果出力方法、及び音声認識結果出力プログラム | |
JP6022138B2 (ja) | 情報提供システム | |
JP2014202848A (ja) | テキスト生成装置、方法、及びプログラム | |
JP5396530B2 (ja) | 音声認識装置および音声認識方法 | |
JP4436087B2 (ja) | 文字データ修正装置、文字データ修正方法および文字データ修正プログラム | |
JP5160594B2 (ja) | 音声認識装置および音声認識方法 | |
JP6676093B2 (ja) | 異言語間コミュニケーション支援装置及びシステム | |
JP4675691B2 (ja) | コンテンツ情報提供装置 | |
JP2010169973A (ja) | 外国語学習支援システム、及びプログラム | |
JP6499228B2 (ja) | テキスト生成装置、方法、及びプログラム | |
KR101250897B1 (ko) | 전자사전에서 음성인식을 이용한 단어 탐색 장치 및 그 방법 | |
JP5196114B2 (ja) | 音声認識装置およびプログラム | |
JP2000056795A (ja) | 音声認識装置 | |
JP2014142465A (ja) | 音響モデル生成装置及び方法、並びに音声認識装置及び方法 | |
JP7297266B2 (ja) | 検索支援サーバ、検索支援方法及びコンピュータプログラム | |
JP4305515B2 (ja) | 音声出力装置及び音声出力プログラム | |
JP2008083410A (ja) | 音声認識装置及びその方法 | |
JP2010002830A (ja) | 音声認識装置 | |
JP2006139789A (ja) | 情報入力方法、情報入力装置及び記憶媒体 | |
JP2008058678A (ja) | 音声出力装置及び音声出力プログラム | |
JP2014139624A (ja) | 音声検索表示装置 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 14901873 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2016548463 Country of ref document: JP Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: 15315506 Country of ref document: US |
|
WWE | Wipo information: entry into national phase |
Ref document number: 112014006957 Country of ref document: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 14901873 Country of ref document: EP Kind code of ref document: A1 |