WO2021056347A1 - Method for retrieving information about pronunciation associated with logogram - Google Patents
Method for retrieving information about pronunciation associated with logogram Download PDFInfo
- Publication number
- WO2021056347A1 WO2021056347A1 PCT/CN2019/108234 CN2019108234W WO2021056347A1 WO 2021056347 A1 WO2021056347 A1 WO 2021056347A1 CN 2019108234 W CN2019108234 W CN 2019108234W WO 2021056347 A1 WO2021056347 A1 WO 2021056347A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- logogram
- information
- logograms
- pronunciation
- identified
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/12—Use of codes for handling textual entities
- G06F40/126—Character encoding
- G06F40/129—Handling non-Latin characters, e.g. kana-to-kanji conversion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/018—Input/output arrangements for oriental characters
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/40—Processing or translation of natural language
- G06F40/53—Processing of non-Latin text
-
- G—PHYSICS
- G09—EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
- G09B—EDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
- G09B19/00—Teaching not covered by other main groups of this subclass
- G09B19/06—Foreign languages
Abstract
A method is provided to help people pronounce uncommonly known logograms correctly. In logogram-based languages such as Chinese or Japanese, the shape of a logogram gives no indication as how to pronounce a logogram. While reading a text comprising logograms, a reader may come upon one or more logograms that are rarely used and whose pronunciation they do not know. There are existing solutions for providing readers with information about the pronunciation of logograms. However, those solutions are not user-friendly. At least one pronunciation associated with a logogram is retrieved automatically based on information indicating whether a logogram is commonly used or not. Then, when a logogram is uncommon, information about how to pronounce such a logogram is retrieved and displayed alongside the logogram making the text displayed clearer, since only a few logograms are displayed alongside information about their pronunciation.
Description
The present invention relates to a solution for helping people pronounce uncommonly known logograms correctly. More particularly, the invention concerns a method for identifying uncommonlyknown logograms in a text to be displayed and for retrieving information enabling a reader to pronounce the identified logogram.
In logogram-based languages such as Chinese or Japanese, the shape of a logogram gives no indication as how to pronounce a logogram. For example, in Chinese, the logogram 我 is pronounced “wǒ” , which means “me” . The same logogram in Japanese is pronounced either “われ” or “ware” , or “わ” or “wa” . Thus, it appears that there is no link between the pronunciation of a logogram “wo” or “ware” and its shape 我.
While reading a text comprising logograms, a reader may come upon one or more logograms that are uncommon, i.e. logograms that are rarely used or that relate to a very specific field such as sciences or laws for example, and whose pronunciation they do not know.
For example, in the following sentence “本次获得一等奖的同学是王懿童” , which means “The student who won the first prize is WANG Yitong. ” in English, the following logograms 王懿童 represent the name of the student who won the prize. The second logogram representing the name of the student, 懿, is an uncommon logogram in Chinese, consequently a reader may not know how to. pronounce the name of the student which is inconvenient.
There are existing solutions for providing readers with information about the pronunciation of logograms.
A first solution consists in retrieving an information concerning a pronunciation of all the logograms of a text and to display the logograms alongside their pronunciation. According to this first solution, the sentence “本次获得一等奖的同学是王懿童” is displayed on a screen in the following way : 本 (ben) 次 (ci) 获 (huo) 得 (de) 一 (yi) 等 (deng) 奖 (jia) 的 (de) 同 (tong) 学 (xue) 是 (shi) 王 (wang) 懿 (yi) 童 (tong) . Although, this first solution provides the reader with information about the pronunciation of the uncommon logogram 懿, it also provides the reader with unnecessary information about the pronunciation of common logograms, making the whole sentence difficult to read due to the amount of characters displayed.
A second solution consists in the reader selecting at least one logogram, whose pronunciation is unknown to him/her, in a text, for example, by clicking on a portion of a screen on which the logogram is displayed by means of a mouse. Information about the pronunciation of the selected logogram is retrieved and displayed on the screen alongside its pronunciation. However, as the number of logograms for which the reader requires information about the pronunciation increases, so does the number of actions executed by the reader in order to select these logograms which can be cumbersome.
It would hence be desirable to provide a solution for retrieving information about the pronunciation of a logogram which is user-friendly.
3. Summary
In a first aspect, the invention concerns a method for retrieving information about at least one pronunciation associated with at least one logogram among a plurality of logograms to be displayed on a display device, said method being implemented by an electronic terminal and comprising :
- identifying among the plurality of logograms to be displayed, and based on information indicating whether a logogram is commonly used, at least one logogram for which at least one information about at least one pronunciation associated with said logogram is to be provided,
- retrieving said at least one information about at least one pronunciation associated with said identified logogram,
- providing said at least one information about at least one pronunciation associated with said identified logogram to said display device, in order to display said at least one information alongside said identified logogram on the display device.
In such a solution, at least one pronunciation associated with a logogram is retrieved automatically, i.e. without any action from a reader, based on information indicating whether a logogram is commonly use.
Thus, depending on whether a logogram is commonly used, i.e. a majority of readers know its pronunciation, or not, information about how to pronounce such a logogram is retrieved and then provided to a display device.
Then, the retrieved information is displayed alongside the logogram making the text displayed on the display device clearer since only a few logograms are displayed alongside information about their pronunciation.
The information indicating whether a logogram is commonly use can be set by the reader depending on his/her degree of proficiency. When the reader is a non-native speaker, he/she can select which common set of logograms whose pronunciation he/she knows. For example, the reader can indicate his/her HSK (Hanyu Shuiping Kaoshi or Chinese Proficiency Test) level number or his/her JPLT (Japanese-Language Proficiency Test) level number since to each of these levels correspond well defined sets of logograms that the reader is supposed to know.
In an embodiment of the method according to the invention, identifying at least one logogram for which at least one information about at least one pronunciation associated with said logogram is to be provided comprises performing a look-up of a standard logogram database, a logogram being identified as commonly used when it is stored in said standard logogram database.
The database in which a look-up is performed can be a standard database for native speakers or can be a database corresponding to a number level of HKS or JPLT.
In this embodiment, when a logogram is not stored in the database, i.e. when a logogram is not found in the database, the logogram is considered as not being commonly used. Consequently, information about its pronunciation is to be retrieved, for example in another database.
In an embodiment of the method according to the invention, identifying at least one logogram for which at least one information about at least one pronunciation associated with said logogram is to be provided comprises performing a look-up of a logogram database in which logograms are stored in association with an indicator about their use in a language.
In this embodiment, a logogram is associated with an indicator about their use, common or uncommon. Consequently, information about the pronunciation of a logogram is provided to the display device when the associated indicator indicates that the logogram is uncommonly used.
In another embodiment of the method according to the invention, identifying at least one logogram for which at least one information about at least one pronunciation associated with said logogram is to be provided comprises comparing an occurrence score associated with said at least one logogram with a threshold.
An occurrence score represents a frequency at which a given logogram appears in different texts. Such an occurrence score is computed based on statistics performed on different types of media such as press articles, e-documents available on the internet, public chats, etc.
An occurrence score can evolve in time depending on the frequency at which a logogram appears.
An occurrence score is easier to update than a standard database which may not be updated frequently.
The threshold may be updated as well in order to reflect an evolution of the language.
According to a feature of the method according to the other embodiment of the invention, the occurrence score associated with a logogram is updated based on statistical analysis of media contents comprising logograms.
According to a feature of the method according to the invention, an information about a translation of said identified logogram is retrieved and is provided together with said at least one information about at least one pronunciation associated with said identified logogram to said display device to be displayed alongside said identified logogram.
This feature is particularly interesting for non-native speakers who, beside not knowing the pronunciation of an uncommon logogram may not know its meaning as well.
The invention also related to an electronic terminal for retrieving information about at least one pronunciation associated with at least one logogram among a plurality of logograms to be displayed on a display device, said electronic terminal comprising at least one processor configured to:
- Identify, among the plurality of logograms to be displayed, and based on information indicating whether a logogram is commonly used, at least one logogram for which at least one information about at least one pronunciation associated with said logogram is to be provided,
- retrieve said at least one information about at least one pronunciation associated with said identified logogram,
provide said at least one information about at least one pronunciation associated with said identified logogram to said display device, in order to display said at least one information alongside said identified logogram on the display device.
Such an electronic terminal may be a Smartphone or a computing terminal connected to a screen.
In an embodiment of the electronic terminal according to the invention, identifying at least one logogram for which at least one information about at least one pronunciation associated with said logogram is to be provided consists in the processor being further configured to perform a look-up of said at least one logogram in a standard logogram database.
In a second embodiment of the electronic terminal according to the invention, identifying at least one logogram for which at least one information about at least one pronunciation associated with said logogram is to be provided consists in the processor being further configured to perform a look-up of a logogram database in which logograms are stored in association with an indicator about their use in a language.
In another embodiment of the electronic terminal according to the invention, identifying at least one logogram for which at least one information about at least one pronunciation associated with said logogram is to be provided consists in the processor being further configured to compare an occurrence score associated with said at least one logogram with a threshold
According to a feature of the electronic terminal according to the other embodiment of the invention, the occurrence score associated with a logogram is updated based on statistical analysis of media contents comprising logograms.
In another embodiment of the electronic terminal according to the invention, the at least one processor is further configured to retrieve an information about a translation of said identified logogram and to provide to said display device said information about a translation together with said at least one information about at least one pronunciation associated with said identified logogram, in order to display these information on said display device.
Another object of the invention is a system for retrieving information about at least one pronunciation associated with at least one logogram among a plurality of logograms, said system comprising at least one an electronic terminal and one display device on which said plurality of logograms is to be displayed, said system being characterized in that said at least one electronic device comprises at least one processor configured to:
- Identify, among the plurality of logograms to be displayed, and based on information indicating whether a logogram is commonly used, at least one logogram for which at least one information about at least one pronunciation associated with said logogram is to be provided,
- retrieve said at least one information about at least one pronunciation associated with said identified logogram,
- provide said at least one information about at least one pronunciation associated with said identified logogram to said at least one display device, in order to display said at least one information alongside said identified logogram on the at least one display device.
The present disclosure also concerns a computer program product downloadable from a communication network and/or recorded on a medium readable by a computer and/or executable by a processor, comprising program code instructions for implementing a method of decoding as described previously in this document.
The present disclosure also concerns a non-transitory computer-readable medium comprising a computer program product recorded thereon and capable of being run by a processor, including program code instructions for implementing a method of decoding as described previously in this document.
Such computer programs may be stored on a computer readable storage medium. A computer readable storage medium as used herein is considered a non-transitory storage medium given the inherent capability to store the information therein as well as the inherent capability to provide retrieval of the information therefrom. A computer readable storage medium can be, for example, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. It is to be appreciated that the following, while providing more specific examples of computer readable storage mediums to which the present principles can be applied, is merely an illustrative and not exhaustive listing as is readily appreciated by one of ordinary skill in the art: a portable computer diskette; a hard disk; a read--only memory (ROM) ; an erasable programmable read--only memory (EPROM or Flash memory) ; a portable compact disc read--only memory (CD-ROM) ; an optical storage device; a magnetic storage device; or any suitable combination of the foregoing.
The present disclosure can be better understood with reference to the following description and drawings, given by way of example and not limiting the scope of protection, and in which:
Figure 1 represents a system in which the method according to the invention can be implemented,
Figure 2 represents a flowchart of the steps of the method for retrieving information about at least one pronunciation associated with at least one logogram among a plurality of logograms to be displayed according to the invention, and
Figure 3 represents a detailed view of one of the electronic terminal according to an embodiment of the invention.
The components in the figures are not necessarily to scale, emphasis instead being placed upon illustrating the principles of the invention.
Figure 1 represents a system 1 in which the method according to the invention can be implemented.
Such a system 1 comprises an electronic device 10 and a display device 11. The electronic terminal 10 and the display device 11 can communicate with each other in order to exchange data.
The electronic terminal 10 can be a computer. The display device 11 can be a TV screen, an advertising board, a computer screen, etc.
In some embodiments, the system 1 can be fully-integrated into a single equipment such as a smartphone, a laptop, an e-reader, or a tablet, etc. In such embodiments, the electronic terminal 10 corresponds to hardware processor, storage unit and input device of such an equipment, while the display device 11 corresponds to the screen of such an equipment.
Figure 2 represents a flowchart of the steps of the method for retrieving information about at least one pronunciation associated with at least one logogram among a plurality of logograms to be displayed on the display device 11 when the method is executed by the electronic device 10.
In a step E1, a user of the system 1 selects a text comprising a plurality of logograms to be displayed on the display device 11 though a user interface of the system 1. Such a text can be stored in a memory of the electronic terminal 10, after having being inputted by a user, or can be retrieved, for example, from the Internet.
For example, the text to be displayed on the display device 11 is “每天上班出门前我都要好好捯饬一番” , which means “I would like to dress up carefully before I go to work every day” .
In a step E2, the electronic terminal 10 identifies, among the plurality of logograms constituting the text to be displayed, at least one logogram for which at least one information about at least one pronunciation associated with said logogram is to be provided.
Such identification of the logograms for which information about at least one pronunciation is to be provided is based on information indicating whether a logogram is commonly use or not.
In a first embodiment of the method, a standard database of common logograms is used for this identification. Such a standard database is for example the Table of General Standard Chinese Characters (通用规范汉字表; Tōngyòng Guīfàn Hànzì Biǎo) which is a standard list of 8105 simplified (and unchanged) Chinese logograms.
In this first embodiment, a logogram can be considered to be of uncommon use when it cannot be found in the standard database of common logograms when performing a look-up of said standard database.
Thus, in this first embodiment of the method, a look-up of the standard database is performed for each logograms of the text to be displayed. If at least one logogram of the text to be displayed is not found in the standard database, then this logogram is identified as an uncommonly used logogram for which information about at least one pronunciation are to be retrieved, typically from a remote database (for instance located in a cloud) to which the electronic device 10 sends a request, such a remote database storing each of the uncommon logograms, i.e. the logograms not stored in the standard database, in association with information about at least one pronunciation of this uncommon logogram. This embodiment is memory efficient in that it allows storing locally, in the electronic device 10, only a standard database of common logograms.
Alternatively to this first embodiment, the database to be looked up may be a database containing all possible logograms in a specific language, i.e. not only the most common ones as in the previous embodiment, but also the uncommon ones, each logogram being stored in association with an indicator about its use in this language, i.e. an indication as to whether this logogram is commonly used or not in this language, as well as an information about at least one pronunciation of the logogram when the logogram is indicated as being uncommon.
In such an alternative, a logogram can be considered to be of uncommon use when its associated indicator indicates that it is uncommon, and the information about at least one pronunciation of this logogram can be directly retrieved from the database itself, i.e. without having to send a request to a remote database to retrieve this information.
The above-mentioned databases to be looked up here, be it the above-mentioned standard database or the database containing all possible logograms, can be typically implemented as a local database (i.e. a database stored in the electronic terminal 10) in order to improve the translation performance. Such a local database may be then a cache database of a remote database located in a cloud system, and may be periodically synchronized with such a remote database, so that any changes in an indicator about the (un) common use of a logogram may be updated, first in the remote database, then in the local database (s) synchronizing with this remote database.
In a second embodiment of the method, each logogram is associated with an occurrence score representing a frequency at which a given logogram appears in different texts. Such an occurrence score can be computed based on statistics performed on different types of media such as press articles, e-documents available on the internet, public chats, etc. The occurrence score can evolve in time depending on the frequency at which a logogram appears.
In order to identify the logograms of the text to be displayed for which information about at least one pronunciation are to be retrieved, an occurrence score associated with each of the logograms of the txt to be displayed is compared with a threshold T. Depending on the result of this comparison, a logogram is considered an uncommonly used logogram or not.
For example, the threshold T is set to 0, 05. When a logogram is associated with an occurrence score lower than or equal to the threshold T then this logogram is considered an uncommonly used logogram for which information about at least one pronunciation are to be retrieved.
The knowledge of a logogram and its pronunciation depends on the level of proficiency of a reader. Thus, a child or a non-native speaker has a lower level of proficiency than a native speaker and are expected to know a fewer number of logograms.
In this case, the standard database, when in the first embodiment, or the occurrence score associated with a logogram, when in the second embodiment, can be selected/modified depending on the degree of proficiency of the reader. Indeed, a common logogram for a native speaker might be considered an uncommonly used logogram by a non-native speaker.
In order to cope with such a situation, an optional step E0 may be executed before the step E1, in which the user selects, though the user interface of the system 1, a common set of logograms whose pronunciation a reader should know. For example, the user can indicate a HSK (Hanyu Shuiping Kaoshi or Chinese Proficiency Test) level number or a JPLT (Japanese-Language Proficiency Test) level number, since to each of these levels correspond well defined sets of logograms that a reader is supposed to know.
In a step E3, the electronic device retrieves at least one information about at least one pronunciation associated with said identified logograms. In an embodiment of the invention, other information, such as a translation of the logogram in another language, can be retrieved as well.
Such information can be stored in databases which can be embedded either in the electronic terminal 10 or in a remote equipment, for instance a server located in a cloud service. These databases can be the standard databases used in step E2 or other databases. An example of such a database can be found at the following URL :
https: //sourceforge. net/p/pinyin4j/code/HEAD/tree/data/trunk/unicode_to_hanyu_pinyin. txt
This database enables to translate a Chinese logogram into pinyin which gives one or more pronunciation for a given logogram.
Once information about at least one pronunciation, and possibly a translation, of a logogram are retrieved, the electronic terminal 10 provides the text to be displayed to the display device 11 together with, for the logograms identified as being uncommonly used during step E2, the information retrieved during step E3 and a set of display instructions indicating that those retrieved information are to be displayed alongside the corresponding logograms.
In a step E4, the display device 11 displays the text to be displayed together with, for the logograms identified as being uncommonly used during step E2, the information retrieved during step E3.
To illustrate this, going back to the text to be displayed “每天上班出门前我都要好好捯饬一番” , let’s assume that the logograms 捯 and 饬 are identified as being uncommonly used in step E2. Consequently, information about a least one pronunciation of these two logograms are retrieved in step E3: a first pronunciation string “dao” for the uncommon logogram 捯 and a second pronunciation string “chi” for the uncommon logogram 饬. The following text is then displayed by the display device 11 at step E4: 每天上班出门前我都要好好捯 (dao) 饬 (chi) 一番.
Figure 3 represents a detailed view of one of the electronic terminal 10 according to an embodiment of the invention.
A electronic terminal 10 may comprise at least one hardware processor 301, a storage unit 302, an input device 303, an interface unit 304, and a network interface 305, which are typically connected by a data bus 306. Of course, constituent elements of the electronic terminal 10 may be connected by a connection other than a data bus connection.
The hardware processor(s) 301 controls operations of the electronic terminal 10. The storage unit 302 stores at least one program capable of retrieving information about at least one pronunciation associated with at least one logogram among a plurality of logograms to be displayed to be executed by the processor 301, and various data, such as parameters used by computations performed by the processor 301, intermediate data of computations performed by the processor 301, and so on. The processor 301 may be formed by any known and suitable hardware, or software, or a combination of hardware and software. For example, the processor 301 may be formed by dedicated hardware such as a processing circuit, or by a programmable processing unit such as a CPU (Central Processing Unit) that executes a program stored in a memory thereof.
The storage unit 302 may be formed by any suitable storage or means capable of storing the program, data, or the like in a computer-readable manner. Examples of the storage unit 302 include non-transitory computer-readable storage media such as semiconductor memory devices, and magnetic, optical, or magneto-optical recording media loaded into a read and write unit. The program causes the processor 301 to perform a process according to an embodiment of the present invention as described with reference to figure 2.
The input device 303 may be formed by a keyboard, a pointing device such as a mouse, or the like for use by the user to input commands, for example to make user's selections of parameters used for selecting the level of proficiency of the reader in reading logograms.
The interface unit 304 provides an interface between the electronic terminal 10 and an external apparatus such as the display device 11. The interface unit 304 may be communicable with the external apparatus via cable or wireless communication. The display device 11 is capable of displaying, for example, a Graphical User Interface (GUI) . The input device 303 of the electronic terminal 10 and the display device 11 may be formed integrally by a touchscreen panel, for example.
A network interface 305 provides a connection between the electronic terminal 10 and a remote equipment via a backbone network (not shown in the figures) , such as the Internet. The network interface 305 may provide, depending on its nature, a wired or a wireless connection to the backbone network.
Although the present invention has been described hereinabove with reference to specific embodiments, the present invention is not limited to the specific embodiments, and modifications will be apparent to a skilled person in the art which lie within the scope of the present invention.
Many further modifications and variations will suggest themselves to those versed in the art upon making reference to the foregoing illustrative embodiments, which are given by way of example only and which are not intended to limit the scope of the invention, that being determined solely by the appended claims. In particular the different features from different embodiments may be interchanged, where appropriate.
Claims (14)
- A method for retrieving information about at least one pronunciation associated with at least one logogram among a plurality of logograms to be displayed on a display device, said method being implemented by an electronic terminal and comprising :- Identifying among the plurality of logograms to be displayed, and based on information indicating whether a logogram is commonly used, at least one logogram for which at least one information about at least one pronunciation associated with said logogram is to be provided,- retrieving said at least one information about at least one pronunciation associated with said identified logogram,- providing said at least one information about at least one pronunciation associated with said identified logogram to said display device, in order to display said at least one information alongside said identified logogram on the display device.
- The method according to claim 1, wherein identifying at least one logogram for which at least one information about at least one pronunciation associated with said logogram is to be provided comprises performing a look-up of a standard logogram database, a logogram being identified as commonly used when it is stored in said standard logogram database.
- The method according to claim 1, wherein identifying at least one logogram for which at least one information about at least one pronunciation associated with said logogram is to be provided comprises performing a look-up of a logogram database in which logograms are stored in association with an indicator about their use in a language.
- The method according to claim 1, wherein identifying at least one logogram for which at least one information about at least one pronunciation associated with said logogram is to be provided comprises comparing an occurrence score associated with said at least one logogram with a threshold.
- The method according to claim 4, wherein the occurrence score associated with a logogram is updated based on statistical analysis of media contents comprising logograms.
- The method according to any one of claims 1 to 5, wherein an information about a translation of said identified logogram is retrieved and is provided together with said at least one information about at least one pronunciation associated with said identified logogram to said display device to be displayed alongside said identified logogram.
- An electronic terminal for retrieving information about at least one pronunciation associated with at least one logogram among a plurality of logograms to be displayed on a display device, said electronic terminal comprising at least one processor configured to:- identify, among the plurality of logograms to be displayed, and based on information indicating whether a logogram is commonly used, at least one logogram for which at least one information about at least one pronunciation associated with said logogram is to be provided,- retrieve said at least one information about at least one pronunciation associated with said identified logogram,- provide said at least one information about at least one pronunciation associated with said identified logogram to said display device, in order to display said at least one information alongside said identified logogram on the display device.
- The electronic terminal according to claim 7, wherein identifying at least one logogram for which at least one information about at least one pronunciation associated with said logogram is to be provided consists in the processor being further configured to perform a look-up of said at least one logogram in a standard logogram database, a logogram being identified as commonly used when it is stored in said standard logogram database.
- The electronic terminal according to claim 7, wherein identifying at least one logogram for which at least one information about at least one pronunciation associated with said logogram is to be provided consists in the processor being further configured to perform a look-up of a logogram database in which logograms are stored in association with an indicator about their use in a language.
- The electronic terminal according to claim 7, wherein identifying at least one logogram for which at least one information about at least one pronunciation associated with said logogram is to be provided consists in the processor being further configured to compare an occurrence score associated with said at least one logogram with a threshold.
- The electronic terminal according to claim 10, wherein the occurrence score associated with a logogram is updated based on statistical analysis of media contents comprising logograms.
- The electronic terminal according to any one of claims 7 to 11, wherein the at least one processor is further configured to retrieve an information about a translation of said identified logogram and to provide to said display device said information about a translation together with said at least one information about at least one pronunciation associated with said identified logogram, in order to display these information on said display device.
- A system for retrieving information about at least one pronunciation associated with at least one logogram among a plurality of logograms, said system comprising at least one an electronic terminal and one display device on which said plurality of logograms is to be displayed, said system being characterized in that said at least one electronic device comprises at least one processor configured to:- Identify, among the plurality of logograms to be displayed, and based on information indicating whether a logogram is commonly used, at least one logogram for which at least one information about at least one pronunciation associated with said logogram is to be provided,- retrieve said at least one information about at least one pronunciation associated with said identified logogram,- provide said at least one information about at least one pronunciation associated with said identified logogram to said at least one display device, in order to display said at least one information alongside said identified logogram on the at least one display device.
- A computer program characterized in that it comprises program code instructions for the implementation of the method for retrieving information about at least one pronunciation associated with at least one logogram among a plurality of logograms to be displayed on a display device according to any of claims 1 to 6 when the program is executed by a processor
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2019/108234 WO2021056347A1 (en) | 2019-09-26 | 2019-09-26 | Method for retrieving information about pronunciation associated with logogram |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2019/108234 WO2021056347A1 (en) | 2019-09-26 | 2019-09-26 | Method for retrieving information about pronunciation associated with logogram |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021056347A1 true WO2021056347A1 (en) | 2021-04-01 |
Family
ID=75165521
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2019/108234 WO2021056347A1 (en) | 2019-09-26 | 2019-09-26 | Method for retrieving information about pronunciation associated with logogram |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2021056347A1 (en) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1741007A (en) * | 2004-08-27 | 2006-03-01 | 英业达股份有限公司 | System for automatic notating Japanese kana and notating method thereof |
CN105138498A (en) * | 2015-08-03 | 2015-12-09 | 小米科技有限责任公司 | Character information output method and apparatus |
US20180047395A1 (en) * | 2016-08-12 | 2018-02-15 | Magic Leap, Inc. | Word flow annotation |
-
2019
- 2019-09-26 WO PCT/CN2019/108234 patent/WO2021056347A1/en active Application Filing
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1741007A (en) * | 2004-08-27 | 2006-03-01 | 英业达股份有限公司 | System for automatic notating Japanese kana and notating method thereof |
CN105138498A (en) * | 2015-08-03 | 2015-12-09 | 小米科技有限责任公司 | Character information output method and apparatus |
US20180047395A1 (en) * | 2016-08-12 | 2018-02-15 | Magic Leap, Inc. | Word flow annotation |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10373191B2 (en) | Advertisement translation device, advertisement display device, and method for translating an advertisement | |
US10657572B2 (en) | Method and system for automatically generating a response to a user query | |
US10621507B2 (en) | System and method for generating an optimized result set using vector based relative importance measure | |
US8976118B2 (en) | Method for character correction | |
US20200004823A1 (en) | Method and device for extracting point of interest from natural language sentences | |
US10861437B2 (en) | Method and device for extracting factoid associated words from natural language sentences | |
TW200900967A (en) | Multi-mode input method editor | |
US10803252B2 (en) | Method and device for extracting attributes associated with centre of interest from natural language sentences | |
JP2010186406A (en) | Apparatus and method for supporting verification of software internationalization | |
US20190303447A1 (en) | Method and system for identifying type of a document | |
EP3029567A1 (en) | Method and device for updating input method system, computer storage medium, and device | |
US20200285932A1 (en) | Method and system for generating structured relations between words | |
US20190303437A1 (en) | Status reporting with natural language processing risk assessment | |
US20180018315A1 (en) | Information processing device, program, and information processing method | |
US11012730B2 (en) | Method and system for automatically updating video content | |
US20150356884A1 (en) | Learning support apparatus, data output method in learning support apparatus, and storage medium | |
JPWO2016147328A1 (en) | Multilingual translation apparatus and multilingual translation method | |
WO2021056347A1 (en) | Method for retrieving information about pronunciation associated with logogram | |
US20200285648A1 (en) | Method and system for providing context-based response for a user query | |
KR20220054753A (en) | Voice search method and device, electronic device, computer readable storage medium and computer program | |
US10467346B2 (en) | Method and system for generating named entities | |
JP5922832B1 (en) | Sentence evaluation apparatus, sentence evaluation method, and program | |
US20200265117A1 (en) | System and method for language independent iterative learning mechanism for nlp tasks | |
US20230177074A1 (en) | Information processing device, terminal device, information processing system, information processing method, and recording medium | |
JP6425989B2 (en) | Character recognition support program, character recognition support method, and character recognition support device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 19946802 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 19946802 Country of ref document: EP Kind code of ref document: A1 |