GB2514753A - Subtitle processing - Google Patents

Subtitle processing Download PDF

Info

Publication number
GB2514753A
GB2514753A GB1304616.4A GB201304616A GB2514753A GB 2514753 A GB2514753 A GB 2514753A GB 201304616 A GB201304616 A GB 201304616A GB 2514753 A GB2514753 A GB 2514753A
Authority
GB
United Kingdom
Prior art keywords
user
subtitle
word
subtitle element
displayed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
GB1304616.4A
Other versions
GB201304616D0 (en
Inventor
Dmitri Golubentsev
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
BUZZMYWORDS Ltd
Original Assignee
BUZZMYWORDS Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by BUZZMYWORDS Ltd filed Critical BUZZMYWORDS Ltd
Priority to GB1304616.4A priority Critical patent/GB2514753A/en
Publication of GB201304616D0 publication Critical patent/GB201304616D0/en
Priority to PCT/GB2014/050796 priority patent/WO2014140617A1/en
Publication of GB2514753A publication Critical patent/GB2514753A/en
Withdrawn legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/488Data services, e.g. news ticker
    • H04N21/4884Data services, e.g. news ticker for displaying subtitles
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09BEDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
    • G09B19/00Teaching not covered by other main groups of this subclass
    • G09B19/06Foreign languages
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09BEDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
    • G09B5/00Electrically-operated educational appliances
    • G09B5/06Electrically-operated educational appliances with both visual and audible presentation of the material to be studied
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/472End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/475End-user interface for inputting end-user data, e.g. personal identification number [PIN], preference data
    • H04N21/4755End-user interface for inputting end-user data, e.g. personal identification number [PIN], preference data for defining user preferences, e.g. favourite actors or genre
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/485End-user interface for client configuration
    • H04N21/4856End-user interface for client configuration for language selection, e.g. for the menu or subtitles

Abstract

Subtitle data associated with a video including a number of subtitle elements (e.g. word or group of words) is received. A determination is made as to whether or not an element of the number of elements is to be displayed, and if so, the subtitle element is outputted for display. The determination may be based on a likelihood that an element (word) is known to a viewer, e.g. using a database. Additional information, e.g. semantic meaning, etymology, phonetic transcription, usage examples, can be displayed with the subtitle data. The element may also be translated, e.g. from a first to a second language, before the display determination is made. A viewer who has a little understanding of a language, but who is not fluent, may be spared the irritating display of more common words as subtitles, but may be assisted in understanding rarer or unfamiliar words.

Description

SUBTITLE PROCESSING
Field of Invention
This disclosure relates to a method and corresponding apparatus arranged to process subtitles.
More specifically but not exclusively, the invention provides a method for processing an incoming stream of subtitle data in accordance with a filtering function to provide relevant subtitle information to a user.
Background to the Invention
Video data streams, which are either stored on a storage medium such as a DVD or streamed directly to a user usually comprise a visual element and an audio element, which are played in synchronisation. It is also common to provide subtitles with the video and audio data. The subtitles may either be in the language of the film or television program associated with video data, such subtitles are often used by users with hearing impairment, or in a language other than the language of the film or television program associated with video data for users that do not understand the language or dialect of the content of the video data.
Many people across the world may be fluent in one or more languages and also have a working knowledge in another language. For those users with a basic or working knowledge of a language, using subtitles can be a distraction and annoyance because they already know the frequently used words. Furthermore, if the user is concentrating on the subtitles, vital visual information in the video may be missed. However, if the user turns the subtitles off then they miss crucial words that they do not understand, which is also an annoyance because it may mean that the user does not fully understand the content of the video.
One solution that many users have taken in the event of these frustrations with subtitles is to attempt to learn the language in question better. However, improving understanding of a language can be a long and laborious process.
Summary of Invention
In accordance with an aspect of the invention there is provided a method for processing of subtitles associated with a video. The method comprising receiving subtitle data associated with a video, the subtitle data including a number of subtitle elements. The method further comprising deterrriining whether or not a subtitle element of the number of subtitle elements is to be displayed. Furthermore, the method comprises outputting the subtitle element for display if it is determined that the subtitle element is to be displayed. Such processing of subtitle information provides more useful information to a user.
The determination as to whether or not the subtitle element is to be displayed may comprise determining a likelihood of whether or not the subtitle element is known by a user. The system therefore provides an intelligent system so that the user is presented with useful information.
The determination as to the likelihood of whether or not the subtitle element is known by a user may comprise determining from a database including information indicative of whether or not subtitle elements are known by a user, wherein if the subtitle element is identified in the database as being known by the user the subtitle element is not output for display. This provides a fast and effective system for identifying whether or not a subtitle element is known by a user.
The database may comprise one or more subtitle elements stored therein, the determination as to whether or not a subtitle element is known to a user further comprises comparing the subtitle element with the subtitle elements stored in the database.
The one or more subtitle elements stored in the database may correspond to words known to the user. The subtitle elements stored in the database may correspond to words not known by the user.
The method may further comprise determining if the subtitle element corresponds to a word known by the user from the database. If the subtitle element corresponds to a word known by the user, output of the subtitle element may be prevented. If the subtitle element does not correspond to a word known by the user, it may be determined if the subtitle element corresponds to a word not known by the user from the database. Consequently, the subtitle element may be output if the subtitle element is determined to not be known by the user.
The method may further comprise updating the information in the database responsive to inputs from a user indicative of whether or not a subtitle element is known by the user. This process means that the system is updated and can therefore operate more effectively.
The determination as to whether or not the subtitle element is to be displayed may comprise determining the probability of the subtitle being known by a user. It may be determined that the subtitle element is to be displayed if the determined probability is greater than a threshold probability.
The determination as to whether or not the subtitle element is to be displayed may comprise determining the probability of the subtitle element being known to a user only if whether or not the subtitle element is known by a user cannot be determined from the database. This approach means that the accurate information provided by the deterministic method is used first, and only if there is not conclusive information regarding whether or not the user knows the subtitle element, an estimation in that regard is made. This process is also a particularly efficient method.
The probability may be determined based on one or more factors associated with the user.
The probability may be determined based on one or more factors associated with the subtitle element.
The probability, P, of a user knowing the word may be determined in accordance with the following logit function: P(x) = Wherein x is a combining factor indicative of a likelihood of a user knowing a word. In this equation, x may be defined by: ii = + E / I Wherein, x0 is a calibrated offset constant, i is an index over all factors, f, available, n is the number of factors, x1 is a calibrated weight allocated to the ith factor, and f is the ith factor for the user.
The method may further comprise calibrating the values of x0 and x1 by minirriising the following equation: W -(P( v0+ V,k) k Wherein k represents an index over all users of the system used for the calibration process, Pk is 0 if the subtitle element is unknown and 1 if the subtitle element is known, i is an index over all factors available, x1 is a calibrated weight allocated to the ith factor, and fik is the value of the i factor for the kth user.
The factor, f, may be one or more of a user's age, a user's gender, a level of user's language skills, a user's academic background, a rarity of the word, or a user's primary language.
The method may further comprise identifying additional information to be displayed with the filtered subtitle data, the additional information being associated with the subtitle data. The additional information may include one or more of semantic meaning, etymology, phonetic transcription, and/or usage examples of the subtitle element with which the additional information is associated.
The method may further comprise displaying the subtitle element.
The method may further comprise translating the subtitle element from a first format to a second format. This process of translating may be carried out before performing the determination as to whether or not the subtitle element of the number of subtitle elements is to be displayed. The performing the determination as to whether or not the subtitle element of the number of subtitle elements is to be displayed may be based on the subtitle element in the second format. The first format may be a first language and the second format is a second language different to the first language. The first format may be a word in a first language and the second format may be a synonym of the word in the first language. The method may further comprise receiving feedback from a user regarding the translation of the subtitle element. The feedback may include information identifying a preferred translation of the subtitle element.
The subtitle element may be a word. The subtitle element may be a group of words.
According to another aspect of the invention there is provided a system arranged to process subtitle data comprising a processor arranged to perform any process as described herein, and a memory arranged to store information for processing the subtitle data. The memory may be arranged to store the database, for example.
According to yet another aspect of the invention there is provided a computer readable medium comprising computer readable code operable, in use, to instruct a computer to perform any method as described herein.
Embodiments of the invention relate to a multimedia system and method for adaptation of video material for users who may wish to improve their language knowledge or understanding of presented material.
Embodiments of the invention allow users to pay maximum attention to the video and audio components of video material, while at the same time bringing unknown or unfamiliar linguistic elements to the user's attention. Unknown or unfamiliar linguistic elements may be unknown or unfamiliar words such as technical terms, idioms, expressions or sentences unknown to the user.
In certain embodiments of the invention, a system that is arranged to display subtitle information in multiple languages at the same time is provided. For example, the subtitles may be provided in the language of the video and the user's first language at the same time.
The system of at least some embodiments of the invention provides databases of words that the user does or does not understand. The database may be created and updated according to the user's ability in the target language. Consequently, the distraction of subtitles is minimised and only linguistic elements that are new or unfamiliar to the user are highlighted. This enables the user to enjoy the video rriaterial while enhancing his/her operational knowledge of the target language at the same time.
Embodiments of the invention provide a natural-language processing system. Such a system may include a registration-candidate storage section that stores registration-candidate dictionary data. The system may also comprise a judgment means that compares input data with the registration-candidate dictionary data to determine whether or not the input data includes a word corresponding to the registration-candidate dictionary data. In addition, the system may comprise an inquiry means that requests a user to input information indicative of whether or not corresponding dictionary data is to be registered in a dictionary storage section of the system.
Furthermore, if a user's instruction indicates that a corresponding word exists, a dictionary registration means may be provided which is arranged to register the corresponding dictionary data in the dictionary storage section based on the input instruction. The system may also comprise a natural language processing means that executes a natural-language processing onto the input data by using the dictionary data registered in the dictionary storage section.
Embodiments of the invention provide a personalised multimedia interactive system and method of video material adaptation for language comprehension and acquisition suitable for all age groups.
In accordance with embodiments of the invention provide a system and method that provides subtitles in the language of the associated video to improve a user's understanding of the material to some extent due to information provided on the spelling of words.
Embodiments of the invention provide a filtering scheme that provides users with relevant subtitle information. In such embodiments the subtitles are useful and no longer a distraction.
Embodiments of the invention provide users with information on the spelling of the words in a subtitle stream. The spelling may be provided in the original language of the subtitles. The spelling of such words may also be provided in another language selected by the user. The spelling provided in another language may be provided by means of an automatic translations operation.
Embodiments of the invention propose a system, which by means of maintaining a personalised dictionary-type database can address the problem of overburdening the user with the visual information by amending the subtitles of the video material the user is watching in real-time to present the user only with the words, etc, unknown to him/her together with their definition, and any other related information according to the user's preferences. The system may identify the words that are estimated as unknown to the user according to the user's level of proficiency, which may be initially set up by the user, and then tailored during the use of the system to closely represent user's operational vocabulary.
The system according to embodiments of the invention may create a database of words in a target language that are presumed to be well-known to a user. Any word, technical term, idiom or expression to occur in the video material, selected by the user, which is not found in the initial database, may be flashed up by the system and displayed on the screen in a subtitle mode together with their translation and/or semantic meaning, and/or etymology, and/or phonetic transcription, and/or usage examples depending on the user's preferences. Any linguistic element can either be added by the user to his/her personal dictionary-type database together with the related multimedia content or marked as well-known as the user is watching the video material.
The user database can be considered to be an adaptive filter, which stores a list of words to be filtered out of the input subtitle stream, the components of the filter, i.e. the words to be filtered from the data stream, being updated by the user on each use or iteration of the system.
Brief Description of the Drawings
Exemplary embodiments of the invention shall now be described with reference to the drawings in which: Figure 1 shows a video processing system for performing subtitle processing in order to provide more relevant subtitle information to a user; and Figure 2 illustrates the subtitle processing procedure performed by the system of Figure 1.
Throughout the description and the drawings, like reference numerals refer to like parts.
Specific Description
Figure 1 shows a video processing system for performing subtitle processing in order to provide more relevant subtitle information to a user. The apparatus that is arranged to provide this functionality shall therefore firstly be described.
The video processing system 10 is integrated within a television in this exemplary disclosure shown in Figure 1. A data stream 11 including video data, audio data and subtitle data is received by a computer processing unit (CPU) 12. The CPU may be any suitable processor capable of performing the required processing and therefore also referred to herein as the processor. The CPU 12 is arranged to separate the data stream 11 into its constituent parts. In particular, the CPU 12 extracts the subtitle data from the data stream 11 so that it can perform processing on the subtitle data. The subtitle data comprises a stream of subtitle elements associated with the audio of the data stream. The subtitle data may be a word or group of associated words. The CPU 12 may also be arranged to perform separate processing on the video and audio data in parallel with the subtitle processing, if required.
The CPU 12 then filters the subtitle data in accordance with a filtering process. This filtering process uses information stored in memory 13. More specifically, the memory 13 stores information identifying which subtitle elements, i.e. words, are known and which are not known to the user of the system 10 and therefore filters the subtitle elements that are known to the user out of the stream of subtitle elements. This deterministic filtering procedure is complemented by a probabilistic filtering procedure, which is used when a word is not identified as either known or unknown to a user. As will be discussed in more detail, the probabilistic filtering procedure uses information about the user and/or the specific word to determine the probability of the user knowing the word. The memory 13 may also store other information relating to the words stored within the memory and information relating to the user. Furthermore, the memory 13 stores the subtitle processing application, which is run by the processor 12.
Once the subtitle data has been filtered to obtain the processed subtitle data, the processed subtitle data is transmitted to a display 14 along with the video and audio data to be displayed.
Some pre-processing is then performed by the processor in order to combine the video data and subtitle data into a single video stream for display on the display 14.
User interface 15 is provided to enable the user to control the operation of the video processing system 10. For example, user interface 15 enables the user to turn the subtitles ON and OFF.
In addition, the user interface 15 allows the user to control the specific details of the filtering process carried out by the video processing system 10 on the subtitle data. In particular, the usel is able to indicate woids that are displayed that s/he knows, identify woids that s/he hears but are not displayed that s/he doesn't know, and input information useful for the probabilistic filtering procedure.
The subtitle processing procedure shall now be described in detail with reference to Figure 2.
When the user uses the subtitle functionality on the television an automated check 30 is cairied out to determine if an initial set-up 20 has already taken place. If the user has used the subtitle functionality before and the initial set-up procedure 20 has taken place then the system can play the video and start lunning the subtitle processing functionality 31. This can be determined by checking whethel infoimation obtained during the initial set-up procedule is stored in memory 13. If the initial set-up procedure 20 has not been carried out before, the initial set-up procedure begins.
In the initial subtitle set-up procedure 20 the processor firstly carries out a system set-up 21.
The system set-up comprises a series of questions that are presented to the user via a pop-up box displayed on the display 14. Firstly, the usel is presented with a choice of languages foi the subtitles. At this point the user selects a primary language for the display of subtitles. The user is then presented with a choice of additional information to be displayed. For example, the user can decide whether or not they would like to view the subtitles in an additional language to the primary language selected. For example, if the user is French and the video being watched is in English then the user may select the primary language to be French, but also select that the subtitles are provided in English so that they can see the English spelling of the words being displayed. Othei options presented to the user include whethel or not they wish to see seniantic meaning, etymology, phonetic transcription, and/or usage examples. Alternatively, weblinks to an online dictionary, Wikipedia or any other suitable resource could be provided.
The system theiefore stores the lelevant information in memoiy 13 to provide such functionality.
Once the user has selected what they would like to see displayed, the system moves to the next stage in the set-up process.
Next a user set-up procedure 22 takes place. The user set-up procedure may take various forms, but the primary aim is to gather sufficient information to be used by the probabilistic filtering procedure to determine the probability of the user knowing a word. In one exemplary system the user is presented with a multiple choice test used to determine the user's proficiency in the target language, i.e. the primary language for the display of subtitles as selected by the user. From this test, a user's language ability is determined, which is then used in the probabilistic subtitle filtering. In alternative systems, the user selects their own proficiency level, for example from a choice of level 1 to level 10, where level 1 is not very proficient and level 10 is very proficient. If the user selected subtitles to be displayed in multiple languages then a language proficiency for each language may be provided.
During the user set-up procedure various other information about the user may be obtained, such as their age, gender, academic background, and any other factors that may affect the user's vocabulary and therefore the information that they may wish to be displayed. For example, the user may include information regarding their proficiency in other languages, which could be indicative of the likelihood of the user knowing certain words, or at least provide an indication of the speed at which the user may pick up the language being watched.
The user is also asked to insert username and password information to allow them to login.
This means that the same settings can be used when watching television or films on other televisions, or other devices, and also enables a single device to be used by multiple users.
During the user set-up procedure a word bank structure is either set-up for, or assigned to, the user in memory 13. This word bank structure is provided to enable the system to store a list of words specifically indicated as being known by the user and a list of words specifically indicated as not being known by a user. Upon initial set-up, both of these lists are empty, but the data structure is provided to allow the user to start generating these lists. If multiple languages are selected by the user then the user will have multiple word bank structures set-up associated with their account, one for each language.
In response to the user selecting one or more languages for the display of subtitles, associated dictionaries for translating the subtitles from the input subtitle source language to the target subtitle language are downloaded and stored on the systerri. Dictionaries for translating of some or all common languages may already be stored in memory.
Once the initial set-up procedure is complete, the video to be watched can begin playing 31.
The CPU will process or filter the input subtitle data stream in real-time in accordance with the following method.
The first step of the method is to translate the subtitles from the source language to the target language. The translation is performed utilising the dictionary associated with the target language stored in memory. The translation process may include complex processing in order to determine the correct contextual translation.
The linguistic elements in the subtitle data stream are determined. In general, each linguistic element is an individual word, but may also be a phrasal verb, idiom or technical term. For ease of explanation, linguistic elements shall primarily be referred to words hereinafter, even though it should be appreciated that the same functionality could be applied to other linguistic elements other than individual words.
Each word is individually processed to determine whether or not it should be displayed. In certain systems, it may be possible to use parallel processing techniques to process multiple The procedure works by firstly carrying out the deterministic filtering procedure. Each word in the subtitle stream is compared with the words in the word banks. If the word is in the word bank listing words known by the user then the word is not displayed, if the word is in the word bank listing words not known by the user then the word is displayed. If the word is found in one of the word banks and a determination is made regarding whether or not to display the word then the filtering procedure can move on to the next word. If no determination is made from the deterministic filtering procedure then the method moves on the probabilistic method. Obviously, after the initial set-up no words are in the word banks and the probabilistic method is used for all words. Unless the user starts to insert words into the dictionaries, as will be discussed, then the probabilistic method is repeatedly used.
The probabilistic filtering procedure determines a probability, P, of the user knowing a word by using the following logit function: 1÷) = Equation I If P is greater than 0.5 then the word is determined to be known to the user and is therefore not displayed. If P is less than or equal to 0.5 then the word is determined not to be known by the user and is displayed. More complex systems may be provided to determine whether or not to display words for which P is near to 0.5. For example, if P is in a range of 0.4 to 0.6 the system may determine whether or not to display the word based on another criteria such as the number of words previously and currently being displayed to avoid overburdening the user.
In Equation 1, x is a variable indicative of the likelihood of the user knowing the word determined according to Equation 2: Li =X0 + Equation 2 In Equation 2, f1 corresponds to one or more, i, factors affecting whether a user may know a word. For example, this may be a factor such as the user's age, the user's gender, the level of user's language skills, the user's acaderriic background, the rarity of the word, or the user's primary language. The user's primary language could be of relevance where there is a relationship between the word in the subtitle stream and the user's primary language. For example, the idiom nouveau riche is commonly used in English to refer to people who have recently become wealthy, if the person's primary language is French then the words are almost certainly known to the user while if the user's primary language is Spanish they may still require the words to be displayed. A total of n factors are utilised.
The factors specific to the user are gathered during the initial set-up procedure. The factors specific to the word can be stored in word bank specific to those factors, which may have a value associated with the word indicative of the factor. For example, the factor of rarity of a word could provide a range of 0-1, where a value close to zero indicates that the word is very rare and a value close to 1 indicates the word is very common. Alternatively, such information could be downloaded on a word by word basis.
In Equation 2, X1 is a weighting parameter associated with the ilb factor f1 of a total of n factors used in the calculation. Hence, each separate factor f1 has an associated weighting X1.
Consequently, the importance of a user's age, compared to the user's primary language, for example, can therefore be weighted in terms of level or importance for the purpose of determining whether or not a word should be displayed. On an initial set-up, the weighting parameters are set at predetermined levels of importance for each associated factor. However, as will be discussed the weighting parameters can be adjusted over time in order to more accurately predict whether or not a user knows a word.
The subtitles are therefore displayed when it is determined that the word is not known to the user by the deterministic or probabilistic filtering procedure.
If the user selected for other information, or linked entities, such as semantic meaning, etymology, phonetic transcription, andlor usage examples to be displayed then, when it is determined that a word in the subtitle stream is to be displayed, the CPU looks up that word in a relevant database stored within the memory to obtain the additional information that the user wants to be displayed. The processor then combines this information, synchronises the subtitle information with the video and audio data and displays this information in the display.
While the video is playing, the user is able to identify words that are displayed that are known to the user. For example, the user can identify known words via functionality provided on the user interface 15. If a word is identified as known this word is added to the user's word bank listing known words. Such words will not be displayed if present in a future subtitle stream.
Furthermore, if the user hears a word that was not displayed and that the user does not know then this can be added to the list of words that the user does not know. This can be achieved by the user requesting the system to display the original version of the subtitles and selecting the unknown word; the system is then automatically updated accordingly. The user word banks are therefore constantly being updated and evolving to provide more accurate and relevant subtitle information to the user.
In addition to updating the word banks during playback, the user is also able to update the word banks after playback so that the video experience is not unduly affected by updating the filtering process. This is achieved by a word recognition process performed by the user after the video has finished playing. This process enables the user to select those words that were displayed because they were determined by the system as not being known by the user that the user does in fact know. This process may take place instead of or in addition to the in-play updating process. When each word is displayed after the video has finished it may be displayed in the context of the video stream, for example by showing the picture or even a video clip related to the word. Furthermore, this post-playing procedure can even be used as a learning process wherein the user can re-review the word along with the video clip in order to contextualise the meaning of the word that they did not know in order to help them learn the meaning of the word.
As pad of the process, the user can select whether or not they wish for the word to be displayed the next time that it appears in a video stream that they are watching.
A calibration process is provided by the system in order to constantly improve the quality of the probabilistic filtering. In particular, the weighting factors of Equation 2 can be adjusted using the calibration process. The calibration process involves minimising the following equation for each word utilising information gathered from the users: rv = (P(x0--.Jx.g%) -F1)2 k Equation 3 In this case, Pk is U if the word is unknown to the kth user, i.e. in the k user's word bank of unknown words, and 1 if the word is known, i.e. in the kth user's word bank of known words.
Hence, the calibration process aims to adjust the x weightings so that W is as small as possible.
Hence, the feedback from all users regarding words that they do and do not know, Pk, is used to update the x weightings and therefore improve the accuracy of the probabilistic filtering.
Utilising information gathered from multiple users allows for an improvement in accuracy.
The calibration process can be carried out at a central server. The central server can be provided with information regarding each users word banks and then perform the calculations.
The central server can then send out details regarding the updated x values to each piece of associated user equipment.
The system also provides for a tuning process in which the user is provided with a plurality of words and asked to indicate whether or not they know the word. This information is then used to add more words to the user's word banks, which in turn are used to more accurately refine the x parameters of Equation 3. The tuning process could be provided as part of the initial set-up procedure so that the probabilistic determination is more accurate from when the user first uses the system.
As mentioned previously, the system is also arranged to enable different users to use the same device. Consequently, each user has a different user word bank or database, which they set-up and then update individually. Each user then identifies themselves when they begin watching a video.
The system is also arranged to enable multiple users to watch a video at the same time. When playing a video the system runs the filtering process described above for all users and if it is determined that one or more of the users does not know the word then the word is displayed. If all users are determined to know the word then the word is not displayed.
The system provides for some specific educational uses in order to assist users in learning languages. For example, a spaced repetition procedure can be used by a user for words that are identified as not being known. Spaced repetition is the process wherein a word is repeated at continuously increasingly spaced intervals in order to assist a user in remembering words.
The user may have an option to enable the spaced repetition process in the main system. For example, the spaced repetition process can be applied to all words identified as not known during the playing of future videos. As the user acquires the word to the extent that he/she retains it in memory over a sufficient period of time, either the system or the user can move the word into the word bank of words known to the user. Alternatively, the spaced repetition process can be used purely as a post-viewing process.
The system also allows for users to indicate if any subtitle information is incorrect. For example, if a translation looks to be incorrect a user can correct the translation. This correction can be sent to a central server and the central server can universally update the translation, if required.
For example, if a predetermined number of corrections are received regarding a particular word then the system may determine that the translation needs correcting.
While the above description focuses primarily on subtitles that are in a different language to the video being played, the system can also function with subtitles in the same language. This can be particularly useful for children learning their first language. Furthermore, the system may allow for subtitles for colloquialisms, regional dialects, or old dialects. For example, dictionaries for such information may be downloaded to the system. In these circumstances, for words that the user does not know, synonyms can be displayed to help the user understand the meaning of the words in the video. Hence, using a bank of known words and the probabilistic determination method, the system can select a synonym most familiar to the user. Obviously, in such circumstances the step of translating the subtitles is not required.
In alternative systems only the deterministic filtering process is used. In such circumstances, the user is required to indicate which words that they do and do not know. This could be achieved if, at the end of each program watched by the user, the user is asked to sort known and unknown words. With further iterations of programs being watched the number of words that the user need sort will significantly reduce. Such a post-viewing word test can be applied to other systems disclosed herein.
In alternative systems, only the probabilistic filtering process is used. In such systems the user can adjust certain functions to improve the accuracy of the determination.
While the system described above is described as being used for processing a data stream at a TV, it will be appreciated that the processing system could be provided partly within a set-top box and partly within a television. Alternatively, the video processing system could be provided within a computer, smart phone, or within or distributed across any other device(s) capable of performing the functionality described herein. The data may also be received by other means.
For example, the received data may be part of a web-stream rather than a standard digital TV data stream. In such circumstances, the video, audio and subtitle data may be received as part of a single data file such as an MPEG 4 file.
The various methods described above may be implemented by a computer program. The computer program may include computer code arranged to instruct a computer to perform the functions of one or more of the various methods described above. The computer program and/or the code for performing such methods may be provided to an apparatus, such as a computer, on a computer readable medium. The computer readable medium could be, for example, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, or a propagation medium for data transmission, for example for downloading the code over the Internet. Non-limiting examples of a physical computer readable medium include semiconductor or solid state memory, magnetic tape, a removable computer diskette, a random access memory (RAM), a read-only memory (ROM), a rigid magnetic disc, and an optical disk, such as a CD-ROM, CD-R/W or DVD.
An apparatus such as a computer may be configured in accordance with such computer code to perform one or more processes in accordance with the various methods discussed above.

Claims (28)

  1. Claims: 1. A method for processing of subtitles associated with a video, comprising: receiving subtitle data associated with a video, the subtitle data including a number of subtitle elements; determining whether or not a subtitle element of the number of subtitle elements is to be displayed; and outputting the subtitle element for display if it is determined that the subtitle element is to be displayed.
  2. 2. The method according to claim 1, wherein the determination as to whether or not the subtitle element is to be displayed comprises determining a likelihood of whether or not the subtitle element is known by a user.
  3. 3. The method according to claim 2, wherein the determination as to the likelihood of whether or not the subtitle element is known by a user comprises determining from a database including information indicative of whether or not subtitle elements are known by a user, wherein if the subtitle element is identified in the database as being known by the user the subtitle element is not output for display.
  4. 4. The method according to claim 3, wherein the database comprises one or more subtitle elements stored therein, the determination as to whether or not a subtitle element is known to a user further comprises comparing the subtitle element with the subtitle elements stored in the database.
  5. 5. The method according to claim 4, wherein the one or more subtitle elements stored in the database correspond to words known to the user.
  6. 6. The method according to claim 4 or claim 5, wherein the subtitle elements stored in the database correspond to words not known by the user.
  7. 7. The method according to claim 6 when dependent on claim 5, wherein the method comprises: determining if the subtitle element corresponds to a word known by the user from the database, wherein if the subtitle element corresponds to a word known by the user, preventing the output of the subtitle element, and if the subtitle element does not correspond to a word known by the user, determining if the subtitle element corresponds to a word not known by the user from the database and outputting the subtitle element if the subtitle element is determined to not be known by the user.
  8. 8. The method according to any one of claims 3 to 7, further comprising updating the information in the database responsive to inputs from a user indicative of whether or not a subtitle element is known by the user.
  9. 9. The method according to any preceding claim, wherein the determination as to whether or not the subtitle element is to be displayed comprises determining the probability of the subtitle being known by a user, it being determined that the subtitle element is to be displayed if the determined probability is greater than a threshold probability.
  10. 10. The method according to claim 7 when dependent on any one of claims 3 to 7, wherein the determination as to whether or not the subtitle element is to be displayed comprises determining the probability of the subtitle element being known to a user only if whether or not the subtitle element is known by a user cannot be determined from the database.
  11. 11. The method according to claim 9 or claim 10, wherein the probability is determined based on one or more factors associated with the user.
  12. 12. The method according to claim 9, 10 or 11, wherein the probability is determined based on one or more factors associated with the subtitle element.
  13. 13. The method according to any one of claim 9 to 12, wherein the probability, P, of a user knowing the word is determined in accordance with the following logit function: P(x) =___ Wherein x is a combining factor indicative of a likelihood of a user knowing a word.
  14. 14. The method according to claim 13, wherein xis defined by: x = Wherein, x0 is a calibrated offset constant, i is an index over all factors, f, available, n is the number of factors, x1 is a calibrated weight allocated to the ith factor, and f is the ith factor for the user.
  15. 15. The method according to claim 14, further comprising calibrating the values of x0 and xk by minimising the following equation: TV = (P()*-F.X,/;.k) ic) fr Wherein k represents an index over all users of the system used for the calibration process, k is U if the subtitle element is unknown and 1 if the subtitle element is known, i is an index over all factors available, x1 is a calibrated weight allocated to the ith factor, and fik is the value of the ith factor for the kth user.
  16. 16. The method according to claim 14 or 15, wherein the factor, f, is one or more of a user's age, a user's gender, a level of user's language skills, a user's academic background, a rarity of the word, or a user's primary language.
  17. 17. The method according to any preceding claim, further comprising: identifying additional information to be displayed with the filtered subtitle data, the additional information being associated with the subtitle data.
  18. 18. The method according to claim 17, wherein the additional information includes one or more of semantic meaning, etymology, phonetic transcription, and/or usage examples of the subtitle element with which the additional information is associated.
  19. 19. The method according to any preceding claim, further comprising displaying the subtitle element.
  20. 20. The method according to any preceding claim, further comprising translating the subtitle element from a first format to a second format before performing the determination as to whether or not the subtitle element of the number of subtitle elements is to be displayed, wherein the performing the determination as to whether or not the subtitle element of the number of subtitle elements is to be displayed is based on the subtitle element in the second format.
  21. 21. The method according to claim 20, wherein the first format is a first language and the second format is a second language different to the first language.
  22. 22. The method according to claim 20, wherein the first format is a word in a first language and the second format is a synonym of the word in the first language.
  23. 23. The method according to any one of claims 20 to 22, further comprising receiving feedback from a user regarding the translation of the subtitle element.
  24. 24. The method according to claim 23, wherein the feedback includes information identifying a preferred translation of the subtitle element.
  25. 25. The method according to any preceding claim, wherein the subtitle element is a word.
  26. 26. The method according to any one of claim ito 24, wherein the subtitle element is a group of words.
  27. 27. A system arranged to process subtitle data comprising: a processor arranged to perform the process of any one of claims ito 26; a memory arranged to store information for processing the subtitle data.
  28. 28. A computer readable medium comprising computer readable code operable, in use, to instruct a computer to perform the method of any one of claims 1 to 26.
GB1304616.4A 2013-03-14 2013-03-14 Subtitle processing Withdrawn GB2514753A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
GB1304616.4A GB2514753A (en) 2013-03-14 2013-03-14 Subtitle processing
PCT/GB2014/050796 WO2014140617A1 (en) 2013-03-14 2014-03-14 Subtitle processing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
GB1304616.4A GB2514753A (en) 2013-03-14 2013-03-14 Subtitle processing

Publications (2)

Publication Number Publication Date
GB201304616D0 GB201304616D0 (en) 2013-05-01
GB2514753A true GB2514753A (en) 2014-12-10

Family

ID=48226330

Family Applications (1)

Application Number Title Priority Date Filing Date
GB1304616.4A Withdrawn GB2514753A (en) 2013-03-14 2013-03-14 Subtitle processing

Country Status (2)

Country Link
GB (1) GB2514753A (en)
WO (1) WO2014140617A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3807862A4 (en) * 2018-06-17 2022-03-23 Langa Ltd Method and system for teaching language via multimedia content

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108763521B (en) * 2018-05-25 2022-02-25 腾讯音乐娱乐科技(深圳)有限公司 Method and device for storing lyric phonetic notation
US20210185405A1 (en) * 2019-12-17 2021-06-17 Rovi Guides, Inc. Providing enhanced content with identified complex content segments

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20040052066A (en) * 2002-12-13 2004-06-19 엘지전자 주식회사 Apparatus for intercepting slang of television
US20090249185A1 (en) * 2006-12-22 2009-10-01 Google Inc. Annotation Framework For Video

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080281579A1 (en) * 2007-05-10 2008-11-13 Omron Advanced Systems, Inc. Method and System for Facilitating The Learning of A Language
US20090162818A1 (en) * 2007-12-21 2009-06-25 Martin Kosakowski Method for the determination of supplementary content in an electronic device
US20100273138A1 (en) * 2009-04-28 2010-10-28 Philip Glenny Edmonds Apparatus and method for automatic generation of personalized learning and diagnostic exercises
US20100332214A1 (en) * 2009-06-30 2010-12-30 Shpalter Shahar System and method for network transmision of subtitles
KR20110083544A (en) * 2010-01-12 2011-07-20 굿파이낸셜 주식회사 Apparatus and method for learning language using growing type personal word database system
US20130196292A1 (en) * 2012-01-30 2013-08-01 Sharp Kabushiki Kaisha Method and system for multimedia-based language-learning, and computer program therefor

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20040052066A (en) * 2002-12-13 2004-06-19 엘지전자 주식회사 Apparatus for intercepting slang of television
US20090249185A1 (en) * 2006-12-22 2009-10-01 Google Inc. Annotation Framework For Video

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
http://clwbmalucachu.co.uk/cmc/features/beginners.htm *
http://en.wikipedia.org/wiki/S4C *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3807862A4 (en) * 2018-06-17 2022-03-23 Langa Ltd Method and system for teaching language via multimedia content

Also Published As

Publication number Publication date
GB201304616D0 (en) 2013-05-01
WO2014140617A1 (en) 2014-09-18

Similar Documents

Publication Publication Date Title
US9414040B2 (en) Method and system for assisting language learning
US20130196292A1 (en) Method and system for multimedia-based language-learning, and computer program therefor
US20170017642A1 (en) Second language acquisition systems, methods, and devices
US20180061274A1 (en) Systems and methods for generating and delivering training scenarios
US9984689B1 (en) Apparatus and method for correcting pronunciation by contextual recognition
US20090162818A1 (en) Method for the determination of supplementary content in an electronic device
CN114143479B (en) Video abstract generation method, device, equipment and storage medium
US20230267152A1 (en) Systems and methods for providing personalized answers with learned user vocabulary for user queries
TW201407563A (en) Interactive type language learning platform
GB2532174A (en) Information processing device, control method therefor, and computer program
US10430522B2 (en) Dynamic suggestions for content translation
WO2014140617A1 (en) Subtitle processing
US20170330482A1 (en) User-controlled video language learning tool
US20070250307A1 (en) System, method, and computer readable medium thereof for language learning and displaying possible terms
Sakunkoo et al. Gliflix: Using movie subtitles for language learning
Nair et al. Understanding bilingual word learning: the role of phonotactic probability and phonological neighborhood density
KR20210068790A (en) Sign language interpretation system
US11568139B2 (en) Determining and utilizing secondary language proficiency measure
KR20140122807A (en) Apparatus and method of providing language learning data
US20210158723A1 (en) Method and System for Teaching Language via Multimedia Content
AU2020262442A1 (en) Augmentative and alternative communication (AAC) reading system
JP6555583B2 (en) Signal processing apparatus and signal processing system
KR20090074607A (en) Method for controlling display for vocabulary learning with caption and apparatus thereof
Shen et al. The time course of morphological processing during spoken word recognition in Chinese
Akita et al. Language model adaptation for academic lectures using character recognition result of presentation slides

Legal Events

Date Code Title Description
WAP Application withdrawn, taken to be withdrawn or refused ** after publication under section 16(1)