FR3051092A1

FR3051092A1 - METHOD AND DEVICE FOR SYNCHRONIZING SUBTITLES

Info

Publication number: FR3051092A1
Application number: FR1654016A
Authority: FR
Inventors: Nicolas Bellardie; Romain Carbou
Original assignee: Orange SA
Current assignee: Orange SA
Priority date: 2016-05-03
Filing date: 2016-05-03
Publication date: 2017-11-10
Also published as: WO2017191397A1

Abstract

L'invention concerne un procédé de synchronisation d'un sous-titre (ST) compris dans un ensemble ordonné (S) de sous-titres associé à un contenu multimédia (V), avec un segment audio (AC) du contenu multimédia (V), le procédé étant caractérisé en ce qu'il comporte : - une étape de conversion automatique, en texte (TXT), d'un segment audio courant (AC) du contenu multimédia (V); - une étape de recherche dans ledit ensemble ordonné (S) de sous-titres, d'un sous-titre (ST) correspondant audit texte (TXT); - si un sous-titre (ST) correspondant audit texte (TXT) est identifié, une étape de synchronisation dudit sous-titre identifié (ST) avec ledit segment audio courant (AC).The invention relates to a method for synchronizing a subtitle (ST) included in an ordered set (S) of subtitles associated with a multimedia content (V), with an audio segment (AC) of the multimedia content (V). ), the method being characterized in that it comprises: a step of automatic conversion, in text (TXT), of a current audio segment (AC) of the multimedia content (V); a step of searching in said ordered set (S) of subtitles, of a subtitle (ST) corresponding to said text (TXT); if a sub-title (ST) corresponding to said text (TXT) is identified, a step of synchronizing said identified subtitle (ST) with said current audio segment (AC).

Description

Arrière-plan de l'invention L'invention se rapporte au domaine général de la gestion de contenus multimédias.BACKGROUND OF THE INVENTION The invention relates to the general field of multimedia content management.

Elle concerne plus particulièrement la synchronisation d'un sous-titre compris dans un fichier de sous-titres associé à un contenu multimédia (ex. film, série télévisée, actualités, vidéo, programme radio, chanson...) avec un segment audio (ex. parole, dialogue, effet sonore, ambiance sonore, etc.) de ce contenu multimédia.It relates more particularly to the synchronization of a subtitle contained in a subtitle file associated with multimedia content (eg film, television series, news, video, radio program, song ...) with an audio segment ( speech, dialogue, sound effect, sound environment, etc.) of this multimedia content.

Aujourd'hui, les systèmes multimédia sont fortement répandus, par exemple sous la forme de décodeurs TV (« set-top box » en anglais), de télévisions connectées, de centres multimédias (ordinateurs personnels permettant la distribution de médias au sein du foyer).Today, multimedia systems are widely used, for example in the form of set-top box (TV set-top boxes), connected TVs, multimedia centers (personal computers allowing the distribution of media within the home). .

Ces systèmes peuvent associer des sous-titres aux oeuvres audio ou audiovisuelles et les afficher de façon synchronisée avec le flux multimédia. De nombreux sites proposent maintenant des sous-titres réalisés par des volontaires et mis gracieusement à disposition du public dans un certain nombre de langues.These systems can associate subtitles with audio or audiovisual works and display them synchronously with the multimedia stream. Many sites now offer subtitles made by volunteers and made freely available to the public in a number of languages.

Cependant, la synchronisation est délicate, car elle est le plus souvent basée sur un système d'horodatage de chaque ligne de sous-titres. Lorsque le débit du flux est modifié ou lorsque l'oeuvre est amputée d'une partie de son contenu par exemple lors de montages spécifiques à certains pays, les sous-titres sont irrémédiablement décalés.However, synchronization is tricky because it is most often based on a timestamp system of each subtitle line. When the rate of the flow is changed or when the work is cut off from some of its content for example in montages specific to certain countries, the subtitles are irremediably staggered.

Dans l'état actuel de la technique, il existe des moyens pour recaler les sous-titres, voire pour changer le débit total d'un fichier de sous-titres. Néanmoins, ces méthodes sont difficiles à utiliser et ne permettent pas d'obtenir une synchronisation satisfaisante.In the current state of the art, there are ways to reset the subtitles, or even to change the total bit rate of a subtitle file. Nevertheless, these methods are difficult to use and do not provide satisfactory synchronization.

Une méthode utilisée pour résoudre un problème de décalage entre un flux audio et un sous-titre se présente comme suit : un spectateur regarde les sous-titres affichés sur l'œuvre tout en écoutant les paroles ou dialogues de l'œuvre ; il détecte à un moment donné un décalage entre le son et le sous-titre ; il interrompt alors la lecture de l'œuvre pour chercher dans un fichier de sous-titres un sous-titre correct afin de le recaler.One method used to solve a problem of discrepancy between an audio stream and a subtitle is as follows: a spectator looks at the subtitles displayed on the work while listening to the lyrics or dialogues of the work; it detects at a given moment a gap between the sound and the subtitle; he then interrupts the reading of the work in order to search in a subtitles file for a correct subtitle in order to readjust it.

Cette méthode lourde et inefficace exige que le spectateur entende et comprenne les paroles ou dialogues pour détecter le décalage et identifier un sous-titre correct dans un fichier de sous-titres. Une telle méthode ne peut être utilisée par un spectateur sourd ou ne maîtrisant pas suffisamment la langue du flux multimédia.This cumbersome and inefficient method requires the viewer to hear and understand the words or dialogues to detect the misalignment and identify a proper subtitle in a subtitle file. Such a method can not be used by a deaf spectator or not sufficiently mastering the language of the multimedia stream.

Obiet et résumé de l'invention L'invention vise notamment à améliorer cette situation en proposant, selon un premier aspect de l'invention, un procédé de synchronisation d'un sous-titre compris dans un ensemble ordonné de sous-titres associé à un contenu multimédia, avec un segment audio du contenu multimédia. Ce procédé comporte : une étape de conversion automatique, en texte, d'un segment audio courant du contenu multimédia; une étape de recherche dans l'ensemble ordonné de sous-titres, d'un sous-titre correspondant au texte; et si un sous-titre correspondant au texte est identifié, une étape de synchronisation du sous-titre identifié avec le segment audio courant.SUMMARY OF THE INVENTION The object of the invention is to improve this situation by proposing, according to a first aspect of the invention, a method for synchronizing a subtitle included in an ordered set of subtitles associated with a subtitle. multimedia content, with an audio segment of the multimedia content. The method includes: a step of automatically converting, into text, a current audio segment of the multimedia content; a search step in the ordered set of subtitles, a subtitle corresponding to the text; and if a subtitle corresponding to the text is identified, a synchronization step of the identified subtitle with the current audio segment.

Selon un deuxième aspect, l'invention vise aussi un dispositif de synchronisation d'un sous-titre compris dans un ensemble ordonné de sous-titres associé à un contenu multimédia, avec un segment audio du contenu multimédia, le dispositif comportant : un module de conversion automatique, en texte, d'un segment audio courant du contenu multimédia; un module de recherche dans l'ensemble ordonné de sous-titres, d'un sous-titre correspondant au texte; et un module de synchronisation, activé si un sous-titre correspondant au texte est identifié, du sous-titre identifié avec le segment audio courant. L'invention propose ainsi avantageusement d'utiliser une conversion automatique audio-texte pour faciliter ['identification dans un ensemble ordonné de sous-titres, d'un sous-titre adéquat à un segment audio (ex. parole, dialogue, chant...) d'un contenu multimédia (ex. audio, vidéo...). Réalisée par une technique de reconnaissance vocale (« speech to text » ou « speech récognition » en anglais) connue en soi, cette conversion automatique permet d'analyser fa voix humaine pour la transcrire sous la forme d'un texte exploitable par un ordinateur. L'ensemble ordonné de sous-titres peut être constitué par un fichier de sous-titres. En variante, les sous-titres peuvent être compris dans le flux multimédia, soit de façon groupée (par exemple au début du flux) soit répartis dans le flux.According to a second aspect, the invention also provides a device for synchronizing a subtitle included in an ordered set of subtitles associated with a multimedia content, with an audio segment of the multimedia content, the device comprising: a module of automatic conversion, into text, of a current audio segment of the multimedia content; a search module in the ordered set of subtitles, a subtitle corresponding to the text; and a synchronization module, activated if a subtitle corresponding to the text is identified, the subtitle identified with the current audio segment. The invention thus advantageously proposes to use an automatic audio-text conversion to facilitate identification in an ordered set of subtitles, from a suitable subtitle to an audio segment (eg speech, dialogue, singing). .) multimedia content (eg audio, video ...). Realized by a technique of speech recognition ("speech to text" or "speech recognition" in English) known in itself, this automatic conversion makes it possible to analyze fa human voice to transcribe it in the form of a text exploitable by a computer. The ordered set of subtitles may consist of a subtitle file. As a variant, the subtitles may be included in the multimedia stream, either in a grouped manner (for example at the beginning of the stream) or distributed in the stream.

La conversion automatique du segment audio courant en texte (le segment audio courant correspondant au segment audio en cours d'analyse selon le procédé conforme à l'invention), l'invention facilite et accélère l'identification du sous-titre correspondant au segment audio courant.The automatic conversion of the current audio segment into text (the current audio segment corresponding to the audio segment being analyzed according to the method according to the invention), the invention makes it easier and faster to identify the subtitle corresponding to the audio segment. current.

Avantageusement, l'invention ne nécessite plus, comme dans l'état actuel de la technique, d'entendre et de comprendre un segment audio pour détecter un décalage entre le son et le sous-titre et identifier un sous-titre correspondant au segment audio.Advantageously, the invention no longer requires, as in the current state of the art, to hear and understand an audio segment to detect an offset between the sound and the subtitle and to identify a subtitle corresponding to the audio segment. .

En outre, l'utilisation de la conversion automatique audio-texte permet une automatisation, c'est-à-dire sans intervention humaine, des opérations (ex. la recherche du sous-titre adéquat et la synchronisation du sous-titre adéquat identifié) du processus de synchronisation. L'invention évite, comme c'est ie cas dans l'état actuel de la technique, d'interrompre fa lecture du contenu et de chercher manuellement le sous-titre correct dans un ensemble ordonné de sous-titre pour effectuer le recalage. L'automatisation du processus de synchronisation augmente l'efficacité de la synchronisation tout en améliorant l'expérience de l'utilisateur du contenu multimédia.In addition, the use of automatic audio-text conversion allows automation, that is to say without human intervention, operations (eg the search for the appropriate subtitle and synchronization of the appropriate subtitle identified) the synchronization process. The invention avoids, as is the case in the current state of the art, to interrupt playback of the content and to manually search for the correct subtitle in an ordered set of subtitles to perform the registration. Automating the synchronization process increases the efficiency of synchronization while improving the user experience of the media content.

Dans un mode de réalisation particulier, les étapes du procédé de synchronisation sont réalisées de façon préalable à la restitution du contenu multimédia. Le procédé de synchronisation peut ainsi constituer en un prétraitement réalisé avant le démarrage de la restitution du contenu multimédia. Dans ce cas, le segment audio courant analysé n'est pas restitué à l'utilisateur. Le procédé peut ainsi être mis en œuvre sans diffusion sonore audible par l'utilisateur.In a particular embodiment, the steps of the synchronization method are performed prior to the return of the multimedia content. The synchronization method can thus constitute a pretreatment performed before the start of the reproduction of the multimedia content. In this case, the current audio segment analyzed is not returned to the user. The method can thus be implemented without sound broadcasting audible by the user.

Ce mode de réalisation permet de préparer un contenu multimédia corrigé (ex. celui intégrant des sous-titres synchronisés, ou celui auquel est associé un ensemble ordonné de sous-titres dont des sous-titres sont synchronisés, etc.) pour une restitution ultérieure.This embodiment makes it possible to prepare a corrected multimedia content (eg that integrating synchronized subtitles, or the one to which is associated an ordered set of subtitles whose subtitles are synchronized, etc.) for a subsequent rendering.

Dans un mode de réalisation particulier, le contenu multimédia est interrompu momentanément pour identifier le sous-titre correspondant au texte. Cela laisse au processus du temps supplémentaire pour effectuer la recherche. Ce mode de réalisation est utile notamment lorsque le consommateur du contenu multimédia n'est pas un utilisateur final, mais une personne ou un automate produisant à son tour une opération d'édition du contenu résultant.In a particular embodiment, the multimedia content is interrupted momentarily to identify the subtitle corresponding to the text. This leaves the process of extra time to perform the search. This embodiment is useful in particular when the consumer of the multimedia content is not an end user, but a person or a controller producing in turn an editing operation of the resulting content.

Dans un mode de réalisation particulier, les étapes du procédé de synchronisation sont réalisées à la volée lors d'une restitution continue du contenu multimédia.In a particular embodiment, the steps of the synchronization method are performed on the fly during a continuous reproduction of the multimedia content.

Autrement dit, les étapes, notamment, de la conversion du segment audio courant, de l'identification du sous-titre adéquat (celui correspondant au texte), et de la synchronisation du sous-titre identifié avec le segment audio courant sont accomplies automatiquement avant la restitution du segment audio suivant le segment audio courant, pendant la restitution continue du contenu multimédia (ex. flux vidéo, flux audio restitué en temps réel...). Dans ce cas, le segment audio courant est celui au point de restitution.In other words, the steps, in particular, of the conversion of the current audio segment, of the identification of the appropriate subtitle (that corresponding to the text), and of the synchronization of the subtitle identified with the current audio segment are accomplished automatically before the restitution of the audio segment according to the current audio segment, during the continuous restitution of the multimedia contents (eg video stream, audio stream reproduced in real time ...). In this case, the current audio segment is the one at the playback point.

Ce mode de réalisation permet une synchronisation automatique et imperceptible pendant la restitution en temps réel du contenu. Dans l'état actuel de la technique, ceci est inenvisageable, la contrainte temps réel ne permettant pas un arrêt momentané du contenu pour recaler un sous-titre. Cette synchronisation garantit une expérience multimédia sans couture.This embodiment allows an automatic and imperceptible synchronization during the real-time reproduction of the content. In the current state of the art, this is unthinkable, the real-time constraint not allowing a temporary stop of the content to reset a subtitle. This synchronization ensures a seamless multimedia experience.

Dans un mode de réalisation particulier, le procédé comporte en outre : une étape pour détecter que la langue dans laquelle les sous-titres de l'ensemble ordonné de sous-titres sont rédigées est différente d'une langue déterminée dans laquelle les sous-titres doivent être restitués; et une étape de traduction des sous-titres de l'ensemble ordonné de sous-titres dans une langue déterminée, le sous-titre identifié étant un sous-titre traduit dans cette langue déterminée.In a particular embodiment, the method further comprises: a step for detecting that the language in which the subtitles of the ordered set of subtitles are written is different from a specific language in which the subtitles must be returned; and a step of translating the subtitles of the ordered set of subtitles into a given language, the subtitle identified being a translated subtitle in that particular language.

Ce mode de réalisation concerne en particulier un cas où un spectateur veut regarder son contenu multimédia avec les sous-titres rédigés dans une langue dont les sous-titres ne sont pas disponibles dans le contenu multimédia. Par conséquent, ce mode de réalisation propose de traduire les sous-titres dans une langue déterminée ou spécifiée par exemple par le spectateur, puis d'identifier un sous-titre traduit (correspondant au texte obtenu par conversion). Ce mode de réalisation permet de modifier, de manière flexible, la langue des sous-titres.This embodiment relates in particular to a case where a viewer wants to watch its multimedia content with subtitles written in a language whose subtitles are not available in the multimedia content. Therefore, this embodiment proposes to translate the subtitles in a specific language or specified for example by the viewer, then identify a translated subtitle (corresponding to the text obtained by conversion). This embodiment makes it possible to modify, in a flexible manner, the language of the subtitles.

Dans un mode de réalisation particulier, le procédé comporte en outre : une étape pour détecter que la langue dans laquelle les sous-titres l'ensemble ordonné de sous-titres sont rédigées est différente de celle du contenu multimédia ; et une étape de traduction du texte dans la langue des sous-titres, le sous-titre identifié correspondant au texte traduit dans la langue des sous-titres.In a particular embodiment, the method further comprises: a step for detecting that the language in which the subtitles the subtitle set is written is different from that in the multimedia content; and a step of translating the text into the subtitle language, the identified subtitle corresponding to the translated text in the subtitle language.

Autrement dit, avant de rechercher un sous-titre correspondant au segment audio courant dans l'ensemble ordonné de sous-titres, il est détecté que la langue dans laquelle sont rédigés les sous-titres (ou la plupart des sous-titres) de l'ensemble ordonné de sous-titres est différente de celle des (ou de la plupart des) paroles ou dialogues du contenu. Dans ce cas, la langue du texte obtenu par conversion étant différente de celle des sous-titres, le texte est traduit dans la langue des sous-titres afin d'effectuer l'étape de recherche du procédé.In other words, before searching for a subtitle corresponding to the current audio segment in the ordered set of subtitles, it is detected that the language in which the subtitles (or most subtitles) of the subtitles are written. an ordered set of subtitles is different from that of (or most) lyrics or dialogs of the content. In this case, the language of the text obtained by conversion being different from that of the subtitles, the text is translated into the language of the subtitles in order to perform the step of searching the process.

Comme dans l'art antérieur, la langue des sous-titres est généralement identifiée immédiatement à partir des sous-titres et celle du contenu multimédia précisée dans les métadonnées du contenu.As in the prior art, the subtitle language is generally identified immediately from the subtitles and that of the multimedia content specified in the content metadata.

Ce mode de réalisation permet ainsi une identification de sous-titres en langue étrangère (c'est-à-dire ceux rédigés dans une différente langue que le son du contenu) pour la synchronisation.This embodiment thus allows identification of foreign language subtitles (ie those written in a different language than the sound of the content) for synchronization.

On note que les étapes de ce mode de réalisation peuvent succéder à l'étape de traduction des sous-titres du mode de réalisation précédent. Autrement dit, les sous-titres inclus dans le contenu multimédia ou dans un fichier de sous-titres sont traduits dans une langue déterminée, puis il est détecté que cette langue déterminée (celle des sous-titres traduits) est différente de celle du contenu multimédia. Le texte est ensuite traduit dans cette langue déterminée et est utilisé dans l'étape de recherche du procédé pour identifier un sous-titre adéquat (i.e. correspondant au texte traduit), ce sous-titre identifié étant un sous-titre traduit dans la langue déterminée.Note that the steps of this embodiment may follow the subtitle translation step of the previous embodiment. In other words, the subtitles included in the multimedia content or in a subtitle file are translated into a specific language, then it is detected that this determined language (that of the translated subtitles) is different from that of the multimedia content. . The text is then translated into this determined language and is used in the process search step to identify a suitable subtitle (ie corresponding to the translated text), this subtitle identified being a translated subtitle in the specified language. .

Dans un mode de réalisation particulier, au cours de l'étape de recherche du procédé, l'identification du sous-titre correspondant au texte comporte : une étape de recherche d'au moins un mot du texte dans au moins un sous-titre cible de l'ensemble ordonné de sous-titres; et une étape de calcul d'un taux de mots au moins partiellement reconnus pour chaque sous-titre cible; le sous-titre correspondant au texte étant choisi parmi au moins un sous-titre cible, ce sous-titre ayant un taux de mots reconnus supérieur à un seuil prédéterminé et/ou ayant le taux de mots reconnus le plus élevé.In a particular embodiment, during the step of searching for the method, the identification of the subtitle corresponding to the text comprises: a step of searching for at least one word of the text in at least one target subtitle the ordered set of subtitles; and a step of calculating a rate of at least partially recognized words for each target subtitle; the subtitle corresponding to the text being chosen from at least one target subtitle, this subtitle having a rate of recognized words greater than a predetermined threshold and / or having the highest rate of recognized words.

Autrement dit, afin d'identifier le sous-titre adéquat (celui correspondant au texte obtenu par conversion du segment audio courant), ce mode de réalisation propose de comparer au moins un sous-titre cible (par ex. un sous-titre associé au segment courant ou précédent, rédigé dans une langue originale indiquée par le contenu multimédia ou par un fichier externe de sous-titres, et, le cas échéant traduit dans une langue déterminée) de l'ensemble ordonné de sous-titres avec le texte obtenu par conversion à partir du segment audio courant (et éventuellement traduit dans la langue des sous-titres ou dans la langue déterminée), pour chercher à reconnaître un ou plusieurs mots du texte dans chaque sous-titre cible.In other words, in order to identify the appropriate subtitle (that corresponding to the text obtained by conversion of the current audio segment), this embodiment proposes to compare at least one target subtitle (eg a subtitle associated with the current or previous segment, written in an original language indicated by the multimedia content or an external subtitle file, and, where appropriate translated into a specific language) of the ordered set of subtitles with the text obtained by converting from the current audio segment (and optionally translated into the subtitle language or the determined language), to seek to recognize one or more words of the text in each target subtitle.

Ensuite, on calcule un taux de mots au moins partiellement reconnus (ex. 4 mots reconnus sur 5, 1 mot reconnu sur 2) pour chaque sous-titre cible, le sous-titre cible dont le taux de mots reconnus est maximum ou supérieur à un seuil prédéterminé (ex. 50%, 80%) est alors déterminé comme le sous-titre correspondant au texte.Next, we calculate a rate of at least partially recognized words (eg 4 recognized words out of 5, 1 recognized word out of 2) for each target subtitle, the target subtitle whose recognized word rate is maximum or greater than a predetermined threshold (eg 50%, 80%) is then determined as the subtitle corresponding to the text.

Ce mode de réalisation apporte de la flexibilité dans l'étape de la recherche du sous-titre adéquat, car il n'impose pas que tous les mots du texte soient reconnus.This embodiment provides flexibility in the step of searching for the proper subtitle because it does not require that all words in the text be recognized.

Dans un mode de réalisation particulier, au moins un mot du texte est au moins partiellement reconnu dans le sous-titre cible tout en tenant compte d'au moins une variante du mot. Dans ce mode de réalisation, le mot compris dans le texte n'est pas seulement comparé directement avec celui compris dans le sous-titre cible, mais il est remplacé, par exemple lorsqu'il n'est pas identifié par comparaison directe, par au moins une variante du mot telle que son synonyme, homophone, une expression idiomatique, etc., chaque variante étant comparée avec un mot du sous-titre cible. Dans un mode de réalisation, les variantes (synonyme, homophone, expression idiomatique) du mot du texte sont fournies lors de la traduction du texte au moyen d'un dictionnaire sémantique numérique.In a particular embodiment, at least one word of the text is at least partially recognized in the target subtitle while taking into account at least one variant of the word. In this embodiment, the word included in the text is not only compared directly with that included in the target subtitle, but is replaced, for example when not identified by direct comparison, by minus a variant of the word such as its synonym, homophone, an idiomatic expression, etc., each variant being compared with a word of the target subtitle. In one embodiment, the variants (synonym, homophone, idiomatic expression) of the text word are provided during the translation of the text by means of a digital semantic dictionary.

Ce mode de réalisation permet d'étendre la recherche.This embodiment makes it possible to extend the search.

Dans un mode de réalisation particulier, s'il existe un doute sur la correspondance entre un sous-titre et le texte, ce doute peut être levé en analysant le segment audio suivant. En effet, si pour le segment audio suivant, une correspondance peut être établie entre le texte du segment audio suivant et un sous-titre, le doute peut être levé positivement.In a particular embodiment, if there is doubt about the correspondence between a subtitle and the text, this doubt can be removed by analyzing the next audio segment. Indeed, if for the next audio segment, a match can be made between the text of the next audio segment and a subtitle, doubt can be raised positively.

Dans un mode de réalisation, le sous-titre cible est le sous-titre associé à un segment audio précédant le segment audio courant. Dans une variante, le sous-titre cible est le sous-titre associé au segment audio courant (par exemple s'il n'y pas de segment audio précédent). Dans une autre variante, les sous-titres cibles sont le sous-titre précédent et au moins un sous-titre suivant le sous-titre précédent (autrement dit le sous-titre courant et éventuellement d'autres subséquents). Dans une autre variante, les sous-titres cibles sont le sous-titre courant et au moins un sous-titre suivant le sous-titre courant.In one embodiment, the target subtitle is the subtitle associated with an audio segment preceding the current audio segment. In a variant, the target subtitle is the subtitle associated with the current audio segment (for example if there is no previous audio segment). In another variant, the target subtitles are the previous subtitle and at least one subtitle according to the previous subtitle (ie the current subtitle and possibly other subsequent ones). In another variant, the target subtitles are the current subtitle and at least one subtitle according to the current subtitle.

On rappelle que dans des exemples de l'art antérieur, un sous-titre dans un fichier de sous-titres (ex. un fichier du format SubRip avec une extension .srt, ou du format sub, ssa, txt, etc.) peut comporter un numéro du sous-titre (ex. 1, 2, 3, etc.), un texte du sous-titre, un horodatage d'entrée indiquant le début du sous-titre et un horodatage de sortie du sous-titre indiquant la fin du sous-titre, la différence entre ces deux horodatages définissant la durée du sous-titre (ex. 01:03 :27 :000 / 01:03 :29 :015, la durée étant 2 secondes 15 millisecondes).It will be recalled that in examples of the prior art, a subtitle in a subtitle file (eg a SubRip format file with an .srt extension, or sub, ssa, txt, etc. format) can include a subtitle number (eg 1, 2, 3, etc.), subtitle text, an input timestamp indicating the beginning of the subtitle, and a subtitle output timestamp indicating the end of the subtitle, the difference between these two timestamps defining the subtitle duration (eg 01:03: 27: 000 / 01:03: 29: 015, the duration being 2 seconds 15 milliseconds).

Le sous-titre précédent s'entend ici par un sous-titre dont ['horodatage d'entrée (ou de restitution) coïncide sensiblement avec celui du segment audio précédent, autrement dit, celui qui doit être restitué, lors de ou immédiatement après fa restitution du segment audio précédent. Le sous-titre suivant le sous-titre précédent désigne alors celui qui est positionné, dans l'ensemble ordonné de sous-titres, après le sous-titre précédent.The previous subtitle means here a subtitle whose [input timestamp (or playback) coincides substantially with that of the previous audio segment, that is, the one to be restored, at or immediately after restitution of the previous audio segment. The subtitle following the previous subtitle then designates the one that is positioned, in the ordered set of subtitles, after the previous subtitle.

En effet, il est très probable en réalité que le sous-titre adéquat au segment audio courant soit positionné après le sous-titre précédent. Ainsi, l'utilisation d'au moins un sous-titre cible à partir du sous-titre précédent permet une identification plus efficace du sous-titre adéquat. En variante, au moins un sous-titre cible peut être déterminé d'une autre façon. Par exemple, un sous-titre cible est déterminé s'il est vérifié que son horodatage d'entrée relève d'une période déterminée (ex. 5 secondes, 10 secondes, etc.) autour du segment audio courant.Indeed, it is very likely that the proper subtitle to the current audio segment is positioned after the previous subtitle. Thus, the use of at least one target subtitle from the previous subtitle allows more effective identification of the proper subtitle. Alternatively, at least one target subtitle may be determined in another way. For example, a target subtitle is determined if its input timestamp is checked for a specified period of time (eg 5 seconds, 10 seconds, etc.) around the current audio segment.

Dans un mode de réalisation particulier, le procédé comporte en outre une étape de réglage des horodatages de tous les sous-titres suivant le sous-titre identifié. Par exemple, cette étape est effectuée s'il est détecté que les sous-titres sont conformes en contenu mais se déroulent plus rapidement ou plus lentement que le contenu multimédia avec un rapport de vitesse constant par rapport à celui-ci. Ce mode de réalisation permet d'éviter de réaliser les étapes du procédé à chaque sous-titre mais de synchroniser tous les sous-titres restants en une seule fois. Cela augmente l'efficacité de la synchronisation des sous-titres.In a particular embodiment, the method further comprises a step of setting the timestamps of all the subtitles according to the identified subtitle. For example, this step is performed if it is detected that the subtitles are content compliant but run faster or slower than the media content with a constant speed ratio to it. This embodiment makes it possible to avoid carrying out the steps of the method with each subtitle but to synchronize all the remaining subtitles at one time. This increases the effectiveness of subtitle synchronization.

On note qu'il n'est pas nécessaire de réaliser les opérations du procédé à chaque sous-titre si le processus détecte une grande conformité entre le contenu et les sous-titres à restituer. Il est envisageable d'effectuer le procédé de façons ponctuelles.It is noted that it is not necessary to carry out the operations of the method with each subtitle if the process detects a great conformity between the content and the subtitles to be rendered. It is conceivable to carry out the method in one-off ways.

Dans un mode de réalisation particulier, si aucun sous-titre correspondant au texte du segment audio courant n'est identifié, le procédé est remis en œuvre pour un segment audio suivant. Autrement dit, il procède à la conversion automatique en texte du segment audio suivant et à la recherche dans l'ensemble ordonné de sous-titres d'un sous-titre correspondant au texte pour ce segment audio suivant. Ce mode de réalisation est particulièrement avantageux pour une synchronisation pendant une restitution continue du contenu multimédia (ex. flux vidéo temps-réel). En variante, le contenu multimédia est interrompu momentanément pour effectuer une recherche approfondie (ex. en utilisant un nombre accru de sous-titres cibles, en envisageant plus de synonymes, d'homophones, d'expressions idiomatiques, etc.).In a particular embodiment, if no subtitle corresponding to the text of the current audio segment is identified, the method is implemented for a next audio segment. In other words, it proceeds to the automatic conversion to text of the next audio segment and search in the ordered set of subtitles of a subtitle corresponding to the text for that next audio segment. This embodiment is particularly advantageous for synchronization during a continuous reproduction of the multimedia content (eg real-time video stream). As a variant, the multimedia content is interrupted momentarily to carry out a thorough search (eg by using an increased number of target subtitles, by considering more synonyms, homophones, idioms, etc.).

Selon un troisième aspect, l'invention vise un équipement comportant : des moyens d'obtention d'un contenu multimédia et d'un ensemble ordonné de sous- titres ; un décodeur comprenant un dispositif de synchronisation selon l'invention ; et des moyens d'acheminement du contenu multimédia et de l'ensemble ordonné de sous-titres vers le décodeur.According to a third aspect, the invention relates to an equipment comprising: means for obtaining a multimedia content and an ordered set of subtitles; a decoder comprising a synchronization device according to the invention; and means for conveying the multimedia content and the ordered set of subtitles to the decoder.

Les avantages et caractéristiques particuliers du dispositif de synchronisation et du système selon l'invention sont identiques à ceux du procédé décrit ci-dessus et ne seront pas rappelés ici.The particular advantages and characteristics of the synchronization device and the system according to the invention are identical to those of the method described above and will not be repeated here.

On peut en outre également envisager, dans d'autres modes de réalisation, que le procédé de synchronisation, le dispositif de synchronisation et le système selon l’invention présentent en combinaison tout ou partie des caractéristiques précitées.It may furthermore also be envisaged, in other embodiments, that the synchronization method, the synchronization device and the system according to the invention present in combination all or some of the aforementioned characteristics.

Dans un mode particulier de réalisation, les différentes étapes du procédé de synchronisation sont déterminées par des instructions d'un programme d'ordinateur.In a particular embodiment, the different steps of the synchronization method are determined by instructions of a computer program.

En conséquence, l'invention vise aussi un programme d'ordinateur sur un support d'information, ce programme étant susceptible d'être mis en œuvre dans un ordinateur, pour la mise en œuvre des étapes du procédé de synchronisation selon l'Invention, tel que brièvement décrit ci-dessus.Consequently, the invention also relates to a computer program on an information medium, this program being capable of being implemented in a computer, for the implementation of the steps of the synchronization method according to the invention, as briefly described above.

Ce programme peut utiliser n'importe quel langage de programmation, et être sous la forme de code source, code objet, ou de code intermédiaire entre code source et code objet, tel que dans une forme partiellement compilée, ou dans n'importe quelle autre forme souhaitable. L'invention vise aussi un support d'information lisible par un ordinateur, et comportant des instructions du programme d'ordinateur tel que mentionné ci-dessus.This program can use any programming language, and be in the form of source code, object code, or intermediate code between source code and object code, such as in a partially compiled form, or in any other form desirable shape. The invention also relates to a computer readable information medium, and comprising instructions of the computer program as mentioned above.

Le support d'information peut être n'importe quel entité ou dispositif capable de stocker le programme. Par exemple, le support peut comporter un moyen de stockage, tel qu'une ROM, par exemple un CD ROM ou une ROM de circuit microélectronique, ou encore un moyen d'enregistrement magnétique, par exemple une disquette (« floppy dise » en anglais), un disque dur, ou une clé USB. D'autre part, le support d’information peut être un support transmissible tel qu'un signal électrique ou optique, qui peut être acheminé via un câble électrique ou optique, par radio ou par d'autres moyens. Les programmes selon l'invention peuvent être en particulier téléchargés sur un réseau de type Internet.The information carrier may be any entity or device capable of storing the program. For example, the medium may comprise storage means, such as a ROM, for example a CD ROM or a microelectronic circuit ROM, or a magnetic recording medium, for example a floppy disk. ), a hard drive, or a USB key. On the other hand, the information medium may be a transmissible medium such as an electrical or optical signal, which may be conveyed via an electrical or optical cable, by radio or by other means. The programs according to the invention may in particular be downloaded on an Internet-type network.

Alternativement, le support d'information peut être constitué de circuits intégrés dans lesquels le programme est incorporé, les circuits étant adaptés pour exécuter ou pour être utilisés dans l'exécution du procédé en question.Alternatively, the information medium may consist of integrated circuits in which the program is incorporated, the circuits being adapted to execute or to be used in the execution of the method in question.

Brève description des dessins D'autres caractéristiques et avantages de la présente invention ressortiront de la description faite ci-dessous, en référence aux dessins annexés qui en illustrent un exemple de réalisation dépourvu de tout caractère limitatif. Sur les figures : la figure 1 représente un équipement EQ conforme à l'invention dans un mode, particulier de réalisation de l'invention ; la figure 2 représente l'architecture matérielle d'un dispositif de synchronisation conforme à l'invention dans un mode particulier de réalisation de l'Invention ; la figure 3 représente les principales étapes d'un procédé de synchronisation mis en œuvre par le dispositif de synchronisation conforme à l'invention pour synchroniser un sous-titre compris dans un ensemble ordonné de sous-titres associé à un contenu multimédia, avec un segment audio du contenu multimédia dans un mode particulier de réalisation de l'invention ; la figure 4 illustre un exemple d'un flux vidéo associé à un ensemble ordonné de sous-titres avant et après une synchronisation conforme à l'invention.BRIEF DESCRIPTION OF THE DRAWINGS Other features and advantages of the present invention will emerge from the description given below, with reference to the accompanying drawings which illustrate an embodiment having no limiting character. In the figures: FIG. 1 represents an equipment EQ according to the invention in a particular embodiment of the invention; FIG. 2 represents the hardware architecture of a synchronization device according to the invention in a particular embodiment of the invention; FIG. 3 represents the main steps of a synchronization method implemented by the synchronization device according to the invention for synchronizing a subtitle included in an ordered set of subtitles associated with a multimedia content, with a segment audio of the multimedia content in a particular embodiment of the invention; FIG. 4 illustrates an example of a video stream associated with an ordered set of subtitles before and after a synchronization according to the invention.

Description détaillée de l'inventionDetailed description of the invention

La figure 1 représente, dans son environnement, un équipement EQ conforme à l'invention dans un mode particulier de réalisation de l'invention. Cet équipement EQ (ex. téléphone, set-top-box, ordinateur, lecteur multimédia etc.) comprend un décodeur DEC muni d'un dispositif de synchronisation DS conforme à l'invention.FIG. 1 represents, in its environment, an EQ equipment according to the invention in a particular embodiment of the invention. This EQ equipment (eg telephone, set-top box, computer, multimedia player, etc.) comprises a decoder DEC equipped with a synchronization device DS according to the invention.

Dans le mode de réalisation décrit ici, le dispositif de synchronisation DS est mis en œuvre au sein du décodeur DEC. En variante, le dispositif de synchronisation DS est mis en œuvre dans un équipement externe au décodeur DEC. L'équipement EQ comporte également des moyens MO (ex. antenne, interface entrée/sortie) d'obtention d'un contenu multimédia V et d'un ensemble ordonné de sous-titres S, et des moyens MA d'acheminement du contenu multimédia V et de l'ensemble S vers le décodeur DEC.In the embodiment described here, the synchronization device DS is implemented within the DEC decoder. In a variant, the synchronization device DS is implemented in a device external to the decoder DEC. The equipment EQ also comprises means MO (eg antenna, input / output interface) for obtaining a multimedia content V and an ordered set of subtitles S, and means MA for routing the multimedia content. V and the set S to the DEC decoder.

Dans le mode de réalisation décrit ici, l'équipement EQ est lié à un écran externe. En variante, l'équipement EQ comporte un écran.In the embodiment described here, the EQ equipment is linked to an external display. Alternatively, the EQ equipment includes a screen.

Dans l'exemple de la figure 1, l'équipement EQ permet au dispositif de synchronisation DS intégré dans le décodeur DEC, de synchroniser un sous-titre ST (qui n'est pas illustré sur la figure 1) compris dans un ensemble ordonné S de sous-titres inclus dans le contenu multimédia, ici un flux vidéo V, avec un segment audio AC (non illustré sur fa figure 1) de ce contenu multimédia, pour que le segment audio AC et le sous-titre ST soient restitués de façon synchronisée sur l'écran.In the example of FIG. 1, the equipment EQ enables the synchronization device DS integrated in the decoder DEC to synchronize a subtitle ST (which is not illustrated in FIG. 1) included in an ordered set S subtitles included in the multimedia content, here a video stream V, with an AC audio segment (not shown in FIG. 1) of this multimedia content, so that the audio segment AC and the subtitle ST are reproduced synchronized on the screen.

Dans le mode de réalisation décrit ici, l'équipement EQ obtient via ses moyens MO l'ensemble ordonné de sous-titres S à partir du flux vidéo V reçu d'un réseau de communications NW. En variante, l'équipement EQ obtient via ses moyens MO l'ensemble S à partir d'un fichier externe de sous-titres obtenu d'une mémoire locale ou externe.In the embodiment described here, the equipment EQ obtains via its means MO the ordered set of subtitles S from the video stream V received from a communications network NW. In a variant, the equipment EQ obtains via its means MO the set S from an external subtitle file obtained from a local or external memory.

Conformément à l'invention, la synchronisation du sous-titre ST mise en œuvre par le dispositif de synchronisation DS comporte une étape de conversion automatique d'un segment audio courant AC en texte TXT et une identification du sous-titre ST correspondant au texte TXT. Comme mentionné précédemment, par « segment audio courant », on entend le segment audio en cours d'analyse par le dispositif de synchronisation DS.According to the invention, the synchronization of the subtitle ST implemented by the synchronization device DS comprises a step of automatically converting a current audio segment AC into TXT text and an identification of the subtitle ST corresponding to the text TXT . As mentioned previously, "current audio segment" means the audio segment being analyzed by the synchronization device DS.

Dans le mode de réalisation décrit ici, l'équipement EQ communique via ses moyens MO avec te réseau NW de communications pour recevoir le flux vidéo V. Aucune limitation n'est attachée à la nature du réseau NW de communications. Il peut s'agir Indifféremment d'un réseau de télécommunications fixe, mobile, sans fil, optique, filaire, etc. En variante, l'équipement EQ obtient via ses moyens MO un fichier vidéo V localement. Aucune limitation n'est attachée à la nature du contenu multimédia V. Il peut s'agir indifféremment d'un flux ou d'un fichier d'un film, d"une série télévisée, d'actualités, d'une vidéo, d'un programme radio, d'une chanson, etc.In the embodiment described here, the equipment EQ communicates via its means MO with the network NW of communications to receive the video stream V. No limitation is attached to the nature of the network NW of communications. It can be indifferent to a fixed, mobile, wireless, optical, wired, etc. telecommunications network. In a variant, the equipment EQ obtains via its means MO a video file V locally. No limitation is attached to the nature of the multimedia content V. It may be indifferently a stream or a file of a film, a television series, news, a video, a radio program, a song, etc.

De même, aucune limitation n'est attachée à la nature de l'ensemble ordonné de sous-titres S. Il peut s'agir indifféremment d'un ensemble ordonné de sous-titres inclus (de manière distribuée ou regroupée) dans un flux ou fichier multimédia, ou enregistrés dans un fichier externe de sous-titres de format srt, sub, ssa, txt, etc.Similarly, no limitation is attached to the nature of the ordered set of subtitles S. It may be indifferently an ordered set of subtitles included (distributed or grouped) in a stream or multimedia file, or saved in an external file of subtitles of format srt, sub, ssa, txt, etc.

Dans le mode de réalisation décrit ici, le dispositif de synchronisation DS a l'architecture matérielle d'un ordinateur, telle que représentée schématiquement à la figure 2.In the embodiment described here, the synchronization device DS has the hardware architecture of a computer, as shown schematically in FIG. 2.

En relation avec la figure 2, le dispositif de synchronisation DS comporte notamment un processeur 10, une mémoire non volatile réinscriptible 11, une mémoire morte de type ROM (pour « Read-only memory », en anglais) 12, une mémoire vive de type RAM (pour « Random-access memory », en anglais) 13 et un module ML de lecture.In relation with FIG. 2, the synchronization device DS notably comprises a processor 10, a rewritable non-volatile memory 11, a ROM of read-only memory type 12, a type of random access memory. RAM (for "Random-access memory" in English) 13 and a read ML module.

Le module ML de lecture permet au dispositif de synchronisation DS de lire l'ensemble ordonné S de sous-titres et le flux vidéo V obtenus par le décorateur DEC afin de réaliser la synchronisation conformément à l'invention.The playback module ML enables the synchronization device DS to read the ordered set S of subtitles and the video stream V obtained by the decorator DEC in order to perform the synchronization according to the invention.

La mémoire morte 12 du dispositif de synchronisation DS constitue un support d'enregistrement conforme à l'invention, lisible par le processeur 10 et sur lequel est enregistré un programme d'ordinateur PG conforme à l'invention comportant des instructions pour l'exécution des étapes d'un procédé de synchronisation selon l'invention tel qu'il est mis en œuvre par le dispositif de synchronisation DS et dont les étapes sont détaillées ultérieurement en référence à la figure 3.The read-only memory 12 of the synchronization device DS constitutes a recording medium according to the invention, readable by the processor 10 and on which is recorded a computer program PG according to the invention comprising instructions for the execution of the steps of a synchronization method according to the invention as implemented by the synchronization device DS and whose steps are detailed later with reference to FIG.

Ce programme d'ordinateur PG définit de façon équivalente des modules fonctionnels du dispositif de synchronisation DS (modules logiciels ici), et notamment ici un module MC de conversion automatique du segment audio courant AC du contenu multimédia V, un module MR de recherche d'un sous-titre ST et un module MS de synchronisation du sous-titre ST.This computer program PG equivalently defines functional modules of the synchronization device DS (software modules here), and in particular here a module MC for automatic conversion of the current audio segment AC of the multimedia content V, a module MR of research of a subtitle ST and a synchronization module MS of the subtitle ST.

Les fonctions de ces modules logiciels sont détaillées ultérieurement en référence aux étapes du procédé de synchronisation selon l'invention.The functions of these software modules are detailed later with reference to the steps of the synchronization method according to the invention.

Nous allons maintenant décrire, en référence à la figure 3, les principales étapes d'un procédé de synchronisation mis en œuvre par ie dispositif de synchronisation DS de la figure 2 pour synchroniser un sous-titre ST compris dans l'ensemble ordonné de sous-titres S associé au flux vidéo V, avec une ligne de dialogue AC de ce flux V dans un mode particulier de réalisation de l'invention.We will now describe, with reference to FIG. 3, the main steps of a synchronization method implemented by the synchronization device DS of FIG. 2 for synchronizing a subtitle ST included in the ordered set of sub-components. S titles associated with the video stream V, with a dialogue line AC of this stream V in a particular embodiment of the invention.

Conformément à l'invention, l'identification du sous-titre ST correspondant au segment audio courant AC est réalisée par l'intermédiaire d'un texte TXT obtenu par conversion automatiquement à partir du segment audio courant AC. A titre d'exemple et en référence à la figure 4, un flux vidéo d'origine comporte cinq segments audio, ici cinq lignes de dialogue, chacune ayant un horodatage Hi (i=l, 2,...,5) représentant l'instant de restitution de fa ligne de dialogue. L'ensemble ordonné S de sous-titres associé à ce flux vidéo origine comporte cinq sous-titres STi (i=l, 2,...,5) correspondant aux cinq lignes de dialogue, chacun ayant également un horodatage Hi (i=l, 2,...,5) représentant l'instant de restitution du sous-titre STi.According to the invention, the identification of the subtitle ST corresponding to the current audio segment AC is carried out via a TXT text obtained by automatically converting from the current audio segment AC. By way of example and with reference to FIG. 4, an original video stream comprises five audio segments, here five dialogue lines, each having a timestamp Hi (i = 1, 2, ..., 5) representing moment of restitution of the line of dialogue. The ordered set S of subtitles associated with this original video stream comprises five subtitles STi (i = 1, 2, ..., 5) corresponding to the five lines of dialogue, each also having a timestamp Hi (i = 1, 2, ..., 5) representing the moment of restitution of the subtitle STi.

Dans l'exemple présenté ici, le flux vidéo d'origine est amputé d'une partie comprenant la troisième ligne de dialogue (« Très bien ») résultant le flux vidéo V. Cette amputation entraîne une désynchronisation ou un décalage entre le son et le sous-titre à partir de la troisième ligne de dialogue. En particulier, si l'invention n'était pas mise en œuvre, le sous-titre ST3 « Fine » serait restitué à l'horodatage H3 en correspondance avec le flux « Merci ».In the example presented here, the original video stream is amputated by a part comprising the third dialogue line ("Very good") resulting in the video stream V. This amputation leads to a desynchronization or an offset between the sound and the subtitle from the third line of dialogue. In particular, if the invention were not implemented, the subtitle ST3 "Fine" would be returned to the timestamp H3 in correspondence with the stream "Thank you".

Nous allons maintenant décrire, en référence à la figure 3 et à la figure 4, en détails les étapes du procédé mis en œuvre par le dispositif de synchronisation DS dans un mode particulier de réalisation de l'invention.We will now describe, with reference to FIG. 3 and FIG. 4, in detail the steps of the method implemented by the synchronization device DS in a particular embodiment of the invention.

Dans ce mode de réalisation, le procédé de synchronisation comporte une phase préliminaire E10 comprenant les étapes E12 à E16 décrites d-après, pour vérifier si les sous-titres compris dans l'ensemble ordonné S de sous-titres doivent être traduits et si un texte obtenu par conversion d'une ligne de dialogue du flux vidéo V doit être traduit. En variante, cette phase E10 comporte seulement les étapes E14 à E16 pour vérifier s'il est nécessaire de traduire un texte obtenu par conversion d'une ligne de dialogue du flux vidéo V. Dans un autre mode de réalisation, cette phase E10 n'est pas effectuée.In this embodiment, the synchronization method comprises a preliminary phase E10 comprising the steps E12 to E16 described below, to check if the subtitles included in the ordered set S of subtitles must be translated and if a text obtained by converting a dialogue line of the video stream V must be translated. As a variant, this phase E10 only includes steps E14 to E16 to check whether it is necessary to translate a text obtained by converting a dialogue line of the video stream V. In another embodiment, this phase E10 does not is not done.

Le dispositif de synchronisation DS détecte (E12) si la langue LS, ici l'anglais, dans laquelle les sous-titres STi de l'ensemble ordonné S de sous-titres sont rédigés est différente d'une langue déterminée LD dans laquelle les sous-titres STi doivent être restitués. Si fa langue LS est différente de la langue déterminée LD, les sous-titres STi de l'ensemble ordonné S sont traduits (E13) dans la langue déterminée LD.The synchronization device DS detects (E12) whether the language LS, in this case English, in which the subtitles STi of the ordered set S of subtitles are written is different from a determined language LD in which the sub-titles -titles STi must be returned. If the language LS is different from the determined language LD, the subtitles STi of the ordered set S are translated (E13) into the determined language LD.

Si non, comme supposé dans ce mode de réalisation, le dispositif de synchronisation DS détecte si la langue LS, ici l'anglais, des sous-titres STi de l'ensemble ordonné S de sous-titres est différente de la langue LV, ici le français, du flux vidéo V. Cette langue LV du flux vidéo V est ici par exemple indiquée dans les métadonnées comprises dans le flux vidéo V.If not, as assumed in this embodiment, the synchronization device DS detects whether the language LS, here English, subtitles STi of the set S subtitles is different from the language LV, here the French of the video stream V. This language LV of the video stream V is here for example indicated in the metadata included in the video stream V.

Si le dispositif de synchronisation DS détecte (E14) que la langue LS des sous-titres STi est identique à la langue LV du flux vidéo V, il détermine et mémorise (E15) qu'un texte obtenu par conversion d'une ligne de dialogue ne doit pas être traduit dans ia langue LS des sous-titres STi.If the synchronization device DS detects (E14) that the language LS of the subtitles STi is identical to the language LV of the video stream V, it determines and stores (E15) that a text obtained by conversion of a dialogue line must not be translated into the LS language of the STi subtitles.

Sinon, il détermine et mémorise (E16) que le texte doit être traduit dans la langue LS des sous-titres STi. C'est le cas dans l'exemple décrit ici.Otherwise, it determines and memorizes (E16) that the text must be translated into the language LS of the subtitles STi. This is the case in the example described here.

Ensuite, le dispositif de synchronisation DS analyse un segment audio du flux vidéo V pour le synchroniser avec un sous-titre STi compris dans l'ensemble ordonné S. On note que cette analyse peut être appliquée à partir du premier segment audio du flux vidéo V ou à partir d'un quelconque segment audio du flux vidéo V. Autrement dit, le procédé selon l'invention peut être effectué de façon ponctuelle, et non pas sur l'intégralité du flux vidéo V.Then, the synchronization device DS analyzes an audio segment of the video stream V to synchronize it with a subtitle STi included in the ordered set S. It should be noted that this analysis can be applied from the first audio segment of the video stream V or from any audio segment of the video stream V. In other words, the method according to the invention can be performed on an ad hoc basis, and not on the entire video stream V.

On suppose ici que le segment audio courant AC, c'est-à-dire ici celui qui est au point de restitution, est « Merci », la troisième ligne de dialogue du flux vidéo V. Cette ligne de dialogue courante AC a maintenant l'horodatage H3 représentant l'instant où elle est restituée durant la restitution du flux vidéo V.It is assumed here that the current audio segment AC, that is to say here that which is at the point of restitution, is "Thank you", the third line of dialogue of the video stream V. This current dialogue line AC now has the H3 timestamp representing the moment when it is rendered during the rendering of the video stream V.

Le dispositif de synchronisation D5 obtient (E20) cette ligne de dialogue courante AC par son module ML de lecture.The synchronization device D5 obtains (E20) this current dialogue line AC by its reading module ML.

Ensuite, il convertit (E30) automatiquement, par son module MC de conversion, la ligne de dialogue courante AC « Merci » en texte TXT3 « Merci » illustré sur la figure 4. Ce texte ΊΧΓ3 a un horodatage H3 correspondant à celui de la ligne de dialogue courante AC. Comme mentionné d-avant, cette conversion automatique est réalisée à l'aide d'une technique de reconnaissance vocale (« speech to text » en anglais) permettant d'analyser la voix humaine pour la transcrire sous la forme d'un texte exploitable par un ordinateur.Then, it converts (E30) automatically, by its conversion module MC, the current line of dialogue AC "Thank you" text TXT3 "Thank you" illustrated in Figure 4. This text ΊΧΓ3 has a time stamp H3 corresponding to that of the line current AC dialog. As mentioned above, this automatic conversion is performed using a speech-to-text ("speech to text") technique for analyzing the human voice for transcription in the form of a text exploitable by a computer.

Comme précédemment mentionné, il est déterminé et mémorisé, au cours de l'étape E16 de la phase préliminaire E10, que le texte obtenu par conversion d'une ligne de dialogue doit être traduit dans la langue LS (l'anglais dans cet exemple) des sous-titres STi de l'ensemble ordonné S.As previously mentioned, it is determined and memorized, during the step E16 of the preliminary phase E10, that the text obtained by conversion of a dialogue line must be translated into the language LS (English in this example). subtitles STi of the ordered set S.

Par conséquent, le dispositif de synchronisation DS traduit (E40) le texte TXT3 obtenu . par conversion de la ligne de dialogue courante AC dans la langue LS des sous-titres STi. Cette étape E40 de traduction résulte ici alors un texte traduit TXT3' « Thanks » en anglais, ce texte traduit TXT3' gardant l'horodatage H3 du texte TXT3.Consequently, the synchronization device DS translates (E40) the text TXT3 obtained. by converting the current AC line of dialogue into the language LS of the subtitles STi. This translation step E40 then results in a translated text TXT3 'Thanks' in English, this text translates TXT3' keeping the time stamp H3 of the text TXT3.

Ensuite, le dispositif de synchronisation DS effectue par son module MR de recherche, une phase E50 de recherche d'un sous-titre correspondant au texte traduit TXT3' dans l'ensemble ordonné S de sous-titres. Cette phase E50 de recherche comporte ici les étapes E51 à E54 suivantes.Next, the synchronization device DS carries out, through its search MR module, a search phase E50 of a subtitle corresponding to the translated text TXT3 'in the ordered set S of subtitles. This search phase E50 here comprises the following steps E51 to E54.

Le dispositif de synchronisation DS détermine (E51) au moins un sous-titre cible STp à partir de l'ensemble ordonné S de sous-titres. Dans ce mode de réalisation, ces sous-titres cibles STp sont déterminés à partir d'un sous-titre associé à un segment audio précédent le segment audio courant AC, c'est à dire à partir d'un sous-titre précédent (ici du sous-titre ST2). On suppose ici que les sous-titres cibles STp comportent ST2, ST3 et ST4. En variante, au moins un sous-titre cible peut être autrement déterminé. Par exemple, un sous-titre cible est déterminé s'il est vérifié que son horodatage de restitution relève d'une période déterminée comprenant celui du segment audio courant AC.The synchronization device DS determines (E51) at least one target subtitle STp from the ordered set S of subtitles. In this embodiment, these target subtitles STp are determined from a subtitle associated with an audio segment preceding the current audio segment AC, that is to say from a previous subtitle (here subtitle ST2). It is assumed here that the target sub-titles STp include ST2, ST3 and ST4. Alternatively, at least one target subtitle may be otherwise determined. For example, a target subtitle is determined if it is verified that its playback timestamp falls within a specified period including that of the current AC audio segment.

Suite à la détermination des sous-titres cibles STp, le dispositif de synchronisation DS cherche à reconnaître (E52) au moins un mot Mi du texte TXT3' dans chaque sous-titre cible STp (ici ST2, ST3 et ST4). Comme illustré à la figure 4, il recherche le mot Mi « Thanks » du texte TXT3' dans le sous-titre ST2 « How are you », ST3 « Fine » et ΞΤ4 « Thank you ».Following the determination of the target subtitles STp, the synchronization device DS seeks to recognize (E52) at least one word Mi of the text TXT3 'in each target subtitle STp (here ST2, ST3 and ST4). As illustrated in Figure 4, he looks up the word "Thanks" from the TXT3 text in the subtitle ST2 "How are you", ST3 "Fine" and "4 Thank you".

Dans ce mode de réalisation, le mot Mi « Thanks » est reconnu dans le sous-titre ST4 tout en tenant compte d'une expression idiomatique « Thank you ».In this embodiment, the word Mi "Thanks" is recognized in the subtitle ST4 while taking into account an idiomatic expression "Thank you".

Ensuite, le dispositif de synchronisation DS calcule (E53) un taux R de mots au moins partiellement reconnus pour chaque sous-titre cible STp. Il obtient ici les taux R de mots reconnus 0,0 et 100% respectivement pour les sous-titres cibles ST2, ST3 et ST4.Then, the synchronization device DS calculates (E53) a rate R of at least partially recognized words for each target subtitle STp. Here he obtains the R-rates of words recognized as 0.0 and 100% respectively for the target subtitles ST2, ST3 and ST4.

Suite au calcul des taux R, le dispositif de synchronisation DS détermine (E54) s'il existe, parmi (es sous-titres cibles STp, un sous-titre ST correspondant au texte obtenu par conversion TXT de la ligne de dialogue courante AC, ici au texte TXT3' en outre traduit dans la langue LS des sous-titres du fichier S de sous-titres.Following the calculation of the R rates, the synchronization device DS determines (E54) whether there exists, among (the target subtitles STp, a subtitle ST corresponding to the text obtained by conversion TXT of the current dialogue line AC, here to the text TXT3 'further translated in the language LS subtitles of the file S of subtitles.

Dans ce mode de réalisation, il détermine (E54) que le sous-titre ST4 dont le taux R de mots reconnus est le plus élevé comme le sous-titre ST correspondant au texte TXT3'. De façon alternative ou additionnelle, il peut déterminer un sous-titre cible STp dont !e taux R de mots reconnus atteint un seuil Tr prédéterminé (ex. 50%, 80%) comme le sous-titre correspondant au texte. S'il est déterminé (E54) qu'aucun sous-titre cible STp ne correspond au texte TXT, le procédé de synchronisation peut passer à un segment audio suivant, ici la ligne de dialogue « Au revoir » de l'horodatage H4, ou il peut interrompre momentanément le flux vidéo V pour effectuer une recherche approfondie par exemple en utilisant plus de sous-titres cibles, en tenant en compte plus de synonymes, d'homophones, d'expressions idiomatiques, etc.In this embodiment, it determines (E54) that the subtitle ST4 whose rate R of recognized words is the highest as the subtitle ST corresponding to the text TXT3 '. Alternatively or additionally, it can determine a target subtitle STp whose rate R of recognized words reaches a predetermined threshold Tr (eg 50%, 80%) as the subtitle corresponding to the text. If it is determined (E54) that no target subtitle STp corresponds to the text TXT, the synchronization method may be passed to a next audio segment, here the "Goodbye" dialog line of the time stamp H4, or it can temporarily interrupt the video stream V to carry out a thorough search, for example by using more target subtitles, taking into account more synonyms, homophones, idioms, etc.

On suppose ici qu'il est déterminé (E54) qu'il existe, parmi les sous-titres cibles STp, le sous-titre ST4 qui correspond au texte TXT3' car ce sous-titre ST4 a le taux R de mots reconnus le plus élevé. De façon alternative ou additionnelle, il peut déterminer l'existence d'un sous-titre STp correspondant au texte TXT si ce sous-titre a un taux R de mots reconnus atteignant ou supérieur à un seuil prédéterminé.It is assumed here that it is determined (E54) that there exists, among the target subtitles STp, the subtitle ST4 which corresponds to the text TXT3 'because this subtitle ST4 has the rate R of words recognized the most Student. Alternatively or additionally, it can determine the existence of a subtitle STp corresponding to the text TXT if this subtitle has a rate R of recognized words reaching or above a predetermined threshold.

Par conséquent, le dispositif de synchronisation DS identifie (E60) ici le sous-titre ST4 comme le sous-titre ST correspondant au texte TXT à l'issue de la phase E50 de recherche.Therefore, the synchronization device DS identifies (E60) here the subtitle ST4 as the subtitle ST corresponding to the text TXT at the end of the search phase E50.

Dans un autre mode de réalisation, le contenu multimédia est traité par le dispositif de synchronisation DS de façon préalable à la restitution du contenu (c'est-à-dire non pas analysé au cours d'une restitution en temps réel). S'il existe un doute sur la correspondance entre un sous-titre et le texte du segment audio courant, ce doute peut être levé en analysant le segment audio suivant. En effet, si pour le segment audio suivant, une correspondance peut être établie entre le texte du segment audio suivant et un sous-titre, le doute peut être levé positivement.In another embodiment, the multimedia content is processed by the synchronization device DS prior to the reproduction of the content (that is to say, not analyzed during a real-time playback). If there is doubt about the correspondence between a subtitle and the text of the current audio segment, this doubt can be removed by analyzing the next audio segment. Indeed, if for the next audio segment, a match can be made between the text of the next audio segment and a subtitle, doubt can be raised positively.

Le dispositif de synchronisation DS synchronise (E7Q) alors, par son module MS de synchronisation, le sous-titre ST identifié, ici ST4, avec la ligne de dialogue courante AC (« Merci »). Autrement dit, il avance le sous-titre identifié ST à l'horodatage H3 de la ligne de dialogue courante AC.The synchronization device DS synchronizes (E7Q) then, by its synchronization module MS, the subtitle ST identified, here ST4, with the current dialogue line AC ("Thank you"). In other words, it advances the subtitle identified ST to the timestamp H3 of the current dialogue line AC.

Ainsi, le dispositif de synchronisation DS synchronise le sous-titre ST4 avec la ligne de dialogue courante AC. C'est à dire, l'horodatage du sous-titre ST4 n'est plus H4, mais synchronisé avec l'horodatage H3 de la ligne de dialogue courante AC. De la même façon et au fur et à mesure, le dispositif de synchronisation D5 peut appliquer le procédé aux lignes de dialogue restantes du flux vidéo V.Thus, the synchronization device DS synchronizes the subtitle ST4 with the current dialogue line AC. That is, the time stamp of the subtitle ST4 is no longer H4, but synchronized with the timestamp H3 of the current AC dialog line. In the same way and as and when, the synchronization device D5 can apply the method to the remaining dialogue lines of the video stream V.

Dans ce mode de réalisation, suite à la synchronisation de la ligne de dialogue courante AC, le dispositif de synchronisation DS règle (E80) les horodatages de tous les sous-titres suivant STr le sous-titre identifié, ici l'horodatage du sous-titre ST5.In this embodiment, following the synchronization of the current dialogue line AC, the synchronization device DS sets (E80) the timestamps of all the subtitles according to the subtitle STr identified, here the timestamp of the sub-title title ST5.

En variante, il effectue ce réglage ultérieur, après avoir synchronisé plusieurs sous-titres et lorsqu'il détecte une grande conformité entre le contenu et les sous-titres à restituer, ou lorsqu'il détecte que les sous-titres sont conformes en contenu mais se déroulent plus rapidement ou plus lentement que le flux vidéo avec un rapport de vitesse constant par rapport à celui-ci.As a variant, it makes this subsequent adjustment, after having synchronized several subtitles and when it detects a great conformity between the content and the subtitles to be reproduced, or when it detects that the subtitles are conforming in content but run faster or slower than the video stream with a constant speed ratio to it.

On note que dans ce mode de réalisation, les étapes du procédé de synchronisation sont réalisées à la volée de la restitution en temps réel du flux vidéo V.Note that in this embodiment, the steps of the synchronization method are performed on the fly of the real-time rendering of the video stream V.

Autrement dit, ces étapes sont accomplies avant la restitution de ia ligne de dialogue suivante pendant fa restitution en temps réel du flux vidéo V.In other words, these steps are performed before the rendering of the following dialogue line during the real-time rendering of the video stream V.

Ainsi, cela permet une synchronisation automatique au rythme d'une restitution en temps réel du flux tout en garantissant une expérience spectateur sans couture.Thus, this allows automatic synchronization at the rate of a real-time playback of the stream while ensuring a seamless viewer experience.

Dans un autre mode de réalisation, les étapes du procédé de synchronisation sont réalisées de façon préalable à la restitution d'un fichier vidéo.In another embodiment, the steps of the synchronization method are performed prior to the return of a video file.

Claims

A method of synchronizing a subtitle (ST) included in an ordered set (S) of subtitles associated with a multimedia content (V), with an audio segment (AG) of the multimedia content (V), the characterized in that it comprises: a step (E30) of automatic conversion, in text (TXT) / of a current audio segment (AC) of the multimedia content (V); a step (E50) of searching in said ordered set (S) of subtitles (STi), of a subtitle (ST) corresponding to said text (TXT); if a subtitle (ST) corresponding to said text (TXT) is identified (E60), a step (E70) synchronizing said subtitle (ST) identified with said current audio segment (AC).

The method of claim 1 further comprising: a step (E12) for detecting that the language (LS) in which the subtitles (STi) of said ordered set (S) of subtitles are written is different from a specified language (LD) in which said subtitles (STi) are to be returned; a step (E13) of translation of the subtitles (STi) of said ordered set (S) of subtitles in said determined language (LD), said identified subtitle (ST) being a translated subtitle in that determined language (LD).

The method of claim 1 or 2 further comprising: a step (E14) for detecting that the language (LS) in which the subtitles (STi) of said ordered set (S) of subtitles are written is different from that (LV) of the multimedia content (V); and a step (E40) for translating said text (TXT) into the language (LS) of the subtitles (STi), said identified subtitle (ST, ST4) corresponding to the translated text (TXT3 ') in said language (LS ) subtitles (STi).

4. Method according to any one of claims 1 to 3 wherein, during said search step (E50), the identification of said subtitle (ST) corresponding to said text (TXT) comprises: a step (E52) searching for at least one word (Mi) of said text (TXT, ΤΧΓ3 ') in at least one target subtitle (STp) of said ordered set of subtitles (S); and a step (E53) of calculating a rate (R) of at least partially recognized words for each target subtitle (STp); said sub-title (ST, ST4) corresponding to said text (TXT, TXT3 ') being chosen from at least one target subtitle (STp), this subtitle (ST, ST4) having a rate (R) of recognized words greater than a predetermined threshold (Tr) and / or having the highest rate (R) of recognized words.

5. The method of claim 4 wherein a said target subtitle (STp) is the subtitle (STi) associated with an audio segment preceding the current audio segment (AC).

6. Method according to claim 4 or 5 wherein said step (E52) search word (Mi) is performed taking into account at least one variant of said at least one word (Mi) of said text (TXT, TXT3 ') .

7. Method according to any one of claims 1 to 6 wherein the steps of the synchronization method are performed prior to the return of the multimedia content (V).

8. Method according to any one of claims 1 to 7 wherein said multimedia content (V) is interrupted momentarily to identify the subtitle (ST, ST4) corresponding to the text (TXT, TXT3 ').

9. Method according to any one of claims 1 to 6 wherein the process steps are performed on the fly of a continuous reproduction of the multimedia content (V).

10. Method according to any one of claims 1 to 9 wherein the method further comprises a step (E80) for setting the timestamps of all the subtitles (STr, ST5) following said identified subtitle (ST, ST4 ).

11. Method according to any one of claims 1 to 10 wherein the method starts again for a next audio segment if no subtitle corresponding to the text (TXT, TXT3 ') is identified.

12. Device for synchronizing a subtitle (ST) included in an ordered set (S) of subtitles associated with a multimedia content (V), with an audio segment (AC) of the multimedia content (V), the device being characterized in that it comprises: a module (MC) for automatic conversion, in text (TXT), of a current audio segment (AC) of the multimedia content (V); a search module (MR) in said ordered set (S) of subtitles, of a subtitle (ST) corresponding to the text (TXT); an identified subtitle synchronization module (MS) with the current audio segment (AC), said module (MS) being activated on the subtitle identification (ST) corresponding to the text (TXT).

13. Equipment (EQ) comprising: means (MO) for obtaining a multimedia content (V) and an ordered set of subtitles (S); a decoder (DEC) comprising a synchronization device (DS) according to claim 10; and means (MA) for conveying said multimedia content (V) and said set of subtitles (S) to said decoder (DEC).

A computer program (PG) comprising instructions for executing the steps of a method according to any one of claims 1 to 11 when said program (PG) is executed by a processor.

15. A computer-readable recording medium on which a computer program (PG) is recorded according to claim 14.