CN110769265A

CN110769265A - Simultaneous caption translation method, smart television and storage medium

Info

Publication number: CN110769265A
Application number: CN201910950069.3A
Authority: CN
Inventors: 邓声扬; 孙雷
Original assignee: Shenzhen Chuangwei RGB Electronics Co Ltd
Current assignee: Shenzhen Skyworth RGB Electronics Co Ltd; Shenzhen Chuangwei RGB Electronics Co Ltd
Priority date: 2019-10-08
Filing date: 2019-10-08
Publication date: 2020-02-07
Also published as: WO2021068558A1

Abstract

The invention discloses a simultaneous subtitle translation method, a smart television and a storage medium, wherein the method comprises the following steps: receiving an operation instruction of a user for selecting a target language, collecting voice information in a playing resource, and sending the target language and the voice information to a cloud server; receiving a target subtitle returned by the cloud server, wherein the cloud server is used for translating the voice information into the target subtitle corresponding to the target language; and matching the target subtitles with the time axis of the playing resources in real time, and synchronously displaying the target subtitles in the playing process. According to the method and the system, the voice information in the playing resources is collected, the voice information is sent to the cloud server according to the language captions required by the user and is synchronously translated into the target captions corresponding to the target language selected by the user, and the target captions are synchronously displayed during playing, so that the user can easily understand the content information expressed by the audio and video of different languages, and the convenient watching or learning is realized.

Description

Simultaneous caption translation method, smart television and storage medium

Technical Field

The invention relates to the technical field of intelligent televisions, in particular to a simultaneous subtitle translation method, an intelligent television and a storage medium.

Background

With the increasingly higher social internationalization degree, the external communication scenes of users are increased continuously, so that the requirements of people on language learning and news understanding are increased more and more, and the large-screen intelligent display equipment (televisions and the like) has a convenient use environment in the communication. The television set as a large-screen display device has unique advantages in watching movies and television and remote education at the moment and in the future of intelligent development. With the development of society, the process of people integrating into globalization is continuously accelerated. For the general public, language is the biggest obstacle to the integration of international society and the understanding of international culture, and especially on some unique high-quality media materials, the foreign language level of a user is not high, so that great obstacle is brought to the hope of learning and cultural communication.

At present, artificial intelligence technology is continuously developed, technologies such as real-time speech translation and full-text translation are gradually mature, recognition accuracy is continuously improved, and network technology is developed towards the direction of high speed and low time delay. However, because a large number of different languages exist in the field of audio and video content, difficulty in understanding can be caused when people who are not native language watch the audio and video content, so that the user cannot get a barrier of cross-domain languages, and a great barrier exists in using audio and video resources of the non-native language.

Accordingly, the prior art is yet to be improved and developed.

Disclosure of Invention

The invention mainly aims to provide a simultaneous sound subtitle translation method, a smart television and a storage medium, and aims to solve the problem that language barrier exists in the process of using high-quality audio and video resources by a user in the prior art.

In order to achieve the above object, the present invention provides a method for translating a word curtain with the same sound, which comprises the following steps:

receiving an operation instruction of a user for selecting a target language, collecting voice information in a playing resource, and sending the target language and the voice information to a cloud server;

receiving a target subtitle returned by the cloud server, wherein the cloud server is used for translating the voice information into the target subtitle corresponding to the target language;

and matching the target subtitles with the time axis of the playing resources in real time, and synchronously displaying the target subtitles in the playing process.

Optionally, the method for simultaneous subtitle translation, where the receiving an operation instruction of a user selecting a target language, collecting voice information in a playing resource, and sending the target language and the voice information to a cloud server, specifically includes:

receiving an operation instruction of selecting a target language by a user through a remote controller key or a touch screen touch menu, wherein the target language comprises a plurality of pre-stored languages;

and collecting the voice information in the playing resources which need to be played currently, and sending the compressed voice information and the target language to the cloud server.

Optionally, the method for translating the subtitles with the same sound, wherein the collecting the voice information in the playing resources specifically includes:

when the playing resources are audio resources, directly acquiring voice information in the audio resources;

and when the playing resources are video resources, identifying and separating the voice information in the video resources.

Optionally, the method for simultaneous subtitle translation, where the receiving an operation instruction of a user selecting a target language, further includes:

and receiving a subtitle style selected by a user to display the target subtitle on a display interface, wherein the parameters of the subtitle style comprise subtitle color, subtitle transparency, display position and font size.

Optionally, the method for translating the subtitles with the same sound, wherein the matching of the target subtitles with the time axis of the playing resource is performed in real time, and the target subtitles are synchronously displayed in the playing process, specifically comprising:

acquiring the target subtitle sent by the cloud server, and matching the target subtitle with the time axis of the playing resource in real time, wherein the matching comprises synchronous matching of pictures, voice and subtitles;

and after the target subtitle is matched with the time axis of the playing resource, synchronously displaying the target subtitle on a display interface in the playing process of the playing resource.

Optionally, the method for translating the subtitles with the same sound, wherein the synchronously displaying the target subtitles in the playing process further comprises:

and in the playing process of the playing resources, displaying the original caption and the target caption in a contrast manner on a display interface, and forming a contrast effect of the original caption and the target caption to perform auxiliary learning of the language of the original caption.

Optionally, the method for translating the simultaneous subtitles, wherein the cloud server is configured to translate the voice information into the target subtitles corresponding to the target language, and specifically includes:

the cloud server receives the voice information and the target language;

the cloud server identifies and translates the voice information according to the target language and generates the target caption corresponding to the target language;

and the cloud server transmits the target subtitles back to the smart television.

In addition, to achieve the above object, the present invention further provides a smart tv, wherein the smart tv includes: the computer-readable medium comprises a memory, a processor and a simultaneous subtitle translation program stored on the memory and operable on the processor, wherein the simultaneous subtitle translation program, when executed by the processor, implements the steps of the simultaneous subtitle translation method as described above.

In addition, in order to achieve the above object, the present invention further provides a simultaneous subtitle translation system, wherein the simultaneous subtitle translation system includes the smart tv as described above, and further includes a cloud server in communication connection with the smart tv; the intelligent television is used for receiving an operation instruction of a user for selecting a target language, collecting voice information in a playing resource and sending the voice information to the cloud server; the system is also used for receiving the target caption returned by the cloud server, matching the target caption with the playing time axis of the playing resource in real time and synchronously displaying the target caption in the playing process; the cloud server is used for translating the voice information into the target subtitle corresponding to the target language and sending the target subtitle to the smart television.

In addition, to achieve the above object, the present invention further provides a storage medium, wherein the storage medium stores a simultaneous subtitle translation program, and the simultaneous subtitle translation program implements the steps of the simultaneous subtitle translation method when executed by a processor.

According to the method, an operation instruction of selecting a target language by a user is received, voice information in a playing resource is collected, and the target language and the voice information are sent to a cloud server; receiving a target subtitle returned by the cloud server, wherein the cloud server is used for translating the voice information into the target subtitle corresponding to the target language; and matching the target subtitles with the time axis of the playing resources in real time, and synchronously displaying the target subtitles in the playing process. According to the method and the system, the voice information in the playing resources is collected, the voice information is sent to the cloud server according to the language captions required by the user and is synchronously translated into the target captions corresponding to the target language selected by the user, and the target captions are synchronously displayed during playing, so that the requirements of the user on the captions in various languages are met, the user can easily understand the content information expressed by audio and video in different languages, and the convenient watching or learning is realized.

Drawings

FIG. 1 is a flow chart of a preferred embodiment of the simultaneous subtitle translation method of the present invention;

FIG. 2 is a flowchart of step S10 in the preferred embodiment of the simultaneous subtitle translation method according to the present invention;

FIG. 3 is a flowchart of step S30 in the preferred embodiment of the simultaneous subtitle translation method according to the present invention;

FIG. 4 is a flowchart illustrating the entire process of performing simultaneous translation of a target caption in accordance with the preferred embodiment of the present invention;

fig. 5 is a schematic operating environment diagram of a smart tv according to a preferred embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention clearer and clearer, the present invention is further described in detail below with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

As shown in fig. 1, the method for translating the same-sound subtitle according to the preferred embodiment of the present invention includes the following steps:

and step S10, receiving an operation instruction of selecting a target language by a user, collecting voice information in a playing resource, and sending the target language and the voice information to a cloud server.

Please refer to fig. 2, which is a flowchart of step S10 in the method for simultaneous subtitle translation according to the present invention.

As shown in fig. 2, the step S10 includes:

s11, receiving an operation instruction of selecting a target language by a user through a remote controller key or a touch screen touch menu, wherein the target language comprises a plurality of pre-stored languages;

s12, collecting the voice information in the playing resources which need to be played at present, and sending the compressed voice information and the target language to the cloud server.

Specifically, firstly, knowing which alphabetic language the user wants to see is, an operation instruction of the user for selecting a target language is received, and the mode of the operation instruction sent by the user can be diversified.

Further, after receiving an operation instruction of selecting a target language by a user, a subtitle style selected by the user to display the target subtitle on a display interface may be received, where parameters of the subtitle style include a subtitle color, a subtitle transparency, a display position, and a font size, for example, the color of the target subtitle selected by the user to be output last is red (for example, the original subtitle is black, and the colors of the original subtitle and the original subtitle are different for being viewed in contrast), the subtitle transparency may be selected according to a user's requirement, the display position may be above or below the original subtitle, the font size may also be selected according to a user's requirement, and other subtitle parameters or related settings may also be selected.

After receiving an operation instruction of a user selecting a target language, the television or the touch device identifies the operation instruction and executes a corresponding function, that is, collects the voice information in the playing resource that needs to be played (or is currently played) currently according to the target language, where the collecting the voice information in the playing resource specifically includes: when the playing resources are audio resources (for example, audio is played through a television, although there is no changed picture, there is also a display interface, and the display interface may be a certain playing interface which is not changed all the time in the audio playing process), then directly acquiring the voice information in the audio resources; when the playing resources are video resources (including audio information and video information), it is necessary to identify and separate the voice information in the video resources. Since only voice information is needed for target subtitle generation.

And compressing the acquired voice information (facilitating data transmission), and finally sending the compressed voice information and the target language to the cloud server, so that the cloud server can process the voice data according to the requirements of users.

And step S20, receiving the target subtitles returned by the cloud server, wherein the cloud server is used for translating the voice information into the target subtitles corresponding to the target language.

Specifically, after the television or the touch device sends the target language and the voice information to the cloud server, the cloud server mainly performs online real-time translation operation on the voice information according to the target language, for example, the cloud server can perform voice recognition and translation through Artificial Intelligence (AI), so as to generate a target subtitle (the target subtitle mainly refers to a native language of a user) desired by the user, and meanwhile, in terms of the AI technology of the cloud server, rapid development of machine language processing can be promoted through massive audio and video resource training and various demands of the user.

After the cloud server receives the voice information and the target language, the cloud server identifies and translates the voice information according to the target language (for example, an original audio and video is a french resource, and a user needs chinese, the target language is chinese, and an original subtitle is french), and generates the target subtitle corresponding to the target language (for example, generates a chinese subtitle), and the cloud server transmits the target subtitle back to a television (for example, a smart television) or a touch device.

Furthermore, the translation of the cloud server is online real-time translation, the watching of the user is not influenced, and meanwhile, the television or the touch equipment can select to broadcast while translating or broadcast after translating according to the requirement according to the network environment condition of the user, so that coherent and transparent watching experience is provided for the user.

And step S30, matching the target subtitles with the time axis of the playing resources in real time, and synchronously displaying the target subtitles in the playing process.

Please refer to fig. 3, which is a flowchart of step S30 in the method for simultaneous subtitle translation according to the present invention.

As shown in fig. 3, the step S30 includes:

s31, acquiring the target subtitles sent by the cloud server, and matching the target subtitles with the time axis of the playing resource in real time, wherein the matching comprises synchronous matching of pictures, voice and subtitles;

and S32, after the target subtitles are matched with the time axis of the playing resources, synchronously displaying the target subtitles on a display interface in the playing process of the playing resources.

Specifically, after the cloud server identifies and translates the voice information according to the target language and generates the target caption corresponding to the target language, and the cloud server transmits the target caption back to the television or the touch device, the television or the touch device matches the target caption and the picture according to the target caption and synchronously outputs the target caption and the picture according to a time axis, that is, the target caption and the time axis of the playing resource are matched in real time, so that the picture, the voice and the target caption are synchronously matched; after the time axis matching is completed, the target subtitles are synchronously displayed on a display interface (i.e. a display screen) in the audio or video playing process, for example, in the audio or video playing process, an original subtitle and the target subtitles are displayed in a contrasting manner on the display screen (for example, the original subtitle is displayed above or below the target subtitles), so that a contrasting effect between the original subtitle and the target subtitles is formed to assist in learning the language of the original subtitle, and not only the original sound but also the meaning to be expressed can be kept, for example, if a user wants to learn the language represented by the original subtitle, the user can be assisted in learning other languages by performing contrasting display.

According to the method for simultaneous interpretation of the audio and video subtitles, decodable audio and video resources are decoded and key information is sampled, the key sampling information is transmitted to the cloud server according to the language requirements of users, the key sampling information is returned to the intelligent display device after being interpreted by the cloud server, and is output in a subtitle mode in cooperation with the original language and presented to the users, so that the users can easily understand content information expressed by the audio and video of different languages, convenient communication or learning is realized, the real context of the users can be given, meanwhile, the users can immediately understand the information to be expressed by the audio and video, the processing efficiency is high, and the application range is wide.

Further, as shown in fig. 4, the whole process of performing target subtitle synchronous translation in the present invention is as follows:

step S0, the user selects a target language according to the subtitles that are finally output (the terminal device takes the smart tv as an example, and the smart tv receives the user' S operation instruction);

step S1, the smart television collects voice information in playing resources (such as audio or video);

step S2, the smart television sends the target language and the voice information to a cloud server;

step S3, the cloud server is used for translating the voice information into the target subtitles corresponding to the target language and transmitting the target subtitles back to the smart television;

step S4, the smart television receives the target subtitles sent by the cloud server, and matches the target subtitles with the time axis of the playing resources in real time, so that synchronous output of pictures, voice and subtitles is realized (step S4 and step S1 can be synchronously performed through a background, and watching of a user is not influenced);

and step S5, after the target subtitles are matched with the time axis of the playing resources, synchronously displaying the target subtitles on a display interface of the smart television in the playing process of the playing resources.

The invention brings the following technical advantages:

(1) the new function is added without extra cost, and a user does not need to purchase other hardware equipment under the condition that the user owns the intelligent display terminal, so that the hardware cost is saved.

(2) The user has wider use scenes, the user can understand information to be expressed by audio and video resources such as news or courses of different languages by using the function, and the user can also be used as an auxiliary tool for learning various foreign languages because of synchronously outputting the contrast subtitles of the mother language and the foreign languages, so that the level is improved in the context of the actual scene.

(3) For manufacturers of large-screen intelligent display devices, more potential customers can be obtained through the function, the service time of the large screen and the viscosity of the customers are improved, and more commercial values are obtained.

(4) For the AI technical level of the cloud server, the rapid development of machine language processing can be promoted through massive audio and video resource training and the increase of various requirements of clients.

Further, as shown in fig. 5, based on the above simultaneous subtitle translation method, the present invention also provides a smart television, which includes a processor 10, a memory 20, and a display 30. Fig. 5 shows only some of the components of the smart television, but it is to be understood that not all of the shown components are required to be implemented, and that more or fewer components may be implemented instead.

The storage 20 may be an internal storage unit of the smart tv in some embodiments, for example, a hard disk or a memory of the smart tv. In other embodiments, the memory 20 may also be an external storage device of the Smart tv, such as a plug-in hard disk provided on the Smart tv, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and so on. Further, the memory 20 may also include both an internal storage unit and an external storage device of the smart tv. The memory 20 is used for storing application software installed in the smart television and various types of data, such as program codes for installing the smart television. The memory 20 may also be used to temporarily store data that has been output or is to be output. In one embodiment, the memory 20 stores a coaural subtitle translation program 40, and the coaural subtitle translation program 40 can be executed by the processor 10 to implement the coaural subtitle translation method of the present application.

The processor 10 may be, in some embodiments, a Central Processing Unit (CPU), a microprocessor or other data Processing chip, and is configured to execute the program codes stored in the memory 20 or process data, such as executing the simultaneous subtitle translation method.

The display 30 may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch panel, or the like in some embodiments. The display 30 is used for displaying information on the smart television and for displaying a visual user interface. The components 10-30 of the smart television communicate with each other via a system bus.

In one embodiment, when the processor 10 executes the audio subtitle translation program 40 in the memory 20, the following steps are implemented:

The method comprises the following steps of receiving an operation instruction of a user selecting a target language, collecting voice information in a playing resource, and sending the target language and the voice information to a cloud server, and specifically comprises the following steps:

The acquiring voice information in the playing resources specifically includes:

The receiving of the operation instruction of the user selecting the target language further comprises:

The matching the target subtitle with the time axis of the playing resource in real time, and synchronously displaying the target subtitle in the playing process specifically include:

The synchronously displaying the target subtitles in the playing process further comprises:

The invention also provides a storage medium, wherein the storage medium stores a simultaneous subtitle translation program, and the simultaneous subtitle translation program realizes the steps of the simultaneous subtitle translation method when being executed by a processor.

In summary, the present invention provides a simultaneous subtitle translation method, a smart television and a storage medium, where the method includes: receiving an operation instruction of a user for selecting a target language, collecting voice information in a playing resource, and sending the target language and the voice information to a cloud server; receiving a target subtitle returned by the cloud server, wherein the cloud server is used for translating the voice information into the target subtitle corresponding to the target language; and matching the target subtitles with the time axis of the playing resources in real time, and synchronously displaying the target subtitles in the playing process. According to the method and the system, the voice information in the playing resources is collected, the voice information is sent to the cloud server according to the language captions required by the user and is synchronously translated into the target captions corresponding to the target language selected by the user, and the target captions are synchronously displayed during playing, so that the requirements of the user on the captions in various languages are met, the user can easily understand the content information expressed by audio and video in different languages, and the convenient watching or learning is realized.

Of course, it will be understood by those skilled in the art that all or part of the processes of the methods of the above embodiments may be implemented by a computer program instructing relevant hardware (such as a processor, a controller, etc.), and the program may be stored in a computer readable storage medium, and when executed, the program may include the processes of the above method embodiments. The storage medium may be a memory, a magnetic disk, an optical disk, etc.

It is to be understood that the invention is not limited to the examples described above, but that modifications and variations may be effected thereto by those of ordinary skill in the art in light of the foregoing description, and that all such modifications and variations are intended to be within the scope of the invention as defined by the appended claims.

Claims

1. A method for simultaneous subtitle translation, comprising:

2. The simultaneous subtitle translation method according to claim 1, wherein the receiving an operation instruction of a user selecting a target language, collecting voice information in a playing resource, and sending the target language and the voice information to a cloud server specifically includes:

3. The simultaneous subtitle translation method according to claim 2, wherein the collecting voice information in a playing resource specifically includes:

4. The method for simultaneous subtitle translation according to claim 1, wherein the receiving of an operation instruction of a user selecting a target language further comprises:

5. The method for simultaneous subtitle translation according to claim 1, wherein the matching of the target subtitles with the time axis of the playing resources in real time and the synchronous display of the target subtitles during the playing process specifically include:

6. The method for simultaneous subtitle translation according to claim 5, wherein the synchronously displaying the target subtitles during playing further comprises:

7. The simultaneous subtitle translation method according to claim 1 or 2, wherein the cloud server is configured to translate the voice information into the target subtitle corresponding to the target language, and specifically includes:

the cloud server receives the voice information and the target language;

8. An intelligent television, characterized in that the intelligent television comprises: a memory, a processor, and a co-sound subtitle translation program stored on the memory and executable on the processor, the co-sound subtitle translation program when executed by the processor implementing the steps of the co-sound subtitle translation method of any one of claims 1-6.

9. A simultaneous subtitle translation system, comprising the smart tv of claim 8, and a cloud server in communication with the smart tv;

the intelligent television is used for receiving an operation instruction of a user for selecting a target language, collecting voice information in a playing resource and sending the voice information to the cloud server; the system is also used for receiving the target caption returned by the cloud server, matching the target caption with the playing time axis of the playing resource in real time and synchronously displaying the target caption in the playing process;

the cloud server is used for translating the voice information into the target subtitle corresponding to the target language and sending the target subtitle to the smart television.

10. A storage medium storing a coaural subtitle translation program that, when executed by a processor, implements the steps of the coaural subtitle translation method of any one of claims 1-6.