CN113127701A - Subtitle language identification method and device, computer equipment and computer readable medium - Google Patents

Subtitle language identification method and device, computer equipment and computer readable medium Download PDF

Info

Publication number
CN113127701A
CN113127701A CN201911416584.XA CN201911416584A CN113127701A CN 113127701 A CN113127701 A CN 113127701A CN 201911416584 A CN201911416584 A CN 201911416584A CN 113127701 A CN113127701 A CN 113127701A
Authority
CN
China
Prior art keywords
language
caption
determining
character
closed caption
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911416584.XA
Other languages
Chinese (zh)
Inventor
洪冲
王伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ZTE Corp
Original Assignee
ZTE Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ZTE Corp filed Critical ZTE Corp
Priority to CN201911416584.XA priority Critical patent/CN113127701A/en
Priority to PCT/CN2020/139479 priority patent/WO2021136096A1/en
Publication of CN113127701A publication Critical patent/CN113127701A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/90335Query processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/126Character encoding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/435Processing of additional data, e.g. decrypting of additional data, reconstructing software from modules extracted from the transport stream

Abstract

The invention provides a caption language identification method, which comprises the steps of obtaining a video stream with a preset coding format in a code stream, obtaining network abstract layer data of a supplementary enhancement information type from the video stream, determining character codes of closed captions if the closed captions are obtained from the network abstract layer data of the supplementary enhancement information type, and determining the language of the closed captions according to the character codes and the mapping relation between the preset character codes and the language. The embodiment of the disclosure can quickly and accurately identify the language of the closed caption under the condition that the original code stream lacks the caption language information. The disclosure also provides a caption language identification device, a computer device and a computer readable medium.

Description

Subtitle language identification method and device, computer equipment and computer readable medium
Technical Field
The disclosure relates to the technical field of multimedia, in particular to a method and a device for recognizing caption languages, computer equipment and a computer readable medium.
Background
Many broadcast and multicast video signals now include text that can be displayed on a television or other display device, with CC (Closed Caption) text being one such type of text that is typical. CC text is a transcript of the text portion of a person's voice in video, and sometimes describes a small background portion of the vocal tract (soundtrack). Initially, CC text was used to provide convenience to hearing impaired people, and later, CC text was also used in some environments where the audio portion of the signal is difficult for listeners to hear due to high or low ambient noise levels, such as in bars, restaurants, airports, medical rooms, etc. There are two types of CC texts: one is EIA (Electronic Industries Association) 608 conforming to NTSC (National Television Standards Committee) Standards, and the other is EIA 708 conforming to ATSC (Advanced Television Systems Committee) Standards, wherein the EIA 608 text supports 6 languages: english, french, spanish, danish, german, and portuguese. EIA 608 supports these 6 languages through three types of character sets: a standard character set, a special character set, and an extended character set.
Other text services may also be included in the video signal relating to programs, electronic program guides, news, sports and emergency announcements, and many other types of information, and may be TeleText services such as TeleText (r), Ceefax (west fox system), and Oracle (Oracle) that contain text. Most text services are currently encoded in the VBI (vertical blanking interval) of a video signal, and a small portion of text services may also be carried along with the audio and video portions of the signal by digital video signals such as MPEG-2 (moving picture experts group) encoded signals and MPEG-4 (moving picture experts group) encoded signals.
The amount of text that can be carried in any video signal is limited by the encoding system, and video signals using the VBI encoding system have only a limited capacity for carrying text. Since the CC text must be carried entirely on 21 lines of the VBI, the number of characters that can be encoded into each frame is limited. The original code stream lacks the language information of the caption, so the language information of the code stream data cannot be obtained.
Disclosure of Invention
In view of the above-mentioned shortcomings in the prior art, the present disclosure provides a method, an apparatus, a computer device and a computer readable medium for recognizing a caption language.
In a first aspect, an embodiment of the present disclosure provides a method for recognizing a caption language, where the method includes:
acquiring a video stream with a preset coding format in a code stream;
acquiring network abstraction layer data of a supplementary enhancement information type from the video stream;
if the closed caption is obtained from the network abstraction layer data of the supplementary enhancement information type, determining the character code of the closed caption;
and determining the language of the closed caption according to the character code and a preset mapping relation between the character code and the language.
Further, the determining the character encoding of the closed caption includes:
and converting the closed captions with the preset number of bytes into character codes with a preset scale.
Further, the determining the language of the closed caption according to the character code and a preset mapping relationship between the character code and the language includes:
selecting a character code;
and if a first determination result comprising a language is obtained according to the character codes and the mapping relation, determining the language as the language of the closed caption.
Further, the determining the language of the closed caption according to the character code and a preset mapping relationship between the character code and the language further includes:
and if a second determination result comprising a plurality of languages is obtained according to the character codes and the mapping relation, processing the second determination result and a previous processing result to determine the same language in the second determination result and the previous processing result, and if the same language is one, determining the same language as the language of the closed caption.
Further, the determining the language of the closed caption according to the character code and a preset mapping relationship between the character code and the language further includes:
if the same language is multiple, selecting other character codes, and determining the language of the closed caption according to the selected character code and the mapping relation.
Further, after acquiring the closed caption from the network abstraction layer data of the supplemental enhancement information type, and before determining the language of the closed caption according to the character code and the preset mapping relationship between the character code and the language, the method further includes: determining a transmission channel of the closed captions;
the method further comprises the following steps:
and if the second determination result corresponding to the last character code of the closed caption and the previous processing result have the same languages, and the transmission channel of the closed caption is one, determining that the language of the closed caption is English.
In another aspect, an embodiment of the present disclosure further provides a device for recognizing a caption language, including: the device comprises a first acquisition module, a second acquisition module, a first determination module and a second determination module;
the first acquisition module is used for acquiring a video stream with a preset coding format in a code stream;
the second obtaining module is configured to obtain, from the video stream, network abstraction layer data of a supplemental enhancement information type;
the first determining module is configured to determine a character code of the closed caption if the closed caption is acquired from the network abstraction layer data of the supplemental enhancement information type;
and the second determining module is used for determining the language of the closed caption according to the character code and the mapping relation between the preset character code and the language.
In another aspect, an embodiment of the present disclosure further provides a computer device, including: one or more processors and storage; the storage device stores one or more programs, and when the one or more programs are executed by the one or more processors, the one or more processors implement the method for recognizing the caption language according to the foregoing embodiments.
The embodiments of the present disclosure also provide a computer readable medium, on which a computer program is stored, where the computer program is executed to implement the method for recognizing the caption language according to the foregoing embodiments.
The caption language identification method provided by the embodiment of the disclosure obtains a video stream with a preset coding format in a code stream, obtains network abstraction layer data of a supplementary enhancement information type from the video stream, determines a character code of a closed caption if the closed caption in the network abstraction layer data of the supplementary enhancement information type is obtained, and determines the language of the closed caption according to the character code and a mapping relation between the preset character code and the language. The embodiment of the disclosure can quickly and accurately identify the language of the closed caption under the condition that the original code stream lacks the caption language information.
Drawings
Fig. 1 is a flowchart of a caption language identification method according to an embodiment of the present disclosure;
FIG. 2 is a flowchart illustrating the closed caption language determination according to character encoding and mapping according to another embodiment of the present disclosure;
fig. 3 is a schematic structural diagram of a caption language identification device according to yet another embodiment of the present disclosure.
Detailed Description
Example embodiments will be described more fully hereinafter with reference to the accompanying drawings, but which may be embodied in different forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
Embodiments described herein may be described with reference to plan and/or cross-sectional views in light of idealized schematic illustrations of the disclosure. Accordingly, the example illustrations can be modified in accordance with manufacturing techniques and/or tolerances. Accordingly, the embodiments are not limited to the embodiments shown in the drawings, but include modifications of configurations formed based on a manufacturing process. Thus, the regions illustrated in the figures have schematic properties, and the shapes of the regions shown in the figures illustrate specific shapes of regions of elements, but are not intended to be limiting.
Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the relevant art and the present disclosure, and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
An embodiment of the present disclosure provides a method for recognizing a caption language, as shown in fig. 1, the method may include the following steps:
and step 11, acquiring a video stream with a preset coding format in the code stream.
In the embodiment of the present disclosure, the preset encoding format may be an AVC/h.264 compression format (Advanced Video Coding, Advanced Video Coding/MPEG-4 part ten compression format). The code Stream may be a channel code Stream containing encoded text received on any broadcast or multicast channel, for example, a UDP (User Datagram Protocol) MPEG (Moving Picture Experts Group) TS (Transport Stream) type code Stream, and the UDP MPEG TS type code Stream may be obtained by adding a UDP multicast mode or by binding a UDP unicast address.
In this step, taking the case that the code stream is UDP MPEG TS code stream as an example, the subtitle language identification apparatus may first find a TS packet (TS packet, a basic component unit of UDP MPEG TS code stream) whose PID is 0. A TS Packet with a PID (Packet Identifier, flag code transport Packet) of 0 is generally a PAT Table (Program Association Table), a PMT Table can be found according to the PID of each PMT Table (Program Map Table, Program mapping Table) provided in the PAT Table, and a TS Packet consistent with the PID of the PMT Table in the code stream can be further found after the PMT Table is analyzed. The caption language type recognition device can analyze the TS packets, determine whether each TS packet belongs to a video Stream according to a Stream _ type value of each TS packet, if so, further determine whether the video Stream uses an AVC/h.264 compression format, and if so, obtain the video Stream in the preset coding format.
And step 12, acquiring the network abstraction layer data of the type of the supplementary enhancement information from the video stream.
In this step, a video stream in AVC/h.264 compression format may generally include NAL (Network Abstraction Layer) data, and the subtitle language identification apparatus may first obtain the NAL from the video stream, and then determine whether the type of the NAL data is SEI (Supplemental Enhancement Information), and if so, obtain the SEI NAL data.
And step 13, if the closed caption is obtained from the network abstraction layer data of the supplementary enhancement information type, determining the character coding of the closed caption.
In this step, the caption language identification device may first parse the SEI NAL data to find a data area with a payloadType value (payload) of 4 in the SEI NAL data (i.e., a user _ data _ registered _ itu _ t _ t35 data area), then may parse the user _ data _ registered _ itu _ t _ t35 data area, and determine whether there are four consecutive bytes with an ASCII value of GA94 in the data area, if so, it may be determined that the SEI NAL data includes CC captions of type EIA (Electronic industry Association) 608, and the caption language identification device may obtain the CC captions in the SEI NAL data. When acquiring the CC caption in the SEI NAL data, the caption language identification apparatus may determine the character encoding of the CC caption.
And step 14, determining the language of the closed caption according to the character code and the preset mapping relation between the character code and the language.
The mapping relationship between character codes and languages is shown in table 1.
TABLE 1
Character encoding Language 1 Language 2 Language 3 Language 4 Language 5
…… …… …… …… …… ……
The mapping relationship between the character codes and the languages (i.e. table 1) reflects the correspondence relationship between the character codes and the languages supported by the CC caption text, and indicates in which languages the characters corresponding to the character codes may appear. In this step, the language of the CC caption corresponding to the character code can be determined by querying the mapping relationship using the character code as an index.
In embodiments of the present disclosure, the languages include french, spanish, denmark, german, and portuguese.
It should be noted that table 1 may further include a character set (Characters Sets), a CC Caption splitting (Closed Caption splitting), a Display (Display), and a character annotation (Unicode Name).
It can be seen from steps 11 to 14 that, in the caption language identification method provided in the embodiment of the present disclosure, a video stream in a preset coding format in a code stream is obtained, network abstraction layer data in a type of supplemental enhancement information is obtained from the video stream, if a closed caption is obtained from the network abstraction layer data in the type of supplemental enhancement information, a character code of the closed caption is determined, and a language of the closed caption is determined according to the character code and a mapping relationship between the preset character code and the language. The embodiment of the disclosure can quickly and accurately identify the language of the closed caption under the condition that the original code stream lacks the caption language information.
In some embodiments, the determining the character encoding of the closed caption may include: and converting the closed captions with the preset number of bytes into character codes with a preset scale. In this step, the caption language identification device may determine to convert CC captions of every several bytes into one character code according to the character coding method used by the CC captions, for example, if the character coding method used by the CC captions is utf-16 (hexadecimal unicode), the CC captions of every two bytes may be converted into one character code of utf-16. In this step, the caption language identification device may convert all CC captions in the SEI NAL into character codes.
In some embodiments, as shown in fig. 2, the determining the language of the closed caption according to the character encoding and the preset mapping relationship between the character encoding and the language may include the following steps:
step 21, selecting a character code.
In this step, the caption language identification device selects a character code as an index to inquire the mapping relation.
Step 22, determining the language of the closed caption according to the character coding and mapping relationship, if a first determination result including one language is obtained, executing step 23, and if a second determination result including a plurality of languages is obtained, executing step 24.
In this step, if only one language is obtained by the query, it can be said that the character represented by the character code is a character specific to the language, and the character is not present in other languages, and the result obtained by the query is the first determination result. If the query obtains a plurality of languages, it can be stated that the characters represented by the character codes do not exist in a plurality of languages but are specific to a certain language, and the result obtained by the query in the plurality of languages is the second determination result. The caption language identification device may determine whether the language in the determination result is unique, execute step 23 if the language in the determination result is unique, and execute step 24 if the language in the determination result is not unique.
The determination result obtained by determining the language type of the closed caption according to the character coding and mapping relationship may be represented by a binary character, the digit of the binary character is the number of the language types in table 1, for example, 5 language types are taken as an example, and a binary determination result of 5 digits is obtained according to the ordering of the language types in table 1, for example, when the determination result is 10001, 1 indicates that the character corresponding to the character coding exists in the corresponding language type (e.g., language 1 and language 5), and 0 indicates that the character corresponding to the character coding does not exist in the corresponding language type (e.g., language 2, language 3, and language 4). The caption language type identifying means may judge whether the determination result is the first determination result or the second determination result by judging that there are several "1" s in the determination result, for example, when there are 1 "s in the determination result, the explanation step 22 obtains the first determination result including one language type, and when there are a plurality of" 1 "s in the determination result, the explanation step 22 obtains the second determination result including a plurality of language types.
And step 23, determining the language as the language of the closed caption.
In this step, the caption language identification means may directly determine the only language in the second determination result as the language of the CC caption.
And 24, processing the second determination result and the previous processing result to determine the same language in the second determination result and the previous processing result.
The caption language identification device can further perform the current processing according to the second determination result and the previous processing result, that is, determine the same part of the plurality of languages in the second determination result and the plurality of languages in the previous processing result, if the same part exists, determine the same language, and judge whether the same language is unique, if the same language is unique, determine that the unique and same language is the language of the CC caption.
The processing operation in this step may be an and operation, for example, when the second determination result is 10001 and the previous processing result is 11100, 10001 and 11100 may be subjected to an and operation to obtain a current processing result of 10000, where "1" appears in the first place, which indicates that the same language in the current second determination result and the previous processing result is the language 1. When the second determination result is 10001 and the previous processing result is 11001, the operations of and may be performed on 10001 and 11001 to obtain that the processing result of this time is 10001, and "1" appears in the first digit and the fifth digit, which indicates that the same language in the second determination result and the previous processing result is language 1 and language 5.
And 25, judging whether the same language is one or not, if so, executing a step 26, otherwise, executing a step 27.
In this step, it can be determined whether the same language is one by determining that several "1" exist in the current processing result. For example, when the processing result of this time is 10000, there is only one "1", which indicates that there is only one language in the same language in the second determination result and the previous processing result, step 26 may be executed. When the result of the current processing is 10011, where there are three "1" s, it is described that the same language is multiple in the second determination result and the previous processing result, and step 27 may be executed.
And step 26, determining the same language as the language of the closed caption.
In this step, when the same language is unique (i.e. only one of the second determination result obtained by the processing in step 24 and the previous processing result is the same language), the caption language identification device may directly determine the same language as the language of the CC caption. That is, the caption language identification device may determine the language corresponding to "1" in the current processing result according to table 1, and determine the language as the language of the closed caption.
And 27, selecting other character codes, and determining the language of the closed caption according to the selected character code and the mapping relation.
In this step, when the same language is not unique (i.e. the same language in the second determination result obtained by the processing in step 24 and the previous processing result is multiple), the caption language identification apparatus needs to select another character code as an index to query the mapping relationship to obtain a determination result, and continue to determine the language of the CC caption according to the determination result, i.e. return to step 22.
It should be noted that, when the language of the CC caption is determined according to one character code and mapping relationship for the first time, the previous processing result is empty, and if a second determination result including multiple languages is obtained, other character codes are directly selected, and the language of the CC caption is determined according to the currently selected character code and mapping relationship. And when the language of the CC caption is determined according to the character coding and mapping relation for the second time, the processing result for the previous time is empty, and if a second determination result comprising a plurality of languages is obtained, processing is carried out according to the second determination result and the second determination result of the previous character coding so as to determine the same language.
It should be noted that, when the same language does not exist in the processing result (i.e. when 0 "1" exists in the processing result), the caption language identification device may directly select another character code and determine the language of the CC caption according to the currently selected character code and the mapping relationship.
Since english does not include special characters and english non-special characters are also included in other languages, english is not included in the mapping relationship between character codes and languages (i.e., table 1), and in the embodiment of the present disclosure, it is possible to determine whether a CC caption is english through a transmission channel of the CC caption.
In some embodiments, after acquiring the closed caption from the network abstraction layer data of the supplemental enhancement information type and before determining the language of the closed caption according to the character code and the preset mapping relationship between the character code and the language, the caption language identification method provided by the embodiment of the present disclosure may further include: a transmission channel for closed captions is determined.
The method for recognizing the language of the subtitle provided by the embodiment of the present disclosure may further include: and if the second determination result corresponding to the last character code of the closed caption and the same language in the previous processing result are determined to be multiple and the transmission channel of the closed caption is one, determining that the language of the closed caption is English.
That is to say, in the caption language identification method provided by the embodiment of the present disclosure, after the CC caption in the SEI NAL data is acquired, the transmission channel of the CC caption may also be determined. If the language of the CC caption cannot be determined until the last character code, whether the transmission channel of the CC caption is unique can be judged, and if the transmission channel is unique, the language of the CC caption can be directly determined to be English.
It should be noted that, if the channel of the CC caption is not unique, the language of the CC caption cannot be determined, and at this time, the channel information is output. Considering that the language in the same channel may change, the caption language identification device may use the newly determined unique language as the final language of the CC caption in the channel in real time.
The following briefly describes the caption language identification method provided by the embodiment of the present disclosure with reference to a specific embodiment. The caption language type identification device acquires a video stream with a preset coding format in a code stream, acquires SEI NAL data from the video stream, acquires CC captions in the SEI NAL data, determines character codes of the CC captions, and determines the language type of the CC captions according to the character codes and the mapping relation. The mapping relationship between the character code (i.e. hexadecimal converted value) of CC caption and the language is shown in table 2:
TABLE 2
Figure BDA0002351340800000101
Figure BDA0002351340800000111
Figure BDA0002351340800000121
Figure BDA0002351340800000131
Figure BDA0002351340800000141
Figure BDA0002351340800000151
Figure BDA0002351340800000161
If the selected character is coded as c3a7, the query mapping relationship may obtain a determination result of 10000, where "1" appears in the first place, which indicates that the character corresponding to c3a7 exists only in french, and at this time, the first determination result including french is obtained, and at this time, it is determined that french is the language of the CC caption. If the character code is c3a9, the query mapping relationship may obtain a determination result of 11101, where "1" respectively appears at the first, second, third, and fifth digits, which indicates that the character corresponding to c3a9 exists in four languages of french, spanish, denmark, and portuguese, and at this time, a second determination result including four languages of french, spanish, denmark, and portuguese is obtained. At this time, the caption language type recognition device processes according to the second determination result and the previous processing result, if the previous processing result is 11011, the second determination result 11101 and the previous processing result 11011 are subjected to and operation, the obtained processing result is 11001, wherein '1' respectively appears at the first place, the second place and the fifth place, the same language is french, spanish and portuguese, and the same language is multiple, so the caption language type recognition device needs to select other character codes and determine the language type of the CC caption according to the currently selected character code and mapping relation.
Based on the same technical concept, an embodiment of the present disclosure further provides a caption language identification device, as shown in fig. 3, the device may include: a first obtaining module 301, a second obtaining module 302, a first determining module 303, and a second determining module 303.
The first obtaining module 301 is configured to obtain a video stream in a preset encoding format in a code stream.
The second obtaining module 302 is configured to obtain network abstraction layer data of the supplemental enhancement information type from the video stream.
The first determining module 303 is configured to determine character encoding of the closed caption if the closed caption is acquired from the network abstraction layer data of the supplemental enhancement information type.
The second determining module 304 is configured to determine the language of the closed caption according to the character code and a preset mapping relationship between the character code and the language.
In some embodiments, the first determining module 303 is configured to convert the closed caption with a preset number of bytes into a character code with a preset scale.
In some embodiments, the second determination module 304 is configured to select a character encoding; and if a first determination result comprising one language is obtained according to the character coding and mapping relation, determining the language as the language of the closed caption.
In some embodiments, the second determining module 304 is configured to, if a second determination result including multiple languages is obtained according to the character encoding and mapping relationship, process the second determination result and the previous processing result to determine that the languages in the second determination result and the previous processing result are the same, and if the same language is one, determine that the same language is the language of the closed caption.
In some embodiments, the second determining module 304 is configured to select another character encoding if the same language is multiple, and determine the language of the closed caption according to the currently selected character encoding and the mapping relationship.
In some embodiments, the first determining module 303 is further configured to determine a transmission channel for closed captioning.
The second determining module 304 is further configured to determine that the language of the closed caption is english if it is determined that the second determination result corresponding to the last character code of the closed caption is multiple in the same language as the previous processing result and the closed caption has one transmission channel.
An embodiment of the present disclosure further provides a computer device, including: one or more processors and storage; the storage device stores one or more programs, and when the one or more programs are executed by the one or more processors, the one or more processors implement the method for recognizing the caption language according to the foregoing embodiments.
The embodiments of the present disclosure also provide a computer readable medium, on which a computer program is stored, where the computer program is executed to implement the method for recognizing the caption language according to the foregoing embodiments.
It will be understood by those of ordinary skill in the art that all or some of the steps of the methods disclosed above, functional modules/units in the apparatus, may be implemented as software, firmware, hardware, and suitable combinations thereof. In a hardware implementation, the division between functional modules/units mentioned in the above description does not necessarily correspond to the division of physical components; for example, one physical component may have multiple functions, or one function or step may be performed by several physical components in cooperation. Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit, digital signal processor, or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit. Such software may be distributed on computer readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media). The term computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data, as is well known to those of ordinary skill in the art. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, Digital Versatile Disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can accessed by a computer. In addition, communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media as known to those skilled in the art.
Example embodiments have been disclosed herein, and although specific terms are employed, they are used and should be interpreted in a generic and descriptive sense only and not for purposes of limitation. In some instances, features, characteristics and/or elements described in connection with a particular embodiment may be used alone or in combination with features, characteristics and/or elements described in connection with other embodiments, unless expressly stated otherwise, as would be apparent to one skilled in the art. It will, therefore, be understood by those skilled in the art that various changes in form and details may be made therein without departing from the scope of the invention encompassed by the appended claims.

Claims (9)

1. A caption language identification method comprises the following steps:
acquiring a video stream with a preset coding format in a code stream;
acquiring network abstraction layer data of a supplementary enhancement information type from the video stream;
if the closed caption is obtained from the network abstraction layer data of the supplementary enhancement information type, determining the character code of the closed caption;
and determining the language of the closed caption according to the character code and a preset mapping relation between the character code and the language.
2. The method of claim 1, wherein said determining character encoding of the closed captioning includes:
and converting the closed captions with the preset number of bytes into character codes with a preset scale.
3. The method according to claim 1 or 2, wherein said determining the language of the closed caption according to the character code and the preset mapping relationship between the character code and the language comprises:
selecting a character code;
and if a first determination result comprising a language is obtained according to the character codes and the mapping relation, determining the language as the language of the closed caption.
4. The method of claim 3, wherein said determining the language of the closed caption according to the character encoding and a preset mapping relationship between the character encoding and the language further comprises:
and if a second determination result comprising a plurality of languages is obtained according to the character codes and the mapping relation, processing the second determination result and a previous processing result to determine the same language in the second determination result and the previous processing result, and if the same language is one, determining the same language as the language of the closed caption.
5. The method of claim 4, wherein said determining the language of the closed caption according to the character encoding and a preset mapping relationship between the character encoding and the language further comprises:
if the same language is multiple, selecting other character codes, and determining the language of the closed caption according to the selected character code and the mapping relation.
6. The method of claim 5, wherein after acquiring the closed caption from the network abstraction layer data of the supplemental enhancement information type, before determining the language of the closed caption according to the character code and a preset mapping relationship between the character code and the language, the method further comprises: determining a transmission channel of the closed captions;
the method further comprises the following steps:
and if the second determination result corresponding to the last character code of the closed caption and the previous processing result have the same languages, and the transmission channel of the closed caption is one, determining that the language of the closed caption is English.
7. A caption language identification device comprises: the device comprises a first acquisition module, a second acquisition module, a first determination module and a second determination module;
the first acquisition module is used for acquiring a video stream with a preset coding format in a code stream;
the second obtaining module is configured to obtain, from the video stream, network abstraction layer data of a supplemental enhancement information type;
the first determining module is configured to determine a character code of the closed caption if the closed caption is acquired from the network abstraction layer data of the supplemental enhancement information type;
and the second determining module is used for determining the language of the closed caption according to the character code and the mapping relation between the preset character code and the language.
8. A computer device, comprising:
one or more processors;
a storage device having one or more programs stored thereon;
the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the caption language identification method of any one of claims 1-6.
9. A computer-readable medium on which a computer program is stored, wherein the program, when executed, implements the caption language identification method according to any one of claims 1 to 6.
CN201911416584.XA 2019-12-31 2019-12-31 Subtitle language identification method and device, computer equipment and computer readable medium Pending CN113127701A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201911416584.XA CN113127701A (en) 2019-12-31 2019-12-31 Subtitle language identification method and device, computer equipment and computer readable medium
PCT/CN2020/139479 WO2021136096A1 (en) 2019-12-31 2020-12-25 Caption language identification method and apparatus, computer device, and computer-readable medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911416584.XA CN113127701A (en) 2019-12-31 2019-12-31 Subtitle language identification method and device, computer equipment and computer readable medium

Publications (1)

Publication Number Publication Date
CN113127701A true CN113127701A (en) 2021-07-16

Family

ID=76686486

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911416584.XA Pending CN113127701A (en) 2019-12-31 2019-12-31 Subtitle language identification method and device, computer equipment and computer readable medium

Country Status (2)

Country Link
CN (1) CN113127701A (en)
WO (1) WO2021136096A1 (en)

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102333242A (en) * 2011-09-29 2012-01-25 深圳市万兴软件有限公司 Device and method for matching streaming media language information
US10944572B2 (en) * 2017-01-02 2021-03-09 Western Digital Technologies, Inc. Decryption and variant processing
KR101894889B1 (en) * 2017-04-06 2018-09-04 에스케이브로드밴드주식회사 Method and apparatus for providing video on demand service
CN108924600A (en) * 2018-06-28 2018-11-30 乐蜜有限公司 Sending and receiving methods, device and the electronic equipment of live data
CN108989876B (en) * 2018-07-27 2021-07-30 青岛海信传媒网络技术有限公司 Subtitle display method and device

Also Published As

Publication number Publication date
WO2021136096A1 (en) 2021-07-08

Similar Documents

Publication Publication Date Title
CN104137555B (en) Non-concealed caption data transmission in standard caption service
US11516495B2 (en) Broadcast system with a watermark payload
CN1599436B (en) Digital broadcast receiver and method for processing captionthereof
US8645983B2 (en) System and method for audible channel announce
US20120144447A1 (en) Digital television signal, digital television receiver, and method of processing digital television signal
CN107852526B (en) Method and receiver for processing a data stream
US10887669B2 (en) Broadcast system with a URI message watermark payload
US10341631B2 (en) Controlling modes of sub-title presentation
CA2795191A1 (en) Method and apparatus for processing non-real-time broadcast service and content transmitted by broadcast signal
US20040237123A1 (en) Apparatus and method for operating closed caption of digital TV
US8788693B2 (en) Apparatus and method for generating a data stream and apparatus and method for reading a data stream
US10972808B2 (en) Extensible watermark associated information retrieval
CN113127701A (en) Subtitle language identification method and device, computer equipment and computer readable medium
EP2555540A1 (en) Method for auto-detecting audio language name and television using the same
KR20070052169A (en) A broadcasting signal for use in a digital television receiver and method and apparatus of decoding psip table
EP2552135A1 (en) Method, system and terminal for processing text information in mobile multimedia broadcast
KR100755839B1 (en) Broadcasting system and method for supporting sound multiplex
CN106454547B (en) real-time caption broadcasting method and system
KR20110022015A (en) Digital television transmitter, digital television receiver and method for processing a broadcast signal
US11102499B2 (en) Emergency messages in watermarks
KR101227497B1 (en) Digital broadcast signal and apparatus and method of processing the signal

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination