CN111539427A - Method and system for extracting video subtitles - Google Patents

Method and system for extracting video subtitles Download PDF

Info

Publication number
CN111539427A
CN111539427A CN202010356689.7A CN202010356689A CN111539427A CN 111539427 A CN111539427 A CN 111539427A CN 202010356689 A CN202010356689 A CN 202010356689A CN 111539427 A CN111539427 A CN 111539427A
Authority
CN
China
Prior art keywords
gray
area
caption
picture
subtitle
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010356689.7A
Other languages
Chinese (zh)
Other versions
CN111539427B (en
Inventor
李钦
王正航
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan Yimantianxia Technology Co ltd
Original Assignee
Wuhan Yimantianxia Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan Yimantianxia Technology Co ltd filed Critical Wuhan Yimantianxia Technology Co ltd
Priority to CN202010356689.7A priority Critical patent/CN111539427B/en
Publication of CN111539427A publication Critical patent/CN111539427A/en
Application granted granted Critical
Publication of CN111539427B publication Critical patent/CN111539427B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • G06V20/635Overlay text, e.g. embedded captions in a TV program
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/488Data services, e.g. news ticker
    • H04N21/4884Data services, e.g. news ticker for displaying subtitles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Abstract

The invention discloses a method and a system for extracting video subtitles, which relate to the field of image processing and comprise the steps of selecting a specific area in a video picture as a subtitle identification area and selecting subtitle colors in the video picture; based on the determined caption identification area, cutting each frame of picture of the video, and based on an image identification algorithm, identifying the caption identification area of each frame of picture to judge whether the caption identification area of each frame of picture contains a caption or not and judge whether the caption identification areas of two adjacent frames of pictures are similar or not; based on the judgment result, grouping adjacent frames containing the same caption into a group, and recording the time stamps of the head and tail frames in each group; and performing OCR on the subtitle recognition area of the first frame picture in each group to obtain the subtitles, wherein the time stamps of the first frame and the last frame of the current group are the start time stamp and the end time stamp of the currently obtained subtitles, and generating a subtitle file. The invention can effectively save the extraction time of the video subtitles.

Description

Method and system for extracting video subtitles
Technical Field
The invention relates to the field of image processing, in particular to a method and a system for extracting video subtitles.
Background
The caption is a character showing non-image contents such as dialogue in television, movie and stage works, and is also a character added at the later stage of video and movie works. The commentary and various characters appearing below the display screens of movie screens, televisions and the like, such as the titles, the credits, the librets, the dialogues and the explanatory words of the movies are called subtitles according to the introduction of people, place names, years and the like.
In practical applications, the subtitles in the video need to be extracted for certain use requirements. However, in the process of extracting the video subtitles, the existing subtitle extraction method has the following disadvantages: 1. the time consumption is slow, and the subtitle extraction time is usually 5-10 times of the original video time; 2. the generated caption time axis can not enter and exit with the original video caption in the same frame; 3. more manual operations are required to process the repeated frame images.
Disclosure of Invention
Aiming at the defects in the prior art, the invention aims to provide a method and a system for extracting video subtitles, which can effectively save the extraction time of the video subtitles.
In order to achieve the above object, the present invention provides a method for extracting video subtitles, comprising the following steps:
selecting a specific area in a video picture as a subtitle identification area, and selecting the color of a subtitle in the video picture;
based on the determined caption identification area, cutting each frame of picture of the video, and based on an image identification algorithm, identifying the caption identification area of each frame of picture to judge whether the caption identification area of each frame of picture contains a caption or not and judge whether the caption identification areas of two adjacent frames of pictures are similar or not;
based on the judgment result, grouping adjacent frames containing the same caption into a group, and recording the time stamps of the head and tail frames in each group;
and performing OCR on the subtitle recognition area of the first frame picture in each group to obtain the subtitles, wherein the time stamps of the first frame and the last frame of the current group are the start time stamp and the end time stamp of the currently obtained subtitles, and generating a subtitle file.
On the basis of the technical proposal, the device comprises a shell,
judging whether the caption identification area of each frame of picture contains the caption or not, wherein the judging mode comprises a global judging mode and a local judging mode;
the global judgment mode comprises the following steps:
converting the caption identification area of the current frame picture into a gray image;
reading the gray level image pixel by pixel to obtain the number of pixels of which the gray level values belong to [ gray-15, gray +15] in the gray level image, wherein gray is a preset gray level value and the value range is 0-255;
based on the obtained number, if the obtained number is more than 3 x h, the caption identification area of the current frame picture contains the caption, otherwise, the caption identification area of the current frame picture does not contain the caption, wherein h is the height of the gray level image;
the local judgment mode comprises the following steps:
cutting the subtitle recognition area of the current frame picture by using a preset cutting area to obtain a cut image;
converting the cut image into a gray image, and reading the gray image pixel by pixel to obtain the number of pixels of which the gray values belong to [ gray-15, gray +15] in the gray image;
and based on the obtained number, if the obtained number belongs to [ cw, cw × ch/2], indicating that the subtitle identification region of the current frame picture contains the subtitle, and otherwise, indicating that the subtitle identification region of the current frame picture does not contain the subtitle, wherein cw represents the width of the cropping image, and ch represents the height of the cropping image.
On the basis of the technical scheme, the subtitle recognition area of the current frame picture is cut by using the preset cutting area, wherein the determining step of the preset cutting area comprises the following steps:
transversely segmenting the caption identification area of the first frame of picture in each group to obtain a plurality of unit areas which are identical in shape and are square, storing the unit areas by using arrays, and storing the number of effective pixel points in the unit area of the caption identification area of one frame of picture by each array;
judging the number of effective pixels of each unit region in a single subtitle identification region, if the number of effective pixels of the current unit region meets [ h1, h1 h/2], adding 1 to the weight value of the current unit region compared with the weight value of the previous unit region, if the number of effective pixels of the current unit region does not meet [ h1, h1 h/2], keeping the weight value of the current unit region consistent with the weight value of the previous unit region, wherein the effective pixels refer to pixels of which the gray values belong to [ gray-15, gray +15], and h1 is the side length of the unit region;
dividing all unit areas of a caption identification area of a current frame picture into a left part and a right part, calculating the sum of weights of each part of unit areas, and then judging whether | left-right |/min { left, right } is greater than 0.1, if so, the current frame picture is a left aligned caption, otherwise, the current frame picture is a middle aligned caption, wherein left represents the sum of weight values of the left part of unit areas, and right represents the sum of weight values of the right part of unit areas;
for the frame picture of the left-aligned caption, finding out a unit area with the maximum weight value in the single caption identification area and a next unit area adjacent to the unit area, and combining the two found unit areas to obtain an area which is a preset clipping area; and for the frame picture of the centered aligned caption, finding out a unit area with the maximum weight value in the single caption identification area and a front unit area and a rear unit area which are adjacent to the unit area, and combining the three found unit areas to obtain an area which is a preset clipping area.
On the basis of the technical scheme, the judging whether the caption identification areas of the two adjacent frames of pictures are similar or not comprises the following specific judging process:
converting the caption identification areas of two adjacent frames of pictures into gray level images to obtain two gray level images;
reading two gray level images pixel by pixel to obtain the number of pixels with gray levels of [ gray-15, gray +15] in the two gray level images;
based on the number of the obtained numbers,
if the number of pixels with gray values belonging to [ gray-15, gray +15] in the two gray images is 0, the caption identification areas of two adjacent frames of pictures are not similar;
if diff/(valid1+ valid2) <0.3, the subtitle recognition regions of two adjacent frames of pictures are similar, wherein valid1 represents the number of pixels with gray values belonging to [ gray-15, gray +15] in one gray image, valid2 represents the number of pixels with gray values belonging to [ gray-15, gray +15] in the other gray image, diff represents the number of times that the pixels at the same position in the two gray images are not simultaneously valid pixels or invalid pixels, the valid pixels refer to the pixels with gray values belonging to [ gray-15, gray +15], and the invalid pixels refer to the pixels with gray values not belonging to [ gray-15, gray +15 ];
if diff/(valid1+ valid2) ≧ 0.3, the subtitle recognition area of the current two adjacent frames is dissimilar.
On the basis of the technical scheme, the OCR is performed on the subtitle recognition area of the first frame picture in each group to obtain the subtitle, and the specific steps include:
longitudinally splicing the subtitle identification areas of the first frame of picture in each group according to the time sequence to form a spliced picture, and drawing a timestamp of the first frame and the last frame of the group where the subtitle identification areas are located above each subtitle identification area in the spliced picture;
OCR is carried out on the spliced pictures, and the obtained character contents are combined according to the time sequence to form a text;
and analyzing the text to obtain all the subtitles and the start time stamp of each subtitle.
The invention provides a video subtitle extraction system, which comprises the following steps:
the selection module is used for selecting a specific area in a video picture as a subtitle identification area and selecting the color of a subtitle in the video picture;
the judging module is used for cutting each frame of picture of the video based on the determined caption identification area, identifying the caption identification area of each frame of picture based on an image identification algorithm so as to judge whether the caption identification area of each frame of picture contains a caption or not and judge whether the caption identification areas of two adjacent frames of pictures are similar or not;
the classification module is used for classifying adjacent frames containing the same subtitles in the video into a group based on the judgment result and recording timestamps of head and tail frames in each group;
and the recognition module is used for performing OCR on the subtitle recognition area of the first frame picture in each group to obtain the subtitles, and the time stamps of the first frame and the last frame of the current group are the starting time stamp and the ending time stamp of the currently obtained subtitles to generate the subtitle file.
On the basis of the technical proposal, the device comprises a shell,
judging whether the caption identification area of each frame of picture contains the caption or not, wherein the judging mode comprises a global judging mode and a local judging mode;
the global judgment mode comprises the following processes:
converting the caption identification area of the current frame picture into a gray image;
reading the gray level image pixel by pixel to obtain the number of pixels of which the gray level values belong to [ gray-15, gray +15] in the gray level image, wherein gray is a preset gray level value and the value range is 0-255;
based on the obtained number, if the obtained number is more than 3 x h, the caption identification area of the current frame picture contains the caption, otherwise, the caption identification area of the current frame picture does not contain the caption, wherein h is the height of the gray level image;
the local judgment mode comprises the following processes:
cutting the subtitle recognition area of the current frame picture by using a preset cutting area to obtain a cut image;
converting the cut image into a gray image, and reading the gray image pixel by pixel to obtain the number of pixels of which the gray values belong to [ gray-15, gray +15] in the gray image;
and based on the obtained number, if the obtained number belongs to [ cw, cw × ch/2], indicating that the subtitle identification region of the current frame picture contains the subtitle, and otherwise, indicating that the subtitle identification region of the current frame picture does not contain the subtitle, wherein cw represents the width of the cropping image, and ch represents the height of the cropping image.
On the basis of the technical scheme, the subtitle recognition area of the current frame picture is cut by using the preset cutting area, wherein the determining process of the preset cutting area comprises the following steps:
transversely segmenting the caption identification area of the first frame of picture in each group to obtain a plurality of unit areas which are identical in shape and are square, storing the unit areas by using arrays, and storing the number of effective pixel points in the unit area of the caption identification area of one frame of picture by each array;
judging the number of effective pixels of each unit region in a single subtitle identification region, if the number of effective pixels of the current unit region meets [ h1, h1 h/2], adding 1 to the weight value of the current unit region compared with the weight value of the previous unit region, if the number of effective pixels of the current unit region does not meet [ h1, h1 h/2], keeping the weight value of the current unit region consistent with the weight value of the previous unit region, wherein the effective pixels refer to pixels of which the gray values belong to [ gray-15, gray +15], and h1 is the side length of the unit region;
dividing all unit areas of a caption identification area of a current frame picture into a left part and a right part, calculating the sum of weights of each part of unit areas, and then judging whether | left-right |/min { left, right } is greater than 0.1, if so, the current frame picture is a left aligned caption, otherwise, the current frame picture is a middle aligned caption, wherein left represents the sum of weight values of the left part of unit areas, and right represents the sum of weight values of the right part of unit areas;
for the frame picture of the left-aligned caption, finding out a unit area with the maximum weight value in the single caption identification area and a next unit area adjacent to the unit area, and combining the two found unit areas to obtain an area which is a preset clipping area; and for the frame picture of the centered aligned caption, finding out a unit area with the maximum weight value in the single caption identification area and a front unit area and a rear unit area which are adjacent to the unit area, and combining the three found unit areas to obtain an area which is a preset clipping area.
On the basis of the technical scheme, the judging whether the caption identification areas of the two adjacent frames of pictures are similar or not comprises the following specific judging process:
converting the caption identification areas of two adjacent frames of pictures into gray level images to obtain two gray level images;
reading two gray level images pixel by pixel to obtain the number of pixels with gray levels of [ gray-15, gray +15] in the two gray level images;
based on the number of the obtained numbers,
if the number of pixels with gray values belonging to [ gray-15, gray +15] in the two gray images is 0, the caption identification areas of two adjacent frames of pictures are not similar;
if diff/(valid1+ valid2) <0.3, the subtitle recognition regions of two adjacent frames of pictures are similar, wherein valid1 represents the number of pixels with gray values belonging to [ gray-15, gray +15] in one gray image, valid2 represents the number of pixels with gray values belonging to [ gray-15, gray +15] in the other gray image, diff represents the number of times that the pixels at the same position in the two gray images are not simultaneously valid pixels or invalid pixels, the valid pixels refer to the pixels with gray values belonging to [ gray-15, gray +15], and the invalid pixels refer to the pixels with gray values not belonging to [ gray-15, gray +15 ];
if diff/(valid1+ valid2) ≧ 0.3, the subtitle recognition area of the current two adjacent frames is dissimilar.
On the basis of the technical scheme, the OCR is performed on the subtitle recognition area of the first frame picture in each group to obtain the subtitle, and the specific process comprises the following steps:
longitudinally splicing the subtitle identification areas of the first frame of picture in each group according to the time sequence to form a spliced picture, and drawing a timestamp of the first frame and the last frame of the group where the subtitle identification areas are located above each subtitle identification area in the spliced picture;
OCR is carried out on the spliced pictures, and the obtained character contents are combined according to the time sequence to form a text;
and analyzing the text to obtain all the subtitles and the start time stamp of each subtitle.
Compared with the prior art, the invention has the advantages that: the specific area in the video picture is selected as the caption identification area, the identification area is reduced, so that the extraction time of the video caption is effectively saved, manual intervention is less, only the caption identification area and the caption color need to be selected manually, the timestamp of the picture with the caption can be recorded in the caption extraction process, and the generated caption time axis and the original video caption can be ensured to enter and exit in the same frame.
Drawings
Fig. 1 is a flowchart of a method for extracting video subtitles according to an embodiment of the present invention.
Detailed Description
The embodiment of the invention provides a method for extracting video subtitles, which is used for reducing the picture identification range by selecting a specific area in a video picture as a subtitle identification area, thereby effectively improving the subtitle extraction speed of the video picture. The embodiment of the invention correspondingly provides a video subtitle extraction system. The present invention will be described in further detail with reference to the accompanying drawings and examples.
Referring to fig. 1, a method for extracting a video subtitle according to an embodiment of the present invention includes the following steps:
s1: and selecting a specific area in the video picture as a subtitle identification area, and selecting the color of the subtitle in the video picture. In the embodiment of the invention, the selection of the specific area and the selection of the caption color can be manually selected in a manual mode, the area where the caption appears in the video picture is generally a fixed area, the caption always appears in the fixed area along with the progress of video playing, and the picture identification range can be effectively reduced by selecting the caption identification area.
S2: and based on the determined caption identification area, cutting each frame of picture of the video, and based on an image identification algorithm, identifying the caption identification area of each frame of picture to judge whether the caption identification area of each frame of picture contains a caption or not and judge whether the caption identification areas of two adjacent frames of pictures are similar or not.
In the embodiment of the invention, whether the caption identification area of each frame of picture contains the caption or not is judged, wherein the judging mode comprises a global judging mode and a local judging mode.
The global judgment mode comprises the following steps:
s201: converting the caption identification area of the current frame picture into a gray image;
s202: reading the gray level image pixel by pixel to obtain the number of pixels of which the gray level values belong to [ gray-15, gray +15] in the gray level image, wherein gray is a preset gray level value and the value range is 0-255;
s203: based on the obtained number, if the obtained number is larger than 3 x h, the caption identification area of the current frame picture contains the caption, otherwise, the caption identification area of the current frame picture does not contain the caption, wherein h is the height of the gray level image, and x represents multiplication.
The local judgment method comprises the following steps:
s211: cutting the subtitle recognition area of the current frame picture by using a preset cutting area to obtain a cut image;
s212: converting the cut image into a gray image, and reading the gray image pixel by pixel to obtain the number of pixels of which the gray values belong to [ gray-15, gray +15] in the gray image;
s213: and based on the obtained number, if the obtained number belongs to [ cw, cw × ch/2], indicating that the subtitle identification region of the current frame picture contains the subtitle, and otherwise, indicating that the subtitle identification region of the current frame picture does not contain the subtitle, wherein cw represents the width of the cropping image, and ch represents the height of the cropping image.
In the embodiment of the invention, a preset cutting area is used for cutting a caption identification area of a current frame picture, wherein the step of determining the preset cutting area comprises the following steps:
a: and transversely segmenting the caption identification area of the first frame of picture in each group to obtain a plurality of unit areas which are identical in shape and are square, storing by using arrays, and storing the number of effective pixel points in the unit area of the caption identification area of one frame of picture by each array. And the side length of the unit area obtained by cutting is the same as the height of the subtitle identification area.
B: judging the number of effective pixels of each unit region in a single subtitle identification region, if the number of effective pixels of the current unit region meets [ h1, h1 h/2], adding 1 to the weight value of the current unit region compared with the weight value of the previous unit region, if the number of effective pixels of the current unit region does not meet [ h1, h1 h/2], keeping the weight value of the current unit region consistent with the weight value of the previous unit region, wherein the effective pixels refer to pixels of which the gray values belong to [ gray-15, gray +15], and h1 is the side length of the unit region.
For example, after a certain single subtitle recognition region is transversely divided, 4 unit regions are sequentially obtained, namely a unit region a, a unit region b, a unit region c and a unit region d, wherein if the number of effective pixels in the unit region a satisfies [ h1, h1 × h/2], the weight value of the unit region a is 1, if the number of effective pixels in the unit region b satisfies [ h1, h1 × h/2], the weight value of the unit region b is 2, if the number of effective pixels in the unit region c does not satisfy [ h1, h1 × h/2], the weight value of the unit region c is 2, and if the number of effective pixels in the unit region d satisfies [ h1, h1 × h/2], the weight value of the unit region d is 3.
C: dividing all unit areas of a caption identification area of a current frame picture into a left part and a right part, calculating the weight sum of each part of unit area, and then judging whether | left-right |/min { left, right } is greater than 0.1, if so, the current frame picture is a left aligned caption, otherwise, the current frame picture is a middle aligned caption, wherein left represents the sum of weight values of the left part of unit area, and right represents the sum of weight values of the right part of unit area.
For example, a subtitle recognition area of a frame includes 4 unit areas, which are, in turn, a unit area a, a unit area b, a unit area c, and a unit area d, and after left and right division, the left part includes the unit area a and the unit area b, the right part includes the unit area c and the unit area d, left indicates a sum of a weight value of the unit area a and a weight value of the unit area b, and right indicates a sum of a weight value of the unit area c and a weight value of the unit area d.
D: for the frame picture of the left-aligned caption, finding out a unit area with the maximum weight value in the single caption identification area and a next unit area adjacent to the unit area, and combining the two found unit areas to obtain an area which is a preset clipping area; and for the frame picture of the centered aligned caption, finding out a unit area with the maximum weight value in the single caption identification area and a front unit area and a rear unit area which are adjacent to the unit area, and combining the three found unit areas to obtain an area which is a preset clipping area.
For example, a subtitle recognition area of a frame includes 4 unit areas, which are, in turn, a unit area a, a unit area b, a unit area c, and a unit area d, where a weight value of the unit area c is the largest, if a current frame is a left-aligned subtitle, a preset clipping area is an area obtained by merging the unit area c and the unit area d, and if the current frame is a center-aligned subtitle, the preset clipping area is an area obtained by merging the unit area b, the unit area c, and the unit area d.
The preset cutting area is determined to judge whether the frame picture contains characters or not more accurately, no matter whether the caption characters only have one character or a plurality of characters, the characters fall in the preset cutting area, so that effective point sampling is only needed in the preset cutting area, sampling in the whole caption identification area is not needed, and the influence of background noise on pixel sampling can be effectively reduced.
In the embodiment of the invention, whether the caption identification areas of two adjacent frames of pictures are similar or not is judged, and the specific judgment process comprises the following steps:
s231: converting the caption identification areas of two adjacent frames of pictures into gray level images to obtain two gray level images;
s232: reading two gray level images pixel by pixel to obtain the number of pixels with gray levels of [ gray-15, gray +15] in the two gray level images;
s233: based on the number of the obtained numbers,
if the number of pixels with gray values belonging to [ gray-15, gray +15] in the two gray images is 0, the caption identification areas of two adjacent frames of pictures are not similar;
if diff/(valid1+ valid2) <0.3, the subtitle recognition regions of two adjacent frames of pictures are similar, wherein valid1 represents the number of pixels with gray values belonging to [ gray-15, gray +15] in one gray image, valid2 represents the number of pixels with gray values belonging to [ gray-15, gray +15] in the other gray image, diff represents the number of times that the pixels at the same position in the two gray images are not simultaneously valid pixels or invalid pixels, valid pixels refer to pixels with gray values belonging to [ gray-15, gray +15], and invalid pixels refer to pixels with gray values not belonging to [ gray-15, gray +15 ];
if diff/(valid1+ valid2) ≧ 0.3, the subtitle recognition area of the current two adjacent frames is dissimilar.
S3: and based on the judgment result, grouping adjacent frames containing the same caption in the video into a group, and recording the time stamp of the head frame and the tail frame in each group.
In the embodiment of the invention, adjacent frames containing the same subtitles in a video are grouped into one group, and the time stamps of the head frame and the tail frame in each group are recorded, and the specific steps comprise: sequentially judging the caption identification area of each frame of picture, if the current caption identification area contains a caption, recording the caption identification area of the current frame of picture, recording the timestamp of the current frame of picture, judging the caption identification area of the next frame of picture, judging whether the caption identification area contains characters or not, and whether the caption identification area is similar to the caption identification area of the previous frame of picture or not:
if the characters are contained and similar, continuously judging the subtitle identification area of the next frame of picture; if the current frame contains characters and is not similar to the characters, recording a subtitle identification area of the current frame and recording a time stamp of the current frame; if the current frame does not contain characters, recording a time stamp of the current frame picture; and by analogy, grouping adjacent frames containing the same caption in the video. The caption identification area containing the same caption frame picture contains the same character.
S4: and performing OCR (Optical character recognition) on the caption recognition area of the first frame picture in each group to obtain the caption, wherein the time stamp of the first frame and the time stamp of the last frame of the current group are the start time stamp and the end time stamp of the currently obtained caption, and generating a caption file.
Performing OCR on a subtitle recognition area of a first frame of picture in each group to obtain subtitles, and specifically comprising the following steps of:
s401: longitudinally splicing the subtitle identification areas of the first frame of picture in each group according to the time sequence to form a spliced picture, and drawing a timestamp of the first frame and the last frame of the group where the subtitle identification areas are located above each subtitle identification area in the spliced picture;
s402: OCR is carried out on the spliced pictures, and the obtained character contents are combined according to the time sequence to form a text;
s403: and analyzing the text to obtain all subtitles and the start time stamp of each subtitle, and outputting a subtitle file according to srt format.
According to the method for extracting the video subtitles, the specific area in the video picture is selected as the subtitle identification area, and the identification area is reduced, so that the extraction time of the video subtitles is effectively saved, manual intervention is less, the subtitle identification area and the subtitle color only need to be selected manually, the timestamp of the picture with the subtitles is recorded in the subtitle extraction process, and the generated subtitle time axis and the original video subtitles can be ensured to enter and exit in the same frame.
The video subtitle extraction system provided by the embodiment of the invention comprises a selection module, a judgment module, a classification module and an identification module.
The selection module is used for selecting a specific area in a video picture as a subtitle identification area and selecting the color of a subtitle in the video picture; the judging module is used for cutting each frame of picture of the video based on the determined caption identification area, identifying the caption identification area of each frame of picture based on an image identification algorithm so as to judge whether the caption identification area of each frame of picture contains a caption or not and judge whether the caption identification areas of two adjacent frames of pictures are similar or not; the classification module is used for classifying adjacent frames containing the same subtitles in the video into a group based on the judgment result and recording timestamps of head and tail frames in each group; and the recognition module is used for performing OCR on the subtitle recognition area of the first frame picture in each group to obtain the subtitle, and the time stamp of the first frame and the time stamp of the last frame of the current group are the starting time stamp and the ending time stamp of the currently obtained subtitle and generate the subtitle file.
In the embodiment of the invention, whether a caption identification area of each frame of picture contains a caption or not is judged, wherein the judging mode comprises a global judging mode and a local judging mode;
the global judgment mode comprises the following processes:
converting the caption identification area of the current frame picture into a gray image;
reading the gray level image pixel by pixel to obtain the number of pixels of which the gray level values belong to [ gray-15, gray +15] in the gray level image, wherein gray is a preset gray level value and the value range is 0-255;
based on the obtained number, if the obtained number is more than 3 x h, the caption identification area of the current frame picture contains the caption, otherwise, the caption identification area of the current frame picture does not contain the caption, wherein h is the height of the gray level image;
the local judgment mode comprises the following processes:
cutting the subtitle recognition area of the current frame picture by using a preset cutting area to obtain a cut image;
converting the cut image into a gray image, and reading the gray image pixel by pixel to obtain the number of pixels of which the gray values belong to [ gray-15, gray +15] in the gray image;
and based on the obtained number, if the obtained number belongs to [ cw, cw × ch/2], indicating that the subtitle identification region of the current frame picture contains the subtitle, and otherwise, indicating that the subtitle identification region of the current frame picture does not contain the subtitle, wherein cw represents the width of the cropping image, and ch represents the height of the cropping image.
In the embodiment of the invention, a preset cutting area is used for cutting a caption identification area of a current frame picture, wherein the determining process of the preset cutting area comprises the following steps:
transversely segmenting the caption identification area of the first frame of picture in each group to obtain a plurality of unit areas which are identical in shape and are square, storing the unit areas by using arrays, and storing the number of effective pixel points in the unit area of the caption identification area of one frame of picture by each array;
judging the number of effective pixels of each unit region in a single subtitle identification region, if the number of effective pixels of the current unit region meets [ h1, h1 h/2], adding 1 to the weight value of the current unit region compared with the weight value of the previous unit region, if the number of effective pixels of the current unit region does not meet [ h1, h1 h/2], keeping the weight value of the current unit region consistent with the weight value of the previous unit region, wherein the effective pixels refer to pixels of which the gray values belong to [ gray-15, gray +15], and h1 is the side length of the unit region;
dividing all unit areas of a caption identification area of a current frame picture into a left part and a right part, calculating the sum of weights of each part of unit areas, and then judging whether | left-right |/min { left, right } is greater than 0.1, if so, the current frame picture is a left aligned caption, otherwise, the current frame picture is a middle aligned caption, wherein left represents the sum of weight values of the left part of unit areas, and right represents the sum of weight values of the right part of unit areas;
for the frame picture of the left-aligned caption, finding out a unit area with the maximum weight value in the single caption identification area and a next unit area adjacent to the unit area, and combining the two found unit areas to obtain an area which is a preset clipping area; and for the frame picture of the centered aligned caption, finding out a unit area with the maximum weight value in the single caption identification area and a front unit area and a rear unit area which are adjacent to the unit area, and combining the three found unit areas to obtain an area which is a preset clipping area.
In the embodiment of the invention, whether the caption identification areas of two adjacent frames of pictures are similar or not is judged, and the specific judgment process comprises the following steps:
converting the caption identification areas of two adjacent frames of pictures into gray level images to obtain two gray level images;
reading two gray level images pixel by pixel to obtain the number of pixels with gray levels of [ gray-15, gray +15] in the two gray level images;
based on the number of the obtained numbers,
if the number of pixels with gray values belonging to [ gray-15, gray +15] in the two gray images is 0, the caption identification areas of two adjacent frames of pictures are not similar;
if diff/(valid1+ valid2) <0.3, the subtitle recognition regions of two adjacent frames of pictures are similar, wherein valid1 represents the number of pixels with gray values belonging to [ gray-15, gray +15] in one gray image, valid2 represents the number of pixels with gray values belonging to [ gray-15, gray +15] in the other gray image, diff represents the number of times that the pixels at the same position in the two gray images are not simultaneously valid pixels or invalid pixels, the valid pixels refer to the pixels with gray values belonging to [ gray-15, gray +15], and the invalid pixels refer to the pixels with gray values not belonging to [ gray-15, gray +15 ];
if diff/(valid1+ valid2) ≧ 0.3, the subtitle recognition area of the current two adjacent frames is dissimilar.
In the embodiment of the invention, OCR is carried out on the subtitle recognition area of the first frame of picture in each group to obtain the subtitle, and the specific process comprises the following steps:
longitudinally splicing the subtitle identification areas of the first frame of picture in each group according to the time sequence to form a spliced picture, and drawing a timestamp of the first frame and the last frame of the group where the subtitle identification areas are located above each subtitle identification area in the spliced picture;
OCR is carried out on the spliced pictures, and the obtained character contents are combined according to the time sequence to form a text;
and analyzing the text to obtain all the subtitles and the start time stamp of each subtitle.
The present invention is not limited to the above-described embodiments, and it will be apparent to those skilled in the art that various modifications and improvements can be made without departing from the principle of the present invention, and such modifications and improvements are also considered to be within the scope of the present invention. Those not described in detail in this specification are within the skill of the art.

Claims (10)

1. A method for extracting video subtitles is characterized by comprising the following steps:
selecting a specific area in a video picture as a subtitle identification area, and selecting the color of a subtitle in the video picture;
based on the determined caption identification area, cutting each frame of picture of the video, and based on an image identification algorithm, identifying the caption identification area of each frame of picture to judge whether the caption identification area of each frame of picture contains a caption or not and judge whether the caption identification areas of two adjacent frames of pictures are similar or not;
based on the judgment result, grouping adjacent frames containing the same caption into a group, and recording the time stamps of the head and tail frames in each group;
and performing OCR on the subtitle recognition area of the first frame picture in each group to obtain the subtitles, wherein the time stamps of the first frame and the last frame of the current group are the start time stamp and the end time stamp of the currently obtained subtitles, and generating a subtitle file.
2. The method for extracting a video subtitle as claimed in claim 1, wherein:
judging whether the caption identification area of each frame of picture contains the caption or not, wherein the judging mode comprises a global judging mode and a local judging mode;
the global judgment mode comprises the following steps:
converting the caption identification area of the current frame picture into a gray image;
reading the gray level image pixel by pixel to obtain the number of pixels of which the gray level values belong to [ gray-15, gray +15] in the gray level image, wherein gray is a preset gray level value and the value range is 0-255;
based on the obtained number, if the obtained number is more than 3 x h, the caption identification area of the current frame picture contains the caption, otherwise, the caption identification area of the current frame picture does not contain the caption, wherein h is the height of the gray level image;
the local judgment mode comprises the following steps:
cutting the subtitle recognition area of the current frame picture by using a preset cutting area to obtain a cut image;
converting the cut image into a gray image, and reading the gray image pixel by pixel to obtain the number of pixels of which the gray values belong to [ gray-15, gray +15] in the gray image;
and based on the obtained number, if the obtained number belongs to [ cw, cw × ch/2], indicating that the subtitle identification region of the current frame picture contains the subtitle, and otherwise, indicating that the subtitle identification region of the current frame picture does not contain the subtitle, wherein cw represents the width of the cropping image, and ch represents the height of the cropping image.
3. The method for extracting video subtitles of claim 2, wherein the step of cropping the subtitle recognition area of the current frame picture by using a preset cropping area comprises:
transversely segmenting the caption identification area of the first frame of picture in each group to obtain a plurality of unit areas which are identical in shape and are square, storing the unit areas by using arrays, and storing the number of effective pixel points in the unit area of the caption identification area of one frame of picture by each array;
judging the number of effective pixels of each unit region in a single subtitle identification region, if the number of effective pixels of the current unit region meets [ h1, h1 h/2], adding 1 to the weight value of the current unit region compared with the weight value of the previous unit region, if the number of effective pixels of the current unit region does not meet [ h1, h1 h/2], keeping the weight value of the current unit region consistent with the weight value of the previous unit region, wherein the effective pixels refer to pixels of which the gray values belong to [ gray-15, gray +15], and h1 is the side length of the unit region;
dividing all unit areas of a caption identification area of a current frame picture into a left part and a right part, calculating the sum of weights of each part of unit areas, and then judging whether | left-right |/min { left, right } is greater than 0.1, if so, the current frame picture is a left aligned caption, otherwise, the current frame picture is a middle aligned caption, wherein left represents the sum of weight values of the left part of unit areas, and right represents the sum of weight values of the right part of unit areas;
for the frame picture of the left-aligned caption, finding out a unit area with the maximum weight value in the single caption identification area and a next unit area adjacent to the unit area, and combining the two found unit areas to obtain an area which is a preset clipping area; and for the frame picture of the centered aligned caption, finding out a unit area with the maximum weight value in the single caption identification area and a front unit area and a rear unit area which are adjacent to the unit area, and combining the three found unit areas to obtain an area which is a preset clipping area.
4. The method for extracting a video subtitle as claimed in claim 1, wherein: the method for judging whether the caption identification areas of the two adjacent frames of pictures are similar comprises the following specific judgment process:
converting the caption identification areas of two adjacent frames of pictures into gray level images to obtain two gray level images;
reading two gray level images pixel by pixel to obtain the number of pixels with gray levels of [ gray-15, gray +15] in the two gray level images;
based on the number of the obtained numbers,
if the number of pixels with gray values belonging to [ gray-15, gray +15] in the two gray images is 0, the caption identification areas of two adjacent frames of pictures are not similar;
if diff/(valid1+ valid2) <0.3, the subtitle recognition regions of two adjacent frames of pictures are similar, wherein valid1 represents the number of pixels with gray values belonging to [ gray-15, gray +15] in one gray image, valid2 represents the number of pixels with gray values belonging to [ gray-15, gray +15] in the other gray image, diff represents the number of times that the pixels at the same position in the two gray images are not simultaneously valid pixels or invalid pixels, the valid pixels refer to the pixels with gray values belonging to [ gray-15, gray +15], and the invalid pixels refer to the pixels with gray values not belonging to [ gray-15, gray +15 ];
if diff/(valid1+ valid2) ≧ 0.3, the subtitle recognition area of the current two adjacent frames is dissimilar.
5. The method for extracting video subtitles according to claim 1, wherein the OCR is performed on the subtitle recognition area of the first frame of picture in each group to obtain subtitles, and the specific steps include:
longitudinally splicing the subtitle identification areas of the first frame of picture in each group according to the time sequence to form a spliced picture, and drawing a timestamp of the first frame and the last frame of the group where the subtitle identification areas are located above each subtitle identification area in the spliced picture;
OCR is carried out on the spliced pictures, and the obtained character contents are combined according to the time sequence to form a text;
and analyzing the text to obtain all the subtitles and the start time stamp of each subtitle.
6. A video subtitle extraction system, comprising:
the selection module is used for selecting a specific area in a video picture as a subtitle identification area and selecting the color of a subtitle in the video picture;
the judging module is used for cutting each frame of picture of the video based on the determined caption identification area, identifying the caption identification area of each frame of picture based on an image identification algorithm so as to judge whether the caption identification area of each frame of picture contains a caption or not and judge whether the caption identification areas of two adjacent frames of pictures are similar or not;
the classification module is used for classifying adjacent frames containing the same subtitles in the video into a group based on the judgment result and recording timestamps of head and tail frames in each group;
and the recognition module is used for performing OCR on the subtitle recognition area of the first frame picture in each group to obtain the subtitles, and the time stamps of the first frame and the last frame of the current group are the starting time stamp and the ending time stamp of the currently obtained subtitles to generate the subtitle file.
7. The system for extracting a video subtitle of claim 6, wherein:
judging whether the caption identification area of each frame of picture contains the caption or not, wherein the judging mode comprises a global judging mode and a local judging mode;
the global judgment mode comprises the following processes:
converting the caption identification area of the current frame picture into a gray image;
reading the gray level image pixel by pixel to obtain the number of pixels of which the gray level values belong to [ gray-15, gray +15] in the gray level image, wherein gray is a preset gray level value and the value range is 0-255;
based on the obtained number, if the obtained number is more than 3 x h, the caption identification area of the current frame picture contains the caption, otherwise, the caption identification area of the current frame picture does not contain the caption, wherein h is the height of the gray level image;
the local judgment mode comprises the following processes:
cutting the subtitle recognition area of the current frame picture by using a preset cutting area to obtain a cut image;
converting the cut image into a gray image, and reading the gray image pixel by pixel to obtain the number of pixels of which the gray values belong to [ gray-15, gray +15] in the gray image;
and based on the obtained number, if the obtained number belongs to [ cw, cw × ch/2], indicating that the subtitle identification region of the current frame picture contains the subtitle, and otherwise, indicating that the subtitle identification region of the current frame picture does not contain the subtitle, wherein cw represents the width of the cropping image, and ch represents the height of the cropping image.
8. The system for extracting video subtitles of claim 7, wherein the clipping is performed on the subtitle recognition area of the current frame picture using a preset clipping area, wherein the determining of the preset clipping area comprises:
transversely segmenting the caption identification area of the first frame of picture in each group to obtain a plurality of unit areas which are identical in shape and are square, storing the unit areas by using arrays, and storing the number of effective pixel points in the unit area of the caption identification area of one frame of picture by each array;
judging the number of effective pixels of each unit region in a single subtitle identification region, if the number of effective pixels of the current unit region meets [ h1, h1 h/2], adding 1 to the weight value of the current unit region compared with the weight value of the previous unit region, if the number of effective pixels of the current unit region does not meet [ h1, h1 h/2], keeping the weight value of the current unit region consistent with the weight value of the previous unit region, wherein the effective pixels refer to pixels of which the gray values belong to [ gray-15, gray +15], and h1 is the side length of the unit region;
dividing all unit areas of a caption identification area of a current frame picture into a left part and a right part, calculating the sum of weights of each part of unit areas, and then judging whether | left-right |/min { left, right } is greater than 0.1, if so, the current frame picture is a left aligned caption, otherwise, the current frame picture is a middle aligned caption, wherein left represents the sum of weight values of the left part of unit areas, and right represents the sum of weight values of the right part of unit areas;
for the frame picture of the left-aligned caption, finding out a unit area with the maximum weight value in the single caption identification area and a next unit area adjacent to the unit area, and combining the two found unit areas to obtain an area which is a preset clipping area; and for the frame picture of the centered aligned caption, finding out a unit area with the maximum weight value in the single caption identification area and a front unit area and a rear unit area which are adjacent to the unit area, and combining the three found unit areas to obtain an area which is a preset clipping area.
9. The system for extracting a video subtitle of claim 6, wherein: the method for judging whether the caption identification areas of the two adjacent frames of pictures are similar comprises the following specific judgment process:
converting the caption identification areas of two adjacent frames of pictures into gray level images to obtain two gray level images;
reading two gray level images pixel by pixel to obtain the number of pixels with gray levels of [ gray-15, gray +15] in the two gray level images;
based on the number of the obtained numbers,
if the number of pixels with gray values belonging to [ gray-15, gray +15] in the two gray images is 0, the caption identification areas of two adjacent frames of pictures are not similar;
if diff/(valid1+ valid2) <0.3, the subtitle recognition regions of two adjacent frames of pictures are similar, wherein valid1 represents the number of pixels with gray values belonging to [ gray-15, gray +15] in one gray image, valid2 represents the number of pixels with gray values belonging to [ gray-15, gray +15] in the other gray image, diff represents the number of times that the pixels at the same position in the two gray images are not simultaneously valid pixels or invalid pixels, the valid pixels refer to the pixels with gray values belonging to [ gray-15, gray +15], and the invalid pixels refer to the pixels with gray values not belonging to [ gray-15, gray +15 ];
if diff/(valid1+ valid2) ≧ 0.3, the subtitle recognition area of the current two adjacent frames is dissimilar.
10. The system for extracting video subtitles of claim 6, wherein the OCR is performed on the subtitle recognition area of the first frame of picture in each group to obtain subtitles, and the specific process includes:
longitudinally splicing the subtitle identification areas of the first frame of picture in each group according to the time sequence to form a spliced picture, and drawing a timestamp of the first frame and the last frame of the group where the subtitle identification areas are located above each subtitle identification area in the spliced picture;
OCR is carried out on the spliced pictures, and the obtained character contents are combined according to the time sequence to form a text;
and analyzing the text to obtain all the subtitles and the start time stamp of each subtitle.
CN202010356689.7A 2020-04-29 2020-04-29 Video subtitle extraction method and system Active CN111539427B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010356689.7A CN111539427B (en) 2020-04-29 2020-04-29 Video subtitle extraction method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010356689.7A CN111539427B (en) 2020-04-29 2020-04-29 Video subtitle extraction method and system

Publications (2)

Publication Number Publication Date
CN111539427A true CN111539427A (en) 2020-08-14
CN111539427B CN111539427B (en) 2023-07-21

Family

ID=71967604

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010356689.7A Active CN111539427B (en) 2020-04-29 2020-04-29 Video subtitle extraction method and system

Country Status (1)

Country Link
CN (1) CN111539427B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112218142A (en) * 2020-08-27 2021-01-12 厦门快商通科技股份有限公司 Method and device for separating voice from video with subtitles, storage medium and electronic equipment
CN113343986A (en) * 2021-06-29 2021-09-03 北京奇艺世纪科技有限公司 Subtitle time interval determining method and device, electronic equipment and readable storage medium
CN113435438A (en) * 2021-06-28 2021-09-24 中国兵器装备集团自动化研究所有限公司 Video screen board extraction and video segmentation method for image and subtitle fusion
CN116886996A (en) * 2023-09-06 2023-10-13 浙江富控创联技术有限公司 Digital village multimedia display screen broadcasting system

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110044662A1 (en) * 2002-11-15 2011-02-24 Thomson Licensing S.A. Method and apparatus for composition of subtitles
CN102332096A (en) * 2011-10-17 2012-01-25 中国科学院自动化研究所 Video caption text extraction and identification method
US20190114486A1 (en) * 2016-08-08 2019-04-18 Tencent Technology (Shenzhen) Company Limited Subtitle extraction method and device, storage medium
CN109729420A (en) * 2017-10-27 2019-05-07 腾讯科技(深圳)有限公司 Image processing method and device, mobile terminal and computer readable storage medium
CN110210299A (en) * 2019-04-26 2019-09-06 平安科技(深圳)有限公司 Voice training data creation method, device, equipment and readable storage medium storing program for executing

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110044662A1 (en) * 2002-11-15 2011-02-24 Thomson Licensing S.A. Method and apparatus for composition of subtitles
CN102332096A (en) * 2011-10-17 2012-01-25 中国科学院自动化研究所 Video caption text extraction and identification method
US20190114486A1 (en) * 2016-08-08 2019-04-18 Tencent Technology (Shenzhen) Company Limited Subtitle extraction method and device, storage medium
CN109729420A (en) * 2017-10-27 2019-05-07 腾讯科技(深圳)有限公司 Image processing method and device, mobile terminal and computer readable storage medium
CN110210299A (en) * 2019-04-26 2019-09-06 平安科技(深圳)有限公司 Voice training data creation method, device, equipment and readable storage medium storing program for executing

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
RAINER LIENHART 等: ""Automatic Text Segmentation and Text Recognition for Video Indexing"" *
王智慧 等: ""两阶段的视频字幕检测和提取算法"" *
赵义武: ""基于边缘特征的视频字幕定位及字幕追踪方法"" *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112218142A (en) * 2020-08-27 2021-01-12 厦门快商通科技股份有限公司 Method and device for separating voice from video with subtitles, storage medium and electronic equipment
CN113435438A (en) * 2021-06-28 2021-09-24 中国兵器装备集团自动化研究所有限公司 Video screen board extraction and video segmentation method for image and subtitle fusion
CN113343986A (en) * 2021-06-29 2021-09-03 北京奇艺世纪科技有限公司 Subtitle time interval determining method and device, electronic equipment and readable storage medium
CN113343986B (en) * 2021-06-29 2023-08-25 北京奇艺世纪科技有限公司 Subtitle time interval determining method and device, electronic equipment and readable storage medium
CN116886996A (en) * 2023-09-06 2023-10-13 浙江富控创联技术有限公司 Digital village multimedia display screen broadcasting system
CN116886996B (en) * 2023-09-06 2023-12-01 浙江富控创联技术有限公司 Digital village multimedia display screen broadcasting system

Also Published As

Publication number Publication date
CN111539427B (en) 2023-07-21

Similar Documents

Publication Publication Date Title
CN111539427B (en) Video subtitle extraction method and system
US6101274A (en) Method and apparatus for detecting and interpreting textual captions in digital video signals
KR100746641B1 (en) Image code based on moving picture, apparatus for generating/decoding image code based on moving picture and method therefor
EP3016025B1 (en) Image processing device, image processing method, poi information creation system, warning system, and guidance system
US6937766B1 (en) Method of indexing and searching images of text in video
US7403657B2 (en) Method and apparatus for character string search in image
US6704029B1 (en) Method and apparatus for specifying scene information in a moving picture
CN110287949B (en) Video clip extraction method, device, equipment and storage medium
US8401303B2 (en) Method and apparatus for identifying character areas in a document image
US20070253680A1 (en) Caption display control apparatus
EP3096264A1 (en) Object detection system, object detection method, poi information creation system, warning system, and guidance system
US20040017579A1 (en) Method and apparatus for enhancement of digital image quality
US8629918B2 (en) Image processing apparatus, image processing method and program
US8437542B2 (en) Image processing apparatus, method, and program
KR101276056B1 (en) Image processing apparatus and image processing method thereof
CN100593792C (en) Text tracking and multi-frame reinforcing method in video
CN105657514A (en) Method and apparatus for playing video key information on mobile device browser
EP1074926A2 (en) Method of and apparatus for retrieving text data from a video signal
JP4573957B2 (en) Image control apparatus, image control method, and television receiver
CN111428590B (en) Video clustering segmentation method and system
JP2009130899A (en) Image playback apparatus
JP3655110B2 (en) Video processing method and apparatus, and recording medium recording video processing procedure
Ghorpade et al. Extracting text from video
JPH11136637A (en) Representative image generating device
JP3435334B2 (en) Apparatus and method for extracting character area in video and recording medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20230614

Address after: 518000, 1603, Zone A, Huayi Building, No. 9 Pingji Avenue, Xialilang Community, Nanwan Street, Longgang District, Shenzhen City, Guangdong Province

Applicant after: Shenzhen Youyou Brand Communication Co.,Ltd.

Address before: 430000 2007, building B, Optics Valley New World t+ office building, No. 355, Guanshan Avenue, East Lake New Technology Development Zone, Wuhan, Hubei Province

Applicant before: Wuhan yimantianxia Technology Co.,Ltd.

GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20231115

Address after: 430000 office 7, 20 / F, building B, office building, block a, Optics Valley New World Center, Donghu New Technology Development Zone, Wuhan, Hubei Province

Patentee after: Wuhan yimantianxia Technology Co.,Ltd.

Address before: 518000, 1603, Zone A, Huayi Building, No. 9 Pingji Avenue, Xialilang Community, Nanwan Street, Longgang District, Shenzhen City, Guangdong Province

Patentee before: Shenzhen Youyou Brand Communication Co.,Ltd.