CN116886996A - Digital village multimedia display screen broadcasting system - Google Patents

Digital village multimedia display screen broadcasting system Download PDF

Info

Publication number
CN116886996A
CN116886996A CN202311140360.7A CN202311140360A CN116886996A CN 116886996 A CN116886996 A CN 116886996A CN 202311140360 A CN202311140360 A CN 202311140360A CN 116886996 A CN116886996 A CN 116886996A
Authority
CN
China
Prior art keywords
caption
feature
characteristic
frame
feature point
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202311140360.7A
Other languages
Chinese (zh)
Other versions
CN116886996B (en
Inventor
袁凌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Fukong Chuanglian Technology Co ltd
Original Assignee
Zhejiang Fukong Chuanglian Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Fukong Chuanglian Technology Co ltd filed Critical Zhejiang Fukong Chuanglian Technology Co ltd
Priority to CN202311140360.7A priority Critical patent/CN116886996B/en
Publication of CN116886996A publication Critical patent/CN116886996A/en
Application granted granted Critical
Publication of CN116886996B publication Critical patent/CN116886996B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • H04N21/440281Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display by altering the temporal resolution, e.g. by frame skipping
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/60Network streaming of media packets
    • H04L65/61Network streaming of media packets for supporting one-way streaming services, e.g. Internet radio
    • H04L65/611Network streaming of media packets for supporting one-way streaming services, e.g. Internet radio for multicast or broadcast
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/80Responding to QoS
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44008Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Television Systems (AREA)

Abstract

The invention relates to the technical field of image communication, and provides a digital village multimedia display screen broadcasting system, which comprises: acquiring a video to be played on a multimedia display screen; extracting caption characteristic points on each frame of broadcast gray level diagram according to the rolling characteristics of the caption in the video to be played in the multimedia display screen; acquiring the feature point aggregation degree of the caption feature points according to the matching degree of the caption feature points at the same position on the adjacent two frames of broadcast gray level images; acquiring feature point weights of caption feature points according to caption feature points and matching feature points of adjacent frames; and acquiring broadcasting frame inserting images between two adjacent frames according to the feature point weights, inserting all broadcasting frame inserting images into the video to be played according to the time sequence, and transmitting the video to be played to the multimedia display screen. The invention utilizes the characteristic of subtitle scrolling to determine different types of pixel points in the frame-inserted image based on the characteristic point weights of subtitle characteristic points at different positions, and solves the problem of text defect caused by inaccurate optical flow estimation generated by the traditional LK optical flow frame-inserting algorithm.

Description

Digital village multimedia display screen broadcasting system
Technical Field
The invention relates to the technical field of image transmission, in particular to a digital village multimedia display screen broadcasting system.
Background
Digital rural is an application in the economic and social development of agricultural rural areas along with networking, informatization and digitalization, and an agricultural rural modern development and transformation process which is generated by the improvement of modern information skills of farmers. A multimedia display screen broadcasting system is an innovative communication system integrating digital technology and multimedia elements, and aims to provide functions of information propagation, propaganda popularization, cultural inheritance and the like for rural areas, and information requirements of rural residents are met by playing news, propaganda films, cultural programs and the like, so that rural development and cultural communication are promoted.
The network environment of the digital rural multimedia display is relatively unstable when the video is played, the network environment is limited by transmission bandwidth and storage capacity, in many cases, the acquired video has the problem of low frame rate, the video watching experience of rural residents is greatly influenced, so that the video is usually required to be subjected to frame inserting processing, the video played in the digital rural multimedia display screen is mostly local content, publicity films and the like, the content of characters contained in the video is more, such as subtitles and the like, when the video is subjected to frame inserting by using a traditional optical flow frame inserting algorithm, the problem that the optical flow estimation is inaccurate and the characters are incomplete easily appears, and the character incomplete in the video possibly causes that the rural residents cannot acquire real and complete information.
Disclosure of Invention
The invention provides a digital village multimedia display screen broadcasting system, which aims to solve the problem that text defects are caused by inaccurate optical flow estimation when a traditional optical flow frame inserting algorithm is used for frame inserting, and adopts the following specific technical scheme:
one embodiment of the invention is a digital village multimedia display screen broadcasting system, which comprises the following modules:
the video data acquisition module acquires a video to be played on the multimedia display screen;
the subtitle feature point extraction module is used for acquiring a display edge image corresponding to each frame of broadcast gray image in the video to be played by utilizing an edge detection algorithm, and acquiring a stroke matching vector of each feature point in the display edge image by utilizing a feature point matching algorithm; taking feature points corresponding to stroke matching vectors with elements larger than a preset threshold as subtitle feature points;
the feature aggregation degree calculation module is used for obtaining the feature aggregation degree of each caption feature point according to the significance degree of elements in the stroke matching vectors of the feature points in different directions in the detection window taken by each caption feature point;
the feature weight calculation module is used for obtaining matching feature points of the caption feature points in the current frame according to stroke matching vectors and feature point aggregation degrees of the caption feature points in the current frame and the caption feature points in the same position in the next frame of the current frame; acquiring feature weights of caption feature points in the current frame and the next frame of the current frame according to the caption feature points in the current frame and the matched feature points;
the broadcast video frame inserting module acquires a characteristic gray scale image between two adjacent frame display edge images according to the characteristic weight in the two adjacent frame display edge images; acquiring a broadcast frame inserting image between two adjacent frames according to a characteristic gray level image between two adjacent frames; and inserting all the broadcasting frame inserting images into the video to be played according to the time sequence, and transmitting the video to the multimedia display screen.
Preferably, the method for obtaining the stroke matching vector of each feature point in the display edge map by using the feature point matching algorithm comprises the following steps:
and taking the edge map of each frame display and the edge map of the preset stroke as inputs of a characteristic point matching algorithm, and taking a vector formed by the matching degree of each characteristic point obtained by the characteristic point matching algorithm in the edge map of each frame display and the preset stroke as a stroke matching vector of each characteristic point.
Preferably, the method for obtaining the feature aggregation degree of each caption feature point according to the significance degree of the elements in the stroke matching vectors of the feature points in different directions in the detection window taken by each caption feature point comprises the following steps:
acquiring an aggregation index of each caption feature point according to the distribution characteristics of stroke matching degree vectors in different directions in the detection window taken by each caption feature point;
taking accumulation of aggregation indexes of each caption feature point in all feature points in each direction in a detection window taken by the aggregation indexes as molecules, taking the number of all feature points in the detection window as denominators, and taking the ratio of the molecules to the denominators as the feature significance of each caption feature point in each direction;
and taking the maximum value of the feature saliency in different directions in the detection window taken by each caption feature point as the feature aggregation degree of each caption feature point.
Preferably, the method for obtaining the aggregation index of each caption feature point according to the distribution characteristics of the stroke matching degree vectors in different directions in the detection window taken by each caption feature point comprises the following steps:
taking the maximum value of all elements in stroke matching degree vectors of each feature point in different directions in a detection window taken by each caption feature point as a target index of each caption feature point, taking a natural constant as a base number, and taking a calculation result taking the target index as an index as a significant value of each caption feature point;
and taking the inverse of the sum of the salient value of each subtitle characteristic point and the preset parameter as the aggregation index of each subtitle characteristic point.
Preferably, the method for obtaining the matching feature points of the caption feature points in the current frame according to the stroke matching vector and the feature point aggregation degree of the caption feature points in the current frame and the caption feature points in the same position in the next frame of the current frame comprises the following steps:
acquiring a feature window corresponding to a caption feature point in a display edge map of a next frame of the current frame, wherein the caption feature point is positioned at the same position in the display edge map of the current frame;
acquiring decision distances of each caption feature point according to differences between the caption feature points in the feature window and the caption feature points;
and taking the caption characteristic point corresponding to the minimum decision distance value in the characteristic window as a matching characteristic point of the caption characteristic point in the current frame.
Preferably, the method for obtaining the decision distance of each caption feature point according to the difference between each caption feature point and the caption feature point in the feature window comprises the following steps:
taking the absolute value of the difference value between the caption characteristic points in the display edge map of the current frame and the characteristic aggregation degree of each caption characteristic point in the characteristic window as a first product factor;
taking the measurement distance between the caption characteristic points in the display edge map of the current frame and the stroke matching degree vector of each caption characteristic point in the characteristic window as a second product factor;
the decision distance of the caption feature points consists of a first product factor and a second product factor, wherein the decision distance is in direct proportion to the first product factor and the second product factor.
Preferably, the method for obtaining the feature weights of the caption feature points in the current frame and the next frame of the current frame according to the caption feature points in the current frame and the matched feature points thereof comprises the following steps:
taking a normalization result of the inverse of the decision distance between each caption feature point in the current frame display edge map and the matched feature point as the feature weight of each caption feature point in the current frame display edge map;
and taking the difference value between the preset parameter and the characteristic weight of each caption characteristic point in the current frame display edge map as the characteristic weight of the matched characteristic point of the caption characteristic point.
Preferably, the method for obtaining the feature gray scale map between the two adjacent frame display edge maps according to the feature weights in the two adjacent frame display edge maps comprises the following steps:
acquiring the frame interpolation gray values of the feature points in the frame interpolation images between the two adjacent frame display edge images according to the feature weights in the two adjacent frame display edge images;
and traversing all the characteristic points on the display edge graph of the current frame, and taking an image formed by using the interpolated frame gray values of all the characteristic points as a characteristic gray graph between the display edge graphs of two adjacent frames.
Preferably, the method for obtaining the interpolated gray value of the feature point in the interpolated image between the two adjacent frame display edge images according to the feature weights in the two adjacent frame display edge images includes:
and taking the characteristic weights in the adjacent two frames of display edge graphs and the gray values of the characteristic points in the adjacent display edge graphs as inputs of a bilinear interpolation algorithm, and acquiring the frame interpolation gray values of the characteristic points on the frame interpolation graph between the adjacent two frames by using the bilinear interpolation algorithm.
Preferably, the method for obtaining the broadcast frame inserting image between two adjacent frames according to the characteristic gray scale image between the display edge images of the two adjacent frames comprises the following steps:
acquiring the interpolated gray values of non-characteristic points on the interpolated images between two adjacent frames of broadcast gray images by using an optical flow pin algorithm, traversing all the non-characteristic points on the current frame of broadcast gray images, and taking the images formed by the interpolated gray values of all the non-characteristic points as the non-characteristic gray images between the two adjacent frames of broadcast gray images;
and taking the linear weighting result of the characteristic gray scale image and the non-characteristic gray scale image between the adjacent two frames of display edge images as a broadcast frame inserting image between the adjacent two frames of broadcast gray scale images.
The beneficial effects of the invention are as follows: the invention obtains the middle frame by separately processing the caption and background in the video to be played in the multimedia display screen, for the caption part, builds a stroke matching degree vector by the character feature, reflects the matching degree of the image feature point of each frame of the video and the character, builds a caption feature point gathering degree by combining the movement and position feature of the caption in the video, comprehensively reflects the character feature in each frame of image, builds the weight of each caption feature point in the adjacent two frames of images by analyzing the stroke matching degree vector of the caption feature point in the adjacent two frames of images and the caption feature point gathering degree, reflects the influence degree of the front frame and the rear frame on the middle frame when the middle frame is generated, further uses a weighted bilinear interpolation method to calculate the gray value of each caption feature point of the middle frame, obtains the gray value of the pixel point of the middle frame by the traditional LK optical flow interpolation frame algorithm, and can obtain the gray value of the pixel point of the middle frame based on the gray value of the caption feature point and the non-caption feature point, thereby solving the problem that the traditional LK optical flow interpolation frame algorithm produces the defect of accurately estimating the character when playing the video in the digital rural optical flow interpolation system.
Drawings
In order to more clearly illustrate the embodiments of the invention or the technical solutions of the prior art, the drawings which are used in the description of the embodiments or the prior art will be briefly described, it being obvious that the drawings in the description below are only some embodiments of the invention, and that other drawings can be obtained according to these drawings without inventive faculty for a person skilled in the art.
Fig. 1 is a schematic flow chart of a digital rural multimedia display broadcasting system according to an embodiment of the present invention;
fig. 2 is a schematic diagram of a detection window according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Referring to fig. 1, a flowchart of a digital rural multimedia display broadcasting system according to an embodiment of the present invention is shown, where the system includes: the system comprises a video data acquisition module, a subtitle characteristic point extraction module, a characteristic aggregation degree calculation module, a characteristic weight calculation module and a broadcast video frame inserting module.
The video data acquisition module acquires videos required to be played in the digital rural multimedia display screen broadcasting system through related multimedia display screen playing video sources such as a digital rural media library, a community broadcasting platform and a rural news video library, and because the acquired videos possibly come from a plurality of video sources, namely different videos possibly have different formats and different resolutions, preprocessing is required to be carried out on the acquired video files, the preprocessing comprises the steps of converting the video files into uniform mp4 format files, adjusting the resolution of the video files into the resolution of the digital rural multimedia display screen, recording the preprocessed videos as videos to be played, and converting the video formats into known technologies, so that detailed processes are not repeated.
So far, the video to be played of the digital rural multimedia display screen is obtained and is used for extracting the follow-up video rolling captions.
The subtitle feature point extraction module is used for inserting frames of videos such as news, publicity films, cultural programs and the like to be played on a rural multimedia display screen by using an LK optical flow frame inserting algorithm, so that the problem that text defects are caused by inaccurate optical flow estimation easily occurs, namely when a text is in a static background state, the optical flow estimation of the position of the text is biased by the motion band of the background, the optical flow estimation of the position of the text is inaccurate, the text defects are caused, the quality of a synthesized intermediate frame is lower, and the experience of a viewer is affected.
Firstly, graying each frame of image in video to be played to obtain a broadcast gray level image corresponding to each frame of image, secondly, in the invention, edge extraction is carried out on each frame of broadcast gray level image by utilizing a Canny edge detection technology, and the edge detection result of the ith frame of broadcast gray level image is recorded as an ith frame of display edge imageCanny edge detection is a well-known technique, and the detailed process is not repeated.
In the video to be played of news, publicity film and the like, the video to be played usually comprises a rolling caption and a fixed caption, wherein the rolling caption usually moves at a constant speed along one direction, namely, the rolling caption in two adjacent frames of broadcast gray level diagrams has a certain relevance; for a fixed subtitle, for example, the subtitle in a video generally disappears after the person speaks, and the next sentence to be expressed by the person is changed, that is, the subtitle does not change in a plurality of seconds in the video to be played, then changes into the next subtitle, the periodicity rule that the subtitle does not change in a plurality of seconds after the subtitle changes is changed, the time interval of subtitle mutation in the adjacent two frames of broadcast gray level diagrams is shorter, the subtitle mutation is generally completed in 2-5 frames, and the adjacent two subtitles are generally different.
The invention utilizes Canny edge detection technology to extract edge images of permanent character eight-method, namely, points, horizontal, vertical, skimming, right-falling, folding, lifting and hooking, the edge images of the permanent character eight-method and each frame of display edge image are used as input of scale-invariant feature transformation SIFT feature point matching algorithm, the SIFT algorithm is utilized to obtain feature vectors of each feature point fixed dimension on the display edge image, and then the Pirson correlation coefficient between the feature vectors of each feature point fixed dimension on the display edge image and the feature vectors of each feature point fixed dimension in the edge image of each stroke in the permanent character eight-method is calculated respectively, and the scale-invariant feature transformation SIfeature point matching algorithm is known in the prior art, so that the specific process is not repeated.
Constructing a stroke matching degree vector of each characteristic point in each frame of display boundary graph based on the Pearson correlation coefficient between the acquired characteristic vectors, and displaying an edge graphThe middle position coordinates are +.>The stroke matching degree vector of the feature points of (2) is marked as +.>
Wherein, the liquid crystal display device comprises a liquid crystal display device,display edge map->The middle position coordinates are +.>The pearson correlation coefficient is a known technology and the specific process is not repeated.
According to the steps, the stroke matching degree vector of each characteristic point in each frame of edge image in the video can be calculated, the pixel point corresponding to the stroke matching degree vector with the element higher than the stroke matching degree threshold value is used as the subtitle characteristic point, and the stroke matching degree threshold value takes the experience value of 0.8 due to the standard and standard requirements of Chinese character display in the video subtitle.
So far, the caption characteristic points in each frame of image in the video can be extracted.
The feature concentration degree calculation module, in order to avoid the problem of matching error caused by similar edges and strokes of certain scene layouts in the broadcast gray level diagram, is characterized in that characters in videos such as news, publicity films and the like played in a rural multimedia display screen are transversely arranged or longitudinally arranged in the videos, and words in the subtitles generally do not independently appear in one, so that if the subtitle feature points are in the subtitle collection windowActually, if the pixel points are subtitle pixel points, one or a plurality of stroke matching degree vectors of the characteristic points in the subtitle gathering window are always larger; if subtitle feature point->In practice, the shape characteristics of the stroke matching degree vectors are similar to the strokes due to the coincidence conditions such as scene layout, etc., so that the stroke matching degree vectors of all the feature points in the subtitle gathering window have larger stroke matching degree, and the stroke matching degree vectors of all the rest feature points have lower stroke matching degree.
At the display edgeIn->The size of the center point construction as the middle-lower edge line of the rectangle is +.>As shown in FIG. 2, < + >>、/>The empirical values 21 and 61 are taken respectively. And respectively acquiring stroke matching degree vectors of caption feature points in different directions in the detection window, and constructing feature aggregation degree based on the layout characteristics of the caption feature points in the different directions so as to evaluate the caption feature points again. Computing display edge map->Medium subtitle feature Point->Characteristic aggregation degree->
In which the edge map is displayedMedium subtitle feature Point->Aggregation index in d-direction, +.>Respectively represent the abscissa and the ordinate of the jth feature point in the detection window W,/th feature point>Indicated at the display edgeFigure->Characteristic point +.>Maximum value of all elements in the stroke matching degree vector, +.>Is an exponential function based on a natural constant e;
is a display edge map->Medium subtitle feature Point->Feature saliency in d-direction, +.>The number of the feature points in the detection window W is taken;
is a display edge map->Medium subtitle feature Point->Is the set of the directions taken in the detection window, set +.>The max () function is a maximum function.
Wherein, the edge map is displayedMedium subtitle feature Point->The greater the likelihood that there are remaining subtitle feature points in the d-direction, the target index +.>The larger the value of (2), the salient value +.>The larger the value of (2) the aggregation indexThe greater the value of (2); display edge map->Medium subtitle feature Point->The higher the subtitle stroke density existing in the d-direction, the feature saliency +.>The greater the value of +.>The greater the value of (2).
So far, the feature aggregation degree of each subtitle feature point in each frame of display edge graph is obtained and used for calculating the subsequent feature weight.
The feature weight calculation module obtains the feature point aggregation degree of each caption feature point in the ith frame image according to the steps, and repeats the steps to obtain the feature aggregation degree of each caption feature point in the (i+1) th frame display edge image respectively, and in videos such as news, publicity films and the like, the rolling speed of the rolling caption is usually slower, and the display edge image is displayed for facilitating audience readingIs filled with caption feature points->Is center and size ∈>The window of (2) is pixel point +.>The characteristic window of (2) is marked->In the invention, the m size takes the checked value 31, and the characteristic window taken by the caption characteristic point at the same position in the display edge diagram of the (i+1) th frame is marked as +.>According to the display edge map->Medium subtitle feature Point->Is characterized in that the feature window of the subtitle feature point concentration, the stroke matching degree vector and the subtitle feature point (x, y) in the display edge diagram of the (i+1) th frame is +.>The concentration degree of the caption characteristic points and the stroke matching degree of the caption characteristic points in the inner part construct a display edge diagram +.>Medium subtitle feature Point->And feature Window->Decision distance between each caption feature point.
Computing display edge mapMedium subtitle feature Point->And feature Window->Inner aDecision distance between subtitle feature points +.>
In the method, in the process of the invention,is caption feature point +.>Decision distance from the a-th subtitle feature point, < ->Is caption feature point +.>Difference between feature concentration of a-th subtitle feature point, +.>、/>Subtitle feature points +.>Stroke matching degree vector of a-th subtitle feature point,>is a vector、/>The DTW distance between the two is a known technology, and the specific process is not described again.
Wherein the larger the distribution difference of caption feature points at the same position on the display edge graph of the adjacent frames is, the first product factorThe larger the value of (2) is, the faster the subtitle between adjacent frames scrolls, the larger the change of the stroke characteristics at the same position is, the second product factor +.>The larger the value of (2), the decision distance +.>The greater the value of (2).
Respectively obtaining display edge graphsMedium subtitle feature Point->And feature Window->Decision distance between each subtitle feature point in the subtitle, feature window is +.>Subtitle feature points corresponding to minimum values of internal decision distances are used as display edge graphs>Medium subtitle feature Point->Is used for matching the feature points. Further, based on the display edge map +.>Medium subtitle feature Point->Display edge map is set up for matching feature points of (a)>Medium subtitle feature Point->Is a characteristic weight of (a).
In the method, in the process of the invention,to display edge map->Medium subtitle feature Point->Is a normalized function, ++Norm)>Representing a display edge map->Medium subtitle feature Point->DTW distance between stroke matching degree vector of (a) and stroke matching degree vector of matching feature point, +.>Representing a display edge map->Medium subtitle feature Point->The difference between the feature concentration of (a) and the feature concentration of the matching feature points;
is subtitle feature point +.>Is a characteristic weight of (a).
Displaying an edge mapMedium subtitle feature Point->The greater the difference between the stroke matching degree vector of (2) and the stroke matching degree vector of the matching feature point, the +.>The larger the value of (2), the display edge map +.>Medium subtitle feature Point->The greater the difference between the feature concentration degree of (a) and the feature concentration degree of the matching feature point, the +.>The larger the display edge map +.>Middle pixel pointThe lower the degree of matching with its matching point, the indication is at the display edge map +.>Subtitle feature point appearing in (a)>The more likely it is not present in the i+1th frame display edge map, i.e., the subtitle feature point +.>It may be the end of the previous sentence of the fixed caption, and the fixed caption will change from the previous sentence to the next sentence, at this time, to avoid the occurrence of caption residuesThe defect problem is that when an intermediate frame is obtained by bilinear interpolation of two-frame display edge map, the display edge map is +.>Medium subtitle feature Point->The weight of (2) should be small, caption feature point +.>Should be weighted relatively large.
So far, the feature weight of the caption feature points at the same position in the adjacent two frames of display edge images is obtained and is used for constructing the subsequent frame inserting images.
The broadcast video interpolation module is used for calculating the gray value of each caption characteristic point in the intermediate interpolation frame image by using a weighted bilinear interpolation method according to the characteristic weight of each caption characteristic point in the adjacent two-frame display edge image obtained by the steps, and the calculation formula is as follows:
in the method, in the process of the invention,the subtitle characteristic point on the inserted frame image between the display edge image of the ith frame and the display edge image of the (i+1) th frame is represented +.>Is inserted with gray values +.>Subtitle feature point +.>) Is used for the weight of the (c),caption in display edge map representing the i+1st frameSyndrome of marked spots->Is a gray value of (a).
Traversing caption characteristic points in the display edge map of the ith frame, and using the edge map constructed by the interpolation frame gray value of each caption characteristic point according to the position information as the characteristic gray map of the display edge map of the ith frame and the display edge map of the (i+1) th frame.
For the non-subtitle characteristic points in the ith frame broadcast gray scale map, the gray scale value of the intermediate frame is calculated by using a traditional LK optical flow frame interpolation algorithm, the input of the LK optical flow frame interpolation algorithm is the ith frame broadcast gray scale map and the (i+1) th frame broadcast gray scale map of the video, the output of the algorithm is an interpolation frame image between the ith frame and the (i+1) th frame broadcast gray scale map, and the interpolation frame image is recorded as the non-characteristic gray scale map. And taking the result of linear fusion of the characteristic gray level image and the non-characteristic gray level image between the adjacent two frames of display edge images as a broadcast frame inserting image between the adjacent two frames of broadcast gray level images. In order to make the transition between the frame-inserted image and the broadcast gray level image of the frame before and after the frame-inserted image smoother, the invention uses morphological corrosion and expansion operation to process the broadcast frame-inserted image, wherein the morphological corrosion and expansion operation is a known technology, and the detailed process is not repeated.
Further, a broadcast frame inserting image between any two adjacent frames of broadcast gray level images in the video to be played is obtained, and the video to be played in the digital rural multimedia display screen is subjected to frame inserting processing by utilizing the broadcast frame inserting image, so that the frame inserting video with higher frame rate and clear characters is obtained.
And secondly, uploading the frame inserting video to a content management system in a broadcasting system for storage, and organizing, classifying, marking and scheduling by an administrator according to the type and duration of the frame inserting video stored in the content management system, so as to determine when to play the content, and the playing sequence and frequency, thereby meeting the information requirements of village residents with high quality and promoting village development and cultural communication.
The above description is only of the preferred embodiments of the present invention and is not intended to limit the invention, but any modifications, equivalent substitutions, improvements, etc. within the principles of the present invention should be included in the scope of the present invention.

Claims (10)

1. A digital rural multimedia display broadcast system, comprising the following modules:
the video data acquisition module acquires a video to be played on the multimedia display screen;
the subtitle feature point extraction module is used for acquiring a display edge image corresponding to each frame of broadcast gray image in the video to be played by utilizing an edge detection algorithm, and acquiring a stroke matching vector of each feature point in the display edge image by utilizing a feature point matching algorithm; taking feature points corresponding to stroke matching vectors with elements larger than a preset threshold as subtitle feature points;
the feature aggregation degree calculation module is used for obtaining the feature aggregation degree of each caption feature point according to the significance degree of elements in the stroke matching vectors of the feature points in different directions in the detection window taken by each caption feature point;
the feature weight calculation module is used for obtaining matching feature points of the caption feature points in the current frame according to stroke matching vectors and feature point aggregation degrees of the caption feature points in the current frame and the caption feature points in the same position in the next frame of the current frame; acquiring feature weights of caption feature points in the current frame and the next frame of the current frame according to the caption feature points in the current frame and the matched feature points;
the broadcast video frame inserting module acquires a characteristic gray scale image between two adjacent frame display edge images according to the characteristic weight in the two adjacent frame display edge images; acquiring a broadcast frame inserting image between two adjacent frames according to a characteristic gray level image between two adjacent frames; and inserting all the broadcasting frame inserting images into the video to be played according to the time sequence, and transmitting the video to the multimedia display screen.
2. The broadcasting system of claim 1, wherein the method for obtaining the stroke matching vector of each feature point in the display edge map by using the feature point matching algorithm is as follows:
and taking the edge map of each frame display and the edge map of the preset stroke as inputs of a characteristic point matching algorithm, and taking a vector formed by the matching degree of each characteristic point obtained by the characteristic point matching algorithm in the edge map of each frame display and the preset stroke as a stroke matching vector of each characteristic point.
3. The broadcasting system of claim 1, wherein the method for obtaining the feature aggregation degree of each caption feature point according to the significance degree of the elements in the stroke matching vectors of the feature points in different directions within the detection window taken by each caption feature point is as follows:
acquiring an aggregation index of each caption feature point according to the distribution characteristics of stroke matching degree vectors in different directions in the detection window taken by each caption feature point;
taking accumulation of aggregation indexes of each caption feature point in all feature points in each direction in a detection window taken by the aggregation indexes as molecules, taking the number of all feature points in the detection window as denominators, and taking the ratio of the molecules to the denominators as the feature significance of each caption feature point in each direction;
and taking the maximum value of the feature saliency in different directions in the detection window taken by each caption feature point as the feature aggregation degree of each caption feature point.
4. A digital rural multimedia display broadcasting system according to claim 3, wherein the method for obtaining the aggregation index of each caption feature point according to the distribution characteristics of the stroke matching degree vectors in different directions within the detection window taken by each caption feature point comprises the following steps:
taking the maximum value of all elements in stroke matching degree vectors of each feature point in different directions in a detection window taken by each caption feature point as a target index of each caption feature point, taking a natural constant as a base number, and taking a calculation result taking the target index as an index as a significant value of each caption feature point;
and taking the inverse of the sum of the salient value of each subtitle characteristic point and the preset parameter as the aggregation index of each subtitle characteristic point.
5. The broadcasting system of claim 1, wherein the method for obtaining the matching feature points of the caption feature points in the current frame according to the stroke matching vector and the feature point aggregation degree of the caption feature points in the current frame and the caption feature points in the same position in the next frame of the current frame comprises the following steps:
acquiring a feature window corresponding to a caption feature point in a display edge map of a next frame of the current frame, wherein the caption feature point is positioned at the same position in the display edge map of the current frame;
acquiring decision distances of each caption feature point according to differences between the caption feature points in the feature window and the caption feature points;
and taking the caption characteristic point corresponding to the minimum decision distance value in the characteristic window as a matching characteristic point of the caption characteristic point in the current frame.
6. The digital village multimedia display broadcasting system according to claim 5, wherein the method for obtaining the decision distance of each caption feature point according to the difference between each caption feature point and the caption feature point in the feature window comprises:
taking the absolute value of the difference value between the caption characteristic points in the display edge map of the current frame and the characteristic aggregation degree of each caption characteristic point in the characteristic window as a first product factor;
taking the measurement distance between the caption characteristic points in the display edge map of the current frame and the stroke matching degree vector of each caption characteristic point in the characteristic window as a second product factor;
the decision distance of the caption feature points consists of a first product factor and a second product factor, wherein the decision distance is in direct proportion to the first product factor and the second product factor.
7. The broadcasting system of claim 1, wherein the method for obtaining the feature weights of the caption feature points in the current frame and the caption feature points in the next frame of the current frame according to the caption feature points in the current frame and the matching feature points thereof comprises the following steps:
taking a normalization result of the inverse of the decision distance between each caption feature point in the current frame display edge map and the matched feature point as the feature weight of each caption feature point in the current frame display edge map;
and taking the difference value between the preset parameter and the characteristic weight of each caption characteristic point in the current frame display edge map as the characteristic weight of the matched characteristic point of the caption characteristic point.
8. The broadcasting system of claim 1, wherein the method for obtaining the feature gray scale map between two adjacent frame display edge maps according to the feature weights in the two adjacent frame display edge maps comprises:
acquiring the frame interpolation gray values of the feature points in the frame interpolation images between the two adjacent frame display edge images according to the feature weights in the two adjacent frame display edge images;
and traversing all the characteristic points on the display edge graph of the current frame, and taking an image formed by using the interpolated frame gray values of all the characteristic points as a characteristic gray graph between the display edge graphs of two adjacent frames.
9. The broadcasting system of claim 8, wherein the method for obtaining the interpolated gray value of the feature point in the interpolated image between two adjacent frame display edge maps according to the feature weights in the two adjacent frame display edge maps comprises:
and taking the characteristic weights in the adjacent two frames of display edge graphs and the gray values of the characteristic points in the adjacent display edge graphs as inputs of a bilinear interpolation algorithm, and acquiring the frame interpolation gray values of the characteristic points on the frame interpolation graph between the adjacent two frames by using the bilinear interpolation algorithm.
10. The broadcasting system of claim 1, wherein the method for acquiring the broadcasting frame-inserted image between two adjacent frames according to the characteristic gray scale image between the display edge images of two adjacent frames comprises the following steps:
acquiring the interpolated gray values of non-characteristic points on the interpolated images between two adjacent frames of broadcast gray images by using an optical flow pin algorithm, traversing all the non-characteristic points on the current frame of broadcast gray images, and taking the images formed by the interpolated gray values of all the non-characteristic points as the non-characteristic gray images between the two adjacent frames of broadcast gray images;
and taking the linear weighting result of the characteristic gray scale image and the non-characteristic gray scale image between the adjacent two frames of display edge images as a broadcast frame inserting image between the adjacent two frames of broadcast gray scale images.
CN202311140360.7A 2023-09-06 2023-09-06 Digital village multimedia display screen broadcasting system Active CN116886996B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311140360.7A CN116886996B (en) 2023-09-06 2023-09-06 Digital village multimedia display screen broadcasting system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311140360.7A CN116886996B (en) 2023-09-06 2023-09-06 Digital village multimedia display screen broadcasting system

Publications (2)

Publication Number Publication Date
CN116886996A true CN116886996A (en) 2023-10-13
CN116886996B CN116886996B (en) 2023-12-01

Family

ID=88271850

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311140360.7A Active CN116886996B (en) 2023-09-06 2023-09-06 Digital village multimedia display screen broadcasting system

Country Status (1)

Country Link
CN (1) CN116886996B (en)

Citations (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2000194845A (en) * 1998-12-25 2000-07-14 Canon Inc Image processor, image processing method and image processing system
US6275612B1 (en) * 1997-06-09 2001-08-14 International Business Machines Corporation Character data input apparatus and method thereof
CN101216948A (en) * 2008-01-14 2008-07-09 浙江大学 Cartoon animation fabrication method based on video extracting and reusing
US20080211968A1 (en) * 2006-12-19 2008-09-04 Tomokazu Murakami Image Processor and Image Display Apparatus Comprising the Same
JP2011130133A (en) * 2009-12-16 2011-06-30 Canon Inc Stereoscopic video processing apparatus, and method of controlling the same
JP2011135288A (en) * 2009-12-24 2011-07-07 Canon Inc Video processing device and video processing method
CN103248797A (en) * 2013-05-30 2013-08-14 北京志光伯元科技有限公司 Video resolution enhancing method and module based on FPGA (field programmable gate array)
KR20140134906A (en) * 2013-05-15 2014-11-25 주식회사 칩스앤미디어 An apparatus for motion compensated frame interpolation of non-moving caption region and a method thereof
US20160366366A1 (en) * 2015-06-12 2016-12-15 Sharp Laboratories Of America, Inc. Frame rate conversion system
CN111539427A (en) * 2020-04-29 2020-08-14 武汉译满天下科技有限公司 Method and system for extracting video subtitles
CN112184779A (en) * 2020-09-17 2021-01-05 无锡安科迪智能技术有限公司 Method and device for processing interpolation image
CN114007135A (en) * 2021-10-29 2022-02-01 广州华多网络科技有限公司 Video frame insertion method and device, equipment, medium and product thereof
WO2022037251A1 (en) * 2020-08-21 2022-02-24 Oppo广东移动通信有限公司 Video data processing method and apparatus
CN115334335A (en) * 2022-07-13 2022-11-11 北京优酷科技有限公司 Video frame insertion method and device
CN116170650A (en) * 2022-12-21 2023-05-26 上海哔哩哔哩科技有限公司 Video frame inserting method and device

Patent Citations (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6275612B1 (en) * 1997-06-09 2001-08-14 International Business Machines Corporation Character data input apparatus and method thereof
JP2000194845A (en) * 1998-12-25 2000-07-14 Canon Inc Image processor, image processing method and image processing system
US20080211968A1 (en) * 2006-12-19 2008-09-04 Tomokazu Murakami Image Processor and Image Display Apparatus Comprising the Same
CN101216948A (en) * 2008-01-14 2008-07-09 浙江大学 Cartoon animation fabrication method based on video extracting and reusing
JP2011130133A (en) * 2009-12-16 2011-06-30 Canon Inc Stereoscopic video processing apparatus, and method of controlling the same
JP2011135288A (en) * 2009-12-24 2011-07-07 Canon Inc Video processing device and video processing method
KR20140134906A (en) * 2013-05-15 2014-11-25 주식회사 칩스앤미디어 An apparatus for motion compensated frame interpolation of non-moving caption region and a method thereof
CN103248797A (en) * 2013-05-30 2013-08-14 北京志光伯元科技有限公司 Video resolution enhancing method and module based on FPGA (field programmable gate array)
US20160366366A1 (en) * 2015-06-12 2016-12-15 Sharp Laboratories Of America, Inc. Frame rate conversion system
CN111539427A (en) * 2020-04-29 2020-08-14 武汉译满天下科技有限公司 Method and system for extracting video subtitles
WO2022037251A1 (en) * 2020-08-21 2022-02-24 Oppo广东移动通信有限公司 Video data processing method and apparatus
CN112184779A (en) * 2020-09-17 2021-01-05 无锡安科迪智能技术有限公司 Method and device for processing interpolation image
CN114007135A (en) * 2021-10-29 2022-02-01 广州华多网络科技有限公司 Video frame insertion method and device, equipment, medium and product thereof
CN115334335A (en) * 2022-07-13 2022-11-11 北京优酷科技有限公司 Video frame insertion method and device
CN116170650A (en) * 2022-12-21 2023-05-26 上海哔哩哔哩科技有限公司 Video frame inserting method and device

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
FENG HONG-CAI: "A Shot Boundary Detection Method Based on Color Space", 《2010 INTERNATIONAL CONFERENCE ON E-BUSINESS AND E-GOVERNMENT》 *
李琼;: "基于颜色分析的新闻视频字幕区提取方法研究", 安徽电子信息职业技术学院学报, no. 03 *
王刚: "新闻视频字幕的自动提取和识别", 《 中国优秀硕士论文电子期刊网》 *

Also Published As

Publication number Publication date
CN116886996B (en) 2023-12-01

Similar Documents

Publication Publication Date Title
CN112153483B (en) Information implantation area detection method and device and electronic equipment
CN110390033B (en) Training method and device for image classification model, electronic equipment and storage medium
US10242265B2 (en) Actor/person centric auto thumbnail
US6937766B1 (en) Method of indexing and searching images of text in video
US11475666B2 (en) Method of obtaining mask frame data, computing device, and readable storage medium
KR100746641B1 (en) Image code based on moving picture, apparatus for generating/decoding image code based on moving picture and method therefor
US11871086B2 (en) Method of displaying comment information, computing device, and readable storage medium
CN112954450B (en) Video processing method and device, electronic equipment and storage medium
CN105657514A (en) Method and apparatus for playing video key information on mobile device browser
US20170345153A1 (en) Method, an apparatus and a computer program product for video object segmentation
CN113516666A (en) Image cropping method and device, computer equipment and storage medium
CN103631786A (en) Clustering method and device for video files
CN112101344A (en) Video text tracking method and device
US9471990B1 (en) Systems and methods for detection of burnt-in text in a video
Jin et al. Network video summarization based on key frame extraction via superpixel segmentation
CN116886996B (en) Digital village multimedia display screen broadcasting system
CN111914850B (en) Picture feature extraction method, device, server and medium
CN112015936B (en) Method, device, electronic equipment and medium for generating article display diagram
Zhao et al. Automatic generation of informative video thumbnail
CN110414471B (en) Video identification method and system based on double models
CN112488072A (en) Method, system and equipment for acquiring face sample set
CN113763313A (en) Text image quality detection method, device, medium and electronic equipment
CN113706636A (en) Method and device for identifying tampered image
CN112580696A (en) Advertisement label classification method, system and equipment based on video understanding
CN112995666B (en) Video horizontal and vertical screen conversion method and device combined with scene switching detection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant