CN112511860B - Picture transmission method with clear character area - Google Patents

Picture transmission method with clear character area Download PDF

Info

Publication number
CN112511860B
CN112511860B CN202011338605.3A CN202011338605A CN112511860B CN 112511860 B CN112511860 B CN 112511860B CN 202011338605 A CN202011338605 A CN 202011338605A CN 112511860 B CN112511860 B CN 112511860B
Authority
CN
China
Prior art keywords
picture
character
data
current
area
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011338605.3A
Other languages
Chinese (zh)
Other versions
CN112511860A (en
Inventor
张浪
孙利杰
欧阳殷朝
陈松政
刘文清
杨涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hunan Qilin Xin'an Technology Co ltd
Original Assignee
Hunan Qilin Xin'an Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hunan Qilin Xin'an Technology Co ltd filed Critical Hunan Qilin Xin'an Technology Co ltd
Priority to CN202011338605.3A priority Critical patent/CN112511860B/en
Publication of CN112511860A publication Critical patent/CN112511860A/en
Application granted granted Critical
Publication of CN112511860B publication Critical patent/CN112511860B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • H04N21/234309Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements by transcoding between formats or standards, e.g. from MPEG-2 to MPEG-4 or from Quicktime to Realvideo
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/90Determination of colour characteristics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • G06V30/153Segmentation of character regions using recognition of characters or words
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/70Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • H04N21/234345Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements the reformatting operation being performed only on part of the stream, e.g. a region of the image or a time segment
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • H04N21/440218Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display by transcoding between formats or standards, e.g. from MPEG-2 to MPEG-4
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • H04N21/440245Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display the reformatting operation being performed only on part of the stream, e.g. a region of the image or a time segment
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Abstract

The invention discloses a picture transmission method with clear text areas, which comprises the steps of compressing and encoding screen image data by a server and decompressing and decoding the screen image data by a client, wherein the step of compressing and encoding the screen image data by the server comprises the following steps: capturing a current picture PiObtaining a character recognition area according to the unit blocks meeting the conditions; picture PiPicture P transcoded into YUV formati1(ii) a For picture P according to character recognition algorithmi1Carrying out character recognition on the Y component of the character recognition area to obtain a character area; for picture Pi1H264 data and reconstructed picture P are obtained after h264 codingi2(ii) a Picture Pi1And picture Pi2Performing YUV data difference calculation on the Chinese character area to obtain character difference data; and compressing the character differential data according to a compression algorithm to obtain a character differential compressed data packet, and enclosing and compressing the h264 data and the character differential compressed data into a picture compressed data packet and then sending the picture compressed data packet to the client. The invention reduces the bandwidth consumption, ensures the clear text area and improves the user experience.

Description

Picture transmission method with clear character area
Technical Field
The invention relates to the field of cloud desktop image transmission, in particular to an image transmission method with clear text areas.
Background
The computer screen transmission technology plays an important role in a cloud desktop, a network teaching system and a video conference system, a general method is that a computer screen image is captured, video compression and encoded, and then the computer screen image is transmitted to a client side through a network for display, in order to reduce network bandwidth (especially transmission across the public network) in the transmission process, a lossy compression algorithm with a relatively large compression ratio is generally adopted for video encoding, when the client side displays the image, the whole image can become fuzzy due to lossy compression, especially when the compression ratio is large, the image can be more fuzzy, and therefore sensitive areas of some images cannot be identified, especially text areas.
Disclosure of Invention
The technical problem to be solved by the invention is as follows: aiming at the technical problems in the prior art, the invention provides a picture transmission method with clear text areas, which can ensure that the consumed bandwidth is small, and can also ensure that sensitive areas such as texts are clear, so as to improve the user experience.
In order to solve the technical problems, the technical scheme provided by the invention is as follows:
a clear picture transmission method for a character area comprises a step of compressing and coding screen image data by a server, and specifically comprises the following steps:
A1) capturing a screen picture as a current picture P according to preset timeiWill be located in the previous picture Pi-1Change region and located in current picture PiAdding position information of a unit block of a non-change area into a current character recognition area set A, wherein the unit block is an area divided by a screen according to lines and rows;
A2) the current picture PiPicture P transcoded into YUV formati1
A3) Acquiring a picture P according to elements in the current character recognition area set Ai1According to the character recognition algorithm, character recognition is carried out on the Y component of each unit block to be recognized, and the position information of the unit block to be recognized which is successfully recognized is added into the current character area set B;
A4) for picture Pi1H264 coding is carried out to obtain coded h264 dataAnd reconstructed picture Pi2
A5) Respectively acquiring a picture P according to elements in the current character region set Bi1And picture Pi2In the character unit blocks corresponding to each other, the picture P is divided intoi1Chinese character unit block and corresponding picture Pi2Performing difference calculation on YUV data of the Chinese character unit block to obtain corresponding character difference data, and adding the position information of the Chinese character unit block and the corresponding character difference data into a current character area detail set C;
A6) and compressing the current text area detail set C according to a compression algorithm to obtain a text differential compression data packet, and enclosing and compressing the encoded h264 data and the text differential compression data into a picture compression data packet and then sending the picture compression data packet to the client.
Further, the method further comprises the step that the client decompresses the decoded screen image data, and specifically comprises the following steps:
B1) acquiring a picture compression data packet sent by a server, and decompressing the picture compression data packet;
B2) if the decompressed content comprises a character differential compression data packet, decompressing the character differential compression data packet to obtain a character region detail set C, and decoding the decompressed h264 data to obtain a reconstructed picture Pi2The character difference data in the character area detail set C and the picture Pi2Synthesizing to obtain clear character picture Pi3Will picture Pi3As a final picture; otherwise, decoding the decompressed h264 data to obtain a reconstructed picture Pi2Will picture Pi2As a final picture; combining the character difference data in the character area detail set C with the picture Pi2Synthesizing to obtain clear character picture Pi3The method specifically comprises the following steps: acquiring a picture P according to the position information in the character area detail set Ci2And the character unit block is matched with the character area detail set C to obtain corresponding character difference data, and the YUV data of the character unit block and the corresponding character difference data are added to obtain new YUV data of the character unit block.
Further, step a1) is preceded by a step of dividing the cell blocks, specifically including: dividing a screen into nw rows of unit blocks with the same nh column size according to the preset unit length w and the preset unit width h, defining a flag set [ nw ] [ nh ] of all the unit blocks, and setting all flags in the flag set [ nw ] [ nh ] to be 0.
Further, the step a1) specifically includes: obtaining a current picture PiRelative to the previous frame Pi-1All the unit blocks corresponding to the changed area of (a) are taken as the first unit block, and the current picture P is acquirediRelative to the previous frame Pi-1All the cell blocks corresponding to the unchanged area of (a) are taken as second cell blocks, and a flag set flag [ nw ] is set][nh]Setting a mark corresponding to the first cell block to be 1, and respectively matching the second cell block with a mark set flag [ nw ]][nh]If the mark corresponding to the second unit block is 1, adding the position information of the second unit block into the current character recognition area set A, and simultaneously, setting a mark set flag [ nw ]][nh]The flag corresponding to the second cell block is set to 0.
Further, capturing the screen picture as the current picture P according to the preset time in the step A1)iThe method specifically comprises the following steps: judging whether the screen image changes within the preset time, if so, capturing the current screen image as the current image PiOtherwise, the previous picture P isi-1As the current picture Pi
Further, step a1) further includes a processing step when the current character recognition area set a is empty: if the current character recognition area set A is empty, the current picture P is divided into a plurality of picturesiPicture P transcoded into YUV formati1And then h264 coding is carried out to obtain coded h264 data, and the coded h264 data is compressed into a picture compression data packet and then is sent to the client.
Further, before the step a5), a step of network judgment is further included, which specifically includes:
C1) judging whether the network condition meets a preset condition, if so, jumping to the step A5), and if not, entering the step C2);
C2) respectively obtaining a picture P according to the elements in the character region set Bi1And picture Pi2In the character unit blocks corresponding to each other, the picture P is divided intoi1Chinese character unit block and corresponding picture Pi2Chinese character sheetAnd carrying out difference calculation on the Y component data of the metablock to obtain corresponding character difference data, adding the position information of the character monoblock and the corresponding character difference data into the character area detail set C, and jumping to the step A6).
Further, before the step a6), a step of network judgment is further included, which specifically includes:
D1) judging whether the network condition meets a preset condition, if so, jumping to the step A6), and if not, entering the step D2);
D2) and compressing the encoded h264 data into a picture compression data packet, sending the picture compression data packet to the client, and returning to the step A1).
Further, the character recognition algorithm in the step a3) is a maximum stable extremum region algorithm.
Further, the compression algorithm in step a6) is a run length compression algorithm or a zlib compression algorithm.
Compared with the prior art, the invention has the advantages that:
1. the screen is divided into the unit blocks, only the areas where some unit blocks are located need to be identified during character identification, and the whole picture does not need to be identified, so that the consumption of a CPU (Central processing Unit) can be reduced;
2. the method does not perform character recognition on the area with changed picture in the recognition process, and performs recognition on the area without changing for only one time, thereby reducing the frequency of character recognition and further reducing the CPU consumption caused by character recognition;
3. the method extracts the details of the text area lost due to h264 lossy compression on the premise of keeping the characteristic of h264 high compression ratio, and transmits the detail data after compressing, thereby reducing bandwidth consumption;
4. the method of the invention carries out character recognition according to the Y component without carrying out gray level processing on the image, thereby improving the processing efficiency and reducing the CPU consumption.
Drawings
FIG. 1 is a diagram illustrating steps of encoding and compressing screen image data according to various embodiments of the present invention.
FIG. 2 is a flow chart of encoding and compressing screen image data according to various embodiments of the present invention.
FIG. 3 is a diagram illustrating steps for decoding decompressed screen image data according to various embodiments of the present invention.
FIG. 4 is a flow chart of decoding decompressed screen image data in accordance with various embodiments of the present invention.
Detailed Description
The invention is further described below with reference to the drawings and specific preferred embodiments of the description, without thereby limiting the scope of protection of the invention.
Before the subsequent method is carried out, firstly dividing a screen into unit blocks according to lines and columns, assuming that the screen is wide and high, dividing the screen into the unit blocks according to a preset unit length w and a preset unit width h, wherein the unit blocks are small blocks with the size of w h, namely, each unit block is the same in size and is an area with the size of w h on the screen, the smaller the unit length w and the unit width h are, the finer the character recognition in the subsequent step is, but the CPU consumption is correspondingly increased, and the specific values of the unit length w and the unit width h can be adjusted according to the reality, so that the method has the following advantages that:
cell block line count: nw ═ w-1/w
Number of cell block columns: nh ═ h (height + h-1)/h
The screen can be divided into a total of nw rows and nh columns of unit blocks of the same size.
And then defining a flag set flag [ nw ] [ nh ] of all the cell blocks, wherein the flags in the flag set flag [ nw ] [ nh ] correspond to the cell blocks one by one, and setting all the flags in the flag set flag [ nw ] [ nh ] to 0, namely setting the flag [ nw ] [ nh ] to be {0 }.
Example one
As shown in fig. 1 and fig. 2, the method for transmitting a clear text area image in this embodiment includes a step of compressing and encoding screen image data by a server, and specifically includes:
A1) capturing a screen picture as a current picture P according to preset timeiWill be located in the previous picture Pi-1Change region and located in current picture PiAdding position information of a unit block in a non-change area into a current character recognition area set A, wherein A is { c0... cn }, and the unit block is a screen line-by-lineIn the area divided by the columns, a screen picture capturing program can call interfaces such as NVIDIA NVFBC, AMD RapidFire, Windows DXGI, QXL, Mirror Driver and the like, and the API interfaces can realize the acquisition of the whole screen picture and the screen change area;
A2) the current picture PiPicture P transcoded into YUV formati1
A3) Acquiring a picture P according to elements in the current character recognition area set Ai1Performing character recognition on the unit block to be recognized according to a character recognition algorithm aiming at the Y component of each unit block to be recognized, and adding the position information of the unit block to be recognized which is successfully recognized into a current character area set B, wherein B is { k0... km };
A4) for picture Pi1H264 lossy coding is carried out, 2 data can be obtained during coding through the conventional x264 coding interface, one is coded h264 data, and the other is a reconstructed picture Pi2Taking the open source coding interface of x264 as an example:
X264_API int x264_encoder_encode(x264_t*,x264_nal_t**pp_nal,int*pi_nal,x264_picture_t*pic_in,x264_picture_t*pic_out);
x264_ picture _ t pic _ in: here the original YUV picture P is passed ini1
x264_ nal _ t × pp _ nal: here, a coded h264 picture is obtained;
x264_ picture _ t pic _ out: where the picture P of the reconstructed image is obtainedi2
Picture Pi2The YUV data is the picture Pi1The YUV data is decoded after h264 lossy coding, so that the picture Pi2Comparing picture P with YUV datai1The raw YUV data loses much detail and causes picture blurring;
A5) respectively acquiring a picture P according to elements in the current character region set Bi1And picture Pi2In the character unit blocks corresponding to each other, the picture P is divided intoi1Chinese character unit block and corresponding picture Pi2Carrying out difference calculation on YUV data of the Chinese character unit block to obtain corresponding character difference data, and comparing position information of the character unit block with the corresponding character difference dataAdding the corresponding text difference data into a current text area detail set C, wherein C is { g0... gm };
A6) and compressing the current text area detail set C according to a compression algorithm to obtain a text differential compression data packet, and enclosing and compressing the encoded h264 data and the text differential compression data into a picture compression data packet and then sending the picture compression data packet to the client.
In this embodiment, the text recognition is performed only when the position of the cell block satisfies that the cell block is located in a previous picture change region and is located in a current picture non-change region, where the change region is a region where a subsequent picture changes relative to a previous picture, and the non-change region is a region where the subsequent picture does not change relative to the previous picture. If no cell block satisfies the above condition, indicating that the picture is changed all the time, the current character recognition area set a is empty, and step a1) of this embodiment further includes the processing steps of: if the current character recognition area set A is empty, the current picture P is divided into a plurality of picturesiPicture P transcoded into YUV formati1And then carrying out h264 lossy coding to obtain coded h264 data, compressing the coded h264 data into a picture compression data packet, and sending the picture compression data packet to the client. I.e. the current picture P is directly put on without a unit block satisfying the aforementioned conditionsiTranscoding and h264 lossy coding are carried out, then h264 data is compressed and sent to the client, character recognition on a constantly changing picture is skipped, and CPU consumption caused by character recognition is reduced.
In step a1) of this embodiment, the preset time is a time when the text changes from the blur to the clear, and the smaller the value of the preset time, the faster the speed of the text changing from the blur to the clear, the higher the CPU consumption, and the adjustment can be performed according to actual needs. If the picture of the screen does not change after the preset time, it indicates that the areas of the positions of all the cell blocks do not change, and the screen picture is captured as the current picture P according to the preset time in step a1)iThe method specifically comprises the following steps: judging whether the screen image changes within the preset time, if so, capturing the current screen image as the current image PiOtherwise, the previous picture P isi-1As the current picture Pi. No change occurs to the screenIn the present embodiment, the last captured screen is used to perform the subsequent processing, so as to reduce the resource consumption.
The specific step of step a1) of this embodiment includes: obtaining a current picture PiRelative to the previous frame Pi-1All the unit blocks corresponding to the changed area of (a) are taken as the first unit block, and the current picture P is acquirediRelative to the previous frame Pi-1All the cell blocks corresponding to the unchanged area of (a) are taken as second cell blocks, and a flag set flag [ nw ] is set][nh]Setting a mark corresponding to the first cell block to be 1, and respectively matching the second cell block with a mark set flag [ nw ]][nh]If the mark corresponding to the second unit block is 1, adding the position information of the second unit block into the current character recognition area set A, and simultaneously, setting a mark set flag [ nw ]][nh]The flag corresponding to the second cell block is set to 0. Through the steps, only the change area of the cell block is changed into the invariable area, and then the character recognition is carried out once, so that the CPU consumption caused by the character recognition is further reduced.
In this embodiment, the character recognition algorithm in step a3) is the maximum stable extremum area algorithm MESR, the YUV format includes Y, U, V3 components, where the Y component represents brightness, and if only the Y component in the picture becomes a black, white and gray picture without color, and the UV component represents color, and the character recognition can be implemented only for the Y component by the maximum stable extremum area algorithm, and if a character is recognized, the recognition is successful, otherwise the recognition fails.
The compression algorithm in step a6) of the present embodiment is a conventional compression algorithm, such as a run length compression algorithm RLE or zlib compression algorithm.
As shown in fig. 3 and 4, the method for transmitting a clear text region picture further includes a step of decompressing, by the client, decoded screen image data, which specifically includes:
B1) acquiring a picture compression data packet sent by a server, and decompressing the picture compression data packet;
B2) if the decompressed content comprises a character differential compression data packet, decompressing the character differential compression data packet to obtain a character region detail set C, and decoding the decompressed h264 data to obtain a reconstructed picture Pi2The character difference data in the character area detail set C and the picture Pi2Synthesizing to obtain clear character picture Pi3Will picture Pi3As a final picture; otherwise, decoding the decompressed h264 data to obtain a reconstructed picture Pi2Will picture Pi2As the final picture.
Combining the character difference data in the character area detail set C with the picture Pi2Synthesizing to obtain clear character picture Pi3The method specifically comprises the following steps: acquiring a picture P according to the position information in the character area detail set Ci2And the character unit block is matched with the character area detail set C to obtain corresponding character differential data, and the YUV data of the character unit block and the corresponding character differential data are added to obtain new YUV data of the character unit block.
Therefore, according to the method of the embodiment, the screen is divided into the unit blocks, the server side performs character recognition on the unit blocks, CPU consumption is reduced, meanwhile, only character recognition is performed once on the unit blocks changed from the change area to the non-change area, character recognition is not performed on the change area, CPU consumption is further reduced, finally, difference calculation is performed on original YUV data of the captured screen image and reconstructed image YUV data after h264 coding to extract character difference data of the character area, the character difference data and the encoded h264 data are packed, compressed and sent to the client side, the client side only needs to synthesize the character difference data and the reconstructed image to obtain an image with clear characters, and the display effect of the characters is guaranteed on the premise of saving network bandwidth.
Example two
The present embodiment is basically the same as the first embodiment, except that a step of network judgment is further included before step a5), which specifically includes:
C1) judging whether the network condition meets a preset condition, if so, jumping to the step A5), and if not, entering the step C2);
C2) respectively obtaining a picture P according to the elements in the character region set Bi1And picture Pi2In the character unit blocks corresponding to each other, the picture P is divided intoi1Chinese character unit block and corresponding picture Pi2And D, carrying out difference calculation on the Y component data of the Chinese character unit block to obtain corresponding character difference data, adding the position information of the character unit block and the corresponding character difference data into the character area detail set C, and skipping to the step A6).
Correspondingly, in the step of decompressing the decoded screen image data by the client, step B3) specifically includes: acquiring a picture P according to the position information in the character area detail set Ci2And the character unit block is matched with the character area detail set C to obtain corresponding character differential data, and the YUV data or Y component data of the character unit block and the corresponding character differential data are added to obtain new YUV data of the character unit block.
Through the steps, under the condition of poor network condition, the data transmission between the server and the client saves the network bandwidth, and the picture of the client can still display clear characters.
EXAMPLE III
The present embodiment is basically the same as the second embodiment, except that before step a6), a step of network judgment is further included, which specifically includes:
D1) judging whether the network condition meets a preset condition, if so, jumping to the step A6), and if not, entering the step D2);
D2) and compressing the encoded h264 data into a picture compression data packet, sending the picture compression data packet to the client, and returning to the step A1).
Through the above steps, on the basis of the second embodiment, the embodiment only sends the encoded h264 data for worse network conditions, so that smooth pictures of the client are ensured, and transmission of the text differential compression data packet is resumed when the network conditions are relieved.
The foregoing is considered as illustrative of the preferred embodiments of the invention and is not to be construed as limiting the invention in any way. Although the present invention has been described with reference to the preferred embodiments, it is not intended to be limited thereto. Therefore, any simple modification, equivalent change and modification made to the above embodiments according to the technical spirit of the present invention should fall within the protection scope of the technical scheme of the present invention, unless the technical spirit of the present invention departs from the content of the technical scheme of the present invention.

Claims (10)

1. A picture transmission method with clear text areas is characterized by comprising the step of compressing and coding screen image data by a server, and specifically comprises the following steps:
A1) capturing a screen picture as a current picture P according to preset timeiWill be located in the previous picture Pi-1Change region and located in current picture PiAdding position information of a unit block of a non-change area into a current character recognition area set A, wherein the unit block is an area divided by a screen according to lines and rows;
A2) the current picture PiPicture P transcoded into YUV formati1
A3) Acquiring a picture P according to elements in the current character recognition area set Ai1According to the character recognition algorithm, character recognition is carried out on the Y component of each unit block to be recognized, and the position information of the unit block to be recognized which is successfully recognized is added into the current character area set B;
A4) for picture Pi1H264 coding is carried out to obtain coded h264 data and reconstructed picture Pi2
A5) Respectively acquiring a picture P according to elements in the current character region set Bi1And picture Pi2In the character unit blocks corresponding to each other, the picture P is divided intoi1Chinese character unit block and corresponding picture Pi2Carrying out difference calculation on YUV data of the Chinese character unit block to obtain corresponding character difference data, and adding the position information of the character unit block and the corresponding character difference data into a current character area detail set C;
A6) and compressing the current text area detail set C according to a compression algorithm to obtain a text differential compression data packet, and enclosing and compressing the encoded h264 data and the text differential compression data into a picture compression data packet and then sending the picture compression data packet to the client.
2. The method for transmitting a picture with clear text areas according to claim 1, further comprising a step of decompressing the decoded screen image data by the client, specifically comprising:
B1) acquiring a picture compression data packet sent by a server, and decompressing the picture compression data packet;
B2) if the decompressed content comprises a character differential compression data packet, decompressing the character differential compression data packet to obtain a character region detail set C, and decoding the decompressed h264 data to obtain a reconstructed picture Pi2The character difference data in the character area detail set C and the picture Pi2Synthesizing to obtain clear character picture Pi3Will picture Pi3As a final picture; otherwise, decoding the decompressed h264 data to obtain a reconstructed picture Pi2Will picture Pi2As a final picture; combining the character difference data in the character area detail set C with the picture Pi2Synthesizing to obtain clear character picture Pi3The method specifically comprises the following steps: acquiring a picture P according to the position information in the character area detail set Ci2And the character unit block is matched with the character area detail set C to obtain corresponding character differential data, and the YUV data of the character unit block and the corresponding character differential data are added to obtain new YUV data of the character unit block.
3. The method for transmitting a picture with clear text areas according to claim 1, wherein step a1) is preceded by a step of dividing the cell blocks, specifically comprising: dividing a screen into nw rows of unit blocks with the same nh column size according to the preset unit length w and the preset unit width h, defining a flag set [ nw ] [ nh ] of all the unit blocks, and setting all flags in the flag set [ nw ] [ nh ] to be 0.
4. The character area of claim 3 is clearThe picture transmission method is characterized in that the step A1) specifically comprises the following steps: obtaining a current picture PiRelative to the previous frame Pi-1All the unit blocks corresponding to the changed area of (a) are taken as the first unit block, and the current picture P is acquirediRelative to the previous picture Pi-1All the cell blocks corresponding to the unchanged area of (a) are taken as second cell blocks, and a flag set flag [ nw ] is set][nh]Setting a mark corresponding to the first cell block to be 1, and respectively matching the second cell block with a mark set flag [ nw ]][nh]If the mark corresponding to the second unit block is 1, adding the position information of the second unit block into the current character recognition area set A, and simultaneously, setting a mark set flag [ nw ]][nh]The flag corresponding to the second cell block is set to 0.
5. The method for transmitting frames with clear text areas according to claim 1, wherein in step A1), the screen frame is grabbed according to the preset time as the current frame PiThe method specifically comprises the following steps: judging whether the screen image changes within the preset time, if so, capturing the current screen image as the current image PiOtherwise, the previous picture P isi-1As the current picture Pi
6. The method for transmitting pictures with clear text areas according to claim 1, wherein the step a1) further comprises the processing steps of when the current text recognition area set a is empty: if the current character recognition area set A is empty, the current picture P is divided into a plurality of picturesiPicture P transcoded into YUV formati1And then h264 coding is carried out to obtain coded h264 data, and the coded h264 data is compressed into a picture compression data packet and then is sent to the client.
7. The method for transmitting pictures with clear text areas according to claim 1, wherein step a5) is preceded by a step of network judgment, which specifically comprises:
C1) judging whether the network condition meets a preset condition, if so, jumping to the step A5), and if not, entering the step C2);
C2) respectively obtaining a picture P according to the elements in the character region set Bi1And picture Pi2In the character unit blocks corresponding to each other, the picture P is divided intoi1Chinese character unit block and corresponding picture Pi2And D, carrying out difference calculation on the Y component data of the Chinese character unit block to obtain corresponding character difference data, adding the position information of the character unit block and the corresponding character difference data into the character area detail set C, and skipping to the step A6).
8. The method for transmitting pictures with clear text areas according to claim 1, wherein step a6) is preceded by a step of network judgment, which specifically comprises:
D1) judging whether the network condition meets the preset condition, if so, jumping to the step A6), otherwise, entering the step D2);
D2) and compressing the encoded h264 data into a picture compression data packet, sending the picture compression data packet to the client, and returning to the step A1).
9. The method for transmitting frames with clear text areas according to claim 1, wherein the text recognition algorithm in step A3) is a maximum stable extremum area algorithm.
10. The method for transmitting pictures with clear text areas according to claim 1, wherein the compression algorithm in step a6) is a run length compression algorithm or a zlib compression algorithm.
CN202011338605.3A 2020-11-25 2020-11-25 Picture transmission method with clear character area Active CN112511860B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011338605.3A CN112511860B (en) 2020-11-25 2020-11-25 Picture transmission method with clear character area

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011338605.3A CN112511860B (en) 2020-11-25 2020-11-25 Picture transmission method with clear character area

Publications (2)

Publication Number Publication Date
CN112511860A CN112511860A (en) 2021-03-16
CN112511860B true CN112511860B (en) 2022-05-24

Family

ID=74958584

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011338605.3A Active CN112511860B (en) 2020-11-25 2020-11-25 Picture transmission method with clear character area

Country Status (1)

Country Link
CN (1) CN112511860B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102254160A (en) * 2011-07-12 2011-11-23 央视国际网络有限公司 Video score detecting and recognizing method and device
CN102630043A (en) * 2012-04-01 2012-08-08 北京捷成世纪科技股份有限公司 Object-based video transcoding method and device
CN110351564A (en) * 2019-08-08 2019-10-18 上海纽菲斯信息科技有限公司 A kind of text clearly video compress transmission method and system
CN111918065A (en) * 2019-05-08 2020-11-10 中兴通讯股份有限公司 Information compression/decompression method and device

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6892625B2 (en) * 2016-07-29 2021-06-23 ブラザー工業株式会社 Data processing equipment and computer programs

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102254160A (en) * 2011-07-12 2011-11-23 央视国际网络有限公司 Video score detecting and recognizing method and device
CN102630043A (en) * 2012-04-01 2012-08-08 北京捷成世纪科技股份有限公司 Object-based video transcoding method and device
CN111918065A (en) * 2019-05-08 2020-11-10 中兴通讯股份有限公司 Information compression/decompression method and device
CN110351564A (en) * 2019-08-08 2019-10-18 上海纽菲斯信息科技有限公司 A kind of text clearly video compress transmission method and system

Also Published As

Publication number Publication date
CN112511860A (en) 2021-03-16

Similar Documents

Publication Publication Date Title
US10904408B2 (en) Picture file processing method, device, and computer storage medium
CN102204257B (en) Low latency video encoder
US5689800A (en) Video feedback for reducing data rate or increasing quality in a video processing system
US7072404B2 (en) Decoding apparatus, control method therefor, and storage medium
CN109640167B (en) Video processing method and device, electronic equipment and storage medium
WO2020135357A1 (en) Data compression method and apparatus, and data encoding/decoding method and apparatus
US11102493B2 (en) Method and apparatus for image compression that employs multiple indexed color history buffers
GB2371434A (en) Encoding and transmitting video data
CN111654660B (en) Video conference system coding transmission method based on image segmentation
US11600026B2 (en) Data processing systems
CN112511860B (en) Picture transmission method with clear character area
CN109413445B (en) Video transmission method and device
US10771797B2 (en) Enhancing a chroma-subsampled video stream
CN114827617B (en) Video coding and decoding method and system based on perception model
CN114938408B (en) Data transmission method, system, equipment and medium of cloud mobile phone
US11538169B2 (en) Method, computer program and system for detecting changes and moving objects in a video view
CN114827620A (en) Image processing method, apparatus, device and medium
KR100798386B1 (en) Method of compressing and decompressing image and equipment thereof
JP2001144968A (en) Multimedia information converter
CN117221547B (en) CTU-level downsampling-based 8K video coding method and device
CN116248895B (en) Video cloud transcoding method and system for virtual reality panorama roaming
CN116489132A (en) Virtual desktop data transmission method, server, client and storage medium
KR19980025576A (en) Data Compression Device and Data Reduction Method Using Motion Characteristics of Video Object in Content-based Coding of Video
KR100504808B1 (en) Method converting rgb color in a motion picture codec
CN115426519A (en) Method and system for playing H265 video in browser

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant