CN113891145A - Super high definition video preprocessing main visual angle roaming playing system and mobile terminal - Google Patents

Super high definition video preprocessing main visual angle roaming playing system and mobile terminal Download PDF

Info

Publication number
CN113891145A
CN113891145A CN202111341472.XA CN202111341472A CN113891145A CN 113891145 A CN113891145 A CN 113891145A CN 202111341472 A CN202111341472 A CN 202111341472A CN 113891145 A CN113891145 A CN 113891145A
Authority
CN
China
Prior art keywords
video
target
target object
unit
mobile terminal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111341472.XA
Other languages
Chinese (zh)
Other versions
CN113891145B (en
Inventor
张宏
王付生
鲁泳
王立光
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Zhonghe Ultra Hd Collaborative Technology Center Co ltd
Original Assignee
Beijing Zhonghe Ultra Hd Collaborative Technology Center Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Zhonghe Ultra Hd Collaborative Technology Center Co ltd filed Critical Beijing Zhonghe Ultra Hd Collaborative Technology Center Co ltd
Priority to CN202111341472.XA priority Critical patent/CN113891145B/en
Publication of CN113891145A publication Critical patent/CN113891145A/en
Application granted granted Critical
Publication of CN113891145B publication Critical patent/CN113891145B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/431Generation of visual interfaces for content selection or interaction; Content or additional data rendering
    • H04N21/4312Generation of visual interfaces for content selection or interaction; Content or additional data rendering involving specific graphical features, e.g. screen layout, special fonts or colors, blinking icons, highlights or animations
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/442Monitoring of processes or resources, e.g. detecting the failure of a recording device, monitoring the downstream bandwidth, the number of times a movie has been viewed, the storage space available from the internal hard disk
    • H04N21/44213Monitoring of end-user related data
    • H04N21/44222Analytics of user selections, e.g. selection of programs or purchase activity
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/472End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content
    • H04N21/47217End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content for controlling playback functions for recorded or on-demand content, e.g. using progress bars, mode or play-point indicators or bookmarks

Abstract

The invention provides an ultrahigh-definition video preprocessing main view roaming playing system and a mobile terminal, wherein the system comprises a first video processing server and a data sending server; the first video processing server and the data sending server can access a shared memory bank storing a target video; the method comprises the steps that a first video processing server determines position parameters of a target object in each frame of image of a target video based on a preset target object and the target video; and the data sending server sends the target video, and if an operation request aiming at a target object of the target video is received, the target video and the position parameter of the target object in each frame image of the target video are sent. The invention can enable the user to automatically track and watch the interested object, and improves the watching experience of the user, thereby meeting the high-quality watching requirement of the user.

Description

Super high definition video preprocessing main visual angle roaming playing system and mobile terminal
Technical Field
The invention relates to the technical field of video playing, in particular to a system for preprocessing main view roaming playing by an ultra-high definition video server and a mobile terminal.
Background
At present, many video sources shot by front-end equipment have resolutions of 4k and 8k, and even front-end equipment capable of shooting 16k videos is also under development, but in small-resolution playing equipment such as a mobile terminal (a mobile phone, a tablet computer, or the like), the resolution of the front-end equipment cannot reach 4k, 8k, or 16k, and original video information such as ultra-high definition videos such as 4k, 8k, or 16k cannot be directly displayed point to point.
At present, when the resolutions of a video source and a playing device are not consistent, in order to play a video, the video source is usually scaled down to the resolution corresponding to the mobile terminal for display, that is, the high resolution is converted into the low resolution for viewing. The conventional method is only for realizing the viewing of video contents, but it cannot realize the viewing at a pixel level for an object to be viewed.
In the related art, when playing a video, a mobile terminal of a user can only play a down-mixed reduced picture of an ultra-high-definition video, or perform the ultra-high-definition playing on a local picture according to an operation instruction of the user, which cannot realize the automatic tracking of an interested object, resulting in poor viewing experience of the user and inability to meet the high-quality viewing requirement of the user.
Meanwhile, in the process of rebroadcasting a sports game like the sports game, a plurality of cameras are usually used for shooting on site, but at present, audiences can only see partial pictures after rebroadcasting switching, only pictures with specific visual angles of a certain camera are seen by the audiences at the same time, and the pictures are also only pictures of a certain part of the playing field. If the athlete concerned by the audience is not in the picture, the audience cannot see the athlete even if other cameras shoot the athlete.
Due to the fact that the ultrahigh-definition cameras have high resolution, one ultrahigh-definition camera can possibly replace a plurality of original common cameras to finish the whole-field sports rebroadcasting. However, the prior art has no way for different viewers to conveniently see clear pictures of their concerned players.
Disclosure of Invention
In view of this, the present invention provides a system for pre-processing a main viewing angle roaming playing by an ultra high definition video server, which automatically tracks a viewing object concerned by a user to improve the viewing experience of the user, thereby satisfying the high-quality viewing requirement of the user.
In a first aspect, an embodiment of the present invention provides a system for preprocessing a roaming play of a primary view angle by an ultra high definition video server, where the system includes a first video processing server and a data sending server; the first video processing server and the data sending server can access a shared memory bank storing a target video; the first video processing server is used for determining the position parameter of the target object in each frame of image of the target video based on a preset target object and the target video; the data sending server is used for sending the target video, and if an operation request aiming at a target object of the target video is received, the target video and the position parameter of the target object in each frame image of the target video are sent.
With reference to the first aspect, an embodiment of the present invention provides a first possible implementation manner of the first aspect, where the first video processing server includes a target identification unit, a comparison unit, and a target tracking unit; the target identification unit is used for determining metadata of a target object; the metadata indicates image features of the target object; the comparison unit is used for comparing the image characteristics of the target object with the image information of each picture frame in the target video to obtain a comparison result; determining the parameter information of the target object in each picture frame according to the comparison result; the target tracking unit is used for determining the position parameter of the target object in each picture frame according to the parameter information; the target object is included in the picture position indicated by the position parameter.
With reference to the first possible implementation manner of the first aspect, an embodiment of the present invention provides a second possible implementation manner of the first aspect, where the parameter information includes an outline and a center position of the target object in the picture frame; the target tracking unit comprises a central determination subunit; the location parameter includes a zoom center; for each picture frame, the center determining subunit is configured to determine a zoom center of the picture frame based on a center position of the target object in the picture frame.
In a second aspect, an embodiment of the present invention provides an ultra high definition video preprocessing main view roaming mobile terminal; the mobile terminal comprises a second video processing module, a video playing module and a communication module; the second video processing module is used for receiving a touch operation request which is input by a user and aims at the target video, and if the touch operation request contains an operation instruction for the target object, the second video processing module is used for processing the target video based on the operation instruction and the position parameter of the target object in each frame of image of the target video to obtain processed video data corresponding to the touch operation; if a certain frame of image of the target video contains a target object, the processed video data current frame comprises the target object; the video playing module plays the processed video data corresponding to the touch operation; the communication module is used for sending an operation request aiming at the target video.
With reference to the second aspect, an embodiment of the present invention provides a first possible implementation manner of the second aspect, where the second video processing module includes an operation information parsing storage unit and a video/audio processing module; the operation information analysis storage unit is used for storing touch operation aiming at the target video input by a user, forming an operation instruction according to the touch operation analysis and sending the operation instruction to the video and audio processing module; the video and audio processing module is used for processing the target video based on the touch operation and the position parameter of the target object in the target video to obtain video data corresponding to the touch operation.
With reference to the first possible implementation manner of the second aspect, an embodiment of the present invention provides a second possible implementation manner of the second aspect, where the video and audio processing module includes a target object determining unit and a video scaling unit; the position parameters comprise the zooming positions of a plurality of target objects in each picture frame of the target video; the target object determination unit is used for determining a selected target object in a plurality of target objects of the target video based on touch operation; the video zooming unit is used for processing the target video according to the zooming position of the selected target object and the zooming ratio of the current video playing to obtain video data corresponding to the zooming position and the zooming ratio.
With reference to the first possible implementation manner of the second aspect, an embodiment of the present invention provides a third possible implementation manner of the second aspect, where the mobile terminal further includes a buffering module; the buffer module comprises a roaming image buffer unit and a display buffer unit; a roaming image buffer unit for storing video data including a target object and transmitting the video data including the target object to the display buffer unit; the display buffer unit is used for playing the video data.
With reference to the first possible implementation manner of the second aspect, the embodiment of the present invention provides a fourth possible implementation manner of the second aspect, wherein the touch operation further includes an area-of-interest switching operation; the video and audio processing module is further used for sending the video data of the selected region of interest corresponding to the current video data to the display buffer unit after processing according to a preset scaling when the region of interest switching operation is identified, so that the display buffer unit plays the video and audio processing module of the video data of the currently selected region of interest.
With reference to the first possible implementation manner of the second aspect, an embodiment of the present invention provides a fifth possible implementation manner of the second aspect, where the mobile terminal further includes a down-conversion unit and an aliasing unit; the operating instructions further include an aliasing instruction; the down-conversion unit is used for performing down-conversion processing on a complete video picture of a video to be played to generate a base map video; and the aliasing unit is used for performing aliasing processing on the base map video and the video picture corresponding to the operation request to obtain video data containing the floating window.
With reference to the first possible implementation manner of the second aspect, an embodiment of the present invention provides a sixth possible implementation manner of the second aspect, where the mobile terminal further includes an audio adjusting unit; and the audio adjusting unit is used for converting the panoramic audio corresponding to the video to be played into the stereo audio corresponding to the target object.
The embodiment of the invention has the following beneficial effects:
the invention provides a super high definition video server preprocessing main visual angle roaming playing system and a mobile terminal, wherein the system comprises a first video processing server and a data sending server; the first video processing server and the data sending server can access a shared memory bank storing a target video; the method comprises the steps that a first video processing server determines position parameters of a target object in each frame of image of a target video based on a preset target object and the target video; and the data sending server sends the target video, and if an operation request aiming at a target object of the target video is received, the target video and the position parameter of the target object in each frame image of the target video are sent. The system can enable the user to automatically track and watch the interested object, improves the watching experience of the user, and meets the high-quality watching requirement of the user.
According to the super-high-definition video server preprocessing main visual angle roaming playing system provided by the invention, a viewer can conveniently amplify or drag a picture on a mobile terminal of the viewer, and selects an interested area of the viewer to watch the picture. Therefore, one-time shooting and multi-version watching can be realized. Meanwhile, the invention also provides a method for the viewer to select the interested athlete and always display the amplified clear picture centered on the athlete on the mobile terminal, which is equivalent to the method for providing free watching at a plurality of angles for different audiences by using one ultra-high definition video source. Meanwhile, the picture displayed on the mobile terminal actually moves along with the movement of the athlete on the original ultra-high-definition video picture, so that the automatic 'roaming' of the watching picture is realized.
After the preprocessing of the server, a plurality of preprocessed videos with different definition versions and centered on each key object can be directly generated and stored, and when a user needs to watch a video centered on a certain key object, the videos of the athlete with corresponding definition are directly called and transmitted to the mobile terminal of the user to be played. The invention can greatly save the transmission bandwidth while providing the multi-view video taking the key object (such as a certain ball star) which is interested together as the center for the user. Meanwhile, the server preprocesses the one-time generation mode, so that the information of all key objects can be identified at one time and stored respectively, the complex operation of identifying the key objects repeatedly when different key objects are proposed by a user is avoided, and the computing intensity of the whole system is greatly reduced. In addition, the invention can also be provided with a plurality of ultra-high-definition cameras capable of shooting the complete picture of the whole competition field, when a mobile terminal generates a video with a certain key object as the center, the mobile terminal requests the server for a video source containing the front image of the interested player, and automatically switches to the video source with the front image of the interested player to generate a final processed video, so that the audience can obtain better watching experience during watching.
Additional features and advantages of the disclosure will be set forth in the description which follows, or in part may be learned by the practice of the above-described techniques of the disclosure, or may be learned by practice of the disclosure.
In order to make the aforementioned objects, features and advantages of the present disclosure more comprehensible, preferred embodiments accompanied with figures are described in detail below.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
Fig. 1 is a schematic structural diagram of an ultra high definition video preprocessing main view roaming playing system according to an embodiment of the present invention;
fig. 2 is a schematic structural diagram of another ultra high definition video preprocessing main view roaming mobile terminal according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of another ultrahigh-definition video preprocessing main view roaming playing system according to an embodiment of the present invention.
Detailed Description
To make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is apparent that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
At present, many video sources shot by front-end equipment have resolutions of 4k and 8k, and even front-end equipment capable of shooting 16k videos is also under development, but in small-resolution playing equipment such as a mobile terminal (a mobile phone, a tablet computer, or the like), the resolution of the front-end equipment cannot reach 4k, 8k, or 16k, and original video information such as ultra-high definition videos such as 4k, 8k, or 16k cannot be directly displayed point to point.
At present, when the resolutions of a video source and a playing device are not consistent, in order to play a video, the video source is usually scaled down to the resolution corresponding to the mobile terminal for display, that is, the high resolution is converted into the low resolution for viewing. The conventional method is only for viewing video content, but it cannot achieve viewing at a pixel level for an object to be viewed.
Based on this, the ultrahigh-definition video preprocessing main viewing angle roaming playing system and the mobile terminal for recorded and played videos provided by the embodiment of the invention can achieve ultrahigh-definition automatic tracking and watching of an object to be watched by a user, improve user experience, and meet the high-quality watching requirement of the user.
To facilitate understanding of the embodiment, a detailed description is first given to the ultrahigh-definition video server preprocessing main view roaming playing system disclosed in the embodiment of the present invention.
The embodiment of the invention provides a super high definition video preprocessing main view roaming playing system. As shown in fig. 1, the system includes a first video processing server 10 and a data transmission server 20 connected in communication, wherein the server can read a pre-stored target video, which may be stored in a video server, but is usually stored in a shared storage device, and the server reads the pre-stored target video through a network. The first video processing server and the data transmission server may access a shared storage bank in which the target video is stored.
In the working process of the system, the first video processing server is used for determining the position parameter of the target object in each frame of image of the target video based on a preset target object and the target video; the data sending server is used for sending the target video, and if an operation request aiming at a target object of the target video is received, the target video and the position parameter of the target object in each frame image of the target video are sent.
The system can also be in communication connection with a preset mobile terminal. The mobile terminal comprises a second video processing module, a video playing module and a communication module; the second video processing module is used for receiving a touch operation request which is input by a user and aims at the target video, and if the touch operation request contains an operation instruction for the target object, the second video processing module is used for processing the target video based on the operation instruction and the position parameter of the target object in each frame of image of the target video to obtain processed video data corresponding to the touch operation; and the position parameter of the target object in each frame of image of the target video is sent to the mobile terminal by the playing system. If a certain frame of image of the target video contains a target object, the processed video data current frame comprises the target object; the video playing module plays the processed video data corresponding to the touch operation; the communication module is used for sending an operation request aiming at the target video.
Specifically, the first video processing server comprises a target identification unit, a comparison unit and a target tracking unit. The target recognition unit is generally used for determining the metadata of the target object by the target recognition unit; the metadata indicates image features of the target object; the comparison unit is generally configured to compare image characteristics of the target object with image information of each picture frame in the target video to obtain a comparison result, and determine parameter information of the target object in each picture frame according to the comparison result; the target tracking unit is generally configured to determine a position parameter of a target object in each frame according to the parameter information; and the picture position indicated by the position parameter comprises a target object.
Further, in order to determine the position information of the target object, the parameter information may include a contour and a center position of the target object in the picture frame; the second video processing module of the mobile terminal generally comprises an operation information storage unit and a video and audio processing module; during specific implementation, after the touch operation aiming at the target video input by the user is stored, the operation information storage and analysis unit analyzes the touch operation to form an operation instruction and sends the operation instruction to the video and audio processing module; the video and audio processing module can process the target video based on the touch operation and the position parameter of the target object in the target video to obtain video data corresponding to the touch operation.
Furthermore, the video and audio processing module also comprises a target object determining unit and a video zooming unit; the position parameters comprise position information of a plurality of target objects in each picture frame of the target video; specifically, the target object determination unit is used for determining a selected target object in a plurality of target objects of the target video based on touch operation; generally, if the selected target object is determined, the contour information is superimposed on the currently played video image according to the position and contour information of the selected target object, for example, the contour information of the target object is marked in a green line form, so as to feed back that the user selects the target object. And then, processing the target video by the user operation instruction and the zoom ratio of the current video playing to obtain video data corresponding to the position and the zoom ratio of the selected target object.
Furthermore, the mobile terminal also comprises a buffer module; the buffer module may include a roaming image buffer unit and a display buffer unit; specifically, the roaming image buffer unit is used for storing video data including a target object and sending the video data including the target object to the display buffer unit; the display buffer unit is used for playing the video data.
Further, the touch operation further comprises region-of-interest switching operation; specifically, the video and audio processing module is further configured to, when the switching operation of the region of interest is identified, send the video data of the selected region of interest corresponding to the current video data to the display buffer unit after processing according to a preset scaling ratio, so that the display buffer unit plays the video data of the currently selected region of interest. Usually, when performing sports live broadcast, some regions of interest are preset, such as: the goal area of the football match, the corner area of the football match, the forbidden area and the like are preset as interested areas, and when a user clicks the interested areas, the interested areas are automatically enlarged and displayed.
Further, the mobile terminal also comprises a down-conversion unit and an aliasing unit; the operating instructions further include an aliasing instruction; specifically, the down-conversion unit is used for performing down-conversion processing on a complete video picture of a video to be played to generate a base map video; and the aliasing unit is used for performing aliasing processing on the base map video and the video picture corresponding to the operation request to obtain video data containing the floating window.
In addition, the mobile terminal also comprises an audio adjusting unit; and the audio adjusting unit is used for converting the panoramic audio corresponding to the video to be played into the stereo audio corresponding to the target object.
The invention provides a system for preprocessing a main visual angle roaming playing by an ultra-high definition video server, which comprises a first video processing server and a data sending server; the first video processing server and the data sending server can access a shared memory bank storing a target video; the method comprises the steps that a first video processing server determines position parameters of a target object in each frame of image of a target video based on a preset target object and the target video; and the data sending server sends the target video, and if an operation request aiming at a target object of the target video is received, the target video and the position parameter of the target object in each frame image of the target video are sent. The method can enable the user to automatically track and watch the interested object, improves the watching experience of the user, and meets the high-quality watching requirement of the user.
The embodiment of the invention also provides another system for preprocessing the main view roaming playing by the ultra-high-definition video server, and the method is realized on the basis of the system shown in the figure 1.
The mobile terminal identifies the operation information of the user and obtains the video picture containing the key object after the operation information is processed according to the scaling ratio from the server to play, so that the user can watch the video containing the interested object in a personalized, clearer and more detailed mode, and the purpose of roaming is achieved. The structure of the mobile terminal of the system is shown in fig. 2, and the schematic diagram of the interaction structure between the mobile terminal and the server is shown in fig. 3.
The system is characterized in that a server carries out certain processing on a video source in advance, a key object is identified in the video source, key object information contained in the video is sent to a mobile terminal, and corresponding video and audio information is sent to the mobile terminal to be played.
The system identifies all key object information at one time through the server, so that the key objects can be prevented from being identified on the mobile terminal, and the identified key object information can be used by other mobile terminal users, thereby greatly reducing the operation complexity of the whole system. The system identifies the operation information of the user through the mobile terminal, processes the video to be played under the assistance of the key object information identified in advance by the server to obtain the video picture containing the key object after scaling processing, so that the user can view the video containing the interested object in a personalized, clearer and more detailed manner, and the aim of viewing the video containing the interested key object through the clear picture is fulfilled all the time.
The mobile terminal at least comprises several modules as shown in fig. 2, which are described in detail as follows:
the mobile terminal includes at least: the device comprises an operation information storage module, a video and audio processing module, a communication module and a buffer module.
The operation information storage module is connected with a touch screen of the mobile terminal and used for receiving operation information of a user and sending the operation information to the video and audio processing module. The video and audio processing module identifies the instruction or instruction parameter of the user through the operation information, and the communication module is in contact with the server to acquire the ultra-high definition video which the user wants to watch.
As shown in fig. 3, the server includes:
and the adjusting control unit is used for calling the unit corresponding to the instruction to process the image based on the parameter identified by the video and audio processing module from the operation information.
And the target tracking unit is used for tracking the key object in the video based on the calling instruction of the adjusting control unit so as to realize the purpose that the user watches by taking the key object as the center.
The mobile terminal includes:
and the video zooming unit is used for zooming the original video image based on the acquired position parameter of the target object and the touch operation request aiming at the target video input by the user so as to realize the purpose of zooming and watching by the user.
And the metadata unit is used for calling metadata from the acquired data of the communication module so as to achieve the purpose of displaying the information of the key object.
And the down-conversion unit is used for performing down-conversion processing on the whole image corresponding to the video image of the ultra-high definition video signal based on the calling instruction of the adjusting control unit to obtain the whole image corresponding to the video image with reduced resolution, so that the purpose that a user displays the complete video in a suspended window mode is realized, or the purpose that the reduced resolution is played on a mobile phone is realized.
And the aliasing unit is used for superposing the picture of the region of interest and the complete picture based on the calling instruction of the adjusting control unit so as to realize the purpose that the user displays the complete video in the form of a floating window.
And the audio adjusting unit is used for adjusting the audio based on the calling instruction of the adjusting control unit so as to realize the purpose of switching the sound effect.
And the roaming control unit is used for selecting a roaming area based on the calling instruction of the adjusting control unit, or controlling the current picture to play by taking the selected target object as the center in real time according to the position information of the selected key object so as to realize the purpose of roaming watching.
The video processed by the video and audio processing module is buffered in the buffering module, then the buffering module displays the video on a display screen of the mobile terminal through the display driving circuit, and the buffering module plays audio through the loudspeaker driving circuit.
The buffer module further includes:
the key object information is like a buffer unit and is used for buffering the information of the key objects, for example, the position and contour information of each key object in each frame of a video source is buffered;
a metadata buffering unit for buffering metadata;
the interesting area image buffering unit is used for buffering the video image of the interesting area and is particularly connected with the video zooming unit;
the original video frame image buffer unit is used for buffering an original video image;
the audio buffer unit is used for buffering audio and is particularly connected with the audio adjusting unit;
and the base map buffer unit is used for buffering the whole image corresponding to the video image with reduced resolution, and is specifically connected with the down-conversion unit and the aliasing unit in the video and audio processing module.
And the roaming image buffer unit is used for buffering the images of the roaming area and is specifically connected with the roaming control unit in the video and audio processing module.
And the audio output buffer unit is used for transmitting the played audio to the loudspeaker driving circuit, and the loudspeaker driving circuit drives the loudspeaker to play and is particularly connected with the audio buffer unit.
And the display buffer unit is used for transmitting one or more of an original video image, a video image of an interested area, a video image of a roaming area and metadata to the display driving circuit, and the display driving circuit drives the display screen to play and is particularly connected with the metadata buffer unit, the interested area image buffer unit, the original video frame image buffer unit and the base map buffer unit.
The communication module is connected with the server and is used for receiving the video information and other information provided by the server, such as the position information of the key object.
The following is specifically described:
the operation information storage module stores operation information of a user, and the source of the operation information comprises but is not limited to (1) operation information of the user on a touch screen; (2) through the operation of the mobile phone keys; (3) the user can sense the operation through the mobile phone (such as gravity sensing). As long as the user operates the mobile terminal, the operation information may be recorded as operation information, and the operation information obtained from the touch panel will be described below as an example.
The operation information storage and analysis module receives and stores operation information of a user, and the operation information storage and analysis unit analyzes and forms an operation instruction according to touch operation after storing the touch operation aiming at a target video input by the user and sends the operation instruction to the video and audio processing module; the video and audio processing module can process the target video based on the touch operation and the position parameter of the target object in the target video to obtain video data corresponding to the touch operation, and sends the processed video to a roaming image buffer unit of the buffer module; the roaming image buffer unit sends the video to the display buffer unit for playing.
Specifically, the above target tracking is an auxiliary means at the stage of identifying the key person, and the key person identified in the previous frame, the position and the contour can be relatively conveniently determined near the previous position by the key person identified in the previous frame under the condition of continuous motion without shot switching, so that the identification difficulty and the computation amount can be reduced. This is the recognition phase.
After a roaming control unit specifies that a certain key person is used as a center for roaming, the center (which can be the center of the face or the center of the body gravity) of the key person is set as the center of a picture to be displayed, and the picture to be displayed is acquired in an original image at the current display scale. Since the key character is centered on the picture to be displayed each time, the character is moving, and the end result is as if the displayed picture was roaming in the original picture. 1. The video and audio processing module can identify the purpose and the scaling ratio of zooming and watching from the operation information, and the mobile terminal can (1) support stepless magnification (the resolution of the mobile terminal is between the resolution of a video source); (2) and multi-touch amplification is supported, the maximum amplification can be performed to the resolution equivalent to that of a video source for playing, namely, the pixel points of the area are consistent with the video source, and the like. Other amplification methods are also supported by the mobile terminal, and are not described herein.
2. The video and audio processing module can automatically identify key objects in the video source or include preset key objects in the video source received by the mobile terminal. The key object may be a key person, a key object, or the like. The mobile terminal may recognize a user selection key object from the operation information.
Specifically, the server may compare the pictures in each frame of image in the video source based on the pre-stored image information of the key object according to the similarity, mark the key object in the image, record the contour and position of the key object in each frame of image, and store the contour and position. When a user selects to watch a video, the information of the key object (at least the outline and the position of the key object) is stored in the key object information buffer unit so as to be convenient for the video and audio processing module to identify. The video and audio processing module identifies a key object selected by the user based on the operation information (including, for example, a click position) and the key object information buffering unit.
The images in each frame of image in the video source can be compared and marked with the key objects according to the similarity based on the pre-stored image information of the key objects, the outlines and the positions of the key objects in each frame of image are recorded, and then the key objects are stored in the key object information buffer unit. The key objects can be determined according to the watching habits of the users, the popularity of people in the videos and main objects, such as referees, players and balls. For example, when the audio-visual processing module identifies that the click position of the user is within the outline range of the key object, the key object selected by the user is identified.
After identifying the key object and the scaling in 1 and 2, sending to the server.
The method comprises the steps that a local picture containing key object activities is extracted from a video source in advance in a server and processed according to a preset scaling, a data packet comprising a video to be played is obtained, and the server sends the corresponding data packet to the mobile terminal after receiving a request of the mobile terminal. After the video and audio processing module of the mobile terminal performs operations such as decoding and the like, the video to be played is sent to the buffer module, the buffer module controls the display driving circuit, and the display driving circuit controls the display screen to play.
Specifically, the mobile terminal can recognize, from the operation information, the purpose of the user to change the play perspective in which the play screen is to contain the key object. For example, after receiving operation information of a user tracking a key object (for example, clicking the key object), the mobile terminal sends the key object to the server, and the server sends a pre-made video packet to the mobile terminal for playing.
Taking a football game as an example, for example, many viewers may like C compass, and there may be a key object preset to C compass. An operation information receiving module of the mobile terminal receives the operation information and then sends the operation information to a video and audio processing module; the video and audio processing module identifies that the key object is C-Row from the clicking position in the outline range of C-Row, and identifies that the magnification is 2 from the operation information that the distance of the two-point contact screen sliding outwards is 1 time of the original positions of the two points. And sends this information to the server. And the server sends the pre-made local video image with the magnification factor of 2 and taking the C compass as the center of the picture to the mobile terminal for playing.
And a key object can be preset as a football, so that when the mobile terminal broadcasts and receives the operation of clicking the football, the identified information is sent to the server, and the server sends a pre-made data packet of the amplified video image taking the football as the picture center to the mobile terminal.
The position of the key object in the amplified video image can be defaulted to be a central position in the processing module, and also can be preset with a plurality of positions which are selected by a user.
The zooming manner includes, but is not limited to, multi-touch zooming and click zooming. For example, the program in the video/audio processing module is configured to receive the multi-touch of the user for zoom playback, that is, receive the operation information of "the user touches the touch panel with two points and the two points of contact are gradually separated (zoom playback)". The video and audio processing module identifies the sliding distance, sliding speed and other conditions of the two contact points to perform zooming playing (for example, receiving the operation information of outward sliding of the two-point contact screen of the user; the video and audio processing module identifies the centers of the two contact points as the zooming centers, and identifies the proportion of the sliding distance and the original distance of the two contact points as the zooming proportion). For another example, the program in the video/audio processing module may be configured to receive a double click from the user and perform amplification, and may be configured to receive a single double click from the user and perform amplification once. At this time, if the resolution of the mobile terminal is high definition (1920 × 1080), the resolution of the video source is 8K, that is, the image size (or the number of pixels) of the video source is 16 times of the image size (or the number of pixels) that can be displayed by the mobile terminal, that is, the video source can be enlarged to the maximum by double-clicking 4 times, so as to meet the viewing requirement of the user to the maximum extent. The term "zoom-in-max" as used herein means that the user can view the most detailed video image at the resolution of the mobile terminal without blurring, i.e., point-to-point viewing (the "zoom-in-max" may be set to other forms, such as 2 times the maximum resolution of the mobile terminal, as desired).
The system also comprises a video and audio processing module for identifying the key object and sending the key object to the server. And after extracting the metadata corresponding to the key object by the server through the metadata unit in advance, storing the metadata into a data packet of the video to be played. After receiving the data packet, the video and audio processing module stores the metadata in the data packet into a metadata buffer unit, and the metadata buffer unit sends the metadata to a display buffer unit for playing so as to display the metadata of the key object on a display screen and achieve the purpose of prompting the key object. The metadata is information of key objects, for example, basic information of a person (name, age, score, number of goals, etc.), basic information of an object (object history information, model number, etc.), and the like.
The system also comprises that the video and audio processing module can identify the magnification factor and the image roaming path from the operation information (roaming, namely changing the position of the current playing visual angle in the full frame image of the video source for playing). The adjustment control unit of the mobile terminal extracts the video source based on the roaming path to obtain the video to be played, forms a data packet, sends the data packet to the roaming image buffer unit, and sends the video to the display buffer unit for playing. Wherein the operation information of the user may include a sliding motion on a screen of the mobile terminal, and the video and audio processing module recognizes the sliding path as a roaming path.
The system also comprises a step of displaying the full picture of the video source in a form of a small floating window when the mobile terminal amplifies and plays each frame of the video source.
Specifically, the adjusting control unit of the mobile terminal calls the down-conversion unit to perform down-conversion processing on each frame image of the video source, and caches each processed frame image to the base image buffer unit, and calls the aliasing unit to perform aliasing processing on each down-converted frame image and each amplified frame image to obtain the video to be played.
Under the system, the operation information also includes operation information of the floating small window, and the operation instructions identified by the video and audio processing module include, but are not limited to, the following: (1) a transparency setting instruction (for example, a transparency range which can be set to 0-90%) (2) a full video image playing instruction is recovered (for example, after receiving an operation of double-clicking a floating window by a user, the full video image is played from a state of playing the local video image in the region of interest to a full screen).
The system also comprises a video and audio processing module which identifies a picture switching instruction from the operation information and a regulation control unit which controls the playing of the local video image of the last region of interest in the buffer module. For example, after receiving an operation of clicking or double-clicking a preset region by a user, the mobile terminal switches to the previous region of interest, and if the previous region of interest does not exist, the mobile terminal can be set to play the video image at the clicked position in an amplified manner.
The system also comprises a communication module of the mobile terminal, which receives the image information (video source) and audio information.
The mobile terminal supports at least the playing of stereo sound and panoramic sound.
The mobile terminal not only amplifies the video image of the region of interest as described above, but also switches the sound of the complete video image to the sound effect of the region of interest, for example, switches the panoramic sound to the stereo sound of the corresponding region of interest through the audio adjusting unit, and stores the stereo sound in the data packet.
Preferably, each frame of the video source in the system is an ultra high definition video obtained by shooting a panoramic picture (the whole playing field is not missed) on the playing field by the shooting device, and each frame of the video source records the representation of each key object on the playing field, so that when the key objects are tracked and played, a local picture containing the key objects cannot have a temporal fault, and the user can be ensured to view the whole-field representation of the key objects.
The resolution of the mobile terminal refers to the resolution of the moving picture area of the mobile terminal, and the moving picture area is an area of the mobile phone picture capable of effectively playing the video. For example, if there is a blank area above, below, or on the left and right sides of the mobile phone screen where no video is played, this part of the area is not a moving picture area, and only when the full screen display (i.e., the video screen occupies the entire mobile phone display screen), the resolution of the moving picture area is consistent with the resolution of the mobile phone.
The first video processing server, the data transmission server, and the shared storage medium storing the target video may be implemented independently by a plurality of servers, or may be integrated in one server.
The computer program product provided in the embodiment of the present invention includes a computer-readable storage medium storing a program code, where instructions included in the program code may be used to execute the method described in the foregoing method embodiment, and specific implementation may refer to the method embodiment, which is not described herein again.
It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working process of the system described above may refer to the corresponding process in the foregoing method embodiment, and is not described herein again.
In addition, in the description of the embodiments of the present invention, unless otherwise explicitly specified or limited, the terms "mounted," "connected," and "connected" are to be construed broadly, e.g., as meaning either a fixed connection, a removable connection, or an integral connection; can be mechanically or electrically connected; they may be connected directly or indirectly through intervening media, or they may be interconnected between two elements. The specific meanings of the above terms in the present invention can be understood in specific cases to those skilled in the art.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a non-volatile computer-readable storage medium executable by a processor. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
In the description of the present invention, it should be noted that the terms "center", "upper", "lower", "left", "right", "vertical", "horizontal", "inner", "outer", etc., indicate orientations or positional relationships based on the orientations or positional relationships shown in the drawings, and are only for convenience of description and simplicity of description, but do not indicate or imply that the device or element being referred to must have a particular orientation, be constructed and operated in a particular orientation, and thus, should not be construed as limiting the present invention. Furthermore, the terms "first," "second," and "third" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance.
Finally, it should be noted that: the above-mentioned embodiments are only specific embodiments of the present invention, which are used for illustrating the technical solutions of the present invention and not for limiting the same, and the protection scope of the present invention is not limited thereto, although the present invention is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: any person skilled in the art can modify or easily conceive the technical solutions described in the foregoing embodiments or equivalent substitutes for some technical features within the technical scope of the present disclosure; such modifications, changes or substitutions do not depart from the spirit and scope of the embodiments of the present invention, and they should be construed as being included therein. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (10)

1. A super high definition video preprocessing main view roaming playing system is characterized by comprising a first video processing server and a data sending server; the first video processing server and the data sending server can access a shared storage body storing a target video;
the first video processing server is used for determining a position parameter of a target object in each frame of image of a target video based on a preset target object and the target video;
the data sending server is used for sending the target video, and if an operation request aiming at a target object of the target video is received, the data sending server sends the target video and the position parameter of the target object in each frame of image of the target video.
2. The playback system of claim 1, wherein the first video processing server comprises a target identification unit, a comparison unit, and a target tracking unit;
the target identification unit is used for determining metadata of the target object; the metadata indicates image features of the target object;
the comparison unit is used for comparing the image characteristics of the target object with the image information of each picture frame in the target video to obtain a comparison result; determining parameter information of the target object in each picture frame according to the comparison result;
the target tracking unit is used for determining the position parameter of the target object in each picture frame according to the parameter information; the target object is included in the picture position indicated by the position parameter.
3. The playback system according to claim 2, wherein the parameter information includes an outline and a center position of the target object in a frame of a picture; the target tracking unit comprises a central determination subunit; the location parameter comprises a zoom center;
for each picture frame, the center determining subunit is configured to determine a zoom center of the picture frame based on a center position of the target object in the picture frame.
4. A super high definition video preprocess main visual angle roaming mobile terminal; the mobile terminal is characterized by comprising a second video processing module, a video playing module and a communication module;
the second video processing module is used for receiving a touch operation request which is input by a user and aims at a target video, and if the touch operation request contains an operation instruction for a target object, the target video is processed based on the operation instruction and a position parameter of the target object in each frame of image of the target video to obtain processed video data corresponding to the touch operation; if a certain frame of image of the target video contains the target object, the processed video data current frame comprises the target object;
the video playing module plays the processed video data corresponding to the touch operation;
the communication module is used for sending an operation request aiming at the target video.
5. The mobile terminal of claim 4, wherein the second video processing module comprises an operation information storage and analysis unit, a video and audio processing module;
the operation information storage and analysis unit is used for storing touch operation aiming at a target video input by a user, forming an operation instruction according to the touch operation analysis, and sending the operation instruction to the video and audio processing module;
the video and audio processing module is used for processing the target video based on an operation instruction and the position parameter of the target object in the target video to obtain video data corresponding to the touch operation.
6. The mobile terminal of claim 5, wherein the video and audio processing module comprises a target object determination unit and a video scaling unit; the position parameters comprise zooming positions of a plurality of target objects in each picture frame of the target video;
the target object determination unit is used for determining a selected target object in a plurality of target objects of the target video based on the touch operation;
and the video zooming unit is used for processing the target video according to the zooming position of the selected target object and the zooming ratio of the current video playing to obtain video data corresponding to the zooming position and the zooming ratio.
7. The mobile terminal of claim 5, wherein the mobile terminal further comprises a buffer module; the buffer module comprises a roaming image buffer unit and a display buffer unit;
the roaming image buffer unit is used for storing the video data comprising the target object and sending the video data comprising the target object to the display buffer unit; the display buffer unit is used for playing the video data.
8. The mobile terminal of claim 5, wherein the touch operations further comprise region-of-interest switching operations;
and the video and audio processing module is further used for sending the video data of the selected region of interest corresponding to the current video data to the display buffer unit after processing according to a preset scaling when the switching operation of the region of interest is identified, so that the display buffer unit plays the video data of the currently selected region of interest.
9. The mobile terminal of claim 5, wherein the mobile terminal further comprises a down-conversion unit and an aliasing unit; the operating instructions further comprise aliasing instructions;
the down-conversion unit is used for performing down-conversion processing on a complete video picture of a video to be played to generate a base map video;
and the aliasing unit is used for performing aliasing processing on the base map video and the video picture corresponding to the operation request to obtain video data containing a floating window.
10. The mobile terminal of claim 5, wherein the mobile terminal further comprises an audio adjustment unit;
and the audio adjusting unit is used for converting the panoramic audio corresponding to the video to be played into the stereo audio corresponding to the target object.
CN202111341472.XA 2021-11-12 2021-11-12 Super-high definition video preprocessing main visual angle roaming playing system and mobile terminal Active CN113891145B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111341472.XA CN113891145B (en) 2021-11-12 2021-11-12 Super-high definition video preprocessing main visual angle roaming playing system and mobile terminal

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111341472.XA CN113891145B (en) 2021-11-12 2021-11-12 Super-high definition video preprocessing main visual angle roaming playing system and mobile terminal

Publications (2)

Publication Number Publication Date
CN113891145A true CN113891145A (en) 2022-01-04
CN113891145B CN113891145B (en) 2024-01-30

Family

ID=79017977

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111341472.XA Active CN113891145B (en) 2021-11-12 2021-11-12 Super-high definition video preprocessing main visual angle roaming playing system and mobile terminal

Country Status (1)

Country Link
CN (1) CN113891145B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115022683A (en) * 2022-05-27 2022-09-06 咪咕文化科技有限公司 Video processing method, device, equipment and readable storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107247397A (en) * 2017-06-28 2017-10-13 中铁第四勘察设计院集团有限公司 The method that the holography of generation three-dimensional circuits both wings roams projection source in real time
JP2019110545A (en) * 2019-02-04 2019-07-04 ヴィド スケール インコーポレイテッド Video playback method, terminal and system
CN111031398A (en) * 2019-12-10 2020-04-17 维沃移动通信有限公司 Video control method and electronic equipment
CN111741274A (en) * 2020-08-25 2020-10-02 北京中联合超高清协同技术中心有限公司 Ultrahigh-definition video monitoring method supporting local amplification and roaming of picture
CN111818312A (en) * 2020-08-25 2020-10-23 北京中联合超高清协同技术中心有限公司 Ultra-high-definition video monitoring conversion device and system with variable vision field
CN113573122A (en) * 2021-07-23 2021-10-29 杭州海康威视数字技术股份有限公司 Audio and video playing method and device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107247397A (en) * 2017-06-28 2017-10-13 中铁第四勘察设计院集团有限公司 The method that the holography of generation three-dimensional circuits both wings roams projection source in real time
JP2019110545A (en) * 2019-02-04 2019-07-04 ヴィド スケール インコーポレイテッド Video playback method, terminal and system
CN111031398A (en) * 2019-12-10 2020-04-17 维沃移动通信有限公司 Video control method and electronic equipment
CN111741274A (en) * 2020-08-25 2020-10-02 北京中联合超高清协同技术中心有限公司 Ultrahigh-definition video monitoring method supporting local amplification and roaming of picture
CN111818312A (en) * 2020-08-25 2020-10-23 北京中联合超高清协同技术中心有限公司 Ultra-high-definition video monitoring conversion device and system with variable vision field
CN113573122A (en) * 2021-07-23 2021-10-29 杭州海康威视数字技术股份有限公司 Audio and video playing method and device

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115022683A (en) * 2022-05-27 2022-09-06 咪咕文化科技有限公司 Video processing method, device, equipment and readable storage medium

Also Published As

Publication number Publication date
CN113891145B (en) 2024-01-30

Similar Documents

Publication Publication Date Title
US8665374B2 (en) Interactive video insertions, and applications thereof
CN107872731B (en) Panoramic video playing method and device
CN109416931A (en) Device and method for eye tracking
KR102246305B1 (en) Augmented media service providing method, apparatus thereof, and system thereof
US20070291134A1 (en) Image editing method and apparatus
CN111757137A (en) Multi-channel close-up playing method and device based on single-shot live video
CN113923486B (en) Pre-generated multi-stream ultra-high definition video playing system and method
US10764493B2 (en) Display method and electronic device
US20170225077A1 (en) Special video generation system for game play situation
CN110324641B (en) Method and device for keeping interest target moment display in panoramic video
CN113891145B (en) Super-high definition video preprocessing main visual angle roaming playing system and mobile terminal
CN111757138A (en) Close-up display method and device based on single-shot live video
CN114449252A (en) Method, device, equipment, system and medium for dynamically adjusting live video based on explication audio
WO2020206647A1 (en) Method and apparatus for controlling, by means of following motion of user, playing of video content
JP7207913B2 (en) Information processing device, information processing method and program
CN110798692A (en) Video live broadcast method, server and storage medium
US8483435B2 (en) Information processing device, information processing system, information processing method, and information storage medium
CN114143561B (en) Multi-view roaming playing method for ultra-high definition video
JP2016012351A (en) Method, system, and device for navigating in ultra-high resolution video content using client device
WO2020017354A1 (en) Information processing device, information processing method, and program
CN113938713A (en) Multi-path ultrahigh-definition video multi-view roaming playing method
US20220224958A1 (en) Automatic generation of augmented reality media
KR102063495B1 (en) Apparatus and control method for playing multimedia contents
JP2009181043A (en) Video signal processor, image signal processing method, program and recording medium
EP4120687A1 (en) An object or region of interest video processing system and method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant