CN112969099A - Camera device, first display equipment, second display equipment and video interaction method - Google Patents

Camera device, first display equipment, second display equipment and video interaction method Download PDF

Info

Publication number
CN112969099A
CN112969099A CN202110182830.0A CN202110182830A CN112969099A CN 112969099 A CN112969099 A CN 112969099A CN 202110182830 A CN202110182830 A CN 202110182830A CN 112969099 A CN112969099 A CN 112969099A
Authority
CN
China
Prior art keywords
user
target
video
resolution
display device
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110182830.0A
Other languages
Chinese (zh)
Inventor
孟卫明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hisense Group Holding Co Ltd
Original Assignee
Hisense Group Holding Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hisense Group Holding Co Ltd filed Critical Hisense Group Holding Co Ltd
Priority to CN202110182830.0A priority Critical patent/CN112969099A/en
Publication of CN112969099A publication Critical patent/CN112969099A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/478Supplemental services, e.g. displaying phone caller identification, shopping application
    • H04N21/4788Supplemental services, e.g. displaying phone caller identification, shopping application communicating with other users, e.g. chatting
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/431Generation of visual interfaces for content selection or interaction; Content or additional data rendering
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/485End-user interface for client configuration

Abstract

The invention relates to a camera device, a first display device, a second display device and a video interaction method, relating to the technical field of wireless video, wherein the camera device comprises: converting original position information of a region needing to be enlarged under the resolution of the second display equipment selected by a second user into target position information under the resolution of the camera device according to the resolution of the camera device, the target resolution and the resolution of the second display equipment; intercepting a target area from each frame in a video containing a first user according to target position information; and sending the intercepted target area to second display equipment. According to the invention, the local video area can be obtained through the resolution corresponding to the network bandwidth in the video process of the first user and the second user, so that the requirement of the resolution corresponding to the network bandwidth is met, and meanwhile, the definition of the amplified image is improved by taking the video of the target area as a code stream.

Description

Camera device, first display equipment, second display equipment and video interaction method
Technical Field
The invention relates to the technical field of wireless communication, in particular to a camera device, a first display device, a second display device and a video interaction method.
Background
In the existing remote video interaction process, the video interaction is mostly established on the basis of video call. If two users of remote communication, for example, a user a and a user B, want to view the local enlarged picture of the user B during the video communication, the user B takes the local enlarged picture of the user B and sends the local enlarged picture to the user a of the remote communication, it can be seen that the user B starts another transmission mode to transmit the local picture during the video communication, and the user a also starts another application to view the picture during the video communication. Or the user B draws close the camera in the video call process to enable the camera to collect the close-range video of the user.
In summary, the process of changing the video of the whole picture into the video of the local area of the whole picture in the conventional method is complicated.
Disclosure of Invention
The invention provides a camera device, a first display device, a second display device and a video interaction method, which can intercept a target area from a video, and can amplify an intercepted target image serving as a basic code stream, so that the video can be directly changed into a video of a local area from the video, and the user operation is simplified.
In a first aspect, an embodiment of the present invention provides an imaging apparatus, including: the device comprises an acquisition unit, a communication unit and a processor;
the acquisition unit is used for acquiring the video of the first user in the video process of the first user and the second user;
the processor is used for intercepting a target area from a video containing a first user according to position information of an area needing to be amplified selected by a second user if an amplifying instruction triggered by the second user is received in the process of carrying out video by the first user and the second user; wherein the zoom-in instruction is triggered by the second user on a second display device; the second display device is used by a second user for video;
the communication unit is configured to receive an amplification instruction sent by a first display device, and forward the target area to the second display device through the first display device, where the first display device is a display device used by a first user for video.
According to the image pickup device, the target area can be intercepted from the video containing the first user according to the position information of the area needing to be amplified selected by the second user in the process of carrying out video by the first user and the second user, so that the second display equipment can display the video of the area needing to be amplified in the target area, the video can be directly changed into the video of a local area from the video, and the user operation is simplified.
In a second aspect, an embodiment of the present invention provides an imaging apparatus, including: the device comprises an acquisition unit, a communication unit and a processor;
the acquisition unit is used for acquiring the video of the first user in the video process of the first user and the second user;
the processor is used for determining a target conversion relation according to the resolution of the camera device, the target resolution and the resolution of the second display device if an amplification instruction is received in the process of carrying out video by the first user and the second user; wherein the zoom-in instruction is triggered on the second display device by the second user; the second display device is a display device used by the second user for video; the target resolution is a resolution corresponding to a network bandwidth of a network connected between first display equipment and second display equipment used by the first user for video;
converting original position information of a region needing to be enlarged under the resolution of the second display equipment selected by the second user into target position information under the resolution of the camera device according to the target conversion relation;
intercepting a target area from each frame after a target image frame in a video containing a first user according to the target position information; the target image frame is an image frame in a video when the second user triggers the amplification instruction;
the communication unit is configured to receive an amplification instruction sent by the first display device, and forward the target area to the second display device through the first display device.
According to the camera device, in the process of carrying out videos by the first user and the second user, the target area is intercepted from the video containing the first user in consideration of the resolution ratio of the camera device, the target resolution ratio and the resolution ratio of the second display equipment, and then the video is sent to the second display equipment, so that the second display equipment can directly obtain the video of the local area, and meanwhile, the requirement of the resolution ratio corresponding to the network bandwidth can be met.
In one possible implementation, the original position information is original pixel coordinates at the resolution of the second display device; the processor is specifically configured to:
converting the original pixel coordinates into pixel coordinates of the resolution of the camera device according to a multiple relation between the resolution of the camera device and the resolution of the second display device;
intercepting a candidate area from an image frame of a video acquired by the camera device according to the pixel coordinate of the resolution of the camera device;
if the number of pixels in the horizontal coordinate direction of the candidate area is within a first preset range and the number of pixels in the vertical coordinate direction is within a second preset range, taking a resolution conversion relation corresponding to the first preset range and the second preset range as a target conversion relation, wherein the first preset range and the second preset range are both determined according to the target resolution.
According to the camera device, the candidate area can be acquired through the multiple relation between the resolution of the camera device and the resolution of the second display device, the resolution conversion relation can be selected according to the candidate area through the preset range determined by the target resolution, and the target area with the preset resolution can be acquired through the target conversion relation.
In a possible implementation manner, the image frame of the video acquired by the camera device includes at least one first fixed area and at least one first variable area in the abscissa direction; the image frame of the video collected by the camera device comprises at least one second fixed area and at least one second variable area in the vertical coordinate direction;
the processor is specifically configured to:
if the position abscissa is in the first fixed area, taking a preset pixel abscissa in the target conversion relation as a pixel abscissa in target position information; wherein the position abscissa is determined according to the pixel abscissa in the original pixel coordinates;
if the position abscissa is in the first change area, inputting the pixel abscissa in the original pixel coordinate, and taking a value obtained after the abscissa in the target conversion relation is converted as the pixel abscissa in the target position information;
if the position ordinate is in the second fixed area, taking a preset pixel ordinate in the target conversion relation as a pixel ordinate in target position information; wherein the position ordinate is determined according to the pixel ordinate in the original pixel coordinate;
if the position ordinate is in the second change area, inputting the pixel ordinate in the original pixel coordinate, and taking a value obtained after the ordinate in the target conversion relation is converted as the pixel ordinate in the target position information;
and forming target position information by using the horizontal coordinates of the pixels in the target position information and the vertical coordinates of the pixels in the target position information.
In general, the image capturing apparatus is troublesome in locating pixels in the edge area of an image, and selects a preset pixel coordinate or a coordinate conversion relationship in a target conversion relationship to obtain target position information by determining which area of a fixed area and a variable area set in a video is a position coordinate, thereby increasing the processing speed.
In a third aspect, a first display apparatus provided in an embodiment of the present invention includes a controller, a communicator, an external device interface, and a display;
the external device interface is connected with the camera device and used for receiving the video which is acquired by the camera device and contains the first user;
the controller is used for determining a target conversion relation according to the resolution of the camera device, the target resolution and the resolution of the second display device if an amplification instruction is received in the process of carrying out video by the first user and the second user; wherein the zoom-in instruction is triggered on the second display device by the second user; the second display device is a display device used by the second user for video; the target resolution is a resolution corresponding to a network bandwidth of a network connected between the first display device and the second display device;
converting original position information of a region needing to be enlarged under the resolution of the second display equipment selected by the second user into target position information under the resolution of the camera device according to the target conversion relation; intercepting a target area from each frame after a target image frame in a video containing a first user according to the target position information; the target image frame is an image frame in a video when the second user triggers the amplification instruction;
the communicator is used for receiving the amplification instruction sent by the second display equipment, receiving the video containing the second user sent by the second display equipment and sending the target area to the second display equipment;
the display is used for displaying the video containing the second user.
According to the first display device, in the process of carrying out video by the first user and the second user, the video which is acquired by the camera device and contains the first user can be acquired through the external device interface, the target area is intercepted from the video containing the first user by considering the resolution ratio of the camera device, the target resolution ratio and the resolution ratio of the second display device, and then the video is sent to the second display device which is connected with the first display device through a network, so that the second display device can directly obtain the video of the local area, and meanwhile, the requirement of the resolution ratio corresponding to the network bandwidth can be met.
In a possible implementation manner, the display is further used for displaying an area needing to be enlarged in the target area in the form of a small window.
When the first display device displays the video containing the second user, the first display device can also display a local area of the first display device for the first user to watch.
In a fourth aspect, an embodiment of the present invention provides a second display device, including: a communication unit, a processor and a display;
the communication unit is used for sending an amplification instruction to first display equipment used by a first user for video, and receiving a target area sent by the first display equipment; the target area is obtained by intercepting a video containing a first user according to the position information of an area needing to be amplified selected by a second user;
the display is used for displaying an area needing to be amplified in the target area;
the processor is configured to, in a process of performing video by a first user and a second user, control the communication unit to send an amplification instruction to a second display device and control the display to display an area to be amplified in the target area if the amplification instruction triggered by the second user is received.
The second display equipment can intercept the target area from the video containing the first user and transmit the target area to the second display equipment after the second display equipment triggers and amplifies.
In one possible implementation, the processor is specifically configured to:
determining the magnification of the target area according to the original position information and the target position information;
and controlling the display to display the amplified region to be amplified in the target region by taking the center of the region to be amplified in the target region as the display center of the second display device.
The second display device can determine the magnification of the region needing to be magnified in the target region, and displays the region needing to be magnified in the target region to a second user by taking the center of the region needing to be magnified in the target region as the display center of the second display device after the region needing to be magnified is magnified, so that when the video stream of the target region is taken as a basic code stream to be magnified, the definition of the magnified image is improved.
In a fifth aspect, a video interaction method provided in an embodiment of the present invention is applied to a camera device for capturing a video of a first user, and includes:
in the process of carrying out video by a first user and a second user, if an amplification instruction is received, determining a target conversion relation according to the resolution of the camera device, the target resolution and the resolution of the second display equipment; wherein the zoom-in instruction is triggered on the second display device by the second user; the target resolution is a resolution corresponding to a network bandwidth of a network connected between the first display device and the second display device; the second display device is a display device used by the second user for video; the first display device is a display device used by the first user for video;
converting original position information of a region needing to be enlarged under the resolution of the second display equipment selected by the second user into target position information under the resolution of the camera device according to the target conversion relation;
intercepting a target area from each frame behind a target image frame in the video containing the first user according to the target position information; the target image frame is an image frame in a video when the second user triggers the amplification instruction;
and sending the intercepted target area to the second display equipment through the first display equipment so that the second display equipment displays the area needing to be amplified in the target area.
In a sixth aspect, a video interaction method provided in an embodiment of the present invention is applied to a first display device used by a first user for video, and includes:
in the process of carrying out video by a first user and a second user, if an amplification instruction is received, determining a target conversion relation according to the resolution of the camera device, the target resolution and the resolution of the second display equipment; wherein the zoom-in instruction is triggered on the second display device by the second user; the target resolution is a resolution corresponding to a network bandwidth of a network connected between the first display device and the second display device; the second display device is a display device used by the second user for video; the first display device is a display device used by the first user for video;
converting original position information of a region needing to be enlarged under the resolution of the second display equipment selected by the second user into target position information under the resolution of the camera device according to the target conversion relation;
intercepting a target area from each frame behind a target image frame in the video containing the first user according to the target position information; the target image frame is an image frame in a video when the second user triggers the amplification instruction;
and sending the intercepted target area to the second display device, so that the second display device displays an area needing to be enlarged in the target area.
In a seventh aspect, a video interaction method provided in an embodiment of the present invention is applied to a second display device used by a second user for video, and includes:
in the process of carrying out video by a first user and a second user, if an amplification instruction triggered by the second user is received, the amplification instruction is sent to first display equipment; the first display device is used by the first user for video;
receiving a target area; the target area is obtained by intercepting a video containing a first user according to the position information of an area needing to be amplified selected by a second user;
and displaying the region needing to be enlarged in the target region.
In an eighth aspect, the present invention further provides a computer storage medium having a computer program stored thereon, which when executed by a processing unit, performs the steps of the video interaction method of the fifth to seventh aspects.
In addition, for technical effects brought by any one implementation manner of the eighth aspect when the step of the video interaction method in the fifth aspect is implemented in the fifth aspect, reference may be made to technical effects brought by different implementation manners in the second aspect, and details are not described here again. The technical effects brought by any implementation manner of the eighth aspect of the step of the video interaction method according to the sixth aspect when the sixth aspect is implemented can be seen in technical effects brought by different implementation manners in the third aspect, and are not described herein again. For technical effects brought by any one of the seventh aspect and the eighth aspect for implementing the steps of the video interaction method in the seventh aspect, reference may be made to technical effects brought by different implementations in the fourth aspect, and details are not described here again.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention, as claimed.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention and are not to be construed as limiting the invention.
Fig. 1 is a block diagram of a video interaction system according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a display device interacting with a user according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a second display device interacting with a user during an ongoing video process according to an embodiment of the present invention;
fig. 4 is a schematic diagram of a second display device and a display device that are performing video after intercepting a target area according to an embodiment of the present invention;
fig. 5 is a flowchart of information interaction of a video interaction system according to an embodiment of the present invention;
FIG. 6 is a flow chart of information interaction of another video interaction system provided by the embodiment of the invention;
fig. 7 is a schematic diagram of intercepting an area to be enlarged from a video frame and intercepting the area to be enlarged again from the area to be enlarged according to an embodiment of the present invention;
fig. 8 is a flowchart of a method applied to video interaction of a camera device according to an embodiment of the present invention;
fig. 9 is a flowchart of a method applied to video interaction of a first display device according to an embodiment of the present invention;
fig. 10 is a flowchart of a method applied to video interaction of a second display device according to an embodiment of the present invention;
fig. 11 is a structural diagram of an image pickup apparatus according to an embodiment of the present invention;
fig. 12 is a structural diagram of a second display device provided in an embodiment of the present invention;
fig. 13 is a structural diagram of a first display device according to an embodiment of the present invention.
Detailed Description
In order to make those skilled in the art better understand the technical solution of the present invention, the technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings.
At present, in the process of video call between a first user and a second user, if the second user wants to acquire a local clear enlarged area of the first user of an image frame in a video, the first user can only shoot a local clear enlarged image of the first user and send the local clear enlarged image to the second user in a non-video call mode, so that the process is complicated.
In view of the above, embodiments of the present invention provide an image capturing apparatus, a first display device, a second display device, and a video interaction method, where a target area is cut from a video including a first user according to position information of an area to be enlarged selected by a second user, so that the second display device can display the video of the area to be enlarged in the target area, and thus the video can be directly changed from the video to a video in a local area, which simplifies user operations.
The embodiment of the invention provides a system for video interaction, which comprises: the video display system comprises a first display device used by a first user for video, a second display device used by a second user for video and a camera device; the first display device and the second display device are connected through a network, and an external device Interface in the first display device is connected with the camera device, wherein the external device Interface is a Universal Serial Bus (USB) or a Mobile Industry Processor Interface (MIPI).
The system for video interaction can realize the following functions: the method comprises the steps of acquiring a video of a first user and a video of a second user in the process of carrying out the video by the first user and the second user, and if an amplification instruction triggered by the second user on a second display device is received, intercepting a target area from the video containing the first user according to position information of an area needing to be amplified selected by the second user, so that the first display device can display the video containing the second user, the second display device displays the video containing the first user before the second display device does not receive the amplification instruction, and the area needing to be amplified in the target area is displayed after the second display device receives the amplification instruction.
In view of the limitation of the resolution corresponding to the network bandwidth of the network connected between the first display device and the second display device, the system for video interaction may, in addition to the above-mentioned process, further: determining a target conversion relation according to the resolution of the camera device, the target resolution and the resolution of the second display equipment; the target resolution is a resolution corresponding to a network bandwidth of a network connected between the first display device and the second display device; converting original position information of a region needing to be enlarged under the resolution of the second display equipment selected by a second user into target position information under the resolution of the camera device according to the target conversion relation; intercepting a target area from each frame after a target image frame in a video containing a first user according to target position information; and the target image frame is an image frame in the video when the second user triggers the amplification instruction.
For example, in the system, a video stream may be forwarded between the first display device and the second display device through the server, for example, the first display device forwards a video including the first user to the second display device through the server, the first display device forwards the target area to the second display device through the server, and the server has an identity authentication function in addition to a video forwarding function. For example, a first user inputs a user name and a password of the first user through the display device, a second user inputs a user name and a password of the second user through the second display device, and the display device and the second display device interact with the server to realize user login authentication and user type identification operations. And after the first user and the second user successfully log in, the token sent by the server is received. The token is used for carrying out token verification by the server in the video call process, and further responding to user operation and realizing video stream forwarding operation.
Of course, the function of video forwarding and the function of identity authentication may also be implemented in two servers. The present invention is not particularly limited. The first server is responsible for video forwarding, and the second server is responsible for identity authentication.
The first display device can be a television, the camera is arranged on the television, and the second display device can be a computer, a pad, a mobile phone and other display devices.
Referring to fig. 1, the system for video interaction includes a television 100 for video use by a first user, i.e., a patient, a camera 101 installed on the television, a login authentication server 102, a video inquiry server 103, and a second display device for video use by a second user, i.e., a doctor, such as a computer 104, a tablet computer 105, and a mobile phone 106; the television 100 is connected with the computer 104, the tablet computer 105 and the mobile phone 106 through the network respectively.
The patient inputs a user name and a password through the television 100, the doctor inputs the user name and the password through the computer 104, the input user name and the password are sent to the login verification server 102, the login verification server 102 performs identity verification, the login verification server 102 can send a page with a corresponding identity to the patient after the verification is passed, the login verification server 102 can send the page with the corresponding identity to the doctor, and meanwhile, the token sent by the login verification server can be received and used for the video inquiry server to perform token verification in the video call process, so that the user operation is responded, and the video stream forwarding operation is realized.
Referring to fig. 2, the patient requests an online doctor from the login verification server 102, the login authentication server 102 sends a list of online doctors to the television 100, such as wang doctor, li doctor and zhang doctor, and the patient selects one online doctor in the page, such as zhang doctor, and initiates a video inquiry request to zhang doctor.
For example, the second display device used by the medical professional for video is the computer 104, and the medical professional receives a video query request from the computer 104.
The camera 101 collects a video of a first user, completes video data compression, transmits the video data to the television 100, and the television 100 sends the video to the computer 104 through the video inquiry server 103.
Referring to fig. 3, a doctor selects a region to be enlarged on the computer 104, and clicks the "+" circle after the selection, so that the computer 104 generates an enlargement command corresponding to the position information of the region to be enlarged.
After the computer 104 generates the zoom-in command, the zoom-in command is sent to the television 100 through the video inquiry server 103, and the target area can be intercepted through the television 100 or the camera 101. Referring to fig. 4, the computer 104 displays a region of the target area that needs to be enlarged.
Illustratively, when the above functions are implemented in a system for video interaction, the process of interaction performed by each device in the system is as follows:
if the process of capturing the target area is implemented in the image capturing device, the working process of each element in the system, as shown in fig. 5, is:
s500: the method comprises the steps that a camera device collects a video of a first user;
s501: the method comprises the steps that a camera device sends a video containing a first user to a first display device;
s502: the first display device sends a video containing a first user to a second display device;
s503: the second display device responds to a magnification instruction triggered by a second user;
the original position information of the region to be enlarged at the resolution of the second display device of the second user-selected region to be enlarged is represented by the coordinates of the upper left vertex and the lower right vertex of the region to be enlarged, which are (x1, y1), (x2, y2), respectively, and these two original pixel coordinates. The coordinates are sent to the first display device by the video call custom RTP protocol.
The vertex coordinates are encapsulated into the padding bytes or extension header of the RTP packet, for example, the extension header of the RTP header is opened at the X position, and the format of the RTP header may include sequence number (sequence number), timestamp (timestamp), synchronization source identifier (SSRC) identifier, and contribution source identifier (CSRC) identifiers, as shown in table 1.
TABLE 1
Figure BDA0002941889020000071
At the position of X in table 1, the extension header of the RTP header is turned on. The extended header, shown in connection with table 2, includes: defined by profile, length, header extension.
TABLE 2
Figure BDA0002941889020000072
The contents of the encapsulation field are shown in table 3 below:
TABLE 3
startx x1 starty y1 endx x2 endy y2
S504: the second display device sends the amplification instruction to the first display device;
s505: the first display equipment receives the amplification instruction and then sends the amplification instruction to the camera device;
s506: the camera device intercepts a target area from each frame after a target image frame in a video containing a first user according to the position information of the area needing to be amplified, which is selected by a second user;
s507: the camera device sends the target area to the first display equipment;
s508: the first display device sends the target area to the second display device;
s509: the second display device displays a region of the target region that needs to be enlarged.
If the process of intercepting the target area is implemented in the first display device, the working process of each element in the system, as shown in fig. 6, is:
s600: the method comprises the steps that a camera device collects a video of a first user;
s601: the method comprises the steps that a camera device sends a video containing a first user to a first display device;
s602: the first display device sends a video containing a first user to a second display device;
s603: the second display device responds to a magnification instruction triggered by a second user;
s604: the second display device sends the amplification instruction to the first display device;
s605: after receiving the amplification instruction, the first display device intercepts a target area from each frame after a target image frame in a video containing a first user according to the position information of the area needing amplification selected by a second user;
s606: the first display device sends the target area to the second display device;
s607: the second display device displays a region of the target region that needs to be enlarged.
The second display device and the image pickup device are different in resolution, and transmission of the intercepted target area is performed through the network, so that the second display device and the image pickup device are influenced by the resolution corresponding to the network bandwidth. To this end, the present invention proposes:
the camera device is further used for determining a target conversion relation according to the resolution of the camera device, the target resolution and the resolution of the second display device if an amplification instruction is received in the process of carrying out video by the first user and the second user; the target resolution is a resolution corresponding to a network bandwidth of a network connected between the first display device and the second display device; converting original position information of a region needing to be enlarged under the resolution of the second display equipment selected by a second user into target position information under the resolution of the camera device according to the target conversion relation; intercepting a target area from each frame after a target image frame in a video containing a first user according to target position information; the target image frame is an image frame in the video when the second user triggers the amplification instruction;
or the first display device is further used for determining a target conversion relation according to the resolution of the camera device, the target resolution and the resolution of the second display device if an amplification instruction is received in the process of carrying out video by the first user and the second user; converting original position information of a region needing to be enlarged under the resolution of the second display equipment selected by a second user into target position information under the resolution of the camera device according to the target conversion relation; and intercepting a target area according to the target position information from each frame after the target image frame in the video containing the first user.
The original position information of the region to be enlarged is original pixel coordinates at a resolution of a second display device used by a second user for video, for example, pixel coordinates of a vertex at the top left corner and pixel coordinates of a vertex at the bottom right corner in the region to be enlarged. However, the resolution of the second display device is different from the resolution of the camera device, so that the region to be enlarged cannot be directly cut from the image frame of the video acquired by the camera device according to the position information of the region to be enlarged, and meanwhile, the resolution corresponding to the network bandwidth of the network between the first display device and the second display device needs to be considered, so that the target conversion relationship is determined, specifically:
converting the original pixel coordinates into pixel coordinates of the resolution of the camera device according to a multiple relation between the resolution of the camera device and the resolution of the second display device;
intercepting a candidate area from an image frame of a video acquired by a camera device according to a pixel coordinate of the resolution of the camera device;
and if the number of pixels in the horizontal coordinate direction of the candidate area is within a first preset range and the number of pixels in the vertical coordinate direction is within a second preset range, taking a resolution conversion relation corresponding to the first preset range and the second preset range as a target conversion relation, wherein the first preset range and the second preset range are both determined according to the target resolution.
Wherein the original pixel coordinates are converted into pixel coordinates at a resolution of the image pickup device, for example, 4 times the resolution of the second display apparatus, when the coordinates are converted: 2 times of the abscissa in the original pixel coordinates is a value of the abscissa in the pixel coordinates at the resolution of the image pickup device, and 2 times of the ordinate in the original pixel coordinates is a value of the ordinate in the pixel coordinates at the resolution of the image pickup device.
For example, the resolution of the imaging device is 4k, the resolution of the second display device is 1080p, and due to the conversion relationship of 1/4 between 4k and 1080p, the pixel point in the abscissa direction of 4k is 2 times that in the abscissa direction of 1080p, and the pixel point in the ordinate direction of 4k is 2 times that in the ordinate direction of 1080 p; therefore, according to the multiple relationship, when the original position information is (x1, y1) and (x2, y2), respectively, the pixel coordinates of the resolution of the image pickup apparatus are (2x1, 2y1) and (2x2, 2y 2).
If the resolution of the camera device is 8k, the resolution of the second display device is 1080p, and due to the conversion relation of 1/16 between 8k and 1080p, the pixel point in the abscissa direction of 8k is 4 times that in the abscissa direction of 1080p, and the pixel point in the ordinate direction of 8k is 4 times that in the ordinate direction of 1080 p; therefore, according to the multiple relation, when the original pixel coordinates are (x1, y1) and (x2, y2), respectively, the pixel coordinates of the resolution of the image pickup device are (4x1, 4y1) and (4x2, 4y 2).
The first preset range and the second preset range are determined according to the resolution corresponding to the network bandwidth of the network connected between the first display device and the second display device.
For example, the first preset range and the second preset range are determined according to a resolution corresponding to the network bandwidth or a multiple of the resolution corresponding to the network bandwidth, the first preset range is a preset range defining a resolution in the abscissa direction, and the second preset range is a preset range defining a resolution in the ordinate direction, for example, when the resolution of the image pickup apparatus is 8k and the resolution corresponding to the network bandwidth is 1080p, the first preset range is a preset range bounded by 1920 and 3840, and the second preset range is a preset range bounded by 2160 and 1080. When the resolution of the camera device is 4k and the resolution of the second display device is 1080p, the first preset range is a preset range with 1920 as a boundary, and the second preset range is a preset range with 1080 as a boundary.
If the resolution of the target area intercepted according to the corresponding target conversion relation meets the corresponding preset range, the resolution of the target area is also determined for the resolution corresponding to the network bandwidth, the resolution in the abscissa direction of the target area is the resolution in the abscissa direction corresponding to the network bandwidth or a multiple of the resolution in the abscissa direction corresponding to the network bandwidth, and the resolution in the ordinate direction of the target area can be the resolution in the ordinate direction corresponding to the network bandwidth or a multiple of the resolution in the ordinate direction corresponding to the network bandwidth. For example, when the resolution of the imaging device is 8k and the resolution corresponding to the network bandwidth is 1080p, the target area may have resolutions of 1080p and 4 k. When the resolution of the image pickup apparatus is 4k and the resolution of the second display device is 1080p, the target area may have a resolution of 1080 p.
For a specific process of determining the target conversion relationship, for example, the resolution of the imaging device is 4k, the resolution of the second display device is 1080P, the original pixel coordinates (x1, y1), (x2, y2) become the pixel coordinates (2x1, 2y1), (2x2, 2y2) at 4k according to a multiple relation of 4 times of 4k and 1080P, and then candidate regions of 2| x2-x1|, 2| y2-y1| are cut out from the image frame of 4 k.
The resolution of the camera device is 4k, and the resolution corresponding to the network bandwidth is 1080p, the first preset range is smaller than 1920, and the second preset range is smaller than 1080;
namely 2x 2-x1 l < 1920; 2| y2-y1| < 1080; where the coordinates (x1, y1) and (x2, y2) are original pixel coordinates, for example, the coordinates (x1, y1) are (400 ), and the coordinates (x2, y2) are (600,500).
Wherein 2 ═ 400-; 2 ═ 500-; then, it is indicated that the resolution of the candidate region is smaller than 1920 × 1080, and the resolution conversion relationship corresponding to the first preset range and the second preset range may be used as the target conversion relationship.
Illustratively, the resolution of the camera is 8k, the resolution of the second display device is 1080P, and then the original pixel coordinates (x1, y1), (x2, y2) are changed into pixel coordinates (4x1, 4y1), (4x2, 4y2) at 8k according to a multiple relation of 16 times of 8k and 1080P, and then candidate regions of 4| x2-x1|, 4| y2-y1| are cut out from the image frame of 8 k.
Since the resolution of the camera is 8k, and the resolution corresponding to the network bandwidth is 1080p, the preset ranges can be no more than 1902, no more than 1080, between 1920 and 3840, no more than 2610, between 1080 and 2160, and no more than 3840;
judging whether the number of pixels in the abscissa direction of the candidate region is less than 1902, between 1920 and 3840 and less than 3840, namely determining which range the number of pixels in the abscissa direction of the candidate region belongs to;
judging whether the number of pixels in the vertical coordinate direction of the candidate area is larger than 1080, smaller than 2610, between 1080 and 2160, namely determining the range to which the number of pixels in the vertical coordinate direction of the candidate area belongs;
for example, 4x | 2-x1| ≦ 1920 and 4y | 2-y1| ≦ 1080, indicating that the resolution of the candidate region is less than 1920 x 1080;
1920< 4x | 2-x1| ≦ 3840 and 4x | y2-y1| ≦ 2160, or 1080< 4x | y2-y1| ≦ 2160 and 4x | x2-x1| ≦ 3840, indicating that the resolution of the candidate region is greater than 1080p but less than 4 k;
if 4x | 2-x1| is less than or equal to 1920 and 4y | y2-y1| is less than or equal to 1080, taking the resolution conversion relation corresponding to 4x | 2-x1| is less than or equal to 1920 and 4y | y2-y1| is less than or equal to 1080 as the target conversion relation;
if 1920< 4x | x2-x1 ≦ 3840 and 4x | y2-y1 ≦ 2160, or 1080< 4x | y2-y1 ≦ 2160 and 4x | x2-x1 ≦ 3840, the resolution transform relationship corresponding to 1920< 4x | 2-x1 ≦ 3840 and 4y 2-y1 ≦ 2160, or 1080< 4x | y2-y1 ≦ 2160 and 4| x2-x1 ≦ 3840 is used as the target transform relationship.
The device comprises a camera device, a display device and a control device, wherein the camera device acquires a video image frame of the video, and the image frame of the video acquired by the camera device comprises at least one first fixed area and at least one first variable area in the abscissa direction; the image frame of the video collected by the camera device comprises at least one second fixed area and at least one second variable area in the vertical coordinate direction;
converting original position information of a region to be enlarged selected by a second user at a resolution of the second display device into target position information at a resolution of the image pickup apparatus according to a target conversion relationship, including:
if the position abscissa is in the first fixed area, taking a preset pixel abscissa in the target conversion relation as a pixel abscissa in the target position information; wherein, the position abscissa is determined according to the pixel abscissa in the original pixel coordinates;
if the position abscissa is in the first change area, inputting the pixel abscissa in the original pixel coordinate into a value obtained after the abscissa in the target conversion relation is converted, and taking the value as the pixel abscissa in the target position information;
if the position vertical coordinate is in the second fixed area, taking a preset pixel vertical coordinate in the target conversion relation as a pixel vertical coordinate in the target position information; the position ordinate is determined according to the pixel ordinate in the original pixel coordinate;
if the position vertical coordinate is in the second change area, inputting the pixel vertical coordinate in the original pixel coordinate into a value obtained after the vertical coordinate in the target conversion relation is converted, and taking the value as the pixel vertical coordinate in the target position information;
and forming target position information by using the horizontal coordinates of the pixels in the target position information and the vertical coordinates of the pixels in the target position information.
Exemplarily, 2x | 2-x1| <1920 under 4k of the camera; the conversion relationship corresponding to 2y 2-y1| <1080 is:
if x1+ x2<960, then a equals 0, c equals 1919;
if x1+ x2>2880, then a is 1920, c is 3839;
otherwise, a is x1+ x2-960, and c is x1+ x2+ 960;
if y1+ y2<540, then b is 0 and d is 1079;
if y1+ y2>1620, then b 1080, d 2159;
otherwise, b is y1+ y2-540, d is y1+ y2+ 540;
the unit in the above relationship is a pixel.
It can be seen that the target conversion relationship is a relationship of original pixel coordinate conversion target position information, wherein the resolution of the target region intercepted according to the target position information is a resolution corresponding to the network bandwidth or a multiple of the resolution corresponding to the network bandwidth.
The image frame of the video collected by the camera device comprises two first fixed areas in the abscissa direction, wherein one first fixed area is smaller than 960, and the other first fixed area is larger than 2880; the image frame of the video collected by the camera device comprises two second fixed areas in the vertical coordinate direction, wherein one second fixed area is smaller than 540, and the other second fixed area is larger than 1620;
in detail, the present invention represents the position abscissa by the added value of the pixel abscissas in two original pixels; the position ordinate is represented by the added value of the pixel ordinates in the two original pixels; if the position abscissa is smaller than the first threshold, which indicates that the position abscissa is located in a first fixed area, the preset value corresponding to the first threshold in the target conversion relationship is used as the pixel abscissa in the target position information, and the first threshold may be 960; if the value is less than 960, the value a is 0, and the value c is 1919;
if the position abscissa is larger than the second threshold, the second threshold is 2880, which indicates that the position abscissa is located in another first fixed area, the preset values corresponding to the second threshold in the target conversion relationship, namely a equals 1920 and c equals 3839, are taken as the pixel abscissa in the target position information;
if the position abscissa does not meet the conditions that the position abscissa is smaller than a first threshold and larger than a second threshold, the pixel abscissa in the target position information is obtained by using the abscissa conversion relationship in the target conversion relationship, wherein a is x1+ x2-960, and c is x1+ x2+ 960;
similarly, if the position ordinate is smaller than the third threshold, where the third threshold is 540, which indicates that the position ordinate is located in a second fixed area, the preset values corresponding to the third threshold in the target conversion relationship, that is, b is 0 and d is 1079, are taken as the pixel ordinate in the target position information;
if the position ordinate is greater than the fourth threshold, the fourth threshold is 1620, which indicates that the position ordinate is located in another second fixed area, the preset value corresponding to the fourth threshold in the target conversion relationship, that is, b is 1080 and d is 2159, is taken as the pixel ordinate in the target position information;
and if the position vertical coordinate is not less than the third threshold and is greater than the fourth threshold, the vertical coordinate conversion relationship in the target conversion relationship is used, and the vertical coordinate in the target position information is obtained by using the vertical coordinate conversion relationship in the target conversion relationship, wherein b is y1+ y2-540, and d is y1+ y2+ 540.
Taking coordinates (x1, y1) as (400 ) and (x2, y2) as (600,500) as examples, when converting:
since 400+600>960, and <2880, a ═ x1+ x2-960 ═ 400+ 600-;
since 400+500>540 and <1620, b-y 1+ y 2-540-400 + 600-540-360, d-y 1+ y2+ 540-400 +600+ 540-1540;
the target location information is (40, 360), (1960, 1540).
Taking coordinates (x1, y1) as (100, 50) and (x2, y2) as (600,500) as examples, when converting:
since 100+600 is 700<960, a is 0, c is 1919;
since 50+500 > 550>540 and <1620, b-y 1+ y 2-540-550-10 and d-y 1+ y2+ 540-550 + 540-1090.
Illustratively, at 8k of the camera device, the switching relationships 1920< 4x | x2-x1| ≦ 3840 and 4x | y2-y1| ≦ 2160, or 1080< 4| y2-y1| ≦ 2160 and 4| x2-x1| ≦ 3840 correspond to:
if x1+ x2<960, then a is 0, c is 3839;
if x1+ x2>2880, then a 3840, c 7679;
otherwise, a is 2(x1+ x2) -1920, c is 2(x1+ x2) + 1920;
if y1+ y2<540, then b equals 0 and d equals 2159;
if y1+ y2>1620, then b 2160 and d 4319;
otherwise, b is 2(y1+ y2) -1080, d is 2(y1+ y2) + 1080;
the image frame of the video collected by the camera device comprises two first fixed areas in the abscissa direction, wherein one first fixed area is smaller than 960, and the other first fixed area is larger than 2880; the image frame of the video collected by the camera device comprises two second fixed areas in the vertical coordinate direction, wherein one second fixed area is smaller than 540, and the other second fixed area is larger than 1620;
specifically, if the position abscissa is smaller than the first threshold, that is, the first threshold is 960, which indicates that the position abscissa is located in a first fixed area, the preset value corresponding to the first threshold in the target conversion relationship is taken as the pixel abscissa in the target position information, where a is 0 and c is 3839;
if the position abscissa is larger than the second threshold, the second threshold is 2880, and it is indicated that the position abscissa is located in another first fixed area, the preset values corresponding to the second threshold in the target conversion relationship, namely a is 3840 and c is 7679, are taken as the pixel abscissa in the target position information;
if the position abscissa does not satisfy the conditions that the position abscissa is smaller than the first threshold and larger than the second threshold, the position abscissa is located in the first change area, and the pixel abscissa in the target position information is obtained by using the abscissa conversion relationship in the target conversion relationship, wherein a is 2(x1+ x2) -1920, and c is 2(x1+ x2) + 1920;
similarly, if the position ordinate is smaller than the third threshold, where the third threshold is 540, which indicates that the position ordinate is located in a second fixed area, the preset values corresponding to the third threshold in the target conversion relationship, that is, b ═ 0 and d ═ 2159, are taken as the pixel ordinate in the target position information;
if the position ordinate is greater than the fourth threshold, the fourth threshold is 1620, which indicates that the position ordinate is located in another second fixed area, the preset values corresponding to the fourth threshold in the target conversion relationship, that is, b equals 2160, d equals 4319, are taken as the pixel ordinate in the target position information;
if the position ordinate does not satisfy less than the third threshold and is greater than the fourth threshold, the ordinate conversion relationship in the target conversion relationship is used to indicate that the position ordinate is located in the second variation region, b is 2(y1+ y2) -1080, d is 2(y1+ y2) +1080, and the pixel ordinate in the target position information is obtained.
Take coordinates (x1, y1) as (300,600), (x2, y2) as (1000 ) as an example;
since | x2-x1| 1000| -300| -700, 700 | -4 | -2800,1920 <2800 ≦ 3840; y2-y1| ≦ 1000| -600| ≦ 400,400 | _ 4 ≦ 1600,1600 ≦ 2160, so coordinates (x1, y1) of (300,600), (x2, y2) of (1000 ) are met with 1920< 4| x2-x1 ≦ 3840 and 4| y2-y1| ≦ 2160, or 1080< 4| y2-y1| ≦ 2160 and 4| x2-x1 ≦ 3840, the corresponding positional information is converted with 1920< 4| x2-x1| ≦ 3840 and 4| y2-y1| ≦ 2160, or 1080< 4| 2-y 5 | ≦ 3840 and 4| x 2| ≦ 57324 |, the corresponding positional information is determined.
Under 8k of the camera device, switching relations of 1920< 4x | 2-x1| ≦ 3840 and 4y | 2-y1| ≦ 2160, or 1080< 4x | y2-y1| ≦ 2160 and 4x | x2-x1| ≦ 3840 correspond, and when switching:
since 300+1000 ═ 1300>960, <2880, a ═ 2(x1+ x2) -1920 ═ 2(1300) -1920 ═ 680, c ═ 2(x1+ x2) +1920 ═ 2 ═ 1300+1920 ═ 4520;
since 600+1000 ═ 1600>540, and <1620, b ═ 2(y1+ y2) -1080 ═ 1000+ 600-;
the target location information is (680, 2120), (4520, 4280).
For example, under 8k of the image pickup device, 4x | 2-x1| ≦ 1920 and 4y | 2-y1| ≦ 1080 corresponding conversion relationships:
if 2x (x1+ x2) <960, then a equals 0, c equals 1919;
if 2(x1+ x2) >6720, then a 5760, c 7679;
otherwise, a is 2(x1+ x2) -960, c is 2(x1+ x2) + 960;
if 2x (y1+ y2) <540, then b is 0, d is 1079;
if 2(y1+ y2) >3780, then b-3240, d-4319;
otherwise, b is 2(y1+ y2) -540, d is 2(y1+ y2) + 540;
the image frame of the video collected by the camera device comprises two first fixed areas in the abscissa direction, wherein one first fixed area is smaller than 480, and the other first fixed area is larger than 3360; the image frame of the video collected by the camera device comprises two second fixed areas in the vertical coordinate direction, wherein one second fixed area is smaller than 270, and the other second fixed area is larger than 1890;
specifically, if the position abscissa is smaller than the first threshold, that is, the first threshold is 480, which indicates that the position abscissa is located in a first fixed area, the preset value corresponding to the first threshold in the target conversion relationship is taken as the pixel abscissa in the target position information, where a is 0 and c is 1919;
if the position abscissa is larger than the second threshold, the second threshold is 3360, which indicates that the position abscissa is located in another first fixed area, the preset values corresponding to the second threshold in the target conversion relationship, namely a is 5760 and c is 7679, are taken as the pixel abscissa in the target position information;
if the position abscissa does not satisfy the conditions that the position abscissa is smaller than the first threshold and larger than the second threshold, the position abscissa is located in the first change area, and the pixel abscissa in the target position information is obtained by using the abscissa conversion relation in the target conversion relation, wherein a is 2(x1+ x2) -960, and c is 2(x1+ x2) + 960;
similarly, if the vertical coordinate of the position is smaller than the third threshold, which is 270, and the horizontal coordinate of the position is located in a first fixed area, the preset values corresponding to the third threshold in the target conversion relationship, that is, b is 0, d is 1079, are taken as the vertical coordinate of the pixel in the target position information;
if the vertical coordinate of the position is greater than a fourth threshold value which is 1890, and the horizontal coordinate of the position is located in another first fixed area, taking a preset value corresponding to the fourth threshold value in the target conversion relationship, namely b is 3240, d is 4319, as the vertical coordinate of the pixel in the target position information;
if the vertical coordinates of the position do not satisfy the conditions that the vertical coordinates of the position are smaller than the third threshold and larger than the fourth threshold, and the horizontal coordinate of the position is located in the second change area, the vertical coordinate conversion relationship in the target conversion relationship is used, wherein b is 2(y1+ y2) -540, and d is 2(y1+ y2) +540, so that the vertical coordinate of the pixel in the target position information is obtained.
Take coordinates (x1, y1) as (1000 ), (x2, y2) as (600,800) as an example;
since | x2-x1| 600| -400, 400 | -4 | -1600, 1600 ≦ 1920; the target position information is determined by using a resolution conversion relationship corresponding to 4x 2-x1| ≦ 1920 and 4y 2-y1| ≦ 1080 if | y2-y1|, 1000| -200, 200 × 4| -800, 800 ≦ 1080, so the coordinates (x1, y1) are (1000 ), (x2, y2) are (600,800) and the target position information is determined by using the resolution conversion relationship corresponding to 4| x2-x1| ≦ 1920 and 4| y2-y1| ≦ 1080.
Under 8k of the imaging device, the resolution conversion relation corresponding to 4x | x2-x1| ≦ 1920 and 4y | y2-y1| ≦ 1080:
since 600+1000 is 1600>480 and less than 3360, a is 0, c is 1919;
since 800+1000 ═ 1800>270, and <1890, b ═ 2(y1+ y2) -540 ═ 2(1000+800) -540 ═ 3060, d ═ 2(y1+ y2) +1080 ═ 2(1000+800) +540 ═ 4140;
the target location information is (0, 3060), (1919, 4140).
When the image pickup device is 4k, 2x | 2-x1| is larger than 1920 or 2x | y2-y1| is larger than 1080, the resolution of the clipped candidate image is larger than 1080p corresponding to the network bandwidth, and when the resolution scale factor corresponding to the network bandwidth is increased, the resolution of the target area is only 4k, so that the target area does not need to be clipped, and the original image only needs to be compressed to the resolution corresponding to the network bandwidth and then transmitted, and the zooming instruction is not executed.
When the imaging device is 8k, 4x | x2-x1| >3840 or 4x | y2-y1| >2160 causes the resolution of the clipped candidate image to be larger than 16 times of the resolution 1080p corresponding to the network bandwidth, and when the resolution scale factor corresponding to the network bandwidth is increased, the resolution of the target area is only 8k, so that the target area does not need to be clipped, and the original image only needs to be compressed to the resolution corresponding to the network bandwidth and then transmitted, and the zoom-in command is not executed.
In an actual application process, if the resolution of the target area is different from the resolution corresponding to the network bandwidth, in order to ensure normal transmission of the video stream, the embodiment of the present invention further provides the following manner:
and compressing the resolution of the target area into the target resolution, and then sending the converted target area to second display equipment, and displaying by the second display equipment.
For example, if the resolution of the target area is 4k and the resolution corresponding to the network bandwidth is 1080p, the 4k needs to be compressed into 1080p and then transmitted to the second display device.
Since distortion of a picture may occur due to conversion of different resolutions, especially when a small resolution is developed to a large resolution, the resolution of the second display device may be generally set to be the same as the target resolution in order to improve the definition of an image.
Illustratively, when displaying on the second display device side: determining the magnification factor of the target area according to the original position information and the target position information;
and displaying the region needing to be magnified in the magnified target region by taking the center of the region needing to be magnified in the target region as the display center of the second display device.
Wherein the magnification of the video picture, e.g. 1, 2, 4, etc., is actually preset at the second display device (preset magnification is advantageous for reducing the algorithm throughput). And amplifying the video picture by taking the target area as a center. The real-time video scaling tool employs FFmpeg (general purpose tool). The video zoom magnification is determined by the following rule:
first, the target position information (a, b), (c, d) at the resolution of the image pickup device is converted into the target position information (a1, b1), (c1, d1) at the same resolution as the original position information, the resolution of which is the resolution of the second display apparatus, so the target position information (a, b), (c, d) at the resolution of the image pickup device is converted into the target position information (a1, b1), (c1, d1) at the resolution of the second display apparatus.
For example, if the resolution of the image pickup device is 4k and the resolution of the second display device is 1080p, then the abscissa of the target position information at the resolution of the image pickup device is between 0 and 3840 and the ordinate is between 0 and 2160, and the abscissa of the target position information converted to the resolution of the second display device is between 0 and 1920 and the ordinate is between 0 and 1080. Due to the 4-fold relationship between 4k and 1080p, a/2 is a1, b/2 is b1, c/2 is c1, and d/2 is d 1.
If the resolution of the imaging device is 8k and the resolution of the second display device is 1080p, due to the 16-fold relationship between 8k and 1080p, a/4 is a1, b/4 is b1, c/4 is c1, and d/4 is d 1.
After the conversion is completed, the magnification is obtained by using the target position information (a1, b1), (c1, d1) at the resolution of the second display device and the original position information of the region to be magnified at the resolution of the second display device as (x1, y1), (x2, y 2).
I c1-a 1I/| x2-x1| <2 or | d1-b1|/| y2-y1| <2, the zoom factor is 1, and the picture is not zoomed;
2 is less than or equal to | c1-a1|/| x2-x1| <4, and | d1-b1|/| y2-y1| >2 or 2 is less than or equal to | d1-b1|/| y2-y1| <4, and | c1-a1|/| x2-x1| >2, the zoom multiple is 2, and the picture is magnified by one time;
the c1-a 1/| x2-x1| is more than or equal to 4 and | d1-b1|/| y2-y1| is more than or equal to 4, the zoom factor is 4, and the picture is amplified by four times;
after the target area is intercepted according to the target conversion relation, the target position information, the amplification factor and the resolution information of the camera device can be transmitted together with the target area and transmitted to the second display device, and the second display device determines the amplification factor after receiving the information.
The magnification factor indicates the relation between the resolution of the intercepted target area and the resolution of the image pickup device, when the target area cannot be intercepted, the magnification factor is 0, when the resolution of the target area is 1080p and the resolution of the image pickup device is 8k, the magnification factor is 4, and when the resolution of the target area is 1080p and the resolution of the image pickup device is 4k, the magnification factor is 2; when the resolution of the target region is 4k and the resolution of the imaging device is 8k, the magnification factor is 2.
Specifically, when the imaging device is 4k, the candidate region satisfies 2x | x2-x1| < 1920; 2| y2-y1| <1080, the amplification factor is 2, and when the candidate region does not meet the condition, the amplification factor is 0; when the imaging device is 8k, the candidate area satisfies 1920< 4x | x2-x1| ≦ 3840 and 4x | y2-y1| ≦ 2160, or 1080< 4| y2-y1| ≦ 2160 and 4| x2-x1| ≦ 3840, and the amplification factor is 2; when the imaging device is 8k, the candidate area meets the conditions that 4x | x2-x1| is less than or equal to 1920 and 4y | y2-y1| is less than or equal to 1080, and the amplification factor is 4; when the candidate region does not satisfy the condition of 8k, i.e., 4 × | x2-x1| >3840 or 4 × | y2-y1| >2160, the amplification factor is 0.
When the second display device is in network communication with the second display device, the target position information, the amplification factor and the target area can be transmitted together, so that the amplification factor is determined directly through the algorithm after the second display device receives the target position information, the amplification factor and the target area.
When the magnification factor is 0, the target position information is position information of an image area of a video frame acquired by the camera device, if the camera device is 4k, the target position information is (0, 2159), (3839, 0), and if the camera device is 8k, the target position information is (0, 4319), (7679, 0). The target position information and the pixel coordinates taken from the original position information are the same, for example, the pixel coordinates taken from the original position information are the upper left corner and the lower right corner, and then the pixel coordinates of the target position information are the upper left corner and the lower right corner.
The display center of the second display device may be the center of the screen of the second display device, or the center of the display area selected by the user.
When the target resolution is the same as the resolution of the second display device, for example, both the target resolution and the resolution of the second display device are 1080p, the resolution of the target region acquired by the second display device is 1080p, and since the area of the target region is larger than the area of the region needing to be enlarged, the resolution of the region needing to be enlarged in the target region is not 1080p, and the invention provides the following way to display:
if the resolution of the region to be enlarged is not less than the display scale of the second display device, that is, the number of pixels in the vertical coordinate direction of the region to be enlarged is small, the center of the region to be enlarged in the target region is taken as the display center of the second display device, the vertical coordinate direction of the region to be enlarged in the target region is normally enlarged and displayed, and the horizontal coordinate direction of the region to be enlarged in the target region is enlarged and displayed according to a first preset relationship.
If the resolution of the region to be enlarged is smaller than the display proportion of the second display device, that is, the number of pixels in the abscissa direction of the region to be enlarged is small, the center of the region to be enlarged in the target region is taken as the display center of the second display device, the abscissa direction of the region to be enlarged in the target region is normally enlarged and displayed, and the ordinate direction of the region to be enlarged in the target region is enlarged and displayed according to a second preset relationship.
Illustratively, when the display scale is 1920 × 1080P, if | x2-x1|: y2-y1| > 1920:1080, then, the center of the region to be enlarged in the target region is taken as the display center of the second display device, the vertical coordinate direction of the region to be enlarged in the target region is normally displayed, the length y ' of the region to be enlarged after enlarged display in the vertical coordinate direction is 1080, the horizontal coordinate direction of the region to be enlarged in the target region is displayed according to | x2-x1| 1080/| y2-y1|, the length x ' of the region to be enlarged after enlarged display in the horizontal coordinate direction is | x2-x1| 1080/| y2-y1|, and the coordinates of the region to be enlarged after enlarged display are (960-x '/2, 540-y '/2), (960+ x '/2, 540 +/2).
If | x2-x1|: | y2-y1| <1920:1080, normally displaying in the abscissa direction of the region to be enlarged in the target region by taking the center of the region to be enlarged in the target region as the display center of the second display device, displaying in the ordinate direction of the region to be enlarged after enlargement display with the length x 'of the region to be enlarged in the ordinate direction being 1920, displaying in the ordinate direction of the region to be enlarged in the target region according to | y2-y1 |. 1920/| x2-x1|, and displaying in the abscissa direction with the length y' of the region to be enlarged after enlargement display being | y2-y1 |. 1920/| x2-x1 |. The coordinates of the area needing to be magnified after the magnified display are (960-x '/2, 540-y '/2), (960+ x '/2, 540 +/2).
For example, after the display is performed in the above manner, the frame is selected as the image of the computer 604 in fig. 4, the image is the area to be enlarged in the target area, and the area to be enlarged in the target area has a relatively small number of pixels in the vertical coordinate direction, so that the center of the area to be enlarged in the target area is the display center of the second display device, the screen is overlaid in the vertical coordinate direction, and the horizontal coordinate direction is displayed according to | x2-x1| 1080/| y2-y1|, so that the screen is overlaid in the vertical coordinate direction, and the image of a little target area is displayed in the horizontal coordinate direction except for the area to be enlarged.
When the second user performs the enlargement processing again on the area needing enlargement in the target area, and the area which is framed and selected again is framed and selected in the area needing enlargement in the enlargement display, which is different from the resolution of the second display device, the second display device needs to know which area the original position information is framed and selected in each time, and the operation is troublesome.
Therefore, when the second user performs the enlargement processing again on the area to be enlarged in the target area, the second user needs to convert the position information to be framed again into the pixel coordinates at the resolution of the second display device.
Referring to fig. 7, original position information (x1, y1), (x2, y2) of a region to be enlarged in a video frame has coordinates (960-x '/2, 540-y '/2), (960+ x '/2, 540+/2) of the region to be enlarged after enlarged display, and a user selects again the enlarged region to be enlarged in the region to be enlarged after enlarged display, wherein pixel coordinates (m, n) of an upper left vertex and pixel coordinates (α, β) of a lower right vertex of the enlarged region to be enlarged are selected again. The invention requires converting (m, n), (α, β) to pixel coordinates at the resolution of the second display device.
Taking the resolution of the second display device as 1080p as an example:
if 960-x '/2 is 0, (m, n) is converted to pixel coordinates at the resolution of the second display device (m (x2-x1)/1920+ x1, y1+ (n- (540-y'/2)) (x2-x 1)/1920);
likewise, (α, β) is converted into pixel coordinates (α (x2-x1)/1920+ x1, y1+ (β - (540-y'/2)) (x2-x1)/1920) at the resolution of the second display device;
if 540-y '/2 is 0, (m, n) is converted to pixel coordinates (x1+ (m- (960-x'/2)) (y2-y1)/1080, y1+ n (y2-y1)/1080) at the resolution of the second display device;
likewise, (α, β) is converted into pixel coordinates (x1+ (α - (960-x'/2)) (y2-y1)/1080, y1+ β (y2-y1)/1080) at the resolution of the second display device;
and (3) acquiring new pixel coordinates, for example, sending the pixel coordinates after (m, n) and (α, β) conversion to the camera, and intercepting the corresponding area by the camera according to the new coordinates and sending the corresponding area to the doctor end.
And when the camera is 4k, a new target area is intercepted from the 4k image according to the pixel coordinates after (m, n) and (alpha, beta) conversion.
And when the camera is 8k, a new target area is intercepted from the 8k image according to the pixel coordinates after (m, n) and (alpha, beta) conversion.
The difference with the 4k camera lies in that the 8k camera has higher definition output capability, for example, the second user firstly requests an area needing to be enlarged, the 4k area which can be intercepted by the camera is compressed to 1080p and then sent to the second user, and after the second user draws the enlarged area for the second time, the camera still can intercept the 4k area according to new coordinates and then sends the area to the second user after being compressed to 1080 p.
Based on the above description, an embodiment of the present invention provides a video interaction method applied to a camera device for capturing a video of a first user, which is described with reference to fig. 8, and includes:
s800: in the process of carrying out video by a first user and a second user, if an amplification instruction is received, determining a target conversion relation according to the resolution of the camera device, the target resolution and the resolution of the second display equipment;
wherein the zoom-in instruction is triggered by a second user on a second display device; the target resolution is a resolution corresponding to a network bandwidth of a network connected between the first display device and the second display device; the second display device is used by a second user for video; the first display device is used by a first user for video;
s801: converting original position information of a region needing to be enlarged under the resolution of the second display equipment selected by a second user into target position information under the resolution of the camera device according to the target conversion relation;
s802: intercepting a target area from each frame after a target image frame in a video containing a first user according to target position information;
the target image frame is an image frame in the video when a second user triggers an amplification instruction;
s803: and sending the intercepted target area to second display equipment through the first display equipment so that the second display equipment displays the area needing to be amplified in the target area.
Optionally, the original position information is an original pixel coordinate at the resolution of the second display device; determining a target conversion relation according to the resolution of the camera device, the target resolution and the resolution of the second display equipment, wherein the target conversion relation comprises the following steps:
converting the original pixel coordinates into pixel coordinates of the resolution of the camera device according to a multiple relation between the resolution of the camera device and the resolution of the second display device;
intercepting a candidate area from an image frame of a video acquired by the camera device according to the pixel coordinate of the resolution of the camera device;
if the number of pixels in the horizontal coordinate direction of the candidate area is within a first preset range and the number of pixels in the vertical coordinate direction is within a second preset range, taking a resolution conversion relation corresponding to the first preset range and the second preset range as a target conversion relation, wherein the first preset range and the second preset range are both determined according to a target resolution.
Optionally, the image frame of the video captured by the camera device includes at least one first fixed region and at least one first variable region in the abscissa direction; the image frame of the video collected by the camera device comprises at least one second fixed area and at least one second variable area in the vertical coordinate direction;
converting original position information of a region to be enlarged at the resolution of the second display device selected by the second user into target position information at the resolution of the image pickup apparatus according to the target conversion relationship, including:
if the position abscissa is in the first fixed area, taking a preset pixel abscissa in the target conversion relation as a pixel abscissa in target position information; wherein the position abscissa is determined according to the pixel abscissa in the original pixel coordinates;
if the position abscissa is in the first change area, inputting the pixel abscissa in the original pixel coordinate, and taking a value obtained after the abscissa in the target conversion relation is converted as the pixel abscissa in the target position information;
if the position ordinate is in the second fixed area, taking a preset pixel ordinate in the target conversion relation as a pixel ordinate in target position information; wherein the position ordinate is determined according to the pixel ordinate in the original pixel coordinate;
if the position ordinate is in the second change area, inputting the pixel ordinate in the original pixel coordinate, and taking a value obtained after the ordinate in the target conversion relation is converted as the pixel ordinate in the target position information;
and forming target position information by using the horizontal coordinates of the pixels in the target position information and the vertical coordinates of the pixels in the target position information.
Based on the above description, an embodiment of the present invention provides a video interaction method applied to a first display device for video use by a first user, which is described in conjunction with fig. 9, and includes:
s900: in the process of carrying out video by a first user and a second user, if an amplification instruction is received, determining a target conversion relation according to the resolution of the camera device, the target resolution and the resolution of the second display equipment;
wherein the zoom-in instruction is triggered on the second display device by the second user; the target resolution is a resolution corresponding to a network bandwidth of a network connected between the first display device and the second display device; the second display device is a display device used by the second user for video; the first display device is a display device used by the first user for video;
s901: converting original position information of a region needing to be enlarged under the resolution of the second display equipment selected by a second user into target position information under the resolution of the camera device according to the target conversion relation;
s902: intercepting a target area from each frame after a target image frame in a video containing a first user according to target position information;
the target image frame is an image frame in the video when a second user triggers an amplification instruction;
s903: and sending the intercepted target area to second display equipment so that the second display equipment displays the area needing to be amplified in the target area.
Optionally, the original position information is an original pixel coordinate at the resolution of the second display device; determining a target conversion relation according to the resolution of the camera device, the target resolution and the resolution of the second display equipment, wherein the target conversion relation comprises the following steps:
converting the original pixel coordinates into pixel coordinates of the resolution of the camera device according to a multiple relation between the resolution of the camera device and the resolution of the second display device;
intercepting a candidate area from an image frame of a video acquired by the camera device according to the pixel coordinate of the resolution of the camera device;
if the number of pixels in the horizontal coordinate direction of the candidate area is within a first preset range and the number of pixels in the vertical coordinate direction is within a second preset range, taking a resolution conversion relation corresponding to the first preset range and the second preset range as a target conversion relation, wherein the first preset range and the second preset range are both determined according to a target resolution.
Optionally, the image frame of the video captured by the camera device includes at least one first fixed region and at least one first variable region in the abscissa direction; the image frame of the video collected by the camera device comprises at least one second fixed area and at least one second variable area in the vertical coordinate direction;
converting original position information of a region to be enlarged at the resolution of the second display device selected by the second user into target position information at the resolution of the image pickup apparatus according to the target conversion relationship, including:
if the position abscissa is in the first fixed area, taking a preset pixel abscissa in the target conversion relation as a pixel abscissa in target position information; wherein the position abscissa is determined according to the pixel abscissa in the original pixel coordinates;
if the position abscissa is in the first change area, inputting the pixel abscissa in the original pixel coordinate, and taking a value obtained after the abscissa in the target conversion relation is converted as the pixel abscissa in the target position information;
if the position ordinate is in the second fixed area, taking a preset pixel ordinate in the target conversion relation as a pixel ordinate in target position information; wherein the position ordinate is determined according to the pixel ordinate in the original pixel coordinate;
if the position ordinate is in the second change area, inputting the pixel ordinate in the original pixel coordinate, and taking a value obtained after the ordinate in the target conversion relation is converted as the pixel ordinate in the target position information;
and forming target position information by using the horizontal coordinates of the pixels in the target position information and the vertical coordinates of the pixels in the target position information.
Referring to fig. 10, another video interaction method provided in an embodiment of the present invention is applied to a second display device used by a second user for video, and includes:
s1000: in the process of carrying out video by a first user and a second user, if an amplification instruction triggered by the second user is received, the amplification instruction is sent to second display equipment;
s1001: receiving a target area; the target area is obtained by intercepting a video containing a first user according to the position information of an area needing to be amplified selected by a second user;
s1002: and displaying the area needing to be magnified in the target area.
Optionally, displaying a region of the target region that needs to be enlarged includes:
determining the magnification of the target area according to the original position information and the target position information;
and displaying the amplified region to be amplified in the target region by taking the center of the region to be amplified in the target region as the display center of the second display device.
An embodiment of the present invention provides an image pickup apparatus, as shown in fig. 11, including: an acquisition unit 1100, a communication unit 1101, and a processor 1102;
the acquisition unit 1101 is configured to acquire a video of a first user during a video process performed by the first user and a second user;
the processor 1102 is configured to, in a process of performing a video by a first user and a second user, if an amplification instruction triggered by the second user is received, intercept a target area from a video including the first user according to location information of an area to be amplified selected by the second user; wherein the zoom-in instruction is triggered by the second user on a second display device; the second display device is used by a second user for video;
the communication unit 1101 is configured to receive an enlargement instruction sent by a first display device, and forward the target area to the second display device through the first display device, where the first display device is a display device used by a first user for video.
An embodiment of the present invention provides another imaging apparatus, similarly, including: the device comprises an acquisition unit, a communication unit and a processor;
the acquisition unit is used for acquiring the video of the first user in the video process of the first user and the second user;
the processor is used for determining a target conversion relation according to the resolution of the camera device, the target resolution and the resolution of the second display device if an amplification instruction is received in the process of carrying out video by the first user and the second user; wherein the zoom-in instruction is triggered on the second display device by the second user; the second display device is a display device used by the second user for video; the target resolution is a resolution corresponding to a network bandwidth of a network connected between first display equipment and second display equipment used by the first user for video;
converting original position information of a region needing to be enlarged under the resolution of the second display equipment selected by the second user into target position information under the resolution of the camera device according to the target conversion relation;
intercepting a target area from each frame after a target image frame in a video containing a first user according to the target position information; the target image frame is an image frame in a video when the second user triggers the amplification instruction;
the communication unit is configured to receive an amplification instruction sent by the first display device, and forward the target area to the second display device through the first display device.
Optionally, the original position information is an original pixel coordinate at the resolution of the second display device; the processor is specifically configured to:
converting the original pixel coordinates into pixel coordinates of the resolution of the camera device according to a multiple relation between the resolution of the camera device and the resolution of the second display device;
intercepting a candidate area from an image frame of a video acquired by the camera device according to the pixel coordinate of the resolution of the camera device;
if the number of pixels in the horizontal coordinate direction of the candidate area is within a first preset range and the number of pixels in the vertical coordinate direction is within a second preset range, taking a resolution conversion relation corresponding to the first preset range and the second preset range as a target conversion relation, wherein the first preset range and the second preset range are both determined according to the target resolution.
Optionally, the image frame of the video captured by the camera device includes at least one first fixed region and at least one first variable region in the abscissa direction; the image frame of the video collected by the camera device comprises at least one second fixed area and at least one second variable area in the vertical coordinate direction;
the processor is specifically configured to:
if the position abscissa is in the first fixed area, taking a preset pixel abscissa in the target conversion relation as a pixel abscissa in target position information; wherein the position abscissa is determined according to the pixel abscissa in the original pixel coordinates;
if the position abscissa is in the first change area, inputting the pixel abscissa in the original pixel coordinate, and taking a value obtained after the abscissa in the target conversion relation is converted as the pixel abscissa in the target position information;
if the position ordinate is in the second fixed area, taking a preset pixel ordinate in the target conversion relation as a pixel ordinate in target position information; wherein the position ordinate is determined according to the pixel ordinate in the original pixel coordinate;
if the position ordinate is in the second change area, inputting the pixel ordinate in the original pixel coordinate, and taking a value obtained after the ordinate in the target conversion relation is converted as the pixel ordinate in the target position information;
and forming target position information by using the horizontal coordinates of the pixels in the target position information and the vertical coordinates of the pixels in the target position information.
An embodiment of the present invention provides a second display device, which is shown in fig. 12, and includes: a communication unit 1200, a processor 1201, and a display 1202;
the communication unit 1200 is configured to send an amplification instruction to a first display device used by a first user for video, and receive a target area sent by the first display device; the target area is obtained by intercepting a video containing a first user according to the position information of an area needing to be amplified selected by a second user;
the display 1202 is configured to display a region of the target region that needs to be enlarged;
the processor 1201 is configured to, in a process of performing a video by a first user and a second user, control the communication unit to send an amplification instruction to a second display device and control the display to display an area to be amplified in the target area if the amplification instruction triggered by the second user is received.
The embodiment of the invention provides first display equipment, which comprises a controller, a communicator, an external device interface and a display, wherein the communicator is connected with the controller;
the external device interface is connected with the camera device and used for receiving the video which is acquired by the camera device and contains the first user;
the controller is used for determining a target conversion relation according to the resolution of the camera device, the target resolution and the resolution of the second display device if an amplification instruction is received in the process of carrying out video by the first user and the second user; wherein the zoom-in instruction is triggered on the second display device by the second user; the second display device is a display device used by the second user for video; the target resolution is a resolution corresponding to a network bandwidth of a network connected between the first display device and the second display device;
converting original position information of a region needing to be enlarged under the resolution of the second display equipment selected by the second user into target position information under the resolution of the camera device according to the target conversion relation; intercepting a target area from each frame after a target image frame in a video containing a first user according to the target position information; the target image frame is an image frame in a video when the second user triggers the amplification instruction;
the communicator is used for receiving the amplification instruction sent by the second display equipment, receiving the video containing the second user sent by the second display equipment and sending the target area to the second display equipment;
the display is used for displaying the video containing the second user.
Optionally, the display is further configured to display an area of the target area that needs to be enlarged in a form of a small window.
When the first display device is a television, another structure of the first display device is shown in conjunction with fig. 13.
In some embodiments, the first display apparatus includes at least one of a tuner demodulator 1310, a communicator 1320, a detector 1330, an external device interface 1340, a controller 1350, a display 1360, an audio output interface 1370, a memory, a power supply, a user interface.
In some embodiments the controller comprises a central processor, a video processor, an audio processor, a graphics processor, a RAM, a ROM, a first interface to an nth interface for input/output.
In some embodiments, the display 1360 includes a display screen component for presenting pictures, and a driving component for driving image display, components for receiving image signals from the controller output, performing display of video content, image content, and menu manipulation interfaces, and user manipulation UI interfaces, and the like.
In some embodiments, the display 1360 may be at least one of a liquid crystal display, an OLED display, and a projection display, and may also be a projection device and a projection screen.
In some embodiments, the tuner demodulator 1310 receives broadcast television signals via wired or wireless reception, and demodulates audio/video signals, such as EPG data signals, from a plurality of wireless or wired broadcast television signals.
In some embodiments, communicator 1320 is a component for communicating with external devices or servers according to various communication protocol types. For example: the communicator may include at least one of a Wifi module, a bluetooth module, a wired ethernet module, and other network communication protocol chips or near field communication protocol chips, and an infrared receiver. The first display device may establish transmission and reception of a control signal and a data signal with the second display device through the communicator 1320.
In some embodiments, detector 1330 is used to collect signals of the external environment or interaction with the outside. For example, detector 1330 includes a light receiver, a sensor for collecting the intensity of ambient light; alternatively, the detector 1330 includes an image collector, such as a camera, which can be used to collect external environment scenes, attributes of the user, or user interaction gestures, or alternatively, the detector 1330 includes a sound collector, such as a microphone, which is used to receive external sounds.
In some embodiments, external device interface 1340 may include, but is not limited to, the following: high Definition Multimedia Interface (HDMI), analog or data high definition component input interface (component), composite video input interface (CVBS), USB input interface (USB), RGB port, and the like. The interface may be a composite input/output interface formed by the plurality of interfaces.
In some embodiments, the controller 1350 and the modem 1310 may be located in different separate devices, that is, the modem 1310 may also be located in an external device of the main device where the controller 1350 is located, such as an external set-top box.
In some embodiments, the controller 1350 controls the operation of the display device and responds to user actions by controlling the programs via various software stored in memory. The controller 1350 controls the overall operation of the first display device. For example: in response to receiving a user command for selecting a UI object displayed on the display 1360, the controller 1350 may perform an operation related to the object selected by the user command.
In some embodiments, the object may be any one of selectable objects, such as a hyperlink, an icon, or other actionable control. The operations related to the selected object are: displaying an operation connected to a hyperlink page, document, image, or the like, or performing an operation of a program corresponding to the icon.
In some embodiments the controller comprises at least one of a Central Processing Unit (CPU), a video processor, an audio processor, a Graphics Processing Unit (GPU), a RAM Random Access Memory (RAM), a ROM (Read-Only Memory), a first to nth interface for input/output, a communication Bus (Bus), and the like.
A CPU processor. For executing operating system and application program instructions stored in the memory, and executing various application programs, data and contents according to various interactive instructions receiving external input, so as to finally display and play various audio-video contents. The CPU processor may include a plurality of processors. E.g. comprising a main processor and one or more sub-processors.
In some embodiments, a graphics processor for generating various graphics objects, such as: at least one of an icon, an operation menu, and a user input instruction display figure. The graphic processor comprises an arithmetic unit, which performs operation by receiving various interactive instructions input by a user and displays various objects according to display attributes; the system also comprises a renderer for rendering various objects obtained based on the arithmetic unit, wherein the rendered objects are used for being displayed on a display.
In some embodiments, the video processor is configured to receive an external video signal, and perform at least one of video processing such as decompression, decoding, scaling, noise reduction, frame rate conversion, resolution conversion, and image synthesis according to a standard codec protocol of the input signal, so as to obtain a signal directly displayable or playable on the first display device.
In some embodiments, the video processor includes at least one of a demultiplexing module, a video decoding module, an image composition module, a frame rate conversion module, a display formatting module, and the like. The demultiplexing module is used for demultiplexing the input audio and video data stream. And the video decoding module is used for processing the video signal after demultiplexing, including decoding, scaling and the like. And the image synthesis module is used for carrying out superposition mixing processing on the GUI signal input by the user or generated by the user and the video image after the zooming processing by the graphic generator so as to generate an image signal for display. And the frame rate conversion module is used for converting the frame rate of the input video. And the display formatting module is used for converting the received video output signal after the frame rate conversion, and changing the signal to be in accordance with the signal of the display format, such as an output RGB data signal.
In some embodiments, the audio processor is configured to receive an external audio signal, decompress and decode the received audio signal according to a standard codec protocol of the input signal, and perform at least one of noise reduction, digital-to-analog conversion, and amplification processing to obtain a sound signal that can be played in the speaker.
In some embodiments, the user may enter user commands on a Graphical User Interface (GUI) displayed on the display 1360, and the user input interface receives the user input commands through the Graphical User Interface (GUI). Alternatively, the user may input the user command by inputting a specific sound or gesture, and the user input interface receives the user input command by recognizing the sound or gesture through the sensor.
In some embodiments, a "user interface" is a media interface for interaction and information exchange between an application or operating system and a user that enables conversion between an internal form of information and a form that is acceptable to the user. A commonly used presentation form of the User Interface is a Graphical User Interface (GUI), which refers to a User Interface related to computer operations and displayed in a graphical manner. It may be an interface element such as an icon, a window, a control, etc. displayed in the display screen of the first display device, where the control may include at least one of an icon, a button, a menu, a tab, a text box, a dialog box, a status bar, a navigation bar, a Widget, etc. visual interface elements.
In some embodiments, the user interface 1380 is an interface that can be used to receive control inputs (e.g., physical buttons on the first display device body, or the like).
In some embodiments, the system of the first display device may include a Kernel (Kernel), a command parser (shell), a file system, and an application program. The kernel, shell, and file system together make up the basic operating system structure that allows users to manage files, run programs, and use the system. After power-on, the kernel is started, kernel space is activated, hardware is abstracted, hardware parameters are initialized, and virtual memory, a scheduler, signals and interprocess communication (IPC) are operated and maintained. And after the kernel is started, loading the Shell and the user application program. The application program is compiled into machine code after being started, and a process is formed.
In an exemplary embodiment, a storage medium comprising instructions, such as a memory comprising instructions, executable by a processor to perform the above-described method of video interaction is also provided. Alternatively, the storage medium may be a non-transitory computer readable storage medium, which may be, for example, a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.
An embodiment of the present invention further provides a first computer program product, which, when running on an image capturing apparatus, enables the image capturing apparatus to execute a method for implementing any one of the above-mentioned video interactions according to the embodiment of the present invention.
Embodiments of the present invention further provide a second computer program product, which, when running on a first display device, enables the first display device to execute a method for implementing any one of the video interactions described above in embodiments of the present invention.
An embodiment of the present invention further provides a third computer program product, which, when running on a second display device, enables the second display device to execute a method for implementing any one of the video interactions described above in the embodiment of the present invention.
Other embodiments of the invention will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This invention is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the invention being indicated by the following claims.
It will be understood that the invention is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the invention is limited only by the appended claims.

Claims (10)

1. An image pickup apparatus, comprising: the device comprises an acquisition unit, a communication unit and a processor;
the acquisition unit is used for acquiring the video of the first user in the video process of the first user and the second user;
the processor is used for intercepting a target area from a video containing a first user according to position information of an area needing to be amplified selected by a second user if an amplifying instruction triggered by the second user is received in the process of carrying out video by the first user and the second user; wherein the zoom-in instruction is triggered by the second user on a second display device; the second display device is used by a second user for video;
the communication unit is configured to receive an amplification instruction sent by a first display device, and forward the target area to the second display device through the first display device, where the first display device is a display device used by a first user for video.
2. An image pickup apparatus, comprising: the device comprises an acquisition unit, a communication unit and a processor;
the acquisition unit is used for acquiring the video of the first user in the video process of the first user and the second user;
the processor is used for determining a target conversion relation according to the resolution of the camera device, the target resolution and the resolution of the second display device if an amplification instruction is received in the process of carrying out video by the first user and the second user; wherein the zoom-in instruction is triggered on the second display device by the second user; the second display device is a display device used by the second user for video; the target resolution is a resolution corresponding to a network bandwidth of a network connected between first display equipment and second display equipment used by the first user for video;
converting original position information of a region needing to be enlarged under the resolution of the second display equipment selected by the second user into target position information under the resolution of the camera device according to the target conversion relation;
intercepting a target area from each frame after a target image frame in a video containing a first user according to the target position information; the target image frame is an image frame in a video when the second user triggers the amplification instruction;
the communication unit is configured to receive an amplification instruction sent by the first display device, and forward the target area to the second display device through the first display device.
3. The image pickup apparatus according to claim 2, wherein the original position information is original pixel coordinates at the resolution of the second display device; the processor is specifically configured to:
converting the original pixel coordinates into pixel coordinates of the resolution of the camera device according to a multiple relation between the resolution of the camera device and the resolution of the second display device;
intercepting a candidate area from an image frame of a video acquired by the camera device according to the pixel coordinate of the resolution of the camera device;
if the number of pixels in the horizontal coordinate direction of the candidate area is within a first preset range and the number of pixels in the vertical coordinate direction is within a second preset range, taking a resolution conversion relation corresponding to the first preset range and the second preset range as a target conversion relation, wherein the first preset range and the second preset range are both determined according to the target resolution.
4. The second display device according to claim 3, wherein the image frame of the video captured by the camera device includes at least one first fixed region and at least one first variable region in the abscissa direction; the image frame of the video collected by the camera device comprises at least one second fixed area and at least one second variable area in the vertical coordinate direction;
the processor is specifically configured to:
if the position abscissa is in the first fixed area, taking a preset pixel abscissa in the target conversion relation as a pixel abscissa in target position information; wherein the position abscissa is determined according to the pixel abscissa in the original pixel coordinates;
if the position abscissa is in the first change area, inputting the pixel abscissa in the original pixel coordinate, and taking a value obtained after the abscissa in the target conversion relation is converted as the pixel abscissa in the target position information;
if the position ordinate is in the second fixed area, taking a preset pixel ordinate in the target conversion relation as a pixel ordinate in target position information; wherein the position ordinate is determined according to the pixel ordinate in the original pixel coordinate;
if the position ordinate is in the second change area, inputting the pixel ordinate in the original pixel coordinate, and taking a value obtained after the ordinate in the target conversion relation is converted as the pixel ordinate in the target position information;
and forming target position information by using the horizontal coordinates of the pixels in the target position information and the vertical coordinates of the pixels in the target position information.
5. A first display apparatus includes a controller, a communicator, an external device interface, a display;
the external device interface is connected with the camera device and used for receiving the video which is acquired by the camera device and contains the first user;
the controller is used for determining a target conversion relation according to the resolution of the camera device, the target resolution and the resolution of the second display device if an amplification instruction is received in the process of carrying out video by the first user and the second user; wherein the zoom-in instruction is triggered on the second display device by the second user; the second display device is a display device used by the second user for video; the target resolution is a resolution corresponding to a network bandwidth of a network connected between the first display device and the second display device;
converting original position information of a region needing to be enlarged under the resolution of the second display equipment selected by the second user into target position information under the resolution of the camera device according to the target conversion relation; intercepting a target area from each frame after a target image frame in a video containing a first user according to the target position information; the target image frame is an image frame in a video when the second user triggers the amplification instruction;
the communicator is used for receiving the amplification instruction sent by the second display equipment, receiving the video containing the second user sent by the second display equipment and sending the target area to the second display equipment;
the display is used for displaying the video containing the second user.
6. The first display device of claim 5, wherein the display is further configured to display an area of the target area that needs to be enlarged in the form of a small window.
7. A second display device, comprising: a communication unit, a processor and a display;
the communication unit is used for sending an amplification instruction to first display equipment used by a first user for video, and receiving a target area sent by the first display equipment; the target area is obtained by intercepting a video containing a first user according to the position information of an area needing to be amplified selected by a second user;
the display is used for displaying an area needing to be amplified in the target area;
the processor is configured to, in a process of performing video by a first user and a second user, control the communication unit to send an amplification instruction to a second display device and control the display to display an area to be amplified in the target area if the amplification instruction triggered by the second user is received.
8. A video interaction method is applied to a camera device for acquiring a video of a first user, and comprises the following steps:
in the process of carrying out video by a first user and a second user, if an amplification instruction is received, determining a target conversion relation according to the resolution of the camera device, the target resolution and the resolution of the second display equipment; wherein the zoom-in instruction is triggered on the second display device by the second user; the target resolution is a resolution corresponding to a network bandwidth of a network connected between the first display device and the second display device; the second display device is a display device used by the second user for video; the first display device is a display device used by the first user for video;
converting original position information of a region needing to be enlarged under the resolution of the second display equipment selected by the second user into target position information under the resolution of the camera device according to the target conversion relation;
intercepting a target area from each frame behind a target image frame in the video containing the first user according to the target position information; the target image frame is an image frame in a video when the second user triggers the amplification instruction;
and sending the intercepted target area to the second display equipment through the first display equipment so that the second display equipment displays the area needing to be amplified in the target area.
9. A video interaction method is applied to a first display device for video use of a first user, and comprises the following steps:
in the process of carrying out video by a first user and a second user, if an amplification instruction is received, determining a target conversion relation according to the resolution of the camera device, the target resolution and the resolution of the second display equipment; wherein the zoom-in instruction is triggered on the second display device by the second user; the target resolution is a resolution corresponding to a network bandwidth of a network connected between the first display device and the second display device; the second display device is a display device used by the second user for video; the first display device is a display device used by the first user for video;
converting original position information of a region needing to be enlarged under the resolution of the second display equipment selected by the second user into target position information under the resolution of the camera device according to the target conversion relation;
intercepting a target area from each frame behind a target image frame in the video containing the first user according to the target position information; the target image frame is an image frame in a video when the second user triggers the amplification instruction;
and sending the intercepted target area to the second display device, so that the second display device displays an area needing to be enlarged in the target area.
10. A video interaction method is applied to a second display device for video use of a second user, and comprises the following steps:
in the process of carrying out video by a first user and a second user, if an amplification instruction triggered by the second user is received, the amplification instruction is sent to first display equipment; the first display device is used by the first user for video;
receiving a target area; the target area is obtained by intercepting a video containing a first user according to the position information of an area needing to be amplified selected by a second user;
and displaying the region needing to be enlarged in the target region.
CN202110182830.0A 2021-02-07 2021-02-07 Camera device, first display equipment, second display equipment and video interaction method Pending CN112969099A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110182830.0A CN112969099A (en) 2021-02-07 2021-02-07 Camera device, first display equipment, second display equipment and video interaction method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110182830.0A CN112969099A (en) 2021-02-07 2021-02-07 Camera device, first display equipment, second display equipment and video interaction method

Publications (1)

Publication Number Publication Date
CN112969099A true CN112969099A (en) 2021-06-15

Family

ID=76284752

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110182830.0A Pending CN112969099A (en) 2021-02-07 2021-02-07 Camera device, first display equipment, second display equipment and video interaction method

Country Status (1)

Country Link
CN (1) CN112969099A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114466212A (en) * 2022-02-07 2022-05-10 百度在线网络技术(北京)有限公司 Live broadcast method, device, electronic equipment and medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103414945A (en) * 2013-07-17 2013-11-27 深圳Tcl新技术有限公司 Method and device for automatically clipping and displaying target image
CN103546716A (en) * 2012-07-17 2014-01-29 三星电子株式会社 System and method for providing image
CN106792092A (en) * 2016-12-19 2017-05-31 广州虎牙信息科技有限公司 Live video flow point mirror display control method and its corresponding device
US20170302719A1 (en) * 2016-04-18 2017-10-19 Qualcomm Incorporated Methods and systems for auto-zoom based adaptive video streaming
CN110944186A (en) * 2019-12-10 2020-03-31 杭州当虹科技股份有限公司 High-quality viewing method for local area of video
CN111601066A (en) * 2020-05-26 2020-08-28 维沃移动通信有限公司 Information acquisition method and device and electronic equipment

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103546716A (en) * 2012-07-17 2014-01-29 三星电子株式会社 System and method for providing image
CN103414945A (en) * 2013-07-17 2013-11-27 深圳Tcl新技术有限公司 Method and device for automatically clipping and displaying target image
US20170302719A1 (en) * 2016-04-18 2017-10-19 Qualcomm Incorporated Methods and systems for auto-zoom based adaptive video streaming
CN106792092A (en) * 2016-12-19 2017-05-31 广州虎牙信息科技有限公司 Live video flow point mirror display control method and its corresponding device
CN110944186A (en) * 2019-12-10 2020-03-31 杭州当虹科技股份有限公司 High-quality viewing method for local area of video
CN111601066A (en) * 2020-05-26 2020-08-28 维沃移动通信有限公司 Information acquisition method and device and electronic equipment

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114466212A (en) * 2022-02-07 2022-05-10 百度在线网络技术(北京)有限公司 Live broadcast method, device, electronic equipment and medium

Similar Documents

Publication Publication Date Title
JP4541476B2 (en) Multi-image display system and multi-image display method
CN112367543A (en) Display device, mobile terminal, screen projection method and screen projection system
KR20080082759A (en) System and method for realizing vertual studio via network
US11960674B2 (en) Display method and display apparatus for operation prompt information of input control
US11917329B2 (en) Display device and video communication data processing method
JP2014049865A (en) Monitor camera system
WO2023011058A1 (en) Display device, communication terminal, and projected-screen image dynamic display method
US11662971B2 (en) Display apparatus and cast method
CN113825002B (en) Display device and focal length control method
CN112969099A (en) Camera device, first display equipment, second display equipment and video interaction method
CN111954043B (en) Information bar display method and display equipment
CN112040309B (en) Channel switching method and display device
CN111669662A (en) Display device, video call method and server
CN114430492A (en) Display device, mobile terminal and picture synchronous zooming method
CN116634207A (en) VR panoramic video playing method and system based on IPTV service
CN113453069B (en) Display device and thumbnail generation method
CN112911371B (en) Dual-channel video resource playing method and display equipment
CN116980554A (en) Display equipment and video conference interface display method
CN114302203A (en) Image display method and display device
CN111949179A (en) Control amplifying method and display device
WO2021218473A1 (en) Display method and display device
CN113645502B (en) Method for dynamically adjusting control and display device
CN112135173B (en) Method for improving play-starting code rate of streaming media and display equipment
CN113825001B (en) Panoramic picture browsing method and display device
WO2021248671A1 (en) Display device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210615

RJ01 Rejection of invention patent application after publication