CN116916172B

CN116916172B - Remote control method and related device

Info

Publication number: CN116916172B
Application number: CN202311165582.4A
Authority: CN
Inventors: 杨勇
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2023-09-11
Filing date: 2023-09-11
Publication date: 2024-01-09
Anticipated expiration: 2043-09-11
Also published as: CN116916172A

Abstract

The embodiment of the application discloses a remote control method and a related device, which can be applied to various basic technologies such as image processing, face processing, cloud technology, security, big data and the like. The first device acquires a plurality of video frames and important parameters of the plurality of video frames, determines a splicing mode of the plurality of video frames based on the important parameters, splices the plurality of video frames based on the splicing mode, and obtains a large video picture and sends the large video picture to the second device. The second device can obtain the control instruction based on the large video picture and send the control instruction to the first device, so that the first device executes corresponding operation according to the control instruction, and remote control of the second device on the first device is achieved. Therefore, the first device does not respectively send a plurality of video frames from multiple paths of videos, but is spliced into a large video picture to be jointly sent to the second device, the probability of delay is reduced, the delay from end to end is reduced, and the real-time performance and the safety of remote control are improved.

Description

Remote control method and related device

Technical Field

The present disclosure relates to the field of industrial control technologies, and in particular, to a remote control method and a related device.

Background

The remote control is suitable for environments where long-time work is not suitable for human beings, such as underwater, nuclear magnetic environments, mines, garbage yards and the like, and needs to remotely control machines to replace human beings to enter the environments for work. For example, remote escape may be achieved in an autopilot scenario by remotely controlling an autopilot vehicle.

In the related art, a controlled machine is generally provided with a plurality of cameras, videos of surrounding environments are collected through the plurality of cameras, then multiple paths of videos are respectively transmitted to a remote control end, video stitching is performed by the remote control end based on the multiple paths of videos, the surrounding environments of the controlled machine are obtained, and control instructions are sent out according to the surrounding environments of the controlled machine, so that the controlled machine is controlled.

However, in the above remote control manner, delay often occurs, which results in poor real-time performance of remote control, and affects the security of remote control for application scenarios with high real-time performance.

Disclosure of Invention

In order to solve the technical problems, the application provides a remote control method and a related device, which are used for reducing the delay from end to end and improving the safety of remote control.

The embodiment of the application discloses the following technical scheme:

In a first aspect, an embodiment of the present application provides a remote control method, where the method is applied to a first device, the method includes:

acquiring a plurality of video frames at the same moment and important parameters of the video frames, wherein the video frames are used for describing environment information of the first equipment at different shooting visual angles, and the important parameters are used for describing importance degrees of corresponding video frames in the video frames;

determining a splicing mode of a plurality of video frames according to the important parameters;

splicing a plurality of video frames according to the splicing mode to obtain a large video picture, wherein the greater the importance degree of the important parameter identification is, the greater the image resolution of the corresponding video frame is;

transmitting the large video frame to a second device;

acquiring a control instruction from the second equipment, wherein the control instruction is obtained according to the large video picture;

and executing the operation corresponding to the control instruction according to the control instruction.

In a second aspect, embodiments of the present application provide a remote control method, where the method is applied to a second device, the method includes:

acquiring a large video picture from a first device, wherein the large video picture is obtained by splicing a plurality of video frames in a splicing mode, the video frames are used for describing environmental information of the first device at different shooting angles, the splicing mode is determined according to important parameters of the video frames, and the important parameters are used for describing the importance degree of the corresponding video frames in the video frames;

Splitting the large video picture according to a splitting mode corresponding to the splitting mode to obtain a plurality of video frames;

acquiring a control instruction according to a plurality of video frames;

and sending the control instruction to the first equipment.

In a third aspect, an embodiment of the present application provides a remote control device, where the device is built in a first apparatus, and the device includes: the device comprises an acquisition unit, a determination unit, a splicing unit, a sending unit and an execution unit;

the acquisition unit is used for acquiring a plurality of video frames at the same moment and important parameters of the video frames, wherein the video frames are used for describing environment information of the first equipment at different shooting visual angles, and the important parameters are used for describing importance degrees of the corresponding video frames in the video frames;

the determining unit is used for determining the splicing mode of a plurality of video frames according to the important parameters;

the splicing unit is used for splicing the plurality of video frames according to the splicing mode to obtain a large video picture, wherein the greater the importance degree of the important parameter identification is, the greater the image resolution of the corresponding video frame is in the large video picture;

The sending unit is used for sending the large video picture to the second equipment;

the acquisition unit is further used for acquiring a control instruction from the second device, wherein the control instruction is obtained according to the large video picture;

the execution unit is used for executing the operation corresponding to the control instruction according to the control instruction.

In a fourth aspect, an embodiment of the present application provides a remote control device, where the device is built in a second apparatus, and the device includes: the device comprises an acquisition unit, a splitting unit and a sending unit;

the acquisition unit is used for acquiring a large video picture from a first device, wherein the large video picture is obtained by splicing a plurality of video frames in a splicing mode, the video frames are used for describing environmental information of the first device at different shooting angles, the splicing mode is determined according to important parameters of the video frames, and the important parameters are used for describing the importance degree of the corresponding video frame in the video frames;

the splitting unit is used for splitting the large video picture according to a splitting mode corresponding to the splitting mode to obtain a plurality of video frames;

the acquisition unit is also used for acquiring control instructions according to a plurality of video frames;

The sending unit is configured to send the control instruction to the first device.

In a fifth aspect, embodiments of the present application provide a remote control system, the system including a first device and a second device;

the first device for use in the method of the first aspect above;

the second device is configured to perform the method described in the second aspect above.

In a sixth aspect, embodiments of the present application provide a computer device comprising a processor and a memory:

the memory is used for storing a computer program and transmitting the computer program to the processor;

the processor is configured to perform the method of the above aspect according to instructions in the computer program.

In a seventh aspect, embodiments of the present application provide a computer readable storage medium for storing a computer program for performing the method of the above aspect.

In an eighth aspect, embodiments of the present application provide a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device performs the method described in the above aspect.

According to the technical scheme, the first device is controlled by the second device, and the surrounding environment information is continuously acquired by shooting videos and the like so as to send the control instruction to the second device. Because the shooting view angle of one video frame is limited, the provided environmental information around the first device is less, a plurality of video frames for describing the environmental information of the first device at different shooting view angles and important parameters of each video frame are acquired, the important parameters are used for describing the importance degree of the corresponding video frame in the plurality of video frames, so that a splicing mode of the plurality of video frames is determined based on the important parameters, the plurality of video frames are spliced based on the splicing mode, a large video picture is obtained, the greater the importance degree of important parameter identification in the large video picture is, the greater the image resolution of the video frame corresponding to the important parameters is, the greater the information content included in the more important video frame is, or the less information content is lost in the more important video frame, so that after the large video picture is sent to the second device, the second device can obtain a control command based on the large video picture, and the control command is sent to the first device, so that the first device performs corresponding operation according to the control command, and remote control of the first device on the first device is realized.

Therefore, the first device does not respectively send a plurality of video frames from multiple paths of videos, but is spliced into a large video picture to be jointly sent to the second device, the probability of delay is reduced, the delay from end to end is reduced, and the real-time performance and the safety of remote control are improved. In addition, a plurality of video frames included in the large video picture are generated at the same time, so that the synchronism among multiple paths of videos can be ensured, the video splicing operation is completed in the second device, the requirement of the video splicing on the resources of the first device is reduced, and the service time and the service life of the first device can be prolonged.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 is an application scenario schematic diagram of a remote control method provided in an embodiment of the present application;

fig. 2 is a signaling interaction diagram of a remote control system according to an embodiment of the present application;

FIG. 3 is a schematic diagram of a split manner according to an embodiment of the present disclosure;

FIG. 4 is a schematic diagram of a split manner according to an embodiment of the present disclosure;

FIG. 5 is a schematic diagram of a split manner according to an embodiment of the present disclosure;

fig. 6 is an application scenario schematic diagram of a remote control method according to an embodiment of the present application;

fig. 7 is a schematic architecture diagram of a remote control system according to an embodiment of the present application;

fig. 8 is an application scenario schematic diagram of a remote control method according to an embodiment of the present application;

fig. 9 is a schematic structural diagram of a remote control device according to an embodiment of the present application;

fig. 10 is a schematic structural diagram of a remote control device according to an embodiment of the present application;

fig. 11 is a schematic structural diagram of a server according to an embodiment of the present application;

fig. 12 is a schematic structural diagram of a terminal device according to an embodiment of the present application.

Detailed Description

Embodiments of the present application are described below with reference to the accompanying drawings.

Because of the remote control mode of the related technology, multiple paths of videos collected by the controlled machine are respectively sent, in order to ensure the synchronism among the multiple paths of videos and realize a better video splicing effect, the video splicing is needed to be carried out after the synchronous transmission of the multiple paths of videos is completed. Therefore, if one of the multiple paths of videos has delay, the delay from end to end is increased, so that the real-time performance of the remote control is influenced, and the safety of the remote control is influenced for an application scene with higher real-time performance.

Based on this, the embodiment of the application provides a remote control method and a related device, in which a first device does not send multiple video frames from multiple paths of videos respectively, but a large video frame is spliced and sent to a second device together, so that the probability of delay is reduced, the delay from end to end is reduced, and the real-time performance and safety of remote control are improved. In addition, a plurality of video frames included in the large video picture are generated at the same time, so that the synchronism among multiple paths of videos can be ensured, the video splicing operation is completed in the second device, the requirement of the video splicing on the resources of the first device is reduced, and the service time and the service life of the first device can be prolonged.

The remote control method provided by the embodiment of the application can be used for realizing remote control of the machine, such as remote control of the industrial mobile equipment and other machines in remote places or cloud. The embodiment of the application can be applied to various scenes including, but not limited to, cloud technology, artificial intelligence, digital people, virtual people, games, virtual reality, augmented reality and the like, and the specific application scenes can be remote driving, remote robots, remote control port field bridges, remote control mining machinery and the like, so that wireless communication and control interaction between the robots and remote control equipment or other control centers are realized.

It is understood that the remote control method provided in the present application may be applied to a computer device having remote control capability, such as a terminal device, a server. The terminal device may be a desktop computer, a notebook computer, a mobile phone, a tablet computer, an internet of things device, a portable wearable device, an aircraft, an industrial mobile device and the like, the internet of things device may be an intelligent sound box, an intelligent television, an intelligent air conditioner, an intelligent vehicle-mounted device and the like, the intelligent vehicle-mounted device may be a vehicle-mounted navigation terminal, a vehicle-mounted computer and the like, and the portable wearable device may be an intelligent watch, an intelligent bracelet, a head-mounted device and the like, but is not limited thereto; the server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server or a server cluster for providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, content delivery networks (Content Delivery network, CDN), basic cloud computing services such as big data and artificial intelligent platforms, and the like. The terminal device and the server may be directly or indirectly connected through wired or wireless communication, which is not limited herein.

In order to facilitate understanding of the remote control method provided in the embodiments of the present application, an application scenario of the remote control method is described below by taking an execution body of the remote control method as an example of a server.

Referring to fig. 1, the diagram is an application scenario schematic diagram of a remote control method provided in an embodiment of the present application. As shown in fig. 1, the application scenario includes a first device 110 and a second device 120, where the first device 110 is a controlled device, which may be the aforementioned terminal device, and in fig. 1, the first device is an intelligent vehicle-mounted device as an example. The second device 120 may be a device that issues a control instruction, or may be a terminal device as described above, where the terminal device and the terminal device may communicate through a network.

The first device 110 may continuously acquire environmental information around the first device by capturing video, etc., so as to send the control command to the second device. Because of the limited viewing angle of one video frame, the first device 110 may obtain multiple video frames, where multiple video frames are used to describe the environmental information of the first device at different viewing angles. The cameras are positioned on the front, left and right sides of the autonomous vehicle as shown in fig. 1, and the resulting video frames are a front side view, a left side view and a right side view of the autonomous vehicle, respectively.

The first device 110 may not only acquire a plurality of video frames, but also acquire an importance parameter of each video frame, where the importance parameter is used to describe an importance level of the corresponding video frame in the plurality of video frames, as shown in fig. 1, when the autopilot vehicle runs forward, an importance level of a front side view is greater than an importance level of a left side view, and an importance level of a front side view is also greater than an importance level of a right side view.

The first device 110 determines a splicing manner of a plurality of video frames based on the important parameters, and splices the plurality of video frames based on the splicing manner, thereby obtaining a large video picture. As shown in fig. 1, since the importance of the front side view is determined to be highest based on the importance parameter, the image resolution of the front side view is the largest in the large video picture obtained by stitching.

The first device 110 sends a large video frame to the second device 120, the second device 120 splits the large video frame according to a splitting mode corresponding to the splitting mode to obtain a plurality of video frames, and performs video stitching on the plurality of video frames to obtain an environment image, wherein the environment image is used for describing environment information near the first device. As shown in fig. 1, the second device 120 may split the large video frame into a front side view, a left side view and a right side view, and stitch together to obtain an environment image, where the environments on the front side, the left side and the right side of the first device can be described in a unified manner.

The second device 120 determines a control instruction for controlling the first device 110 based on the large video picture, and transmits the control instruction to the first device 110 so that the first device realizes remote control according to the control instruction.

The remote control method provided by the embodiment of the application can be executed by the terminal equipment. However, in other embodiments of the present application, the server may also have a similar function as the server, so as to perform the remote control method provided in the embodiments of the present application, or the terminal device and the server may jointly perform the remote control method provided in the embodiments of the present application, which is not limited in this embodiment.

In connection with the above description, a remote control system provided in the present application will be described, which includes a first device and a second device. The second device may be a device that issues a control instruction, and may be the terminal device or the server described above, as a possible implementation manner, the second device may be a cloud server. The first device is a device remotely controlled by the second device, and may be an autonomous vehicle in a remote driving application scenario, a robot in a remote robot application scenario, a port bridge in a remote port bridge application scenario, a mining machine in a remote mining machine application scenario, or the like. As one possible implementation, the first device may be an industrial personal computer deployed on an industrial mobile device.

Referring to fig. 2, the diagram is a signaling interaction diagram of a remote control system provided in an embodiment of the present application.

S201: the first device acquires a plurality of video frames at the same time and important parameters of the plurality of video frames.

In practical application, the first device continuously collects environmental information nearby the first device, so that the second device can obtain a control instruction according to the environmental information, and remote control of the first device is achieved based on the control instruction. As a possible implementation manner, the first device may continuously acquire the environmental information nearby by shooting a video, where each video frame included in the video belongs to an image, and each video frame may describe the environmental information nearby by the first device at different moments.

Because the shooting view angle of one video frame is limited, the provided environmental information around the first device is less, so that a plurality of video frames are acquired, and the plurality of video frames are used for describing the environmental information of the first device at different shooting view angles. For example, a plurality of photographing devices may be installed on the first apparatus, and different photographing devices are used to obtain environmental information of the first apparatus at different angles, such as front side view, left side view, right side view, and so on, so as to obtain a plurality of video frames.

In addition, since the surrounding environment of the first device may change continuously, in order to ensure synchronicity, a plurality of video frames at the same time may be acquired based on the time information of the first device. For example, a plurality of photographing apparatuses mounted on the first device photograph videos for different directions around the first device, the videos include a plurality of video frames, and the video frames at the same time can be acquired from the plurality of videos.

The first device acquires important parameters of a plurality of video frames at the same time, wherein the important parameters are used for describing the importance degree of the corresponding video frame in the plurality of video frames, namely the importance degree of each video frame after the plurality of video frames are compared. The embodiment of the application does not specifically limit the acquisition mode of the important parameters, and a person skilled in the art can set the acquisition mode according to actual needs. For example, the predetermined important parameters may be stored in the first device. Two examples will be described below.

In one mode, a working state of a first device is obtained, and important parameters of a plurality of video frames are determined according to the working state. The working state is used for describing the current running condition of the first device, such as forward, backward and the like, and different working states have different requirements for environmental information in different directions, for example, in the process of forward movement of the first device, the requirement for the environmental information on the front side of the first device is larger, and in the process of rear legs of the first device, the requirement for the environmental information on the rear side of the first device is larger. Therefore, according to the important parameters autonomously determined according to the working state of the first equipment, the method better meets the actual requirements of the first equipment in the current environment, and the situation that more environment information is lost after the more important video frames are processed in the follow-up process is avoided, so that the accuracy of control instructions is poor is avoided.

In a second mode, an important parameter acquisition request is sent to the second device, so that the second device sends important parameters of a plurality of video frames to the first device. Therefore, the second device remotely controls the first device, and can acquire important parameters of a plurality of video frames through dynamic negotiation with the first device in advance, so that the control of the second device on the first device can be enhanced, and the safety of data transmission can be improved. As a possible implementation manner, the important parameters may be dynamically negotiated before each capturing of multiple video frames, or may be dynamically negotiated at intervals.

It will be appreciated that in the specific embodiments of the present application, where data such as video frames may be related to user data is involved, when the above embodiments of the present application are applied to specific products or technologies, individual permissions or individual consents of the user may be required, and the collection, use and processing of the related data may be required to comply with relevant laws and regulations and standards of the relevant country and region.

S202: the first device determines a split mode of a plurality of video frames according to the important parameters.

S203: and the first equipment performs splicing on the plurality of video frames according to the splicing mode to obtain a large video picture.

As can be seen from the foregoing, in the related art, multiple video frames collected by a controlled machine (i.e., a first device) are sent separately, or multiple video frames at the same time are sent separately, for example, through multiple video streams. The delay from end to end can be increased, so that the real-time performance of the remote control is influenced, and the safety of the remote control can be influenced for an application scene with higher real-time performance.

Based on this, the embodiment of the application does not send a plurality of video frames separately, but sends a plurality of video frames to the second device together, thereby reducing the probability of delay in the separate sending of multiple paths of video, reducing the delay from end to end, and further improving the real-time performance and the safety of remote control. The manner of splicing the plurality of video frames is described below.

In addition, although a plurality of video frames are used to describe the environmental information of the first device at different shooting angles, in order to ensure the coverage of the shooting device to the surrounding environment of the first device, there is usually an overlapping area of the fields of view of the shooting device, in this case, when multiple video frames are independently displayed, the multiple video frames seen by the user have overlapping contents, and the user experience is poor. As shown in fig. 1, there will be a certain overlap area between the left side view and the front side view. Therefore, in the related art, the first device may eliminate the overlapping area while stitching the plurality of video frames into one image, but the operation of eliminating the overlapping area has high computational complexity, and has high resource requirements on the computing device, and the service time and the service life of the first device may be affected due to high power consumption.

The splicing means that each video frame is simply put together without other operations, so as to obtain a large video frame picture. For example, as shown in fig. 3, the 3 video frames after splicing are adjacent to each other. As another example, as shown in fig. 4, the 3 video frames after splicing are complementarily adjacent to each other. As another example, as shown in fig. 5, there may be an overlap or the like between the 3 video frames after splicing. It will be appreciated that there may be blank areas in the large video frame, such as the area of the bold lines in fig. 4, that is, the large video frame, where 3 video frames do not completely cover the large video frame. The large video frame may be in an irregular shape, as in fig. 5, the region formed by the bold lines is the large video frame.

It should be noted that, in the embodiment of the present application, since importance degrees among a plurality of video frames are different, a stitching manner for stitching the plurality of video frames is determined based on an important parameter, so that, in a large video frame obtained by stitching based on the stitching manner, the greater the importance degree of an important parameter identifier is, the greater the image resolution of a corresponding video frame is, the greater the amount of information included in the image with the greater image resolution is, so that the more important video frame includes more environmental information, and thus the current environment of the first device is determined more accurately. As shown in fig. 1, the front side view of the autonomous vehicle is more important, so that the resolution of the front side view is maximized in the large video screen, so that the large video screen may include more environmental information of the front side of the autonomous vehicle.

The size of the large video frame is not particularly limited, and can be set by a person skilled in the art according to actual needs. For example, the large video frames obtained by each splicing may be uniform in size, or the large video frames obtained by each splicing may be non-uniform.

As a possible implementation manner, the number of the plurality of video frames may be acquired, and the splicing manner of the plurality of video frames is determined according to the number and the important parameters. It can be understood that, as the size of the large video frame increases, the larger the first device surrounding information is included, but the larger the size of the large video frame is, the larger the pressure for subsequent transmission is, which may cause problems such as long transmission time. For example, a smaller size, such as larger than a frame of video, is determined, but does not satisfy the mutual adjacency between multiple frames of video. For example, if the number of video frames is 3, the most important video frame may be placed on the left side, and the two video frames of the lesser order may be placed on the right side, as shown in the large video frame of fig. 1. For another example, if the number of the plurality of video frames is 4 and the importance of the four video frames is equivalent, the split mode may be obtained by arranging the four video frames in a 2×2 mode.

Therefore, based on the important parameters of the plurality of video frames and the splicing mode determined by the number of the plurality of video frames, and based on the large video frames obtained by splicing the plurality of video frames in the splicing mode, the larger the space occupied by more important environmental information is in a limited arrangement space, namely, the larger the importance degree of the important parameter identification is in the large video frames, the larger the image resolution of the corresponding video frames is, and the experience of a user is improved.

S204: the first device sends a large video picture to the second device.

As a possible implementation manner, the first device may encrypt the large video frame and then send the encrypted large video frame to the second device. The encryption mode can be sent along with the encrypted large video, can be independently sent, can be negotiated in advance, can be dynamically negotiated, is not particularly limited, and can be set according to actual needs by a person skilled in the art.

As a possible implementation manner, since the large video frame is too large, if direct transmission takes more bandwidth, increases transmission time, and has higher requirements for device parameters of the second device, taking a size of 1920×1080 as an example, 1920×1080×8×3 bits, that is, 47Mb, are required to transmit the large video frame. Wherein 3 represents red, green, blue (Red, green, blue, RGB). If large video pictures are to be transmitted at a speed of 30 frames per second, each frame of large video picture being such a picture, 1.4Gb of data needs to be transmitted for one second, and the required bandwidth is 1.4Gbps, which is a great demand for bandwidth. Thus, large video pictures can be encoded, i.e. the first device encodes the large video pictures, resulting in encoded video, and sends the encoded video to the second device.

In addition, in the related art, since the first device is required to independently encode the multiple paths of video when the multiple paths of video are transmitted respectively, excessive computing resources of the first device are occupied. Furthermore, since the video is encoded according to the video transmission specification, information about the first device needs to be added to each encoded video, and excessive redundant data such as the header (overhead) of the video encoding code stream is generated, thereby further burdening the data transmission. However, in the embodiment of the application, a plurality of video frames generated by the multi-path video are spliced into a large video picture, and the multi-path video can be transmitted through one-time coding, so that the resource occupation of the first equipment is reduced, the burden of data transmission is also reduced, and the time delay from end to end is reduced. Particularly, when data transmission is performed based on the fifth generation wireless cellular technology (5 th generation mobile networks or 5th generation wireless systems,5G), the burden of the uplink flow of the 5G network can be reduced, and the safety of remote control is improved for some industrial mobile equipment with high real-time requirements.

S205: and the second equipment splits the large video picture according to a splitting mode corresponding to the splitting mode to obtain a plurality of video frames.

It should be noted that, the splitting modes have corresponding splitting modes, and based on the splitting modes, the large video frames can be restored into a plurality of video frames. For example, if a plurality of video frames are split into a large video frame in the manner shown in fig. 3, and the split manner describes a video frame corresponding to each position, the split manner may split the large video frame into a plurality of video frames according to the position.

It will be appreciated that, since there may be cases where video frames are compressed in some stitching modes, such as left and right side views in a large video frame as shown in fig. 1, splitting is performed based on the splitting mode determined by the stitching mode, and a plurality of video frames may be obtained that may have a loss in resolution or the like compared to a plurality of video frames before stitching, but since the stitching mode is determined based on important parameters, even if there is a loss, it is within a range acceptable to a user. In addition, the restoration can be performed by artificial intelligence or other techniques, which is not particularly limited in this application, and can be implemented by those skilled in the art according to actual needs.

The embodiment of the application does not specifically limit the acquisition mode of the splitting mode, and a person skilled in the art can set the splitting mode according to actual needs. For example, if the split system is stored in the first device in advance, the split system corresponding to the split system may be stored in the second device in advance. In another example, if the split mode is negotiated in advance, after determining the split mode, the split mode corresponding to the split mode may be stored in the second device in advance. For another example, the second device may identify the large video frame and then determine a split mode corresponding to the split mode.

As one possible implementation manner, if the first device encodes the large video frame to obtain an encoded video, and sends the encoded video to the second device, the second device receives the encoded video, and the second device may decode the encoded video to obtain the large video frame, so that the second device splits the large video frame according to a splitting manner corresponding to the splitting manner to obtain a plurality of video frames. Therefore, compared with decoding for multiple paths of videos, the method can reduce the resource occupation of the second equipment in the decoding process, save the computational power resource, and improve the real-time performance and the safety of remote control for some industrial mobile equipment with higher requirements on the real-time performance because the plurality of video frames describe the environment information of the first equipment at different shooting visual angles at the same moment.

S206: and the second equipment acquires the control instruction according to the plurality of video frames.

For example, the second device may display a plurality of video frames to the user, and the user may view the plurality of video frames, and since the plurality of video frames can describe environmental information around the first device, a control instruction may be sent to the first device through the second device in a targeted manner. For another example, the second device may also autonomously identify a plurality of video frames, such as identifying a hazard location, etc., and determine the control command. The control instruction is used to control the first device, such as instructing the first device to perform a forward operation, a backward operation, or the like.

As can be seen from the foregoing, the plurality of video frames may include overlapping areas, i.e. the same area for the surrounding environment of the first device, and the plurality of photographing devices may all photograph the video frames. To enhance the user experience, overlapping areas present in multiple video frames may be eliminated.

As a possible implementation manner, a plurality of video frames can be spliced by a video splicing technology to obtain an environment image, and a control instruction is obtained according to the environment image. The splicing means that a plurality of video frames are subjected to one environmental image, and the image does not have an overlapping area. The environment image is an image obtained from a plurality of video frames and having no overlapping area for describing the environment information in the vicinity of the first device, so that the environment around the first device can be observed or recognized better based on the environment image.

It should be noted that, the Video Stitching technology (Video Stitching) is a technology for performing seamless Stitching of contents on multiple paths of videos through a computer vision technology, and is mainly used for solving the problem that a single camera has a limited viewing angle, and multiple cameras are displayed in parallel and the viewing experience of clients is affected due to overlapping of Video contents. Are commonly used in live video-like scenes to provide a larger viewing angle, better experienced video presentation to the user. Based on the above, if an overlapping area exists among a plurality of video frames, the overlapping area can be eliminated by a video stitching technology, so that an environment image with a larger visual angle and better experience is obtained. In addition, the environment image is obtained by splicing a plurality of video frames at the same time, so that the synchronism of multiple paths of videos is ensured, the robustness of a video splicing algorithm is ensured, the video splicing effect is greatly improved, and better experience is brought to a final user.

As a possible implementation manner, the overlapping area can be eliminated through an artificial intelligence technology, that is, a plurality of video frames are spliced through the artificial intelligence technology to obtain an environment image, and a control instruction is obtained according to the environment image.

It should be noted that artificial intelligence (Artificial Intelligence, AI) is a theory, method, technique, and application system that simulates, extends, and expands human intelligence, senses environment, obtains knowledge, and uses knowledge to obtain optimal results using a digital computer or a machine controlled by a digital computer. In other words, artificial intelligence is an integrated technology of computer science that attempts to understand the essence of intelligence and to produce a new intelligent machine that can react in a similar way to human intelligence. Artificial intelligence, i.e. research on design principles and implementation methods of various intelligent machines, enables the machines to have functions of sensing, reasoning and decision.

The artificial intelligence technology is a comprehensive subject, and relates to the technology with wide fields, namely the technology with a hardware level and the technology with a software level. Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning, automatic driving, intelligent traffic and other directions.

In the embodiments of the present application, the artificial intelligence technology mainly includes the directions of the computer vision technology and the like. Computer Vision (CV) is a science of studying how to "look" a machine, and more specifically, to replace a human eye with a camera and a Computer to perform machine Vision such as recognition and measurement on a target, and further perform graphic processing to make the Computer process an image more suitable for human eye observation or transmission to an instrument for detection. As a scientific discipline, computer vision research-related theory and technology has attempted to build artificial intelligence systems that can acquire information from images or multidimensional data. Computer vision technologies typically include image processing, image recognition, image semantic understanding, image retrieval, optical character recognition (Optical Character Recognition, OCR), video processing, video semantic understanding (video semantic understanding, VSU), video content/behavior recognition, three-dimensional object reconstruction, 3D technology, virtual reality, augmented reality, synchronous positioning and mapping, autopilot, intelligent transportation, etc., as well as common biometric technologies such as face recognition, fingerprint recognition, etc.

S207: the second device sends a control instruction to the first device.

S208: and the first equipment executes the operation corresponding to the control instruction according to the control instruction.

Therefore, the first device acquires the control instruction from the second device, and the first device executes the operation corresponding to the control instruction according to the control instruction, so that the second device remotely controls the first device.

In order to facilitate further understanding of the technical solutions provided in the embodiments of the present application, the following describes a remote control method in an overall exemplary manner.

Referring to fig. 6, the application scenario of the remote control method provided in the embodiment of the present application is schematically shown. In fig. 6, the first device is an industrial mobile device and the second device is a remote control device.

The first device can realize functions of video acquisition, video stitching, video coding and the like. Specifically, the first device may perform video capturing through a plurality of capturing devices, and continuously collect surrounding environmental information, so as to obtain multiple paths of videos (video collection). Then, the first device clips together video frames at the same time in the multiple video channels to obtain a large video picture (video clip). Finally, the first device encodes the large video picture to obtain an encoded video (video encoding), and transmits the encoded video to the second device.

The second device may implement video decoding, video splitting, video stitching, etc. Specifically, the second device, after receiving the encoded video, decodes the encoded video to obtain a large video picture (video decoding). Then, the second device splits the large video frame to obtain a plurality of video frames (video splitting). Finally, the second device splices the plurality of video frames to obtain an environment image (video splicing), so as to acquire a control instruction for controlling the first device and send the control instruction to the first device, thereby realizing remote control of the second device on the first device.

Since remote control is generally applied in industry, taking the first device as an industrial mobile device as an example, the implementation of remote control of the industrial mobile device in an industrial application scenario will be described below.

As a possible implementation manner, in an industrial application scenario, there is a high requirement on real-time performance of remote control, and for convenience, the problem of mess caused by a wired line is avoided, and the data transmission can be realized by adopting a fifth generation wireless cellular technology (5 th generation mobile networks or 5th generation wireless systems,5G) architecture. Specifically, the first device sends the large video picture to the streaming server via 5G so that the second device pulls the large video picture from the streaming server.

The industrial mobile device as the first device may be a 5G terminal or a wired terminal, which is not particularly limited in this application. For example, if the industrial mobile device is a 5G terminal, the industrial mobile device may directly access the 5G network to send a large video frame to the streaming media server. For another example, if the industrial mobile device is a wired terminal, a customer premise equipment (Customer Premise Equipment, CPE), such as an industrial personal computer, needs to be deployed in the industrial mobile device, and then the industrial mobile industrial personal computer and the 5G CPE are connected through a wired network, so as to realize access of the 5G network, and further realize sending of a large video picture to the streaming media server. The 5G network and the streaming media server belong to a general device, which is not particularly limited in this application, and can be selected by those skilled in the art according to actual needs.

Referring to fig. 7, a schematic architecture diagram of a remote control system according to an embodiment of the present application is provided. In fig. 7, the remote control system includes an industrial mobile device, a 5G base station, a 5G core network, a streaming server, and a remote control device, respectively, which will be described below.

(1) The industrial mobile equipment is first equipment and comprises the industrial personal computer, wherein the industrial personal computer is responsible for controlling 3 cameras to acquire 3 paths of videos, then 3 video frames of each path of video at the same moment are spliced to obtain a large video picture, then the large video picture is encoded to obtain an encoded video, and the encoded video is sent to a streaming media server of a cloud end through a 5G network.

(2) The 5G network includes a 5G base station and a 5G core network. The method is responsible for transmitting uplink and downlink data, transmitting coded video sent by the industrial mobile device to the streaming media server in the uplink direction, and transmitting control instructions sent by the streaming media server to the industrial mobile device in the downlink direction.

(3) The streaming media server belongs to a cloud server and is responsible for connecting remote control equipment and industrial mobile equipment. As a possible implementation manner, the streaming media server includes a multimedia transmission channel and a control data transmission channel, where the multimedia transmission channel is used for transmission and distribution of multimedia data such as video, audio, and the like, and the multimedia transmission channel is used to send a large video picture to the streaming media server through a 5G network. The data transmission channel is used for transmitting and distributing light-weight data such as control instructions, state data and the like between the industrial mobile equipment and the remote control equipment, for example, the control data transmission channel is used for sending a splicing mode or a control instruction to the streaming media server. Because the bandwidth that multimedia data needs is higher, and the data volume is more, often appear losing packet, transmission time scheduling problem is longer, and control command's importance is higher moreover, if the two adopts a passageway to transmit, probably will lose the higher control command packet of importance, latency is longer etc., can improve data transmission's stability through two passageway separate transmission to and remote control's security, improve user's experience sense.

(4) The remote control device is a second device, and is responsible for pulling a video stream from the streaming media server, then decoding the encoded video to obtain a large video picture, splitting the large video picture into a plurality of video frames, then performing video stitching to obtain an environment image, and displaying the environment image to a user in real time so that the user can issue a control instruction for the industrial mobile device. In addition, the control instruction issued by the user can be issued to the industrial mobile equipment through the streaming media server and the 5G network.

As a possible implementation manner, the functions of video decoding, video splitting and video splicing may be completed by the streaming media server, so that the remote control device only has functions of displaying, for example, after the streaming media server encodes the environmental image to obtain a path of video stream, the remote control device pulls the video stream, decodes the video stream to obtain the environmental image, and displays the environmental image to the user.

In order to facilitate further understanding of the technical solution provided in the embodiments of the present application, an architecture shown in fig. 7 is taken as an example, and an overall exemplary description is given to the remote control method provided in the embodiments of the present application.

Referring to fig. 8, the application scenario of the remote control method provided in the embodiment of the present application is schematically shown.

S801: the industrial mobile device obtains a plurality of video frames at the same time and important parameters of the plurality of video frames.

S802: the industrial mobile device determines a splicing mode of a plurality of video frames according to the important parameters.

S803: and the industrial mobile equipment performs splicing on the plurality of video frames according to the splicing mode to obtain a large video picture.

S804: the industrial mobile device encodes the large video picture to obtain an encoded video.

S805: the industrial mobile device sends the encoded video to a streaming server.

S806: the remote control device pulls the encoded video from the streaming server.

S807: the remote control device decodes the encoded video to obtain a large video picture.

S808: the remote control device splits the large video frames according to the splitting mode corresponding to the splitting mode to obtain a plurality of video frames.

S809: the remote control device splices a plurality of video frames through a video splicing technology to obtain an environment image.

S810: and the remote control equipment acquires a control instruction according to the environment image.

S811: the remote control device sends a control instruction to the streaming media server.

S812: the streaming media server sends control instructions to the industrial mobile device.

S813: and the industrial mobile equipment executes the operation corresponding to the control instruction according to the control instruction.

The application further provides a remote control device corresponding to the first equipment for the remote control method, so that the remote control method is practically applied and realized.

Referring to fig. 9, the structure of a remote control device according to an embodiment of the present application is shown. As shown in fig. 9, the remote control apparatus 900 is built in a first device, and includes: an acquisition unit 901, a determination unit 902, a splicing unit 903, a transmission unit 904, and an execution unit 905;

the acquiring unit 901 is configured to acquire a plurality of video frames at the same time and important parameters of the plurality of video frames, where the plurality of video frames are used to describe environmental information of the first device at different shooting angles, and the important parameters are used to describe importance degrees of corresponding video frames in the plurality of video frames;

the determining unit 902 is configured to determine a splicing manner of the plurality of video frames according to the important parameter;

the stitching unit 903 is configured to stitch a plurality of video frames according to the stitching manner, so as to obtain a large video frame, where the greater the importance degree of the important parameter identifier, the greater the image resolution of the corresponding video frame;

The sending unit 904 is configured to send the large video frame to a second device;

the acquiring unit 901 is further configured to acquire a control instruction from the second device, where the control instruction is obtained according to the large video frame;

the execution unit 905 is configured to execute an operation corresponding to the control instruction according to the control instruction.

As a possible implementation manner, the obtaining unit 901 is further configured to obtain an operating state of the first device;

the determining unit 902 is further configured to determine important parameters of a plurality of video frames according to the working state.

As a possible implementation manner, the sending unit 904 is further configured to send an important parameter obtaining request to the second device;

the obtaining unit 901 is further configured to obtain important parameters of a plurality of video frames from the second device.

As a possible implementation manner, the obtaining unit 901 is further configured to obtain a number of the video frames;

The determining unit 902 is specifically configured to determine a splicing manner of a plurality of video frames according to the number and the important parameters.

As a possible implementation manner, the remote control device 900 further includes an encoding unit, configured to encode the large video frame to obtain an encoded video;

the sending unit 904 is specifically configured to send the encoded video to the second device.

As a possible implementation manner, if the first device is an industrial mobile device, the sending unit 904 is specifically configured to send the large video frame to a streaming media server through a fifth generation wireless cellular technology, so that the second device pulls the large video frame from the streaming media server.

As a possible implementation manner, the streaming media server includes a multimedia transmission channel and a control data transmission channel, and the sending unit 904 is specifically configured to send, through the fifth generation wireless cellular technology, the large video frame to the streaming media server by using the multimedia transmission channel, and send, to the streaming media server, the split mode or the control instruction by using the control data transmission channel.

The application further provides a remote control device corresponding to the second equipment for the remote control method, so that the remote control method is practically applied and realized.

Referring to fig. 10, a schematic structural diagram of a remote control device according to an embodiment of the present application is shown. As shown in fig. 10, the remote control apparatus 1000 is built in a second device, and includes: an acquisition unit 1001, a splitting unit 1002, and a transmission unit 1003;

the obtaining unit 1001 is configured to obtain a large video frame from a first device, where the large video frame is obtained by stitching a plurality of video frames, where the plurality of video frames are used to describe environmental information of the first device at different viewing angles, and the stitching mode is determined according to important parameters of the plurality of video frames, where the important parameters are used to describe importance degrees of a corresponding video frame in the plurality of video frames;

the splitting unit 1002 is configured to split the large video frame according to a splitting manner corresponding to the splitting manner, so as to obtain a plurality of video frames;

the obtaining unit 1001 is further configured to obtain a control instruction according to a plurality of video frames;

The sending unit 1003 is configured to send the control instruction to the first device.

As a possible implementation manner, if there is an overlapping area between the plurality of video frames, the remote control apparatus 1000 further includes a splicing unit, configured to splice the plurality of video frames by using a video splicing technology, so as to obtain an environmental image, where the environmental image is used to describe environmental information near the first device;

the acquiring unit 1001 is specifically configured to acquire a control instruction according to the environmental image.

As a possible implementation manner, the obtaining unit 1001 is further configured to obtain, from the first device, an encoded video, where the encoded video is obtained by encoding the large video picture;

The remote control device 1000 further includes a decoding unit, configured to decode the encoded video to obtain the large video frame.

The embodiment of the application further provides a computer device, where the computer device is the computer device described above, and the computer device may be a server or a terminal device, and the remote control device may be built in the server or the terminal device, and the computer device provided in the embodiment of the application will be described below from the perspective of hardware materialization. Fig. 11 is a schematic structural diagram of a server, and fig. 12 is a schematic structural diagram of a terminal device.

Referring to fig. 11, which is a schematic diagram of a server structure provided in an embodiment of the present application, the server 1400 may vary considerably in configuration or performance, and may include one or more processors 1422, such as a central processing unit (Central Processing Units, CPU), a memory 1432, one or more application programs 1442, or a storage medium 1430 (e.g., one or more mass storage devices) for data 1444. Wherein the memory 1432 and storage medium 1430 can be transitory or persistent storage. The program stored in the storage medium 1430 may include one or more modules (not shown), each of which may include a series of instruction operations on a server. Still further, a processor 1422 may be provided in communication with a storage medium 1430 to execute a series of instructions operations on the storage medium 1430 on the server 1400.

Server 1400 may also include one or more power supplies 1426, one or more wired or wireless network interfaces 1450, one or more input/output interfaces 1458, and/or one or more operating systems 1441, such as a Windows Server ^TM ，Mac OS X ^TM ，Unix ^TM , Linux ^TM ，FreeBSD ^TM Etc.

The steps performed by the server in the above embodiments may be based on the server structure shown in fig. 11.

Wherein, the CPU 1422 is configured to perform the following steps:

transmitting the large video frame to a second device;

Alternatively, the CPU 1422 is configured to perform the following steps:

acquiring a control instruction according to a plurality of video frames;

and sending the control instruction to the first equipment.

Optionally, the CPU 1422 may further perform method steps of any specific implementation of the remote control method in the embodiments of the present application.

Referring to fig. 12, the structure of a terminal device provided in an embodiment of the present application is shown. Fig. 12 is a block diagram illustrating a part of a structure of a smart phone related to a terminal device provided in an embodiment of the present application, where the smart phone includes: radio Frequency (RF) circuitry 1510, memory 1520, input unit 1530, display unit 1540, sensor 1550, audio circuitry 1560, wireless fidelity (WiFi) module 1570, processor 1580, power supply 1590, and the like. Those skilled in the art will appreciate that the smartphone structure shown in fig. 12 is not limiting of the smartphone and may include more or fewer components than shown, or may combine certain components, or a different arrangement of components.

The following describes the components of the smart phone in detail with reference to fig. 12:

the RF circuit 1510 may be used for receiving and transmitting signals during a message or a call, and particularly, after receiving downlink information of a base station, the signal is processed by the processor 1580; in addition, the data of the design uplink is sent to the base station.

The memory 1520 may be used to store software programs and modules, and the processor 1580 implements various functional applications and data processing of the smartphone by running the software programs and modules stored in the memory 1520.

The input unit 1530 may be used to receive input numerical or character information and generate key signal inputs related to user settings and function control of the smart phone. In particular, the input unit 1530 may include a touch panel 1531 and other input devices 1532. The touch panel 1531, also referred to as a touch screen, may collect touch operations on or near the user and drive the corresponding connection device according to a predetermined program. The input unit 1530 may include other input devices 1532 in addition to the touch panel 1531. In particular, other input devices 1532 may include, but are not limited to, one or more of a physical keyboard, function keys (e.g., volume control keys, switch keys, etc.), a trackball, mouse, joystick, etc.

The display unit 1540 may be used to display information input by a user or information provided to the user and various menus of the smart phone. The display unit 1540 may include a display panel 1541, and optionally, the display panel 1541 may be configured in the form of a liquid crystal display (Liquid Crystal Display, LCD), an Organic Light-Emitting Diode (OLED), or the like.

The smartphone may also include at least one sensor 1550, such as a light sensor, a motion sensor, and other sensors. Other sensors such as gyroscopes, barometers, hygrometers, thermometers, infrared sensors, etc. that may also be configured with the smart phone are not described in detail herein.

Audio circuitry 1560, speaker 1561, and microphone 1562 may provide an audio interface between a user and a smart phone. The audio circuit 1560 may transmit the received electrical signal converted from audio data to the speaker 1561, and be converted into a sound signal by the speaker 1561 for output; on the other hand, the microphone 1562 converts the collected sound signals into electrical signals, which are received by the audio circuit 1560 for conversion into audio data, which is processed by the audio data output processor 1580 for transmission to, for example, another smart phone via the RF circuit 1510 or for output to the memory 1520 for further processing.

Processor 1580 is a control center of the smartphone, connects various parts of the entire smartphone with various interfaces and lines, performs various functions of the smartphone and processes data by running or executing software programs and/or modules stored in memory 1520, and invoking data stored in memory 1520. In the alternative, processor 1580 may include one or more processing units.

The smart phone also includes a power source 1590 (e.g., a battery) for powering the various components, which may preferably be logically connected to the processor 1580 via a power management system, such as to provide for managing charging, discharging, and power consumption.

Although not shown, the smart phone may further include a camera, a bluetooth module, etc., which will not be described herein.

In an embodiment of the present application, the memory 1520 included in the smart phone may store program codes and transmit the program codes to the processor.

The processor 1580 included in the smart phone may execute the remote control method provided in the foregoing embodiment according to the instructions in the program code.

The present application also provides a computer-readable storage medium storing a computer program for executing the remote control method provided by the above embodiment.

Embodiments of the present application also provide a computer program product or computer program comprising computer instructions stored in a computer-readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions to cause the computer device to perform the remote control method provided in the various alternative implementations of the above aspects.

Those of ordinary skill in the art will appreciate that: all or part of the steps for implementing the above method embodiments may be implemented by hardware related to program instructions, where the above program may be stored in a computer readable storage medium, and when the program is executed, the program performs steps including the above method embodiments; and the aforementioned storage medium may be at least one of the following media: read-Only Memory (ROM), RAM, magnetic disk or optical disk, etc.

It should be noted that, in the present specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment is mainly described in a different point from other embodiments. In particular, for the apparatus and system embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, with reference to the description of the method embodiments in part. The apparatus and system embodiments described above are merely illustrative, in which elements illustrated as separate elements may or may not be physically separate, and elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.

The foregoing is merely one specific embodiment of the present application, but the protection scope of the present application is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present application should be covered in the protection scope of the present application. Further combinations of the present application may be made to provide further implementations based on the implementations provided in the above aspects. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A remote control method, the method being applied to a first device, the method comprising:

splicing a plurality of video frames according to the splicing mode to obtain a large video picture, wherein the greater the importance degree of the important parameter identification is, the greater the image resolution of the corresponding video frame is; the splicing refers to putting together a plurality of video frames, and does not perform a cancellation operation on the overlapping area of the video frames;

The large video picture is sent to a second device, so that the second device splits the large video picture according to a splitting mode corresponding to the splitting mode to obtain a plurality of video frames, if the plurality of video frames have overlapping areas, the plurality of video frames are spliced by a video splicing technology to eliminate the overlapping areas of the plurality of video frames, and an environment image is obtained and is used for describing environment information nearby the first device; acquiring a control instruction according to the environment image, and sending the control instruction to the first equipment;

2. The method according to claim 1, wherein the method further comprises:

acquiring the working state of the first equipment;

and determining important parameters of a plurality of video frames according to the working state.

3. The method according to claim 1, wherein the method further comprises:

sending an important parameter acquisition request to the second equipment;

Important parameters of a plurality of the video frames are acquired from the second device.

4. The method according to claim 1, wherein the method further comprises:

acquiring the number of a plurality of video frames;

the determining a splicing mode of a plurality of video frames according to the important parameters comprises the following steps:

and determining the splicing mode of a plurality of video frames according to the number and the important parameters.

5. The method of claim 1, wherein the sending the large video picture to the second device comprises:

coding the large video picture to obtain a coded video;

and transmitting the encoded video to the second device.

6. The method of claim 1, wherein if the first device is an industrial mobile device, the sending the large video frame to the second device comprises:

and sending the large video picture to a streaming media server through a fifth generation wireless cellular technology so that the second equipment can pull the large video picture from the streaming media server.

7. The method of claim 6, wherein the streaming server comprises a multimedia transmission channel and a control data transmission channel, wherein the sending the large video frames to the streaming server via fifth generation wireless cellular technology comprises:

Transmitting the large video picture to the streaming media server by utilizing the multimedia transmission channel through the fifth generation wireless cellular technology;

the method further comprises the steps of:

and sending the splicing mode or the control instruction to the streaming media server by utilizing the control data transmission channel.

8. A remote control method, the method being applied to a second device, the method comprising:

acquiring a large video picture from a first device, wherein the large video picture is obtained by splicing a plurality of video frames at the same moment in a splicing way, the video frames are used for describing environment information of the first device at different shooting visual angles, the splicing way is determined according to important parameters of the video frames, and the important parameters are used for describing the importance degree of the corresponding video frame in the video frames; the splicing refers to putting together a plurality of the video frames without performing a cancel operation on an overlapping area of the plurality of the video frames;

if the plurality of video frames have overlapping areas, splicing the plurality of video frames through a video splicing technology to eliminate the overlapping areas of the plurality of video frames, so as to obtain an environment image, wherein the environment image is used for describing environment information near the first equipment;

Acquiring a control instruction according to the environment image;

and sending the control instruction to the first equipment.

9. A remote control apparatus, the apparatus being built into a first device, the apparatus comprising: the device comprises an acquisition unit, a determination unit, a splicing unit, a sending unit and an execution unit;

the splicing unit is used for splicing the plurality of video frames according to the splicing mode to obtain a large video picture, wherein the greater the importance degree of the important parameter identification is, the greater the image resolution of the corresponding video frame is in the large video picture; the splicing refers to putting together a plurality of video frames, and does not perform a cancellation operation on the overlapping area of the video frames;

the sending unit is configured to send the large video frame to a second device, so that the second device splits the large video frame according to a splitting mode corresponding to the splitting mode to obtain a plurality of video frames, if the plurality of video frames have overlapping areas, splice the plurality of video frames through a video splicing technology to eliminate the overlapping areas of the plurality of video frames, and obtain an environmental image, where the environmental image is used for describing environmental information near the first device; acquiring a control instruction according to the environment image, and sending the control instruction to the first equipment;

10. The apparatus of claim 9, wherein the device comprises a plurality of sensors,

the acquisition unit is further used for acquiring the working state of the first equipment;

the determining unit is further configured to determine important parameters of the plurality of video frames according to the working state.

11. The apparatus of claim 9, wherein the device comprises a plurality of sensors,

the sending unit is further configured to send an important parameter acquisition request to the second device;

the obtaining unit is further configured to obtain important parameters of the plurality of video frames from the second device.

12. The apparatus of claim 9, wherein the device comprises a plurality of sensors,

the acquisition unit is used for acquiring the number of the video frames;

the determining unit is specifically configured to determine a splicing manner of the plurality of video frames according to the number and the important parameter.

13. The apparatus of claim 9, wherein the apparatus further comprises:

The coding unit is used for coding the large video picture to obtain a coded video;

the transmitting unit is specifically configured to transmit the encoded video to the second device.

14. The apparatus according to claim 9, wherein if the first device is an industrial mobile device, the sending unit is specifically configured to send the large video frame to a streaming server through a fifth generation wireless cellular technology, so that the second device pulls the large video frame from the streaming server.

15. The device according to claim 14, wherein the streaming server comprises a multimedia transmission channel and a control data transmission channel, and the sending unit is specifically configured to send the large video frame to the streaming server by using the multimedia transmission channel and send the split mode or the control instruction to the streaming server by using the control data transmission channel through the fifth generation wireless cellular technology.

16. A remote control apparatus, the apparatus being built into a second device, the apparatus comprising: the device comprises an acquisition unit, a splitting unit, a sending unit and a splicing unit;

The acquisition unit is used for acquiring a large video picture from a first device, wherein the large video picture is obtained by splicing a plurality of video frames at the same moment in a splicing mode, the video frames are used for describing environment information of the first device at different shooting visual angles, the splicing mode is determined according to important parameters of the video frames, and the important parameters are used for describing importance degrees of corresponding video frames in the video frames; the splicing refers to putting together a plurality of the video frames without performing a cancel operation on an overlapping area of the plurality of the video frames;

the splicing unit is used for splicing the plurality of video frames through a video splicing technology if the plurality of video frames have overlapping areas, so as to eliminate the overlapping areas of the plurality of video frames and obtain an environment image, wherein the environment image is used for describing environment information near the first equipment;

the acquisition unit is also used for acquiring a control instruction according to the environment image;

17. A remote control system, the system comprising a first device and a second device;

the first device for performing the method of any of claims 1-7;

the second device for performing the method of claim 8.

18. A computer device, the computer device comprising a processor and a memory:

the processor is adapted to perform the method of any of claims 1-7 or to perform the method of claim 8 according to the computer program.

19. A computer readable storage medium for storing a computer program for performing the method of any one of claims 1-7 or for performing the method of claim 8.