CN111629269A

CN111629269A - Method for automatically shooting and generating mobile terminal short video advertisement based on mechanical arm

Info

Publication number: CN111629269A
Application number: CN202010452022.7A
Authority: CN
Inventors: 佘莹莹; 李杰峰; 林琳
Original assignee: Xiamen University
Current assignee: Xiamen University
Priority date: 2020-05-25
Filing date: 2020-05-25
Publication date: 2020-09-04
Anticipated expiration: 2040-05-25
Also published as: CN111629269B

Abstract

The invention provides a method or a system for automatically shooting and generating a mobile-end short video advertisement based on a mechanical arm, wherein the method comprises the following steps: (1) determining the video style and commodity information through user interaction, and generating a video script by combining a video script model through the system; (2) the system plans the motion path of the mechanical arm to be planned and automatically shot, and interactively guides a user to arrange a scene through a line diagram; (3) and analyzing the video to be edited based on the saliency region algorithm, and labeling the video. And editing, synthesizing and rendering the video based on the label combination script. The invention develops and explores a new man-machine cooperation mode for generating the short video advertisements. The efficiency of short video advertisement production is obviously promoted, and the quality of short video advertisements is optimized.

Description

Method for automatically shooting and generating mobile terminal short video advertisement based on mechanical arm

Technical Field

The invention belongs to the field of video generation, and particularly relates to a method and a system for automatically shooting and generating a mobile-end short video advertisement based on a mechanical arm.

Background

With the rise of short video applications, short video advertisements are in rapid development, and for small micro-enterprises, the short video advertisements put on social media have considerable conversion rate, so that the sales volume of commodities can be remarkably increased. But for small micro-businesses, making short video advertisements is an extremely time, effort, and cost consuming task. Among billions of good, the number of short video advertisements is less than 1%.

Short video generation can be roughly divided into three steps of script design, shooting, and post-clipping. In the aspect of script design, existing short video production tools, such as TopVid and Animoto, all rely on a fixed video template with a single structure, and these short video advertisement production tools often only perform general and limited classification on video templates, and lack the acquisition of user requirements for a shooting part.

In the aspect of shooting, although a series of exploration is available for the interaction mode of shooting guidance of non-professional users, most of the interaction modes only stay at the level of photo shooting. In contrast, video shooting has more dimensional features including lens movement direction, lens movement speed and dynamic composition, and has higher complexity, and interactive guidance in the video field is very rare. Professional camera shooting auxiliary equipment (such as a handheld stabilizer) including Xinjiang can reduce the shaking degree of a lens and help non-professional users to improve the quality of video shooting to a certain extent, but the equipment still needs the users to design the movement path and speed of the lens, and for the non-professional users, high learning cost exists and the users still have difficulty in shooting high-quality short videos.

Existing video clips are mostly done manually by hand. Related methods have explored video automatic editing methods. Video auto-editing is an important branch of the field of computer vision research. The current automatic clipping method mainly uses a visual method to detect human features (such as human faces) or automatically clips based on preset information (such as video scripts and human dialogue contents). However, these editing methods are limited to the scenes of the character, and cannot automatically edit the video with a large part of scenes containing no actor, such as short video advertisements.

Disclosure of Invention

The invention provides a method or a system for automatically shooting and generating a mobile terminal short video advertisement based on a mechanical arm. After shooting, the video is automatically edited through an algorithm based on the analysis of the salient regions. The invention explores a new man-machine cooperation mode for generating the short video advertisements, obviously improves the efficiency of making the short video advertisements and optimizes the quality of the short video advertisements. The automatic generation of the video mainly comprises the following steps:

1. the system generates a video script based on user requirements and a script generation model

The system provides the requirements for acquiring the front-end interactive interface on one hand, and generates the video script by combining the short video advertisement script model on the basis of the acquired user requirements on the back end on the other hand. And in a user requirement acquisition interface, the user selects the style of the video and the category of the commodity by clicking to be used as input parameters. The script model contains design rules for the visual effects of the advertising video. The video script finally generated by the system encodes the visual effect of the video and describes the specific contents of the video, including the video advertisement duration, the split mirror, the main tone and the like.

2. The system analyzes the video script, plans the motion path of the mechanical arm and guides the user to perform scene setting

The system calculates the movement speed and the movement curve of a main object (namely a commodity) in a two-dimensional camera plane according to the visual effect of each lens defined in the script; calculating the motion speed and the motion curve of the camera in the three-dimensional space according to the motion speed and the motion curve of the main object in the plane of the two-dimensional camera; and calculating and obtaining the position information of each degree of freedom of the mechanical arm in the mechanical arm coordinate based on a reverse dynamics algorithm according to the movement speed and the movement curve of the camera in the three-dimensional space, and planning the complete movement path of the mechanical arm for shooting all commodity lenses.

Meanwhile, the system reads the position of the article in the first commodity lens, calculates the position relation of the mechanical arm, the camera and the commodity in the three-dimensional space, and generates a corresponding wire frame diagram to guide the user to put the commodity and the susceptors in the shooting scene. The wire frame diagram includes the position of the commodity, the position of the susceptor, and the angle of the camera. The user sets under the guidance of the online block diagram and fixes the camera at the tail end of the mechanical arm. The robot arm takes this as an initial state of shooting.

3. Performing significance region analysis on video to be edited by utilizing significance algorithm and synthesizing rendering

The mechanical arm shoots and obtains a continuous section of video to be edited. The video to be edited comprises: a segment moving at a constant speed; shot-to-shot transition segments; a segment in which the robot arm accelerates after moving in the changing direction or a segment in which the robot arm decelerates to reach a stop state. The system needs to keep or discard video segments based on the video script. The saliency algorithm analyzes the video to be edited frame by frame to obtain the size and position coordinates of the range of the saliency area of each frame of shot. In order to improve the robustness of the algorithm, the average value processing is carried out on every five frames.

And selecting key frames every five frames, and performing labeling processing. For each key frame, the distance and angular offset of the key frame from several adjacent key frames are analyzed. And judging the motion state of the object in the picture according to the position offset, and labeling the key frame according to the motion state of the object in the picture. And after the labeling processing, combining the visual characteristic information defined by the script, editing and synthesizing the video to be edited, and finally obtaining the advertisement film.

Compared with the prior method, the method has the innovation points that:

1. the professional and abstract short video advertisement making method is converted into a data model in a computer system, so that the complicated video making process is replaced by simple operation steps, and the use threshold of a user is reduced in a man-machine cooperation system generated by the short video advertisement.

2. The shooting process of the short video advertisement is automatically completed by the aid of the mechanical arm, and a user only needs to select needed advertisement styles and product categories through a mobile phone interface and arrange shooting scenes under guidance. The problems of wrong composition, uneven speed, shaking and the like which possibly occur in manual shooting are avoided, the shooting efficiency is improved, and the video quality is optimized.

3. The video is analyzed and edited by using the salient region analysis algorithm, so that the time-consuming and labor-consuming steps in manual editing are replaced, the video editing efficiency is improved, and the video editing accuracy is optimized.

Drawings

FIG. 1 is a flow chart of the automatic video generation method of the present invention

FIG. 2 is a flow chart of a video script generation method of the present invention

FIG. 3 is a user request acquisition interface in accordance with the present invention

FIG. 4 is a flow chart of a robot arm shooting module of the present invention

FIG. 5 is a guide line diagram of the scenery according to the invention

FIG. 6 is an exemplary communication protocol format sent to a robotic arm in accordance with the present invention

FIG. 7 is a diagram illustrating an example of the relationship between the positions of the robot arm, the mobile phone and the object to be photographed

FIG. 8 is a flow chart of an automatic clipping method of the present invention

FIG. 9 is a flowchart of a method for automatically capturing and generating a short video advertisement in accordance with the present invention

Detailed Description

In order to make the objects, technical processes and technical innovation points of the present invention more clearly illustrated, the present invention is further described in detail with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. In addition, the technical features involved in the embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.

In order to achieve the aim, the invention provides a method for automatically shooting and generating a mobile-end short video advertisement based on a mechanical arm. The main process is shown in fig. 1, and the method comprises the following steps:

and step S01, the user interactively determines the video style and the commodity information to generate a video script. Referring to fig. 2, specifically, the following may be:

the user selects the video style and the commodity category as input on the demand acquisition interface. The interface is referred to in fig. 3. The video style comprises fresh wind and ancient wind; the commodity category comprises jewelry, tableware, food, daily necessities and stationery;

the script generator analyzes the user input by combining the formulated video advertisement script model;

and the script generator integrates the analysis results and writes the information of each shot into the script based on different video types. For each lens, if the lens is a commodity lens, writing the visual characteristics of the lens into a video script to be shot; and aiming at each lens, if the lens is an empty lens, entering a material database to search for an empty lens material. And traversing the material database by the system, and if a material matched with the visual characteristics of the lens is found, writing the index of the material into the script for rendering.

The script generator analyzes user input by combining with a formulated video advertisement script model, and specifically comprises the following steps:

the video advertisement script model comprises a production process of a professional short video advertisement video, and specifically comprises a series of rules; the rules contain visual characteristics that are appropriate for short video advertisements of different styles and different goods. The script generator defines the advertisement duration, the advertisement music, the lens quantity and the advertisement rhythm of the short video advertisement based on the video advertisement script model; further, the script generator defines the video type (empty lens or commodity lens), composition, mirror moving mode, color characteristics (saturation, contrast, etc.) of each lens of the short video advertisement based on the video advertisement script model and the advertisement overall information (duration, music … …).

And step S02, the system analyzes the video script, plans the motion path of the mechanical arm and guides the user to set scenes. Referring to fig. 4, specifically, the following may be:

the system reads the script and converts visual characteristics of the lens in the script into mechanical arm movement path planning aiming at the commodity lens in the script;

converting the motion path of the mechanical arm into a corresponding instruction format according to a communication protocol of the system;

and displaying a line diagram on a mobile phone interface according to the initial position of the mechanical arm for shooting to guide a user to arrange shooting scenes. The wire frame diagram contains the position of the commodity and the position of the susceptor as well as the placing angle of the camera. The user sets under the guidance of the online block diagram and fixes the camera at the tail end of the mechanical arm. The mechanical arm regards the final state as an initial state of the photographing process.

Wherein, the visual characteristic of camera lens in the script is converted into arm motion path planning to the system, specifically is:

aiming at each shot in the script, if the shot is a commodity shot, the system analyzes the visual characteristics of the shot; otherwise, skipping to enter the next lens until all the lenses are read;

the shooting analysis can be divided into two-dimensional camera screen plane analysis, three-dimensional shooting space analysis and mechanical arm motion path planning analysis:

and analyzing the visual effect of each lens defined in the script by the two-dimensional camera screen plane, and calculating the motion speed and the motion track of the main object (namely the commodity) in the two-dimensional camera screen plane. The system first reads the resolution of the camera view, as well as the visual characteristics of the current shot in the script. The visual features include: the moving path of the commodity in the picture and the moving speed of the commodity in the picture. Wherein, the commodity motion path in the picture comprises picture composition and commodity motion direction; establishing a two-dimensional camera plane coordinate system by taking the upper left corner of a camera plane as an origin and the horizontal direction and the vertical direction as an x axis and a y axis respectively; according to the visual characteristics of the lens in the script, the abstract visual characteristics are quantized into the commodity motion states in the two-dimensional camera screen plane in the initial state and the termination state of the lens, wherein the commodity motion states comprise the coordinates of the center of the commodity in a two-dimensional camera plane coordinate system and the radius of an circumscribed circle presented by the commodity in a picture; according to the visual characteristics of the lens in the script, the position relation of the main object (namely the commodity) and the set object is defined. Finally, transmitting the analysis result to a three-dimensional shooting space for analysis;

and the three-dimensional shooting space analysis calculates and obtains the motion speed and the motion trail of the camera in the three-dimensional space according to the motion speed and the motion trail of the main object in the plane of the two-dimensional camera screen. The system firstly reads the focal length of a camera and the motion information of a commodity in a current lens picture; and establishing a three-dimensional space coordinate system. Calculating motion information of the camera in the three-dimensional space of the initial state and the termination state of the lens by combining the focal length of the camera according to the sizes and the positions of the commodity and the susceptor and through a projection theorem; finally, the analysis result is transmitted to the mechanical arm movement path planning analysis;

and planning and analyzing the motion path of the mechanical arm, calculating and obtaining the position information of each degree of freedom of the mechanical arm in the mechanical arm coordinate based on a reverse dynamics algorithm according to the motion speed and the motion track of the camera in the three-dimensional space, and planning the global motion track of the mechanical arm. The system firstly reads the motion information of the current lens camera; based on the inverse dynamics algorithm, the positions of the mechanical arms with the starting points and the ending points are calculated according to the positions of the cameras (namely the tail ends of the mechanical arms) with the current lens starting points and the current lens ending points. Finally, according to the communication protocol of the system, the motion path of the mechanical arm is converted into a corresponding instruction format, and fig. 6 is a communication protocol example, specifically:

and calculating the sequence numbers of the initial point and the initial point of the current lens in the motion action sequence of the mechanical arm according to the lens sequence numbers. N001 in fig. 6 represents the first motion of the robot arm in the motion sequence. L01 is action ID; the delay waiting means that after the lens finishes a certain action, the lens stays for a plurality of times and then enters the next action. And a delay waiting is arranged between actions, so that the shaking phenomenon of the mechanical arm can be effectively relieved. X, Y, Z, A, B, C correspond to the robot arms with six degrees of freedom, all with an origin of 0000.000. And coding the mechanical arm state information of the current lens starting point and the current lens ending point according to the format, and writing the mechanical arm state information into a mechanical arm script.

And S03, analyzing the state information of the motion of the object in the video picture to be clipped based on the salient region analysis algorithm, and labeling the video. And based on the label and in combination with the script, performing clipping and synthesizing processing on the video. The method specifically comprises the following steps:

performing frame-by-frame significance region analysis on a video to be edited by a significance algorithm; carrying out mean processing on every five frames to obtain the final salient regions and the salient region centers of the five frames;

calculating the central distance and deflection angle of the salient region of the nth frame and the (n-5) th frame by taking the five frames as step lengths from the sixth frame, setting the lens serial number as k, and setting the initial value as 1;

if the central deflection angle in the salient region of the nth frame and the (n-5) th frame and the central deflection angle in the salient region of the (n-5) th frame and the (n-10) th frame are higher than a threshold value, recording a label 'lens switching' for the nth frame; if the previous 'shot switching' frame does not comprise the 'kth commodity shot' label, the 'kth commodity shot' label is recorded for the current frame. K plus 1;

if the distance between the center of the salient region of the nth frame and the (n-5) th frame and the distance between the center of the salient region of the (n-5) th frame and the center of the salient region of the (n-10) th frame are lower than the threshold, and the distance between the center of the salient region of the nth frame and the (n-5) th frame and the distance between the center of the salient region of the (n-5) th frame and the center of the salient region of the (n-10) th frame are higher than the threshold, recording a label' acceleration of the kth commodity lens;

if the central deflection angle of the salient region of the nth frame and the (n-5) th frame and the central distance of the salient region of the (n-5) th frame and the (n-10) th frame are higher than the threshold value, recording a label 'the deceleration start of the k-th commodity lens' for the nth frame;

and finally performing composite rendering on the video according to the script. If the shot is an empty lens, extracting materials from the database according to material retrieval in the script; and if the shot is a commodity shot, performing editing and composition according to the script information and the label. The simplified flow chart of the above steps is shown in fig. 9.

Can also use a plurality of arms to cooperate the shooting action of accomplishing high complexity jointly, include: marking a plurality of robotic arms, in order a1, a2,. and aN, wherein N is a natural number of at least 2;

planning a three-dimensional space position to which the mechanical arms a1, a2, aN aN should reach at each moment, and which mechanical arms start shooting actions at each moment, wherein the mechanical arms which do not start the shooting actions can be advanced to a position required by the next key moment point, so that zero-time difference switching is realized during shooting of the next group;

planning the movement speed, movement tracks and interactive actions of a plurality of mechanical arms in a three-dimensional space, and ensuring that the mechanical arms are mutually matched and do not interfere with each other;

the multiple mechanical arms are matched together to finish multi-angle shooting of the articles at the same moment or special track shooting of the articles at different moments, and shooting actions which cannot be finished by the single mechanical arm are carried out.

As another aspect, the present application further provides a system for automatically shooting and generating a mobile-end short video advertisement based on a mechanical arm, where the system includes: a generator and a memory for storing a mobile terminal program capable of running on the generator; the generator is configured to implement the method described in the embodiment of the present application when the generator runs the mobile terminal program.

In another aspect, the present application further provides a computer-readable storage medium, on which mobile end device executable instructions are stored, where the mobile end device executable instructions are configured to execute a method for implementing the descriptions of the embodiments of the present application.

It should be understood that portions of the present application may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. If implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for realizing a logic function for a data signal, an asic having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), and the like.

It will be understood by those skilled in the art that all or part of the steps carried by the method for implementing the above embodiments may be implemented by hardware related to instructions of a program, which may be stored in a computer readable storage medium, and when the program is executed, the program includes one or a combination of the steps of the method embodiments.

In addition, functional units in the embodiments of the present application may be integrated into one processing module, or each unit may exist alone physically, or two or more units are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a stand-alone product, may also be stored in a computer readable storage medium.

The storage medium mentioned above may be a read-only memory, a magnetic or optical disk, etc. Although embodiments of the present application have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present application, and that variations, modifications, substitutions and alterations may be made to the above embodiments by those of ordinary skill in the art within the scope of the present application.

Claims

1. A method for automatically shooting and generating a mobile-end short video advertisement based on a mechanical arm comprises the following steps:

determining the video style and commodity information through user interaction, and generating a video script by combining a script generation model based on user requirements through the system;

the system analyzes the video script, plans the motion path of the mechanical arm and guides the user to set a scene;

and carrying out significance region analysis on the video to be clipped by utilizing a significance algorithm and synthesizing and rendering.

2. The method of claim 1, wherein the user interaction determines a video style and commodity information, and the system generates a video script based on the user demand in combination with a script generation model, comprising:

selecting a video style and a commodity type on a demand acquisition interface by a user as input;

and the script generator integrates the analysis results and writes the information of each shot into the script based on different video types.

3. The method of claim 1, wherein the system analyzes the video script, plans the path of the robot arm movement, and guides the user through the set, comprising:

according to the initial position of the mechanical arm in shooting, a line frame diagram is displayed on a mobile phone interface to guide a user to arrange a shooting scene, the line frame diagram comprises the positions of commodities, the positions of the susceptors and the placing angles of the cameras, the user sets scenes under the guidance of the line frame diagram and fixes the cameras at the tail end of the mechanical arm, and the mechanical arm regards the final state as the initial state of the shooting process.

4. The method of claim 1, wherein performing saliency region analysis and synthetic rendering on the video to be clipped by using a saliency algorithm comprises:

the saliency algorithm analyzes the saliency areas of the video to be edited frame by frame, and performs mean processing on every five frames to obtain the saliency areas and the saliency area centers of the five frames finally;

starting from the sixth frame, calculating the central distance and deflection angle of the salient region of the nth frame and the (n-5) th frame with the step length of five frames, setting the lens serial number as k, and setting the initial value as 1;

if the central deflection angle in the salient region of the nth frame and the (n-5) th frame and the central deflection angle in the salient region of the (n-5) th frame and the (n-10) th frame are higher than a threshold value, recording a label 'lens switching' for the nth frame; if the previous 'shot switching' frame does not comprise a 'kth commodity shot' label, a current frame recording label 'kth commodity shot' is marked, and 1 is added to K;

the distance between the center of the salient region of the nth frame and the (n-5) th frame and the distance between the center of the salient region of the (n-5) th frame and the center of the salient region of the (n-10) th frame are lower than a threshold value, and the distance between the center of the salient region of the nth frame and the (n-5) th frame and the distance between the center of the salient region of the (n-5) th frame and the center of the salient region of the (n-10) th frame are higher than the threshold value, then the nth frame is recorded with a label;

performing final synthesis rendering on the video according to the script, and if the video is an empty lens, extracting materials from a database according to material retrieval in the script; and if the shot is a commodity shot, performing editing and composition according to the script information and the label.

5. The method of claim 2, wherein the script generator analyzes the user input in conjunction with the formulated video advertisement script model, comprising:

the video advertisement script model comprises a production process of a professional short video advertisement video, and specifically comprises a series of rules; the rules contain visual characteristics suitable for short video advertisements of different styles and different commodities;

the script generator defines the advertisement duration, the advertisement music, the lens quantity and the advertisement rhythm of the short video advertisement based on the video advertisement script model;

the script generator defines the video type, composition, mirror moving mode and color characteristics of each lens of the short video advertisement based on the video advertisement script model and the advertisement overall information.

6. The method of claim 2, wherein the script generator integrates the analysis results, comprising:

for each lens, if the lens is a commodity lens, writing the visual characteristics of the lens into a video script to be shot;

and for each lens, if the lens is an empty lens, entering a material database to search for an empty lens material, traversing the material database by the system, and if a material matched with the visual characteristics of the lens is found, writing the index of the material into a script to be rendered for use.

7. The method of claim 3, wherein the system converts the visual characteristics of the shots in the script into the mechanical arm motion path plan, comprising:

aiming at each lens in the script, if the lens is a commodity lens, the system carries out shooting analysis on the visual characteristics of the lens; otherwise, skipping to enter the next lens until all the lenses are read;

analyzing the visual effect of each lens defined in the script by the two-dimensional camera screen plane, and calculating the motion speed and the motion track of the main object in the two-dimensional camera plane;

the three-dimensional shooting space analysis is used for calculating and obtaining the motion speed and the motion trail of the camera in the three-dimensional space according to the motion speed and the motion trail of the main object in the plane of the two-dimensional camera;

and the mechanical arm motion analysis is used for calculating and obtaining the position information of each degree of freedom of the mechanical arm in the mechanical arm coordinate based on a reverse dynamics algorithm according to the motion speed and the motion track of the camera in the three-dimensional space, and planning the overall motion track of the mechanical arm.

8. The method of claim 3, wherein converting the robot arm motion path into a corresponding command format according to a communication protocol of the system comprises:

calculating the serial numbers of the initial point and the initial point of the current lens in the motion sequence of the mechanical arm according to the lens serial numbers, wherein N001 represents the first motion of the mechanical arm in the motion sequence, and L01 represents a motion ID;

the time delay waiting means that after a certain action is finished by the lens, the lens stays for a plurality of times and then enters the next action, and the time delay waiting is set between the actions, so that the shaking phenomenon of the mechanical arm can be effectively relieved;

x, Y, Z, A, B, C respectively corresponding to the mechanical arm with six degrees of freedom, the original points are all 0000.000;

and coding the mechanical arm state information of the current lens starting point and the current lens ending point according to the format, and writing the mechanical arm state information into a mechanical arm script.

9. The method of claim 7, wherein the two-dimensional camera screen plane analyzing the visual effect for each lens defined in the script calculates motion information of objects in the two-dimensional camera plane, comprising:

reading the visual characteristics of the current lens in the script, wherein the visual characteristics comprise: the system comprises an intra-picture commodity motion path and an intra-picture commodity motion speed, wherein the intra-picture commodity motion path comprises a picture composition and a commodity motion direction;

reading the resolution of a camera picture;

establishing a two-dimensional camera plane coordinate system by taking the upper left corner of a camera plane as an origin and the horizontal direction and the vertical direction as an x axis and a y axis respectively;

according to the visual characteristics of the lens in the script, the abstract visual characteristics are quantized into the commodity motion states in the two-dimensional camera screen plane in the initial state and the termination state of the lens, wherein the commodity motion states comprise the coordinates of the center of the commodity in a two-dimensional camera plane coordinate system and the radius of an circumscribed circle presented by the commodity in a picture;

defining the position relation of the main object article and the dummy article according to the visual characteristics of the lens in the script;

and transmitting the analysis result to a three-dimensional shooting space for analysis.

10. The method of claim 7, wherein the three-dimensional shooting space analysis is used for calculating and obtaining the motion information of the camera in the three-dimensional space according to the motion information of the subject article in the two-dimensional camera plane, and comprises the following steps:

reading commodity motion information in a current lens picture;

reading a focal distance of a camera;

establishing a three-dimensional space coordinate system, and calculating motion information of the camera in the three-dimensional space of the initial state and the termination state of the lens through a projection theorem according to the sizes and the positions of the commodity and the susceptor and by combining the focal length of the camera;

and transmitting the analysis result to the mechanical arm motion path planning analysis.

11. The method of claim 7, wherein the mechanical arm motion path planning analysis is based on a reverse dynamics algorithm to calculate and plan a mechanical arm global motion track, and comprises the following steps:

reading the current lens camera motion information;

based on a reverse dynamics algorithm, the positions of the mechanical arms with the starting points and the ending points are calculated according to the positions of the cameras at the starting points and the ending points of the current lens, namely the tail ends of the mechanical arms.

12. The method of claim 7, wherein the high complexity photographing action is performed using a plurality of robotic arms cooperatively, comprising: marking a plurality of robotic arms, in order a1, a2,. and aN, wherein N is a natural number of at least 2;

planning a three-dimensional space position to which the mechanical arms a1, a2, aN aN should reach at each moment, and which mechanical arms start shooting actions at each moment, wherein the mechanical arms which do not start the shooting actions are advanced to a position required by the next key moment point, so that zero-time difference switching is realized during shooting of the next group;

13. A system for automatically shooting and generating a mobile terminal short video advertisement based on a mechanical arm is characterized by comprising: a generator and a memory for storing a mobile terminal program capable of running on the generator; wherein the generator is configured to execute the method according to any one of claims 1 to 12 when running the mobile terminal program.

14. A mobile end device-readable storage medium having mobile end device-executable instructions stored thereon, wherein the mobile end device-executable instructions are configured to perform the method according to any one of claims 1 to 12.