CN108268847B

CN108268847B - Method and system for analyzing movie montage language

Info

Publication number: CN108268847B
Application number: CN201810048953.3A
Authority: CN
Inventors: 逄泽沐风
Original assignee: Individual
Current assignee: Pang Zewenyue
Priority date: 2018-01-18
Filing date: 2018-01-18
Publication date: 2020-04-21
Anticipated expiration: 2038-01-18
Also published as: CN108268847A

Abstract

The application discloses a method and a system for analyzing a movie montage language, which comprise the following steps: loading video information; cutting video information according to the shot to obtain at least 1 video segment; arranging the video clips according to the playing sequence and giving numbers to the video clips; acquiring the serial number and the playing time length of a video clip; the method comprises the steps of obtaining the scene type of a video clip, obtaining the shooting skill of the video clip, and summarizing the number, the playing time, the scene type and the shooting method of the video clip as the montage language of the video information. According to the method and the system for analyzing the movie montage language, provided by the invention, no step is required in the whole method and system for personnel to have movie working experience, and the problems of high requirement for personnel experience, long time consumption and low efficiency in the past are solved.

Description

Method and system for analyzing movie montage language

Technical Field

The invention relates to the field of movie analysis, in particular to a method and a system for movie montage language analysis.

Background

With the gradual development of society and the improvement of material level, people gradually increase the consumption of movies and televisions, and the watching of moving images such as movies and televisions becomes an important entertainment mode in daily life of people.

The montage language is a basic means for organically combining a lens, a picture and sound according to a specific creation purpose and following a series of artistic rules, and through the means, the completeness and the uniformity of space and time of a movie work are created, the description of characters, environments and events is completed, the thought and the emotion with internal logic are expressed, the harmonious rhythm and style are created, the montage language is not only a clipping rule, but also a thinking method and an creation method in real life, a plurality of different montage languages are often used in excellent movie and television and other moving images, and through analyzing and learning the montage language in the movie and television and other moving images, the shooting skill and the expression method for learning a director of movie practitioners can be kept, so that the capability level of the movie practitioners is improved, in the prior art, film practitioners mainly learn by means of shooting techniques and skills of a director watching a shooting site, but the opportunities for watching and learning in the shooting site are limited, so that the film practitioners watch moving images of a movie television and the like shot by the director and analyze montage languages in the moving images more often, the method is long in time-consuming and low in efficiency, and everyone often has bias understanding of the film montage languages, so that mistakes are easily made.

Therefore, it is a problem to be solved in the art to provide a method and system for analyzing a montage language of a movie.

Disclosure of Invention

In view of this, the present invention provides a method and a system for analyzing a movie montage language, so as to solve the problems of low efficiency and easy error in recognizing the movie montage language in the prior art.

In order to solve the above technical problem, the present application provides a method for analyzing a movie montage language, comprising:

loading video information;

cutting the video information according to the lens to obtain at least 1 video segment;

arranging the video clips according to a playing sequence and giving numbers to the video clips;

acquiring the serial number and the playing time length of the video clip;

acquiring the view of the video clip, further comprising:

acquiring a frame picture of the video clip, acquiring sampling points on the frame picture, matching the sampling points with a human body sample feature library to acquire matched sampling point numerical values, comparing the matched sampling point numerical values with a predefined matching threshold value,

when the values of the matched sample points are smaller than the predefined matching threshold value, the frame picture does not contain frames of characters and character local features, the scene of the frame picture is a long scene,

when the numerical value of the matched sampling point is larger than or equal to the predefined matching threshold value, the frame picture contains a frame of a person or a person local close-up, the ratio of the head area of the person in the frame picture to the display screen area is obtained,

when the proportion of the human head area in the frame picture on the screen is greater than or equal to 1/6, the scene of the frame picture is close-up,

when the proportion of the head area of the human body in the shot information to the screen is less than 1/6 and greater than or equal to 1/48, the scene of the frame picture is a close scene,

when the proportion of the human head area in the shot information to the screen is less than 1/48 and greater than or equal to 1/85, the scene of the frame picture is a medium scene,

when the proportion of the head area of the human body in the shot information to the screen is less than 1/85, the scene of the frame picture is panorama,

acquiring the scene of the frame picture in the video clip as the scene of the video clip;

acquiring shooting skills of the video clip, and further comprising:

acquiring feature points on the frame pictures of the video clip, wherein the number of the feature points on any frame picture is a fixed value,

establishing a coordinate system on a plane where the frame picture is located, wherein the coordinate system comprises: the origin of the coordinate system is positioned at the geometric center of the frame picture, the positive direction of the transverse coordinate axis is any direction in the plane of the frame picture, the positive direction of the longitudinal coordinate axis is intersected with the positive direction of the transverse coordinate axis,

calculating coordinate values of the feature points in the coordinate system, wherein the coordinate values comprise: the abscissa and the ordinate are the same as each other,

acquiring any frame picture in the video clip as a target frame picture,

acquiring a contrast frame picture of the target frame picture, wherein the number of the contrast frame picture is greater than the number of the target frame picture,

acquiring the feature points shared by the target frame picture and the comparison frame picture as shared feature points,

and subtracting the coordinate value of the common feature point in the target frame picture from the coordinate value of the common feature point in the comparison frame picture to obtain a coordinate difference value of the common feature point, wherein the coordinate difference value comprises: the horizontal coordinate difference value and the vertical coordinate difference value,

obtaining the feature point having the abscissa difference value larger than zero and the ordinate difference value larger than zero from the common feature points as a first feature point,

obtaining, as a second feature point, the feature point whose abscissa difference is smaller than zero and whose ordinate difference is larger than zero from the common feature points,

obtaining the feature point of which the abscissa difference is smaller than zero and the ordinate difference is smaller than zero from the common feature points as a third feature point,

obtaining the feature point with the abscissa difference value larger than zero and the ordinate difference value smaller than zero from the common feature points as a fourth feature point,

when the numbers of the first feature point, the second feature point, the third feature point and the fourth feature point are not less than zero, and the abscissa of the first feature point and the abscissa of the fourth feature point in the comparison frame picture are greater than the abscissa of the second feature point and the third feature point in the comparison frame picture, the shooting technique adopted by the target frame picture and the comparison frame picture is a deduction,

when the numbers of the first feature point, the second feature point, the third feature point and the fourth feature point are all not less than zero, and the abscissa of the second feature point and the third feature point in the comparison frame picture is greater than the abscissa of the first feature point and the fourth feature point in the comparison frame picture, the shooting skill adopted by the target frame picture and the comparison frame picture is pull,

when the abscissa difference value of the common feature point is greater than zero or both are less than zero, and the abscissa difference value of the common feature point and the absolute value of the abscissa of the common feature point in the target frame screen do not have monotonicity; or when the vertical coordinate difference of the common feature point is greater than zero or both are less than zero and the vertical coordinate difference of the common feature point and the absolute value of the vertical coordinate of the common feature point in the target frame picture have no monotonicity, the shooting tricks adopted by the target frame picture and the comparison frame picture are shifted,

when the abscissa difference value of the common feature point is greater than zero or both are less than zero, and the abscissa difference value of the common feature point has monotonicity with an absolute value of the abscissa of the common feature point in the target frame screen; or when the vertical coordinate difference of the common feature point is greater than zero or both are less than zero and the vertical coordinate difference of the common feature point and the absolute value of the vertical coordinate of the common feature point in the target frame picture have monotonicity, the shooting technique adopted by the target frame picture and the comparison frame picture is shaking,

obtaining the feature points of which the horizontal coordinate difference and the vertical coordinate difference are both equal to zero from the common feature points as static feature points, wherein when the number of the static feature points is greater than a first threshold, the shooting technique adopted by the target frame picture and the comparison frame picture is fixed shooting, wherein the ratio of the first threshold to the number of the feature points of any one frame picture is greater than 10%,

acquiring the shooting skills adopted by the target frame picture and the contrast frame picture in the video clip as the shooting skills of the video clip;

summarizing the serial numbers, the playing duration, the scenes and the shooting methods of the video clips to be used as a montage language of the video information.

Optionally, the video information is cut according to shots to obtain at least 1 video segment, and further:

acquiring any one frame picture from the video information as a current frame picture,

acquiring a next frame picture adjacent to the current frame picture from the video information,

acquiring the characteristic points of the current frame picture and the next frame picture,

calculating the number of the feature points shared by the current frame picture and the next frame picture as the continuity,

when the continuity is greater than or equal to a second threshold, the current frame picture and the next frame picture belong to the same video segment, wherein: the ratio of the second threshold value to the number of the feature points of any one frame picture is more than 50%,

and when the continuity is smaller than a second threshold value, the current frame picture and the next frame picture do not belong to the same video segment, and cutting is carried out between the current frame picture and the next frame picture.

Optionally, the comparison frame picture and the target frame picture belong to the same video segment;

the contrast frame picture and the target frame picture are adjacent in the video segment.

Optionally, the coordinate system is a rectangular coordinate system, and the transverse coordinate axis and the longitudinal coordinate axis are perpendicular to each other.

Optionally, the obtaining of the feature point on the frame picture further includes:

and acquiring a static scene from the frame picture, and setting at least two feature points on the static scene.

Optionally, the method includes: the system comprises a video reading module, a scene acquisition module, a shooting skill acquisition module and a calculation module;

the video reading module is connected with the scene acquisition module and the calculation module and used for loading video information, cutting the video information according to a lens to obtain at least 1 video clip, arranging the video clips according to a playing sequence and giving numbers to the video clips to acquire the numbers and the playing duration of the video clips;

the scene acquiring module is connected to the video reading module, the shooting skill acquiring module and the calculating module, and configured to acquire the scene of the video segment, where the acquiring the scene of the video segment further includes:

when the matched sampling point value is smaller than the predefined matching threshold value, the frame picture does not contain a character frame, the scene of the frame picture is a distant scene,

when the numerical value of the matched sampling point is larger than or equal to the predefined matching threshold value, the frame picture contains the characters and the frames of the local close-ups of the characters, the ratio of the head area of the human body in the frame picture to the area of the display screen is obtained,

the shooting skill obtaining module, connected to the scene obtaining module and the calculating module, is configured to obtain the shooting skill of the video segment, where the obtaining the shooting skill of the video segment further includes:

acquiring feature points on the frame picture of the video clip, wherein: the number of the characteristic points on any frame picture is a fixed value,

acquiring any frame picture in the video clip as a target frame picture,

obtaining the feature points of which the horizontal coordinate difference and the vertical coordinate difference are both equal to zero from the common feature points as static feature points, wherein when the number of the static feature points is greater than a first threshold value, the shooting technique adopted by the target frame picture and the comparison frame picture is fixed shooting, wherein the ratio of the first threshold value to the number of the feature points of any one frame picture is greater than 10%,

the calculation module is respectively connected with the video reading module, the view acquisition module and the shooting skill acquisition module and is used for summarizing the serial numbers of the video clips, the playing time length, the view and the shooting methods as the montage language of the video information.

Optionally, the video reading module includes: a video cutting unit to:

when the continuity is greater than or equal to a second threshold, the current frame picture and the next frame picture belong to the same video segment, wherein: the ratio of the second threshold to the number of the feature points of any one of the frame pictures is greater than 50%,

Compared with the prior art, the method and the system for analyzing the film montage language provided by the invention have the beneficial effects that:

1) according to the method and the system for analyzing the movie montage language, no step is required in the whole method for personnel to have movie working experience, and the problems of high requirement for personnel experience, long consumed time and low efficiency in the past are solved;

2) the method and the system for analyzing the film montage language provided by the invention reduce the dependence on manpower, can quickly identify the montage language in moving images such as film televisions and the like, and provide abundant data for film practitioners to learn film shooting skills.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description, serve to explain the principles of the invention.

FIG. 1 is a flow chart of a method of language analysis for a movie montage in accordance with the present invention;

FIG. 2 is a flow chart of a method for capturing a scene of a video clip according to the present invention;

FIG. 3 is a flow chart of a method for capturing camera skills of a video segment according to the present invention;

FIG. 4 is a system for language analysis of a movie montage according to the present invention.

Detailed Description

Various exemplary embodiments of the present invention will now be described in detail with reference to the accompanying drawings. It should be noted that: the relative arrangement of the components and steps, the numerical expressions and numerical values set forth in these embodiments do not limit the scope of the present invention unless specifically stated otherwise.

The following description of at least one exemplary embodiment is merely illustrative in nature and is in no way intended to limit the invention, its application, or uses.

Techniques, methods, and apparatus known to those of ordinary skill in the relevant art may not be discussed in detail but are intended to be part of the specification where appropriate.

In all examples shown and discussed herein, any particular value should be construed as merely illustrative, and not limiting. Thus, other examples of the exemplary embodiments may have different values. It should be noted that: once an item is defined in one drawing, it need not be further discussed in subsequent drawings.

Fig. 1 is a flowchart of a method for analyzing a movie montage language according to the present invention, and as shown in fig. 1, the method for analyzing a movie montage language according to the present invention includes:

s101: and loading video information, cutting the video information according to the shot to obtain at least 1 video clip, arranging the video clips according to the playing sequence and giving numbers.

Specifically, when the video information is played at the playing speed of a general video, the playing time of the video information should not be too short, and generally should be longer than 10 seconds, the read video information may be video information of black-and-white or color movie television and the like, the video information of the movie television and the like may include images and audios, or may only include images, the video information of the movie television and the like is composed of one frame, and generally one second includes 24 frame pictures, during playing, because the previous frame has image residue on human eyes, human eyes cannot recognize the interval between two frame pictures, when acquiring the frame picture in the video information, all the frame pictures may be acquired, or a certain number of frame pictures may be spaced, one frame picture is extracted, because the change between two adjacent frame pictures is often small, it is difficult to find the difference point between two frame pictures, and the problem of difficult recognition is easily caused, therefore, optionally, for video information such as general movie and television, one frame picture is extracted every second, that is, one frame picture is extracted every 23 frame pictures, all the extracted frame pictures are summarized, numbers are given according to the playing sequence, the numbers are arranged in an increasing order according to the sequence of natural numbers starting from 1, the summarized frame pictures are cut according to the shots, then the cutting position in the video information is correspondingly found according to the cutting position, and the video information is cut at the position to obtain each video clip.

S102: and acquiring the number and the playing time of the video clip.

Specifically, the video segments are arranged in the order of the numbers from small to large, and the playing time of each video segment is respectively obtained.

S103: and acquiring the scene of the video clip.

Fig. 2 is a flowchart of a method for obtaining a scene of a video clip according to the present invention, and as shown in fig. 2, obtaining the scene of the video clip further includes: S1031-S1038.

S1031: the method comprises the steps of obtaining a frame picture of a video clip, obtaining sampling points on the frame picture, matching the sampling points with a human body sample feature library, obtaining matched sampling point numerical values, and comparing the matched sampling point numerical values with a predefined matching threshold value.

S1032: and when the numerical value of the matched sampling point is smaller than the predefined matching threshold value, the frame picture does not contain the character and the frame of the character local close-up, and the scene of the frame picture is a long scene.

Specifically, when the frame contains a character image, the value of the matched sampling point should be greater than or equal to a predefined matching threshold, when the lens information does not contain the character image, the value of the matched sampling point should be less than the predefined matching threshold, and when the lens information does not contain the character image, the scene is a long-range scene.

S1033: and when the numerical value of the matched sampling point is greater than or equal to the predefined matching threshold value, the frame picture contains the person or the frame of the person local close-up, and the ratio of the head area of the human body in the frame picture to the area of the display screen is obtained.

Specifically, the scenes can be classified into close-up, medium, long-distance and full-view scenes, and the images of the characters in different scenes have different standing proportions, wherein the head of a person usually occupies 1/8 of the height of the body, and the head width of the person occupies 1/2 of the shoulder width of the body. The range of the human body appearing in the shot information is different, so that the proportion of the human head in the shot information is different. Therefore, the ratio of the head area of the human body to the area of the display screen is used as a basis for judging the scene.

S1034: when the proportion of the human head area in the frame picture to the screen is greater than or equal to 1/6, the scene of the frame picture is close-up.

S1035: when the proportion of the human head area in the shot information to the screen is smaller than 1/6 and greater than or equal to 1/48, the scene of the frame picture is a close scene.

S1036: when the human head area in the shot information accounts for less than 1/48 and is equal to or greater than 1/85, the scene of the frame is the medium scene.

S1037: when the proportion of the human head area in the shot information to the screen is smaller than 1/85, the scene of the frame picture is panoramic.

S1038: and acquiring the scene of the frame picture in the video clip as the scene of the video clip.

Specifically, for any video clip, it is necessary to acquire the scene of each frame picture in the video clip, and when frame pictures using different scenes exist in the video clip, the scenes of the video clip are all types of scenes, for example: the video clip is composed of the frame pictures of the close shot and the frame pictures of the middle shot, and the scenes of the video clip are the close shot and the middle shot.

S104: and acquiring shooting skills of the video clip.

Fig. 3 is a flowchart of a method for obtaining a shooting skill of a video segment according to the present invention, and as shown in fig. 3, the method for obtaining a shooting skill of a video segment includes: S1041-S1049.

S1041: the method comprises the steps of obtaining characteristic points on a frame picture, cutting video information according to a lens to obtain video segments, establishing a coordinate system on a plane where the frame picture is located, and calculating coordinate values of the characteristic points in the coordinate system.

Specifically, the number of the feature points on any frame picture is a fixed value, that is, the number of the feature points on any frame picture is equal to the number of the feature points on another frame picture, and further, in some optional embodiments, the obtaining of the feature points on the frame picture further includes: and acquiring a static scene from the frame picture, and setting at least two feature points on the static scene. After video information is cut according to the lens, all frame pictures in any video clip are shot by adopting one lens, the origin of a coordinate system is positioned at the geometric center of the frame pictures, and the coordinate system comprises: the frame picture is a rectangle, and the transverse coordinate axis and the longitudinal coordinate axis are respectively parallel to the transverse side and the longitudinal side of the frame picture. In other cases, a non-rectangular coordinate system may be established depending on specific conditions and requirements. Coordinate values in a coordinate system, including: the abscissa and the ordinate.

S1042: any frame picture in the video clip is obtained as a target frame picture, a comparison frame picture of the target frame picture is obtained, and a feature point shared by the target frame picture and the comparison frame picture is obtained as a shared feature point.

Specifically, when a target frame picture is obtained, the first frame picture of a video clip is generally obtained, then the corresponding target frame picture is obtained, then the frame pictures after being obtained are sequentially used as the target frame pictures, and the corresponding target frame pictures are obtained, and the number of the frame pictures is compared to be larger than the number of the target frame pictures. The acquired common feature points refer to feature points existing in both the target frame picture and the comparison frame picture, and the orientation relationship between these feature points does not change, for example, in the target frame picture, the feature point a and the feature point B are located on the left and right sides of the feature point C, and in the comparison frame picture, the feature point a and the feature point B are still located on the two sides of the feature point C, but the distance relationship between them may change, that is, the distance between two feature points may change.

S1043: and subtracting the coordinate value of the common characteristic point in the target frame picture from the coordinate value of the common characteristic point in the comparison frame picture to obtain a coordinate difference value of the common characteristic point, wherein the coordinate difference value comprises: and acquiring a first characteristic point, a second characteristic point, a third characteristic point and a fourth characteristic point from the common characteristic points by using the horizontal coordinate difference value and the vertical coordinate difference value.

Specifically, the size of the comparison frame picture should be the same as that of the target frame picture, the coordinate system established is the same coordinate system and is established at the same position, and then the difference operation can be performed on the coordinate value.

Specifically, feature points with horizontal coordinate difference values larger than zero and vertical coordinate difference values larger than zero are obtained from the common feature points and serve as first feature points; obtaining a characteristic point with a horizontal coordinate difference value smaller than zero and a vertical coordinate difference value larger than zero from the common characteristic points as a second characteristic point; obtaining a characteristic point with a horizontal coordinate difference value smaller than zero and a vertical coordinate difference value smaller than zero from the common characteristic points as a third characteristic point; obtaining a feature point with a horizontal coordinate difference value larger than zero and a vertical coordinate difference value smaller than zero from the common feature points as a fourth feature point; when the established coordinate system is a cartesian rectangular coordinate system, the right direction represents the positive direction of the abscissa, the upper direction represents the positive direction of the ordinate, the first feature point refers to a feature point moving in the upper right direction, the second feature point refers to a feature point moving in the upper left direction, the third feature point refers to a feature point moving in the lower left direction, and the fourth feature point refers to a feature point moving in the lower right direction.

S1044: and when the numbers of the first characteristic point, the second characteristic point, the third characteristic point and the fourth characteristic point are not less than zero, and the abscissa of the first characteristic point and the abscissa of the fourth characteristic point in the comparison frame picture are greater than the abscissa of the second characteristic point and the third characteristic point in the comparison frame picture, the shooting skills adopted by the target frame picture and the comparison frame picture are inferred.

S1045: when the number of the first characteristic point, the second characteristic point, the third characteristic point and the fourth characteristic point is not less than zero, and the abscissa of the second characteristic point and the abscissa of the third characteristic point in the comparison frame picture are greater than the abscissa of the first characteristic point and the fourth characteristic point in the comparison frame picture, the shooting skill adopted by the target frame picture and the comparison frame picture is pull; .

Specifically, taking the established coordinate system as a cartesian rectangular coordinate system as an example, the right direction represents the positive direction of the abscissa, the upper direction represents the positive direction of the ordinate, and the numbers of the first feature point, the second feature point, the third feature point and the fourth feature point are all not less than zero, which indicates that the comparison frame picture is pushed or pulled relative to the target frame picture, while in the comparison frame picture, when the abscissa of the first feature point and the fourth feature point is greater than the abscissa of the second feature point and the third feature point, it indicates that the feature point on the right side moves to the right side, and the feature point on the left side moves to the left side, that is, the local content of the target frame picture is enlarged in the comparison frame picture, at this time, the ordinate of the first feature point and the second feature point in the comparison frame picture is greater than the ordinate of the third feature point and the fourth feature point in the comparison frame picture. Similarly, in the comparison frame picture, when the abscissa of the second feature point and the abscissa of the third feature point are greater than the abscissa of the first feature point and the fourth feature point, the feature point on the right side moves to the sitting side, and the feature point on the left side moves to the right side, that is, the target frame picture is reduced in the comparison frame picture.

S1046: when the abscissa difference value of the common feature point is greater than zero or both are less than zero, and the abscissa difference value of the common feature point and the absolute value of the abscissa of the common feature point in the target frame screen do not have monotonicity; or when the vertical coordinate difference value of the common characteristic point is larger than zero or both are smaller than zero, and the vertical coordinate difference value of the common characteristic point and the absolute value of the vertical coordinate of the common characteristic point in the target frame picture do not have monotonicity, the shooting skills adopted by the target frame picture and the comparison frame picture are shifted.

Specifically, when the horizontal coordinate difference or the vertical coordinate difference of the common feature point is both greater than zero or both less than zero, it indicates that the camera for shooting is moving, but does not move back and forth, i.e., does not move in a direction perpendicular to the lens, and the horizontal coordinate difference and the vertical coordinate difference of the common feature point do not have monotonicity with the absolute value of the horizontal coordinate and the absolute value of the vertical coordinate of the common feature point in the target frame picture, which indicates that the distance of movement of each common feature point is independent of its position in the target frame picture, i.e., it is a non-forward and backward translation, i.e., it is a translation.

S1047: when the abscissa difference value of the common feature point is greater than zero or both are less than zero, and the abscissa difference value of the common feature point has monotonicity with an absolute value of the abscissa of the common feature point in the target frame screen; or when the vertical coordinate difference value of the common characteristic point is larger than zero or both are smaller than zero, and the vertical coordinate difference value of the common characteristic point and the absolute value of the vertical coordinate of the common characteristic point in the target frame picture have monotonicity, the shooting technique adopted by the target frame picture and the comparison frame picture is shaking.

Specifically, when the horizontal coordinate difference or the vertical coordinate difference of the common feature point is both greater than zero or both less than zero, it indicates that the camera for shooting is moving but not moving back and forth, and the horizontal coordinate difference or the vertical coordinate difference of the common feature point has monotonicity with the absolute value of the horizontal coordinate or the absolute value of the vertical coordinate of the common feature point in the target frame picture, which indicates that the moving distance of each common feature point is related to the position of the common feature point in the target frame picture, that is, shaking is used.

S1048: and obtaining the characteristic points of which the horizontal coordinate difference value and the vertical coordinate difference value are equal to zero in the common characteristic points as static characteristic points, wherein when the number of the static characteristic points is greater than a first threshold value, the shooting skills adopted by the target frame picture and the comparison frame picture are fixed shooting.

Specifically, the ratio of the first threshold to the number of feature points in any frame is greater than 10%, and during fixed shooting, the camera is fixed, so that the landmark scenery such as a building does not change position in the target frame, but in contrast to the frame, i.e. there are a plurality of common feature points whose coordinate values do not change, however, considering the influence factors such as vibration and light refraction during the actual shooting process, an object that does not move originally may be displayed as moving, and an object that moves originally may also be displayed as stationary, so that a first threshold needs to be set, and only when the number of stationary feature points is greater than the first threshold, the adopted shooting skill can be considered as fixed shooting, and the ratio of the first threshold to the number of feature points in any frame is greater than 10%.

S1049: and acquiring shooting skills adopted by the target frame picture and the contrast frame picture in the video clip as the shooting skills of the video clip.

S105: and summarizing the serial number, the playing time length, the scene difference and the shooting method of the video clip to be used as a montage language of the video information.

Specifically, the obtained montage language includes each video clip, the encoding and playing time of the video clip, the scene used in each video clip, and the shooting method used for shooting the video clip.

Further, in some optional embodiments, the video information is cut according to a shot to obtain at least 1 video segment, further, any frame of picture is obtained from the video information as a current frame of picture, a next frame of picture adjacent to the current frame of picture is obtained from the video information, feature points of the current frame of picture and the next frame of picture are obtained, the number of feature points shared by the current frame of picture and the next frame of picture is calculated as a continuity, and when the continuity is greater than or equal to a second threshold, the current frame of picture and the next frame of picture belong to the same video segment, wherein: the ratio of the second threshold value to the number of the feature points of any frame of picture is more than 50%, when the continuity is less than the second threshold value, the current frame of picture and the next frame of picture do not belong to the same video segment, and the current frame of picture and the next frame of picture are cut.

Specifically, the number of feature points of each frame is the same, that is, the number of feature points of each frame is a fixed value, and when it is determined whether the current frame and the next frame belong to the unified video segment, the number of feature points commonly owned by the two frames is used as the continuity for use as the basis for the determination, because the frame shot by the unified lens should have more same feature points, when the continuity is greater than or equal to the second threshold, and the ratio of the second threshold to the number of feature points of any frame is greater than 50%.

Further, in some alternative embodiments, the comparison frame picture and the target frame picture belong to the same video segment; the contrast frame picture and the target frame picture are adjacent in the video clip; the number of the comparison frame picture is larger than that of the target frame picture, namely the comparison frame picture is a frame picture next to the target frame picture.

The invention provides a method for analyzing a film montage language, which can directly obtain the montage language in active videos such as film televisions and the like, comprises the steps of obtaining the scene and the shooting method of each video segment, has no step in the whole method to have work experience requirements on film practitioners, reduces the identification cost, improves the identification efficiency, can provide learning materials and guidance suggestions for the film television practitioners, and is favorable for improving the shooting skill of the film practitioners.

The present invention also provides a system for analyzing a movie montage language, and fig. 4 is a system for analyzing a movie montage language in the present invention, as shown in fig. 4, the system includes: a video reading module 201, a scene acquisition module 202, a shooting skill acquisition module 203 and a calculation module 204;

the video reading module 201 is connected with the view acquisition module and the calculation module, and is configured to load video information, cut the video information according to a shot to obtain at least 1 video clip, arrange the video clips according to a playing sequence and assign numbers to the video clips, and acquire the numbers and playing durations of the video clips.

Specifically, the read video information may be black-and-white or color moving images such as a movie television, the moving images such as a movie television may include images and audio, or may only include images, the moving images such as a movie television include one frame, usually, one second includes 24 frames, when the frame in the video information is obtained, all the frames may be obtained, or a certain number of frames may be separated, one frame is extracted, because the change between two adjacent frames is often small, it is difficult to find the difference point between the two frames, which easily causes the problem of difficult recognition, all the extracted frames are collected, the number is given according to the playing sequence, the number is increased from 1, the collected frames are cut according to the lens, then according to the cut position, the corresponding position in the video information is correspondingly found, the video information is cut at the position to obtain at least one video segment, the playing time length of each video segment is calculated, and the serial number and the playing time length of each video segment are transmitted to the calculating module 204.

Further, in some optional embodiments, the video reading module 201 includes: a video cutting unit, the video cutting unit to: acquiring any frame of picture from video information as a current frame of picture, acquiring a next frame of picture adjacent to the current frame of picture from the video information, acquiring feature points of the current frame of picture and the next frame of picture, calculating the number of feature points shared by the current frame of picture and the next frame of picture as continuity, and when the continuity is greater than or equal to a second threshold value, the current frame of picture and the next frame of picture belong to the same video segment, wherein: the ratio of the second threshold value to the number of the feature points of any frame of picture is more than 50%, when the continuity is less than the second threshold value, the current frame of picture and the next frame of picture do not belong to the same video segment, and the current frame of picture and the next frame of picture are cut.

Specifically, the number of feature points of each frame is the same, that is, the number of feature points of any frame is a fixed value, and when it is determined whether the current frame and the next frame belong to the unified video segment, the number of feature points that the two frames have in common is used as the continuity for being used as the basis for the determination, because the frames shot by the same shot should have more identical feature points, when the continuity is greater than or equal to the second threshold, and the ratio of the second threshold to the number of feature points of any frame is greater than 50%.

The view acquisition module 202 is connected to the video reading module 201, the shooting skill acquisition module 203, and the calculation module 204, and configured to acquire views of the video segments, and after the video reading module 201 processes the read video information, the cut video segments are transmitted to the view acquisition module 202.

The view acquiring module 202 acquires a view of the video clip, further includes acquiring a frame of the video clip, acquiring sampling points on the frame, matching the sampling points with the human body sample feature library, acquiring values of the matched sampling points, comparing the values of the matched sampling points with a predefined matching threshold, when the values of the matched sampling points are smaller than the predefined matching threshold, the frame does not contain a person or a frame of a local close-up of the person, the view of the frame is a distant view, when the values of the matched sampling points are greater than or equal to the predefined matching threshold, the frame contains a person frame or a frame of a local close-up of the person, acquiring a ratio of a head area of the human body in the frame to a display screen area, when the head area of the human body in the frame occupies a screen ratio greater than or equal to 1/6, the view of the frame is a close-up view, when the head area of the human body in the shot information occupies a screen ratio less than 1/6 and greater than or equal to 1/48, the scene of the frame picture is a close scene, when the proportion of the head area of the human body in the shot information to the screen is smaller than 1/48 and is greater than or equal to 1/85, the scene of the frame picture is a medium scene, when the proportion of the head area of the human body in the shot information to the screen is smaller than 1/85, the scene of the frame picture is a panorama, and the scene of the frame picture in the video clip is obtained and used as the scene of the video clip.

Specifically, after the scene type acquiring module 202 acquires the scene type of each video clip, the acquired scene type and the corresponding video number are sent to the calculating module 204, and the calculating module 204 stores the acquired data.

The shooting skill acquisition module 203 is connected to the view acquisition module 202 and the calculation module 204, and is configured to acquire shooting skills of the video segment, in some optional embodiments, the shooting skill acquisition module may be connected to the video reading module 201, so that a copy of the video segment processed by the video reading module 201 may be directly sent to the shooting skill acquisition module 203, so as to process the video segment synchronously with the view acquisition module 202.

The shooting skill obtaining module 203 obtains the shooting skill of the video segment, and further comprises: acquiring feature points on a frame picture of a video clip, wherein: the number of the feature points of any two frame pictures is the same, and a coordinate system is established on a plane where the frame pictures are located, wherein the coordinate system comprises: horizontal coordinate axis and longitudinal axis, the origin of coordinate system is located the geometric center of frame picture, and the positive direction of horizontal coordinate axis is the arbitrary direction in the plane of frame picture place, and the positive direction of longitudinal axis intersects with the positive direction of horizontal coordinate axis, calculates the coordinate value of characteristic point in the coordinate system, and wherein, the coordinate value includes: acquiring an abscissa and an ordinate, acquiring any frame picture in the video clip as a target frame picture, acquiring a comparison frame picture of the target frame picture, wherein the number of the comparison frame picture is greater than that of the target frame picture, acquiring a feature point shared by the target frame picture and the comparison frame picture as a shared feature point, and subtracting the coordinate value of the shared feature point in the target frame picture from the coordinate value of the shared feature point in the comparison frame picture to obtain a coordinate difference value of the shared feature point, wherein the coordinate difference value comprises: a horizontal coordinate difference and a vertical coordinate difference obtained from the common feature points, a feature point having a horizontal coordinate difference greater than zero and a vertical coordinate difference greater than zero obtained from the common feature points as a first feature point, a feature point having a horizontal coordinate difference less than zero and a vertical coordinate difference greater than zero obtained from the common feature points as a second feature point, a feature point having a horizontal coordinate difference less than zero and a vertical coordinate difference less than zero obtained from the common feature points as a third feature point, a feature point having a horizontal coordinate difference greater than zero and a vertical coordinate difference less than zero obtained from the common feature points as a fourth feature point, and when the numbers of the first feature point, the second feature point, the third feature point and the fourth feature point are not less than zero and the horizontal coordinates of the first feature point and the fourth feature point in the comparison frame picture are greater than the horizontal coordinates of the second feature point and the third feature point in the comparison frame picture, the target frame and the comparison frame are taken as a push frame, when the numbers of the first feature point, the second feature point, the third feature point and the fourth feature point are not less than zero, and the abscissa of the second feature point and the third feature point in the comparison frame picture is greater than the abscissa of the first feature point and the fourth feature point in the comparison frame picture, the shooting skill adopted by the target frame picture and the comparison frame picture is "pull", and when the abscissa difference value of the common feature point is greater than zero or both less than zero, and the abscissa difference value of the common feature point and the absolute value of the abscissa of the common feature point in the target frame picture do not have monotonicity; or when the ordinate difference value of the common feature point is greater than zero or both less than zero and the ordinate difference value of the common feature point has no monotonicity with respect to the absolute value of the ordinate of the common feature point in the target frame picture, the shooting tricks adopted by the target frame picture and the comparison frame picture are shifted, and when the abscissa difference value of the common feature point is greater than zero or both less than zero and the abscissa difference value of the common feature point has monotonicity with respect to the absolute value of the abscissa of the common feature point in the target frame picture; or when the vertical coordinate difference of the common feature point is greater than zero or both are less than zero, and the vertical coordinate difference of the common feature point and the absolute value of the vertical coordinate of the target frame picture have monotonicity, the shooting techniques adopted by the target frame picture and the comparison frame picture are shaking, the feature point of which the horizontal coordinate difference and the vertical coordinate difference are both equal to zero in the obtained common feature points is taken as a static feature point, and when the number of the static feature points is greater than a first threshold, the shooting techniques adopted by the target frame picture and the comparison frame picture are fixed shooting, wherein the ratio of the first threshold to the number of the feature points of any frame picture is greater than 10%, and the shooting techniques adopted by the target frame picture and the comparison frame picture in the video segment are obtained as the shooting techniques of the video segment.

Specifically, the shooting skill divide into to push, pull, shake, move and fixed shooting, in a video clip, adopt a camera lens to shoot, but can adopt one or more shooting skills simultaneously, and the shooting skill of video clip can be more than one, can include multiple simultaneously, when acquireing the shooting skill, can only make statistics of the shooting skill that uses, also can make statistics of the length of time of using this shooting skill simultaneously, thereby calculate the length of time that each shooting skill used accounts for the length of time that this video clip broadcast, thereby learn the proportion that each shooting skill shared in this video clip, for example: push is used 30% of the time and fixed shot is used 70% of the time in a video clip. The video judgment shooting technique includes 30% push and 70% stationary shooting.

And the calculating module 304 is respectively connected with the video reading module 301, the view obtaining module 302 and the shooting skill obtaining module 303, and is configured to summarize the number, the playing time, the view and the shooting method of the video clip as a montage language of the video information.

Although some specific embodiments of the present invention have been described in detail by way of examples, it should be understood by those skilled in the art that the above examples are for illustrative purposes only and are not intended to limit the scope of the present invention. It will be appreciated by those skilled in the art that modifications may be made to the above embodiments without departing from the scope and spirit of the invention. The scope of the invention is defined by the appended claims.

Claims

1. A method for language analysis of a movie montage, comprising:

loading video information;

acquiring the serial number and the playing time length of the video clip;

acquiring the view of the video clip, further comprising:

acquiring a frame picture of the video clip, acquiring sampling points on the frame picture, matching the sampling points with a human body sample feature library, acquiring matched sampling point numerical values, comparing the matched sampling point numerical values with a predefined matching threshold value,

acquiring the scene of the frame picture in the video clip as the scene of the video clip; acquiring shooting skills of the video clip, and further comprising:

acquiring any frame picture in the video clip as a target frame picture,

acquiring the shooting skills adopted by the target frame picture and the contrast frame picture in the video clip as the shooting skills of the video clip; summarizing the serial numbers, the playing duration, the scenes and the shooting methods of the video clips to be used as a montage language of the video information.

2. The method of claim 1, wherein the video information is cut into at least 1 video segment according to the shots, and further comprising:

3. The method of cinematographic montage language analysis as claimed in claim 2,

the contrast frame picture and the target frame picture belong to the same video clip;

4. The method of claim 1, wherein the coordinate system is a rectangular coordinate system, and the horizontal coordinate axis and the vertical coordinate axis are perpendicular to each other.

5. The method of claim 1, wherein the obtaining of the feature points on the frame of the movie comprises:

6. A system for language analysis of a movie montage, comprising: the system comprises a video reading module, a scene acquisition module, a shooting skill acquisition module and a calculation module;

acquiring the scene of the frame picture in the video clip as the scene of the video clip; the shooting skill obtaining module, connected to the scene obtaining module and the calculating module, is configured to obtain the shooting skill of the video segment, where the obtaining the shooting skill of the video segment further includes:

acquiring any frame picture in the video clip as a target frame picture,

acquiring the shooting skills adopted by the target frame picture and the contrast frame picture in the video clip as the shooting skills of the video clip; the calculation module is respectively connected with the video reading module, the view acquisition module and the shooting skill acquisition module and is used for summarizing the serial numbers of the video clips, the playing time length, the view and the shooting methods as the montage language of the video information.

7. The system of claim 6, wherein the video reading module comprises: a video cutting unit to: