CN116962816B

CN116962816B - Method and device for setting implantation identification, electronic equipment and storage medium

Info

Publication number: CN116962816B
Application number: CN202311214501.5A
Authority: CN
Inventors: 纪智辉
Original assignee: 4u Beijing Technology Co ltd
Current assignee: 4u Beijing Technology Co ltd
Priority date: 2023-09-20
Filing date: 2023-09-20
Publication date: 2023-12-12
Anticipated expiration: 2043-09-20
Also published as: CN116962816A

Abstract

The application provides a method, a device, electronic equipment and a storage medium for setting an implantation mark, wherein the method comprises the following steps: acquiring a target video frame in a target video; identifying a plurality of candidate edges in the target video frame and identifying a bounding box in the target video frame that contains a target object; and screening target side lines meeting preset conditions from the plurality of candidate side lines, and correcting the boundary frame based on the target side lines to obtain an implantation position of an object to be implanted into the target video frame, wherein the preset conditions are that polygons can be formed through communication, and the similarity between the polygons and the boundary frame is larger than a preset similarity threshold. The application solves the technical problem that the implantation effect of the object to be implanted is poor due to inaccurate implantation mark setting position in the prior art.

Description

Method and device for setting implantation identification, electronic equipment and storage medium

Technical Field

The present application relates to the field of multimedia technologies, and in particular, to a method, an apparatus, an electronic device, and a storage medium for setting an implantation identifier.

Background

With the rapid development of computer technology and the internet, video content has become an integral part of the current digital age, and various video applications have also been vigorously developed. In this diversified video application market, the way of implanting multimedia files (e.g., advertising materials) has become an important spreading and popularizing means.

However, with the continuous change of market demands, the problems related to multimedia file material selection, implantation positions, market changes and the like are gradually highlighted. The general flow of multimedia file implantation includes detecting an implantation area of a multimedia file in a video, tracking the movement of the area, determining an implantation location, and setting an implantation mark at the implantation location to project multimedia file material to the implantation location.

At present, detection of an implanted region of a multimedia file mainly depends on image detection and segmentation technologies, but the methods are difficult to cope with complicated situations such as edge shadow, shielding, reflection and the like of the implanted region of the multimedia file, so that the detection precision of the implanted region of the multimedia file is not high enough, the position of implantation identification is not accurate, and the implantation effect of an object to be implanted is affected.

In view of the above problems, no effective solution has been proposed at present.

Disclosure of Invention

The embodiment of the invention provides a method, a device, electronic equipment and a storage medium for setting an implantation mark, which at least solve the technical problem that the implantation effect of an object to be implanted is poor due to inaccurate implantation mark setting position in the prior art.

According to an aspect of an embodiment of the present invention, there is provided a method for setting an implantation mark, including: acquiring a target video frame in a target video; identifying a plurality of candidate edges in the target video frame and identifying a bounding box in the target video frame that contains a target object; screening target side lines meeting preset conditions from the plurality of candidate side lines, and correcting the boundary frame based on the target side lines to obtain an implantation position of an object to be implanted into the target video frame, wherein the preset conditions are that polygons can be formed through communication, and the similarity between the polygons and the boundary frame is larger than a preset similarity threshold; and setting an implantation mark at the implantation position, wherein the implantation mark is used for marking the implantation position of the object to be implanted in the target video frame.

According to another aspect of the embodiment of the present invention, there is also provided an apparatus for setting an implant identifier, including: the acquisition module is configured to acquire a target video frame in the target video; an identification module configured to identify a plurality of candidate edges in the target video frame and to identify a bounding box in the target video frame containing a target object; the correction module is configured to screen target side lines meeting preset conditions from the plurality of candidate side lines, correct the boundary frame based on the target side lines and obtain an implantation position of an object to be implanted into the target video frame, wherein the preset conditions are that polygons can be formed in a communicating mode, and the similarity between the polygons and the boundary frame is larger than a preset similarity threshold; the setting module is configured to set an implantation identifier at the implantation position, wherein the implantation identifier is used for identifying the implantation position of the object to be implanted in the target video frame.

In the embodiment of the invention, a target video frame in a target video is acquired; identifying a plurality of candidate edges in the target video frame and identifying a bounding box in the target video frame that contains a target object; screening target edges meeting preset conditions from the plurality of candidate edges, and correcting the boundary frame based on the target edges to obtain implantation positions of the object to be implanted in the target video frame; and setting an implantation mark at the implantation position, wherein the implantation mark is used for marking the implantation position of the object to be implanted in the target video frame. Through the scheme, the technical problem that the implantation effect of the object to be implanted is poor due to inaccurate implantation mark setting position in the prior art is solved.

Drawings

The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this specification, illustrate embodiments of the application and together with the description serve to explain the application. In the drawings:

FIG. 1 is a flow chart of a method of setting an implant identification according to an embodiment of the present application;

FIG. 2 is a flow chart of a method of implanting based on a set implant identification according to an embodiment of the present application;

FIG. 3 is a flow chart of another method of setting an implant identification according to an embodiment of the present application;

FIG. 4 is a flow chart of a method of identifying implant identifiers according to an embodiment of the present application;

FIG. 5 is a flow chart of a method of determining an object to be implanted based on identity information according to an embodiment of the application;

FIG. 6 is a flow chart of a method of generating a pose transformation matrix for an object to be implanted based on pose information according to an embodiment of the application;

FIG. 7 is a flow chart of another method of implantation based on an implant identification, according to an embodiment of the present application;

FIG. 8 is a schematic structural view of a device for setting an implant marker according to an embodiment of the present application;

fig. 9 shows a schematic structural diagram of an electronic device suitable for use in implementing embodiments of the present disclosure.

Detailed Description

It should be noted that, without conflict, the embodiments of the present application and features of the embodiments may be combined with each other. The application will be described in detail below with reference to the drawings in connection with embodiments.

It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of exemplary embodiments according to the present application. As used herein, the singular is also intended to include the plural unless the context clearly indicates otherwise, and furthermore, it is to be understood that the terms "comprises" and/or "comprising" when used in this specification are taken to specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof.

The relative arrangement of the components and steps, numerical expressions and numerical values set forth in these embodiments do not limit the scope of the present application unless it is specifically stated otherwise. Meanwhile, it should be understood that the sizes of the respective parts shown in the drawings are not drawn in actual scale for convenience of description. Techniques, methods, and apparatus known to one of ordinary skill in the relevant art may not be discussed in detail. In all examples shown and discussed herein, any specific values should be construed as merely illustrative, and not a limitation. Thus, other examples of the exemplary embodiments may have different values. It should be noted that: like reference numerals and letters denote like items in the following figures, and thus once an item is defined in one figure, no further discussion thereof is necessary in subsequent figures.

Example 1

The embodiment of the application provides a method for setting an implantation mark, as shown in fig. 1, comprising the following steps:

step S102, obtaining a target video frame in the target video.

The target video frame may be any one of a set of key frame images or any one of a set of key frame images in the target video. Key frame images can be divided into three types: a first frame image, a last frame image, and a specified frame image.

The first frame image is the first frame of the target video. It plays a key role in the introduction of video content, typically for previews, thumbnails or pictures before the video starts. The first frame image may convey important information about video content and subject matter and is thus widely used in the fields of video sharing platforms, advertising, movies, and the like.

The end frame image is the last frame image of the target video. It plays a key role at the end of the video, typically containing the end of the video, end-of-track captions, brand identification, or other important information. The tail frame image is helpful for the audience to memorize the video content, and simultaneously provides opportunities for sharing and popularizing the video. In movies and advertisements, the end frame images are typically used to present production team, production company, or brand information.

The specified frame images are key frames that are well defined in the video, they may not be the first or last frame, but have special significance in video editing and analysis. These frames are typically selected because they contain important scenes, critical information, or specific actions. The selection of the designated frame image may be based on a timecode, content analysis, subject matter relevance, or other criteria. In video analysis and processing, these images may be used for object detection, emotion analysis, advertisement positioning, and video summary generation applications.

Step S104, a plurality of candidate edges in the target video frame are identified, and a boundary box containing a target object in the target video frame is identified.

First, candidate edges are identified.

The line segment detection model may be invoked to extract feature information of the target frame image and identify line segments in the image based on the feature information. The feature information may include gray values, position information, pixel values, and the like of respective pixel points in the target frame image. The line segment detection method may employ different techniques, including a conventional method based on hough transform and a neural network-based method.

The network structure of the neural network-based segment detection model may include four main modules: the system comprises a trunk module, a connection point prediction module, a line segment sampling module and a line segment correction module. The backbone module is responsible for feature extraction, takes the input image as input, and provides a shared convolution feature map for subsequent modules. These feature maps contain a high-level representation of the image, helping subsequent modules to better understand the image content. The task of the connection point prediction module is to output candidate connection points, which are image locations that may contain line segments. The connection point prediction module predicts the location of the connection point using the feature information extracted by the backbone module. The line segment sampling module receives the connection point information output by the connection point prediction module and predicts a candidate line segment therefrom. The task of the segment sampling module is to combine the connection points into candidate segments. The line segment correction module is responsible for classifying the candidate line segments to determine which candidate line segments are actually straight line segments in the image. This module includes a pooling layer for extracting segment features for each candidate segment. By combining the convolution feature map extracted by the trunk module, the line segment correction module may determine which candidate line segments are valid and output information of the straight line segments, such as endpoint coordinates. The embodiment effectively identifies the line segments in the image through the modularized structure of the neural network, which is helpful for improving the accuracy and efficiency of line segment detection.

Next, a bounding box is identified. A dataset is prepared comprising images of the target object and accurate bounding box annotations of the target object in each image. These labels are typically provided in the form of rectangular boxes, including coordinate information for the upper left and lower right corners. Next, a target detection model is selected that is appropriate for the task. There are many models available in the field of object detection, such as YOLO, fast R-CNN, SSD, etc. Subsequently, model training is performed. The selected object detection model is trained using the annotation data. During the training process, the model will learn how to locate the target object from the image and generate a corresponding bounding box. Once the model training is complete, it can be applied to the target video frame. The video frames are input into the model, and the model performs an inference operation. The model analyzes the images and outputs bounding boxes of the detected target objects, as well as other information related to each bounding box, such as confidence scores. In some cases, post-processing the bounding box of the model output may improve accuracy. Post-processing operations may include removing overlapping bounding boxes, filtering bounding boxes with low confidence, or merging similar bounding boxes using non-maximal suppression (NMS). The processing operation in the later embodiment improves the accuracy and usability of the detection result, and ensures that only the most relevant bounding boxes are reserved.

And S106, screening target side lines meeting preset conditions from the plurality of candidate side lines, and correcting the boundary frame based on the target side lines to obtain the implantation position of the object to be implanted into the target video frame, wherein the preset conditions are that polygons can be formed in a communicating mode, and the similarity between the polygons and the boundary frame is larger than a preset similarity threshold.

First, the target edge is screened out. Specifically, connectivity among the plurality of candidate edges is detected, and edges which can be communicated to form a polygon are screened out; and calculating the similarity between the polygon and the boundary frame, and taking the edge of the polygon as the target edge under the condition that the similarity is larger than the preset similarity threshold value. In this way, false detection is facilitated to be reduced and the accuracy of the implantation position is improved, especially in complex scenarios.

In some embodiments, the similarity may be calculated using the following method: calculating the overlapping area based on the polygonal and outline functions of the bounding box; calculating the degree of overlap based on the distance between the polygon and the center point of the bounding box and the overlapping area; calculating the area difference between the polygon and the boundary frame, and carrying out normalization processing on the area difference to obtain the relative size value; the spatial relationship value is calculated based on depth values of the polygon and the bounding box and a distance between center points of the polygon and the bounding box. After calculating the overlapping area, the overlapping degree, the relative size value and the spatial relation value of the polygon and the boundary frame, the similarity between the polygon and the boundary frame is calculated based on the overlapping area, the overlapping degree, the relative size value and the spatial relation value.

For example, the similarity can be calculated using the following formula: similarity = w1 x IoU +w2 x (1-relative size value) +w3 x spatial relationship, where IoU denotes the overlap (Intersection over Union) which measures the ratio of overlap area to the union of the polygon and bounding box contour functions. The relative size value is a normalized value of the area difference of the polygon and the bounding box, 1 minus the relative size value is used to measure the similarity of the sizes. The spatial relationship value is information such as depth values of the polygon and the bounding box, and distances between their center points. Wherein w1, w2, w3 are preset weights.

In some embodiments, the degree of overlap may be calculated using the following method: and finding out the intersection point inside the polygon by calculating the intersection of the boundary point of the polygon and the boundary point of the boundary box. These intersections are connected to form a new polygon representing the intersection of the polygon with the bounding box. Next, the area of the intersecting polygon is calculated by employing a polygon area calculation algorithm. Then, the areas of the polygon and the bounding box are calculated separately, and finally the union area is calculated, i.e. the area of the polygon plus the area of the bounding box minus the area of the intersection polygon. This results in an intersection area and a union area, which can be used to calculate IoU, i.e., the intersection area divided by the union area. This IoU computation method more accurately accounts for complex interactions between polygons and bounding boxes, and is particularly useful in situations where complex shape matching and overlap metrics need to be handled.

In some embodiments, the relative size value calculation formula may be: relative size value= (|area of polygon-area of bounding box|/max (area of polygon, area of bounding box)) ² . In this embodiment, the square of the calculation result of the relative size value is increased, so that the contribution of the relative size value to the similarity is more remarkable.

In some embodiments, the spatial relationship value calculation formula may be: spatial relation value = (1-distance/maximum distance) × (1-superposition) × (1-depth value), wherein distance represents the distance between the polygon and the center point of the bounding box and maximum distance represents the furthest spatial separation between the polygon and the bounding box. The maximum distance is typically the furthest distance from a point of the polygon to the bounding box or the furthest distance from a point of the bounding box to the polygon. The present embodiment introduces depth values to more fully consider the spatial relationship between polygons and bounding boxes. Therefore, the relative positions of the polygons and the boundary boxes can be measured according to the depth information, and the accuracy of the spatial relation value is further improved. Further, the present embodiment more fully considers various aspects between the polygon and the bounding box, including distance, degree of superposition, and depth, thereby more accurately measuring the spatial relationship therebetween.

The bounding box is then modified based on the target edge. For example, identifying geometric features of the target edge, the geometric features including a length, an angle, and a curvature of the target edge; analyzing a relative position between the target edge and the bounding box based on the geometric feature; based on the relative positions, the position and shape of the bounding box are adjusted to modify the bounding box. In the embodiment, by identifying geometric characteristics such as the length, the angle, the curvature and the like of the edge of the target, the system can more comprehensively know the shape and the position information of the target. This helps to accurately capture the appearance characteristics of the target object, particularly excellent in the case of complex scenes or irregular shapes. Next, based on the analysis of these geometric features, the relative positional relationship between the target edge and the existing bounding box can be studied in depth. Finally, according to the analysis result of the relative position, the position and the shape of the boundary box can be intelligently adjusted, so that the target object can be better contained, and possible deviation and error of the boundary box are reduced. This fine bounding box adjustment process makes the target detection more accurate.

Specifically, when the relative position indicates that the target edge intersects the boundary frame, an intersection angle of the target edge and the boundary frame is detected, and when the intersection angle is larger than a preset angle threshold value, the boundary frame is narrowed to avoid the intersection of the target edge and the boundary frame. And under the condition that the intersection angle is smaller than a preset angle threshold value, resetting the position of the boundary frame by calculating the intersection point of the central point of the boundary frame and the target side line. This approach helps to reduce redundant portions of the bounding box, ensuring that they better conform to the shape of the target object, thereby improving the accuracy of the bounding box. In addition, when the relative position indicates that the target edge does not intersect with the boundary frame, detecting a gap distance between the target edge and the boundary frame, and when the gap distance is smaller than a preset gap threshold value, translating the edge of the boundary frame in the direction of the target edge to enable the boundary frame to be closer to the target edge. In case the gap distance is larger than a preset gap threshold, increasing the width and height of the bounding box may ensure that it covers the target object better, while decreasing the gap distance. In this way, the gap between the target edge and the boundary frame is reduced, the boundary frame is ensured to better surround the target object, and the adaptability of the boundary frame is improved.

Step S108, setting an implantation mark at the implantation position, wherein the implantation mark is used for marking the implantation position of the object to be implanted in the target video frame.

In order to more accurately identify the position of the object to be implanted during implantation of the object to be implanted into the target video frame, an implantation identification may be provided at the implantation position. The implant identification may not only help identify the location of the object to be implanted, but may also provide other useful information and functions.

First, the implant identifier may be used as a unique marker to accurately locate the position of the object to be implanted in the target video frame. By arranging the implantation mark at the implantation position, the object to be implanted can be easily identified and tracked in subsequent processing and analysis, and information such as visual characteristics or colors is not needed to be relied on.

Second, the implant identifier may also contain additional metadata, such as a timestamp, object attributes, or other related information. This information may be used for further analysis, retrieval or tagging to facilitate a more comprehensive understanding of the context and nature of the object to be implanted in the target video frame.

In addition, the implant identification may also be used for interaction or interaction of objects in subsequent processing. For example, it may be a trigger point for user interaction, enabling a user to interact with an object to be implanted, such as clicking, dragging, or performing other operations, thereby enhancing the interactivity and engagement of video content.

In summary, setting the implant identifier in the target video frame not only provides more accurate location information, but also provides more possibilities for subsequent processing and application, making the implantation of the object to be implanted more flexible and intelligent. This may play an important role in the fields of advertising, entertainment, education, virtual reality, etc.

Example 2

The embodiment of the application provides a method for implanting based on a set implantation mark, as shown in fig. 2, comprising the following steps:

step S202, detecting an implantation position.

The specific detection method refers to steps S102 to S106 in embodiment 1, and will not be described here again.

Step S204, setting an implantation mark at the implantation position.

The implantation mark can carry identity information, pose information and the like to ensure accurate implantation operation, wherein the pose information comprises information such as orientation, position, size and the like, and the identity information comprises an identity identification code mark and a color combination.

As shown in fig. 3, setting the implant identification may include the steps of:

step S2042, identity information and pose information are set.

The core elements of the implant identification comprise a rectangular identification frame, an orientation mark, an identity identification code mark and a color combination. First, a rectangular identification frame is used as an outermost marker for clearly indicating the implantation position. It is typically composed of an adjoining black outer frame and red inner frame, which is designed to help provide a stable identification under complex environmental and light conditions. The size of the rectangular identification frame may be different for indicating the size of the implantation site. Inside the rectangular identification frame, the rotation angle or direction of the object to be implanted is represented using an orientation flag. These orientation markers may be three rectangular point markers arranged in three fixed points at isosceles right angles to provide angle information.

The identity recognition code mark is positioned inside the rectangular recognition frame and adopts a color mark combination mode. The arrangement combination of the colors is used for carrying identity information, and each color combination corresponds to different identities or objects to be implanted. When the computer recognizes and arranges these colors, specific operations of implantation, such as the kind of product, the type of mask implanted, and the manner of composition, etc., are determined by the command information.

The present embodiment can generate a large number of different commands by using an arrangement combination of four colors to satisfy various implantation demands. This design not only allows for more flexibility in identification and operation, but also allows for ease of handling a wide variety of implant requirements. This embodiment keeps its design simple and straightforward in order to ensure that the implant identification can be easily identified in all cases. Even in the case of low resolution or limited pixels, the basic symbol or shape, such as a square, can be clearly recognized. This helps to maintain the clarity and accuracy of the implant identification, making it a critical role in video implant applications. With this improved implant identification design, the implantation procedure can be more flexibly controlled while ensuring the stability and identifiability of the identification.

In step S2044, a shape recognition flag, a color mode flag, and an action control flag are set.

In still other embodiments, the implant identification may further include a shape identification flag, a color mode flag, and an action control flag. The shape identification mark is used for indicating the implantation position; a color mode flag for indicating a specific attribute or effect of the implant object; the action control flags are used to indicate the action and behavior of the implant. When these flags are detected, command information will be given which directly determines the specific operation of implantation, such as the type of product, the type of mask implanted, the manner of composition, etc.

Shape recognition markers are a form of identification that represents the implantation site in a graphical shape. These shapes may take a variety of different geometric shapes, such as square, circular, star, triangle, etc. These shapes are typically drawn at specific locations or areas in the video frame to indicate the location of the implant. The arrangement of the shape recognition marks may be varied as needed. For example, it may be selected to arrange a plurality of shapes into a particular geometric pattern, such as an isosceles right triangle, square grid, or the like.

A color mode flag is a way of identifying a particular attribute or effect of an implanted object by color coding. Each color or combination of colors may correspond to a different attribute or effect so that the implant system can take a corresponding action according to the identified color pattern.

Color mode flags are typically composed of one or more colors that may be arranged or combined according to a particular rule. For example, different color arrangements may represent different implant effects, such as mask type, transparency, color filtering, etc. The selection and coding rules of these color combinations should be determined according to the specific application. Different colors or combinations of colors may represent different meanings. For example, red may represent a shade effect, green may represent a transparency effect, blue may represent a color filter effect, etc. The implantation system may perform corresponding operations based on the interpretation of the color mode indicator, thereby achieving different implantation effects.

The action control flag is an identification means for indicating the action and behavior of the implanted object. It may take the form of text, symbols, graphics, etc. to represent different actions such as panning, rotating, zooming, fading, etc. The action control markers may include text labels or symbols to explicitly indicate the specific action of the implanted object. For example, a "pan" symbol may represent a panning action of an object, and a "fade-in" symbol may represent a fade-in effect. The action control markers may also use graphical examples, such as arrow icons, circle icons, etc., to more visually represent actions. Different graphical examples may correspond to different action types. Multiple motion control markers may be combined together to achieve a complex sequence of motions.

Step S206, identifying the implantation mark.

The method for identifying the implantation mark is shown in fig. 4, and comprises the following steps:

step S2062, the target video is processed.

And acquiring a preset implantation identifier from the target video. The target video is divided into a plurality of video frames and each video frame is converted into a gray scale image for subsequent processing.

Step S2064, gaussian blur and thresholding.

And (3) carrying out Gaussian blur processing on the gray level image of each video frame by applying a preset blur kernel size, and then carrying out thresholding processing on the blurred image to convert the blurred image into a binary image. This helps reduce noise in the image and highlights the outline of the implanted logo.

In step S2066, contour and image recognition are extracted.

And extracting the outline corresponding to the mark from the binary image according to a preset tracking graph. Based on the image information in the outline, image recognition is performed to analyze each part of the implantation mark, including identity information, pose information and the like.

In some embodiments, shape recognition flags, color mode flags, and action control flags may also be identified. This information is not necessary and may or may not be carried in the implanted identification.

The shape recognition mark is a mark way for representing the implantation position by a graph shape. In video frames, these shapes are typically drawn in specific geometric shapes (e.g., square, circular, star, triangle, etc.) at specific locations or areas. To identify these shapes, a shape detection algorithm may be used that is capable of detecting boundaries or features of a particular shape. Once the shape is identified, the system may determine the location of the implantation. The arrangement mode of the shape identification marks can be specifically set according to application requirements.

The color mode flag is an identification means for representing a specific attribute or effect of the implant object by color coding. Each color or combination of colors may correspond to a different attribute or effect. To identify the color pattern flags, a color identification algorithm may be used to detect a particular color or combination of colors in the image. From the interpretation of the color mode flag, the system may determine the properties or effects of the implanted object. Color mode flags are typically composed of one or more colors that may be arranged or combined according to a particular rule.

The action control flag is used to indicate the action and behavior of the implanted subject. It may take the form of text, symbols or graphics. Text or graphic recognition algorithms may be used to detect text labels, symbols, or graphic examples. Based on the identified motion control markers, the system may perform a corresponding implantation motion. In some embodiments, multiple motion control markers may be combined together to achieve a complex sequence of motions.

Step S208, determining the object to be implanted based on the identity information.

As shown in fig. 5, the method of determining an object to be implanted based on identity information includes the steps of:

in step S2080, a plurality of tracking points are extracted.

A plurality of tracking points are extracted from within the outline of the implant marker. These tracking points are typically located at key locations within the contour, which may be characteristic points that are landmark. By locating and extracting these tracking points, the system is able to obtain key information within the profile. The tracking points are located at specific locations within the marker and may form a specific geometry, such as triangles, rectangles, etc.

Step S2082, calculating a center point.

The center point of the image within the contour is calculated using these extracted tracking points. By measuring the position of these points relative to the contour, the coordinates of the center point can be accurately calculated. The location information of this central point is important for subsequent identity information resolution.

Step S2084, calculating an angle.

By calculating the angles of these tracking points with respect to the center point, a set of angle information can be obtained. These angle information describe the arrangement and relative orientation of the tracking points and can help determine identity information. For example, if the tracking points are arranged in an equilateral triangle, the relevant angle information will indicate the direction of the triangle.

Step S2086, color combination analysis.

The identification code label typically contains a combination of different colors inside. The computer may resolve the color information by detecting and permutation and combination of colors. Different color combinations may correspond to different identity information that may be used to distinguish between different objects to be implanted.

In step S2088, the object to be implanted is determined.

By integrating the information obtained in the above steps, including the position of the tracking point, the position of the center point, the angle information and the color combination, the implantation system can accurately determine the identity of the object to be implanted.

The present embodiment determines the identity of the object to be implanted in a highly accurate manner, which helps to reduce the risk of false identifications, thereby improving the reliability and safety of the system. Second, the method is very flexible and adaptable to different types of markers and implant markers, as it relies on feature points and color combinations within the outline. Finally, the embodiment provides more comprehensive identity information by integrating a plurality of information sources such as positions, angles and colors, and is helpful for more accurately determining the identity of the object. The method has wide applicability, can be applied to the fields of computer vision, image processing and automatic identification, and is beneficial to improving the efficiency and reliability of the identification information identification task in various fields.

Step S210, generating a pose transformation matrix of the object to be implanted based on the pose information.

As shown in fig. 6, the method of generating a pose transformation matrix of an object to be implanted based on pose information includes the steps of:

In step S2102, a translation transformation matrix is generated.

Based on the position information of the object to be implanted, a translation transformation matrix for translating the object to the implantation position may be generated. The translation transformation matrix describes the distance that the object should move in the horizontal and vertical directions.

In step S2104, a rotation transformation matrix is generated.

Based on the rotation angle of the object to be implanted, a rotation transformation matrix for rotating the object by a specified angle may be generated. The rotation transformation matrix describes how to rotate the object around some center point to match the target angle.

In step S2106, a scaling transformation matrix is generated.

Based on the size information of the object to be implanted, a scaling transformation matrix for scaling the object by a specified size may be generated. The scaling transformation matrix describes the scale to which the object should scale in the horizontal and vertical directions.

Step S2108, combining the pose transformation matrices.

The generated translation, rotation, and scaling transformation matrices are combined together to form a complete pose transformation matrix. The pose transformation matrix contains all transformation information needed to be performed on the object to be implanted, so that the transformation information matches the implantation position, rotation and size.

Step S212, adjusting the object to be implanted based on the pose transformation matrix, and implanting the adjusted object to be implanted into the target video.

And adjusting the object to be implanted by using the generated pose transformation matrix. Specifically, according to the description of the pose transformation matrix, the object to be implanted is subjected to translation, rotation and scaling operations so as to adapt to the designated position and pose in the target video. And then, implanting the adjusted object to be implanted into the target video. For example, the object to be implanted is superimposed on the target video frame, and accurate positioning is performed according to the position information of the implantation identifier. This step may be repeated on each video frame to achieve multiple implant effects.

After implantation, a fusion process may also be performed. For example, a target video frame with the implantation identification in the plurality of video frames is acquired; and carrying out fusion processing on the target video frame and the adjusted object to be implanted so as to implant the adjusted object to be implanted into the target video. For example, obtaining perspective transformation information based on the target video frame and the adjusted object to be implanted, and performing perspective transformation on the adjusted object to be implanted based on the perspective transformation information; superposing the object to be implanted after perspective transformation on an implantation position indicated by the implantation identification in the target video frame, and adjusting the transparency of the superposed object to be implanted based on the target video frame; and carrying out edge smoothing treatment on the boundary of the superimposed object to be implanted.

Through the steps, the video implantation system can generate a proper pose transformation matrix according to pose information in the implantation mark, ensure that an object to be implanted is accurately implanted into the target video, and match implantation positions, rotation and sizes.

Example 3

An embodiment of the present application provides another video implantation method, as shown in fig. 7, including the following steps:

step S700, detecting an implantation position in the target video, and setting an implantation mark.

1) The target video is divided into a plurality of video frames.

2) And carrying out target detection on target video frames in the plurality of video frames, and determining a boundary box containing a target object in the target video frames.

Performing target detection on the target video frame; acquiring a previous video frame of the target video frame, and performing target detection on the previous video frame; the bounding box containing the target object in the target video frame is determined based on a target detection result of the last video frame and a target detection result of the target video frame.

For example, estimating a displacement of each pixel of a previous video frame to the target video frame in the target video frame using dense optical flow; and deducing the shape change and the motion state of the target object through the displacement, and carrying out target detection based on the shape change and the motion state.

The present embodiment can improve the accuracy of target detection by performing target detection at two different time points (the current frame and the previous frame). The multi-frame detection strategy can reduce false detection or missed detection caused by factors such as shielding, illumination change or noise in a single frame. By comparing the detection results of the two frames, the bounding box of the target object can be determined more reliably.

3) And determining straight lines of all sides of the boundary frame, and detecting whether implantation positions capable of being implanted into an object to be implanted exist in the target video frame based on straight line equation parameters corresponding to the straight lines.

Firstly, the linear equation parameters are converted into a parameter matrix, wherein the parameter matrix is used for describing the positions of all pixel points in the boundary box. For example, substituting the coordinates of each pixel point into a linear equation of the straight line, and calculating the coordinates of each pixel point mapped onto the straight line to obtain the position information of each pixel point relative to the straight line; and constructing the parameter matrix based on the position information, wherein elements in the parameter matrix correspond to the position information of each pixel point.

The present embodiment can more accurately determine the implantation position by modeling the border of the bounding box as a straight line and calculating the straight line equation parameters. Thus, the position of the object to be implanted can be more accurately matched, so that the quality and the sense of reality of the implantation effect are improved. Furthermore, by converting the linear equation parameters into a parameter matrix, the system can adapt to implant location detection in different scenarios. This can help the system work in a variety of contexts and environmental conditions and accommodate a variety of bounding box shapes and sizes.

Then, it is detected whether the implantation position capable of implanting the object to be implanted exists in the target video frame based on the parameter matrix. For example, screening the pixel points in the parameter matrix, and marking the pixel points meeting the preset condition as candidate positions; performing feature matching on the candidate position and the object to be implanted, and particularly performing position matching on the pixel position of the object to be implanted and the candidate position; and under the condition that the positions are matched, extracting feature descriptors of the object to be implanted and the pixel points of the candidate positions respectively, and determining whether the feature attributes of the object to be implanted and the feature attributes of the candidate positions are matched or not based on the feature descriptors. Then, it is determined whether the implantation position capable of implanting the object to be implanted exists in the target video frame.

According to the embodiment, the pixel points can be screened based on the implantation position detection of the parameter matrix, and the pixel points which only meet the preset condition are marked as candidate positions, so that false alarms can be reduced, and the reliability of the system is improved. In addition, the matching degree can be more reliably determined by matching the pixel position of the object to be implanted with the candidate position and extracting the characteristic descriptors of the pixel position and the candidate position, so that the matching of the implantation position and the characteristic attribute of the object to be implanted is facilitated, and the implantation reality is improved. Finally, the embodiment is not only suitable for different types of objects to be implanted, but also suitable for implantation under various background conditions. It can be applied to the implantation of virtual objects, such as virtual characters, objects or effects, as well as a variety of different video contexts.

Step S702, obtaining a preset implantation identifier from a target video, where the implantation identifier is used to identify an implantation position of an object to be implanted in the target video.

Dividing the target video into a plurality of video frames, and converting each video frame of the plurality of video frames into a gray scale map; carrying out Gaussian blur processing on the gray level map based on a preset blur kernel size, and carrying out thresholding processing on the gray level map after Gaussian blur processing so as to convert the gray level map after Gaussian blur processing into a binary image; and extracting a contour corresponding to the tracking graph from the binary image based on a preset tracking graph, and identifying an implantation identifier based on an image in the contour.

According to the embodiment, the target video is divided into a plurality of video frames and converted into the gray level image, and then Gaussian blur and thresholding are carried out, so that noise and interference in the video frames can be reduced, and extraction of the implantation identification is more stable. And extracting the outline from the binary image based on a preset tracking graph, thereby being beneficial to identifying the implantation mark.

Step S704, determining the object to be implanted based on the identity information carried in the implantation identifier, and generating a pose transformation matrix of the object to be implanted based on the pose information carried in the implantation identifier.

Firstly, determining the object to be implanted based on the identity information carried in the implantation identification. For example, extracting a plurality of tracking points from the image within the contour, and calculating a center point of the image within the contour using positions of the plurality of tracking points in the image within the contour; connecting the plurality of tracking points with the center point based on the angles of the plurality of tracking points and the angles of the center point to obtain a region to be filled; the identity information is determined based on a combination of the geometry of the region to be filled and the color of the center point, and the object to be implanted is determined based on the identity information.

The embodiment considers the geometric characteristics and the color characteristics of the implantation mark, which is helpful for accurately identifying the identity of the object to be implanted, ensures that the selected object is consistent with the video environment, and improves the authenticity of the implantation effect. Second, the present embodiment also helps to precisely locate the position of the object to be implanted. By calculating the geometric shape of the region to be filled and combining the color information of the center point, the position of the object can be determined more accurately, the object to be implanted is ensured to be accurately placed at the expected position in the video, and the problems of position deviation and incompatibility are avoided. In summary, the present embodiment contributes to improvement of the sense of realism of the implantation effect. By comprehensively considering information such as color, shape, angle and the like, the object to be implanted is ensured to be consistent with the visual characteristics of the video environment, so that the implanted object looks more natural and real, the implantation effect is enhanced, and the implanted object is better integrated into the video scene.

And then, generating a pose transformation matrix of the object to be implanted based on the pose information carried in the implantation mark. For example, based on the pose information, determining a position, a rotation angle, and a size of the object to be implanted; the pose transformation matrix is determined based on the position, rotation angle and size of the object to be implanted.

The present embodiment precisely defines the appearance and position of the object to be implanted in the target video by comprehensively considering the position, rotation angle and size information. This ensures that the implanted object is consistent with the video environment without an uncoordinated, distorted or unnatural appearance. Furthermore, the process of generating the pose transformation matrix is programmable and controllable. By flexibly modifying the position, rotation angle, and size parameters, the implant object can be adjusted at any time to accommodate different video scene or creative requirements without the need to recreate or edit the object. This provides a high degree of customization and flexibility, ensuring that the appearance and position of the implant object can be adjusted as desired without having to reprocess the entire implantation procedure. In summary, the method for generating the pose transformation matrix based on the pose information has the advantages of accuracy, customizable performance and automatic processing, and the advantages are helpful for ensuring the coordination consistency of the implanted object and the target video, and meanwhile, the processing efficiency and the controllability are improved.

Specifically, determining the pose transformation matrix based on the position, rotation angle, and size of the object to be implanted may include: generating a translation transformation matrix for translating the object to be implanted to the position based on the position of the object to be implanted; generating a rotation transformation matrix for rotating the object to be implanted according to the rotation angle based on the rotation angle of the object to be implanted; generating a scaling transformation matrix for scaling the object to be implanted according to the size based on the size of the object to be implanted; wherein the pose transformation matrix comprises the translation transformation matrix, the rotation transformation matrix and the scaling transformation matrix.

The present embodiment provides precise control of the object to be implanted, including translational, rotational, and scaling transformations, by using a pose transformation matrix. These exact transformations ensure consistent coordination of the implanted object with the video environment, improving the realism of the implantation effect. The matrix representation mode not only enables the mathematical calculation to be efficient, but also provides flexibility and customizability, and can be quickly adjusted according to different requirements. Meanwhile, the controllability and predictability of the transformation process are ensured by the combined form of the pose transformation matrix, and the processing accuracy and consistency are improved. This embodiment has significant advantages in terms of improving the coordination, accuracy and processing efficiency of the implantation effect.

In some embodiments, different portions of the object to be implanted may need to have different dimensions, including local scaling or deformation of the object. To achieve this effect, the pose transformation matrix may be non-uniformly scaled, allowing different scale factors to be applied on different axes. For example, for non-uniform scaling, anisotropic scaling factors may be introduced. It is sometimes also necessary to warp the object to be implanted so that it can adapt to a particular scene or shape. By applying a warping transformation, the pose transformation matrix allows non-linear deformation of the object to be implanted on its surface to match the target scene. For example, a non-linear deformation may be introduced on the surface of the object to be implanted. This is useful in the case of muscle simulation of a virtual character, morphological adjustment of a deformed object, and the like. The warping may be achieved by introducing a non-linear transformation, such as a Bezier curve or a B-spline curve, to introduce local shape variations at different parts of the object to be implanted.

In addition, in pose transformation, the change of the material and texture of the object to be implanted needs to be considered. Not only the geometry of the object to be implanted changes, but also the changes in the material properties such as color, transparency, reflectivity, etc. need to be considered. The texture attributes are combined with pose transformation to generate a pose transformation matrix to achieve a realistic appearance.

In some embodiments, an adaptive pose transformation approach may be considered, depending on the requirements of the scene. The pose transformation matrix can be adjusted according to the interaction and constraint of the object to be implanted and the surrounding environment. For example, as the virtual character interacts with real world objects, the pose may be dynamically adjusted to better simulate the interaction effect. If variations in camera viewing angle are involved, camera parameters, such as internal and external parameters, need to be considered as well. These parameters may be combined with a pose transformation matrix to ensure a realistic rendering of the object to be implanted at different perspectives.

Step S706, adjusting the object to be implanted based on the pose transformation matrix, and implanting the adjusted object to be implanted into the target video.

Acquiring a target video frame with the implantation identifier in the plurality of video frames; and carrying out fusion processing on the target video frame and the adjusted object to be implanted so as to implant the adjusted object to be implanted into the target video. For example, obtaining perspective transformation information based on the target video frame and the adjusted object to be implanted, and performing perspective transformation on the adjusted object to be implanted based on the perspective transformation information; superposing the object to be implanted after perspective transformation on an implantation position indicated by the implantation identification in the target video frame, and adjusting the transparency of the superposed object to be implanted based on the target video frame; and carrying out edge smoothing treatment on the boundary of the superimposed object to be implanted.

According to the embodiment, the target video frame with the implantation identification is obtained, so that the implantation object can be accurately placed at the specific position of the video, the implantation object is coordinated with the scene, and the sense of reality of the implantation effect is enhanced. Secondly, the introduction of perspective transformation information allows to take into account the deformation and projection effects of the object at different viewing angles, making the projection of the object to be implanted in the video appear more realistic and coordinated. In addition, the adjustment of the transparency ensures that the implant object blends naturally with the background, without causing an uncoordinated sensation. Finally, the edge smoothing process is helpful to eliminate the hard boundary between the implanted object and the background, so that the transition is smoother, and the authenticity and harmony of the implantation effect are further improved. In conclusion, the effects act together, the overall consistency of the implanted object and the target video environment is improved, and the implantation effect is enhanced.

Example 4

An embodiment of the present application provides a device for setting an implantation mark, as shown in fig. 8, including an acquisition module 82, an identification module 84, a correction module 86, and a setting module 88.

The acquisition module 82 is configured to acquire a target video frame in a target video; the identification module 84 is configured to identify a plurality of candidate edges in the target video frame and to identify a bounding box in the target video frame containing a target object; the correction module 86 is configured to screen out a target edge line meeting a preset condition from the plurality of candidate edge lines, and correct the bounding box based on the target edge line to obtain an implantation position of an object to be implanted into the target video frame, where the preset condition is that a polygon can be formed in a communicating manner, and a similarity between the polygon and the bounding box is greater than a preset similarity threshold; the setting module 88 is configured to set an implant identification at the implant location, the implant identification identifying an implant location of the object to be implanted in the target video frame.

It should be noted that: the device for setting the implantation identifier provided in the above embodiment is only exemplified by the division of the above functional modules, and in practical application, the above functional allocation may be performed by different functional modules according to needs, that is, the internal structure of the device is divided into different functional modules, so as to perform all or part of the functions described above. In addition, the device for setting the implantation identifier provided in the above embodiment and the method embodiment for setting the implantation identifier belong to the same concept, and detailed implementation processes of the device are shown in the method embodiment, which is not described herein.

Example 5

Fig. 9 shows a schematic structural diagram of an electronic device suitable for use in implementing embodiments of the present disclosure. It should be noted that the electronic device shown in fig. 9 is only an example, and should not impose any limitation on the functions and the application scope of the embodiments of the present disclosure.

As shown in fig. 9, the electronic apparatus includes a Central Processing Unit (CPU) 1001 that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 1002 or a program loaded from a storage section 1008 into a Random Access Memory (RAM) 1003. In the RAM 1003, various programs and data required for system operation are also stored. The CPU1001, ROM 1002, and RAM 1003 are connected to each other by a bus 1004. An input/output (I/O) interface 1005 is also connected to bus 1004.

The following components are connected to the I/O interface 1005: an input section 1006 including a keyboard, a mouse, and the like; an output portion 1007 including a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), etc., and a speaker, etc.; a storage portion 1008 including a hard disk or the like; and a communication section 1009 including a network interface card such as a LAN card, a modem, or the like. The communication section 1009 performs communication processing via a network such as the internet. The drive 1010 is also connected to the I/O interface 1005 as needed. A removable medium 1011, such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like, is installed as needed in the drive 1010, so that a computer program read out therefrom is installed as needed in the storage section 1008.

In particular, according to embodiments of the present disclosure, the processes described below with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method shown in the flowcharts. In such an embodiment, the computer program may be downloaded and installed from a network via the communication portion 1009, and/or installed from the removable medium 1011. When being executed by a Central Processing Unit (CPU) 1001, performs the various functions defined in the method and apparatus of the present application. In some embodiments, the electronic device may further include an AI (Artificial Intelligence ) processor for processing computing operations related to machine learning.

It should be noted that the computer readable medium shown in the present disclosure may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this disclosure, a computer-readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present disclosure, however, the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.

The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The units involved in the embodiments of the present disclosure may be implemented by means of software, or may be implemented by means of hardware, and the described units may also be provided in a processor. Wherein the names of the units do not constitute a limitation of the units themselves in some cases.

As another aspect, the present application also provides a computer-readable medium that may be contained in the electronic device described in the above embodiment; or may exist alone without being incorporated into the electronic device.

The computer-readable medium carries one or more programs which, when executed by one of the electronic devices, cause the electronic device to implement the methods described in the embodiments below. For example, the electronic device may implement the steps of the method embodiments described above, and so on.

The integrated units in the above embodiments may be stored in the above-described computer-readable storage medium if implemented in the form of software functional units and sold or used as separate products. Based on such understanding, the technical solution of the present application may be embodied in essence or a part contributing to the prior art or all or part of the technical solution in the form of a software product stored in a storage medium, comprising several instructions for causing one or more computer devices (which may be personal computers, servers or network devices, etc.) to perform all or part of the steps of the method described in the embodiments of the present application.

In the foregoing embodiments of the present application, the descriptions of the embodiments are emphasized, and for a portion of this disclosure that is not described in detail in this embodiment, reference is made to the related descriptions of other embodiments.

In several embodiments provided in the present application, it should be understood that the disclosed terminal device may be implemented in other manners. The above-described embodiments of the apparatus are merely exemplary, and the division of the units, such as the division of the units, is merely a logical function division, and may be implemented in another manner, for example, multiple units or components may be combined or may be integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be through some interfaces, units or modules, or may be in electrical or other forms.

The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.

The foregoing is merely a preferred embodiment of the present application and it should be noted that modifications and adaptations to those skilled in the art may be made without departing from the principles of the present application, which are intended to be comprehended within the scope of the present application.

Claims

1. A method of setting an implant marker, comprising:

acquiring a target video frame in a target video;

identifying a plurality of candidate edges in the target video frame and identifying a bounding box in the target video frame that contains a target object;

screening target side lines meeting preset conditions from the plurality of candidate side lines, and correcting the boundary frame based on the target side lines to obtain an implantation position of an object to be implanted into the target video frame, wherein the preset conditions are that polygons can be formed through communication, and the similarity between the polygons and the boundary frame is larger than a preset similarity threshold;

Setting an implantation mark at the implantation position, wherein the implantation mark is used for marking the implantation position of the object to be implanted in the target video frame;

after setting the implant identification, the method further comprises: acquiring the implantation identification corresponding to the implantation position from the target video; determining the object to be implanted based on the identity information carried in the implantation identification, and generating a pose transformation matrix of the object to be implanted based on the pose information carried in the implantation identification; adjusting the object to be implanted based on the pose transformation matrix, and implanting the adjusted object to be implanted into the target video;

the obtaining, from the target video, a preset implantation identifier corresponding to the implantation position includes: dividing the target video into a plurality of video frames, and converting each video frame of the plurality of video frames into a gray scale map; carrying out Gaussian blur processing on the gray level map based on a preset blur kernel size, and carrying out thresholding processing on the gray level map after Gaussian blur processing so as to convert the gray level map after Gaussian blur processing into a binary image; and extracting a contour corresponding to the tracking graph from the binary image based on a preset tracking graph, and identifying the implantation identification based on an image in the contour.

2. The method of claim 1, wherein selecting a target edge from the plurality of candidate edges that meets a preset condition comprises:

detecting connectivity among the plurality of candidate edges, and screening out edges which can be communicated to form a polygon;

and calculating the similarity between the polygon and the boundary frame, and taking the edge of the polygon as the target edge under the condition that the similarity is larger than the preset similarity threshold value.

3. The method of claim 2, wherein calculating the similarity of the polygon to the bounding box comprises:

calculating the overlapping area, the overlapping degree, the relative size value and the spatial relation value of the polygon and the boundary frame;

the similarity of the polygon to the bounding box is calculated based on the overlap area, the degree of overlap, the relative size value, and the spatial relationship value.

4. A method according to claim 3, wherein calculating the overlap area, the degree of overlap, the relative size value and the spatial relationship value of the polygon and the bounding box comprises:

calculating the overlapping area based on the polygonal and outline functions of the bounding box;

Calculating the degree of overlap based on the distance between the polygon and the center point of the bounding box and the overlapping area;

calculating the area difference between the polygon and the boundary frame, and carrying out normalization processing on the area difference to obtain the relative size value;

the spatial relationship value is calculated based on depth values of the polygon and the bounding box and a distance between center points of the polygon and the bounding box.

5. The method of claim 1, wherein modifying the bounding box based on the target edge comprises:

identifying geometric features of the target edge, the geometric features including a length, an angle, and a curvature of the target edge;

analyzing a relative position between the target edge and the bounding box based on the geometric feature;

based on the relative positions, the position and shape of the bounding box are adjusted to modify the bounding box.

6. The method of claim 5, wherein adjusting the position and shape of the bounding box based on the relative positions comprises:

detecting an intersecting angle of the target edge line and the boundary frame when the relative position indicates that the target edge line intersects the boundary frame, and shrinking the boundary frame when the intersecting angle is larger than a preset angle threshold value so as to avoid the target edge line from intersecting the boundary frame;

And detecting a gap distance between the target side line and the boundary frame when the relative position indicates that the target side line is not intersected with the boundary frame, and translating the side line of the boundary frame in the direction of the target side line when the gap distance is smaller than a preset gap threshold value so that the boundary frame is closer to the target side line.

7. A device for setting an implant marker, comprising:

the acquisition module is configured to acquire a target video frame in the target video;

an identification module configured to identify a plurality of candidate edges in the target video frame and to identify a bounding box in the target video frame containing a target object;

the correction module is configured to screen target side lines meeting preset conditions from the plurality of candidate side lines, correct the boundary frame based on the target side lines and obtain an implantation position of an object to be implanted into the target video frame, wherein the preset conditions are that polygons can be formed in a communicating mode, and the similarity between the polygons and the boundary frame is larger than a preset similarity threshold;

a setting module configured to set an implantation identifier at the implantation position, the implantation identifier being used to identify an implantation position of the object to be implanted in the target video frame;

Wherein the means for setting the implant identification is further configured to: acquiring the implantation identification corresponding to the implantation position from the target video; determining the object to be implanted based on the identity information carried in the implantation identification, and generating a pose transformation matrix of the object to be implanted based on the pose information carried in the implantation identification; adjusting the object to be implanted based on the pose transformation matrix, and implanting the adjusted object to be implanted into the target video;

wherein the means for setting the implant identification is further configured to: dividing the target video into a plurality of video frames, and converting each video frame of the plurality of video frames into a gray scale map; carrying out Gaussian blur processing on the gray level map based on a preset blur kernel size, and carrying out thresholding processing on the gray level map after Gaussian blur processing so as to convert the gray level map after Gaussian blur processing into a binary image; and extracting a contour corresponding to the tracking graph from the binary image based on a preset tracking graph, and identifying the implantation identification based on an image in the contour.

8. An electronic device, comprising:

A memory configured to store a computer program;

a processor configured to cause a computer to perform the method of any one of claims 1 to 6 when the program is run.

9. A computer-readable storage medium, on which a program is stored, characterized in that the program, when run, causes a computer to perform the method of any one of claims 1 to 6.