CN118261874A

CN118261874A - Fishing rod fishing performance analysis method based on image segmentation large model and multi-element high-order regression fitting

Info

Publication number: CN118261874A
Application number: CN202410359722.XA
Authority: CN
Inventors: 苏统华; 万庭雷; 乔智栋; 曲明成; 涂志莹; 王忠杰; 徐汉川; 初佃辉; 陈鄞
Original assignee: Harbin Institute of Technology
Current assignee: Harbin Institute of Technology
Filing date: 2024-03-27
Publication date: 2024-06-28

Abstract

The invention discloses a fishing rod fishing performance analysis method based on an image segmentation large model and a multi-element high-order regression fitting, which comprises the following steps: step one, cutting an image; step two, pre-identifying the fishing rod; step three, image recognition; and fourthly, fitting a fishing rod curve. The method can automatically identify the fishing rod in the image and generate the coordinates of the fishing rod image in the original image so as to measure the fishing performance of the fishing rod, support the segmentation of the image under dim light, and have higher efficiency and segmentation effect when processing the large-size image.

Description

Fishing rod fishing performance analysis method based on image segmentation large model and multi-element high-order regression fitting

Technical Field

The invention relates to a fishing rod fishing performance analysis method, in particular to a fishing rod fishing performance analysis method based on an image segmentation large model and a multi-element high-order regression fitting.

Background

The existing method for analyzing the fishing performance of the fishing rod is to manually mark coordinate data of the fishing rod when the fishing rod is bent by taking a certain number of points through coordinate axes, and then manually analyze the data. The above method has the following problems:

(1) The fishing rod shoots the ground light relatively poor, and the shot image has the problems of ghost, blurring and the like.

(2) The thinner the fishing rod is, the more the fishing rod is rounded, which presents challenges for the image segmentation task.

(3) The background patterns are more, the thickness and the color of the line for marking the coordinate axes are different, and certain interference is generated for cutting the fishing rod.

(4) Because of manual measurement, the reference is different, and the error is larger; the data is more likely to have recording errors.

(5) Because of errors in manual measurement, calibration is performed by multiple punctuations, time is long, and efficiency is low.

(6) The accurate tonal analysis result is obtained by manual measurement, and the requirement on the mathematical basis of the staff is high.

Disclosure of Invention

Aiming at the problems existing in the prior art, the invention provides a fishing rod fishing performance analysis method based on an image segmentation large model and a multi-element high-order regression fit. The method can automatically identify the fishing rod in the image and generate the coordinates of the fishing rod image in the original image so as to measure the fishing performance of the fishing rod, support the segmentation of the image under dim light, and have higher efficiency and segmentation effect when processing the large-size image.

The invention aims at realizing the following technical scheme:

A fishing rod fishing performance analysis system based on an image segmentation large model and a multi-element high-order regression fitting comprises an image cutting module, a fishing rod pre-recognition module, an image recognition module and a fishing rod curve fitting module, wherein:

The image cutting module is used for cutting out all images of the fishing rod and the background plate of the fishing rod;

The fishing rod pre-recognition module is used for accurately recognizing the position and the shape of the fishing rod in the image and transmitting the position and the shape to the image recognition module;

The image recognition module is used for recognizing the fishing rod by utilizing the SEGMENT ANYTHING large model according to the curve information provided by the fishing rod pre-recognition module, so as to recognize the regional image of the fishing rod;

the fishing rod curve fitting module utilizes the fishing rod region image obtained by the image recognition module to fit pixel coordinates of the obtained fishing rod region image after image inversion, expansion, corrosion and filtering treatment.

A method for realizing fishing rod fishing performance analysis based on an image segmentation large model and a multi-element high-order regression fit by using the system comprises the following steps:

Step one, cutting an image:

step one, identifying all dots in the background plate by using a spot detector, and storing the coordinates and the areas of all the identified dots locally in a matrix form;

Step one, sorting according to the size of the dot areas identified in the step one by one;

step one, determining coordinates of cutting points according to the first five dots obtained in the sequence of the step one and cutting the first five dots;

Step two, pre-identifying the fishing rod:

Step two, dividing the original image by utilizing all the dot coordinates obtained in the step one, and extracting a partial image of the middle section of the fishing rod;

Step two, processing pixel values of the areas where all the dots are positioned, and setting the pixel values to be 0;

Step two, converting the middle section part image of the fishing rod into a binary image, and separating out the distinction between the fishing rod and the background;

Fourthly, performing Gaussian smoothing operation, and then detecting the edge of the fishing rod in the image by using a Canny edge detection algorithm;

step five, performing morphological expansion operation to connect the edges of the curve and obtain the accurate contour of the middle section part of the fishing rod;

step six, searching outlines in the image processed in the step two and step five, drawing the outline lines on the original image, and intuitively displaying the shape and the position of the fishing rod;

step seven, curve fitting is carried out on the contour so as to better describe the shape of the fishing rod, and smoother and continuous contour lines are obtained through fitting curves;

Step two, selecting two points from the fitting curve as indication points of the image recognition module, converting the coordinates of the two points under the middle section partial image into coordinates in the original full image, and inputting the coordinates in the original full image of the two indication points into the image recognition module for subsequent recognition and analysis;

step three, image recognition:

Step three, after the image recognition module receives the indication points transmitted by the fishing rod pre-recognition module, an image encoder of SEGMENT ANYTHING large models converts an input image into image embedded vectors, semantic and characteristic information of the image are captured, and meanwhile, the indication points are converted into corresponding prompt embedded vectors through processing of a prompt encoder;

Step three, a mask decoder predicts a segmentation mask of the fishing rod in the image by combining image embedding and prompt embedding, and segments different objects in the image;

Thirdly, accurately dividing a fishing rod region in an image from a background by generating a division mask according to an output result of the SEGMENTANYTHING large model, wherein the division mask obtained in the process can clearly draw the outline and the shape of the fishing rod, and returns an identification result;

step four, fitting a fishing rod curve:

Step four, a fishing rod curve fitting module converts a fishing rod area diagram obtained by an image recognition module into a binary image, and inverts the binary image to enable a black area to be 255 and a white area to be 0;

step four, performing expansion and corrosion operation on the image by using a kernel of 5 multiplied by 5, and filling areas which are not communicated in the image; performing an expansion operation on the inverted image to fill the hole; performing corrosion operation on the expanded image to restore the image;

Step four, filtering the pendant area by using the area position and size information: extracting a fishing rod curve image and a pendant region image through a Canny edge detection algorithm; determining which part is a pendant region by calculating the outline area and the center point coordinates of the two regions; according to a predefined threshold value, the too small area is eliminated, so that the pixel points of the pendant area can be eliminated;

Fourth, the influence of the circular ring is filtered by utilizing gradient information: extracting a fishing rod curve image through a Canny edge detection algorithm, and calculating a gradient image for the fishing rod curve image; the gradient amplitude and the gradient direction of each pixel point are obtained by using a Sobel operator; in the gradient image of the fishing rod curve, pixel points which are obviously different from the circular ring gradient direction are screened out according to the gradient direction and the threshold value, and the pixel points are filtered out from the gradient image of the fishing rod curve;

Step four, acquiring coordinates of a pixel point with a pixel value of 255: traversing each pixel of the image, checking whether the pixel value is 255, and recording the corresponding row coordinates and column coordinates to obtain the coordinate information of all the pixel points with the pixel value of 255;

and step four, fitting all pixel points with the pixel value of 255 to obtain a fishing rod curve.

Compared with the prior art, the invention has the following advantages:

1. the invention solves the problem of large error and obtains better effect.

2. According to the invention, aiming at the problem of low efficiency of manually measuring the fishing rod, the large model is used for identification, so that the fishing rod in one picture can be identified within ten seconds, and the efficiency is greatly improved.

Drawings

FIG. 1 is a block diagram of a fishing rod angleness analysis system based on an image segmentation large model and a multiple higher order regression fit;

FIG. 2 is a design drawing of a novel fishing rod measurement background plate;

FIG. 3 is a graph of a cutting effect;

FIG. 4 is a graph of dot inversion effects;

FIG. 5 is a diagram showing the pre-recognition effect of a fishing rod;

FIG. 6 is an image recognition effect diagram;

FIG. 7 is a graph showing the effect of curve fitting on a fishing rod;

FIG. 8 is a graph of a fishing rod identification effect with a starting point y coordinate of 50 cm;

FIG. 9 is a graph showing the identification effect of a fishing rod with a starting point y coordinate of 2 cm;

FIG. 10 is a graph showing the identification effect of a fishing rod with a starting point y coordinate of 0 cm.

Detailed Description

The following description of the present invention is provided with reference to the accompanying drawings, but is not limited to the following description, and any modifications or equivalent substitutions of the present invention should be included in the scope of the present invention without departing from the spirit and scope of the present invention.

The invention provides a fishing rod fishing performance analysis system based on an image segmentation large model and a multi-element high-order regression fit, which is shown in figure 1, and comprises four modules of image cutting, fishing rod pre-identification, image identification and fishing rod curve fitting, wherein the detailed process of each module is as follows:

(1) Image cutting module

In order to improve the recognition and segmentation precision of the image and the pre-recognition precision of the fishing rod, the invention designs an image cutting module for cutting out all the images of the fishing rod and the background plate thereof. The module can be divided into three parts: the device comprises a dot identification module, a dot ordering module and an image cutting module, wherein:

The dot recognition module recognizes all dots in a background plate by using a dot detector, wherein the background plate is a novel fishing rod measuring background plate, and the novel fishing rod measuring background plate is shown in figure 2; the round dots comprise two circular mark points with different radiuses, wherein the radius of a big circle mark point is 5cm, the radius of a small circle mark point is 2.5cm, the small circle mark points are pasted at intervals of 20cm, the big circle mark points are pasted on the peripheral edge of a background plate, the world coordinates of all other small circle mark points are determined through the positions of the big circle mark points on the background plate, the positioning of coordinates on the background plate is realized, the robustness of mark point identification can be improved to the greatest extent by arranging the round dots on the background plate, the big circle is placed on a coordinate axis, no interference is generated on manual measurement, and the coordinates of all the identified round dots and the areas thereof are stored locally in a matrix form; in order to identify the real coordinates on the background plate grid in a long distance, the invention is realized by adopting the identification analysis of the size and arrangement of the mark points, wherein the size of the mark points is the minimum value which can be normally identified and is obtained under the condition that the distance of a normal fishing rod for measuring the background plate can be covered, the possible manual identification condition is not interfered as far as possible, the size of the mark points provided by the invention can be suitable for the background plate identification under the condition that a camera is 1 meter to 8 meters away from a standard background plate, the invention is suitable for most fishing rod tonality analysis scenes, and if the distance condition exists, the radius of the mark points is increased according to the condition;

The dot ordering module reads the file stored locally by the dot identification module and orders the files according to the area of the dots;

the image cutting module determines the coordinates of the cutting points according to the first five dots obtained by the sorting of the dot sorting module and cuts, and a cutting effect diagram is shown in fig. 3.

(2) Fishing rod pre-identification module

The fishing rod pre-recognition module can accurately recognize the position and the shape of the fishing rod in the image. This pre-recognition process is critical to the subsequent image recognition module, which provides the basic information for locating and segmenting the fishing rod.

By means of the fishing rod pre-recognition module, the rough coordinates of the fishing rod in the image can be obtained. This information will be passed to the image recognition module for further analysis and processing. The two key points are selected as inputs to provide more detailed fishing rod features so that the image recognition module can more accurately perform image segmentation and recognition.

Before image recognition can be performed, challenges presented by taking photographs from different angles need to be addressed. Especially when cropping the image, the background plate dots may partially remain, thereby interfering with the image recognition. To overcome this problem, the present invention takes the following pretreatment steps: firstly, dividing an original image by utilizing all dot coordinates obtained by a dot recognition module to extract a middle section part image of the fishing rod. Next, in order to minimize the influence of dots, the pixel values of the areas where all dots are located are processed to be set to 0. In this way, the influence of the dots is minimized when converting the image into a binary image, thereby improving the accuracy and stability of image recognition. Through the preprocessing step, the contour of the fishing rod can be better segmented, so that the image recognition module can more accurately recognize the characteristics and the attributes of the fishing rod. The dot inversion effect is shown in figure 4.

After the fishing rod middle section part image is obtained, a series of image processing operations are required to be carried out on the fishing rod middle section part image so as to extract more accurate fishing rod contour information. First, the partial image is converted into a binary image, so that the distinction between the fishing rod and the background can be separated more clearly. Next, morphological closing operations are performed, which are very helpful for filling and connecting voids and gaps in the image. By this step, it is ensured that the resulting profile is more complete and that no breaks or voids occur.

In preparation for edge detection, a gaussian smoothing operation is performed, which helps to reduce noise and make the image smoother. The smoothed image will provide a better input for the subsequent edge detection algorithm. A Canny edge detection algorithm is then used to detect the fishing rod edge in the image. The Canny algorithm can effectively identify edges and thereby more accurately locate the shape of the fishing rod.

Next, a morphological dilation operation is performed to join the curvilinear edges. This step helps to fill the gaps between the edges, making the resulting fishing rod profile more continuous and smooth. By these processing steps, an accurate profile of the middle section of the fishing rod can be obtained.

Contours are then found in the processed image and drawn on the original image. By doing so, the shape and position of the fishing rod can be intuitively displayed. These contours are then curve fitted to better describe the shape of the fishing rod. By fitting a curve, a smoother and more continuous profile can be obtained.

After fitting the curve, two points are selected from the curve as key points of the fishing rod. To restore the coordinates of these two points to the full map, their coordinates under the mid-section image are converted to coordinates in the original full map. Finally, the coordinates of the two key points are input into an image recognition module for subsequent recognition and analysis. The pre-recognition effect of the fishing rod is shown in fig. 5.

(3) Image recognition module

The image recognition module uses SEGMENTANYTHING large models to perform the recognition of the fishing rod. The model utilizes curve information provided by the fishing rod pre-recognition module, and a plurality of points are selected as indication points of the model at will so as to guide the model to divide a selected area, thereby accurately recognizing the area image of the fishing rod.

In particular, SEGMENTANYTHING's image encoder plays an important role in converting an input image into an image-embedded vector, capturing semantic and feature information of the image. Meanwhile, the indication points are also converted into corresponding embedded vectors through the processing of the cue encoder. Thus, through the combination of image embedding and prompt embedding, the mask decoder can predict the segmentation mask of the fishing rod in the image, and the accurate segmentation of different objects is realized.

The decoder plays an important role in this process, it uses hint self-attention and cross-attention mechanisms to continually update the embedded vector and make adjustments to the model weights based on the mask. Such a mechanism enables the model to be gradually optimised and to better adapt to the characteristics of the fishing rod in the image.

And the fishing rod region in the image is accurately segmented from the background by generating a segmentation mask according to the output result of the SEGMENTANYTHING large model. The segmentation mask obtained by the process can clearly describe the outline and the shape of the fishing rod, and the recognition result is returned.

Through the image recognition module, an accurate fishing rod region image can be obtained, and a reliable basis is provided for subsequent analysis and processing. Such recognition effects will help to improve overall system performance and effectiveness, making it better suited for a variety of complex scenarios and conditions. The image recognition effect is shown in fig. 6.

(4) Fishing rod curve fitting module

The fishing rod curve fitting module converts the fishing rod region map obtained by the image recognition module into a binary image. The binary image was inverted so that the black area was 255 and the white area was 0.

Performing expansion and corrosion operation on the image by using a kernel of 5 multiplied by 5, so as to fill the area which is not communicated in the image; performing an expansion operation on the inverted image to fill the hole; performing corrosion operation on the expanded image to restore the image; inverting the image again, recovering the original color representation; and carrying out Gaussian smoothing operation to prepare for edge detection, and detecting the edge by using a Canny edge detection algorithm.

In order to eliminate the influence of the pendant on the fishing rod on the curve fitting of the fishing rod, the pendant area is filtered by using the information such as the position and the size of the area: extracting a fishing rod curve image and a pendant region image through Canny edge detection; by calculating the contour area and the center point coordinates of the two areas, it is possible to determine which part is the pendant area. The too small area may be excluded according to a predefined threshold. Thus, the pixel points of the pendant region can be removed.

Since the guide ring on the fishing rod can influence the follow-up fitting of the fishing rod curve, the influence of the circular ring is filtered by utilizing gradient information: and extracting a fishing rod curve image through Canny edge detection, and calculating a gradient image for the fishing rod curve image. The Sobel operator can be used to obtain the gradient amplitude and direction of each pixel point; in the gradient image of the fishing rod curve, pixel points which are obviously different from the gradient direction of the circular ring are screened out according to the gradient direction and the threshold value, are likely to be influenced by the circular ring, and are filtered out from the gradient image of the fishing rod curve.

In order to perform polynomial fitting, it is necessary to first acquire coordinates of a pixel point having a pixel value of 255. By traversing each pixel of the image, it is checked whether its pixel value is 255 and the corresponding row and column coordinates are recorded. Thus, coordinate information of all the pixel points having the pixel value of 255 is obtained.

Next, the row and column coordinates of the points are stored for polynomial fitting. Polynomial fitting is a mathematical method by which a given set of data points can be approximated by fitting a curve. Here, 5 th order polynomial fitting is chosen.

In the fitting process, a mathematical library or a fitting algorithm is required to perform polynomial fitting. The present invention uses a least squares method to fit that finds the best fit result by minimizing the sum of squares of the residuals between the fitted curve and the data points. For 5 th order polynomial fitting, a fifth order polynomial function is constructed, which contains fifth order, fourth order, third order, second order, first order and constant terms. Then, by adjusting the coefficients of the function, the residuals of the fitted curve and the data points are minimized.

In performing the fitting, the coefficients of the best fit function may be calculated using a curve fitting algorithm. These coefficients will determine the shape and adaptability of the fitted curve. Typically, the coefficients will be optimized using a least squares method so that the fitted curve is as close as possible to the data points.

After completion of the polynomial fitting, a fitting function is obtained which best describes the distribution of the points. By means of this fitting function, the positions of other unknown points can be predicted and a smooth curve can be obtained which best represents the distribution of these pixels with a pixel value of 255, i.e. the pixels of the fishing rod. The effect of the fishing rod curve fitting is shown in figure 7.

Examples:

The system development platform is Windows operating system Windows10, the GPU is a block NVIDIA GeForce GTX TITAN X GPU, the program is written in python3.8, and the PyTorrch1.12 framework is used.

The following mainly describes the recognition capability of the algorithm on the fishing rod in a natural scene.

The fishing rod picture is input into the system, the system firstly automatically identifies the dots on the background plate, stores the pixel coordinates and the area of all the dots, then reads the pixel coordinates and the area of the dots, and sorts the dots according to Euclidean distance between each pair of characteristic points to determine the positions of all the large circles.

And cutting the fishing rod picture after determining the position of the large circle, and reserving the fishing rod picture of the background plate part. And cutting the fishing rod picture of the background plate part again, and dividing the original image by all dot coordinates obtained by dot identification to obtain a part of the fishing rod image. The fishing rod pre-recognition module sets the pixel values of the areas where all dots are located as 0, converts the images into binary images, performs morphological closing operation and Gaussian smoothing operation, and detects edges by using a Canny edge detection algorithm; performing morphological expansion operation, searching for contours in the image and drawing contour lines on the original image; the curve is fitted and two points are taken in the curve. And restoring the two points, restoring the coordinates of the two points under the middle section part image into the coordinates in the whole image, and inputting the coordinates into the image recognition module.

The image recognition module uses SEGMENT ANYTHING large models for recognition. When receiving the indication points transmitted by the fishing rod pre-recognition module, the image encoder SEGMENT ANYTHING converts the input image into an image embedded vector and captures the semantic and characteristic information of the image. The hint encoder converts the pointer into an embedded vector. The mask decoder predicts a segmentation mask using image embedding and hint embedding to segment different objects in the image. The decoder block updates the embedding using hint self-attention and cross-attention mechanisms and updates the model weights according to the mask. The SAM generates a segmentation mask, segments the object in the image, and returns the result.

The fishing rod curve fitting module converts the fishing rod region map obtained by the image recognition module into a binary image, converts True into 255 and converts false into 0. The binary image was inverted so that the black area was 255 and the white area was 0.

Performing expansion and corrosion operation on the reversed image; inverting the image again, recovering the original color representation; performing Gaussian smoothing operation, and detecting edges; acquiring coordinates of a pixel with a pixel value of 255; storing the row coordinates and the column coordinates of the points; and fitting all points by using polynomial fitting (5 th-degree polynomial fitting) to finally obtain the fishing rod curve. The fishing rod recognition effect is shown in fig. 8 to 10.

Claims

1. The fishing rod fishing performance analysis system based on the image segmentation large model and the multi-element high-order regression fitting is characterized by comprising an image cutting module, a fishing rod pre-recognition module, an image recognition module and a fishing rod curve fitting module, wherein:

2. The fishing rod angleness analysis system based on the image segmentation large model and the multiple higher order regression fit according to claim 1, wherein the image cropping module is divided into three parts: the device comprises a dot identification module, a dot ordering module and an image cutting module, wherein: the dot identification module is used for identifying all dots in the background plate by using the dot detector, wherein the dots comprise two circular mark points with different radius sizes, and the world coordinates of all other small circle mark points are determined by the positions of the large circle mark points on the background plate, so that the positioning of the coordinates on the background plate is realized; the dot ordering module reads the file stored locally by the dot identification module and orders the files according to the area of the dots; the image cutting module determines the coordinates of the cutting points according to the first five dots obtained by the sorting of the dot sorting module and cuts the first five dots.

3. The fishing rod fishing analysis system based on the image segmentation large model and the multi-component high-order regression fitting according to claim 2, wherein the radius of the large circle mark point is 5cm, the radius of the small circle mark point is 2.5cm, the small circle mark points are pasted at intervals of 20cm, and the large circle mark points are pasted on the peripheral edge of the background plate.

4. A method of using the fishing rod angleness analysis system of any one of claims 1-3 to perform a fishing rod angleness analysis based on an image segmentation large model and a multiple higher order regression fit, the method comprising the steps of:

Step one, cutting an image:

Step two, pre-identifying the fishing rod:

step three, image recognition:

step four, fitting a fishing rod curve: