CN116385313B

CN116385313B - Infant interpersonal communication jigsaw training system and method based on image processing

Info

Publication number: CN116385313B
Application number: CN202310538346.6A
Authority: CN
Inventors: 刘国雄; 陈庆荣; 程平; 晏阳; 李风华
Original assignee: Nanjing Normal University
Current assignee: Nanjing Normal University
Priority date: 2023-05-15
Filing date: 2023-05-15
Publication date: 2023-08-25
Anticipated expiration: 2043-05-15
Also published as: CN116385313A

Abstract

The invention relates to the technical field of image processing, in particular to an infant interpersonal communication jigsaw training system and method based on image processing, wherein the method acquires a static frame image of an infant in a jigsaw process to obtain a corresponding gray image; acquiring a gray level image with motion blur as a target gray level image, acquiring a spectrum image of the target gray level image, and centering the spectrum image to obtain a target spectrum image; dividing a target spectrum image into a low frequency region and a high frequency region; acquiring a first low-frequency region and a first high-frequency region based on the low-frequency region; acquiring a fuzzy coefficient of a fuzzy core according to the low frequency region, the high frequency region, the first low frequency region and the first high frequency region; deblurring the target gray level image by using the fuzzy coefficient to obtain a deblurred static frame image; and performing jigsaw training of the infant based on the deblurred static frame image. The invention improves the practicability and stability of the intelligent accompanying and playing system by improving the quality requirement of the monitoring image.

Description

Infant interpersonal communication jigsaw training system and method based on image processing

Technical Field

The invention relates to the technical field of image processing, in particular to an infant interpersonal communication jigsaw training system and method based on image processing.

Background

The multi-element intelligence of the infants is developed through the jigsaw puzzle, and the coordination capacity, social capacity and exploration, observation and discrimination capacity of the infants are improved. The current intelligent game on the mobile phone and the tablet personal computer is frequent, but the electronic product has a certain harm to the body and vision of the child and is unfavorable for cultivating the practical ability of the child, so that the application of the puzzle toy and the artificial intelligence combined with the accompanying and playing type of the child, namely the intelligent accompanying and playing system gives out scene instructions, the child gradually places the target puzzle in the matched scene puzzle according to the system prompt to cultivate the communication and understanding ability of the child, and the child-raising problems of most of time tension, poor communication ability and lack of guiding consciousness parents are solved.

The intelligent companion playing system needs to recognize the behavior of the child in the process of the puzzle and the result of the puzzle to guide, motivate and correct the child. In the real-time monitoring process, higher recognition capability is needed for different jigsaw toys or behaviors of infants, the system can timely and accurately guide and excite the jigsaw toys, the conventional jigsaw recognition adopts an image edge matching mode, but the household monitoring or mobile phones and other devices are limited in configuration, and when the infants perform the actions of capturing and splicing the jigsaw, the factors such as poor light environment, camera equipment clamping and desktop bracket shaking can cause motion blur of the images, so that the acquired images cannot accurately recognize target information.

At present, a fuzzy core is conventionally estimated by using a variable decibel leaf algorithm, and an image with motion blur is deblurred by using the estimated fuzzy core to obtain a clear image, but the method is realized based on the principle that the posterior probability of the estimated fuzzy core is maximum, and a large amount of iterative computation is needed, so that the timeliness and the computation amount are not high obviously in the real-time monitoring process of the infant jigsaw, and further the intelligent accompanying and playing system cannot conduct jigsaw guidance on the infant timely and accurately, and the practicability is poor.

Disclosure of Invention

In order to solve the problem that the prior fuzzy image estimated by using the variable decibel leaf algorithm is deblurred to run a fuzzy image, so that the problem that the infant cannot be guided by the jigsaw in time and accurately, the invention aims to provide an infant interpersonal communication jigsaw training system and method based on image processing, and the adopted technical scheme is as follows:

in a first aspect, an embodiment of the present invention provides an image processing-based training method for inter-human communication puzzles for infants, including:

collecting static frame images of infants in the jigsaw process to obtain corresponding gray images;

acquiring a gray level image with motion blur as a target gray level image, acquiring a spectrum image of the target gray level image, and centering the spectrum image to obtain a target spectrum image; dividing the target spectrum image into a low frequency region and a high frequency region by a segmentation threshold;

connecting each edge point on the low-frequency region with the center point of the target spectrum image, acquiring a segmentation node corresponding to the connection line according to the brightness of each point on the connection line, connecting the segmentation nodes of all the connection lines to obtain a closed edge, and taking the region corresponding to the closed edge as a first low-frequency region; taking the area except the first low-frequency area in the target frequency spectrum image as a first high-frequency area;

acquiring a fuzzy coefficient of a fuzzy core according to the low frequency region, the high frequency region, the first low frequency region and the first high frequency region; based on the gray level difference between each pixel point in the target gray level image and surrounding pixel points, deblurring the target gray level image by using a fuzzy coefficient to obtain a deblurred static frame image;

and performing jigsaw training of the infant based on the deblurred static frame image.

Further, the method for obtaining the split node of the corresponding connecting line according to the brightness of each point on the connecting line comprises the following steps:

respectively performing curve fitting on the brightness of each point on the connecting line at least twice to obtain a corresponding brightness change curve function, obtaining the fitting brightness of each point on the connecting line by using the brightness change curve function, calculating the square difference between the brightness of each point on the connecting line and the corresponding fitting brightness, obtaining the addition result of the square difference, and taking the brightness change curve function corresponding to the minimum addition result as the optimal brightness change curve function; and obtaining the segmentation nodes corresponding to the connecting lines by utilizing the Lagrangian median theorem based on the optimal brightness change curve function of the connecting lines.

Further, the method for acquiring the blur coefficient of the blur kernel according to the low frequency region, the high frequency region, the first low frequency region and the first high frequency region comprises the following steps:

the method comprises the steps of respectively obtaining areas of a low-frequency area, a high-frequency area, a first low-frequency area and a first high-frequency area; the area of the first high-frequency area is taken as a denominator, the area of the first low-frequency area is taken as a numerator to obtain a corresponding ratio which is taken as a first ratio, the area of the high-frequency area is taken as a numerator, the area of the low-frequency area is taken as a denominator to obtain a corresponding ratio which is taken as a second ratio, and the product of the first ratio and the second ratio is taken as a fuzzy coefficient.

Further, the method for deblurring the target gray image by using a blur coefficient based on gray level difference between each pixel point in the target gray image and surrounding pixel points thereof to obtain a deblurred static frame image comprises the following steps:

setting a window with a preset size by taking a pixel point as a center for any pixel point in a target gray level image, counting gray level types in the window, calculating the total number of pixel points in the window of the number of the pixel points under each gray level type, calculating an entropy value according to the duty ratio, normalizing the entropy value to obtain a normalized entropy value, taking the product of the normalized entropy value and a fuzzy coefficient as a denominator, taking the gray level of the pixel point as a molecule to obtain a corresponding ratio, and taking the ratio as a deblurring gray level of the pixel point;

and obtaining a deblurring gray value of each pixel point in the target gray image to obtain a corresponding deblurring static frame image.

Further, the method for acquiring the gray image with motion blur as the target gray image comprises the following steps:

and performing convolution operation on the gray level image by using the Laplace convolution check gray level image with the set size, calculating variance of the result of each convolution operation, normalizing the variance to obtain normalized variance, and confirming that motion blur exists in the gray level image when the normalized variance is smaller than a variance threshold value, wherein the gray level image is taken as a target gray level image.

In a second aspect, an embodiment of the present invention provides an image processing-based training system for inter-personal communication puzzles for infants, the system comprising: a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the steps of any of the methods described above when executing the computer program.

The invention has the following beneficial effects:

according to the invention, the corresponding gray level image is obtained by collecting the static frame image of the infant in the jigsaw process, so that the behavior of the infant and the jigsaw information cannot be identified because the monitoring image is possibly blurred during monitoring, and further the correction and the guide cannot be timely and accurately performed, firstly, the gray level image with motion blur is obtained as a target gray level image, the frequency spectrum image of the target gray level image is obtained, and the frequency spectrum image is centered to obtain the target frequency spectrum image so as to be converted into a frequency domain from a space domain, so that the motion blur condition can be more intuitively reflected; considering that motion blur weakens the outline of an original clear edge, so that part of weakened high-frequency information is changed into low-frequency information, and then the low-frequency information and the original low-frequency information in the original clear image are both generated in a segmented low-frequency region in a target frequency spectrum image, dividing the target frequency spectrum image into a low-frequency region and a high-frequency region, and further acquiring a first low-frequency region and a first high-frequency region according to the brightness of each point on a connecting line of the edge point of the low-frequency region and the center point of the target frequency spectrum image, wherein the brightness of each point is used for representing the distribution situation of the actually corresponding high-frequency region and the low-frequency region in the image when the motion blur does not occur; and then, based on the change before and after the image motion blur, obtaining the blur coefficient of a blur kernel according to a low-frequency region, a high-frequency region, a first low-frequency region and a first high-frequency region, and further, in order to improve the deblurring effect, deblurring the target gray image by using the blur coefficient based on the gray difference between each pixel point in the target gray image and surrounding pixel points, so as to obtain a deblurred static frame image.

Drawings

In order to more clearly illustrate the embodiments of the invention or the technical solutions and advantages of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are only some embodiments of the invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flowchart showing steps of an image processing-based training method for inter-personal communication puzzles for infants according to an embodiment of the present invention;

fig. 2 is a schematic diagram of a fitted curve provided in an embodiment of the present invention.

Detailed Description

In order to further describe the technical means and effects adopted by the invention to achieve the preset aim, the following detailed description is given below of specific implementation, structure, characteristics and effects of the infant interpersonal communication jigsaw training system and method based on image processing according to the invention by combining the accompanying drawings and the preferred embodiment. In the following description, different "one embodiment" or "another embodiment" means that the embodiments are not necessarily the same. Furthermore, the particular features, structures, or characteristics of one or more embodiments may be combined in any suitable manner.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.

The invention aims at the following situations: the intelligent accompanying and playing system needs to identify the behaviors of the infants in the jigsaw process and the jigsaw results so as to guide, excite and correct errors, and generally needs to be connected with equipment such as home monitoring or mobile phones to monitor the infant jigsaw process in real time. In the real-time monitoring process, no matter what the behaviors of different jigsaw toys or infants need higher recognition capability, the system can timely and accurately guide and excite the jigsaw toys, the conventional jigsaw recognition adopts an image edge matching mode, but equipment such as household monitoring or mobile phones is limited in configuration, when the infants make the actions of grabbing and splicing the jigsaw, the factors such as poor light environment, camera equipment blocking, desktop bracket shaking and the like can possibly cause motion blur of the images, so that the acquired images cannot recognize target information, and correct guide cannot be performed.

The invention provides a specific scheme of an infant interpersonal communication jigsaw training system and method based on image processing, which is specifically described below with reference to the accompanying drawings.

Referring to fig. 1, a flowchart of steps of an image processing-based training method for inter-human communication of infants is shown, which includes:

and S001, collecting a static frame image of the infant in the jigsaw process, and obtaining a corresponding gray image.

Specifically, the intelligent companion playing system is matched with the jigsaw puzzle box, and the intelligent companion playing system installed by the mobile phone can directly acquire camera authority or be connected with a monitor to monitor the infants. When the mobile phone is fixed by the support to adjust a proper shooting angle and then the intelligent accompanying and playing system sends out a jigsaw puzzle instruction, for example: toy in kitchen of Xiaoming; after hearing the instruction, the child needs to find the kitchen scene card and insert the jigsaw floor, then find the jigsaw of the Ming and toy, and paste it on the background board, wherein the instruction includes figures, emotions, movements, objects, language (bubble), etc., there are corresponding jigsaw puzzle in the jigsaw puzzle box, the child needs to understand the meaning of the instruction, then carry out the instruction, and the intelligent companion playing system grasps, selects the jigsaw puzzle by monitoring the child, and splice, paste the process on the jigsaw plate, carry on corresponding guidance, excitation and error correction to it.

And collecting a static frame image of the infant in the process of executing the instruction, and carrying out graying treatment to obtain a corresponding gray image so as to better capture image edge information and reduce the operation amount. The graying process is a known technique, and the scheme is not described in detail.

Step S002, a gray level image with motion blur is obtained as a target gray level image, a spectrum image of the target gray level image is obtained, and the spectrum image is centered to obtain a target spectrum image; the target spectrum image is divided into a low frequency region and a high frequency region by a division threshold.

Specifically, in the night or in the environment with poor light, the camera can preferentially identify surrounding light and process images, so that in the shooting process of the mobile phone, the shake of the mobile phone or the movement of a target unit in a lens can occur, and the movement blur of the acquired images can be caused by light micro-blocking, which is a normal phenomenon. However, when the intelligent companion playing system monitors the jigsaw behaviors of infants, the light environment cannot be guaranteed to be good all the time, and high-end camera equipment cannot be equipped due to the image problem, so that when the intelligent companion playing system recognizes and analyzes images, a debluring module is added, and the intelligent companion playing system has important significance for improving the practicability of intelligent companion playing software and application and the timeliness of artificial intelligent guidance.

Firstly, confirming whether motion blur exists in an acquired image, taking the image with the motion blur as a target image for subsequent deblurring processing, wherein the acquisition method of the target image comprises the following steps of: and performing convolution operation on the gray level image by using the Laplace convolution check gray level image with the set size, calculating variance of the result of each convolution operation, normalizing the variance to obtain normalized variance, and confirming that motion blur exists in the gray level image when the normalized variance is smaller than a variance threshold value, wherein the gray level image is taken as a target gray level image.

As an example, the laplace convolution kernel of 3*3 is used to perform convolution operation with the gray image, the laplace operator can highlight the region where the gradient changes rapidly in the gray image, if motion blur occurs, the fewer edges in the gray image, so the variance is calculated from the convolution result of the laplace convolution kernel each time, if the variance is low, the variance is almost no edge on behalf of the gray image, so the variance threshold is set to 0.3, the normalized variance is obtained by normalizing the variance, and when the normalized variance is smaller than the variance threshold, it is determined that the motion blur exists in the gray image. The normalization is a known technique, and this scheme is not described in detail.

The motion blur removal process is a deconvolution process, i.e. assuming that the current image with motion blur is formed by convolving the original image with a blur kernel, the motion blur can be removed by deconvoluting the current image with motion blur only by estimating the blur kernel. Therefore, the method performs deblurring processing based on the spectrum image of the target gray image, and firstly converts the target gray image from a space domain to a frequency domain by using fourier transformation, and then obtains a highlight region and a low-highlight region by using a maximum inter-class variance method, specifically: performing discrete Fourier transform on the target gray image, and converting the target gray image from a space domain to a frequency domain to obtain a frequency spectrum image; centering the spectrum image to obtain a target spectrum image, namely, performing center translation on the spectrum image, and shifting the low frequency to a center position; knowing the brightness of each point in the spectrum image, the energy in the target gray scale image is described, namely, the pixel point with smaller gradient in the airspace is larger in energy, and the higher brightness and the lower frequency are in the spectrum image; the larger gradient pixel points are, the more energy attenuation is, the lower the brightness is, the higher the frequency is in the spectrum image, the original edge information is weakened due to motion blur, the low-frequency information is necessarily increased by the motion blurred image, therefore, the brightness of each point in the target spectrum image is normalized to obtain normalized brightness, the maximum inter-class variance method is utilized to obtain the segmentation threshold value of the brightness corresponding to the low-frequency information and the high-frequency information with the maximum inter-class variance on the target spectrum image based on the normalized brightness in the target spectrum image, and then the points in the target spectrum image are divided into two classes based on the segmentation threshold value, and the target spectrum image is further divided into a high-frequency region and a low-frequency region.

Step S003, connecting each edge point on the low-frequency area with the center point of the target spectrum image, obtaining the segmentation nodes of the corresponding connection lines according to the brightness of each point on the connection lines, and connecting the segmentation nodes of all the connection lines to obtain a closed edge, wherein the area corresponding to the closed edge is used as a first low-frequency area; the region other than the first low frequency region in the target spectrum image is taken as a first high frequency region.

Specifically, the motion blur is caused by the fact that the desktop shake or the speed of the person moving is greater than the exposure time of the camera, so that the edges of the original person and the object are smeared, the original clear edge outline is weakened, and the partially weakened high-frequency information is changed into low-frequency information, and then the low-frequency information and the original low-frequency information in the original image are all present in the segmented low-frequency area in the target spectrum image, so that in order to estimate the blur kernel, firstly, the proportion of the partial low-frequency information formed by the weakened high-frequency information in the low-frequency area needs to be estimated in the target spectrum image, specifically: connecting each edge point on the low-frequency area with the center point of the target spectrum image, performing curve fitting on the brightness of each point on the connecting line at least twice to obtain a corresponding brightness change curve function, obtaining the fitting brightness of each point on the connecting line by using the brightness change curve function, calculating the square difference value of the brightness of each point on the connecting line and the corresponding fitting brightness, obtaining the addition result of the square difference value, and taking the brightness change curve function corresponding to the minimum addition result as the optimal brightness change curve function; obtaining a segmentation node corresponding to the connecting line by utilizing the Lagrangian median theorem based on the optimal brightness change curve function of the connecting line; and connecting the dividing nodes of all the connecting lines to obtain a closed edge, wherein the area corresponding to the closed edge is used as a first low-frequency area.

As an example, the energy at the center point of the target spectrum image is maximum, the brightness decreases from the center to the outside, connecting lines are formed from each edge point on the edge of the low-frequency region to the center point of the target spectrum image, and all the brightness on each line from the edge point to the center point is obtained; for any connecting line, curve fitting is carried out according to the brightness of all points on the connecting line, and the formula for obtaining the optimal brightness change curve function according to the result of each curve fitting is as follows:

wherein, the liquid crystal display device comprises a liquid crystal display device,is the function of the optimal brightness change curve; />As a function of the minimum value; />For the brightness of the c-th point on the connecting line；/>The brightness of the c-th point on the connecting line on the corresponding fitting curve is the fitting brightness; />Is the number of points on the connection line.

It should be noted that the number of the substrates,the square of the luminance residual representing the c-th point on the connecting line, that is, the difference between the actual luminance and the fitting luminance, the larger the difference, the more unsuitable the corresponding fitting curve is; calculating the sum of squares of luminance residuals for all points on the connection line +.>When->The smaller the value of (2) the more appropriate the fitted curve, and thus theThe fitting curve corresponding to the minimum value of the (a) is the best fitting curve, and then the brightness change curve function corresponding to the best fitting curve is also the best brightness change curve function.

And obtaining the optimal brightness change curve function corresponding to each connecting line by utilizing the formula of the optimal brightness change curve function.

Although the motion blur due to the dithering converts part of the high frequency information into low frequency information, only the original gradient in the image is attenuated, not completely disappeared into a uniform region, so the low frequency region should be also divided into the original low frequency region and the low frequency region formed by the attenuation of part of the high frequency information. The latter is almost distributed near the boundary of the high-frequency region and the low-frequency region, so that the optimal brightness change curve function corresponding to each connecting line is utilized to derive the optimal brightness change curve function, and the segmentation nodes on each connecting line are obtained according to the Lagrange median theorem, and the calculation formula of the segmentation nodes is as follows:

wherein F' (ε) is the derivative of the optimal luminance change curve function of the connecting line at the ε -th point;a luminance representing a center point of the target spectrum image; />Represents the brightness of the edge point corresponding to the connecting line, +.>Representing the brightness difference between the two end points of the connecting line; m is the number of points on the connection line.

It should be noted that the number of the substrates,slope of two end points of the fitting curve corresponding to the connecting line, according to the Lagrange's median theorem, when +.>In this case, the epsilon-th point is a change node of the change trend of the fitting curve, as shown in fig. 2, the change node is preceded by a part which is weakened by high-frequency information and becomes low-frequency information, and the change node is followed by an original low-frequency information part which exists in the image, so that the brightness of the front section changes faster, and the brightness of the rear section changes more smoothly and slowly, and therefore, the change node is used as a partition node on a corresponding connecting line to partition the original low-frequency region and the low-frequency region formed after the weakening of part of high-frequency information.

The method for dividing the nodes is utilized to obtain the dividing nodes on each connecting line, then the dividing nodes are connected to obtain a closed edge, the closed edge is the original low-frequency information edge in the original clear image estimated on the target spectrum image, the area corresponding to the closed edge is used as a first low-frequency area, and then the rest area except the first low-frequency area in the target spectrum image is used as a second high-frequency area, namely the area corresponding to the original high-frequency information in the original clear image estimated on the target spectrum image.

Step S004, obtaining fuzzy coefficients of fuzzy kernels according to the low-frequency region, the high-frequency region, the first low-frequency region and the first high-frequency region; and deblurring the target gray image by using a fuzzy coefficient based on gray level difference between each pixel point in the target gray image and surrounding pixel points, so as to obtain a deblurred static frame image.

Specifically, although the points of the frequency domain and the spatial domain do not correspond to each other, the duty ratio relationship between the high frequency information and the low frequency information is necessarily consistent, so that the blur coefficient of the blur kernel may be obtained according to the proportional relationship between the high frequency region and the low frequency region of the target spectrum image obtained in step S002 and the proportional relationship between the first high frequency region and the first low frequency region of the original sharp image obtained in step S003 by estimation, specifically: the method comprises the steps of respectively obtaining areas of a low-frequency area, a high-frequency area, a first low-frequency area and a first high-frequency area; the area of the first high-frequency area is taken as a denominator, the area of the first low-frequency area is taken as a numerator to obtain a corresponding ratio which is taken as a first ratio, the area of the high-frequency area is taken as a numerator, the area of the low-frequency area is taken as a denominator to obtain a corresponding ratio which is taken as a second ratio, and the product of the first ratio and the second ratio is taken as a fuzzy coefficient.

As an example, since the sharp image a. Blur kernel = motion blur map B, where the representative point is multiplied, it can be translated into: the ratio of high frequency to low frequency of the clear image a x the ratio of blur coefficient=high frequency to low frequency of the motion blur image B, and the calculation formula for obtaining the blur coefficient is:

wherein ω is a blur coefficient; b represents a blurred image in which motion blur occurs; a represents a clear image; p represents high frequency information; q represents the information of a low frequency,representing the content of high-frequency information on the blurred image, namely the area of the high-frequency region; />Representing the content of low-frequency information on the blurred image, namely the area of the low-frequency region; />The area of the low frequency region, i.e., the area of the first low frequency region, which is the area of the clear image; />The area of the high frequency region, which is a clear image, is the area of the first high frequency region.

It should be noted that the number of the substrates,for the ratio of the high frequency information content to the low frequency information content on the blurred image, the same applies->Representing the ratio of the content of low-frequency information to the content of high-frequency information on a clear image, then +.>The conversion relation between the clear image and the blurred image, i.e. the blur coefficient of the blur kernel +.>。

The blur coefficient can be regarded as the overall blur coefficient for the whole image, and the actual blur kernel is different from the blur degree of the gray uniform region at the edge, the blur degree is larger, and the blur degree is smaller, so that the blur coefficient is neededOn the basis of the above, the self-adaptive local blurring coefficient of each pixel point is obtained according to the texture complexity of different areas on the target gray level image, namely blurringThe kernel is separated into a window and a fuzzy coefficient, and then the fuzzy coefficient is utilized to deblur the target gray level image based on the gray level difference between each pixel point in the target gray level image and surrounding pixel points, so as to obtain a deblurred static frame image, and the specific process is as follows: setting a window with a preset size by taking a pixel point as a center for any pixel point in a target gray level image, counting gray level types in the window, calculating the total number of pixel points in the window of the number of the pixel points under each gray level type, calculating an entropy value according to the duty ratio, normalizing the entropy value to obtain a normalized entropy value, taking the product of the normalized entropy value and a fuzzy coefficient as a denominator, taking the gray level of the pixel point as a molecule to obtain a corresponding ratio, and taking the ratio as a deblurring gray level of the pixel point; and obtaining a deblurring gray value of each pixel point in the target gray image to obtain a corresponding deblurring static frame image.

As one example, set upTraversing the target gray level image, wherein n is an odd number, in the scheme, n is 9, a window with a size of 9*9 is obtained by taking the r pixel point in the target gray level image as the center, the deblurring gray level value of the r pixel point is calculated according to the gray level value of each pixel point in the window, and the calculation formula of the deblurring gray level value is as follows:

wherein, the liquid crystal display device comprises a liquid crystal display device,deblurring gray value for the r-th pixel; />The gray value of the r pixel point in the target gray image is the gray value of the r pixel point in the target gray image; />As a hyperbolic tangent function; />Representing the number of pixel points under the v-th gray value type in the window; m is the total number of pixel points in the window; />Is a logarithmic function with a natural constant as a base; />Is a fuzzy coefficient; />Is the number of gray value types within the window.

It should be noted that the number of the substrates,representing the ratio of the number of pixels in the window to the total number of pixels in the window for the v-th gray value type in the window, ">For the gray information entropy in the window, the entropy value can reflect the gray confusion in the window, and can represent the texture complexity of the local area where the central pixel point of the window is located, and the higher the entropy value is, the more complex the partial texture is; />Represents the proportional normalization of the entropy value by the hyperbolic tangent function th, the larger the entropy value isThe greater the value between 0 and 1, then +.>The fuzzy degree of the texture areas is different, and the self-adaptive weight coefficient given to the fuzzy coefficient is also different; assuming that the gray information of the r pixel point in the original clear image is attenuated by the fuzzy core and then is +.>Then。

And obtaining the deblurring gray value of each pixel point in the target gray image by using a calculation formula of the deblurring gray value, and realizing deblurring processing to obtain a deblurring static frame image.

And step S005, performing jigsaw training of the infant based on the deblurred static frame image.

Specifically, the deblurring static frame image is obtained through deblurring processing, so that the definition of the collected image in the monitoring process is guaranteed to the greatest extent, motion blur caused by shaking of a person or a mobile phone and a camera is avoided, when an intelligent accompanying and playing system gives a scene instruction, the system carries out target information identification on the deblurring static frame image, namely, the split image captured by an infant is segmented through the existing human body action identification module and the edge, template matching is carried out on the segmented image and a pre-stored template library, multi-angle template matching can be realized based on matchtemplate+rotation+image pyramid, and the template matching is a known technology, so that the scheme is not repeated. If the grabbing puzzles of the infants are consistent with the instruction puzzles, voice excitation is carried out, if the grabbing puzzles are inconsistent with the instruction puzzles, relevant guidance is carried out, the specific guidance mode is not important in the invention, and the explanation is omitted, so that the puzzles of the infants are trained.

As an example, since the intelligent companion playing system is matched with the jigsaw puzzle, the intelligent companion playing system itself contains the jigsaw puzzle template, so that verification can be performed by using a template matching method, the template matching method calculates structural similarity for the target image (deblurred static frame image) and the template image, the structural similarity is a known algorithm, the value of the structural similarity is between 0 and 1, and the larger the value of the structural similarity is, the more similar the structural similarity is. Because the image definition is guaranteed through the preprocessing stage, the factors influencing the template matching result only have the problems of shielding, angles and the like, the latter can realize multi-angle template matching based on the matchtemplate+rotation+image pyramid, the structural similarity threshold value is set to be 0.7 in consideration of shielding, fault tolerance is given, the structural similarity is more than or equal to 0.7 and is considered to be successfully matched, and if the structural similarity is less than or equal to 0.7 and is considered to be failed to be matched, the specific process of the infant jigsaw training is as follows:

1. human skeleton contour recognition is carried out on the deblurred infant monitoring image (deblurred static frame image) by utilizing an openpost network trained in a Human Pose Evaluator human skeleton image database, and the input of the openpost network is as follows: the deblurred infant monitoring image is output as follows: the openpost network is a commonly used conventional gesture recognition network for recognizing multiple persons or single person, and the training process and the loss function in the training process are well known and are not described herein.

2. Since the infant needs to conduct guidance when executing instructions, the running time of the intelligent accompanying and playing system needs to be saved, and the real-time performance is improved, so that the detection range frame of the target object is selected at the tail end of the vector corresponding to the arm and the hand communication framework, the detection range frame is round, the radius of the detection range frame is set to be 1/4 of the length of the marking line segment corresponding to the arm according to human body proportion experience, and the target object is detected in the detection range frame. The detection mode adopts a mode of sliding template matching in a detection range frame by an instruction corresponding template, and the template matching method is as described above. In the process of moving the grabbing target object to the background plate, the grabbing jigsaw can be determined to be the target jigsaw only by more than 3 frames of successful matching in the monitoring process.

3. Similarly, the background plate is identified in advance in a template matching mode, the template matching method is the same as that of the template matching method, a known voice instruction corresponds to a picture groove area on the background plate, the picture groove area is marked, after the grabbing success is determined, whether the corresponding target picture groove spliced on the background plate is correct after the target picture is grabbed by an infant is determined according to the extending direction of a vector corresponding to a connecting framework of a small arm and a hand, and therefore the infant is stimulated, corrected and guided in the grabbing moving process.

4. In the process of grabbing and moving, the optical flow field identification of the continuous frames is carried out by using an optical flow method, when the hand grabs the object jigsaw and moves, an optical flow field exists between the adjacent frames, and the optical flow field area in the detection range frame of each adjacent continuous frame is dynamically marked, and the optical flow method is a known technology and is not repeated here.

5. When the optical flow field area is intersected with the marked image slot area, namely the coordinates of any point in the two areas are identical and coincide, after delaying for 2-3 seconds, the marked image slot area on the background plate is identified again, template matching is carried out on the marked image slot area and the splicing result template, the template matching mode is the same as the above, and whether the splicing result is correct or not is judged.

It should be noted that, the training methods of Human Pose Evaluator and openpost networks described above are known techniques, and only the volume of training data needs to be adjusted by themselves, so that the larger the training data is, the more accurate the marking result of the openpost network is, and the self-adjustment according to the requirement can be omitted.

Based on the same inventive concept as the embodiment of the method, the embodiment of the invention also provides an infant interpersonal communication jigsaw training system based on image processing, which comprises: a memory, a processor, and a computer program stored in the memory and executable on the processor. The steps in the above embodiment of the method for training the inter-human communication jigsaw for infants based on image processing, such as the steps shown in fig. 1, are implemented when the processor executes the computer program. The image processing-based infant interpersonal communication jigsaw training method is already described in detail in the above embodiments, and will not be repeated.

It should be noted that: the sequence of the embodiments of the present invention is only for description, and does not represent the advantages and disadvantages of the embodiments. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.

In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments.

The foregoing description of the preferred embodiments of the present invention is not intended to be limiting, but rather, any modifications, equivalents, improvements, etc. that fall within the principles of the present invention are intended to be included within the scope of the present invention.

Claims

1. The infant interpersonal communication jigsaw training method based on image processing is characterized by comprising the following steps of:

performing jigsaw training of the infant based on the deblurred static frame image;

the method for acquiring the split nodes of the corresponding connecting lines according to the brightness of each point on the connecting lines comprises the following steps:

respectively performing curve fitting on the brightness of each point on the connecting line at least twice to obtain a corresponding brightness change curve function, obtaining the fitting brightness of each point on the connecting line by using the brightness change curve function, calculating the square difference between the brightness of each point on the connecting line and the corresponding fitting brightness, obtaining the addition result of the square difference, and taking the brightness change curve function corresponding to the minimum addition result as the optimal brightness change curve function; obtaining a segmentation node corresponding to the connecting line by utilizing the Lagrangian median theorem based on the optimal brightness change curve function of the connecting line;

the method for acquiring the fuzzy coefficient of the fuzzy core according to the low frequency region, the high frequency region, the first low frequency region and the first high frequency region comprises the following steps:

the method comprises the steps of respectively obtaining areas of a low-frequency area, a high-frequency area, a first low-frequency area and a first high-frequency area; taking the area of a first high-frequency area as a denominator, taking the area of a first low-frequency area as a numerator to obtain a corresponding ratio as a first ratio, taking the area of a high-frequency area as a numerator, taking the area of a low-frequency area as a denominator to obtain a corresponding ratio as a second ratio, and taking the product of the first ratio and the second ratio as a fuzzy coefficient;

the method for acquiring the deblurred static frame image comprises the following steps:

2. The image processing-based infant interpersonal communication jigsaw training method according to claim 1, wherein the method for acquiring the gray image with motion blur as the target gray image comprises the following steps:

3. Infant interpersonal communication jigsaw training system based on image processing, which is characterized by comprising: a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the image processing based infant interpersonal communication jigsaw training method of any one of claims 1-2 when the computer program is executed by the processor.