WO2021004402A1 - Image recognition method and apparatus, storage medium, and processor - Google Patents

Image recognition method and apparatus, storage medium, and processor Download PDF

Info

Publication number
WO2021004402A1
WO2021004402A1 PCT/CN2020/100247 CN2020100247W WO2021004402A1 WO 2021004402 A1 WO2021004402 A1 WO 2021004402A1 CN 2020100247 W CN2020100247 W CN 2020100247W WO 2021004402 A1 WO2021004402 A1 WO 2021004402A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
target
coordinates
pixel
value
Prior art date
Application number
PCT/CN2020/100247
Other languages
French (fr)
Chinese (zh)
Inventor
刘根
何炳塬
解春兰
孔甜
屈奇勋
沈凌浩
贡卓琳
张帆
郑汉城
Original Assignee
深圳数字生命研究院
深圳碳云智能数字生命健康管理有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳数字生命研究院, 深圳碳云智能数字生命健康管理有限公司 filed Critical 深圳数字生命研究院
Publication of WO2021004402A1 publication Critical patent/WO2021004402A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/136Segmentation; Edge detection involving thresholding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds

Definitions

  • This application relates to the field of image recognition, and specifically to an image recognition method and device, storage medium and processor.
  • Image recognition technology is an important field of artificial intelligence, which refers to the object recognition of images to identify targets and objects in various patterns. Common recognition objects can be roughly divided into natural scene objects and specific scene objects. For natural scene images, convolutional networks are used to train appropriate models, while for specific scene objects, certain network models and algorithms are required. Secondary development. For the recognition of the data in the picture, the existing optical character recognition (Optical Character Recognition, referred to as OCR) recognition technology is used to recognize the data of the picture, but the OCR recognition technology can only recognize the numbers displayed in the image. For dot, continuous or The value represented by the discontinuous curve cannot be identified.
  • OCR optical Character Recognition
  • the embodiment of the application provides an image recognition method and device, storage medium, and processor, which can at least solve that the current image recognition method can only recognize the numerical value of the character format in the image, and cannot automatically recognize the curve or discrete point as a numerical value.
  • an image recognition method including: obtaining a target image to be recognized; obtaining a target region in the target image, wherein the image in the target region is used to reflect specified type parameter information; Determine the coordinate of the selected pixel in the target area; determine the parameter value corresponding to the selected pixel coordinate based on the correlation between the value of the specified type parameter and the coordinate of the pixel.
  • the method further includes: separating the specified color channel from the target area, wherein, The designated color channel is the same color channel as the color channel corresponding to the standard color band of the target area among the R, G, and B color channels; image binarization processing is performed on the image of the designated color channel, and the binarization processing is selection designation The set of pixel points in the image of the color channel that are greater than the preset threshold to obtain a binarized image; select the threshold corresponding to the area where each pixel in the binarized image is located from the preset threshold set, and use the selected threshold Perform image segmentation on the target area; perform reference point pixel recognition on the binarized image obtained after segmentation to obtain the pixel coordinates of at least two reference points of the standard color band in the image; determine the actual values of at least two reference points And the corresponding relationship between the pixel point coordinates, and based on
  • determining the coordinates of selected pixels in the target area includes: performing grayscale processing on the image in the target area to obtain a grayscale image; performing clustering processing on each pixel in the grayscale image to obtain Multiple clusters; select a designated cluster from the multiple clusters, and determine the coordinates of the selected pixel from all pixels in the designated cluster.
  • selecting a designated cluster from a plurality of clusters, and determining the coordinates of the selected pixel from all pixels in the designated cluster is specifically: selecting the cluster with the least number of pixels from the multiple clusters, and selecting from the pixels Determine the coordinates of the selected pixel in the cluster with the least number of points.
  • the image in the target area includes: a curve image in the coordinate system or a discrete point image in the coordinate system, and the curve in the curve image or the discrete points in the discrete point image are used to reflect the selection of the specified type of parameters at different moments. value.
  • the method further includes: determining the target recording time corresponding to the pixel point coordinate in the curve image or the discrete point image; determining the selected pixel point coordinate based on the correlation between the value of the specified type parameter and the pixel point coordinate
  • the corresponding parameter value includes: determining the parameter value corresponding to the selected pixel point coordinate at the target recording time based on the correlation between the value of the specified type parameter and the pixel point coordinate.
  • the target recording time is determined by the following methods: identifying the character information in the curve image, and extracting the time information of the specified type parameter from the character information; the time length between any two adjacent recording moments in the time information is determined by pixel The number of points is divided at equal intervals to obtain multiple time points; and the target recording time to which the selected pixel point coordinates belong is determined from the multiple time points.
  • acquiring the target region in the target image includes: semantically segmenting the target image to obtain a mask image and a foreground image of the target image; determining the region of interest from the foreground image, and combining the region of interest As the target area.
  • an efficient neural network model is used to perform semantic segmentation on the target image
  • the efficient neural network model includes an initialization module and a bottleneck module, wherein each bottleneck module includes Three convolutional layers, wherein the first convolutional layer of the three convolutional layers is used for dimensionality reduction processing, and the second convolutional layer is used for hole convolution, full convolution and asymmetric convolution,
  • the third convolutional layer is used to perform dimension-upgrading processing;
  • the target image is of the second type, adjust the bilateral segmentation network model, and use the adjusted segmentation network model to perform semantic segmentation on the target image,
  • the adjustment of the segmentation network model includes: the bilateral segmentation network model includes a backbone network and an auxiliary network.
  • the backbone network is composed of two layers. Each layer of the backbone network includes a convolutional layer, a batch normalization layer, and a nonlinear activation function. , Reduce the number of feature maps of the backbone network output channel; the auxiliary network model framework adopts a lightweight model to reduce the number of feature maps of the backbone network output channel.
  • the lightweight model includes one of the following: Xception39, SqueezeNet, Xception, MobileNet, ShuffleNet, where the number of images in the first data set corresponding to the first type is smaller than the number of images in the second data set corresponding to the second type.
  • determining the region of interest from the foreground image includes: determining the characteristic region in the foreground image and the corner coordinates of the target geometric region, where the characteristic region is the region in the foreground image that contains the specified type parameter information; Point coordinates calculate the projection transformation matrix; perform projection transformation on the pixel points in the characteristic area to obtain the region of interest.
  • the method before acquiring the target image to be recognized, the method further includes: determining whether the image to be recognized is the target image; when the image to be recognized is the target image, determining to perform semantic segmentation on the target image.
  • the target image is the image with the target area, and the image in the target area is used to reflect the specified type of parameter information.
  • one embodiment of the present application is a method for recognizing Abbott’s continuous blood glucose meter images.
  • the continuous blood glucose meter contains images that reflect the continuous changes in blood glucose is the target image; of course, if it is used to identify the value of other curve images or discrete point images, then it contains the curve image or discrete points used to reflect the corresponding value.
  • the image is the target image.
  • the above method further includes: dividing the region of interest into a preset number of non-overlapping sliders; determining the feature values of the preset number of non-overlapping sliders to obtain the preset number of feature values; The preset number of feature values are combined into a feature vector; the feature vector is input to a support vector machine classifier for analysis to obtain the type of the region of interest.
  • the specified type parameter information includes curve information used to reflect the trend of blood glucose data changes over time, or value information of discrete points in the coordinate system used to reflect the trend of blood glucose data changes over time.
  • the above method further includes: displaying the parameter value corresponding to the selected pixel point coordinate.
  • determining the coordinate of the selected pixel in the target area includes: receiving a user's instruction for the target image; and determining the coordinate of the selected pixel according to the instruction.
  • the instruction is determined based on one of the following information: receiving touch position information of the user on the human-computer interaction interface where the target image is located; or receiving query information input by the user.
  • the method further includes: determining the selected pixel point coordinates based on the touch point position Before, it is judged whether the touch point position is located in the target area; when the judgment result indicates that the touch point position is located in the target area, trigger determination of the selected pixel point coordinates.
  • a data display method including: displaying a target image to be recognized; displaying a region of interest in the target image, wherein the image in the region of interest is used for Reflect the change information of the specified type of parameters over time; display the coordinates of the selected pixel in the region of interest and the target recording time corresponding to the pixel coordinates; display the coordinates of the selected pixel in the The parameter value corresponding to the target recording time, wherein the parameter value is determined based on the correlation between the value of the specified type parameter and the pixel coordinate.
  • the association relationship is determined in the following manner: a designated color channel is separated from the region of interest, wherein the designated color channel is the standard color band of the R, G, and B color channels corresponding to the region of interest Color channels with the same color channel; perform image binarization processing on the image of the specified color channel to obtain a binarized image; select the area corresponding to each pixel in the binarized image from a preset threshold set And use the selected threshold to perform image segmentation on the region of interest; perform reference point pixel identification on the binarized image obtained after segmentation to obtain the pixel coordinates of at least two reference points of the standard color band in the image Determine the actual value of the at least two reference points and the corresponding relationship between the pixel coordinates, and establish a linear relationship between the values of the specified type parameters and the pixel coordinates based on the corresponding relationship, and The linear relationship serves as the association relationship.
  • an image recognition method including: detecting a user's touch point position in a target image; determining the selected pixel point coordinates based on the touch point position, And the target recording time corresponding to the pixel point coordinates; determine the parameter value corresponding to the selected pixel point coordinate at the target recording time based on the correlation between the value of the specified type parameter and the pixel point coordinate; The parameter value.
  • the method before determining the coordinates of the selected pixel point based on the position of the touch point, the method further includes: determining whether the position of the touch point is located in the region of interest of the target image, wherein the image in the region of interest is used to reflect the specified type parameters Change information over time; when the judgment result indicates that the touch point is located in the region of interest, trigger the determination of the selected pixel point coordinates.
  • the method further includes: The designated color channel is separated from the region of interest, where the designated color channel is the same color channel as the color channel corresponding to the standard color band of the region of interest among the R, G, and B color channels; Perform image binarization processing on the image of the color channel to obtain a binarized image; select the threshold corresponding to the area where each pixel in the binarized image is located from the preset threshold set, and use the selected threshold to Perform image segmentation on the region of interest; perform reference point pixel recognition on the binarized image obtained after segmentation to obtain pixel coordinates of at least two reference points of the standard color band in the image; determine the at least two reference points The actual value of and the corresponding relationship between the pixel point coordinates, and establish a linear relationship between the value of the specified type parameter and the pixel point coordinates based on the corresponding relationship, and use
  • an image recognition method including: detecting query information input by a user; determining the coordinates of a selected pixel in a target image based on the query information, and the corresponding pixel coordinates The target recording time of the target; the value of the parameter corresponding to the selected pixel coordinate at the target recording time is determined based on the correlation between the value of the specified type parameter and the coordinate of the pixel; and the value of the parameter is output.
  • the method before determining the coordinates of the selected pixel point in the target image based on the touch point position, the method further includes: determining whether the touch point position is located in the region of interest of the target image, wherein the feeling The image in the region of interest is used to reflect the change information of the specified type parameter over time; when the judgment result indicates that the position of the touch point is located in the region of interest, triggering the determination of the coordinates of the selected pixel point.
  • the method before determining the selected pixel coordinates based on the association relationship between the values of the specified type parameters and the pixel coordinates before the parameter values corresponding to the target recording time, the method further includes: The designated color channel is separated from the region of interest, where the designated color channel is the same color channel as the color channel corresponding to the standard color band of the region of interest among the R, G, and B color channels; The image of the color channel is subjected to image binarization processing.
  • the binarization process is to select a set of pixel points greater than a preset threshold in the image of the specified color channel to obtain a binarized image; select all from the preset threshold set The threshold corresponding to the area where each pixel in the binarized image is located, and the selected threshold is used to segment the region of interest; the binarized image obtained after the segmentation is identified by the reference point pixel to obtain the standard color
  • the pixel coordinates of at least two reference points of the belt in the image determine the actual values of the at least two reference points and the corresponding relationship between the pixel coordinates, and establish the specified type based on the corresponding relationship
  • the linear relationship between the value of the parameter and the coordinate of the pixel point, and the linear relationship is taken as the association relationship.
  • an image recognition device which includes: a first acquisition module configured to acquire a target image to be recognized; a second acquisition module configured to acquire a target area in the target image , Wherein the image in the target area is used to reflect the specified type parameter information; the first determining module is set to determine the coordinates of the selected pixel in the target area; the second determining module is set to be based on the specified type The correlation between the value of the parameter and the coordinate of the pixel determines the value of the parameter corresponding to the selected pixel coordinate.
  • the device further includes: a separation module configured to separate a designated color channel from the target area, wherein the designated color channel is the standard of the R, G, B color channels and the target area The color channel corresponding to the color band is the same color channel;
  • the processing module is configured to perform image binarization processing on the image of the specified color channel, and the binarization process is to select the image of the specified color channel larger than the preset A set of threshold pixels to obtain a binarized image;
  • the selection module is configured to select the threshold corresponding to the area of each pixel in the binarized image from the preset threshold set, and use the selected threshold to compare the Image segmentation of the target area;
  • a fitting module configured to perform reference point pixel recognition on the binarized image obtained after segmentation, to obtain the pixel coordinates of at least two reference points of the standard color band in the image;
  • a building module Set to determine the actual value of the at least two reference points and the corresponding relationship between the pixel point coordinates, and establish a linear relationship between the value
  • the first determining module includes: a grayscale processing unit configured to perform grayscale processing on the image in the target area to obtain a grayscale image; and a clustering unit configured to perform grayscale processing on the grayscale Each pixel in the image is clustered to obtain multiple clusters; the selection unit is configured to select a specified cluster from the multiple clusters, and determine the selected pixel from all the pixels in the specified cluster Point coordinates.
  • the selection unit is configured to select a cluster with the least number of pixels from the plurality of clusters, and determine the coordinates of the selected pixel from the cluster with the least number of pixels.
  • the image in the target area includes: a curve image in a coordinate system or a discrete point image in the coordinate system, and the curve in the curve image or the discrete point in the discrete point image is used to reflect a specified type parameter Values at different moments.
  • the first determining module is further configured to determine the target recording time corresponding to the pixel point coordinates in the curve image; the second determining module is further configured to determine the value based on the specified type parameter and The correlation of the pixel coordinates determines the value of the parameter corresponding to the selected pixel coordinate at the target recording time.
  • the first determining module further includes: a first recognition unit configured to recognize character information in a curved image, and extract time information of the specified type parameter from the character information; and a first dividing unit, To divide the time length between any two adjacent recording moments in the time information at equal intervals according to the number of pixels to obtain multiple time points; the first determining unit is set to determine from the multiple time points The target recording time to which the selected pixel coordinates belong.
  • a first recognition unit configured to recognize character information in a curved image, and extract time information of the specified type parameter from the character information
  • a first dividing unit To divide the time length between any two adjacent recording moments in the time information at equal intervals according to the number of pixels to obtain multiple time points; the first determining unit is set to determine from the multiple time points The target recording time to which the selected pixel coordinates belong.
  • the second acquisition module includes: a segmentation unit configured to perform semantic segmentation on a target image to obtain a mask image and a foreground image of the target image; a second determining unit configured to be configured to Determine the region of interest in the, and use the region of interest as the target region.
  • the segmentation unit is set to use an efficient neural network model to perform semantic segmentation on the target image when the target image is of the first type.
  • the efficient neural network model includes an initialization module and a bottleneck module, where each The bottleneck module includes three convolutional layers. Among the three convolutional layers, the first convolutional layer is used for dimensionality reduction processing, and the second convolutional layer is used for hole convolution, full convolution, and asymmetrical.
  • the third convolutional layer is used for dimensional increase processing; the segmentation unit is also set to adjust the bilateral segmentation network model when the target image is of the second type, and use the adjusted segmentation
  • the network model performs semantic segmentation on the target image, where the adjustment of the segmentation network model includes: the bilateral segmentation network model includes a backbone network and an auxiliary network.
  • the backbone network is composed of two layers, and each layer of the backbone network includes volumes.
  • the auxiliary network model framework adopts a lightweight model to reduce the number of feature maps of the backbone network output channel, the lightweight model It includes one of the following: Xception39, SqueezeNet, Xception, MobileNet, ShuffleNet; wherein the number of images in the first data set corresponding to the first type is smaller than the number of images in the second data set corresponding to the second type.
  • the device further includes: a third determining module configured to determine whether the image to be recognized is a target image; and when the image to be recognized is a target image, determine to perform semantic segmentation on the target image.
  • a third determining module configured to determine whether the image to be recognized is a target image; and when the image to be recognized is a target image, determine to perform semantic segmentation on the target image.
  • the third determining module is further configured to determine the type of the region of interest by: dividing the region of interest into a preset number of non-overlapping sliders; determining the preset number of non-overlapping sliders To obtain the preset number of feature values; combine the preset number of feature values into a feature vector; input the feature vector to a support vector machine classifier for analysis to obtain the region of interest Types of.
  • the specified type parameter information includes curve information used to reflect the trend of blood glucose data changes over time, or value information of discrete points in the coordinate system used to reflect the trend of blood glucose data changes over time.
  • the device further includes: a display module for displaying the parameter values corresponding to the selected pixel coordinates.
  • the first determining module is further configured to receive a user's instruction for the target image; determine the coordinates of the selected pixel according to the instruction.
  • the instruction is determined based on one of the following information: receiving position information of the user's touch point on the human-computer interaction interface where the target image is located; or receiving query information input by the user.
  • the device further includes: a judging module, configured to determine the selected user based on the touch point position when the instruction is to receive the user's touch point position information on the human-computer interaction interface where the target image is located. Before setting the pixel coordinates, determine whether the touch point position is located in the target area; a trigger module is used to trigger the determination of the selected pixel when the determination result indicates that the touch point position is located in the target area Point coordinates.
  • a judging module configured to determine the selected user based on the touch point position when the instruction is to receive the user's touch point position information on the human-computer interaction interface where the target image is located.
  • a trigger module is used to trigger the determination of the selected pixel when the determination result indicates that the touch point position is located in the target area Point coordinates.
  • a non-volatile storage medium includes a stored program, wherein the non-volatile storage is controlled while the program is running
  • the device where the medium is located executes the image recognition method described above.
  • a processor configured to run a program, wherein the image recognition method described above is executed when the program is running.
  • the method of determining the parameter value corresponding to the selected pixel coordinate according to the correlation between the pixel coordinate in the target image and the value of the specified type parameter is adopted.
  • the correlation between the point coordinates and the value of the specified type of parameter recognizes the value of the parameter corresponding to any pixel point in the image. Therefore, the recognition of the parameter value represented by the non-character information in the image is realized, and the image
  • the pixel points in the image are automatically recognized as the purpose of the corresponding parameter value, which solves the technical problem that the current image recognition method can only recognize the value of the character format in the image, and cannot automatically recognize the curve or the discrete point as the value.
  • FIG. 1 is a flowchart of an optional method for identifying blood glucose data in an embodiment of the present application
  • FIG. 2 is a flowchart of an image recognition method in another embodiment of the present application.
  • FIG. 3 is an example diagram of an optional ROI region extraction according to an embodiment of the application.
  • 4a-4d are exemplary diagrams of an optional blood glucose curve detection and segmentation process according to an embodiment of the application.
  • Fig. 5 is a schematic structural diagram of an optional BiSeNet-Xception39 simplified model according to an embodiment of the application;
  • 6 is an optional 8-hour image R-square error distribution statistical result according to an embodiment of the application.
  • FIG. 7 is an optional 8-hour image error value distribution diagram according to an embodiment of the application.
  • FIG. 8 is a schematic diagram of an optional 24-hour image R-square error distribution statistical result according to an embodiment of the application.
  • FIG. 9 is an optional 24-hour image error value distribution diagram according to an embodiment of the application.
  • FIG. 10 is a structural block diagram of an image recognition device according to an embodiment of the present application.
  • FIG. 11 is a structural block diagram of another optional image recognition device according to an embodiment of the present application.
  • FIG. 12 is a flowchart of a data display method according to an embodiment of the application.
  • FIG. 13 is a flowchart of another image recognition method according to an embodiment of the application.
  • FIG. 14 is a flowchart of another image recognition method according to an embodiment of the application.
  • the tester grasps the continuous blood glucose value at all times, and cannot provide a basis for the follow-up system blood glucose management and real-time push intervention plan.
  • the existing OCR technology cannot identify the blood glucose value at any point in the blood glucose curve.
  • the correlation between pixel coordinates and actual parameter values is used to realize the recognition of parameter values at any point in the curve, simplify the process of converting images into quantitative values, and store them in a database in a certain format , Provide support for subsequent blood glucose analysis and intervention plan generation.
  • it can also facilitate the export and storage of blood glucose data, realize the mobile phone record and manage blood glucose, and improve the user experience.
  • the recognition of blood glucose data in a blood glucose image is taken as an example to illustrate how to identify corresponding parameter values at the pixel level. As shown in FIG. 1, the process includes the following steps:
  • Step S102 Receive an image uploaded by the user.
  • Step S104 Identify whether the image uploaded by the user is a blood glucose meter image to be processed.
  • the deep network model is used for image classification before calling the algorithm model, such as MobileNet, Xception, SqueezeNet and other image classification models.
  • the confidence level of the classification model output will be used to determine whether Whether the current image is the image to be processed, for example, the confidence threshold is set to 0.85 to ensure the quality of the image uploaded by the user.
  • Step S106 Perform image segmentation using the semantic segmentation network. Specifically, segment the foreground information in the entire image, that is, the highlighted screen part of the blood glucose meter image.
  • the images uploaded by users contain various noises.
  • semantic segmentation models in deep learning are used in the image pre-segmentation part, such as BiSeNet, ICNet, BSPNet, ENET and all other semantic segmentation models. Considering the data complexity and actual application speed requirements, a network model with real-time segmentation characteristics is selected.
  • Step S108 perform image correction. Perform quadrilateral fitting and corner detection according to the semantic segmentation results, and return the coordinate information of the four corners of the screen in an orderly manner. Use the corner information to calculate the projection transformation matrix and perform image projection transformation. Image orientation judgment-rotate the image according to the grayscale, color, texture and other information in the image, and return to the positive direction image (ROI, Region of Interest, region of interest) on the screen of the blood glucose meter.
  • ROI Region of Interest, region of interest
  • Step S110 extracting local standard deviation and color features-judging the image type, where the image type includes N-hour and 24-hour images, where N is less than 24.
  • Step S112 Use the SVM classifier to classify the image, and then use the standard color band to detect and perform the segmentation of the blood glucose curve.
  • OCR for the 24-hour blood glucose image, use OCR to identify the date information, and for the 8-hour blood glucose image Then use OCR to identify the start time and end time of the blood glucose device scan.
  • OCR digital image processing
  • Step S114 Determine the mapping relationship between the blood glucose level and the image pixels.
  • Step S116 Calculate the blood glucose value of a certain pixel by using the above mapping relationship, and output the calculated blood glucose value.
  • the embodiments of the present application provide a method embodiment of an image recognition method. It should be noted that the steps shown in the flowchart of the accompanying drawings can be executed in a computer system such as a set of computer executable instructions And, although a logical sequence is shown in the flowchart, in some cases, the steps shown or described may be performed in a different order than here.
  • Fig. 2 is a flowchart of an image recognition method according to another embodiment of the present application. As shown in Fig. 2, the method includes the following steps:
  • Step S202 acquiring a target image to be recognized
  • Step S204 Acquire a target area in the target image, where the image in the target area is used to reflect specified type parameter information;
  • Step S206 Determine the coordinates of the selected pixel in the target area
  • Step S208 Determine the value of the parameter corresponding to the selected pixel point coordinate based on the correlation between the value of the specified type parameter and the pixel point coordinate.
  • the correlation between the pixel coordinates in the target image and the value of the specified type parameter is used to identify the parameter value corresponding to any pixel coordinate in the image.
  • the recognition of the parameter value represented by the information achieves the purpose of automatically recognizing the pixel in the image as the corresponding parameter value, and thus solves the problem that the current image recognition method can only recognize the value of the character format in the image.
  • the above-mentioned association relationship can be expressed in many ways, for example, it can be expressed as a mapping relationship, or as a linear function relationship.
  • the former can be implemented in the following ways: based on the value of the specified type parameter and the pixel Before the correlation of the point coordinates determines the value of the parameter corresponding to the selected pixel point coordinate, a designated color channel is separated from the target area, wherein the designated color channel is R, G, B color channel The color channel in the same color channel as the color channel corresponding to the standard color band of the target area; image binarization is performed on the image of the specified color channel, and the binarization process is to select the image of the specified color channel to be greater than the preset threshold To obtain a binarized image; select a threshold corresponding to the area where each pixel in the binarized image is located from a preset threshold set, and use the selected threshold to perform image segmentation on the target area; Perform reference point pixel identification on the binarized image obtained after segmentation to obtain the pixel coordinates
  • the above-mentioned association relationship is expressed as a mapping relationship.
  • a standard color band is required.
  • This detection process is a way to effectively establish the relationship between the actual blood glucose value and the pixel coordinates of the blood glucose image, and its purpose is to find the corresponding linear relationship between the pixel coordinates on the blood glucose curve and the actual blood glucose value.
  • the color channel R, G, and B are separated in the blood glucose image ROI area. Because the standard color band presents blue characteristics, the B channel is extracted for image processing.
  • the channel image is grayscaled and grayscaled Value normalization processing, and then adaptive threshold segmentation of the image, at the same time, using image morphology processing to complete the noise processing of the segmented image, complete the standard color band image segmentation operation, and finally, the binary image in the above process is horizontal Straight line fitting, the fitting result is the pixel coordinate height of the upper and lower lines of the standard color band in the image.
  • the upper and lower blood glucose values of the standard ribbon can be 3.9 and 7.8, respectively.
  • the linear relationship is established by the known actual blood glucose value on the actual standard color band and the corresponding pixel coordinate height:
  • line_rho pixel height of blood glucose curve in the image
  • std_upper pixel height of the upper edge of the standard color band
  • std_lower the pixel height of the lower edge of the standard color band
  • the pixels can be determined by clustering, or the area of interest can be used as the target area.
  • the following methods can be used to process the image in the target area. Gray-scale processing to obtain a gray-scale image; perform clustering processing on each pixel in the gray-scale image to obtain multiple clusters; select a specified cluster from multiple clusters, and determine the selected cluster from all pixels in the specified cluster The coordinates of the selected pixel.
  • the target image in this embodiment may be an original image generated for blood glucose meter data that needs to be displayed to the customer, for example, analyzing a graph of blood glucose content value and time value, where the graph is on the time axis In the coordinate system formed by the coordinate axis of blood glucose content, the graph is the original image to be processed for data analysis of the blood glucose meter, that is, the target image.
  • the principle for selecting the designated cluster is flexible according to the actual situation. Determine, for example, select the cluster with the least number of pixels from a plurality of clusters, and determine the coordinates of the selected pixel from the cluster with the least number of pixels.
  • the input image is intercepted by the local area rect operation, and the parameters are set As shown in Table 1:
  • the blood glucose curve in the image can be formed by connecting black and red (colors are not distinguished in the figure). Therefore, the color channel of the image is separated and the R channel is extracted.
  • the grayscale distribution of the image presents three distribution trends of black, gray and white, so the number of clustering centers is set to 3 (as shown in Figure 4c), and the image grayscale clustering is performed by KMeans, and the blood glucose curve is displayed in the image. It occupies the least area. Therefore, the category with the least number of categories in the classification result is extracted as the category of the blood glucose curve (as shown in FIG. 4d), and the image post-processing operation of the category image is performed to complete the blood glucose curve segmentation and detection process.
  • the target area can be determined according to the region of interest, specifically: semantically segment the target image to obtain the mask image and foreground image of the target image; determine the region of interest from the foreground image and use the region of interest as target area.
  • the region of interest can be determined from the foreground image by the following methods: determine the feature region in the foreground image, and the corner coordinates of the target geometric region, where the feature region is the region in the foreground image that contains the specified type of parameter information; Point coordinates calculate the projection transformation matrix; perform projection transformation on the pixel points in the characteristic area to obtain the region of interest.
  • the image rotation processing may also be included to ensure that the region of interest can be correctly identified.
  • the image in the above-mentioned target area includes: a curve image in a coordinate system, and the curve in the curve image is used to reflect the value of the specified type parameter at different times.
  • step S206 is determining the selection
  • the parameter value corresponding to the pixel point it can be expressed as the following processing process: Determine the parameter value corresponding to the selected pixel point coordinate at the target recording time based on the correlation between the value of the specified type parameter and the pixel point coordinate .
  • the target recording time can be determined by the following methods: identifying the character information in the curve image, and extracting the time information of the specified type parameter from the character information; for any two adjacent recording times in the time information The length of time between is divided at equal intervals according to the number of pixels to obtain multiple time points; from multiple time points, determine the target recording time to which the coordinates of the selected pixel point belongs, for example, in the horizontal direction between T1 and T2 If the number of pixels is N, the time corresponding to the M-th pixel in the horizontal direction between T1 and T2 is T1+[M*(T2-T1)/N].
  • OCR technology can be used to recognize the above-mentioned character information, but it is not limited to this.
  • the local area RECT is intercepted (RECT is the abbreviation of rectangular frame, which represents the rectangular area intercepted in the image) for character recognition, and the RECT area parameters are set to (235, 10, 20) for 8 hours , 225), the 24-hour blood glucose image RECT area parameter is set to (205,85,20,225), the specific parameter description takes 8 hours as an example, 235 represents the vertical coordinate of the starting point of the captured area image in the size of 256*256 input image , 10 is the abscissa of the starting point of the captured area image in the input image with a size of 256*256, 20 represents the height of the changed capture area, and 225 represents the width of the captured area. Specific examples are shown in Table 2:
  • an efficient neural network model is used to perform semantic segmentation on the target image.
  • the efficient neural network model includes an initialization module and a bottleneck module, where each bottleneck module includes three Convolutional layer, where the first convolutional layer of the three convolutional layers is used for dimensionality reduction processing, the second convolutional layer is used for hole convolution, full convolution and asymmetric convolution, and the third convolution
  • the layer is used for dimension upgrading; when the target image is of the second type, adjust the bilateral segmentation network model, and use the adjusted segmentation network model to perform semantic segmentation on the target image, where the segmentation network model is adjusted Including:
  • the bilateral segmentation network model includes a backbone network and an auxiliary network.
  • the backbone network is composed of two layers. Each backbone network includes a convolutional layer, a batch normalization layer and a nonlinear activation function to reduce the output channel of the backbone network. Feature maps; the auxiliary network model framework adopts a lightweight model to reduce the number of feature maps of the backbone network output channel.
  • the lightweight model includes one of the following: Xception39, SqueezeNet, Xception, MobileNet, ShuffleNet; wherein, the The number of images in the first data set corresponding to the first type is smaller than the number of images in the second data set corresponding to the second type.
  • the above-mentioned first type and second type respectively correspond to the images in the first data set and the images in the second data set, that is, the images in the first data set are the images of the first type, and the images in the second data set are the second Type of image.
  • the number of images in the first data set is less than the number of images in the second data set.
  • an efficient neural network that is, the segmentation model used in the first type
  • a segmentation network model after adjusting the bilateral segmentation network model that is, the second type
  • the segmentation model performs semantic segmentation on the target image.
  • the segmentation network model adjusted by the bilateral segmentation network model is used to perform semantic segmentation on the target image, and the segmentation network model adjusted by the bilateral segmentation network model is Processing large-scale data has the advantages of fast processing speed and the ability to protect the size of the receptive field while retaining a certain richness of spatial information.
  • the function of the above-mentioned main network is to retain abundant spatial information, and the function of the auxiliary network is to protect the size of the receptive field.
  • the adjusted bilateral segmentation network model due to the strong processing capability of the adjusted bilateral segmentation network model, it can not only process large-scale data but also support small-scale data processing.
  • an efficient neural network model can be used to process the image, and when the data scale is large, it can also be manually or automatically switched to the adjusted bilateral segmentation network model for processing.
  • the size of the data is determined based on the number of specific query requests in the context of specific hardware capabilities, or based on the statistical number of query requests within a time period.
  • the query request can be used to request image processing (for example, Recognize images uploaded by users).
  • the data set can be composed of images to be processed.
  • the second data set is a data set that satisfies the following conditions: within a certain period of time, it is the target image that needs to be processed in the data set (for example, to be recognized The number of images) is greater than the preset threshold;
  • the first data set is a data set that meets the following conditions: the number of target images (for example, images to be recognized) that need to be processed in the data set within a specific time period is less than the preset threshold.
  • the number of target images When the number of target images is counted, it may be counted once every preset time length, and any time period for counting the number of target images is taken as the above-mentioned specific time period.
  • the starting time point of this action in the target area is the starting point, and the preset time between it and the midpoint or end point (for example, 3ms, 5ms, 30ms, 50ms, 100ms, 500ms, 1s or 2s, etc.) is used as the specific time
  • the preset time between it and the midpoint or end point for example, 3ms, 5ms, 30ms, 50ms, 100ms, 500ms, 1s or 2s, etc.
  • any one of the 12 time periods is the above specific time period.
  • the data set corresponding to the certain time period is the second data set.
  • the number of received target images or target images to be processed is less than a preset threshold, and the data set corresponding to the certain time period is the first data set.
  • the aforementioned specific time period may be a predetermined fixed time period, for example, every 1 s, 1 min, 1 h, and so on.
  • the number of images in the first data set and the second data set can be dynamically changed.
  • the number of images in the first data set and the second data set can be The number determines whether to use the model corresponding to the first type for semantic segmentation or to use the model corresponding to the second type for semantic segmentation.
  • the size of the data is determined based on the number of specific query network requests in the context of specific hardware capabilities.
  • the number of target images in the data set is greater than or equal to 50000 is regarded as the second data set, and less than 50000 is regarded as the first data set; for example, if you use a GPU server, it contains many It is equivalent to the computing power of 24 NVIDIA V100GPUs, and the response time is less than 3 seconds.
  • the number of target images in the data set is greater than or equal to 50,000 is considered the second data set, and less than 50,000 is considered the first data set.
  • the size of the data is determined based on the statistical number of time periods in the context of specific hardware capabilities.
  • the model corresponding to the second type can be used for semantic segmentation, and there are fewer users who query data in the evening from 00:00 to 08:00, so the first type can be used.
  • the corresponding model performs semantic segmentation.
  • receive the target image uploaded by the user determine the upload time of the target image; determine the time period corresponding to the upload time; determine the segmentation network for semantic segmentation of the target image according to the time period Model, and perform semantic segmentation on the target image according to the determined segmentation network model.
  • the type of the target image can also be determined; when the type is the preset type, the semantic segmentation of the target image is determined .
  • the type of the target image it can be expressed as the following process: dividing the target image into a preset number of non-overlapping sliders; determining the feature values of the preset number of non-overlapping sliders to obtain the preset number of feature values; The preset number of feature values are combined into feature vectors; the feature vectors are input to the support vector machine classifier for analysis to obtain the target image type.
  • the types of blood glucose images of Abbott's continuous blood glucose device include two types: 8 hours and 24 hours.
  • the input image with a size of 256*256 is divided into 256 16*16 non-overlapping sliders, and the partial variance of each relatively independent slider and the image blue channel are calculated.
  • the average value of the pixel value, the variance and the average pixel value of the blue channel are combined into the feature value of the independent slider, and then the 256 slider features are combined into a feature vector with a dimension of 512, and finally, combined with SVM (support vector machine Support vector machine) classifier realizes the two classification of images and completes the classification of blood glucose images.
  • SVM support vector machine Support vector machine
  • image segmentation is performed on the image that is determined to be recognized, so as to accurately extract the highlight part of the screen.
  • the network has the characteristics of few parameters, small model and high accuracy.
  • the basic implementation units of the pre-segmentation network ENET are: (1) the initialization module, (2) the bottleneck module designed based on the ResNet idea. Each module contains three convolutional layers.
  • the first convolutional layer achieves dimensionality reduction
  • the first The two convolutional layers implement hole convolution, full convolution and asymmetric convolution, etc.
  • the third convolution implements the dimensionality increase function.
  • Each convolution kernel includes Batch Normalization and PReLU.
  • training data 515
  • validation set 65
  • test set 64
  • All collected images cover multiple angles, and all photos have uniform illumination distribution.
  • the initial learning rate is 0.005, and the learning rate decays once every 30 iterations.
  • the total number of iterations epoch is 300, but not limited to 300. All specific network parameters can be adjusted according to the actual data.
  • the specific training and test performance are shown in Table 2.
  • the test environment is: memory 16G, CPU model Intel(R)Core(TM)i5-7500CPU@3.40GHz.
  • the model performance is shown in the following table.
  • IOU Intersection over Union
  • the final result is the intersection of GT and PR than the union of GT and PR, which is common in target detection and segmentation. Metrics.
  • the performance of the ENET semantic segmentation network model is shown in Table 3.
  • the model performance of the network on large data sets Does not meet further application requirements.
  • the total number of samples is 4912, which are divided into 4104 training sets, 608 verification sets, and 200 test sets.
  • the initial learning rate is 0.01, and the learning rate decays once every 30 iterations.
  • the total number of iterations epoch is 300. All network parameters include but are not limited to the above values. The specific data can be adjusted according to the actual data.
  • the model performance is shown in Table 4:
  • the original segmentation model BiSeNet has certain advantages in speed and accuracy on public data sets (data set Cityscapes, data set CamVid, data set COCO-Stuff, etc.).
  • the sample complexity is relatively clean and less complex than the data in the public data set. Therefore, the semantic segmentation BiSeNet network is appropriately adjusted and simplified.
  • the adjustment ideas are mainly divided into: (1) Spatial information processing Layer Spatial Path, (2) Receptive field processing layer Context Path, (3) Number of input-output channels (feature maps) between network layers, (4) Compress input image size.
  • the specific content of the simplified modification is as follows: (1) The Spatial Path part of the backbone network is reduced by the original 3-layer network (where each layer includes the common convolution layer conv, the batch normalization layer Batch Normalization, and the nonlinear activation function ReLU ), reduced to a 2-layer network, as shown in Layer1 and Layer2 in Figure 2. At the same time, the part of the output channel is reduced from 128 feature maps to 64 feature maps, which greatly reduces network parameters and effectively compresses the model size.
  • the above-mentioned designated type parameter information includes curve information used to reflect the change trend of blood glucose data or value information of discrete points in the coordinate system, wherein each discrete point corresponds to a blood glucose value at each sampling time.
  • the value of the parameter corresponding to the selected pixel point coordinate at the target recording time can also be displayed.
  • the coordinates of selected pixels in the target area can be determined by the following methods: detecting a user's instruction for the target image; determining the coordinates of the selected pixel according to the instruction.
  • the instruction is determined based on one of the following information: the user's touch position on the human-computer interaction interface where the target image is located; and the query information input by the user.
  • the following processing can also be performed: determine whether the touch point position is located in the target area; when the determination result indicates that the touch point position is located in the target area, trigger the determination to be selected The specified pixel coordinates.
  • data analysis and result statistics are performed on 100 8-hour blood glucose images and 100 24-hour blood glucose images.
  • 8-hour image 98 blood glucose images can be effectively identified (the blood glucose recognized by this method)
  • the image trend of the value is consistent with the image trend of the blood glucose value recognized by the scanner), and the error range is about plus or minus 0.4, which is in line with the actual application scenario.
  • 24-hour images all 100 blood glucose images can be effectively identified (the image trend of the blood glucose value recognized by this method is consistent with the image trend of the blood glucose value recognized by the scanner), and the error range is about -0.6 to 0.4. Meet the needs of missing blood glucose values.
  • the method error is measured based on the quantitative indicators of R-Square, mean square error (MSE), root mean square error (RMSE), and average absolute error (MAE).
  • MSE mean square error
  • RMSE root mean square error
  • MAE average absolute error
  • An embodiment of the present application also provides an image recognition device, which is used to implement the method shown in FIG. 2, and as shown in FIG. 10, the device includes:
  • the first obtaining module 10 is configured to obtain a target image to be recognized
  • the second acquisition module 12 is configured to acquire a target area in a target image, where the image in the target area is used to reflect the specified type parameter information;
  • the first determining module 14 is configured to determine the coordinates of the selected pixel in the target area
  • the second determining module 16 is configured to determine the value of the parameter corresponding to the coordinate of the selected pixel based on the correlation between the value of the specified type parameter and the coordinate of the pixel point.
  • Utilizing the functions implemented by the above modules can also realize the recognition of the parameter values represented by the non-character information in the image, and achieve the purpose of automatically identifying the pixels in the image as the corresponding parameter values, thereby solving the current problem
  • the image recognition method can only recognize the numerical value of the character format in the image, and cannot automatically recognize the curve or the discrete point as the technical problem of the numerical value.
  • the device further includes: a separation module 11 configured to separate a designated color channel from the target area, wherein the designated color channel is R, G, B color channel The same color channel as the color channel corresponding to the standard color band of the target area; the processing module 13 is set to perform image binarization processing on the image of the specified color channel to obtain a binarized image; the selection module 15 is set to preset In the threshold set, select the threshold corresponding to the area of each pixel in the binarized image, and use the selected threshold to segment the target area; the fitting module 17 is set to reference the binarized image obtained after segmentation Point pixel recognition, to obtain the pixel coordinates of at least two reference points of the standard color band in the image; the establishment module 19 is configured to determine the actual values of the at least two reference points and the corresponding relationship between the pixel coordinates, and based on The corresponding relationship establishes the linear relationship between the value of the specified type parameter and the pixel coordinate, and uses the linear relationship as the
  • the first determining module 14 includes: a grayscale processing unit 140, which is configured to perform grayscale processing on the image in the target area to obtain a grayscale image; and the clustering unit 142 is configured to perform grayscale Each pixel in the image is clustered to obtain multiple clusters; the selection unit 144 is configured to select a specified cluster from the multiple clusters, and determine the selected pixel coordinates from all pixels in the specified cluster.
  • a grayscale processing unit 140 which is configured to perform grayscale processing on the image in the target area to obtain a grayscale image
  • the clustering unit 142 is configured to perform grayscale
  • Each pixel in the image is clustered to obtain multiple clusters
  • the selection unit 144 is configured to select a specified cluster from the multiple clusters, and determine the selected pixel coordinates from all pixels in the specified cluster.
  • the selection unit 144 is further configured to select a cluster with the least number of pixels from a plurality of clusters, and determine the coordinates of the selected pixel from the cluster with the least number of pixels.
  • the image in the target area includes: a curve image in a coordinate system, and the curve in the curve image is used to reflect the values of the specified type parameters at different times.
  • the first determining module 14 is further configured to determine the target recording time corresponding to the pixel coordinates in the curved image; the second determining module 16 is also configured to determine the value and pixel value based on the specified type parameter. The correlation of the point coordinates determines the parameter values corresponding to the selected pixel point coordinates at the target recording time.
  • the above-mentioned target recording time is determined by the following methods: identifying the character information in the curve image, and extracting the time information of the specified type parameter from the character information; the time between any two adjacent recording moments in the time information is determined according to the number of pixels Divide at equal intervals to obtain multiple time points; determine the target recording time to which the selected pixel coordinates belong from the multiple time points.
  • the first determining module further includes: a first recognition unit configured to recognize character information in a curved image, and extract time information of the specified type parameter from the character information; and a first dividing unit, To divide the time length between any two adjacent recording moments in the time information at equal intervals according to the number of pixels to obtain multiple time points; the first determining unit is set to determine from the multiple time points The target recording time to which the selected pixel coordinates belong.
  • a first recognition unit configured to recognize character information in a curved image, and extract time information of the specified type parameter from the character information
  • a first dividing unit To divide the time length between any two adjacent recording moments in the time information at equal intervals according to the number of pixels to obtain multiple time points; the first determining unit is set to determine from the multiple time points The target recording time to which the selected pixel coordinates belong.
  • the second acquisition module 12 includes: a segmentation unit 120 configured to perform semantic segmentation on the target image to obtain a mask image and a foreground image of the target image; and a second determining unit 122 configured to Identify the target area in the foreground image.
  • the segmentation unit 120 is configured to use an efficient neural network model to perform semantic segmentation on the target image when the target image is of the first type.
  • the efficient neural network model includes an initialization module and a bottleneck module, where each bottleneck
  • the module includes three convolutional layers. Among the three convolutional layers, the first convolutional layer is used for dimensionality reduction, and the second convolutional layer is used for hole convolution, full convolution and asymmetric convolution.
  • the third convolutional layer is used for dimensional upscaling; the segmentation unit 120 is also set to adjust the bilateral segmentation network model when the target image is of the second type, and use the adjusted segmentation network model to perform the target image Semantic segmentation, where adjusting the segmentation network model includes at least one of the following: reducing the number of spatial information processing layers in the segmentation network model; reducing the number of feature maps output by each network layer; performing the input image of the bilateral segmentation network model Compression processing; simplify the receptive field processing layer.
  • the segmentation unit 120 is further configured to simplify the receptive field processing layer by replacing the residual neural network (RESNET) module in the receptive field processing layer with a channel separation convolution (Xception39) module.
  • RESNET residual neural network
  • Xception39 channel separation convolution
  • the above-mentioned apparatus may further include: a third determining module 21 configured to determine the type of the target image; and when the type is a preset type, determining to perform semantic segmentation on the target image.
  • the third determining module 21 is further configured to determine the type of the target image by: dividing the target image into a preset number of non-overlapping sliders; determining the feature values of the preset number of non-overlapping sliders to obtain the preset number A feature value; a preset number of feature values are combined into a feature vector; the feature vector is input to the support vector machine classifier for analysis, and the type of the target image is obtained.
  • the target area contains a curve image of the change trend of blood glucose data.
  • the device further includes: a display module for displaying the parameter values corresponding to the selected pixel coordinates.
  • the first determining module is further configured to receive a user's instruction for the target image; determine the coordinates of the selected pixel according to the instruction.
  • the instruction is determined based on one of the following information: receiving position information of the user's touch point on the human-computer interaction interface where the target image is located; or receiving query information input by the user.
  • the device further includes: a judging module, configured to determine the selected user based on the touch point position when the instruction is to receive the user's touch point position information on the human-computer interaction interface where the target image is located. Before setting the pixel coordinates, determine whether the touch point position is located in the target area; a trigger module is used to trigger the determination of the selected pixel when the determination result indicates that the touch point position is located in the target area Point coordinates.
  • a judging module configured to determine the selected user based on the touch point position when the instruction is to receive the user's touch point position information on the human-computer interaction interface where the target image is located.
  • a trigger module is used to trigger the determination of the selected pixel when the determination result indicates that the touch point position is located in the target area Point coordinates.
  • the embodiment of the present application also provides a data display method. As shown in FIG. 12, the method includes:
  • Step S1202 displaying and acquiring the target image to be recognized
  • Step S1204 displaying the region of interest in the target image, where the image in the region of interest is used to reflect the change information of the specified type parameter over time;
  • Step S1206 displaying the coordinates of the selected pixel in the region of interest and the target recording time corresponding to the pixel coordinates;
  • Step S1208 Display the parameter value corresponding to the selected pixel point coordinate at the target recording time, where the parameter value is determined based on the correlation between the value of the specified type parameter and the pixel point coordinate.
  • execution subject of the above steps S1202 to S1208 includes but is not limited to a mobile terminal.
  • the above-mentioned association relationship is determined in the following manner: the designated color channel is separated from the region of interest, where the designated color channel is the standard color band of the R, G, B color channels and the region of interest Corresponding color channels with the same color channel; perform image binarization processing on the image of the specified color channel to obtain a binarized image; select the threshold corresponding to the area of each pixel in the binarized image from the preset threshold set , And use the selected threshold to perform image segmentation on the region of interest; perform reference point pixel recognition on the binarized image obtained after segmentation to obtain the pixel coordinates of at least two reference points of the standard color band in the image; determine at least two Correspondence between the actual values of the reference points and the pixel coordinates, and establish a linear relationship between the values of the specified type parameters and the pixel coordinates based on the correspondence, and use the linear relationship as the association relationship.
  • the embodiment of the present application also provides an image recognition method.
  • the method can determine the selected pixel based on the user's touch operation, thereby determining the parameter value corresponding to the pixel.
  • the method include:
  • Step S1302 detecting the position of the user's touch point in the target image
  • Step S1304 Determine the coordinates of the selected pixel based on the position of the touch point, and the target recording time corresponding to the coordinate of the pixel;
  • Step S1306 Determine the parameter value corresponding to the selected pixel coordinate at the target recording time based on the correlation between the value of the specified type parameter and the pixel coordinate;
  • Step S1308 output parameter values.
  • the value of the output parameter includes, but is not limited to, displaying the value of the parameter to the user, or sending the value of the parameter to the external device, but is not limited to the above-mentioned manifestation.
  • the touch point position before determining the selected pixel point coordinates based on the touch point position, in order to prevent interference from invalid touch operations, it can also be determined whether the touch point position is located in the interest area of the target image, where , The image in the region of interest is used to reflect the change information of the specified type of parameters over time; when the judgment result indicates that the touch point is located in the region of interest, trigger the determination of the coordinates of the selected pixel point.
  • the following processing may be performed Process: Separate the designated color channel from the region of interest, where the designated color channel is the same color channel in the R, G, and B color channels as the color channel corresponding to the standard color band of the region of interest; for the image of the designated color channel Perform image binarization processing to obtain a binarized image; select the threshold corresponding to the area where each pixel in the binarized image is located from the preset threshold set, and use the selected threshold to segment the region of interest; The binarized image obtained after segmentation performs reference point pixel recognition to obtain the pixel coordinates of at least two reference points of the standard color band in the image; determine the actual values of at least two reference points and the correspondence between the pixel coordinates It establishes the linear relationship between the value of the specified type parameter and the pixel coordinate based on the corresponding relationship, and
  • the embodiment of the present application also provides an image recognition method, which can determine the selected pixel based on user input, thereby determining the parameter value corresponding to the pixel. As shown in FIG. 14, the method includes:
  • Step S1402 Detect query information input by the user; wherein the query information may be input through a human-computer interaction interface, and the human-computer interaction interface includes a text input box for inputting query information.
  • Step S1404 Determine the coordinates of the selected pixel in the target image and the target recording time corresponding to the pixel coordinates based on the query information;
  • Step S1406 Determine the value of the parameter corresponding to the selected pixel coordinate at the target recording time based on the correlation between the value of the specified type parameter and the pixel coordinate;
  • Step S1408 output parameter values.
  • the following process may also be performed : Separate the designated color channel from the region of interest, where the designated color channel is the same color channel in the R, G, and B color channels as the color channel corresponding to the standard color band of the region of interest; Image binarization processing to obtain a binarized image; select the threshold corresponding to the area of each pixel in the binarized image from the preset threshold set, and use the selected threshold to segment the region of interest; segmentation The binarized image obtained afterwards performs reference point pixel identification to obtain the pixel coordinates of at least two reference points of the standard color band in the image; determine the actual values of at least two reference points and the corresponding relationship between the pixel coordinates , And establish the linear relationship between the value of the specified type parameter and the pixel coordinate based on the corresponding relationship, and use the linear relationship as the association
  • the embodiment of the present application also provides a non-volatile storage medium, the storage medium includes a stored program, wherein the device where the storage medium is located is controlled to execute the above image recognition method when the program runs.
  • the embodiment of the present application also provides a processor, which is configured to run a program, wherein the above image recognition method is executed when the program is running.
  • the method of determining the parameter value corresponding to the selected pixel coordinate according to the correlation between the pixel coordinate in the target image and the value of the specified type parameter is adopted.
  • the correlation between the point coordinates and the value of the specified type of parameter recognizes the value of the parameter corresponding to any pixel point in the image. Therefore, the recognition of the parameter value represented by the non-character information in the image is realized, and the image
  • the pixel points in the image are automatically recognized as the purpose of the corresponding parameter value, which solves the technical problem that the current image recognition method can only recognize the value of the character format in the image, and cannot automatically recognize the curve or the discrete point as the value.
  • the disclosed technical content can be implemented in other ways.
  • the device embodiments described above are merely illustrative.
  • the division of the units may be a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components may be combined or may be Integrate into another system, or some features can be ignored or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of units or modules, and may be in electrical or other forms.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • each unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.
  • the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium.
  • the technical solution of this application essentially or the part that contributes to the existing technology or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , Including several instructions to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the method described in each embodiment of the present application.
  • the aforementioned storage media include: U disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), mobile hard disk, magnetic disk or optical disk and other media that can store program code .
  • the solutions provided in the embodiments of the present application can be applied to the image recognition process, for example, can be applied to the image recognition process of blood glucose data.
  • the image recognition process of blood glucose data.
  • the correlation between the pixel coordinates in the target image and the values of the specified type parameters is used to identify the parameter values corresponding to any pixel coordinates in the image.
  • the The recognition of the parameter value represented by the character information achieves the purpose of automatically recognizing the pixel in the image as the corresponding parameter value, and further solves that the current image recognition method can only recognize the value of the character format in the image.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Multimedia (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Analysis (AREA)

Abstract

Disclosed are an image recognition method and apparatus, a storage medium, and a processor. The image recognition method comprises: obtaining a target image to be recognized; obtaining a target region in said target image, wherein an image in the target region is used for reflecting information of a parameter of the specific type; determining a selected pixel coordinate in the target region; and determining, on the basis of the association between the value of the parameter of the specific type, and the pixel coordinate, the value of a parameter corresponding to the selected pixel coordinate.

Description

图像识别方法及装置、存储介质和处理器Image recognition method and device, storage medium and processor 技术领域Technical field
本申请涉及图像识别领域,具体而言,涉及一种图像识别方法及装置、存储介质和处理器。This application relates to the field of image recognition, and specifically to an image recognition method and device, storage medium and processor.
背景技术Background technique
图像识别技术是人工智能的重要领域,它是指对图像进行对象识别,以识别各种不同模式的目标和对象的技术。常见识别对象大致可分为自然场景对象和特定场景对象,对于自然场景图像而言,则利用卷积网络训练合适的模型,而对于特定场景对象而言,则需进行一定的网络模型和算法的二次开发。对于图片中数据的识别,现有的是采用光学字符识别(Optical Character Recognition,简称为OCR)识别技术对图片的数据进行识别,但OCR识别技术只能识别图像中所显示的数字,对于点、连续或间断曲线所代表的数值无法识别。Image recognition technology is an important field of artificial intelligence, which refers to the object recognition of images to identify targets and objects in various patterns. Common recognition objects can be roughly divided into natural scene objects and specific scene objects. For natural scene images, convolutional networks are used to train appropriate models, while for specific scene objects, certain network models and algorithms are required. Secondary development. For the recognition of the data in the picture, the existing optical character recognition (Optical Character Recognition, referred to as OCR) recognition technology is used to recognize the data of the picture, but the OCR recognition technology can only recognize the numbers displayed in the image. For dot, continuous or The value represented by the discontinuous curve cannot be identified.
血糖监测硬件设备较多,但是血糖数据的导出需等到监测周期(一般为14天左右)结束后,需通过数据线连接电脑,导出血糖数据表格,无法实现用户实时了解个人血糖数据的变化,更无法监控日常生活中吃、动、睡对自己血糖数据的影响。There are many hardware devices for blood glucose monitoring, but the export of blood glucose data needs to wait until the end of the monitoring period (usually about 14 days). You need to connect to the computer through a data cable to export the blood glucose data table. It is impossible for users to understand the changes of personal blood glucose data in real time. Unable to monitor the influence of eating, moving, and sleeping on their blood glucose data in daily life.
由此可见,目前基于图像识别的生理监测参数(如指尖血血糖仪读数)的记录方法及系统是基于光学字符识别技术,只能识别出生理参数数值而不能通过生理曲线自动识别为数值进行后续应用。It can be seen that the current recording methods and systems for physiological monitoring parameters (such as fingertip blood glucose meter readings) based on image recognition are based on optical character recognition technology, which can only identify physiological parameter values and cannot be automatically recognized as values through physiological curves. Follow-up application.
针对上述的问题,目前尚未提出有效的解决方案。In view of the above-mentioned problems, no effective solutions have yet been proposed.
发明内容Summary of the invention
本申请实施例提供了一种图像识别方法及装置、存储介质和处理器,以至少解决当前的图像识别方式只能识别出图像中的字符格式的数值,不能将曲线或离散点自动识别为数值的技术问题。The embodiment of the application provides an image recognition method and device, storage medium, and processor, which can at least solve that the current image recognition method can only recognize the numerical value of the character format in the image, and cannot automatically recognize the curve or discrete point as a numerical value. Technical issues.
根据本申请实施例的一个方面,提供了一种图像识别方法,包括:获取待识别的目标图像;获取目标图像中的目标区域,其中,该目标区域中的图像用于反映指定类型参数信息;确定目标区域中被选定的像素点坐标;基于指定类型参数的取值与像素点坐标的关联关系确定被选定的像素点坐标所对应的参数取值。According to one aspect of the embodiments of the present application, there is provided an image recognition method, including: obtaining a target image to be recognized; obtaining a target region in the target image, wherein the image in the target region is used to reflect specified type parameter information; Determine the coordinate of the selected pixel in the target area; determine the parameter value corresponding to the selected pixel coordinate based on the correlation between the value of the specified type parameter and the coordinate of the pixel.
可选地,基于指定类型参数的取值与像素点坐标的关联关系确定被选定的像素点坐标所对应的参数取值之前,方法还包括:从目标区域中分离出指定颜色通道,其中,指定颜色通道为R、G、B颜色通道中与所述目标区域的标准色带所对应颜色通道相同的颜色通道;对指定颜色通道的图像进行图像二值化处理,二值化处理为选择指定颜色通道的图像中大于预设阈值的像素点集合,得到二值化图像;从预设阈值集合中选择所述二值化图像中每个像素点所在区域所对应的阈值,并利用选择的阈值对目标区 域进行图像分割;对分割后得到的二值化图像进行参考点像素识别,得到标准色带的至少两个参考点在图像中的像素点坐标;确定至少两个参考点的实际取值以及像素点坐标之间的对应关系,并基于所述对应关系建立指定类型参数的取值与像素点坐标的线性关系,并将线性关系作为关联关系。Optionally, before determining the value of the parameter corresponding to the selected pixel point coordinate based on the correlation between the value of the specified type parameter and the pixel point coordinate, the method further includes: separating the specified color channel from the target area, wherein, The designated color channel is the same color channel as the color channel corresponding to the standard color band of the target area among the R, G, and B color channels; image binarization processing is performed on the image of the designated color channel, and the binarization processing is selection designation The set of pixel points in the image of the color channel that are greater than the preset threshold to obtain a binarized image; select the threshold corresponding to the area where each pixel in the binarized image is located from the preset threshold set, and use the selected threshold Perform image segmentation on the target area; perform reference point pixel recognition on the binarized image obtained after segmentation to obtain the pixel coordinates of at least two reference points of the standard color band in the image; determine the actual values of at least two reference points And the corresponding relationship between the pixel point coordinates, and based on the corresponding relationship, a linear relationship between the value of the specified type parameter and the pixel point coordinate is established, and the linear relationship is regarded as the association relationship.
可选地,确定目标区域中被选定的像素点坐标,包括:对目标区域中的图像进行灰度化处理,得到灰度图像;对灰度图像中的各个像素点进行聚类处理,得到多个簇;从所述多个簇中选择指定簇,并从指定簇中的所有像素点中确定被选定的像素点坐标。Optionally, determining the coordinates of selected pixels in the target area includes: performing grayscale processing on the image in the target area to obtain a grayscale image; performing clustering processing on each pixel in the grayscale image to obtain Multiple clusters; select a designated cluster from the multiple clusters, and determine the coordinates of the selected pixel from all pixels in the designated cluster.
可选地,从多个簇中选择指定簇,并从指定簇中的所有像素点中确定被选定的像素点坐标具体为:从多个簇中选择像素点数量最少的簇,并从像素点数量最少的簇中确定被选定的像素点坐标。Optionally, selecting a designated cluster from a plurality of clusters, and determining the coordinates of the selected pixel from all pixels in the designated cluster is specifically: selecting the cluster with the least number of pixels from the multiple clusters, and selecting from the pixels Determine the coordinates of the selected pixel in the cluster with the least number of points.
可选地,目标区域中的图像包括:坐标系中的曲线图像或坐标系中离散点图像,该曲线图像中的曲线或离散点图像中的离散点用于反映指定类型参数在不同时刻的取值。Optionally, the image in the target area includes: a curve image in the coordinate system or a discrete point image in the coordinate system, and the curve in the curve image or the discrete points in the discrete point image are used to reflect the selection of the specified type of parameters at different moments. value.
可选地,方法还包括:确定像素点坐标在曲线图像或所述离散点图像中对应的目标记录时间;基于指定类型参数的取值与像素点坐标的关联关系确定被选定的像素点坐标所对应的参数取值,包括:基于指定类型参数的取值与像素点坐标的关联关系确定被选定的像素点坐标在目标记录时间所对应的参数取值。Optionally, the method further includes: determining the target recording time corresponding to the pixel point coordinate in the curve image or the discrete point image; determining the selected pixel point coordinate based on the correlation between the value of the specified type parameter and the pixel point coordinate The corresponding parameter value includes: determining the parameter value corresponding to the selected pixel point coordinate at the target recording time based on the correlation between the value of the specified type parameter and the pixel point coordinate.
可选地,目标记录时间通过以下方式确定:识别曲线图像中的字符信息,从字符信息中提取指定类型参数的时间信息;对时间信息中任意的相邻两个记录时刻之间的时长按照像素点数量进行等间隔划分,得到多个时间点;从所述多个时间点中确定被选定的像素点坐标所属的目标记录时间。Optionally, the target recording time is determined by the following methods: identifying the character information in the curve image, and extracting the time information of the specified type parameter from the character information; the time length between any two adjacent recording moments in the time information is determined by pixel The number of points is divided at equal intervals to obtain multiple time points; and the target recording time to which the selected pixel point coordinates belong is determined from the multiple time points.
可选地,获取所述目标图像中的目标区域,包括:对目标图像进行语义分割,得到所述目标图像的掩码图像和前景图像;从前景图像中确定感兴趣区域,并将感兴趣区域作为目标区域。Optionally, acquiring the target region in the target image includes: semantically segmenting the target image to obtain a mask image and a foreground image of the target image; determining the region of interest from the foreground image, and combining the region of interest As the target area.
可选地,在目标图像为第一类型的情况下,采用高效神经网络模型对目标图像进行语义分割,其中,所述高效神经网络模型包括:初始化模块和瓶颈模块,其中,每个瓶颈模块包括三个卷积层,其中,所述三个卷积层中的第一卷积层用于进行降维处理,第二卷积层用于进行空洞卷积、全卷积和非对称卷积,第三卷积层用于进行升维处理;在所述目标图像为第二类型的情况下,对双边分割网络模型进行调整,并利用调整后的分割网络模型对所述目标图像进行语义分割,其中,对分割网络模型进行调整包括:双边分割网络模型包括主干网络和辅助网络,所述主干网络由两层构成,每层主干网络分别包括卷积层、批归一化层和非线性激活函数,降低主干网络输出通道特征图数;所述辅助网络模型框架采用轻量级模型,降低主干网络输出通道特征图数,所述轻量级模型包括以下之一:Xception39、SqueezeNet、Xception、MobileNet、ShuffleNet,其中,第一类型所对应第一数据集的图像数量小于第二类型所对应第二数据集中的图像数量。Optionally, when the target image is of the first type, an efficient neural network model is used to perform semantic segmentation on the target image, where the efficient neural network model includes an initialization module and a bottleneck module, wherein each bottleneck module includes Three convolutional layers, wherein the first convolutional layer of the three convolutional layers is used for dimensionality reduction processing, and the second convolutional layer is used for hole convolution, full convolution and asymmetric convolution, The third convolutional layer is used to perform dimension-upgrading processing; when the target image is of the second type, adjust the bilateral segmentation network model, and use the adjusted segmentation network model to perform semantic segmentation on the target image, Among them, the adjustment of the segmentation network model includes: the bilateral segmentation network model includes a backbone network and an auxiliary network. The backbone network is composed of two layers. Each layer of the backbone network includes a convolutional layer, a batch normalization layer, and a nonlinear activation function. , Reduce the number of feature maps of the backbone network output channel; the auxiliary network model framework adopts a lightweight model to reduce the number of feature maps of the backbone network output channel. The lightweight model includes one of the following: Xception39, SqueezeNet, Xception, MobileNet, ShuffleNet, where the number of images in the first data set corresponding to the first type is smaller than the number of images in the second data set corresponding to the second type.
可选地,从前景图像中确定感兴趣区域,包括:确定前景图像中的特征区域,以及目标几何区域的角点坐标,其中,特征区域为前景图像中包含指定类型参数信息的区域;基于角点坐标计算投影变换矩阵;对特征区域中的像素点进行投影变换,得到感兴趣区域。Optionally, determining the region of interest from the foreground image includes: determining the characteristic region in the foreground image and the corner coordinates of the target geometric region, where the characteristic region is the region in the foreground image that contains the specified type parameter information; Point coordinates calculate the projection transformation matrix; perform projection transformation on the pixel points in the characteristic area to obtain the region of interest.
可选地,获取待识别的目标图像之前,所述方法还包括:确定待识别的图像是否为目标图像;在待识别的图像为目标图像时,确定对所述目标图像进行语义分割。需要说明的是,目标图像即是具有目标区域的图像,目标区域中的图像用于反映指定类型参数信息,例如,本申请的一个实施例是用于识别雅培连续血糖仪图像的方法,那么雅培连续血糖仪的含有反映血糖连续变化的图像即为目标图像;当然,如果是用于识别其他曲线图像或离散点图像的数值的实施场景,那么含有用于反映相应的数值的曲线图像或离散点图像即为目标图像。Optionally, before acquiring the target image to be recognized, the method further includes: determining whether the image to be recognized is the target image; when the image to be recognized is the target image, determining to perform semantic segmentation on the target image. It should be noted that the target image is the image with the target area, and the image in the target area is used to reflect the specified type of parameter information. For example, one embodiment of the present application is a method for recognizing Abbott’s continuous blood glucose meter images. The continuous blood glucose meter contains images that reflect the continuous changes in blood glucose is the target image; of course, if it is used to identify the value of other curve images or discrete point images, then it contains the curve image or discrete points used to reflect the corresponding value. The image is the target image.
可选地,上述方法还包括:将感兴趣区域均分成预设数量个不重叠滑块;确定所述预设数量个不重叠滑块的特征值,得到所述预设数量个特征值;将所述预设数量个特征值组合成特征向量;将所述特征向量输入至支持向量机分类器进行分析,得到所述感兴趣区域的类型。Optionally, the above method further includes: dividing the region of interest into a preset number of non-overlapping sliders; determining the feature values of the preset number of non-overlapping sliders to obtain the preset number of feature values; The preset number of feature values are combined into a feature vector; the feature vector is input to a support vector machine classifier for analysis to obtain the type of the region of interest.
可选地,指定类型参数信息中包含有用于反映血糖数据随时间变化的趋势的曲线信息,或用于反映血糖数据随时间变化的趋势的坐标系中离散点的取值信息。Optionally, the specified type parameter information includes curve information used to reflect the trend of blood glucose data changes over time, or value information of discrete points in the coordinate system used to reflect the trend of blood glucose data changes over time.
可选地,上述方法还包括:展示所述被选定的像素点坐标所对应的参数取值。Optionally, the above method further includes: displaying the parameter value corresponding to the selected pixel point coordinate.
可选地,确定所述目标区域中被选定的像素点坐标,包括:接收用户针对目标图像的指令;依据所述指令确定被选定的像素点坐标。Optionally, determining the coordinate of the selected pixel in the target area includes: receiving a user's instruction for the target image; and determining the coordinate of the selected pixel according to the instruction.
可选地,所述指令基于以下之一信息确定:接收所述用户在所述目标图像所在人机交互界面的触摸位置信息;或接收所述用户输入的查询信息。Optionally, the instruction is determined based on one of the following information: receiving touch position information of the user on the human-computer interaction interface where the target image is located; or receiving query information input by the user.
可选地,当所述指令为接收所述用户在所述目标图像所在人机交互界面的触摸点位置信息时,所述方法还包括:基于所述触摸点位置确定被选定的像素点坐标之前,判断所述触摸点位置是否位于所述目标区域;在判断结果指示所述触摸点位置位于所述目标区域时,触发确定所述被选定的像素点坐标。Optionally, when the instruction is to receive the user's touch point position information on the human-computer interaction interface where the target image is located, the method further includes: determining the selected pixel point coordinates based on the touch point position Before, it is judged whether the touch point position is located in the target area; when the judgment result indicates that the touch point position is located in the target area, trigger determination of the selected pixel point coordinates.
根据本申请实施例的又一方面,提供了一种数据展示方法,包括:展示获取待识别的目标图像;展示所述目标图像中的感兴趣区域,其中,该感兴趣区域中的图像用于反映指定类型参数随时间变化的变化信息;展示所述感兴趣区域中被选定的像素点坐标,以及该像素点坐标对应的目标记录时间;展示所述被选定的像素点坐标在所述目标记录时间所对应的参数取值,其中,参数取值是基于指定类型参数的取值与像素点坐标的关联关系确定的。According to another aspect of the embodiments of the present application, there is provided a data display method, including: displaying a target image to be recognized; displaying a region of interest in the target image, wherein the image in the region of interest is used for Reflect the change information of the specified type of parameters over time; display the coordinates of the selected pixel in the region of interest and the target recording time corresponding to the pixel coordinates; display the coordinates of the selected pixel in the The parameter value corresponding to the target recording time, wherein the parameter value is determined based on the correlation between the value of the specified type parameter and the pixel coordinate.
可选地,关联关系通过以下方式确定:从所述感兴趣区域中分离出指定颜色通道,其中,指定颜色通道为R、G、B颜色通道中与所述感兴趣区域的标准色带所对应颜色通道相同的颜色通道;对所述指定颜色通道的图像进行图像二值化处理,得到二值化图像;从预设阈值集合中选择所述二值化图像中每个像素点所在区域所对应的阈值, 并利用选择的阈值对所述感兴趣区域进行图像分割;对分割后得到的二值化图像进行参考点像素识别,得到标准色带的至少两个参考点在图像中的像素点坐标;确定所述至少两个参考点的实际取值以及所述像素点坐标之间的对应关系,并基于所述对应关系建立所述指定类型参数的取值与像素点坐标的线性关系,并将所述线性关系作为所述关联关系。Optionally, the association relationship is determined in the following manner: a designated color channel is separated from the region of interest, wherein the designated color channel is the standard color band of the R, G, and B color channels corresponding to the region of interest Color channels with the same color channel; perform image binarization processing on the image of the specified color channel to obtain a binarized image; select the area corresponding to each pixel in the binarized image from a preset threshold set And use the selected threshold to perform image segmentation on the region of interest; perform reference point pixel identification on the binarized image obtained after segmentation to obtain the pixel coordinates of at least two reference points of the standard color band in the image Determine the actual value of the at least two reference points and the corresponding relationship between the pixel coordinates, and establish a linear relationship between the values of the specified type parameters and the pixel coordinates based on the corresponding relationship, and The linear relationship serves as the association relationship.
可选地,根据本申请实施例的另一方面,提供了一种图像识别方法,包括:检测用户在目标图像中的触摸点位置;基于所述触摸点位置确定被选定的像素点坐标,以及该像素点坐标对应的目标记录时间;基于指定类型参数的取值与像素点坐标的关联关系确定所述被选定的像素点坐标在所述目标记录时间所对应的参数取值;输出所述参数取值。Optionally, according to another aspect of the embodiments of the present application, an image recognition method is provided, including: detecting a user's touch point position in a target image; determining the selected pixel point coordinates based on the touch point position, And the target recording time corresponding to the pixel point coordinates; determine the parameter value corresponding to the selected pixel point coordinate at the target recording time based on the correlation between the value of the specified type parameter and the pixel point coordinate; The parameter value.
可选地,基于触摸点位置确定被选定的像素点坐标之前,方法还包括:判断触摸点位置是否位于目标图像的感兴趣区域,其中,该感兴趣区域中的图像用于反映指定类型参数随时间变化的变化信息;在判断结果指示所述触摸点位置位于感兴趣区域时,触发确定被选定的像素点坐标。Optionally, before determining the coordinates of the selected pixel point based on the position of the touch point, the method further includes: determining whether the position of the touch point is located in the region of interest of the target image, wherein the image in the region of interest is used to reflect the specified type parameters Change information over time; when the judgment result indicates that the touch point is located in the region of interest, trigger the determination of the selected pixel point coordinates.
可选地,基于指定类型参数的取值与像素点坐标的关联关系确定所述被选定的像素点坐标在所述目标记录时间所对应的参数取值之前,所述方法还包括:从所述感兴趣区域中分离出指定颜色通道,其中,所述指定颜色通道为R、G、B颜色通道中与所述感兴趣区域的标准色带所对应颜色通道相同的颜色通道;对所述指定颜色通道的图像进行图像二值化处理,得到二值化图像;从预设阈值集合中选择所述二值化图像中每个像素点所在区域所对应的阈值,并利用选择的阈值对所述感兴趣区域进行图像分割;对分割后得到的二值化图像进行参考点像素识别,得到所述标准色带的至少两个参考点在图像中的像素点坐标;确定所述至少两个参考点的实际取值以及所述像素点坐标之间的对应关系,并基于所述对应关系建立所述指定类型参数的取值与像素点坐标的线性关系,并将所述线性关系作为所述关联关系。Optionally, before determining the selected pixel coordinates based on the association relationship between the values of the specified type parameters and the pixel coordinates before the parameter values corresponding to the target recording time, the method further includes: The designated color channel is separated from the region of interest, where the designated color channel is the same color channel as the color channel corresponding to the standard color band of the region of interest among the R, G, and B color channels; Perform image binarization processing on the image of the color channel to obtain a binarized image; select the threshold corresponding to the area where each pixel in the binarized image is located from the preset threshold set, and use the selected threshold to Perform image segmentation on the region of interest; perform reference point pixel recognition on the binarized image obtained after segmentation to obtain pixel coordinates of at least two reference points of the standard color band in the image; determine the at least two reference points The actual value of and the corresponding relationship between the pixel point coordinates, and establish a linear relationship between the value of the specified type parameter and the pixel point coordinates based on the corresponding relationship, and use the linear relationship as the association relationship .
根据本申请实施例的又一方面,提供了一种图像识别方法,包括:检测用户输入的查询信息;基于所述查询信息确定目标图像中被选定的像素点坐标,以及该像素点坐标对应的目标记录时间;基于指定类型参数的取值与像素点坐标的关联关系确定所述被选定的像素点坐标在所述目标记录时间所对应的参数取值;输出所述参数取值。According to another aspect of the embodiments of the present application, an image recognition method is provided, including: detecting query information input by a user; determining the coordinates of a selected pixel in a target image based on the query information, and the corresponding pixel coordinates The target recording time of the target; the value of the parameter corresponding to the selected pixel coordinate at the target recording time is determined based on the correlation between the value of the specified type parameter and the coordinate of the pixel; and the value of the parameter is output.
可选地,基于所述触摸点位置确定目标图像中被选定的像素点坐标之前,所述方法还包括:判断所述触摸点位置是否位于所述目标图像的感兴趣区域,其中,该感兴趣区域中的图像用于反映指定类型参数随时间变化的变化信息;在判断结果指示所述触摸点位置位于所述感兴趣区域时,触发确定所述被选定的像素点坐标。Optionally, before determining the coordinates of the selected pixel point in the target image based on the touch point position, the method further includes: determining whether the touch point position is located in the region of interest of the target image, wherein the feeling The image in the region of interest is used to reflect the change information of the specified type parameter over time; when the judgment result indicates that the position of the touch point is located in the region of interest, triggering the determination of the coordinates of the selected pixel point.
可选地,基于指定类型参数的取值与像素点坐标的关联关系确定所述被选定的像素点坐标在所述目标记录时间所对应的参数取值之前,所述方法还包括:从所述感兴趣区域中分离出指定颜色通道,其中,所述指定颜色通道为R、G、B颜色通道中与所述感兴趣区域的标准色带所对应颜色通道相同的颜色通道;对所述指定颜色通道的图像进行图像二值化处理,所述二值化处理为选择所述指定颜色通道的图像中大于预设 阈值的像素点集合,得到二值化图像;从预设阈值集合中选择所述二值化图像中每个像素点所在区域所对应的阈值,并利用选择的阈值对所述感兴趣区域进行图像分割;对分割后得到的二值化图像进行参考点像素识别,得到标准色带的至少两个参考点在图像中的像素点坐标;确定所述至少两个参考点的实际取值以及所述像素点坐标之间的对应关系,并基于所述对应关系建立所述指定类型参数的取值与像素点坐标的线性关系,并将所述线性关系作为所述关联关系。Optionally, before determining the selected pixel coordinates based on the association relationship between the values of the specified type parameters and the pixel coordinates before the parameter values corresponding to the target recording time, the method further includes: The designated color channel is separated from the region of interest, where the designated color channel is the same color channel as the color channel corresponding to the standard color band of the region of interest among the R, G, and B color channels; The image of the color channel is subjected to image binarization processing. The binarization process is to select a set of pixel points greater than a preset threshold in the image of the specified color channel to obtain a binarized image; select all from the preset threshold set The threshold corresponding to the area where each pixel in the binarized image is located, and the selected threshold is used to segment the region of interest; the binarized image obtained after the segmentation is identified by the reference point pixel to obtain the standard color The pixel coordinates of at least two reference points of the belt in the image; determine the actual values of the at least two reference points and the corresponding relationship between the pixel coordinates, and establish the specified type based on the corresponding relationship The linear relationship between the value of the parameter and the coordinate of the pixel point, and the linear relationship is taken as the association relationship.
根据本申请实施例的再一方面,提供了一种图像识别装置,包括:第一获取模块,设置为获取待识别的目标图像;第二获取模块,设置为获取所述目标图像中的目标区域,其中,该目标区域中的图像用于反映指定类型参数信息;第一确定模块,设置为确定所述目标区域中被选定的像素点坐标;第二确定模块,设置为基于所述指定类型参数的取值与像素点坐标的关联关系确定所述被选定的像素点坐标所对应的参数取值。According to another aspect of the embodiments of the present application, there is provided an image recognition device, which includes: a first acquisition module configured to acquire a target image to be recognized; a second acquisition module configured to acquire a target area in the target image , Wherein the image in the target area is used to reflect the specified type parameter information; the first determining module is set to determine the coordinates of the selected pixel in the target area; the second determining module is set to be based on the specified type The correlation between the value of the parameter and the coordinate of the pixel determines the value of the parameter corresponding to the selected pixel coordinate.
可选地,所述装置还包括:分离模块,设置为从所述目标区域中分离出指定颜色通道,其中,所述指定颜色通道为R、G、B颜色通道中与所述目标区域的标准色带所对应颜色通道相同的颜色通道;处理模块,设置为对所述指定颜色通道的图像进行图像二值化处理,所述二值化处理为选择所述指定颜色通道的图像中大于预设阈值的像素点集合,得到二值化图像;选择模块,设置为从预设阈值集合中选择所述二值化图像中每个像素点所在区域所对应的阈值,并利用选择的阈值对所述目标区域进行图像分割;拟合模块,设置为对分割后得到的二值化图像进行参考点像素识别,得到所述标准色带的至少两个参考点在图像中的像素点坐标;建立模块,设置为确定所述至少两个参考点的实际取值以及所述像素点坐标之间的对应关系,并基于所述对应关系建立所述指定类型参数的取值与像素点坐标的线性关系,并将所述线性关系作为所述关联关系。Optionally, the device further includes: a separation module configured to separate a designated color channel from the target area, wherein the designated color channel is the standard of the R, G, B color channels and the target area The color channel corresponding to the color band is the same color channel; the processing module is configured to perform image binarization processing on the image of the specified color channel, and the binarization process is to select the image of the specified color channel larger than the preset A set of threshold pixels to obtain a binarized image; the selection module is configured to select the threshold corresponding to the area of each pixel in the binarized image from the preset threshold set, and use the selected threshold to compare the Image segmentation of the target area; a fitting module, configured to perform reference point pixel recognition on the binarized image obtained after segmentation, to obtain the pixel coordinates of at least two reference points of the standard color band in the image; a building module, Set to determine the actual value of the at least two reference points and the corresponding relationship between the pixel point coordinates, and establish a linear relationship between the value of the specified type parameter and the pixel point coordinates based on the corresponding relationship, and Use the linear relationship as the association relationship.
可选地,所述第一确定模块,包括:灰度处理单元,设置为对所述目标区域中的图像进行灰度化处理,得到灰度图像;聚类单元,设置为对所述灰度图像中的各个像素点进行聚类处理,得到多个簇;选择单元,设置为从所述多个簇中选择指定簇,并从所述指定簇中的所有像素点中确定被选定的像素点坐标。Optionally, the first determining module includes: a grayscale processing unit configured to perform grayscale processing on the image in the target area to obtain a grayscale image; and a clustering unit configured to perform grayscale processing on the grayscale Each pixel in the image is clustered to obtain multiple clusters; the selection unit is configured to select a specified cluster from the multiple clusters, and determine the selected pixel from all the pixels in the specified cluster Point coordinates.
可选地,所述选择单元,设置为从所述多个簇中选择像素点数量最少的簇,并从所述像素点数量最少的簇中确定被选定的像素点坐标。Optionally, the selection unit is configured to select a cluster with the least number of pixels from the plurality of clusters, and determine the coordinates of the selected pixel from the cluster with the least number of pixels.
可选地,所述目标区域中的图像包括:坐标系中的曲线图像或坐标系中离散点图像,该曲线图像中的曲线或所述离散点图像中的离散点,用于反映指定类型参数在不同时刻的取值。Optionally, the image in the target area includes: a curve image in a coordinate system or a discrete point image in the coordinate system, and the curve in the curve image or the discrete point in the discrete point image is used to reflect a specified type parameter Values at different moments.
可选地,所述第一确定模块,还设置为确定所述像素点坐标在所述曲线图像中对应的目标记录时间;所述第二确定模块,还设置为基于指定类型参数的取值与像素点坐标的关联关系确定所述被选定的像素点坐标在所述目标记录时间所对应的参数取值。Optionally, the first determining module is further configured to determine the target recording time corresponding to the pixel point coordinates in the curve image; the second determining module is further configured to determine the value based on the specified type parameter and The correlation of the pixel coordinates determines the value of the parameter corresponding to the selected pixel coordinate at the target recording time.
可选地,所述第一确定模块还包括:第一识别单元,设置为识别曲线图像中的字符信息,从所述字符信息中提取所述指定类型参数的时间信息;第一划分单元,设置为对所述时间信息中任意的相邻两个记录时刻之间的时长按照像素点数量进行等间隔 划分,得到多个时间点;第一确定单元,设置为从所述多个时间点中确定所述被选定的像素点坐标所属的目标记录时间。Optionally, the first determining module further includes: a first recognition unit configured to recognize character information in a curved image, and extract time information of the specified type parameter from the character information; and a first dividing unit, To divide the time length between any two adjacent recording moments in the time information at equal intervals according to the number of pixels to obtain multiple time points; the first determining unit is set to determine from the multiple time points The target recording time to which the selected pixel coordinates belong.
可选地,第二获取模块,包括:分割单元,设置为对目标图像进行语义分割,得到所述目标图像的掩码图像和前景图像;第二确定单元,设置为设置为从所述前景图像中确定感兴趣区域,并将所述感兴趣区域作为所述目标区域。Optionally, the second acquisition module includes: a segmentation unit configured to perform semantic segmentation on a target image to obtain a mask image and a foreground image of the target image; a second determining unit configured to be configured to Determine the region of interest in the, and use the region of interest as the target region.
可选地,分割单元,设置为在目标图像为第一类型的情况下,采用高效神经网络模型对目标图像进行语义分割,其中,高效神经网络模型包括:初始化模块和瓶颈模块,其中,每个瓶颈模块包括三个卷积层,其中,所述三个卷积层中的第一卷积层用于进行降维处理,第二卷积层用于进行空洞卷积、全卷积和非对称卷积,第三卷积层用于进行升维处理;所述分割单元,还设置为在所述目标图像为第二类型的情况下,对双边分割网络模型进行调整,并利用调整后的分割网络模型对所述目标图像进行语义分割,其中,对所述分割网络模型进行调整包括:双边分割网络模型包括主干网络和辅助网络,所述主干网络由两层构成,每层主干网络分别包括卷积层、批归一化层和非线性激活函数,降低主干网络输出通道特征图数;所述辅助网络模型框架采用轻量级模型,降低主干网络输出通道特征图数,所述轻量级模型包括以下之一:Xception39、SqueezeNet、Xception、MobileNet、ShuffleNet;其中,所述第一类型所对应第一数据集的图像数量小于第二类型所对应第二数据集中的图像数量。Optionally, the segmentation unit is set to use an efficient neural network model to perform semantic segmentation on the target image when the target image is of the first type. The efficient neural network model includes an initialization module and a bottleneck module, where each The bottleneck module includes three convolutional layers. Among the three convolutional layers, the first convolutional layer is used for dimensionality reduction processing, and the second convolutional layer is used for hole convolution, full convolution, and asymmetrical. Convolution, the third convolutional layer is used for dimensional increase processing; the segmentation unit is also set to adjust the bilateral segmentation network model when the target image is of the second type, and use the adjusted segmentation The network model performs semantic segmentation on the target image, where the adjustment of the segmentation network model includes: the bilateral segmentation network model includes a backbone network and an auxiliary network. The backbone network is composed of two layers, and each layer of the backbone network includes volumes. Multiplying layers, batch normalization layers and nonlinear activation functions to reduce the number of feature maps of the backbone network output channel; the auxiliary network model framework adopts a lightweight model to reduce the number of feature maps of the backbone network output channel, the lightweight model It includes one of the following: Xception39, SqueezeNet, Xception, MobileNet, ShuffleNet; wherein the number of images in the first data set corresponding to the first type is smaller than the number of images in the second data set corresponding to the second type.
可选地,装置还包括:第三确定模块,设置为确定待识别的图像是否为目标图像;以及在所述待识别的图像为目标图像时,确定对所述目标图像进行语义分割。Optionally, the device further includes: a third determining module configured to determine whether the image to be recognized is a target image; and when the image to be recognized is a target image, determine to perform semantic segmentation on the target image.
可选地,第三确定模块,还设置为通过以下方式确定所述感兴趣区域的类型:将感兴趣区域均分成预设数量个不重叠滑块;确定所述预设数量个不重叠滑块的特征值,得到所述预设数量个特征值;将所述预设数量个特征值组合成特征向量;将所述特征向量输入至支持向量机分类器进行分析,得到所述感兴趣区域的类型。Optionally, the third determining module is further configured to determine the type of the region of interest by: dividing the region of interest into a preset number of non-overlapping sliders; determining the preset number of non-overlapping sliders To obtain the preset number of feature values; combine the preset number of feature values into a feature vector; input the feature vector to a support vector machine classifier for analysis to obtain the region of interest Types of.
可选地,指定类型参数信息中包含有用于反映血糖数据随时间变化的趋势的曲线信息,或用于反映血糖数据随时间变化的趋势的坐标系中离散点的取值信息。Optionally, the specified type parameter information includes curve information used to reflect the trend of blood glucose data changes over time, or value information of discrete points in the coordinate system used to reflect the trend of blood glucose data changes over time.
可选地,所述装置还包括:展示模块,用于展示所述被选定的像素点坐标所对应的参数取值。Optionally, the device further includes: a display module for displaying the parameter values corresponding to the selected pixel coordinates.
可选地,所述第一确定模块,还用于接收用户针对目标图像的指令;依据所述指令确定被选定的像素点坐标。Optionally, the first determining module is further configured to receive a user's instruction for the target image; determine the coordinates of the selected pixel according to the instruction.
可选地,所述指令基于以下之一信息确定:接收所述用户在所述目标图像所在人机交互界面的触摸点位置信息;或接收所述用户输入的查询信息。Optionally, the instruction is determined based on one of the following information: receiving position information of the user's touch point on the human-computer interaction interface where the target image is located; or receiving query information input by the user.
可选地,所述装置还包括:判断模块,用于在所述指令为接收所述用户在所述目标图像所在人机交互界面的触摸点位置信息时,基于所述触摸点位置确定被选定的像素点坐标之前,判断所述触摸点位置是否位于所述目标区域;触发模块,用于在判断结果指示所述触摸点位置位于所述目标区域时,触发确定所述被选定的像素点坐标。Optionally, the device further includes: a judging module, configured to determine the selected user based on the touch point position when the instruction is to receive the user's touch point position information on the human-computer interaction interface where the target image is located. Before setting the pixel coordinates, determine whether the touch point position is located in the target area; a trigger module is used to trigger the determination of the selected pixel when the determination result indicates that the touch point position is located in the target area Point coordinates.
根据本申请实施例的又一方面,提供了一种非易失性存储介质,所述非易失性存储介质包括存储的程序,其中,在所述程序运行时控制所述非易失性存储介质所在设备执行以上所述的图像识别方法。According to yet another aspect of the embodiments of the present application, there is provided a non-volatile storage medium, the non-volatile storage medium includes a stored program, wherein the non-volatile storage is controlled while the program is running The device where the medium is located executes the image recognition method described above.
根据本申请实施例的又一方面,提供了一种处理器,所述处理器设置为运行程序,其中,所述程序运行时执行以上所述的图像识别方法。According to another aspect of the embodiments of the present application, a processor is provided, the processor is configured to run a program, wherein the image recognition method described above is executed when the program is running.
在本申请实施例中,采用依据目标图像中像素点坐标和指定类型参数的取值之间的关联关系确定被选定的像素点坐标所对应的参数取值的方式,由于采用目标图像中像素点坐标和指定类型参数的取值之间的关联关系识别图像中任意像素点坐标对应的参数取值,因此,实现了对图像中非字符信息所表示的参数取值的识别,达到了将图像中的像素点自动识别为相应的参数取值的目的,进而解决了当前的图像识别方式只能识别出图像中的字符格式的数值,不能将曲线或离散点自动识别为数值的技术问题。In the embodiment of the present application, the method of determining the parameter value corresponding to the selected pixel coordinate according to the correlation between the pixel coordinate in the target image and the value of the specified type parameter is adopted. The correlation between the point coordinates and the value of the specified type of parameter recognizes the value of the parameter corresponding to any pixel point in the image. Therefore, the recognition of the parameter value represented by the non-character information in the image is realized, and the image The pixel points in the image are automatically recognized as the purpose of the corresponding parameter value, which solves the technical problem that the current image recognition method can only recognize the value of the character format in the image, and cannot automatically recognize the curve or the discrete point as the value.
附图说明Description of the drawings
此处所说明的附图用来提供对本申请的进一步理解,构成本申请的一部分,本申请的示意性实施例及其说明用于解释本申请,并不构成对本申请的不当限定。在附图中:The drawings described here are used to provide a further understanding of the application and constitute a part of the application. The exemplary embodiments and descriptions of the application are used to explain the application and do not constitute an improper limitation of the application. In the attached picture:
图1是本申请一个实施例中的一种可选的血糖数据的识别方法的流程图;FIG. 1 is a flowchart of an optional method for identifying blood glucose data in an embodiment of the present application;
图2是本申请另一个实施例中的一种图像识别方法的流程图;FIG. 2 is a flowchart of an image recognition method in another embodiment of the present application;
图3为本申请实施例的一种可选的ROI区域的提取示例图;FIG. 3 is an example diagram of an optional ROI region extraction according to an embodiment of the application;
图4a-4d为本申请实施例的一种可选的血糖曲线检测与分割过程的示例图;4a-4d are exemplary diagrams of an optional blood glucose curve detection and segmentation process according to an embodiment of the application;
图5为本申请实施例的一种可选的BiSeNet-Xception39精简版模型结构示意图;Fig. 5 is a schematic structural diagram of an optional BiSeNet-Xception39 simplified model according to an embodiment of the application;
图6为本申请实施例的一种可选的8小时图像R方误差分布统计结果;6 is an optional 8-hour image R-square error distribution statistical result according to an embodiment of the application;
图7为本申请实施例的一种可选的8小时图像误差值分布图;FIG. 7 is an optional 8-hour image error value distribution diagram according to an embodiment of the application;
图8为本申请实施例的一种可选的24小时图像R方误差分布统计结果示意图;8 is a schematic diagram of an optional 24-hour image R-square error distribution statistical result according to an embodiment of the application;
图9为本申请实施例的一种可选的24小时图像误差值分布图;FIG. 9 is an optional 24-hour image error value distribution diagram according to an embodiment of the application;
图10是本申请实施例的一种图像识别装置的结构框图;FIG. 10 is a structural block diagram of an image recognition device according to an embodiment of the present application;
图11是本申请实施例的另一种可选的图像识别装置的结构框图;FIG. 11 is a structural block diagram of another optional image recognition device according to an embodiment of the present application;
图12为本申请实施例的一种数据展示方法的流程图;FIG. 12 is a flowchart of a data display method according to an embodiment of the application;
图13为本申请实施例的另一种图像识别方法的流程图;FIG. 13 is a flowchart of another image recognition method according to an embodiment of the application;
图14为本申请实施例的另一种图像识别方法的流程图。FIG. 14 is a flowchart of another image recognition method according to an embodiment of the application.
具体实施方式Detailed ways
为了使本技术领域的人员更好地理解本申请方案,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分的实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都应当属于本申请保护的范围。In order to enable those skilled in the art to better understand the solution of the application, the technical solutions in the embodiments of the application will be clearly and completely described below in conjunction with the drawings in the embodiments of the application. Obviously, the described embodiments are only It is a part of the embodiments of this application, not all the embodiments. Based on the embodiments in this application, all other embodiments obtained by those of ordinary skill in the art without creative work should fall within the protection scope of this application.
需要说明的是,本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的本申请的实施例能够以除了在这里图示或描述的那些以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。It should be noted that the terms "first" and "second" in the description and claims of the application and the above-mentioned drawings are used to distinguish similar objects, and are not necessarily used to describe a specific sequence or sequence. It should be understood that the data used in this way can be interchanged under appropriate circumstances so that the embodiments of the present application described herein can be implemented in a sequence other than those illustrated or described herein. In addition, the terms "including" and "having" and any variations of them are intended to cover non-exclusive inclusions. For example, a process, method, system, product or device that includes a series of steps or units is not necessarily limited to the clearly listed Those steps or units may include other steps or units that are not clearly listed or are inherent to these processes, methods, products, or equipment.
相关技术中,在对图像中的数据进行识别时,往往仅能识别出字符信息对应的参数取值,而不能识别出非字符信息对应的参数取值,以雅培连续血糖设备为例:血糖设备中瞬感扫描仪能获取用于反映8小时范围内的变化趋势的曲线,无法准确衡量8小时范围内的精确血糖值。瞬时扫描时,只能获取扫描仪扫描时间点时的血糖值,或者通过现有文字识别技术可以识别扫描仪上存在的固定时间点以及扫描当前时刻的血糖值,但是,上述识别方案无法让血糖测试者每时每刻掌握连续血糖值,更无法为后续系统血糖管理与实时推送干预方案提供依据。也就是说,现有的OCR技术无法做到识别血糖曲线中任意一点的血糖值。本申请实施例中利用像素点坐标和实际参数取值的关联关系,实现了对曲线中任意一点的参数取值的识别,简化从图像转化为定量数值的过程,并以一定格式存储于数据库中,为后续血糖分析及干预方案生成提供支持,同时,也可以方便血糖数据的导出、存储,实现手机记录管理血糖,提高使用体验。In related technologies, when recognizing data in an image, only the parameter value corresponding to character information can be recognized, but the parameter value corresponding to non-character information cannot be recognized. Take Abbott continuous blood glucose equipment as an example: blood glucose equipment The instantaneous scanner can obtain a curve that reflects the change trend in the 8-hour range, and cannot accurately measure the accurate blood glucose value in the 8-hour range. During instantaneous scanning, only the blood glucose value at the time when the scanner is scanned can be obtained, or the existing text recognition technology can identify the fixed time point on the scanner and the blood glucose value at the current time of the scan. However, the above recognition scheme cannot make blood glucose. The tester grasps the continuous blood glucose value at all times, and cannot provide a basis for the follow-up system blood glucose management and real-time push intervention plan. In other words, the existing OCR technology cannot identify the blood glucose value at any point in the blood glucose curve. In the embodiments of this application, the correlation between pixel coordinates and actual parameter values is used to realize the recognition of parameter values at any point in the curve, simplify the process of converting images into quantitative values, and store them in a database in a certain format , Provide support for subsequent blood glucose analysis and intervention plan generation. At the same time, it can also facilitate the export and storage of blood glucose data, realize the mobile phone record and manage blood glucose, and improve the user experience.
在本申请的一些实施例中,以血糖图像中血糖数据的识别为例说明如何在像素级别识别相应的参数取值,如图1所示,该流程包括以下步骤:In some embodiments of the present application, the recognition of blood glucose data in a blood glucose image is taken as an example to illustrate how to identify corresponding parameter values at the pixel level. As shown in FIG. 1, the process includes the following steps:
步骤S102,接收用户上传的图像。Step S102: Receive an image uploaded by the user.
步骤S104,识别用户上传的图像是否为所需处理的血糖仪图像。为保证血糖图像识别算法数据的完整性及规范性,在调用算法模型前利用深度网络模型进行图像分类,如MobileNet,Xception,SqueezeNet等图像分类模型,其中,将利用分类模型输出的置信度确定是否当前图像是否为所需处理的图像,例如,将置信度阈值设置为0.85,以保证用户上传图像质量。Step S104: Identify whether the image uploaded by the user is a blood glucose meter image to be processed. In order to ensure the integrity and standardization of the blood glucose image recognition algorithm data, the deep network model is used for image classification before calling the algorithm model, such as MobileNet, Xception, SqueezeNet and other image classification models. Among them, the confidence level of the classification model output will be used to determine whether Whether the current image is the image to be processed, for example, the confidence threshold is set to 0.85 to ensure the quality of the image uploaded by the user.
步骤S106,利用语义分割网络进行图像分割。具体地,分割整副图像中前景信息,即血糖仪图像中高亮屏幕部分。用户上传图像包含各种噪声,为保证算法返回结果的精确性,在图像预分割部分选用深度学习中语义分割模型,如BiSeNet,ICNet,BSPNet,ENET等一切语义分割模型。从数据复杂程度及实际应用速度要求考虑,选择具有实时分割特性的网络模型。Step S106: Perform image segmentation using the semantic segmentation network. Specifically, segment the foreground information in the entire image, that is, the highlighted screen part of the blood glucose meter image. The images uploaded by users contain various noises. In order to ensure the accuracy of the results returned by the algorithm, semantic segmentation models in deep learning are used in the image pre-segmentation part, such as BiSeNet, ICNet, BSPNet, ENET and all other semantic segmentation models. Considering the data complexity and actual application speed requirements, a network model with real-time segmentation characteristics is selected.
步骤S108,进行图像校正。根据语义分割结果进行四边形拟合、角点检测并有序返回屏幕四个角点坐标信息。利用角点信息计算投影变换矩阵并进行图像投影变换。图像方向判断——根据图像中灰度、颜色、纹理等信息,进行图像旋转,返回血糖仪屏幕正方向图像(ROI,Region of Interesting,感兴趣区域)。Step S108, perform image correction. Perform quadrilateral fitting and corner detection according to the semantic segmentation results, and return the coordinate information of the four corners of the screen in an orderly manner. Use the corner information to calculate the projection transformation matrix and perform image projection transformation. Image orientation judgment-rotate the image according to the grayscale, color, texture and other information in the image, and return to the positive direction image (ROI, Region of Interest, region of interest) on the screen of the blood glucose meter.
步骤S110,局部标准差及颜色特征提取——判断图像类型,其中图像类型包含N小时和24小时图像,其中,N小于24。Step S110, extracting local standard deviation and color features-judging the image type, where the image type includes N-hour and 24-hour images, where N is less than 24.
步骤S112,利用SVM分类器对图像进行分类,然后,利用标准色带检测,并进行血糖曲线的分割,其中,对于24小时的血糖图像利用OCR识别出其中的日期信息,对于8小时的血糖图像则利用OCR识别出血糖设备扫描的起始时间和终点时间。当然,识别图像中的具体数据的方案除了OCR之外,还可以采用数字图像处理(DIPDigital Image Processing)技术。Step S112: Use the SVM classifier to classify the image, and then use the standard color band to detect and perform the segmentation of the blood glucose curve. Among them, for the 24-hour blood glucose image, use OCR to identify the date information, and for the 8-hour blood glucose image Then use OCR to identify the start time and end time of the blood glucose device scan. Of course, in addition to OCR, the solution for identifying specific data in an image can also adopt digital image processing (DIP Digital Image Processing) technology.
步骤S114,确定血糖值与图像像素之间的映射关系。Step S114: Determine the mapping relationship between the blood glucose level and the image pixels.
步骤S116,利用上述映射关系计算某一像素点的血糖值,并输出计算得到的血糖值。Step S116: Calculate the blood glucose value of a certain pixel by using the above mapping relationship, and output the calculated blood glucose value.
基于上述实施例,本申请实施例提供了一种图像识别方法的方法实施例,需要说明的是,在附图的流程图示出的步骤可以在诸如一组计算机可执行指令的计算机系统中执行,并且,虽然在流程图中示出了逻辑顺序,但是在某些情况下,可以以不同于此处的顺序执行所示出或描述的步骤。Based on the above embodiments, the embodiments of the present application provide a method embodiment of an image recognition method. It should be noted that the steps shown in the flowchart of the accompanying drawings can be executed in a computer system such as a set of computer executable instructions And, although a logical sequence is shown in the flowchart, in some cases, the steps shown or described may be performed in a different order than here.
图2是根据本申请的另一个实施例的图像识别方法的流程图,如图2所示,该方法包括如下步骤:Fig. 2 is a flowchart of an image recognition method according to another embodiment of the present application. As shown in Fig. 2, the method includes the following steps:
步骤S202,获取待识别的目标图像;Step S202, acquiring a target image to be recognized;
步骤S204,获取所述目标图像中的目标区域,其中,该目标区域中的图像用于反映指定类型参数信息;Step S204: Acquire a target area in the target image, where the image in the target area is used to reflect specified type parameter information;
步骤S206,确定所述目标区域中被选定的像素点坐标;Step S206: Determine the coordinates of the selected pixel in the target area;
步骤S208,基于指定类型参数的取值与像素点坐标的关联关系确定所述被选定的像素点坐标所对应的参数取值。Step S208: Determine the value of the parameter corresponding to the selected pixel point coordinate based on the correlation between the value of the specified type parameter and the pixel point coordinate.
通过上述实施例提供的方案,由于采用目标图像中像素点坐标和指定类型参数的取值之间的关联关系识别图像中任意像素点坐标对应的参数取值,因此,实现了对图像中非字符信息所表示的参数取值的识别,达到了将图像中的像素点自动识别为相应的参数取值的目的,进而解决了当前的图像识别方式只能识别出图像中的字符格式的数值,不能将曲线或离散点自动识别为数值的技术问题。Through the solution provided by the above embodiment, the correlation between the pixel coordinates in the target image and the value of the specified type parameter is used to identify the parameter value corresponding to any pixel coordinate in the image. The recognition of the parameter value represented by the information achieves the purpose of automatically recognizing the pixel in the image as the corresponding parameter value, and thus solves the problem that the current image recognition method can only recognize the value of the character format in the image. The technical problem of automatically identifying curves or discrete points as numerical values.
对于上述关联关系,其表现方式有多种,例如,可以表现为映射关系,也可以表现为线性函数关系等,其中,对于前者,可以通过以下方式实现:在基于指定类型参数的取值与像素点坐标的关联关系确定所述被选定的像素点坐标所对应的参数取值之前,从所述目标区域中分离出指定颜色通道,其中,所述指定颜色通道为R、G、B颜色通道中与所述目标区域的标准色带所对应颜色通道相同的颜色通道;对所述指定颜色通道的图像进行图像二值化处理,二值化处理为选择指定颜色通道的图像中大于预设阈值的像素点集合,得到二值化图像;从预设阈值集合中选择所述二值化图像中每个像素点所在区域所对应的阈值,并利用选择的阈值对所述目标区域进行图像分割;对分割后得到的二值化图像进行参考点像素识别,得到所述标准色带的至少两个参考点在图像中的像素点坐标;确定所述至少两个参考点的实际取值以及所述像素点坐标之间的对应关系,并基于所述对应关系建立所述指定类型参数的取值与像素点坐标的线性关系,并将所述线性关系作为所述关联关系。The above-mentioned association relationship can be expressed in many ways, for example, it can be expressed as a mapping relationship, or as a linear function relationship. Among them, the former can be implemented in the following ways: based on the value of the specified type parameter and the pixel Before the correlation of the point coordinates determines the value of the parameter corresponding to the selected pixel point coordinate, a designated color channel is separated from the target area, wherein the designated color channel is R, G, B color channel The color channel in the same color channel as the color channel corresponding to the standard color band of the target area; image binarization is performed on the image of the specified color channel, and the binarization process is to select the image of the specified color channel to be greater than the preset threshold To obtain a binarized image; select a threshold corresponding to the area where each pixel in the binarized image is located from a preset threshold set, and use the selected threshold to perform image segmentation on the target area; Perform reference point pixel identification on the binarized image obtained after segmentation to obtain the pixel coordinates of at least two reference points of the standard color band in the image; determine the actual values of the at least two reference points and the And establish a linear relationship between the value of the specified type parameter and the pixel coordinate based on the corresponding relationship, and use the linear relationship as the association relationship.
具体地,以血糖曲线中血糖数据的识别为例,上述关联关系表现为映射关系,此 时,需要用到标准色带。该检测过程是有效建立血糖实际值与血糖图像像素坐标之间的关系的一种途径,其目的就是寻找血糖曲线上像素坐标与实际血糖值的对应线性关系。检测过程中,对血糖图像ROI区域进行颜色通道R,G,B分离,因标准色带呈现蓝色特征,故提取B通道进行图像处理,首先,对该通道图像进行灰度拉伸以及灰度值归一化处理,然后,对图像进行自适应阈值分割,同时,利用图像形态学处理完成分割图像的噪声处理,完成标准色带图像分割操作,最后,对上述过程中的二值图像进行横向直线拟合,拟合结果为标准色带上下线在图像中的像素坐标高度。常见血糖仪扫描仪(即血糖设备)中,标准色带的上下线血糖值分别可以为3.9和7.8。由实际标准色带上的已知实际血糖值以及其对应的像素坐标高度建立其线性关系:Specifically, taking the identification of blood glucose data in the blood glucose curve as an example, the above-mentioned association relationship is expressed as a mapping relationship. At this time, a standard color band is required. This detection process is a way to effectively establish the relationship between the actual blood glucose value and the pixel coordinates of the blood glucose image, and its purpose is to find the corresponding linear relationship between the pixel coordinates on the blood glucose curve and the actual blood glucose value. In the detection process, the color channel R, G, and B are separated in the blood glucose image ROI area. Because the standard color band presents blue characteristics, the B channel is extracted for image processing. First, the channel image is grayscaled and grayscaled Value normalization processing, and then adaptive threshold segmentation of the image, at the same time, using image morphology processing to complete the noise processing of the segmented image, complete the standard color band image segmentation operation, and finally, the binary image in the above process is horizontal Straight line fitting, the fitting result is the pixel coordinate height of the upper and lower lines of the standard color band in the image. In a common blood glucose meter scanner (ie, blood glucose equipment), the upper and lower blood glucose values of the standard ribbon can be 3.9 and 7.8, respectively. The linear relationship is established by the known actual blood glucose value on the actual standard color band and the corresponding pixel coordinate height:
sValue=3.9/(std_lower-std_upper)sValue=3.9/(std_lower-std_upper)
rValue=5.85+0.5*sValue*(std_lower+std_upper)-sValue*line_rhorValue=5.85+0.5*sValue*(std_lower+std_upper)-sValue*line_rho
其中,sValue:图像像素与实际血糖值比例系数Among them, sValue: image pixel and actual blood glucose value ratio coefficient
rValue:图像处理返回的实际血糖值rValue: The actual blood glucose value returned by image processing
line_rho:图像中血糖曲线像素高度line_rho: pixel height of blood glucose curve in the image
std_upper:标准色带上边缘像素高度std_upper: pixel height of the upper edge of the standard color band
std_lower:标准色带下边缘像素高度std_lower: the pixel height of the lower edge of the standard color band
上述目标区域的确定方式有多种,例如,可以对像素点进行聚类的方式确定,也可以将感兴趣区域作为目标区域,对于前者,可以通过以下方式进行处理:对目标区域中的图像进行灰度化处理,得到灰度图像;对灰度图像中的各个像素点进行聚类处理,得到多个簇;从多个簇中选择指定簇,并从指定簇中的所有像素点中确定被选定的像素点坐标。There are many ways to determine the target area mentioned above. For example, the pixels can be determined by clustering, or the area of interest can be used as the target area. For the former, the following methods can be used to process the image in the target area. Gray-scale processing to obtain a gray-scale image; perform clustering processing on each pixel in the gray-scale image to obtain multiple clusters; select a specified cluster from multiple clusters, and determine the selected cluster from all pixels in the specified cluster The coordinates of the selected pixel.
另外,本实施例中的目标图像可以是针对血糖仪数据生成的需要展示给客户的原始图像,例如,分析血液中血糖含量值与时间值的曲线图,其中该曲线图所处于由时间坐标轴与血糖含量坐标轴构成的坐标系之中,那么这个曲线图便是该血糖仪数据分析的原始待处理图像,即目标图像。In addition, the target image in this embodiment may be an original image generated for blood glucose meter data that needs to be displayed to the customer, for example, analyzing a graph of blood glucose content value and time value, where the graph is on the time axis In the coordinate system formed by the coordinate axis of blood glucose content, the graph is the original image to be processed for data analysis of the blood glucose meter, that is, the target image.
在本申请的一些实施例中,在从多个簇中选择指定簇,并从指定簇中的所有像素点中确定被选定的像素点坐标时,选择指定簇所依据的原则根据实际情况灵活确定,例如,从多个簇中选择像素点数量最少的簇,并从像素点数量最少的簇中确定被选定的像素点坐标。In some embodiments of the present application, when a designated cluster is selected from a plurality of clusters, and the coordinates of the selected pixel point are determined from all the pixels in the designated cluster, the principle for selecting the designated cluster is flexible according to the actual situation. Determine, for example, select the cluster with the least number of pixels from a plurality of clusters, and determine the coordinates of the selected pixel from the cluster with the least number of pixels.
具体地,在对上述目标区域进行检测时,可以表现为对图像中的曲线进行检测,以血糖曲线为例,为尽可能减少图像噪点的存在,对输入图像进行截取局部区域rect操作,参数设置如表1所示:Specifically, when detecting the above-mentioned target area, it can be expressed as detecting the curve in the image. Taking the blood glucose curve as an example, in order to reduce the existence of image noise as much as possible, the input image is intercepted by the local area rect operation, and the parameters are set As shown in Table 1:
表1血糖曲线检测RECT参数详情Table 1 Details of RECT parameters for blood glucose curve detection
Figure PCTCN2020100247-appb-000001
Figure PCTCN2020100247-appb-000001
如图4a所示,图像中血糖曲线可以由黑色和红色(图中未区分颜色)共同连接而成,因此,对图像进行颜色通道分离并提取出R通道,此时,如图4b所示,图像灰度分布呈现黑色、灰色以及白色三种分布趋势,故设置聚类中心个数为3(如图4c所示),利用KMeans对图像进行图像灰度聚类,而血糖曲线在图像中所占区域最少,因此,提取分类结果中类别数量最少的类别作为血糖曲线的类别(如图4d所示),对该类别图像进行图像后处理操作完成血糖曲线分割与检测过程。As shown in Figure 4a, the blood glucose curve in the image can be formed by connecting black and red (colors are not distinguished in the figure). Therefore, the color channel of the image is separated and the R channel is extracted. At this time, as shown in Figure 4b, The grayscale distribution of the image presents three distribution trends of black, gray and white, so the number of clustering centers is set to 3 (as shown in Figure 4c), and the image grayscale clustering is performed by KMeans, and the blood glucose curve is displayed in the image. It occupies the least area. Therefore, the category with the least number of categories in the classification result is extracted as the category of the blood glucose curve (as shown in FIG. 4d), and the image post-processing operation of the category image is performed to complete the blood glucose curve segmentation and detection process.
正如上面所述,可以依据感兴趣区域确定目标区域,具体地:对目标图像进行语义分割,得到目标图像的掩码图像和前景图像;从前景图像中确定感兴趣区域,并将感兴趣区域作为目标区域。As mentioned above, the target area can be determined according to the region of interest, specifically: semantically segment the target image to obtain the mask image and foreground image of the target image; determine the region of interest from the foreground image and use the region of interest as target area.
其中,可以通过以下方式从前景图像中确定感兴趣区域:确定前景图像中的特征区域,以及目标几何区域的角点坐标,其中,特征区域为前景图像中包含指定类型参数信息的区域;基于角点坐标计算投影变换矩阵;对特征区域中的像素点进行投影变换,得到感兴趣区域。其中,在进行投影变换过程中,还可以包括对图像的旋转处理,以保证能够正确识别感兴趣区域。Among them, the region of interest can be determined from the foreground image by the following methods: determine the feature region in the foreground image, and the corner coordinates of the target geometric region, where the feature region is the region in the foreground image that contains the specified type of parameter information; Point coordinates calculate the projection transformation matrix; perform projection transformation on the pixel points in the characteristic area to obtain the region of interest. Among them, in the process of projection transformation, the image rotation processing may also be included to ensure that the region of interest can be correctly identified.
利用预分割掩码mask以及图像形态学处理进行四边形拟合,有序(逆时针)返回四边形角点坐标(左上,左下,右下,右上,包含但不局限于此顺序)。然后,由返回的角点坐标计算投影变换矩阵,对血糖图像的高亮区域进行投影变换得到ROI区域,其中,变换过程如图3所示。Use the pre-segmentation mask and image morphology processing to perform quadrilateral fitting, and return the quadrilateral corner coordinates (upper left, lower left, lower right, upper right, including but not limited to this order) in order (counterclockwise). Then, the projection transformation matrix is calculated from the returned corner coordinates, and the highlight area of the blood glucose image is projected and transformed to obtain the ROI area. The transformation process is shown in Figure 3.
在本申请的另一些可选的实施例中,上述目标区域中的图像包括:坐标系中的曲线图像,该曲线图像中的曲线用于反映指定类型参数在不同时刻的取值。In some other optional embodiments of the present application, the image in the above-mentioned target area includes: a curve image in a coordinate system, and the curve in the curve image is used to reflect the value of the specified type parameter at different times.
在对图像进行识别时,为方便用户查询具体参数值所对应的时刻,在确定像素点坐标时,还可以确定像素点坐标在曲线图像中对应的目标记录时间;此时,步骤S206在确定选择的像素点所对应的参数取值时,可以表现为以下处理过程:基于指定类型参数的取值与像素点坐标的关联关系确定被选定的像素点坐标在目标记录时间所对应的参数取值。When recognizing the image, in order to facilitate the user to query the time corresponding to the specific parameter value, when determining the pixel point coordinates, the target recording time corresponding to the pixel point coordinates in the curve image can also be determined; at this time, step S206 is determining the selection When the parameter value corresponding to the pixel point is selected, it can be expressed as the following processing process: Determine the parameter value corresponding to the selected pixel point coordinate at the target recording time based on the correlation between the value of the specified type parameter and the pixel point coordinate .
在本申请的一些实施例中,目标记录时间可以通过以下方式确定:识别曲线图像中的字符信息,从字符信息中提取指定类型参数的时间信息;对时间信息中任意的相邻两个记录时刻之间的时长按照像素点数量进行等间隔划分,得到多个时间点;从多个时间点中确定被选定的像素点坐标所属的目标记录时间,例如,在T1和T2时刻之间的横向像素数为N,则T1和T2之间横向上第M个像素点对应的时刻为T1+[M*(T2-T1)/N]。其中,可以采用OCR技术识别上述字符信息,但不限于此。In some embodiments of the present application, the target recording time can be determined by the following methods: identifying the character information in the curve image, and extracting the time information of the specified type parameter from the character information; for any two adjacent recording times in the time information The length of time between is divided at equal intervals according to the number of pixels to obtain multiple time points; from multiple time points, determine the target recording time to which the coordinates of the selected pixel point belongs, for example, in the horizontal direction between T1 and T2 If the number of pixels is N, the time corresponding to the M-th pixel in the horizontal direction between T1 and T2 is T1+[M*(T2-T1)/N]. Among them, OCR technology can be used to recognize the above-mentioned character information, but it is not limited to this.
对血糖图像中的时间信息进行字符识别,准确记录瞬感血糖时间,为后续形成连续血糖数据的管理提供有力保障。对于应用程序接口而言,庞大的数据输入势必会增加数据传输时间,同时,接口处理数据时长也会随之增加。因此,为提升字符识别速率,识别过程中对输入数据进行一定的数据前处理,通过实验测试,调整图像尺寸可直接影响识别速率和准确度,在具体实验测试中,固定输入图像大小为256*256(包括但不仅仅局限于该图像尺寸大小,可根据数据复杂程度适当调整)时,字符识别准确率和速率可取得相对较优结果。在固定输入图像大小为256*256后,截取局部区域RECT(RECT为矩形框缩写,其代表在图像中所截取的矩形区域)进行字符识别,8小时RECT区域参数设置为(235,10,20,225),24小时血糖图像RECT区域参数设置 为(205,85,20,225),具体参数说明以8小时示例,235代表截取区域图像在大小为256*256输入图像中起始点的纵坐标,10为截取区域图像在大小为256*256输入图像中起始点的横坐标,20代表改截取区域的高度,225则代表截取区域的宽度。具体实例如表2所示:Perform character recognition on the time information in the blood glucose image, accurately record the instantaneous blood glucose time, and provide a strong guarantee for the subsequent management of continuous blood glucose data. For the application program interface, huge data input is bound to increase the data transmission time, and at the same time, the interface processing data time will also increase. Therefore, in order to increase the character recognition rate, certain data pre-processing is performed on the input data during the recognition process. Through experimental tests, adjusting the image size can directly affect the recognition rate and accuracy. In specific experimental tests, the fixed input image size is 256* 256 (including but not limited to the size of the image, which can be appropriately adjusted according to the complexity of the data), the character recognition accuracy and rate can achieve relatively good results. After the fixed input image size is 256*256, the local area RECT is intercepted (RECT is the abbreviation of rectangular frame, which represents the rectangular area intercepted in the image) for character recognition, and the RECT area parameters are set to (235, 10, 20) for 8 hours , 225), the 24-hour blood glucose image RECT area parameter is set to (205,85,20,225), the specific parameter description takes 8 hours as an example, 235 represents the vertical coordinate of the starting point of the captured area image in the size of 256*256 input image , 10 is the abscissa of the starting point of the captured area image in the input image with a size of 256*256, 20 represents the height of the changed capture area, and 225 represents the width of the captured area. Specific examples are shown in Table 2:
表2截取区域RECT参数详情及示例Table 2 Details and examples of RECT parameters in the interception area
Figure PCTCN2020100247-appb-000002
Figure PCTCN2020100247-appb-000002
可选地,在目标图像为第一类型的情况下,采用高效神经网络模型对目标图像进行语义分割,其中,高效神经网络模型包括:初始化模块和瓶颈模块,其中,每个瓶颈模块包括三个卷积层,其中,三个卷积层中的第一卷积层用于进行降维处理,第二卷积层用于进行空洞卷积、全卷积和非对称卷积,第三卷积层用于进行升维处理;在目标图像为第二类型的情况下,对双边分割网络模型进行调整,并利用调整后的分割网络模型对目标图像进行语义分割,其中,对分割网络模型进行调整包括:所述双边分割网络模型包括主干网络和辅助网络,所述主干网络由两层构成,每层主干网络分别包括卷积层、批归一化层和非线性激活函数,降低主干网络输出通道特征图数;所述辅助网络模型框架采用轻量级模型,降低主干网络输出通道特征图数,所述轻量级模型包括以下之一:Xception39、SqueezeNet、Xception、MobileNet、ShuffleNet;其中,所述第一类型所对应第一数据集的图像数量小于第二类型所对应第二数据集中的图像数量。其中,上述第一类型和第二类型分别对应于第一数据集中的图像和第二数据集中的图像,即第一数据集中的图像为第一类型的图像,第二数据集中的图像为第二类型的图像。其中,第一数据集中的图像数量小于第二数据集中的图像数量。在本申请的一个实施方式中,可任选高效神经网络(即第一类型的情况下采用的分割模型)或对双边分割网络模型进行调整后的分割网络模型(即第二类型的情况下采用的分割模型)对目标图像进行语义分割,一个优选的实施方式中,采用双边分割网络模型进行调整后的分割网络模型对目标图像进行语义分割,用双边分割网络模型进行调整后的分割网络模型在处理大规模数据是具有处理速度快、以及在保留一定丰富度的空间信息的能够同时兼顾保护感受野的大小的优点。另外,上述主网络的作用是保留丰富的空间信息,而辅助网络的作用是保护感受野大小。Optionally, when the target image is of the first type, an efficient neural network model is used to perform semantic segmentation on the target image. The efficient neural network model includes an initialization module and a bottleneck module, where each bottleneck module includes three Convolutional layer, where the first convolutional layer of the three convolutional layers is used for dimensionality reduction processing, the second convolutional layer is used for hole convolution, full convolution and asymmetric convolution, and the third convolution The layer is used for dimension upgrading; when the target image is of the second type, adjust the bilateral segmentation network model, and use the adjusted segmentation network model to perform semantic segmentation on the target image, where the segmentation network model is adjusted Including: The bilateral segmentation network model includes a backbone network and an auxiliary network. The backbone network is composed of two layers. Each backbone network includes a convolutional layer, a batch normalization layer and a nonlinear activation function to reduce the output channel of the backbone network. Feature maps; the auxiliary network model framework adopts a lightweight model to reduce the number of feature maps of the backbone network output channel. The lightweight model includes one of the following: Xception39, SqueezeNet, Xception, MobileNet, ShuffleNet; wherein, the The number of images in the first data set corresponding to the first type is smaller than the number of images in the second data set corresponding to the second type. Among them, the above-mentioned first type and second type respectively correspond to the images in the first data set and the images in the second data set, that is, the images in the first data set are the images of the first type, and the images in the second data set are the second Type of image. Wherein, the number of images in the first data set is less than the number of images in the second data set. In one embodiment of the present application, an efficient neural network (that is, the segmentation model used in the first type) or a segmentation network model after adjusting the bilateral segmentation network model (that is, the second type is used The segmentation model) performs semantic segmentation on the target image. In a preferred embodiment, the segmentation network model adjusted by the bilateral segmentation network model is used to perform semantic segmentation on the target image, and the segmentation network model adjusted by the bilateral segmentation network model is Processing large-scale data has the advantages of fast processing speed and the ability to protect the size of the receptive field while retaining a certain richness of spatial information. In addition, the function of the above-mentioned main network is to retain abundant spatial information, and the function of the auxiliary network is to protect the size of the receptive field.
在本申请的一些实施例中,由于调整后的双边分割网络模型处理能力强,因此,其不仅可处理大规模数据,也可以支持小规模数据的处理。同样,在数据规模较小时,可以采用高效神经网络模型对图像进行处理,在数据规模较大时,也可以手动或自动切换至调整后的双边分割网络模型进行处理。数据规模的大或小,在特定硬件能力的背景下,基于具体查询请求的数量,或基于时间段内的查询请求的统计数量决定,其中,查询请求可以用于请求对图像进行处理(例如,对用户上传的图像进行识别)。In some embodiments of the present application, due to the strong processing capability of the adjusted bilateral segmentation network model, it can not only process large-scale data but also support small-scale data processing. Similarly, when the data scale is small, an efficient neural network model can be used to process the image, and when the data scale is large, it can also be manually or automatically switched to the adjusted bilateral segmentation network model for processing. The size of the data is determined based on the number of specific query requests in the context of specific hardware capabilities, or based on the statistical number of query requests within a time period. The query request can be used to request image processing (for example, Recognize images uploaded by users).
正如上面所述,数据集可以由待处理的图像组成,在此基础上,上述第二数据集为满足以下条件的数据集:在特定时间段内是数据集中需要处理的目标图像(例如待 识别图像)的数量大于预设阈值;第一数据集为满足以下条件的数据集:在特定时间段内的数据集中需要处理的目标图像(例如待识别图像)的数量小于预设阈值。As mentioned above, the data set can be composed of images to be processed. On this basis, the second data set is a data set that satisfies the following conditions: within a certain period of time, it is the target image that needs to be processed in the data set (for example, to be recognized The number of images) is greater than the preset threshold; the first data set is a data set that meets the following conditions: the number of target images (for example, images to be recognized) that need to be processed in the data set within a specific time period is less than the preset threshold.
其中,上述特定时间段的确定方式有多种,例如:Among them, there are many ways to determine the above specific time period, for example:
在对目标图像的数量进行统计时,可以是每隔预设时长统计一次,将任意一个用于统计目标图像的数量的时间段作为上述特定时间段,例如,将在获取所述目标图像中的目标区域这一动作的起始时间点为起点、将其与中点或终点之间的预设时长(例如3ms、5ms、30ms、50ms、100ms、500ms、1s或2s等)作为所述特定时间段;例如,将每天的24小时平均划分为12个时间段,则12个时间段中的任意一个时间段为上述特定时间段。此时,如果某个时间段内接到的的目标图像或待处理的目标图像的数量大于预设阈值,则该某个时间段对应的数据集为第二数据集,如果某个时间段内接到的目标图像或待处理的目标图像的数量小于预设阈值,则该某个时间段对应的数据集为第一数据集。When the number of target images is counted, it may be counted once every preset time length, and any time period for counting the number of target images is taken as the above-mentioned specific time period. The starting time point of this action in the target area is the starting point, and the preset time between it and the midpoint or end point (for example, 3ms, 5ms, 30ms, 50ms, 100ms, 500ms, 1s or 2s, etc.) is used as the specific time For example, if the 24 hours per day are divided into 12 time periods on average, any one of the 12 time periods is the above specific time period. At this time, if the number of received target images or target images to be processed in a certain time period is greater than the preset threshold, the data set corresponding to the certain time period is the second data set. The number of received target images or target images to be processed is less than a preset threshold, and the data set corresponding to the certain time period is the first data set.
又例如:Another example:
上述特定时间段可以是预先设定的固定的时间段,例如,每1s、1min、1h等等。The aforementioned specific time period may be a predetermined fixed time period, for example, every 1 s, 1 min, 1 h, and so on.
本申请的一个实施方式中,第一数据集和第二数据集中的图像数量是可以动态变化的,在进行目标图像的语义分割时,可根据第一数据集和第二数据集中的目标图像的数量的多少确定选择采用第一类型对应的模型进行语义分割或采用第二类型对应的模型进行语义分割。本申请的一个具体实施例中,数据规模的大或小,在特定硬件能力的背景下,基于具体查询网络请求的数量决定,例如,如果使用CPU服务器,且其具备不少于10000个CPU核的计算能力,响应时间要求小于3秒,数据集中目标图像的数量大于等于50000被认为是第二数据集,小于50000被认为是第一数据集;又例如,使用GPU服务器,则其包含不少于等效于24张NVIDIA V100GPU的计算能力,响应时间要求小于3秒,数据集中目标图像的数量大于等于50000被认为是第二数据集,小于50000被认为是第一数据集。本申请的另一个实施例中,数据规模的大或小,在特定硬件能力的背景下,基于时间段的统计数量决定。例如,在上午9-12点左右由于查询的用户比较多,可以采用第二类型对应的模型进行语义分割,在晚上00:00-08:00时段查询数据的用户比较少,可以采用第一类型对应的模型进行语义分割,在实际应用时:接收用户上传的目标图像;确定目标图像的上传时间;确定所述上传时间对应的时间段;依据该时间段确定对目标图像进行语义分割的分割网络模型,并依据确定的分割网络模型对目标图像进行语义分割。In an embodiment of the present application, the number of images in the first data set and the second data set can be dynamically changed. When the semantic segmentation of the target image is performed, the number of images in the first data set and the second data set can be The number determines whether to use the model corresponding to the first type for semantic segmentation or to use the model corresponding to the second type for semantic segmentation. In a specific embodiment of this application, the size of the data is determined based on the number of specific query network requests in the context of specific hardware capabilities. For example, if a CPU server is used and it has no less than 10,000 CPU cores The computing power, response time requirement is less than 3 seconds, the number of target images in the data set is greater than or equal to 50000 is regarded as the second data set, and less than 50000 is regarded as the first data set; for example, if you use a GPU server, it contains many It is equivalent to the computing power of 24 NVIDIA V100GPUs, and the response time is less than 3 seconds. The number of target images in the data set is greater than or equal to 50,000 is considered the second data set, and less than 50,000 is considered the first data set. In another embodiment of the present application, the size of the data is determined based on the statistical number of time periods in the context of specific hardware capabilities. For example, since there are more users inquiring about 9-12 in the morning, the model corresponding to the second type can be used for semantic segmentation, and there are fewer users who query data in the evening from 00:00 to 08:00, so the first type can be used. The corresponding model performs semantic segmentation. In actual application: receive the target image uploaded by the user; determine the upload time of the target image; determine the time period corresponding to the upload time; determine the segmentation network for semantic segmentation of the target image according to the time period Model, and perform semantic segmentation on the target image according to the determined segmentation network model.
其中,为避免运算资源的浪费,在对目标图像进行语义分割处理,得到目标图像的感兴趣区域之前,还可以确定目标图像的类型;在类型为预设类型时,确定对目标图像进行语义分割。在确定目标图像的类型时,可以表现为以下处理过程:将目标图像均分成预设数量个不重叠滑块;确定预设数量个不重叠滑块的特征值,得到预设数量个特征值;将预设数量个特征值组合成特征向量;将特征向量输入至支持向量机分类器进行分析,得到目标图像的类型。具体地,以血糖图像为例,比如雅培连续血糖设备的血糖图像额类型包括8小时以及24小时两种类型。首先提取图像局部方差特征以及局部颜色特征,具体而言,将大小为256*256的输入图像均分成256个16*16的不重叠滑块,计算各个相对独立滑块部分方差和图像蓝色通道像素值的平均值,将方差及蓝色通道平均像素值组合成改独立滑块的特征值,然后,将256个滑块特征组合 成维度为512的特征向量,最后,结合SVM(support vector machine支持向量机)分类器实现图像二分类,完成血糖图像分类。Among them, in order to avoid the waste of computing resources, before semantic segmentation is performed on the target image to obtain the region of interest of the target image, the type of the target image can also be determined; when the type is the preset type, the semantic segmentation of the target image is determined . When determining the type of the target image, it can be expressed as the following process: dividing the target image into a preset number of non-overlapping sliders; determining the feature values of the preset number of non-overlapping sliders to obtain the preset number of feature values; The preset number of feature values are combined into feature vectors; the feature vectors are input to the support vector machine classifier for analysis to obtain the target image type. Specifically, taking blood glucose images as an example, for example, the types of blood glucose images of Abbott's continuous blood glucose device include two types: 8 hours and 24 hours. First extract the local variance feature and local color feature of the image. Specifically, the input image with a size of 256*256 is divided into 256 16*16 non-overlapping sliders, and the partial variance of each relatively independent slider and the image blue channel are calculated. The average value of the pixel value, the variance and the average pixel value of the blue channel are combined into the feature value of the independent slider, and then the 256 slider features are combined into a feature vector with a dimension of 512, and finally, combined with SVM (support vector machine Support vector machine) classifier realizes the two classification of images and completes the classification of blood glucose images.
以血糖图像为例,在对其中的图像数据进行识别时,对确定需要识别的图像进行图像分割,以便准确提取屏幕高亮部分。首先,利用ENET网络完成对用户上传图像的预分割,返回前景区域掩码mask。该网络具有参数少,模型小,精确度高等特点。预分割网络ENET的基础实现单元在于:(1)初始化模块,(2)基于ResNet思想设计的bottleneck模块,每个模块包含三个卷积层,其中,第一个卷积层实现降维,第二个卷积层实现空洞卷积,全卷积以及非对称卷积等,第三个卷积实现升维功能,每个卷积核均包含Batch Normalization和PReLU。实验中,总数据集样本644张,其中分成训练数据:515张,验证集:65张,测试集:64张,所有采集图像覆盖多角度情况,所有照片光照分布均匀。网络训练中,初始学习率为0.005,每30个迭代过程学习率衰减一次,总迭代次数epoch为300,但不局限于300,具体所有网络参数可根据数据实际情况调整。在现有小数据集上,所训练的血糖图像分割模型效果可观,具体训练及测试表现如表2所示。其中,测试环境为:内存16G,CPU型号Intel(R)Core(TM)i5-7500CPU@3.40GHz。模型表现如下表所示,其中,IOU(Intersection over Union)由真实值GT、测试PR计算得来,最终结果是GT、PR的交集比GT、PR的并集,是目标检测与分割中通用的衡量指标。ENET语义分割网络模型的表现如表3所示。Taking blood glucose image as an example, when recognizing the image data therein, image segmentation is performed on the image that is determined to be recognized, so as to accurately extract the highlight part of the screen. First, use the ENET network to complete the pre-segmentation of the user uploaded image, and return the foreground area mask. The network has the characteristics of few parameters, small model and high accuracy. The basic implementation units of the pre-segmentation network ENET are: (1) the initialization module, (2) the bottleneck module designed based on the ResNet idea. Each module contains three convolutional layers. Among them, the first convolutional layer achieves dimensionality reduction, and the first The two convolutional layers implement hole convolution, full convolution and asymmetric convolution, etc., and the third convolution implements the dimensionality increase function. Each convolution kernel includes Batch Normalization and PReLU. In the experiment, there are 644 samples of the total data set, which are divided into training data: 515, validation set: 65, and test set: 64. All collected images cover multiple angles, and all photos have uniform illumination distribution. In network training, the initial learning rate is 0.005, and the learning rate decays once every 30 iterations. The total number of iterations epoch is 300, but not limited to 300. All specific network parameters can be adjusted according to the actual data. On the existing small data sets, the trained blood glucose image segmentation model has considerable effect. The specific training and test performance are shown in Table 2. Among them, the test environment is: memory 16G, CPU model Intel(R)Core(TM)i5-7500CPU@3.40GHz. The model performance is shown in the following table. Among them, IOU (Intersection over Union) is calculated from the true value GT and test PR. The final result is the intersection of GT and PR than the union of GT and PR, which is common in target detection and segmentation. Metrics. The performance of the ENET semantic segmentation network model is shown in Table 3.
表3 ENET语义分割网络模型表现(小数据集)Table 3 ENET semantic segmentation network model performance (small data set)
Figure PCTCN2020100247-appb-000003
Figure PCTCN2020100247-appb-000003
随着用户数据的不断增加,在进行分割网络模型迭代过程中发现,该语义分割网络ENET不再适用于大数据集。随着数据集的不断增大,数据复杂程度越高,语义分割网络ENET在追求速度过程中,无法高效合理的平衡图像中空间信息和感受野,因此,该网络在大数据集上的模型表现不符合进一步的应用需求。在新分割数据集中,样本总数为4912张,其中分成训练集4104张,验证集608张,测试集200张。网络训练中,初始学习率为0.01,每30个迭代过程学习率衰减一次,总迭代次数epoch为300,所有网络参数包含但不局限于上述数值,具体数据可根据数据实际情况调整。在同样的测试环境下,模型表现如表4所示:With the continuous increase of user data, it is found during the iteration of the segmentation network model that the semantic segmentation network ENET is no longer suitable for large data sets. With the continuous increase of data sets and the higher the data complexity, the semantic segmentation network ENET cannot efficiently and reasonably balance the spatial information and receptive fields in the image in the pursuit of speed. Therefore, the model performance of the network on large data sets Does not meet further application requirements. In the new segmentation data set, the total number of samples is 4912, which are divided into 4104 training sets, 608 verification sets, and 200 test sets. In network training, the initial learning rate is 0.01, and the learning rate decays once every 30 iterations. The total number of iterations epoch is 300. All network parameters include but are not limited to the above values. The specific data can be adjusted according to the actual data. In the same test environment, the model performance is shown in Table 4:
表4 ENET语义分割网络模型表现(大数据集)Table 4 ENET semantic segmentation network model performance (big data set)
Figure PCTCN2020100247-appb-000004
Figure PCTCN2020100247-appb-000004
因此,为满足大训练数据集上网络模型的分割表现,提出BiSeNet精简版模型。原分割模型BiSeNet在公共数据集(数据集Cityscapes,数据集CamVid,数据集COCO-Stuff等)上的速度和精度表现都呈现一定的优越性。对于本申请实施例中训练数据而言,样本复杂程度相对公共数据集中数据干净、复杂程度低,因此,对语义分割BiSeNet网络进行适当调整与精简,调整思路主要分为:(1)空间信息处理层Spatial  Path,(2)感受野处理层Context Path,(3)各网络层间输入-输出通道(特征图)数量,(4)压缩输入图像尺寸。具体精简修改内容为:(1)主干网络Spatial Path部分网络层减少,由原始3层网络(其中,每层网络包括常见的卷积层conv、批归一化层Batch Normalization、非线性激活函数ReLU),减少为2层网络,如图2中的Layer1和Layer2所示,同时,该部分输出通道由128个特征图减少为64个特征图,大量减小网络参数,有效压缩模型大小,在保证分割精度的情况下大大提高分割速率;(2)辅助网络Context Path部分模型框架更改,由原始的ResNet18,ResNet101替换为更轻便的Xception39模型,在有效保证感受野范围的情况下,压缩模型大小;(3)减小各个网络层输出特征图Feature Map的个数;(4)压缩模型输入图像大小,由原始的640*640压缩至320*320,经模型训练测试,直接压缩输入图像进行图像分割的方式能满足一定的分割精度,同时,运算代价明显减少。修改精简后的网络结构如图5所示。Therefore, in order to meet the segmentation performance of the network model on the large training data set, a simplified BiSeNet model is proposed. The original segmentation model BiSeNet has certain advantages in speed and accuracy on public data sets (data set Cityscapes, data set CamVid, data set COCO-Stuff, etc.). For the training data in the embodiments of this application, the sample complexity is relatively clean and less complex than the data in the public data set. Therefore, the semantic segmentation BiSeNet network is appropriately adjusted and simplified. The adjustment ideas are mainly divided into: (1) Spatial information processing Layer Spatial Path, (2) Receptive field processing layer Context Path, (3) Number of input-output channels (feature maps) between network layers, (4) Compress input image size. The specific content of the simplified modification is as follows: (1) The Spatial Path part of the backbone network is reduced by the original 3-layer network (where each layer includes the common convolution layer conv, the batch normalization layer Batch Normalization, and the nonlinear activation function ReLU ), reduced to a 2-layer network, as shown in Layer1 and Layer2 in Figure 2. At the same time, the part of the output channel is reduced from 128 feature maps to 64 feature maps, which greatly reduces network parameters and effectively compresses the model size. In the case of segmentation accuracy, the segmentation rate is greatly improved; (2) The model framework of the auxiliary network Context Path is changed, and the original ResNet18 and ResNet101 are replaced with the lighter Xception39 model, which reduces the model size while effectively ensuring the range of the receptive field; (3) Reduce the number of Feature Maps output by each network layer; (4) Compress the input image size of the model from the original 640*640 to 320*320. After the model training test, the input image is directly compressed for image segmentation The method can meet a certain segmentation accuracy, and at the same time, the computational cost is significantly reduced. The modified and simplified network structure is shown in Figure 5.
实验结果表明,在上述样本总数为4912张的数据集下训练的分割模型取得了更优的表现,符合实际应用需求,在测试环境为内存16G,CPU型号Intel(R)Core(TM)i5-7500 CPU @ 3.40GHz的硬件条件下,模型表现如表5所示:The experimental results show that the segmentation model trained under the above-mentioned data set with a total of 4912 samples has achieved better performance, which meets the actual application requirements. In the test environment, the memory is 16G, and the CPU model is Intel(R)Core(TM)i5- Under the hardware conditions of 7500 CPU @ 3.40GHz, the model performance is shown in Table 5:
表5 BiSeNet-Xception39精简版分割模型表现Table 5 BiSeNet-Xception39 simplified version segmentation model performance
Figure PCTCN2020100247-appb-000005
Figure PCTCN2020100247-appb-000005
可选地,上述指定类型参数信息中包含有用于反映血糖数据变化趋势的曲线信息或坐标系中离散点的取值信息,其中,每个离散点对应每个采样时刻的血糖值。Optionally, the above-mentioned designated type parameter information includes curve information used to reflect the change trend of blood glucose data or value information of discrete points in the coordinate system, wherein each discrete point corresponds to a blood glucose value at each sampling time.
在确定上述参数取值后,还可以展示所述被选定的像素点坐标在所述目标记录时间所对应的参数取值。After determining the value of the above parameter, the value of the parameter corresponding to the selected pixel point coordinate at the target recording time can also be displayed.
在本申请的一些实施例中,可以通过以下方式确定目标区域中被选定的像素点坐标:检测用户针对目标图像的指令;依据指令确定被选定的像素点坐标。In some embodiments of the present application, the coordinates of selected pixels in the target area can be determined by the following methods: detecting a user's instruction for the target image; determining the coordinates of the selected pixel according to the instruction.
可选地,指令基于以下之一信息确定:用户在目标图像所在人机交互界面的触摸位置;用户输入的查询信息。对于前者,在基于触摸点位置确定被选定的像素点坐标之前,还可以执行以下处理过程:判断触摸点位置是否位于目标区域;在判断结果指示触摸点位置位于目标区域时,触发确定被选定的像素点坐标。Optionally, the instruction is determined based on one of the following information: the user's touch position on the human-computer interaction interface where the target image is located; and the query information input by the user. For the former, before determining the coordinates of the selected pixel point based on the touch point position, the following processing can also be performed: determine whether the touch point position is located in the target area; when the determination result indicates that the touch point position is located in the target area, trigger the determination to be selected The specified pixel coordinates.
基于上述图像识别方法,对8小时类型血糖图像和24小时类型血糖图像各100张进行数据分析及结果统计,其中,对于8小时图像而言,98张血糖图像能有效识别(本方法识别的血糖值的图像趋势与扫描仪识别的血糖值的图像趋势一致),其误差范围在正负0.4左右,符合实际应用场景。对于24小时图像而言,全部100张血糖图像均能被有效识别(本方法识别的血糖值的图像趋势与扫描仪识别的血糖值的图像趋势一致),起误差范围在-0.6到0.4左右,满足血糖值缺失补录的需求。同时,基于R方(R-Square)、均方误差(MSE)、均方根误差(RMSE)、平均绝对误差(MAE)量化指标度量该方法误差情况,具体量化指标如表6所示,对于8小时图像,误差分布统计结果如图6所示,其误差值分布符合正态分布,具体分布情况如图7所示,且误差值 集中分布在正负0.4之内,对于24小时图像,误差分布统计结果如图8所示,其误差值分布也符合正态分布,具体分布情况如图9所示。Based on the above image recognition method, data analysis and result statistics are performed on 100 8-hour blood glucose images and 100 24-hour blood glucose images. Among them, for the 8-hour image, 98 blood glucose images can be effectively identified (the blood glucose recognized by this method) The image trend of the value is consistent with the image trend of the blood glucose value recognized by the scanner), and the error range is about plus or minus 0.4, which is in line with the actual application scenario. For 24-hour images, all 100 blood glucose images can be effectively identified (the image trend of the blood glucose value recognized by this method is consistent with the image trend of the blood glucose value recognized by the scanner), and the error range is about -0.6 to 0.4. Meet the needs of missing blood glucose values. At the same time, the method error is measured based on the quantitative indicators of R-Square, mean square error (MSE), root mean square error (RMSE), and average absolute error (MAE). The specific quantitative indicators are shown in Table 6. For the 8-hour image, the statistical results of the error distribution are shown in Figure 6. The error value distribution conforms to the normal distribution. The specific distribution is shown in Figure 7, and the error values are concentrated within plus or minus 0.4. For the 24-hour image, the error The statistical results of the distribution are shown in Figure 8. The error value distribution is also in line with the normal distribution, and the specific distribution is shown in Figure 9.
表6 量化指标结果Table 6 Quantitative index results
Figure PCTCN2020100247-appb-000006
Figure PCTCN2020100247-appb-000006
本申请实施例还提供一种图像识别装置,该装置用于实现图2所示方法,如图10所示,该装置包括:An embodiment of the present application also provides an image recognition device, which is used to implement the method shown in FIG. 2, and as shown in FIG. 10, the device includes:
第一获取模块10,设置为获取待识别的目标图像;The first obtaining module 10 is configured to obtain a target image to be recognized;
第二获取模块12,设置为获取目标图像中的目标区域,其中,该目标区域中的图像用于反映指定类型参数信息;The second acquisition module 12 is configured to acquire a target area in a target image, where the image in the target area is used to reflect the specified type parameter information;
第一确定模块14,设置为确定目标区域中被选定的像素点坐标;The first determining module 14 is configured to determine the coordinates of the selected pixel in the target area;
第二确定模块16,设置为基于指定类型参数的取值与像素点坐标的关联关系确定被选定的像素点坐标所对应的参数取值。The second determining module 16 is configured to determine the value of the parameter corresponding to the coordinate of the selected pixel based on the correlation between the value of the specified type parameter and the coordinate of the pixel point.
利用上述各个模块实现的功能,同样可以实现对图像中非字符信息所表示的参数取值的识别,达到了将图像中的像素点自动识别为相应的参数取值的目的,进而解决了当前的图像识别方式只能识别出图像中的字符格式的数值,不能将曲线或离散点自动识别为数值的技术问题。Utilizing the functions implemented by the above modules can also realize the recognition of the parameter values represented by the non-character information in the image, and achieve the purpose of automatically identifying the pixels in the image as the corresponding parameter values, thereby solving the current problem The image recognition method can only recognize the numerical value of the character format in the image, and cannot automatically recognize the curve or the discrete point as the technical problem of the numerical value.
在本申请的一些实施例中,如图11所示,该装置还包括:分离模块11,设置为从目标区域中分离出指定颜色通道,其中,指定颜色通道为R、G、B颜色通道中与目标区域的标准色带所对应颜色通道相同的颜色通道;处理模块13,设置为对指定颜色通道的图像进行图像二值化处理,得到二值化图像;选择模块15,设置为从预设阈值集合中选择二值化图像中每个像素点所在区域所对应的阈值,并利用选择的阈值对目标区域进行图像分割;拟合模块17,设置为对分割后得到的二值化图像进行参考点像素识别,得到标准色带的至少两个参考点在图像中的像素点坐标;建立模块19,设置为确定至少两个参考点的实际取值以及像素点坐标之间的对应关系,并基于对应关系建立指定类型参数的取值与像素点坐标的线性关系,并将线性关系作为关联关系。In some embodiments of the present application, as shown in FIG. 11, the device further includes: a separation module 11 configured to separate a designated color channel from the target area, wherein the designated color channel is R, G, B color channel The same color channel as the color channel corresponding to the standard color band of the target area; the processing module 13 is set to perform image binarization processing on the image of the specified color channel to obtain a binarized image; the selection module 15 is set to preset In the threshold set, select the threshold corresponding to the area of each pixel in the binarized image, and use the selected threshold to segment the target area; the fitting module 17 is set to reference the binarized image obtained after segmentation Point pixel recognition, to obtain the pixel coordinates of at least two reference points of the standard color band in the image; the establishment module 19 is configured to determine the actual values of the at least two reference points and the corresponding relationship between the pixel coordinates, and based on The corresponding relationship establishes the linear relationship between the value of the specified type parameter and the pixel coordinate, and uses the linear relationship as the association relationship.
如图11所示,该第一确定模块14,包括:灰度处理单元140,设置为对目标区域中的图像进行灰度化处理,得到灰度图像;聚类单元142,设置为对灰度图像中的各个像素点进行聚类处理,得到多个簇;选择单元144,设置为从多个簇中选择指定簇,并从指定簇中的所有像素点中确定被选定的像素点坐标。As shown in FIG. 11, the first determining module 14 includes: a grayscale processing unit 140, which is configured to perform grayscale processing on the image in the target area to obtain a grayscale image; and the clustering unit 142 is configured to perform grayscale Each pixel in the image is clustered to obtain multiple clusters; the selection unit 144 is configured to select a specified cluster from the multiple clusters, and determine the selected pixel coordinates from all pixels in the specified cluster.
其中,选择单元144,还设置为从多个簇中选择像素点数量最少的簇,并从像素点数量最少的簇中确定被选定的像素点坐标。Wherein, the selection unit 144 is further configured to select a cluster with the least number of pixels from a plurality of clusters, and determine the coordinates of the selected pixel from the cluster with the least number of pixels.
可选地,目标区域中的图像包括:坐标系中的曲线图像,该曲线图像中的曲线用于反映指定类型参数在不同时刻的取值。Optionally, the image in the target area includes: a curve image in a coordinate system, and the curve in the curve image is used to reflect the values of the specified type parameters at different times.
在本申请的一些实施例中,第一确定模块14,还设置为确定像素点坐标在曲线图像中对应的目标记录时间;第二确定模块16,还设置为基于指定类型参数的取值与像 素点坐标的关联关系确定被选定的像素点坐标在目标记录时间所对应的参数取值。In some embodiments of the present application, the first determining module 14 is further configured to determine the target recording time corresponding to the pixel coordinates in the curved image; the second determining module 16 is also configured to determine the value and pixel value based on the specified type parameter. The correlation of the point coordinates determines the parameter values corresponding to the selected pixel point coordinates at the target recording time.
上述目标记录时间通过以下方式确定:识别曲线图像中的字符信息,从字符信息中提取指定类型参数的时间信息;对时间信息中任意的相邻两个记录时刻之间的时长按照像素点数量进行等间隔划分,得到多个时间点;从多个时间点中确定被选定的像素点坐标所属的目标记录时间。The above-mentioned target recording time is determined by the following methods: identifying the character information in the curve image, and extracting the time information of the specified type parameter from the character information; the time between any two adjacent recording moments in the time information is determined according to the number of pixels Divide at equal intervals to obtain multiple time points; determine the target recording time to which the selected pixel coordinates belong from the multiple time points.
可选地,所述第一确定模块还包括:第一识别单元,设置为识别曲线图像中的字符信息,从所述字符信息中提取所述指定类型参数的时间信息;第一划分单元,设置为对所述时间信息中任意的相邻两个记录时刻之间的时长按照像素点数量进行等间隔划分,得到多个时间点;第一确定单元,设置为从所述多个时间点中确定所述被选定的像素点坐标所属的目标记录时间。Optionally, the first determining module further includes: a first recognition unit configured to recognize character information in a curved image, and extract time information of the specified type parameter from the character information; and a first dividing unit, To divide the time length between any two adjacent recording moments in the time information at equal intervals according to the number of pixels to obtain multiple time points; the first determining unit is set to determine from the multiple time points The target recording time to which the selected pixel coordinates belong.
可选地,如图11所示,第二获取模块12包括:分割单元120,设置为对目标图像进行语义分割,得到目标图像的掩码图像和前景图像;第二确定单元122,设置为从前景图像中确定目标区域。其中,分割单元120,设置为在目标图像为第一类型的情况下,采用高效神经网络模型对目标图像进行语义分割,其中,高效神经网络模型包括:初始化模块和瓶颈模块,其中,每个瓶颈模块包括三个卷积层,其中,三个卷积层中的第一卷积层用于进行降维处理,第二卷积层用于进行空洞卷积、全卷积和非对称卷积,第三卷积层用于进行升维处理;分割单元120,还设置为在目标图像为第二类型的情况下,对双边分割网络模型进行调整,并利用调整后的分割网络模型对目标图像进行语义分割,其中,对分割网络模型进行调整包括以下至少之一:减少分割网络模型中的空间信息处理层的数量;减少各个网络层输出的特征图的数量;对双边分割网络模型的输入图像进行压缩处理;简化感受野处理层。Optionally, as shown in FIG. 11, the second acquisition module 12 includes: a segmentation unit 120 configured to perform semantic segmentation on the target image to obtain a mask image and a foreground image of the target image; and a second determining unit 122 configured to Identify the target area in the foreground image. Wherein, the segmentation unit 120 is configured to use an efficient neural network model to perform semantic segmentation on the target image when the target image is of the first type. The efficient neural network model includes an initialization module and a bottleneck module, where each bottleneck The module includes three convolutional layers. Among the three convolutional layers, the first convolutional layer is used for dimensionality reduction, and the second convolutional layer is used for hole convolution, full convolution and asymmetric convolution. The third convolutional layer is used for dimensional upscaling; the segmentation unit 120 is also set to adjust the bilateral segmentation network model when the target image is of the second type, and use the adjusted segmentation network model to perform the target image Semantic segmentation, where adjusting the segmentation network model includes at least one of the following: reducing the number of spatial information processing layers in the segmentation network model; reducing the number of feature maps output by each network layer; performing the input image of the bilateral segmentation network model Compression processing; simplify the receptive field processing layer.
分割单元120,还设置为通过以下方式简化感受野处理层:将感受野处理层中的残差神经网络(RESNET)模块替换为通道分离式卷积(Xception39)模块。The segmentation unit 120 is further configured to simplify the receptive field processing layer by replacing the residual neural network (RESNET) module in the receptive field processing layer with a channel separation convolution (Xception39) module.
可选地,如图11所示,上述装置还可以包括:第三确定模块21,设置为确定目标图像的类型;以及在类型为预设类型时,确定对目标图像进行语义分割。该第三确定模块21,还设置为通过以下方式确定目标图像的类型:将目标图像均分成预设数量个不重叠滑块;确定预设数量个不重叠滑块的特征值,得到预设数量个特征值;将预设数量个特征值组合成特征向量;将特征向量输入至支持向量机分类器进行分析,得到目标图像的类型。Optionally, as shown in FIG. 11, the above-mentioned apparatus may further include: a third determining module 21 configured to determine the type of the target image; and when the type is a preset type, determining to perform semantic segmentation on the target image. The third determining module 21 is further configured to determine the type of the target image by: dividing the target image into a preset number of non-overlapping sliders; determining the feature values of the preset number of non-overlapping sliders to obtain the preset number A feature value; a preset number of feature values are combined into a feature vector; the feature vector is input to the support vector machine classifier for analysis, and the type of the target image is obtained.
可选地,目标区域中包含有血糖数据变化趋势的曲线图像。Optionally, the target area contains a curve image of the change trend of blood glucose data.
可选地,所述装置还包括:展示模块,用于展示所述被选定的像素点坐标所对应的参数取值。Optionally, the device further includes: a display module for displaying the parameter values corresponding to the selected pixel coordinates.
可选地,所述第一确定模块,还用于接收用户针对目标图像的指令;依据所述指令确定被选定的像素点坐标。Optionally, the first determining module is further configured to receive a user's instruction for the target image; determine the coordinates of the selected pixel according to the instruction.
可选地,所述指令基于以下之一信息确定:接收所述用户在所述目标图像所在人机交互界面的触摸点位置信息;或接收所述用户输入的查询信息。Optionally, the instruction is determined based on one of the following information: receiving position information of the user's touch point on the human-computer interaction interface where the target image is located; or receiving query information input by the user.
可选地,所述装置还包括:判断模块,用于在所述指令为接收所述用户在所述目标图像所在人机交互界面的触摸点位置信息时,基于所述触摸点位置确定被选定的像素点坐标之前,判断所述触摸点位置是否位于所述目标区域;触发模块,用于在判断 结果指示所述触摸点位置位于所述目标区域时,触发确定所述被选定的像素点坐标。Optionally, the device further includes: a judging module, configured to determine the selected user based on the touch point position when the instruction is to receive the user's touch point position information on the human-computer interaction interface where the target image is located. Before setting the pixel coordinates, determine whether the touch point position is located in the target area; a trigger module is used to trigger the determination of the selected pixel when the determination result indicates that the touch point position is located in the target area Point coordinates.
本申请实施例还提供了一种数据展示方法,如图12所示,该方法包括:The embodiment of the present application also provides a data display method. As shown in FIG. 12, the method includes:
步骤S1202,展示获取待识别的目标图像;Step S1202, displaying and acquiring the target image to be recognized;
步骤S1204,展示目标图像中的感兴趣区域,其中,该感兴趣区域中的图像用于反映指定类型参数随时间变化的变化信息;Step S1204, displaying the region of interest in the target image, where the image in the region of interest is used to reflect the change information of the specified type parameter over time;
步骤S1206,展示感兴趣区域中被选定的像素点坐标,以及该像素点坐标对应的目标记录时间;Step S1206, displaying the coordinates of the selected pixel in the region of interest and the target recording time corresponding to the pixel coordinates;
步骤S1208,展示被选定的像素点坐标在目标记录时间所对应的参数取值,其中,参数取值是基于指定类型参数的取值与像素点坐标的关联关系确定的。Step S1208: Display the parameter value corresponding to the selected pixel point coordinate at the target recording time, where the parameter value is determined based on the correlation between the value of the specified type parameter and the pixel point coordinate.
需要说明的是,上述步骤S1202至S1208的执行主体包括但不限于移动终端。It should be noted that the execution subject of the above steps S1202 to S1208 includes but is not limited to a mobile terminal.
在本申请的一些实施例中,上述关联关系通过以下方式确定:从感兴趣区域中分离出指定颜色通道,其中,指定颜色通道为R、G、B颜色通道中与感兴趣区域的标准色带所对应颜色通道相同的颜色通道;对指定颜色通道的图像进行图像二值化处理,得到二值化图像;从预设阈值集合中选择二值化图像中每个像素点所在区域所对应的阈值,并利用选择的阈值对感兴趣区域进行图像分割;对分割后得到的二值化图像进行参考点像素识别,得到标准色带的至少两个参考点在图像中的像素点坐标;确定至少两个参考点的实际取值以及像素点坐标之间的对应关系,并基于对应关系建立指定类型参数的取值与像素点坐标的线性关系,并将线性关系作为关联关系。In some embodiments of the present application, the above-mentioned association relationship is determined in the following manner: the designated color channel is separated from the region of interest, where the designated color channel is the standard color band of the R, G, B color channels and the region of interest Corresponding color channels with the same color channel; perform image binarization processing on the image of the specified color channel to obtain a binarized image; select the threshold corresponding to the area of each pixel in the binarized image from the preset threshold set , And use the selected threshold to perform image segmentation on the region of interest; perform reference point pixel recognition on the binarized image obtained after segmentation to obtain the pixel coordinates of at least two reference points of the standard color band in the image; determine at least two Correspondence between the actual values of the reference points and the pixel coordinates, and establish a linear relationship between the values of the specified type parameters and the pixel coordinates based on the correspondence, and use the linear relationship as the association relationship.
需要说明的时,图12所示实施例的优选实施方式可以参见图2至9中所示实施例的相关描述,此处不再赘述。When it is necessary to explain, for the preferred implementation manner of the embodiment shown in FIG. 12, reference may be made to the related description of the embodiment shown in FIGS. 2 to 9, which will not be repeated here.
本申请实施例还提供了一种图像识别方法,该方法可以基于用户的触摸操作确定被选中的像素点,从而确定该像素点对应的参数取值,具体地,如图13所示,该方法包括:The embodiment of the present application also provides an image recognition method. The method can determine the selected pixel based on the user's touch operation, thereby determining the parameter value corresponding to the pixel. Specifically, as shown in FIG. 13, the method include:
步骤S1302,检测用户在目标图像中的触摸点位置;Step S1302, detecting the position of the user's touch point in the target image;
步骤S1304,基于触摸点位置确定被选定的像素点坐标,以及该像素点坐标对应的目标记录时间;Step S1304: Determine the coordinates of the selected pixel based on the position of the touch point, and the target recording time corresponding to the coordinate of the pixel;
步骤S1306,基于指定类型参数的取值与像素点坐标的关联关系确定被选定的像素点坐标在目标记录时间所对应的参数取值;Step S1306: Determine the parameter value corresponding to the selected pixel coordinate at the target recording time based on the correlation between the value of the specified type parameter and the pixel coordinate;
步骤S1308,输出参数取值。其中,输出参数取值包括但不限于向用户展示参数取值,或者向外接设备发送参数取值,但不限于上述表现形式。Step S1308, output parameter values. The value of the output parameter includes, but is not limited to, displaying the value of the parameter to the user, or sending the value of the parameter to the external device, but is not limited to the above-mentioned manifestation.
在本申请的一些可选实施例中,在基于触摸点位置确定被选定的像素点坐标之前,为了防止无效触摸操作的干扰,还可以判断触摸点位置是否位于目标图像的感兴趣区域,其中,该感兴趣区域中的图像用于反映指定类型参数随时间变化的变化信息;在判断结果指示触摸点位置位于感兴趣区域时,触发确定被选定的像素点坐标。In some optional embodiments of the present application, before determining the selected pixel point coordinates based on the touch point position, in order to prevent interference from invalid touch operations, it can also be determined whether the touch point position is located in the interest area of the target image, where , The image in the region of interest is used to reflect the change information of the specified type of parameters over time; when the judgment result indicates that the touch point is located in the region of interest, trigger the determination of the coordinates of the selected pixel point.
在本申请的另一些实施例中,在基于指定类型参数的取值与像素点坐标的关联关系确定被选定的像素点坐标在目标记录时间所对应的参数取值之前,还可以执行以下处理过程:从感兴趣区域中分离出指定颜色通道,其中,指定颜色通道为R、G、B颜色通道中与感兴趣区域的标准色带所对应颜色通道相同的颜色通道;对指定颜色通道的图像进行图像二值化处理,得到二值化图像;从预设阈值集合中选择二值化图像中每个像素点所在区域所对应的阈值,并利用选择的阈值对感兴趣区域进行图像分割; 对分割后得到的二值化图像进行参考点像素识别,得到标准色带的至少两个参考点在图像中的像素点坐标;确定至少两个参考点的实际取值以及像素点坐标之间的对应关系,并基于对应关系建立指定类型参数的取值与像素点坐标的线性关系,并将线性关系作为关联关系。In some other embodiments of the present application, before determining the value of the parameter corresponding to the selected pixel coordinate at the target recording time based on the correlation between the value of the specified type parameter and the pixel coordinate, the following processing may be performed Process: Separate the designated color channel from the region of interest, where the designated color channel is the same color channel in the R, G, and B color channels as the color channel corresponding to the standard color band of the region of interest; for the image of the designated color channel Perform image binarization processing to obtain a binarized image; select the threshold corresponding to the area where each pixel in the binarized image is located from the preset threshold set, and use the selected threshold to segment the region of interest; The binarized image obtained after segmentation performs reference point pixel recognition to obtain the pixel coordinates of at least two reference points of the standard color band in the image; determine the actual values of at least two reference points and the correspondence between the pixel coordinates It establishes the linear relationship between the value of the specified type parameter and the pixel coordinate based on the corresponding relationship, and uses the linear relationship as the association relationship.
需要说明的时,图13所示实施例的优选实施方式可以参见图2至9中所示实施例的相关描述,此处不再赘述。When it is necessary to explain, the preferred implementation manner of the embodiment shown in FIG. 13 can refer to the related description of the embodiment shown in FIGS. 2 to 9, which will not be repeated here.
本申请实施例还提供了一种图像识别方法,该方法可以基于用户的输入确定被选中的像素点,从而确定该像素点对应的参数取值,如图14所示,该方法包括:The embodiment of the present application also provides an image recognition method, which can determine the selected pixel based on user input, thereby determining the parameter value corresponding to the pixel. As shown in FIG. 14, the method includes:
步骤S1402,检测用户输入的查询信息;其中,该查询信息可以为通过人机交互界面输入的,该人机交互界面中包括有用于输入查询信息的文字输入框。Step S1402: Detect query information input by the user; wherein the query information may be input through a human-computer interaction interface, and the human-computer interaction interface includes a text input box for inputting query information.
步骤S1404,基于查询信息确定目标图像中被选定的像素点坐标,以及该像素点坐标对应的目标记录时间;Step S1404: Determine the coordinates of the selected pixel in the target image and the target recording time corresponding to the pixel coordinates based on the query information;
步骤S1406,基于指定类型参数的取值与像素点坐标的关联关系确定被选定的像素点坐标在目标记录时间所对应的参数取值;Step S1406: Determine the value of the parameter corresponding to the selected pixel coordinate at the target recording time based on the correlation between the value of the specified type parameter and the pixel coordinate;
步骤S1408,输出参数取值。Step S1408, output parameter values.
在本申请的一些实施例中,在基于指定类型参数的取值与像素点坐标的关联关系确定被选定的像素点坐标在目标记录时间所对应的参数取值之前,还可以执行以下处理过程:从感兴趣区域中分离出指定颜色通道,其中,指定颜色通道为R、G、B颜色通道中与感兴趣区域的标准色带所对应颜色通道相同的颜色通道;对指定颜色通道的图像进行图像二值化处理,得到二值化图像;从预设阈值集合中选择二值化图像中每个像素点所在区域所对应的阈值,并利用选择的阈值对感兴趣区域进行图像分割;对分割后得到的二值化图像进行参考点像素识别,得到标准色带的至少两个参考点在图像中的像素点坐标;确定至少两个参考点的实际取值以及像素点坐标之间的对应关系,并基于对应关系建立指定类型参数的取值与像素点坐标的线性关系,并将线性关系作为关联关系。In some embodiments of the present application, before determining the value of the parameter corresponding to the selected pixel point coordinate at the target recording time based on the correlation between the value of the specified type parameter and the pixel point coordinate, the following process may also be performed : Separate the designated color channel from the region of interest, where the designated color channel is the same color channel in the R, G, and B color channels as the color channel corresponding to the standard color band of the region of interest; Image binarization processing to obtain a binarized image; select the threshold corresponding to the area of each pixel in the binarized image from the preset threshold set, and use the selected threshold to segment the region of interest; segmentation The binarized image obtained afterwards performs reference point pixel identification to obtain the pixel coordinates of at least two reference points of the standard color band in the image; determine the actual values of at least two reference points and the corresponding relationship between the pixel coordinates , And establish the linear relationship between the value of the specified type parameter and the pixel coordinate based on the corresponding relationship, and use the linear relationship as the association relationship.
需要说明的时,图14所示实施例的优选实施方式可以参见图2至9中所示实施例的相关描述,此处不再赘述。When it is necessary to explain, for the preferred implementation of the embodiment shown in FIG. 14, reference may be made to the related description of the embodiment shown in FIGS. 2 to 9, which will not be repeated here.
本申请实施例还提供了一种非易失性存储介质,存储介质包括存储的程序,其中,在程序运行时控制存储介质所在设备执行以上的图像识别方法。The embodiment of the present application also provides a non-volatile storage medium, the storage medium includes a stored program, wherein the device where the storage medium is located is controlled to execute the above image recognition method when the program runs.
本申请实施例还提供了一种处理器,处理器设置为运行程序,其中,程序运行时执行以上的图像识别方法。The embodiment of the present application also provides a processor, which is configured to run a program, wherein the above image recognition method is executed when the program is running.
在本申请实施例中,采用依据目标图像中像素点坐标和指定类型参数的取值之间的关联关系确定被选定的像素点坐标所对应的参数取值的方式,由于采用目标图像中像素点坐标和指定类型参数的取值之间的关联关系识别图像中任意像素点坐标对应的参数取值,因此,实现了对图像中非字符信息所表示的参数取值的识别,达到了将图像中的像素点自动识别为相应的参数取值的目的,进而解决了当前的图像识别方式只能识别出图像中的字符格式的数值,不能将曲线或离散点自动识别为数值的技术问题。In the embodiment of the present application, the method of determining the parameter value corresponding to the selected pixel coordinate according to the correlation between the pixel coordinate in the target image and the value of the specified type parameter is adopted. The correlation between the point coordinates and the value of the specified type of parameter recognizes the value of the parameter corresponding to any pixel point in the image. Therefore, the recognition of the parameter value represented by the non-character information in the image is realized, and the image The pixel points in the image are automatically recognized as the purpose of the corresponding parameter value, which solves the technical problem that the current image recognition method can only recognize the value of the character format in the image, and cannot automatically recognize the curve or the discrete point as the value.
上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。The serial numbers of the foregoing embodiments of the present application are only for description, and do not represent the advantages and disadvantages of the embodiments.
在本申请的上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述的部分,可以参见其他实施例的相关描述。In the above-mentioned embodiments of the present application, the description of each embodiment has its own focus. For parts that are not described in detail in an embodiment, reference may be made to related descriptions of other embodiments.
在本申请所提供的几个实施例中,应该理解到,所揭露的技术内容,可通过其它的方式实现。其中,以上所描述的装置实施例仅仅是示意性的,例如所述单元的划分,可以为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,单元或模块的间接耦合或通信连接,可以是电性或其它的形式。In the several embodiments provided in this application, it should be understood that the disclosed technical content can be implemented in other ways. The device embodiments described above are merely illustrative. For example, the division of the units may be a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components may be combined or may be Integrate into another system, or some features can be ignored or not implemented. In addition, the displayed or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of units or modules, and may be in electrical or other forms.
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。In addition, the functional units in each embodiment of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit. The above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可为个人计算机、服务器或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、移动硬盘、磁碟或者光盘等各种可以存储程序代码的介质。If the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium. Based on this understanding, the technical solution of this application essentially or the part that contributes to the existing technology or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , Including several instructions to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the method described in each embodiment of the present application. The aforementioned storage media include: U disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), mobile hard disk, magnetic disk or optical disk and other media that can store program code .
以上所述仅是本申请的优选实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本申请原理的前提下,还可以做出若干改进和润饰,这些改进和润饰也应视为本申请的保护范围。The above are only the preferred embodiments of this application. It should be pointed out that for those of ordinary skill in the art, without departing from the principle of this application, several improvements and modifications can be made, and these improvements and modifications are also Should be regarded as the scope of protection of this application.
工业实用性Industrial applicability
本申请实施例中提供的方案,可以应用于图像识别过程中,例如,可以应用于血糖数据的图像识别过程中。基于本申请实施例提供的方案,由于采用目标图像中像素点坐标和指定类型参数的取值之间的关联关系识别图像中任意像素点坐标对应的参数取值,因此,实现了对图像中非字符信息所表示的参数取值的识别,达到了将图像中的像素点自动识别为相应的参数取值的目的,进而解决了当前的图像识别方式只能识别出图像中的字符格式的数值,不能将曲线或离散点自动识别为数值的技术问题。The solutions provided in the embodiments of the present application can be applied to the image recognition process, for example, can be applied to the image recognition process of blood glucose data. Based on the solution provided by the embodiments of this application, since the correlation between the pixel coordinates in the target image and the values of the specified type parameters is used to identify the parameter values corresponding to any pixel coordinates in the image, the The recognition of the parameter value represented by the character information achieves the purpose of automatically recognizing the pixel in the image as the corresponding parameter value, and further solves that the current image recognition method can only recognize the value of the character format in the image. A technical problem that cannot automatically recognize curves or discrete points as numerical values.

Claims (34)

  1. 一种图像识别方法,包括:An image recognition method, including:
    获取待识别的目标图像;Obtain the target image to be recognized;
    获取所述目标图像中的目标区域,其中,该目标区域中的图像用于反映指定类型参数信息;Acquiring a target area in the target image, where the image in the target area is used to reflect specified type parameter information;
    确定所述目标区域中被选定的像素点坐标;Determining the coordinates of selected pixels in the target area;
    基于指定类型参数的取值与像素点坐标的关联关系确定所述被选定的像素点坐标所对应的参数取值。The value of the parameter corresponding to the selected pixel point coordinate is determined based on the correlation between the value of the specified type parameter and the pixel point coordinate.
  2. 根据权利要求1所述的方法,其中,基于指定类型参数的取值与像素点坐标的关联关系确定所述被选定的像素点坐标所对应的参数取值之前,所述方法还包括The method according to claim 1, wherein before determining the value of the parameter corresponding to the selected pixel point coordinate based on the correlation between the value of the specified type parameter and the pixel point coordinate, the method further comprises
    从所述目标区域中分离出指定颜色通道,其中,所述指定颜色通道为R、G、B颜色通道中与所述目标区域的标准色带所对应颜色通道相同的颜色通道;Separating a designated color channel from the target area, where the designated color channel is the same color channel as the color channel corresponding to the standard color band of the target area among the R, G, and B color channels;
    对所述指定颜色通道的图像进行图像二值化处理,所述二值化处理为选择所述指定颜色通道的图像中大于预设阈值的像素点集合,得到二值化图像;Performing image binarization processing on the image of the designated color channel, where the binarization process is to select a set of pixel points greater than a preset threshold in the image of the designated color channel to obtain a binary image;
    对得到的二值化图像进行参考点像素识别,得到所述标准色带的至少两个参考点在图像中的像素点坐标高度;Performing reference point pixel identification on the obtained binarized image to obtain the pixel point coordinate height of at least two reference points of the standard color band in the image;
    确定所述至少两个参考点的实际取值以及所述像素点坐标之间的对应关系,并基于所述对应关系建立所述指定类型参数的取值与像素点坐标的线性关系,并将所述线性关系作为所述关联关系。Determine the actual value of the at least two reference points and the corresponding relationship between the pixel point coordinates, and establish a linear relationship between the value of the specified type parameter and the pixel point coordinates based on the corresponding relationship, and compare the The linear relationship is the association relationship.
  3. 根据权利要求1所述的方法,其中,确定所述目标区域中被选定的像素点坐标,包括:The method according to claim 1, wherein determining the coordinates of the selected pixel in the target area comprises:
    对所述目标区域中的图像进行灰度化处理,得到灰度图像;Performing grayscale processing on the image in the target area to obtain a grayscale image;
    对所述灰度图像中的各个像素点进行聚类处理,得到多个簇;Performing clustering processing on each pixel in the grayscale image to obtain multiple clusters;
    从所述多个簇中选择指定簇,并从所述指定簇中的所有像素点中确定被选定的像素点坐标。A designated cluster is selected from the plurality of clusters, and the coordinates of the selected pixel point are determined from all the pixels in the designated cluster.
  4. 根据权利要求3所述的方法,其中,从所述多个簇中选择指定簇,并从所述指定簇中的所有像素点中确定被选定的像素点坐标具体为:The method according to claim 3, wherein selecting a designated cluster from the plurality of clusters, and determining the coordinates of the selected pixel point from all pixels in the designated cluster is specifically:
    从所述多个簇中选择像素点数量最少的簇,并从所述像素点数量最少的簇中确定被选定的像素点坐标。Select the cluster with the least number of pixels from the plurality of clusters, and determine the selected pixel coordinates from the cluster with the least number of pixels.
  5. 根据权利要求1所述的方法,其中,The method of claim 1, wherein:
    所述目标区域中的图像包括:坐标系中的曲线图像或坐标系中离散点图像,所述曲线图像中的曲线或所述离散点图像中的离散点用于反映指定类型参数在不同时刻的取值;The image in the target area includes: a curve image in the coordinate system or an image of discrete points in the coordinate system, and the curve in the curve image or the discrete points in the discrete point image are used to reflect the specified type parameters at different moments. Value
    所述方法还包括:确定所述像素点坐标在所述曲线图像或所述离散点图像中对应的目标记录时间;The method further includes: determining the target recording time corresponding to the pixel point coordinate in the curve image or the discrete point image;
    所述基于指定类型参数的取值与像素点坐标的关联关系确定所述被选定的像素点坐标所对应的参数取值,包括:基于所述指定类型参数的取值与像素点坐标的关联关系确定所述被选定的像素点坐标在所述目标记录时间所对应的参数取值。The determining the parameter value corresponding to the selected pixel point coordinate based on the correlation between the value of the specified type parameter and the pixel point coordinate includes: based on the association between the value of the specified type parameter and the pixel point coordinate The relationship determines the value of the parameter corresponding to the selected pixel point coordinate at the target recording time.
  6. 根据权利要求5所述的方法,其中,所述目标记录时间通过以下方式确定:The method according to claim 5, wherein the target recording time is determined in the following manner:
    识别所述曲线图像或所述离散点图像中的字符信息,从所述字符信息中提取所述指定类型参数的时间信息;Identifying character information in the curved image or the discrete point image, and extracting the time information of the specified type parameter from the character information;
    对所述时间信息中任意的相邻两个记录时刻之间的时长按照像素点数量进行等间隔划分,得到多个时间点;Divide the time length between any two adjacent recording moments in the time information at equal intervals according to the number of pixels to obtain multiple time points;
    从所述多个时间点中确定所述被选定的像素点坐标所属的目标记录时间。The target recording time to which the coordinates of the selected pixel point belongs is determined from the multiple time points.
  7. 根据权利要求1所述的方法,其中,所述获取所述目标图像中的目标区域,包括:The method according to claim 1, wherein said obtaining the target area in the target image comprises:
    对所述目标图像进行语义分割,得到所述目标图像的掩码图像和前景图像;Performing semantic segmentation on the target image to obtain a mask image and a foreground image of the target image;
    从所述前景图像中确定感兴趣区域,并将所述感兴趣区域作为所述目标区域。Determine a region of interest from the foreground image, and use the region of interest as the target region.
  8. 根据权利要求7所述的方法,其中,The method according to claim 7, wherein:
    在所述目标图像为第一类型的情况下,采用高效神经网络模型对所述目标图像进行语义分割,其中,所述高效神经网络模型包括:初始化模块和瓶颈模块,其中,每个瓶颈模块包括三个卷积层,其中,所述三个卷积层中的第一卷积层用于进行降维处理,第二卷积层用于进行空洞卷积、全卷积和非对称卷积,第三卷积层用于进行升维处理;In the case that the target image is of the first type, an efficient neural network model is used to perform semantic segmentation on the target image, where the efficient neural network model includes: an initialization module and a bottleneck module, wherein each bottleneck module includes Three convolutional layers, wherein the first convolutional layer of the three convolutional layers is used for dimensionality reduction processing, and the second convolutional layer is used for hole convolution, full convolution and asymmetric convolution, The third convolutional layer is used to perform dimension upgrading;
    在所述目标图像为第二类型的情况下,对双边分割网络模型进行调整,并利用调整后的分割网络模型对所述目标图像进行语义分割,其中,所述对双边分割网络模型进行调整包括:When the target image is of the second type, adjusting the bilateral segmentation network model, and using the adjusted segmentation network model to perform semantic segmentation on the target image, wherein the adjusting the bilateral segmentation network model includes :
    所述双边分割网络模型包括主干网络和辅助网络,所述主干网络由两层构成,每层主干网络分别包括卷积层、批归一化层和非线性激活函数,降低主干网络输出通道特征图数;所述辅助网络模型框架采用轻量级模型,降低主干网络输出通道特征图数,所述轻量级模型包括以下之一:Xception39、SqueezeNet、Xception、MobileNet、ShuffleNet;The bilateral segmentation network model includes a backbone network and an auxiliary network. The backbone network is composed of two layers. Each backbone network includes a convolutional layer, a batch normalization layer and a nonlinear activation function, which reduces the output channel feature map of the backbone network. The auxiliary network model framework adopts a lightweight model to reduce the number of feature maps of the main network output channel. The lightweight model includes one of the following: Xception39, SqueezeNet, Xception, MobileNet, ShuffleNet;
    其中,所述第一类型所对应第一数据集的图像数量小于第二类型所对应第二数据集中的图像数量。Wherein, the number of images in the first data set corresponding to the first type is smaller than the number of images in the second data set corresponding to the second type.
  9. 根据权利要求7所述的方法,其中,所述从所述前景图像中确定感兴趣区域,包括:The method according to claim 7, wherein said determining a region of interest from said foreground image comprises:
    确定所述前景图像中的特征区域,以及目标几何区域的角点坐标,其中,所述特征区域为所述前景图像中包含所述指定类型参数信息的区域;Determining the characteristic area in the foreground image and the corner point coordinates of the target geometric area, wherein the characteristic area is an area in the foreground image that contains the specified type parameter information;
    基于所述角点坐标计算投影变换矩阵;Calculating a projection transformation matrix based on the corner coordinates;
    对所述特征区域中的像素点进行投影变换,得到所述感兴趣区域。Performing projection transformation on the pixel points in the characteristic region to obtain the region of interest.
  10. 根据权利要求7所述的方法,其中,所述获取待识别的目标图像之前,所述方法还包括:The method according to claim 7, wherein, before said acquiring the target image to be recognized, the method further comprises:
    确定待识别的图像是否为目标图像;Determine whether the image to be recognized is the target image;
    在所述待识别的图像为目标图像时,确定对所述目标图像进行语义分割。When the image to be recognized is a target image, it is determined to perform semantic segmentation on the target image.
  11. 根据权利要求9所述的方法,其中,所述方法还包括:The method according to claim 9, wherein the method further comprises:
    将所述感兴趣区域均分成预设数量个不重叠滑块;Dividing the region of interest into a preset number of non-overlapping sliders;
    确定所述预设数量个不重叠滑块的特征值,得到所述预设数量个特征值;将所述预设数量个特征值组合成特征向量;将所述特征向量输入至支持向量机分类器进行分析,得到所述感兴趣区域的类型。Determine the feature values of the preset number of non-overlapping sliders to obtain the preset number of feature values; combine the preset number of feature values into a feature vector; input the feature vector to support vector machine classification Detector performs analysis to obtain the type of the region of interest.
  12. 根据权利要求1所述的方法,其中,所述指定类型参数信息中包含有用于反映血 糖数据随时间变化的趋势的曲线信息,或用于反映血糖数据随时间变化的趋势的坐标系中离散点的取值信息。The method according to claim 1, wherein the specified type of parameter information contains curve information used to reflect the trend of blood glucose data changes over time, or discrete points in a coordinate system used to reflect the trend of blood glucose data changes over time Value information.
  13. 根据权利要求1所述的方法,其中,所述方法还包括:The method of claim 1, wherein the method further comprises:
    展示所述被选定的像素点坐标所对应的参数取值。Display the parameter values corresponding to the selected pixel coordinates.
  14. 根据权利要求1所述的方法,其中,确定所述目标区域中被选定的像素点坐标,包括:The method according to claim 1, wherein determining the coordinates of the selected pixel in the target area comprises:
    接收用户针对目标图像的指令;依据所述指令确定被选定的像素点坐标。Receive a user's instruction for the target image; determine the selected pixel coordinates according to the instruction.
  15. 根据权利要求14所述的方法,其中,所述指令基于以下之一信息确定:接收所述用户在所述目标图像所在人机交互界面的触摸点位置信息;或接收所述用户输入的查询信息。The method according to claim 14, wherein the instruction is determined based on one of the following information: receiving position information of the user's touch point on the human-computer interaction interface where the target image is located; or receiving query information input by the user .
  16. 根据权利要求15所述的方法,其中,当所述指令为接收所述用户在所述目标图像所在人机交互界面的触摸点位置信息时,所述方法还包括:The method according to claim 15, wherein when the instruction is to receive the position information of the user's touch point on the human-computer interaction interface where the target image is located, the method further comprises:
    基于所述触摸点位置确定被选定的像素点坐标之前,判断所述触摸点位置是否位于所述目标区域;Before determining the coordinates of the selected pixel point based on the touch point position, determining whether the touch point position is located in the target area;
    在判断结果指示所述触摸点位置位于所述目标区域时,触发确定所述被选定的像素点坐标。When the determination result indicates that the position of the touch point is located in the target area, triggering the determination of the selected pixel point coordinates.
  17. 一种图像识别装置,包括:An image recognition device includes:
    第一获取模块,设置为获取待识别的目标图像;The first obtaining module is configured to obtain the target image to be recognized;
    第二获取模块,设置为获取所述目标图像中的目标区域,其中,该目标区域中的图像用于反映指定类型参数信息;The second acquisition module is configured to acquire a target area in the target image, wherein the image in the target area is used to reflect specified type parameter information;
    第一确定模块,设置为确定所述目标区域中被选定的像素点坐标;The first determining module is configured to determine the coordinates of the selected pixel in the target area;
    第二确定模块,设置为基于所述指定类型参数的取值与像素点坐标的关联关系确定所述被选定的像素点坐标所对应的参数取值。The second determining module is configured to determine the value of the parameter corresponding to the coordinate of the selected pixel based on the correlation between the value of the specified type parameter and the coordinate of the pixel point.
  18. 根据权利要求17所述的装置,其中,所述装置还包括:The device according to claim 17, wherein the device further comprises:
    分离模块,设置为从所述目标区域中分离出指定颜色通道,其中,所述指定颜色通道为R、G、B颜色通道中与所述目标区域的标准色带所对应颜色通道相同的颜色通道;The separation module is configured to separate a designated color channel from the target area, wherein the designated color channel is the same color channel as the color channel corresponding to the standard color band of the target area among the R, G, and B color channels ;
    处理模块,设置为对所述指定颜色通道的图像进行图像二值化处理,所述二值化处理为选择所述指定颜色通道的图像中大于预设阈值的像素点集合,得到二值化图像;The processing module is configured to perform image binarization processing on the image of the specified color channel, and the binarization process is to select a set of pixel points greater than a preset threshold in the image of the specified color channel to obtain a binarized image ;
    拟合模块,设置为对得到的二值化图像进行参考点像素识别,得到所述标准色带的至少两个参考点在图像中的像素点坐标;A fitting module, configured to perform reference point pixel recognition on the obtained binarized image, and obtain the pixel point coordinates of at least two reference points of the standard color band in the image;
    建立模块,设置为确定所述至少两个参考点的实际取值以及所述像素点坐标之间的对应关系,并基于所述对应关系建立所述指定类型参数的取值与像素点坐标的线性关系,并将所述线性关系作为所述关联关系。The establishment module is configured to determine the actual value of the at least two reference points and the corresponding relationship between the pixel coordinates, and establish the linear relationship between the value of the specified type parameter and the pixel coordinates based on the corresponding relationship Relationship, and use the linear relationship as the association relationship.
  19. 根据权利要求17所述的装置,其中,所述第一确定模块,包括:The apparatus according to claim 17, wherein the first determining module comprises:
    灰度处理单元,设置为对所述目标区域中的图像进行灰度化处理,得到灰度图像;A grayscale processing unit, configured to perform grayscale processing on the image in the target area to obtain a grayscale image;
    聚类单元,设置为对所述灰度图像中的各个像素点进行聚类处理,得到多个簇;The clustering unit is configured to perform clustering processing on each pixel in the grayscale image to obtain multiple clusters;
    选择单元,设置为从所述多个簇中选择指定簇,并从所述指定簇中的所有像素点中确定被选定的像素点坐标。The selection unit is configured to select a designated cluster from the plurality of clusters, and determine the coordinates of the selected pixel point from all the pixels in the designated cluster.
  20. 根据权利要求19所述的装置,其中,所述选择单元,设置为从所述多个簇中选择像素点数量最少的簇,并从所述像素点数量最少的簇中确定被选定的像素点坐标。The device according to claim 19, wherein the selection unit is configured to select a cluster with the least number of pixels from the plurality of clusters, and determine the selected pixel from the cluster with the least number of pixels Point coordinates.
  21. 根据权利要求17所述的装置,其中,The device of claim 17, wherein:
    所述目标区域中的图像包括:坐标系中的曲线图像或坐标系中离散点图像,所述曲线图像中的曲线或所述离散点图像中的离散点,用于反映指定类型参数在不同时刻的取值;The image in the target area includes: a curve image in a coordinate system or a discrete point image in a coordinate system, and a curve in the curve image or a discrete point in the discrete point image is used to reflect the specified type parameters at different times The value of
    所述第一确定模块,还设置为确定所述像素点坐标在所述曲线图像中对应的目标记录时间;The first determining module is further configured to determine the target recording time corresponding to the pixel point coordinates in the curve image;
    所述第二确定模块,设置为基于所述指定类型参数的取值与像素点坐标的关联关系确定所述被选定的像素点坐标在所述目标记录时间所对应的参数取值。The second determining module is configured to determine the value of the parameter corresponding to the selected pixel point coordinate at the target recording time based on the correlation between the value of the specified type parameter and the pixel point coordinate.
  22. 根据权利要求21所述的装置,其中,所述第一确定模块还包括:The apparatus according to claim 21, wherein the first determining module further comprises:
    第一识别单元,设置为识别所述曲线图像或所述离散点图像中的字符信息,从所述字符信息中提取所述指定类型参数的时间信息;The first recognition unit is configured to recognize character information in the curve image or the discrete point image, and extract the time information of the specified type parameter from the character information;
    第一划分单元,设置为对所述时间信息中任意的相邻两个记录时刻之间的时长按照像素点数量进行等间隔划分,得到多个时间点;The first dividing unit is configured to divide the time length between any two adjacent recording moments in the time information at equal intervals according to the number of pixels to obtain multiple time points;
    第一确定单元,设置为从所述多个时间点中确定所述被选定的像素点坐标所属的目标记录时间。The first determining unit is configured to determine the target recording time to which the coordinates of the selected pixel point belong from the multiple time points.
  23. 根据权利要求17所述的装置,其中,所述第二获取模块,包括:The apparatus according to claim 17, wherein the second acquisition module comprises:
    分割单元,设置为对所述目标图像进行语义分割,得到所述目标图像的掩码图像和前景图像;A segmentation unit, configured to perform semantic segmentation on the target image to obtain a mask image and a foreground image of the target image;
    第二确定单元,设置为从所述前景图像中确定感兴趣区域,并将所述感兴趣区域作为所述目标区域。The second determining unit is configured to determine a region of interest from the foreground image, and use the region of interest as the target region.
  24. 根据权利要求23所述的装置,其中,The device according to claim 23, wherein:
    所述分割单元,设置为在所述目标图像为第一类型的情况下,采用高效神经网络模型对所述目标图像进行语义分割,其中,所述高效神经网络模型包括:初始化模块和瓶颈模块,其中,每个瓶颈模块包括三个卷积层,其中,所述三个卷积层中的第一卷积层用于进行降维处理,第二卷积层用于进行空洞卷积、全卷积和非对称卷积,第三卷积层用于进行升维处理;The segmentation unit is configured to use an efficient neural network model to perform semantic segmentation on the target image when the target image is of the first type, wherein the efficient neural network model includes an initialization module and a bottleneck module, Wherein, each bottleneck module includes three convolutional layers, wherein the first convolutional layer of the three convolutional layers is used for dimensionality reduction processing, and the second convolutional layer is used for hole convolution and full convolution. Product and asymmetric convolution, and the third convolution layer is used for dimensionality increase processing;
    所述分割单元,还设置为在所述目标图像为第二类型的情况下,对双边分割网络模型进行调整,并利用调整后的分割网络模型对所述目标图像进行语义分割,其中,对所述分割网络模型进行调整包括:所述双边分割网络模型包括主干网络和辅助网络,所述主干网络由两层构成,每层主干网络分别包括卷积层、批归一化层和非线性激活函数,降低主干网络输出通道特征图数;所述辅助网络模型框架采用轻量级模型,降低主干网络输出通道特征图数,所述轻量级模型包括以下之一:Xception39、SqueezeNet、Xception、MobileNet、ShuffleNet;其中,所述第一类型所The segmentation unit is further configured to adjust the bilateral segmentation network model when the target image is of the second type, and use the adjusted segmentation network model to perform semantic segmentation on the target image, wherein The adjustment of the segmentation network model includes: the bilateral segmentation network model includes a backbone network and an auxiliary network. The backbone network is composed of two layers. Each layer of the backbone network includes a convolutional layer, a batch normalization layer, and a nonlinear activation function. , Reduce the number of feature maps of the backbone network output channel; the auxiliary network model framework adopts a lightweight model to reduce the number of feature maps of the backbone network output channel. The lightweight model includes one of the following: Xception39, SqueezeNet, Xception, MobileNet, ShuffleNet; wherein, the first type of
    对应第一数据集的图像数量小于第二类型所对应第二数据集中的图像数量。The number of images corresponding to the first data set is smaller than the number of images in the second data set corresponding to the second type.
  25. 根据权利要求23所述的装置,其中,所述第二确定单元,还用于确定所述前景图 像中的特征区域,以及目标几何区域的角点坐标,其中,所述特征区域为所述前景图像中包含所述指定类型参数信息的区域;基于所述角点坐标计算投影变换矩阵;对所述特征区域中的像素点进行投影变换,得到所述感兴趣区域。23. The device according to claim 23, wherein the second determining unit is further configured to determine the characteristic area in the foreground image and the corner point coordinates of the target geometric area, wherein the characteristic area is the foreground The area in the image containing the specified type parameter information; the projection transformation matrix is calculated based on the corner coordinates; the pixel points in the characteristic area are projected and transformed to obtain the area of interest.
  26. 根据权利要求23所述的装置,其中,所述装置还包括:The device according to claim 23, wherein the device further comprises:
    第三确定模块,设置为确定待识别的图像是否为目标图像;以及在所述待识别的图像为目标图像时,确定对所述目标图像进行语义分割。The third determining module is configured to determine whether the image to be recognized is a target image; and when the image to be recognized is a target image, determine to perform semantic segmentation on the target image.
  27. 根据权利要求26所述的装置,其中,所述第三确定模块,还设置为通过以下方式确定感兴趣区域的类型:将所述感兴趣区域均分成预设数量个不重叠滑块;确定所述预设数量个不重叠滑块的特征值,得到所述预设数量个特征值;将所述预设数量个特征值组合成特征向量;将所述特征向量输入至支持向量机分类器进行分析,得到所述感兴趣区域的类型。The device according to claim 26, wherein the third determining module is further configured to determine the type of the region of interest by dividing the region of interest into a preset number of non-overlapping sliders; The preset number of feature values of non-overlapping sliders are obtained to obtain the preset number of feature values; the preset number of feature values are combined into a feature vector; the feature vector is input to a support vector machine classifier Analysis to obtain the type of the region of interest.
  28. 根据权利要求17至27中任意一项所述的装置,其中,所述指定类型参数信息中包含有用于反映血糖数据随时间变化的趋势的曲线信息,或用于反映血糖数据随时间变化的趋势的坐标系中离散点的取值信息。The device according to any one of claims 17 to 27, wherein the specified type parameter information includes curve information used to reflect the trend of blood glucose data changes over time, or used to reflect the trend of blood glucose data changes over time Value information of discrete points in the coordinate system.
  29. 根据权利要求17所述的装置,其中,所述装置还包括:The device according to claim 17, wherein the device further comprises:
    展示模块,用于展示所述被选定的像素点坐标所对应的参数取值。The display module is used to display the parameter values corresponding to the selected pixel coordinates.
  30. 根据权利要求17所述的装置,其中,所述第一确定模块,还用于接收用户针对目标图像的指令;依据所述指令确定被选定的像素点坐标。The device according to claim 17, wherein the first determining module is further configured to receive a user's instruction for the target image; determine the coordinates of the selected pixel according to the instruction.
  31. 根据权利要求30所述的装置,其中,所述指令基于以下之一信息确定:接收所述用户在所述目标图像所在人机交互界面的触摸点位置信息;或接收所述用户输入的查询信息。The device according to claim 30, wherein the instruction is determined based on one of the following information: receiving position information of the user's touch point on the human-computer interaction interface where the target image is located; or receiving query information input by the user .
  32. 根据权利要求31所述的装置,其中,所述装置还包括:The device according to claim 31, wherein the device further comprises:
    判断模块,用于在所述指令为接收所述用户在所述目标图像所在人机交互界面的触摸点位置信息时,基于所述触摸点位置确定被选定的像素点坐标之前,判断所述触摸点位置是否位于所述目标区域;The judging module is configured to judge the coordinates of the selected pixel based on the touch point position when the instruction is to receive the user's touch point position information on the human-computer interaction interface where the target image is located Whether the touch point is located in the target area;
    触发模块,用于在判断结果指示所述触摸点位置位于所述目标区域时,触发确定所述被选定的像素点坐标。The triggering module is used for triggering the determination of the coordinates of the selected pixel when the judgment result indicates that the position of the touch point is located in the target area.
  33. 一种非易失性存储介质,其中,所述非易失性存储介质包括存储的程序,其中,在所述程序运行时控制所述非易失性存储介质所在设备执行权利要求1至16中任意一项所述的图像识别方法。A non-volatile storage medium, wherein the non-volatile storage medium includes a stored program, wherein, when the program is running, the device where the non-volatile storage medium is located is controlled to execute claims 1 to 16 Any one of the image recognition methods.
  34. 一种处理器,其中,所述处理器设置为运行程序,其中,所述程序运行时执行权利要求1至16中任意一项所述的图像识别方法。A processor, wherein the processor is configured to run a program, wherein the image recognition method according to any one of claims 1 to 16 is executed when the program is running.
PCT/CN2020/100247 2019-07-05 2020-07-03 Image recognition method and apparatus, storage medium, and processor WO2021004402A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910606437.2A CN111881913A (en) 2019-07-05 2019-07-05 Image recognition method and device, storage medium and processor
CN201910606437.2 2019-07-05

Publications (1)

Publication Number Publication Date
WO2021004402A1 true WO2021004402A1 (en) 2021-01-14

Family

ID=73153889

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/100247 WO2021004402A1 (en) 2019-07-05 2020-07-03 Image recognition method and apparatus, storage medium, and processor

Country Status (2)

Country Link
CN (1) CN111881913A (en)
WO (1) WO2021004402A1 (en)

Cited By (32)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112767359A (en) * 2021-01-21 2021-05-07 中南大学 Steel plate corner detection method and system under complex background
CN112861885A (en) * 2021-03-25 2021-05-28 北京百度网讯科技有限公司 Image recognition method and device, electronic equipment and storage medium
CN113077472A (en) * 2021-04-07 2021-07-06 华南理工大学 Paper electrocardiogram curve image segmentation method, system, device and medium
CN113221985A (en) * 2021-04-29 2021-08-06 大连海事大学 Method for extracting basic features based on fusion network of pyramid model
CN113239832A (en) * 2021-05-20 2021-08-10 河南中全科技有限公司 Hidden danger intelligent identification method and system based on image identification
CN113436171A (en) * 2021-06-28 2021-09-24 博奥生物集团有限公司 Processing method and device for canned image
CN113486898A (en) * 2021-07-08 2021-10-08 西安电子科技大学 Radar signal RD image interference identification method and system based on improved ShuffleNet
CN113533551A (en) * 2021-06-08 2021-10-22 广西科技大学 GC-IMS-based fragrant rice shared flavor fingerprint spectrum extraction method
CN113554008A (en) * 2021-09-18 2021-10-26 深圳市安软慧视科技有限公司 Method and device for detecting static object in area, electronic equipment and storage medium
CN113592889A (en) * 2021-07-22 2021-11-02 武汉工程大学 Method and system for detecting included angle of cotter pin and electronic equipment
CN113642609A (en) * 2021-07-15 2021-11-12 东华大学 Characterization method of dispersed phase morphology in polymer blend based on image recognition technology
CN113658132A (en) * 2021-08-16 2021-11-16 沭阳九鼎钢铁有限公司 Computer vision-based structural part weld joint detection method
CN113900418A (en) * 2021-09-30 2022-01-07 广西埃索凯循环科技有限公司 Intelligent production system of high-purity zinc sulfate monohydrate
CN114119976A (en) * 2021-11-30 2022-03-01 广州文远知行科技有限公司 Semantic segmentation model training method, semantic segmentation model training device, semantic segmentation method, semantic segmentation device and related equipment
CN114219813A (en) * 2021-12-16 2022-03-22 数坤(北京)网络科技股份有限公司 Image processing method, intelligent terminal and storage medium
CN114241407A (en) * 2021-12-10 2022-03-25 电子科技大学 Close-range screen monitoring method based on deep learning
CN114445483A (en) * 2022-01-28 2022-05-06 泗阳三江橡塑有限公司 Injection molding part quality analysis method based on image pyramid
CN114662594A (en) * 2022-03-25 2022-06-24 浙江省通信产业服务有限公司 Target feature recognition analysis system
CN114692202A (en) * 2022-03-31 2022-07-01 马上消费金融股份有限公司 Image processing method, image processing device, electronic equipment and storage medium
CN114694186A (en) * 2022-06-01 2022-07-01 南京优牧大数据服务有限公司 Method and device for processing cattle face identification data
CN115063578A (en) * 2022-08-18 2022-09-16 杭州长川科技股份有限公司 Method and device for detecting and positioning target object in chip image and storage medium
CN115272298A (en) * 2022-09-19 2022-11-01 江苏网进科技股份有限公司 Urban road maintenance and supervision method and system based on road monitoring
CN115588099A (en) * 2022-11-02 2023-01-10 北京鹰之眼智能健康科技有限公司 Region-of-interest display method, electronic device and storage medium
WO2023098487A1 (en) * 2021-11-30 2023-06-08 西门子股份公司 Target detection method and apparatus, electronic device, and computer storage medium
CN116385706A (en) * 2023-06-06 2023-07-04 山东外事职业大学 Signal detection method and system based on image recognition technology
CN116611503A (en) * 2023-07-21 2023-08-18 浙江双元科技股份有限公司 Lightweight model construction method and device for multi-category flaw real-time detection
CN116612287A (en) * 2023-07-17 2023-08-18 腾讯科技(深圳)有限公司 Image recognition method, device, computer equipment and storage medium
CN116939906A (en) * 2023-07-26 2023-10-24 嘉兴市成泰镜业有限公司 Artificial intelligence-based LED mixed-color lamplight color calibration and adjustment method
CN117079147A (en) * 2023-10-17 2023-11-17 深圳市城市交通规划设计研究中心股份有限公司 Road interior disease identification method, electronic equipment and storage medium
CN117079218A (en) * 2023-09-20 2023-11-17 山东省地质矿产勘查开发局第一地质大队(山东省第一地质矿产勘查院) Dynamic monitoring method for rope position of passenger ropeway rope based on video monitoring
CN117437608A (en) * 2023-11-16 2024-01-23 元橡科技(北京)有限公司 All-terrain pavement type identification method and system
CN113486898B (en) * 2021-07-08 2024-05-31 西安电子科技大学 Radar signal RD image interference identification method and system based on improvement ShuffleNet

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112528782B (en) * 2020-11-30 2024-02-23 北京农业信息技术研究中心 Underwater fish target detection method and device
CN114638774B (en) * 2020-12-01 2024-02-02 珠海碳云智能科技有限公司 Image data processing method and device and nonvolatile storage medium
CN112785640B (en) * 2020-12-28 2022-08-09 宁波江丰生物信息技术有限公司 Method and system for detecting position of internal slice of scanner
CN113065501B (en) * 2021-04-15 2024-03-22 黑龙江惠达科技股份有限公司 Seedling line identification method and device and agricultural machinery
CN113096119A (en) * 2021-04-30 2021-07-09 上海众壹云计算科技有限公司 Method and device for classifying wafer defects, electronic equipment and storage medium
CN113222963B (en) * 2021-05-27 2024-03-26 大连海事大学 Non-orthographic infrared monitoring sea surface oil spill area estimation method and system
CN113486892B (en) * 2021-07-02 2023-11-28 东北大学 Production information acquisition method and system based on smart phone image recognition
CN113379006B (en) * 2021-08-16 2021-11-02 北京国电通网络技术有限公司 Image recognition method and device, electronic equipment and computer readable medium
CN115578564B (en) * 2022-10-25 2023-05-23 北京医准智能科技有限公司 Training method and device for instance segmentation model, electronic equipment and storage medium
CN116664529A (en) * 2023-06-05 2023-08-29 青岛信驰电子科技有限公司 Electronic element flat cable calibration method based on feature recognition
CN116820561B (en) * 2023-08-29 2023-10-31 成都丰硕智能数字科技有限公司 Method for automatically generating interface codes based on interface design diagram

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5054091A (en) * 1989-04-18 1991-10-01 Sharp Kabushiki Kaisha Method for determining coordinates of circumscribed rectangular frame of each character for use in optical character reader
US5898795A (en) * 1995-12-08 1999-04-27 Ricoh Company, Ltd. Character recognition method using a method for deleting ruled lines
CN105938555A (en) * 2016-04-12 2016-09-14 常州市武进区半导体照明应用技术研究院 Extraction method for picture curve data
CN106228159A (en) * 2016-07-29 2016-12-14 深圳友讯达科技股份有限公司 A kind of gauge table meter copying device based on image recognition and method thereof

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109766898A (en) * 2018-12-26 2019-05-17 平安科技(深圳)有限公司 Image character recognition method, device, computer equipment and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5054091A (en) * 1989-04-18 1991-10-01 Sharp Kabushiki Kaisha Method for determining coordinates of circumscribed rectangular frame of each character for use in optical character reader
US5898795A (en) * 1995-12-08 1999-04-27 Ricoh Company, Ltd. Character recognition method using a method for deleting ruled lines
CN105938555A (en) * 2016-04-12 2016-09-14 常州市武进区半导体照明应用技术研究院 Extraction method for picture curve data
CN106228159A (en) * 2016-07-29 2016-12-14 深圳友讯达科技股份有限公司 A kind of gauge table meter copying device based on image recognition and method thereof

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
DONG, YAN ET AL.: "A New Method of Data Extracting of Image Curve Based on MATLAB", DEVELOPMENT & INNOVATION OF MACHINERY & ELECTRICAL PRODUCTS, vol. 27, no. 2,, 31 March 2014 (2014-03-31), DOI: 20200925155307X *
TAN, YANZHENG ET AL.: "Method of data extraction of complex curve image", ELECTRONIC MEASUREMENT TECHNOLOGY, no. 12, 31 December 2016 (2016-12-31), DOI: 20200925154910X *
XU, BOHONG: "The Research on the Vector about the Scanned Curve Drawing", INFORMATION & TECHNOLOGY, CHINA MASTER’S THESES FULL-TEXT DATABASE, no. 2,, 15 February 2009 (2009-02-15), ISSN: 1674-024, DOI: 20200910121454X *

Cited By (53)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112767359A (en) * 2021-01-21 2021-05-07 中南大学 Steel plate corner detection method and system under complex background
CN112767359B (en) * 2021-01-21 2023-10-24 中南大学 Method and system for detecting corner points of steel plate under complex background
CN112861885A (en) * 2021-03-25 2021-05-28 北京百度网讯科技有限公司 Image recognition method and device, electronic equipment and storage medium
CN112861885B (en) * 2021-03-25 2023-09-22 北京百度网讯科技有限公司 Image recognition method, device, electronic equipment and storage medium
CN113077472A (en) * 2021-04-07 2021-07-06 华南理工大学 Paper electrocardiogram curve image segmentation method, system, device and medium
CN113221985A (en) * 2021-04-29 2021-08-06 大连海事大学 Method for extracting basic features based on fusion network of pyramid model
CN113221985B (en) * 2021-04-29 2024-04-05 大连海事大学 Method for extracting image basic features based on pyramid model fusion network
CN113239832A (en) * 2021-05-20 2021-08-10 河南中全科技有限公司 Hidden danger intelligent identification method and system based on image identification
CN113533551B (en) * 2021-06-08 2023-10-03 广西科技大学 GC-IMS-based extraction method of fragrant rice sharing flavor fingerprint spectrum
CN113533551A (en) * 2021-06-08 2021-10-22 广西科技大学 GC-IMS-based fragrant rice shared flavor fingerprint spectrum extraction method
CN113436171B (en) * 2021-06-28 2024-02-09 博奥生物集团有限公司 Processing method and device for can printing image
CN113436171A (en) * 2021-06-28 2021-09-24 博奥生物集团有限公司 Processing method and device for canned image
CN113486898B (en) * 2021-07-08 2024-05-31 西安电子科技大学 Radar signal RD image interference identification method and system based on improvement ShuffleNet
CN113486898A (en) * 2021-07-08 2021-10-08 西安电子科技大学 Radar signal RD image interference identification method and system based on improved ShuffleNet
CN113642609A (en) * 2021-07-15 2021-11-12 东华大学 Characterization method of dispersed phase morphology in polymer blend based on image recognition technology
CN113642609B (en) * 2021-07-15 2024-03-26 东华大学 Characterization method of dispersed phase morphology in polymer blend based on image recognition technology
CN113592889A (en) * 2021-07-22 2021-11-02 武汉工程大学 Method and system for detecting included angle of cotter pin and electronic equipment
CN113592889B (en) * 2021-07-22 2024-04-12 武汉工程大学 Method, system and electronic equipment for detecting included angle of cotter pin
CN113658132B (en) * 2021-08-16 2022-08-19 沭阳九鼎钢铁有限公司 Computer vision-based structural part weld joint detection method
CN113658132A (en) * 2021-08-16 2021-11-16 沭阳九鼎钢铁有限公司 Computer vision-based structural part weld joint detection method
CN113554008B (en) * 2021-09-18 2021-12-31 深圳市安软慧视科技有限公司 Method and device for detecting static object in area, electronic equipment and storage medium
CN113554008A (en) * 2021-09-18 2021-10-26 深圳市安软慧视科技有限公司 Method and device for detecting static object in area, electronic equipment and storage medium
CN113900418A (en) * 2021-09-30 2022-01-07 广西埃索凯循环科技有限公司 Intelligent production system of high-purity zinc sulfate monohydrate
CN113900418B (en) * 2021-09-30 2024-05-03 广西埃索凯循环科技有限公司 Intelligent production system of high-purity zinc sulfate monohydrate
WO2023098487A1 (en) * 2021-11-30 2023-06-08 西门子股份公司 Target detection method and apparatus, electronic device, and computer storage medium
CN114119976A (en) * 2021-11-30 2022-03-01 广州文远知行科技有限公司 Semantic segmentation model training method, semantic segmentation model training device, semantic segmentation method, semantic segmentation device and related equipment
CN114119976B (en) * 2021-11-30 2024-05-14 广州文远知行科技有限公司 Semantic segmentation model training method, semantic segmentation device and related equipment
CN114241407A (en) * 2021-12-10 2022-03-25 电子科技大学 Close-range screen monitoring method based on deep learning
CN114219813A (en) * 2021-12-16 2022-03-22 数坤(北京)网络科技股份有限公司 Image processing method, intelligent terminal and storage medium
CN114445483B (en) * 2022-01-28 2023-03-24 泗阳三江橡塑有限公司 Injection molding part quality analysis method based on image pyramid
CN114445483A (en) * 2022-01-28 2022-05-06 泗阳三江橡塑有限公司 Injection molding part quality analysis method based on image pyramid
CN114662594B (en) * 2022-03-25 2022-10-04 浙江省通信产业服务有限公司 Target feature recognition analysis system
CN114662594A (en) * 2022-03-25 2022-06-24 浙江省通信产业服务有限公司 Target feature recognition analysis system
CN114692202A (en) * 2022-03-31 2022-07-01 马上消费金融股份有限公司 Image processing method, image processing device, electronic equipment and storage medium
CN114694186B (en) * 2022-06-01 2022-08-26 南京优牧大数据服务有限公司 Method and device for processing cattle face identification data
CN114694186A (en) * 2022-06-01 2022-07-01 南京优牧大数据服务有限公司 Method and device for processing cattle face identification data
CN115063578A (en) * 2022-08-18 2022-09-16 杭州长川科技股份有限公司 Method and device for detecting and positioning target object in chip image and storage medium
CN115272298A (en) * 2022-09-19 2022-11-01 江苏网进科技股份有限公司 Urban road maintenance and supervision method and system based on road monitoring
CN115588099A (en) * 2022-11-02 2023-01-10 北京鹰之眼智能健康科技有限公司 Region-of-interest display method, electronic device and storage medium
CN115588099B (en) * 2022-11-02 2023-05-30 北京鹰之眼智能健康科技有限公司 Region of interest display method, electronic device and storage medium
CN116385706A (en) * 2023-06-06 2023-07-04 山东外事职业大学 Signal detection method and system based on image recognition technology
CN116385706B (en) * 2023-06-06 2023-08-25 山东外事职业大学 Signal detection method and system based on image recognition technology
CN116612287B (en) * 2023-07-17 2023-09-22 腾讯科技(深圳)有限公司 Image recognition method, device, computer equipment and storage medium
CN116612287A (en) * 2023-07-17 2023-08-18 腾讯科技(深圳)有限公司 Image recognition method, device, computer equipment and storage medium
CN116611503A (en) * 2023-07-21 2023-08-18 浙江双元科技股份有限公司 Lightweight model construction method and device for multi-category flaw real-time detection
CN116611503B (en) * 2023-07-21 2023-09-22 浙江双元科技股份有限公司 Lightweight model construction method and device for multi-category flaw real-time detection
CN116939906A (en) * 2023-07-26 2023-10-24 嘉兴市成泰镜业有限公司 Artificial intelligence-based LED mixed-color lamplight color calibration and adjustment method
CN116939906B (en) * 2023-07-26 2024-04-19 嘉兴市成泰镜业有限公司 Artificial intelligence-based LED mixed-color lamplight color calibration and adjustment method
CN117079218A (en) * 2023-09-20 2023-11-17 山东省地质矿产勘查开发局第一地质大队(山东省第一地质矿产勘查院) Dynamic monitoring method for rope position of passenger ropeway rope based on video monitoring
CN117079218B (en) * 2023-09-20 2024-03-08 山东省地质矿产勘查开发局第一地质大队(山东省第一地质矿产勘查院) Dynamic monitoring method for rope position of passenger ropeway rope based on video monitoring
CN117079147B (en) * 2023-10-17 2024-02-27 深圳市城市交通规划设计研究中心股份有限公司 Road interior disease identification method, electronic equipment and storage medium
CN117079147A (en) * 2023-10-17 2023-11-17 深圳市城市交通规划设计研究中心股份有限公司 Road interior disease identification method, electronic equipment and storage medium
CN117437608A (en) * 2023-11-16 2024-01-23 元橡科技(北京)有限公司 All-terrain pavement type identification method and system

Also Published As

Publication number Publication date
CN111881913A (en) 2020-11-03

Similar Documents

Publication Publication Date Title
WO2021004402A1 (en) Image recognition method and apparatus, storage medium, and processor
CN109359575B (en) Face detection method, service processing method, device, terminal and medium
US11681418B2 (en) Multi-sample whole slide image processing in digital pathology via multi-resolution registration and machine learning
WO2020199931A1 (en) Face key point detection method and apparatus, and storage medium and electronic device
CN109151501B (en) Video key frame extraction method and device, terminal equipment and storage medium
CN108288075B (en) A kind of lightweight small target detecting method improving SSD
JP7413400B2 (en) Skin quality measurement method, skin quality classification method, skin quality measurement device, electronic equipment and storage medium
CN108717524B (en) Gesture recognition system based on double-camera mobile phone and artificial intelligence system
CN112381775B (en) Image tampering detection method, terminal device and storage medium
TWI395145B (en) Hand gesture recognition system and method
Liu et al. Real-time robust vision-based hand gesture recognition using stereo images
CN110728255B (en) Image processing method, image processing device, electronic equipment and storage medium
WO2019114036A1 (en) Face detection method and device, computer device, and computer readable storage medium
US11886492B2 (en) Method of matching image and apparatus thereof, device, medium and program product
CN112052186B (en) Target detection method, device, equipment and storage medium
WO2020206850A1 (en) Image annotation method and device employing high-dimensional image
WO2022041830A1 (en) Pedestrian re-identification method and device
WO2019080203A1 (en) Gesture recognition method and system for robot, and robot
WO2021164550A1 (en) Image classification method and apparatus
WO2021103868A1 (en) Method for structuring pedestrian information, device, apparatus and storage medium
CN109033935B (en) Head-up line detection method and device
CN109190456B (en) Multi-feature fusion overlook pedestrian detection method based on aggregated channel features and gray level co-occurrence matrix
CN112036284A (en) Image processing method, device, equipment and storage medium
WO2023035558A1 (en) Anchor point cut-based image processing method and apparatus, device, and medium
CN115082776A (en) Electric energy meter automatic detection system and method based on image recognition

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20836201

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 16.06.2022)

122 Ep: pct application non-entry in european phase

Ref document number: 20836201

Country of ref document: EP

Kind code of ref document: A1