WO2022247005A1 - Method and apparatus for identifying target object in image, electronic device and storage medium - Google Patents

Method and apparatus for identifying target object in image, electronic device and storage medium Download PDF

Info

Publication number
WO2022247005A1
WO2022247005A1 PCT/CN2021/109479 CN2021109479W WO2022247005A1 WO 2022247005 A1 WO2022247005 A1 WO 2022247005A1 CN 2021109479 W CN2021109479 W CN 2021109479W WO 2022247005 A1 WO2022247005 A1 WO 2022247005A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
training set
texture
recognition model
sample image
Prior art date
Application number
PCT/CN2021/109479
Other languages
French (fr)
Chinese (zh)
Inventor
王瑞
李君�
陈凌智
薛淑月
吕传峰
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2022247005A1 publication Critical patent/WO2022247005A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/40Analysis of texture
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components

Definitions

  • the present application relates to the technical field of artificial intelligence, and in particular to a method, device, electronic equipment, and computer-readable storage medium for object recognition in an image.
  • the new image generated by image expansion is obtained based on the features of the original image, the basic features of the image in the new image have not changed, which may easily lead to overfitting of the trained model, which in turn leads to the failure of the model.
  • the accuracy of target recognition is not high.
  • a method for recognizing objects in an image provided by the present application includes:
  • test set to test the initial recognition model, and selecting preset types of error results in the test results to construct a second training set, and using the sample images to construct a third training set;
  • the image to be recognized is acquired, and the standard recognition model is used to perform target recognition on the image to be recognized to obtain a target recognition result in the image.
  • the present application also provides a device for identifying objects in an image, the device comprising:
  • An image amplification module configured to obtain a sample image containing a target object, perform image amplification on the sample image, and obtain a sample image set;
  • the first training module is configured to use noise images of the same type as the sample image and the sample image set to construct a first training set and a test set, and use the first training set to perform pre-built original recognition models. Training to get the initial recognition model;
  • a model testing module configured to use the test set to test the initial recognition model, select a preset type of error result in the test result to construct a second training set, and use the sample image to construct a third training set;
  • a feature extraction module configured to extract a first feature vector of images in the second training set, and extract a second feature vector of images in the third training set;
  • a second training module configured to calculate a loss value between the first feature vector and the second feature vector, and update parameters of the initial recognition model according to the loss value to obtain a standard recognition model
  • the image recognition module is used to obtain the image to be recognized, and use the standard recognition model to perform target recognition on the image to be recognized to obtain the target recognition result in the image.
  • the present application also provides an electronic device, the electronic device comprising:
  • a memory storing at least one instruction
  • a processor executing instructions stored in the memory to implement the following steps:
  • test set to test the initial recognition model, and selecting preset types of error results in the test results to construct a second training set, and using the sample images to construct a third training set;
  • the image to be recognized is acquired, and the standard recognition model is used to perform target recognition on the image to be recognized to obtain a target recognition result in the image.
  • test set to test the initial recognition model, and selecting preset types of error results in the test results to construct a second training set, and using the sample images to construct a third training set;
  • the image to be recognized is acquired, and the standard recognition model is used to perform target recognition on the image to be recognized to obtain a target recognition result in the image.
  • FIG. 1 is a schematic flow diagram of a method for recognizing objects in an image provided by an embodiment of the present application
  • FIG. 2 is a schematic flow diagram of generating a first feature vector provided by an embodiment of the present application
  • Fig. 3 is a functional module diagram of an object recognition device in an image provided by an embodiment of the present application.
  • FIG. 4 is a schematic structural diagram of an electronic device for implementing the method for recognizing an object in an image provided by an embodiment of the present application.
  • AI artificial intelligence
  • digital computers or machines controlled by digital computers to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge and use knowledge to obtain the best results.
  • Artificial intelligence basic technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technology, operation/interaction systems, and mechatronics.
  • Artificial intelligence software technology mainly includes computer vision technology, robotics technology, biometrics technology, speech processing technology, natural language processing technology, and machine learning/deep learning.
  • An embodiment of the present application provides a method for recognizing an object in an image.
  • the executor of the method for object recognition in an image includes, but is not limited to, at least one of electronic devices such as a server and a terminal that can be configured to execute the method provided by the embodiment of the present application.
  • the method for identifying an object in an image can be executed by software or hardware installed on a terminal device or a server device, and the software can be a blockchain platform.
  • the server includes, but is not limited to: a single server, a server cluster, a cloud server or a cloud server cluster, and the like.
  • the server can be an independent server, or it can provide cloud services, cloud database, cloud computing, cloud function, cloud storage, network service, cloud communication, middleware service, domain name service, security service, content distribution network (ContentDelivery Network, CDN ), and cloud servers for basic cloud computing services such as big data and artificial intelligence platforms.
  • cloud database cloud computing
  • cloud function cloud storage
  • network service cloud communication
  • middleware service domain name service
  • security service content distribution network (ContentDelivery Network, CDN )
  • CDN content distribution network
  • cloud servers for basic cloud computing services such as big data and artificial intelligence platforms.
  • FIG. 1 it is a schematic flow chart of a method for recognizing an object in an image provided by an embodiment of the present application.
  • the target recognition method in the image includes:
  • the sample image contains a specific target object and the real label corresponding to the target object.
  • the sample image is an image containing an apple, and the real label is " Apple"; or, when the target object is a lesion of a certain disease, the sample image is an image containing the lesion, and the real label is the name of the disease corresponding to the lesion.
  • the pre-stored sample image can be captured from the pre-built block chain node through the java statement with the data capture function, and the high throughput of the block chain for data can be used to improve the speed of obtaining the sample image efficiency.
  • the image amplification of the sample image can be realized by performing geometric transformation, color change, contrast adjustment, and partial occlusion on the sample image.
  • the sample image set is generated by stretching the sample image in different sizes in the horizontal and vertical directions to obtain the length, width or a combination of both.
  • a sample image set including sample images of multiple different colors is obtained.
  • a plurality of sample image sets in which different parts are covered are obtained by partially covering the sample image. For example, cover the upper part of the target object in the sample image set to obtain a sample image in which only the lower part of the target object can be seen, and cover the right half part of the target object in the sample image set to obtain only the left half of the target object. half of the sample images, and the sample images that cover different areas are collected as a sample image set.
  • performing image amplification on the sample image to obtain a sample image set includes:
  • image texture extraction algorithms such as GLCM (Gray-level co-occurrence matrix, gray-level co-occurrence matrix) method, LBP (Local Binary Pattern, local binary pattern) can be used to implement texture depiction of the sample image to highlight Output the image texture in the sample image.
  • GLCM Gram-level co-occurrence matrix, gray-level co-occurrence matrix
  • LBP Local Binary Pattern, local binary pattern
  • the random local deepening of the image texture to obtain a texture deepened image includes:
  • the image texture of the preset ratio is selected according to the number of textures, and the pixel value on the selected texture to be processed is adjusted (for example, the pixel value on the texture to be processed is adjusted to a black range ), to realize the deepening of the texture to be processed, and obtain the texture deepened image.
  • the step of performing random local lightening on the image file to obtain a texture lightening image is consistent with the step of texture deepening, for example, adjusting the pixel value on the texture to be processed to a white range to realize the lightening of the texture to be processed, Get a textured faded image.
  • the noise image is an image of the same type as the sample image, but the target object in the noise image is different from that in the sample image (for example, the noise image and the sample image are both is a fruit image, but the noise image contains watermelons, and the sample image contains apples), similarly, the noise image also contains the real label corresponding to the noise image.
  • the target object contained in the sample image is a focus of a lung disease, but the focus of a liver disease may be contained in the noise image.
  • the noise image and the sample image set are collected, and the collected sample image and noise image are divided into a first training set and a test set according to a preset division ratio.
  • both the first training set and the test set include sample images and noise images.
  • the original recognition model can use networks with image recognition functions such as VGG network, GoogleNet and Residual Network.
  • EfficientNet is used as the backbone network of the original recognition model, and the EfficientNet is a composite multi-dimensional convolutional neural network, which is beneficial to improve the accuracy of the image processing process, thereby improving the accuracy of image recognition. Accuracy.
  • the embodiment of the present application uses the training set to train the original recognition model, so as to adjust the model parameters in the original recognition model, improve the accuracy of the original recognition model for image recognition, and obtain the initial recognition model.
  • an initial recognition model including:
  • the recognition result is the recognition result of the original recognition model for the type of target contained in each image in the first training set, for example, the first training set contains sample image A, sample image B and noise image C, where , the real label of sample image A and sample image B is apple (that is, the target object is apple), and the real label of image C is watermelon, but the recognition result obtained by the original recognition model is: sample image A is apple, and sample image B is watermelon , the noise image C is an apple.
  • the preset first loss function can be used to calculate the loss value of the recognition result and the real label corresponding to each image in the first training set, and then adjust the parameters of the original recognition model according to the loss value to improve the accuracy of the original recognition model.
  • vector conversion is performed on the real labels in the first training set to obtain the real vector; vector conversion is performed on the recognition result to obtain the recognition vector, and the loss value between the real vector and the recognition vector corresponding to each image in the first training set is calculated respectively , and then use a preset optimization algorithm to adjust the parameters of the original recognition model according to the loss value.
  • the optimization algorithm includes but is not limited to: batch gradient descent algorithm, stochastic gradient descent algorithm, and mini-batch gradient descent algorithm.
  • test set Use the test set to test the initial recognition model, select preset types of error results from the test results to construct a second training set, and use the sample images to construct a third training set.
  • the test set can be used to test the initial recognition model obtained in step S2, for example, the test set is input into the initial recognition model to obtain the test of each image in the test set by the initial recognition model result.
  • the preset types of error results include error results of noise images in the test set.
  • the test set contains sample image D, sample image E, noise image F and noise image G, wherein the real label of sample image D and sample image E is apple, and the real label of noise image E and noise image F is watermelon ;
  • the test results obtained are: the sample image D and the noise image F are apples, the sample image E and the noise image G are watermelons, therefore, the noise image F and the sample image E is the wrong result in the test result, and the noise image F is selected as the second training set.
  • At least one sample image in the sample image set generated in step S1 is used as the third training set.
  • the second training set is constructed by using the wrong results in the test results
  • the third training set is constructed by using the sample images, so that the images that are easily misrecognized by the initial recognition model in the noise image and the images containing the target object can be realized.
  • the sample images construct the training set separately, which is conducive to further training the initial recognition model and improving the accuracy of the initial recognition model.
  • the first feature vectors of the images in the second training set may be extracted by using an image feature extraction algorithm such as the HOG algorithm, the LBP algorithm, and the Harr algorithm.
  • step of extracting the second feature vectors of the images in the third training set is the same as the step of extracting the first feature vectors of the images in the second training set, which will not be repeated here.
  • the extraction of the first feature vector of the images in the second training set includes:
  • the first gradient operator and the second gradient operator are preset matrices, for example, the first gradient operator may be [-1, 0, 1], and the second gradient operator can be [1, 0, -1], and each image in the second training set can be obtained by convolving the first gradient operator and the second gradient operator with each image in the second training set The horizontal gradient component and vertical gradient component corresponding to the image.
  • the use of the preset first gradient operator to perform a horizontal convolution operation on the image in the second training set to obtain a horizontal gradient component includes:
  • the convolution step size refers to the pixel length that the first gradient operator needs to move after performing a convolution operation
  • the convolution length refers to the length of each image in the second training set in the horizontal direction.
  • the calculating the first feature vector according to the horizontal gradient component and the vertical gradient component includes:
  • the horizontal normalized component and the vertical normalized component are squared and summed to obtain the first feature vector.
  • the horizontal gradient component and the vertical gradient component can be calculated by using a preset linear function, logarithmic function, or inverse cotangent function, etc. Normalization of gradient components.
  • the horizontal normalized component and the vertical normalized component can be squared and summed using the following square summation formula to obtain the first eigenvector:
  • L is the first feature vector
  • is the horizontal normalized component
  • is the vertical normalized component
  • the preset second loss function can be used to calculate the loss value of the recognition result and the real label corresponding to each image in the first training set, and then adjust the parameters of the initial recognition model to improve the performance of the initial recognition model. Accuracy.
  • the second loss function may be the same as or different from the first loss function in step S2.
  • step of updating the parameters of the initial recognition model by using the loss value to obtain a standard recognition model is the same as when using the first training set to train the pre-built original recognition model in step S2,
  • the steps of adjusting the parameters of the original recognition model are the same, and will not be repeated here.
  • the image to be recognized may be an image that contains or does not contain the target object.
  • the standard recognition model can be used to perform target recognition on the image to be recognized to obtain the target object in the image. The recognition result of the object type.
  • the embodiment of the present application can realize the expansion of the original small number of samples through image expansion, and use the expanded sample image set and noise image to train the model at the same time, which is beneficial to improve the accuracy and robustness of the model; Test, and use the wrong results in the test results to construct a training set, and retrain the model with sample images, which can further improve the accuracy of the model's recognition of the target. Therefore, the method for identifying objects in images proposed by the present application can solve the problem of low accuracy in identifying objects.
  • FIG. 3 it is a functional block diagram of an apparatus for recognizing objects in images provided by an embodiment of the present application.
  • the apparatus 100 for recognizing objects in images described in this application can be installed in electronic equipment.
  • the device 100 for identifying objects in images may include an image augmentation module 101 , a first training module 102 , a model testing module 103 , a feature extraction module 104 , a second training module 105 and an image recognition module 106 .
  • the module described in this application can also be called a unit, which refers to a series of computer program segments that can be executed by the processor of the electronic device and can complete fixed functions, and are stored in the memory of the electronic device.
  • each module/unit is as follows:
  • the image augmentation module 101 is configured to acquire a sample image containing a target object, perform image augmentation on the sample image, and obtain a sample image set;
  • the sample image contains a specific target object and the real label corresponding to the target object.
  • the sample image is an image containing an apple, and the real label is " Apple"; or, when the target object is a lesion of a certain disease, the sample image is an image containing the lesion, and the real label is the name of the disease corresponding to the lesion.
  • the pre-stored sample image can be captured from the pre-built block chain node through the java statement with the data capture function, and the high throughput of the block chain for data can be used to improve the speed of obtaining the sample image efficiency.
  • the image amplification of the sample image can be realized by performing geometric transformation, color change, contrast adjustment, and partial occlusion on the sample image.
  • the sample image set is generated by stretching the sample image in different sizes in the horizontal and vertical directions to obtain the length, width or a combination of both.
  • a sample image set including sample images of multiple different colors is obtained.
  • a plurality of sample image sets in which different parts are covered are obtained by partially covering the sample image. For example, cover the upper part of the target object in the sample image set to obtain a sample image in which only the lower part of the target object can be seen, and cover the right half part of the target object in the sample image set to obtain only the left half of the target object. half of the sample images, and the sample images that cover different areas are collected as a sample image set.
  • the image augmentation module 101 is specifically used for:
  • image texture extraction algorithms such as GLCM (Gray-level co-occurrence matrix, gray-level co-occurrence matrix) method, LBP (Local Binary Pattern, local binary pattern) can be used to implement texture depiction of the sample image to highlight Output the image texture in the sample image.
  • GLCM Gram-level co-occurrence matrix, gray-level co-occurrence matrix
  • LBP Local Binary Pattern, local binary pattern
  • the random local deepening of the image texture to obtain a texture deepened image includes:
  • the image texture of the preset ratio is selected according to the number of textures, and the pixel value on the selected texture to be processed is adjusted (for example, the pixel value on the texture to be processed is adjusted to a black range ), to realize the deepening of the texture to be processed, and obtain the texture deepened image.
  • the step of performing random local lightening on the image file to obtain a texture lightening image is consistent with the step of texture deepening, for example, adjusting the pixel value on the texture to be processed to a white range to realize the lightening of the texture to be processed, Get a textured faded image.
  • the first training module 102 is configured to use noise images of the same type as the sample image and the sample image set to construct a first training set and a test set, and use the first training set to compare the pre-built original
  • the recognition model is trained to obtain the initial recognition model
  • the noise image is an image of the same type as the sample image, but the target object in the noise image is different from that in the sample image (for example, the noise image and the sample image are both is a fruit image, but the noise image contains watermelons, and the sample image contains apples), similarly, the noise image also contains the real label corresponding to the noise image.
  • the target object contained in the sample image is a focus of a lung disease, but the focus of a liver disease may be contained in the noise image.
  • the noise image and the sample image set are collected, and the collected sample image and noise image are divided into a first training set and a test set according to a preset division ratio.
  • both the first training set and the test set include sample images and noise images.
  • the original recognition model can use networks with image recognition functions such as VGG network, GoogleNet and Residual Network.
  • EfficientNet is used as the backbone network of the original recognition model, and the EfficientNet is a composite multi-dimensional convolutional neural network, which is beneficial to improve the accuracy of the image processing process, thereby improving the accuracy of image recognition. Accuracy.
  • the embodiment of the present application uses the training set to train the original recognition model, so as to adjust the model parameters in the original recognition model, improve the accuracy of the original recognition model for image recognition, and obtain the initial recognition model.
  • the first training module 102 is specifically used for:
  • the recognition result is the recognition result of the original recognition model for the type of target contained in each image in the first training set, for example, the first training set contains sample image A, sample image B and noise image C, where , the real label of sample image A and sample image B is apple (that is, the target object is apple), and the real label of image C is watermelon, but the recognition result obtained by the original recognition model is: sample image A is apple, and sample image B is watermelon , the noise image C is an apple.
  • the preset first loss function can be used to calculate the loss value of the recognition result and the real label corresponding to each image in the first training set, and then adjust the parameters of the original recognition model according to the loss value to improve the accuracy of the original recognition model.
  • vector conversion is performed on the real labels in the first training set to obtain the real vector; vector conversion is performed on the recognition result to obtain the recognition vector, and the loss value between the real vector and the recognition vector corresponding to each image in the first training set is calculated respectively , and then use a preset optimization algorithm to adjust the parameters of the original recognition model according to the loss value.
  • the optimization algorithm includes but is not limited to: batch gradient descent algorithm, stochastic gradient descent algorithm, and mini-batch gradient descent algorithm.
  • the model testing module 103 is configured to use the test set to test the initial recognition model, select preset types of error results in the test results to construct a second training set, and use the sample images to construct a third training set ;
  • the test set can be used to test the initial recognition model obtained in the first training module 102, for example, the test set is input into the initial recognition model to obtain the initial recognition model for each image in the test set. Image test results.
  • the preset types of error results include error results of noise images in the test set.
  • the test set contains sample image D, sample image E, noise image F and noise image G, wherein the real label of sample image D and sample image E is apple, and the real label of noise image E and noise image F is watermelon ;
  • the test results obtained are: the sample image D and the noise image F are apples, the sample image E and the noise image G are watermelons, therefore, the noise image F and the sample image E is the wrong result in the test result, and the noise image F is selected as the second training set.
  • At least one sample image in the sample image set generated in the image augmentation module 101 is used as the third training set.
  • the second training set is constructed by using the wrong results in the test results
  • the third training set is constructed by using the sample images, so that the images that are easily misrecognized by the initial recognition model in the noise image and the images containing the target object can be realized.
  • the sample images construct the training set separately, which is conducive to further training the initial recognition model and improving the accuracy of the initial recognition model.
  • the feature extraction module 104 is configured to extract a first feature vector of images in the second training set, and extract a second feature vector of images in the third training set;
  • the first feature vectors of the images in the second training set may be extracted by using an image feature extraction algorithm such as the HOG algorithm, the LBP algorithm, and the Harr algorithm.
  • step of extracting the second feature vectors of the images in the third training set is the same as the step of extracting the first feature vectors of the images in the second training set, which will not be repeated here.
  • the feature extraction module 104 is specifically used for:
  • Extracting second feature vectors of images in the third training set Extracting second feature vectors of images in the third training set.
  • the first gradient operator and the second gradient operator are preset matrices, for example, the first gradient operator may be [-1, 0, 1], and the second gradient operator can be [1, 0, -1], and each image in the second training set can be obtained by convolving the first gradient operator and the second gradient operator with each image in the second training set The horizontal gradient component and vertical gradient component corresponding to the image.
  • the use of the preset first gradient operator to perform a horizontal convolution operation on the image in the second training set to obtain a horizontal gradient component includes:
  • the convolution step size refers to the pixel length that the first gradient operator needs to move after performing a convolution operation
  • the convolution length refers to the length of each image in the second training set in the horizontal direction.
  • the calculating the first feature vector according to the horizontal gradient component and the vertical gradient component includes:
  • the horizontal normalized component and the vertical normalized component are squared and summed to obtain the first feature vector.
  • the horizontal gradient component and the vertical gradient component can be calculated by using a preset linear function, logarithmic function, or inverse cotangent function, etc. Normalization of gradient components.
  • the horizontal normalized component and the vertical normalized component can be squared and summed using the following square summation formula to obtain the first eigenvector:
  • L is the first feature vector
  • is the horizontal normalized component
  • is the vertical normalized component
  • the second training module 105 is configured to calculate a loss value between the first feature vector and the second feature vector, and update the parameters of the initial recognition model according to the loss value to obtain a standard recognition model ;
  • the preset second loss function can be used to calculate the loss value of the recognition result and the real label corresponding to each image in the first training set, and then adjust the parameters of the initial recognition model to improve the performance of the initial recognition model. Accuracy.
  • the second loss function may be the same as or different from the first loss function in the first training module 102 .
  • the step of updating the parameters of the initial recognition model using the loss value to obtain a standard recognition model is the same as performing the pre-built original recognition model using the first training set in the first training module 102.
  • the steps of adjusting the parameters of the original recognition model are the same, and will not be repeated here.
  • the image recognition module 106 is configured to acquire an image to be recognized, and use the standard recognition model to perform object recognition on the image to be recognized to obtain a target object recognition result in the image.
  • the image to be recognized may be an image that contains or does not contain the target object.
  • the standard recognition model can be used to perform target recognition on the image to be recognized to obtain the target object in the image. The recognition result of the object type.
  • the embodiment of the present application can realize the expansion of the original small number of samples through image expansion, and use the expanded sample image set and noise image to train the model at the same time, which is beneficial to improve the accuracy and robustness of the model; Test, and use the wrong results in the test results to construct a training set, and retrain the model with sample images, which can further improve the accuracy of the model's recognition of the target. Therefore, the device for identifying objects in images proposed by the present application can solve the problem of low accuracy in identifying objects.
  • FIG. 4 it is a schematic structural diagram of an electronic device for implementing a method for recognizing an object in an image provided by an embodiment of the present application.
  • the electronic device 1 may include a processor 10 , a memory 11 and a bus, and may also include a computer program stored in the memory 11 and operable on the processor 10 , such as a program 12 for object recognition in an image.
  • the memory 11 includes at least one type of readable storage medium, and the readable storage medium includes flash memory, mobile hard disk, multimedia card, card type memory (for example: SD or DX memory, etc.), magnetic memory, magnetic disk, CD etc.
  • the storage 11 may be an internal storage unit of the electronic device 1 in some embodiments, such as a mobile hard disk of the electronic device 1 .
  • the memory 11 can also be an external storage device of the electronic device 1 in other embodiments, such as a plug-in mobile hard disk equipped on the electronic device 1, a smart memory card (Smart Media Card, SMC), a secure digital (Secure Digital , SD) card, flash memory card (Flash Card), etc.
  • the memory 11 may also include both an internal storage unit of the electronic device 1 and an external storage device.
  • the memory 11 can not only be used to store application software and various data installed in the electronic device 1 , such as codes of the object recognition program 12 in images, but also can be used to temporarily store data that has been output or will be output.
  • the processor 10 may be composed of integrated circuits, for example, may be composed of a single packaged integrated circuit, or may be composed of multiple integrated circuits with the same function or different functions, including one or more Central processing unit (Central Processing unit, CPU), microprocessor, digital processing chip, graphics processor and a combination of various control chips, etc.
  • the processor 10 is the control core (Control Unit) of the electronic device, and uses various interfaces and lines to connect the various components of the entire electronic device, and runs or executes programs or modules stored in the memory 11 (such as image target recognition program, etc.), and call the data stored in the memory 11 to execute various functions of the electronic device 1 and process data.
  • Control Unit Control Unit
  • the bus may be a peripheral component interconnect standard (PCI for short) bus or an extended industry standard architecture (EISA for short) bus or the like.
  • PCI peripheral component interconnect standard
  • EISA extended industry standard architecture
  • the bus can be divided into address bus, data bus, control bus and so on.
  • the bus is configured to realize connection and communication between the memory 11 and at least one processor 10 and the like.
  • FIG. 4 only shows an electronic device with components. Those skilled in the art can understand that the structure shown in FIG. 4 does not constitute a limitation to the electronic device 1, and may include fewer or more components, or combinations of certain components, or different arrangements of components.
  • the electronic device 1 can also include a power supply (such as a battery) for supplying power to various components.
  • the power supply can be logically connected to the at least one processor 10 through a power management device, so that the power supply can be controlled by power management.
  • the device implements functions such as charge management, discharge management, and power consumption management.
  • the power supply may also include one or more DC or AC power supplies, recharging devices, power failure detection circuits, power converters or inverters, power status indicators and other arbitrary components.
  • the electronic device 1 may also include various sensors, bluetooth modules, Wi-Fi modules, etc., which will not be repeated here.
  • the electronic device 1 may also include a network interface, optionally, the network interface may include a wired interface and/or a wireless interface (such as a WI-FI interface, a Bluetooth interface, etc.), which are usually used in the electronic device 1 Establish a communication connection with other electronic devices.
  • a network interface optionally, the network interface may include a wired interface and/or a wireless interface (such as a WI-FI interface, a Bluetooth interface, etc.), which are usually used in the electronic device 1 Establish a communication connection with other electronic devices.
  • the electronic device 1 may further include a user interface, which may be a display (Display) or an input unit (such as a keyboard (Keyboard)).
  • the user interface may also be a standard wired interface or a wireless interface.
  • the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode, organic light-emitting diode) touch device, and the like.
  • the display may also be appropriately called a display screen or a display unit, and is used for displaying information processed in the electronic device 1 and for displaying a visualized user interface.
  • the object recognition program 12 in the image stored in the memory 11 in the electronic device 1 is a combination of multiple instructions. When running in the processor 10, it can realize:
  • test set to test the initial recognition model, and selecting preset types of error results in the test results to construct a second training set, and using the sample images to construct a third training set;
  • the image to be recognized is acquired, and the standard recognition model is used to perform target recognition on the image to be recognized to obtain a target recognition result in the image.
  • the integrated modules/units of the electronic device 1 are realized in the form of software function units and sold or used as independent products, they can be stored in a computer-readable storage medium.
  • the computer-readable storage medium may be volatile or non-volatile.
  • the computer-readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer memory, a read-only memory (ROM, Read-Only Memory).
  • the present application also provides a computer-readable storage medium, the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor of an electronic device, it can realize:
  • test set to test the initial recognition model, and selecting preset types of error results in the test results to construct a second training set, and using the sample images to construct a third training set;
  • the image to be recognized is acquired, and the standard recognition model is used to perform target recognition on the image to be recognized to obtain a target recognition result in the image.
  • modules described as separate components may or may not be physically separated, and the components shown as modules may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the modules can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional module in each embodiment of the present application may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated units can be implemented in the form of hardware, or in the form of hardware plus software function modules.
  • Blockchain essentially a decentralized database, is a series of data blocks associated with each other using cryptographic methods. Each data block contains a batch of network transaction information, which is used to verify its Validity of information (anti-counterfeiting) and generation of the next block.
  • the blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.

Abstract

A method and apparatus for identifying a target object in an image, a device and a medium, related to the image processing technology, comprising: obtaining a sample image set by means of image amplification; constructing a first training set and a test set by using a noise image and the sample image set, and training an original identification model by using the first training set to obtain an initial identification model; testing the initial identification model by using the test set, selecting the error test result to construct a second training set to generate a first feature vector, and constructing a second feature vector of a sample image; and calculating a loss value between the first feature vector and the second feature vector to perform parameter update on the initial identification model to obtain a standard identification model to identify an image to be identified, and obtaining the target object identification result in the image. In addition, the method also relates to the blockchain technology, and the sample image may be stored in a node of a blockchain. The method can solve the problem of lower accuracy of target object identification.

Description

图像中目标物识别方法、装置、电子设备及存储介质Object recognition method, device, electronic equipment and storage medium in image
本申请要求于2021年5月27日提交中国专利局、申请号为CN202110581184.5、名称为“图像中目标物识别方法、装置、电子设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application with the application number CN202110581184.5 and the title "Method, device, electronic equipment and storage medium for object recognition in images" submitted to the China Patent Office on May 27, 2021, all of which The contents are incorporated by reference in this application.
技术领域technical field
本申请涉及人工智能技术领域,尤其涉及一种图像中目标物识别方法、装置、电子设备及计算机可读存储介质。The present application relates to the technical field of artificial intelligence, and in particular to a method, device, electronic equipment, and computer-readable storage medium for object recognition in an image.
背景技术Background technique
随着人工智能技术的飞速发展,人们日常生活中越来越常见到利用图像识别模型对图像进行处理,以识别出图像中包含的目标物的信息,例如,利用图像识别模型机对疾病图像进行分析,以确定图像中病灶的种类等。但由于医疗图像的隐私性导致可用于训练模型的图像数量极少,进而导致训练出的模型的精确度不高。With the rapid development of artificial intelligence technology, it is more and more common in people's daily life to use image recognition models to process images to identify the information of the target contained in the images, for example, to use image recognition model machines to analyze disease images , to determine the type of lesion in the image, etc. However, due to the privacy of medical images, the number of images that can be used to train the model is very small, which leads to the low accuracy of the trained model.
发明人意识到,现有的针对少量样本的模型训练中,往往采用图像扩展的方式将原有的少量样本扩增为多张图像,以实现对模型的大量训练。但该方法中,由于图像扩展生成的新图像均是基于原图的特征得到的,因此新图像中图像的基本特征并未改变,容易导致训练后的模型出现过拟合等情况,进而导致模型对目标物识别的精确度不高。The inventor realized that in the existing model training for a small number of samples, the original small number of samples is often expanded into multiple images by way of image expansion, so as to realize a large amount of training for the model. However, in this method, since the new image generated by image expansion is obtained based on the features of the original image, the basic features of the image in the new image have not changed, which may easily lead to overfitting of the trained model, which in turn leads to the failure of the model. The accuracy of target recognition is not high.
发明内容Contents of the invention
本申请提供的一种图像中目标物识别方法,包括:A method for recognizing objects in an image provided by the present application includes:
获取包含目标物的样本图像,对所述样本图像进行图像扩增,得到样本图像集;Acquiring a sample image containing the target object, performing image amplification on the sample image to obtain a sample image set;
利用与所述样本图像同类型的噪声图像,以及所述样本图像集构建第一训练集与测试集,并利用所述第一训练集对预构建的原始识别模型进行训练,得到初始识别模型;constructing a first training set and a test set by using a noise image of the same type as the sample image and the sample image set, and using the first training set to train a pre-built original recognition model to obtain an initial recognition model;
利用所述测试集对所述初始识别模型进行测试,并选取测试结果中预设类型的错误结果构建第二训练集,利用所述样本图像构建第三训练集;Using the test set to test the initial recognition model, and selecting preset types of error results in the test results to construct a second training set, and using the sample images to construct a third training set;
提取所述第二训练集中图像的第一特征向量,提取所述第三训练集中图像的第二特征向量;extracting the first feature vector of the image in the second training set, and extracting the second feature vector of the image in the third training set;
计算所述第一特征向量与所述第二特征向量之间的损失值,并根据所述损失值对所述初始识别模型进行参数更新,得到标准识别模型;calculating a loss value between the first feature vector and the second feature vector, and updating parameters of the initial recognition model according to the loss value to obtain a standard recognition model;
获取待识别图像,利用所述标准识别模型对所述待识别图像进行目标物识别,得到图像中目标物识别结果。The image to be recognized is acquired, and the standard recognition model is used to perform target recognition on the image to be recognized to obtain a target recognition result in the image.
本申请还提供一种图像中目标物识别装置,所述装置包括:The present application also provides a device for identifying objects in an image, the device comprising:
图像扩增模块,用于获取包含目标物的样本图像,对所述样本图像进行图像扩增,得到样本图像集;An image amplification module, configured to obtain a sample image containing a target object, perform image amplification on the sample image, and obtain a sample image set;
第一训练模块,用于利用与所述样本图像同类型的噪声图像,以及所述样本图像集构建第一训练集与测试集,并利用所述第一训练集对预构建的原始识别模型进行训练,得到初始识别模型;The first training module is configured to use noise images of the same type as the sample image and the sample image set to construct a first training set and a test set, and use the first training set to perform pre-built original recognition models. Training to get the initial recognition model;
模型测试模块,用于利用所述测试集对所述初始识别模型进行测试,并选取测试结果中预设类型的错误结果构建第二训练集,利用所述样本图像构建第三训练集;A model testing module, configured to use the test set to test the initial recognition model, select a preset type of error result in the test result to construct a second training set, and use the sample image to construct a third training set;
特征提取模块,用于提取所述第二训练集中图像的第一特征向量,提取所述第三训练集中图像的第二特征向量;A feature extraction module, configured to extract a first feature vector of images in the second training set, and extract a second feature vector of images in the third training set;
第二训练模块,用于计算所述第一特征向量与所述第二特征向量之间的损失值,并根据所述损失值对所述初始识别模型进行参数更新,得到标准识别模型;A second training module, configured to calculate a loss value between the first feature vector and the second feature vector, and update parameters of the initial recognition model according to the loss value to obtain a standard recognition model;
图像识别模块,用于获取待识别图像,利用所述标准识别模型对所述待识别图像进行 目标物识别,得到图像中目标物识别结果。The image recognition module is used to obtain the image to be recognized, and use the standard recognition model to perform target recognition on the image to be recognized to obtain the target recognition result in the image.
本申请还提供一种电子设备,所述电子设备包括:The present application also provides an electronic device, the electronic device comprising:
存储器,存储至少一个指令;及a memory storing at least one instruction; and
处理器,执行所述存储器中存储的指令以实现如下步骤:A processor, executing instructions stored in the memory to implement the following steps:
获取包含目标物的样本图像,对所述样本图像进行图像扩增,得到样本图像集;Acquiring a sample image containing the target object, performing image amplification on the sample image to obtain a sample image set;
利用与所述样本图像同类型的噪声图像,以及所述样本图像集构建第一训练集与测试集,并利用所述第一训练集对预构建的原始识别模型进行训练,得到初始识别模型;constructing a first training set and a test set by using a noise image of the same type as the sample image and the sample image set, and using the first training set to train a pre-built original recognition model to obtain an initial recognition model;
利用所述测试集对所述初始识别模型进行测试,并选取测试结果中预设类型的错误结果构建第二训练集,利用所述样本图像构建第三训练集;Using the test set to test the initial recognition model, and selecting preset types of error results in the test results to construct a second training set, and using the sample images to construct a third training set;
提取所述第二训练集中图像的第一特征向量,提取所述第三训练集中图像的第二特征向量;extracting the first feature vector of the image in the second training set, and extracting the second feature vector of the image in the third training set;
计算所述第一特征向量与所述第二特征向量之间的损失值,并根据所述损失值对所述初始识别模型进行参数更新,得到标准识别模型;calculating a loss value between the first feature vector and the second feature vector, and updating parameters of the initial recognition model according to the loss value to obtain a standard recognition model;
获取待识别图像,利用所述标准识别模型对所述待识别图像进行目标物识别,得到图像中目标物识别结果。The image to be recognized is acquired, and the standard recognition model is used to perform target recognition on the image to be recognized to obtain a target recognition result in the image.
请还提供一种计算机可读存储介质,所述计算机可读存储介质中存储有至少一个指令,所述至少一个指令被电子设备中的处理器执行以实现如下步骤:Please also provide a computer-readable storage medium, where at least one instruction is stored in the computer-readable storage medium, and the at least one instruction is executed by a processor in the electronic device to implement the following steps:
获取包含目标物的样本图像,对所述样本图像进行图像扩增,得到样本图像集;Acquiring a sample image containing the target object, performing image amplification on the sample image to obtain a sample image set;
利用与所述样本图像同类型的噪声图像,以及所述样本图像集构建第一训练集与测试集,并利用所述第一训练集对预构建的原始识别模型进行训练,得到初始识别模型;constructing a first training set and a test set by using a noise image of the same type as the sample image and the sample image set, and using the first training set to train a pre-built original recognition model to obtain an initial recognition model;
利用所述测试集对所述初始识别模型进行测试,并选取测试结果中预设类型的错误结果构建第二训练集,利用所述样本图像构建第三训练集;Using the test set to test the initial recognition model, and selecting preset types of error results in the test results to construct a second training set, and using the sample images to construct a third training set;
提取所述第二训练集中图像的第一特征向量,提取所述第三训练集中图像的第二特征向量;extracting the first feature vector of the image in the second training set, and extracting the second feature vector of the image in the third training set;
计算所述第一特征向量与所述第二特征向量之间的损失值,并根据所述损失值对所述初始识别模型进行参数更新,得到标准识别模型;calculating a loss value between the first feature vector and the second feature vector, and updating parameters of the initial recognition model according to the loss value to obtain a standard recognition model;
获取待识别图像,利用所述标准识别模型对所述待识别图像进行目标物识别,得到图像中目标物识别结果。The image to be recognized is acquired, and the standard recognition model is used to perform target recognition on the image to be recognized to obtain a target recognition result in the image.
附图说明Description of drawings
图1为本申请一实施例提供的图像中目标物识别方法的流程示意图;FIG. 1 is a schematic flow diagram of a method for recognizing objects in an image provided by an embodiment of the present application;
图2为本申请一实施例提供的生成第一特征向量的流程示意图;FIG. 2 is a schematic flow diagram of generating a first feature vector provided by an embodiment of the present application;
图3为本申请一实施例提供的图像中目标物识别装置的功能模块图;Fig. 3 is a functional module diagram of an object recognition device in an image provided by an embodiment of the present application;
图4为本申请一实施例提供的实现所述图像中目标物识别方法的电子设备的结构示意图。FIG. 4 is a schematic structural diagram of an electronic device for implementing the method for recognizing an object in an image provided by an embodiment of the present application.
本申请目的的实现、功能特点及优点将结合实施例,参照附图做进一步说明。The realization, functional features and advantages of the present application will be further described in conjunction with the embodiments and with reference to the accompanying drawings.
具体实施方式Detailed ways
应当理解,此处所描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。It should be understood that the specific embodiments described here are only used to explain the present application, and are not intended to limit the present application.
本申请实施例可以基于人工智能技术对相关的数据进行获取和处理。其中,人工智能(Artificial Intelligence,AI)是利用数字计算机或者数字计算机控制的机器模拟、延伸和扩展人的智能,感知环境、获取知识并使用知识获得最佳结果的理论、方法、技术及应用系统。The embodiments of the present application may acquire and process relevant data based on artificial intelligence technology. Among them, artificial intelligence (AI) is a theory, method, technology and application system that uses digital computers or machines controlled by digital computers to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge and use knowledge to obtain the best results. .
人工智能基础技术一般包括如传感器、专用人工智能芯片、云计算、分布式存储、大数据处理技术、操作/交互系统、机电一体化等技术。人工智能软件技术主要包括计算机视觉技术、机器人技术、生物识别技术、语音处理技术、自然语言处理技术以及机器学习/ 深度学习等几大方向。Artificial intelligence basic technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technology, operation/interaction systems, and mechatronics. Artificial intelligence software technology mainly includes computer vision technology, robotics technology, biometrics technology, speech processing technology, natural language processing technology, and machine learning/deep learning.
本申请实施例提供一种图像中目标物识别方法。所述图像中目标物识别方法的执行主体包括但不限于服务端、终端等能够被配置为执行本申请实施例提供的该方法的电子设备中的至少一种。换言之,所述图像中目标物识别方法可以由安装在终端设备或服务端设备的软件或硬件来执行,所述软件可以是区块链平台。所述服务端包括但不限于:单台服务器、服务器集群、云端服务器或云端服务器集群等。服务器可以是独立的服务器,也可以是提供云服务、云数据库、云计算、云函数、云存储、网络服务、云通信、中间件服务、域名服务、安全服务、内容分发网络(ContentDelivery Network,CDN)、以及大数据和人工智能平台等基础云计算服务的云服务器。An embodiment of the present application provides a method for recognizing an object in an image. The executor of the method for object recognition in an image includes, but is not limited to, at least one of electronic devices such as a server and a terminal that can be configured to execute the method provided by the embodiment of the present application. In other words, the method for identifying an object in an image can be executed by software or hardware installed on a terminal device or a server device, and the software can be a blockchain platform. The server includes, but is not limited to: a single server, a server cluster, a cloud server or a cloud server cluster, and the like. The server can be an independent server, or it can provide cloud services, cloud database, cloud computing, cloud function, cloud storage, network service, cloud communication, middleware service, domain name service, security service, content distribution network (ContentDelivery Network, CDN ), and cloud servers for basic cloud computing services such as big data and artificial intelligence platforms.
参照图1所示,为本申请一实施例提供的图像中目标物识别方法的流程示意图。在本实施例中,所述图像中目标物识别方法包括:Referring to FIG. 1 , it is a schematic flow chart of a method for recognizing an object in an image provided by an embodiment of the present application. In this embodiment, the target recognition method in the image includes:
S1、获取包含目标物的样本图像,对所述样本图像进行图像扩增,得到样本图像集。S1. Acquire a sample image including a target object, and perform image amplification on the sample image to obtain a sample image set.
本申请实施例中,所述样本图像中包含有特定的目标物,及该目标物对应的真实标签,例如,当目标物为苹果时,所述样本图像为包含苹果的图像,真实标签为“苹果”;或者,当所述目标物为某一疾病的病灶时,所述样本图像为包含该病灶的图像,真实标签为该病灶对应的疾病名称。In the embodiment of the present application, the sample image contains a specific target object and the real label corresponding to the target object. For example, when the target object is an apple, the sample image is an image containing an apple, and the real label is " Apple"; or, when the target object is a lesion of a certain disease, the sample image is an image containing the lesion, and the real label is the name of the disease corresponding to the lesion.
本申请实施例可通过具有数据抓取功能的java语句从预构建的区块链节点中抓取预先存储的样本图像,利用区块链对数据的高吞吐性,可提高获取所述样本图像的效率。In the embodiment of the present application, the pre-stored sample image can be captured from the pre-built block chain node through the java statement with the data capture function, and the high throughput of the block chain for data can be used to improve the speed of obtaining the sample image efficiency.
本申请实施例可通过对所述样本图像进行几何变换、色彩变更、对比度调整及部分遮挡等方式实现对样本图像进行图像扩增。In the embodiment of the present application, the image amplification of the sample image can be realized by performing geometric transformation, color change, contrast adjustment, and partial occlusion on the sample image.
本申请其中一实施例中,通过将所述样本图像进行不同尺寸的水平、垂直方向的拉伸,以获得长度、宽度或二者结合进行变换产生的样本图像集。In one embodiment of the present application, the sample image set is generated by stretching the sample image in different sizes in the horizontal and vertical directions to obtain the length, width or a combination of both.
或者,通过将所述样本图像进行染色,以使所述样本图像的颜色变更为多种不同的颜色,得到包含多种不同颜色的样本图像的样本图像集。Alternatively, by dyeing the sample image to change the color of the sample image into multiple different colors, a sample image set including sample images of multiple different colors is obtained.
或者,通过对所述样本图像进行局部遮掩来获取多个被遮掩了不同部位的样本图像集。例如,对样本图像集中目标物的上半部分进行遮掩,得到仅可看见该目标物下半部分的样本图像,对样本图像集中目标物的右半部分进行遮掩,得到仅可看见该目标物左半部分的样本图像,并将遮掩了不同区域的样本图像汇集为样本图像集。Alternatively, a plurality of sample image sets in which different parts are covered are obtained by partially covering the sample image. For example, cover the upper part of the target object in the sample image set to obtain a sample image in which only the lower part of the target object can be seen, and cover the right half part of the target object in the sample image set to obtain only the left half of the target object. half of the sample images, and the sample images that cover different areas are collected as a sample image set.
本申请其中一实施例中,所述对所述样本图像进行图像扩增,得到样本图像集,包括:In one embodiment of the present application, performing image amplification on the sample image to obtain a sample image set includes:
对所述样本图像进行纹理描绘,得到所述样本图像的图像纹理;performing texture drawing on the sample image to obtain the image texture of the sample image;
对所述图像纹理进行随机局部加深,得到纹理加深图像;performing random local deepening on the image texture to obtain a texture deepened image;
对所述图像文件进行随机局部淡化,得到纹理淡化图像;performing random partial lightening on the image file to obtain a texture lightened image;
将所述纹理加深图像与所述纹理淡化图像汇集为所述样本图像集。Collect the texture-enhanced image and the texture-lighten image into the sample image set.
详细地,可采用GLCM(Gray-level co-occurrence matrix,灰度共生矩阵)法、LBP(Local Binary Pattern,局部二值模式)等图像纹理提取算法实现对所述样本图像进行纹理描绘,以凸显出样本图像中的图像纹理。In detail, image texture extraction algorithms such as GLCM (Gray-level co-occurrence matrix, gray-level co-occurrence matrix) method, LBP (Local Binary Pattern, local binary pattern) can be used to implement texture depiction of the sample image to highlight Output the image texture in the sample image.
具体地,所述对所述图像纹理进行随机局部加深,得到纹理加深图像,包括:Specifically, the random local deepening of the image texture to obtain a texture deepened image includes:
统计所述图像纹理的纹理数量;Count the number of textures of the image texture;
根据所述纹理数量选取预设比例的图像纹理为待处理纹理;Selecting an image texture with a preset ratio as the texture to be processed according to the number of textures;
将所述待处理纹理上的像素进行像素值调整,得到纹理加深图像。Adjusting pixel values of the pixels on the texture to be processed to obtain a texture-enhanced image.
本申请其中一个应用场景中,根据所述纹理数量选取该预设比例的图像纹理,并对选取的待处理纹理上的像素值进行调整(例如,将待处理纹理上的像素值调整为黑色范围),以实现对待处理纹理的加深,获得纹理加深图像。In one of the application scenarios of the present application, the image texture of the preset ratio is selected according to the number of textures, and the pixel value on the selected texture to be processed is adjusted (for example, the pixel value on the texture to be processed is adjusted to a black range ), to realize the deepening of the texture to be processed, and obtain the texture deepened image.
同理,对所述图像文件进行随机局部淡化,得到纹理淡化图像的步骤,与纹理加深的步骤一致,例如,将待处理纹理上的像素值调整为白色范围,以实现对待处理纹理的淡化, 获得纹理淡化图像。Similarly, the step of performing random local lightening on the image file to obtain a texture lightening image is consistent with the step of texture deepening, for example, adjusting the pixel value on the texture to be processed to a white range to realize the lightening of the texture to be processed, Get a textured faded image.
S2、利用与所述样本图像同类型的噪声图像,以及所述样本图像集构建第一训练集与测试集,并利用所述第一训练集对预构建的原始识别模型进行训练,得到初始识别模型。S2. Using noise images of the same type as the sample image and the sample image set to construct a first training set and a test set, and using the first training set to train a pre-built original recognition model to obtain an initial recognition Model.
本申请实施例中,所述噪声图像为与所述样本图像相同类型的图像,但所述噪声图像中与所述样本图中的目标物不同(例如,所述噪声图像与所述样本图像均为水果图像,但所述噪声图像中包含的是西瓜,所述样本图像中包含的是苹果),同样地,所述噪声图像中也含有该噪声图像对应的真实标签。例如,所述样本图像中包含的目标物为肺部疾病的病灶,但所述噪声图像中包含的可能肝部疾病的病灶。In this embodiment of the present application, the noise image is an image of the same type as the sample image, but the target object in the noise image is different from that in the sample image (for example, the noise image and the sample image are both is a fruit image, but the noise image contains watermelons, and the sample image contains apples), similarly, the noise image also contains the real label corresponding to the noise image. For example, the target object contained in the sample image is a focus of a lung disease, but the focus of a liver disease may be contained in the noise image.
本申请实施例将噪声图像与所述样本图像集进行汇集,并按照预设的划分比例将汇集的样本图像与噪声图像划分为第一训练集和测试集。其中,所述第一训练集与所述测试集中均包含样本图像与噪声图像。In the embodiment of the present application, the noise image and the sample image set are collected, and the collected sample image and noise image are divided into a first training set and a test set according to a preset division ratio. Wherein, both the first training set and the test set include sample images and noise images.
本申请实施例中,所述原始识别模型可采用VGG网络、GoogleNet和Residual Network等具有图像识别功能的网络。In the embodiment of the present application, the original recognition model can use networks with image recognition functions such as VGG network, GoogleNet and Residual Network.
本申请其中一实施例中,采用EfficientNet作为所述原始识别模型的主干网络,所述EfficientNet为一种复合多维度卷积神经网络,有利于提高图像处理过程中的精确度,进而提高图像识别的精确度。In one embodiment of the present application, EfficientNet is used as the backbone network of the original recognition model, and the EfficientNet is a composite multi-dimensional convolutional neural network, which is beneficial to improve the accuracy of the image processing process, thereby improving the accuracy of image recognition. Accuracy.
进一步地,本申请实施例利用训练集对原始识别模型进行训练,以实现对原始识别模型中模型参数的调整,提高原始识别模型对图像进行识别的精确度,得到初始识别模型。Further, the embodiment of the present application uses the training set to train the original recognition model, so as to adjust the model parameters in the original recognition model, improve the accuracy of the original recognition model for image recognition, and obtain the initial recognition model.
详细地,所述利用所述第一训练集对预构建的原始识别模型进行训练,得到初始识别模型,包括:Specifically, using the first training set to train the pre-built original recognition model to obtain an initial recognition model, including:
利用所述原始识别模型对所述第一训练集进行图像识别,得到识别结果;performing image recognition on the first training set by using the original recognition model to obtain a recognition result;
计算所述识别结果与所述第一训练集中每张图像对应的真实标签的损失值;Calculate the loss value of the recognition result and the real label corresponding to each image in the first training set;
根据所述损失值对所述原始识别模型进行参数调整,得到初始识别模型。Adjusting parameters of the original recognition model according to the loss value to obtain an initial recognition model.
具体地,所述识别结果为原始识别模型对第一训练集中每张图像中含有的目标物的类型的识别结果,例如,第一训练集中包含样本图像A,样本图像B和噪声图像C,其中,样本图像A和样本图像B的真实标签为苹果(即目标物为苹果),图像C的真实标签为西瓜,但原始识别模型得到的识别结果为:样本图像A为苹果,样本图像B为西瓜,噪声图像C为苹果。Specifically, the recognition result is the recognition result of the original recognition model for the type of target contained in each image in the first training set, for example, the first training set contains sample image A, sample image B and noise image C, where , the real label of sample image A and sample image B is apple (that is, the target object is apple), and the real label of image C is watermelon, but the recognition result obtained by the original recognition model is: sample image A is apple, and sample image B is watermelon , the noise image C is an apple.
可利用预设的第一损失函数计算识别结果与第一训练集中每张图像对应的真实标签的损失值,进而根据损失值对原始识别模型进行参数调整,以提高原始识别模型的精确度。The preset first loss function can be used to calculate the loss value of the recognition result and the real label corresponding to each image in the first training set, and then adjust the parameters of the original recognition model according to the loss value to improve the accuracy of the original recognition model.
例如,将第一训练集中的真实标签进行向量转换,得到真实向量;将识别结果进行向量转换,得到识别向量,分别计算第一训练集中每张图像对应的真实向量与识别向量之间的损失值,进而根据损失值利用预设的优化算法对原始识别模型的参数进行调整,所述优化算法包括但不限于:批量梯度下降算法、随机梯度下降算法、小批量梯度下降算法。For example, vector conversion is performed on the real labels in the first training set to obtain the real vector; vector conversion is performed on the recognition result to obtain the recognition vector, and the loss value between the real vector and the recognition vector corresponding to each image in the first training set is calculated respectively , and then use a preset optimization algorithm to adjust the parameters of the original recognition model according to the loss value. The optimization algorithm includes but is not limited to: batch gradient descent algorithm, stochastic gradient descent algorithm, and mini-batch gradient descent algorithm.
S3、利用所述测试集对所述初始识别模型进行测试,并选取测试结果中预设类型的错误结果构建第二训练集,利用所述样本图像构建第三训练集。S3. Use the test set to test the initial recognition model, select preset types of error results from the test results to construct a second training set, and use the sample images to construct a third training set.
本申请实施例中,可利用测试集对步骤S2中得到的初始识别模型进行测试,例如,将测试集输入至该初始识别模型中,以得到该初始识别模型对测试集中每一张图像的测试结果。In the embodiment of the present application, the test set can be used to test the initial recognition model obtained in step S2, for example, the test set is input into the initial recognition model to obtain the test of each image in the test set by the initial recognition model result.
本申请其中一实施例中,利用测试结果中预设类型的错误结果构建第二训练集时,所述预设类型的错误结果包括所述测试集中噪声图像的错误结果。In one embodiment of the present application, when the second training set is constructed by using preset types of error results in the test results, the preset types of error results include error results of noise images in the test set.
例如,所述测试集中包含样本图像D、样本图像E、噪声图像F和噪声图像G,其中,样本图像D和样本图像E的真实标签为苹果,噪声图像E和噪声图像F的真实标签为西瓜;当所述初始识别模型对所述测试集进行识别后,得到的测试结果为:样本图像D和噪声图像F为苹果,样本图像E和噪声图像G为西瓜,因此,噪声图像F和样本图像E为该测试结果 中的错误结果,选取噪声图像F为所述第二训练集。For example, the test set contains sample image D, sample image E, noise image F and noise image G, wherein the real label of sample image D and sample image E is apple, and the real label of noise image E and noise image F is watermelon ; After the initial identification model identifies the test set, the test results obtained are: the sample image D and the noise image F are apples, the sample image E and the noise image G are watermelons, therefore, the noise image F and the sample image E is the wrong result in the test result, and the noise image F is selected as the second training set.
进一步地,本申请实施例将步骤S1中生成的样本图像集中至少一张样本图像作为第三训练集。Further, in the embodiment of the present application, at least one sample image in the sample image set generated in step S1 is used as the third training set.
本申请实施例利用测试结果中的错误结果构建第二训练集,以及利用样本图像构建第三训练集,可实现将噪声图像中易被所述初始识别模型识别错误的图像,及包含目标物的样本图像分别构建训练集,有利于后续对初始识别模型进行进一步训练,提高初始识别模型的精确度。In the embodiment of the present application, the second training set is constructed by using the wrong results in the test results, and the third training set is constructed by using the sample images, so that the images that are easily misrecognized by the initial recognition model in the noise image and the images containing the target object can be realized. The sample images construct the training set separately, which is conducive to further training the initial recognition model and improving the accuracy of the initial recognition model.
S4、提取所述第二训练集中图像的第一特征向量,提取所述第三训练集中图像的第二特征向量。S4. Extract a first feature vector of the images in the second training set, and extract a second feature vector of the images in the third training set.
本申请实施例中,可利用HOG算法、LBP算法及Harr算法等具有图像特征提取的算法来提取所述第二训练集中图像的第一特征向量。In the embodiment of the present application, the first feature vectors of the images in the second training set may be extracted by using an image feature extraction algorithm such as the HOG algorithm, the LBP algorithm, and the Harr algorithm.
进一步,所述提取所述第三训练集中图像的第二特征向量的步骤,与提取所述第二训练集中图像的第一特征向量的步骤一致,在此不做赘述。Further, the step of extracting the second feature vectors of the images in the third training set is the same as the step of extracting the first feature vectors of the images in the second training set, which will not be repeated here.
本申请其中一个实施例中,参图2所示,所述提取所述第二训练集中图像的第一特征向量,包括:In one of the embodiments of the present application, as shown in FIG. 2, the extraction of the first feature vector of the images in the second training set includes:
S21、利用预设的第一梯度算子对所述第二训练集中的图像进行水平卷积运算,得到水平梯度分量;S21. Using a preset first gradient operator to perform a horizontal convolution operation on the images in the second training set to obtain a horizontal gradient component;
S22、利用预设的第二梯度算子对所述第二训练集中的图像进行垂直卷积运算,得到垂直梯度分量;S22. Using a preset second gradient operator to perform a vertical convolution operation on the images in the second training set to obtain a vertical gradient component;
S23、根据所述水平梯度分量与所述垂直梯度分量计算所述第一特征向量。S23. Calculate the first feature vector according to the horizontal gradient component and the vertical gradient component.
详细地,所述第一梯度算子与所述第二梯度算子为预设的矩阵,例如,所述第一梯度算子可以为[-1,0,1],所述第二梯度算子可以为[1,0,-1],通过将所述第一梯度算子与所述第二梯度算子分别和所述第二训练集中每张图像进行卷积运算,即可得到每张图像对应的水平梯度分量和垂直梯度分量。In detail, the first gradient operator and the second gradient operator are preset matrices, for example, the first gradient operator may be [-1, 0, 1], and the second gradient operator can be [1, 0, -1], and each image in the second training set can be obtained by convolving the first gradient operator and the second gradient operator with each image in the second training set The horizontal gradient component and vertical gradient component corresponding to the image.
具体地,所述利用预设的第一梯度算子对所述第二训练集中图的像进行水平卷积运算,得到水平梯度分量,包括:Specifically, the use of the preset first gradient operator to perform a horizontal convolution operation on the image in the second training set to obtain a horizontal gradient component includes:
获取卷积步长和卷积长度;Get the convolution step size and convolution length;
根据所述卷积步长与所述卷积长度计算水平卷积次数;calculating the number of horizontal convolutions according to the convolution step size and the convolution length;
利用所述第一梯度算子将所述第二训练集中每张图像按照所述卷积步长进行所述卷积次数的卷积运算,得到水平梯度分量。Using the first gradient operator to perform a convolution operation on each image in the second training set according to the convolution step size for the number of convolutions to obtain a horizontal gradient component.
其中,所述卷积步长是指所述第一梯度算子在进行一次卷积运算后,需要移动的像素长度,所述卷积长度是指第二训练集中每张图像在水平方向上的像素长度,将所述卷积步长与所述卷积长度相除,可计算得出第二训练集中每张图像在水平方向上需要进行的水平卷积次数。Wherein, the convolution step size refers to the pixel length that the first gradient operator needs to move after performing a convolution operation, and the convolution length refers to the length of each image in the second training set in the horizontal direction. pixel length, divide the convolution step by the convolution length, and calculate the number of times of horizontal convolution that needs to be performed on each image in the second training set in the horizontal direction.
本申请其中一个实施例中,所述根据所述水平梯度分量与所述垂直梯度分量计算所述第一特征向量,包括:In one of the embodiments of the present application, the calculating the first feature vector according to the horizontal gradient component and the vertical gradient component includes:
将所述水平梯度分量进行归一化计算,得到水平归一化分量;performing normalized calculation on the horizontal gradient component to obtain a horizontal normalized component;
将所述垂直梯度分量进行归一化计算,得到垂直归一化分量;performing normalized calculation on the vertical gradient component to obtain a vertical normalized component;
将所述水平归一化分量与所述垂直归一化分量进行平方求和,得到所述第一特征向量。The horizontal normalized component and the vertical normalized component are squared and summed to obtain the first feature vector.
详细地,可采用预设的线性函数、对数函数或反余切函数等域值范围在(0,1)的函数对水平梯度分量和垂直梯度分量进行计算,以实现对水平梯度分量和垂直梯度分量的归一化处理。In detail, the horizontal gradient component and the vertical gradient component can be calculated by using a preset linear function, logarithmic function, or inverse cotangent function, etc. Normalization of gradient components.
具体地,可采用如下平方求和公式将所述水平归一化分量与所述垂直归一化分量进行平方求和,得到所述第一特征向量:Specifically, the horizontal normalized component and the vertical normalized component can be squared and summed using the following square summation formula to obtain the first eigenvector:
Figure PCTCN2021109479-appb-000001
Figure PCTCN2021109479-appb-000001
其中,L为所述第一特征向量,α为所述水平归一化分量,β为所述垂直归一化分量。Wherein, L is the first feature vector, α is the horizontal normalized component, and β is the vertical normalized component.
S5、计算所述第一特征向量与所述第二特征向量之间的损失值,并根据所述损失值对所述初始识别模型进行参数更新,得到标准识别模型。S5. Calculate a loss value between the first feature vector and the second feature vector, and update parameters of the initial recognition model according to the loss value to obtain a standard recognition model.
本申请实施例中,可利用预设的第二损失函数计算识别结果与第一训练集中每张图像对应的真实标签的损失值,进而实现对初始识别模型进行参数调整,以提高初始识别模型的精确度。In the embodiment of the present application, the preset second loss function can be used to calculate the loss value of the recognition result and the real label corresponding to each image in the first training set, and then adjust the parameters of the initial recognition model to improve the performance of the initial recognition model. Accuracy.
其中,所述第二损失函数与步骤S2中所述第一损失函数可以相同,也可以不同。Wherein, the second loss function may be the same as or different from the first loss function in step S2.
详细地,所述并利用所述损失值对所述初始识别模型进行参数更新,得到标准识别模型的步骤,与步骤S2中利用所述第一训练集对预构建的原始识别模型进行训练时,对原始识别模型的参数调整的步骤一致,在此不做赘述。In detail, the step of updating the parameters of the initial recognition model by using the loss value to obtain a standard recognition model is the same as when using the first training set to train the pre-built original recognition model in step S2, The steps of adjusting the parameters of the original recognition model are the same, and will not be repeated here.
S6、获取待识别图像,利用所述标准识别模型对所述待识别图像进行目标物识别,得到图像中目标物识别结果。S6. Acquire the image to be recognized, and use the standard recognition model to perform target recognition on the image to be recognized to obtain a target recognition result in the image.
本申请实施例中,所述待识别图像可以为包含目标物或不包含目标物的图像,当获取待识别图像扣,可利用标准识别模型对该待识别图像进行目标物识别,得到该图像中目标物类型的识别结果。In the embodiment of the present application, the image to be recognized may be an image that contains or does not contain the target object. When the image to be recognized is obtained, the standard recognition model can be used to perform target recognition on the image to be recognized to obtain the target object in the image. The recognition result of the object type.
本申请实施例通过图像扩展,可实现对原先少量样本的扩展,利用扩展后得到的样本图像集与噪声图像同时对模型进行训练,有利于提高模型的精确度与鲁棒性;再对模型进行测试,并利用测试结果中的错误结果构建训练集,与样本图像对模型进行再次训练,可进一步地提高模型对目标物进行识别的精确度。因此本申请提出的图像中目标物识别方法,可以解决对目标物进行识别的精确度较低的问题。The embodiment of the present application can realize the expansion of the original small number of samples through image expansion, and use the expanded sample image set and noise image to train the model at the same time, which is beneficial to improve the accuracy and robustness of the model; Test, and use the wrong results in the test results to construct a training set, and retrain the model with sample images, which can further improve the accuracy of the model's recognition of the target. Therefore, the method for identifying objects in images proposed by the present application can solve the problem of low accuracy in identifying objects.
如图3所示,是本申请一实施例提供的图像中目标物识别装置的功能模块图。As shown in FIG. 3 , it is a functional block diagram of an apparatus for recognizing objects in images provided by an embodiment of the present application.
本申请所述图像中目标物识别装置100可以安装于电子设备中。根据实现的功能,所述图像中目标物识别装置100可以包括图像扩增模块101、第一训练模块102、模型测试模块103、特征提取模块104、第二训练模块105和图像识别模块106。本申请所述模块也可以称之为单元,是指一种能够被电子设备处理器所执行,并且能够完成固定功能的一系列计算机程序段,其存储在电子设备的存储器中。The apparatus 100 for recognizing objects in images described in this application can be installed in electronic equipment. According to the realized functions, the device 100 for identifying objects in images may include an image augmentation module 101 , a first training module 102 , a model testing module 103 , a feature extraction module 104 , a second training module 105 and an image recognition module 106 . The module described in this application can also be called a unit, which refers to a series of computer program segments that can be executed by the processor of the electronic device and can complete fixed functions, and are stored in the memory of the electronic device.
在本实施例中,关于各模块/单元的功能如下:In this embodiment, the functions of each module/unit are as follows:
所述图像扩增模块101,用于获取包含目标物的样本图像,对所述样本图像进行图像扩增,得到样本图像集;The image augmentation module 101 is configured to acquire a sample image containing a target object, perform image augmentation on the sample image, and obtain a sample image set;
本申请实施例中,所述样本图像中包含有特定的目标物,及该目标物对应的真实标签,例如,当目标物为苹果时,所述样本图像为包含苹果的图像,真实标签为“苹果”;或者,当所述目标物为某一疾病的病灶时,所述样本图像为包含该病灶的图像,真实标签为该病灶对应的疾病名称。In the embodiment of the present application, the sample image contains a specific target object and the real label corresponding to the target object. For example, when the target object is an apple, the sample image is an image containing an apple, and the real label is " Apple"; or, when the target object is a lesion of a certain disease, the sample image is an image containing the lesion, and the real label is the name of the disease corresponding to the lesion.
本申请实施例可通过具有数据抓取功能的java语句从预构建的区块链节点中抓取预先存储的样本图像,利用区块链对数据的高吞吐性,可提高获取所述样本图像的效率。In the embodiment of the present application, the pre-stored sample image can be captured from the pre-built block chain node through the java statement with the data capture function, and the high throughput of the block chain for data can be used to improve the speed of obtaining the sample image efficiency.
本申请实施例可通过对所述样本图像进行几何变换、色彩变更、对比度调整及部分遮挡等方式实现对样本图像进行图像扩增。In the embodiment of the present application, the image amplification of the sample image can be realized by performing geometric transformation, color change, contrast adjustment, and partial occlusion on the sample image.
本申请其中一实施例中,通过将所述样本图像进行不同尺寸的水平、垂直方向的拉伸,以获得长度、宽度或二者结合进行变换产生的样本图像集。In one embodiment of the present application, the sample image set is generated by stretching the sample image in different sizes in the horizontal and vertical directions to obtain the length, width or a combination of both.
或者,通过将所述样本图像进行染色,以使所述样本图像的颜色变更为多种不同的颜色,得到包含多种不同颜色的样本图像的样本图像集。Alternatively, by dyeing the sample image to change the color of the sample image into multiple different colors, a sample image set including sample images of multiple different colors is obtained.
或者,通过对所述样本图像进行局部遮掩来获取多个被遮掩了不同部位的样本图像集。例如,对样本图像集中目标物的上半部分进行遮掩,得到仅可看见该目标物下半部分的样本图像,对样本图像集中目标物的右半部分进行遮掩,得到仅可看见该目标物左半部分的样本图像,并将遮掩了不同区域的样本图像汇集为样本图像集。Alternatively, a plurality of sample image sets in which different parts are covered are obtained by partially covering the sample image. For example, cover the upper part of the target object in the sample image set to obtain a sample image in which only the lower part of the target object can be seen, and cover the right half part of the target object in the sample image set to obtain only the left half of the target object. half of the sample images, and the sample images that cover different areas are collected as a sample image set.
本申请其中一实施例中,所述图像扩增模块101具体用于:In one of the embodiments of the present application, the image augmentation module 101 is specifically used for:
获取包含目标物的样本图像;Obtain a sample image containing the target object;
对所述样本图像进行纹理描绘,得到所述样本图像的图像纹理;performing texture drawing on the sample image to obtain the image texture of the sample image;
对所述图像纹理进行随机局部加深,得到纹理加深图像;performing random local deepening on the image texture to obtain a texture deepened image;
对所述图像文件进行随机局部淡化,得到纹理淡化图像;performing random partial lightening on the image file to obtain a texture lightened image;
将所述纹理加深图像与所述纹理淡化图像汇集为所述样本图像集。Collect the texture-enhanced image and the texture-lighten image into the sample image set.
详细地,可采用GLCM(Gray-level co-occurrence matrix,灰度共生矩阵)法、LBP(Local Binary Pattern,局部二值模式)等图像纹理提取算法实现对所述样本图像进行纹理描绘,以凸显出样本图像中的图像纹理。In detail, image texture extraction algorithms such as GLCM (Gray-level co-occurrence matrix, gray-level co-occurrence matrix) method, LBP (Local Binary Pattern, local binary pattern) can be used to implement texture depiction of the sample image to highlight Output the image texture in the sample image.
具体地,所述对所述图像纹理进行随机局部加深,得到纹理加深图像,包括:Specifically, the random local deepening of the image texture to obtain a texture deepened image includes:
统计所述图像纹理的纹理数量;Count the number of textures of the image texture;
根据所述纹理数量选取预设比例的图像纹理为待处理纹理;Selecting an image texture with a preset ratio as the texture to be processed according to the number of textures;
将所述待处理纹理上的像素进行像素值调整,得到纹理加深图像。Adjusting pixel values of the pixels on the texture to be processed to obtain a texture-enhanced image.
本申请其中一个应用场景中,根据所述纹理数量选取该预设比例的图像纹理,并对选取的待处理纹理上的像素值进行调整(例如,将待处理纹理上的像素值调整为黑色范围),以实现对待处理纹理的加深,获得纹理加深图像。In one of the application scenarios of the present application, the image texture of the preset ratio is selected according to the number of textures, and the pixel value on the selected texture to be processed is adjusted (for example, the pixel value on the texture to be processed is adjusted to a black range ), to realize the deepening of the texture to be processed, and obtain the texture deepened image.
同理,对所述图像文件进行随机局部淡化,得到纹理淡化图像的步骤,与纹理加深的步骤一致,例如,将待处理纹理上的像素值调整为白色范围,以实现对待处理纹理的淡化,获得纹理淡化图像。Similarly, the step of performing random local lightening on the image file to obtain a texture lightening image is consistent with the step of texture deepening, for example, adjusting the pixel value on the texture to be processed to a white range to realize the lightening of the texture to be processed, Get a textured faded image.
所述第一训练模块102,用于利用与所述样本图像同类型的噪声图像,以及所述样本图像集构建第一训练集与测试集,并利用所述第一训练集对预构建的原始识别模型进行训练,得到初始识别模型;The first training module 102 is configured to use noise images of the same type as the sample image and the sample image set to construct a first training set and a test set, and use the first training set to compare the pre-built original The recognition model is trained to obtain the initial recognition model;
本申请实施例中,所述噪声图像为与所述样本图像相同类型的图像,但所述噪声图像中与所述样本图中的目标物不同(例如,所述噪声图像与所述样本图像均为水果图像,但所述噪声图像中包含的是西瓜,所述样本图像中包含的是苹果),同样地,所述噪声图像中也含有该噪声图像对应的真实标签。例如,所述样本图像中包含的目标物为肺部疾病的病灶,但所述噪声图像中包含的可能肝部疾病的病灶。In this embodiment of the present application, the noise image is an image of the same type as the sample image, but the target object in the noise image is different from that in the sample image (for example, the noise image and the sample image are both is a fruit image, but the noise image contains watermelons, and the sample image contains apples), similarly, the noise image also contains the real label corresponding to the noise image. For example, the target object contained in the sample image is a focus of a lung disease, but the focus of a liver disease may be contained in the noise image.
本申请实施例将噪声图像与所述样本图像集进行汇集,并按照预设的划分比例将汇集的样本图像与噪声图像划分为第一训练集和测试集。其中,所述第一训练集与所述测试集中均包含样本图像与噪声图像。In the embodiment of the present application, the noise image and the sample image set are collected, and the collected sample image and noise image are divided into a first training set and a test set according to a preset division ratio. Wherein, both the first training set and the test set include sample images and noise images.
本申请实施例中,所述原始识别模型可采用VGG网络、GoogleNet和Residual Network等具有图像识别功能的网络。In the embodiment of the present application, the original recognition model can use networks with image recognition functions such as VGG network, GoogleNet and Residual Network.
本申请其中一实施例中,采用EfficientNet作为所述原始识别模型的主干网络,所述EfficientNet为一种复合多维度卷积神经网络,有利于提高图像处理过程中的精确度,进而提高图像识别的精确度。In one embodiment of the present application, EfficientNet is used as the backbone network of the original recognition model, and the EfficientNet is a composite multi-dimensional convolutional neural network, which is beneficial to improve the accuracy of the image processing process, thereby improving the accuracy of image recognition. Accuracy.
进一步地,本申请实施例利用训练集对原始识别模型进行训练,以实现对原始识别模型中模型参数的调整,提高原始识别模型对图像进行识别的精确度,得到初始识别模型。Further, the embodiment of the present application uses the training set to train the original recognition model, so as to adjust the model parameters in the original recognition model, improve the accuracy of the original recognition model for image recognition, and obtain the initial recognition model.
详细地,所述第一训练模块102具体用于:In detail, the first training module 102 is specifically used for:
利用所述原始识别模型对所述第一训练集进行图像识别,得到识别结果;performing image recognition on the first training set by using the original recognition model to obtain a recognition result;
计算所述识别结果与所述第一训练集中每张图像对应的真实标签的损失值;Calculate the loss value of the recognition result and the real label corresponding to each image in the first training set;
根据所述损失值对所述原始识别模型进行参数调整,得到初始识别模型。Adjusting parameters of the original recognition model according to the loss value to obtain an initial recognition model.
具体地,所述识别结果为原始识别模型对第一训练集中每张图像中含有的目标物的类型的识别结果,例如,第一训练集中包含样本图像A,样本图像B和噪声图像C,其中,样本图像A和样本图像B的真实标签为苹果(即目标物为苹果),图像C的真实标签为西瓜,但原始识别模型得到的识别结果为:样本图像A为苹果,样本图像B为西瓜,噪声图像C为 苹果。Specifically, the recognition result is the recognition result of the original recognition model for the type of target contained in each image in the first training set, for example, the first training set contains sample image A, sample image B and noise image C, where , the real label of sample image A and sample image B is apple (that is, the target object is apple), and the real label of image C is watermelon, but the recognition result obtained by the original recognition model is: sample image A is apple, and sample image B is watermelon , the noise image C is an apple.
可利用预设的第一损失函数计算识别结果与第一训练集中每张图像对应的真实标签的损失值,进而根据损失值对原始识别模型进行参数调整,以提高原始识别模型的精确度。The preset first loss function can be used to calculate the loss value of the recognition result and the real label corresponding to each image in the first training set, and then adjust the parameters of the original recognition model according to the loss value to improve the accuracy of the original recognition model.
例如,将第一训练集中的真实标签进行向量转换,得到真实向量;将识别结果进行向量转换,得到识别向量,分别计算第一训练集中每张图像对应的真实向量与识别向量之间的损失值,进而根据损失值利用预设的优化算法对原始识别模型的参数进行调整,所述优化算法包括但不限于:批量梯度下降算法、随机梯度下降算法、小批量梯度下降算法。For example, vector conversion is performed on the real labels in the first training set to obtain the real vector; vector conversion is performed on the recognition result to obtain the recognition vector, and the loss value between the real vector and the recognition vector corresponding to each image in the first training set is calculated respectively , and then use a preset optimization algorithm to adjust the parameters of the original recognition model according to the loss value. The optimization algorithm includes but is not limited to: batch gradient descent algorithm, stochastic gradient descent algorithm, and mini-batch gradient descent algorithm.
所述模型测试模块103,用于利用所述测试集对所述初始识别模型进行测试,并选取测试结果中预设类型的错误结果构建第二训练集,利用所述样本图像构建第三训练集;The model testing module 103 is configured to use the test set to test the initial recognition model, select preset types of error results in the test results to construct a second training set, and use the sample images to construct a third training set ;
本申请实施例中,可利用测试集对第一训练模块102中得到的初始识别模型进行测试,例如,将测试集输入至该初始识别模型中,以得到该初始识别模型对测试集中每一张图像的测试结果。In the embodiment of the present application, the test set can be used to test the initial recognition model obtained in the first training module 102, for example, the test set is input into the initial recognition model to obtain the initial recognition model for each image in the test set. Image test results.
本申请其中一实施例中,利用测试结果中预设类型的错误结果构建第二训练集时,所述预设类型的错误结果包括所述测试集中噪声图像的错误结果。In one embodiment of the present application, when the second training set is constructed by using preset types of error results in the test results, the preset types of error results include error results of noise images in the test set.
例如,所述测试集中包含样本图像D、样本图像E、噪声图像F和噪声图像G,其中,样本图像D和样本图像E的真实标签为苹果,噪声图像E和噪声图像F的真实标签为西瓜;当所述初始识别模型对所述测试集进行识别后,得到的测试结果为:样本图像D和噪声图像F为苹果,样本图像E和噪声图像G为西瓜,因此,噪声图像F和样本图像E为该测试结果中的错误结果,选取噪声图像F为所述第二训练集。For example, the test set contains sample image D, sample image E, noise image F and noise image G, wherein the real label of sample image D and sample image E is apple, and the real label of noise image E and noise image F is watermelon ; After the initial identification model identifies the test set, the test results obtained are: the sample image D and the noise image F are apples, the sample image E and the noise image G are watermelons, therefore, the noise image F and the sample image E is the wrong result in the test result, and the noise image F is selected as the second training set.
进一步地,本申请实施例将图像扩增模块101中生成的样本图像集中至少一张样本图像作为第三训练集。Further, in the embodiment of the present application, at least one sample image in the sample image set generated in the image augmentation module 101 is used as the third training set.
本申请实施例利用测试结果中的错误结果构建第二训练集,以及利用样本图像构建第三训练集,可实现将噪声图像中易被所述初始识别模型识别错误的图像,及包含目标物的样本图像分别构建训练集,有利于后续对初始识别模型进行进一步训练,提高初始识别模型的精确度。In the embodiment of the present application, the second training set is constructed by using the wrong results in the test results, and the third training set is constructed by using the sample images, so that the images that are easily misrecognized by the initial recognition model in the noise image and the images containing the target object can be realized. The sample images construct the training set separately, which is conducive to further training the initial recognition model and improving the accuracy of the initial recognition model.
所述特征提取模块104,用于提取所述第二训练集中图像的第一特征向量,提取所述第三训练集中图像的第二特征向量;The feature extraction module 104 is configured to extract a first feature vector of images in the second training set, and extract a second feature vector of images in the third training set;
本申请实施例中,可利用HOG算法、LBP算法及Harr算法等具有图像特征提取的算法来提取所述第二训练集中图像的第一特征向量。In the embodiment of the present application, the first feature vectors of the images in the second training set may be extracted by using an image feature extraction algorithm such as the HOG algorithm, the LBP algorithm, and the Harr algorithm.
进一步,所述提取所述第三训练集中图像的第二特征向量的步骤,与提取所述第二训练集中图像的第一特征向量的步骤一致,在此不做赘述。Further, the step of extracting the second feature vectors of the images in the third training set is the same as the step of extracting the first feature vectors of the images in the second training set, which will not be repeated here.
本申请其中一个实施例中,所述特征提取模块104具体用于:In one of the embodiments of the present application, the feature extraction module 104 is specifically used for:
利用预设的第一梯度算子对所述第二训练集中的图像进行水平卷积运算,得到水平梯度分量;performing a horizontal convolution operation on the images in the second training set by using a preset first gradient operator to obtain a horizontal gradient component;
利用预设的第二梯度算子对所述第二训练集中的图像进行垂直卷积运算,得到垂直梯度分量;performing a vertical convolution operation on the images in the second training set by using a preset second gradient operator to obtain a vertical gradient component;
根据所述水平梯度分量与所述垂直梯度分量计算所述第一特征向量;calculating the first feature vector according to the horizontal gradient component and the vertical gradient component;
提取所述第三训练集中图像的第二特征向量。Extracting second feature vectors of images in the third training set.
详细地,所述第一梯度算子与所述第二梯度算子为预设的矩阵,例如,所述第一梯度算子可以为[-1,0,1],所述第二梯度算子可以为[1,0,-1],通过将所述第一梯度算子与所述第二梯度算子分别和所述第二训练集中每张图像进行卷积运算,即可得到每张图像对应的水平梯度分量和垂直梯度分量。In detail, the first gradient operator and the second gradient operator are preset matrices, for example, the first gradient operator may be [-1, 0, 1], and the second gradient operator can be [1, 0, -1], and each image in the second training set can be obtained by convolving the first gradient operator and the second gradient operator with each image in the second training set The horizontal gradient component and vertical gradient component corresponding to the image.
具体地,所述利用预设的第一梯度算子对所述第二训练集中图的像进行水平卷积运算,得到水平梯度分量,包括:Specifically, the use of the preset first gradient operator to perform a horizontal convolution operation on the image in the second training set to obtain a horizontal gradient component includes:
获取卷积步长和卷积长度;Get the convolution step size and convolution length;
根据所述卷积步长与所述卷积长度计算水平卷积次数;calculating the number of horizontal convolutions according to the convolution step size and the convolution length;
利用所述第一梯度算子将所述第二训练集中每张图像按照所述卷积步长进行所述卷积次数的卷积运算,得到水平梯度分量。Using the first gradient operator to perform a convolution operation on each image in the second training set according to the convolution step size for the number of convolutions to obtain a horizontal gradient component.
其中,所述卷积步长是指所述第一梯度算子在进行一次卷积运算后,需要移动的像素长度,所述卷积长度是指第二训练集中每张图像在水平方向上的像素长度,将所述卷积步长与所述卷积长度相除,可计算得出第二训练集中每张图像在水平方向上需要进行的水平卷积次数。Wherein, the convolution step size refers to the pixel length that the first gradient operator needs to move after performing a convolution operation, and the convolution length refers to the length of each image in the second training set in the horizontal direction. pixel length, divide the convolution step by the convolution length, and calculate the number of times of horizontal convolution that needs to be performed on each image in the second training set in the horizontal direction.
本申请其中一个实施例中,所述根据所述水平梯度分量与所述垂直梯度分量计算所述第一特征向量,包括:In one of the embodiments of the present application, the calculating the first feature vector according to the horizontal gradient component and the vertical gradient component includes:
将所述水平梯度分量进行归一化计算,得到水平归一化分量;performing normalized calculation on the horizontal gradient component to obtain a horizontal normalized component;
将所述垂直梯度分量进行归一化计算,得到垂直归一化分量;performing normalized calculation on the vertical gradient component to obtain a vertical normalized component;
将所述水平归一化分量与所述垂直归一化分量进行平方求和,得到所述第一特征向量。The horizontal normalized component and the vertical normalized component are squared and summed to obtain the first feature vector.
详细地,可采用预设的线性函数、对数函数或反余切函数等域值范围在(0,1)的函数对水平梯度分量和垂直梯度分量进行计算,以实现对水平梯度分量和垂直梯度分量的归一化处理。In detail, the horizontal gradient component and the vertical gradient component can be calculated by using a preset linear function, logarithmic function, or inverse cotangent function, etc. Normalization of gradient components.
具体地,可采用如下平方求和公式将所述水平归一化分量与所述垂直归一化分量进行平方求和,得到所述第一特征向量:Specifically, the horizontal normalized component and the vertical normalized component can be squared and summed using the following square summation formula to obtain the first eigenvector:
Figure PCTCN2021109479-appb-000002
Figure PCTCN2021109479-appb-000002
其中,L为所述第一特征向量,α为所述水平归一化分量,β为所述垂直归一化分量。Wherein, L is the first feature vector, α is the horizontal normalized component, and β is the vertical normalized component.
所述第二训练模块105,用于计算所述第一特征向量与所述第二特征向量之间的损失值,并根据所述损失值对所述初始识别模型进行参数更新,得到标准识别模型;The second training module 105 is configured to calculate a loss value between the first feature vector and the second feature vector, and update the parameters of the initial recognition model according to the loss value to obtain a standard recognition model ;
本申请实施例中,可利用预设的第二损失函数计算识别结果与第一训练集中每张图像对应的真实标签的损失值,进而实现对初始识别模型进行参数调整,以提高初始识别模型的精确度。In the embodiment of the present application, the preset second loss function can be used to calculate the loss value of the recognition result and the real label corresponding to each image in the first training set, and then adjust the parameters of the initial recognition model to improve the performance of the initial recognition model. Accuracy.
其中,所述第二损失函数与第一训练模块102中所述第一损失函数可以相同,也可以不同。Wherein, the second loss function may be the same as or different from the first loss function in the first training module 102 .
详细地,所述并利用所述损失值对所述初始识别模型进行参数更新,得到标准识别模型的步骤,与第一训练模块102中利用所述第一训练集对预构建的原始识别模型进行训练时,对原始识别模型的参数调整的步骤一致,在此不做赘述。In detail, the step of updating the parameters of the initial recognition model using the loss value to obtain a standard recognition model is the same as performing the pre-built original recognition model using the first training set in the first training module 102. During training, the steps of adjusting the parameters of the original recognition model are the same, and will not be repeated here.
所述图像识别模块106,用于获取待识别图像,利用所述标准识别模型对所述待识别图像进行目标物识别,得到图像中目标物识别结果。The image recognition module 106 is configured to acquire an image to be recognized, and use the standard recognition model to perform object recognition on the image to be recognized to obtain a target object recognition result in the image.
本申请实施例中,所述待识别图像可以为包含目标物或不包含目标物的图像,当获取待识别图像扣,可利用标准识别模型对该待识别图像进行目标物识别,得到该图像中目标物类型的识别结果。In the embodiment of the present application, the image to be recognized may be an image that contains or does not contain the target object. When the image to be recognized is obtained, the standard recognition model can be used to perform target recognition on the image to be recognized to obtain the target object in the image. The recognition result of the object type.
本申请实施例通过图像扩展,可实现对原先少量样本的扩展,利用扩展后得到的样本图像集与噪声图像同时对模型进行训练,有利于提高模型的精确度与鲁棒性;再对模型进行测试,并利用测试结果中的错误结果构建训练集,与样本图像对模型进行再次训练,可进一步地提高模型对目标物进行识别的精确度。因此本申请提出的图像中目标物识别装置,可以解决对目标物进行识别的精确度较低的问题。The embodiment of the present application can realize the expansion of the original small number of samples through image expansion, and use the expanded sample image set and noise image to train the model at the same time, which is beneficial to improve the accuracy and robustness of the model; Test, and use the wrong results in the test results to construct a training set, and retrain the model with sample images, which can further improve the accuracy of the model's recognition of the target. Therefore, the device for identifying objects in images proposed by the present application can solve the problem of low accuracy in identifying objects.
如图4所示,是本申请一实施例提供的实现图像中目标物识别方法的电子设备的结构示意图。As shown in FIG. 4 , it is a schematic structural diagram of an electronic device for implementing a method for recognizing an object in an image provided by an embodiment of the present application.
所述电子设备1可以包括处理器10、存储器11和总线,还可以包括存储在所述存储器11中并可在所述处理器10上运行的计算机程序,如图像中目标物识别程序12。The electronic device 1 may include a processor 10 , a memory 11 and a bus, and may also include a computer program stored in the memory 11 and operable on the processor 10 , such as a program 12 for object recognition in an image.
其中,所述存储器11至少包括一种类型的可读存储介质,所述可读存储介质包括闪存、移动硬盘、多媒体卡、卡型存储器(例如:SD或DX存储器等)、磁性存储器、磁盘、光盘 等。所述存储器11在一些实施例中可以是电子设备1的内部存储单元,例如该电子设备1的移动硬盘。所述存储器11在另一些实施例中也可以是电子设备1的外部存储设备,例如电子设备1上配备的插接式移动硬盘、智能存储卡(Smart Media Card,SMC)、安全数字(Secure Digital,SD)卡、闪存卡(Flash Card)等。进一步地,所述存储器11还可以既包括电子设备1的内部存储单元也包括外部存储设备。所述存储器11不仅可以用于存储安装于电子设备1的应用软件及各类数据,例如图像中目标物识别程序12的代码等,还可以用于暂时地存储已经输出或者将要输出的数据。Wherein, the memory 11 includes at least one type of readable storage medium, and the readable storage medium includes flash memory, mobile hard disk, multimedia card, card type memory (for example: SD or DX memory, etc.), magnetic memory, magnetic disk, CD etc. The storage 11 may be an internal storage unit of the electronic device 1 in some embodiments, such as a mobile hard disk of the electronic device 1 . The memory 11 can also be an external storage device of the electronic device 1 in other embodiments, such as a plug-in mobile hard disk equipped on the electronic device 1, a smart memory card (Smart Media Card, SMC), a secure digital (Secure Digital , SD) card, flash memory card (Flash Card), etc. Further, the memory 11 may also include both an internal storage unit of the electronic device 1 and an external storage device. The memory 11 can not only be used to store application software and various data installed in the electronic device 1 , such as codes of the object recognition program 12 in images, but also can be used to temporarily store data that has been output or will be output.
所述处理器10在一些实施例中可以由集成电路组成,例如可以由单个封装的集成电路所组成,也可以是由多个相同功能或不同功能封装的集成电路所组成,包括一个或者多个中央处理器(Central Processing unit,CPU)、微处理器、数字处理芯片、图形处理器及各种控制芯片的组合等。所述处理器10是所述电子设备的控制核心(Control Unit),利用各种接口和线路连接整个电子设备的各个部件,通过运行或执行存储在所述存储器11内的程序或者模块(例如图像中目标物识别程序等),以及调用存储在所述存储器11内的数据,以执行电子设备1的各种功能和处理数据。In some embodiments, the processor 10 may be composed of integrated circuits, for example, may be composed of a single packaged integrated circuit, or may be composed of multiple integrated circuits with the same function or different functions, including one or more Central processing unit (Central Processing unit, CPU), microprocessor, digital processing chip, graphics processor and a combination of various control chips, etc. The processor 10 is the control core (Control Unit) of the electronic device, and uses various interfaces and lines to connect the various components of the entire electronic device, and runs or executes programs or modules stored in the memory 11 (such as image target recognition program, etc.), and call the data stored in the memory 11 to execute various functions of the electronic device 1 and process data.
所述总线可以是外设部件互连标准(peripheral component interconnect,简称PCI)总线或扩展工业标准结构(extended industry standard architecture,简称EISA)总线等。该总线可以分为地址总线、数据总线、控制总线等。所述总线被设置为实现所述存储器11以及至少一个处理器10等之间的连接通信。The bus may be a peripheral component interconnect standard (PCI for short) bus or an extended industry standard architecture (EISA for short) bus or the like. The bus can be divided into address bus, data bus, control bus and so on. The bus is configured to realize connection and communication between the memory 11 and at least one processor 10 and the like.
图4仅示出了具有部件的电子设备,本领域技术人员可以理解的是,图4示出的结构并不构成对所述电子设备1的限定,可以包括比图示更少或者更多的部件,或者组合某些部件,或者不同的部件布置。FIG. 4 only shows an electronic device with components. Those skilled in the art can understand that the structure shown in FIG. 4 does not constitute a limitation to the electronic device 1, and may include fewer or more components, or combinations of certain components, or different arrangements of components.
例如,尽管未示出,所述电子设备1还可以包括给各个部件供电的电源(比如电池),优选地,电源可以通过电源管理装置与所述至少一个处理器10逻辑相连,从而通过电源管理装置实现充电管理、放电管理、以及功耗管理等功能。电源还可以包括一个或一个以上的直流或交流电源、再充电装置、电源故障检测电路、电源转换器或者逆变器、电源状态指示器等任意组件。所述电子设备1还可以包括多种传感器、蓝牙模块、Wi-Fi模块等,在此不再赘述。For example, although not shown, the electronic device 1 can also include a power supply (such as a battery) for supplying power to various components. Preferably, the power supply can be logically connected to the at least one processor 10 through a power management device, so that the power supply can be controlled by power management. The device implements functions such as charge management, discharge management, and power consumption management. The power supply may also include one or more DC or AC power supplies, recharging devices, power failure detection circuits, power converters or inverters, power status indicators and other arbitrary components. The electronic device 1 may also include various sensors, bluetooth modules, Wi-Fi modules, etc., which will not be repeated here.
进一步地,所述电子设备1还可以包括网络接口,可选地,所述网络接口可以包括有线接口和/或无线接口(如WI-FI接口、蓝牙接口等),通常用于在该电子设备1与其他电子设备之间建立通信连接。Further, the electronic device 1 may also include a network interface, optionally, the network interface may include a wired interface and/or a wireless interface (such as a WI-FI interface, a Bluetooth interface, etc.), which are usually used in the electronic device 1 Establish a communication connection with other electronic devices.
可选地,该电子设备1还可以包括用户接口,用户接口可以是显示器(Display)、输入单元(比如键盘(Keyboard)),可选地,用户接口还可以是标准的有线接口、无线接口。可选地,在一些实施例中,显示器可以是LED显示器、液晶显示器、触控式液晶显示器以及OLED(Organic Light-Emitting Diode,有机发光二极管)触摸器等。其中,显示器也可以适当的称为显示屏或显示单元,用于显示在电子设备1中处理的信息以及用于显示可视化的用户界面。Optionally, the electronic device 1 may further include a user interface, which may be a display (Display) or an input unit (such as a keyboard (Keyboard)). Optionally, the user interface may also be a standard wired interface or a wireless interface. Optionally, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode, organic light-emitting diode) touch device, and the like. Wherein, the display may also be appropriately called a display screen or a display unit, and is used for displaying information processed in the electronic device 1 and for displaying a visualized user interface.
应该了解,所述实施例仅为说明之用,在专利申请范围上并不受此结构的限制。It should be understood that the embodiments are only for illustration, and are not limited by the structure in terms of the scope of the patent application.
所述电子设备1中的所述存储器11存储的图像中目标物识别程序12是多个指令的组合,在所述处理器10中运行时,可以实现:The object recognition program 12 in the image stored in the memory 11 in the electronic device 1 is a combination of multiple instructions. When running in the processor 10, it can realize:
获取包含目标物的样本图像,对所述样本图像进行图像扩增,得到样本图像集;Acquiring a sample image containing the target object, performing image amplification on the sample image to obtain a sample image set;
利用与所述样本图像同类型的噪声图像,以及所述样本图像集构建第一训练集与测试集,并利用所述第一训练集对预构建的原始识别模型进行训练,得到初始识别模型;constructing a first training set and a test set by using a noise image of the same type as the sample image and the sample image set, and using the first training set to train a pre-built original recognition model to obtain an initial recognition model;
利用所述测试集对所述初始识别模型进行测试,并选取测试结果中预设类型的错误结果构建第二训练集,利用所述样本图像构建第三训练集;Using the test set to test the initial recognition model, and selecting preset types of error results in the test results to construct a second training set, and using the sample images to construct a third training set;
提取所述第二训练集中图像的第一特征向量,提取所述第三训练集中图像的第二特征 向量;Extract the first feature vector of the image in the second training set, extract the second feature vector of the image in the third training set;
计算所述第一特征向量与所述第二特征向量之间的损失值,并根据所述损失值对所述初始识别模型进行参数更新,得到标准识别模型;calculating a loss value between the first feature vector and the second feature vector, and updating parameters of the initial recognition model according to the loss value to obtain a standard recognition model;
获取待识别图像,利用所述标准识别模型对所述待识别图像进行目标物识别,得到图像中目标物识别结果。The image to be recognized is acquired, and the standard recognition model is used to perform target recognition on the image to be recognized to obtain a target recognition result in the image.
具体地,所述处理器10对上述指令的具体实现方法可参考图1至图4对应实施例中相关步骤的描述,在此不赘述。Specifically, for the specific implementation method of the above instructions by the processor 10, reference may be made to the description of relevant steps in the embodiments corresponding to FIG. 1 to FIG. 4 , and details are not repeated here.
进一步地,所述电子设备1集成的模块/单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读存储介质中。所述计算机可读存储介质可以是易失性的,也可以是非易失性的。例如,所述计算机可读介质可以包括:能够携带所述计算机程序代码的任何实体或装置、记录介质、U盘、移动硬盘、磁碟、光盘、计算机存储器、只读存储器(ROM,Read-Only Memory)。Further, if the integrated modules/units of the electronic device 1 are realized in the form of software function units and sold or used as independent products, they can be stored in a computer-readable storage medium. The computer-readable storage medium may be volatile or non-volatile. For example, the computer-readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer memory, a read-only memory (ROM, Read-Only Memory).
本申请还提供一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序在被电子设备的处理器所执行时,可以实现:The present application also provides a computer-readable storage medium, the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor of an electronic device, it can realize:
获取包含目标物的样本图像,对所述样本图像进行图像扩增,得到样本图像集;Acquiring a sample image containing the target object, performing image amplification on the sample image to obtain a sample image set;
利用与所述样本图像同类型的噪声图像,以及所述样本图像集构建第一训练集与测试集,并利用所述第一训练集对预构建的原始识别模型进行训练,得到初始识别模型;constructing a first training set and a test set by using a noise image of the same type as the sample image and the sample image set, and using the first training set to train a pre-built original recognition model to obtain an initial recognition model;
利用所述测试集对所述初始识别模型进行测试,并选取测试结果中预设类型的错误结果构建第二训练集,利用所述样本图像构建第三训练集;Using the test set to test the initial recognition model, and selecting preset types of error results in the test results to construct a second training set, and using the sample images to construct a third training set;
提取所述第二训练集中图像的第一特征向量,提取所述第三训练集中图像的第二特征向量;extracting the first feature vector of the image in the second training set, and extracting the second feature vector of the image in the third training set;
计算所述第一特征向量与所述第二特征向量之间的损失值,并根据所述损失值对所述初始识别模型进行参数更新,得到标准识别模型;calculating a loss value between the first feature vector and the second feature vector, and updating parameters of the initial recognition model according to the loss value to obtain a standard recognition model;
获取待识别图像,利用所述标准识别模型对所述待识别图像进行目标物识别,得到图像中目标物识别结果。The image to be recognized is acquired, and the standard recognition model is used to perform target recognition on the image to be recognized to obtain a target recognition result in the image.
在本申请所提供的几个实施例中,应该理解到,所揭露的设备,装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述模块的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。In the several embodiments provided in this application, it should be understood that the disclosed devices, devices and methods can be implemented in other ways. For example, the device embodiments described above are only illustrative. For example, the division of the modules is only a logical function division, and there may be other division methods in actual implementation.
所述作为分离部件说明的模块可以是或者也可以不是物理上分开的,作为模块显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。The modules described as separate components may or may not be physically separated, and the components shown as modules may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the modules can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
另外,在本申请各个实施例中的各功能模块可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用硬件加软件功能模块的形式实现。In addition, each functional module in each embodiment of the present application may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit. The above-mentioned integrated units can be implemented in the form of hardware, or in the form of hardware plus software function modules.
对于本领域技术人员而言,显然本申请不限于上述示范性实施例的细节,而且在不背离本申请的精神或基本特征的情况下,能够以其他的具体形式实现本申请。It will be apparent to those skilled in the art that the present application is not limited to the details of the exemplary embodiments described above, but that the present application can be implemented in other specific forms without departing from the spirit or essential characteristics of the present application.
因此,无论从哪一点来看,均应将实施例看作是示范性的,而且是非限制性的,本申请的范围由所附权利要求而不是上述说明限定,因此旨在将落在权利要求的等同要件的含义和范围内的所有变化涵括在本申请内。不应将权利要求中的任何附关联图标记视为限制所涉及的权利要求。Therefore, the embodiments should be regarded as exemplary and not restrictive in all points of view, and the scope of the application is defined by the appended claims rather than the foregoing description, and it is intended that the scope of the present application be defined by the appended claims rather than by the foregoing description. All changes within the meaning and range of equivalents of the elements are embraced in this application. Any reference sign in a claim should not be construed as limiting the claim concerned.
本申请所指区块链是分布式数据存储、点对点传输、共识机制、加密算法等计算机技术的新型应用模式。区块链(Blockchain),本质上是一个去中心化的数据库,是一串使用密码学方法相关联产生的数据块,每一个数据块中包含了一批次网络交易的信息,用于验证其信息的有效性(防伪)和生成下一个区块。区块链可以包括区块链底层平台、平台产品服务层以及应用服务层等。The blockchain referred to in this application is a new application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm. Blockchain (Blockchain), essentially a decentralized database, is a series of data blocks associated with each other using cryptographic methods. Each data block contains a batch of network transaction information, which is used to verify its Validity of information (anti-counterfeiting) and generation of the next block. The blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.
此外,显然“包括”一词不排除其他单元或步骤,单数不排除复数。系统权利要求中陈述的多个单元或装置也可以由一个单元或装置通过软件或者硬件来实现。第二等词语用来表示名称,而并不表示任何特定的顺序。In addition, it is obvious that the word "comprising" does not exclude other elements or steps, and the singular does not exclude the plural. A plurality of units or devices stated in the system claims may also be realized by one unit or device through software or hardware. Secondary terms are used to denote names without implying any particular order.
最后应说明的是,以上实施例仅用以说明本申请的技术方案而非限制,尽管参照较佳实施例对本申请进行了详细说明,本领域的普通技术人员应当理解,可以对本申请的技术方案进行修改或等同替换,而不脱离本申请技术方案的精神和范围。Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present application without limitation. Although the present application has been described in detail with reference to the preferred embodiments, those skilled in the art should understand that the technical solutions of the present application can be Make modifications or equivalent replacements without departing from the spirit and scope of the technical solutions of the present application.

Claims (20)

  1. 一种图像中目标物识别方法,其中,所述方法包括:A method for object recognition in an image, wherein the method includes:
    获取包含目标物的样本图像,对所述样本图像进行图像扩增,得到样本图像集;Acquiring a sample image containing the target object, performing image amplification on the sample image to obtain a sample image set;
    利用与所述样本图像同类型的噪声图像,以及所述样本图像集构建第一训练集与测试集,并利用所述第一训练集对预构建的原始识别模型进行训练,得到初始识别模型;constructing a first training set and a test set by using a noise image of the same type as the sample image and the sample image set, and using the first training set to train a pre-built original recognition model to obtain an initial recognition model;
    利用所述测试集对所述初始识别模型进行测试,并选取测试结果中预设类型的错误结果构建第二训练集,利用所述样本图像构建第三训练集;Using the test set to test the initial recognition model, and selecting preset types of error results in the test results to construct a second training set, and using the sample images to construct a third training set;
    提取所述第二训练集中图像的第一特征向量,提取所述第三训练集中图像的第二特征向量;extracting the first feature vector of the image in the second training set, and extracting the second feature vector of the image in the third training set;
    计算所述第一特征向量与所述第二特征向量之间的损失值,并根据所述损失值对所述初始识别模型进行参数更新,得到标准识别模型;calculating a loss value between the first feature vector and the second feature vector, and updating parameters of the initial recognition model according to the loss value to obtain a standard recognition model;
    获取待识别图像,利用所述标准识别模型对所述待识别图像进行目标物识别,得到图像中目标物识别结果。The image to be recognized is acquired, and the standard recognition model is used to perform target recognition on the image to be recognized to obtain a target recognition result in the image.
  2. 如权利要求1所述的图像中目标物识别方法,其中,所述对所述样本图像进行图像扩增,得到样本图像集,包括:The method for identifying objects in an image according to claim 1, wherein said performing image amplification on said sample image to obtain a sample image set comprises:
    对所述样本图像进行纹理描绘,得到所述样本图像的图像纹理;performing texture drawing on the sample image to obtain the image texture of the sample image;
    对所述图像纹理进行随机局部加深,得到纹理加深图像;performing random local deepening on the image texture to obtain a texture deepened image;
    对所述图像文件进行随机局部淡化,得到纹理淡化图像;performing random partial lightening on the image file to obtain a texture lightened image;
    将所述纹理加深图像与所述纹理淡化图像汇集为所述样本图像集。Collect the texture-enhanced image and the texture-lighten image into the sample image set.
  3. 如权利要求2所述的图像中目标物识别方法,其中,所述对所述图像纹理进行随机局部加深,得到纹理加深图像,包括:The method for identifying objects in an image according to claim 2, wherein said performing random local deepening on said image texture to obtain a texture-enhanced image comprises:
    统计所述图像纹理的纹理数量;Count the number of textures of the image texture;
    根据所述纹理数量选取预设比例的图像纹理为待处理纹理;Selecting an image texture with a preset ratio as the texture to be processed according to the number of textures;
    将所述待处理纹理上的像素进行像素值调整,得到纹理加深图像。Adjusting pixel values of the pixels on the texture to be processed to obtain a texture-enhanced image.
  4. 如权利要求1所述的图像中目标物识别方法,其中,所述利用所述第一训练集对预构建的原始识别模型进行训练,得到初始识别模型,包括:The method for recognizing an object in an image according to claim 1, wherein said using said first training set to train a pre-built original recognition model to obtain an initial recognition model, comprising:
    利用所述原始识别模型对所述第一训练集进行图像识别,得到识别结果;performing image recognition on the first training set by using the original recognition model to obtain a recognition result;
    计算所述识别结果与所述第一训练集中每张图像对应的真实标签的损失值;Calculate the loss value of the recognition result and the real label corresponding to each image in the first training set;
    根据所述损失值对所述原始识别模型进行参数调整,得到初始识别模型。Adjusting parameters of the original recognition model according to the loss value to obtain an initial recognition model.
  5. 如权利要求1至4中任一项所述的图像中目标物识别方法,其中,所述提取所述第二训练集中图像的第一特征向量,包括:The object recognition method in an image according to any one of claims 1 to 4, wherein said extracting the first feature vector of the image in the second training set comprises:
    利用预设的第一梯度算子对所述第二训练集中的图像进行水平卷积运算,得到水平梯度分量;performing a horizontal convolution operation on the images in the second training set by using a preset first gradient operator to obtain a horizontal gradient component;
    利用预设的第二梯度算子对所述第二训练集中的图像进行垂直卷积运算,得到垂直梯度分量;performing a vertical convolution operation on the images in the second training set by using a preset second gradient operator to obtain a vertical gradient component;
    根据所述水平梯度分量与所述垂直梯度分量计算所述第一特征向量。The first feature vector is calculated according to the horizontal gradient component and the vertical gradient component.
  6. 如权利要求5所述的图像中目标物识别方法,其中,所述利用预设的第一梯度算子对所述第二训练集中图的像进行水平卷积运算,得到水平梯度分量,包括:The method for identifying objects in an image according to claim 5, wherein the horizontal convolution operation is performed on the image in the second training set using the preset first gradient operator to obtain a horizontal gradient component, comprising:
    获取卷积步长和卷积长度;Get the convolution step size and convolution length;
    根据所述卷积步长与所述卷积长度计算水平卷积次数;calculating the number of horizontal convolutions according to the convolution step size and the convolution length;
    利用所述第一梯度算子将所述第二训练集中每张图像按照所述卷积步长进行所述卷积次数的卷积运算,得到水平梯度分量。Using the first gradient operator to perform a convolution operation on each image in the second training set according to the convolution step size for the number of convolutions to obtain a horizontal gradient component.
  7. 如权利要求5所述的图像中目标物识别方法,其中,所述根据所述水平梯度分量与所述垂直梯度分量计算所述第一特征向量,包括:The method for recognizing objects in an image according to claim 5, wherein said calculating said first feature vector according to said horizontal gradient component and said vertical gradient component comprises:
    将所述水平梯度分量进行归一化计算,得到水平归一化分量;performing normalized calculation on the horizontal gradient component to obtain a horizontal normalized component;
    将所述垂直梯度分量进行归一化计算,得到垂直归一化分量;performing normalized calculation on the vertical gradient component to obtain a vertical normalized component;
    将所述水平归一化分量与所述垂直归一化分量进行平方求和,得到所述第一特征向量。The horizontal normalized component and the vertical normalized component are squared and summed to obtain the first feature vector.
  8. 一种图像中目标物识别装置,其中,所述装置包括:A device for recognizing an object in an image, wherein the device includes:
    图像扩增模块,用于获取包含目标物的样本图像,对所述样本图像进行图像扩增,得到样本图像集;An image amplification module, configured to obtain a sample image containing a target object, perform image amplification on the sample image, and obtain a sample image set;
    第一训练模块,用于利用与所述样本图像同类型的噪声图像,以及所述样本图像集构建第一训练集与测试集,并利用所述第一训练集对预构建的原始识别模型进行训练,得到初始识别模型;The first training module is configured to use noise images of the same type as the sample image and the sample image set to construct a first training set and a test set, and use the first training set to perform pre-built original recognition models. Training to get the initial recognition model;
    模型测试模块,用于利用所述测试集对所述初始识别模型进行测试,并选取测试结果中预设类型的错误结果构建第二训练集,利用所述样本图像构建第三训练集;A model testing module, configured to use the test set to test the initial recognition model, select a preset type of error result in the test result to construct a second training set, and use the sample image to construct a third training set;
    特征提取模块,用于提取所述第二训练集中图像的第一特征向量,提取所述第三训练集中图像的第二特征向量;A feature extraction module, configured to extract a first feature vector of images in the second training set, and extract a second feature vector of images in the third training set;
    第二训练模块,用于计算所述第一特征向量与所述第二特征向量之间的损失值,并根据所述损失值对所述初始识别模型进行参数更新,得到标准识别模型;A second training module, configured to calculate a loss value between the first feature vector and the second feature vector, and update parameters of the initial recognition model according to the loss value to obtain a standard recognition model;
    图像识别模块,用于获取待识别图像,利用所述标准识别模型对所述待识别图像进行目标物识别,得到图像中目标物识别结果。The image recognition module is used to obtain the image to be recognized, and use the standard recognition model to perform object recognition on the image to be recognized to obtain a target object recognition result in the image.
  9. 一种电子设备,其中,所述电子设备包括:An electronic device, wherein the electronic device includes:
    至少一个处理器;以及,at least one processor; and,
    与所述至少一个处理器通信连接的存储器;其中,a memory communicatively coupled to the at least one processor; wherein,
    所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够执行如下步骤:The memory stores instructions executable by the at least one processor, the instructions are executed by the at least one processor, so that the at least one processor can perform the following steps:
    获取包含目标物的样本图像,对所述样本图像进行图像扩增,得到样本图像集;Acquiring a sample image containing the target object, performing image amplification on the sample image to obtain a sample image set;
    利用与所述样本图像同类型的噪声图像,以及所述样本图像集构建第一训练集与测试集,并利用所述第一训练集对预构建的原始识别模型进行训练,得到初始识别模型;constructing a first training set and a test set using noise images of the same type as the sample image and the sample image set, and using the first training set to train a pre-built original recognition model to obtain an initial recognition model;
    利用所述测试集对所述初始识别模型进行测试,并选取测试结果中预设类型的错误结果构建第二训练集,利用所述样本图像构建第三训练集;Using the test set to test the initial recognition model, and selecting preset types of error results in the test results to construct a second training set, and using the sample images to construct a third training set;
    提取所述第二训练集中图像的第一特征向量,提取所述第三训练集中图像的第二特征向量;extracting the first feature vector of the image in the second training set, and extracting the second feature vector of the image in the third training set;
    计算所述第一特征向量与所述第二特征向量之间的损失值,并根据所述损失值对所述初始识别模型进行参数更新,得到标准识别模型;calculating a loss value between the first feature vector and the second feature vector, and updating parameters of the initial recognition model according to the loss value to obtain a standard recognition model;
    获取待识别图像,利用所述标准识别模型对所述待识别图像进行目标物识别,得到图像中目标物识别结果。The image to be recognized is acquired, and the standard recognition model is used to perform target recognition on the image to be recognized to obtain a target recognition result in the image.
  10. 如权利要求9所述的电子设备,其中,所述对所述样本图像进行图像扩增,得到样本图像集,包括:The electronic device according to claim 9, wherein said performing image amplification on said sample image to obtain a sample image set comprises:
    对所述样本图像进行纹理描绘,得到所述样本图像的图像纹理;performing texture drawing on the sample image to obtain the image texture of the sample image;
    对所述图像纹理进行随机局部加深,得到纹理加深图像;performing random local deepening on the image texture to obtain a texture deepened image;
    对所述图像文件进行随机局部淡化,得到纹理淡化图像;performing random partial lightening on the image file to obtain a texture lightened image;
    将所述纹理加深图像与所述纹理淡化图像汇集为所述样本图像集。Collect the texture-enhanced image and the texture-lighten image into the sample image set.
  11. 如权利要求10所述的电子设备,其中,所述对所述图像纹理进行随机局部加深,得到纹理加深图像,包括:The electronic device according to claim 10, wherein said random local deepening of said image texture to obtain a texture deepened image comprises:
    统计所述图像纹理的纹理数量;Count the number of textures of the image texture;
    根据所述纹理数量选取预设比例的图像纹理为待处理纹理;Selecting an image texture with a preset ratio as the texture to be processed according to the number of textures;
    将所述待处理纹理上的像素进行像素值调整,得到纹理加深图像。Adjusting pixel values of the pixels on the texture to be processed to obtain a texture-enhanced image.
  12. 如权利要求9所述的电子设备,其中,所述利用所述第一训练集对预构建的原始 识别模型进行训练,得到初始识别模型,包括:The electronic device according to claim 9, wherein said utilizing said first training set to train a pre-built original recognition model to obtain an initial recognition model, comprising:
    利用所述原始识别模型对所述第一训练集进行图像识别,得到识别结果;performing image recognition on the first training set by using the original recognition model to obtain a recognition result;
    计算所述识别结果与所述第一训练集中每张图像对应的真实标签的损失值;Calculate the loss value of the recognition result and the real label corresponding to each image in the first training set;
    根据所述损失值对所述原始识别模型进行参数调整,得到初始识别模型。Adjusting parameters of the original recognition model according to the loss value to obtain an initial recognition model.
  13. 如权利要求9至12中任一项所述的电子设备,其中,所述提取所述第二训练集中图像的第一特征向量,包括:The electronic device according to any one of claims 9 to 12, wherein said extracting the first feature vectors of the images in the second training set comprises:
    利用预设的第一梯度算子对所述第二训练集中的图像进行水平卷积运算,得到水平梯度分量;performing a horizontal convolution operation on the images in the second training set by using a preset first gradient operator to obtain a horizontal gradient component;
    利用预设的第二梯度算子对所述第二训练集中的图像进行垂直卷积运算,得到垂直梯度分量;performing a vertical convolution operation on the images in the second training set by using a preset second gradient operator to obtain a vertical gradient component;
    根据所述水平梯度分量与所述垂直梯度分量计算所述第一特征向量。The first feature vector is calculated according to the horizontal gradient component and the vertical gradient component.
  14. 如权利要求13所述的电子设备,其中,所述利用预设的第一梯度算子对所述第二训练集中图的像进行水平卷积运算,得到水平梯度分量,包括:The electronic device according to claim 13, wherein the horizontal convolution operation is performed on the image in the second training set by using the preset first gradient operator to obtain a horizontal gradient component, comprising:
    获取卷积步长和卷积长度;Get the convolution step size and convolution length;
    根据所述卷积步长与所述卷积长度计算水平卷积次数;calculating the number of horizontal convolutions according to the convolution step size and the convolution length;
    利用所述第一梯度算子将所述第二训练集中每张图像按照所述卷积步长进行所述卷积次数的卷积运算,得到水平梯度分量。Using the first gradient operator to perform a convolution operation on each image in the second training set according to the convolution step size for the number of convolutions to obtain a horizontal gradient component.
  15. 如权利要求13所述的电子设备,其中,所述根据所述水平梯度分量与所述垂直梯度分量计算所述第一特征向量,包括:The electronic device according to claim 13, wherein said calculating said first feature vector according to said horizontal gradient component and said vertical gradient component comprises:
    将所述水平梯度分量进行归一化计算,得到水平归一化分量;performing normalized calculation on the horizontal gradient component to obtain a horizontal normalized component;
    将所述垂直梯度分量进行归一化计算,得到垂直归一化分量;performing normalized calculation on the vertical gradient component to obtain a vertical normalized component;
    将所述水平归一化分量与所述垂直归一化分量进行平方求和,得到所述第一特征向量。The horizontal normalized component and the vertical normalized component are squared and summed to obtain the first feature vector.
  16. 一种计算机可读存储介质,存储有计算机程序,其中,所述计算机程序被处理器执行时实现如下步骤:A computer-readable storage medium storing a computer program, wherein the computer program implements the following steps when executed by a processor:
    获取包含目标物的样本图像,对所述样本图像进行图像扩增,得到样本图像集;Acquiring a sample image containing the target object, performing image amplification on the sample image to obtain a sample image set;
    利用与所述样本图像同类型的噪声图像,以及所述样本图像集构建第一训练集与测试集,并利用所述第一训练集对预构建的原始识别模型进行训练,得到初始识别模型;constructing a first training set and a test set by using a noise image of the same type as the sample image and the sample image set, and using the first training set to train a pre-built original recognition model to obtain an initial recognition model;
    利用所述测试集对所述初始识别模型进行测试,并选取测试结果中预设类型的错误结果构建第二训练集,利用所述样本图像构建第三训练集;Using the test set to test the initial recognition model, and selecting preset types of error results in the test results to construct a second training set, and using the sample images to construct a third training set;
    提取所述第二训练集中图像的第一特征向量,提取所述第三训练集中图像的第二特征向量;extracting the first feature vector of the image in the second training set, and extracting the second feature vector of the image in the third training set;
    计算所述第一特征向量与所述第二特征向量之间的损失值,并根据所述损失值对所述初始识别模型进行参数更新,得到标准识别模型;calculating a loss value between the first feature vector and the second feature vector, and updating parameters of the initial recognition model according to the loss value to obtain a standard recognition model;
    获取待识别图像,利用所述标准识别模型对所述待识别图像进行目标物识别,得到图像中目标物识别结果。The image to be recognized is acquired, and the standard recognition model is used to perform target recognition on the image to be recognized to obtain a target recognition result in the image.
  17. 如权利要求16所述的计算机可读存储介质,其中,所述对所述样本图像进行图像扩增,得到样本图像集,包括:The computer-readable storage medium according to claim 16, wherein said performing image augmentation on said sample image to obtain a sample image set comprises:
    对所述样本图像进行纹理描绘,得到所述样本图像的图像纹理;performing texture drawing on the sample image to obtain the image texture of the sample image;
    对所述图像纹理进行随机局部加深,得到纹理加深图像;performing random local deepening on the image texture to obtain a texture deepened image;
    对所述图像文件进行随机局部淡化,得到纹理淡化图像;performing random partial lightening on the image file to obtain a texture lightened image;
    将所述纹理加深图像与所述纹理淡化图像汇集为所述样本图像集。Collect the texture-enhanced image and the texture-lighten image into the sample image set.
  18. 如权利要求17所述的计算机可读存储介质,其中,所述对所述图像纹理进行随机局部加深,得到纹理加深图像,包括:The computer-readable storage medium according to claim 17, wherein said performing random local deepening on said image texture to obtain a texture-enhanced image comprises:
    统计所述图像纹理的纹理数量;Count the number of textures of the image texture;
    根据所述纹理数量选取预设比例的图像纹理为待处理纹理;Selecting an image texture with a preset ratio as the texture to be processed according to the number of textures;
    将所述待处理纹理上的像素进行像素值调整,得到纹理加深图像。Adjusting pixel values of the pixels on the texture to be processed to obtain a texture-enhanced image.
  19. 如权利要求16所述的计算机可读存储介质,其中,所述利用所述第一训练集对预构建的原始识别模型进行训练,得到初始识别模型,包括:The computer-readable storage medium according to claim 16, wherein said using said first training set to train a pre-built original recognition model to obtain an initial recognition model comprises:
    利用所述原始识别模型对所述第一训练集进行图像识别,得到识别结果;performing image recognition on the first training set by using the original recognition model to obtain a recognition result;
    计算所述识别结果与所述第一训练集中每张图像对应的真实标签的损失值;Calculate the loss value of the recognition result and the real label corresponding to each image in the first training set;
    根据所述损失值对所述原始识别模型进行参数调整,得到初始识别模型。Adjusting parameters of the original recognition model according to the loss value to obtain an initial recognition model.
  20. 如权利要求16至19中任一项所述的计算机可读存储介质,其中,所述提取所述第二训练集中图像的第一特征向量,包括:The computer-readable storage medium according to any one of claims 16 to 19, wherein said extracting the first feature vector of the images in the second training set comprises:
    利用预设的第一梯度算子对所述第二训练集中的图像进行水平卷积运算,得到水平梯度分量;performing a horizontal convolution operation on the images in the second training set by using a preset first gradient operator to obtain a horizontal gradient component;
    利用预设的第二梯度算子对所述第二训练集中的图像进行垂直卷积运算,得到垂直梯度分量;performing a vertical convolution operation on the images in the second training set by using a preset second gradient operator to obtain a vertical gradient component;
    根据所述水平梯度分量与所述垂直梯度分量计算所述第一特征向量。The first feature vector is calculated according to the horizontal gradient component and the vertical gradient component.
PCT/CN2021/109479 2021-05-27 2021-07-30 Method and apparatus for identifying target object in image, electronic device and storage medium WO2022247005A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110581184.5 2021-05-27
CN202110581184.5A CN113283446B (en) 2021-05-27 2021-05-27 Method and device for identifying object in image, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
WO2022247005A1 true WO2022247005A1 (en) 2022-12-01

Family

ID=77281917

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/109479 WO2022247005A1 (en) 2021-05-27 2021-07-30 Method and apparatus for identifying target object in image, electronic device and storage medium

Country Status (2)

Country Link
CN (1) CN113283446B (en)
WO (1) WO2022247005A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116958503A (en) * 2023-09-19 2023-10-27 广东新泰隆环保集团有限公司 Image processing-based sludge drying grade identification method and system
CN117094966A (en) * 2023-08-21 2023-11-21 青岛美迪康数字工程有限公司 Tongue image identification method and device based on image amplification and computer equipment
CN117523345A (en) * 2024-01-08 2024-02-06 武汉理工大学 Target detection data balancing method and device

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114549928A (en) * 2022-02-21 2022-05-27 平安科技(深圳)有限公司 Image enhancement processing method and device, computer equipment and storage medium
CN115546536A (en) * 2022-09-22 2022-12-30 南京森林警察学院 Ivory product identification method and system

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190147320A1 (en) * 2017-11-15 2019-05-16 Uber Technologies, Inc. "Matching Adversarial Networks"
CN110135514A (en) * 2019-05-22 2019-08-16 国信优易数据有限公司 A kind of workpiece classification method, device, equipment and medium
CN110796057A (en) * 2019-10-22 2020-02-14 上海交通大学 Pedestrian re-identification method and device and computer equipment
CN112101542A (en) * 2020-07-24 2020-12-18 北京沃东天骏信息技术有限公司 Training method and device of machine learning model, and face recognition method and device
CN112232384A (en) * 2020-09-27 2021-01-15 北京迈格威科技有限公司 Model training method, image feature extraction method, target detection method and device

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7617103B2 (en) * 2006-08-25 2009-11-10 Microsoft Corporation Incrementally regulated discriminative margins in MCE training for speech recognition
US9652688B2 (en) * 2014-11-26 2017-05-16 Captricity, Inc. Analyzing content of digital images
CN111079785A (en) * 2019-11-11 2020-04-28 深圳云天励飞技术有限公司 Image identification method and device and terminal equipment
CN111639704A (en) * 2020-05-28 2020-09-08 深圳壹账通智能科技有限公司 Target identification method, device and computer readable storage medium
CN111914939B (en) * 2020-08-06 2023-07-28 平安科技(深圳)有限公司 Method, apparatus, device and computer readable storage medium for recognizing blurred image
CN112581522A (en) * 2020-11-30 2021-03-30 平安科技(深圳)有限公司 Method and device for detecting position of target object in image, electronic equipment and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190147320A1 (en) * 2017-11-15 2019-05-16 Uber Technologies, Inc. "Matching Adversarial Networks"
CN110135514A (en) * 2019-05-22 2019-08-16 国信优易数据有限公司 A kind of workpiece classification method, device, equipment and medium
CN110796057A (en) * 2019-10-22 2020-02-14 上海交通大学 Pedestrian re-identification method and device and computer equipment
CN112101542A (en) * 2020-07-24 2020-12-18 北京沃东天骏信息技术有限公司 Training method and device of machine learning model, and face recognition method and device
CN112232384A (en) * 2020-09-27 2021-01-15 北京迈格威科技有限公司 Model training method, image feature extraction method, target detection method and device

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117094966A (en) * 2023-08-21 2023-11-21 青岛美迪康数字工程有限公司 Tongue image identification method and device based on image amplification and computer equipment
CN117094966B (en) * 2023-08-21 2024-04-05 青岛美迪康数字工程有限公司 Tongue image identification method and device based on image amplification and computer equipment
CN116958503A (en) * 2023-09-19 2023-10-27 广东新泰隆环保集团有限公司 Image processing-based sludge drying grade identification method and system
CN116958503B (en) * 2023-09-19 2024-03-12 广东新泰隆环保集团有限公司 Image processing-based sludge drying grade identification method and system
CN117523345A (en) * 2024-01-08 2024-02-06 武汉理工大学 Target detection data balancing method and device
CN117523345B (en) * 2024-01-08 2024-04-23 武汉理工大学 Target detection data balancing method and device

Also Published As

Publication number Publication date
CN113283446B (en) 2023-09-26
CN113283446A (en) 2021-08-20

Similar Documents

Publication Publication Date Title
WO2022247005A1 (en) Method and apparatus for identifying target object in image, electronic device and storage medium
CN110728209B (en) Gesture recognition method and device, electronic equipment and storage medium
CN107679466B (en) Information output method and device
WO2022105179A1 (en) Biological feature image recognition method and apparatus, and electronic device and readable storage medium
CN113705462B (en) Face recognition method, device, electronic equipment and computer readable storage medium
WO2023015935A1 (en) Method and apparatus for recommending physical examination item, device and medium
WO2019119396A1 (en) Facial expression recognition method and device
CN115374189B (en) Block chain-based food safety tracing method, device and equipment
CN112132812A (en) Certificate checking method and device, electronic equipment and medium
CN113705469A (en) Face recognition method and device, electronic equipment and computer readable storage medium
CN115690615B (en) Video stream-oriented deep learning target recognition method and system
CN113419951B (en) Artificial intelligent model optimization method and device, electronic equipment and storage medium
CN113887408B (en) Method, device, equipment and storage medium for detecting activated face video
CN113705686B (en) Image classification method, device, electronic equipment and readable storage medium
CN112233194B (en) Medical picture optimization method, device, equipment and computer readable storage medium
CN114049676A (en) Fatigue state detection method, device, equipment and storage medium
CN113505698A (en) Face recognition method, device, equipment and medium based on counterstudy
CN114240935B (en) Space-frequency domain feature fusion medical image feature identification method and device
Anggoro et al. Classification of Solo Batik patterns using deep learning convolutional neural networks algorithm
CN117593610B (en) Image recognition network training and deployment and recognition methods, devices, equipment and media
WO2022247006A1 (en) Target object cut-out method and apparatus based on multiple features, and device and storage medium
WO2023178798A1 (en) Image classification method and apparatus, and device and medium
CN116644801A (en) Medical visual question-answering method and device based on transfer learning and electronic equipment
CN117315369A (en) Fundus disease classification method and device based on neural network
CN116824011A (en) Animation generation method, device, equipment and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21942574

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE