CN116863116A

CN116863116A - Image recognition method, device, equipment and medium based on artificial intelligence

Info

Publication number: CN116863116A
Application number: CN202310798703.2A
Authority: CN
Inventors: 张倩
Original assignee: Ping An Technology Shenzhen Co Ltd
Current assignee: Ping An Technology Shenzhen Co Ltd
Priority date: 2023-06-30
Filing date: 2023-06-30
Publication date: 2023-10-10

Abstract

The embodiment of the invention relates to the technical fields of artificial intelligence and intelligent medical treatment, and discloses an image identification method, device, equipment and medium based on artificial intelligence, wherein the method comprises the following steps: acquiring image information and source information of an image to be identified, determining a target image type of the image to be identified based on the source information, and determining a target tag set of the image to be identified according to the target image type and preset identification information; inputting the image information and the target label set into a first preset model, and determining the similarity value of each label in the image information and the target label set to obtain a similarity set; inputting the image information into a second preset model to obtain the number of types corresponding to the image to be identified; and selecting a target similarity value according to the category number to finish the identification of the image to be identified. The number of the types of the targets to be identified in the images to be identified is determined, the number of the target labels is determined according to the number of the types, and each image to be identified is targeted, so that the identification process is more intelligent, and the effect is better.

Description

Image recognition method, device, equipment and medium based on artificial intelligence

Technical Field

The invention relates to the technical field of artificial intelligence and intelligent medical treatment, in particular to an image identification method, device, equipment and medium based on artificial intelligence.

Background

Along with the development of artificial intelligence, the application of identifying pictures by using a picture identification model is wider and wider, so that the method can be applied to life, and more work which needs to identify images can be completed by using the artificial intelligence. For example, when it is desired to identify the food material contained in a picture from a picture containing one or more foods (e.g., corn, rice), this can be accomplished by artificial intelligence; for another example, in the field of intelligent medical, the task of managing medical devices may be accomplished through artificial intelligence when it is necessary to distinguish the medical device contained in a picture from a picture containing one or more medical devices (e.g., a scalpel, a tourniquet); also for example, in the field of equipment maintenance, the task of performing supervision/logistics of running equipment maintenance may be accomplished by artificial intelligence when it is necessary to identify a maintenance instrument contained in a picture from a picture containing one or more maintenance instruments (e.g., a stylus, a screwdriver); the current idea is to consider the process of recognizing an image as a process of target detection.

The current general object detection framework needs to generate candidate boxes (ROI) to generate a possible region of interest (ROI), then performs deletion and recombination on a series of candidate boxes containing objects, so that each object is defined by a single box (box), finally extracts features from the region of interest, and performs subsequent classification or regression through various neural networks. However, in the prior art, since explicit computing features in the region are required, the resolution of the picture is often required for object detection. Meanwhile, the process of generating the candidate region is often time-consuming, the speed is low during training and detection, and high computing resources are required. Meanwhile, the processes of generating candidate frames, deleting and reorganizing a series of candidate frames and the like often bring about a series of errors, and the ROI features to characterize the image feature data can be regarded as features of region (region) granularity, possibly bringing about some noise loss.

Disclosure of Invention

In view of the above, the invention provides an image recognition method, device, equipment and medium based on artificial intelligence, which are used for solving the problems of complex recognition process, large calculation amount and inaccurate recognition in the prior art.

To achieve one or a part or all of the above or other objects, the present invention provides an image recognition method based on artificial intelligence, comprising: acquiring image information and source information of an image to be identified, determining a target image type of the image to be identified based on the source information, and determining a target tag set of the image to be identified according to the target image type and preset identification information, wherein the preset identification information comprises different image types and preset tag sets corresponding to the image types;

inputting the image information and the target tag set into a first preset model to determine the similarity value of each tag in the image information and the target tag set, and obtaining a similarity set, wherein the first preset model comprises a preset coding algorithm, an image coding algorithm and a similarity calculation algorithm;

inputting the image information into a second preset model to obtain the number of types of targets to be identified in the images to be identified, wherein the second preset model comprises a first basic model for obtaining deep semantic information of the images to be identified and a second basic model for classifying;

selecting the similarity value with the largest similarity value and the number of the similarity values being the category number from the similarity set, taking the selected similarity value as a target similarity value, taking a label corresponding to the target similarity value as a target label, and completing the identification of the image to be identified based on the target label.

In another aspect, the present application provides an artificial intelligence based image recognition apparatus, the apparatus comprising:

the data acquisition module is used for acquiring image information and source information of an image to be identified, determining a target image type of the image to be identified based on the source information, and determining a target tag set of the image to be identified according to the target image type and preset identification information, wherein the preset identification information comprises different image types and preset tag sets corresponding to the image types;

the first computing module is used for inputting the image information and the target tag set into a first preset model to determine the similarity value of each tag in the image information and the target tag set, so as to obtain a similarity set, wherein the first preset model comprises a preset encoding algorithm, an image encoding algorithm and a similarity computing algorithm;

the second calculation module is used for inputting the image information into a second preset model to obtain the number of types of targets to be identified in the images to be identified, and the second preset model comprises a first basic model used for obtaining deep semantic information of the images to be identified and a second basic model used for classifying;

The identification module is used for selecting the similarity value with the maximum similarity value and the maximum number of the similarity values from the similarity set, taking the selected similarity value as a target similarity value, taking a label corresponding to the target similarity value as a target label, and completing the identification of the image to be identified based on the target label.

In another aspect, the present application provides an electronic device, including: a processor, a memory, and a bus, the memory storing machine-readable instructions executable by the processor, the processor in communication with the memory via the bus when the electronic device is running, the machine-readable instructions when executed by the processor performing: acquiring image information and source information of an image to be identified, determining a target image type of the image to be identified based on the source information, and determining a target tag set of the image to be identified according to the target image type and preset identification information, wherein the preset identification information comprises different image types and preset tag sets corresponding to the image types; inputting the image information and the target tag set into a first preset model to determine the similarity value of each tag in the image information and the target tag set, and obtaining a similarity set, wherein the first preset model comprises a preset coding algorithm, an image coding algorithm and a similarity calculation algorithm; inputting the image information into a second preset model to obtain the number of types of targets to be identified in the images to be identified, wherein the second preset model comprises a first basic model for obtaining deep semantic information of the images to be identified and a second basic model for classifying; selecting the similarity value with the largest similarity value and the number of the similarity values being the category number from the similarity set, taking the selected similarity value as a target similarity value, taking a label corresponding to the target similarity value as a target label, and completing the identification of the image to be identified based on the target label.

In another aspect, the present application provides a computer readable storage medium having a computer program stored thereon, the computer program when executed by a processor performing: acquiring image information and source information of an image to be identified, determining a target image type of the image to be identified based on the source information, and determining a target tag set of the image to be identified according to the target image type and preset identification information, wherein the preset identification information comprises different image types and preset tag sets corresponding to the image types; inputting the image information and the target tag set into a first preset model to determine the similarity value of each tag in the image information and the target tag set, and obtaining a similarity set, wherein the first preset model comprises a preset coding algorithm, an image coding algorithm and a similarity calculation algorithm; inputting the image information into a second preset model to obtain the number of types of targets to be identified in the images to be identified, wherein the second preset model comprises a first basic model for obtaining deep semantic information of the images to be identified and a second basic model for classifying; selecting the similarity value with the largest similarity value and the number of the similarity values being the category number from the similarity set, taking the selected similarity value as a target similarity value, taking a label corresponding to the target similarity value as a target label, and completing the identification of the image to be identified based on the target label.

The implementation of the embodiment of the invention has the following beneficial effects:

determining a target image type of an image to be identified based on source information by acquiring the image information and the source information of the image to be identified, and determining a target tag set of the image to be identified according to the target image type and preset identification information, wherein the preset identification information comprises different image types and preset tag sets corresponding to the image types; inputting the image information and the target tag set into a first preset model to determine the similarity value of each tag in the image information and the target tag set, and obtaining a similarity set, wherein the first preset model comprises a preset coding algorithm, an image coding algorithm and a similarity calculation algorithm; inputting the image information into a second preset model to obtain the number of types of targets to be identified in the images to be identified, wherein the second preset model comprises a first basic model for obtaining deep semantic information of the images to be identified and a second basic model for classifying; selecting the similarity value with the largest similarity value and the number of the similarity values being the category number from the similarity set, taking the selected similarity value as a target similarity value, taking a label corresponding to the target similarity value as a target label, and completing the identification of the image to be identified based on the target label. The similarity value of the image to be identified and the preset label is obtained through encoding, complex interaction between the image and the preset label is avoided, the number of types of objects to be identified in the image to be identified is determined, and the number of the object labels is determined according to the number of types.

Drawings

In order to more clearly illustrate the embodiments of the application or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Wherein:

FIG. 1 is an application scenario diagram of an image recognition method based on artificial intelligence provided by an embodiment of the present application;

FIG. 2 is a flow chart of an image recognition method based on artificial intelligence provided by an embodiment of the application;

FIG. 3 is a schematic structural diagram of an image recognition device based on artificial intelligence according to an embodiment of the present application;

FIG. 4 is a schematic diagram of a computer device according to an embodiment of the present application;

fig. 5 is a schematic diagram of another configuration of a computer device according to an embodiment of the present application.

Fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present application;

fig. 7 is a schematic structural diagram of a storage medium according to an embodiment of the present application.

Detailed Description

The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.

The image recognition method based on artificial intelligence provided by the embodiment of the invention can be applied to an application environment as shown in fig. 1, wherein a client communicates with a server through a network. The server side can acquire image information and source information of an image to be identified, determine a target image type of the image to be identified based on the source information, and determine a target tag set of the image to be identified according to the target image type and preset identification information, wherein the preset identification information comprises different image types and preset tag sets corresponding to the image types; inputting the image information and the target tag set into a first preset model to determine the similarity value of each tag in the image information and the target tag set, and obtaining a similarity set, wherein the first preset model comprises a preset coding algorithm, an image coding algorithm and a similarity calculation algorithm; inputting the image information into a second preset model to obtain the number of types of targets to be identified in the images to be identified, wherein the second preset model comprises a first basic model for obtaining deep semantic information of the images to be identified and a second basic model for classifying; selecting the similarity value with the largest similarity value and the number of the similarity values being the category number from the similarity set, taking the selected similarity value as a target similarity value, taking a label corresponding to the target similarity value as a target label, and completing the identification of the image to be identified based on the target label. In the invention, the similarity value of the image to be identified and the preset label is obtained through encoding, so that complex interaction between the image and the preset label is avoided, meanwhile, the number of types of objects to be identified in the image to be identified is determined, and the number of the object labels is determined according to the number of types. The clients may be, but are not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices. The server may be implemented by a stand-alone server or a server cluster formed by a plurality of servers. The present invention will be described in detail with reference to specific examples.

In order to reduce the calculation pressure of the server, the image recognition method based on artificial intelligence provided by the embodiment of the invention can also be applied to the client in fig. 1, namely, the image information and the source information of the image to be recognized are obtained, the target image type of the image to be recognized is determined based on the source information, and the target tag set of the image to be recognized is determined according to the target image type and preset recognition information, wherein the preset recognition information comprises different image types and preset tag sets corresponding to various image types; inputting the image information and the target tag set into a first preset model to determine the similarity value of each tag in the image information and the target tag set, and obtaining a similarity set, wherein the first preset model comprises a preset coding algorithm, an image coding algorithm and a similarity calculation algorithm; inputting the image information into a second preset model to obtain the number of types of targets to be identified in the images to be identified, wherein the second preset model comprises a first basic model for obtaining deep semantic information of the images to be identified and a second basic model for classifying; selecting the similarity value with the largest similarity value and the number of the similarity values being the category number from the similarity set, taking the selected similarity value as a target similarity value, taking a label corresponding to the target similarity value as a target label, and completing the identification of the image to be identified based on the target label.

As shown in fig. 2, an embodiment of the present application provides an image recognition method based on artificial intelligence, including:

s101, acquiring image information and source information of an image to be identified, determining a target image type of the image to be identified based on the source information, and determining a target tag set of the image to be identified according to the target image type and preset identification information, wherein the preset identification information comprises different image types and preset tag sets corresponding to the image types;

the image recognition method provided by the embodiment of the application can be applied to image recognition devices or image recognition engines in various scenes, the image recognition process is usually realized through a server, and the server for image recognition can perform data transmission with a client of a user in real time. For example, after receiving image recognition request information of a client, a server obtains image information of an image to be recognized according to the image recognition request information and records source information of the image to be recognized, when the image to be recognized is recognized, labels corresponding to the image to be recognized need to be combined, and labels corresponding to the images to be recognized of different image types are different, so that the image type of the image to be recognized is determined according to the source information of the image to be recognized, for example, when the source information of the image to be recognized is a medical field, namely, when the image to be recognized is obtained through a camera arranged in a region to which the medical field belongs or from a database of the medical field, a target label set corresponding to the medical field is selected.

For example, preset identification information is constructed according to different services and scenes corresponding to the services, for example, for the service of identifying food materials, a first tag set is constructed according to the types of the food materials; for the service of medical instrument identification, constructing a second tag set according to the type of the medical instrument; for the service identified by the maintenance tool, constructing a third tag set according to the type of the maintenance tool; constructing initial identification information according to the first tag set, the second tag set and the third tag set; and associating the initial identification information with a scene corresponding to the service to obtain preset identification information.

The source information comprises identification information of an image acquisition device for acquiring the image to be identified, identification information of a database for storing the image to be identified and the like, and when the source information is the identification information of the database for storing the image to be identified, the image acquisition device for acquiring the image to be identified can be determined according to the storage rule of the database.

The target image types of the image to be identified include, but are not limited to, food materials and equipment, and the preset information includes a label set of the food materials, namely a first label set, and a label set of the equipment, namely a second label set.

S102, inputting the image information and the target tag set into a first preset model to determine a similarity value of each tag in the image information and the target tag set, and obtaining a similarity set, wherein the first preset model comprises a preset coding algorithm, an image coding algorithm and a similarity calculation algorithm;

the image information of the image to be identified and the target tag set corresponding to the image to be identified are input into a first preset model, the first preset model represents the image information as first dimension data (enabling), the first preset model represents each tag in the target tag set as second dimension data (enabling), and similarity of the first dimension data and the second dimension data is calculated, wherein the similarity represents association degree of the first dimension data and the second dimension data, namely, the similarity represents corresponding degree of the image information of the image to be identified and each tag in the target tag set.

The image information of the image to be identified is an actual picture of the image to be identified, for example, a picture containing one or more foods, a picture containing one or more medical instruments, etc. The preset coding algorithm comprises a text coding algorithm and an image coding algorithm. The preset encoding algorithm and the image encoding algorithm are respectively realized through a preset encoder and an image encoder.

The similarity set includes a similarity value for each tag in the set of image information and the target tag set.

S103, inputting the image information into a second preset model to obtain the number of types of objects to be identified in the image to be identified, wherein the second preset model comprises a first basic model for acquiring deep semantic information of the image to be identified and a second basic model for classifying;

the method includes the steps that deep semantic information of an image to be identified is obtained through a first basic model in the second preset model, and the objects to be identified in the image to be identified are classified according to the deep semantic information of the image to be identified through the second basic model in the second preset model, so that the number of types of the objects to be identified in the image to be identified is obtained.

Taking a picture containing food as an example, the category number represents the number of categories of food materials contained in the picture to be identified.

The deep semantic information is texture and color of the image content in the image to be identified, and the deep semantic information of the image to be identified is category information of the image content.

S104, selecting a target similarity value with the maximum similarity value and the number of the categories from the similarity set, taking a label corresponding to the target similarity value as a target label, and completing the identification of the image to be identified based on the target label.

The number of types of the targets to be identified in the image to be identified determined according to the second preset model is exemplified, and a corresponding number of target tags are selected from the similarity set.

For example, when a corresponding number of target labels are selected in the similarity set, the similarity values in the similarity set may be arranged in order from the top to the bottom, and a number of similarity values of a top class is selected, for example, the number of classes is 4, and the first 4 similarity values in the similarity set are selected. And selecting a corresponding number of target labels from the similarity set according to the maximum value, for example, selecting the target similarity with the maximum similarity value in the similarity set, removing the selected target similarity from the similarity set after the selection is completed, selecting the maximum target similarity value again from the similarity set with the target similarity removed, removing the selected target similarity from the similarity set after the selection is completed, and repeating the selection process for a plurality of times to obtain a plurality of target similarity values of the category.

The similarity value of the image to be identified and the preset label is obtained through encoding, complex interaction between the image and the preset label is avoided, the number of types of objects to be identified in the image to be identified is determined, and the number of the object labels is determined according to the number of types.

In a possible implementation manner, the step of determining the image type of the image to be identified based on the source information includes:

determining identification information of a target data acquisition device for acquiring the image to be identified based on the source information;

determining a data acquisition area corresponding to the target data acquisition device based on the identification information and the distribution data of the preset data acquisition device, and determining the area type of the data acquisition area according to the functional department to which the data acquisition area belongs;

and taking the region type as the image type of the image to be identified.

The image to be identified may be an image directly sent to the server by the data acquisition device, or may be an image extracted from the database, where the data acquisition device is a target data acquisition device when the image to be identified is an image directly sent to the server by the data acquisition device, and the identification information of the target data acquisition device is obtained, and the attribute information of the image to be identified is obtained when the image to be identified is an image extracted from the database, and the identification information of the target data acquisition device for collecting the image to be identified is determined according to the attribute information.

The position of the target data acquisition device is determined based on the identification information and distribution data of the preset data acquisition device, and then a data acquisition area corresponding to the target data acquisition device is judged according to the position of the target data acquisition device, wherein the distribution data of the preset data acquisition device is a distribution diagram of the data acquisition device, an installation planning diagram of the data acquisition device and the like.

The method includes determining a region type of the data acquisition region according to a functional department to which the data acquisition region belongs, for example, if the functional department to which the data acquisition region belongs is a department responsible for medical instrument management, the region type of the data acquisition region is a medical instrument type, and further determining an image type of the image to be identified is a medical instrument type; the functional department to which the data acquisition area belongs is a department responsible for food material management, and the area type of the data acquisition area is a food material type, so that the image type of the image to be identified is determined to be the food material type.

In one possible implementation manner, the step of determining the region type of the data acquisition region according to the functional department to which the data acquisition region belongs includes:

When at least two functional departments to which the data acquisition area belongs exist, acquiring target request information, wherein the target request information is used for initiating the step of acquiring the image information and the source information of the image to be identified;

determining account information corresponding to the target request information according to the target request information and log data of an image recognition process;

determining a target function department to which the target request information belongs based on the account information;

and determining the region type of the data acquisition region according to the target functional department.

For example, when there are at least two functional departments to which the data acquisition area belongs, for example, when the data acquisition area, for example, a warehouse, simultaneously places medical equipment and maintenance tools, the functional departments to which the data acquisition area belongs are the medical departments and the maintenance departments, target request information for initiating the current image recognition process is acquired, and when the target request information is initiated by account information of the medical departments, the image type of the image to be recognized generated based on the data acquisition area is determined as the medical equipment type. The problem of inaccurate identification caused by the mutual influence of different departments when identifying the same image to be identified is avoided.

The log data is the procedural event record data generated by the recording system, and in this example, the log data is the record from the start of the identification of the image identification event generated by the recording system where the image identification process is located to the end of the identification.

In one possible implementation manner, before the step of inputting the image information and the target tag set into a first preset model to obtain the similarity value of each tag in the image information and the target tag set, the method further includes:

acquiring a data form of the target tag set;

and determining a preset coding algorithm in the first preset model aiming at the data form of the target tag set.

For example, the data form of the tag set constructed in real work may be a text form, an image form, or the like, so that the preset encoding algorithm in the first preset model is determined for the data form of the tag set, for example, when the data form of the tag set constructed in real work is a text form, the text encoding algorithm is selected as the preset encoding algorithm in the first preset model.

The applicability of the image recognition method of the present application is enhanced by selecting a preset encoding algorithm in the first preset model to adapt to the tag sets of different data forms.

In a possible implementation manner, the inputting the image information and the target tag set into a first preset model determines a similarity value of each tag in the image information and the target tag set, so as to obtain a similarity set, where the first preset model includes a preset encoding algorithm, an image encoding algorithm and a similarity calculating algorithm, and the method includes the steps of:

inputting the target tag set into a preset coding algorithm in the first preset model so that the preset coding algorithm carries out coding characterization on each tag in the target tag set to obtain a first low-dimensional vector;

inputting the image information into an image coding algorithm in the first preset model so that the image coding algorithm carries out coding characterization on the image information to obtain a second low-dimensional vector;

inputting the first low-dimensional vector and the second low-dimensional vector into a similarity calculation algorithm in the first preset model to calculate cosine similarity of the first low-dimensional vector and the second low-dimensional vector through the similarity calculation algorithm;

and taking the cosine similarity as a similarity value of the image information and the labels in the target label set.

The first preset model adopts a contrast learning-based large-scale image-text pre-training model (CLIP), and codes and characterizes the text of the target tag set (labels) into a first low-dimensional vector through a text coder;

the image information, i.e. the actual picture, is characterized by the image encoder encoding as a second low-dimensional vector.

Cosine similarity of the second low-dimensional vector and the first low-dimensional vector is calculated (cosine similarity).

For example, the first low-dimensional vector a is a low-dimensional vector with 512 dimensions, the second low-dimensional vector B is also a low-dimensional vector with 512 dimensions, and the inner product of the first low-dimensional vector and the second low-dimensional vector is calculated to obtain a numerical value with 1 dimension, namely, cosine similarity, which is specifically:

A＝【0.3,0.4,0.5】，B＝【0.2,0.1,0.6】

the cosine similarity of a and B is=0.3×0.2+0.4×0.1+0.5×0.6=0.4.

The large-scale image-text pre-training model based on contrast learning adopts an open-source Chinese-based Taiyi multi-mode model, the implementation of the preset coding algorithm adopts a text coder with the model number of Taiyi-326M, and the implementation of the image coding algorithm adopts an image coder with the model number of clip-vit-large-patch14.

In a possible implementation manner, before the step of inputting the image information into a second preset model to obtain the number of types of objects to be identified in the image to be identified, the second preset model includes a first basic model for obtaining deep semantic information of the image to be identified and a second basic model for classifying, the method further includes:

Constructing a second basic model for classification based on the full connection layer and a preset classifier;

connecting the full connection layer of the second basic model with a first basic model for acquiring deep semantic information of the image to be identified to obtain an initial model;

training the initial model according to a preset data set to obtain the second preset model.

Illustratively, a two-layer fully connected layer (MLP) is used, after which the classification function is implemented by a connection classifier (softmax), wherein the two-layer fully connected layer formula uses an activation function (relu) to obtain a second basic model, in particular:

wherein W represents the weight, T represents the current round, x represents the input data, b represents the bias, and 1 and 2 represent the first fully connected layer and the second fully connected layer.

The first base model employs a model (ViT, vision Transformer) with encoders that can be trained in parallel and master global information.

In a possible implementation manner, the step of training the initial model according to a preset data set to obtain the second preset model includes:

acquiring first initial parameters of the first basic model and second initial parameters of the second basic model;

Acquiring data pairs obtained through different labeling modes according to a preset proportion, constructing a training set according to the data pairs, and training the second basic model in the initial model based on the training set to obtain second target parameters of the second basic model;

updating the second basic model according to the second target parameters, and keeping the first initial parameters of the first basic model unchanged;

and obtaining the second preset model based on the updated second basic model and the first basic model.

For example, in training the initial model, taking food material recognition as an example, collecting nearly 20000 pictures containing food from a network, labeling tags in 20000 pictures by Named Entity Recognition (NER), obtaining image-tag data pairs, and counting the number of tags of each picture; in addition, 5000 additional pictures are manually marked, and the number of labels is obtained, namely the total number of the labels of 25000 pictures and images is used as a training set to train the initial model.

Illustratively, the portion vision transformer is frozen during the training process, i.e., the first initial parameters of the first base model are kept unchanged, and only the second base model is trained and the parameters are updated.

Illustratively, after two layers of neuro-linguistics (NLP), each tag's final score is obtained via a normalized index (softmax) function.

In one possible embodiment, as shown in fig. 3, the present application provides an artificial intelligence based image recognition apparatus, the apparatus comprising:

the data acquisition module 201 is configured to acquire image information and source information of an image to be identified, determine a target image type of the image to be identified based on the source information, and determine a target tag set of the image to be identified according to the target image type and preset identification information, where the preset identification information includes different image types and preset tag sets corresponding to the image types;

a first calculation module 202, configured to input the image information and the target tag set into a first preset model, to determine a similarity value of each tag in the image information and the target tag set, to obtain a similarity set, where the first preset model includes a preset encoding algorithm, an image encoding algorithm, and a similarity calculation algorithm;

the second computing module 203 is configured to input the image information into a second preset model to obtain the number of types of objects to be identified in the image to be identified, where the second preset model includes a first basic model for obtaining deep semantic information of the image to be identified and a second basic model for classifying the image to be identified;

The identifying module 204 is configured to select, from the similarity set, a similarity value having a maximum similarity value and the number of similarity values being the number of categories, use the selected similarity value as a target similarity value, use a label corresponding to the target similarity value as a target label, and complete identification of the image to be identified based on the target label.

In a possible implementation manner, the data acquisition module 201 is configured to:

and taking the region type as the image type of the image to be identified.

In a possible implementation manner, the first computing module 202 is configured to:

acquiring a data form of the target tag set;

In a possible implementation manner, the second computing module 203 is configured to:

Based on the updated second base model and the first base model.

The invention provides an image recognition device, which obtains a similarity value of an image to be recognized and a preset label through encoding, avoids complex interaction between the image and the preset label, determines the number of types of objects to be recognized in the image to be recognized, and determines the number of the object labels according to the number of types.

For specific limitations of the image recognition apparatus, reference may be made to the above limitations of the image recognition method, and no further description is given here. The respective modules in the image recognition apparatus described above may be implemented in whole or in part by software, hardware, and combinations thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.

In one embodiment, a computer device is provided, which may be a server, and the internal structure of which may be as shown in fig. 4. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes non-volatile and/or volatile storage media and internal memory. The non-volatile storage medium stores an operating system, computer programs, and a database. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The network interface of the computer device is for communicating with an external client via a network connection. The computer program, when executed by a processor, performs functions or steps of a server side of an image recognition method based on artificial intelligence.

In one embodiment, a computer device is provided, which may be a client, the internal structure of which may be as shown in FIG. 5. The computer device includes a processor, a memory, a network interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The network interface of the computer device is for communicating with an external server via a network connection. The computer program is executed by a processor to perform functions or steps of a client side of an artificial intelligence based image recognition method.

In one possible implementation, as shown in fig. 6, an embodiment of the present application provides an electronic device 300, including: comprising a memory 310, a processor 320 and a computer program 311 stored on the memory 310 and executable on the processor 320, the processor 320 implementing, when executing the computer program 311: acquiring image information and source information of an image to be identified, determining a target image type of the image to be identified based on the source information, and determining a target tag set of the image to be identified according to the target image type and preset identification information, wherein the preset identification information comprises different image types and preset tag sets corresponding to the image types; inputting the image information and the target tag set into a first preset model to determine the similarity value of each tag in the image information and the target tag set, and obtaining a similarity set, wherein the first preset model comprises a preset coding algorithm, an image coding algorithm and a similarity calculation algorithm; inputting the image information into a second preset model to obtain the number of types of targets to be identified in the images to be identified, wherein the second preset model comprises a first basic model for obtaining deep semantic information of the images to be identified and a second basic model for classifying; selecting the similarity value with the largest similarity value and the number of the similarity values being the category number from the similarity set, taking the selected similarity value as a target similarity value, taking a label corresponding to the target similarity value as a target label, and completing the identification of the image to be identified based on the target label.

In one possible implementation, as shown in fig. 7, an embodiment of the present application provides a computer-readable storage medium 400 having a computer program 411 stored thereon, the computer program 411, when executed by a processor, implementing: acquiring image information and source information of an image to be identified, determining a target image type of the image to be identified based on the source information, and determining a target tag set of the image to be identified according to the target image type and preset identification information, wherein the preset identification information comprises different image types and preset tag sets corresponding to the image types; inputting the image information and the target tag set into a first preset model to determine the similarity value of each tag in the image information and the target tag set, and obtaining a similarity set, wherein the first preset model comprises a preset coding algorithm, an image coding algorithm and a similarity calculation algorithm; inputting the image information into a second preset model to obtain the number of types of targets to be identified in the images to be identified, wherein the second preset model comprises a first basic model for obtaining deep semantic information of the images to be identified and a second basic model for classifying; selecting the similarity value with the largest similarity value and the number of the similarity values being the category number from the similarity set, taking the selected similarity value as a target similarity value, taking a label corresponding to the target similarity value as a target label, and completing the identification of the image to be identified based on the target label.

The computer storage media of embodiments of the invention may take the form of any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. The computer readable storage medium may be, for example, but not limited to: an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the computer-readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

The computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, either in baseband or as part of a carrier wave. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.

Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.

Computer program code for carrying out operations of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).

It will be appreciated by those of ordinary skill in the art that the modules or steps of the invention described above may be implemented in a general purpose computing device, they may be centralized on a single computing device, or distributed over a network of computing devices, or they may alternatively be implemented in program code executable by a computer device, such that they are stored in a memory device and executed by the computing device, or they may be separately fabricated as individual integrated circuit modules, or multiple modules or steps within them may be fabricated as a single integrated circuit module. Thus, the present invention is not limited to any specific combination of hardware and software.

Note that the above is only a preferred embodiment of the present invention and the technical principle applied. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of numerous obvious changes, rearrangements and substitutions without departing from the scope of the invention. Therefore, while the invention has been described in connection with the above embodiments, the invention is not limited to the embodiments, but may be embodied in many other equivalent forms without departing from the spirit or scope of the invention, which is set forth in the following claims.

The foregoing disclosure is illustrative of the present invention and is not to be construed as limiting the scope of the invention, which is defined by the appended claims.

Claims

1. An image recognition method based on artificial intelligence, comprising:

acquiring image information and source information of an image to be identified, determining a target image type of the image to be identified based on the source information, and determining a target tag set of the image to be identified according to the target image type and preset identification information, wherein the preset identification information comprises different image types and preset tag sets corresponding to the image types;

2. The artificial intelligence based image recognition method of claim 1, wherein the step of determining the image type of the image to be recognized based on the source information comprises:

and taking the region type as the image type of the image to be identified.

3. The image recognition method according to claim 2, wherein the step of determining the region type of the data acquisition region according to the functional department to which the data acquisition region belongs comprises:

4. The image recognition method based on artificial intelligence according to claim 1, further comprising, before the step of inputting the image information and the target tag set into a first preset model to obtain a similarity value between the image information and each tag in the target tag set:

acquiring a data form of the target tag set;

5. The image recognition method based on artificial intelligence according to claim 1, wherein the step of inputting the image information and the target tag set into a first preset model to determine a similarity value of each tag in the image information and the target tag set, and obtaining a similarity set, the first preset model including a preset encoding algorithm, an image encoding algorithm and a similarity calculating algorithm includes:

6. The image recognition method based on artificial intelligence according to claim 1, wherein before the step of inputting the image information into a second preset model to obtain the number of kinds of objects to be recognized in the image to be recognized, the second preset model includes a first basic model for obtaining deep semantic information of the image to be recognized and a second basic model for classification, the method further includes:

7. The artificial intelligence based image recognition method of claim 6, wherein the training the initial model according to a preset data set to obtain the second preset model comprises:

8. An artificial intelligence based image recognition device, the device comprising:

9. An electronic device, comprising: a processor, a memory and a bus, the memory storing machine-readable instructions executable by the processor, the processor and the memory in communication via the bus when the electronic device is running, the machine-readable instructions when executed by the processor performing the steps of the artificial intelligence based image recognition method of any one of claims 1 to 7.

10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, performs the steps of the artificial intelligence based image recognition method according to any one of claims 1 to 7.