CN113110782A - Image recognition method and device, computer equipment and storage medium - Google Patents

Image recognition method and device, computer equipment and storage medium Download PDF

Info

Publication number
CN113110782A
CN113110782A CN202110304037.3A CN202110304037A CN113110782A CN 113110782 A CN113110782 A CN 113110782A CN 202110304037 A CN202110304037 A CN 202110304037A CN 113110782 A CN113110782 A CN 113110782A
Authority
CN
China
Prior art keywords
identified
determining
image
identification
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110304037.3A
Other languages
Chinese (zh)
Other versions
CN113110782B (en
Inventor
刘俊启
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN202110304037.3A priority Critical patent/CN113110782B/en
Publication of CN113110782A publication Critical patent/CN113110782A/en
Application granted granted Critical
Publication of CN113110782B publication Critical patent/CN113110782B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0484Interaction techniques based on graphical user interfaces [GUI] for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range
    • G06F3/04842Selection of displayed objects or displayed text elements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Library & Information Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Human Computer Interaction (AREA)
  • User Interface Of Digital Computer (AREA)
  • Image Analysis (AREA)

Abstract

The present disclosure discloses an image recognition method, an image recognition device, a computer device and a storage medium, which relate to the technical field of image processing, and in particular to the fields of artificial intelligence such as intelligent retrieval, image recognition and deep learning. The specific implementation scheme is as follows: responding to the acquired image selection operation, and determining a target image targeted by the selection operation and a trigger position of the selection operation; determining the distance between each object to be identified in the target image and the trigger position; determining a target identification object according to the distance between each object to be identified and the trigger position; and identifying the target identification object to generate an identification result. Therefore, the content possibly associated with the user request can be determined through the position information of the behavior selected by the user, the speed and the accuracy of image recognition can be improved, the resource consumption in the image recognition process is reduced, the experience of the user in the image recognition process is improved, the image search is more accurate, and the image search performance is improved.

Description

Image recognition method and device, computer equipment and storage medium
Technical Field
The present disclosure relates to the field of image processing technologies, and in particular, to the field of artificial intelligence such as intelligent retrieval, image recognition, and deep learning, and in particular, to an image recognition method and apparatus, a computer device, and a storage medium.
Background
With the rapid development of artificial intelligence, scenes to which the image recognition capability is applied are increasing, such as two-dimensional code recognition, person recognition, photo recognition and the like. When the user is going to perform image recognition on a picture and the requirement is not clear, the content of the image recognition has a certain uncertainty. Therefore, how to identify picture contents with stronger association with user behaviors in response to the ambiguous identification requirement of the user is a problem that needs to be solved at present.
Disclosure of Invention
The disclosure provides an image identification method, an image identification device, computer equipment and a storage medium.
According to a first aspect of the present disclosure, there is provided an image recognition method, including:
responding to the acquired image selection operation, and determining a target image targeted by the selection operation and a trigger position of the selection operation;
determining the distance between each object to be identified in the target image and the trigger position;
determining a target identification object according to the distance between each object to be identified and the trigger position;
and identifying the target identification object to generate an identification result.
According to a second aspect of the present disclosure, there is provided an image recognition apparatus including:
the first determination module is used for responding to the acquired image selection operation and determining a target image aimed at by the selection operation and a trigger position of the selection operation;
the second determining module is used for determining the distance between each object to be identified in the target image and the trigger position;
the third determining module is used for determining a target identification object according to the distance between each object to be identified and the trigger position;
and the first generation module is used for identifying the target identification object so as to generate an identification result.
According to a third aspect of the present disclosure, there is provided an electronic device comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a method of image recognition as described in an embodiment of the above aspect.
According to a fourth aspect of the present disclosure, there is provided a non-transitory computer readable storage medium having stored thereon a computer program for causing a computer to execute the method of recognizing an image according to the embodiment of the above aspect.
According to a fifth aspect of the present disclosure, there is provided a computer program product comprising a computer program which, when executed by a processor, implements the method of image recognition as described in an embodiment of the above aspect.
The image identification method, the image identification device, the image identification equipment and the image identification storage medium have the following beneficial effects:
the device firstly responds to the acquired image selection operation, determines a target image aimed at by the selection operation and a trigger position of the selection operation, then determines the distance between each object to be identified in the target image and the trigger position, then determines a target identification object according to the distance between each object to be identified and the trigger position, and finally identifies the target identification object to generate an identification result. Therefore, the content possibly associated with the user request can be determined through the position information of the behavior selected by the user, the speed and the accuracy of image recognition can be improved, the resource consumption in the image recognition process is reduced, the experience of the user in the image recognition process is improved, the image search is more accurate, and the image search performance is improved.
It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.
Drawings
The drawings are included to provide a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:
fig. 1 is a schematic flowchart of an image recognition method according to an embodiment of the present disclosure;
fig. 2 is a schematic flowchart of another image recognition method according to an embodiment of the present disclosure;
fig. 3 is a schematic flowchart of another image recognition method according to an embodiment of the present disclosure;
fig. 4 is a schematic structural diagram of an image recognition apparatus according to an embodiment of the present disclosure;
fig. 5 is a block diagram of an electronic device according to an embodiment of the disclosure.
Detailed Description
Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of the embodiments of the disclosure are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
In order to facilitate understanding of the present disclosure, the following description is first briefly made to the technical field to which the present disclosure relates.
Artificial intelligence is the subject of research that makes computers simulate some human mental processes and intelligent behaviors (such as learning, reasoning, thinking, planning, etc.), both at the hardware level and at the software level. Artificial intelligence hardware technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing, and the like; the artificial intelligence software technology mainly comprises a computer vision technology, a voice recognition technology, a natural language processing technology, a machine learning technology, a deep learning technology, a big data processing technology, a knowledge map technology and the like.
The relevance and the importance of the intelligent retrieval result are simultaneously considered, the relevance adopts weighted mixed indexes of all fields, the relevance analysis is more accurate, the importance refers to the evaluation of the quality of the documents through document source authority analysis, citation relation analysis and the like, the result ranking is more accurate, the documents most relevant to the user desire can be ranked to the top, and the retrieval efficiency is improved.
Deep learning is the intrinsic law and expression level of the learning sample data, and the information obtained in the learning process is very helpful for the interpretation of data such as characters, images and sounds. The final aim of the method is to enable the machine to have the analysis and learning capability like a human, and to recognize data such as characters, images and sounds. Deep learning is a complex machine learning algorithm, and achieves the effect in speech and image recognition far exceeding the prior related art.
Image recognition, which refers to a technique for processing, analyzing and understanding images by a computer to recognize various different patterns of objects and objects, is a practical application of applying a deep learning algorithm. Image recognition technology at present is generally divided into face recognition and commodity recognition, and the face recognition is mainly applied to security inspection, identity verification and mobile payment; the commodity identification is mainly applied to the commodity circulation process, in particular to the field of unmanned retail such as unmanned goods shelves and intelligent retail cabinets.
In general, in an unspecified scene, when the user's needs are unclear, the contents of image recognition have a certain uncertainty. In the related art, all contents in an image are generally recognized, so that the direct influence is that the overall performance is reduced, and the overall use experience is reduced for a user. In order to solve the above problem, an embodiment of the present disclosure provides an image recognition method, which can obtain a recognition object and a recognition result with a stronger association with a user recognition behavior, implement accurate image search, and improve user experience.
The image recognition method in the present disclosure may be executed by the image recognition apparatus provided in the present disclosure, or may be executed by an electronic device provided in the present disclosure, where the electronic device may be a server, or may also be a device such as a desktop computer, a notebook computer, a smart phone, or a wearable device, and the present disclosure does not limit this. The following describes the recognition method of an image provided by the present disclosure in detail, taking as an example that the recognition device of an image provided by the present disclosure, hereinafter simply referred to as "device" executes the recognition method of an image provided by the present disclosure.
The identification method, apparatus, computer device, and storage medium of the image of the present disclosure are described with reference to the accompanying drawings.
Fig. 1 is a schematic flow chart of an image recognition method according to an embodiment of the present disclosure.
As shown in fig. 1, the image recognition method may include the steps of:
step 101, responding to the acquired image selection operation, determining a target image targeted by the selection operation and a trigger position of the selection operation.
The selection operation refers to an operation of selecting an image when a user determines a target image. The selection operation may be an operation of any selectable target image, for example, a click operation, a long-press operation, and the like, which is not limited in this disclosure.
For example, when a user browses a page, the user may click or long-press a certain position in a certain image in the page to select the image, and then when the device monitors a selection operation of the user, the image may be used as a target image. Alternatively, the user may perform a selection operation on a locally stored image, for example, a long-press operation, so that the apparatus may determine the image as the target image in response to the selection operation.
It will be appreciated that the device may use the user's click or long press position on the picture as the trigger position for further action based on the trigger position.
And 102, determining the distance between each object to be identified in the target image and the trigger position.
It should be noted that there may be one or more objects to be recognized in the target image, and if there is only one object to be recognized, it may be directly determined that the object to be recognized is the target recognition object; if there are a plurality of objects to be identified, the distances between the plurality of objects to be identified and the trigger positions may be the same or different.
It will be appreciated that the trigger position may correspond to a certain coordinate in the target image, which the apparatus may have as a reference coordinate. Each object to be recognized may also correspond to a coordinate of the target image, where the coordinate may be a geometric coordinate center or a physical coordinate center of the object to be recognized, which is not limited by the present disclosure.
Furthermore, the device can determine the distance between each object to be recognized and the trigger position in the target image by calculating the distance between the reference coordinate, namely the coordinate of the trigger position, in the target image and the coordinate of each object to be recognized.
And 103, determining the target recognition object according to the distance between each object to be recognized and the trigger position.
It can be understood that, when the user performs the selection operation on the image, the selected trigger position of the user can be understood as the direction of the user to the target recognition object, that is, the target object that the user intends to recognize. Therefore, for the target object specification which may exist by the user, the device in the present disclosure may determine the target recognition object according to the distance between each object to be recognized in the target image and the trigger position.
Optionally, if distances between the multiple objects to be recognized in the target image and the trigger position are different, the device may select the object to be recognized having the shortest distance to the trigger position as the target object.
Optionally, if the distance between at least two objects to be identified and the trigger position is the same and smaller than the distance between each of the remaining objects to be identified and the trigger position, the device may determine both of the at least two objects to be identified as the target object. Or, the device may further determine the candidate type of the target object according to the history image identification information by acquiring the history image identification information, so as to determine the object to be identified, which is matched with the candidate type, of the at least two objects to be identified as the target identification object.
The history image recognition information may be conventional image recognition information stored in the apparatus, and may include information such as a type of a recognition target and a recognition frequency of each type of recognition target every time the image is recognized, which is not limited in the present disclosure.
The candidate type may be an identification object that appears in the history image identification information with a high frequency, or may also be a type of an object identified by a previous image identification operation, and the like, which is not limited in the present disclosure.
For example, if a target image includes four objects to be identified, which are a star picture, a commodity picture, a two-dimensional code and a text signature, respectively, and distances between the star picture and the commodity picture in the objects to be identified and a trigger position are the same, and distances between the two objects to be identified and the trigger position on the target image are smaller than the distances between the two objects to be identified and the trigger position, that is, the two objects to be identified are closest to the trigger position, the device may determine the candidate type according to the historical image identification information of the current user. For example, the probability of the current user identifying the person picture is 64%, and the probability of identifying the product is 20%, and since the probability of the person picture is large, the person picture can be used as a candidate type of the target object. Because a certain star picture is a figure picture in two objects to be identified which are equal in distance from the trigger position and shortest in distance, the device can take the picture of a certain star as a target identification object.
Alternatively, in the embodiment of the present disclosure, after determining the target image, the apparatus may first perform preprocessing on the target image to determine the type of each object to be recognized contained in the target image, the recognition frame corresponding to each object to be recognized, and the like. And determining the sizes of the identification frames corresponding to the at least two objects to be identified respectively if the distances between the at least two objects to be identified and the trigger positions are the same and smaller than the distances between the rest objects to be identified and the trigger positions, and taking the objects to be identified with the large identification frames as target identification objects.
For example, if three objects to be recognized are included in a target image, which are a star picture, a star signature, and a brand mark, respectively, and the distances between the star picture and the star signature from the trigger position are the same and smaller than the distance between the brand mark and the trigger position, the device may determine the target recognition object according to the size of a recognition frame generated by the star picture and the star signature. For example, if the identification frame of the star picture is larger than the identification frame of the star signature, the device may determine that the certain star picture is a target identification object.
In the embodiment of the present disclosure, there may be many determination methods for a target recognition object, and the above embodiments are several reference implementation manners that can achieve the desired effect of the embodiment of the present disclosure, and the present disclosure does not limit this.
And 104, identifying the target identification object to generate an identification result.
Specifically, the device may identify the target identification object after determining the target identification object to generate an identification result corresponding to the target identification object.
Wherein, the recognition results can be the same or different for different target recognition objects. For example, if the target recognition object is a person photograph, the recognition result of the person photograph may be the name, age, other pictures including the person, or other pictures including persons similar to the person, and the disclosure is not limited thereto. Or, if the target identification object is a two-dimensional code of a certain commodity, the identification result of the two-dimensional code may be a purchase link, flagship information, commodity information, and the like of the commodity, which is not limited in this disclosure.
The recognition result obtained by the apparatus recognizing the target recognition object may be a plurality of pieces of search information associated with the content of the target recognition object. Therefore, the richness of image searching can be improved, more potential of users can be mined, the display times of the contents of the target identification object can be improved, and the commercial value of the image searching identification service can be improved.
In the image identification method provided by the embodiment of the disclosure, firstly, in response to an acquired image selection operation, a target image targeted by the selection operation and a trigger position of the selection operation are determined, then, a distance between each object to be identified in the target image and the trigger position is determined, then, according to the distance between each object to be identified and the trigger position, a target identification object is determined, and finally, the target identification object is identified to generate an identification result. Therefore, the content possibly associated with the user request can be determined through the position information of the behavior selected by the user, the speed and the accuracy of image recognition can be improved, the resource consumption in the image recognition process is reduced, the experience of the user in the image recognition process is improved, the image search is more accurate, and the image search performance is improved.
Through the analysis, the image recognition device can firstly screen each object to be recognized according to the distance between each object to be recognized and the selected position to determine the target recognition object, and then only recognize the target recognition object. In this disclosure, in a possible implementation scenario, each object to be recognized may also be recognized to further enrich the result of image recognition, which is described in detail below with reference to fig. 2.
Fig. 2 is a schematic flow chart of an image recognition method according to an embodiment of the present disclosure.
As shown in fig. 2, the image recognition method may include the steps of:
step 201, in response to an acquired image selection operation, determining a target image targeted by the selection operation and a trigger position of the selection operation.
Step 202, determining the distance between each object to be identified in the target image and the trigger position.
Step 203, determining the priority of each object to be identified according to the distance between each object to be identified and the trigger position.
Specifically, the device may determine the priority of each object to be identified according to the distance between the object to be identified and the trigger position. For example, if a distance between an object to be identified and the trigger position is short, the priority of the object to be identified may be considered to be high. Alternatively, if the distances between two or more objects to be identified and the trigger position are the same, the priorities of the objects to be identified may be the same or different. For example, if the distances between the at least two objects to be recognized and the trigger position are the same, the device may determine the priorities of the at least two objects to be recognized according to the recognition frames or the historical image recognition information of the at least two objects to be recognized, which may specifically refer to step 103 in the foregoing embodiment, which is not limited by the present disclosure.
For example, in a certain target image, the object to be recognized A, B, C, D is included, and the distances between the object to be recognized A, B, C, D and the trigger position are a, b, c, d, respectively. If a < B < C < D, then the priority order of the object to be identified A, B, C, D may be A > B > C > D.
It can be understood that the priority may be a weight occupied by any object to be identified in all objects to be identified, and if the weight of the object to be identified is higher, it may be considered that the user has a higher identification requirement for the object to be identified.
And 204, respectively identifying each object to be identified to obtain an identification result corresponding to each object to be identified.
Specifically, after determining the objects to be recognized, the device may recognize each object to be recognized to obtain a recognition result corresponding to each object to be recognized, where the recognition result may be a plurality of pieces of search information associated with the content of the target recognition object. For example, if the target recognition object is a person photograph, the recognition result of the person photograph may be the name, age, other pictures including the person, or other pictures including persons similar to the person, and the disclosure is not limited thereto.
And step 205, displaying the identification result corresponding to each object to be identified according to the priority of each object to be identified.
When the device displays the identification result corresponding to the object to be identified, the number of the objects to be identified is large and/or the display position is forward according to the priority of each object to be identified, for example, the identification result of the object to be identified with high priority. Therefore, the identification objects with higher search weight of the user can be preferentially displayed, the user can select the identification objects conveniently, and the user experience is improved.
The device in the embodiment of the disclosure firstly responds to the acquired image selection operation, determines a target image and a trigger position of the selection operation, and then determines the distance between each object to be identified in the target image and the trigger position, and further determines the priority of each object to be identified according to the distance between each object to be identified and the trigger position. And finally, displaying the identification result corresponding to each object to be identified according to the priority of each object to be identified. Therefore, the identification result of the object to be identified is displayed according to the priority level, so that the identification object with high identification possibility of user intention can be preferentially displayed, the identification result is enriched, various user requirements can be met, and the user experience is improved.
Fig. 3 is a flowchart illustrating an image recognition method according to an embodiment of the present disclosure.
As shown in fig. 3, the image recognition method may include the steps of:
step 301, in response to the acquired image selection operation, determining a target image targeted by the selection operation and a trigger position of the selection operation.
Step 302, determining each object to be recognized and the type of the object to be recognized in the target image.
It should be noted that there may be one or more objects to be recognized in the target image acquired by the apparatus. If there are a plurality of objects to be identified, the types of the objects to be identified may be the same or different.
In the embodiment of the present disclosure, after determining the target image, the apparatus may perform preprocessing on the target image to determine the number of each object to be recognized and the type of the object to be recognized, such as a two-dimensional code, a person, a text, and the like, which are included in the target image, and the present disclosure does not limit this.
Step 303, determining a candidate identification mode corresponding to each object to be identified according to the type of each object to be identified.
Specifically, for different types of objects to be recognized, in the embodiment of the present disclosure, the device may provide one or more corresponding candidate recognition modes for the objects to be recognized, so that the user can perform the next action, and the number of interactions between the device and the user is increased.
For example, if the object to be identified is a commodity, the candidate identification method corresponding to the commodity may be, for example, identifying a brand of the commodity, identifying a region to which the commodity belongs, identifying a price of the commodity, or identifying a purchase link of the commodity, which is not limited in the present disclosure. Alternatively, if the object to be recognized is a character photograph, the candidate recognition method corresponding to the character photograph may be the age of the recognized character, the name of the recognized character, the similarity graph of the recognized character, and the like, which is not limited in the present disclosure.
And 304, according to the distance between each object to be recognized and the trigger position, distinguishing and displaying the candidate recognition modes corresponding to each object to be recognized.
The device may be used to perform differential display on the fonts, background colors, positions and the like of the candidate recognition modes, and there may be many, which is not limited in this disclosure.
It can be understood that, if the distance between a certain object to be recognized and the trigger position is short, the candidate recognition manner corresponding to the object to be recognized may be highlighted, for example, a font of the candidate recognition manner corresponding to the object to be recognized is enlarged, a background color of the candidate recognition manner corresponding to the object to be recognized is highlighted, and the disclosure does not limit this.
Step 305, in response to the fact that any candidate recognition mode corresponding to any object to be recognized is selected, recognizing any object to be recognized based on any candidate recognition mode to generate a recognition result.
Specifically, after the device displays the candidate identification modes corresponding to the objects to be identified in a distinguishing manner, the display interface can be monitored, and when it is monitored that a user clicks or performs other triggering operations on a certain candidate identification mode of any object to be identified in the display interface, it can be determined that the user selects the candidate identification mode. Therefore, the device can identify the object to be identified according to the candidate identification mode to generate an identification result of the object to be identified.
For example, if the user selects the candidate identification mode "identify the area to which the product belongs" of the object to be identified, "the device may identify the" product "to obtain the result of the area to which the product belongs.
It should be noted that, for different candidate recognition manners and objects to be recognized, corresponding processes for generating recognition results may be different. For example, in the identification process of identifying an object to be identified in the category of "goods" to determine information such as "region to which the object belongs" or "purchasing link", the device is required to identify the object to be identified first, determine basic information of the object to be identified, such as name of goods, brand, and the like, and then search based on the identified basic information to generate an identification result. The present disclosure does not limit the specific process and manner of generating the recognition result corresponding to the recognition object.
In the image identification method provided by the embodiment of the disclosure, the device firstly responds to the acquired image selection operation, determines a target image and a trigger position of the selection operation, which are aimed at by the selection operation, then determines each object to be identified and the type of the object to be identified in the target image, and determines a candidate identification mode corresponding to each object to be identified according to the type of each object to be identified. And finally, the device responds to the fact that any candidate identification mode corresponding to any object to be identified is selected, identifies any object to be identified based on any candidate identification mode, and generates an identification result. Therefore, the candidate identification modes corresponding to the objects to be identified are displayed in a distinguishing manner according to the distance between each object to be identified and the trigger position, so that the identification objects with high identification possibility of user intention can be identified preferentially, any object to be identified in the target image can be prevented from being identified in a missing manner, and the diversity requirements of users are met.
In order to implement the above embodiments, the present disclosure also provides an image recognition apparatus.
Fig. 4 is a schematic structural diagram of an image recognition apparatus according to an embodiment of the present disclosure.
As shown in fig. 4, the image recognition apparatus 400 includes: a first determination module 410, a second determination module 420, a third determination module 430, and a first generation module 440.
A first determining module 410, configured to determine, in response to an acquired image selection operation, a target image targeted by the selection operation and a trigger position of the selection operation;
a second determining module 420, configured to determine a distance between each object to be identified in the target image and the trigger position;
a third determining module 430, configured to determine a target identification object according to a distance between each object to be identified and the trigger position;
the first generating module 440 is configured to identify the target identification object to generate an identification result.
As a possible implementation manner, the third determining module is specifically configured to:
responding to the fact that the distance between at least two objects to be identified and the trigger position is the same and smaller than the distance between each of the rest objects to be identified and the trigger position, and obtaining historical image identification information;
determining the candidate type of the target object according to the historical image identification information;
and determining the object to be recognized which is matched with the candidate type in the at least two objects to be recognized as the target recognition object.
As a possible implementation manner, the third determining module is specifically configured to:
determining the sizes of the identification frames corresponding to the at least two objects to be identified respectively in response to the fact that the distances between the at least two objects to be identified and the trigger positions are the same and smaller than the distances between the rest objects to be identified and the trigger positions;
and determining the object to be recognized corresponding to the recognition frame as the target recognition object.
As a possible implementation manner, the apparatus further includes:
the fourth determining module is used for determining the priority of each object to be identified according to the distance between each object to be identified and the triggering position;
the acquisition module is used for respectively identifying the objects to be identified so as to acquire an identification result corresponding to each object to be identified;
and the display module is used for displaying the identification result corresponding to each object to be identified according to the priority of each object to be identified.
As a possible implementation manner, the apparatus further includes:
a fifth determining module, configured to determine, according to the type of each object to be identified, a candidate identification manner corresponding to each object to be identified;
the display module is used for displaying the candidate identification modes corresponding to the objects to be identified in a distinguishing manner according to the distance between the objects to be identified and the trigger position;
and the second generation module is used for responding to the fact that any candidate identification mode corresponding to any object to be identified is selected, identifying any object to be identified based on any candidate identification mode, and generating an identification result.
The image recognition device provided by the embodiment of the disclosure firstly responds to the acquired image selection operation, determines a target image targeted by the selection operation and a trigger position of the selection operation, then determines a distance between each object to be recognized in the target image and the trigger position, then determines a target recognition object according to the distance between each object to be recognized and the trigger position, and finally recognizes the target recognition object to generate a recognition result. Therefore, the content possibly associated with the user request can be determined through the position information of the behavior selected by the user, the speed and the accuracy of image recognition can be improved, the resource consumption in the image recognition process is reduced, the experience of the user in the image recognition process is improved, the image search is more accurate, and the image search performance is improved.
The present disclosure also provides an electronic device, a readable storage medium, and a computer program product according to embodiments of the present disclosure.
FIG. 5 illustrates a schematic block diagram of an example electronic device 500 that can be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the disclosure described and/or claimed herein.
As shown in fig. 5, the apparatus 500 comprises a computing unit 501 which may perform various appropriate actions and processes in accordance with a computer program stored in a Read Only Memory (ROM)502 or a computer program loaded from a storage unit 508 into a Random Access Memory (RAM) 503. In the RAM 503, various programs and data required for the operation of the device 500 can also be stored. The calculation unit 501, the ROM502, and the RAM 503 are connected to each other by a bus 504. An input/output (I/O) interface 505 is also connected to bus 504.
A number of components in the device 500 are connected to the I/O interface 505, including: an input unit 506 such as a keyboard, a mouse, or the like; an output unit 507 such as various types of displays, speakers, and the like; a storage unit 508, such as a magnetic disk, optical disk, or the like; and a communication unit 509 such as a network card, modem, wireless communication transceiver, etc. The communication unit 509 allows the device 500 to exchange information/data with other devices through a computer network such as the internet and/or various telecommunication networks.
The computing unit 501 may be a variety of general-purpose and/or special-purpose processing components having processing and computing capabilities. Some examples of the computing unit 501 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized artificial intelligence (A I) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and so forth. The calculation unit 501 performs the respective methods and processes described above, such as the recognition method of an image. For example, in some embodiments, the method of image recognition may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as storage unit 508. In some embodiments, part or all of the computer program may be loaded and/or installed onto the device 500 via the ROM502 and/or the communication unit 509. When the computer program is loaded into the RAM 503 and executed by the computing unit 501, one or more steps of the image recognition method described above may be performed. Alternatively, in other embodiments, the computing unit 501 may be configured to perform the image recognition method by any other suitable means (e.g., by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), the internet, and blockchain networks.
The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The Server can be a cloud Server, also called a cloud computing Server or a cloud host, and is a host product in a cloud computing service system, so as to solve the defects of high management difficulty and weak service expansibility in the traditional physical host and VPS service ("Virtual Private Server", or simply "VPS"). The server may also be a server of a distributed system, or a server incorporating a blockchain.
The device in the embodiment of the disclosure firstly responds to the acquired image selection operation, determines a target image aimed at by the selection operation and a trigger position of the selection operation, then determines a distance between each object to be identified in the target image and the trigger position, then determines a target identification object according to the distance between each object to be identified and the trigger position, and finally identifies the target identification object to generate an identification result. Therefore, the content possibly associated with the user request can be determined through the position information of the behavior selected by the user, the speed and the accuracy of image recognition can be improved, the resource consumption in the image recognition process is reduced, the experience of the user in the image recognition process is improved, the image search is more accurate, and the image search performance is improved.
It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present disclosure may be executed in parallel, sequentially, or in different orders, as long as the desired results of the technical solutions disclosed in the present disclosure can be achieved, and the present disclosure is not limited herein.
The above detailed description should not be construed as limiting the scope of the disclosure. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present disclosure should be included in the scope of protection of the present disclosure.

Claims (13)

1. An image recognition method, comprising:
responding to the acquired image selection operation, and determining a target image targeted by the selection operation and a trigger position of the selection operation;
determining the distance between each object to be identified in the target image and the trigger position;
determining a target identification object according to the distance between each object to be identified and the trigger position;
and identifying the target identification object to generate an identification result.
2. The method of claim 1, wherein the determining a target recognition object according to the distance between each object to be recognized and the trigger position comprises:
responding to the fact that the distance between at least two objects to be identified and the trigger position is the same and smaller than the distance between each of the rest objects to be identified and the trigger position, and obtaining historical image identification information;
determining the candidate type of the target object according to the historical image identification information;
and determining the object to be recognized which is matched with the candidate type in the at least two objects to be recognized as the target recognition object.
3. The method of claim 1, wherein the determining a target recognition object according to the distance between each object to be recognized and the trigger position comprises:
determining the sizes of the identification frames corresponding to the at least two objects to be identified respectively in response to the fact that the distances between the at least two objects to be identified and the trigger positions are the same and smaller than the distances between the rest objects to be identified and the trigger positions;
and determining the object to be recognized corresponding to the recognition frame as the target recognition object.
4. The method of claim 1, wherein after the determining the distance between each object to be recognized in the target image and the trigger position, further comprising:
determining the priority of each object to be identified according to the distance between each object to be identified and the trigger position;
respectively identifying the objects to be identified to obtain an identification result corresponding to each object to be identified;
and displaying the identification result corresponding to each object to be identified according to the priority of each object to be identified.
5. The method according to any one of claims 1-4, wherein after said determining the distance between each object to be identified in the target image and the trigger position, further comprising:
determining a candidate identification mode corresponding to each object to be identified according to the type of each object to be identified;
according to the distance between each object to be recognized and the trigger position, displaying the candidate recognition modes corresponding to each object to be recognized in a distinguishing manner;
in response to the fact that any candidate identification mode corresponding to any object to be identified is selected, any object to be identified is identified based on any candidate identification mode, and an identification result is generated.
6. An apparatus for recognizing an image, comprising:
the first determination module is used for responding to the acquired image selection operation and determining a target image aimed at by the selection operation and a trigger position of the selection operation;
the second determining module is used for determining the distance between each object to be identified in the target image and the trigger position;
the third determining module is used for determining a target identification object according to the distance between each object to be identified and the trigger position;
and the first generation module is used for identifying the target identification object so as to generate an identification result.
7. The apparatus of claim 6, wherein the third determining module is specifically configured to:
responding to the fact that the distance between at least two objects to be identified and the trigger position is the same and smaller than the distance between each of the rest objects to be identified and the trigger position, and obtaining historical image identification information;
determining the candidate type of the target object according to the historical image identification information;
and determining the object to be recognized which is matched with the candidate type in the at least two objects to be recognized as the target recognition object.
8. The apparatus of claim 6, wherein the third determining module is specifically configured to:
determining the sizes of the identification frames corresponding to the at least two objects to be identified respectively in response to the fact that the distances between the at least two objects to be identified and the trigger positions are the same and smaller than the distances between the rest objects to be identified and the trigger positions;
and determining the object to be recognized corresponding to the recognition frame as the target recognition object.
9. The apparatus of claim 6, further comprising:
the fourth determining module is used for determining the priority of each object to be identified according to the distance between each object to be identified and the triggering position;
the acquisition module is used for respectively identifying the objects to be identified so as to acquire an identification result corresponding to each object to be identified;
and the display module is used for displaying the identification result corresponding to each object to be identified according to the priority of each object to be identified.
10. The apparatus of any of claims 6-9, further comprising:
a fifth determining module, configured to determine, according to the type of each object to be identified, a candidate identification manner corresponding to each object to be identified;
the display module is used for displaying the candidate identification modes corresponding to the objects to be identified in a distinguishing manner according to the distance between the objects to be identified and the trigger position;
and the second generation module is used for responding to the fact that any candidate identification mode corresponding to any object to be identified is selected, identifying any object to be identified based on any candidate identification mode, and generating an identification result.
11. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-5.
12. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1-5.
13. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any one of claims 1-5.
CN202110304037.3A 2021-03-22 2021-03-22 Image recognition method and device, computer equipment and storage medium Active CN113110782B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110304037.3A CN113110782B (en) 2021-03-22 2021-03-22 Image recognition method and device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110304037.3A CN113110782B (en) 2021-03-22 2021-03-22 Image recognition method and device, computer equipment and storage medium

Publications (2)

Publication Number Publication Date
CN113110782A true CN113110782A (en) 2021-07-13
CN113110782B CN113110782B (en) 2022-09-30

Family

ID=76712178

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110304037.3A Active CN113110782B (en) 2021-03-22 2021-03-22 Image recognition method and device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113110782B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023169049A1 (en) * 2022-03-09 2023-09-14 聚好看科技股份有限公司 Display device and server

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140119609A1 (en) * 2012-10-31 2014-05-01 Electronics And Telecommunications Research Institute Image recognizing apparatus and method
CN109218982A (en) * 2018-07-23 2019-01-15 Oppo广东移动通信有限公司 Sight spot information acquisition methods, device, mobile terminal and storage medium
CN110909192A (en) * 2019-11-20 2020-03-24 腾讯科技(深圳)有限公司 Instant searching method, device, terminal and storage medium
CN111672118A (en) * 2020-06-05 2020-09-18 腾讯科技(深圳)有限公司 Virtual object aiming method, device, equipment and medium
CN112083872A (en) * 2020-09-16 2020-12-15 努比亚技术有限公司 Picture processing method, mobile terminal and computer storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140119609A1 (en) * 2012-10-31 2014-05-01 Electronics And Telecommunications Research Institute Image recognizing apparatus and method
CN109218982A (en) * 2018-07-23 2019-01-15 Oppo广东移动通信有限公司 Sight spot information acquisition methods, device, mobile terminal and storage medium
CN110909192A (en) * 2019-11-20 2020-03-24 腾讯科技(深圳)有限公司 Instant searching method, device, terminal and storage medium
CN111672118A (en) * 2020-06-05 2020-09-18 腾讯科技(深圳)有限公司 Virtual object aiming method, device, equipment and medium
CN112083872A (en) * 2020-09-16 2020-12-15 努比亚技术有限公司 Picture processing method, mobile terminal and computer storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023169049A1 (en) * 2022-03-09 2023-09-14 聚好看科技股份有限公司 Display device and server

Also Published As

Publication number Publication date
CN113110782B (en) 2022-09-30

Similar Documents

Publication Publication Date Title
CN113656582B (en) Training method of neural network model, image retrieval method, device and medium
EP3872652B1 (en) Method and apparatus for processing video, electronic device, medium and product
CN112579727B (en) Document content extraction method and device, electronic equipment and storage medium
CN112541332B (en) Form information extraction method and device, electronic equipment and storage medium
CN112733042A (en) Recommendation information generation method, related device and computer program product
CN113780098B (en) Character recognition method, character recognition device, electronic equipment and storage medium
CN112580666A (en) Image feature extraction method, training method, device, electronic equipment and medium
CN112818227A (en) Content recommendation method and device, electronic equipment and storage medium
CN111767420A (en) Method and device for generating clothing matching data
CN112560461A (en) News clue generation method and device, electronic equipment and storage medium
CN112241704A (en) Method and device for judging portrait infringement, electronic equipment and storage medium
CN113110782B (en) Image recognition method and device, computer equipment and storage medium
CN106157281A (en) A kind of image subject recognition methods and device
CN113837194A (en) Image processing method, image processing apparatus, electronic device, and storage medium
CN112508005A (en) Method, apparatus, device and storage medium for processing image
CN115510212A (en) Text event extraction method, device, equipment and storage medium
US11610396B2 (en) Logo picture processing method, apparatus, device and medium
CN113239273B (en) Method, apparatus, device and storage medium for generating text
CN111144122A (en) Evaluation processing method, evaluation processing device, computer system, and medium
CN114254650A (en) Information processing method, device, equipment and medium
CN113987026A (en) Method, apparatus, device and storage medium for outputting information
CN114417029A (en) Model training method and device, electronic equipment and storage medium
CN113642495B (en) Training method, apparatus, and program product for evaluating model for time series nomination
CN114417871B (en) Model training and named entity recognition method, device, electronic equipment and medium
CN113657126B (en) Translation method and device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant