WO2022166706A1 - 用于对象识别的方法、计算机系统及电子设备 - Google Patents

用于对象识别的方法、计算机系统及电子设备 Download PDF

Info

Publication number
WO2022166706A1
WO2022166706A1 PCT/CN2022/073987 CN2022073987W WO2022166706A1 WO 2022166706 A1 WO2022166706 A1 WO 2022166706A1 CN 2022073987 W CN2022073987 W CN 2022073987W WO 2022166706 A1 WO2022166706 A1 WO 2022166706A1
Authority
WO
WIPO (PCT)
Prior art keywords
classification
group
individuals
genus
recognition accuracy
Prior art date
Application number
PCT/CN2022/073987
Other languages
English (en)
French (fr)
Inventor
徐青松
李青
Original Assignee
杭州睿胜软件有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 杭州睿胜软件有限公司 filed Critical 杭州睿胜软件有限公司
Publication of WO2022166706A1 publication Critical patent/WO2022166706A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/45Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/48Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/483Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques

Definitions

  • the present disclosure relates to the field of computer technology, and in particular, to a method, a computer system, and an electronic device for object recognition.
  • APPs applications for identifying objects to be identified
  • these applications usually receive images from users (including static images, dynamic images, and videos, etc.), and recognize objects to be recognized in the images based on an object recognition model established by artificial intelligence technology to obtain recognition results.
  • the recognition result obtained when the object is a biological object may be the biological classification of the object to be recognized identified by the object recognition model, for example, the taxonomic unit may be Family, Genus or Species.
  • the recognition results output by the object recognition model may include one or more categories, usually sorted in descending order of confidence, and the category with the highest confidence can be considered as the category with the highest degree of matching with the features of the object to be identified presented in the image.
  • the recognition results output by the object recognition model may also include classifications similar to the classification with the highest confidence.
  • the image from the user usually includes at least a part of the object to be identified, for example, the image taken by the user includes stems, leaves, and flowers of the plant to be identified.
  • An object of the present disclosure is to provide a method, computer system and electronic device for object recognition.
  • a method for object recognition comprising receiving a first classification of recognized objects from a pre-established object recognition model, the object recognition model based on rendering the recognized objects at least a portion of the first imagery identifies a classification of the identified object; in response to the first classification belonging to a first group, displaying a first frame, wherein the first frame includes the first classification; and in response to The first category belongs to a second group, and a second screen is displayed, wherein the second screen does not include the first category and includes a prompt requesting the user to enter additional information about the identified object, wherein the The first group and the second group are established based on the statistical recognition accuracy of the object recognition model for the classification of individuals in the targeted object group, wherein the first group includes The classification of individuals whose identification accuracy satisfies the first condition, the second group includes the classification of individuals whose identification accuracy satisfies the second condition, wherein the first condition is the identification of the classification unit of the individual classification as species
  • the accuracy rate is the
  • a method for object recognition comprising receiving a first classification of recognized objects from a pre-established object recognition model, the object recognition model based on rendering the recognized objects at least a portion of the first imagery identifies a classification of the identified object; in response to the first classification belonging to a first group, displaying a first frame, wherein the first frame includes the first classification; and in response to The first category belongs to a second group, and a second screen is displayed, wherein the second screen does not include the first category and includes a prompt requesting the user to enter additional information about the identified object, wherein the The first group and the second group are established based on the statistical recognition accuracy of the object recognition model for the classification of individuals in the targeted object group, wherein the first group includes The classification of individuals whose identification accuracy satisfies the first condition, the second group includes the classification of individuals whose identification accuracy satisfies the second condition, wherein the first condition is the identification of the classification unit of the individual classification as species The accuracy is higher than
  • a method for object recognition comprising receiving a first classification of recognized objects from a pre-established object recognition model, the object recognition model based on rendering the recognized objects at least a part of the first image identifies the classification of the identified object; in response to the first classification belonging to a pre-established group, displaying information about a classification whose classification unit corresponding to the first classification is a genus, wherein, The group is established based on the statistical recognition accuracy rate of the object recognition model for the classification of individuals in the targeted object group, wherein the group includes the recognition accuracy rate for which the taxonomic unit is species Individuals that are lower than the first threshold and whose classification unit is a genus have a recognition accuracy higher than the second threshold are classified.
  • a method for object recognition comprising receiving a first classification of recognized objects from a pre-established object recognition model, the object recognition model based on rendering the recognized objects at least a portion of the first imagery identifies a classification of the identified object; in response to the first classification belonging to a pre-established group, not displaying the first classification and displaying a request for user input of additional information about the identified object , wherein the group is established based on the statistical recognition accuracy rate of the object recognition model for the classification of individuals in the targeted object group, wherein the group includes the recognition accuracy rate for it Classification of individuals below the threshold.
  • an electronic device comprising: one or more processors configured to cause the electronic device to perform any of the methods described above.
  • an apparatus for operating an electronic device comprising: one or more processors configured to cause the electronic device to perform any of the methods described above.
  • a computer system for object recognition comprising: one or more processors; and one or more memories configured to store computer executables instructions and computer-accessible data associated with the computer-executable instructions, which, when executed by the one or more processors, cause the computer system to perform as described above any of the methods described.
  • a non-transitory computer-readable storage medium having computer-executable instructions stored thereon, when the computer-executable instructions are stored by a When executed, the one or more computing devices are caused to perform any of the methods described above.
  • FIG. 1 is a flowchart schematically illustrating at least a portion of a method for object recognition according to an embodiment of the present disclosure.
  • 2 to 8 are diagrams schematically illustrating display screens of a method according to an embodiment of the present disclosure.
  • FIG. 9 is a block diagram schematically illustrating at least a portion of a computer system for object recognition according to an embodiment of the present disclosure.
  • FIG. 10 is a block diagram schematically illustrating at least a portion of a computer system for object recognition according to an embodiment of the present disclosure.
  • FIG. 1 is a flowchart schematically illustrating at least a portion of a method 100 for object recognition according to an embodiment of the present disclosure.
  • the method 100 includes: receiving a classification of the recognized object from the object recognition model (step S110 ); determining which group the classification indicates the recognized object belongs to (step S120 ); in response to the classification indicating that the recognized object belongs to the first group, displaying a screen including the classification (step S130); and in response to the classification indicating that the recognized object belongs to the second group, displaying a screen that does not include the classification and includes a prompt requesting the user to input additional information about the recognized object (step S140) .
  • a user inputs an image of all or a portion of an identified object (also referred to herein as a "first image") into an application that can perform object recognition in order to obtain information about the identified object.
  • the image may include any one or a combination of any of the roots, stems, leaves, flowers, fruits, and seeds of the plant to be recognized, and each of the items included Can be the whole or part of this item.
  • the image can be previously stored by the user, captured in real time, or downloaded from the Internet.
  • Imagery can include any form of visual presentation, such as still images, moving images, and video. Images can be captured using devices including cameras, such as mobile phones, tablet computers, etc.
  • An application capable of implementing method 100 may receive the imagery from the user and perform object recognition based on the imagery.
  • Recognition may include any known method of image-based object recognition.
  • a recognized object in an image may be recognized by a computing device and a pre-trained (or "trained") object recognition model to obtain a recognition result (eg, including the recognized one or more classifications) .
  • the recognition model can be established based on a neural network such as a deep convolutional neural network (CNN) or a deep residual network (Resnet), etc.
  • CNN deep convolutional neural network
  • Resnet deep residual network
  • the imagery can also be preprocessed before object recognition based on the imagery. Preprocessing may include normalization, brightness adjustment, or noise reduction, among others. Noise reduction processing can highlight the description of the features in the image, making the features more distinct.
  • the recognition results provided by an object recognition model typically include one or more classifications of the recognized object.
  • One or more classifications are ranked from high to low by confidence (how close the classification is to the true classification).
  • Ranking first is the classification with the highest confidence, which is also referred to herein as the “Top 1 recognition result", which can be described as the “first classification” in at least some of the claims.
  • the second place is the classification of the recognition results whose confidence is second only to Top 1, which is also referred to as “Top 2 recognition results” in this article.
  • the third place is called “Top 3 recognition results", and so on.
  • the classification unit of one or more classifications included in the recognition result provided by the object recognition model is species.
  • the classification unit of each recognition result is the classification of the genus.
  • the taxonomic units of one or more classifications included in the recognition result provided by the object recognition model are species and genus.
  • the classification in which the taxonomic unit is species is abbreviated as "species classification”
  • the classification in which the taxonomic unit is genus is abbreviated as "genus classification”.
  • Objects that are similar to each other can have the same classification or different classifications.
  • Plant 1 and Plant 2 are similar plants to each other, Plant 1 and Plant 2 can have the same genus classification and have different species classifications, and can also have different genus classifications.
  • a classification of an individual having a similar morphology to an individual indicated by at least one of the one or more classifications described above may be derived from each identification result, also referred to herein as a "similar result".
  • similar results of the respective identification results can be obtained according to a pre-established rule database. Similar results can be provided by the object recognition model, or can be obtained from the recognition results obtained from the object recognition model.
  • object group targeted by the object recognition model if an object recognition model is used to identify plants, the object group it targets is plants, the individuals in the object group refer to various species of plants, and the individual classification refers to the classification of various species of plants (eg species classification).
  • species classification e.g. classification
  • “species” used to define an individual generally refers to a classification in which the taxonomic unit is a species. The same is true when the object recognition model is used to recognize animals.
  • the object recognition model can also be used to identify certain plants (or animals).
  • the object recognition model is used to identify ferns
  • the object population it targets is ferns
  • the individuals in the object population refer to various species of ferns.
  • a group is a collection of individual classifications established based on the statistical recognition accuracy rates of the individual classifications in the targeted target group by the object recognition model, including the individual classifications whose recognition accuracy rates satisfy certain conditions.
  • the group of objects targeted by the object recognition model are plants.
  • a large amount of test data (for example, 10,000 sets of data) is used to calculate the recognition accuracy of the object recognition model for various types of plants, and the statistical results are shown in Table 1.
  • the recognition accuracy of a certain type of plant refers to the ratio of the number of samples correctly identified by the object recognition model to the total number of samples in the test data set of this type of plant for the test data set of this type of plant.
  • the accuracy rate of the species classification corresponding to the Top 1 identification results provided by the object recognition model is higher than 85%, which means that the object recognition model for these plants is classified as species in the taxonomic unit. The identification on the level is almost correct.
  • These kinds of plants can be classified into group one, which can be in the form of, for example, a collection of species classifications of these plants.
  • the accuracy rate of the species classification corresponding to the Top 1 recognition results provided by the object recognition model is about 51%, but the accuracy rate of the corresponding genus classification is about 93%, which means that the object recognition The model identifies these plants almost correctly at the genus level, but may be incorrect at the species level.
  • the object recognition model cannot accurately distinguish between these relatively similar species classifications.
  • group two can be, for example, a collection of species classifications of these plants.
  • the accuracy rate of the species classification corresponding to the Top 1 recognition results provided by the object recognition model is about 51%, indicating that the recognition results of the species classification may be incorrect, but the accuracy rate of the genus classification is about 66%. , indicating that the identification results of genus classification may be acceptable but not ideal.
  • group three can be, for example, a collection of species classifications of these plants.
  • the accuracy rate of the species classification corresponding to the Top 1 recognition result provided by the object recognition model is about 22% and the accuracy rate of the genus classification is about 29%, which means that the recognition results are almost wrong.
  • These kinds of plants can be classified into group four, which can be, for example, a collection of species classifications of these plants. It can be seen that there is no intersection between groups established in this way.
  • the classification of the recognized object that is, the recognition result of Top 1
  • the object recognition model in step S110.
  • step S120 it is determined which group the identification result of Top 1 belongs to. For example, if it is determined in step S120 that the classification of the recognition result of the Top 1 is included in group 1, the recognition result can be considered to be accurate, and a screen including the recognition result of the Top 1 can be displayed in step S130 .
  • step S120 If it is determined in step S120 that the category of the top 1 recognition result is included in group four, the recognition result may be considered unreliable, so the result may not be displayed to the user, for example, in step S140 A screen that does not include the recognition result of the Top 1 and includes a prompt requesting the user to input additional information about the recognized object is displayed.
  • the additional information about the recognized object may include morphological information of the recognized object, growth environment information, recognition environment information, and the like.
  • the screen displayed in step S140 may include a prompt requesting the user to input such information. These information can be input in various forms, such as text, voice, video and so on.
  • the prompt requesting the user to input additional information about the identified object may include: a prompt area requesting the user to input one or more additional images; and/or informing the user to capture images from different angles and/or distances. Shooting guide.
  • the user can input additional information about the recognized object in the form of images (these images are referred to herein as "additional images"), and the method according to this embodiment can drive the object recognition model based on the aforementioned first image and the additional image, Or just re-identify the classification of the recognized object based on this additional image and obtain the re-identified result from the object recognition model.
  • the re-identified result may be directly displayed to the user without distinguishing the group (ie, without performing step S120 or similar operations), or the above-mentioned steps S130 to S140 may be performed for the re-identified result.
  • the recognition result of Top 1 received from the object recognition model belongs to group 1, that is, it can be considered that the recognition result of the species classification of the object recognition model is correct at this time, and the screen 10 shown in FIG. 2 can be displayed.
  • Screen 10 may include the recognition results of Top 1.
  • the recognition result of Top 1 is the category "Baby rubber plant" with the highest confidence of the recognized object recognized by the object recognition model.
  • the user's operations such as clicking, sliding, etc.
  • a user's specific operation e.g, swiping right
  • an additional page for case one may be displayed.
  • the additional page may include one or more of the following: shooting instructions; a prompt to change the recognition result output by the method (i.e., the recognition result of Top 1 displayed on screen 10); A classification of individuals whose individuals have similar morphologies (ie, similar outcomes).
  • the additional page 20 includes shooting instructions for the user (displayed in the screen 20 as "Tips for taking pictures”, also known as shooting techniques, shooting methods, etc.), for example, "Focus the plant in the middle of the frame, and avoid dark or contaminated images”.
  • the additional page 20 also includes a prompt for changing the recognition result of Top 1 (displayed as "Change the result” in the screen 20) below the shooting guide, so that when the user thinks that the recognition result of the Top 1 provided by the object recognition model is wrong, he/she can Correct it yourself.
  • additional pages may also include similar results to the top 1 identification results.
  • the top 1 recognition result provided by the object recognition model is displayed on the screen 10 as jasmine, and the additional page can also display the classification of individuals with similar shapes to jasmine, such as forsythia, peach and cherry blossoms.
  • the recognition result of Top 1 received from the object recognition model belongs to group 2, that is, it can be considered that the recognition result of the species classification of the object recognition model at this time is not accurate, but the corresponding genus classification is correct
  • the screen 30 shown in FIG. 4 can be displayed.
  • the screen 30 includes information on the genus category corresponding to the recognition result of Top 1, and in the screen 30 is the content displayed in the area 31.
  • the screen 30 may further include the recognition result of Top 1, which is the content displayed in the area 32 in the screen 30.
  • the screen 30 may also include, for example in area 33, one or more categories received from the object recognition model with a lower confidence level than the top 1 recognition results, such as Top 2, Top 3, etc.
  • Recognition result (in one embodiment, it will be displayed in area 33 only when the genus classification corresponding to the recognition result of Top 2, Top 3, etc. is the same as the genus classification corresponding to the recognition result of Top 1); and/or similar results for the identification results of Top 1.
  • the recognition results of Top 2, Top 3, etc. are not repeatedly displayed for the same classification as in the similar results.
  • the screen 30 may also include the recognition result of Top 2, the recognition result of Top 3, and the recognition result of removing Top 2 in sequence. similar results.
  • the recognition result of Top 1 received from the object recognition model belongs to group 3, that is to say, it can be considered that the recognition results of the species classification and genus classification of the object recognition model are not very accurate at this time, and the result shown in Fig. 5 can be displayed.
  • Screen 40 may include Top 1 recognition results (eg, displayed in area 41 ) and one or more classifications (eg, displayed in area 42 ) of recognition results received from the object recognition model with a confidence level lower than Top 1 , such as Top 1 2.
  • Recognition results of Top 3, etc. (in one embodiment, it will be displayed only when the genus classification corresponding to the recognition results of Top 2, Top 3, etc. is the same as the genus classification corresponding to the recognition result of Top 1 in area 42).
  • screen 40 may also include similar results (eg, displayed in area 43) to the top 1 identification results. It should be noted that the identification results of Top 2, Top 3, etc. are not repeated for the same classification as in similar results.
  • Screen 40 may also include a prompt (eg, displayed in area 44 ) requesting the user to enter additional information about the identified object.
  • the prompt requesting the user to input additional information about the recognized object may include: a prompt area requesting the user to input one or more additional images; and/or a shooting guide telling the user to take images from different angles and/or distances.
  • FIGS. 6 to 8 For more information on the prompt requesting the user to enter additional information about the identified object, reference may be made to the description below with respect to FIGS. 6 to 8 .
  • the recognition result of Top 1 received from the object recognition model belongs to group 4, that is, it can be considered that the recognition result of the object recognition model is incorrect at this time, and the screen 50 shown in FIG. 6 can be displayed.
  • the screen 50 does not display the recognition results of Top 1 but displays a prompt requesting the user to input additional information about the recognized object, for example, may include a prompt area requesting the user to input one or more additional images; and/or telling the user from different angles and/or or shooting guide for shooting images at a distance.
  • screen 50 displays the prompt "Could you please try 'Multi-image' identification?" to request the user to enter one or more additional images about the identified object. Since the recognition result of Top 1 this time is incorrect, it is not displayed to the user in the screen 50, and it may not be saved in the history record of successful recognition.
  • the user may act in response to the prompt, such as clicking the button "Multi-image identification" in screen 50 to enter additional information about the identified object by entering one or more additional images.
  • screen 61 is displayed.
  • the screen 61 includes a shooting guide instructing the user to shoot images from different angles and/or distances.
  • Area 63 of frame 61 is located below the frame of the shot and includes 3 small boxes that request the user to input 3 additional images about the identified object in order to re-identify the object.
  • the user may operate button 64 to capture one or more of the requested 3 additional images.
  • thumbnails of the selected images are also displayed within the small frame of the area 63 . These additional images can be displayed from left to right in the order entered.
  • the recognition of the recognized object can be automatically started, or the user can operate the button 66 (that is, instructing to start the re-recognition). operation) to manually initiate re-identification. If the input additional images do not reach the predetermined number, the user can manually initiate re-identification by operating the button 66 .
  • the re-identification may be based only on the current input of one or more additional images, or may be based on the previously input first image and the current input of one or more additional images.
  • the thumbnail of each additional image includes a delete operation area (eg, an "X" symbol in its upper right corner), and the user can delete any additional image before re-identification begins.
  • Re-identification can be performed by the aforementioned object recognition model.
  • the object recognition model re-identifies the identified object based on the one or more additional images, or based on the one or more additional images and the first image, and provides re-identified results (eg, may include only one highest-confidence classification). After the result of this re-identification is obtained, a screen can be displayed to display the result to the user.
  • One or more of the additional images may be saved in the history of successful identification.
  • FIG. 9 is a block diagram schematically illustrating at least a portion of a computer system 700 for object recognition according to an embodiment of the present disclosure.
  • system 700 may include one or more storage devices 710 , one or more electronic devices 720 , and one or more computing devices 730 , which may be communicatively connected to each other through a network or bus 740 .
  • One or more storage devices 710 provide storage services for one or more electronic devices 720 , and one or more computing devices 730 .
  • one or more storage devices 710 are shown in system 700 as a separate block from one or more electronic devices 720 and one or more computing devices 730, it should be understood that one or more storage devices 710 It may actually be stored on any of the other entities 720 , 730 included in the system 700 .
  • Each of the one or more electronic devices 720 and the one or more computing devices 730 may be located at different nodes of the network or bus 740 and be capable of communicating directly or indirectly with other nodes of the network or bus 740 .
  • the system 700 may also include other devices not shown in FIG. 9 , where each different device is located at a different node of the network or bus 740 .
  • One or more storage devices 710 may be configured to store any of the data described above, including but not limited to: first imagery, additional images, object recognition models, individual sample sets/test data sets, recognition results, individual groups , application program files and other data.
  • One or more computing devices 730 may be configured to perform one or more of the above-described methods according to embodiments, and/or one or more steps of one or more methods according to embodiments.
  • One or more electronic devices 720 may be configured to provide services to the user, which may display screens 10 to 50 and 61, 62 as described above.
  • One or more electronic devices 720 may also be configured to perform one or more steps in a method according to an embodiment.
  • the network or bus 740 may be any wired or wireless network and may also include cables.
  • the network or bus 740 may be part of the Internet, the World Wide Web, a specific intranet, a wide area network, or a local area network.
  • the network or bus 740 may utilize standard communication protocols such as Ethernet, WiFi, and HTTP, protocols that are proprietary to one or more companies, and various combinations of the foregoing.
  • the network or bus 740 may also include, but is not limited to, an Industry Standard Architecture (ISA) bus, a Microchannel Architecture (MCA) bus, an Enhanced ISA (EISA) bus, a Video Electronics Standards Association (VESA) local bus, and peripheral component interconnects (PCI) bus.
  • ISA Industry Standard Architecture
  • MCA Microchannel Architecture
  • EISA Enhanced ISA
  • VESA Video Electronics Standards Association
  • PCI peripheral component interconnects
  • Each of the one or more electronic devices 720 and the one or more computing devices 730 may be configured similarly to the system 800 shown in FIG. as well as instructions and data.
  • Each of the one or more electronic devices 720 and the one or more computing devices 730 may be a personal computing device intended for use by a user or a business computer device used by an enterprise, and have All components used in conjunction, such as a central processing unit (CPU), memory that stores data and instructions (eg, RAM and internal hard drives), such as a display (eg, a monitor with a screen, touch screen, projector, television, or operable other devices to display information), mouse, keyboard, touch screen, microphone, speaker, and/or one or more I/O devices of a network interface device, etc.
  • CPU central processing unit
  • RAM random access memory
  • RAM random access memory
  • a display eg, a monitor with a screen, touch screen, projector, television, or operable other devices to display information
  • mouse keyboard, touch screen, microphone, speaker, and/or one or more I/
  • the one or more electronic devices 720 may also include one or more cameras for capturing still images or recording video streams, as well as all components for connecting these elements to each other. While one or more of the electronic devices 720 may each include a full-sized personal computing device, they may alternatively include a mobile computing device capable of wirelessly exchanging data with a server over a network such as the Internet.
  • the one or more electronic devices 720 may be a mobile phone, or a device such as a PDA with wireless support, a tablet PC, or a netbook capable of obtaining information via the Internet.
  • the one or more electronic devices 720 may be a wearable computing system.
  • FIG. 10 is a block diagram schematically illustrating at least a portion of a computer system 800 for object recognition according to one embodiment of the present disclosure.
  • System 800 includes one or more processors 810, one or more memories 820, and other components (not shown) typically found in a computer or the like.
  • Each of the one or more memories 820 may store content accessible by the one or more processors 810 , including instructions 821 executable by the one or more processors 810 , and Data 822 retrieved, manipulated or stored.
  • Instructions 821 may be any set of instructions to be executed directly by one or more processors 810, such as machine code, or any set of instructions to be executed indirectly, such as scripts.
  • the terms “instructions,” “applications,” “processes,” “steps,” and “programs” are used interchangeably herein.
  • Instructions 821 may be stored in object code format for direct processing by one or more processors 810, or in any other computer language, including scripts or collections of stand-alone source code modules interpreted on demand or compiled ahead of time. Instructions 821 may include instructions that cause, for example, one or more processors 810 to function as various neural networks herein. The functions, methods, and routines of instructions 821 are explained in greater detail elsewhere herein.
  • the one or more memories 820 may be any temporary or non-transitory computer readable storage medium capable of storing content accessible by the one or more processors 810, such as hard drives, memory cards, ROM, RAM, DVD, CD, USB memory, writable memory and read-only memory, etc.
  • One or more of the one or more memories 820 may include a distributed storage system, where instructions 821 and/or data 822 may be stored on a plurality of different storage devices that may be physically located in the same or different geographic locations.
  • One or more of the one or more memories 820 may be connected to the one or more processors 810 via a network, and/or may be directly connected to or incorporated into any of the one or more processors 810 .
  • One or more processors 810 may retrieve, store or modify data 822 in accordance with instructions 821 .
  • the data 822 stored in the one or more memories 820 may include at least a portion of one or more of the items stored in the one or more storage devices 710 described above.
  • the data 822 may also be stored in a computer register (not shown), as a table or XML document with many different fields and records in a relational type database.
  • Data 822 may be formatted in any computing device readable format, such as, but not limited to, binary values, ASCII, or Unicode. Additionally, data 822 may include any information sufficient to identify relevant information, such as numbers, descriptive text, proprietary codes, pointers, references to data stored in other memory, such as at other network locations, or used by functions to compute relevant information data information.
  • the one or more processors 810 may be any conventional processor, such as a commercially available central processing unit (CPU), graphics processing unit (GPU), or the like. Alternatively, the one or more processors 810 may also be special-purpose components, such as application specific integrated circuits (ASICs) or other hardware-based processors. Although not required, the one or more processors 810 may include specialized hardware components to perform certain computational processes faster or more efficiently, such as image processing of images, and the like.
  • CPU central processing unit
  • GPU graphics processing unit
  • ASICs application specific integrated circuits
  • the one or more processors 810 may include specialized hardware components to perform certain computational processes faster or more efficiently, such as image processing of images, and the like.
  • FIG. 10 schematically shows one or more processors 810 and one or more memories 820 within the same box
  • system 800 may actually include multiple Multiple processors or memories within a physical enclosure.
  • one of the one or more memories 820 may be a hard drive or other storage medium located in a different housing than the housing of each of the one or more computing devices (not shown) described above .
  • reference to a processor, computer, computing device or memory should be understood to include reference to a collection of processors, computers, computing devices or memories that may or may not operate in parallel.
  • references to "one embodiment” or “some embodiments” means that a feature, structure or characteristic described in connection with the embodiment is included in at least one embodiment, at least some embodiments of the present disclosure.
  • appearances of the phrases “in one embodiment” and “in some embodiments” in various places in this disclosure are not necessarily referring to the same embodiment or embodiments.
  • the features, structures or characteristics may be combined in any suitable combination and/or subcombination in one or more embodiments.
  • the word "exemplary” means “serving as an example, instance, or illustration” rather than as a “model” to be exactly reproduced. Any implementation illustratively described herein is not necessarily to be construed as preferred or advantageous over other implementations. Furthermore, the present disclosure is not to be bound by any expressed or implied theory presented in the preceding technical field, background, brief summary or detailed description.
  • a component may be, but is not limited to, a process, object, executable state, thread of execution, and/or program, etc. running on a processor.
  • a component may be, but is not limited to, a process, object, executable state, thread of execution, and/or program, etc. running on a processor.
  • an application running on a server and the server can be one component.
  • One or more components may exist within an executing process and/or thread, and a component may be localized on one computer and/or distributed between two or more computers.

Abstract

用于对象识别的方法、计算机系统和电子设备,涉及计算机技术领域。所述方法包括:从预先建立的对象识别模型接收被识别对象的第一分类;响应于所述第一分类属于第一群组,显示第一画面,其中,所述第一画面包括所述第一分类;以及响应于所述第一分类属于第二群组,显示第二画面,其中,所述第二画面不包括所述第一分类并且包括请求用户输入关于所述被识别对象的附加信息的提示,其中,所述第一群组的第一条件为对个体分类的分类单位为种的识别准确率为第一等级,所述第二群组的第二条件为对个体分类的分类单位为属的识别准确率为第二等级,所述第一等级高于第二等级。

Description

用于对象识别的方法、计算机系统及电子设备 技术领域
本公开涉及计算机技术领域,尤其涉及用于对象识别的方法、计算机系统及电子设备。
背景技术
计算机技术领域中,存在多种对待识别对象进行识别的应用(APP),例如用于识别植物的应用等。这些应用通常接收来自用户的影像(包括静态图像、动态图像、以及视频等),并基于由人工智能技术建立的对象识别模型对影像中的待识别对象进行识别,以得到识别结果。例如,对象为生物时得到的识别结果可以是对象识别模型所识别出的待识别对象生物学分类,例如分类单位可以为科(Family)、属(Genus)或种(Species)等。对象识别模型输出的识别结果可以包括一个或多个分类,通常按置信度从高到底排序,置信度最高的分类可以被认为是与影像中呈现出的待识别对象的特征匹配程度最高的分类。此外,对象识别模型输出的识别结果还可以包括与置信度最高的分类相似的分类。来自用户的影像通常包括待识别对象的至少一部分,例如,用户拍摄的影像中包括待识别植物的茎、叶、和花。
发明内容
本公开的一个目的是提供用于对象识别的方法、计算机系统及电子设备。
根据本公开的第一方面,提供了一种用于对象识别的方法,包括:从预先建立的对象识别模型接收被识别对象的第一分类,所述对象识别模型基于呈现所述被识别对象的至少一部分的第一影像识别所述被识别对象的分类;响应于所述第一分类属于第一群组,显示第一画面,其中,所述第一画面包括所述第一分类;以及响应于所述第一分类属于第二群组,显示第二画面,其中,所述第二画面不包括所述第一分类并且包括请求用户输入关于所述被识别对象的附加信息的提示,其中,所述第一群组和第二群组是基于所述对象识别模型对所针对的对象群体中的个体分类在统计学上的识别准确率建立 的,其中,所述第一群组包括对其的识别准确率满足第一条件的个体分类,所述第二群组包括对其的识别准确率满足第二条件的个体分类,其中,所述第一条件为对个体分类的分类单位为种的识别准确率为第一等级,所述第二条件为对个体分类的分类单位为属的识别准确率为第二等级,所述第一等级高于第二等级。
根据本公开的第二方面,提供了一种用于对象识别的方法,包括:从预先建立的对象识别模型接收被识别对象的第一分类,所述对象识别模型基于呈现所述被识别对象的至少一部分的第一影像识别所述被识别对象的分类;响应于所述第一分类属于第一群组,显示第一画面,其中,所述第一画面包括所述第一分类;以及响应于所述第一分类属于第二群组,显示第二画面,其中,所述第二画面不包括所述第一分类并且包括请求用户输入关于所述被识别对象的附加信息的提示,其中,所述第一群组和第二群组是基于所述对象识别模型对所针对的对象群体中的个体分类在统计学上的识别准确率建立的,其中,所述第一群组包括对其的识别准确率满足第一条件的个体分类,所述第二群组包括对其的识别准确率满足第二条件的个体分类,其中,所述第一条件为对个体分类的分类单位为种的识别准确率高于第一阈值,所述第二条件为对个体分类的分类单位为种的识别准确率低于第二阈值,并且其中,所述第一阈值高于第二阈值。
根据本公开的第三方面,提供了一种用于对象识别的方法,包括:从预先建立的对象识别模型接收被识别对象的第一分类,所述对象识别模型基于呈现所述被识别对象的至少一部分的第一影像识别所述被识别对象的分类;响应于所述第一分类属于预先建立的群组,显示关于与所述第一分类对应的分类单位为属的分类的信息,其中,所述群组是基于所述对象识别模型对所针对的对象群体中的个体分类在统计学上的识别准确率建立的,其中,所述群组包括对其的分类单位为种的识别准确率低于第一阈值并且对其的分类单位为属的识别准确率高于第二阈值的个体分类。
根据本公开的第四方面,提供了一种用于对象识别的方法,包括:从预先建立的对象识别模型接收被识别对象的第一分类,所述对象识别模型基于呈现所述被识别对象的至少一部分的第一影像识别所述被识别对象的分类; 响应于所述第一分类属于预先建立的群组,不显示所述第一分类并且显示请求用户输入关于所述被识别对象的附加信息的提示,其中,所述群组是基于所述对象识别模型对所针对的对象群体中的个体分类在统计学上的识别准确率建立的,其中,所述群组包括对其的识别准确率低于阈值的个体分类。
根据本公开的第五方面,提供了一种电子设备,包括:一个或多个处理器,被配置为使得所述电子设备进行如上所述的任一方法。
根据本公开的第六方面,提供了一种用于操作电子设备的装置,包括:一个或多个处理器,被配置为使得所述电子设备进行如上所述的任一方法。
根据本公开的第七方面,提供了一种用于对象识别的计算机系统,包括:一个或多个处理器;以及一个或多个存储器,所述一个或多个存储器被配置为存储计算机可执行的指令以及与所述计算机可执行的指令相关联的计算机可访问的数据,其中,当所述计算机可执行的指令被所述一个或多个处理器执行时,使得所述计算机系统进行如上所述的任一方法。
根据本公开的第八方面,提供了一种非临时性计算机可读存储介质,所述非临时性计算机可读存储介质上存储有计算机可执行的指令,当所述计算机可执行的指令被一个或多个计算装置执行时,使得所述一个或多个计算装置进行如上所述的任一方法。
通过以下参照附图对本公开的示例性实施例的详细描述,本公开的其它特征及其优点将会变得清楚。
附图说明
构成说明书的一部分的附图描述了本公开的实施例,并且连同说明书一起用于解释本公开的原理。
参照附图,根据下面的详细描述,可以更加清楚地理解本公开,其中:
图1是示意性地示出根据本公开的实施例的用于对象识别的方法的至少一部分的流程图。
图2至图8是示意性地示出根据本公开的实施例的方法显示画面的示意图。
图9是示意性地示出根据本公开的实施例的用于对象识别的计算机系统 的至少一部分的结构图。
图10是示意性地示出根据本公开的实施例的用于对象识别的计算机系统的至少一部分的结构图。
注意,在以下说明的实施方式中,有时在不同的附图之间共同使用同一附图标记来表示相同部分或具有相同功能的部分,而省略其重复说明。在本说明书中,使用相似的标号和字母表示类似项,因此,一旦某一项在一个附图中被定义,则在随后的附图中不需要对其进行进一步讨论。
具体实施方式
以下将参照附图来详细描述本公开的各种示例性实施例。应注意到:除非另外具体说明,否则在这些实施例中阐述的部件和步骤的相对布置、数字表达式和数值不限制本公开的范围。在下面描述中,为了更好地解释本公开,阐述了许多细节,然而可以理解的是,在没有这些细节的情况下也可以实践本公开。
以下对至少一个示例性实施例的描述实际上仅仅是说明性的,决不作为对本公开及其应用或使用的任何限制。在这里示出和讨论的所有示例中,任何具体值应被解释为仅仅是示例性的,而不是作为限制。
对于相关领域普通技术人员已知的技术、方法和设备可能不作详细讨论,但在适当情况下,所述技术、方法和设备应当被视为说明书的一部分。
图1是示意性地示出根据本公开的实施例的用于对象识别的方法100的至少一部分的流程图。方法100包括:从对象识别模型接收被识别对象的分类(步骤S110);判断该分类指示被识别对象属于哪个群组(步骤S120);响应于该分类指示被识别对象属于第一群组,显示包括该分类的画面(步骤S130);以及响应于该分类指示被识别对象属于第二群组,显示不包括该分类并且包括请求用户输入关于被识别对象的附加信息的提示的画面(步骤S140)。
在一些情况下,用户将被识别对象的全部或一部分的影像(在本文中也被称为“第一影像”)输入到可以进行对象识别的应用,以期获得关于该被识别对象的信息。例如在被识别对象为植物时,影像可以包括待识别植物的根、 茎、叶、花、果实、以及种子等各项的中的任意一项或多项的组合,其中所包括的每一项可以是这项的整体或部分。该影像可以是用户先前存储的、实时拍摄的、或者从网络上下载的。影像可以包括任何形式的视觉呈现,例如静态图像、动态图像、以及视频等。影像可以利用包括摄像头的设备进行拍摄,如手机、平板电脑等。
能够实施方法100的应用可以接收来自用户的该影像,并基于影像进行对象识别。识别可以包括任何已知的基于影像进行对象识别的方法。例如,可以通过计算装置和预先训练的(或称为“已训练的”)对象识别模型对影像中的被识别对象进行识别,以得到识别结果(例如,包括识别出的一个或多个分类)。可以基于神经网络(例如深度卷积神经网络(CNN)或深度残差网络(Resnet)等)来建立识别模型。例如,为每个植物的分类获取一定数量的标注有该植物的分类名称的影像样本,即训练样本集,利用这些影像样本对神经网络进行训练,直至神经网络的输出准确率满足要求。在基于影像进行对象识别之前,还可以对影像进行预处理。预处理可以包括归一化、明亮度调整、或降噪等。降噪处理可以凸显对影像中特征部分的描述,使特征更为鲜明。
如上所述,对象识别模型提供的识别结果通常包括被识别对象的一个或多个分类。一个或多个分类按置信度(该分类接近真实分类的可信程度)由高到低排列。排在第一位的为置信度最高的分类,在本文也被称为“Top 1的识别结果”,在至少部分权利要求里可以被描述为“第一分类”。排在第二位的为置信度仅次于Top 1的识别结果的分类,在本文也被称为“Top 2的识别结果”。排在第三位的称为“Top 3的识别结果”,依此类推。在一个实施例中,对象识别模型提供的识别结果所包括的一个或多个分类的分类单位为种。可以根据种与属的对应关系获知各个识别结果的分类单位为属的分类。在一个实施例中,对象识别模型提供的识别结果所包括的一个或多个分类的分类单位为种和属。为简便起见,下文将分类单位为种的分类简称为“种分类”,将分类单位为属的分类简称为“属分类”。
现实中常存在多种形态相似的对象,包括局部形态相似和整体形态相似。互相相似的对象可以具有相同的分类也可以具有不同的分类。例如,植物一 与植物二互为相似植物,则植物一与植物二可以具有相同的属分类并具有不同的种分类,也可以具有不同的属分类。在一些实施例中,可以根据各个识别结果来得到与上述一个或多个分类中的至少一个分类所指示的个体具有相似形态的个体的分类,在本文也被称为“相似结果”。例如,可以根据预先建立的规则数据库来得到各个识别结果的相似结果。相似结果可以由对象识别模型提供,也可以根据从对象识别模型获取的识别结果来得到。
下面对本文使用到的术语“对象识别模型所针对的对象群体”、对象群体中的“个体”、“个体分类”以及“群组”进行说明。在一个示例中,如果对象识别模型是用来识别植物的,则其所针对的对象群体是植物,对象群体中的个体是指各个种类的植物,个体分类是指各个种类的植物的分类(例如种分类)。在本文中,除另有说明,用于定义个体的“种类”通常是指分类单位为种的分类。当对象识别模型是用来识别动物时也是类似的。此外,对象识别模型还可以是用来识别特定的一些植物(或动物)。在一个示例中,如果对象识别模型是用来识别蕨类植物的,则其所针对的对象群体是蕨类植物,对象群体中的个体是指各个种类的蕨类植物。群组是个体分类的集合,其基于对象识别模型对所针对的对象群体中的个体分类在统计学上的识别准确率而建立,包括的个体分类的识别准确率满足特定条件。
下面以一个具体的示例对本文所称的群组进行说明。在本例中,对象识别模型所针对的对象群体是植物。使用大量测试数据(例如10000组数据)对对象识别模型针对各个种类的植物的识别准确率进行统计,统计结果如表1所示。对某个种类的植物的识别准确率是指,对于该种类的植物的测试数据集,对象识别模型正确识别出其分类的样本数与该种类的植物的测试数据集中总样本数之比。
根据统计结果,对于一些种类的植物,该对象识别模型提供的Top 1的识别结果所对应的种分类的准确率高于85%,这意味着该对象识别模型对于这些植物在分类单位为种的级别上的识别几乎是正确的。可以将这些种类的植物划归到群组一,群组一的形式例如可以是这些植物的种分类的集合。对于一些种类的植物,该对象识别模型提供的Top 1的识别结果所对应的种分类的准确率约为51%,但所对应的属分类的准确率约为93%,这意味着该对象识 别模型对于这些植物在分类单位为属的级别上的识别几乎是正确的,但在分类单位为种的级别上的识别可能不正确。这通常是由于在该属分类下有2个或更多个较为相似的种分类,该对象识别模型无法在这些较为相似的种分类之间进行准确的区分。可以将这些种类的植物划归到群组二,群组二的形式例如可以是这些植物的种分类的集合。对于一些种类的植物,该对象识别模型提供的Top 1的识别结果所对应的种分类的准确率约为51%,说明种分类的识别结果可能不正确,但属分类的准确率约为66%,说明属分类的识别结果可能可以接受但不是很理想。可以将这些种类的植物划归到群组三,群组三的形式例如可以是这些植物的种分类的集合。对于另一些种类的植物,该对象识别模型提供的Top 1的识别结果所对应的种分类的准确率约为22%并且属分类的准确率约为29%,这意味着识别结果几乎是错误的。可以将这些种类的植物划归到群组四,群组四的形式例如可以是这些植物的种分类的集合。可见,以这种方式建立的群组之间无交集。
表1 群组及识别准确率
Figure PCTCN2022073987-appb-000001
在上述方法100中,在步骤S110从对象识别模型接收被识别对象的分类,即Top 1的识别结果。在步骤S120判断该Top 1的识别结果属于哪个群组。例如,如果在步骤S120中判断为该Top 1的识别结果的种分类被包括在群组一中,可以认为该识别结果是准确的,则在步骤S130可以显示包括该Top 1的识别结果的画面。如果在步骤S120中判断为该Top 1的识别结果的种分类被包括在群组四中,可以认为该识别结果是不可靠的,因此可以不将该结果显示给用户,例如,在步骤S140中显示不包括该Top 1的识别结果并且包括请求用户输入关于被识别对象的附加信息的提示的画面。
关于被识别对象的附加信息可以包括被识别对象的形态信息、生长环境 信息、识别环境信息等。在步骤S140中显示的画面可以包括请求用户输入这些信息的提示。这些信息可以以各种形式输入,例如文字、语音、影像等。在一个实施例中,请求用户输入关于被识别对象的附加信息的提示可以包括:请求用户输入一个或多个附加影像的提示区域;和/或告诉用户从不同角度和/或距离进行影像拍摄的拍摄指导。用户可以以影像的形式输入关于被识别对象的附加信息(这些影像在本文中被称为“附加影像”),根据该实施例的方法可以驱动对象识别模型基于前述第一影像和该附加影像、或仅基于该附加影像重新识别被识别对象的分类,并从对象识别模型获取重新识别的结果。重新识别的结果可以不进行区分群组(即不执行步骤S120或与其类似的操作)而直接显示给用户,也可以针对重新识别的结果进行上述步骤S130至S140。
本领域技术人员应理解,表1所示的群组的划分只是示意性的,在其他的实施例中,可以根据其他的划分条件将对象群体中的个体所对应的个体分类划分为更少个或更多个的群组。本领域技术人员应理解,图1针对每个群组所进行的与用户的交互(包括所显示的画面)也只是示意性的,在其他的实施例中,可以根据其他的划分条件为每个群组设计适当的交互方式。
下面结合图2至图9的具体示例说明根据本公开的实施例的方法针对不同的群组所进行的交互。
情况一:Top 1的识别结果属于群组一
在情况一中,从对象识别模型接收的Top 1的识别结果属于群组一,即可以认为此时该对象识别模型的种分类的识别结果是正确的,则可以显示图2所示的画面10。画面10可以包括Top 1的识别结果。在该具体的示例中,Top 1的识别结果为对象识别模型识别出的被识别对象的置信度最高的种分类“Baby rubber plant”。
在显示画面10时,可以接收用户的操作,例如点击、滑动等。响应于在显示画面10时用户的特定操作(例如,向右滑动),可以显示针对情况一的附加页面。附加页面可以包括以下各项中的一个或多个:拍摄指导;更改方法所输出的识别结果(即画面10所显示的Top 1的识别结果)的提示;以及与Top 1的识别结果所指示的个体具有相似形态的个体的分类(即相似结果)。
图3所示为附加页面20。附加页面20包括对用户的拍摄指导(在画面 20中显示为“Tips for taking pictures”,也可称为拍摄技巧、拍摄方法等),例如,“将植物聚焦在取景框的中部,并避免暗的或被污染的图像”。附加页面20在拍摄指导的下方还包括更改Top 1的识别结果的提示(在画面20中显示为“Change the result”),以便用户在认为对象识别模型提供的Top 1的识别结果错误时,可以自行更正。
在一个实施例中,尽管未在附图中示出,附加页面还可以包括Top 1的识别结果的相似结果。例如,在画面10中显示对象识别模型提供的Top 1的识别结果为迎春花,在附加页面中还可以显示连翘花、桃花和樱花等与迎春花具有相似形态的个体的分类。
情况二:Top 1的识别结果属于群组二
在情况二中,从对象识别模型接收的Top 1的识别结果属于群组二,即可以认为此时该对象识别模型的种分类的识别结果不太准确,但其所对应的属分类是正确的,则可以显示图4所示的画面30。画面30包括关于与Top 1的识别结果对应的属分类的信息,在画面30中为区域31所显示的内容。在关于与Top 1的识别结果对应的属分类的信息之后,画面30还可以包括Top 1的识别结果,在画面30中为区域32所显示的内容。
在Top 1的识别结果之后,画面30还可以包括,例如在区域33中,从对象识别模型接收的置信度低于Top 1的识别结果的一个或多个分类,例如Top 2、Top 3等的识别结果(在一个实施例中,只有在Top 2、Top 3等的识别结果所对应的属分类与Top 1的识别结果所对应的属分类相同的情况下才会被显示在区域33中);和/或Top 1的识别结果的相似结果。需要注意的是,在前述二者都被显示的情况下,不重复显示Top 2、Top 3等的识别结果与相似结果中的相同的分类。例如,Top 2的识别结果与相似结果中的一个相同,则在Top 1的识别结果之后,画面30还可以依序包括Top 2的识别结果、Top 3的识别结果、以及除去Top 2的识别结果的相似结果。
情况三:Top 1的识别结果属于群组三
在情况三中,从对象识别模型接收的Top 1的识别结果属于群组三,即可以认为此时该对象识别模型的种分类和属分类的识别结果都不太准确,则可以显示图5所示的画面40。画面40可以包括Top 1的识别结果(例如在区域 41中显示)和从对象识别模型接收的置信度低于Top 1的识别结果的一个或多个分类(例如在区域42中显示),例如Top 2、Top 3等的识别结果(在一个实施例中,只有在Top 2、Top 3等的识别结果所对应的属分类与Top 1的识别结果所对应的属分类相同的情况下才会被显示在区域42中)。在这些信息之后,画面40还可以包括Top 1的识别结果的相似结果(例如在区域43中显示)。需要注意的是,不重复显示Top 2、Top 3等的识别结果与相似结果中的相同的分类。画面40还可以包括请求用户输入关于被识别对象的附加信息的提示(例如在区域44中显示)。请求用户输入关于被识别对象的附加信息的提示可以包括:请求用户输入一个或多个附加影像的提示区域;和/或告诉用户从不同角度和/或距离进行影像拍摄的拍摄指导。请求用户输入关于被识别对象的附加信息的提示的更多信息可以参考下文中关于图6至图8的描述。
情况四:Top 1的识别结果属于群组四
在情况四中,从对象识别模型接收的Top 1的识别结果属于群组四,即可以认为此时该对象识别模型的识别结果不正确,则可以显示图6所示的画面50。画面50不显示Top 1的识别结果但显示请求用户输入关于被识别对象的附加信息的提示,例如可以包括请求用户输入一个或多个附加影像的提示区域;和/或告诉用户从不同角度和/或距离进行影像拍摄的拍摄指导。在图6所示的例子中,画面50显示提示语“Could you please try‘Multi-image’identification?”以请求用户输入关于被识别对象的一个或多个附加影像。由于此次Top 1的识别结果是不正确的,所以不在画面50中显示给用户,并且也可以不保存在成功识别的历史记录中。
用户可以响应于该提示进行操作,例如点击画面50中的按钮“Multi-image identification”以便通过输入一个或多个附加影像来输入关于被识别对象的附加信息。在一个示例中,响应于画面50中的按钮“Multi-image identification”被点击,显示画面61。在图示的例子中,画面61包括告诉用户从不同角度和/或距离进行影像拍摄的拍摄指导。画面61的区域63位于拍摄的取景框下方并包括3个小方框,其请求用户输入关于被识别对象的3个附加影像,以重新识别该对象。用户可以操作按钮64以拍摄所请求的3个附加影像中的一个或多个影像。影像被拍摄后,其缩略图被显示在区域63的小方框内,其间可 以呈现影像被缩小到小方框内的动画效果。例如,第一个附加影像被拍摄后,可以显示画面62。此外,用户还可以操作按钮65以从相册中选择所请求的3个附加影像中的一个或多个影像。与被拍摄的影像类似,被选择的影像的缩略图也被显示在区域63的小方框内。这些附加影像可以按照输入的顺序从左到右依次显示。
当输入的附加影像达到预定个数(例如所请求的个数,在该示例中为3个)时,可以自动开始重新识别该被识别对象,也可以通过用户操作按钮66(即指示开始重新识别的操作)以人工启动重新识别。若输入的附加影像未达到预定个数,用户可以通过操作按钮66以人工启动重新识别。重新识别可以仅基于本次输入一个或多个附加影像,也可以基于之前输入的第一影像和本次输入一个或多个附加影像。每个附加影像的缩略图包括删除操作区域(例如在其右上角的“×”符号),用户可以在重新识别开始之前删除任意一个附加影像。重新识别可以通过前述对象识别模型来进行。对象识别模型基于一个或多个附加影像、或基于一个或多个附加影像和第一影像,重新识别被识别对象,并提供重新识别的结果(例如可以仅包括一个置信度最高的分类)。获取该重新识别的结果之后,可以显示画面以将该结果显示给用户。附加影像中的一个或多个可以被保存在成功识别的历史记录中。
图9是示意性地示出根据本公开的实施例的用于对象识别的计算机系统700的至少一部分的结构图。本领域技术人员可以理解,系统700只是一个示例,不应将其视为限制本公开的范围或本文所描述的特征。在该示例中,系统700可以包括一个或多个存储装置710、一个或多个电子设备720、以及一个或多个计算装置730,其可以通过网络或总线740互相通信连接。一个或多个存储装置710为一个或多个电子设备720、以及一个或多个计算装置730提供存储服务。虽然一个或多个存储装置710在系统700中以独立于一个或多个电子设备720、以及一个或多个计算装置730之外的单独的框示出,应当理解,一个或多个存储装置710可以实际存储在系统700所包括的其他实体720、730中的任何一个上。一个或多个电子设备720以及一个或多个计算装置730中的每一个可以位于网络或总线740的不同节点处,并且能够直接地或间接地与网络或总线740的其他节点通信。本领域技术人员可以理解,系统700 还可以包括图9未示出的其他装置,其中每个不同的装置均位于网络或总线740的不同节点处。
一个或多个存储装置710可以被配置为存储上文所述的任何数据,包括但不限于:第一影像、附加影像、对象识别模型、各样本集/测试数据集、识别结果、各个群组、应用的程序文件等数据。一个或多个计算装置730可以被配置为执行上述根据实施例的方法中的一个或多个,和/或一个或多个根据实施例的方法中的一个或多个步骤。一个或多个电子设备720可以被配置为为用户提供服务,其可以显示如上所述的画面10至50和61、62。一个或多个电子设备720还可以被配置为执行根据实施例的方法中的一个或多个步骤。
网络或总线740可以是任何有线或无线的网络,也可以包括线缆。网络或总线740可以是互联网、万维网、特定内联网、广域网或局域网的一部分。网络或总线740可以利用诸如以太网、WiFi和HTTP等标准通信协议、对于一个或多个公司来说是专有的协议、以及前述协议的各种组合。网络或总线740还可以包括但不限于工业标准体系结构(ISA)总线、微通道架构(MCA)总线、增强型ISA(EISA)总线、视频电子标准协会(VESA)本地总线、和外围部件互连(PCI)总线。
一个或多个电子设备720和一个或多个计算装置730中的每一个可以被配置为与图10所示的系统800类似,即具有一个或多个处理器810、一个或多个存储器820、以及指令和数据。一个或多个电子设备720和一个或多个计算装置730中的每一个可以是意在由用户使用的个人计算装置或者由企业使用的商业计算机装置,并且具有通常与个人计算装置或商业计算机装置结合使用的所有组件,诸如中央处理单元(CPU)、存储数据和指令的存储器(例如,RAM和内部硬盘驱动器)、诸如显示器(例如,具有屏幕的监视器、触摸屏、投影仪、电视或可操作来显示信息的其他装置)、鼠标、键盘、触摸屏、麦克风、扬声器、和/或网络接口装置等的一个或多个I/O设备。
一个或多个电子设备720还可以包括用于捕获静态图像或记录视频流的一个或多个相机、以及用于将这些元件彼此连接的所有组件。虽然一个或多个电子设备720可以各自包括全尺寸的个人计算装置,但是它们可能可选地包括能够通过诸如互联网等网络与服务器无线地交换数据的移动计算装置。 举例来说,一个或多个电子设备720可以是移动电话,或者是诸如带无线支持的PDA、平板PC或能够经由互联网获得信息的上网本等装置。在另一个示例中,一个或多个电子设备720可以是可穿戴式计算系统。
图10是示意性地示出根据本公开的一个实施例的用于对象识别的计算机系统800的至少一部分的结构图。系统800包括一个或多个处理器810、一个或多个存储器820、以及通常存在于计算机等装置中的其他组件(未示出)。一个或多个存储器820中的每一个可以存储可由一个或多个处理器810访问的内容,包括可以由一个或多个处理器810执行的指令821、以及可以由一个或多个处理器810来检索、操纵或存储的数据822。
指令821可以是将由一个或多个处理器810直接地执行的任何指令集,诸如机器代码,或者间接地执行的任何指令集,诸如脚本。本文中的术语“指令”、“应用”、“过程”、“步骤”和“程序”在本文中可以互换使用。指令821可以存储为目标代码格式以便由一个或多个处理器810直接处理,或者存储为任何其他计算机语言,包括按需解释或提前编译的独立源代码模块的脚本或集合。指令821可以包括引起诸如一个或多个处理器810来充当本文中的各神经网络的指令。本文其他部分更加详细地解释了指令821的功能、方法和例程。
一个或多个存储器820可以是能够存储可由一个或多个处理器810访问的内容的任何临时性或非临时性计算机可读存储介质,诸如硬盘驱动器、存储卡、ROM、RAM、DVD、CD、USB存储器、能写存储器和只读存储器等。一个或多个存储器820中的一个或多个可以包括分布式存储系统,其中指令821和/或数据822可以存储在可以物理地位于相同或不同的地理位置处的多个不同的存储装置上。一个或多个存储器820中的一个或多个可以经由网络连接至一个或多个处理器810,和/或可以直接地连接至或并入一个或多个处理器810中的任何一个中。
一个或多个处理器810可以根据指令821来检索、存储或修改数据822。存储在一个或多个存储器820中的数据822可以包括上文所述的一个或多个存储装置710中存储的各项中一项或多项的至少部分。举例来说,虽然本文所描述的主题不受任何特定数据结构限制,但是数据822还可能存储在计算 机寄存器(未示出)中,作为具有许多不同的字段和记录的表格或XML文档存储在关系型数据库中。数据822可以被格式化为任何计算装置可读格式,诸如但不限于二进制值、ASCII或统一代码。此外,数据822可以包括足以识别相关信息的任何信息,诸如编号、描述性文本、专有代码、指针、对存储在诸如其他网络位置处等其他存储器中的数据的引用或者被函数用于计算相关数据的信息。
一个或多个处理器810可以是任何常规处理器,诸如市场上可购得的中央处理单元(CPU)、图形处理单元(GPU)等。可替换地,一个或多个处理器810还可以是专用组件,诸如专用集成电路(ASIC)或其他基于硬件的处理器。虽然不是必需的,但是一个或多个处理器810可以包括专门的硬件组件来更快或更有效地执行特定的计算过程,诸如对影像进行图像处理等。
虽然图10中示意性地将一个或多个处理器810以及一个或多个存储器820示出在同一个框内,但是系统800可以实际上包括可能存在于同一个物理壳体内或不同的多个物理壳体内的多个处理器或存储器。例如,一个或多个存储器820中的一个可以是位于与上文所述的一个或多个计算装置(未示出)中的每一个的壳体不同的壳体中的硬盘驱动器或其他存储介质。因此,引用处理器、计算机、计算装置或存储器应被理解成包括引用可能并行操作或可能非并行操作的处理器、计算机、计算装置或存储器的集合。
在说明书及权利要求中的词语“A或B”包括“A和B”以及“A或B”,而不是排他地仅包括“A”或者仅包括“B”,除非另有特别说明。
在本公开中,对“一个实施例”、“一些实施例”的提及意味着结合该实施例描述的特征、结构或特性包含在本公开的至少一个实施例、至少一些实施例中。因此,短语“在一个实施例中”、“在一些实施例中”在本公开的各处的出现未必是指同一个或同一些实施例。此外,在一个或多个实施例中,可以任何合适的组合和/或子组合来组合特征、结构或特性。
如在此所使用的,词语“示例性的”意指“用作示例、实例或说明”,而不是作为将被精确复制的“模型”。在此示例性描述的任意实现方式并不一定要被解释为比其它实现方式优选的或有利的。而且,本公开不受在上述技术领域、背景技术、发明内容或具体实施方式中所给出的任何所表述的或所暗 示的理论所限定。
另外,仅仅为了参考的目的,还可以在下面描述中使用某种术语,并且因而并非意图限定。例如,除非上下文明确指出,否则涉及结构或元件的词语“第一”、“第二”和其它此类数字词语并没有暗示顺序或次序。还应理解,“包括/包含”一词在本文中使用时,说明存在所指出的特征、整体、步骤、操作、单元和/或组件,但是并不排除存在或增加一个或多个其它特征、整体、步骤、操作、单元和/或组件以及/或者它们的组合。
在本公开中,术语“部件”和“系统”意图是涉及一个与计算机有关的实体,或者硬件、硬件和软件的组合、软件、或执行中的软件。例如,一个部件可以是,但是不局限于,在处理器上运行的进程、对象、可执行态、执行线程、和/或程序等。通过举例说明,在一个服务器上运行的应用程序和所述服务器两者都可以是一个部件。一个或多个部件可以存在于一个执行的进程和/或线程的内部,并且一个部件可以被定位于一台计算机上和/或被分布在两台或更多计算机之间。
本领域技术人员应当意识到,在上述操作之间的边界仅仅是说明性的。多个操作可以结合成单个操作,单个操作可以分布于附加的操作中,并且操作可以在时间上至少部分重叠地执行。而且,另选的实施例可以包括特定操作的多个实例,并且在其他各种实施例中可以改变操作顺序。但是,其它的修改、变化和替换同样是可能的。因此,本说明书和附图应当被看作是说明性的,而非限制性的。
虽然已经通过示例对本公开的一些特定实施例进行了详细说明,但是本领域的技术人员应该理解,以上示例仅是为了进行说明,而不是为了限制本公开的范围。在此公开的各实施例可以任意组合,而不脱离本公开的精神和范围。本领域的技术人员还应理解,可以对实施例进行多种修改而不脱离本公开的范围和精神。本公开的范围由所附权利要求来限定。

Claims (27)

  1. 一种用于对象识别的方法,其特征在于,包括:
    从预先建立的对象识别模型接收被识别对象的第一分类,所述对象识别模型基于呈现所述被识别对象的至少一部分的第一影像识别所述被识别对象的分类;
    响应于所述第一分类属于第一群组,显示第一画面,其中,所述第一画面包括所述第一分类;以及
    响应于所述第一分类属于第二群组,显示第二画面,其中,所述第二画面不包括所述第一分类并且包括请求用户输入关于所述被识别对象的附加信息的提示,其中,
    所述第一群组和所述第二群组是基于所述对象识别模型对所针对的对象群体中的个体分类在统计学上的识别准确率建立的,其中,所述第一群组包括对其的识别准确率满足第一条件的个体分类,所述第二群组包括对其的识别准确率满足第二条件的个体分类,其中,
    所述第一条件为对个体分类的分类单位为种的识别准确率为第一等级,所述第二条件为对个体分类的分类单位为属的识别准确率为第二等级,所述第一等级高于所述第二等级。
  2. 根据权利要求1所述的方法,其特征在于,所述第一分类的分类单位为种。
  3. 根据权利要求1所述的方法,其特征在于,所述对象识别模型提供被识别对象的一个或多个分类,其中所述第一分类为所述一个或多个分类中置信度最高的分类。
  4. 根据权利要求1所述的方法,其特征在于,还包括:
    响应于在显示所述第一画面时用户的第一操作,显示第一附加页面,所述第一附加页面包括以下各项中的一个或多个:
    拍摄指导;
    更改所述第一分类的提示;以及
    与所述第一分类所指示的个体具有相似形态的个体的分类。
  5. 根据权利要求1所述的方法,其特征在于,请求用户输入关于所述被识别对象的附加信息的提示包括:
    请求用户输入一个或多个附加影像的提示区域;和/或
    告诉用户从不同角度和/或距离进行影像拍摄的拍摄指导。
  6. 根据权利要求1所述的方法,其特征在于,还包括:
    响应于在显示所述第二画面时所述附加信息的输入,驱动所述对象识别模型基于所述第一影像和所述附加信息、或基于所述附加信息重新识别所述被识别对象的分类;
    从所述对象识别模型接收重新识别的所述被识别对象的第二分类;以及
    显示所述第二分类。
  7. 根据权利要求1所述的方法,其特征在于,还包括:
    响应于所述第一分类属于第三群组,显示第三画面,其中,
    所述第三群组是基于所述对象识别模型对所针对的对象群体中的个体分类在统计学上的识别准确率建立的,所述第三群组包括对其的识别准确率满足第三条件的个体分类,其中,所述第三条件为对个体分类的分类单位为种的识别准确率为第三等级并且对个体分类的分类单位为属的识别准确率为第一等级,所述第三等级低于所述第一等级并且高于所述第二等级,其中,
    所述第三画面包括关于与所述第一分类对应的分类单位为属的分类的信息。
  8. 根据权利要求7所述的方法,其特征在于,在所述关于分类单位为属的分类的信息之后,所述第三画面还包括所述第一分类。
  9. 根据权利要求8所述的方法,其特征在于,在所述第一分类之后,所述第三画面还包括:
    从所述对象识别模型接收的置信度低于所述第一分类的一个或多个分类,其中所述一个或多个分类所对应的分类单位为属的分类与所述第一分类所对应的分类单位为属的分类相同;和/或
    与所述第一分类所指示的个体具有相似形态的个体的分类。
  10. 根据权利要求1所述的方法,其特征在于,还包括:
    响应于所述第一分类属于第四群组,显示第四画面,其中,
    所述第四群组是基于所述对象识别模型对所针对的对象群体中的个体分类在统计学上的识别准确率建立的,所述第四群组包括对其的识别准确率满足第四条件的个体分类,所述第四条件为对个体分类的分类单位为种的识别准确率为第三等级并且对个体分类的分类单位为属的识别准确率为第三等级,所述第三等级低于所述第一等级并且高于所述第二等级,其中,
    所述第四画面包括所述第一分类、以及以下各项中的至少一项:
    从所述对象识别模型接收的置信度低于所述第一分类的一个或多个分类,其中所述一个或多个分类所对应的分类单位为属的分类与所述第一分类所对应的分类单位为属的分类相同;和
    与所述第一分类所指示的个体具有相似形态的个体的分类。
  11. 根据权利要求10所述的方法,其特征在于,所述第四画面还包括请求用户输入关于所述被识别对象的附加信息的提示。
  12. 一种用于对象识别的方法,其特征在于,包括:
    从预先建立的对象识别模型接收被识别对象的第一分类,所述对象识别模型基于呈现所述被识别对象的至少一部分的第一影像识别所述被识别对象的分类;
    响应于所述第一分类属于第一群组,显示第一画面,其中,所述第一画面包括所述第一分类;以及
    响应于所述第一分类属于第二群组,显示第二画面,其中,所述第二画面不包括所述第一分类并且包括请求用户输入关于所述被识别对象的附加信息的提示,其中,
    所述第一群组和所述第二群组是基于所述对象识别模型对所针对的对象群体中的个体分类在统计学上的识别准确率建立的,其中,所述第一群组包括对其的识别准确率满足第一条件的个体分类,所述第二群组包括对其的识别准确率满足第二条件的个体分类,其中,
    所述第一条件为对个体分类的分类单位为种的识别准确率高于第一阈值,所述第二条件为对个体分类的分类单位为种的识别准确率低于第二阈值,并且其中,所述第一阈值高于所述第二阈值。
  13. 根据权利要求12所述的方法,其特征在于,还包括:
    响应于所述第一分类属于第三群组,显示第三画面,其中,
    所述第三群组是基于所述对象识别模型对所针对的对象群体中的个体分类在统计学上的识别准确率建立的,所述第三群组包括对其的识别准确率满足第三条件的个体分类,其中,
    所述第三条件为对个体分类的分类单位为种的识别准确率落入第一范围,所述第一范围的上限低于所述第一阈值并且下限高于所述第二阈值,
    所述第三画面包括所述第一分类、以及:
    与所述第一分类所指示的个体具有相似形态的个体的分类;和/或
    从所述对象识别模型接收的置信度低于所述第一分类的一个或多个分类,其中所述一个或多个分类所对应的分类单位为属的分类与所述第一分类所对应的分类单位为属的分类相同。
  14. 根据权利要求13所述的方法,其特征在于,所述第三群组包括第一子组和第二子组,所述方法还包括:
    响应于所述第一分类属于第一子组,显示第一子画面,并且响应于所述第一分类属于第二子组,显示第二子画面,其中,
    所述第一子组包括对其的识别准确率还满足第一子条件的个体分类,所述第二子组包括对其的识别准确率还满足第二子条件的个体分类,
    所述第一子条件为对个体分类的分类单位为属的识别准确率高于第一阈值,所述第二子条件为对个体分类的分类单位为属的识别准确率落入第一范围,
    所述第一子画面包括关于与所述第一分类对应的分类单位为属的分类的信息,所述第二子画面不包括关于与所述第一分类对应的分类单位为属的分类的信息。
  15. 根据权利要求12所述的方法,其特征在于,所述第一阈值的取值范围为大于或等于80%。
  16. 根据权利要求12所述的方法,其特征在于,所述第二阈值的取值范围为小于或等于35%。
  17. 根据权利要求13所述的方法,其特征在于,所述第一范围包括45%到65%的数值区间。
  18. 一种用于对象识别的方法,其特征在于,包括:
    从预先建立的对象识别模型接收被识别对象的第一分类,所述对象识别模型基于呈现所述被识别对象的至少一部分的第一影像识别所述被识别对象的分类;
    响应于所述第一分类属于预先建立的群组,显示关于与所述第一分类对应的分类单位为属的分类的信息,其中,
    所述群组是基于所述对象识别模型对所针对的对象群体中的个体分类在统计学上的识别准确率建立的,其中,所述群组包括对其的分类单位为种的识别准确率低于第一阈值并且对其的分类单位为属的识别准确率高于第二阈值的个体分类。
  19. 根据权利要求18所述的方法,其特征在于,还包括:在所述关于分类单位为属的分类的信息之后,显示所述第一分类。
  20. 根据权利要求19所述的方法,其特征在于,还包括:在所述第一分类之后显示:
    从所述对象识别模型接收的置信度低于所述第一分类的一个或多个分类,其中所述一个或多个分类所对应的分类单位为属的分类与所述第一分类所对应的分类单位为属的分类相同;和/或
    与所述第一分类所指示的个体具有相似形态的个体的分类。
  21. 一种用于对象识别的方法,其特征在于,包括:
    从预先建立的对象识别模型接收被识别对象的第一分类,所述对象识别模型基于呈现所述被识别对象的至少一部分的第一影像识别所述被识别对象的分类;
    响应于所述第一分类属于预先建立的群组,不显示所述第一分类并且显示请求用户输入关于所述被识别对象的附加信息的提示,其中,
    所述群组是基于所述对象识别模型对所针对的对象群体中的个体分类在统计学上的识别准确率建立的,其中,所述群组包括对其的识别准确率低于阈值的个体分类。
  22. 根据权利要求21所述的方法,其特征在于,请求用户输入关于所述被识别对象的附加信息的提示包括:
    请求用户输入一个或多个附加影像的提示区域;和/或
    告诉用户从不同角度和/或距离进行影像拍摄的拍摄指导。
  23. 根据权利要求22所述的方法,其特征在于,所述方法还包括:
    响应于预定个数的附加影像的输入、或响应于不足预定个数的附加影像的输入和指示开始重新识别的操作,驱动所述对象识别模型基于所述第一影像和所述附加影像、或基于所述附加影像重新识别所述被识别对象的分类;
    从所述对象识别模型接收重新识别的所述被识别对象的第二分类;以及
    显示所述第二分类。
  24. 一种电子设备,其特征在于,包括:
    一个或多个处理器,被配置为使得所述电子设备进行如权利要求1-23中任一项所述的方法。
  25. 一种用于操作电子设备的装置,其特征在于,包括:
    一个或多个处理器,被配置为使得所述电子设备进行如权利要求1-23中任一项所述的方法。
  26. 一种用于对象识别的计算机系统,其特征在于,包括:
    一个或多个处理器;以及
    一个或多个存储器,所述一个或多个存储器被配置为存储计算机可执行的指令以及与所述计算机可执行的指令相关联的计算机可访问的数据,
    其中,当所述计算机可执行的指令被所述一个或多个处理器执行时,使得所述计算机系统进行如权利要求1-23中任一项所述的方法。
  27. 一种非临时性计算机可读存储介质,其特征在于,所述非临时性计算机可读存储介质上存储有计算机可执行的指令,当所述计算机可执行的指令被一个或多个计算机系统执行时,使得所述一个或多个计算机系统进行如权利要求1-23中任一项所述的方法。
PCT/CN2022/073987 2021-02-08 2022-01-26 用于对象识别的方法、计算机系统及电子设备 WO2022166706A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110171761.3A CN112784925A (zh) 2021-02-08 2021-02-08 用于对象识别的方法、计算机系统及电子设备
CN202110171761.3 2021-02-08

Publications (1)

Publication Number Publication Date
WO2022166706A1 true WO2022166706A1 (zh) 2022-08-11

Family

ID=75761265

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/073987 WO2022166706A1 (zh) 2021-02-08 2022-01-26 用于对象识别的方法、计算机系统及电子设备

Country Status (2)

Country Link
CN (1) CN112784925A (zh)
WO (1) WO2022166706A1 (zh)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112784925A (zh) * 2021-02-08 2021-05-11 杭州睿胜软件有限公司 用于对象识别的方法、计算机系统及电子设备
CN113313193A (zh) * 2021-06-15 2021-08-27 杭州睿胜软件有限公司 植物图片识别方法、可读存储介质及电子设备
CN113298180A (zh) * 2021-06-15 2021-08-24 杭州睿胜软件有限公司 用于植物识别的方法和计算机系统
CN115203451A (zh) * 2022-08-03 2022-10-18 杭州睿胜软件有限公司 用于植物图像的识别处理方法、系统和存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190114333A1 (en) * 2017-10-13 2019-04-18 International Business Machines Corporation System and method for species and object recognition
CN110490086A (zh) * 2019-07-25 2019-11-22 杭州睿琪软件有限公司 一种用于对象识别结果二次确认的方法及系统
CN110852376A (zh) * 2019-11-11 2020-02-28 杭州睿琪软件有限公司 用于识别生物种类的方法及系统
CN112270297A (zh) * 2020-11-13 2021-01-26 杭州睿琪软件有限公司 用于显示识别结果的方法和计算机系统
CN112784925A (zh) * 2021-02-08 2021-05-11 杭州睿胜软件有限公司 用于对象识别的方法、计算机系统及电子设备

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190114333A1 (en) * 2017-10-13 2019-04-18 International Business Machines Corporation System and method for species and object recognition
CN110490086A (zh) * 2019-07-25 2019-11-22 杭州睿琪软件有限公司 一种用于对象识别结果二次确认的方法及系统
CN110852376A (zh) * 2019-11-11 2020-02-28 杭州睿琪软件有限公司 用于识别生物种类的方法及系统
CN112270297A (zh) * 2020-11-13 2021-01-26 杭州睿琪软件有限公司 用于显示识别结果的方法和计算机系统
CN112784925A (zh) * 2021-02-08 2021-05-11 杭州睿胜软件有限公司 用于对象识别的方法、计算机系统及电子设备

Also Published As

Publication number Publication date
CN112784925A (zh) 2021-05-11

Similar Documents

Publication Publication Date Title
WO2022166706A1 (zh) 用于对象识别的方法、计算机系统及电子设备
US9721183B2 (en) Intelligent determination of aesthetic preferences based on user history and properties
US10242250B2 (en) Picture ranking method, and terminal
WO2020119350A1 (zh) 视频分类方法、装置、计算机设备和存储介质
US8121358B2 (en) Method of grouping images by face
US8750602B2 (en) Method and system for personalized advertisement push based on user interest learning
US10949702B2 (en) System and a method for semantic level image retrieval
US11461386B2 (en) Visual recognition using user tap locations
US20230162466A1 (en) Method and computer system for displaying identification result
CN104994426B (zh) 节目视频识别方法及系统
US10380461B1 (en) Object recognition
US9569498B2 (en) Using image features to extract viewports from images
US11853368B2 (en) Method and system for identifying and displaying an object
WO2021219117A1 (zh) 图像检索方法、图像检索装置、图像检索系统及图像显示系统
JP2016057901A (ja) 画像処理装置、画像処理方法、プログラムおよび記録媒体
WO2023138298A1 (zh) 判断植物的容器是否适合植物的养护的方法和装置
CN112101300A (zh) 药材识别方法、装置及电子设备
JP6314071B2 (ja) 情報処理装置、情報処理方法及びプログラム
KR20190091774A (ko) 딥러닝을 이용한 음식인식 및 식단관리 시스템
WO2021190412A1 (zh) 一种生成视频缩略图的方法、装置和电子设备
US20180189602A1 (en) Method of and system for determining and selecting media representing event diversity
JP6244887B2 (ja) 情報処理装置、画像探索方法、及びプログラム
CN110490027B (zh) 人脸特征提取训练方法及系统
US20180157666A1 (en) System and method for determining a social relativeness between entities depicted in multimedia content elements
CN111597906B (zh) 一种结合文字信息的快速绘本识别方法及系统

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22748985

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22748985

Country of ref document: EP

Kind code of ref document: A1