CN117372858A - Article identification method and device, electronic equipment and storage medium - Google Patents

Article identification method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN117372858A
CN117372858A CN202210784296.5A CN202210784296A CN117372858A CN 117372858 A CN117372858 A CN 117372858A CN 202210784296 A CN202210784296 A CN 202210784296A CN 117372858 A CN117372858 A CN 117372858A
Authority
CN
China
Prior art keywords
preview image
preset object
preview
identification
target area
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210784296.5A
Other languages
Chinese (zh)
Inventor
尹宏轶
钟铭
朱政
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Xiaomi Mobile Software Co Ltd
Original Assignee
Beijing Xiaomi Mobile Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Xiaomi Mobile Software Co Ltd filed Critical Beijing Xiaomi Mobile Software Co Ltd
Priority to CN202210784296.5A priority Critical patent/CN117372858A/en
Publication of CN117372858A publication Critical patent/CN117372858A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/55Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/75Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/783Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0623Item investigation
    • G06Q30/0625Directed, with specific intent or strategy
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The disclosure relates to an article identification method and device, electronic equipment and storage medium. Wherein the method comprises the following steps: invoking a camera to collect images so as to display collected preview images in an image preview interface, wherein the preview images comprise preset objects; carrying out preset object identification on the preview image to determine the position of the preset object in the preview image, and determining a target area to be identified according to the determined position; and carrying out object identification on the target area to obtain an identification result.

Description

Article identification method and device, electronic equipment and storage medium
Technical Field
The disclosure relates to the field of terminals, and in particular relates to an article identification method and device, electronic equipment and a storage medium.
Background
Users often encounter situations in daily life where it is necessary to identify items.
For this reason, the related art proposes a method of identifying an article by image capturing. In the method, a user can shoot an image of an object to be identified through the terminal, so that the terminal can identify the object according to the shot image, and an identification result is obtained.
Disclosure of Invention
The disclosure provides an article identification method and device, electronic equipment and storage medium, which can finish article identification for a shot subject in a preview stage.
According to a first aspect of the present disclosure, there is provided an article identification method comprising:
invoking a camera to collect images so as to display collected preview images in an image preview interface, wherein the preview images comprise preset objects;
carrying out preset object identification on the preview image to determine the position of the preset object in the preview image, and determining a target area to be identified according to the determined position;
and carrying out object identification on the target area to obtain an identification result.
According to a second aspect of the present disclosure, there is provided an article identification device comprising:
the acquisition unit is used for calling the camera to acquire images so as to display acquired preview images in an image preview interface, wherein the preview images comprise preset objects;
the first identification unit is used for carrying out preset object identification on the preview image so as to determine the position of the preset object in the preview image and determine a target area to be identified according to the determined position;
and the second identification unit is used for carrying out article identification on the target area so as to obtain an identification result.
According to a third aspect of the present disclosure, there is provided an electronic device comprising:
a processor;
a memory for storing processor-executable instructions;
wherein the processor implements the method of the first aspect by executing the executable instructions.
According to a fourth aspect of the present disclosure there is provided a computer readable storage medium having stored thereon computer instructions which when executed by a processor perform the steps of the method according to the first aspect.
In the technical scheme of the disclosure, after image acquisition, the acquired preview image may be displayed in an image preview interface, where the preview image may include a preset object. On the basis, the method and the device can conduct preset object recognition on the preview image to determine a target area to be recognized according to the position of the preset object, and further conduct article recognition on the target area to obtain a recognition result.
It should be appreciated that, since the present disclosure determines the target area to be identified according to the preset object, the user only needs to indicate the position of the object to be identified through the preset object, and then determine the area of the object to be identified in the preview stage of the image, so as to complete the object identification of the object to be identified, thereby avoiding the problem that in the related art, the object identification can be completed only on the basis of storing the image.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the disclosure and together with the description, serve to explain the principles of the disclosure.
FIG. 1 is a flow chart of a method of item identification, shown in an exemplary embodiment of the present disclosure;
FIG. 2 is a flow chart of another article identification method shown in an exemplary embodiment of the present disclosure;
FIG. 3 is a schematic diagram of a preview screen illustrating an exemplary embodiment of the present disclosure;
FIG. 4 is a block diagram of an article identification device shown in an exemplary embodiment of the present disclosure;
fig. 5 is a schematic structural diagram of an electronic device in an exemplary embodiment of the present disclosure.
Detailed Description
Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples are not representative of all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with some aspects of the present disclosure as detailed in the accompanying claims.
The terminology used in the present disclosure is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. As used in this disclosure and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any or all possible combinations of one or more of the associated listed items.
It should be understood that although the terms first, second, third, etc. may be used in this disclosure to describe various information, these information should not be limited to these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of the present disclosure. The word "if" as used herein may be interpreted as "at … …" or "at … …" or "responsive to a determination", depending on the context.
In the related art, if a user needs to identify an article, it is generally necessary to take an image of the article by a terminal such as a mobile phone, so that the terminal can identify the article from the taken image.
In the process of identifying the object, the photographed image generally contains a background, so that the terminal cannot determine which object needs to be identified, and more sundries are involved. Therefore, in order to ensure the accuracy of article identification, the shot image is not directly identified, but the stored image is manually cut by a user on the basis that the shot image is stored by the terminal, so that a cut image only containing the article to be identified is obtained, and the article identification is performed on the basis that the cut image is stored to obtain an identification result.
It is apparent that the article identification method in the related art must be performed on the basis of completion of image storage. However, in practical applications, the user usually only needs to identify the article in most cases, and the image is not used later, that is, the image containing the article is not desired to be stored, so when the article identifying method in the related art is adopted, the user is usually required to manually delete the image, otherwise, the problem that the image for identifying the article occupies the storage space occurs.
Therefore, the disclosure proposes an article identification method, which can complete the identification of the article in the image preview stage, so as to avoid the problem that the related art needs to perform article identification on the basis of completing image storage, which results in occupying storage space or requiring a user to additionally perform deletion operation.
Fig. 1 is a diagram illustrating an article identification method according to an exemplary embodiment of the present disclosure. As shown in fig. 1, the method may include the steps of:
step 102, calling a camera to collect images so as to display collected preview images in an image preview interface, wherein the preview images comprise preset objects.
As is apparent from the above description, the object recognition method in the related art needs to be performed on the basis of storing the image, which is caused by the fact that the area where the object is located needs to be determined by the user through a manual clipping manner in the related art.
In view of this, the present disclosure does not rely on manual clipping by a user to determine an area where an article is located, but instead, the user may indicate, through a preset object, where the article to be identified is located when capturing an image. On the basis, the terminal can identify the preset object in the preview stage, so as to determine the position of the preset object, and further determine a target area needing article identification according to the position of the preset object.
It should be understood that the method and the device disclosed by the disclosure are equivalent to a mode of replacing manual cutting by a user through a preset object indication, or a mode of replacing a post-cutting operation with a pre-indicating operation, so that the terminal can finish the identification operation of the object in the preview stage of the image without executing subsequent operations such as storage, and the problem that the object identification operation can be executed only after the image shooting and storage are finished in the related art is avoided.
It should be stated that the article to be identified in the present disclosure, and in particular what is generally related to the needs of the user, is not limited in this disclosure.
And 104, carrying out preset object recognition on the preview image to determine the position of the preset object in the preview image, and determining a target area to be recognized according to the determined position.
In the present disclosure, in order to enable a terminal to accurately identify a preset object for indicating a position of an object to be identified in a preview image including background content, an object including a plurality of key points may be used as the preset object. On the basis, the terminal can determine the positions of the key points contained in the preset object in the preview image by detecting the key points in the preview image, and on the basis, the terminal determines the positions of the preset object in the preview image according to the positions of the key points in the preview image.
In the present disclosure, the key points of the preset object are points capable of reflecting the structure of the preset object. For example, in the case that the preset object is a specific limb of the user, the key point of the preset object may be a joint point of the specific limb, in other words, when the preset object is the specific limb, the joint point detection may be performed on the preview image to determine the position of each joint point included in the specific limb in the preview image, and then, based on this, the position of the preset object in the preview image may be determined according to the position of each joint point. For example, the specific limb may be the hands of the user, and then the joints of the finger, the wrist, etc. may be used as the joint points to be detected. Of course, both hands are merely illustrative, and the particular limb or limbs, and may be set by those skilled in the art according to the actual circumstances, for example, one hand, sole, head, etc., and the present disclosure is not limited thereto.
In the present disclosure, other types of objects may be used as preset objects besides the specific limbs, for example, the preset objects may be stationery such as ruler, pen, etc., or may be materials such as wooden sticks, steel pipes, etc., and specifically, what type of object is used as the preset object for indicating the position of the subject may be determined by those skilled in the art according to actual needs, which is not limited in the present disclosure.
In the present disclosure, the detection of the keypoints of the preset object may be performed in various ways. For example, a technician may collect several sample images containing a preset object and determine the location of the key points of the preset object in each image. On the basis, the sample image can be used as input of a detection model template, and the positions of the key points of the preset object can be used as output, so that the key point detection model can be obtained through training. After the model is obtained through training, the preview image can be used as the input of the key point detection model under the condition that the preview image is obtained, so that the key point detection model outputs the positions of all the key points of the preset object in the preview image. In practical applications, the model training may be performed by means of a neural network model, which is of course only illustrative, and how to train the keypoint detection model specifically, which may be determined by those skilled in the art according to practical situations, and the disclosure is not limited thereto.
In the present disclosure, after determining the position of the preset object, the target area to be identified may be determined according to the position of the preset object. In one possible embodiment, a specific portion of the preset object may be determined as a target key point, so as to determine the target area according to the position of the target key point. For example, when the preset object is two hands, a specific finger of the two hands can be used as a target key point, so as to determine the target area according to the position of the specific finger, for example, when the index finger of the two hands is determined as the target key point, the target area can be further determined according to the position of the index finger of the two hands after the position of the two hands is determined. Of course, this example is merely illustrative, and how to set the target key points of the preset object can be determined by those skilled in the art according to actual needs, which is not limited by the present disclosure.
In the present disclosure, the target area to be identified may be determined based on the location of the target keypoints in a variety of ways. For example, after determining the position of the target key point, the area of the preset shape determined according to the target key point may be determined as the target area to be identified. In other words, the target key point is used as a position reference point of the target area to be identified, so as to determine the area where the object to be identified is located.
It should be emphasized that the above-mentioned preset shape may be any shape, for example, may be rectangular, circular, diamond, etc. For example, in the above example of determining the index finger of the two hands as the target key point, if the preset shape is rectangular, the position of the index finger of the two hands may be taken as the diagonal of the rectangle to determine the target area; if the preset shape is a circle, the connecting line of the positions of the index fingers of the two hands can be used as the diameter of the circle, and the midpoint of the connecting line is used as the center of the circle, so that the target area can be determined. Of course, this example is also illustrative, and the preset shape is specific to what shape, and how the target area is determined based on the preset shape and the target keypoints may be determined by one skilled in the art based on actual needs, which is not limited by the present disclosure.
In the present disclosure, in addition to performing the keypoint detection on the preview image, the position of the preset object may be determined in other manners. For example, the present disclosure may further configure feature information for a preset object according to the features of the preset object, and on the basis of the feature information, feature recognition may be performed on the preview image to determine the position of the preset object in the preview image, for example, when the preset object is a palm, the outline of the palm may be preset, and the outline may be used as feature information. Of course, this example is merely illustrative, and specific how to perform feature recognition may be determined by one skilled in the art based on actual needs, which the present disclosure is not limited to.
And 106, carrying out object identification on the target area to obtain an identification result.
In the present disclosure, after determining a target area to be identified, article identification may be performed on the area to obtain an identification result.
In one embodiment, the object of identifying the object by the user may be to know the type of the object, in other words, the object to be identified is an unknown object to the user, at this time, the object identification may be performed on the target area to determine the object type of the object contained in the target area, and the object type is taken as the identification result
In another embodiment, the purpose of identifying the article by the user may be for data retrieval, for example, when the user looks at a certain article, the user may want to purchase the same article, and at this time, the article may also be identified by the above method; for another example, when the user wants to know information about a certain commodity, the user may recognize the commodity in the above manner. In this embodiment, the screen content of the target area may be used as a search keyword to obtain a plurality of search results.
Of course, the above two implementations are merely illustrative, and specific how to perform the article identification to obtain the identification result may be determined by those skilled in the art according to actual needs, which is not limited in this disclosure.
It should be stated that, the execution body of the technical scheme of the disclosure may be any type of electronic device, for example, the electronic device may be a mobile terminal such as a smart phone, a tablet computer, or a fixed terminal such as a smart television, a PC (personal computer ), or the like. It should be understood that, only electronic devices having an image capturing function may be used as the electronic devices in the present disclosure, and in particular, which type of electronic device is used as the execution subject of the technical solution of the present disclosure may be determined by those skilled in the art according to actual needs, which is not limited by the present disclosure.
According to the technical scheme, after the image acquisition is carried out, the acquired preview image can be displayed in the image preview interface, wherein the preview image can contain a preset object. On the basis, the method and the device can conduct preset object recognition on the preview image to determine the target area to be recognized according to the position of the preset object, and further conduct object recognition on the target area to obtain a recognition result.
It should be appreciated that, since the present disclosure determines the target area to be identified according to the preset object, the user only needs to indicate the position of the object to be identified through the preset object, and then determine the area of the object to be identified in the preview stage of the image, so as to complete the object identification of the object to be identified, thereby avoiding the problem that in the related art, the object identification can be completed only on the basis of storing the image.
Further, the disclosure may take a specific limb of the user as a preset object, for example, the specific limb may be a palm of the user, and at this time, the user may perform article identification in a preview screen of the device only by indicating the approximate position of the subject with the palm, which is relatively simple for the user to operate.
In the following, the technical solution of the present disclosure will be described by taking the hands of the user as the preset object and taking the mobile phone as the execution subject.
FIG. 2 is a flow chart of another method of item identification shown in an exemplary embodiment of the present disclosure. As shown in fig. 2, the method may include the steps of:
step 201, the camera APP is started.
In this embodiment, when a user needs to identify an article, a camera APP pre-assembled in the mobile phone may be started to collect an image of the corresponding article. In actual operation, after the camera APP is started, several optional operation modes, such as an article identification mode and an image shooting mode, may be displayed. In the present embodiment, the user can select the item identification mode to realize item identification for the subject.
It should be stated that, in this embodiment, since the two hands are used as the preset objects, the mobile phone may be fixed and then image acquisition is performed, or another user helps to align the camera to the area containing the object to be identified and the two hands of the user, so as to perform image acquisition.
Step 202, image acquisition is performed through a camera.
In this embodiment, after determining that the user selects the article identification mode, the camera APP may call the camera to collect an image of the article to be identified, and display the collected image in the preview image, so that the user can adjust the shooting angle, and the mobile phone can conveniently identify the article.
In step 203, a preview image is displayed on the preview screen.
Step 204, inputting the joint point detection model into the preview image.
In this embodiment, a pre-trained joint point detection model may be built in the mobile phone. In the actual training process, a plurality of sample images including left and right hands can be used as model input, and coordinates of each joint of the left and right hands can be used as model output, so as to train and obtain the joint point detection model.
On the basis, the preview image can be used as the input of the joint point detection model to output and obtain the coordinates of all the joint points contained in the hands. In this embodiment, the joints of the finger, the wrist joint of the wrist, and the finger tip can be used as the joints of the hand, so that 20 joint coordinates (3 joints of the thumb and 1 joint of the wrist can be included in the other four fingers) can be output for a single hand, and 40 joint coordinates can be output for both hands.
It should be noted that, in the training process, the judgment results of the left hand and the right hand can also be used as the output of the model to train, so that after the preview image is input into the joint point detection model, the left hand identification and the right hand identification of each joint point can be contained besides the coordinates of the 40 joint points, in practical application, since the left hand identification and the right hand identification of the two hands just form a plane coordinate, the joint point detection model can also output the left hand identification and the right hand identification of the two hands in the form of coordinates. The specific forms of the left-hand and right-hand identification can be determined by those skilled in the art, for example, a 1 may be used to identify the left hand and a 0 may be used to identify the right hand, which is not limited in this embodiment.
And 205, acquiring the coordinates of the key points output by the key point detection model.
And 206, determining the index finger positions of the hands according to the output key point coordinates.
In this embodiment, the region where the subject is located may be determined by taking the index finger of the user as a reference, so after the coordinates of each node of the two hands are obtained, the position of the index finger may be determined according to the obtained coordinates, and then the region where the subject is located may be determined according to the position of the index finger. For example, the position of the index finger of the two hands can be determined according to the relative position relation of the respective nodes, for example, since the thumb has only 3 nodes and the index finger is close to the thumb, the position of the index finger can be indirectly deduced according to the position of the thumb, which is, of course, merely illustrative, and how to determine the position of the index finger according to the respective coordinates can be determined by those skilled in the art according to the actual requirements, which is not limited in this embodiment.
Of course, instead of determining the position of the index finger according to the coordinates of each joint point, the position of the index finger may also be used as one of the outputs in the process of training the joint point detection model. The index finger position is already included in the output of the joint point detection model, and the operation of determining the index finger position based on the output coordinates is not needed, which is also illustrative, and how the index finger position is determined specifically can be determined by those skilled in the art according to actual needs, which is not limited by the present disclosure.
In step 207, a rectangular area is determined based on the index finger positions of the hands.
In this embodiment, after the positions of the index fingers of the two hands are determined, the positions of the two index fingers can be used as diagonal corners of a rectangle to determine a rectangular area, and the rectangular area is used as the target area to be identified.
For example, as shown in fig. 3, the preview screen includes a subject 31, a determined rectangular region 32, a left hand 33 of the user, a right hand 34 of the user, and a preview region frame 35.
And step 208, carrying out object identification on the picture content in the rectangular area.
After the rectangular area is determined, the object identification can be performed on the picture content in the rectangular area to obtain the identification result, and the specific identification mode is described above and will not be described herein.
According to the technical scheme, through the technical scheme of the embodiment, the user only needs to indicate the position of the shot main body by hand, so that the mobile phone can realize the article identification of the article to be identified based on the preview picture, and the problem that in the related art, the shot operation of the article to be identified needs to be completed preferentially, and the article identification can be performed on the shot main body on the basis of storing the shot image is avoided.
It should be noted that the above manner is merely illustrative, and in practical applications, it is entirely possible to use a single hand of the user as the specific limb and one or some fingers of the single hand as the target node. For example, the right hand of the user may be used as a specific limb, and the thumb and index finger of the right hand may be determined as target nodes, so as to determine the rectangular area according to the coordinates of the thumb of the right hand and the coordinates of the index finger of the right hand of the user. In this case, the user can carry out position adjustment by holding the mobile phone with one hand and the other hand as a preset object to indicate the position of the article to be identified, so that the user can complete the article identification operation even by a single person.
Fig. 4 is a block diagram of an article identification device according to an exemplary embodiment of the present disclosure. Referring to fig. 4, the apparatus includes an acquisition unit 401, a first recognition unit 402, and a second recognition unit 403.
The acquisition unit 401 invokes a camera to acquire images so as to display acquired preview images in an image preview interface, wherein the preview images comprise preset objects;
the first recognition unit 402 performs preset object recognition on the preview image to determine the position of the preset object in the preview image, and determines a target area to be recognized according to the determined position;
the second identifying unit 403 identifies the object in the target area to obtain an identification result.
Optionally, the first identifying unit 402 is further configured to:
detecting key points of the preview image to determine the positions of the key points contained in the preset object in the preview image;
and determining the position of the preset object in the preview image according to the position of each key point in the preview image.
Optionally, the preset object is a specific limb of the user; the first recognition unit 402 is further configured to:
and detecting the joint points of the preview image to determine the positions of the joint points contained in the specific limb in the preview image.
Optionally, the first identifying unit 402 is further configured to:
and inputting the preview image into a key point detection model so that the key point detection model outputs the positions of the key points contained in the preset object in the preview image.
Optionally, the first identifying unit 402 is further configured to:
and determining target key points belonging to the specific part of the preset object, and determining a target area to be identified according to the positions of the target key points.
Optionally, the first identifying unit 402 is further configured to:
and determining the area with the preset shape determined according to the target key points as a target area to be identified.
Optionally, the first identifying unit 402 is further configured to:
and carrying out feature recognition on the preview image according to the pre-configured feature information of the preset object so as to determine the position of the preset object in the preview image.
Optionally, the second identifying unit 403 is further configured to:
performing item identification on the target area to determine the item type of the items contained in the target area; or,
and taking the picture content of the target area as a search keyword to search for a plurality of related search results.
For the device embodiments, reference is made to the description of the method embodiments for the relevant points, since they essentially correspond to the method embodiments. The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the objectives of the disclosed solution. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
Correspondingly, the disclosure also provides an article identification device, comprising: a processor; a memory for storing processor-executable instructions; wherein the processor is configured to implement the method for identifying an item according to any of the above embodiments, for example the method may comprise: invoking a camera to collect images so as to display collected preview images in an image preview interface, wherein the preview images comprise preset objects; carrying out preset object identification on the preview image to determine the position of the preset object in the preview image, and determining a target area to be identified according to the determined position; and carrying out object identification on the target area to obtain an identification result.
Accordingly, the present disclosure also provides an electronic device including a memory, and one or more programs, wherein the one or more programs are stored in the memory, and configured to be executed by the one or more processors, the one or more programs containing instructions for implementing the article identification method according to any of the above embodiments, for example, the method may include: invoking a camera to collect images so as to display collected preview images in an image preview interface, wherein the preview images comprise preset objects; carrying out preset object identification on the preview image to determine the position of the preset object in the preview image, and determining a target area to be identified according to the determined position; and carrying out object identification on the target area to obtain an identification result.
Fig. 5 is a block diagram illustrating an apparatus 500 for implementing a method of item identification, according to an example embodiment. For example, the apparatus 500 may be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a game console, a tablet device, a medical device, an exercise device, a personal digital assistant, or the like.
Referring to fig. 5, an apparatus 500 may include one or more of the following components: a processing component 502, a memory 504, a power supply component 506, a multimedia component 508, an audio component 510, an input/output (I/O) interface 512, a sensor component 514, and a communication component 516.
The processing component 502 generally controls overall operation of the apparatus 500, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing component 502 may include one or more processors 520 to execute instructions to perform all or part of the steps of the methods described above. Further, the processing component 502 can include one or more modules that facilitate interactions between the processing component 502 and other components. For example, the processing component 502 can include a multimedia module to facilitate interaction between the multimedia component 508 and the processing component 502.
The memory 504 is configured to store various types of data to support operations at the apparatus 500. Examples of such data include instructions for any application or method operating on the apparatus 500, contact data, phonebook data, messages, pictures, videos, and the like. The memory 504 may be implemented by any type or combination of volatile or nonvolatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disk.
The power supply component 506 provides power to the various components of the device 500. The power components 506 may include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power for the device 500.
The multimedia component 508 includes a screen between the device 500 and the user that provides an output interface. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive input signals from a user. The touch panel includes one or more touch sensors to sense touches, swipes, and gestures on the touch panel. The touch sensor may sense not only the boundary of a touch or slide action, but also the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 508 includes a front-facing camera and/or a rear-facing camera. The front-facing camera and/or the rear-facing camera may receive external multimedia data when the apparatus 500 is in an operational mode, such as a photographing mode or a video mode. Each front camera and rear camera may be a fixed optical lens system or have focal length and optical zoom capabilities.
The audio component 510 is configured to output and/or input audio signals. For example, the audio component 510 includes a Microphone (MIC) configured to receive external audio signals when the device 500 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signals may be further stored in the memory 504 or transmitted via the communication component 516. In some embodiments, the audio component 510 further comprises a speaker for outputting audio signals.
The I/O interface 512 provides an interface between the processing component 502 and peripheral interface modules, which may be keyboards, click wheels, buttons, etc. These buttons may include, but are not limited to: homepage button, volume button, start button, and lock button.
The sensor assembly 514 includes one or more sensors for providing status assessment of various aspects of the apparatus 500. For example, the sensor assembly 514 may detect the on/off state of the device 500, the relative positioning of the components, such as the display and keypad of the device 500, the sensor assembly 514 may also detect a change in position of the device 500 or a component of the device 500, the presence or absence of user contact with the device 500, the orientation or acceleration/deceleration of the device 500, and a change in temperature of the device 500. The sensor assembly 514 may include a proximity sensor configured to detect the presence of nearby objects without any physical contact. The sensor assembly 514 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 514 may also include an acceleration sensor, a gyroscopic sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
The communication component 516 is configured to facilitate communication between the apparatus 500 and other devices in a wired or wireless manner. The apparatus 500 may access a wireless network based on a communication standard, such as WiFi,2G or 3G,4G LTE, 5G NR (New Radio), or a combination thereof. In one exemplary embodiment, the communication component 516 receives broadcast signals or broadcast-related information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communication component 516 further includes a Near Field Communication (NFC) module to facilitate short range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, ultra Wideband (UWB) technology, bluetooth (BT) technology, and other technologies.
In an exemplary embodiment, the apparatus 500 may be implemented by one or more Application Specific Integrated Circuits (ASICs), digital Signal Processors (DSPs), digital Signal Processing Devices (DSPDs), programmable Logic Devices (PLDs), field Programmable Gate Arrays (FPGAs), controllers, microcontrollers, microprocessors, or other electronic elements for executing the methods described above.
In an exemplary embodiment, a non-transitory computer readable storage medium is also provided, such as memory 504, including instructions executable by processor 520 of apparatus 500 to perform the above-described method. For example, the non-transitory computer readable storage medium may be ROM, random Access Memory (RAM), CD-ROM, magnetic tape, floppy disk, optical data storage device, etc.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This disclosure is intended to cover any adaptations, uses, or adaptations of the disclosure following the general principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It is to be understood that the present disclosure is not limited to the precise arrangements and instrumentalities shown in the drawings, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.
The foregoing description of the preferred embodiments of the present disclosure is not intended to limit the disclosure, but rather to cover all modifications, equivalents, improvements and alternatives falling within the spirit and principles of the present disclosure.

Claims (11)

1. An article identification method, comprising:
invoking a camera to collect images so as to display collected preview images in an image preview interface, wherein the preview images comprise preset objects;
carrying out preset object identification on the preview image to determine the position of the preset object in the preview image, and determining a target area to be identified according to the determined position;
and carrying out object identification on the target area to obtain an identification result.
2. The method of claim 1, wherein the performing the preset object recognition on the preview image to determine the position of the preset object in the preview image comprises:
detecting key points of the preview image to determine the positions of the key points contained in the preset object in the preview image;
and determining the position of the preset object in the preview image according to the position of each key point in the preview image.
3. The method of claim 2, wherein the preset object is a specific limb of the user; the detecting the key points of the preview image to determine the positions of the key points included in the preset object in the preview image includes:
and detecting the joint points of the preview image to determine the positions of the joint points contained in the specific limb in the preview image.
4. The method according to claim 2, wherein the performing keypoint detection on the preview image to determine the position of each keypoint included in the preset object in the preview image includes:
and inputting the preview image into a key point detection model so that the key point detection model outputs the positions of the key points contained in the preset object in the preview image.
5. The method of claim 2, wherein the determining the target area to be identified based on the determined location comprises:
and determining target key points belonging to the specific part of the preset object, and determining a target area to be identified according to the positions of the target key points.
6. The method of claim 5, wherein the determining the target area to be identified based on the location of the target keypoint comprises:
and determining the area with the preset shape determined according to the target key points as a target area to be identified.
7. The method of claim 1, wherein the performing the preset object recognition on the preview image to determine the position of the preset object in the preview image comprises:
and carrying out feature recognition on the preview image according to the pre-configured feature information of the preset object so as to determine the position of the preset object in the preview image.
8. The method of claim 1, wherein the identifying the object area to obtain the identification result comprises:
performing item identification on the target area to determine the item type of the items contained in the target area; or,
and taking the picture content of the target area as a search keyword to search for a plurality of related search results.
9. An article identification device, comprising:
the acquisition unit is used for calling the camera to acquire images so as to display acquired preview images in an image preview interface, wherein the preview images comprise preset objects;
the first identification unit is used for carrying out preset object identification on the preview image so as to determine the position of the preset object in the preview image and determine a target area to be identified according to the determined position;
and the second identification unit is used for carrying out article identification on the target area so as to obtain an identification result.
10. An electronic device, comprising:
a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to implement the method of any of claims 1-8 by executing the executable instructions.
11. A computer readable storage medium having stored thereon computer instructions which, when executed by a processor, implement the steps of the method according to any of claims 1-8.
CN202210784296.5A 2022-06-28 2022-06-28 Article identification method and device, electronic equipment and storage medium Pending CN117372858A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210784296.5A CN117372858A (en) 2022-06-28 2022-06-28 Article identification method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210784296.5A CN117372858A (en) 2022-06-28 2022-06-28 Article identification method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN117372858A true CN117372858A (en) 2024-01-09

Family

ID=89401075

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210784296.5A Pending CN117372858A (en) 2022-06-28 2022-06-28 Article identification method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN117372858A (en)

Similar Documents

Publication Publication Date Title
JP6310556B2 (en) Screen control method and apparatus
CN105845124B (en) Audio processing method and device
EP3179408A2 (en) Picture processing method and apparatus, computer program and recording medium
EP3125155A1 (en) Image-based communication method and device
EP2911051A1 (en) Input method and device
US20170344177A1 (en) Method and device for determining operation mode of terminal
US10313537B2 (en) Method, apparatus and medium for sharing photo
EP3232314A1 (en) Method and device for processing an operation
CN106126082B (en) Terminal control method and device and terminal
CN105069083A (en) Determination method and device of associated user
US10769743B2 (en) Method, device and non-transitory storage medium for processing clothes information
JP6609266B2 (en) Fingerprint identification method, apparatus, program, and recording medium
CN104850643B (en) Picture comparison method and device
CN110019897B (en) Method and device for displaying picture
CN108470321B (en) Method and device for beautifying photos and storage medium
CN105426904A (en) Photo processing method, apparatus and device
CN106447747B (en) Image processing method and device
CN104836880A (en) Method and device for processing contact person head portrait
CN113642551A (en) Nail key point detection method and device, electronic equipment and storage medium
US10198614B2 (en) Method and device for fingerprint recognition
CN108509863A (en) Information cuing method, device and electronic equipment
CN117372858A (en) Article identification method and device, electronic equipment and storage medium
CN114063876A (en) Virtual keyboard setting method, device and storage medium
CN109145151B (en) Video emotion classification acquisition method and device
CN111787215A (en) Shooting method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination