Detailed Description
Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present disclosure are shown in the drawings, it is to be understood that the present disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein, but rather are provided for a more thorough and complete understanding of the present disclosure. It should be understood that the drawings and embodiments of the disclosure are for illustration purposes only and are not intended to limit the scope of the disclosure.
It should be understood that the various steps recited in the method embodiments of the present disclosure may be performed in a different order, and/or performed in parallel. Moreover, method embodiments may include additional steps and/or omit performing the illustrated steps. The scope of the present disclosure is not limited in this respect.
The term "include" and variations thereof as used herein are open-ended, i.e., "including but not limited to". The term "based on" is "based, at least in part, on". The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one additional embodiment"; the term "some embodiments" means "at least some embodiments". Relevant definitions for other terms will be given in the following description.
It should be noted that the terms "first", "second", and the like in the present disclosure are only used for distinguishing different devices, modules or units, and are not used for limiting the devices, modules or units to be determined as different devices, modules or units, and are not used for limiting the sequence or interdependence relationship of the functions executed by the devices, modules or units.
It is noted that references to "a", "an", and "the" modifications in this disclosure are intended to be illustrative rather than limiting, and that those skilled in the art will recognize that "one or more" may be used unless the context clearly dictates otherwise.
The names of messages or information exchanged between devices in the embodiments of the present disclosure are for illustrative purposes only, and are not intended to limit the scope of the messages or information.
The present disclosure provides a method, an apparatus, an electronic device, and a computer-readable storage medium for identifying attribute classification of a target object, which aim to solve the above technical problems of the prior art.
The following describes the technical solutions of the present disclosure and how to solve the above technical problems in specific embodiments. The following several specific embodiments may be combined with each other, and details of the same or similar concepts or processes may not be repeated in some embodiments. Embodiments of the present disclosure will be described below with reference to the accompanying drawings.
In one embodiment, a method for identifying a property classification of a target object is provided, as shown in fig. 1, the method including:
step S101, when an application client preset in a terminal is in an attribute classification mode of a target object, acquiring an image;
in practical application, an application client is provided with an attribute classification mode of a target object, for example, a garbage classification mode, and when a user runs the application client and starts the garbage classification mode, garbage classification and identification can be performed on the target object.
The terminal may have the following features:
(1) on a hardware architecture, a device has a central processing unit, a memory, an input unit and an output unit, that is, the device is often a microcomputer device having a communication function. In addition, various input modes such as a keyboard, a mouse, a touch screen, a microphone, a camera and the like can be provided, and input can be adjusted as required. Meanwhile, the equipment often has a plurality of output modes, such as a telephone receiver, a display screen and the like, and can be adjusted according to needs;
(2) on a software system, the device must have an operating system, such as Windows Mobile, Symbian, Palm, Android, iOS, and the like. Meanwhile, the operating systems are more and more open, and personalized application programs developed based on the open operating system platforms are infinite, such as a communication book, a schedule, a notebook, a calculator, various games and the like, so that the requirements of personalized users are met to a great extent;
(3) in terms of communication capacity, the device has flexible access mode and high-bandwidth communication performance, and can automatically adjust the selected communication mode according to the selected service and the environment, thereby being convenient for users to use. The device can support GSM (Global System for Mobile Communication), WCDMA (Wideband Code Division Multiple Access), CDMA2000(Code Division Multiple Access), TDSCDMA (Time Division-Synchronous Code Division Multiple Access), Wi-Fi (Wireless-Fidelity), WiMAX (world interoperability for Microwave Access), etc., thereby adapting to various systems of networks, not only supporting voice service, but also supporting various Wireless data services;
(4) in the aspect of function use, the equipment focuses more on humanization, individuation and multi-functionalization. With the development of computer technology, devices enter a human-centered mode from a device-centered mode, and the embedded computing, control technology, artificial intelligence technology, biometric authentication technology and the like are integrated, so that the human-oriented purpose is fully embodied. Due to the development of software technology, the equipment can be adjusted and set according to individual requirements, and is more personalized. Meanwhile, the device integrates a plurality of software and hardware, and the function is more and more powerful.
In a preferred embodiment of the present disclosure, the step of acquiring an image comprises:
acquiring an image through image acquisition equipment of a terminal;
or the like, or, alternatively,
and acquiring an image which is stored in the terminal and corresponds to the selection instruction based on the selection instruction initiated by the user.
For example, a user may first click a shortcut of an application client in a terminal to run the application client, then click a virtual button or an entity button for acquiring an image to initiate an instruction for acquiring the image, and the terminal invokes an image acquisition device to acquire a target image according to the instruction, and uses the acquired target image as an image to be identified, where the acquired target image may be a static image, such as a photo, or a dynamic image, such as a video.
Or after the application client is operated, the user can acquire the target image without clicking any virtual button or physical button, and the image acquisition device is aligned with the target object, so that the image captured by the image acquisition device (namely, the image seen by the user in the visual interface of the terminal) is taken as the image to be identified.
Moreover, after the application client is operated, the user can also select a target image from the images stored in the terminal as the image to be identified to be displayed in the application client, wherein the image selected by the user can be a static image or a dynamic image.
It should be noted that, in addition to the above-mentioned manner of acquiring the image to be recognized, the image to be recognized may also be acquired by other manners, and the embodiment of the present disclosure does not limit this.
Further, except that the image to be recognized is acquired when the preset application client in the terminal is in the garbage classification mode, the garbage classification mode of the application client may be started after the image to be recognized is acquired, and the user may select a time for starting the garbage classification mode according to actual requirements.
Step S102, extracting the features of the image to obtain target features, determining the probability that the target features belong to each preset main object, and judging the main object with the maximum probability as a final target object;
step S103, determining the attribute type corresponding to the target object, and displaying the attribute type through the application client.
After the image to be recognized is obtained, the application client can recognize the target object in the image to be recognized, that is, what the main body of the garbage classification the user wants to determine is recognized, and the garbage classification result of the target object is determined.
In a preferred embodiment of the present disclosure, the step of extracting features of an image to obtain target features, determining probabilities that the target features belong to respective preset subject objects, and determining a subject object with the highest probability as a final target object includes:
determining whether a marked area marked by a user exists in the image;
and in response to the existence of the mark region in the image, performing feature extraction on the mark region by adopting a preset first neural network model to obtain target features, determining the probability that the target features belong to each preset main object, and judging the main object with the maximum probability as a final target object.
Carrying out feature extraction on the image through a preset neural network model to obtain target features, wherein the feature extraction comprises the following steps:
and performing feature extraction on the image by adopting a first neural network model to identify a target object, obtaining at least one target object, and displaying each target object.
Determining the attribute type corresponding to the target object, including:
determining the attribute type of the target object based on the corresponding relation between the preset object and the attribute type;
or;
and determining the attribute category of the target object by adopting a preset second neural network model.
Determining the attribute category of the target object based on the corresponding relation between the preset object and the attribute category, wherein the determining comprises the following steps:
when a selection instruction aiming at any target object initiated by a user is received, determining the attribute category of the target object based on the corresponding relation between a preset object and the attribute category;
and when the first selection instruction is not received, determining the attribute category of each target object based on the corresponding relation.
Determining the attribute category of the target object by adopting a preset second neural network model, wherein the attribute category comprises the following steps:
when a selection instruction aiming at any target object initiated by a user is received, determining the attribute category of the target object by adopting the second neural network model;
and when the selection instruction is not received, determining the attribute category of each target object by using the second neural network model.
Specifically, a plurality of target objects may exist in an image to be recognized, but a user only determines a garbage classification result of one of the target objects, in this case, the user may mark the target object, for example, circle a region of the target object, so that the circle region is a recognition region, then perform target object recognition on the recognition region by using a preset first neural network model, including performing feature extraction on the image in the recognition region, then calculating a probability that the feature belongs to each subject object, then determining the subject object with the highest probability as a final target object, thereby obtaining the target object in the recognition region, and then obtaining the garbage classification result corresponding to the target object by querying according to a preset corresponding relationship between the target object and the garbage classification result.
In the embodiment of the present disclosure, the image marked with the target object may be input into the first neural network model as sample data in advance to train the first neural network model; the preset corresponding relation between the target object and the garbage classification result records the garbage classification result corresponding to each target object.
For example, the first neural network model is used to identify the target object in the identification area, determine that the target object in the identification area is "crab shell", and then query the corresponding relationship between the target object and the garbage classification result to know that the garbage classification result corresponding to "crab shell" is "wet garbage", as shown in fig. 2.
Further, under the condition that an identification region exists in an image to be identified, a preset second neural network model can be used for identifying a target object in the identification region, the method comprises the steps of extracting features of the image in the identification region, calculating the probability that the features belong to each main object, judging the main object with the maximum probability as a final target object so as to obtain the target object in the identification region, after the target object in the identification region is determined, performing garbage classification identification on the target object by using a preset third neural network model, calculating the probability that the target object belongs to each garbage classification, and judging the garbage classification with the maximum probability as a final garbage classification result so as to obtain a garbage classification result corresponding to the target object.
Further, after the target object in the identification area is determined, the second neural network model can be continuously adopted to perform garbage classification identification on the target object, so that a garbage classification result corresponding to the target object is obtained.
It should be noted that, in practical application, the third neural network model and the second neural network model may be a convolutional neural network, and the two models may be different models, in this case, the second neural network model and the first neural network model may be the same model, the third neural network model is used to identify the garbage classification of the target object, and an image labeled with the garbage classification corresponding to the target object may be input into the third neural network model as sample data in advance to train the third neural network model; the third neural network model and the second neural network model may also be two parts in a joint training model, the second neural network model part in the joint training model is used for identifying a target object in an image, and the third neural network model part is used for identifying garbage classification of the target object, wherein the second neural network model part may be trained by using the image pre-labeled with the target object as sample data, and the third neural network model part may be trained by using the image pre-labeled with the garbage classification corresponding to the target object as sample data.
Further, under the condition that a plurality of target objects exist in the image to be recognized, the first neural network model can be adopted to perform target object recognition on the whole image to be recognized to obtain at least one target object and display each target object to the user, and after the user selects any one target object from the plurality of target objects, the garbage classification result corresponding to the target object is obtained through query according to the preset corresponding relation between the target object and the garbage classification result.
If the user does not select any one of the target objects, after a preset time, for example, after 3 seconds, querying to obtain a garbage classification result corresponding to each target object according to the preset corresponding relationship between the target object and the garbage classification result.
Further, under the condition that a plurality of target objects exist in the image to be recognized, the second neural network model can be used for carrying out target object recognition on the whole image to be recognized to obtain at least one target object and displaying each target object to the user, and after the user selects any target object from the plurality of target objects, the preset third neural network model is used for carrying out garbage classification recognition on the target object, so that a garbage classification result corresponding to the target object is obtained.
If the user does not select any one of the target objects, after a preset time, for example, after 3 seconds, performing garbage classification and identification on each target object by using a preset third neural network model, so as to obtain a garbage classification result corresponding to each target object.
It should be noted that, in practical applications, the third neural network model and the second neural network model may be two different models, or may be two parts in a jointly trained model.
In a preferred embodiment of the present disclosure, the step of presenting the attribute category by the application client includes:
displaying the attribute categories in an interactive interface of the application client; and
the attribute categories are played by voice.
Specifically, after the garbage classification result corresponding to the target object is obtained, the garbage classification result can be displayed in an interactive interface of the application client, and meanwhile, for convenience of a user, the garbage classification result can be played in a voice manner by default.
Furthermore, some users may need voice playing, some users may not need voice playing, and in order to give consideration to all users, a voice playing button may be further provided, and when the user clicks the button, the voice playing is performed on the garbage classification result.
Further, in the embodiment of the present disclosure, if there is an error or omission in the target object identified from the image to be identified or the garbage classification result corresponding to the target object, the user may correct the erroneous target object and the garbage classification result corresponding to the target object, or add the omitted target object and the garbage classification result corresponding to the target object, and the client trains the first neural network model, the second neural network model, and the third neural network model by using the target object corrected or added by the user and the garbage classification result corresponding to the target object as sample data, so that the identification rates of the first neural network model, the second neural network model, and the third neural network model are higher.
In the embodiment of the disclosure, when the application client preset in the terminal is in the attribute classification mode, an image is acquired, then extracting the characteristics of the image through a preset neural network model to obtain target characteristics, determining the probability that the target characteristics belong to each preset main object, judging the main object with the maximum probability as a final target object, determining the attribute category corresponding to the target object, displaying the attribute category through an application client, therefore, the user can obtain the corresponding garbage classification result by adopting the image containing the target object, the user is saved from searching on the network, and the step of screening the plurality of search results to obtain the final result, not only saves the search time of the user, has simple and convenient operation process, and the recognition rate and accuracy of garbage classification are improved by adopting a mode of recognizing garbage classification results by using a trained model.
Fig. 3 is a schematic structural diagram of an apparatus for identifying attribute classification of a target object according to another embodiment of the present disclosure, as shown in fig. 3, the apparatus of this embodiment may include:
an obtaining module 301, configured to obtain an image when an application client preset in a terminal is in an attribute classification mode;
the processing module 302 is configured to perform feature extraction on the image to obtain target features, determine probabilities that the target features belong to each preset subject object, and determine the subject object with the highest probability as a final target object;
a determining module 303, configured to determine an attribute category corresponding to the target object;
and the presentation module 304 is used for presenting the attribute categories through the application client.
In a preferred embodiment of the present disclosure, the obtaining module is specifically configured to:
acquiring an image through image acquisition equipment of a terminal;
or the like, or, alternatively,
and acquiring an image which is stored in the terminal and corresponds to the selection instruction based on the selection instruction initiated by the user.
In a preferred embodiment of the present disclosure, the processing module includes:
a determination submodule for determining whether a marked region marked by a user exists in the image;
and the first processing submodule is used for responding to the existence of the mark region in the image, adopting a preset first neural network model to carry out feature extraction on the mark region to obtain target features, determining the probability that the target features belong to each preset main object, and judging the main object with the maximum probability as the final target object.
In a preferred embodiment of the present disclosure, the determining module includes:
the second processing submodule is used for determining the attribute category of the target object based on the corresponding relation between the preset object and the attribute category;
and the third processing submodule is used for determining the attribute category of the target object by adopting a preset second neural network model.
In a preferred embodiment of the present disclosure, the processing module is specifically configured to:
and performing feature extraction on the image by adopting a first neural network model to identify a target object, obtaining at least one target object, and displaying each target object.
In a preferred embodiment of the present disclosure, the second processing submodule is specifically configured to:
when a selection instruction aiming at any target object initiated by a user is received, determining the attribute category of the target object based on the corresponding relation between a preset object and the attribute category;
when the selection instruction is not received, determining the attribute type of each target object based on the corresponding relation;
the third processing submodule is specifically configured to:
when a selection instruction aiming at any target object initiated by a user is received, determining the attribute category of the target object by adopting a second neural network model;
and when the selection instruction is not received, determining the attribute class of each target object by adopting the second neural network model.
In a preferred embodiment of the present disclosure, the display module comprises:
the display submodule is used for displaying the attribute categories in an interactive interface of the application client;
and the playing sub-module is used for playing the attribute categories through voice.
The identification apparatus for attribute classification of a target object in this embodiment can execute the identification method for attribute classification of a target object shown in the first embodiment of this disclosure, which is similar to the implementation principle, and is not described herein again.
In the embodiment of the disclosure, when the application client preset in the terminal is in the attribute classification mode, an image is acquired, then extracting the characteristics of the image through a preset neural network model to obtain target characteristics, determining the probability that the target characteristics belong to each preset main object, judging the main object with the maximum probability as a final target object, determining the attribute category corresponding to the target object, displaying the attribute category through an application client, therefore, the user can obtain the corresponding garbage classification result by adopting the image containing the target object, the user is saved from searching on the network, and the step of screening the plurality of search results to obtain the final result, not only saves the search time of the user, has simple and convenient operation process, and the recognition rate and accuracy of garbage classification are improved by adopting a mode of recognizing garbage classification results by using a trained model.
Referring now to FIG. 4, a block diagram of an electronic device 400 suitable for use in implementing embodiments of the present disclosure is shown. The electronic devices in the embodiments of the present disclosure may include, but are not limited to, mobile terminals such as mobile phones, notebook computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablet computers), PMPs (portable multimedia players), in-vehicle terminals (e.g., car navigation terminals), and the like, and fixed terminals such as digital TVs, desktop computers, and the like. The electronic device shown in fig. 4 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure.
The electronic device includes: a memory and a processor, wherein the processor may be referred to as a processing device 401 described below, and the memory may include at least one of a Read Only Memory (ROM)402, a Random Access Memory (RAM)403, and a storage device 408, which are described below:
as shown in fig. 4, electronic device 400 may include a processing device (e.g., central processing unit, graphics processor, etc.) 401 that may perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)402 or a program loaded from a storage device 408 into a Random Access Memory (RAM) 403. In the RAM 403, various programs and data necessary for the operation of the electronic apparatus 400 are also stored. The processing device 401, the ROM 402, and the RAM 403 are connected to each other via a bus 404. An input/output (I/O) interface 405 is also connected to bus 404.
In general, input devices 406 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc., output devices 407 including, for example, a liquid crystal display (L CD), speaker, vibrator, etc., storage devices 408 including, for example, magnetic tape, hard disk, etc., and communication devices 409 may allow electronic device 400 to communicate wirelessly or wiredly with other devices to exchange data although FIG. 4 illustrates electronic device 400 with various means, it is to be understood that not all of the illustrated means are required to be implemented or provided.
In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program carried on a non-transitory computer readable medium, the computer program containing program code for performing the method illustrated by the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via the communication device 409, or from the storage device 408, or from the ROM 402. The computer program performs the above-described functions defined in the methods of the embodiments of the present disclosure when executed by the processing device 401.
It should be noted that the computer readable medium in the present disclosure can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In contrast, in the present disclosure, a computer readable signal medium may comprise a propagated data signal with computer readable program code embodied therein, either in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.
In some embodiments, the clients, servers may communicate using any currently known or future developed network protocol, such as HTTP (HyperText transfer protocol), and may be interconnected with any form or medium of digital data communication (e.g., a communications network). examples of communications networks include local area networks ("L AN"), wide area networks ("WAN"), the Internet (e.g., the Internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future developed networks.
The computer readable medium may be embodied in the electronic device; or may exist separately without being assembled into the electronic device.
The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: when an application client preset in a terminal is in an attribute classification mode, acquiring an image; extracting the features of the image through a preset neural network model to obtain target features, determining the probability that the target features belong to each preset main object, and judging the main object with the maximum probability as a final target object; and determining the attribute category corresponding to the target object, and displaying the attribute category through the application client.
Computer program code for carrying out operations of the present disclosure may be written in any combination of one or more programming languages, including but not limited to AN object oriented programming language such as Java, Smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The modules or units described in the embodiments of the present disclosure may be implemented by software or hardware.
For example, without limitation, exemplary types of hardware logic that may be used include Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), systems on a chip (SOCs), complex programmable logic devices (CP L D), and so forth.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
According to one or more embodiments of the present disclosure, there is provided an identification method of an attribute classification of a target object, including:
when an application client preset in a terminal is in an attribute classification mode, acquiring an image;
extracting features of the image to obtain target features, determining the probability that the target features belong to each preset main object, and judging the main object with the maximum probability as a final target object;
and determining the attribute category corresponding to the target object, and displaying the attribute category through the application client.
Preferably, the step of acquiring an image comprises:
acquiring the image through image acquisition equipment of the terminal;
or the like, or, alternatively,
and acquiring the image which is stored in the terminal and corresponds to the selection instruction based on the selection instruction initiated by the user.
Preferably, the feature extraction of the image is performed to obtain a target feature, the probability that the target feature belongs to each preset subject object is determined, and the subject object with the highest probability is determined as the final target object, including:
determining whether a marked area marked by a user exists in the image;
and in response to the mark region existing in the image, performing feature extraction on the mark region by adopting a preset first neural network model to obtain target features, determining the probability that the target features belong to each preset main object, and determining the main object with the maximum probability as a final target object.
Preferably, determining the attribute category corresponding to the target object includes:
determining the attribute type of the target object based on the corresponding relation between a preset object and the attribute type;
or;
and determining the attribute category of the target object by adopting a preset second neural network model.
Preferably, the feature extraction of the image to obtain the target feature includes:
and performing feature extraction on the image by adopting a first neural network model to identify a target object, obtaining at least one target object, and displaying each target object.
Preferably, determining the attribute category of the target object based on the corresponding relationship between the preset object and the attribute category includes:
when a selection instruction aiming at any target object initiated by a user is received, determining the attribute category of the target object based on the corresponding relation between a preset object and the attribute category;
when the selection instruction is not received, determining the attribute category of each target object based on the corresponding relation;
determining the attribute category of the target object by adopting a preset second neural network model, wherein the attribute category comprises the following steps:
when a selection instruction aiming at any target object initiated by a user is received, determining the attribute category of the target object by adopting the second neural network model;
and when the selection instruction is not received, determining the attribute category of each target object by using the second neural network model.
Preferably, the step of presenting the attribute category by the application client includes:
displaying the attribute categories in an interactive interface of the application client; and
playing the attribute category by voice.
According to one or more embodiments of the present disclosure, [ example two ] there is provided the apparatus of example one, further comprising:
the system comprises an acquisition module, a classification module and a classification module, wherein the acquisition module is used for acquiring an image when an application client preset in a terminal is in an attribute classification mode;
the processing module is used for extracting the features of the image through a preset neural network model to obtain target features, determining the probability that the target features belong to each preset main object, and judging the main object with the maximum probability as a final target object;
the determining module is used for determining the attribute category corresponding to the target object;
and the display module is used for displaying the attribute categories through the application client.
Preferably, the obtaining module is specifically configured to:
acquiring an image through image acquisition equipment of a terminal;
or the like, or, alternatively,
and acquiring an image which is stored in the terminal and corresponds to the selection instruction based on the selection instruction initiated by the user.
Preferably, the processing module comprises:
a determination submodule for determining whether a marked region marked by a user exists in the image;
and the first processing submodule is used for responding to the mark region in the image, adopting a preset first neural network model to carry out feature extraction on the mark region to obtain target features, determining the probability that the target features belong to each preset main object, and judging the main object with the maximum probability as the final target object.
Preferably, the determining module comprises:
the second processing submodule is used for determining the attribute category of the target object based on the corresponding relation between a preset object and the attribute category;
and the third processing submodule is used for determining the attribute category of the target object by adopting a preset second neural network model.
Preferably, the processing module is specifically configured to:
and performing feature extraction on the image by adopting a first neural network model to identify a target object, obtaining at least one target object, and displaying each target object.
Preferably, the second processing submodule is specifically configured to:
when a selection instruction aiming at any target object initiated by a user is received, determining the attribute category of the target object based on the corresponding relation between a preset object and the attribute category;
when the selection instruction is not received, determining the attribute category of each target object based on the corresponding relation;
the third processing submodule is specifically configured to:
when a selection instruction aiming at any target object initiated by a user is received, determining the attribute category of the target object by adopting the second neural network model;
and when the selection instruction is not received, determining the attribute category of each target object by using the second neural network model.
Preferably, the display module comprises:
the display sub-module is used for displaying the attribute categories in an interactive interface of the application client;
and the playing sub-module is used for playing the attribute categories through voice.
The foregoing description is only exemplary of the preferred embodiments of the disclosure and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the disclosure herein is not limited to the particular combination of features described above, but also encompasses other embodiments in which any combination of the features described above or their equivalents does not depart from the spirit of the disclosure. For example, the above features and (but not limited to) the features disclosed in this disclosure having similar functions are replaced with each other to form the technical solution.
Further, while operations are depicted in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order. Under certain circumstances, multitasking and parallel processing may be advantageous. Likewise, while several specific implementation details are included in the above discussion, these should not be construed as limitations on the scope of the disclosure. Certain features that are described in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination.
Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as example forms of implementing the claims.