CN116361493A - LOGO identification method, LOGO identification system, storage medium and electronic equipment - Google Patents

LOGO identification method, LOGO identification system, storage medium and electronic equipment Download PDF

Info

Publication number
CN116361493A
CN116361493A CN202310047905.3A CN202310047905A CN116361493A CN 116361493 A CN116361493 A CN 116361493A CN 202310047905 A CN202310047905 A CN 202310047905A CN 116361493 A CN116361493 A CN 116361493A
Authority
CN
China
Prior art keywords
logo
image
similarity
feature
characters
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310047905.3A
Other languages
Chinese (zh)
Inventor
请求不公布姓名
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Mdata Information Technology Co ltd
Original Assignee
Shanghai Mdata Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Mdata Information Technology Co ltd filed Critical Shanghai Mdata Information Technology Co ltd
Priority to CN202310047905.3A priority Critical patent/CN116361493A/en
Publication of CN116361493A publication Critical patent/CN116361493A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Character Discrimination (AREA)

Abstract

The invention provides a LOGO identification method, a LOGO identification system, a storage medium and electronic equipment, wherein the LOGO identification method comprises the following steps: constructing a LOGO characteristic database; constructing a LOGO text database; acquiring an image containing LOGO; intercepting LOGO images in the images; performing character recognition in the LOGO image; if the characters are recognized, calculating the word error rate of each LOGO character in the recognized characters and the LOGO text database, and when the word error rate is smaller than a first preset threshold value, selecting the LOGO name corresponding to the LOGO character with the minimum word error rate as the name of LOGO contained in the image; if the characters are not recognized, extracting image features of the LOGO image; calculating the similarity of the image features and each LOGO feature in the LOGO feature database, and when the similarity is larger than a second preset threshold, selecting the LOGO name corresponding to the LOGO feature with the largest similarity as the name of the LOGO contained in the image. The LOGO identification method, the LOGO identification system, the storage medium and the electronic equipment can accurately identify the graph LOGO and the text LOGO at the same time, and effectively improve the recall rate of LOGO identification.

Description

LOGO identification method, LOGO identification system, storage medium and electronic equipment
Technical Field
The invention belongs to the technical field of image processing, and particularly relates to a LOGO identification method, a LOGO identification system, a storage medium and electronic equipment.
Background
LOGO is a LOGO or a foreign language abbreviation of a trademark, plays a role in identifying and popularizing a company owned by the LOGO, and enables consumers to memorize a company main body and brand culture through the LOGO. The logo in the network is primarily a graphical logo that each website uses to link to other websites, representing one website or one tile of a website.
In the prior art, a commonly used LOGO identification method comprises the following steps: constructing a LOGO feature retrieval library, extracting the features of LOGO in the image, performing similarity calculation on the extracted LOGO features and the LOGO features in the LOGO feature retrieval library, and selecting a category larger than a threshold value as an identified LOGO category.
However, the above method is effective for graphic LOGO, but cannot accurately recognize text LOGO. This is because the characteristics of the text LOGO are mostly similar. Since there is no difference in the characteristics, the false detection rate is high.
Disclosure of Invention
In view of the above-mentioned drawbacks of the prior art, the present invention aims to provide a LOGO recognition method, a system, a storage medium and an electronic device, which can accurately recognize graphic LOGO and text LOGO at the same time, and effectively improve recall rate of LOGO recognition.
In a first aspect, the present invention provides a LOGO recognition method comprising the steps of: constructing a LOGO feature database, wherein the LOGO feature database comprises LOGO names and LOGO features; constructing a LOGO text database, wherein the LOGO text database comprises LOGO names and LOGO characters; acquiring an image containing LOGO; intercepting LOGO images in the images; performing character recognition in the LOGO image; if the characters are recognized, calculating the word error rate of the recognized characters and each LOGO character in the LOGO text database, and when the word error rate is smaller than a first preset threshold value, selecting the LOGO name corresponding to the LOGO character with the minimum word error rate as the name of LOGO contained in the image; if the characters are not recognized, extracting the image features of the LOGO image; calculating the similarity between the image features and each LOGO feature in the LOGO feature database, and selecting the LOGO name corresponding to the LOGO feature with the largest similarity as the name of the LOGO contained in the image when the similarity is larger than a second preset threshold.
In one implementation manner of the first aspect, capturing the LOGO image in the image includes the following steps:
acquiring coordinate information of the LOGO image in the image based on a DETR model;
the LOGO image is truncated from the image based on the coordinate information.
In one implementation manner of the first aspect, performing text recognition in the LOGO image includes the steps of:
identifying characters in the LOGO image based on a CRNN model.
In one implementation manner of the first aspect, extracting the image features of the LOGO image includes the following steps:
and extracting image features of the LOGO image based on a VIT model.
In an implementation manner of the first aspect, the first preset threshold value is 0.2.
In an implementation manner of the first aspect, the second preset threshold value is 0.8.
In an implementation manner of the first aspect, the similarity employs cosine similarity.
In a second aspect, the present invention provides a LOGO recognition system, the system comprising a first construction module, a second construction module, an acquisition module, an interception module, a recognition module, a word processing module, and an image processing module;
the first construction module is used for constructing a LOGO feature database, and the LOGO feature database comprises LOGO names and LOGO features;
the second construction module is used for constructing a LOGO text database, and the LOGO text database comprises LOGO names and LOGO characters;
the acquisition module is used for acquiring an image containing LOGO;
the intercepting module is used for intercepting LOGO images in the images;
the identification module is used for carrying out character identification in the LOGO image;
the word processing module is used for calculating the word error rate of each LOGO word in the recognized word and the LOGO text database if the word is recognized, and selecting the LOGO name corresponding to the LOGO word with the minimum word error rate as the name of LOGO contained in the image when the word error rate is smaller than a first preset threshold value;
the image processing module is used for extracting the image characteristics of the LOGO image if the characters are not recognized; calculating the similarity between the image features and each LOGO feature in the LOGO feature database, and selecting the LOGO name corresponding to the LOGO feature with the largest similarity as the name of the LOGO contained in the image when the similarity is larger than a second preset threshold.
In a third aspect, the present invention provides an electronic device comprising: a processor and a memory;
the memory is used for storing a computer program;
the processor is configured to execute the computer program stored in the memory, so that the electronic device executes the LOGO recognition method.
In a fourth aspect, the present invention provides a computer readable storage medium having stored thereon a computer program, wherein the program when executed by an electronic device implements the LOGO recognition method described above.
As described above, the LOGO identification method, the LOGO identification system, the storage medium and the electronic equipment have the following beneficial effects:
the LOGO identification method, the LOGO identification system, the storage medium and the electronic equipment can give consideration to the identification of the graph LOGO and the identification of the text LOGO, and enrich the application scene; effectively promoted LOGO discernment degree of accuracy, promoted the recall rate of LOGO discernment.
Drawings
FIG. 1 is a schematic view of an electronic device according to an embodiment of the invention;
FIG. 2 is a flow chart of a LOGO identification method according to an embodiment of the invention;
FIG. 3 is a schematic diagram illustrating a LOGO identification system in accordance with an embodiment of the present invention;
fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the invention.
Description of element reference numerals
11 mobile phone
12 tablet personal computer
13 notebook computer
31 first build module
32 second building block
33 acquisition module
34 intercept module
35 identification module
36 word processing module
37 image processing module
41 processing unit
42 memory
421 random access memory
422 cache memory
423 storage system
424 program/utility
4241 program modules
43 bus
44 input/output interface
45 network adapter
Detailed Description
Other advantages and effects of the present invention will become apparent to those skilled in the art from the following disclosure, which describes the embodiments of the present invention with reference to specific examples. The invention may be practiced or carried out in other embodiments that depart from the specific details, and the details of the present description may be modified or varied from the spirit and scope of the present invention. It should be noted that the following embodiments and features in the embodiments may be combined with each other without conflict.
It should be noted that the illustrations provided in the following embodiments merely illustrate the basic concept of the present invention by way of illustration, and only the components related to the present invention are shown in the drawings and are not drawn according to the number, shape and size of the components in actual implementation, and the form, number and proportion of the components in actual implementation may be arbitrarily changed, and the layout of the components may be more complicated.
The following embodiments of the present invention provide a LOGO recognition method that can be applied to an electronic device as shown in fig. 1. The electronic device in the present invention may include a mobile phone 11, a tablet computer 12, a notebook computer 13, a wearable device, a vehicle-mounted device, an augmented Reality (Augmented Reality, AR)/Virtual Reality (VR) device, an Ultra-Mobile Personal Computer (UMPC), a netbook, a personal digital assistant (Personal Digital Assistant, PDA) and the like with a wireless charging function, and the specific type of the electronic device is not limited in the embodiments of the present invention.
For example, the electronic device may be a Station (ST) in a wireless charging enabled WLAN, a wireless charging enabled cellular telephone, a cordless telephone, a Session initiation protocol (Session InitiationProtocol, SIP) telephone, a wireless local loop (WirelessLocal Loop, WLL) station, a personal digital assistant (Personal Digital Assistant, PDA) device, a wireless charging enabled handheld device, a computing device or other processing device, a computer, a laptop computer, a handheld communication device, a handheld computing device, and/or other devices for communicating over a wireless system, as well as next generation communication systems, such as a mobile terminal in a 5G network, a mobile terminal in a future evolved public land mobile network (PublicLand Mobile Network, PLMN), or a mobile terminal in a future evolved Non-terrestrial network (Non-terrestrial Network, NTN), etc.
For example, the electronic device may communicate with networks and other devices via wireless communications. The wireless communications may use any communication standard or protocol including, but not limited to, global system for mobile communications (GlobalSystem of Mobile communication, GSM), general Packet radio service (General Packet RadioService, GPRS), code division multiple access (Code Division Multiple Access, CDMA), wideband code division multiple access (Wideband Code Division Multiple Access, WCDMA), long term evolution (Long Term Evolution, LTE)), email, short message service (Short Messaging Service, SMS), BT, GNSS, WLAN, NFC, FM, and/or IR techniques, among others. The GNSS may include a global satellite positioning system (Global Positioning System, GPS), a global navigation satellite system (Global Navigation Satellite System, GLONASS), a beidou satellite navigation system (BeiDou navigation Satellite System, BDS), a Quasi zenith satellite system (Quasi-Zenith Satellite System, QZSS) and/or a satellite based augmentation system (Satellite Based Augmentation Systems, SBAS).
The following describes the technical solution in the embodiment of the present invention in detail with reference to the drawings in the embodiment of the present invention.
As shown in fig. 2, in an embodiment, the LOGO recognition method of the present invention includes the following steps:
step S1, constructing a LOGO feature database, wherein the LOGO feature database comprises LOGO names and LOGO features.
Specifically, a LOGO feature database is constructed by taking a sufficient number of LOGO samples. Naming the LOGO sample to obtain a LOGO name; and extracting the characteristics of the LOGO sample, and obtaining LOGO characteristics. And in the LOGO characteristic database, the LOGO name and the LOGO characteristic are associated. Preferably, the LOGO feature database adopts a dictionary format, LOGO names are used as keys, and 128-byte features are used as values.
And S2, constructing a LOGO text database, wherein the LOGO text database comprises LOGO names and LOGO characters.
Specifically, a LOGO text database is built by taking a sufficient number of LOGO samples. Naming the LOGO sample to obtain a LOGO name; and extracting the characters of the LOGO sample to obtain LOGO characters. And in the LOGO text database, the LOGO name and the LOGO text are associated. Preferably, the LOGO text database adopts a dictionary format, the LOGO name is used as a key, and characters in the LOGO sample are used as values.
And S3, acquiring an image containing LOGO.
Specifically, an image including LOGO provided by an image generating device, a network, or the like is acquired.
And S4, intercepting LOGO images in the images.
Specifically, capturing the LOGO image in the image includes the following steps:
41 Acquiring coordinate information of the LOGO image in the image based on a DETR model.
The DETR model, i.e. DEtection Transformer model, converts a target detection task into a sequence prediction task, and uses a transform encoder-decoder structure and a bilateral matching method to directly predict an input image to obtain a prediction result sequence, and the whole process is to use CNN to extract features and then encode and decode to obtain a prediction output. Based on the end-to-end target detection of the Transformers, no NMS post-processing step exists, and no anchor is actually available. DETR achieves accuracy and runtime performance comparable to the fast RCNN baseline.
In the invention, the image is input into the DETR model, and the coordinate information of the LOGO image in the image can be output.
42 -intercepting said LOGO image from said image based on said coordinate information.
And intercepting the LOGO image from the image according to the coordinate information.
And S5, performing character recognition in the LOGO image.
Specifically, text in the LOGO image is identified based on a CRNN model. The CRNN model is totally called Convolutional Recurrent Neural Network and is mainly used for recognizing text sequences with indefinite lengths end to end, single characters are not required to be cut first, and text recognition is converted into sequence learning problems depending on time sequence, namely sequence recognition based on images. The CRNN network structure comprises three parts, namely:
a) CNN (convolutional layer) is a feature map obtained by extracting features from an input image using depth CNN.
b) RNN (loop layer), predicting a feature sequence by using bidirectional RNN (BLSTM), learning each feature vector in the sequence, and outputting a prediction tag (true value) distribution.
c) CTC loss (transcription layer) a series of tag distributions obtained from the cycling layer are converted to the final tag sequence by using CTC losses.
When character recognition is performed in the LOGO image, there are two cases where a character is recognized and no character is recognized. For the two cases, LOGO recognition algorithms for characters and for images are respectively adopted.
And S6, if the characters are identified, calculating the word error rate of the identified characters and each LOGO character in the LOGO text database, and when the word error rate is smaller than a first preset threshold value, selecting the LOGO name corresponding to the LOGO character with the minimum word error rate as the name of the LOGO contained in the image.
Specifically, when text is identified, the LOGO image is indicated as text LOGO. Thus, the LOGO image is matched to each LOGO word in the LOGO text database. Wherein a Word Error Rate (WER) is calculated between the two. The word error rate is a key evaluation index in the field of speech recognition, and the lower the WER the higher the matching degree.
Therefore, a first preset threshold is set first, and when the word error rate is not smaller than the first preset threshold, the matching degree precision between the word error rate and the first preset threshold is insufficient. And when the word error rate is smaller than a first preset threshold value, indicating that the matching degree precision between the word error rate and the first preset threshold value meets the requirement. And under the condition that the precision requirement is met, selecting the LOGO name corresponding to the LOGO text with the minimum word error rate in the LOGO text database as the LOGO name identified in the image. Preferably, the first preset threshold value is 0.2.
S7, if the characters are not recognized, extracting the image characteristics of the LOGO image; calculating the similarity between the image features and each LOGO feature in the LOGO feature database, and selecting the LOGO name corresponding to the LOGO feature with the largest similarity as the name of the LOGO contained in the image when the similarity is larger than a second preset threshold.
Specifically, when no text is recognized, the LOGO image is indicated as image LOGO. Therefore, firstly, extracting the image characteristics of the LOGO image based on a VIT (Vision Transformer) model; and matching the image features with each LOGO feature in the LOGO feature database. Wherein the similarity between the two is calculated. The higher the similarity means the higher the matching degree. Preferably, the similarity adopts cosine similarity.
Therefore, a second preset threshold is set first, and when the similarity is not larger than the second preset threshold, the matching degree precision between the two is insufficient. And when the similarity is larger than a second preset threshold value, indicating that the matching degree precision between the two is satisfied. And under the condition of meeting the precision requirement, selecting the LOGO name corresponding to the LOGO feature with the largest similarity in the LOGO feature database as the LOGO name identified in the image. Preferably, the second preset threshold value is 0.8.
The protection scope of the LOGO recognition method according to the embodiment of the present invention is not limited to the execution sequence of the steps listed in the embodiment, and all the schemes implemented by adding or removing steps and replacing steps according to the prior art according to the principles of the present invention are included in the protection scope of the present invention.
The embodiment of the invention also provides a LOGO identification system, which can realize the LOGO identification method of the invention, but the implementation device of the LOGO identification system comprises but is not limited to the structure of the LOGO identification system listed in the embodiment, and all structural modifications and substitutions of the prior art according to the principles of the invention are included in the protection scope of the invention.
As shown in FIG. 3, in one embodiment, the LOGO recognition system of the present invention includes a first building block 31, a second building block 32, an acquisition block 33, an interception block 34, a recognition block 35, a word processing block 36, and an image processing block 37.
The first construction module 31 is configured to construct a LOGO feature database, where the LOGO feature database includes a LOGO name and a LOGO feature.
Specifically, a LOGO feature database is constructed by taking a sufficient number of LOGO samples. Naming the LOGO sample to obtain a LOGO name; and extracting the characteristics of the LOGO sample, and obtaining LOGO characteristics. And in the LOGO characteristic database, the LOGO name and the LOGO characteristic are associated. Preferably, the LOGO feature database adopts a dictionary format, LOGO names are used as keys, and 128-byte features are used as values.
The second construction module 32 is connected to the first construction module 31, and is configured to construct a LOGO text database, where the LOGO text database includes a LOGO name and a LOGO text.
Specifically, a LOGO text database is built by taking a sufficient number of LOGO samples. Naming the LOGO sample to obtain a LOGO name; and extracting the characters of the LOGO sample to obtain LOGO characters. And in the LOGO text database, the LOGO name and the LOGO text are associated. Preferably, the LOGO text database adopts a dictionary format, the LOGO name is used as a key, and characters in the LOGO sample are used as values.
The acquisition module 33 is connected to the second construction module 32 for acquiring an image containing LOGO.
Specifically, an image including LOGO provided by an image generating device, a network, or the like is acquired.
The intercepting module 34 is connected to the acquiring module 33, and is configured to intercept the LOGO image in the image.
Specifically, capturing the LOGO image in the image includes the following steps:
41 Acquiring coordinate information of the LOGO image in the image based on a DETR model.
The DETR model, i.e. DEtection Transformer model, converts a target detection task into a sequence prediction task, and uses a transform encoder-decoder structure and a bilateral matching method to directly predict an input image to obtain a prediction result sequence, and the whole process is to use CNN to extract features and then encode and decode to obtain a prediction output. Based on the end-to-end target detection of the Transformers, no NMS post-processing step exists, and no anchor is actually available. DETR achieves accuracy and runtime performance comparable to the fast RCNN baseline.
In the invention, the image is input into the DETR model, and the coordinate information of the LOGO image in the image can be output.
42 -intercepting said LOGO image from said image based on said coordinate information.
And intercepting the LOGO image from the image according to the coordinate information.
The recognition module 35 is connected to the interception module 34, and is configured to perform text recognition in the LOGO image.
Specifically, text in the LOGO image is identified based on a CRNN model. The CRNN model is totally called Convolutional Recurrent Neural Network and is mainly used for recognizing text sequences with indefinite lengths end to end, single characters are not required to be cut first, and text recognition is converted into sequence learning problems depending on time sequence, namely sequence recognition based on images. The CRNN network structure comprises three parts, namely:
a) CNN (convolutional layer) is a feature map obtained by extracting features from an input image using depth CNN.
b) RNN (loop layer), predicting a feature sequence by using bidirectional RNN (BLSTM), learning each feature vector in the sequence, and outputting a prediction tag (true value) distribution.
c) CTC loss (transcription layer) a series of tag distributions obtained from the cycling layer are converted to the final tag sequence by using CTC losses.
When character recognition is performed in the LOGO image, there are two cases where a character is recognized and no character is recognized. For the two cases, LOGO recognition algorithms for characters and for images are respectively adopted.
The word processing module 36 is connected to the recognition module 35, and is configured to calculate a word error rate between the recognized word and each LOGO word in the LOGO text database if the recognized word is recognized, and select a LOGO name corresponding to the LOGO word with the smallest word error rate as the LOGO name included in the image when the word error rate is smaller than a first preset threshold.
Specifically, when text is identified, the LOGO image is indicated as text LOGO. Thus, the LOGO image is matched to each LOGO word in the LOGO text database. Wherein a Word Error Rate (WER) is calculated between the two. The word error rate is a key evaluation index in the field of speech recognition, and the lower the WER the higher the matching degree.
Therefore, a first preset threshold is set first, and when the word error rate is not smaller than the first preset threshold, the matching degree precision between the word error rate and the first preset threshold is insufficient. And when the word error rate is smaller than a first preset threshold value, indicating that the matching degree precision between the word error rate and the first preset threshold value meets the requirement. And under the condition that the precision requirement is met, selecting the LOGO name corresponding to the LOGO text with the minimum word error rate in the LOGO text database as the LOGO name identified in the image. Preferably, the first preset threshold value is 0.2.
The image processing module 37 is connected to the recognition module 35, and is configured to extract image features of the LOGO image if no text is recognized; calculating the similarity between the image features and each LOGO feature in the LOGO feature database, and selecting the LOGO name corresponding to the LOGO feature with the largest similarity as the name of the LOGO contained in the image when the similarity is larger than a second preset threshold.
Specifically, when no text is recognized, the LOGO image is indicated as image LOGO. Therefore, firstly, extracting the image characteristics of the LOGO image based on a VIT (Vision Transformer) model; and matching the image features with each LOGO feature in the LOGO feature database. Wherein the similarity between the two is calculated. The higher the similarity means the higher the matching degree. Preferably, the similarity adopts cosine similarity.
Therefore, a second preset threshold is set first, and when the similarity is not larger than the second preset threshold, the matching degree precision between the two is insufficient. And when the similarity is larger than a second preset threshold value, indicating that the matching degree precision between the two is satisfied. And under the condition of meeting the precision requirement, selecting the LOGO name corresponding to the LOGO feature with the largest similarity in the LOGO feature database as the LOGO name identified in the image. Preferably, the second preset threshold value is 0.8.
In the several embodiments provided in the present invention, it should be understood that the disclosed system, apparatus, or method may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of modules/units is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple modules or units may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or modules or units, which may be in electrical, mechanical or other forms.
The modules/units illustrated as separate components may or may not be physically separate, and components shown as modules/units may or may not be physical modules, i.e., may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules/units may be selected according to actual needs to achieve the objectives of the embodiments of the present invention. For example, functional modules/units in various embodiments of the invention may be integrated into one processing module, or each module/unit may exist alone physically, or two or more modules/units may be integrated into one module/unit.
Those of ordinary skill would further appreciate that the elements and algorithm steps of the examples described in connection with the embodiments disclosed herein may be embodied in electronic hardware, in computer software, or in a combination of the two, and that the elements and steps of the examples have been generally described in terms of function in the foregoing description to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
The embodiment of the invention also provides a computer readable storage medium. Those of ordinary skill in the art will appreciate that all or part of the steps in the method implementing the above embodiments may be implemented by a program to instruct a processor, where the program may be stored in a computer readable storage medium, where the storage medium is a non-transitory (non-transitory) medium, such as a random access memory, a read only memory, a flash memory, a hard disk, a solid state disk, a magnetic tape (magnetic tape), a floppy disk (floppy disk), an optical disk (optical disk), and any combination thereof. The storage media may be any available media that can be accessed by a computer or a data storage device such as a server, data center, or the like that contains an integration of one or more available media. The usable medium may be a magnetic medium (e.g., a floppy disk, a hard disk, a magnetic tape), an optical medium (e.g., a digital video disc (digital video disc, DVD)), or a semiconductor medium (e.g., a Solid State Disk (SSD)), or the like.
The embodiment of the invention also provides electronic equipment. The electronic device includes a processor and a memory.
The memory is used for storing a computer program.
The memory includes: various media capable of storing program codes, such as ROM, RAM, magnetic disk, U-disk, memory card, or optical disk.
The processor is connected with the memory and is used for executing the computer program stored in the memory so as to enable the electronic equipment to execute the LOGO identification method.
Preferably, the processor may be a general-purpose processor, including a central processing unit (Central Processing Unit, abbreviated as CPU), a network processor (Network Processor, abbreviated as NP), etc.; but also digital signal processors (Digital Signal Processor, DSP for short), application specific integrated circuits (Application Specific Integrated Circuit, ASIC for short), field programmable gate arrays (Field Programmable Gate Array, FPGA for short) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components.
As shown in fig. 4, the electronic device of the present invention is embodied in the form of a general purpose computing device. Components of an electronic device may include, but are not limited to: one or more processors or processing units 41, a memory 42, a bus 43 connecting the different system components, including the memory 42 and the processing unit 41.
Bus 43 represents one or more of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, a processor, and a local bus using any of a variety of bus architectures. By way of example, and not limitation, such architectures include Industry Standard Architecture (ISA) bus, micro channel architecture (MAC) bus, enhanced ISA bus, video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus.
Electronic devices typically include a variety of computer system readable media. Such media can be any available media that can be accessed by the electronic device and includes both volatile and nonvolatile media, removable and non-removable media.
Memory 42 may include computer system readable media in the form of volatile memory, such as Random Access Memory (RAM) 421 and/or cache memory 422. The electronic device may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, the storage system 423 may be used to read from and write to non-removable, non-volatile magnetic media (not shown in FIG. 4, commonly referred to as a "hard disk drive"). Although not shown in fig. 4, a magnetic disk drive for reading from and writing to a removable non-volatile magnetic disk (e.g., a "floppy disk"), and an optical disk drive for reading from or writing to a removable non-volatile optical disk (e.g., a CD-ROM, DVD-ROM, or other optical media) may be provided. In these cases, each drive may be coupled to bus 43 through one or more data media interfaces. Memory 42 may include at least one program product having a set (e.g., at least one) of program modules configured to carry out the functions of embodiments of the invention.
A program/utility 424 having a set (at least one) of program modules 4241 may be stored in, for example, memory 42, such program modules 4241 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, each or some combination of which may include an implementation of a network environment. Program modules 4241 generally perform the functions and/or methodologies of the described embodiments of the invention.
The electronic device may also communicate with one or more external devices (e.g., keyboard, pointing device, display, etc.), with one or more devices that enable a user to interact with the electronic device, and/or with any device (e.g., network card, modem, etc.) that enables the electronic device to communicate with one or more other computing devices. Such communication may occur through an input/output (I/O) interface 44. And the electronic device may also communicate with one or more networks such as a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the internet, via the network adapter 45. As shown in fig. 4, the network adapter 45 communicates with other modules of the electronic device over the bus 43. It should be appreciated that although not shown, other hardware and/or software modules may be used in connection with an electronic device, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, data backup storage systems, and the like.
The above embodiments are merely illustrative of the principles of the present invention and its effectiveness, and are not intended to limit the invention. Modifications and variations may be made to the above-described embodiments by those skilled in the art without departing from the spirit and scope of the invention. Accordingly, it is intended that all equivalent modifications and variations of the invention be covered by the claims, which are within the ordinary skill of the art, be within the spirit and scope of the present disclosure.

Claims (10)

1. A method of LOGO identification, said method comprising the steps of:
constructing a LOGO feature database, wherein the LOGO feature database comprises LOGO names and LOGO features;
constructing a LOGO text database, wherein the LOGO text database comprises LOGO names and LOGO characters;
acquiring an image containing LOGO;
intercepting LOGO images in the images;
performing character recognition in the LOGO image;
if the characters are recognized, calculating the word error rate of the recognized characters and each LOGO character in the LOGO text database, and when the word error rate is smaller than a first preset threshold value, selecting the LOGO name corresponding to the LOGO character with the minimum word error rate as the name of LOGO contained in the image;
if the characters are not recognized, extracting the image features of the LOGO image; calculating the similarity between the image features and each LOGO feature in the LOGO feature database, and selecting the LOGO name corresponding to the LOGO feature with the largest similarity as the name of the LOGO contained in the image when the similarity is larger than a second preset threshold.
2. The LOGO recognition method as claimed in claim 1, wherein: intercepting a LOGO image in the image comprises the following steps:
acquiring coordinate information of the LOGO image in the image based on a DETR model;
the LOGO image is truncated from the image based on the coordinate information.
3. The LOGO recognition method as claimed in claim 1, wherein: performing text recognition in the LOGO image comprises the following steps:
identifying characters in the LOGO image based on a CRNN model.
4. The LOGO recognition method as claimed in claim 1, wherein: extracting image features of the LOGO image comprises the following steps:
and extracting image features of the LOGO image based on a VIT model.
5. The LOGO recognition method as claimed in claim 1, wherein: the first preset threshold value is 0.2.
6. The LOGO recognition method as claimed in claim 1, wherein: the second preset threshold value is 0.8.
7. The LOGO recognition method as claimed in claim 1, wherein: the similarity adopts cosine similarity.
8. The LOGO identification system is characterized by comprising a first construction module, a second construction module, an acquisition module, an interception module, an identification module, a word processing module and an image processing module;
the first construction module is used for constructing a LOGO feature database, and the LOGO feature database comprises LOGO names and LOGO features;
the second construction module is used for constructing a LOGO text database, and the LOGO text database comprises LOGO names and LOGO characters;
the acquisition module is used for acquiring an image containing LOGO;
the intercepting module is used for intercepting LOGO images in the images;
the identification module is used for carrying out character identification in the LOGO image;
the word processing module is used for calculating the word error rate of each LOGO word in the recognized word and the LOGO text database if the word is recognized, and selecting the LOGO name corresponding to the LOGO word with the minimum word error rate as the name of LOGO contained in the image when the word error rate is smaller than a first preset threshold value;
the image processing module is used for extracting the image characteristics of the LOGO image if the characters are not recognized; calculating the similarity between the image features and each LOGO feature in the LOGO feature database, and selecting the LOGO name corresponding to the LOGO feature with the largest similarity as the name of the LOGO contained in the image when the similarity is larger than a second preset threshold.
9. An electronic device, the electronic device comprising: a processor and a memory;
the memory is used for storing a computer program;
the processor is configured to execute the computer program stored in the memory, to cause the electronic device to perform the LOGO recognition method as claimed in any one of claims 1 to 7.
10. A computer readable storage medium having stored thereon a computer program, characterized in that the program, when executed by an electronic device, implements the LOGO recognition method as claimed in claims 1 to 7.
CN202310047905.3A 2023-01-31 2023-01-31 LOGO identification method, LOGO identification system, storage medium and electronic equipment Pending CN116361493A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310047905.3A CN116361493A (en) 2023-01-31 2023-01-31 LOGO identification method, LOGO identification system, storage medium and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310047905.3A CN116361493A (en) 2023-01-31 2023-01-31 LOGO identification method, LOGO identification system, storage medium and electronic equipment

Publications (1)

Publication Number Publication Date
CN116361493A true CN116361493A (en) 2023-06-30

Family

ID=86940239

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310047905.3A Pending CN116361493A (en) 2023-01-31 2023-01-31 LOGO identification method, LOGO identification system, storage medium and electronic equipment

Country Status (1)

Country Link
CN (1) CN116361493A (en)

Similar Documents

Publication Publication Date Title
CN117275461B (en) Multitasking audio processing method, system, storage medium and electronic equipment
US9530103B2 (en) Combining of results from multiple decoders
CN115858839B (en) Cross-modal LOGO retrieval method, system, terminal and storage medium
CN116361493A (en) LOGO identification method, LOGO identification system, storage medium and electronic equipment
CN116630633B (en) Automatic labeling method and system for semantic segmentation, storage medium and electronic equipment
CN116912353B (en) Multitasking image processing method, system, storage medium and electronic device
CN116092087B (en) OCR (optical character recognition) method, system, storage medium and electronic equipment
CN117746866B (en) Multilingual voice conversion text method, multilingual voice conversion text system, storage medium and electronic equipment
CN116701708B (en) Multi-mode enhanced video classification method, system, storage medium and electronic equipment
CN118036592A (en) Regional membership error correction method, system, storage medium and electronic equipment
CN116912871B (en) Identity card information extraction method, system, storage medium and electronic equipment
CN118279611A (en) Image difference description method, system, storage medium and electronic equipment
CN118296186A (en) Video advertisement detection method, system, storage medium and electronic equipment
CN116029284B (en) Chinese substring extraction method, chinese substring extraction system, storage medium and electronic equipment
CN118196695A (en) Video semantic segmentation method, system, storage medium and electronic equipment
CN110619087B (en) Method and apparatus for processing information
CN116108147A (en) Cross-modal retrieval method, system, terminal and storage medium based on feature fusion
CN117975941A (en) Multi-attention multi-feature voice recognition method, system, storage medium and electronic equipment
CN117079643A (en) Speech recognition method, system, storage medium and electronic equipment
CN117351973A (en) Tone color conversion method, system, storage medium and electronic equipment
CN118196775A (en) Target detection method, target detection system, storage medium and electronic equipment
CN117349397A (en) Address recognition method, address recognition device, medium and electronic equipment
CN118277190A (en) Log analysis method, system, storage medium and electronic equipment
CN118072715A (en) Voice keyword detection method, system, storage medium and electronic equipment
CN118052648A (en) Network information attention calculating method, system, storage medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination