CN111914681A - Advertisement material content identification method and system - Google Patents

Advertisement material content identification method and system Download PDF

Info

Publication number
CN111914681A
CN111914681A CN202010669119.3A CN202010669119A CN111914681A CN 111914681 A CN111914681 A CN 111914681A CN 202010669119 A CN202010669119 A CN 202010669119A CN 111914681 A CN111914681 A CN 111914681A
Authority
CN
China
Prior art keywords
recognition result
words
brand
commodity
identification
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010669119.3A
Other languages
Chinese (zh)
Inventor
陈瑞
刘建辉
张淼
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Minglue Zhaohui Technology Co Ltd
Original Assignee
Beijing Minglue Zhaohui Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Minglue Zhaohui Technology Co Ltd filed Critical Beijing Minglue Zhaohui Technology Co Ltd
Priority to CN202010669119.3A priority Critical patent/CN111914681A/en
Publication of CN111914681A publication Critical patent/CN111914681A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • G06V30/153Segmentation of character regions using recognition of characters or words
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification

Abstract

The invention provides an advertisement material content identification method and system, wherein the advertisement material content identification method comprises the following steps: adopting an identification technology to identify the advertisement material and outputting a plurality of identification results; according to an advertising resource metadata standard library, standardizing and processing a plurality of recognition results; fusing the recognition results after the plurality of standardized processes; and outputting the content information of the advertisement material according to the fusion result. The method is based on the standard database of the metadata of the advertisement resources, and outputs standard and uniform identification results by fusing a plurality of identification results, so that the accuracy of identifying the content of the advertisement materials is improved.

Description

Advertisement material content identification method and system
Technical Field
The invention relates to the technical field of advertisement identification, in particular to an advertisement material content identification method and system.
Background
The advertisement material contains rich information, such as commodities and brands appearing in the advertisement material, and corresponding information of industries, advertisers, speakers of advertisements and the like. Obtaining such information in the advertising material can help business analysis organizations obtain business information such as the operation status, market segment status, etc. of enterprises.
The advertising material is generally in the form of pictures and videos, and to acquire contents in the pictures or videos, in the prior art, the advertising material is generally recognized through character recognition (OCR) and speech recognition (ASR), so as to obtain information such as commodities and brands appearing in the advertising material.
However, content information in an advertisement material cannot be well recognized only through character recognition (OCR) and speech recognition (ASR), on one hand, the recognized content information has a lot of interference information to influence recognition accuracy, for example, some common words "apple", "boss", and some simple english letters and the like are also names of commodities or brands, and interference is caused to recognition results; on the other hand, the identified goods and the brand do not match, that is, the identified goods do not belong to the identified brand, and this situation may cause the identification result to be inconsistent.
Disclosure of Invention
In order to solve the technical problems of low content identification accuracy rate and non-uniform identification results of the advertisement material in the prior art, the invention provides the content identification method of the advertisement material.
The invention provides an advertisement material content identification method, which comprises the following steps:
s1: adopting an identification technology to identify the advertisement material and outputting a plurality of identification results;
s2: according to an advertising resource metadata standard library, standardizing and processing a plurality of recognition results;
s3: fusing the recognition results after the plurality of standardized processes;
s4: and outputting the content information of the advertisement material according to the fusion result of S3.
Further, before the step of S1, the method further includes:
s0: collecting monitoring information of the advertisement material, and outputting content information of the advertisement material according to the monitoring information and an advertisement resource metadata standard library when the monitoring information is collected; and when the monitoring information is not acquired, executing the steps S1-S4.
Further, in the step S1, the identification technique includes: the system comprises a character recognition technology, a voice recognition technology, a logo recognition technology and a face recognition technology.
Further, the step S2 specifically includes:
s21: according to the advertising resource metadata standard library, processing a character recognition result and a voice recognition result in a standardized manner;
s22: according to the advertising resource metadata standard library, standardizing and processing logo identification results;
s23: and carrying out standardized processing on the face recognition result according to the advertising resource metadata standard library.
Further, in the step S21, the method for standardizing the character recognition result and the voice recognition result specifically includes:
s211: determining commodity words and/or brand words with the confidence degrees of more than or equal to a first preset threshold value in the character recognition result and the voice recognition result;
s212: when the character recognition result and the voice recognition result contain commodity words, the commodity words and brand words to which the commodity words belong are taken;
s213: when the character recognition result and the voice recognition result only contain brand words, the brand words are taken;
s214: when the text recognition result and the voice recognition result contain blacklist words, filtering the blacklist words;
s215: when the character recognition result and the voice recognition result contain commodity distinguished nouns and/or brand distinguished nouns, taking commodity words corresponding to the commodity distinguished nouns and/or brand words corresponding to brand aliases;
s216: and when the character recognition result and the voice recognition result contain a plurality of commodity words and/or brand words, determining the commodity words and/or the brand words according to the phrase relationship, the commodity word frequency and the brand word frequency.
Further, in the step S216, the method for determining the commodity word and/or the brand word specifically includes:
when the character recognition result and the voice recognition result contain phrase relations, the commodity words and/or the brand words contained in the long-short language relations are taken;
when the character recognition result and the voice recognition result do not contain phrase relations, counting the times of each commodity word and/or the times of the brand word, and if the times of each commodity word and/or the times of the brand word are different, taking the commodity word and/or the brand word with the largest times;
and if the times of the commodity words and/or the times of the brand words are the same, the commodity words and/or the brand words with the highest confidence level are taken.
Further, in the step S22, the method for standardizing the logo recognition result specifically includes:
s221: mapping the logo recognition result to the advertising resource metadata standard library for matching;
s222: and when the logo identification result is completely matched with the advertising resource metadata standard library, the mapped brand words are taken.
Further, in the step S23, the method for standardizing the face recognition result specifically includes:
s231: determining the name of the speaker with the confidence coefficient greater than or equal to a second preset threshold value in the face recognition result;
s232: mapping the speaker name to the standard database of the advertising resource metadata for matching;
s233: and when the speaker name is completely matched with the advertising resource metadata standard library, the mapped speaker name and the information of the speaker are taken.
Further, in the step S3, the method for fusing the plurality of normalized recognition results specifically includes:
s31: determining the trust relationship among the character recognition result, the voice recognition result and the logo recognition result after the standardization processing;
s32: and when the character recognition result, the voice recognition result and the logo recognition result after the standardization processing are inconsistent, fusing a plurality of recognition results after the standardization processing according to the trust degree relation.
The invention also provides an advertisement material content identification system, comprising:
the identification unit is used for identifying the advertisement material by adopting an identification technology and outputting a plurality of identification results;
the processing unit is used for standardizing and processing a plurality of identification results according to an advertising resource metadata standard library;
a fusion unit for fusing the plurality of standardized recognition results;
and the advertisement content output unit is used for outputting the content information of the advertisement material according to the fusion result of the fusion unit.
The invention has the technical effects or advantages that:
(1) the invention provides an advertisement material content identification method, which adopts an identification technology to identify advertisement materials, standardizes a plurality of identification results according to an advertisement resource metadata standard library, fuses the plurality of standardized identification results, and outputs the content information of the advertisement materials according to the fusion result. The method is based on the standard database of the metadata of the advertisement resources, and outputs standard and uniform identification results by fusing a plurality of identification results, so that the accuracy of identifying the content of the advertisement materials is improved.
(2) The advertisement material content identification method provided by the invention is based on the advertisement resource metadata standard library, adopts the character identification technology, the voice identification technology, the logo identification technology and the face identification technology to identify the advertisement material, not only can identify the commodity information, the brand information and the speaker information of the advertisement material, but also can obtain the industry information, the advertiser information and the like by combining the identified commodity information, the brand information and the speaker information, enriches the content information output by the advertisement material, and is beneficial to better acquiring valuable information in the advertisement material.
(3) According to the advertisement material content identification method provided by the invention, before the advertisement material is identified by adopting an identification technology, the monitoring information of the advertisement material is collected, and when the monitoring information is collected, the content information of the advertisement material is output according to the monitoring information and the advertisement resource metadata standard library, so that the accuracy of the output result is further ensured.
Drawings
FIG. 1 is a flow chart of a method for identifying content of advertising material according to an embodiment of the present invention;
FIG. 2 is a flowchart of a method for normalizing logo recognition results according to an embodiment of the present invention;
FIG. 3 is a flowchart of normalizing a face recognition result according to an embodiment of the present invention;
fig. 4 is a schematic diagram of an advertisement material content identification system according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some embodiments of the present invention, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be understood in the art that the terms "first," "second," "third," and the like in the description of the invention are used for distinguishing between descriptions and not for indicating or implying relative importance. The specific meanings of the above terms in the present invention can be understood in specific cases to those skilled in the art.
In order to solve the technical problems of low content identification accuracy rate and non-uniform identification results of the advertisement material in the prior art, the invention provides the content identification method of the advertisement material.
The technical solution of the present invention will be described in detail below with reference to the specific embodiments and the accompanying drawings.
Referring to fig. 1, fig. 1 is a flow chart of a method for identifying content of advertisement material according to an embodiment of the present invention. The invention provides an advertisement material content identification method, which comprises the following steps:
s0: collecting monitoring information of the advertisement material, and outputting content information of the advertisement material according to the monitoring information and the advertisement resource metadata standard library when the monitoring information is collected; and when the monitoring information is not acquired, executing the steps S1-S4. In the implementation, the content information of the advertisement material is output according to the monitoring information and the advertisement resource metadata standard library, so that the accuracy of the output result is ensured.
The standard database of the advertisement resource metadata is common knowledge of those skilled in the art, and in this embodiment, the standard database of the advertisement resource metadata may also be constructed and modified according to actual requirements. The standard libraries of advertising resource metadata include, but are not limited to, commodity thesaurus, brand thesaurus, logo materials library, speaker names library, blacklist thesaurus, commodity alias thesaurus, brand alias thesaurus, industry thesaurus, and advertiser thesaurus. Monitoring information of the advertisement material is collected by a crawler, if the collected monitoring information is collected, a campaign ID can be analyzed according to the monitoring information, content information corresponding to the advertisement material can be obtained in an advertisement resource metadata standard library according to the campaign ID, and the content information of the advertisement material comprises but is not limited to commodity information, brand information, industry information, speaker information and advertiser information.
S1: and identifying the advertisement material by adopting an identification technology, and outputting a plurality of identification results. Wherein the identification technique comprises: the method comprises the following steps of character recognition technology, voice recognition technology, logo recognition technology and face recognition technology, wherein recognition results comprise character recognition results, voice recognition results, logo recognition results and face recognition results.
The character recognition technology, the voice recognition technology, the logo recognition technology and the face recognition technology adopted in the embodiment are common knowledge of those skilled in the art, and those skilled in the art can select the character recognition technology, the voice recognition technology, the logo recognition technology and the face recognition technology of different manufacturers according to actual requirements. The embodiment is based on the advertising resource metadata standard library, and adopts a character recognition technology, a voice recognition technology, a logo recognition technology and a face recognition technology to recognize the advertising material, so that not only can commodity information, brand information and speaker information of the advertising material be recognized, but also industry information, advertiser information and the like can be obtained by combining the recognized commodity information, brand information and speaker information, content information output by the advertising material is enriched, and valuable information in the advertising material can be better acquired.
It should be noted that the character recognition technology can recognize commodity words and brand words of the advertisement material, the voice recognition technology can also recognize commodity words and brand words of the advertisement material, the logo recognition technology can recognize brand words of the advertisement material, and the face recognition technology can recognize speaker names of the advertisement material.
S2: and standardizing and processing a plurality of identification results according to the standard library of the advertising resource metadata. Wherein the step of S2 further comprises:
s21: according to the advertising resource metadata standard library, standardizing the character recognition result and the voice recognition result;
s22: standardizing and processing logo identification results according to the advertising resource metadata standard library;
s23: and (5) carrying out standardized processing on the face recognition result according to the advertising resource metadata standard library.
It should be noted that the steps S21, S22 and S23 are not in sequence.
Since the character recognition result and the voice recognition result are similar, the character recognition result and the voice recognition result are normalized according to the same method, wherein in the step S21, the method for normalizing the character recognition result and the voice recognition result is performed according to the following steps:
s211: determining commodity words and/or brand words with the confidence degrees of more than or equal to a first preset threshold value in the character recognition result and the voice recognition result;
s212: when the character recognition result and the voice recognition result contain commodity words, the commodity words and brand words to which the commodity words belong are taken;
s213: when the character recognition result and the voice recognition result only contain brand words, the brand words are taken;
s214: when the text recognition result and the voice recognition result contain the blacklist words, filtering the blacklist words;
s215: when the character recognition result and the voice recognition result contain commodity names and/or brand names, selecting commodity words corresponding to the commodity names and/or brand words corresponding to brand aliases;
s216: and when the character recognition result and the voice recognition result contain a plurality of commodity words and/or brand words, determining the commodity words and/or the brand words according to the phrase relationship, the commodity word frequency and the brand word frequency.
It should be noted that steps S212, S213, S214, S215, and S216 are not in sequence.
In the step S216, the method for determining the commodity word and/or the brand word is performed according to the following steps:
when the character recognition result and the voice recognition result contain phrase relations, commodity words and/or brand words contained in the long-short language relations are taken;
when the character recognition result and the voice recognition result do not contain phrase relations, counting the times of each commodity word and/or the times of the brand word, and if the times of each commodity word and/or the times of the brand word are different, selecting the commodity word and/or the brand word with the most times;
and if the times of each commodity word and/or the times of the brand word are the same, the commodity word and/or the brand word with the highest confidence level is selected.
Referring to fig. 2, in step S22, the method for standardizing logo recognition results is performed as follows:
s221: mapping the logo recognition result to an advertising resource metadata standard library for matching;
s222: and when the logo recognition result is completely matched with the advertising resource metadata standard library, the mapped brand words are taken.
Referring to fig. 3, in step S23, the method for normalizing the face recognition result is performed as follows:
s231: determining the name of the speaker with the confidence level greater than or equal to a second preset threshold in the face recognition result, wherein the second preset threshold is 95 in the embodiment;
s232: mapping the speaker name to an advertising resource metadata standard library for matching;
s233: and when the speaker name is completely matched with the advertising resource metadata standard library, the mapped speaker name and the information of the speaker are taken.
S3: and fusing the recognition results after the plurality of standardized treatments. In step S3, the method for fusing the plurality of normalized recognition results is performed as follows:
s31: determining the trust relationship among the character recognition result, the voice recognition result and the logo recognition result after the standardization processing;
s32: and when the character recognition result, the voice recognition result and the logo recognition result after the standardization processing are inconsistent, fusing a plurality of recognition results after the standardization processing according to the trust degree relation.
Because the logo recognition result after the standardization processing only contains brand words, the logo recognition result after the standardization processing is more targeted compared with the character recognition result and the voice recognition result after the standardization processing, and the character recognition result after the standardization processing is more accurate compared with the voice recognition result, in the implementation, the trust degree relation among the character recognition result after the standardization processing, the voice recognition result and the logo recognition result is that the trust degree of the logo recognition result after the standardization processing is greater than that of the character recognition result after the standardization processing, and the trust degree of the character recognition result after the standardization processing is greater than that of the voice recognition result after the standardization processing.
Because the character recognition result after the standardization processing may include commodity words and brand words, the voice recognition result after the standardization processing may also include commodity words and brand words, and the logo recognition result after the standardization processing only includes brand words, in the step S32, the fusion of a plurality of recognition results after the standardization processing according to the trust relationship may be performed according to the following steps:
s321: when the character recognition result, the voice recognition result and the logo recognition result after the standardization processing are all not empty, if the logo recognition result after the standardization processing is inconsistent with the brand word in the character recognition result, the brand word in the logo recognition result after the standardization processing is taken; if the character recognition result after the standardization processing is inconsistent with the commodity word in the voice recognition result, taking the commodity word in the character recognition result after the standardization processing;
s322: when the logo recognition result after the standardization processing is empty, if the character recognition result after the standardization processing is inconsistent with the commodity words and/or the brand words in the voice recognition result, taking the commodity words and/or the brand words in the character recognition result after the standardization processing;
s323: when the character recognition result after the standardization processing is empty, if the logo recognition result after the standardization processing is inconsistent with the brand word in the voice recognition result, the brand word in the logo recognition result after the standardization processing is taken, and the commodity word in the voice recognition result after the standardization processing is not reserved;
s324: when the speech recognition result after the standardization processing is empty, if the logo recognition result after the standardization processing is inconsistent with the brand word in the character recognition result, the brand word in the logo recognition result after the standardization processing is taken, and the commodity word in the character recognition result after the standardization processing is not reserved;
s325: and when two recognition results are empty in the logo recognition result, the voice recognition result and the character recognition result after the standardization processing, taking the recognition result which is not empty after the standardization processing.
It should be noted that the character recognition result and the voice recognition result after the normalization process are null means that the recognition result does not include a commodity word or a brand word. In step S322, the case that the normalized character recognition result is inconsistent with the commodity word and/or the brand word in the voice recognition result includes: the character recognition result after the standardization processing comprises commodity words, and the commodity words in the voice recognition result after the standardization processing are null; the character recognition result after the standardization processing comprises brand words, and the brand words in the voice recognition result after the standardization processing are null; the commodity words in the character recognition result after the standardization processing are empty, and the voice recognition result after the standardization processing comprises the commodity words; the brand words in the character recognition result after the standardization processing are empty, and the voice recognition result after the standardization processing comprises the brand words; the character recognition result after the standardization processing and the voice recognition result after the standardization processing comprise commodity words and brand words, and the commodity words or the brand words in the character recognition result after the standardization processing and the voice recognition result are inconsistent.
S4: content information of the advertisement material is output based on the fusion result of S3. In which commodity information, brand information, industry information, speaker information, advertiser information, and the like of the advertisement material are output.
According to the method for identifying the content of the advertisement material, the advertisement material is identified by adopting an identification technology, a plurality of identification results are processed in a standardized manner according to an advertisement resource metadata standard library, the identification results after the standardized processing are fused, and the content information of the advertisement material is output according to the fusion result. The method is based on the standard database of the metadata of the advertisement resources, and outputs standard and uniform identification results by fusing a plurality of identification results, so that the accuracy of identifying the content of the advertisement materials is improved.
As an example, the following describes in detail how to identify content information of advertisement material.
Monitoring information of the advertisement material is not collected by a crawler, and the advertisement material is provided according to the following website links;
website linking of advertisement materials:
https://chuangyiku-1255521909.cos.ap-beijing.myqcloud.com/chuangyiku2/37ddc82b54371a5c1031c444c40c6922.mp4?q-sign-algorithm=sha1&q-ak=AKIDQdaLvJLdYJrALDMLejfMn14uWs1yxAQi&q-sign-time=1591574088;1749254088&q-key-time=1591574088;1749254088&q-header-list=&q-url-param-list=&q-signature=4c21ac32f008d4b7814cca792d7ce08b811ec4d8
recognizing the advertisement material by adopting a character recognition technology, a voice recognition technology, a logo recognition technology and a face recognition technology, and outputting a character recognition result, a voice recognition result, a logo recognition result and a face recognition result;
the commodity words in the character recognition result comprise unified ice black tea and a Beijing east supermarket, the brand words in the character recognition result comprise unified and a Beijing east supermarket, the commodity words in the voice recognition result are unified ice black tea, the brand words in the voice recognition result are unified, the brand words in the logo recognition result are unified, and the face recognition result is empty;
according to the standard library of advertising resource metadata, the character recognition result, the voice recognition result, the logo recognition result and the face recognition result are processed in a standardized manner, according to the standardized processing method of the character recognition result, the number of times of occurrence of the unified ice black tea in the character recognition result is more than that of the Beijing supermarket, so that the commodity word in the character recognition result is the unified ice black tea, the brand word to which the commodity word unified ice black tea belongs is unified according to the standard library of the advertising resource metadata, according to the standardized processing method of the voice recognition result, the commodity word in the voice recognition result is the unified ice black tea, the brand word is unified, and according to the standardized processing method of the logo recognition result, the brand word in the logo recognition result is unified.
According to the fusion method, wherein the character recognition result, the voice recognition result and the logo recognition result are all not null, the character recognition result is consistent with the voice recognition result, the character recognition result is consistent with the logo recognition result, and the content information and the commodity information of the advertisement material are output: "unify black tea", brand information: "unify", can output relevant trade information according to advertising resource metadata standard library: "tea drink", advertiser information: "unified".
An embodiment of the present invention further provides an advertisement material content identification system, and referring to fig. 4, the advertisement material content identification system includes:
the identification unit is used for identifying the advertisement material by adopting an identification technology and outputting a plurality of identification results;
the processing unit is used for standardizing and processing a plurality of identification results according to the advertising resource metadata standard library;
a fusion unit for fusing the plurality of standardized recognition results;
and the advertisement content output unit is used for outputting the content information of the advertisement material according to the fusion result of the fusion unit.
The above description is only a preferred embodiment of the present application and is not intended to limit the present application, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims (10)

1. A method for identifying content of advertising material, comprising:
s1: adopting an identification technology to identify the advertisement material and outputting a plurality of identification results;
s2: according to an advertising resource metadata standard library, standardizing and processing a plurality of recognition results;
s3: fusing the recognition results after the plurality of standardized processes;
s4: and outputting the content information of the advertisement material according to the fusion result of S3.
2. The advertising material content identification method as claimed in claim 1, wherein prior to the step of S1, further comprising:
s0: collecting monitoring information of the advertisement material, and outputting content information of the advertisement material according to the monitoring information and an advertisement resource metadata standard library when the monitoring information is collected; and when the monitoring information is not acquired, executing the steps S1-S4.
3. The method for identifying contents of an advertising material according to claim 1, wherein in said step S1, said identification technique comprises: the system comprises a character recognition technology, a voice recognition technology, a logo recognition technology and a face recognition technology.
4. The method for identifying contents of an advertising material as claimed in claim 3, wherein the step S2 includes:
s21: according to the advertising resource metadata standard library, processing a character recognition result and a voice recognition result in a standardized manner;
s22: according to the advertising resource metadata standard library, standardizing and processing logo identification results;
s23: and carrying out standardized processing on the face recognition result according to the advertising resource metadata standard library.
5. The method for identifying contents of an advertising material as claimed in claim 4, wherein in the step S21, the method for standardizing the results of the character recognition and the voice recognition comprises:
s211: determining commodity words and/or brand words with the confidence degrees of more than or equal to a first preset threshold value in the character recognition result and the voice recognition result;
s212: when the character recognition result and the voice recognition result contain commodity words, the commodity words and brand words to which the commodity words belong are taken;
s213: when the character recognition result and the voice recognition result only contain brand words, the brand words are taken;
s214: when the text recognition result and the voice recognition result contain blacklist words, filtering the blacklist words;
s215: when the character recognition result and the voice recognition result contain commodity distinguished nouns and/or brand distinguished nouns, taking commodity words corresponding to the commodity distinguished nouns and/or brand words corresponding to brand aliases;
s216: and when the character recognition result and the voice recognition result contain a plurality of commodity words and/or brand words, determining the commodity words and/or the brand words according to the phrase relationship, the commodity word frequency and the brand word frequency.
6. The method for identifying content of advertisement material as claimed in claim 5, wherein in the step S216, the method for determining the commodity word and/or the brand word specifically comprises:
when the character recognition result and the voice recognition result contain phrase relations, the commodity words and/or the brand words contained in the long-short language relations are taken;
when the character recognition result and the voice recognition result do not contain phrase relations, counting the times of each commodity word and/or the times of the brand word, and if the times of each commodity word and/or the times of the brand word are different, taking the commodity word and/or the brand word with the largest times;
and if the times of the commodity words and/or the times of the brand words are the same, the commodity words and/or the brand words with the highest confidence level are taken.
7. The method for identifying content of advertisement material according to claim 4, wherein in the step S22, the method for standardizing logo identification results specifically comprises:
s221: mapping the logo recognition result to the advertising resource metadata standard library for matching;
s222: and when the logo identification result is completely matched with the advertising resource metadata standard library, the mapped brand words are taken.
8. The method for identifying contents of an advertisement material as claimed in claim 4, wherein in the step S23, the method for standardizing the face recognition result specifically comprises:
s231: determining the name of the speaker with the confidence coefficient greater than or equal to a second preset threshold value in the face recognition result;
s232: mapping the speaker name to the standard database of the advertising resource metadata for matching;
s233: and when the speaker name is completely matched with the advertising resource metadata standard library, the mapped speaker name and the information of the speaker are taken.
9. The method for identifying content of an advertising material according to any one of claims 4 to 8, wherein the step S3 is a method for fusing a plurality of standardized identification results, and specifically comprises:
s31: determining the trust relationship among the character recognition result, the voice recognition result and the logo recognition result after the standardization processing;
s32: and when the character recognition result, the voice recognition result and the logo recognition result after the standardization processing are inconsistent, fusing a plurality of recognition results after the standardization processing according to the trust degree relation.
10. An advertising material content identification system, comprising:
the identification unit is used for identifying the advertisement material by adopting an identification technology and outputting a plurality of identification results;
the processing unit is used for standardizing and processing a plurality of identification results according to an advertising resource metadata standard library;
a fusion unit for fusing the plurality of standardized recognition results;
and the advertisement content output unit is used for outputting the content information of the advertisement material according to the fusion result of the fusion unit.
CN202010669119.3A 2020-07-13 2020-07-13 Advertisement material content identification method and system Pending CN111914681A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010669119.3A CN111914681A (en) 2020-07-13 2020-07-13 Advertisement material content identification method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010669119.3A CN111914681A (en) 2020-07-13 2020-07-13 Advertisement material content identification method and system

Publications (1)

Publication Number Publication Date
CN111914681A true CN111914681A (en) 2020-11-10

Family

ID=73227957

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010669119.3A Pending CN111914681A (en) 2020-07-13 2020-07-13 Advertisement material content identification method and system

Country Status (1)

Country Link
CN (1) CN111914681A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113420146A (en) * 2021-06-09 2021-09-21 有米科技股份有限公司 Material brand identification method and device
CN113763054A (en) * 2021-09-24 2021-12-07 支付宝(杭州)信息技术有限公司 Advertisement verification method and device based on block chain and electronic equipment

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113420146A (en) * 2021-06-09 2021-09-21 有米科技股份有限公司 Material brand identification method and device
CN113763054A (en) * 2021-09-24 2021-12-07 支付宝(杭州)信息技术有限公司 Advertisement verification method and device based on block chain and electronic equipment

Similar Documents

Publication Publication Date Title
CN107705066B (en) Information input method and electronic equipment during commodity warehousing
US9230547B2 (en) Metadata extraction of non-transcribed video and audio streams
CN107833082B (en) Commodity picture recommendation method and device
CN108388650B (en) Search processing method and device based on requirements and intelligent equipment
CN109446376B (en) Method and system for classifying voice through word segmentation
US10489637B2 (en) Method and device for obtaining similar face images and face image information
CN110569502A (en) Method and device for identifying forbidden slogans, computer equipment and storage medium
CN111914681A (en) Advertisement material content identification method and system
CN112085568B (en) Commodity and rich media aggregation display method and equipment, electronic equipment and medium
CN110532449B (en) Method, device, equipment and storage medium for processing service document
CN113051380A (en) Information generation method and device, electronic equipment and storage medium
CN110874534A (en) Data processing method and data processing device
CN108229285B (en) Object classification method, object classifier training method and device and electronic equipment
CN111782793A (en) Intelligent customer service processing method, system and equipment
CN110598008A (en) Data quality inspection method and device for recorded data and storage medium
CN109635125B (en) Vocabulary atlas building method and electronic equipment
CN110363206B (en) Clustering of data objects, data processing and data identification method
CN110737770B (en) Text data sensitivity identification method and device, electronic equipment and storage medium
CN109829033B (en) Data display method and terminal equipment
CN114254138A (en) Multimedia resource classification method and device, electronic equipment and storage medium
CN111382343B (en) Label system generation method and device
CN112966125A (en) Geographic position identification method, device and equipment
CN111143559A (en) Triple-based word cloud display method and device
CN113014591B (en) Method and device for detecting counterfeit public numbers, electronic equipment and medium
CN114697762B (en) Processing method, processing device, terminal equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination