CN109840445A - A kind of recognition methods and system of video of practising fraud - Google Patents

A kind of recognition methods and system of video of practising fraud Download PDF

Info

Publication number
CN109840445A
CN109840445A CN201711188045.6A CN201711188045A CN109840445A CN 109840445 A CN109840445 A CN 109840445A CN 201711188045 A CN201711188045 A CN 201711188045A CN 109840445 A CN109840445 A CN 109840445A
Authority
CN
China
Prior art keywords
word finder
feature
vocabulary
feature word
video
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201711188045.6A
Other languages
Chinese (zh)
Other versions
CN109840445B (en
Inventor
张深源
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba China Co Ltd
Original Assignee
Youku Network Technology Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Youku Network Technology Beijing Co Ltd filed Critical Youku Network Technology Beijing Co Ltd
Priority to CN201711188045.6A priority Critical patent/CN109840445B/en
Publication of CN109840445A publication Critical patent/CN109840445A/en
Application granted granted Critical
Publication of CN109840445B publication Critical patent/CN109840445B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application embodiment discloses the recognition methods and system of a kind of video of practising fraud, wherein the described method includes: obtaining the heading message of target video, and extracts the feature vocabulary in the heading message;According to classification belonging to the feature vocabulary, the Feature Words are remitted and transferred and are divided at least one feature word finder;Wherein, classification belonging to the feature vocabulary in the same feature word finder is identical;Recognition threshold associated with current feature word finder is obtained, and judges whether the current feature word finder belongs to abnormal word finder based on the recognition threshold;If the current feature word finder belongs to abnormal word finder, determine the target video for video of practising fraud.Technical solution provided by the present application can be improved the recognition accuracy of cheating video.

Description

A kind of recognition methods and system of video of practising fraud
Technical field
This application involves Internet technical field, in particular to a kind of the recognition methods and system of video of practising fraud.
Background technique
With the continuous development of Internet technology, more and more video playing platforms are emerged.Currently, video playing is flat Platform would generally count the click volume of each video.In this way, user can be judged according to the click volume of video video content by Ratings, to selectively watch video.
Currently, some cheating videos uploader in order to improve cheating video click volume, it will usually for cheating video match Set false video title.These false video titles and the actual content of cheating video are possible and uncorrelated, but purely It piles up current some heat and searches vocabulary, in this way, the video title of the falseness will when user searches for some more popular video It appears in search result, to gain the click volume of user by cheating.For example, some false video title is " the happy male of Venus show The good sound of sound China runs the newest collection of male ", then when user is in search " Venus show " or " the good sound of China ", the falseness Video title appears in search result.
In order to identify cheating video from numerous videos, the heat occurred in the same video title can currently be searched Vocabulary is limited.For example, 3 can be set by the upper limit of the number that the heat occurred in the same video title searches vocabulary, this Sample can determine that the video is once the heat for occurring 4 or 4 or more in the title of some video searches vocabulary Cheating video.However, the recognition methods of existing this cheating video will lead to many normal videos and be mistaken for cheating view Frequently, for example, entitled " the happy base camp's collection of choice specimens of Deng Chao Zheng Kai Bao Beier Li Chen " of some video.Occur in the video title 5 heat search vocabulary, if the video can be determined as video of practising fraud according to existing method.But actually in the video title Several stars both participated in the same variety show, therefore the name of these stars occurs not being merely to pile up simultaneously Heat searches vocabulary, but normally enumerates, therefore the video is not cheating video.Therefore cheating view in the prior art The recognition methods of frequency can not accurately identify cheating video.
Summary of the invention
The purpose of the application embodiment is to provide the recognition methods and system of a kind of video of practising fraud, and can be improved cheating view The recognition accuracy of frequency.
To achieve the above object, the application embodiment provides a kind of recognition methods of video of practising fraud, which comprises The heading message of target video is obtained, and extracts the feature vocabulary in the heading message;According to belonging to the feature vocabulary The Feature Words are remitted and transferred and are divided at least one feature word finder by classification;Wherein, the feature vocabulary in the same feature word finder Affiliated classification is identical;Recognition threshold associated with current feature word finder is obtained, and is judged based on the recognition threshold Whether the current feature word finder belongs to abnormal word finder;If the current feature word finder belongs to abnormal word finder, Determine the target video for video of practising fraud.
To achieve the above object, the application embodiment also provides a kind of identifying system of video of practising fraud, the system packet Memory and processor are included, computer program is stored in the memory, when the computer program is executed by the processor, The heading message for obtaining target video is performed the steps of, and extracts the feature vocabulary in the heading message;According to the spy Classification belonging to vocabulary is levied, the Feature Words are remitted and transferred and are divided at least one feature word finder;Wherein, the same feature word finder In feature vocabulary belonging to classification it is identical;Recognition threshold associated with current feature word finder is obtained, and based on described Recognition threshold judges whether the current feature word finder belongs to abnormal word finder;If the current feature word finder belongs to Abnormal word finder determines the target video for video of practising fraud.
Therefore technical solution provided by the present application first may be used when the heading message to target video identifies To extract the feature vocabulary in the heading message.In practical applications, the feature vocabulary can be that current heat searches word It converges.After extracting feature vocabulary, it can classify to the feature vocabulary extracted, to obtain at least one Feature Words Collect.Specifically, different classes of feature word finder can be associated with different recognition thresholds, which can be used as one The upper limit quantity for the feature vocabulary for including in the feature word finder of classification.If the number for the feature vocabulary for including in feature word finder Amount is more than associated recognition threshold, then it is assumed that the specific word is collected for abnormal word finder, in this way, the target video can be by It is determined as video of practising fraud.Therefore it can also be different for the decision metrics of different feature word finders.For example, for joy For the feature word finder of happy stars, corresponding recognition threshold can be somewhat larger;And it is directed to the feature of programm name class For vocabulary, corresponding recognition threshold can be smaller.Specifically, the value of the recognition threshold can be according to normal view The quantity for the feature vocabulary for including in the video title of frequency is counted to obtain.It can be seen that technical solution provided by the present application, Vocabulary is searched for different classes of heat, can be determined using different criterion, is avoided due to being sentenced using unified Calibration standard caused erroneous judgement situation when being determined, to improve the recognition accuracy of cheating video.
Detailed description of the invention
It, below will be to embodiment in order to illustrate more clearly of the application embodiment or technical solution in the prior art Or attached drawing needed to be used in the description of the prior art is briefly described, it should be apparent that, the accompanying drawings in the following description is only It is some embodiments as described in this application, for those of ordinary skill in the art, in not making the creative labor property Under the premise of, it is also possible to obtain other drawings based on these drawings.
Fig. 1 is the recognition methods block diagram of cheating video in the application embodiment;
Fig. 2 is the recognition methods flow chart of cheating video in the application embodiment;
Fig. 3 is the structural schematic diagram of the identifying system of cheating video in the application embodiment.
Specific embodiment
In order to make those skilled in the art better understand the technical solutions in the application, below in conjunction with the application reality The attached drawing in mode is applied, the technical solution in the application embodiment is clearly and completely described, it is clear that described Embodiment is only a part of embodiment of the application, rather than whole embodiments.Based on the embodiment party in the application Formula, every other embodiment obtained by those of ordinary skill in the art without making creative efforts, is all answered When the range for belonging to the application protection.
The application provides a kind of recognition methods of video of practising fraud, and the method can be applied to the service of video playback website In device.Fig. 1 and Fig. 2 are please referred to, the method may include following steps.
S1: obtaining the heading message of target video, and extracts the feature vocabulary in the heading message.
In the present embodiment, the target video can be video to be identified, and the target video can have mark Information is inscribed, the heading message can be the text information that video uploader is target video setting.For example, the target The heading message of video can be " the happy good sound of male voice China of Venus show runs the newest collection of male ".
It in the present embodiment, can be to the target video when whether judge the target video is cheating video Heading message is identified.In the server, the data of the video of upload can be associated storage with the information of the video.Institute The information for stating video may include the range of information such as the duration of video, title, type and uploader user's name.In this way, When obtaining the heading message of the target video, characterization video can be read out from the associated video information of the target video The character string of title.
It in the present embodiment, can be for the interior of heading message after the heading message for obtaining the target video Appearance is identified.Specifically, the feature vocabulary in the heading message can be extracted.The feature vocabulary can be current The more vocabulary of searching times in video playback website.In practical applications, video playback website can count designated time period The searching times of interior each vocabulary can then proceed in searching times from more to few sequence, the vocabulary of search be ranked up. Finally, multiple vocabulary in the top can be obtained, these vocabulary in the top can be used as the video playback website In feature vocabulary.For example, the heat that video playback website can count before nearly one week ranking 100 searches vocabulary, these heat search word Remittance can be as the feature vocabulary of video playback website.
In the present embodiment, when extracting the feature vocabulary in the heading message, can to the heading message into Row participle, to obtain the multiple vocabulary for including in the heading message.When being segmented to the heading message, can adopt The vocabulary in the heading message is identified with pre-set lexicon, so as to identify to obtain the heading message In multiple vocabulary.In practical applications, heading message can be segmented using various segmenter.The segmenter is for example It can be friso segmenter, Jcseg segmenter, MMSEG4J segmenter etc..Further, in order to improve to the title of video letter The accuracy segmented is ceased, the dictionary of segmenter can be constructed based on vocabulary common in video playback website, to make The result for obtaining segmenter output can be more in line with the speech habits of vocabulary in video playback website.
In the present embodiment, it after being segmented and obtaining multiple vocabulary, can will be in the multiple vocabulary Heat searches feature vocabulary of the vocabulary in word finder as the heading message.Wherein, the heat that the heat is searched in word finder searches vocabulary It can be according to searching times determination corresponding within the specified time limit.For example, video playback website can count nearly one week ranking Preceding 100 heat searches vocabulary, and these heat are searched vocabulary composition heat and search word finder.So according to the heading message of target video point After word obtains multiple vocabulary, the vocabulary searched in word finder in the heat can be extracted as feature vocabulary.In this implementation In mode, the purpose for extracting feature vocabulary is, cheating video, which is likely to pile up current multiple heat in heading message, to be searched Vocabulary, to achieve the purpose that gain user clicks by cheating.Therefore subsequent the feature vocabulary extracted to be analyzed, thus Judge whether target video is cheating video.
S3: according to classification belonging to the feature vocabulary, the Feature Words is remitted and transferred and are divided at least one feature word finder; Wherein, classification belonging to the feature vocabulary in the same feature word finder is identical.
In the present embodiment, can classify according to classification belonging to feature vocabulary to feature vocabulary.Feature vocabulary Classification can be classified according to the search intention of user.Specifically, the classification of the feature vocabulary may include program names Claim class, figure kind, from the plurality of classes such as media class and sensitive part of speech.Wherein, programm name class can be the name of variety show The abbreviation of title or title.It such as may include " brother of running ", " Venus show ", " the good sound of China " etc. in programm name class Feature vocabulary.Figure kind can be the name of public figure or the nickname of name.For example, may include " Lee in figure kind The features vocabulary such as morning ", " Ma Yun ", " Ba Feite ".It can be PGC (Professional in video playback website from media class Generated Content, professional production content) title or uploader title.For example, may include from media The features vocabulary such as " heroic alliance ", " brother is helped in day ", " moonlit night maple ".Sensitive part of speech can be the Feature Words for having bad Guiding Significance It converges.For example, may include the features vocabulary such as " strong kiss ", " large scale ", " passion play " in sensitive part of speech.
It should be noted that the classification for above-mentioned feature vocabulary can be directed to therein in practical application scene The division that some classification is more refined, to obtain multiple subclass in a classification.For example, for figure kind and Speech, wherein may include multiple subclass such as amusement class personage, financial class personage, political class personage.
It in the present embodiment, can be according to spy after feature vocabulary is filtered out in the heading message from target video Classification belonging to vocabulary is levied, feature vocabulary is sorted out.A Feature Words can be divided to by belonging to of a sort feature vocabulary In collecting.In this way, at least one feature word finder can be obtained, belonging to the feature vocabulary in the same feature word finder Classification it is identical.For example, for " run the newest collection of the good sound packet of male Venus show China and see that Bao Beierli morning freely chats ideal " this Heading message can divide to obtain " running the good sound packet of male Venus show China " and " Bao Beierli morning " the two feature word finders.
S5: recognition threshold associated with current feature word finder is obtained, and based on described in recognition threshold judgement Whether current feature word finder belongs to abnormal word finder.
Typically, for different classes of feature vocabulary, the feature vocabulary that includes in the heading message of normal video Quantity may also be different.For example, appearing in the quantity in the same heading message for the feature vocabulary of programm name class Three are not exceeded generally;And the feature vocabulary for entertaining class personage, the quantity appeared in the same heading message are general Five are not exceeded.It therefore, in the present embodiment can be for not in order to avoid normal video is mistaken for cheating video Same classification, formulates different recognition strategies.
In the present embodiment, for different classes of feature word finder, it may be predetermined that be used for judging characteristic vocabulary The whether normal recognition threshold of the quantity for the feature vocabulary that concentration includes.The recognition threshold can be used as in feature word finder Feature vocabulary the upper limit of the number.If the quantity for the feature vocabulary for including in feature word finder is greater than the recognition threshold, table In bright corresponding heading message exist pile up heat search disliking and avoiding for vocabulary.Specifically, since different feature word finders can be associated with not Same recognition threshold, then can first be obtained and current feature word finder when determining current feature word finder Associated recognition threshold.Each recognition threshold can be associated in the server of video playback website with corresponding classification Storage.Wherein, the classification of feature vocabulary can be used as key (key), and recognition threshold associated with classification then can be used as value (value) can be stored by way of key-value (key-value pair) in this way.Current feature word finder pair is being determined After the classification answered, associated recognition threshold can be read.
In the present embodiment, it is for statistical analysis to can be heading message based on normal video for the recognition threshold It arrives.Specifically, the non-cheating heading message of the preset quantity of non-cheating video can be obtained in advance, and counts the same non-work The maximum quantity of feature vocabulary comprising specified classification in disadvantage heading message.For example, available 5000 non-cheating videos Then heading message is directed to every heading message, can count the quantity of the feature vocabulary wherein comprising specified classification.For example, It can count in this 5000 heading messages, the quantity of the feature vocabulary of the programm name class respectively contained.Finally, pass through comparison Each quantity of statistics, so as to obtain maximum quantity therein.The maximum quantity can be used as in non-cheating video and wrap The upper limit of the number of feature vocabulary containing specified classification, so as to using the maximum quantity counted as with the specified class The associated recognition threshold of another characteristic word finder.For example, for being found after a large amount of normal headers information analysis, normal video Heading message in it is general at most only can refer to 2 programm names, then the recognition threshold for programm name class can be set It is set to 2.
It, can after obtaining recognition threshold associated with current feature word finder in this embodiment party mode Judge whether the current feature word finder belongs to abnormal word finder based on the recognition threshold.Specifically, if it is described current Feature word finder in include feature vocabulary quantity be greater than recognition threshold associated with the current feature word finder, It then can be determined that the current feature word finder belongs to abnormal word finder.For example, the feature word finder phase with programm name class Associated recognition threshold can be 2, if that the quantity for the feature vocabulary for including in the feature word finder of programm name class is big In 2, then it can be determined that the specific word is collected for off-note word finder.Conversely, if including in the current feature word finder Feature vocabulary quantity be less than or equal to recognition threshold associated with the current feature word finder, then can be determined that The current feature word finder is not belonging to abnormal word finder.
S7: if the current feature word finder belongs to abnormal word finder, determine the target video for video of practising fraud.
In the present embodiment, if the current feature word finder belongs to abnormal word finder, show current feature There is the suspicion that heat searches vocabulary of piling up in the feature vocabulary in word finder.The heading message of target video can correspond to multiple Feature Words Collect, if wherein there is an abnormal word finder, then the target video can be determined for video of practising fraud.For example, for " running The newest collection of the good sound packet of male Venus show China sees that Bao Beierli morning freely chats ideal " this heading message, although wherein " Bao Beier This feature word finder of Li Chen " belongs to normal word finder, still " runs the good sound packet of male Venus show China " and but belongs to abnormal vocabulary Collection, then the corresponding video of the heading message can be determined for video of practising fraud.
In one embodiment, if dividing obtained feature word finder belongs to normal word finder, then can be into one Whether step comprehensive descision target video is cheating video.Specifically, it can count and divide to obtain by the heading message of target video Feature word finder total quantity.For example, for " run the newest collection of male and see that Bao Beierli morning freely chats ideal " this heading message, Comprising two feature word finders, therefore the total quantity of the corresponding feature word finder of the heading message is 2.If the sum of statistics Amount is greater than specified quantity threshold value, then can be determined that the target video for cheating video.The specified quantity threshold value can be used for It is limited to the upper limit of the number of the different classes of feature word finder in the same heading message while occurred.In some cases, The feature vocabulary for including in any feature word finder in heading message is without departing from associated recognition threshold, but title is believed But comprising many different classes of feature word finders in breath, in this case, which be should also be as being judged to practising fraud mark Inscribe information.For example, for " running the newest collection of male and seeing that Bao Beierli morning freely chats ideal heroic alliance and passes new racing season horse cloud Ba Feite Award the road made a good deal of money " as title, comprising four feature word finders, (figure kind can be divided into amusement class personage and financial class altogether Two class of personage), the feature vocabulary quantity for including in each feature word finder is normal, but due to the total quantity mistake of feature word finder It is more, therefore can be determined that the corresponding video of the heading message for cheating video.
In the present embodiment, the specified quantity threshold value is also possible to count by the heading message to non-cheating video What analysis obtained.Specifically, the non-cheating heading message of the preset quantity of available non-cheating video, and count same non- The maximum quantity for the feature vocabulary classification for including in cheating heading message.Then can using the maximum quantity counted as The specified quantity threshold value.
It in one embodiment, can be for the division that some classification therein is more refined, to obtain one Multiple subclass in a classification.In this way, the feature vocabulary in the current feature word finder can be divided to it is multiple In subclass.For example, for figure kind, wherein may include that amusement class personage, financial class personage, political class personage etc. are more A subclass.So when obtaining recognition threshold associated with current feature word finder, available and current feature The recognition threshold that subclass in word finder is respectively associated.It is subsequent when judging abnormal word finder, can be based on and the subclass Not associated recognition threshold, judges whether the subclass belongs to abnormal subclass.Specifically, judge whether subclass belongs to Abnormal subclass is similar with the abnormal mode of word finder of judgement described in above embodiment otherwise, just no longer explains here It states.If can be determined that the current Feature Words there are at least one abnormal subclass in the current feature word finder Collect and belongs to abnormal word finder.
In one embodiment, if the subclass in the current feature word finder is normal subclass, equally Can further judge whether current feature word finder is off-note word finder from the total quantity of subclass.Specifically, The total quantity for the subclass for including in the current feature word finder can be counted, if the total quantity of the subclass of statistics Greater than specified class threshold, then it can be determined that the current feature word finder belongs to abnormal word finder.The specified classification threshold Value equally can be what the heading message based on non-cheating video statisticallyd analyze.For example, in current feature word finder, If not only containing the subset of amusement class personage, but also the subset of financial class personage is contained, while further comprising political class personage Subset, then can determine that the current feature word finder is abnormal word finder.
In one embodiment, if the classification of the current feature word finder is from media categories, then with from matchmaker The associated recognition threshold of body classification can carry out statistical by the corresponding heading message of the video uploaded to emphasis PGC user Analysis obtains.Specifically, multiple non-cheating videos that the available user by designated user group uploads, and extract described more The respective heading message of a non-cheating video.Wherein, the designated user group can be above-mentioned emphasis PGC user, institute Stating emphasis PGC user can be the PGC user that video upload amount reaches specified quantity, be also possible to passing through from media categories The PGC user of video playback website certification.The video that these emphasis PGC user uploads usually is non-cheating video, at this time can be with The heading message of video by uploading to them is for statistical analysis, to obtain recognition threshold corresponding from media categories. Specifically, similar with above-mentioned embodiment, it can count in the same non-cheating heading message and belong to the spy from media categories The maximum quantity for levying vocabulary, then using the maximum quantity counted as associated with the current feature word finder Recognition threshold.
In one embodiment, can further be determined for the feature word finder of sensitive part of speech.Specifically, If dividing obtained feature word finder belongs to normal word finder, it can be determined that whether there is in the feature word finder divided Characterize the fisrt feature word finder of sensitive vocabulary.The fisrt feature word finder if it exists can further judge except described the Except one feature word finder, with the presence or absence of the second feature word finder of characterization programm name in the feature word finder that divides. The second feature word finder if it exists then can be determined that the target video for cheating video.Handle in this way foundation be, In the same heading message, if only there is the feature word finder of the satisfactory sensitive part of speech of vocabulary quantity, then not The heading message is preferably determined as heading message of practising fraud.Because possible video display is exactly " strong kiss ", " large scale ", " passion Play " etc. content, simultaneously violation operation is not present in the heading message of this kind of video.But if by sensitive word and program names Claim while editing in heading message, then there may be by programm name and sensitive contamination, user is attracted to click Suspicion.For example, some heading message is " not as good as your week winter rain Zhang Yishan wall rub-a-dub strong kiss passion large scale in spring breeze ten ", then this Not only included programm name in a heading message, but also included sensitive word, so as to determine the corresponding video of the heading message for work Disadvantage video.
Referring to Fig. 3, the application also provides a kind of identifying system of video of practising fraud, the system comprises memories and processing Device stores computer program in the memory, when the computer program is executed by the processor, realizes following steps.
S1: obtaining the heading message of target video, and extracts the feature vocabulary in the heading message.
S3: according to classification belonging to the feature vocabulary, the Feature Words is remitted and transferred and are divided at least one feature word finder; Wherein, classification belonging to the feature vocabulary in the same feature word finder is identical.
S5: recognition threshold associated with current feature word finder is obtained, and based on described in recognition threshold judgement Whether current feature word finder belongs to abnormal word finder.
S7: if the current feature word finder belongs to abnormal word finder, determine the target video for video of practising fraud.
In the present embodiment, the classification of the current feature word finder is from media categories;Correspondingly, the calculating When machine program is executed by the processor, also perform the steps of
The multiple non-cheating videos uploaded by the user in designated user group are obtained, and extract the multiple non-cheating view Frequently respective heading message;
It counts in the same non-cheating heading message and belongs to the maximum quantity from the feature vocabulary of media categories;
Using the maximum quantity counted as recognition threshold associated with the current feature word finder.
In the present embodiment, it when the computer program is executed by the processor, also performs the steps of
If the feature word finder that division obtains belongs to normal word finder, judging to divide in obtained feature word finder is It is no to there is the fisrt feature word finder for characterizing sensitive vocabulary;
The fisrt feature word finder if it exists judges the feature divided in addition to the fisrt feature word finder With the presence or absence of the second feature word finder of characterization programm name in word finder;
The second feature word finder if it exists determines the target video for video of practising fraud.
In the present embodiment, the memory may include the physical unit for storing information, usually by information It is stored again with the media using the methods of electricity, magnetic or optics after digitlization.Memory described in present embodiment again may be used To include: to store the device of information, such as RAM, ROM in the way of electric energy;The device of information is stored in the way of magnetic energy, it is such as hard Disk, floppy disk, tape, core memory, magnetic bubble memory, USB flash disk;Using the device of optical mode storage information, such as CD or DVD. Certainly, there are also memories of other modes, such as quantum memory, graphene memory etc..
In the present embodiment, the processor can be implemented in any suitable manner.For example, the processor can be with Take such as microprocessor or processor and storage can by (micro-) processor execute computer readable program code (such as Software or firmware) computer-readable medium, logic gate, switch, specific integrated circuit (Application Specific Integrated Circuit, ASIC), programmable logic controller (PLC) and the form etc. for being embedded in microcontroller.
The identifying system for the cheating video that this specification embodiment provides, the specific function that memory and processor are realized Can, explanation can be contrasted with the aforementioned embodiments in this specification, and the technical effect of aforementioned embodiments can be reached, Here it just repeats no more.
In the 1990s, the improvement of a technology can be distinguished clearly be on hardware improvement (for example, Improvement to circuit structures such as diode, transistor, switches) or software on improvement (improvement for method flow).So And with the development of technology, the improvement of current many method flows can be considered as directly improving for hardware circuit. Designer nearly all obtains corresponding hardware circuit by the way that improved method flow to be programmed into hardware circuit.Cause This, it cannot be said that the improvement of a method flow cannot be realized with hardware entities module.For example, programmable logic device (Programmable Logic Device, PLD) (such as field programmable gate array (Field Programmable Gate Array, FPGA)) it is exactly such a integrated circuit, logic function determines device programming by user.By designer Voluntarily programming comes a digital display circuit " integrated " on a piece of PLD, designs and makes without asking chip maker Dedicated IC chip.Moreover, nowadays, substitution manually makes IC chip, this programming is also used instead mostly " is patrolled Volume compiler (logic compiler) " software realizes that software compiler used is similar when it writes with program development, And the source code before compiling also write by handy specific programming language, this is referred to as hardware description language (Hardware Description Language, HDL), and HDL is also not only a kind of, but there are many kind, such as ABEL (Advanced Boolean Expression Language)、AHDL(Altera Hardware Description Language)、Confluence、CUPL(Cornell University Programming Language)、HDCal、JHDL (Java Hardware Description Language)、Lava、Lola、MyHDL、PALASM、RHDL(Ruby Hardware Description Language) etc., VHDL (Very-High-Speed is most generally used at present Integrated Circuit Hardware Description Language) and Verilog2.Those skilled in the art It will be apparent to the skilled artisan that only needing method flow slightly programming in logic and being programmed into integrated circuit with above-mentioned several hardware description languages In, so that it may it is readily available the hardware circuit for realizing the logical method process.
It is also known in the art that the identification in addition to realizing cheating video in a manner of pure computer readable program code Other than system, completely can by by method and step carry out programming in logic come so that cheating video identifying system with logic gate, Switch, specific integrated circuit, programmable logic controller (PLC) and insertion microcontroller etc. form realize identical function.Therefore this The identifying system of kind of cheating video is considered a kind of hardware component, and to including for realizing various functions in it Device can also be considered as the structure in hardware component.Or even, both can may be used being considered as realizing the device of various functions To be that the software module of implementation method can be the structure in hardware component again.
As seen through the above description of the embodiments, those skilled in the art can be understood that the application can It realizes by means of software and necessary general hardware platform.Based on this understanding, the technical solution essence of the application On in other words the part that contributes to existing technology can be embodied in the form of software products, the computer software product It can store in storage medium, such as ROM/RAM, magnetic disk, CD, including some instructions are used so that a computer equipment (can be personal computer, server or the network equipment etc.) executes each embodiment of the application or embodiment Method described in certain parts.
Each embodiment in this specification is described in a progressive manner, same and similar between each embodiment Part may refer to each other, what each embodiment stressed is the difference with other embodiments.In particular, needle For the embodiment of the identifying system of cheating video, it is referred to the introduction control solution of the embodiment of preceding method It releases.
The application can describe in the general context of computer-executable instructions executed by a computer, such as program Module.Generally, program module includes routines performing specific tasks or implementing specific abstract data types, programs, objects, group Part, data structure etc..The application can also be practiced in a distributed computing environment, in these distributed computing environments, by Task is executed by the connected remote processing devices of communication network.In a distributed computing environment, program module can be with In the local and remote computer storage media including storage equipment.
Although depicting the application by embodiment, it will be appreciated by the skilled addressee that there are many deformations by the application With variation without departing from spirit herein, it is desirable to which the attached claims include these deformations and change without departing from the application Spirit.

Claims (13)

1. a kind of recognition methods for video of practising fraud, which is characterized in that the described method includes:
The heading message of target video is obtained, and extracts the feature vocabulary in the heading message;
According to classification belonging to the feature vocabulary, the Feature Words are remitted and transferred and are divided at least one feature word finder;Wherein, together Classification belonging to feature vocabulary in one feature word finder is identical;
Recognition threshold associated with current feature word finder is obtained, and the current spy is judged based on the recognition threshold Whether sign word finder belongs to abnormal word finder;
If the current feature word finder belongs to abnormal word finder, determine the target video for video of practising fraud.
2. the method according to claim 1, wherein the feature vocabulary extracted in the heading message includes:
The heading message is segmented, the multiple vocabulary for including in the heading message are obtained;
The vocabulary in word finder is searched as the feature vocabulary of the heading message in heat using in the multiple vocabulary;Wherein, institute It states heat and searches the heat in word finder and search vocabulary according to searching times determination corresponding within the specified time limit.
3. the method according to claim 1, wherein the recognition threshold determines in the following way:
The non-cheating heading message of the preset quantity of non-cheating video is obtained, and counts in the same non-cheating heading message and includes The maximum quantity of the feature vocabulary of specified classification;
Using the maximum quantity counted as recognition threshold associated with the feature word finder of the specified classification.
4. according to the method described in claim 3, it is characterized in that, judging whether the current feature word finder belongs to exception Word finder includes:
If the quantity for the feature vocabulary for including in the current feature word finder is greater than and the current feature word finder phase Associated recognition threshold determines that the current feature word finder belongs to abnormal word finder;
If the quantity for the feature vocabulary for including in the current feature word finder is less than or equal to and the current feature The associated recognition threshold of word finder determines that the current feature word finder is not belonging to abnormal word finder.
5. the method according to claim 1, wherein if dividing obtained feature word finder belongs to normal vocabulary When collection, the method also includes:
Statistics divides the total quantity of obtained feature word finder, if the total quantity of statistics is greater than specified quantity threshold value, determines The target video is cheating video;
Wherein, the specified quantity threshold value determines in the following way:
The non-cheating heading message of the preset quantity of non-cheating video is obtained, and counts in the same non-cheating heading message and includes Feature vocabulary classification maximum quantity;
Using the maximum quantity counted as the specified quantity threshold value.
6. the method according to claim 1, wherein feature vocabulary in the current feature word finder also by It is divided in multiple subclass;Correspondingly, obtaining recognition threshold associated with current feature word finder includes:
Obtain the recognition threshold being respectively associated with the subclass in current feature word finder.
7. according to the method described in claim 6, it is characterized in that, the method also includes:
Based on recognition threshold associated with the subclass, judge whether the subclass belongs to abnormal subclass;
If determining the current feature word finder category there are at least one abnormal subclass in the current feature word finder In abnormal word finder.
8. according to the method described in claim 6, it is characterized in that, if the subclass in the current feature word finder is Normal subclass, the method also includes:
The total quantity for the subclass for including in the current feature word finder is counted, if the total quantity of the subclass of statistics Greater than specified class threshold, determine that the current feature word finder belongs to abnormal word finder.
9. the method according to claim 1, wherein if the classification of the current feature word finder is from media Classification, recognition threshold associated with the current feature word finder determine in the following way:
The multiple non-cheating videos uploaded by the user in designated user group are obtained, and it is each to extract the multiple non-cheating video From heading message;
It counts in the same non-cheating heading message and belongs to the maximum quantity from the feature vocabulary of media categories;
Using the maximum quantity counted as recognition threshold associated with the current feature word finder.
10. the method according to claim 1, wherein if dividing obtained feature word finder belongs to normal word When collecting, the method also includes:
Judgement divides in obtained feature word finder with the presence or absence of the fisrt feature word finder for characterizing sensitive vocabulary;
The fisrt feature word finder if it exists judges the feature vocabulary divided in addition to the fisrt feature word finder Concentrate the second feature word finder with the presence or absence of characterization programm name;
The second feature word finder if it exists determines the target video for video of practising fraud.
11. a kind of identifying system for video of practising fraud, which is characterized in that the system comprises memory and processor, the storage Computer program is stored in device to perform the steps of when the computer program is executed by the processor
The heading message of target video is obtained, and extracts the feature vocabulary in the heading message;
According to classification belonging to the feature vocabulary, the Feature Words are remitted and transferred and are divided at least one feature word finder;Wherein, together Classification belonging to feature vocabulary in one feature word finder is identical;
Recognition threshold associated with current feature word finder is obtained, and the current spy is judged based on the recognition threshold Whether sign word finder belongs to abnormal word finder;
If the current feature word finder belongs to abnormal word finder, determine the target video for video of practising fraud.
12. system according to claim 11, which is characterized in that the classification of the current feature word finder is from media Classification;Correspondingly, it when the computer program is executed by the processor, also performs the steps of
The multiple non-cheating videos uploaded by the user in designated user group are obtained, and it is each to extract the multiple non-cheating video From heading message;
It counts in the same non-cheating heading message and belongs to the maximum quantity from the feature vocabulary of media categories;
Using the maximum quantity counted as recognition threshold associated with the current feature word finder.
13. system according to claim 11, which is characterized in that when the computer program is executed by the processor, Also perform the steps of
If the feature word finder that division obtains belongs to normal word finder, judges to divide and whether be deposited in obtained feature word finder In the fisrt feature word finder for characterizing sensitive vocabulary;
The fisrt feature word finder if it exists judges the feature vocabulary divided in addition to the fisrt feature word finder Concentrate the second feature word finder with the presence or absence of characterization programm name;
The second feature word finder if it exists determines the target video for video of practising fraud.
CN201711188045.6A 2017-11-24 2017-11-24 Method and system for identifying cheating videos Active CN109840445B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711188045.6A CN109840445B (en) 2017-11-24 2017-11-24 Method and system for identifying cheating videos

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711188045.6A CN109840445B (en) 2017-11-24 2017-11-24 Method and system for identifying cheating videos

Publications (2)

Publication Number Publication Date
CN109840445A true CN109840445A (en) 2019-06-04
CN109840445B CN109840445B (en) 2021-10-01

Family

ID=66876321

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711188045.6A Active CN109840445B (en) 2017-11-24 2017-11-24 Method and system for identifying cheating videos

Country Status (1)

Country Link
CN (1) CN109840445B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111950360A (en) * 2020-07-06 2020-11-17 北京奇艺世纪科技有限公司 Method and device for identifying infringing user

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103077172A (en) * 2011-10-26 2013-05-01 腾讯科技(深圳)有限公司 Method and device for mining cheating user
US8745056B1 (en) * 2008-03-31 2014-06-03 Google Inc. Spam detection for user-generated multimedia items based on concept clustering
US8752184B1 (en) * 2008-01-17 2014-06-10 Google Inc. Spam detection for user-generated multimedia items based on keyword stuffing
CN106202049A (en) * 2016-07-18 2016-12-07 合网络技术(北京)有限公司 A kind of hot word determines method and device
CN106326498A (en) * 2016-10-13 2017-01-11 合网络技术(北京)有限公司 Cheat video identification method and device
CN106326497A (en) * 2016-10-10 2017-01-11 合网络技术(北京)有限公司 Cheating video user identification method and device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8752184B1 (en) * 2008-01-17 2014-06-10 Google Inc. Spam detection for user-generated multimedia items based on keyword stuffing
US8745056B1 (en) * 2008-03-31 2014-06-03 Google Inc. Spam detection for user-generated multimedia items based on concept clustering
CN103077172A (en) * 2011-10-26 2013-05-01 腾讯科技(深圳)有限公司 Method and device for mining cheating user
CN106202049A (en) * 2016-07-18 2016-12-07 合网络技术(北京)有限公司 A kind of hot word determines method and device
CN106326497A (en) * 2016-10-10 2017-01-11 合网络技术(北京)有限公司 Cheating video user identification method and device
CN106326498A (en) * 2016-10-13 2017-01-11 合网络技术(北京)有限公司 Cheat video identification method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
王庆福等: ""搜索引擎反作弊方法研究"", 《电脑知识与技术》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111950360A (en) * 2020-07-06 2020-11-17 北京奇艺世纪科技有限公司 Method and device for identifying infringing user
CN111950360B (en) * 2020-07-06 2023-08-18 北京奇艺世纪科技有限公司 Method and device for identifying infringement user

Also Published As

Publication number Publication date
CN109840445B (en) 2021-10-01

Similar Documents

Publication Publication Date Title
US10867212B2 (en) Learning highlights using event detection
Merler et al. Automatic curation of sports highlights using multimodal excitement features
CN108197330B (en) Data digging method and device based on social platform
Xie et al. Structure analysis of soccer video with hidden Markov models
KR101816113B1 (en) Estimating and displaying social interest in time-based media
CN109299271B (en) Training sample generation method, text data method, public opinion event classification method and related equipment
WO2017096877A1 (en) Recommendation method and device
CN108520046B (en) Method and device for searching chat records
US9245035B2 (en) Information processing system, information processing method, program, and non-transitory information storage medium
CN110019954A (en) A kind of recognition methods and system of the user that practises fraud
CN113779381B (en) Resource recommendation method, device, electronic equipment and storage medium
CN110035302B (en) Information recommendation method and device, model training method and device, computing equipment and storage medium
US10104428B2 (en) Video playing detection method and apparatus
CN111597446B (en) Content pushing method and device based on artificial intelligence, server and storage medium
WO2018113673A1 (en) Method and apparatus for pushing search result of variety show query
CN112883734B (en) Block chain security event public opinion monitoring method and system
CN108733791A (en) network event detection method
Merler et al. Automatic curation of golf highlights using multimodal excitement features
CN115661302A (en) Video editing method, device, equipment and storage medium
JP2008310626A (en) Automatic tag impartment device, automatic tag impartment method, automatic tag impartment program and recording medium recording the program
CN109977735A (en) Move the extracting method and device of wonderful
CN112995690B (en) Live content category identification method, device, electronic equipment and readable storage medium
CN109840445A (en) A kind of recognition methods and system of video of practising fraud
KR20170048736A (en) Evnet information extraciton method for extracing the event information for text relay data, and user apparatus for perfromign the method
CN110232160B (en) Method and device for detecting interest point transition event and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20200514

Address after: 310052 room 508, floor 5, building 4, No. 699, Wangshang Road, Changhe street, Binjiang District, Hangzhou City, Zhejiang Province

Applicant after: Alibaba (China) Co.,Ltd.

Address before: 100080 Beijing Haidian District city Haidian street A Sinosteel International Plaza No. 8 block 5 layer A, C

Applicant before: Youku network technology (Beijing) Co., Ltd

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant