CN109840445A - A kind of recognition methods and system of video of practising fraud - Google Patents
A kind of recognition methods and system of video of practising fraud Download PDFInfo
- Publication number
- CN109840445A CN109840445A CN201711188045.6A CN201711188045A CN109840445A CN 109840445 A CN109840445 A CN 109840445A CN 201711188045 A CN201711188045 A CN 201711188045A CN 109840445 A CN109840445 A CN 109840445A
- Authority
- CN
- China
- Prior art keywords
- word finder
- feature
- vocabulary
- feature word
- video
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The application embodiment discloses the recognition methods and system of a kind of video of practising fraud, wherein the described method includes: obtaining the heading message of target video, and extracts the feature vocabulary in the heading message;According to classification belonging to the feature vocabulary, the Feature Words are remitted and transferred and are divided at least one feature word finder;Wherein, classification belonging to the feature vocabulary in the same feature word finder is identical;Recognition threshold associated with current feature word finder is obtained, and judges whether the current feature word finder belongs to abnormal word finder based on the recognition threshold;If the current feature word finder belongs to abnormal word finder, determine the target video for video of practising fraud.Technical solution provided by the present application can be improved the recognition accuracy of cheating video.
Description
Technical field
This application involves Internet technical field, in particular to a kind of the recognition methods and system of video of practising fraud.
Background technique
With the continuous development of Internet technology, more and more video playing platforms are emerged.Currently, video playing is flat
Platform would generally count the click volume of each video.In this way, user can be judged according to the click volume of video video content by
Ratings, to selectively watch video.
Currently, some cheating videos uploader in order to improve cheating video click volume, it will usually for cheating video match
Set false video title.These false video titles and the actual content of cheating video are possible and uncorrelated, but purely
It piles up current some heat and searches vocabulary, in this way, the video title of the falseness will when user searches for some more popular video
It appears in search result, to gain the click volume of user by cheating.For example, some false video title is " the happy male of Venus show
The good sound of sound China runs the newest collection of male ", then when user is in search " Venus show " or " the good sound of China ", the falseness
Video title appears in search result.
In order to identify cheating video from numerous videos, the heat occurred in the same video title can currently be searched
Vocabulary is limited.For example, 3 can be set by the upper limit of the number that the heat occurred in the same video title searches vocabulary, this
Sample can determine that the video is once the heat for occurring 4 or 4 or more in the title of some video searches vocabulary
Cheating video.However, the recognition methods of existing this cheating video will lead to many normal videos and be mistaken for cheating view
Frequently, for example, entitled " the happy base camp's collection of choice specimens of Deng Chao Zheng Kai Bao Beier Li Chen " of some video.Occur in the video title
5 heat search vocabulary, if the video can be determined as video of practising fraud according to existing method.But actually in the video title
Several stars both participated in the same variety show, therefore the name of these stars occurs not being merely to pile up simultaneously
Heat searches vocabulary, but normally enumerates, therefore the video is not cheating video.Therefore cheating view in the prior art
The recognition methods of frequency can not accurately identify cheating video.
Summary of the invention
The purpose of the application embodiment is to provide the recognition methods and system of a kind of video of practising fraud, and can be improved cheating view
The recognition accuracy of frequency.
To achieve the above object, the application embodiment provides a kind of recognition methods of video of practising fraud, which comprises
The heading message of target video is obtained, and extracts the feature vocabulary in the heading message;According to belonging to the feature vocabulary
The Feature Words are remitted and transferred and are divided at least one feature word finder by classification;Wherein, the feature vocabulary in the same feature word finder
Affiliated classification is identical;Recognition threshold associated with current feature word finder is obtained, and is judged based on the recognition threshold
Whether the current feature word finder belongs to abnormal word finder;If the current feature word finder belongs to abnormal word finder,
Determine the target video for video of practising fraud.
To achieve the above object, the application embodiment also provides a kind of identifying system of video of practising fraud, the system packet
Memory and processor are included, computer program is stored in the memory, when the computer program is executed by the processor,
The heading message for obtaining target video is performed the steps of, and extracts the feature vocabulary in the heading message;According to the spy
Classification belonging to vocabulary is levied, the Feature Words are remitted and transferred and are divided at least one feature word finder;Wherein, the same feature word finder
In feature vocabulary belonging to classification it is identical;Recognition threshold associated with current feature word finder is obtained, and based on described
Recognition threshold judges whether the current feature word finder belongs to abnormal word finder;If the current feature word finder belongs to
Abnormal word finder determines the target video for video of practising fraud.
Therefore technical solution provided by the present application first may be used when the heading message to target video identifies
To extract the feature vocabulary in the heading message.In practical applications, the feature vocabulary can be that current heat searches word
It converges.After extracting feature vocabulary, it can classify to the feature vocabulary extracted, to obtain at least one Feature Words
Collect.Specifically, different classes of feature word finder can be associated with different recognition thresholds, which can be used as one
The upper limit quantity for the feature vocabulary for including in the feature word finder of classification.If the number for the feature vocabulary for including in feature word finder
Amount is more than associated recognition threshold, then it is assumed that the specific word is collected for abnormal word finder, in this way, the target video can be by
It is determined as video of practising fraud.Therefore it can also be different for the decision metrics of different feature word finders.For example, for joy
For the feature word finder of happy stars, corresponding recognition threshold can be somewhat larger;And it is directed to the feature of programm name class
For vocabulary, corresponding recognition threshold can be smaller.Specifically, the value of the recognition threshold can be according to normal view
The quantity for the feature vocabulary for including in the video title of frequency is counted to obtain.It can be seen that technical solution provided by the present application,
Vocabulary is searched for different classes of heat, can be determined using different criterion, is avoided due to being sentenced using unified
Calibration standard caused erroneous judgement situation when being determined, to improve the recognition accuracy of cheating video.
Detailed description of the invention
It, below will be to embodiment in order to illustrate more clearly of the application embodiment or technical solution in the prior art
Or attached drawing needed to be used in the description of the prior art is briefly described, it should be apparent that, the accompanying drawings in the following description is only
It is some embodiments as described in this application, for those of ordinary skill in the art, in not making the creative labor property
Under the premise of, it is also possible to obtain other drawings based on these drawings.
Fig. 1 is the recognition methods block diagram of cheating video in the application embodiment;
Fig. 2 is the recognition methods flow chart of cheating video in the application embodiment;
Fig. 3 is the structural schematic diagram of the identifying system of cheating video in the application embodiment.
Specific embodiment
In order to make those skilled in the art better understand the technical solutions in the application, below in conjunction with the application reality
The attached drawing in mode is applied, the technical solution in the application embodiment is clearly and completely described, it is clear that described
Embodiment is only a part of embodiment of the application, rather than whole embodiments.Based on the embodiment party in the application
Formula, every other embodiment obtained by those of ordinary skill in the art without making creative efforts, is all answered
When the range for belonging to the application protection.
The application provides a kind of recognition methods of video of practising fraud, and the method can be applied to the service of video playback website
In device.Fig. 1 and Fig. 2 are please referred to, the method may include following steps.
S1: obtaining the heading message of target video, and extracts the feature vocabulary in the heading message.
In the present embodiment, the target video can be video to be identified, and the target video can have mark
Information is inscribed, the heading message can be the text information that video uploader is target video setting.For example, the target
The heading message of video can be " the happy good sound of male voice China of Venus show runs the newest collection of male ".
It in the present embodiment, can be to the target video when whether judge the target video is cheating video
Heading message is identified.In the server, the data of the video of upload can be associated storage with the information of the video.Institute
The information for stating video may include the range of information such as the duration of video, title, type and uploader user's name.In this way,
When obtaining the heading message of the target video, characterization video can be read out from the associated video information of the target video
The character string of title.
It in the present embodiment, can be for the interior of heading message after the heading message for obtaining the target video
Appearance is identified.Specifically, the feature vocabulary in the heading message can be extracted.The feature vocabulary can be current
The more vocabulary of searching times in video playback website.In practical applications, video playback website can count designated time period
The searching times of interior each vocabulary can then proceed in searching times from more to few sequence, the vocabulary of search be ranked up.
Finally, multiple vocabulary in the top can be obtained, these vocabulary in the top can be used as the video playback website
In feature vocabulary.For example, the heat that video playback website can count before nearly one week ranking 100 searches vocabulary, these heat search word
Remittance can be as the feature vocabulary of video playback website.
In the present embodiment, when extracting the feature vocabulary in the heading message, can to the heading message into
Row participle, to obtain the multiple vocabulary for including in the heading message.When being segmented to the heading message, can adopt
The vocabulary in the heading message is identified with pre-set lexicon, so as to identify to obtain the heading message
In multiple vocabulary.In practical applications, heading message can be segmented using various segmenter.The segmenter is for example
It can be friso segmenter, Jcseg segmenter, MMSEG4J segmenter etc..Further, in order to improve to the title of video letter
The accuracy segmented is ceased, the dictionary of segmenter can be constructed based on vocabulary common in video playback website, to make
The result for obtaining segmenter output can be more in line with the speech habits of vocabulary in video playback website.
In the present embodiment, it after being segmented and obtaining multiple vocabulary, can will be in the multiple vocabulary
Heat searches feature vocabulary of the vocabulary in word finder as the heading message.Wherein, the heat that the heat is searched in word finder searches vocabulary
It can be according to searching times determination corresponding within the specified time limit.For example, video playback website can count nearly one week ranking
Preceding 100 heat searches vocabulary, and these heat are searched vocabulary composition heat and search word finder.So according to the heading message of target video point
After word obtains multiple vocabulary, the vocabulary searched in word finder in the heat can be extracted as feature vocabulary.In this implementation
In mode, the purpose for extracting feature vocabulary is, cheating video, which is likely to pile up current multiple heat in heading message, to be searched
Vocabulary, to achieve the purpose that gain user clicks by cheating.Therefore subsequent the feature vocabulary extracted to be analyzed, thus
Judge whether target video is cheating video.
S3: according to classification belonging to the feature vocabulary, the Feature Words is remitted and transferred and are divided at least one feature word finder;
Wherein, classification belonging to the feature vocabulary in the same feature word finder is identical.
In the present embodiment, can classify according to classification belonging to feature vocabulary to feature vocabulary.Feature vocabulary
Classification can be classified according to the search intention of user.Specifically, the classification of the feature vocabulary may include program names
Claim class, figure kind, from the plurality of classes such as media class and sensitive part of speech.Wherein, programm name class can be the name of variety show
The abbreviation of title or title.It such as may include " brother of running ", " Venus show ", " the good sound of China " etc. in programm name class
Feature vocabulary.Figure kind can be the name of public figure or the nickname of name.For example, may include " Lee in figure kind
The features vocabulary such as morning ", " Ma Yun ", " Ba Feite ".It can be PGC (Professional in video playback website from media class
Generated Content, professional production content) title or uploader title.For example, may include from media
The features vocabulary such as " heroic alliance ", " brother is helped in day ", " moonlit night maple ".Sensitive part of speech can be the Feature Words for having bad Guiding Significance
It converges.For example, may include the features vocabulary such as " strong kiss ", " large scale ", " passion play " in sensitive part of speech.
It should be noted that the classification for above-mentioned feature vocabulary can be directed to therein in practical application scene
The division that some classification is more refined, to obtain multiple subclass in a classification.For example, for figure kind and
Speech, wherein may include multiple subclass such as amusement class personage, financial class personage, political class personage.
It in the present embodiment, can be according to spy after feature vocabulary is filtered out in the heading message from target video
Classification belonging to vocabulary is levied, feature vocabulary is sorted out.A Feature Words can be divided to by belonging to of a sort feature vocabulary
In collecting.In this way, at least one feature word finder can be obtained, belonging to the feature vocabulary in the same feature word finder
Classification it is identical.For example, for " run the newest collection of the good sound packet of male Venus show China and see that Bao Beierli morning freely chats ideal " this
Heading message can divide to obtain " running the good sound packet of male Venus show China " and " Bao Beierli morning " the two feature word finders.
S5: recognition threshold associated with current feature word finder is obtained, and based on described in recognition threshold judgement
Whether current feature word finder belongs to abnormal word finder.
Typically, for different classes of feature vocabulary, the feature vocabulary that includes in the heading message of normal video
Quantity may also be different.For example, appearing in the quantity in the same heading message for the feature vocabulary of programm name class
Three are not exceeded generally;And the feature vocabulary for entertaining class personage, the quantity appeared in the same heading message are general
Five are not exceeded.It therefore, in the present embodiment can be for not in order to avoid normal video is mistaken for cheating video
Same classification, formulates different recognition strategies.
In the present embodiment, for different classes of feature word finder, it may be predetermined that be used for judging characteristic vocabulary
The whether normal recognition threshold of the quantity for the feature vocabulary that concentration includes.The recognition threshold can be used as in feature word finder
Feature vocabulary the upper limit of the number.If the quantity for the feature vocabulary for including in feature word finder is greater than the recognition threshold, table
In bright corresponding heading message exist pile up heat search disliking and avoiding for vocabulary.Specifically, since different feature word finders can be associated with not
Same recognition threshold, then can first be obtained and current feature word finder when determining current feature word finder
Associated recognition threshold.Each recognition threshold can be associated in the server of video playback website with corresponding classification
Storage.Wherein, the classification of feature vocabulary can be used as key (key), and recognition threshold associated with classification then can be used as value
(value) can be stored by way of key-value (key-value pair) in this way.Current feature word finder pair is being determined
After the classification answered, associated recognition threshold can be read.
In the present embodiment, it is for statistical analysis to can be heading message based on normal video for the recognition threshold
It arrives.Specifically, the non-cheating heading message of the preset quantity of non-cheating video can be obtained in advance, and counts the same non-work
The maximum quantity of feature vocabulary comprising specified classification in disadvantage heading message.For example, available 5000 non-cheating videos
Then heading message is directed to every heading message, can count the quantity of the feature vocabulary wherein comprising specified classification.For example,
It can count in this 5000 heading messages, the quantity of the feature vocabulary of the programm name class respectively contained.Finally, pass through comparison
Each quantity of statistics, so as to obtain maximum quantity therein.The maximum quantity can be used as in non-cheating video and wrap
The upper limit of the number of feature vocabulary containing specified classification, so as to using the maximum quantity counted as with the specified class
The associated recognition threshold of another characteristic word finder.For example, for being found after a large amount of normal headers information analysis, normal video
Heading message in it is general at most only can refer to 2 programm names, then the recognition threshold for programm name class can be set
It is set to 2.
It, can after obtaining recognition threshold associated with current feature word finder in this embodiment party mode
Judge whether the current feature word finder belongs to abnormal word finder based on the recognition threshold.Specifically, if it is described current
Feature word finder in include feature vocabulary quantity be greater than recognition threshold associated with the current feature word finder,
It then can be determined that the current feature word finder belongs to abnormal word finder.For example, the feature word finder phase with programm name class
Associated recognition threshold can be 2, if that the quantity for the feature vocabulary for including in the feature word finder of programm name class is big
In 2, then it can be determined that the specific word is collected for off-note word finder.Conversely, if including in the current feature word finder
Feature vocabulary quantity be less than or equal to recognition threshold associated with the current feature word finder, then can be determined that
The current feature word finder is not belonging to abnormal word finder.
S7: if the current feature word finder belongs to abnormal word finder, determine the target video for video of practising fraud.
In the present embodiment, if the current feature word finder belongs to abnormal word finder, show current feature
There is the suspicion that heat searches vocabulary of piling up in the feature vocabulary in word finder.The heading message of target video can correspond to multiple Feature Words
Collect, if wherein there is an abnormal word finder, then the target video can be determined for video of practising fraud.For example, for " running
The newest collection of the good sound packet of male Venus show China sees that Bao Beierli morning freely chats ideal " this heading message, although wherein " Bao Beier
This feature word finder of Li Chen " belongs to normal word finder, still " runs the good sound packet of male Venus show China " and but belongs to abnormal vocabulary
Collection, then the corresponding video of the heading message can be determined for video of practising fraud.
In one embodiment, if dividing obtained feature word finder belongs to normal word finder, then can be into one
Whether step comprehensive descision target video is cheating video.Specifically, it can count and divide to obtain by the heading message of target video
Feature word finder total quantity.For example, for " run the newest collection of male and see that Bao Beierli morning freely chats ideal " this heading message,
Comprising two feature word finders, therefore the total quantity of the corresponding feature word finder of the heading message is 2.If the sum of statistics
Amount is greater than specified quantity threshold value, then can be determined that the target video for cheating video.The specified quantity threshold value can be used for
It is limited to the upper limit of the number of the different classes of feature word finder in the same heading message while occurred.In some cases,
The feature vocabulary for including in any feature word finder in heading message is without departing from associated recognition threshold, but title is believed
But comprising many different classes of feature word finders in breath, in this case, which be should also be as being judged to practising fraud mark
Inscribe information.For example, for " running the newest collection of male and seeing that Bao Beierli morning freely chats ideal heroic alliance and passes new racing season horse cloud Ba Feite
Award the road made a good deal of money " as title, comprising four feature word finders, (figure kind can be divided into amusement class personage and financial class altogether
Two class of personage), the feature vocabulary quantity for including in each feature word finder is normal, but due to the total quantity mistake of feature word finder
It is more, therefore can be determined that the corresponding video of the heading message for cheating video.
In the present embodiment, the specified quantity threshold value is also possible to count by the heading message to non-cheating video
What analysis obtained.Specifically, the non-cheating heading message of the preset quantity of available non-cheating video, and count same non-
The maximum quantity for the feature vocabulary classification for including in cheating heading message.Then can using the maximum quantity counted as
The specified quantity threshold value.
It in one embodiment, can be for the division that some classification therein is more refined, to obtain one
Multiple subclass in a classification.In this way, the feature vocabulary in the current feature word finder can be divided to it is multiple
In subclass.For example, for figure kind, wherein may include that amusement class personage, financial class personage, political class personage etc. are more
A subclass.So when obtaining recognition threshold associated with current feature word finder, available and current feature
The recognition threshold that subclass in word finder is respectively associated.It is subsequent when judging abnormal word finder, can be based on and the subclass
Not associated recognition threshold, judges whether the subclass belongs to abnormal subclass.Specifically, judge whether subclass belongs to
Abnormal subclass is similar with the abnormal mode of word finder of judgement described in above embodiment otherwise, just no longer explains here
It states.If can be determined that the current Feature Words there are at least one abnormal subclass in the current feature word finder
Collect and belongs to abnormal word finder.
In one embodiment, if the subclass in the current feature word finder is normal subclass, equally
Can further judge whether current feature word finder is off-note word finder from the total quantity of subclass.Specifically,
The total quantity for the subclass for including in the current feature word finder can be counted, if the total quantity of the subclass of statistics
Greater than specified class threshold, then it can be determined that the current feature word finder belongs to abnormal word finder.The specified classification threshold
Value equally can be what the heading message based on non-cheating video statisticallyd analyze.For example, in current feature word finder,
If not only containing the subset of amusement class personage, but also the subset of financial class personage is contained, while further comprising political class personage
Subset, then can determine that the current feature word finder is abnormal word finder.
In one embodiment, if the classification of the current feature word finder is from media categories, then with from matchmaker
The associated recognition threshold of body classification can carry out statistical by the corresponding heading message of the video uploaded to emphasis PGC user
Analysis obtains.Specifically, multiple non-cheating videos that the available user by designated user group uploads, and extract described more
The respective heading message of a non-cheating video.Wherein, the designated user group can be above-mentioned emphasis PGC user, institute
Stating emphasis PGC user can be the PGC user that video upload amount reaches specified quantity, be also possible to passing through from media categories
The PGC user of video playback website certification.The video that these emphasis PGC user uploads usually is non-cheating video, at this time can be with
The heading message of video by uploading to them is for statistical analysis, to obtain recognition threshold corresponding from media categories.
Specifically, similar with above-mentioned embodiment, it can count in the same non-cheating heading message and belong to the spy from media categories
The maximum quantity for levying vocabulary, then using the maximum quantity counted as associated with the current feature word finder
Recognition threshold.
In one embodiment, can further be determined for the feature word finder of sensitive part of speech.Specifically,
If dividing obtained feature word finder belongs to normal word finder, it can be determined that whether there is in the feature word finder divided
Characterize the fisrt feature word finder of sensitive vocabulary.The fisrt feature word finder if it exists can further judge except described the
Except one feature word finder, with the presence or absence of the second feature word finder of characterization programm name in the feature word finder that divides.
The second feature word finder if it exists then can be determined that the target video for cheating video.Handle in this way foundation be,
In the same heading message, if only there is the feature word finder of the satisfactory sensitive part of speech of vocabulary quantity, then not
The heading message is preferably determined as heading message of practising fraud.Because possible video display is exactly " strong kiss ", " large scale ", " passion
Play " etc. content, simultaneously violation operation is not present in the heading message of this kind of video.But if by sensitive word and program names
Claim while editing in heading message, then there may be by programm name and sensitive contamination, user is attracted to click
Suspicion.For example, some heading message is " not as good as your week winter rain Zhang Yishan wall rub-a-dub strong kiss passion large scale in spring breeze ten ", then this
Not only included programm name in a heading message, but also included sensitive word, so as to determine the corresponding video of the heading message for work
Disadvantage video.
Referring to Fig. 3, the application also provides a kind of identifying system of video of practising fraud, the system comprises memories and processing
Device stores computer program in the memory, when the computer program is executed by the processor, realizes following steps.
S1: obtaining the heading message of target video, and extracts the feature vocabulary in the heading message.
S3: according to classification belonging to the feature vocabulary, the Feature Words is remitted and transferred and are divided at least one feature word finder;
Wherein, classification belonging to the feature vocabulary in the same feature word finder is identical.
S5: recognition threshold associated with current feature word finder is obtained, and based on described in recognition threshold judgement
Whether current feature word finder belongs to abnormal word finder.
S7: if the current feature word finder belongs to abnormal word finder, determine the target video for video of practising fraud.
In the present embodiment, the classification of the current feature word finder is from media categories;Correspondingly, the calculating
When machine program is executed by the processor, also perform the steps of
The multiple non-cheating videos uploaded by the user in designated user group are obtained, and extract the multiple non-cheating view
Frequently respective heading message;
It counts in the same non-cheating heading message and belongs to the maximum quantity from the feature vocabulary of media categories;
Using the maximum quantity counted as recognition threshold associated with the current feature word finder.
In the present embodiment, it when the computer program is executed by the processor, also performs the steps of
If the feature word finder that division obtains belongs to normal word finder, judging to divide in obtained feature word finder is
It is no to there is the fisrt feature word finder for characterizing sensitive vocabulary;
The fisrt feature word finder if it exists judges the feature divided in addition to the fisrt feature word finder
With the presence or absence of the second feature word finder of characterization programm name in word finder;
The second feature word finder if it exists determines the target video for video of practising fraud.
In the present embodiment, the memory may include the physical unit for storing information, usually by information
It is stored again with the media using the methods of electricity, magnetic or optics after digitlization.Memory described in present embodiment again may be used
To include: to store the device of information, such as RAM, ROM in the way of electric energy;The device of information is stored in the way of magnetic energy, it is such as hard
Disk, floppy disk, tape, core memory, magnetic bubble memory, USB flash disk;Using the device of optical mode storage information, such as CD or DVD.
Certainly, there are also memories of other modes, such as quantum memory, graphene memory etc..
In the present embodiment, the processor can be implemented in any suitable manner.For example, the processor can be with
Take such as microprocessor or processor and storage can by (micro-) processor execute computer readable program code (such as
Software or firmware) computer-readable medium, logic gate, switch, specific integrated circuit (Application Specific
Integrated Circuit, ASIC), programmable logic controller (PLC) and the form etc. for being embedded in microcontroller.
The identifying system for the cheating video that this specification embodiment provides, the specific function that memory and processor are realized
Can, explanation can be contrasted with the aforementioned embodiments in this specification, and the technical effect of aforementioned embodiments can be reached,
Here it just repeats no more.
In the 1990s, the improvement of a technology can be distinguished clearly be on hardware improvement (for example,
Improvement to circuit structures such as diode, transistor, switches) or software on improvement (improvement for method flow).So
And with the development of technology, the improvement of current many method flows can be considered as directly improving for hardware circuit.
Designer nearly all obtains corresponding hardware circuit by the way that improved method flow to be programmed into hardware circuit.Cause
This, it cannot be said that the improvement of a method flow cannot be realized with hardware entities module.For example, programmable logic device
(Programmable Logic Device, PLD) (such as field programmable gate array (Field Programmable Gate
Array, FPGA)) it is exactly such a integrated circuit, logic function determines device programming by user.By designer
Voluntarily programming comes a digital display circuit " integrated " on a piece of PLD, designs and makes without asking chip maker
Dedicated IC chip.Moreover, nowadays, substitution manually makes IC chip, this programming is also used instead mostly " is patrolled
Volume compiler (logic compiler) " software realizes that software compiler used is similar when it writes with program development,
And the source code before compiling also write by handy specific programming language, this is referred to as hardware description language
(Hardware Description Language, HDL), and HDL is also not only a kind of, but there are many kind, such as ABEL
(Advanced Boolean Expression Language)、AHDL(Altera Hardware Description
Language)、Confluence、CUPL(Cornell University Programming Language)、HDCal、JHDL
(Java Hardware Description Language)、Lava、Lola、MyHDL、PALASM、RHDL(Ruby
Hardware Description Language) etc., VHDL (Very-High-Speed is most generally used at present
Integrated Circuit Hardware Description Language) and Verilog2.Those skilled in the art
It will be apparent to the skilled artisan that only needing method flow slightly programming in logic and being programmed into integrated circuit with above-mentioned several hardware description languages
In, so that it may it is readily available the hardware circuit for realizing the logical method process.
It is also known in the art that the identification in addition to realizing cheating video in a manner of pure computer readable program code
Other than system, completely can by by method and step carry out programming in logic come so that cheating video identifying system with logic gate,
Switch, specific integrated circuit, programmable logic controller (PLC) and insertion microcontroller etc. form realize identical function.Therefore this
The identifying system of kind of cheating video is considered a kind of hardware component, and to including for realizing various functions in it
Device can also be considered as the structure in hardware component.Or even, both can may be used being considered as realizing the device of various functions
To be that the software module of implementation method can be the structure in hardware component again.
As seen through the above description of the embodiments, those skilled in the art can be understood that the application can
It realizes by means of software and necessary general hardware platform.Based on this understanding, the technical solution essence of the application
On in other words the part that contributes to existing technology can be embodied in the form of software products, the computer software product
It can store in storage medium, such as ROM/RAM, magnetic disk, CD, including some instructions are used so that a computer equipment
(can be personal computer, server or the network equipment etc.) executes each embodiment of the application or embodiment
Method described in certain parts.
Each embodiment in this specification is described in a progressive manner, same and similar between each embodiment
Part may refer to each other, what each embodiment stressed is the difference with other embodiments.In particular, needle
For the embodiment of the identifying system of cheating video, it is referred to the introduction control solution of the embodiment of preceding method
It releases.
The application can describe in the general context of computer-executable instructions executed by a computer, such as program
Module.Generally, program module includes routines performing specific tasks or implementing specific abstract data types, programs, objects, group
Part, data structure etc..The application can also be practiced in a distributed computing environment, in these distributed computing environments, by
Task is executed by the connected remote processing devices of communication network.In a distributed computing environment, program module can be with
In the local and remote computer storage media including storage equipment.
Although depicting the application by embodiment, it will be appreciated by the skilled addressee that there are many deformations by the application
With variation without departing from spirit herein, it is desirable to which the attached claims include these deformations and change without departing from the application
Spirit.
Claims (13)
1. a kind of recognition methods for video of practising fraud, which is characterized in that the described method includes:
The heading message of target video is obtained, and extracts the feature vocabulary in the heading message;
According to classification belonging to the feature vocabulary, the Feature Words are remitted and transferred and are divided at least one feature word finder;Wherein, together
Classification belonging to feature vocabulary in one feature word finder is identical;
Recognition threshold associated with current feature word finder is obtained, and the current spy is judged based on the recognition threshold
Whether sign word finder belongs to abnormal word finder;
If the current feature word finder belongs to abnormal word finder, determine the target video for video of practising fraud.
2. the method according to claim 1, wherein the feature vocabulary extracted in the heading message includes:
The heading message is segmented, the multiple vocabulary for including in the heading message are obtained;
The vocabulary in word finder is searched as the feature vocabulary of the heading message in heat using in the multiple vocabulary;Wherein, institute
It states heat and searches the heat in word finder and search vocabulary according to searching times determination corresponding within the specified time limit.
3. the method according to claim 1, wherein the recognition threshold determines in the following way:
The non-cheating heading message of the preset quantity of non-cheating video is obtained, and counts in the same non-cheating heading message and includes
The maximum quantity of the feature vocabulary of specified classification;
Using the maximum quantity counted as recognition threshold associated with the feature word finder of the specified classification.
4. according to the method described in claim 3, it is characterized in that, judging whether the current feature word finder belongs to exception
Word finder includes:
If the quantity for the feature vocabulary for including in the current feature word finder is greater than and the current feature word finder phase
Associated recognition threshold determines that the current feature word finder belongs to abnormal word finder;
If the quantity for the feature vocabulary for including in the current feature word finder is less than or equal to and the current feature
The associated recognition threshold of word finder determines that the current feature word finder is not belonging to abnormal word finder.
5. the method according to claim 1, wherein if dividing obtained feature word finder belongs to normal vocabulary
When collection, the method also includes:
Statistics divides the total quantity of obtained feature word finder, if the total quantity of statistics is greater than specified quantity threshold value, determines
The target video is cheating video;
Wherein, the specified quantity threshold value determines in the following way:
The non-cheating heading message of the preset quantity of non-cheating video is obtained, and counts in the same non-cheating heading message and includes
Feature vocabulary classification maximum quantity;
Using the maximum quantity counted as the specified quantity threshold value.
6. the method according to claim 1, wherein feature vocabulary in the current feature word finder also by
It is divided in multiple subclass;Correspondingly, obtaining recognition threshold associated with current feature word finder includes:
Obtain the recognition threshold being respectively associated with the subclass in current feature word finder.
7. according to the method described in claim 6, it is characterized in that, the method also includes:
Based on recognition threshold associated with the subclass, judge whether the subclass belongs to abnormal subclass;
If determining the current feature word finder category there are at least one abnormal subclass in the current feature word finder
In abnormal word finder.
8. according to the method described in claim 6, it is characterized in that, if the subclass in the current feature word finder is
Normal subclass, the method also includes:
The total quantity for the subclass for including in the current feature word finder is counted, if the total quantity of the subclass of statistics
Greater than specified class threshold, determine that the current feature word finder belongs to abnormal word finder.
9. the method according to claim 1, wherein if the classification of the current feature word finder is from media
Classification, recognition threshold associated with the current feature word finder determine in the following way:
The multiple non-cheating videos uploaded by the user in designated user group are obtained, and it is each to extract the multiple non-cheating video
From heading message;
It counts in the same non-cheating heading message and belongs to the maximum quantity from the feature vocabulary of media categories;
Using the maximum quantity counted as recognition threshold associated with the current feature word finder.
10. the method according to claim 1, wherein if dividing obtained feature word finder belongs to normal word
When collecting, the method also includes:
Judgement divides in obtained feature word finder with the presence or absence of the fisrt feature word finder for characterizing sensitive vocabulary;
The fisrt feature word finder if it exists judges the feature vocabulary divided in addition to the fisrt feature word finder
Concentrate the second feature word finder with the presence or absence of characterization programm name;
The second feature word finder if it exists determines the target video for video of practising fraud.
11. a kind of identifying system for video of practising fraud, which is characterized in that the system comprises memory and processor, the storage
Computer program is stored in device to perform the steps of when the computer program is executed by the processor
The heading message of target video is obtained, and extracts the feature vocabulary in the heading message;
According to classification belonging to the feature vocabulary, the Feature Words are remitted and transferred and are divided at least one feature word finder;Wherein, together
Classification belonging to feature vocabulary in one feature word finder is identical;
Recognition threshold associated with current feature word finder is obtained, and the current spy is judged based on the recognition threshold
Whether sign word finder belongs to abnormal word finder;
If the current feature word finder belongs to abnormal word finder, determine the target video for video of practising fraud.
12. system according to claim 11, which is characterized in that the classification of the current feature word finder is from media
Classification;Correspondingly, it when the computer program is executed by the processor, also performs the steps of
The multiple non-cheating videos uploaded by the user in designated user group are obtained, and it is each to extract the multiple non-cheating video
From heading message;
It counts in the same non-cheating heading message and belongs to the maximum quantity from the feature vocabulary of media categories;
Using the maximum quantity counted as recognition threshold associated with the current feature word finder.
13. system according to claim 11, which is characterized in that when the computer program is executed by the processor,
Also perform the steps of
If the feature word finder that division obtains belongs to normal word finder, judges to divide and whether be deposited in obtained feature word finder
In the fisrt feature word finder for characterizing sensitive vocabulary;
The fisrt feature word finder if it exists judges the feature vocabulary divided in addition to the fisrt feature word finder
Concentrate the second feature word finder with the presence or absence of characterization programm name;
The second feature word finder if it exists determines the target video for video of practising fraud.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711188045.6A CN109840445B (en) | 2017-11-24 | 2017-11-24 | Method and system for identifying cheating videos |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711188045.6A CN109840445B (en) | 2017-11-24 | 2017-11-24 | Method and system for identifying cheating videos |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109840445A true CN109840445A (en) | 2019-06-04 |
CN109840445B CN109840445B (en) | 2021-10-01 |
Family
ID=66876321
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711188045.6A Active CN109840445B (en) | 2017-11-24 | 2017-11-24 | Method and system for identifying cheating videos |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109840445B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111950360A (en) * | 2020-07-06 | 2020-11-17 | 北京奇艺世纪科技有限公司 | Method and device for identifying infringing user |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103077172A (en) * | 2011-10-26 | 2013-05-01 | 腾讯科技(深圳)有限公司 | Method and device for mining cheating user |
US8745056B1 (en) * | 2008-03-31 | 2014-06-03 | Google Inc. | Spam detection for user-generated multimedia items based on concept clustering |
US8752184B1 (en) * | 2008-01-17 | 2014-06-10 | Google Inc. | Spam detection for user-generated multimedia items based on keyword stuffing |
CN106202049A (en) * | 2016-07-18 | 2016-12-07 | 合网络技术(北京)有限公司 | A kind of hot word determines method and device |
CN106326498A (en) * | 2016-10-13 | 2017-01-11 | 合网络技术(北京)有限公司 | Cheat video identification method and device |
CN106326497A (en) * | 2016-10-10 | 2017-01-11 | 合网络技术(北京)有限公司 | Cheating video user identification method and device |
-
2017
- 2017-11-24 CN CN201711188045.6A patent/CN109840445B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8752184B1 (en) * | 2008-01-17 | 2014-06-10 | Google Inc. | Spam detection for user-generated multimedia items based on keyword stuffing |
US8745056B1 (en) * | 2008-03-31 | 2014-06-03 | Google Inc. | Spam detection for user-generated multimedia items based on concept clustering |
CN103077172A (en) * | 2011-10-26 | 2013-05-01 | 腾讯科技(深圳)有限公司 | Method and device for mining cheating user |
CN106202049A (en) * | 2016-07-18 | 2016-12-07 | 合网络技术(北京)有限公司 | A kind of hot word determines method and device |
CN106326497A (en) * | 2016-10-10 | 2017-01-11 | 合网络技术(北京)有限公司 | Cheating video user identification method and device |
CN106326498A (en) * | 2016-10-13 | 2017-01-11 | 合网络技术(北京)有限公司 | Cheat video identification method and device |
Non-Patent Citations (1)
Title |
---|
王庆福等: ""搜索引擎反作弊方法研究"", 《电脑知识与技术》 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111950360A (en) * | 2020-07-06 | 2020-11-17 | 北京奇艺世纪科技有限公司 | Method and device for identifying infringing user |
CN111950360B (en) * | 2020-07-06 | 2023-08-18 | 北京奇艺世纪科技有限公司 | Method and device for identifying infringement user |
Also Published As
Publication number | Publication date |
---|---|
CN109840445B (en) | 2021-10-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10867212B2 (en) | Learning highlights using event detection | |
Merler et al. | Automatic curation of sports highlights using multimodal excitement features | |
CN108197330B (en) | Data digging method and device based on social platform | |
Xie et al. | Structure analysis of soccer video with hidden Markov models | |
KR101816113B1 (en) | Estimating and displaying social interest in time-based media | |
CN109299271B (en) | Training sample generation method, text data method, public opinion event classification method and related equipment | |
WO2017096877A1 (en) | Recommendation method and device | |
CN108520046B (en) | Method and device for searching chat records | |
US9245035B2 (en) | Information processing system, information processing method, program, and non-transitory information storage medium | |
CN110019954A (en) | A kind of recognition methods and system of the user that practises fraud | |
CN113779381B (en) | Resource recommendation method, device, electronic equipment and storage medium | |
CN110035302B (en) | Information recommendation method and device, model training method and device, computing equipment and storage medium | |
US10104428B2 (en) | Video playing detection method and apparatus | |
CN111597446B (en) | Content pushing method and device based on artificial intelligence, server and storage medium | |
WO2018113673A1 (en) | Method and apparatus for pushing search result of variety show query | |
CN112883734B (en) | Block chain security event public opinion monitoring method and system | |
CN108733791A (en) | network event detection method | |
Merler et al. | Automatic curation of golf highlights using multimodal excitement features | |
CN115661302A (en) | Video editing method, device, equipment and storage medium | |
JP2008310626A (en) | Automatic tag impartment device, automatic tag impartment method, automatic tag impartment program and recording medium recording the program | |
CN109977735A (en) | Move the extracting method and device of wonderful | |
CN112995690B (en) | Live content category identification method, device, electronic equipment and readable storage medium | |
CN109840445A (en) | A kind of recognition methods and system of video of practising fraud | |
KR20170048736A (en) | Evnet information extraciton method for extracing the event information for text relay data, and user apparatus for perfromign the method | |
CN110232160B (en) | Method and device for detecting interest point transition event and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
TA01 | Transfer of patent application right |
Effective date of registration: 20200514 Address after: 310052 room 508, floor 5, building 4, No. 699, Wangshang Road, Changhe street, Binjiang District, Hangzhou City, Zhejiang Province Applicant after: Alibaba (China) Co.,Ltd. Address before: 100080 Beijing Haidian District city Haidian street A Sinosteel International Plaza No. 8 block 5 layer A, C Applicant before: Youku network technology (Beijing) Co., Ltd |
|
TA01 | Transfer of patent application right | ||
GR01 | Patent grant | ||
GR01 | Patent grant |