CN110163375A - Subject detection method and device - Google Patents

Subject detection method and device Download PDF

Info

Publication number
CN110163375A
CN110163375A CN201910282986.9A CN201910282986A CN110163375A CN 110163375 A CN110163375 A CN 110163375A CN 201910282986 A CN201910282986 A CN 201910282986A CN 110163375 A CN110163375 A CN 110163375A
Authority
CN
China
Prior art keywords
target
medium
subject
media
score
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910282986.9A
Other languages
Chinese (zh)
Other versions
CN110163375B (en
Inventor
王萌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Advanced New Technologies Co Ltd
Advantageous New Technologies Co Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201910282986.9A priority Critical patent/CN110163375B/en
Publication of CN110163375A publication Critical patent/CN110163375A/en
Application granted granted Critical
Publication of CN110163375B publication Critical patent/CN110163375B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N7/00Computing arrangements based on specific mathematical models
    • G06N7/01Probabilistic graphical models, e.g. probabilistic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Abstract

This application discloses a kind of subject detection method and devices, which comprises obtains the destination media that detection main body uses;The destination media is the used medium of target subject;Obtain each destination media medium score that training obtains in advance;According to the medium score of each destination media, judge whether the detection main body is target subject;Wherein, the medium score of each destination media is pre-generated according to following training method: obtaining the training data for carrying label;The label is used to identify the target subject using destination media and non-targeted main body;For each destination media, the target subject number and non-targeted main body number that the destination media is used in the training data are calculated;Obtain the media type of the destination media;According to the target subject number, the non-targeted main body number and the media type, the medium score for obtaining the destination media is calculated.The embodiment of the present application improves the accuracy of subject detection.

Description

Subject detection method and device
Technical Field
The present patent application is a divisional application entitled "subject detection method and apparatus" filed as 2016, 07/06/2016 and with application number of 2016105281392 in China. The present application belongs to the technical field of information processing, and in particular, to a method and an apparatus for detecting a subject.
Background
The main body refers to a natural person, a group formed by the natural person or an account corresponding to the natural person in the network.
In many business scenarios, there is a need to discover a special subject from a group of subjects, that is, a subject needs to be detected to detect a target subject satisfying a certain condition. Such as finding which users are more likely to purchase products or services produced by the company among a group of users. As another example, hundreds of millions of people are looking for people at risk of implementing a terrorist attack; as another example, a risky payment account number is sought from among a large number of payment account numbers, etc.
The detection of the subject is performed according to the media used by the subject, the media used by the subject may include different media types, such as attributes of the subject, i.e., age, occupation, income, location, etc., and the behavior of the subject may be characterized by browsing behavior of jumping from a search engine to a commodity page, behavior of modifying a password in an insecure environment, etc.
In the prior art, the detection of the subject is usually to determine whether the subject uses a medium satisfying a target condition, and if so, the subject may be determined to be the target subject. However, the prior art detection of bodies is accurate, since one body usually has multiple media, and the effect of media of different media types on the body is also different.
Disclosure of Invention
In view of this, the present application provides a method and an apparatus for detecting a subject, so as to improve accuracy of subject detection.
In order to solve the above technical problems, the present application discloses a subject detecting method,
acquiring a target medium used by a detection main body; the target medium is a medium used by a target subject;
acquiring a medium score obtained by pre-training each target medium;
judging whether the detection subject is a target subject or not according to the medium fraction of each target medium;
wherein the medium fraction of each target medium is generated in advance according to the following training mode:
acquiring training data carrying a label; the label is used for identifying a target subject and a non-target subject using a target medium;
calculating the target subject number and the non-target subject number of the target medium used in the training data aiming at each target medium;
acquiring the media type of the target media;
and calculating and obtaining the medium fraction of the target medium according to the target subject number, the non-target subject number and the medium type.
Preferably, the determining whether the detection subject is a target subject according to the medium scores of the respective target media includes:
summarizing the medium scores of all target media to obtain the main body score of the detection main body;
and judging whether the detection subject is a target subject or not according to the subject score.
Preferably, the obtaining of the medium score obtained by pre-training each target medium comprises:
establishing a medium level tree structure according to the sub-medium used by each target medium and the next level sub-medium used by each sub-medium; the target medium is used as a branch node or a leaf node;
for any branch node, acquiring the medium scores corresponding to the child nodes of the branch node, and summarizing the medium scores of the child nodes to obtain the scores serving as the medium scores of the branch nodes;
for any leaf node, acquiring a medium score obtained by pre-training, wherein the medium score of the leaf node is generated in advance according to a training mode of the medium score of a target medium by taking a parent node of the leaf node as a target subject and taking the leaf node as the target medium used by the target subject.
Preferably, the medium score of the target medium is a probability that a subject using the target medium is a non-target subject;
the calculating and obtaining the media score of the target media according to the target subject number, the non-target subject number and the media type comprises:
according to the target subject number, the non-target subject number and the media type, calculating and obtaining a media fraction of the target media according to a first calculation formula;
wherein, A represents a target subject,representing non-target bodies, xiAn ith target medium representing use by a non-target subject; m represents the number of non-target subjects using the target medium; n represents the number of target subjects using the target medium; f (m, n) represents the number of media used by the m non-target subjects and the n target subjects and belonging to the media type.
Preferably, when the number of non-target subjects is less than a first threshold and the number of target subjects is less than a second threshold, F (m, n) is statistically obtained from the training data;
when the number of non-target subjects is less than a first threshold and the number of target subjects is greater than a second threshold; or when the number of non-target subjects is greater than a first threshold and the target subject is less than a second threshold:
wherein ,αn and βnThe slope and intercept obtained by fitting F (m, n) are respectively;
the first calculation formula is specifically as follows:
when the number of non-target subjects is greater than a first threshold and the number of target subjects is greater than a second threshold:
F(m,n)≈1;
the first calculation formula is specifically as follows:
preferably, the aggregating the medium scores of the target media and obtaining the subject score of the detection subject includes:
summarizing the medium scores of all target media, and calculating to obtain the subject score of the detection subject according to the following second calculation formula;
wherein ,k represents the total number of target media used by the detection main body; x represents the number of non-target subjects in the training data, Y represents the number of target subjects in the training data,representing a target medium xiThe medium fraction of (a).
A subject detection apparatus, comprising:
the pre-calculation module is used for acquiring training data carrying labels; the label is used for identifying a target subject and a non-target subject using a target medium; calculating the target subject number and the non-target subject number of the target medium used in the training data aiming at each target medium; acquiring the media type of the target media; calculating and obtaining a medium score of the target medium according to the target subject number, the non-target subject number and the medium type;
the medium acquisition module is used for acquiring a target medium used by the detection main body; the target medium is a medium used by a target body;
the score acquisition module is used for acquiring the medium score obtained by pre-training each target medium obtained by the pre-calculation module;
and the detection module is used for judging whether the detection main body is the target main body or not according to the medium fraction of each target medium.
Preferably, the detection module comprises:
the main body calculating unit is used for summarizing the medium scores of all target media to obtain the main body score of the detection main body;
and the detection unit is used for judging whether the detection subject is a target subject or not according to the subject score.
Preferably, the score obtaining module includes:
the structure establishing unit is used for establishing a medium level tree structure according to the sub-medium used by each target medium and the next level sub-medium used by each level of sub-medium; the target medium is used as a branch node or a leaf node;
the system comprises a score acquisition unit, a score calculation unit and a score calculation unit, wherein the score acquisition unit is used for acquiring the medium scores corresponding to the subnodes of any branch node, and taking the scores obtained by summarizing the medium scores of the subnodes as the medium scores of the branch nodes;
for any leaf node, acquiring a medium score obtained by pre-training, wherein the medium score of the leaf node is generated in advance according to a training mode of the medium score of a target medium by taking a parent node of the leaf node as a target subject and taking the leaf node as the target medium used by the target subject.
Preferably, the medium score of the target medium is a probability that a subject using the target medium is a non-target subject;
the pre-calculation module calculates and obtains the media score of the target media according to the target subject number, the non-target subject number and the media type, and comprises:
according to the target subject number, the non-target subject number and the media type, calculating and obtaining a media fraction of the target media according to a first calculation formula;
wherein, A represents a target subject,representing non-target bodies, xiAn ith target medium representing use by a non-target subject; m represents the number of non-target subjects using the target medium; n represents the number of target subjects using the target medium; f (m, n) represents the number of media used by the m non-target subjects and the n target subjects and belonging to the media type.
Preferably, when the number of non-target subjects is less than a first threshold and the number of target subjects is less than a second threshold, F (m, n) is statistically obtained from the training data;
when the number of non-target subjects is less than a first threshold and the number of target subjects is greater than a second threshold; or when the number of non-target subjects is greater than a first threshold and the target subject is less than a second threshold:
wherein ,αn and βnThe slope and intercept obtained by fitting F (m, n) are respectively;
the first calculation formula is specifically as follows: (ii) a
When the number of non-target subjects is greater than a first threshold and the number of target subjects is greater than a second threshold:
F(m,n)≈1;
the first calculation formula is specifically as follows:
preferably, the subject computing unit is specifically configured to:
summarizing the medium scores of all target media, and calculating to obtain the subject score of the detection subject according to the following second calculation formula;
wherein ,k represents the total number of target media used by the detection main body; x represents the number of non-target subjects in the training data, Y represents the number of target subjects in the training data,representing the media fraction of the target media xi.
Compared with the prior art, the application can obtain the following technical effects:
the method comprises the steps of training a target medium used by a target main body in advance, scoring the target medium to obtain a medium score of the target medium, wherein the medium score is carried out according to the target main body number and the non-target main body number of the used target medium, and the medium score is combined with the medium type to distinguish the influence degree of different medium types on whether the main body is the target main body, so that the medium score is more accurate and reasonable. When the detection main body is detected, the detection main body is judged according to the medium scores of all target media of the detection main body, and the medium scores can accurately represent the probability of whether the main body using the medium is the target main body, so that the accuracy of main body detection is improved.
Of course, it is not necessary for any one product to practice the present application to achieve all of the above-described technical effects simultaneously.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application in a non-limiting sense. In the drawings:
FIG. 1 is a flow chart of one embodiment of a method for detecting a subject according to an embodiment of the present application;
FIG. 2 is a schematic diagram of a media-level tree structure according to an embodiment of the present application;
FIG. 3 is a flow chart of yet another embodiment of a method for subject detection according to an embodiment of the present application;
FIG. 4 is a schematic structural diagram of an embodiment of a subject detection apparatus according to an embodiment of the present application;
fig. 5 is a schematic structural diagram of a further embodiment of a body detection device according to an embodiment of the present application.
Detailed Description
Embodiments of the present application will be described in detail with reference to the drawings and examples, so that how to implement technical means to solve technical problems and achieve technical effects of the present application can be fully understood and implemented.
The main body of the embodiment of the application refers to a natural person, a group formed by the natural person or a corresponding account of the natural person in a network. The target subject refers to a subject satisfying a condition such as an account number appearing abnormally, a person at risk, a user having a purchase potential, and the like.
The target medium refers to a medium used by the target body, and a medium unused by the target body is a non-target medium. A subject that is not a target subject is a non-target subject.
In the prior art, whether the subject is the target subject is mainly determined according to whether the subject uses a medium satisfying a preset condition. The media satisfying the preset condition may be obtained by scoring the media, which is generally calculated from the historical hit rate. For example, if the media type is credit card number, a specific credit card number is used as a medium, and if 3 accounts and 2 accounts are used as target entities, the historical hit rate is 2/3, which is the media score of the credit card number. However, this scoring method is not accurate, for example, if the number of target subjects using a certain medium is 1, and the number of non-target subjects is 0, the score of the medium is 1; if the number of target subjects using the medium is 100 and the number of non-target subjects is 0, the medium fraction is still 1. From the empirical distribution, the media score is inaccurate, resulting in subject detection inaccuracies. In the prior art, only the influence of a single medium on the main body is considered, and the influence of different types of media on whether the main body is the target main body is different, so that the detection of the main body is inaccurate.
Based on the problems in the prior art, the inventor finds that the influence degree of different media types on the main body is different. Thus, the media scores may be different for different media types, even for the same number of target subjects and number of non-target subjects. Therefore, in the embodiment of the application, the medium score is obtained by calculation according to the target subject number and the non-target subject number of the used target medium, and the influence degree of different medium types on whether the subject is the target subject is distinguished by combining the medium types, so that the medium score is more accurate and reasonable. The medium score indicates a probability that the subject using the target medium is the target subject. When the detection main body is detected, the detection main body is judged according to the medium scores of all target media of the detection main body, the medium scores can accurately represent the probability of whether the main body using the medium is the target main body, and the medium scores of all the target media are comprehensively considered, so that the accuracy of main body detection is improved.
The technical solution of the present application will be described in detail below with reference to the accompanying drawings.
Fig. 1 is a flowchart of an embodiment of a method for detecting a subject according to an embodiment of the present application, where the method may include the following steps:
101: and acquiring a target medium used by the detection subject.
The target medium is a medium used by a target main body and can be determined according to historical data.
102: and acquiring the medium score obtained by pre-training each target medium.
Wherein the media score may represent a probability of using a subject of the target media as a target subject.
Of course, the media score may also represent the probability that the subject using the target media is a non-target subject.
So that it can be used to determine whether the subject using the target medium is the target subject or a non-target subject based on the medium score.
103: and judging whether the detection subject is the target subject or not according to the medium fraction of each target medium.
In this embodiment, the detection subject is judged according to the medium fraction of each target medium, instead of the single medium, and the influence of each target medium on the target subject is comprehensively considered, so that the detection result is more accurate.
As another embodiment, the determining whether the detection subject is the target subject according to the media scores of the respective target media may be:
summarizing the medium scores of all target media to obtain the main body score of the detection main body;
and judging whether the detection subject is a target subject or not according to the subject score.
The media scores of the target media may be collected in various ways, for example, the media scores may be collected in combination with the media types of the target media, and different weights may be given to different media types according to the influence degree of different media types on whether the subject is the target subject, so that the media scores of the target media and the corresponding weights thereof may be collected in ways of adding, multiplying, and the like, that is, the subject score of the detected subject may be obtained.
So that it can be used to detect whether a subject is a target subject based on the subject score.
The subject score may represent a probability that the subject using each target medium is the target subject, so that the higher the score, the greater the probability that the detection subject becomes the target subject.
Of course, the probability that the subject using each target medium is a non-target subject may be indicated, and the lower the score, the higher the probability that the detection subject becomes a target subject.
In order to realize the judgment, a score threshold value can be set according to the actual situation, and the subject score is compared with the score threshold value, that is, whether the detection subject is the target subject can be determined.
Wherein the medium fraction of each target medium may be generated in advance according to the following training mode:
104: training data carrying labels is obtained.
Wherein the tag is used to identify a target subject and a non-target subject using a target medium.
The training data may be a large amount of historical data, including media used by targeted subjects, media used by non-targeted subjects, and so on.
105: and calculating the target subject number and the non-target subject number of the target medium used in the training data aiming at each target medium.
106: and acquiring the media type of the target media.
The medium is a specific value of the type of medium.
For example, the media type of the target medium is a mobile phone number, and the target medium is a specific mobile phone number.
The media type of the target media is age, and the target media is a specific age value such as "15 years".
107: and calculating and obtaining the medium fraction of the target medium according to the target subject number, the non-target subject number and the medium type.
As is known from the actual situation, the media scores may be different for different media types even with the same number of target subjects and the same number of non-target subjects. That is, the target media of different media types have different degrees of influence on the detection subject being the target subject, so the media fraction of the target media is calculated not only according to the target subject number and the non-target subject number of the target media used therein, but also in combination with the media type of the target media.
Wherein, according to the media type, calculating the media fraction of the target media may be, for example:
and calculating the historical hit rate according to the target subject number and the non-target subject number of the target medium. Different media types can be endowed with different weight coefficients, and the product of the historical hit rate and the weight coefficients can be used as the media fraction of the target media; the weight coefficient of the target medium with larger influence degree on the detection subject is higher, so that the medium score of the target medium can be obtained through calculation more accurately.
Of course, other implementation manners may also be adopted for calculating the media score of the target media according to the target subject number, the non-target subject number and the media type, which will be described in detail in the following embodiments.
In this embodiment, the media score of the target media is calculated based on the target subject number, the non-target subject number, and the media type. And the medium score is more accurate by calculating according to the target subject number and the non-target subject data.
Wherein the media score of the target media may be calculated according to the target subject number, the non-target subject number, and the number of media of which the media type is used. The number of media may specifically be selected from the number of media used by the number of non-target subjects and the number of target subjects and belonging to all media of the media type.
For example, the media type is a mobile phone number, the target media is a specific mobile phone number, and assuming that a is used, the number of target entities using the mobile phone number "a" is n, and the number of non-target entities is m.
The number of media refers to the number of mobile phone numbers used by m non-target subjects and n target subjects.
The calculating of the media fraction of the target media may be based on empirical probability, based on the target subject number, the non-target subject number, and the media type.
Therefore, as still another embodiment, based on the target subject number, the non-target subject number, and the media type, the media score of the target media may be obtained by calculation according to the following first calculation formula;
wherein, A represents a target subject,representing non-target bodies, xiAn ith target medium representing a target subject's usage; m represents the number of non-target subjects using the target medium; n represents the number of target subjects using the target medium; f (m, n) represents the media number of all media used by the m non-target subjects and the n target subjects and belonging to the media type of the target media.
That is, the probability that the subject using the target medium xi is a non-target subject can be expressed as the medium score. Of course P (A/x) can also be calculatedi),P(A/xi) Representing the probability of using a subject of the target medium xi, which is not the target subject, wherein,
the first calculation formula is obtained according to the empirical distribution, and when the empirical distribution is close to the actual distribution, the first calculation formula is obtained.
The derivation process is as follows:
the first idea is as follows: the number of target subjects using the target medium xi is n, the number of non-target subjects is m, and if a new subject using the target medium xi is present, the number of target subjects using the target medium xi becomes n +1, if the new subject is a target subject, either the target subject or the non-target subject; when the new subject is a non-target subject, the non-target subject of the target medium xi becomes m + 1.
And a second idea: and F (m, n) represents the media number of the entire media used by the m non-target subjects and the n target subjects and belonging to the media type of the target media. For example, there are 1000 IPs in total, which are used by 5 non-target subjects and 3 target subjects, and the media type is the number of media of the IP: f (5, 3) ═ 1000. One medium is used by m non-target subjects and n target subjects, and since the historical hit rate n/m + n indicates that the medium score is inaccurate and has no applicability, the number of the media of all the media belonging to the media type used by the m non-target subjects and the n target subjects can be searched, and the number of the media type is combined to calculate the medium score of the medium, so that the medium score indicates whether the subject using the medium is the target subject or the non-target subject more accurately.
And combining the first idea and the second idea to calculate the probability distribution when the new subject is a non-target subject, so as to obtain the first calculation formula. In the extreme case, when m and n are both 0,it can be seen that the first calculation formula is also true, satisfying the actual situation distribution.
The medium fraction of the target medium can be accurately calculated by utilizing the first calculation formula, and the probability that the main body using the target medium is the target main body can be accurately and reasonably expressed, so that the detection of the main body is realized.
Due to the limited training data, when m and n are large, F (m, n) is often 0, so as to improve the calculation accuracy. As a further example:
when the number of the non-target subjects is smaller than a first threshold value and the number of the target subjects is smaller than a second threshold value, F (m, n) is obtained from the training data in a statistical mode;
when the number of non-target subjects is less than a first threshold and the number of target subjects is greater than a second threshold; or when the number of non-target subjects is greater than a first threshold and the target subject is less than a second threshold: f (m, n) can be fitted and calculated by using a fitting formula, which can have various forms as a possible implementation manner:
wherein ,αn and βnThe slope and intercept obtained by fitting F (m, n), constants, i.e. F (m, n) calculated by using a fitting function, are respectively
Then the first calculation formula may be:
when the number of non-target subjects is greater than a first threshold and the number of target subjects is greater than a second threshold, F (m, n) is often 0, and in order to avoid the denominator being 0, F (m, n) may be equal to 1, where the first calculation formula is specifically:
that is, when m and n are both large, the media fraction may be represented using the historical hit rate.
The first threshold and the second threshold may be determined according to an actual situation, a subject type, and a data amount of the training data.
As another example, the media score of the target medium may be represented by a probability that the user uses the subject of the target medium as the target subject, that is:
wherein ,denotes a non-target body, P (A/x)i) That is, the medium score indicates the probability that the subject using the target medium xi is the target subject.
As another embodiment, the media scores of the target media are aggregated, and obtaining the subject score of the detection subject may be:
summarizing the medium scores of all target media, and calculating to obtain the subject score of the detection subject according to the following second calculation formula;
wherein ,
k represents the total number of target media used by the detection subject; x represents the number of non-target subjects in the training data, and Y represents the number of target subjects in the training data.
As a body score, the medium x used is indicated1,x2,...xkIs the probability of a non-target subject.
wherein ,for a priori probability, corrections are needed.
Since there is less chance that one non-target subject will use different target media at the same time, it can be assumed that the events of using different target media by non-target subjects are independent of each other. The second calculation formula can thus be obtained according to the following derivation:
wherein ,is a target medium xiThe medium fraction of (2) can be obtained by using the first calculation formula described above.
Of course, as yet another example, the media fraction is P (A/x)i) When expressed, then the subject score may be:
wherein ,
P(A/x1,x2,...xk) Indicating the use of medium x1,x2,...xkIs the probability of a non-target subject.
Since the probability that one target subject uses different target media at the same time is high, and the probability that one non-target subject uses different target media at the same time is low,therefore, it can be assumed that the events of different target media are used by non-target subjects independently, and the formula can be utilizedIs obtained by calculationThen P (A/x) can be obtained by calculation1,x2,...xk) And the accuracy of calculating the subject score is ensured.
The scores of all target media are collected through the embodiment, and the main body is scored, so that the influence of different media types on the main body as the target main body is comprehensively considered, and the accuracy of main body detection is improved.
In which, because one main body uses a plurality of target media, each target media may include a plurality of sub-media, each sub-media further includes a sub-media of a next level, and each target media is a main body of its sub-media, there is a media hierarchy relationship. The subject is the medium score of the target subject depending on the target medium, and the medium score of the target medium depends on the medium score of the sub-medium used when the target medium is the subject.
For example, the likelihood that a principal is bad depends on the likelihood that the used cell phone number is a bad cell phone number, the likelihood that the used mailbox is a bad mailbox, and so on. The quality of the mobile phone number depends on the medium related to the mobile phone number, such as LBS information, IP information of the mobile phone number, and the like.
Therefore, a main body can be divided into media levels to form a tree structure, as shown in fig. 2, the main body is a root node, a target medium used by the main body may be a leaf node, or may also be a branch node, and the branch node has a child node, that is, a child medium of the target medium, or a next-level medium of the child medium. The branch node is a parent node of the child node. In fig. 2, leaf 1, leaf 2, leaf 3, leaf 4, leaf 5, and leaf 6 are branch nodes, node 1, node 2, and node 3 are branch nodes, and the main body is a root node.
For example, as in fig. 2, leaf 1, node 1, and node 2 are the target media used by the principal.
The child media of the node 2 are the leaf 4 and the node 3, and the leaf 4 and the node 3 are the target media used by the node 2 as the main body.
The sub-media of the node 3 are the leaf 5 and the leaf 6, and the leaf 5 and the leaf 6 are the target media used by the node 3 as the main body.
When the target medium is a leaf node, the medium score of the target medium can be generated in advance according to the operations of the step 104 to the step 107;
when the target medium is a branch node, the medium scores of the target medium are the medium scores corresponding to the respective child nodes, and the obtained scores are summarized, where the summarizing mode may be performed according to the method described in the above embodiment.
Therefore, as another embodiment, as shown in fig. 3, in another embodiment of the host detection method provided in this application, the method may include the following steps:
301: and acquiring a target medium used by the detection subject.
The target medium is a medium used by a target main body and can be determined according to historical data.
302: a medium-level tree structure is created according to the sub medium used by each target medium and the next-level sub medium used by each sub medium.
Wherein the target medium is used as a branch node or a leaf node.
The tree structure may be as shown in fig. 2.
303: for any branch node, acquiring the medium scores corresponding to the sub-nodes of the branch node, and summarizing the medium scores of the sub-nodes to obtain the scores serving as the medium scores of the branch nodes;
304: for any leaf node, obtaining a medium score obtained by pre-training, wherein the medium score of the leaf node is generated in advance according to a training mode of the medium score of a target medium by taking a parent node of the leaf node as a target subject and taking the leaf node as the target medium used by the target subject.
The score of each target medium can be obtained by calculating the scores of the branch nodes and the leaf nodes.
That is, when the target medium includes sub-media, the medium score of each sub-medium is obtained; summarizing the medium scores of all the sub-mediums to obtain a score as the medium score of the target medium;
and when the target medium does not comprise the sub-medium, acquiring a medium score obtained by pre-training the target medium.
When the sub-medium does not comprise the next-level sub-medium, the medium score of the sub-medium is that the target medium is used as a target subject, and the sub-medium is used as the target medium used by the target subject and is generated in advance according to a training mode of the medium score of the target medium;
when the sub-media comprise the next-level sub-media, the scores of the sub-media are the media scores of the obtained next-level sub-media, and the scores obtained by summarizing the media scores of the various next-level sub-media are used as the media scores of the sub-media; and the media fraction of the next level of sub-media can be analogized in turn.
The training method of the medium score of the target medium may be the method described in step 104 to step 107 in fig. 1, and is not described herein again.
305: and summarizing the medium scores of the target media to obtain the subject score of the detection subject.
Referring to fig. 2, it can be seen that the medium scores of 6 leaf nodes are first calculated and then summarized layer by layer from bottom to top.
The leaves 5 and 6 are collected to obtain the medium fraction of the node 3, the leaves 2 and 3 are collected to obtain the medium fraction of the node 1, the nodes 3 and 4 are collected to obtain the medium fraction of the node 2, and finally the nodes 1, 2 and 1 are collected to obtain the main body fraction of the main body.
And the summarizing mode can be obtained by calculation according to the second calculating mode.
The medium score of the leaf 1 is generated in advance in the training mode from step 104 to step 107, with the root node as the main body and the leaf 1, the node 1, and the node 2 as the target medium.
The medium scores of the leaves 5 and 6 are generated in advance in the training method from step 104 to step 107, with the node 3 as the target subject and the leaves 5 and 6 as the target medium.
The medium scores of the leaves 2 and 3 are generated in advance in the training method from step 104 to step 107, with the node 1 as the target subject and the leaves 2 and 3 as the target medium.
The medium score of the leaf 4 is generated in advance in the training method from step 104 to step 107, with the node 2 as the target subject and the leaf 4 as the target medium.
306: and judging whether the detection subject is a target subject or not according to the subject score.
In this embodiment, by performing media hierarchy division on the subject, the accuracy of subject detection can be further improved.
Fig. 4 is a schematic structural diagram of an embodiment of a body detection apparatus according to an embodiment of the present application, where the apparatus may include:
a pre-calculation module 401, configured to obtain training data carrying a label; the label is used for identifying a target subject and a non-target subject using a target medium; calculating, for each target medium, a target subject number and a non-target subject number of the training data using the target medium; acquiring the media type of the target media; and calculating and obtaining the medium fraction of the target medium according to the target subject number, the non-target subject number and the medium type.
In this embodiment, the media score of the target media is calculated based on the target subject number, the non-target subject number, and the media type. And not only calculation is carried out according to the target subject number and the non-target subject data, so that the medium score is more accurate.
Wherein the media score of the target media may be calculated based on the target subject number, the non-target subject number, and the number of media in which media belonging to the media type are used. A media quantity volume may select a media quantity of media used by the non-target and target volumes and belonging to the media type.
A medium acquiring module 402, configured to acquire a target medium used by the detection subject.
Wherein the target media is media used by a target subject;
a score obtaining module 403, configured to obtain a medium score obtained by pre-training each target medium obtained by the pre-calculation module.
Wherein the media score may represent a probability of using a subject of the target media as a target subject.
Of course, the media score may also represent the probability that the subject using the target media is a non-target subject.
So that it can be used to determine whether the subject using the target medium is the target subject or a non-target subject based on the medium score.
The detection module 404 is configured to determine whether the detection subject is a target subject according to the media scores of the target media.
In this embodiment, the detection subject is judged according to the medium fraction of each target medium, instead of the single medium, and the influence of each target medium on the target subject is comprehensively considered, so that the detection result is more accurate.
As another embodiment, the pre-calculation module may calculate the medium score of the target medium according to the target subject number, the non-target subject number, and the medium type according to a first calculation formula;
wherein, A represents a target subject,representing non-target bodies, xiAn ith target medium representing use by a non-target subject; m represents the number of non-target subjects using the target medium; n represents the number of target subjects using the target medium; f (m, n) represents the media number of all media used by the m non-target subjects and the n target subjects and belonging to the media type of the target media.
Indicating the probability that the subject using the target medium xi is a non-target subject.
The first calculation formula is obtained according to the empirical distribution, and when the empirical distribution is close to the actual distribution, the first calculation formula is obtained.
Due to the limited training data, when m and n are large, F (m, n) is often 0, so as to improve the calculation accuracy. As a further example:
when the number of the non-target subjects is smaller than a first threshold value and the number of the target subjects is smaller than a second threshold value, F (m, n) is obtained from the training data in a statistical mode;
when the number of non-target subjects is less than a first threshold and the number of target subjects is greater than a second threshold; or when the number of non-target subjects is greater than a first threshold and the target subject is less than a second threshold:
wherein ,αn and βnThe slope and intercept obtained by fitting F (m, n), constants, i.e. F (m, n) calculated by using a fitting function, are respectively
Then the first calculation formula is specifically:
when the number of non-target subjects is greater than a first threshold and the number of target subjects is greater than a second threshold:
F(m,n)≈1;
to avoid the denominator being 0, another F (m, n) may be equal to 1, and the first calculation formula is specifically:
that is, when m and n are both large, the utilization history hit rate can be used to represent the media score.
The first threshold and the second threshold may be determined according to an actual situation, a subject type, and a data amount of the training data.
As another example, the media score of the target medium may be represented by a probability that the user uses the subject of the target medium as the target subject, that is:
wherein ,denotes a non-target body, P (A/x)i) That is, the medium score indicates the probability that the subject using the target medium xi is the target subject.
In which, because one main body uses a plurality of target media, each target media may include a plurality of sub-media, each sub-media further includes a sub-media of a next level, and each target media is a main body of its sub-media, there is a media hierarchy relationship. The subject is the medium score of the target subject depending on the target medium, and the medium score of the target medium depends on the medium score of the sub-medium used when the target medium is the subject. Thus, for one subject, the media hierarchy can be divided, forming a tree structure,
as a further embodiment, therefore, as shown in fig. 5, the difference from the embodiment shown in fig. 4 is that,
the score obtaining module 403 includes:
a structure establishing unit 501, configured to establish a medium-level tree structure according to the sub medium used by each target medium and the next-level sub medium used by each sub medium; the target medium is used as a branch node or a leaf node;
a score obtaining unit 502, configured to obtain, for any branch node, a medium score corresponding to a child node of the branch node, and use a score obtained by summarizing the medium scores of the child nodes as the medium score of the branch node;
for any leaf node, acquiring a medium score obtained by pre-training, wherein the medium score of the leaf node is generated in advance by using the pre-calculation module, and the leaf node is used as a target medium used by the target subject.
The score of each target medium can be obtained by calculating the scores of the branch nodes and the leaf nodes.
That is, when the target medium includes sub-media, the medium score of each sub-medium is obtained; summarizing the medium scores of all the sub-mediums to obtain a score as the medium score of the target medium;
and when the target medium does not comprise the sub-medium, acquiring a medium score obtained by pre-training the target medium.
When the sub-medium does not comprise the next-level sub-medium, the medium score of the sub-medium is that the target medium is used as a target subject, and the sub-medium is used as the target medium used by the target subject and is generated in advance according to a training mode of the medium score of the target medium;
when the sub-media comprise the next-level sub-media, the scores of the sub-media are the media scores of the obtained next-level sub-media, and the scores obtained by summarizing the media scores of the various next-level sub-media are used as the media scores of the sub-media; and the media fraction of the next level of sub-media can be analogized in turn.
The medium fraction of the target medium can be calculated by a pre-calculation module.
By performing media level division on the subject, the accuracy of subject detection can be further improved.
Further, as still another embodiment, as shown in fig. 5, the detection module 404 may include:
a subject calculation unit 503, configured to sum the media scores of the target media to obtain a subject score of the detection subject;
a detecting unit 504, configured to determine whether the detected subject is a target subject according to the subject score.
The media scores of the target media may be collected in various ways, for example, the media scores may be collected in combination with the media types of the target media, and different weights may be given to different media types according to the influence degree of different media types on whether the subject is the target subject, so that the media scores of the target media and the corresponding weights thereof may be collected in ways of adding, multiplying, and the like, that is, the subject score of the detected subject may be obtained.
So that it can be used to detect whether a subject is a target subject based on the subject score.
The subject score may indicate a probability that the subject using each target medium is the target subject, and the higher the score is, the higher the possibility that the detection subject becomes the target subject is.
Of course, the probability that the subject using each target medium is a non-target subject may be indicated, and the lower the score, the higher the probability that the detection subject becomes a target subject.
To achieve the determination, a score threshold may be set according to the actual situation, so that the subject score is compared with the score threshold to determine whether the detection subject is the target subject.
As another embodiment, the main body computing unit may be specifically configured to:
summarizing the medium scores of all target media, and calculating to obtain the subject score of the detection subject according to the following second calculation formula;
wherein ,k represents the total number of target media used by the detection subject; x represents the number of non-target subjects in the training data, and Y represents the number of target subjects in the training data.
As a body score, the medium x used is indicated1,x2,...xkIs the probability of a non-target subject.
wherein ,for a priori probability, corrections are needed.
Since there is less chance that one non-target subject will use different target media at the same time, it can be assumed that the events of using different target media by non-target subjects are independent of each other. Therefore, the derivation process by which the second calculation formula can be obtained according to the following derivation process can be as follows:
wherein ,is a target medium xiThe medium fraction of (2) can be obtained by using the first calculation formula described above.
Of course, as yet another example, the media fraction is P (A/x)i) When expressed, then the subject score may be:
wherein ,
indicating the use of medium x1,x2,...xkIs the probability of a non-target subject.
According to the embodiment of the application, the medium fraction of the target medium is calculated by combining the medium types, the detection main body is judged according to the medium fractions of all the target media of the detection main body, the target media of different medium types are comprehensively considered instead of a single medium, and the accuracy of main body detection is improved.
In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.
Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, computer readable media does not include non-transitory computer readable media (transient media), such as modulated data signals and carrier waves.
As used in the specification and in the claims, certain terms are used to refer to particular components. As one skilled in the art will appreciate, manufacturers may refer to a component by different names. This specification and claims do not intend to distinguish between components that differ in name but not function. In the following description and in the claims, the terms "include" and "comprise" are used in an open-ended fashion, and thus should be interpreted to mean "include, but not limited to. "substantially" means within an acceptable error range, that a person skilled in the art can solve the technical problem within a certain error range to substantially achieve the technical effect. Furthermore, the term "coupled" is intended to encompass any direct or indirect electrical coupling. Thus, if a first device couples to a second device, that connection may be through a direct electrical connection, or through an indirect electrical connection via other devices and connections. The description which follows is a preferred embodiment of the present application, but is made for the purpose of illustrating the general principles of the application and not for the purpose of limiting the scope of the application. The protection scope of the present application shall be subject to the definitions of the appended claims.
It is also noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a good or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such good or system. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a commodity or system that includes the element.
The foregoing description shows and describes several preferred embodiments of the application, but as aforementioned, it is to be understood that the application is not limited to the forms disclosed herein, but is not to be construed as excluding other embodiments and is capable of use in various other combinations, modifications, and environments and is capable of changes within the scope of the application as expressed herein, commensurate with the above teachings, or the skill or knowledge of the relevant art. And that modifications and variations may be effected by those skilled in the art without departing from the spirit and scope of the application, which is to be protected by the claims appended hereto.

Claims (14)

1. A method of obtaining a media fraction of a target media, comprising:
acquiring training data carrying a label; the label is used for identifying a target subject and a non-target subject using a target medium;
calculating the target subject number and the non-target subject number of the target medium used in the training data aiming at the target medium;
acquiring the media type of the target media;
and calculating and obtaining the medium fraction of the target medium according to the target subject number, the non-target subject number and the medium type.
2. The method of claim 1, further comprising, prior to said obtaining tag-bearing training data:
acquiring a target medium used by a detection main body; the target medium is a medium used by a target subject;
after the calculating and obtaining the media score of the target media according to the target subject number, the non-target subject number and the media type, the method further includes:
and judging whether the detection subject is the target subject or not according to the medium fraction of each target medium.
3. The method of claim 2, wherein the determining whether the detection subject is a target subject according to the media scores of the respective target media comprises:
summarizing the medium scores of all target media to obtain the main body score of the detection main body;
and judging whether the detection subject is a target subject or not according to the subject score.
4. The method according to any one of claims 1 to 3, further comprising:
establishing a medium level tree structure according to the sub-medium used by each target medium and the next level sub-medium used by each sub-medium; the target medium is used as a branch node or a leaf node;
for any branch node, acquiring the medium scores corresponding to the child nodes of the branch node, and summarizing the medium scores of the child nodes to obtain the scores serving as the medium scores of the branch nodes;
for any leaf node, acquiring a medium score obtained by pre-training, wherein the medium score of the leaf node is generated in advance according to a training mode of the medium score of a target medium by taking a parent node of the leaf node as a target subject and taking the leaf node as the target medium used by the target subject.
5. The method according to any one of claims 1 to 3, wherein the medium fraction of the target medium is a probability that a subject using the target medium is a non-target subject;
the calculating and obtaining the media score of the target media according to the target subject number, the non-target subject number and the media type comprises:
according to the target subject number, the non-target subject number and the media type, calculating and obtaining a media fraction of the target media according to a first calculation formula;
wherein, A represents a target subject,representing non-target bodies, xiAn ith target medium representing use by a non-target subject; m represents the number of non-target subjects using the target medium; n represents the number of target subjects using the target medium; f (m, n) represents the number of media used by the m non-target subjects and the n target subjects and belonging to the media type.
6. The method of claim 5, wherein F (m, n) is statistically derived from the training data when the number of non-target subjects is less than a first threshold and the number of target subjects is less than a second threshold;
when the number of non-target subjects is less than a first threshold and the number of target subjects is greater than a second threshold; or when the number of the non-target subjects is greater than a first threshold and the target subjects are less than a second threshold:
wherein ,αn and βnThe slope and intercept obtained by fitting F (m, n) are respectively;
the first calculation formula is specifically as follows:
when the number of non-target subjects is greater than a first threshold and the number of target subjects is greater than a second threshold:
F(m,n)≈1;
the first calculation formula is specifically as follows:
7. the method of claim 3, wherein the aggregating media scores of respective target media to obtain a subject score for the test subject comprises:
summarizing the medium scores of all target media, and calculating to obtain the subject score of the detection subject according to the following second calculation formula;
wherein ,k represents the total number of target media used by the detection main body; x represents the number of non-target subjects in the training data, Y represents the number of target subjects in the training data,representing a target medium xiThe medium fraction of (a).
8. An apparatus for obtaining a media score of a target media, comprising a pre-calculation module configured to:
acquiring training data carrying a label; the label is used for identifying a target subject and a non-target subject using a target medium;
calculating the target subject number and the non-target subject number of the target medium used in the training data aiming at the target medium;
acquiring the media type of the target media;
and calculating and obtaining the medium fraction of the target medium according to the target subject number, the non-target subject number and the medium type.
9. The apparatus of claim 7, further comprising:
the medium acquisition module is used for acquiring a target medium used by the detection main body; the target medium is a medium used by a target subject;
and the detection module is used for judging whether the detection main body is the target main body or not according to the medium fraction of each target medium.
10. The apparatus of claim 8, wherein the detection module comprises:
the main body calculating unit is used for summarizing the medium scores of all target media to obtain the main body score of the detection main body;
and the detection unit is used for judging whether the detection subject is a target subject or not according to the subject score.
11. The apparatus of any one of claims 8 to 10, wherein the method further comprises:
the structure establishing unit is used for establishing a medium level tree structure according to the sub-medium used by each target medium and the next level sub-medium used by each level of sub-medium; the target medium is used as a branch node or a leaf node;
the score acquisition unit is used for acquiring the medium scores corresponding to the subnodes of any branch node, and taking the scores obtained by summarizing the medium scores of the subnodes as the medium scores of the branch nodes;
for any leaf node, acquiring a medium score obtained by pre-training, wherein the medium score of the leaf node is generated in advance according to a training mode of the medium score of a target medium by taking a parent node of the leaf node as a target subject and taking the leaf node as the target medium used by the target subject.
12. The apparatus according to any one of claims 8 to 10, wherein the medium fraction of the target medium is a probability that a subject using the target medium is a non-target subject;
the pre-calculation module calculates and obtains the media score of the target media according to the target subject number, the non-target subject number and the media type, and comprises:
according to the target subject number, the non-target subject number and the media type, calculating and obtaining a media fraction of the target media according to a first calculation formula;
wherein, A represents a target subject,representing non-target bodies, xiAn ith target medium representing use by a non-target subject; m represents the number of non-target subjects using the target medium; n represents the number of target subjects using the target medium; f (m, n) represents the number of media used by the m non-target subjects and the n target subjects and belonging to the media of the media type。
13. The apparatus of claim 12, wherein F (m, n) is statistically derived from the training data when the number of non-target subjects is less than a first threshold and the number of target subjects is less than a second threshold;
when the number of non-target subjects is less than a first threshold and the number of target subjects is greater than a second threshold; or when the number of the non-target subjects is greater than a first threshold and the target subjects are less than a second threshold:
wherein ,αn and βnThe slope and intercept obtained by fitting F (m, n) are respectively;
the first calculation formula is specifically as follows: (ii) a
When the number of non-target subjects is greater than a first threshold and the number of target subjects is greater than a second threshold:
F(m,n)≈1;
the first calculation formula is specifically as follows:
14. the apparatus of claim 10, wherein the subject computing unit is specifically configured to:
summarizing the medium scores of all target media, and calculating to obtain the subject score of the detection subject according to the following second calculation formula;
wherein ,k represents the total number of target media used by the detection main body; x represents the number of non-target subjects in the training data, Y represents the number of target subjects in the training data,representing a target medium xiThe medium fraction of (a).
CN201910282986.9A 2016-07-06 2016-07-06 Main body detection method and device Active CN110163375B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910282986.9A CN110163375B (en) 2016-07-06 2016-07-06 Main body detection method and device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610528139.2A CN106875016B (en) 2016-07-06 2016-07-06 Subject detection method and device
CN201910282986.9A CN110163375B (en) 2016-07-06 2016-07-06 Main body detection method and device

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
CN201610528139.2A Division CN106875016B (en) 2016-07-06 2016-07-06 Subject detection method and device

Publications (2)

Publication Number Publication Date
CN110163375A true CN110163375A (en) 2019-08-23
CN110163375B CN110163375B (en) 2023-06-02

Family

ID=59238930

Family Applications (2)

Application Number Title Priority Date Filing Date
CN201910282986.9A Active CN110163375B (en) 2016-07-06 2016-07-06 Main body detection method and device
CN201610528139.2A Active CN106875016B (en) 2016-07-06 2016-07-06 Subject detection method and device

Family Applications After (1)

Application Number Title Priority Date Filing Date
CN201610528139.2A Active CN106875016B (en) 2016-07-06 2016-07-06 Subject detection method and device

Country Status (1)

Country Link
CN (2) CN110163375B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101014935A (en) * 2004-05-24 2007-08-08 艾菲诺瓦公司 Determining design preferences of a group
CN102314447A (en) * 2010-07-05 2012-01-11 渥奇数位资讯股份有限公司 User data paring technology of community website
CN104090888A (en) * 2013-12-10 2014-10-08 深圳市腾讯计算机系统有限公司 Method and device for analyzing user behavior data

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150285817A1 (en) * 2014-04-08 2015-10-08 Biodesix, Inc. Method for treating and identifying lung cancer patients likely to benefit from EGFR inhibitor and a monoclonal antibody HGF inhibitor combination therapy

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101014935A (en) * 2004-05-24 2007-08-08 艾菲诺瓦公司 Determining design preferences of a group
CN102314447A (en) * 2010-07-05 2012-01-11 渥奇数位资讯股份有限公司 User data paring technology of community website
CN104090888A (en) * 2013-12-10 2014-10-08 深圳市腾讯计算机系统有限公司 Method and device for analyzing user behavior data

Also Published As

Publication number Publication date
CN106875016B (en) 2019-04-23
CN106875016A (en) 2017-06-20
CN110163375B (en) 2023-06-02

Similar Documents

Publication Publication Date Title
US10671679B2 (en) Method and system for enhanced content recommendation
TWI718643B (en) Method and device for identifying abnormal groups
CN104915879B (en) The method and device that social relationships based on finance data are excavated
JP5984917B2 (en) Method and apparatus for providing suggested words
US9418147B2 (en) Method and apparatus of determining product category information
US8812420B2 (en) Identifying categorized misplacement
WO2019149145A1 (en) Compliant report class sorting method and apparatus
US10037584B2 (en) Obtaining social relationship type of network subjects
Tan et al. The reliability of the Akaike information criterion method in cosmological model selection
WO2012158465A1 (en) Method and system of recommending items
CN104239351A (en) User behavior machine learning model training method and device
CN110009474B (en) Credit risk assessment method and device and electronic equipment
US20160034861A1 (en) Method and apparatus of controlling network payment
Cuba‐Borda et al. Likelihood evaluation of models with occasionally binding constraints
CN105391594A (en) Method and device for recognizing characteristic account number
CN110020025B (en) Data processing method and device
CN110060053B (en) Identification method, equipment and computer readable medium
CN109918678B (en) Method and device for identifying field meaning
CN106681581A (en) Method and device for application program icon arrangement
CN106033455B (en) Method and equipment for processing user operation information
Juddoo et al. A qualitative assessment of machine learning support for detecting data completeness and accuracy issues to improve data analytics in big data for the healthcare industry
CN106997350A (en) A kind of method and device of data processing
CN106469182A (en) A kind of information recommendation method based on mapping relations and device
WO2018227931A1 (en) Information determining method and apparatus
CN108647739A (en) A kind of myspace discovery method based on improved density peaks cluster

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20200923

Address after: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman, British Islands

Applicant after: Advanced innovation technology Co.,Ltd.

Address before: A four-storey 847 mailbox in Grand Cayman Capital Building, British Cayman Islands

Applicant before: Alibaba Group Holding Ltd.

Effective date of registration: 20200923

Address after: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman, British Islands

Applicant after: Innovative advanced technology Co.,Ltd.

Address before: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman, British Islands

Applicant before: Advanced innovation technology Co.,Ltd.

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant