CN109213513B - Method and device for determining share ratio of software and computer readable storage medium - Google Patents

Method and device for determining share ratio of software and computer readable storage medium Download PDF

Info

Publication number
CN109213513B
CN109213513B CN201710521670.1A CN201710521670A CN109213513B CN 109213513 B CN109213513 B CN 109213513B CN 201710521670 A CN201710521670 A CN 201710521670A CN 109213513 B CN109213513 B CN 109213513B
Authority
CN
China
Prior art keywords
software
target
probability
sample data
determining
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710521670.1A
Other languages
Chinese (zh)
Other versions
CN109213513A (en
Inventor
谢毅
胡荣杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN201710521670.1A priority Critical patent/CN109213513B/en
Publication of CN109213513A publication Critical patent/CN109213513A/en
Application granted granted Critical
Publication of CN109213513B publication Critical patent/CN109213513B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/70Software maintenance or management

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Stored Programmes (AREA)

Abstract

The invention discloses a method and a device for determining the share ratio of software and a computer readable storage medium, and belongs to the technical field of data processing. The method comprises the following steps: determining specific software having an inhibiting or promoting effect on the installation probability of the target software from the plurality of specific software; determining a joint probability between the target software and the determined specified software; counting first conditional probabilities of the terminal installation target software corresponding to at least two dimensions under the condition of installing the determined designated software and the target software; determining a joint probability between the determined specified software, the target software and the at least two dimensions based on the joint probability between the target software and the determined specified software and the first conditional probability; and obtaining the software share ratio of the target software based on the determined joint probability, the stored multiple sample data and the dimension information of the multiple sample data. Because the manual operation of data statistics personnel with rich industry background experience is not needed, the determination efficiency is improved.

Description

Method and device for determining share ratio of software and computer readable storage medium
Technical Field
The present invention relates to the field of data processing technologies, and in particular, to a method and an apparatus for determining a share ratio of software, and a computer-readable storage medium.
Background
At present, operators provide a great variety of software, and in order to facilitate deep understanding of the use of software in the market and making product strategies, the operators usually want to know the share proportion of the software, which refers to the share proportion of the software installed in the market. In order to determine the Software share ratio, some data statistics providers provide a Software Development Kit (SDK), and the SDK can exist in some Software, for example, in an application manager, so that when the terminal is installed with the Software, a Software list of Software installed in the terminal itself can be obtained and reported through the SDK, so that the data statistics providers can determine the Software share ratio of each Software in a sampling manner according to the Software list reported by each terminal.
In an actual application scenario, dimensions such as a terminal model, a user gender, a user age, a popularization strategy of the data acquisition SDK and the like all affect the accuracy of the determined software share ratio. For example, taking a terminal model as an example, if a terminal of a certain model does not support installation of software a, when the ratio of the software shares of the software a is counted, if most of the sampled software lists are reported by the terminal of the model, it is easy to cause that the counted ratio of the software shares has a certain bias. For this reason, in the related art, a data statistics staff with enough industry background experience is generally required to perform analysis and correction, and in general, the data statistics staff can know which dimensions cause the software share proportion to have bias according to experience, and perform resampling on the dimensions to perform the software share proportion remeasuring on the software, so as to achieve the correction effect.
In the process of implementing the present application, the inventor finds that the prior art has at least the following problems: this results in inefficient determination of the share ratio of the software, as it requires manual analysis and correction by data statistics personnel with a high level of industry background experience in the related art.
Disclosure of Invention
In order to solve the problem of low efficiency in determining the share ratio of software in the related art, embodiments of the present invention provide a method and an apparatus for determining the share ratio of software. The technical scheme is as follows:
in a first aspect, a method for determining a share ratio of software is provided, the method including:
determining specified software with a suppression or promotion effect on the installation probability of target software from a plurality of specified software, wherein each specified software is software with a suppression or promotion effect on the installation probability of a preset number of other software, and the target software is software with a ratio of software shares to be counted;
determining a joint probability between the target software and the determined specified software;
counting first conditional probabilities of the installation of the target software by corresponding terminals in at least two dimensions under the condition of installing the determined specified software and the target software, wherein each dimension in the at least two dimensions influences the software share ratio of the target software;
determining a joint probability between the determined specified software, the target software and the at least two dimensions based on the joint probability between the target software and the determined specified software and the first conditional probability;
obtaining a software share proportion of the target software based on the determined specified software, joint probabilities between the target software and the at least two dimensions, a plurality of stored sample data and dimension information of the plurality of sample data.
In a second aspect, a method for determining a share ratio of software is provided, the method comprising:
acquiring the edge probability of target software installed on a terminal corresponding to a target dimension, wherein the target software is software with a software share ratio to be counted, and the target dimension is a dimension with accurate edge probability in at least two dimensions influencing the software share ratio of the target software;
counting third conditional probabilities that the target software is installed by the terminals corresponding to the other dimensions in the at least two dimensions under the condition that the target software is already installed by the terminals corresponding to the target dimensions;
determining a joint probability between the at least two dimensions based on the marginal probability and the third conditional probability;
and obtaining the software share ratio of the target software based on the joint probability, a plurality of stored sample data and the dimension information of the sample data.
In a third aspect, an apparatus for determining a share ratio of software is provided, the apparatus comprising:
the software distribution management system comprises a first determining module, a second determining module and a third determining module, wherein the first determining module is used for determining designated software which has an inhibiting or improving effect on the installation probability of target software from a plurality of designated software, each designated software is software which has an inhibiting or improving effect on the installation probability of a preset number of other software, and the target software is software with a ratio of the share of the software to be counted;
a second determining module for determining a joint probability between the target software and the determined specified software;
a first statistical module, configured to count first conditional probabilities that terminals corresponding to at least two dimensions install the target software under a condition that the determined specified software and the target software are installed, where each of the at least two dimensions affects a software share ratio of the target software;
a third determination module to determine a joint probability between the determined specified software, the target software, and the at least two dimensions based on the joint probability between the target software and the determined specified software and the first conditional probability;
a fourth determining module, configured to obtain a software share ratio of the target software based on the determined specified software, joint probabilities between the target software and the at least two dimensions, a plurality of stored sample data, and dimension information of the plurality of sample data.
In a fourth aspect, an apparatus for determining a share ratio of software is provided, the apparatus comprising:
the acquisition module is used for acquiring the marginal probability of the corresponding terminal installation target software on a target dimension, wherein the target software is software with the share ratio of the software to be counted, and the target dimension is a dimension with accurate marginal probability in at least two dimensions influencing the share ratio of the software of the target software;
a first statistical module, configured to count a third conditional probability that a terminal corresponding to the target dimension installs the target software on a condition that the terminal corresponding to the target dimension has already installed the target software;
a first determination module to determine a joint probability between the at least two dimensions based on the marginal probability and the third conditional probability;
and the second statistical module is used for obtaining the software share ratio of the target software based on the joint probability, the stored multiple sample data and the dimension information of the multiple sample data.
In a fifth aspect, a terminal is provided, which includes a processor and a memory, where at least one instruction, at least one program, a set of codes, or a set of instructions is stored in the memory, and the instruction, the program, the set of codes, or the set of instructions is loaded and executed by the processor to implement the method for determining a ratio of software shares according to the first aspect or the second aspect.
In a sixth aspect, there is provided a computer readable storage medium having stored therein at least one instruction, at least one program, set of codes, or set of instructions, which is loaded and executed by a processor to implement the method for determining a software share ratio according to the first or second aspect.
The technical scheme provided by the embodiment of the invention has the following beneficial effects: the method includes determining a specific software having a suppressing or boosting effect on an installation probability of a target software from among a plurality of specific software, and determining a joint probability between the target software and the specific software. Then, counting first condition probabilities of the terminal installation target software corresponding to at least two dimensions under the condition that the determined specified software and the target software are installed, determining joint probabilities among the determined specified software, the target software and the at least two dimensions based on the joint probability between the target software and the determined specified software and the first condition probability, and determining a software share ratio based on the joint probability and dimension information of a plurality of sample data and a plurality of sample data. In the process, the condition relation between the designated software and at least two dimensions and the joint distribution influence on the target software are considered, the accuracy of the determined software share proportion is ensured, and in addition, the manual participation of data statistics personnel with rich industry background experience is not needed in the process of determining the software share proportion, so the determination efficiency is improved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a schematic diagram of an application scenario provided in an embodiment of the present invention;
FIG. 2A is a flowchart of a method for determining a share fraction ratio of software according to an embodiment of the present invention;
FIG. 2B is a diagram illustrating a modification result, according to an exemplary embodiment;
FIG. 2C is a schematic illustration of a correction result shown in accordance with another exemplary embodiment;
FIG. 3A is a flow chart of another method for determining a share ratio of software provided by an embodiment of the present invention;
FIG. 3B is a flowchart illustrating a method for determining a software share ratio according to the embodiment of FIG. 3A;
fig. 3C is a schematic diagram illustrating an effect of determining a ratio of software shares based on a terminal model according to an embodiment of the present invention;
FIG. 4A is a schematic structural diagram of a device for determining a share ratio of software according to an embodiment of the present invention;
FIG. 4B is a schematic structural diagram of another software share ratio determining apparatus provided by the embodiment of the present invention;
FIG. 5A is a schematic structural diagram of another software share ratio determining apparatus provided by an embodiment of the present invention;
FIG. 5B is a schematic structural diagram of another software share ratio determining apparatus provided by the embodiment of the present invention;
fig. 6 is a schematic structural diagram of a server of a device for determining a ratio of software shares according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention will be described in detail with reference to the accompanying drawings.
Before explaining the embodiments of the present invention in detail, terms, application scenarios and implementation environments related to the embodiments of the present invention will be briefly described.
First, terms related to embodiments of the present invention will be described.
Software share ratio:refers to the share of the software installed in the market. The operator can know the installation condition of the software in the market according to the software share ratio, namely, the operator can know the use condition of the software in the market according to the software share ratio.
KOL:(Opinion Development, Opinion leader) here can be understood as a software that has a suppressing or boosting effect on the installation probability of other software, for example, the software is an application treasure, which usually promotes software such as QQ, so that the application treasure has a certain boosting effect on the installation probability of software such as QQ.
Gibbs sampling:is an algorithm used in (Markov Chain Monte Carlo, MCMC) Markov Chain Monte carl theory to obtain a series of probability distributions that are approximately equal to a given multi-dimensional (e.g., in embodiments of the present invention, joint probabilities between multiple dimensions) to extract sample data.
The lifting degree is as follows:and may generally be used to indicate whether the probability of installation of one piece of software on another has an increasing or inhibiting effect. In practical application, if a second software is installed under the condition that a first software is installed, when the promotion degree between the first software and the second software is greater than 1, it is indicated that the installation probability of the first software on the second software has a promotion effect, and when the promotion degree between the first software and the second software is less than 1, it is indicated that the installation probability of the first software on the second software has a suppression effect.
Next, an application scenario of the embodiment of the present invention will be described.
In an actual application scenario, when the software share proportion of the target software is counted, dimensions such as a terminal model, a user gender, a user age, a popularization strategy of the data acquisition SDK and the like influence the accuracy of determining the software share proportion, and certain brother software products or competitive software products also influence the accuracy of determining the software share proportion. For example, referring to fig. 1, if there is a piece of software that has a raising effect on the installation probability of the target software that is compared with the current statistical software share, the piece of software will cause the software share of the target software to be higher, whereas if there is a piece of software that has a suppressing effect on the installation probability of the target software that is compared with the current statistical software share, the piece of software will cause the software share of the target software to be lower. Typically, this type of software is referred to as KOL.
Therefore, when there is other software having an effect of increasing or suppressing the installation probability of the target software, the joint distribution between the target software and other software capable of affecting the target software needs to be considered in the process of counting the share ratio of the software by using the sampling method.
To this end, the embodiment of the present invention provides a method for determining a ratio of software shares, which can solve not only the problem that there are only at least two dimensions that have a biased influence on the ratio of software shares, but also the problem that while there are at least two dimensions that have a biased influence on the ratio of software shares, the ratio of software belonging to the KOL also has a biased influence on the ratio of software shares of target software.
Finally, an environment in which embodiments of the present invention are implemented is described.
The method for determining the proportion of the software shares provided by the embodiment of the invention can be realized by taking the server as an execution subject. In a specific implementation, the server may be one server or a server cluster composed of multiple servers, which is not limited in this embodiment of the present invention.
The server may receive sample data reported from each data source, where the sample data may specifically be a software list. The plurality of data sources may include a terminal such as a mobile phone, a tablet computer, etc., in which software carrying the data statistics SDK may be installed, for example, the software may be an app, a software manager, etc. Therefore, the terminal can acquire the software list of the installed software of the terminal through the SDK in the software and report the software list to the server. Then, the server may determine the software share proportion of the software by the method for determining the software share proportion provided by the embodiment of the present invention based on all the received sample data, and a specific implementation process thereof may be referred to as the embodiment shown in fig. 2A and fig. 3A below.
As described above, in an actual application scenario, when the software share of the target software is counted, there are two cases: in the first case, the software share is affected by the ratio, namely the software share comprises the at least two dimensions and also comprises other software having certain influence on the installation probability of the target software; in the second case, the share ratio affecting the software includes only at least two dimensions such as the terminal model, the user gender, and the user age. For ease of understanding and reading, a detailed description will be given below of a specific implementation process of the software share ratio determination method for the above two cases through the following two embodiments of fig. 2A and fig. 3A, respectively.
First, taking an example of influencing the share ratio of software, that is, including at least two dimensions as described above, and further including other software having a certain influence on the installation probability of the target software, please refer to fig. 2A, where fig. 2A is a flowchart of a method for determining the share ratio of software, which is provided by an embodiment of the present invention, and the method is applied to a server. The method comprises the following steps:
step 201: the method comprises the steps of determining specified software with inhibiting or improving effects on the installation probability of target software from a plurality of specified software, wherein each specified software is software with inhibiting or improving effects on the installation probability of a preset number of other software, and the target software is software with the proportion of the software to be counted.
The specific implementation of the specific software which has an effect of suppressing or improving the installation probability of the target software from the plurality of specific software may include: and determining the promotion degree between each designated software in the plurality of designated software and the target software, and if the designated software with the promotion degree larger than 1 or smaller than 1 exists, determining the designated software as the designated software with the effect of inhibiting or promoting the installation probability of the target software.
Wherein determining a specific implementation of the degree of promotion between each piece of the specified software and the target software comprises: for each piece of designated software in the designated software, acquiring the marginal probability of the designated software, counting the conditional probability of the target software installed under the condition that the designated software is installed, and dividing the counted conditional probability by the acquired marginal probability of the designated software to obtain the promotion degree between the designated software and the target software.
In a possible implementation manner, the server stores the marginal probability of each piece of specified software, in an actual implementation, the marginal probability of each piece of specified software may be obtained through a distribution channel such as the ministry of industry and trust and then stored in the server, or the marginal probability of each piece of specified software may also be obtained through data exchange with a third party and then stored in the server, which is not limited in the embodiment of the present invention.
In addition, in practical implementation, the server may count the conditional probability of installing the target software under the condition that the specified software is installed according to the sample data reported by the plurality of data sources, that is, the server counts the number of samples S1 corresponding to installing the target software under the condition that the specified software is installed in the sample data reported by the plurality of data sources, and counts the number of samples S2 corresponding to installing the target software in the sample data reported by the plurality of data sources. Thereafter, the number of samples S1 is divided by the number of samples S2, i.e., the conditional probability of installing the target software under the condition that the specified software is installed is obtained.
Further, before determining the specific software having the effect of suppressing or improving the installation probability of the target software from the plurality of specific software, the plurality of specific software also needs to be determined, and in a specific implementation, the following 2011-:
2011: determining the proportion of first software in the plurality of sample data, wherein the first software is any one of the plurality of software corresponding to the plurality of sample data.
In practical implementation, for any first software in the plurality of software corresponding to the plurality of sample data, the share of the first software in the plurality of sample data needs to be determined, because if the share of the first software in the plurality of sample data is small, it indicates that the influence of the first software is also small and can be ignored, but if the share of the first software in the plurality of sample data is large, it indicates that the first software has a large influence, and at this time, the influence of the first software cannot be ignored. Therefore, the proportion of the first software in the plurality of sample data needs to be determined.
It should be noted that, in the embodiment of the present invention, the description is only given by taking the determination of the ratio of the first software in the plurality of sample data as an example, in another embodiment, if the number of the plurality of sample data is less, the ratio of the first software in all the sample data may also be determined, and the embodiment of the present invention is not limited thereto.
2012: if the ratio is larger than or equal to a preset ratio, determining the promotion degree between the first software and each of a plurality of second software, wherein the plurality of second software is the software except the first software in the plurality of software, and the promotion degree is used for indicating whether the installation probability of the first software to other software has a restraining or promoting effect.
The preset proportion can be set by the server in a default mode, or can be set by technicians in a self-defined mode according to actual requirements, and the preset proportion is not limited in the embodiment of the invention.
If the determined ratio is larger than the preset ratio, it indicates that the influence of the first software is large, and further needs to determine the number of other software influenced by the first software. To determine the amount of other software affected by the first software, it is necessary to determine which software the first software has an effect on, and to this end, the server determines a degree of promotion between the first software and each of the plurality of second software.
The specific implementation of determining the promotion degree between the first software and each of the plurality of second software may include: for each second software, obtaining the marginal probability of the second software, determining the third condition probability of installing the second software under the condition of installing the first software, and dividing the third condition probability by the marginal probability of the second software to obtain the promotion degree between the first software and the second software.
For example, if the first software is K and the second software is L, the edge probability P (L) of the second software is obtained, and P (L | K) is counted, and then the promotion degree between the first software K and the second software L can be obtained through P (L | K)/P (L).
It should be noted that, the above description is only given by taking the lifting degree between the first software and each of the plurality of second software corresponding to the determination ratio being greater than or equal to the preset ratio as an example. In an actual implementation, a plurality of first software which is ranked n before the scale may be selected from all the determined first software according to a descending order, and a degree of improvement between each first software in the n first software and each second software in the plurality of second software may be determined, which is not limited in the embodiment of the present invention.
2013: and counting the number of the software with the promotion degree smaller than 1 or larger than 1 in the plurality of second software with the first software.
As described above, if there is a second software in the plurality of second software, the degree of improvement with respect to the first software is smaller than 1 or larger than 1, it is indicated that the first software suppresses or improves the installation probability of the second software. And the server counts the number of the software with the promotion degree between the second software and the first software being less than 1 or greater than 1.
2014: and when the counted software quantity reaches the preset quantity, determining the first software as the designated software.
The preset number may be set by the server in a default manner, or may be set by a technician in a user-defined manner according to actual requirements, which is not limited in the embodiment of the present invention.
If the counted number of the software reaches the preset number, the first software has an effect of improving or inhibiting the installation probability of a large number of second software, and therefore the first software is determined to be the designated software. According to the implementation method, the plurality of designated software can be determined.
Step 202: a joint probability between the target software and the determined specified software is determined.
Since the determination result of the software share ratio of the target software is affected when the designated software has the effect of increasing or suppressing the installation probability of the target software, after the designated software having the effect of suppressing or increasing the installation probability of the target software is determined, the joint distribution between the designated software and the target software needs to be considered, that is, the server determines the joint probability between the designated software and the target software.
In a specific implementation, the server obtains the determined marginal probability of the specified software, counts a second condition probability of installing the target software under the condition of installing the determined specified software, and determines a joint probability between the target software and the determined specified software based on the determined marginal probability and the second condition probability of the specified software.
For example, if the specific software having the effect of suppressing or increasing the installation probability of the target software Y is software X, the marginal probability of the specific software is P (X), and the second conditional probability P (Y | X) of installing the target software under the condition of installing the determined specific software is obtained through statistics, it is possible to determine that the joint probability between the target software and the determined specific software is P (X, Y) ═ P (Y | X).
Since the edge probability of the specified software is the true probability, and the second condition probability of installing the target software under the condition of installing the determined specified software is also considered, compared with the case that the edge probability of the specified software is inaccurate, the joint probability determined based on the edge probability of the specified software is corrected, that is, the influence of the specified software on the target software is corrected.
Step 203: and counting first conditional probabilities of the terminals corresponding to at least two dimensions for installing the target software under the condition of installing the determined specified software and the target software, wherein each dimension of the at least two dimensions influences the software share ratio of the target software.
Since the share ratio of the software affecting the target software further includes at least two dimensions, in practical implementation, it is further necessary to count first conditional probabilities of the terminals corresponding to the at least two dimensions for installing the target software. The specific implementation of the statistics is similar to the specific implementation of the conditional probability of installing the target software under the condition that the specified software is installed.
For example, if the at least two dimensions include terminal model and user gender, P (model | X, Y) and P (gender | X, Y, model) are counted. In a specific implementation, since each of the at least two dimensions includes a plurality of dimension information, for example, the terminal model includes a and B, and the user gender includes a male and a female, in an actual implementation, the counted P (model | X, Y) and P (gender | X, Y, model) also include a plurality of dimensions. For example, the first conditional probability includes P (A | X, Y), P (B | X, Y), P (male | X, Y, A), P (female | X, Y, A), P (male | X, Y, B), and P (female | X, Y, B).
Step 204: determining a joint probability between the determined specified software, the target software and the at least two dimensions based on the joint probability between the target software and the determined specified software and the first conditional probability.
In a practical implementation, the server determines joint probabilities between the determined specified software, the target software and the at least two dimensions using Gibbs sampling principle P (H, M) ═ P (H | M) × P (M) based on the joint probability between the target software and the determined specified software and the first conditional probability.
Continuing with the above example, that is, if the specified software having the effect of increasing or suppressing the installation probability of the target software X is the software X, the server uses Gibbs sampling principle P (H, M) ═ P (H | M) × (M), and the joint probability between the specified software X, the target software Y and the at least two dimensions can be determined by:
p (X, Y, a, male) P (a | X, Y) P (Y | X) P (X); p (X, Y, B, male) P (male | X, Y) × P (Y | X) × (X); p (X, Y, a, female) P (female | X, Y) × P (Y | X) × P (X); p (X, Y, a, male) P (B | X, Y, female) P (female | X, Y) × P (Y | X) × P (X).
Step 205: and obtaining the software share proportion of the target software based on the determined specified software, the joint probability between the target software and the at least two dimensions, the stored multiple sample data and the dimension information of the multiple sample data.
In a specific implementation, the server extracts a specified number of sample data from the stored multiple sample data based on the joint probability and the dimension information of the multiple sample data, where the specified number may be set by a user in a customized manner according to actual needs, or may be set by the server in a default manner, which is not limited in the embodiment of the present invention.
For example, assuming that the designated number is m, the server performs sample number extraction from the plurality of sample data according to the dimension information of the plurality of sample data, for example, extracting P (X, Y, a, male) X m dimension information as a, male, and sample data reported by a terminal installed with designated software X having an influence on target software Y; and extracting P (X, Y, B, male) m dimensional information as B and male, and installing sample data reported by a terminal of specified software X having influence on the target software Y.
Then, the server can determine the software share ratio of the target software in the extracted m sample data. For example, if t sample data in the m sample data correspond to the target software, the software share ratio is (t/m) × 100%.
Referring to fig. 2B, fig. 2B is a diagram illustrating a correction result according to an exemplary embodiment. The determined specified software is X, the target software is Y, and the brother software product is between the determined specified software X and the target software Y, namely the determined specified software X has a promoting effect on the installation probability of the target software Y. As can be seen from fig. 2B, the problem of the higher ratio of software shares X to Y is corrected by the method described above.
Referring to fig. 2C, fig. 2C is a schematic diagram illustrating a correction result according to another exemplary embodiment, where the determined specified software is X, the target software is Y, and the determined specified software X and the target software Y belong to competing software products, that is, the determined specified software X has an inhibitory effect on the installation probability of the target software Y. As can be seen from this fig. 2C, the low-level problem caused by the software share ratio of X to Y is corrected by the above method.
Before counting the software share proportion of the target software based on the determined specified software, the joint probability between the target software and the at least two dimensions, the stored multiple sample data and the dimension information of the multiple sample data, the dimension information of the multiple sample data and the multiple sample data needs to be determined, and the specific implementation of the method comprises the following steps: receiving sample data reported by a plurality of data sources and the dimension information of the sample data, and determining the sample data and the dimension information of the sample data from all the received sample data according to the weights of the data sources.
In a specific implementation, a server receives sample data reported by a plurality of data sources and dimension information of the sample data, and determines the sample data and the dimension information of the sample data from all the received sample data according to weights of the data sources.
For example, if the weight of a certain data source is greater, more sample data reported by the data source may be extracted from all the received sample data. On the contrary, if the weight of a certain data source is smaller, a smaller part of sample data reported by the data source can be extracted from all the received sample data. Thus, the plurality of sample data can be obtained.
The dimension information of each sample data in the plurality of sample data can be obtained from a data source from social software such as QQ and WeChat, and is reported to the server.
It should be noted that the weights of the multiple data sources may be set by a technician according to actual situations, for example, if a data source reports a data sample through a software manager, and the software manager has a large share in the market, the weight of the data source may be set to be larger. The embodiment of the present invention does not limit the specific setting rule of the weight of each data source.
In the embodiment of the invention, the specified software which has the effect of inhibiting or improving the installation probability of the target software is determined from a plurality of specified software, and the joint probability between the target software and the specified software is determined. Then, counting first condition probabilities of the terminal installation target software corresponding to at least two dimensions under the condition that the determined specified software and the target software are installed, determining joint probabilities among the determined specified software, the target software and the at least two dimensions based on the joint probability between the target software and the determined specified software and the first condition probability, and determining a software share ratio based on the joint probability and dimension information of a plurality of sample data and a plurality of sample data. In the process, the condition relation between the designated software and at least two dimensions and the joint distribution influence on the target software are considered, the accuracy of the determined software share proportion is ensured, and in addition, the manual participation of data statistics personnel with rich industry background experience is not needed in the process of determining the software share proportion, so the determination efficiency is improved.
Next, taking an example of influencing the software share ratio including at least two dimensions such as the terminal model, the user gender, and the user age as an example, please refer to fig. 3A, where fig. 3A is a flowchart of a method for determining the software share ratio provided by an embodiment of the present invention, the method is applied to a server, and the method for determining the software share ratio includes:
step 301: the method comprises the steps of obtaining edge probability of target software installed on a terminal corresponding to a target dimension, wherein the target software is software with a ratio of software shares to be counted, and the target dimension is a dimension with accurate edge probability in at least two dimensions influencing the ratio of the software shares of the target software.
In an actual implementation, the target dimension may be preset by a technician, that is, the technician may determine the target dimension from the at least two dimensions according to the reality of the actually stored marginal probability.
In a possible implementation manner, referring to fig. 3B, the server may store marginal probabilities of the terminal installation target software corresponding to each dimension. In this case, the server may obtain the marginal probability of the terminal installation target software corresponding to the target dimension from among the marginal probabilities of the terminal installation target software corresponding to each dimension stored in the server.
For example, the dimension is taken as a terminal model, and the dimension information corresponding to the dimension may include millet, huaye, samsung, and the like. Therefore, in practical implementation, for the target dimension, the edge probability that needs to be obtained includes edge probabilities of a plurality of pieces of dimension information corresponding to the target dimension.
For example, assuming that the target dimension is a terminal model and the dimension information corresponding to the terminal model includes a model a and a model B, the server obtains an edge probability p (a) that the terminal corresponding to the model a installs the target software and obtains an edge probability p (B) that the terminal corresponding to the model B installs the target software.
The above description is made only by taking, as an example, the edge probability that the server stores the terminal installation target software corresponding to each dimension. In another embodiment, the server may further obtain, from servers provided by other third-party partners, the marginal probability of the terminal installation target software corresponding to the target dimension, which is not limited in the embodiment of the present invention.
Step 302: and counting the fourth condition probability of the target software installed by the corresponding terminal in the other dimensions in the at least two dimensions under the condition that the corresponding terminal in the target dimension already installs the target software.
In practical implementation, because each of the at least two dimensions is not independent from each other, in the embodiment of the present invention, a relationship between the dimensions is considered, that is, the fourth condition probability that the terminal corresponding to the other dimension of the at least two dimensions installs the target software is counted under the condition that the terminal corresponding to the target dimension already installs the target software.
In a specific implementation, when only two dimensions influence the share ratio of the software, for example, the two dimensions include a terminal model and a user gender, the server counts a fourth condition probability that the terminal corresponding to the dimension of the user gender installs the target software under the condition that the terminal corresponding to the dimension of the terminal model already installs the target software.
When there are more than two dimensions affecting the share ratio of the software, for example, assuming that the more than two dimensions include a terminal model, a user gender and a user age, the server counts fourth condition probabilities that the terminal corresponding to the user gender and the user age dimension installs the target software under the condition that the terminal corresponding to the terminal model dimension has installed the target software, and in a specific implementation, the fourth condition probabilities include P (user gender | terminal model) and P (user age | user gender, terminal model).
For ease of understanding, the following description is given by way of specific examples. Assuming that the dimension information corresponding to the dimension of the user age includes more than 30 years old and less than or equal to 30 years old, in an actual implementation, the fourth conditional probability that the server needs to count includes P (male | a), P (female | a), P (>30| male, a), P (≦ 30| male, a), P (>30| female, a), P (≦ 30| female, a), P (male | B), P (female | B), P (>30| male, B), P (≦ 30| male, B), P (>30| female, B), and P (≦ 30| female, B).
It should be noted that, in a specific implementation, the server may perform statistics on each conditional probability included in the fourth conditional probability based on all sample data reported by each data source.
It should be further noted that, here, the description is only given by taking an example of determining, by a statistical method, the fourth condition probability that the target software is installed by the terminal corresponding to the target dimension in the other dimensions of the at least two dimensions under the condition that the target software is already installed by the terminal corresponding to the target dimension, in another embodiment, the fourth condition probability that the target software is installed by the terminal corresponding to the target dimension in the other dimensions of the at least two dimensions under the condition that the target software is already installed by the terminal corresponding to the target dimension can also be directly obtained through another channel, for example, the fourth condition probability can be obtained from some files issued by a technician and then stored in the server, which is not limited by the embodiment of the present invention.
Step 303: determining a joint probability between the at least two dimensions based on the edge probability and the fourth condition probability.
In a particular implementation, the server may determine a joint probability between the at least two dimensions using Gibbs sampling principles P (H, M) ═ P (H | M) × P (M) based on the edge probability and the fourth condition probability.
Continuing with the above example, that is, the edge probabilities of the target dimension corresponding to the terminal installing the target software include P (a) and P (B), and the fourth conditional probabilities include P (male | a), P (female | a), P (>30| male, a), P (≦ 30| male, a), P (>30| female, a), P (≦ 30| female, a), P (male | B), P (female | B), P (>30| male, B), P (≦ 30| male, B), P (>30| female, B), and P (≦ 30| female, B), then the server employs the above bbgis sampling principle, and can determine the joint probability between the dimensions including:
p (a, male, >30) ═ P (>30| male, a) × P (male | a) × P (a); p (a, female, > tg 30) ═ P (>30| female, a) × P (female | a) × P (a); p (a, male, ≦ 30) ═ P (≦ 30| male, a) × P (male | a) × P (a); p (a, female, ≦ 30) ═ P (≦ 30| female, a) × P (female | a) × P (a); p (B, male, >30) ═ P (>30| male, B) × P (male | a) × P (B); p (B, female, >30) ═ P (>30| female, B) × P (female | a) × P (B); p (B, ≦ 30) ═ P (≦ 30| male, B) × P (male | B) × P (B); p (B, female, ≦ 30) ═ P (≦ 30| female, B) × P (female | B) × P (B).
Because the marginal probability of the target software installed by the terminal on the target dimension is the accurate marginal probability, and the condition relation among the dimensions is considered when the joint probability between the at least two dimensions is determined, the obtained joint probability is determined more accurately, and thus, the accuracy of the proportion of the subsequently determined software share can be ensured.
Step 304: and obtaining the software share proportion of the target software based on the joint probability, the stored multiple sample data and the dimension information of the multiple sample data.
In a specific implementation, the server extracts a specified number of sample data from the stored multiple sample data based on the joint probability and the dimension information of the multiple sample data, where the specified number may be set by a user in a customized manner according to actual needs, or may be set by the server in a default manner, which is not limited in the embodiment of the present invention.
For example, assuming that the specified number is m, the server extracts, from the plurality of sample data, according to the dimension information of the plurality of sample data: p (A, male >30) × m dimensional information is sample data reported by A, male and >30 terminals; p (A, female >30) m dimensional information is sample data reported by A, female and >30 terminals; sample data reported by terminals with dimension information of P (a, male, ≦ 30) × m as a, male and ≦ 30; sample data reported by terminals with dimension information of P (A, female, ≦ 30) × m as A, female and ≦ 30; p (B, male >30) × m dimensional information is sample data reported by a male terminal and a terminal of > 30; p (B, female, more than 30) x m pieces of dimensional information are sample data reported by B, female and the terminal more than 30; sample data reported by terminals with dimension information of P (B, male, ≦ 30) × m is B, male and ≦ 30; and P (B, female, ≦ 30) × m dimensional information is sample data reported by B, female and ≦ 30 terminals.
Then, the server counts the share ratio of the target software in the extracted m sample data, i.e. obtains the software share ratio of the target software.
Referring to fig. 3C, fig. 3C is a schematic diagram illustrating the effect of the ratio of the software share determined based on the terminal model, and it can be seen from fig. 3C that, in the case that the ratio of the software share is higher due to the terminal model (as shown by 11 in fig. 3C), the ratio of the software share is statistically reduced (as shown by 12 in fig. 3C), that is, the ratio of the software share is corrected.
Further, referring to fig. 3B, before counting the software share proportion of the target software based on the joint probability, the stored multiple sample data, and the dimensional information of the multiple sample data, the server needs to determine the multiple sample data and the dimensional information of the multiple sample data.
Specifically, sample data reported by a plurality of data sources and the dimension information of the sample data are received, and the plurality of sample data and the dimension information of the plurality of sample data are determined from all the received sample data according to the weights of the plurality of data sources.
For example, if the weight of a certain data source is greater, more sample data reported by the data source may be extracted from all the received sample data. On the contrary, if the weight of a certain data source is smaller, a smaller part of sample data reported by the data source can be extracted from all the received sample data. Thus, the plurality of sample data can be obtained.
The dimension information of each sample data in the plurality of sample data can be obtained from a data source from social software such as QQ and WeChat, and is reported to the server.
It should be noted that the weights of the multiple data sources may be set by a technician according to actual situations, for example, if a data source reports a data sample through a software manager, and the software manager has a large share in the market, the weight of the data source may be set to be larger. The embodiment of the present invention does not limit the specific setting rule of the weight of each data source.
In addition, referring to fig. 3B, in an actual implementation, after determining the share ratio of the software, the software may be cross-analyzed and verified through the big data service platform, and a specific verification process of the software is not limited in this embodiment of the present invention. Moreover, the verified share of the software can be practically applied, for example, the method can be applied to scenes such as algorithm fixing, historical data tracing, thematic analysis and test and the like.
In the embodiment of the invention, the marginal probability of the target software with the ratio of the software share to be counted installed by the terminal corresponding to the target dimension is obtained, and the fourth condition probability of the target software installed by the terminal corresponding to the other dimension in at least two dimensions under the condition that the terminal corresponding to the target dimension is already installed with the target software is counted. And then, determining the joint probability between the at least two dimensions based on the edge probability and the fourth condition probability, wherein the target dimension is a dimension with accurate edge probability in the at least two dimensions influencing the software share proportion of the target software, and the relation between the target dimension and other dimensions is considered, so that the obtained joint probability is accurate. In this way, when the software share proportion of the target software is counted based on the joint probability, the stored multiple sample data and the dimensional information of the multiple sample data, the accuracy of the counted software share proportion can be ensured. Moreover, manual analysis and correction are not needed by data statistics personnel with rich industry background experience, so that the determination efficiency of the share ratio of the software is improved.
Referring to fig. 4A, fig. 4A is a schematic structural diagram of another apparatus for determining a share ratio of software according to an embodiment of the present invention, where the apparatus includes: a first determination module 401, a second determination module 402, a first statistics module 403, a third determination module 404, and a fourth determination module 405;
a first determining module 401, configured to perform step 201 in the foregoing fig. 2A embodiment;
a second determining module 402, configured to perform step 202 in the embodiment of fig. 2A;
a first statistical module 403, configured to perform step 203 in the embodiment of fig. 2A;
a third determining module 404, configured to perform step 204 in the embodiment of fig. 2A;
a fourth determining module 405, configured to perform step 205 in the embodiment of fig. 2A.
Optionally, referring to fig. 4B, the apparatus further includes:
a fifth determining module 406, configured to perform step 2011 in the embodiment of fig. 2A;
a sixth determining module 407, configured to perform step 2012 in the embodiment of fig. 2A;
a second statistical module 408, configured to perform step 2013 in the embodiment of fig. 2A;
a seventh determining module 409, configured to perform step 2014 in the embodiment of fig. 2A.
In the embodiment of the invention, the specified software which has the effect of inhibiting or improving the installation probability of the target software is determined from a plurality of specified software, and the joint probability between the target software and the specified software is determined. Then, counting first condition probabilities of the terminal installation target software corresponding to at least two dimensions under the condition that the determined specified software and the target software are installed, determining joint probabilities among the determined specified software, the target software and the at least two dimensions based on the joint probability between the target software and the determined specified software and the first condition probability, and determining a software share ratio based on the joint probability and dimension information of a plurality of sample data and a plurality of sample data. In the process, the condition relation between the designated software and at least two dimensions and the joint distribution influence on the target software are considered, the accuracy of the determined software share proportion is ensured, and in addition, the manual participation of data statistics personnel with rich industry background experience is not needed in the process of determining the software share proportion, so the determination efficiency is improved.
Referring to fig. 5A, fig. 5A is a schematic structural diagram of an apparatus for determining a ratio of software shares according to an embodiment of the present invention, where the apparatus includes an obtaining module 501, a counting module 502, a first determining module 503, and a second determining module 504.
An obtaining module 501, configured to execute step 301 in the embodiment of fig. 3A;
a statistics module 502, configured to perform step 302 in the embodiment of fig. 3A;
a first determining module 503, configured to perform step 303 in the embodiment of fig. 3A;
a second determining module 504, configured to perform step 304 in the embodiment of fig. 3A.
Optionally, referring to fig. 5B, the apparatus further comprises:
a receiving module 505, configured to receive sample data reported by multiple data sources and dimension information of the sample data;
a third determining module 506, configured to determine the multiple sample data and the dimension information of the multiple sample data from all received sample data according to the weights of the multiple data sources.
In the embodiment of the invention, the marginal probability of the target software with the ratio of the software share to be counted installed by the terminal corresponding to the target dimension is obtained, and the fourth condition probability of the target software installed by the terminal corresponding to the other dimension in at least two dimensions under the condition that the terminal corresponding to the target dimension is already installed with the target software is counted. And then, determining the joint probability between the at least two dimensions based on the edge probability and the fourth condition probability, wherein the target dimension is a dimension with accurate edge probability in the at least two dimensions influencing the software share proportion of the target software, and the relation between the target dimension and other dimensions is considered, so that the obtained joint probability is accurate. In this way, when the software share proportion of the target software is counted based on the joint probability, the stored multiple sample data and the dimensional information of the multiple sample data, the accuracy of the counted software share proportion can be ensured. Moreover, manual analysis and correction are not needed by data statistics personnel with rich industry background experience, so that the determination efficiency of the share ratio of the software is improved.
It should be noted that: in the above embodiment, when the determining apparatus for determining a ratio of software shares is implemented, only the division of the above functional modules is taken as an example, and in practical applications, the above function distribution may be completed by different functional modules according to needs, that is, the internal structure of the apparatus is divided into different functional modules to complete all or part of the above described functions. In addition, the determining apparatus of the ratio of the software share and the determining method embodiment of the ratio of the software share provided by the above embodiments belong to the same concept, and specific implementation processes thereof are detailed in the method embodiments and are not described herein again.
Fig. 6 is a schematic structural diagram of a server of a device for determining a ratio of software shares according to an embodiment of the present invention. The server may be a server in a cluster of background servers. Specifically, the method comprises the following steps:
the server 600 includes a Central Processing Unit (CPU)601, a system memory 604 including a Random Access Memory (RAM)602 and a Read Only Memory (ROM)603, and a system bus 605 connecting the system memory 604 and the central processing unit 601. The server 600 also includes a basic input/output system (I/O system) 606, which facilitates the transfer of information between devices within the computer, and a mass storage device 607, which stores an operating system 613, application programs 614, and other program modules 615.
The basic input/output system 606 includes a display 608 for displaying information and an input device 609 such as a mouse, keyboard, etc. for user input of information. Wherein a display 608 and an input device 609 are connected to the central processing unit 601 through an input output controller 610 connected to the system bus 605. The basic input/output system 606 may also include an input/output controller 610 for receiving and processing input from a number of other devices, such as a keyboard, mouse, or electronic stylus. Similarly, input/output controller 610 may also provide output to a display screen, a printer, or other type of output device.
The mass storage device 607 is connected to the central processing unit 601 through a mass storage controller (not shown) connected to the system bus 605. The mass storage device 607 and its associated computer-readable media provide non-volatile storage for the server 600. That is, mass storage device 607 may include a computer-readable medium (not shown), such as a hard disk or CD-ROM drive.
Without loss of generality, computer readable media may comprise computer storage media and communication media. Computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. Computer storage media includes RAM, ROM, EPROM, EEPROM, flash memory or other solid state memory technology, CD-ROM, DVD, or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices. Of course, those skilled in the art will appreciate that computer storage media is not limited to the foregoing. The system memory 604 and mass storage device 607 described above may be collectively referred to as memory.
According to various embodiments of the invention, the server 600 may also operate as a remote computer connected to a network through a network, such as the Internet. That is, the server 600 may be connected to the network 612 through the network interface unit 611 connected to the system bus 605, or may be connected to other types of networks or remote computer systems (not shown) using the network interface unit 611.
The memory further includes one or more programs, and the one or more programs are stored in the memory and configured to be executed by the CPU. The one or more programs include instructions for performing the software share fraction determination method provided by embodiments of the present invention.
In an exemplary embodiment, a non-transitory computer readable storage medium comprising instructions, such as the memory 604 comprising instructions, executable by the processor 620 of the apparatus 600 to perform the above-described method is also provided. For example, the non-transitory computer readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.
A non-transitory computer readable storage medium, wherein instructions, when executed by a processor of a server, enable the server to perform the method of determining a software share ratio described above with reference to fig. 2A or fig. 3A.
It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims (16)

1. A method for determining share ratios of software, the method comprising:
determining specified software with a suppression or promotion effect on the installation probability of target software from a plurality of specified software, wherein each specified software is software with a suppression or promotion effect on the installation probability of a preset number of other software, and the target software is software with a ratio of software shares to be counted;
determining a joint probability between the target software and the determined specified software;
counting first conditional probabilities of the installation of the target software by corresponding terminals in at least two dimensions under the condition of installing the determined specified software and the target software, wherein each dimension in the at least two dimensions influences the software share ratio of the target software;
determining a joint probability between the determined specified software, the target software and the at least two dimensions based on the joint probability between the target software and the determined specified software and the first conditional probability;
extracting a specified number of sample data from the stored plurality of sample data based on the determined specified software, the target software, the joint probability between the at least two dimensions and the dimension information of the plurality of sample data, and determining the share ratio of the target software in the extracted specified number of sample data as the software share ratio of the target software.
2. The method of claim 1, wherein the determining a joint probability between the target software and the determined specified software comprises:
acquiring the determined marginal probability of the specified software, and counting a second conditional probability of installing the target software under the condition of installing the determined specified software;
determining a joint probability between the target software and the determined specified software based on the determined edge probability and the second conditional probability for the specified software.
3. The method of claim 1, wherein prior to determining the specific software having the inhibitory or enhancing effect on the installation probability of the target software from the plurality of specific software, further comprising:
determining the proportion of first software in the plurality of sample data, wherein the first software is any one of a plurality of pieces of software corresponding to the plurality of sample data;
if the ratio is larger than or equal to a preset ratio, determining a promotion degree between the first software and each of a plurality of second software, wherein the plurality of second software is software except the first software, and the promotion degree is used for indicating whether the installation probability of the first software to other software has a restraining or promoting effect;
counting the number of software with the promotion degree smaller than 1 or larger than 1 in the plurality of second software with the first software;
and when the counted software quantity reaches the preset quantity, determining the first software as the designated software.
4. The method of claim 2, wherein prior to determining the specific software having the inhibitory or enhancing effect on the installation probability of the target software from the plurality of specific software, further comprising:
determining the proportion of first software in the plurality of sample data, wherein the first software is any one of a plurality of pieces of software corresponding to the plurality of sample data;
if the ratio is larger than or equal to a preset ratio, determining a promotion degree between the first software and each of a plurality of second software, wherein the plurality of second software is software except the first software, and the promotion degree is used for indicating whether the installation probability of the first software to other software has a restraining or promoting effect;
counting the number of software with the promotion degree smaller than 1 or larger than 1 in the plurality of second software with the first software;
and when the counted software quantity reaches the preset quantity, determining the first software as the designated software.
5. The method of claim 3, wherein the determining a degree of promotion between the first software and each of a plurality of second software comprises:
for each second software, obtaining an edge probability of the second software, and determining a third condition probability of installing the second software under the condition of installing the first software;
and dividing the third conditional probability by the marginal probability of the second software to obtain the promotion degree between the first software and the second software.
6. The method of claim 4, wherein the determining a degree of promotion between the first software and each of a plurality of second software comprises:
for each second software, obtaining an edge probability of the second software, and determining a third condition probability of installing the second software under the condition of installing the first software;
and dividing the third conditional probability by the marginal probability of the second software to obtain the promotion degree between the first software and the second software.
7. The method of any of claims 1 to 6, wherein before deriving the software share proportion for the target software based on the determined joint probabilities between the specified software, the target software, and the at least two dimensions, the stored plurality of sample data, and the dimensional information for the plurality of sample data, further comprises:
receiving sample data reported by a plurality of data sources and dimension information of the sample data;
and determining the plurality of sample data and the dimension information of the plurality of sample data from all the received sample data according to the weights of the plurality of data sources.
8. A method for determining share ratios of software, the method comprising:
acquiring the edge probability of target software installed on a terminal corresponding to a target dimension, wherein the target software is software with a software share ratio to be counted, and the target dimension is a dimension with accurate edge probability in at least two dimensions influencing the software share ratio of the target software;
counting fourth condition probabilities that the target software is installed by the terminals corresponding to the other dimensions in the at least two dimensions under the condition that the target software is installed by the terminals corresponding to the target dimensions;
determining a joint probability between the at least two dimensions based on the edge probability and the fourth condition probability;
and extracting a specified number of sample data from the stored multiple sample data based on the joint probability and the dimension information of the multiple sample data, and counting the share ratio of the target software in the extracted specified number of sample data to be used as the software share ratio of the target software.
9. The method of claim 8, wherein before deriving the software share proportion of the target software based on the joint probability, the stored plurality of sample data, and the dimensional information of the plurality of sample data, further comprising:
receiving sample data reported by a plurality of data sources and dimension information of the sample data;
and determining the plurality of sample data and the dimension information of the plurality of sample data from all the received sample data according to the weights of the plurality of data sources.
10. An apparatus for determining a share ratio of software, the apparatus comprising:
the software distribution management system comprises a first determining module, a second determining module and a third determining module, wherein the first determining module is used for determining designated software which has an inhibiting or improving effect on the installation probability of target software from a plurality of designated software, each designated software is software which has an inhibiting or improving effect on the installation probability of a preset number of other software, and the target software is software with a ratio of the share of the software to be counted;
a second determining module for determining a joint probability between the target software and the determined specified software;
a first statistical module, configured to count first conditional probabilities that terminals corresponding to at least two dimensions install the target software under a condition that the determined specified software and the target software are installed, where each of the at least two dimensions affects a software share ratio of the target software;
a third determination module to determine a joint probability between the determined specified software, the target software, and the at least two dimensions based on the joint probability between the target software and the determined specified software and the first conditional probability;
and the fourth determination module is used for extracting a specified number of sample data from the stored multiple sample data based on the determined specified software, the target software, the joint probability between the at least two dimensions and the dimension information of the multiple sample data, and determining the share ratio of the target software in the extracted specified number of sample data as the software share ratio of the target software.
11. The apparatus of claim 10, wherein the second determination module is to:
acquiring the determined marginal probability of the specified software, and counting a second conditional probability of installing the target software under the condition of installing the determined specified software;
determining a joint probability between the target software and the determined specified software based on the determined edge probability and the second conditional probability for the specified software.
12. The apparatus of claim 10 or 11, wherein the apparatus further comprises:
a fifth determining module, configured to determine a proportion of first software in the multiple sample data, where the first software is any one of multiple pieces of software corresponding to the multiple sample data;
a sixth determining module, configured to determine, if the ratio is greater than or equal to a preset ratio, a degree of improvement between the first software and each of a plurality of second software, where the plurality of second software is software other than the first software, and the degree of improvement is used to indicate whether the first software has an effect of suppressing or improving an installation probability of other software;
the second counting module is used for counting the number of the software with the promotion degree smaller than 1 or larger than 1 between the plurality of second software and the first software;
and the seventh determining module is used for determining the first software as the specified software when the counted software number reaches the preset number.
13. An apparatus for determining a share ratio of software, the apparatus comprising:
the acquisition module is used for acquiring the marginal probability of the corresponding terminal installation target software on a target dimension, wherein the target software is software with the share ratio of the software to be counted, and the target dimension is a dimension with accurate marginal probability in at least two dimensions influencing the share ratio of the software of the target software;
a counting module, configured to count a third conditional probability that the target software is installed by the terminal corresponding to the target dimension in the other dimensions of the at least two dimensions under the condition that the target software has already been installed by the terminal corresponding to the target dimension;
a first determination module to determine a joint probability between the at least two dimensions based on the marginal probability and the third conditional probability;
and the second determining module is used for extracting a specified number of sample data from the stored multiple sample data based on the joint probability and the dimension information of the multiple sample data, and counting the share ratio of the target software in the extracted specified number of sample data as the software share ratio of the target software.
14. The apparatus of claim 13, wherein the apparatus further comprises:
the receiving module is used for receiving the sample data reported by a plurality of data sources and the dimension information of the sample data;
and a third determining module, configured to determine the multiple sample data and the dimension information of the multiple sample data from all received sample data according to the weights of the multiple data sources.
15. A terminal, characterized in that it comprises a processor and a memory in which at least one instruction, at least one program, set of codes or set of instructions is stored, which is loaded and executed by the processor to implement the method of determining a share ratio of software according to any of claims 1 to 9.
16. A computer-readable storage medium, having stored therein at least one instruction, at least one program, a set of codes, or a set of instructions, which is loaded and executed by a processor to implement the method of determining a software share ratio as claimed in any one of claims 1 to 9.
CN201710521670.1A 2017-06-30 2017-06-30 Method and device for determining share ratio of software and computer readable storage medium Active CN109213513B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710521670.1A CN109213513B (en) 2017-06-30 2017-06-30 Method and device for determining share ratio of software and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710521670.1A CN109213513B (en) 2017-06-30 2017-06-30 Method and device for determining share ratio of software and computer readable storage medium

Publications (2)

Publication Number Publication Date
CN109213513A CN109213513A (en) 2019-01-15
CN109213513B true CN109213513B (en) 2021-07-27

Family

ID=64960931

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710521670.1A Active CN109213513B (en) 2017-06-30 2017-06-30 Method and device for determining share ratio of software and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN109213513B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112016792A (en) * 2020-07-15 2020-12-01 北京淇瑀信息科技有限公司 User resource quota determining method and device and electronic equipment
CN111913987B (en) * 2020-08-10 2023-08-04 东北大学 Distributed query system and method based on dimension group-space-time-probability filtering

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20130059738A (en) * 2011-11-29 2013-06-07 에스케이플래닛 주식회사 System and method for recommending application using contents analysis
CN105868248A (en) * 2015-12-15 2016-08-17 乐视网信息技术(北京)股份有限公司 Media recommendation method and device
CN106682056A (en) * 2016-07-15 2017-05-17 腾讯科技(深圳)有限公司 Method, device and system for determining correlation among different application software
CN106709298A (en) * 2017-01-04 2017-05-24 广东欧珀移动通信有限公司 Information processing method and device and intelligent terminal
CN106775850A (en) * 2016-12-02 2017-05-31 海马云(天津)信息技术有限公司 The installation calculating of instance system application and installation method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20130059738A (en) * 2011-11-29 2013-06-07 에스케이플래닛 주식회사 System and method for recommending application using contents analysis
CN105868248A (en) * 2015-12-15 2016-08-17 乐视网信息技术(北京)股份有限公司 Media recommendation method and device
CN106682056A (en) * 2016-07-15 2017-05-17 腾讯科技(深圳)有限公司 Method, device and system for determining correlation among different application software
CN106775850A (en) * 2016-12-02 2017-05-31 海马云(天津)信息技术有限公司 The installation calculating of instance system application and installation method
CN106709298A (en) * 2017-01-04 2017-05-24 广东欧珀移动通信有限公司 Information processing method and device and intelligent terminal

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于信任行为的移动终端软件信誉和推荐系统的实现与评测;党田力;《中国优秀硕士学位论文全文数据库》;20160315;全文 *

Also Published As

Publication number Publication date
CN109213513A (en) 2019-01-15

Similar Documents

Publication Publication Date Title
CN108959279B (en) Data processing method, data processing device, readable medium and electronic equipment
US20140289719A1 (en) Automatic version management
CN109213513B (en) Method and device for determining share ratio of software and computer readable storage medium
CN113051183A (en) Test data recommendation method and system, electronic device and storage medium
CN107426336B (en) Method and device for adjusting push message opening rate
CN111858369A (en) Memory monitoring method, device, equipment and storage medium
CN108280024B (en) Flow distribution strategy testing method and device and electronic equipment
CN114329469A (en) API abnormal calling behavior detection method, device, equipment and storage medium
CN116756522A (en) Probability forecasting method and device, storage medium and electronic equipment
CN114064445A (en) Test method, device, equipment and computer readable storage medium
CN110020166B (en) Data analysis method and related equipment
CN116204428A (en) Test case generation method and device
CN106294457B (en) Network information pushing method and device
CN109214846B (en) Information storage method and device
CN115576831A (en) Test case recommendation method, device, equipment and storage medium
CN112966199B (en) Page adjustment income determining method and device, electronic equipment and medium
CN105245380B (en) Message propagation mode identification method and device
CN113157583B (en) Test method, device and equipment
CN114138358A (en) Application program starting optimization method, device, equipment and storage medium
CN115249043A (en) Data analysis method and device, electronic equipment and storage medium
CN108287792B (en) Method and apparatus for outputting information
CN109189689B (en) Method and apparatus for testing
CN109614328B (en) Method and apparatus for processing test data
CN110875949A (en) Method and device for pushing information
CN113301597B (en) Network analysis method and equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant