CN108628883B - Data processing method and device and electronic equipment - Google Patents

Data processing method and device and electronic equipment Download PDF

Info

Publication number
CN108628883B
CN108628883B CN201710165468.XA CN201710165468A CN108628883B CN 108628883 B CN108628883 B CN 108628883B CN 201710165468 A CN201710165468 A CN 201710165468A CN 108628883 B CN108628883 B CN 108628883B
Authority
CN
China
Prior art keywords
target
elements
intersection
combination
sets
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710165468.XA
Other languages
Chinese (zh)
Other versions
CN108628883A (en
Inventor
吴家旭
史军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Sogou Technology Development Co Ltd
Original Assignee
Beijing Sogou Technology Development Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Sogou Technology Development Co Ltd filed Critical Beijing Sogou Technology Development Co Ltd
Priority to CN201710165468.XA priority Critical patent/CN108628883B/en
Publication of CN108628883A publication Critical patent/CN108628883A/en
Application granted granted Critical
Publication of CN108628883B publication Critical patent/CN108628883B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention discloses a data processing method, a data processing device and electronic equipment, wherein the method comprises the following steps: obtaining all target sets to which each element in the element set to be processed belongs; acquiring all combination intersections formed by all the target sets of all the elements in combination and the number of the elements contained in each combination intersection according to all the target sets corresponding to all the elements in the element set to be processed; obtaining a relationship chain for representing the affiliation among the target sets; and according to the number of elements contained by the relation chain and each combination intersection, calculating by using a repulsion principle to obtain the total number of elements contained in any target set. In the technical scheme, the number of elements of the combined intersection is not influenced by the relationship chain, and when the relationship chain is changed, the set to which each element belongs and the ancestor set of the element do not need to be counted again, so that the technical problem of low element number updating efficiency in the prior art is solved, and the element number updating efficiency is improved.

Description

Data processing method and device and electronic equipment
Technical Field
The present invention relates to the field of software technologies, and in particular, to a data processing method and apparatus, and an electronic device.
Background
Nowadays, there are many places such as websites where tags need to be counted, and the websites set tags, and each tag may include several sub-tags, which are layered to form a tag relation chain, for example: the label "physical" may contain a sub-label: the following elements of each tag and sub-tag may correspond to multiple elements, and each element (goods, questions, terms, and the like) may belong to several tags, for example: the television can belong to labels of life, electric appliances, large commodities and the like.
When a new element is added, the count of the tags to which the element belongs, and all ancestor tags for each tag, is affected. The traditional solutions to this tag count requirement are: each new element is added with 1 to its belonging label and all ancestor labels of the belonging labels, for example: the sub-label of label a contains B, C, D, the sub-label of label C is D, then an element is added to D, and then the number of elements in label D, C, A needs to be increased by 1.
Problems with this approach are: if the label relation chain changes, the result obtained by the original statistics does not have the correctness any more. Because the label influenced by each new element is determined by the current label relation chain; once the label relationship chain changes, the difference between the old relationship chain and the new relationship chain which are relied on in the past will cause the past statistical result to be unreliable, if error correction is needed, all elements need to be counted again by relying on the new relationship chain, and the updating efficiency of the element number is greatly reduced.
Disclosure of Invention
The embodiment of the invention provides a data processing method, a data processing device and electronic equipment, which are used for solving the technical problem of low element number updating efficiency in the prior art and improving the element number updating efficiency.
The embodiment of the invention provides a data processing method, which comprises the following steps:
obtaining all target sets to which each element in the element set to be processed belongs;
acquiring all combination intersections which can be formed by combining all the target sets of all the elements and the number of the elements contained in each combination intersection according to all the target sets corresponding to all the elements in the element set to be processed;
obtaining a relationship chain for characterizing the affiliation between each of the target sets;
and according to the number of elements contained in the relation chain and each combination intersection, calculating and obtaining the total number of elements contained in any target set by using a repulsion principle.
Optionally, the obtaining, by calculating according to the number of elements included in the relation chain and each combination intersection, a total number of elements included in any one of the target sets by using a principle of repulsion, includes:
obtaining a target combination intersection corresponding to any one target set according to the relation chain;
and according to the number of the elements contained in the target combination intersection and each target combination intersection, calculating and obtaining the total number of the elements contained in any one target set by using a repulsion principle.
Optionally, the obtaining, according to the relation chain, a target combination intersection corresponding to any one of the target sets includes:
obtaining all target subsets subordinate to the target set according to the relation chain;
and obtaining a combined intersection formed by combining all target subsets belonging to the target set as a target combined intersection corresponding to the target set.
Optionally, the obtaining, by calculating according to the number of elements included in the relation chain and each combination intersection, a total number of elements included in any one of the target sets by using a principle of repulsion, includes:
Figure BDA0001249680330000021
wherein the content of the first and second substances,
Figure BDA0001249680330000022
representing the total number of elements contained in the target set, m being the number of target subsets subordinate to said target set, TiRepresenting a subset of targets that are subordinate to the set of targets,
Figure BDA0001249680330000023
representing the intersection of the target combinations corresponding to the target set,
Figure BDA0001249680330000024
representing the number of elements contained in the target combination intersection, i is more than or equal to 1k≤m。
Optionally, the obtaining, according to all target sets corresponding to all elements in the to-be-processed element set, all combination intersections that can be formed by combining all target sets of all elements includes:
and exhaustively enumerating the combination intersection formed by combining at least one target set in all the target sets of all the elements to obtain all the combination intersections which can be formed by combining all the target sets of all the elements.
Optionally, the method further includes:
when one element is newly added in the element set to be processed, acquiring all target sets to which the newly added element belongs;
and acquiring a combination intersection formed by all the target sets to which the newly added elements belong, and adding one to the number of elements contained in each combination intersection corresponding to the newly added elements.
An embodiment of the present application further provides a data processing apparatus, where the apparatus includes:
the first acquisition unit is used for acquiring all target sets to which each element in the element set to be processed belongs;
a second obtaining unit, configured to obtain, according to all target sets corresponding to all elements in the element set to be processed, all combination intersections that can be formed by combining all target sets of all elements, and the number of elements included in each combination intersection;
a third obtaining unit, configured to obtain a relationship chain used for representing a dependency relationship between the target sets;
and the calculating unit is used for calculating and obtaining the total number of elements contained in any target set by using a repulsion principle according to the number of the elements contained in the relation chain and each combination intersection.
Optionally, the computing unit includes:
the obtaining subunit is configured to obtain, according to the relation chain, a target combination intersection corresponding to any one of the target sets;
and the calculating subunit is used for calculating and obtaining the total number of the elements contained in any one target set by using a repulsion principle according to the number of the elements contained in the target combination intersection and each target combination intersection.
Optionally, the obtaining subunit is configured to:
obtaining all target subsets subordinate to the target set according to the relation chain;
and obtaining a combined intersection formed by combining all target subsets belonging to the target set as a target combined intersection corresponding to the target set.
Optionally, the calculating unit is configured to calculate the total number of the obtained elements according to the following formula:
Figure BDA0001249680330000031
wherein the content of the first and second substances,
Figure BDA0001249680330000032
representing the total number of elements contained in the target set, m being the number of target subsets subordinate to said target set, TiRepresenting a subset of targets that are subordinate to the set of targets,
Figure BDA0001249680330000033
representing the intersection of the target combinations corresponding to the target set,
Figure BDA0001249680330000034
representing the number of elements contained in the target combination intersection, i is more than or equal to 1k≤m。
Optionally, the second obtaining unit is configured to:
and exhaustively enumerating the combination intersection formed by combining at least one target set in all the target sets of all the elements to obtain all the combination intersections which can be formed by combining all the target sets of all the elements.
Optionally, the apparatus further comprises:
and the counting unit is used for acquiring all target sets to which the newly added elements belong when one element is newly added in the element set to be processed, acquiring a combined intersection which can be formed by combining all the target sets to which the newly added elements belong, and adding one to the number of elements contained in each combined intersection corresponding to the newly added elements.
Implementations also provide an electronic device comprising a memory, and one or more programs, wherein the one or more programs are stored in the memory and configured for execution by the one or more processors to include instructions for:
obtaining all target sets to which each element in the element set to be processed belongs;
acquiring all combination intersections which can be formed by combining all the target sets of all the elements and the number of the elements contained in each combination intersection according to all the target sets corresponding to all the elements in the element set to be processed;
obtaining a relationship chain for characterizing the affiliation between each of the target sets;
and according to the number of elements contained in the relation chain and each combination intersection, calculating and obtaining the total number of elements contained in any target set by using a repulsion principle.
One or more technical solutions in the embodiments of the present application have at least the following technical effects:
aiming at a target set needing counting, obtaining all target sets to which each element in the target set (namely an element set to be processed) belongs and a combined intersection formed by combining all the target sets; obtaining the number of elements contained in each combined intersection and obtaining a relationship chain for representing the dependency relationship among the target sets; according to the obtained relationship chain and the number of elements contained in each combined intersection, the total number of elements contained in any target set is obtained through calculation by the aid of a repulsion principle.
Drawings
Fig. 1 is a flowchart of a data processing method according to an embodiment of the present application;
FIG. 2 is a schematic diagram of a modification of the relationship chain according to an embodiment of the present application;
fig. 3 is a schematic diagram of a data processing apparatus according to an embodiment of the present application;
fig. 4 is a schematic diagram of an electronic device for implementing a data processing method according to an embodiment of the present application;
fig. 5 is a block diagram illustrating an apparatus for data processing as a server according to an example embodiment.
Detailed Description
In the technical scheme provided by the embodiment of the application, the total number of elements contained in the target set is obtained by counting the sets to which the elements belong and the combination intersection thereof and calculating by using a repulsion principle according to the obtained number of the elements of the sets and the relationship chain among the sets, so that even if the relationship chain among the sets is changed, the target set and the combination intersection thereof do not need to be counted again, and the total number of the elements can be obtained by recalculating according to the changed relationship chain and the number of the elements of the combination set, so as to solve the technical problem of low updating efficiency of the number of the elements in the prior art.
The main implementation principle, the specific implementation mode and the corresponding beneficial effects of the technical scheme of the embodiment of the present application are explained in detail with reference to the accompanying drawings.
Examples
Referring to fig. 1, an embodiment of the present application provides a data processing method, including:
s11: obtaining all target sets to which each element in the element set to be processed belongs;
s12: acquiring all combination intersections which can be formed by combining all the target sets corresponding to all the elements and the number of the elements contained in each combination intersection according to all the target sets corresponding to all the elements in the element set to be processed;
s13: obtaining a relationship chain for characterizing the affiliation between each of the target sets;
s14: and according to the relation chain and the number of elements contained in each combination intersection, calculating and obtaining the total number of elements contained in any target set by using a repulsion principle.
In a specific implementation process, the elements referred to in the embodiments of the present application may be entities included in a web page, for example, data contents such as goods included in a shopping-type web page, questions included in a forum-type web page, and terms included in an encyclopedia-type web page. Each element may belong to several target sets, which may be data sets, databases, etc. containing a large number of elements. The data sets, databases and other sets can be labeled by labels, and the labels to which the elements belong can be used to represent the sets to which the elements belong. The following takes the target set as the labeled data set as an example, and the specific implementation process of the embodiment of the present application is illustrated.
When S11 is executed, all target sets to which each element currently belongs are obtained, without considering the dependency relationship between the target sets, and as shown in table one, it is assumed that the label to which the element 1001 belongs is: the literature and the emotion novel, wherein the literature and the emotion novel are labels corresponding to the set A, D to which the element 1001 belongs respectively, and then the target sets to which the element 1001 belongs are the set a and the set D.
As shown in table one, the table is a correspondence table between the element ID and the belonging tag, and the belonging tag of the element represents a set to which the element belongs, for example, the belonging tag of the element 1002 is literature, that is, the data set to which the element 1002 belongs and the tag is literature; the labels to which the element 1003 belongs are literature, emotion and emotion novel, that is, the element 1003 respectively belongs to a data set labeled with literature, a data set labeled with emotion and a data set labeled with emotion novel.
As shown in Table II, the element ID and the corresponding relationship table of the belonged set are shown. For example, element 1002 belongs to the set A; element 1003 belongs to the set of A, B, D.
By combining the table one and the table two, it can be determined that the label corresponding to the set A is literature, the label corresponding to the set B is emotion, the label corresponding to the set C is novel, and the label corresponding to the set D is emotion novel.
Element ID Belonged label
1001 Literature, emotional novel
1002 Literature
1003 Literature, emotion and emotional novel
1004 Literature and novel
1005 Emotions and novel novels
1006 Emotion and emotional novel
1007 Emotional novel
1008 Emotional novel
Watch 1
Element ID The collection to which it belongs
1001 A,D
1002 A
1003 A,B,D
1004 A,C
1005 B,C
1006 B,D
1007 D
1008 D
Watch two
S11 obtains all the target sets to which each element in the element set to be processed belongs, and further performs S12 to obtain all the combination intersections that can be formed by combining all the target sets corresponding to all the elements according to all the target sets corresponding to all the elements in the element set to be processed.
Wherein, the step S12 of obtaining the intersection of all combinations of all elements may be: and exhaustively obtaining a combination intersection formed by combining at least one target set in all the target sets of all the elements. That is, assuming that all target sets are n (n is greater than or equal to 1), intersecting the n target sets with themselves respectively to obtain all combined intersections formed by any one target set; respectively combining any two of the n target sets to obtain all combined intersections formed by any two target sets; respectively combining any three of the n target sets to obtain all combination intersections formed by any three target sets; … …, by analogy, respectively combining any n-1 of the n target sets to obtain all combination intersections formed by any n-1 target sets; combining the n target sets to obtain a combined intersection formed by the n target sets; and then, adding all the obtained combination intersections to obtain all combination intersections which can be formed by combining all the target sets corresponding to all the elements. That is, all the combination intersections formed by any one target set, all the combination intersections formed by any two target sets, all the combination intersections formed by any n-1 target sets and all the combination intersections formed by any three target sets, namely … …, are added to obtain all the combination intersections formed by all the target sets.
Likewise, the dependencies between target sets are not considered at this time. For ease of writing, the embodiment of the present application uses the symbol "# to indicate the combination intersection, # A indicates the combination intersection of the target set A with itself, # AB indicates the combination intersection of the target set A, B, and so on.
For all target sets P ═ { T ═ Ty1,Ty2,…,Tyn},Tyi(i-1 to n) as target sets, and obtaining all possible combination intersection sets R which can be formed by at least one target set in the n target setsPSuch as:
if n is 1, e.g., P is { a }, then R isP={∩A}
If n is 2, e.g. P is { a, B }, then R isP={∩A,∩B,∩AB}
If n is 3, e.g. P ═ a, B, C, then R is presentPThe values of { # A, # B, # C, # AB, # AC, # BC, and # ABC } and so on.
For example: for the example shown in table two above, all target sets P ═ a, B, C, D, and all combination intersections R that can be formed by combinationP={∩A,∩B,∩C,∩D,∩AB,∩AC,∩AD,∩BC,∩BD,∩CD,∩ABC,∩ABD,∩ACD,∩BCD,∩ABCD}。
S12 obtains all the combination intersections, and also obtains the number of elements belonging to each combination intersection by statistics for each combination intersection. Specifically, when a new element is added to the set of elements to be processed, all target sets to which the new element belongs are obtained; and aiming at all target sets to which the newly added elements belong, acquiring a combination intersection formed by combining all the target sets to which the newly added elements belong, and adding one to the number of elements contained in each combination intersection corresponding to the newly added elements. For example: assuming that the combination intersection formed by all the target sets to which the new elements belong can be combined is RP={β12,…,βzThen, add 1 to the element number of each corresponding combination intersection to obtain Cβi=Cβi+1, as shown in table three and table four: and counting the number of elements of each combination intersection in the second table. For another example: assuming that element 1001 is a new element, the target set to which it belongs is: a and D, then the combination intersection formed by combining the target sets a and D to which the new element 1001 belongs includes: and (3) adding 1 to the elements of the N.A,. N.D and. N.AD respectively during counting.
Figure BDA0001249680330000081
Figure BDA0001249680330000091
Watch III
Set of beta Count Cβ
∩A 4
∩B 3
∩C 2
∩D 5
∩AB 1
∩AC 1
∩AD 2
∩BC 1
∩BD 2
∩CD 0
∩ABC 0
∩ABD 1
∩ACD 0
∩BCD 0
∩ABCD 0
Watch four
The above-mentioned S11 and S12 are processes for counting newly added elements in the embodiment of the present application, and do not consider the dependency corresponding to the target set to which the element belongs, and are only related to a combination intersection that can be formed by combining the target sets to which the element belongs. Therefore, even if the membership between the target sets changes, the number of elements in each combination intersection is not affected, and the number of elements of each target set finally calculated according to the number of elements included in each combination intersection is not affected.
When the total number of elements contained in each target set needs to be obtained, S13-S14 is executed, and for the online data processing method, S11-S12 and S13-S14 appear alternately according to the actual scene.
S13 obtains a relationship chain for characterizing the dependency relationship between each of the target sets, wherein, as shown in FIG. 2, before the relationship chain is changed, the set D belongs to the set A, C, B, and the set C belongs to the set A.
S14, according to the obtained relationship chain between the target sets and the number of elements contained in each combination intersection, the total number of elements contained in any target set is obtained through calculation by the aid of the repulsion principle. Wherein the total number of elements comprises the total number of elements contained by all target subsets in the target set. The target subset is a target set which has a subordinate relationship with the target set in a relationship chain.
Specifically, when the total number of elements of any target set is calculated and obtained in step S14, a target combination intersection corresponding to any target set may be obtained according to the relationship chain; and then, according to the target combination intersection and the number of elements contained in each target combination intersection, calculating and obtaining the total number of elements contained in any one target set by using a repulsion principle.
Obtaining a target combination intersection corresponding to any target set may include: and obtaining all target subsets subordinate to the target set according to the relation chain, and obtaining a combined intersection which can be formed by combining all the target subsets subordinate to the target set as a target combined intersection corresponding to the target set. Assuming that the target set is T, all target subsets subordinate to the target set T are obtained as U according to the relationship chainT={T1,T2,…,TmIn which UTAnd T is included, the union of the target subsets is expanded into an intersection to obtain a target combination intersection. It should be noted that the target subset belonging to the target set includes the target set itself.
For example: as shown in fig. 2 and table five, the target set a corresponding to the label "literature" is known from the label relationship chain before change: a set D corresponding to the label 'emotion novel', and a set C corresponding to the label 'novel' all belong to a target set A, so that all target subsets U of the target set AAThe target combination intersection corresponding to the target set a may be defined by all target subsets U ═ a, C, D }, where the target combination intersection corresponding to the target set a may be defined by all target subsets UAThe union of { a, C, D } is expanded as an intersection to obtain: N.A, N.D, N.C, N.AD, N.AC, N.CD, and N.ADC.
Label (R) Target set<T> All target subsets UT
<Literature > <A> {A,C,D}
<Emotion> <B> {B,D}
<Novel> <C> {C,D}
<Emotional novel> <D> {D}
Watch five
After the target combination intersection corresponding to the target set is obtained, the total number of elements contained in any target set is obtained through calculation by the aid of the repulsion principle according to the target combination intersection and the number of elements contained in each target combination intersection. The number of elements included in each target combination intersection may be obtained according to the number of elements included in each combination intersection obtained in S12.
Wherein, the Principle of repulsion (Principle of exclusion-exclusion) refers to: regardless of the overlap, the number of all objects contained in a content is calculated, and then the number of repeated calculations in counting is excluded, so that the result of the calculation is neither missing nor repeated. Specifically, the total number of elements of the target set T may be calculated by the following formula:
Figure BDA0001249680330000111
finishing to obtain:
Figure BDA0001249680330000112
wherein the content of the first and second substances,
Figure BDA0001249680330000113
representing the total number of elements contained in the target set, m being the number of target subsets subordinate to said target set, TiRepresenting a subset of targets that are subordinate to the set of targets,
Figure BDA0001249680330000114
representing the intersection of the target combinations corresponding to the target set,
Figure BDA0001249680330000115
representing the number of elements contained in the target combination intersection, i is more than or equal to 1k≤m。
For example: regarding the tag as a target set based on the elements provided in tables one through four, the tag, and the pre-change tag relationship chain shown in fig. 2, then:
element count ═ A > - [ ACD | - ] under < literature >
Element count ═ B > - [ BD | -, under < emotion > -, a group of elements is formed
< in summary > element count ═ C ═ CD | -
Element count ═ D ^ under < emotion novel-
And (3) expanding operation according to the formula to obtain the total number of elements in each target set:
│<A>│=│∪ACD│
+│∩A│+│∩C│+│∩D│
-│∩AC│-│∩AD│-│∩CD│
+│∩ACD│
=4+2+5-1-2-0+0=8
like
│<B>│=│∪BD│=│∩B│+│∩D│-│∩BD│=3+5-2=6
│<C>│=│∪CD│=│∩C│+│∩D│-│∩CD│=2+5-0=7
│<D>│=│∪D│=│∩D│=5
Based on the above method, when the relationship chain between the target sets changes, the modified label relationship chain as shown in fig. 2: the novel no longer belongs to a sub-label of literature but becomes a label of the same level as literature, and the label now<Literature >Becomes UT={A,D},<Literature >Total number of elements ═ l<A>│=│∪AD│=│∩A│+│∩D│-│∩AD│=4+5-2=7。
It can be seen from this example that the change of the relationship chain of the tag does not affect the counting of the number of elements included in each combination intersection, and it is not necessary to trace back the past records (i.e. the counting of the combination intersections), so that the method can still ensure the correctness and efficiency of the counting in the scenario in which the dependency relationship between sets changes frequently, and is particularly suitable for the scenario in which the relationship chain changes frequently, such as a network tag.
For the data processing method provided in the foregoing embodiment, an embodiment of the present application further provides a data processing apparatus correspondingly, please refer to fig. 3, where the apparatus includes:
a first obtaining unit 31, configured to obtain all target sets to which each element in the set of elements to be processed belongs;
a second obtaining unit 32, configured to obtain, according to all target sets corresponding to all elements in the to-be-processed element set, all combination intersections that can be formed by combining all target sets of all elements, and the number of elements included in each combination intersection;
a third obtaining unit 33, configured to obtain a relationship chain used for characterizing a dependency relationship between the target sets;
and the calculating unit 34 is configured to calculate, according to the number of elements included in the relation chain and each combination intersection, a total number of elements included in any one of the target sets by using a repulsion principle.
In a specific implementation process, the calculating unit 34 includes: an acquisition subunit and a calculation subunit. The acquisition subunit is to: and obtaining a target combination intersection corresponding to any one target set according to the relation chain. A calculation subunit: and the total number of elements contained in any target set is obtained through calculation by using the principle of repulsion and repulsion according to the number of the elements contained in the target combination intersection and each target combination intersection.
When acquiring the target combination intersection, the acquiring subunit is specifically configured to: obtaining all target subsets subordinate to the target set according to the relation chain; and obtaining a combined intersection formed by combining all target subsets belonging to the target set as a target combined intersection corresponding to the target set.
Specifically, the calculating unit 34 may calculate the total number of the obtained elements according to the following formula:
Figure BDA0001249680330000131
wherein the content of the first and second substances,
Figure BDA0001249680330000132
representing the total number of elements contained in the target set, m being the number of target subsets subordinate to said target set, TiRepresenting a subset of targets that are subordinate to the set of targets,
Figure BDA0001249680330000133
representing the intersection of the target combinations corresponding to the target set,
Figure BDA0001249680330000134
representing the number of elements contained in the target combination intersection, i is more than or equal to 1k≤m。
The second obtaining unit 32 is specifically configured to: and exhaustively enumerating the combination intersection formed by combining at least one target set in all the target sets of all the elements to obtain all the combination intersections which can be formed by combining all the target sets of all the elements.
In a specific implementation process, the device further comprises: the counting unit 35 is configured to, when an element is newly added to the set of elements to be processed, obtain all target sets to which the newly added element belongs, obtain a combination intersection that can be formed by combining all target sets to which the newly added element belongs, and add one to the number of elements included in each combination intersection corresponding to the newly added element.
With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.
Fig. 4 is a block diagram illustrating an electronic device 800 for implementing a data processing method according to an example embodiment. For example, the electronic device 800 may be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a game console, a tablet device, a medical device, an exercise device, a personal digital assistant, and the like.
Referring to fig. 4, electronic device 800 may include one or more of the following components: processing component 802, memory 804, power component 806, multimedia component 808, audio component 810, input/output (I/O) interface 812, sensor component 814, and communication component 816.
The processing component 802 generally controls overall operation of the electronic device 800, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing elements 802 may include one or more processors 820 to execute instructions to perform all or a portion of the steps of the methods described above. Further, the processing component 802 can include one or more modules that facilitate interaction between the processing component 802 and other components. For example, the processing component 802 can include a multimedia module to facilitate interaction between the multimedia component 808 and the processing component 802.
The memory 804 is configured to store various types of data to support operation at the device 800. Examples of such data include instructions for any application or method operating on the electronic device 800, contact data, phonebook data, messages, pictures, videos, and so forth. The memory 804 may be implemented by any type or combination of volatile or non-volatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.
The power supply component 806 provides power to the various components of the electronic device 800. The power components 806 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power for the electronic device 800.
The multimedia component 808 includes a screen that provides an output interface between the electronic device 800 and a user. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive an input signal from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 808 includes a front facing camera and/or a rear facing camera. The front-facing camera and/or the rear-facing camera may receive external multimedia data when the device 800 is in an operating mode, such as a shooting mode or a video mode. Each front camera and rear camera may be a fixed optical lens system or have a focal length and optical zoom capability.
The audio component 810 is configured to output and/or input audio signals. For example, the audio component 810 includes a Microphone (MIC) configured to receive external audio signals when the electronic device 800 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signals may further be stored in the memory 804 or transmitted via the communication component 816. In some embodiments, audio component 810 also includes a speaker for outputting audio signals.
The I/O interface 812 provides an interface between the processing component 802 and peripheral interface modules, which may be keyboards, click wheels, buttons, etc. These buttons may include, but are not limited to: a home button, a volume button, a start button, and a lock button.
The sensor assembly 814 includes one or more sensors for providing various aspects of state assessment for the electronic device 800. For example, the sensor assembly 814 may detect an open/closed state of the device 800, the relative positioning of components, such as a display and keypad of the electronic device 800, the sensor assembly 814 may also detect a change in the position of the electronic device 800 or a component of the electronic device 800, the presence or absence of user contact with the electronic device 800, orientation or acceleration/deceleration of the electronic device 800, and a change in the temperature of the electronic device 800. Sensor assembly 814 may include a proximity sensor configured to detect the presence of a nearby object without any physical contact. The sensor assembly 814 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 814 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
The communication component 816 is configured to facilitate wired or wireless communication between the electronic device 800 and other devices. The electronic device 800 may access a wireless network based on a communication standard, such as WiFi, 2G or 3G, or a combination thereof. In an exemplary embodiment, the communication component 816 receives a broadcast signal or broadcast associated information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communications component 816 further includes a Near Field Communication (NFC) module to facilitate short-range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, Ultra Wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.
In an exemplary embodiment, the electronic device 800 may be implemented by one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), controllers, micro-controllers, microprocessors or other electronic components for performing the above-described methods.
In an exemplary embodiment, a non-transitory computer-readable storage medium comprising instructions, such as the memory 804 comprising instructions, executable by the processor 820 of the electronic device 800 to perform the above-described method is also provided. For example, the non-transitory computer readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.
A non-transitory computer readable storage medium having instructions therein, which when executed by a processor of a mobile terminal, enable the mobile terminal to perform a data processing method, the method comprising:
obtaining all target sets to which each element in the element set to be processed belongs; acquiring all target sets corresponding to all elements in the element set to be processed, and acquiring all combination intersections which can be formed by combining all the target sets of all the elements and the number of elements contained in each combination intersection; obtaining a relationship chain for characterizing the affiliation between each of the target sets; and according to the number of elements contained in the relation chain and each combination intersection, calculating and obtaining the total number of elements contained in any target set by using a repulsion principle.
Fig. 5 is a block diagram illustrating an apparatus for data processing as a server according to an example embodiment. The server 1900 may vary widely by configuration or performance and may include one or more Central Processing Units (CPUs) 1922 (e.g., one or more processors) and memory 1932, one or more storage media 1930 (e.g., one or more mass storage devices) storing applications 1942 or data 1944. Memory 1932 and storage medium 1930 can be, among other things, transient or persistent storage. The program stored in the storage medium 1930 may include one or more modules (not shown), each of which may include a series of instructions operating on a server. Still further, a central processor 1922 may be provided in communication with the storage medium 1930 to execute a series of instruction operations in the storage medium 1930 on the server 1900.
The server 1900 may also include one or more power supplies 1926, one or more wired or wireless network interfaces 1950, one or more input-output interfaces 1958, one or more keyboards 1956, and/or one or more operating systems 1941, such as Windows Server, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, etc.
Other embodiments of the invention will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the invention being indicated by the following claims.
It will be understood that the invention is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the invention is only limited by the appended claims
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims (14)

1. A method of data processing, the method comprising:
obtaining all target sets to which each element in the element set to be processed belongs;
acquiring all combination intersections which can be formed by combining all the target sets of all the elements and the number of the elements contained in each combination intersection according to all the target sets corresponding to all the elements in the element set to be processed;
obtaining a relationship chain for characterizing the affiliation between each of the target sets;
according to the number of elements contained in the relation chain and each combination intersection, calculating and obtaining the total number of elements contained in any target set through a repulsion principle;
the elements are entities included in the webpage, and the target set is a data set labeled by the webpage label.
2. The method of claim 1, wherein the obtaining a total number of elements included in any of the target sets by a repulsion principle calculation according to the number of elements included in the relationship chain and each of the combination intersections comprises:
obtaining a target combination intersection corresponding to any one target set according to the relation chain;
and according to the number of the elements contained in the target combination intersection and each target combination intersection, calculating and obtaining the total number of the elements contained in any one target set by using a repulsion principle.
3. The method of claim 2, wherein obtaining a target combination intersection corresponding to any one of the target sets according to the relationship chain comprises:
obtaining all target subsets subordinate to the target set according to the relation chain;
and obtaining a combined intersection formed by combining all target subsets belonging to the target set as a target combined intersection corresponding to the target set.
4. The method of claim 3, wherein said obtaining a total number of elements included in any of said target sets by a repulsion principle calculation according to a number of elements included in said relation chain and said each said combination intersection comprises:
Figure FDA0002795829840000011
wherein the content of the first and second substances,
Figure FDA0002795829840000012
representing the total number of elements contained in the target set, m being the number of target subsets subordinate to said target set, TiRepresenting a subset of targets that are subordinate to the set of targets,
Figure FDA0002795829840000013
representing the intersection of the target combinations corresponding to the target set,
Figure FDA0002795829840000021
representing the number of elements contained in the target combination intersection, i is more than or equal to 1k≤m。
5. The method of claim 1, wherein the obtaining all combination intersections that can be formed by all the target sets of all the elements in the set of elements to be processed according to all the target sets corresponding to all the elements includes:
and exhaustively enumerating the combination intersection formed by combining at least one target set in all the target sets of all the elements to obtain all the combination intersections which can be formed by combining all the target sets of all the elements.
6. The method of claim 1, wherein the method further comprises:
when one element is newly added in the element set to be processed, acquiring all target sets to which the newly added element belongs;
and acquiring a combination intersection formed by all the target sets to which the newly added elements belong, and adding one to the number of elements contained in each combination intersection corresponding to the newly added elements.
7. A data processing apparatus, comprising:
the first acquisition unit is used for acquiring all target sets to which each element in the element set to be processed belongs;
a second obtaining unit, configured to obtain, according to all target sets corresponding to all elements in the element set to be processed, all combination intersections that can be formed by combining all target sets of all elements, and the number of elements included in each combination intersection;
a third obtaining unit, configured to obtain a relationship chain used for representing a dependency relationship between the target sets;
a calculating unit, configured to calculate, according to the number of elements included in the relation chain and each combination intersection, a total number of elements included in any one of the target sets by using a repulsion principle;
the elements are entities included in the webpage, and the target set is a data set labeled by the webpage label.
8. The apparatus of claim 7, wherein the computing unit comprises:
the obtaining subunit is configured to obtain, according to the relation chain, a target combination intersection corresponding to any one of the target sets;
and the calculating subunit is used for calculating and obtaining the total number of the elements contained in any one target set by using a repulsion principle according to the number of the elements contained in the target combination intersection and each target combination intersection.
9. The apparatus of claim 8, wherein the acquisition subunit is to:
obtaining all target subsets subordinate to the target set according to the relation chain;
and obtaining a combined intersection formed by combining all target subsets belonging to the target set as a target combined intersection corresponding to the target set.
10. The apparatus of claim 9, wherein the computing unit is configured to compute the total number of elements according to the following formula:
Figure FDA0002795829840000031
wherein the content of the first and second substances,
Figure FDA0002795829840000032
representing the total number of elements of the target set, m being the number of target subsets contained in the target set, TiRepresenting a subset of the objects in the set of objects,
Figure FDA0002795829840000033
representing a target combination intersection formed by target subset combinations in the target set,
Figure FDA0002795829840000034
the number of elements representing the intersection of the target combinations is more than or equal to 1 and less than or equal to ik≤m。
11. The apparatus of claim 7, wherein the second obtaining unit is to:
and exhaustively enumerating the combination intersection formed by combining at least one target set in all the target sets of all the elements to obtain all the combination intersections which can be formed by combining all the target sets of all the elements.
12. The apparatus of claim 7, wherein the apparatus further comprises:
and the counting unit is used for acquiring all target sets to which the newly added elements belong when one element is newly added in the element set to be processed, acquiring a combined intersection which can be formed by combining all the target sets to which the newly added elements belong, and adding one to the number of elements contained in each combined intersection corresponding to the newly added elements.
13. An electronic device comprising a memory and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by one or more processors, the one or more programs comprising operating instructions for performing the method according to any one of claims 1-6.
14. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the method according to any one of claims 1 to 6.
CN201710165468.XA 2017-03-20 2017-03-20 Data processing method and device and electronic equipment Active CN108628883B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710165468.XA CN108628883B (en) 2017-03-20 2017-03-20 Data processing method and device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710165468.XA CN108628883B (en) 2017-03-20 2017-03-20 Data processing method and device and electronic equipment

Publications (2)

Publication Number Publication Date
CN108628883A CN108628883A (en) 2018-10-09
CN108628883B true CN108628883B (en) 2021-03-16

Family

ID=63687827

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710165468.XA Active CN108628883B (en) 2017-03-20 2017-03-20 Data processing method and device and electronic equipment

Country Status (1)

Country Link
CN (1) CN108628883B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110324321B (en) * 2019-06-18 2021-07-13 创新先进技术有限公司 Data processing method and device

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104090918A (en) * 2014-06-16 2014-10-08 北京理工大学 Sentence similarity calculation method based on information amount

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104090918A (en) * 2014-06-16 2014-10-08 北京理工大学 Sentence similarity calculation method based on information amount

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
Deriving Similarity Graphs from open Linked Data on Semantic Web;Mi Jinhua等;《PROCEEDINGS OF THE 2009 IEEE INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION》;20090812;第157-162页 *
Learning to Query: Focused Web Page Harvesting for Entity Aspects;Fang,Y等;《2016 32ND IEEE INTERNATIONAL CONFERENCE ON DATA ENGINEERING》;20160520;第8-9页 *
基于图的科技文献相似性搜索关键技术研究;朱戈;《中国优秀硕士学位论文全文数据库(信息科技辑)》;20120615;第20-50页 *
面向特定领域的互联网舆情分析技术研究;张长利;《中国博士学位论文全文数据库(信息科技辑)》;20110915;第14-15页 *

Also Published As

Publication number Publication date
CN108628883A (en) 2018-10-09

Similar Documents

Publication Publication Date Title
CN107463643B (en) Barrage data display method and device and storage medium
US20200012701A1 (en) Method and apparatus for recommending associated user based on interactions with multimedia processes
CN106534951B (en) Video segmentation method and device
US10083346B2 (en) Method and apparatus for providing contact card
CN107729098B (en) User interface display method and device
CN107402767B (en) Method and device for displaying push message
CN107133361B (en) Gesture recognition method and device and terminal equipment
US10229165B2 (en) Method and device for presenting tasks
CN109543069B (en) Video recommendation method and device and computer-readable storage medium
CN108573697B (en) Language model updating method, device and equipment
CN108628883B (en) Data processing method and device and electronic equipment
CN109842688B (en) Content recommendation method and device, electronic equipment and storage medium
CN105260088B (en) Information classification display processing method and device
CN108153846B (en) Telephone number marking method and equipment
CN107885571B (en) Display page control method and device
CN112862349B (en) Data processing method, device and equipment based on ABS service data
CN107436896B (en) Input recommendation method and device and electronic equipment
CN110019657B (en) Processing method, apparatus and machine-readable medium
CN110008135B (en) Information processing method and device and electronic equipment
CN110147426B (en) Method for determining classification label of query text and related device
CN109145151B (en) Video emotion classification acquisition method and device
CN111680248A (en) Method and device for generating batch number of message pushed
CN112381223A (en) Neural network training and image processing method and device
CN104793847A (en) Picture showing method and device
CN111241097B (en) Method for processing object, device for processing object and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant