WO2021137690A1 - Method of determining trending topics and a system thereof - Google Patents
Method of determining trending topics and a system thereof Download PDFInfo
- Publication number
- WO2021137690A1 WO2021137690A1 PCT/MY2020/050147 MY2020050147W WO2021137690A1 WO 2021137690 A1 WO2021137690 A1 WO 2021137690A1 MY 2020050147 W MY2020050147 W MY 2020050147W WO 2021137690 A1 WO2021137690 A1 WO 2021137690A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- attribute
- frequently used
- attributes
- report
- trending
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 32
- 238000004458 analytical method Methods 0.000 claims description 21
- 238000000546 chi-square test Methods 0.000 claims description 9
- 238000003012 network analysis Methods 0.000 claims description 8
- 238000013507 mapping Methods 0.000 claims description 7
- 238000007619 statistical method Methods 0.000 claims description 7
- 102220047090 rs6152 Human genes 0.000 claims description 6
- 238000004891 communication Methods 0.000 abstract description 2
- 238000007405 data analysis Methods 0.000 description 5
- 230000008520 organization Effects 0.000 description 4
- 238000012800 visualization Methods 0.000 description 4
- 244000080767 Areca catechu Species 0.000 description 2
- 230000003542 behavioural effect Effects 0.000 description 2
- 238000004422 calculation algorithm Methods 0.000 description 2
- 238000012517 data analytics Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000004075 alteration Effects 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000001960 triggered effect Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/01—Social networking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
Definitions
- This invention relates to data analysis and more particularly to a method of determining relevant trending topics on social media platforms based on frequently used attributes by user.
- data analysis In data analysis, data are collected, organized, and displayed in a form of table, chart, graph or other representation to interpret meaning of the data. In other words, data analysis helps user to derive helpful information and extract meaningful insights.
- social media platforms provide massive data of user behaviour and preferences. Trending topic on one or more social media platforms indicates subject that has a sudden popularity during a given period of time. By analysing trending topics on social media platforms, one can better understand societal needs which can be capitalized as business opportunity. Therefore, report generation based on the trending topics could offer a glimpse into behaviour and regular common interests of user.
- a report attribute in report generation reflects characteristic of a data that takes a value which is associated with an object, such as a person, place, or thing.
- An example of such characteristic is author, whose value is typically associated with a name of an object creator.
- most of traditional analytic visualization tools usually require to manually construct reporting data which involve deep understanding on user behaviour and the trending topics.
- a prior art of patent application US 2006/0184464 A 1 discloses a system and methods for data analysis and trend prediction, and more specifically relates to analysis of data relationships. ‘464 A1 obtains impact of frequently used attributes associated with a particular item in a dataset and generates impact profile for each of the item. Then, heuristic algorithms are used to generate an expertise profile for each item based on the impact profile to further determine additional relationship among the items in the dataset. Notoriously, heuristic algorithm sacrifices optimality, accuracy, precision, or completeness despite their efficiency and speed.
- the present invention relates to a method of determining trending topics based on frequently used attributes, comprising a step of retrieving report attributes from a report repository, characterized by steps of analysing impact of the frequently used attributes from the report attributes by employing Attribute View Duration, AVD and User Access Location, UAL, by a statistical engine; generating relevant trending topics based on the impacted frequently used attributes, by a trending topic analyser; and generating a set of highest combination, HCn for each impacted frequently used attribute, by a report criteria generator, wherein each set of HCn comprises attribute keywords, AK and attribute related words, AR with highest similarity value.
- the step of analysing impact of the frequently used attributes from the report attributes by a statistical engine further comprises steps of accumulating a list of the frequently used attributes from the report attributes; analysing each frequently used attribute based on the AVD and UAL; conducting statistical analysis for each frequently used attribute based on the AVD and UAL using chi-square test; calculating UAL weightage for each frequently used attribute; and establishing a rank of the frequently used attributes according to highest order of chi-square values and highest UAL weightage.
- the step of accumulating a list of the frequently used attributes from the report attributes further comprises steps of clustering the frequently used attributes based on occurrences of similar attributes; and mapping each frequently used attribute with its associated view duration and access location.
- the step of analysing each frequently used attribute based on the AVD and UAL includes distributing the AVD and UAL for each frequently used attribute in a contingency table.
- the step of generating relevant trending topics based on the impacted frequently used attributes by a trending topic analyser further comprises steps of acquiring a list of trending topics from a trending topic repository based on access location and time duration of the trending topics; analysing relationship of the trending topics and related attribute values using Social Network Analysis, SNA; and grouping the attribute values and its relevant trending topics according to each impacted frequently used attributes.
- SNA Social Network Analysis
- the step of analysing relationship of the trending topics and related attribute values using Social Network Analysis, SNA further comprises steps of mapping the trending topics to related attribute values of each associated frequently used attributes by creating a link, calculating total sum of links concluded for each attribute value and the associated frequently used attributes; and establishing a rank of the frequently used attributes according to highest order of the sum of links.
- the step of tagging impacted frequently used attribute with keywords further comprises steps of identifying attribute description of each impacted frequently used attribute; and tagging the attribute description using Named Entity Recognition, NER.
- the method further comprising a step of storing the set of highest combination, FICn, for each frequently used attribute in the report repository.
- the present invention also relates to a system for determining trending topics based on frequently used attributes, comprises a report repository for storing report attributes and outputs of the system, a trending topic repository for storing trending topics gathered from a plurality of social media, a statistical engine configured to conduct statistical analysis on the report attributes based on Attribute View Duration, AVD and User Access Location, UAL using chi-square test to analyse impact of the frequently used attributes, a trending topic analyser configured to analyse relationship of the trending topics and related attribute values for each impacted frequently used attribute using Social Network Analysis, SNA, and a report criteria generator configured to generate a set of highest combination of the trending topics and attributes for each frequently used attributes, FIC n , wherein each set of the FIC n comprises attribute keywords, AK and attribute related words, AR with highest similarity value.
- Figure 1 is a diagram illustrating a system for determining trending topics based on attributes in accordance to the present invention.
- Figure 2 is a flow chart for a method of determining trending topics based on attributes in accordance to the present invention.
- Figure 3 is a flow chart for a step of analysing impact of the frequently used attributes from the report attributes in accordance to the present invention.
- Figure 4 is a flow chart for a step of generating relevant trending topics based on the impacted frequently used attributes in accordance to the present invention.
- Figure 5 illustrates an exemplary embodiment for mapping the trending topics to related attribute values of each associated frequently used attributes in accordance to the present invention.
- Figure 6 is a flow chart for a step of generating a set of highest combination (HC n ) for each impacted frequently used attribute in accordance to the present invention.
- Figure 7 illustrates an exemplary embodiment to represent similarity between a set of attribute keywords and an attribute related words of the trending topics in accordance to the present invention.
- FIG. 10 illustrates a block diagram of said system (10) in accordance to the preferred embodiment of the present invention.
- the system (10) comprising a statistical engine (11 ), a trending topic analyser (12) and a report criteria generator (13) in communication with a report repository (14) and a trending topic repository (15).
- the statistical engine (11 ) is configured to conduct statistical analysis for attributes based on Attribute View Duration, AVD and User Access Location, UAL using chi-square test.
- the AVD in the present invention is defined as an amount of time a user spent on accessing a report.
- AVD factor is considered because view duration determines behavioural preferences of a user in generating subsequent report. From view duration, subsequent report may suggests an appropriate attribute for next report generating analysis.
- the UAL relates to a location from where the user accesses the report which is required to determine trending topic for each access location.
- the trending topics analyser (12) is configured to analyse relationship of trending topics and related attribute values using Social Network Analysis, SNA such as Google Trends.
- SNA Social Network Analysis
- Google Trends provides keyword related data including search volume index and geographical information about search engine users.
- the report criteria generator (13) is configured to analyse report attributes with current trend based on user profiling and trending topics.
- the user profiling is acquired by analysing the AVD and UAL.
- Output of the report criteria generator (13) may further be used in any data analytic visualization tools to provide an accessible approach to analyse and understand trends, outliers, and patterns in data.
- data analytic visualization delivers graphical representation of information and data by using visual elements like charts, graphs and maps.
- the report repository (14) stores a collection of report execution history comprises of report attributes (e.g. attribute name, attribute description, etc.) and outputs of the system (10) as well as the AVD and UAL information during the report execution.
- the trending topics repository (15) stores a collection of latest trending topics gathered from various social media platforms for example Facebook, Twitter, Instagram, etc.
- the present invention also relates to a method for determining trending topics (20) based on the frequently used attributes by user.
- Figure 2 illustrates a flow chart for said method (20), comprising a step of retrieving report attributes (100) from the report repository (14), analysing impact of the frequently used attributes (200) from the report attributes by employing Attribute View Duration, AVD and User Access Location, UAL, by the statistical engine (11).
- the method (20) further comprising a step of generating relevant trending topics (300) based on the impacted frequently used attributes, by the trending topic analyser (12) and generating a set of highest combination, FIC n , for each impacted frequently used attribute (400), by the report criteria generator (13).
- the method (20) also comprising a step of storing the set of FIC n for each frequently used attribute (500) in the report repository (14).
- the method (20) begins with retrieving all report attributes (100) from the report repository (14) by extracting all attributes from a current report generated analysis, wherein the report-generated-analysis may comprises a plurality of reports (e.g. Ri, R 2 , R 3 ,..R n ).
- the statistical engine (11) analyses impact of the frequently used attributes (200) from the report attributes by employing the Attribute View Duration, AVD and User Access Location, UAL as shown in Figure 3. All the attributes extracted in step 100 are accumulated to acquire a list of the frequently used attributes by the user (201 ).
- the step of accumulating a list of the frequently used attributes (201) from the report attributes further comprises steps of clustering the frequently used attributes based on occurrences of similar attributes; and mapping each frequently used attribute with its associated view duration and access location.
- Table 1 shows an example of the accumulated attributes used by the users from each report are mapped in a table with its associated view duration and access location from step 201 .
- a number of highest attributes are gathered and ranked accordingly based on the occurrences of similar attributes, for example a top 100 attributes are selected to be used for further analysis. From the exemplary embodiment of table 1 , the top attributes such as ‘gender’, ‘age‘, ‘state’, ‘income’, ‘height’, and ‘weight’, etc. are gathered and ranked according to the occurrences of similar attributes in each report.
- the occurrences of similar attributes used by the users are identified as a plurality of sets of a frequent value, Fn-
- the frequently used attributes may further be represented by equation (1), wherein the frequently used attributes equation comprises of the number of highest attributes and its frequent value, F n .
- each attribute from table 1 is analysed using two factors (202) i.e. the Attribute View Duration, AVD and User Access Location, UAL.
- the most popular attributes is identified to determine user behavioural preferences in generating subsequent report.
- the AVD is defined as an amount of time for a user spent during accessing the report. View duration time the starts when the user logs in into a user profile and consequently clicks to start performing analysis and selects attributes to view or interact with the report.
- the report can be any type of report. Tracking time stops and marks as offline when there is no physical movement or input devices (i.e. mouse movement, keyboard and touchscreen) detected.
- the report generated has a geo-tagging to mark location of the report being accessed during the user logs in into the user profile to record the UAL.
- the UAL is further used in determining trending topic for each location.
- the step of analysing each frequently used attribute (202) based on the AVD and UAL includes distributing the AVD and UAL for each frequently used attribute in a contingency table. As shown in table 2, the AVD and UAL for each attribute from table 1 are distributed in the contingency table.
- each attribute is being accessed in different UAL, i.e. Bandar Baru Bangi, Putrajaya, Kajang and so forth until L n .
- Each attribute has different AVD value based on the UAL.
- each attribute from the contingency table is analysed (203) using a statistical analysis known as chi-square test to determine correlation between the AVD of each attribute for different UAL.
- the chi-square test outputs a significant value, V of difference between expected frequencies and observed frequencies of each attribute for each AVD at different UAL.
- the expected frequency, EF is calculated by equation (2):
- the observed frequency is a value of AVD to be tested, for example with reference to table 2, the observed frequency value for ‘Gender’ and ‘Putrajaya’ is 60.
- V calculated using the chi-square test for ‘Putrajaya’ may be represented by equation (3):
- V (60 - EF) / EF (3)
- UAL weightage for each frequently used attribute is calculated (204), by summing each attribute based on the UAL in Table 2. For example, with reference to table 2, the UAL weightage for ‘Putrajaya’ is T 2 . Then, the frequently used attributes are ranked (205) according to highest order of chi-square values and the highest UAL weightage as shown in table 3.
- the chi-square value is sort out and rank from highest to lowest chi square value (205) with highest UAL weightage.
- a high value shows high correlation of AVD and UAL for the frequently used attributes, thus giving a higher position ranking to signify impact of the frequently used attributes.
- the impacted frequently used attributes and the values obtained from step 205 are subsequently stored into the report repository (14) for the purpose of determining the trending topics.
- the method (20) then proceed to generate relevant trending topics (300) based on the impacted frequently used attributes by the trending topic analyser (12) with reference to Figure 4.
- the step 300 begins by acquiring a list of trending topics (301 ) from a trending topic repository (15) based on access location and time duration of the trending topics.
- Example of the trending topic repository is Google Trends.
- the list of the trending topics may be obtained from Google Trends and inputs attained such as country, state and city indicating location of trending topic, while time duration of the trending topics are presented for example in past 24 hours, past 30 days and past 12 months.
- the list of trending topics, T are returned as in equation (4).
- T ⁇ Lee Chong Wei, Gamuda Share Price, Liverpool, ... , T n ⁇ (4)
- relationship of the trending topics and related attribute values is analysed (302) using Social Network Analysis, SNA.
- the step of analysing relationship of the trending topics and related attribute values (302) further comprises step of mapping each of the trending topics to related attribute values, A of each associated frequently used attributes by creating a link as illustrated in Figure 5.
- Examples of the attribute values, A for the attribute ‘Gender’ are ‘Female’ and ‘Male’.
- the trending topic of ‘Lee Chong Wei’ is linked to attribute value, A ‘Male’ for attribute ‘Gender’ and ‘P.Pinang’ for attribute ‘State’.
- Aae ((0-20 Years , Liverpool), ( 20-40 Years, Gamuda Share Price ⁇
- the method (20) further generates a set of highest combination, HC n , for each impacted frequently used attribute (400), by the report criteria generator (13) as shown in Figure 6.
- Step 400 is represent to find the highest similarity value combination of related words of the trending topics known as attribute keywords, AK and the impacted frequently used attributes keywords known as attribute related words, AR.
- the step of tagging impacted frequently used attribute with keywords (401) further comprises steps of identifying attribute description of each impacted frequently used attribute and tagging the attribute description using Named Entity Recognition, NER.
- AK ⁇ Place, Year, Organisation ⁇ .
- LSA Latent Semantic Analysis
- the set of related words generated for each of the trending topics, T of the attribute A ⁇ using the LSA technique are:
- AR ⁇ finance adviser, finance management, finance consultant, education fair, education articles, education act ⁇ .
- the step of selecting each of the attribute keyword, K n and calculating the similarity are iterated until there are no more attribute related words, R n to be selected.
- similarity of with the set of attribute related words, AR ⁇ finance adviser, finance management, finance consultant, education fair, education articles, education act ⁇ is calculated using the cosine similarity until all related words, R 1 ; R 2 , ... R N from AR have been selected for obtaining the similarity values.
- the cosine similarity calculation outputs results such as 0.8, 0.7, 0.6, 0.3, 0.5, and 0.2 to represent the similarity between AK and AR.
- the similarity between the first set of attribute keywords AK i.e. and the first attribute related words of the trending topics i.e. is shown as 0.8
- the first attribute related words of the trending topics i.e. R 2 is shown as 0.7.
- the small dots in Figure 7 represent the related words.
- the attribute keywords, AK and attribute related words, AR with highest similarity value are constituted (404) to generate the set of highest combination, HC n , for each impacted frequently used attribute, wherein each set of HC n comprises attribute keywords, AK and attribute related words, AR with highest similarity value.
- the HC n represent a set of highest combination of the trending topics and attributes for each frequently used attributes as in the following example:
- HCage ⁇ Place, finance management, education act ⁇
- HCgender ⁇ Year, Organization, finance adviser, education articles, education act ⁇
- each impacted frequently used attribute (i.e. age, state, gender) comprises a list of combination of each attribute keywords, AK and attribute related words, AR having highest similarity value.
Landscapes
- Business, Economics & Management (AREA)
- Engineering & Computer Science (AREA)
- Strategic Management (AREA)
- General Business, Economics & Management (AREA)
- Finance (AREA)
- Development Economics (AREA)
- Economics (AREA)
- Accounting & Taxation (AREA)
- Marketing (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Tourism & Hospitality (AREA)
- Primary Health Care (AREA)
- Human Resources & Organizations (AREA)
- Health & Medical Sciences (AREA)
- Entrepreneurship & Innovation (AREA)
- Computing Systems (AREA)
- Game Theory and Decision Science (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The present invention relates to a system for determining trending topics (10) based on frequently used attributes, comprises a statistical engine (11), a trending topic analyser (12) and a report criteria generator (13) in communication with a report repository (14) and a trending topic repository (15). A method of determining trending topics (20) based on the frequently used attributes is also provided thereof, comprising steps of retrieving report attributes (100), analysing impact of the frequently used attributes (200) from the report attributes; generating relevant trending topics (300) based on the impacted frequently used attributes; generating a set of highest combination (400), for each impacted frequently used attribute; and storing the set of highest combination of the trending topics and the attributes, (HCn) for each frequently used attribute (500) in the report repository (14).
Description
METHOD OF DETERMINING TRENDING TOPICS AND A SYSTEM THEREOF
FIELD OF INVENTION
This invention relates to data analysis and more particularly to a method of determining relevant trending topics on social media platforms based on frequently used attributes by user.
BACKGROUND OF THE INVENTION
In data analysis, data are collected, organized, and displayed in a form of table, chart, graph or other representation to interpret meaning of the data. In other words, data analysis helps user to derive helpful information and extract meaningful insights. In recent days, social media platforms provide massive data of user behaviour and preferences. Trending topic on one or more social media platforms indicates subject that has a sudden popularity during a given period of time. By analysing trending topics on social media platforms, one can better understand societal needs which can be capitalized as business opportunity. Therefore, report generation based on the trending topics could offer a glimpse into behaviour and regular common interests of user.
A report attribute in report generation reflects characteristic of a data that takes a value which is associated with an object, such as a person, place, or thing. An example of such characteristic is author, whose value is typically associated with a name of an object creator. In order to find out relationship between current trending topics with existing report attribute and user interests on any prominent topic, most of traditional analytic visualization tools usually require to manually construct reporting data which involve deep understanding on user behaviour and the trending topics.
Thus, historical report data alone is insufficient to be used as reference to understand current interest of user in certain topics or scenario. Consequently, there is a need to exploit additional input to gather more information on relevant topics based on user behaviour profile in accessing report attributes in order to improve report criteria relevancies with current trending topics.
A prior art of patent application US 2006/0184464 A 1 (‘464 A1) discloses a system and methods for data analysis and trend prediction, and more specifically relates to analysis of data relationships. ‘464 A1 obtains impact of frequently used attributes associated with a particular item in a dataset and generates impact profile for each of the item. Then, heuristic algorithms are used to generate an expertise profile for each item based on the impact profile to further determine additional relationship among the items in the dataset. Notoriously, heuristic algorithm sacrifices optimality, accuracy, precision, or completeness despite their efficiency and speed.
Another prior art is a patent US 9213996 B2 (‘996 B2) discloses a system and method for analysing social media trends, and particularly to identify trends in social media activity and identify correlations with sales data. Correlations between social media activity with respect to concept and sales of products are used to predict sales for the same or different products by identifying second-tier influencers. In ‘996 B2, the second tier influencers are monitored among general population of expert contributors to evaluate their significance.
Despite the attempts in the prior arts, there are still room of improvements to provide data analysis method of determining trending topics to improve report criteria relevancies with current trend based on user profiling and trending topics. Accordingly, there exists a need for a method to provide valuable information on relationship between user behaviour of accessing reports and current social trending topics to consequently make better recommendations or predictions for next report generating analysis.
SUMMARY OF INVENTION
The following presents a simplified summary of the invention in order to provide a basic understanding of some aspects of the invention. This summary is not an extensive overview of the invention. Its sole purpose is to present some concepts of the invention in a simplified form as a prelude to the more detailed description that is presented later.
It is an objective of the present invention to provide a system and method of determining trending topics based on frequently used attributes by user to improve report criteria relevancies.
It is another objective of the present invention to provide a system and method of determining trending topics by considering view duration and access location of the attributes.
It is also an objective of the present invention to provide a system and method of determining trending topics by combining highest similarity value of the trending topics from social media and the frequently used attributes by user.
The present invention relates to a method of determining trending topics based on frequently used attributes, comprising a step of retrieving report attributes from a report repository, characterized by steps of analysing impact of the frequently used attributes from the report attributes by employing Attribute View Duration, AVD and User Access Location, UAL, by a statistical engine; generating relevant trending topics based on the impacted frequently used attributes, by a trending topic analyser; and generating a set of highest combination, HCn for each impacted frequently used attribute, by a report criteria generator, wherein each set of HCn comprises attribute keywords, AK and attribute related words, AR with highest similarity value.
In a preferred embodiment of the present invention, the step of analysing impact of the frequently used attributes from the report attributes by a statistical engine, by employing Attribute View Duration, AVD and User Access Location, UAL, further comprises steps of accumulating a list of the frequently used attributes from the report attributes; analysing each frequently used attribute based on the AVD and UAL; conducting statistical analysis for each frequently used attribute based on the AVD and UAL using chi-square test; calculating UAL weightage for each frequently used attribute; and establishing a rank of the frequently used attributes according to highest order of chi-square values and highest UAL weightage.
In a preferred embodiment of the present invention, the step of accumulating a list of the frequently used attributes from the report attributes, further comprises steps of clustering the frequently used attributes based on occurrences of similar attributes; and mapping each frequently used attribute with its associated view duration and access location.
In a preferred embodiment of the present invention, the step of analysing each frequently used attribute based on the AVD and UAL includes distributing the AVD and UAL for each frequently used attribute in a contingency table.
In a preferred embodiment of the present invention, the step of generating relevant trending topics based on the impacted frequently used attributes by a trending topic analyser, further comprises steps of acquiring a list of trending topics from a trending topic repository based on access location and time duration of the trending topics; analysing relationship of the trending topics and related attribute values using Social Network Analysis, SNA; and grouping the attribute values and its relevant trending topics according to each impacted frequently used attributes.
In a preferred embodiment of the present invention, the step of analysing relationship of the trending topics and related attribute values using Social Network Analysis, SNA, further comprises steps of mapping the trending topics to related attribute values of each associated frequently used attributes by creating a link, calculating total sum of links concluded for each attribute value and the associated frequently used attributes; and establishing a rank of the frequently used attributes according to highest order of the sum of links.
In a preferred embodiment of the present invention, the step of generating a set of highest combination, HCn, for each impacted frequently used attribute, by a report criteria generator, further comprises steps of tagging each impacted frequently used attribute with keywords to constitute a set of attribute keywords, AK = {Kn}; generating semantically related words for each trending topic using Latent Semantic Analysis, LSA to constitute a set of attribute related words, AR = {Rn}; calculating similarity value of each Kn with each Rn using cosine similarity; and constituting the attribute related words, AK and attributes keywords, AR with highest similarity value to generate the set of highest combination, HCn for each impacted frequently.
In a preferred embodiment of the present invention, the step of tagging impacted frequently used attribute with keywords, further comprises steps of identifying attribute description of each impacted frequently used attribute; and tagging the attribute description using Named Entity Recognition, NER.
In a preferred embodiment of the present invention, the method further comprising a step of storing the set of highest combination, FICn, for each frequently used attribute in the report repository.
The present invention also relates to a system for determining trending topics based on frequently used attributes, comprises a report repository for storing report attributes and outputs of the system, a trending topic repository for storing trending topics gathered from a plurality of social media, a statistical engine configured to conduct statistical analysis on the report attributes based on Attribute View Duration, AVD and User Access Location, UAL using chi-square test to analyse impact of the frequently used attributes, a trending topic analyser configured to analyse relationship of the trending topics and related attribute values for each impacted frequently used attribute using Social Network Analysis, SNA, and a report criteria generator configured to generate a set of highest combination of the trending topics and attributes for each frequently used attributes, FICn, wherein each set of the FICn comprises attribute keywords, AK and attribute related words, AR with highest similarity value.
BRIEF DESCRIPTION OF THE DRAWINGS
The features of the invention will be more readily understood and appreciated from the following detailed description when read in conjunction with the accompanying drawings of the preferred embodiment of the present invention.
Figure 1 is a diagram illustrating a system for determining trending topics based on attributes in accordance to the present invention.
Figure 2 is a flow chart for a method of determining trending topics based on attributes in accordance to the present invention.
Figure 3 is a flow chart for a step of analysing impact of the frequently used attributes from the report attributes in accordance to the present invention.
Figure 4 is a flow chart for a step of generating relevant trending topics based on the impacted frequently used attributes in accordance to the present invention.
Figure 5 illustrates an exemplary embodiment for mapping the trending topics to related attribute values of each associated frequently used attributes in accordance to the present invention.
Figure 6 is a flow chart for a step of generating a set of highest combination (HCn) for each impacted frequently used attribute in accordance to the present invention.
Figure 7 illustrates an exemplary embodiment to represent similarity between a set of attribute keywords and an attribute related words of the trending topics in accordance to the present invention.
DESCRIPTION OF THE PREFERRED EMBODIMENT
The above mentioned features and objectives of this invention will become more apparent and better understood by reference to the following detailed description. It should be understood that the detailed description made known below is not intended to be exhaustive or limit the invention to the precise disclosed form, as the invention may assume various alternative forms. On the contrary, the detailed description covers all the relevant modifications and alterations made to the present invention, unless the claims expressly state otherwise.
The terminology used in the description of the example embodiments herein is for the purpose of describing particular example embodiments only and is not intended to be limiting. As used in the description of the example embodiments and the appended examples, the singular forms “a,” “an,” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will also be understood that the term “and/or” as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items. It will be further understood that the terms “comprises” and/or “comprising,” when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
The present invention provides a system for determining trending topics based on frequently used attributes by user wherein Figure 1 illustrates a block diagram of said system (10) in accordance to the preferred embodiment of the present invention. The system (10) comprising a statistical engine (11 ), a trending topic analyser (12) and a report criteria generator (13) in communication with a report repository (14) and a trending topic repository (15).
The statistical engine (11 ) is configured to conduct statistical analysis for attributes based on Attribute View Duration, AVD and User Access Location, UAL using chi-square test. The AVD in the present invention is defined as an amount of time a user spent on accessing a report. AVD factor is considered because view duration determines behavioural preferences of a user in generating subsequent report. From view duration, subsequent report may suggests an appropriate attribute for next report generating analysis. Meanwhile, the UAL relates to a location from where the user accesses the report which is required to determine trending topic for each access location.
The trending topics analyser (12) is configured to analyse relationship of trending topics and related attribute values using Social Network Analysis, SNA such as Google Trends. In one embodiment, the Google Trends provides keyword related data including search volume index and geographical information about search engine users. Flowever, it should be appreciated by a person skilled in the art that any SNA tools that capable to provide comparative keyword research and to discover event-triggered spikes in keyword search volume may be used to analyse relationship of the trending topics and the related attribute values.
The report criteria generator (13) is configured to analyse report attributes with current trend based on user profiling and trending topics. The user profiling is acquired by analysing the AVD and UAL. Output of the report criteria generator (13) may further be used in any data analytic visualization tools to provide an accessible approach to analyse and understand trends, outliers, and patterns in data. In one embodiment, data analytic visualization delivers graphical representation of information and data by using visual elements like charts, graphs and maps.
The report repository (14) stores a collection of report execution history comprises of report attributes (e.g. attribute name, attribute description, etc.) and outputs of the system (10) as well as the AVD and UAL information during the report execution. The trending topics repository (15) stores a collection of latest trending topics gathered from various social media platforms for example Facebook, Twitter, Instagram, etc.
The present invention also relates to a method for determining trending topics (20) based on the frequently used attributes by user. Figure 2 illustrates a flow chart for said method (20), comprising a step of retrieving report attributes (100) from the report repository (14), analysing impact of the frequently used attributes (200) from the report attributes by employing Attribute View Duration, AVD and User Access Location, UAL, by the statistical engine (11). The method (20) further comprising a step of generating relevant trending topics (300) based on the impacted frequently used attributes, by the trending topic analyser (12) and generating a set of highest combination, FICn, for each impacted frequently used attribute (400), by the report criteria generator (13). The method (20) also comprising a step of storing the set of FICn for each frequently used attribute (500) in the report repository (14).
The method (20) begins with retrieving all report attributes (100) from the report repository (14) by extracting all attributes from a current report generated analysis, wherein the report-generated-analysis may comprises a plurality of reports (e.g. Ri, R2, R3,..Rn).
Then, the statistical engine (11) analyses impact of the frequently used attributes (200) from the report attributes by employing the Attribute View Duration, AVD and User Access Location, UAL as shown in Figure 3. All the attributes extracted in step 100 are accumulated to acquire a list of the frequently used attributes by the user (201 ).
The step of accumulating a list of the frequently used attributes (201) from the report attributes, further comprises steps of clustering the frequently used attributes based on occurrences of similar attributes; and mapping each frequently used attribute with its associated view duration and access location. Table 1 shows an example of the accumulated attributes used by the users from each report are mapped in a table with its associated view duration and access location from step 201 .
Table 1 : Attributes Used in Report Generated Analysis
A number of highest attributes are gathered and ranked accordingly based on the occurrences of similar attributes, for example a top 100 attributes are selected to be used for further analysis. From the exemplary embodiment of table 1 , the top attributes such as ‘gender’, ‘age‘, ‘state’, ‘income’, ‘height’, and ‘weight’, etc. are gathered and ranked according to the occurrences of similar attributes in each report. The occurrences of similar attributes used by the users are identified as a plurality of sets of a frequent value, Fn-
The frequently used attributes may further be represented by equation (1), wherein the frequently used attributes equation comprises of the number of highest attributes and its frequent value, Fn.
Frequently Used Attribute = {Gender =59}, {State = 27}, {Income = 14}, (1)
{Height =11}, {Weight = 10], ... , {Ai00 = Fn}
Once the frequently used attributes are acquired, each attribute from table 1 is analysed using two factors (202) i.e. the Attribute View Duration, AVD and User Access Location, UAL. By analysing each frequently used attribute (202) based on the AVD and UAL, the most popular attributes is identified to determine user behavioural preferences in generating subsequent report.
The AVD is defined as an amount of time for a user spent during accessing the report. View duration time the starts when the user logs in into a user profile and consequently clicks to start performing analysis and selects attributes to view or interact with the report. The report can be any type of report. Tracking time stops and marks as offline when there is no physical movement or input devices (i.e. mouse movement, keyboard and touchscreen) detected.
In the preferred embodiment, the report generated has a geo-tagging to mark location of the report being accessed during the user logs in into the user profile to record the UAL. The UAL is further used in determining trending topic for each location.
The step of analysing each frequently used attribute (202) based on the AVD and UAL includes distributing the AVD and UAL for each frequently used attribute in a contingency table. As shown in table 2, the AVD and UAL for each attribute from table 1 are distributed in the contingency table.
From the contingency table, it is shown that each attribute is being accessed in different UAL, i.e. Bandar Baru Bangi, Putrajaya, Kajang and so forth until Ln. Each attribute has different AVD value based on the UAL.
Table 2: Contingency Table for AVD and UAL
Next, each attribute from the contingency table is analysed (203) using a statistical analysis known as chi-square test to determine correlation between the AVD of each attribute for different UAL. The chi-square test outputs a significant value, V of difference between expected frequencies and observed frequencies of each attribute for each AVD at different UAL. In the preferred embodiment, the expected frequency, EF is calculated by equation (2):
EF = Total Attribute AVD x (Total Attribute UAL/ Overall Total of AVD) (2)
For example, the expected frequency (EF) for ‘Putrajaya’ calculated using equation (2) is, EF = 105 x (T2/GTn).
Meanwhile, the observed frequency is a value of AVD to be tested, for example with reference to table 2, the observed frequency value for ‘Gender’ and ‘Putrajaya’ is 60.
Therefore, the significant value, V calculated using the chi-square test for ‘Putrajaya’ may be represented by equation (3):
Significant Value, V = (60 - EF) / EF (3)
After acquiring the significant values, V for all the attributes from the chi-square test, UAL weightage for each frequently used attribute is calculated (204), by summing each attribute based on the UAL in Table 2. For example, with reference to table 2, the UAL weightage for ‘Putrajaya’ is T2. Then, the frequently used attributes are ranked (205) according to highest order of chi-square values and the highest UAL weightage as shown in table 3.
Table 3: Statistical Analysis Output for AVD and UAL
From Table 3, it is shown that the chi-square value is sort out and rank from highest to lowest chi square value (205) with highest UAL weightage. A high value shows high correlation of AVD and UAL for the frequently used attributes, thus giving a higher position ranking to signify impact of the frequently used attributes. The impacted frequently used attributes and the values obtained from step 205 are subsequently stored into the report repository (14) for the purpose of determining the trending topics.
The method (20) then proceed to generate relevant trending topics (300) based on the impacted frequently used attributes by the trending topic analyser (12) with reference to Figure 4. The step 300 begins by acquiring a list of trending topics (301 ) from a trending topic repository (15) based on access location and time duration of the trending topics. Example of the trending topic repository (15) is Google Trends. In an embodiment, the list of the trending topics may be obtained from Google Trends and inputs attained such as country, state and city indicating location of trending topic, while time duration of the trending topics are presented for example in past 24 hours, past 30 days and past 12 months. In a preferred embodiment, the list of trending topics, T are returned as in equation (4).
T = {Lee Chong Wei, Gamuda Share Price, Liverpool, ... , Tn} (4)
Then, relationship of the trending topics and related attribute values is analysed (302) using Social Network Analysis, SNA. The step of analysing relationship of the trending topics and related attribute values (302), further comprises step of mapping each of the trending topics to related attribute values, A of each associated frequently used attributes by creating a link as illustrated in Figure 5. Examples of the attribute values, A for the attribute ‘Gender’ are ‘Female’ and ‘Male’. In an exemplary embodiment for step 302, the trending topic of ‘Lee Chong Wei’ is linked to attribute value, A ‘Male’ for attribute ‘Gender’ and ‘P.Pinang’ for attribute ‘State’.
Total sum of links concluded for each attribute value, A and the associated frequently used attributes is then calculated, and the frequently used attributes are ranked according to highest order of the sum of links. Further, the attribute values and its relevant trending topics according to each impacted frequently used attributes are grouped (303) by generating a list of the frequently used attributes comprising the attribute values, A associated with its relevant trending topics. Following are the examples of the list of most impacted frequently used attributes with social media trending topics: i. Gender: (Female = 2 linked) + (Male = 3 linked) = 5 linked.
Gender {(Female, Gamuda Share Price), (Female, Liverpool )} ii. State: (Selangor = 1 linked) + (KL = 1 linked) + (P.Pinang = 2 linked) = 4 linked. State {{Selangor, Liverpool), {KL, Gamuda Share Price)}
iii. Age: (0-20 Years = 1 linked) + (20-40 Years = 2 linked) + (40-60 Years = 1 linked) = 4 linked.
Aae = ((0-20 Years , Liverpool), ( 20-40 Years, Gamuda Share Price}}
The method (20) further generates a set of highest combination, HCn, for each impacted frequently used attribute (400), by the report criteria generator (13) as shown in Figure 6. Step 400 is represent to find the highest similarity value combination of related words of the trending topics known as attribute keywords, AK and the impacted frequently used attributes keywords known as attribute related words, AR.
The step 400 begins by step of tagging each impacted frequently used attribute with keywords (401 ) to constitute a set of the attribute keywords, AK = {Kn}, where n = 1 , 2, 3... N. The step of tagging impacted frequently used attribute with keywords (401), further comprises steps of identifying attribute description of each impacted frequently used attribute and tagging the attribute description using Named Entity Recognition, NER.
The attribute description for each frequently used attribute is acquired by iterating each attribute and an NER repository is used to tag the attribute description with relevant attributes keywords, AK = {K1 ; K2, ... KN}.
In an exemplary embodiment, attribute A having attribute’s description as follow: “This attribute is related to Hospital Serdang patients which stays in Selangor in 2010”, said attribute’s description of the attribute A-i is then tagged as :
Place : Selangor
Year: 2010
Organization: Hospital Serdang
Therefore, the set of the attribute keywords, related to attribute A^ is,
AK = {Place, Year, Organisation}.
Next, related words for each trending topic (402) is generated semantically using Latent Semantic Analysis, LSA to constitute a set of attribute related words, AR = {Rn}, where n = 1 , 2, 3... N. Each of the trending topics from step 300 is iterated and LSA technique is used to find sets of semantically related words based on domain of the trending topics to generate the attribute related word, AR = {R1 ; R2, ... RN}.
In the exemplary embodiment, the trending topics, T generated based on the impacted frequently used attributes of the attribute Ai is T = {Finance, Education}. The set of related words generated for each of the trending topics, T of the attribute A^ using the LSA technique are:
Finance = {finance adviser, finance management, finance consultant}
Education = {education fair, education articles, education act}
Therefore, the set of the attribute related words, AR for attribute A^ is :
AR = {finance adviser, finance management, finance consultant, education fair, education articles, education act}.
Upon identifying the AK and AR, similarity of each attribute keywords Kn with each of attribute related words Rn are calculated (403) using cosine similarity. For example, a first attribute keywords,
from the set of attribute keywords AK = {Place, Year, Organization} is selected, i.e. “Place”.
The step of selecting each of the attribute keyword, Kn and calculating the similarity are iterated until there are no more attribute related words, Rn to be selected. For example, similarity of
with the set of attribute related words, AR = {finance adviser, finance management, finance consultant, education fair, education articles, education act} is calculated using the cosine similarity until all related words, R1 ; R2, ... RNfrom AR have been selected for obtaining the similarity values.
The cosine similarity calculation outputs results such as 0.8, 0.7, 0.6, 0.3, 0.5, and 0.2 to represent the similarity between AK and AR. For example as shown in Figure 7, the similarity between the first set of attribute keywords AK i.e.
and the first attribute related words of the trending topics i.e.
is shown as 0.8, while
and the first attribute related words of the trending topics i.e. R2 is shown as 0.7. The small dots in Figure 7 represent the related words.
Once all the similarity values are calculated, the attribute keywords, AK and attribute related words, AR with highest similarity value are constituted (404) to generate the set of highest combination, HCn, for each impacted frequently used attribute, wherein each set of HCn comprises attribute keywords, AK and attribute related words, AR with highest similarity value.
In other word, the HCn represent a set of highest combination of the trending topics and attributes for each frequently used attributes as in the following example:
HCage = {Place, finance management, education act}
HCstate = {Organization, education fair}
HCgender = {Year, Organization, finance adviser, education articles, education act}
As shown in the example, each impacted frequently used attribute (i.e. age, state, gender) comprises a list of combination of each attribute keywords, AK and attribute related words, AR having highest similarity value.
These attributes combination of trending topics and attributes for each frequently used attributes are stored in the report repository (14) (500) which subsequently may be used as additional input for better analysis in the analytics visualization tools.
While this invention has been particularly shown and described with reference to the exemplary embodiments thereof, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the scope of the invention as defined by the appended claims.
Claims
1. A method of determining trending topics (20) based on frequently used attributes, comprising a step of retrieving report attributes (100) from a report repository (14), characterized by steps of: analysing impact of the frequently used attributes (200) from the report attributes by employing Attribute View Duration, AVD and User Access Location, UAL, by a statistical engine (11); generating relevant trending topics (300) based on the impacted frequently used attributes, by a trending topic analyser (12); and generating a set of highest combination, HCn, for each impacted frequently used attribute (400), by a report criteria generator (13), wherein each set of HCn comprises attribute keywords, AK and attribute related words, AR with highest similarity value.
2. The method (20) according to claim 1 , wherein the step of analysing impact of the frequently used attributes (200) from the report attributes by a statistical engine (11), by employing Attribute View Duration, AVD and User Access Location, UAL, further comprises steps of: accumulating a list of the frequently used attributes (201) from the report attributes; analysing each frequently used attribute (202) based on the AVD and UAL; conducting statistical analysis for each frequently used attribute (203) based on the AVD and UAL using chi-square test; calculating UAL weightage for each frequently used attribute (204); and establishing a rank of the frequently used attributes (205) according to highest order of chi-square values and highest UAL weightage.
3. The method (20) according to claim 2, wherein the step of accumulating a list of the frequently used attributes (201) from the report attributes, further comprises steps of: clustering the frequently used attributes based on occurrences of similar attributes; and mapping each frequently used attribute with its associated view duration and access location.
4. The method (20) according to claim 2, wherein the step of analysing each frequently used attribute (202) based on the AVD and UAL includes distributing the AVD and UAL for each frequently used attribute in a contingency table.
5. The method (20) according to claim 1 , wherein the step of generating relevant trending topics (300) based on the impacted frequently used attributes by a trending topic analyser (12), further comprises steps of: acquiring a list of trending topics (301 ) from a trending topic repository (15) based on access location and time duration of the trending topics; analysing relationship of the trending topics and related attribute values (302) using Social Network Analysis, SNA; and grouping the attribute values and its relevant trending topics according to each impacted frequently used attributes (303).
6. The method according to claim 5, wherein the step of analysing relationship of the trending topics and related attribute values (302) using Social Network Analysis, SNA, further comprises steps of: mapping the trending topics to related attribute values of each associated frequently used attributes by creating a link; calculating total sum of links concluded for each attribute value and the associated frequently used attributes; and establishing a rank of the frequently used attributes according to highest order of the sum of links.
7. The method according to claim 1 , wherein the step of generating a set of highest combination, HCn, for each impacted frequently used attribute (400), by a report criteria generator (13), further comprises steps of: tagging each impacted frequently used attribute with keywords (401) to constitute a set of attribute keywords, AK = {Kn}; generating semantically related words for each trending topic (402) using Latent Semantic Analysis, LSA to constitute a set of attribute related words, AR =
{Rn} ; calculating similarity value of each Kn with each Rn (403) using cosine similarity; and
constituting the attribute keywords, AK and attribute related words, AR with highest similarity value (404) to generate the set of highest combination, HCn, for each impacted frequently used attribute.
8. The method according to claim 7, wherein the step of tagging impacted frequently used attribute with keywords (401), further comprises steps of: identifying attribute description of each impacted frequently used attribute; and tagging the attribute description using Named Entity Recognition, NER.
9. The method (20) according to claim 1 further comprising a step of storing the set of highest combination, HCn for each frequently used attribute (500) in the report repository (14).
10. A system for determining trending topics (10) based on frequently used attributes, comprises: a report repository (14) for storing report attributes and outputs of the system (10); and a trending topic repository (15) for storing trending topics gathered from a plurality of social media, characterised by a statistical engine (11) configured to conduct statistical analysis on the report attributes based on Attribute View Duration, AVD and User Access Location, UAL, using chi-square test to analyse impact of the frequently used attributes; a trending topic analyser (12) configured to analyse relationship of the trending topics and related attribute values for each impacted frequently used attribute using Social Network Analysis, SNA; and a report criteria generator (13) configured to generate a set of highest combination of the trending topics and attributes for each frequently used attributes, HCn, wherein each set of the HCn comprises attribute keywords, AK and attribute related words, AR with highest similarity value.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
MYPI2019007943 | 2019-12-31 | ||
MYPI2019007943 | 2019-12-31 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021137690A1 true WO2021137690A1 (en) | 2021-07-08 |
Family
ID=76686746
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/MY2020/050147 WO2021137690A1 (en) | 2019-12-31 | 2020-11-12 | Method of determining trending topics and a system thereof |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2021137690A1 (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2012100067A1 (en) * | 2011-01-19 | 2012-07-26 | 24/7 Customer, Inc. | Analyzing and applying data related to customer interactions with social media |
US20130304658A1 (en) * | 2010-10-29 | 2013-11-14 | Facebook, Inc. | Inferring user profile attributes from social information |
US20150262069A1 (en) * | 2014-03-11 | 2015-09-17 | Delvv, Inc. | Automatic topic and interest based content recommendation system for mobile devices |
US20180373788A1 (en) * | 2014-12-30 | 2018-12-27 | Facebook, Inc. | Contrastive multilingual business intelligence |
KR20190109628A (en) * | 2018-02-27 | 2019-09-26 | 한국전자통신연구원 | Method for providing personalized article contents and apparatus for the same |
-
2020
- 2020-11-12 WO PCT/MY2020/050147 patent/WO2021137690A1/en active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130304658A1 (en) * | 2010-10-29 | 2013-11-14 | Facebook, Inc. | Inferring user profile attributes from social information |
WO2012100067A1 (en) * | 2011-01-19 | 2012-07-26 | 24/7 Customer, Inc. | Analyzing and applying data related to customer interactions with social media |
US20150262069A1 (en) * | 2014-03-11 | 2015-09-17 | Delvv, Inc. | Automatic topic and interest based content recommendation system for mobile devices |
US20180373788A1 (en) * | 2014-12-30 | 2018-12-27 | Facebook, Inc. | Contrastive multilingual business intelligence |
KR20190109628A (en) * | 2018-02-27 | 2019-09-26 | 한국전자통신연구원 | Method for providing personalized article contents and apparatus for the same |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN101408886B (en) | Selecting tags for a document by analyzing paragraphs of the document | |
CN101408887B (en) | Recommending terms to specify body space | |
CN101408885B (en) | Modeling topics using statistical distributions | |
JP5423030B2 (en) | Determining words related to a word set | |
US8543380B2 (en) | Determining a document specificity | |
CN101404015B (en) | Automatically generating a hierarchy of terms | |
US20150310097A1 (en) | Systems and methods for analyzing and clustering search queries | |
CN101692223A (en) | Refining a search space inresponse to user input | |
JP5391632B2 (en) | Determining word and document depth | |
Hansmann et al. | Big data-characterizing an emerging research field using topic models | |
US20140337280A1 (en) | Systems and Methods for Data Analysis | |
Levine-Clark et al. | A new comparative citation analysis: Google Scholar, Microsoft Academic, Scopus, and Web of Science | |
US10147095B2 (en) | Chain understanding in search | |
Lee et al. | Reducing noises for recall-oriented patent retrieval | |
Oo | Pattern discovery using association rule mining on clustered data | |
Gupta et al. | Correlation, prediction and ranking of evaluation metrics in information retrieval | |
Akritidis et al. | Identifying attractive research fields for new scientists | |
WO2021137690A1 (en) | Method of determining trending topics and a system thereof | |
CN113420096B (en) | Index system construction method, device, equipment and storage medium | |
US11874868B2 (en) | Generating and presenting multi-dimensional representations for complex entities | |
Vo et al. | TKES: a novel system for extracting trendy keywords from online news sites | |
Kammergruber et al. | Using association rules for discovering tag bundles in social tagging data | |
Xie et al. | Distinguishing re-sharing behaviors from re-creating behaviors in information diffusion | |
Orthuber | Uniform definition of comparable and searchable information on the web | |
Ozmutlu | Markovian analysis for automatic new topic identification in search engine transaction logs |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20910342 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20910342 Country of ref document: EP Kind code of ref document: A1 |